Video Swin Transformer, [Paper]
AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders, [Paper]
A-ViT: Adaptive Tokens for Efficient Vision Transformer [Paper]
AdaViT: Adaptive Vision Transformers for Efficient Image Recognition [Paper]
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification [Paper]
A Survey on In-context Learning [Paper]
Learning To Retrieve Prompts for In-Context Learning [Paper]

Provide feedback

Saved searches