V0.2.6-A2MIM-ImageNet-Weights
Lupin1998
released this
18 Nov 01:05
·
79 commits
to main
since this release
A collection of weights and logs for self-supervised learning benchmark on ImageNet-1K (download). You can find pre-training codes of compared methods in OpenMixup, VISSL, solo-learn, and the official repositories. You can download all files from Baidu Cloud: A2MIM (3q5i).
- All compared methods adopt ResNet-50 or ViT-B architectures and are pre-trained 100/300 or 800 epochs on ImageNet-1K. The pre-training and fine-tuning testing image size are
$224\times 224$ . The fine-tuning protocols include: RSB A3 and RSB A2 for ResNet-50, BEiT (SimMIM) for ViT-B. Refer to the paper of A2MIM for more details. - The best top-1 accuracy of fine-tuning in the last 10 training epochs is reported for all self-supervised methods.
- Visualization of mixed samples of A2MIM are provided in zip files.
- As for pre-training and fine-tuning weights, you can evaluate them with
tools/dist_test.sh
or fine-tune pre-trained modelstools/dist_train.sh
with--load_checkpoint
(loading the full checkpoints). Note that pre-trained weights stated withfull_
contains the full keys of pre-trained models whilebackbone_
only contains the encoder weights, which can be used for downstream tasks, e.g., COCO detection and ADE20K segmentation.
Self-supervised Pre-training and Fine-tuning with ResNet-50 on ImageNet-1K
We provide the source of pre-trained weights, pre-training epochs, fine-tuning epochs and protocol, and top-1 accuracy in the following table.
Methods | Source | PT epoch | FT protocol | FT top-1 |
---|---|---|---|---|
PyTorch | PyTorch | 90 | RSB A3 | 78.8 |
Inpainting | OpenMixup | 70 | RSB A3 | 78.4 |
Relative-Loc | OpenMixup | 70 | RSB A3 | 77.8 |
Rotation | OpenMixup | 70 | RSB A3 | 77.7 |
SimCLR | VISSL | 100 | RSB A3 | 78.5 |
MoCoV2 | OpenMixup | 100 | RSB A3 | 78.5 |
BYOL | OpenMixup | 100 | RSB A3 | 78.7 |
BYOL | Official | 300 | RSB A3 | 78.9 |
BYOL | Official | 300 | RSB A2 | 80.1 |
SwAV | VISSL | 100 | RSB A3 | 78.9 |
SwAV | Official | 400 | RSB A3 | 79.0 |
SwAV | Official | 400 | RSB A2 | 80.2 |
BarlowTwins | solo learn | 100 | RSB A3 | 78.5 |
BarlowTwins | Official | 300 | RSB A3 | 78.8 |
MoCoV3 | Official | 100 | RSB A3 | 78.7 |
MoCoV3 | Official | 300 | RSB A3 | 79.0 |
MoCoV3 | Official | 300 | RSB A2 | 80.1 |
A2MIM | OpenMixup | 100 | RSB A3 | 78.8 |
A2MIM | OpenMixup | 300 | RSB A3 | 78.9 |
A2MIM | OpenMixup | 300 | RSB A2 | 80.4 |
Self-supervised Pre-training and Fine-tuning with ViT-B on ImageNet-1K
We provide the source of pre-trained weights, pre-training epochs, fine-tuning epochs and protocol, and top-1 accuracy in the following table.
Methods | Source | PT epoch | FT protocol | FT top-1 |
---|---|---|---|---|
SimMIM | Official | 800 | BEiT (SimMIM) | 83.8 |
SimMIM (RGB mean) | OpenMixup | 800 | BEiT (SimMIM) | 84.0 |
A2MIM | OpenMixup | 800 | BEiT (SimMIM) | 84.3 |