Releases: Westlake-AI/openmixup
ViTs-Mixup-CIFAR100-Weights
A collection of weights and logs for image classification experiments with modern Transformer architectures on CIFAR-100. These benchmarks are proposed for the convenience of conducting research in Mixup augmentations with Transformers since the most published benchmarks of Mixup variants with ViTs are based on ImageNet-1K. Please refer to our tech report for more details.
- Since the original resolutions of CIFAR-100 are too small for ViTs, we resize the input images to
$224\times 224$ (training and testing) while not modifying the ViT architectures. This benchmark uses the DeiT setup and trains the model for 200 or 600 epochs with a batch size of 100 on CIFAR-100. The basic learning rates of DeiT and Swin are$1e-3$ and$5e-4$ , which is the optimal setup in our experiments. We search and report$\alpha$ in$Beta(\alpha, \alpha)$ for all compared methods. View config files in mixups/vits. - The best of top-1 accuracy in the last 10 training epochs is reported for ViT architectures. We released the trained models and logs in vits-mix-cifar100-weights.
ViTs' Mixup Benchmark on CIFAR-100
Backbones | DEiT-S(/16) | DEiT-S(/16) | Swin-T | Swin-T | |
---|---|---|---|---|---|
Epoch | 200 epochs | 600 epochs | 200 epochs | 600 epochs | |
Vanilla | - | 65.81 | 68.50 | 78.41 | 81.29 |
MixUp | 0.8 | 69.98 | 76.35 | 76.78 | 83.67 |
CutMix | 2 | 74.12 | 79.54 | 80.64 | 83.38 |
DeiT | 0.8,1 | 75.92 | 79.38 | 81.25 | 84.41 |
SmoothMix | 0.2 | 67.54 | 80.25 | 66.69 | 81.18 |
SaliencyMix | 0.2 | 69.78 | 76.60 | 80.40 | 82.58 |
AttentiveMix+ | 2 | 75.98 | 80.33 | 81.13 | 83.69 |
FMix* | 1 | 70.41 | 74.31 | 80.72 | 82.82 |
GridMix | 1 | 68.86 | 74.96 | 78.54 | 80.79 |
PuzzleMix | 2 | 73.60 | 81.01 | 80.44 | 84.74 |
ResizeMix* | 1 | 68.45 | 71.95 | 80.16 | 82.36 |
AlignMix | 1 | - | - | 78.91 | 83.34 |
TransMix | 0.8,1 | 76.17 | 79.33 | 81.33 | 84.45 |
AutoMix | 2 | 76.24 | 80.91 | 82.67 | 84.70 |
SAMix* | 2 | 77.94 | 82.49 | 82.62 | 84.85 |
V0.2.7-OpenSource-Preproduced-ImageNet-Weights
We provide a collection of model weights and logs for image classification networks on ImageNet-1K (download) reproduced with OpenMixup
or MMLab frameworks. You can view the training setting in config files and README pages of related models. You can download all files from Baidu Cloud (cicj).
If you want us to reproduce a new model or can provide reproduced results to OpenMixup, Please contact us by GitHub issued or e-mail. This release is on updating for a long time!
ImageNet Classification with OpenMixup
Model | Paper | Pretrain | Params(M) | Flops(G) | Top-1(%) | Top-5(%) | Config | Download |
---|---|---|---|---|---|---|---|---|
DeiT-S | ICML'2021 | From scratch | 22.05 | 4.24 | 80.28 | 95.07 | config | model | log |
DeiT-B | ICML'2021 | From scratch | 86.57 | 16.86 | 81.82 | 95.57 | config | model | log |
Swin-T | ICCV'2021 | From scratch | 28.29 | 4.36 | 81.18 | 95.61 | config | model | log |
ConvNeXt-T | CVPR'2022 | From scratch | 28.59 | 4.46 | 82.16 | 95.81 | config | model | log |
UniFormer-T | ICLR'2022 | From scratch | 5.55 | 0.88 | 78.02 | 94.14 | config | model | log |
UniFormer-S | ICLR'2022 | From scratch | 21.5 | 3.44 | 82.29 | 95.91 | config | model | log |
VAN-T (B0) | arXiv'2022 | From scratch | 4.11 | 0.88 | 75.77 | 92.99 | config | model | log |
VAN-S (B1) | arXiv'2022 | From scratch | 13.86 | 2.52 | 81.03 | 95.56 | config | model | log |
VAN-B (B2) | arXiv'2022 | From scratch | 26.58 | 5.03 | 82.65 | 96.17 | config | model | log |
LITv2-S | NIPS'2022 | From scratch | 27.85 | 3.52 | 81.74 | 95.59 | config | model | log |
CoC-T | ICLR'2023 | From scratch | 5.60 | 1.10 | 72.70 | 91.26 | config | model | log |
CoC-T-plain | ICLR'2023 | From scratch | 5.60 | 1.10 | 73.16 | 95.48 | config | model | log |
CoC-S | ICLR'2023 | From scratch | 14.7 | 2.78 | 77.71 | 93.87 | config | model | log |
V0.2.7-RSB-A3-ImageNet-Weights
A collection of weights and logs for image classification experiments with RSB A3 training setting on ImageNet-1K (download). You can view the training setting in ResNet strikes back and find the full results in MogaNet (Appendix Table A.7). You can download all files from Baidu Cloud (ss3j).
- We train all models for 100 epochs according to the RSB A3 setting on ImageNet-1K. We turn the basic learning in {8e-3, 6e-3} to get better performances.
- The best top-1 accuracy of image classification in the last 10 training epochs is reported for all experiments.
RSB A3 Image Classification on ImageNet-1K
Model | Date | Train / Test | Params (M) | Top-1 (%) | Top-5 (%) | Config | Download |
---|---|---|---|---|---|---|---|
ResNet-50 | CVPR'2016 | 160 / 224 | 26 | 78.1 | 93.8 | config | model | log |
ResNet-101 | CVPR'2016 | 160 / 224 | 45 | 79.9 | 94.9 | config | model | log |
ResNet-152 | CVPR'2016 | 160 / 224 | 60 | 80.7 | 95.2 | config | model | log |
ViT-T | ICLR'2021 | 160 / 224 | 6 | 66.7 | 87.7 | config | model | log |
ViT-S | ICLR'2021 | 160 / 224 | 22 | 73.8 | 91.2 | config | model | log |
ViT-B | ICLR'2021 | 160 / 224 | 87 | 76.0 | 91.8 | config | model | log |
PVT-T | ICCV'2021 | 160 / 224 | 13 | 71.5 | 89.8 | config | model | log |
PVT-S | ICCV'2021 | 160 / 224 | 25 | 72.1 | 90.2 | config | model | log |
Swin-T | ICCV'2021 | 160 / 224 | 28 | 77.7 | 93.7 | config | model | log |
Swin-S | ICCV'2021 | 160 / 224 | 50 | 80.2 | 95.1 | config | model | log |
Swin-B | ICCV'2021 | 160 / 224 | 50 | 80.5 | 95.4 | config | model | log |
LITV2-T | NIPS'2022 | 160 / 224 | 28 | 79.7 | 94.7 | config | model | log |
LITV2-M | NIPS'2022 | 160 / 224 | 49 | 80.5 | 95.2 | config | model | log |
LITV2-B | NIPS'2022 | 160 / 224 | 87 | 81.3 | 95.5 | config | model | log |
ConvMixer-768-d32 | arXiv'2022 | 160 / 224 | 21 | 77.6 | 93.5 | config | model | log |
PoolFormer-S12 | CVPR'2022 | 160 / 224 | 12 | 69.3 | 88.7 | config | model | log |
PoolFormer-S24 | CVPR'2022 | 160 / 224 | 21 | 74.1 | 91.8 | config | model | log |
PoolFormer-S36 | CVPR'2022 | 160 / 224 | 31 | 74.6 | 92.0 | config | model | log |
PoolFormer-M36 | CVPR'2022 | 160 / 224 | 56 | 80.7 | 95.2 | config | model | log |
PoolFormer-M48 | CVPR'2022 | 160 / 224 | 73 | 81.2 | 95.3 | config | [model](ht... |
V0.2.6-MogaNet-ImageNet-Weights
A collection of weights and logs for image classification experiments of MogaNet on ImageNet-1K (download). You can download all files from Baidu Cloud (z8mf) at MogaNet/Classification_OpenMixup
.
- We train MogaNet for 100 and 300 epochs according to the RSB A3 and DeiT settings on ImageNet-1K. Note that * denotes the refined training setting of lightweight models with 3-Augment. Refer to the Appendix of MogaNet for more training details.
- The best top-1 accuracy of image classification in the last 10 training epochs is reported for all experiments. Note that we report the classification accuracy of EMA weights for MogaNet-S, MogaNet-B, and MogaNet-L.
- As for evaluation experiments of the pre-trained weights, you can test them with
tools/dist_test.sh
for the classification performance or fine-tune them on downstream tasks by only loading the encoder weights, e.g., COCO detection and ADE20K segmentation. - Warning of
attn_force_fp32
: During fp16 training, we force to run the gating functions with fp32 to avoid inf or nan. We found that if we useattn_force_fp32=True
during training, it should also keepattn_force_fp32=True
during evaluation. This might be caused by the difference between the output results of usingattn_force_fp32
or not. It will not affect performances of fully fine-tuning but the results of transfer learning (e.g., COCO Mask-RCNN freezes the parameters of the first stage). We set it to true by default in OpenMixup while removing it in MogaNet implementation. For example, you can use moga_small_ema_sz224_8xb128_ep300.pth withattn_force_fp32=True
while using moga_small_ema_sz224_8xb128_no_forcefp32_ep300.pth withattn_force_fp32=False
.
Image Classification on ImageNet-1K
Model | Pretrain | Setting | resolution | Params(M) | Flops(G) | Top-1 (%) | Config | Download |
---|---|---|---|---|---|---|---|---|
MogaNet-XT | From scratch | DeiT | 224x224 | 2.97 | 0.80 | 76.5 | config | model | log |
MogaNet-XT | From scratch | DeiT | 256x256 | 2.97 | 1.04 | 77.2 | config | model | log |
MogaNet-XT* | From scratch | DeiT-3 | 256x256 | 2.97 | 1.04 | 77.6 | config | model | log |
MogaNet-T | From scratch | DeiT | 224x224 | 5.20 | 1.10 | 79.0 | config | model | log |
MogaNet-T | From scratch | DeiT | 256x256 | 5.20 | 1.44 | 79.6 | config | model | log |
MogaNet-T* | From scratch | DeiT-3 | 256x256 | 5.20 | 1.44 | 80.0 | config | model | log |
MogaNet-S | From scratch | DeiT | 224x224 | 25.3 | 4.97 | 83.4 | config | model | log |
MogaNet-B | From scratch | DeiT | 224x224 | 43.9 | 9.93 | 84.3 | config | model | log |
MogaNet-L | From scratch | DeiT | 224x224 | 82.5 | 15.9 | 84.7 | config | model | log |
MogaNet-XL | From scratch | DeiT | 224x224 | 180.8 | 34.5 | 85.1 | config | model | log |
MogaNet-XT | From scratch | RSB A3 | 160x160 | 2.97 | 0.80 | 72.8 | config | model | log |
MogaNet-T | From scratch | RSB A3 | 160x160 | 5.20 | 1.10 | 75.4 | config | model | log |
MogaNet-S | From scratch | RSB A3 | 160x160 | 25.3 | 4.97 | 81.1 | config | model | log |
MogaNet-B | From scratch | RSB A3 | 160x160 | 43.9 | 9.93 | 82.2 | config | model | log |
MogaNet-L | From scratch | RSB A3 | 160x160 | 43.9 | 9.93 | 83.2 | config | model | log |
V0.2.6-A2MIM-ImageNet-Weights
A collection of weights and logs for self-supervised learning benchmark on ImageNet-1K (download). You can find pre-training codes of compared methods in OpenMixup, VISSL, solo-learn, and the official repositories. You can download all files from Baidu Cloud: A2MIM (3q5i).
- All compared methods adopt ResNet-50 or ViT-B architectures and are pre-trained 100/300 or 800 epochs on ImageNet-1K. The pre-training and fine-tuning testing image size are
$224\times 224$ . The fine-tuning protocols include: RSB A3 and RSB A2 for ResNet-50, BEiT (SimMIM) for ViT-B. Refer to the paper of A2MIM for more details. - The best top-1 accuracy of fine-tuning in the last 10 training epochs is reported for all self-supervised methods.
- Visualization of mixed samples of A2MIM are provided in zip files.
- As for pre-training and fine-tuning weights, you can evaluate them with
tools/dist_test.sh
or fine-tune pre-trained modelstools/dist_train.sh
with--load_checkpoint
(loading the full checkpoints). Note that pre-trained weights stated withfull_
contains the full keys of pre-trained models whilebackbone_
only contains the encoder weights, which can be used for downstream tasks, e.g., COCO detection and ADE20K segmentation.
Self-supervised Pre-training and Fine-tuning with ResNet-50 on ImageNet-1K
We provide the source of pre-trained weights, pre-training epochs, fine-tuning epochs and protocol, and top-1 accuracy in the following table.
Methods | Source | PT epoch | FT protocol | FT top-1 |
---|---|---|---|---|
PyTorch | PyTorch | 90 | RSB A3 | 78.8 |
Inpainting | OpenMixup | 70 | RSB A3 | 78.4 |
Relative-Loc | OpenMixup | 70 | RSB A3 | 77.8 |
Rotation | OpenMixup | 70 | RSB A3 | 77.7 |
SimCLR | VISSL | 100 | RSB A3 | 78.5 |
MoCoV2 | OpenMixup | 100 | RSB A3 | 78.5 |
BYOL | OpenMixup | 100 | RSB A3 | 78.7 |
BYOL | Official | 300 | RSB A3 | 78.9 |
BYOL | Official | 300 | RSB A2 | 80.1 |
SwAV | VISSL | 100 | RSB A3 | 78.9 |
SwAV | Official | 400 | RSB A3 | 79.0 |
SwAV | Official | 400 | RSB A2 | 80.2 |
BarlowTwins | solo learn | 100 | RSB A3 | 78.5 |
BarlowTwins | Official | 300 | RSB A3 | 78.8 |
MoCoV3 | Official | 100 | RSB A3 | 78.7 |
MoCoV3 | Official | 300 | RSB A3 | 79.0 |
MoCoV3 | Official | 300 | RSB A2 | 80.1 |
A2MIM | OpenMixup | 100 | RSB A3 | 78.8 |
A2MIM | OpenMixup | 300 | RSB A3 | 78.9 |
A2MIM | OpenMixup | 300 | RSB A2 | 80.4 |
Self-supervised Pre-training and Fine-tuning with ViT-B on ImageNet-1K
We provide the source of pre-trained weights, pre-training epochs, fine-tuning epochs and protocol, and top-1 accuracy in the following table.
Methods | Source | PT epoch | FT protocol | FT top-1 |
---|---|---|---|---|
SimMIM | Official | 800 | BEiT (SimMIM) | 83.8 |
SimMIM (RGB mean) | OpenMixup | 800 | BEiT (SimMIM) | 84.0 |
A2MIM | OpenMixup | 800 | BEiT (SimMIM) | 84.3 |
V0.2.5-Mixup-iNaturalist2018-Weights
A collection of weights and logs for mixup classification benchmark on iNaturalist-2018 (download, config). You can download all files from Baidu Cloud: iNaturalist-2018 (wy2v).
- All compared methods adopt ResNet-50 and ResNeXt-101 (32x4d) architectures and are trained 100 epochs using the PyTorch training recipe. The training and testing image size is 224 with the CenterCrop ratio of 0.85. We search
$\alpha$ in$Beta(\alpha, \alpha)$ for all compared methods. - The median of top-1 accuracy in the last 5 training epochs is reported for ResNet variants.
- Visualization of mixed samples from AutoMix and SAMix are provided in zip files. [2022-08-22] Update MixBlock keys in AutoMix and SAMix checkpoints.
- Test pre-trained weights with
tools/dist_test.sh
or fine-tune pre-trained modelstools/dist_train.sh
with--load_checkpoint
.
Mixup Classification Benchmark on iNaturalist-2018
Backbones | ResNet-50 top-1 | ResNeXt-101 top-1 |
---|---|---|
Vanilla | 62.53 | 66.94 |
MixUp [ICLR'2018] | 62.69 | 67.56 |
CutMix [ICCV'2019] | 63.91 | 69.75 |
ManifoldMix [ICML'2019] | 63.46 | 69.30 |
SaliencyMix [ICLR'2021] | 64.27 | 70.01 |
FMix [Arixv'2020] | 63.71 | 69.46 |
PuzzleMix [ICML'2020] | 64.36 | 70.12 |
ResizeMix [Arixv'2020] | 64.12 | 69.30 |
AutoMix [ECCV'2022] | 64.73 | 70.49 |
SAMix [Arxiv'2021] | 64.84 | 70.54 |
V0.2.5-Mixup-Place205-Weights
A collection of weights and logs for mixup classification benchmark on Place205 (download, config). You can download all files from Baidu Cloud (4m94).
- All compared methods adopt ResNet-18/50 architectures and are trained 100 epochs using the PyTorch training recipe. The training and testing image size is 224 with the CenterCrop ratio of 0.85. We search
$\alpha$ in$Beta(\alpha, \alpha)$ for all compared methods. - The median of top-1 accuracy in the last 5 training epochs is reported for ResNet variants.
- Visualization of mixed samples from AutoMix and SAMix are provided in zip files. [2022-08-22] Update MixBlock keys in AutoMix and SAMix checkpoints.
- Test pre-trained weights with
tools/dist_test.sh
or fine-tune pre-trained modelstools/dist_train.sh
with--load_checkpoint
.
Mixup Classification Benchmark on Place205
Backbones | ResNet-18 top-1 | ResNet-50 top-1 |
---|---|---|
Vanilla | 59.63 | 63.10 |
MixUp [ICLR'2018] | 59.33 | 63.01 |
CutMix [ICCV'2019] | 59.21 | 63.75 |
ManifoldMix [ICML'2019] | 59.46 | 63.23 |
SaliencyMix [ICLR'2021] | 59.50 | 63.33 |
FMix [Arixv'2020] | 59.51 | 63.63 |
PuzzleMix [ICML'2020] | 59.62 | 63.91 |
ResizeMix [Arixv'2020] | 59.66 | 63.88 |
AutoMix [ECCV'2022] | 59.74 | 64.06 |
SAMix [Arxiv'2021] | 59.86 | 64.27 |
V0.2.5-Mixup-iNaturalist2017-Weights
A collection of weights and logs for mixup classification benchmark on iNaturalist-2017 (download, config). You can download all files from Baidu Cloud: iNaturalist-2017 (1e7w).
- All compared methods adopt ResNet-18/50 and ResNeXt-101 (32x4d) architectures and are trained 100 epochs using the PyTorch training recipe. The training and testing image size is 224 with the CenterCrop ratio of 0.85. We search
$\alpha$ in$Beta(\alpha, \alpha)$ for all compared methods. - The median of top-1 accuracy in the last 5 training epochs is reported for ResNet variants.
- Visualization of mixed samples from AutoMix and SAMix are provided in zip files. [2022-08-22] Update MixBlock keys in AutoMix and SAMix checkpoints.
- Test pre-trained weights with
tools/dist_test.sh
or fine-tune pre-trained modelstools/dist_train.sh
with--load_checkpoint
.
Mixup Classification Benchmark on iNaturalist-2017
Backbones | ResNet-18 top-1 | ResNet-50 top-1 | ResNeXt-101 top-1 |
---|---|---|---|
Vanilla | 51.79 | 60.23 | 63.70 |
MixUp [ICLR'2018] | 51.40 | 61.22 | 66.27 |
CutMix [ICCV'2019] | 51.24 | 62.34 | 67.59 |
ManifoldMix [ICML'2019] | 51.83 | 61.47 | 66.08 |
SaliencyMix [ICLR'2021] | 51.29 | 62.51 | 67.20 |
FMix [Arixv'2020] | 52.01 | 61.90 | 66.64 |
PuzzleMix [ICML'2020] | - | 62.66 | 67.72 |
ResizeMix [Arixv'2020] | 51.21 | 62.29 | 66.82 |
AutoMix [ECCV'2022] | 52.84 | 63.08 | 68.03 |
SAMix [Arxiv'2021] | 53.42 | 63.32 | 68.26 |
OpenMixup Release V0.2.3
Highlight
- Support the online document of
OpenMixup
(built on Read the Docs). - Provide
README
and update configs for self-supervised and supervised methods. - Support new contrastive learning method (Barlow Twins) and Masked Image Modeling (MIM) methods (MAE, SimMIM, MaskFeat, CAE, A2MIM).
- Support new backbone networks (ConvMixer, DenseNet, MLPMixer, ResNeSt, PoolFormer, UniFormer, VAN).
- Support new Fine-tuing method (HCR).
- Support new mixup augmentation methods (SmoothMix, GridMix).
- Support more regression losses (Charbonnier loss, Focal Frequency loss, Focal L1/L2 loss, Balanced L1 loss, Balanced MSE loss).
- Support more regression metrics (regression errors and correlations) and the regression dataset.
- Support more reweight classification losses (Gradient Harmonized loss, Varifocal Focal Loss) from MMDetection.
- Model Zoos and lists of Awesome Mixups have been updated.
Bug Fixes
- Refactor code structures of
openmixup.models.utils
and support more network layers. - Fix the bug of
DropPath
(using stochastic depth rule) inResNet
for RSB A1/A2 training settings. - Fix bugs in self-supervised classification benchmarks (configs and implementations of VisionTransformer).
- Update INSTALL.md. We suggest you install PyTorch 1.8 or higher and mmcv-full for better usage of this repo. Since PyTorch 1.8 has bugs in AdamW optimizer, do not use PyTorch 1.8 to fine-tune ViT-based methods.
- Fix bugs in
PreciseBNHook
(update all BN stats) andRepeatSampler
(set sync_random_seed) for RSB A1/A2. - Fix bugs in regression metrics, MIM dataset, and benchmark configs. Notice that only
l1_loss
is supported by FP16 training, other regression losses (e.g., MSE and Smooth_L1 losses) will cause NAN when the target and prediction are not normalized in FP16 training.
OpenMixup Release V0.2.0
Highlights
- Support various popular backbones (ConvNets and ViTs), various image datasets, popular mixup methods, and benchmarks for supervised learning. Config files are available (reorganized).
- Support popular self-supervised methods (e.g., BYOL, MoCo.V3, MAE, SimMIM) on both large-scale and small-scale datasets, and self-supervised benchmarks (merged from MMSelfSup). Config files are available (reorganized).
- Support analyzing tools for self-supervised learning (kNN/SVM/linear metrics and t-SNE/UMAP visualization).
- Convenient usage of configs: fast configs generation by 'auto_train.py' and configs inheriting (MMCV).
- Support mixed-precision training (NVIDIA Apex or MMCV Apex) for all methods.
- Model Zoos and lists of Awesome Mixups have been released.
Bug Fixes
- Done code refactoring follows MMSelfSup and MMClassification #3.
- Fix mixed-precision training overflow (NAN & INF in supervised mixup methods).
- Fix fine-tuning settings (ViT and Swin Transformer) as MMSelfsup.