Möbius Transform for Mitigating Perspective Distortions in Representation Learning (ECCV2024)
Chhipa, P.C., Chippa, M.S., De, K., Saini, R., Liwicki, M., Shah, M.: Möbius transform for mitigating perspective distortions in representation learning. European Conference on Computer Vision. (ECCV 2024)
Visit Project Website - https://prakashchhipa.github.io/projects/mpd/
- Pretrained models - https://huggingface.co/prakashchhipa/MPD_SSL
- ImageNet-PD benchmark dataset - https://huggingface.co/datasets/prakashchhipa/ImageNet-PD
- Two minutes summary on MPD - https://prakashchhipa.github.io/projects/mpd/
Train a ResNet50 model under a supervised setting with Mobius transformations applied with a probability of 0.2:
torchrun --nproc_per_node=2 train_supervised.py --model resnet50 --apply-mobius --forward-mobius --mobius-prob 0.2 --name supervised_mpd_rn50
Pretrain ResNet50 using the SimCLR framework with Mobius transformations:
python train_simclr.py --config configs/imagenet_train_epochs100_bs512.yaml
Pretrain ResNet50 with SimCLR, applying Mobius transformations and controlling the background with the mobius_background
input:
python train_simclr.py --config configs/imagenet_train_epochs100_bs512.yaml
Fine-tune a ResNet50 model pretrained with SimCLR for a downstream task using Mobius transformations:
torchrun --nproc_per_node=5 train_downstream_task_from_ssl.py --model resnet50 --apply-mobius --apply-BGI --name downstream_simclr_rn50 --forward-mobius --mobius-prob 0.2 --ssl-checkpoint <checkpoint URL>
Pretrain a Vision Transformer (ViT) using DINO with Mobius transformations:
python -m torch.distributed.launch --nproc_per_node=8 main_dino_mobius.py --mobius_prob 0.8 --arch vit_small --data_path /path/to/imagenet/train --output_dir /path/to/saving_dir
Pretrain a Vision Transformer (ViT) using DINO with Mobius transformations and a padded background:
python -m torch.distributed.launch --nproc_per_node=8 main_dino_mobius_bgi.py --mobius_prob 0.8 --arch vit_small --data_path /path/to/imagenet/train --output_dir /path/to/saving_dir
The source code for additional computer vision applications will be released later.