Skip to content

[ECCV 2024] Official code release for "Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition"

Notifications You must be signed in to change notification settings

masashi-hatano/MM-CDFSL

Repository files navigation

Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition (ECCV'24)

PyTorch Lightning Config: Hydra

[Paper][Supplementary][Project Page][Poster][Data]

This is the official code release for our ECCV 2024 paper
"Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition".

🔨 Installation

# Create a virtual environment
python3 -m venv mm-cdfsl
source mm-cdfsl/bin/activate

# Install the dependencies
pip install -r requirements.txt

📂 Data Preparation

Training Split

You can find the train/val split files for all three target datasets in cdfsl folders.

Data Structure

Please follow the data structure as detailed in DATA_STRUCTURE.md.

Pre-processed Data

You can download the pre-processed data from the hub.

📍 Model Zoo

You can brouse the checkpoints of pre-trained model, comparison methods, and our models in this folder or directly download from the following links.

Pre-Train

Method Source Dataset Target Dataset Modality Ckpt
VideoMAE Kinetics-400 - RGB checkpoint
VideoMAE Ego4D - RGB checkpoint
VideoMAE w/ classifier Ego4D EPIC RGB checkpoint
VideoMAE w/ classifier Ego4D EPIC flow checkpoint
VideoMAE w/ classifier Ego4D EPIC pose checkpoint
VideoMAE w/ classifier Ego4D MECCANO RGB checkpoint
VideoMAE w/ classifier Ego4D MECCANO flow checkpoint
VideoMAE w/ classifier Ego4D MECCANO pose checkpoint
VideoMAE w/ classifier Ego4D WEAR RGB checkpoint
VideoMAE w/ classifier Ego4D WEAR flow checkpoint
VideoMAE w/ classifier Ego4D WEAR pose checkpoint

2nd Stage

Method Source Dataset Target Dataset Modality Ckpt
STARTUP++ Ego4D EPIC RGB checkpoint
STARTUP++ Ego4D MECCANO RGB checkpoint
STARTUP++ Ego4D WEAR RGB checkpoint
Dynamic Distill++ Ego4D EPIC RGB checkpoint
Dynamic Distill++ Ego4D MECCANO RGB checkpoint
Dynamic Distill++ Ego4D WEAR RGB checkpoint
CDFSL-V Ego4D EPIC RGB checkpoint
CDFSL-V Ego4D MECCANO RGB checkpoint
CDFSL-V Ego4D WEAR RGB checkpoint
Ours Ego4D EPIC RGB, flow, pose checkpoint
Ours Ego4D MECCANO RGB, flow, pose checkpoint
Ours Ego4D WEAR RGB, flow, pose checkpoint

🔥 Training

1. Pre-training

Please make sure that you set modality (e.g., rgb) and dataset (e.g., epic) in configs/trainer/pretrain_trainer.yaml and confings/data_module/pretrain_data_module.yaml.

python3 lit_main_pretrain.py train=True test=False

2. Multimodal Distillation

Please make sure that you set dataset (e.g., epic) in confings/data_module/mm_distill_data_module.yaml. Also, you need to set the ckpt path of all modalities in configs/trainer/mm_distill_trainer.yaml.

python3 lit_main_mmdistill.py train=True test=False 

🔍 Evaluation

To evaluate the model in 5-way 5-shot setting with 600 runs, please run the following command.

python3 lit_main_mmdistill.py train=False test=True data_module.n_way=5 data_module.k_shot=5 data_module.episodes=600

✍️ Citation

If you use this code for your research, please cite our paper.

@inproceedings{Hatano2024MMCDFSL,
    author = {Masashi Hatano, Ryo Hachiuma, Ryo Fujii and Hideo Saito},
    title = {Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition},
    booktitle = {European Conference on Computer Vision (ECCV)},
    year = {2024},
}

About

[ECCV 2024] Official code release for "Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages