This is the official implementation of ICML'24 paper Exploring Intrinsic Dimension for Vision-Language Model Pruning.
The Intrinsic Dimension (ID) of vision representations spans a wider and greater range than that of language representations, which we attribute to the heightened sensitivity of vision models to pruning. In contrast, language models exhibit greater robustness despite containing more redundant weights.
This code is tested on Pytorch==1.11.0
, cuda==11.5
, and python==3.9.0
. Install the dependencies with:
conda install --yes --file requirements.txt
- CPU Mode
python ComputeID.py -n 2000 --Path ID/Blip_coco --cpu
- GPU Mode
python ComputeID.py -n 2000 --Path ID/Blip_coco --gpu 0
-
Dataset & Annotation
-
Pruning
- Download the uncompressed model from this link and place it in the
pretrained
folder. Update thepretrained
in config. - To prune BLIP by 80% and add Intrinsic Dimension to the importance score metric during pruning, run:
python -m torch.distributed.run --nproc_per_node=2 --master_port=29505 train_caption.py --final_threshold 0.2 --model_dir coco/PLATON80 --pruner_name PLATON --useID
- Download the uncompressed model from this link and place it in the
-
Evaluation
- Place the pruned model in the
output
folder and update the--pretrained
in the scripts. - To evaluate the pruned model, run:
python -m torch.distributed.run --nproc_per_node=2 --master_port=29505 train_caption.py --pruner_name PLATON --pruned output/pruned_model_path --evaluate
- Place the pruned model in the
-
Dataset & Annotation
-
Pruning
- Download the uncompressed model from this link and place it in the
pretrained
folder. Update thepretrained
in config. - To prune BLIP by 80% and add Intrinsic Dimension to the importance score metric during pruning, run:
python -m torch.distributed.run --nproc_per_node=2 --master_port=29505 train_nlvr.py --final_threshold 0.2 --model_dir nlvr/PLATON80 --pruner_name PLATON --useID
- Download the uncompressed model from this link and place it in the
-
Evaluation
- Place the pruned model in the
output
folder and update the--pretrained
in the scripts. - To evaluate the pruned model, run:
python -m torch.distributed.run --nproc_per_node=2 --master_port=29505 train_nlvr.py --pruner_name PLATON --pruned output/pruned_model_path --evaluate
- Place the pruned model in the
This code is built upon IntrinsicDimDeep, BLIP and PLATON, and we sincerely appreciate their contributions.
If you find this work useful, please consider citing our paper:
@inproceedings{
wang2024exploring,
title={Exploring Intrinsic Dimension for Vision-Language Model Pruning},
author={Hanzhang Wang, Jiawen Zhang, and Qingyuan Ma},
booktitle={Forty-first International Conference on Machine Learning},
year={2024},
url={https://openreview.net/forum?id=xxL7CEWuxz}
}