Exploring Intrinsic Dimension for Vision-Language Model Pruning

This is the official implementation of ICML'24 paper Exploring Intrinsic Dimension for Vision-Language Model Pruning.

🏃‍♂️ TL;DR

The Intrinsic Dimension (ID) of vision representations spans a wider and greater range than that of language representations, which we attribute to the heightened sensitivity of vision models to pruning. In contrast, language models exhibit greater robustness despite containing more redundant weights.

🔨 Installation

This code is tested on Pytorch==1.11.0, cuda==11.5, and python==3.9.0. Install the dependencies with:

conda install --yes --file requirements.txt

📐 Evaluation of IDs

CPU Mode

python ComputeID.py -n 2000 --Path ID/Blip_coco --cpu

GPU Mode

python ComputeID.py -n 2000 --Path ID/Blip_coco --gpu 0

✂️ Pruning

Image Captioning on the COCO Caption Dataset with BLIP

Dataset & Annotation
1. Download the COCO2014 dataset and unzip it under the datasets folder. Update the image_root in config.
2. Download all-in-one annotations from this link, unzip it under the coco/annotation folder, and update the annotation in config.
Pruning
1. Download the uncompressed model from this link and place it in the pretrained folder. Update the pretrained in config.
2. To prune BLIP by 80% and add Intrinsic Dimension to the importance score metric during pruning, run:
```
python -m torch.distributed.run --nproc_per_node=2 --master_port=29505 train_caption.py --final_threshold 0.2 --model_dir coco/PLATON80 --pruner_name PLATON --useID
```

Evaluation

Place the pruned model in the output folder and update the --pretrained in the scripts.

To evaluate the pruned model, run:

python -m torch.distributed.run --nproc_per_node=2 --master_port=29505 train_caption.py --pruner_name PLATON --pruned output/pruned_model_path --evaluate

Visual Reasoning on the NLVR2 Dataset with BLIP

Dataset & Annotation
1. Download the NLVR2 dataset and unzip it under the datasets folder. Update the image_root in config.
2. Download all-in-one annotations from this link, unzip it under the nlvr/annotation folder, and update the annotation in config.
Pruning
1. Download the uncompressed model from this link and place it in the pretrained folder. Update the pretrained in config.
2. To prune BLIP by 80% and add Intrinsic Dimension to the importance score metric during pruning, run:
```
python -m torch.distributed.run --nproc_per_node=2 --master_port=29505 train_nlvr.py --final_threshold 0.2 --model_dir nlvr/PLATON80 --pruner_name PLATON --useID
```

Evaluation

Place the pruned model in the output folder and update the --pretrained in the scripts.

To evaluate the pruned model, run:

python -m torch.distributed.run --nproc_per_node=2 --master_port=29505 train_nlvr.py --pruner_name PLATON --pruned output/pruned_model_path --evaluate

💐 Acknowledgments

This code is built upon IntrinsicDimDeep, BLIP and PLATON, and we sincerely appreciate their contributions.

🌸 Citation

If you find this work useful, please consider citing our paper:

@inproceedings{
wang2024exploring,
title={Exploring Intrinsic Dimension for Vision-Language Model Pruning},
author={Hanzhang Wang, Jiawen Zhang, and Qingyuan Ma},
booktitle={Forty-first International Conference on Machine Learning},
year={2024},
url={https://openreview.net/forum?id=xxL7CEWuxz}
}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
configs		configs
data		data
models		models
pruning		pruning
transform		transform
.gitattributes		.gitattributes
ComputeID.py		ComputeID.py
ID.png		ID.png
ID_pruner.py		ID_pruner.py
ID_pruner_nlvr.py		ID_pruner_nlvr.py
readme.md		readme.md
requirements.txt		requirements.txt
train_caption.py		train_caption.py
train_nlvr.py		train_nlvr.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploring Intrinsic Dimension for Vision-Language Model Pruning

🏃‍♂️ TL;DR

🔨 Installation

📐 Evaluation of IDs

✂️ Pruning

Image Captioning on the COCO Caption Dataset with BLIP

Visual Reasoning on the NLVR2 Dataset with BLIP

💐 Acknowledgments

🌸 Citation

About

Releases

Packages

Contributors 2

Languages

Nofear18/ID_VL_Pruning

Folders and files

Latest commit

History

Repository files navigation

Exploring Intrinsic Dimension for Vision-Language Model Pruning

🏃‍♂️ TL;DR

🔨 Installation

📐 Evaluation of IDs

✂️ Pruning

Image Captioning on the COCO Caption Dataset with BLIP

Visual Reasoning on the NLVR2 Dataset with BLIP

💐 Acknowledgments

🌸 Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages