This repository contains the official code for the paper "Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection". (TODO: Add link once on arxiv)
First, create a virtual environment for the project (we use Conda to create a python 3.9 environment) and install all the requirments using requirements.txt
.
conda create -n ood_det python==3.9
conda activate ood_det
pip install -r requirements.txt
Now, create two directories:
- Data directory
- Model directory - create two subdirectories:
pretrained_models
andfinetuned_models
Both directories can be located anywhere but should be specified in config.py
(See more below).
At the start of each session, run the following: . bin/start.sh
Directory paths and training arguments can be specified in config.py
. Some important arguments are:
DATA_DIR
: Path to the data directoryMODEL_DIR
: Path to the model directorytask_name
: Name of the dataset to use as in-distribution (ID) data.ood_datasets
: List of datasets to use as out-of-distribution (OOD) data.model_class
: Type of model to use. Options areroberta
,gpt2
andt5
(base versions of all models used).do_train
: If true, trains the model on the ID data before performing OOD detection.
After specifying the arguments, run the following command:
python run_ood_detection.py
Running the command below will extend the pretraining process on a specified dataset.
Once again, the values in config.json
can be used to specify the dataset and model to be used.
python tapt_training/pretrain_roberta.py
After the new model has been saved to the pretrained_models
directory within the MODELS_DIR
directory (specified in config.py
), it can be used for OOD detection by running run_ood_detection.py
.
If you find this repo helpful, you are welcome to cite our work:
@inproceedings{uppaal2023fine,
title={Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection},
author={Rheeya Uppaal and Junjie Hu and Yixuan Li },
booktitle = {Annual Meeting of the Association for Computational Linguistics},
year = {2023}
}
Our codebase borrows from the following:
@inproceedings{zhou2021contrastive,
title={Contrastive Out-of-Distribution Detection for Pretrained Transformers},
author={Zhou, Wenxuan and Liu, Fangyu and Chen, Muhao},
booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},
pages={1100--1111},
year={2021}
}
@article{liu2020tfew,
title={Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning},
author={Liu, Haokun and Tam, Derek and Muqeeth, Mohammed and Mohta, Jay and Huang, Tenghao and Bansal, Mohit and Raffel, Colin},
journal={arXiv preprint arXiv:2205.05638},
year={2022}
}