Welcome to the official PyTorch implementation of GEM (Entropic Distribution Matching in Supervised Fine-tuning)! 🎉 Developed in our paper Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity, GEM is your go-to method for improving model generalization and output diversity. 🌍✨
Tired of overfitting when using standard cross-entropy loss in supervised fine-tuning (SFT)? GEM is here to help! 🚀
- Lower Perplexity: Get better evaluation results with consistently lower perplexity than cross-entropy (CE). 📉
- Improved Downstream Performance: Achieve higher performance on downstream tasks. 🏆
- Enhanced Output Diversity: Unlock the potential of diverse outputs, especially useful for test-time scaling when using best-of-n strategies. 🌈💡
First, create a new environment and install the required packages:
conda create -n gem python=3.10
conda activate gem
pip install -r requirements.txt
Kickstart your training process using the UltraFeedback
dataset from HuggingFace. Here’s how:
Tokenize Data
bash scripts/tokenize_data.sh
Training
bash scripts/train_gem_ultrafeedback.sh
Run evaluations for different tasks:
Instruction Following
bash scripts/eval/if_eval.sh
GSM8K
bash scripts/eval/gsm8k_eval.sh
GSM8K (Voting)
bash scripts/eval/gsm8k_voting_eval.sh
Creative Writing
bash scripts/eval/creative_writing.sh
Results of the fine-tuned models from the above scripts are available here.
If you find this repository helpful in your research or projects, please consider citing the GEM paper in your academic work. Your support is much appreciated! 🙌
@article{li2024entropic,
title={Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity},
author={Li, Ziniu and Chen, Congliang and Xu, Tian and Qin, Zeyu and Xiao, Jiancong and Sun, Ruoyu and Luo, Zhi-Quan},
journal={arXiv preprint arXiv:2408.16673},
year={2024}
}
Ziniu Li would like to acknowledge Zhengyang Tang for his minimalistic and clean implementation of SFT.