Skip to content
/ GEM Public

Code for Paper (Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity)

Notifications You must be signed in to change notification settings

liziniu/GEM

Repository files navigation

🚀 PyTorch Implementation of GEM 🌟

Welcome to the official PyTorch implementation of GEM (Entropic Distribution Matching in Supervised Fine-tuning)! 🎉 Developed in our paper Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity, GEM is your go-to method for improving model generalization and output diversity. 🌍✨

Why GEM? 🤔

Tired of overfitting when using standard cross-entropy loss in supervised fine-tuning (SFT)? GEM is here to help! 🚀

  • Lower Perplexity: Get better evaluation results with consistently lower perplexity than cross-entropy (CE). 📉
  • Improved Downstream Performance: Achieve higher performance on downstream tasks. 🏆
  • Enhanced Output Diversity: Unlock the potential of diverse outputs, especially useful for test-time scaling when using best-of-n strategies. 🌈💡

Quickstart Guide 💻

Setup 🔧

First, create a new environment and install the required packages:

conda create -n gem python=3.10
conda activate gem
pip install -r requirements.txt

Training 🏋️‍♂️

Kickstart your training process using the UltraFeedback dataset from HuggingFace. Here’s how:

Tokenize Data

bash scripts/tokenize_data.sh

Training

bash scripts/train_gem_ultrafeedback.sh

Evaluation 🧪

Run evaluations for different tasks:

Instruction Following

bash scripts/eval/if_eval.sh

GSM8K

bash scripts/eval/gsm8k_eval.sh

GSM8K (Voting)

bash scripts/eval/gsm8k_voting_eval.sh

Creative Writing

bash scripts/eval/creative_writing.sh

Results of the fine-tuned models from the above scripts are available here.

📜 Citation

If you find this repository helpful in your research or projects, please consider citing the GEM paper in your academic work. Your support is much appreciated! 🙌

@article{li2024entropic,
  title={Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity},
  author={Li, Ziniu and Chen, Congliang and Xu, Tian and Qin, Zeyu and Xiao, Jiancong and Sun, Ruoyu and Luo, Zhi-Quan},
  journal={arXiv preprint arXiv:2408.16673},
  year={2024}
}

Ziniu Li would like to acknowledge Zhengyang Tang for his minimalistic and clean implementation of SFT.

About

Code for Paper (Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published