Rectified Rotary Position Embeddings (ReRoPE)

Using ReRoPE, we can more effectively extend the context length of LLM without the need for fine-tuning.

Blog

https://kexue.fm/archives/9706 (Chinese)
https://kexue.fm/archives/9708 (Chinese)
https://normxu.github.io/Rethinking-Rotary-Position-Embedding-2/ (English by @NormXU)
https://normxu.github.io/Rethinking-Rotary-Position-Embedding-3/ (English by @NormXU)

Idea

Results

Calculated the loss on llama2-13b with samples_15k.jsonl:

Method	loss
RoPE-4k(original llama2-13b)	1.4967
RoPE-8k(original llama2-13b)	8.8615
NTK-RoPE-4k(not dynamic)	1.6081
NTK-RoPE-8k(not dynamic)	1.5417
NTK-RoPE-16k(not dynamic)	1.5163
ReRoPE-w1024-4k	1.4996
ReRoPE-w1024-8k	1.4267
ReRoPE-w1024-16k	1.4001

ReRoPE's performance at training length (4k) has hardly decreased, and it possesses the ideal property of "longer context, lower loss".

Usage

Dependency: transformers 4.31.0

Run python test.py to test chatting or run python eval_loss.py to calculate loss with llama2.

From here and here, we can see what modifications ReRoPE/Leaky ReRoPE has made compared to the original llama implementation.

Other

Triton Implementation of ReRoPE: https://gist.github.com/chu-tianxiang/4307937fd94b49c75b61a6967716bae9

Cite

@misc{rerope2023,
  title={Rectified Rotary Position Embeddings},
  author={Jianlin Su},
  year={2023},
  howpublished={\url{https://github.com/bojone/rerope}},
}

Communication

QQ discussion group: 67729435, for WeChat group, please add the robot WeChat ID spaces_ac_cn

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
README.md		README.md
contexts.100.json		contexts.100.json
contexts.json		contexts.json
eval_loss.py		eval_loss.py
idea.png		idea.png
leaky_rerope_patch.py		leaky_rerope_patch.py
ntk_patch.py		ntk_patch.py
requirements.txt		requirements.txt
rerope_patch.py		rerope_patch.py
samples_15k.jsonl		samples_15k.jsonl
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rectified Rotary Position Embeddings (ReRoPE)

Blog

Idea

Results

Usage

Other

Cite

Communication

About

Releases

Packages

Contributors 2

Languages

bojone/rerope

Folders and files

Latest commit

History

Repository files navigation

Rectified Rotary Position Embeddings (ReRoPE)

Blog

Idea

Results

Usage

Other

Cite

Communication

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages