-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PaddleNLP 3.0] Update README #8681
[PaddleNLP 3.0] Update README #8681
Conversation
Thanks for your contribution! |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #8681 +/- ##
========================================
Coverage 55.42% 55.42%
========================================
Files 626 626
Lines 98082 98082
========================================
Hits 54364 54364
Misses 43718 43718 ☔ View full report in Codecov by Sentry. |
README.md
Outdated
|
||
* **2023.08.15 [PaddleNLP v2.6](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.6.0)**: 发布[全流程大模型工具链](./llm),涵盖预训练,精调,压缩,推理以及部署等各个环节,为用户提供端到端的大模型方案和一站式的开发体验;内置[4D并行分布式Trainer](./docs/trainer.md),[高效微调算法LoRA/Prefix Tuning](./llm#33-lora), [自研INT8/INT4量化算法](./llm#6-量化)等等;全面支持[LLaMA 1/2](./llm/llama), [BLOOM](.llm/bloom), [ChatGLM 1/2](./llm/chatglm), [GLM](./llm/glm), [OPT](./llm/opt)等主流大模型 | ||
* **2023.08.15 [PaddleNLP v2.6](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.6.0)**: 发布[全流程大模型工具链](./llm),涵盖预训练,精调,压缩,推理以及部署等各个环节,为用户提供端到端的大模型方案和一站式的开发体验;内置[4D并行分布式Trainer](./docs/trainer.md),[高效微调算法LoRA/Prefix Tuning](./llm#33-lora), [自研INT8/INT4量化算法](./llm#6-量化)等等;全面支持[LLaMA 1/2](./llm/config/llama), [BLOOM](.llm/config/bloom), [ChatGLM 1/2](./llm/config/chatglm), [GLM](./llm/config/glm), [OPT](./llm/config/opt)等主流大模型 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bloom地址有问题,GLM要不就不要了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bloom已经修改,GLM删除了
Signed-off-by: Zhang Jun <jzhang533@gmail.com>
…n-readme' into dev_update_readme_3.0beta
7d0627c
to
b4a5862
Compare
…ddleNLP into dev_update_readme_3.0beta
README.md
Outdated
* **2024.06.27 [PaddleNLP v3.0 Beta](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v3.0.0-beta0)**:拥抱大模型,体验全升级。统一大模型工具链,实现国产计算芯片全流程接入;全面支持飞桨4D并行配置、高效精调策略、高效对齐算法、高性能推理等大模型产业级应用流程;自研极致收敛的RsLoRA+算法、自动扩缩容存储机制Unified Checkpoint和通用化支持FastFFN、FusedQKV助力大模型训推;主流模型持续支持更新,提供高效解决方案。 | ||
|
||
* **2024.04.24 [PaddleNLP v2.8](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.8.0)**:自研极致收敛的RsLoRA+算法,大幅提升PEFT训练收敛速度以及训练效果;引入高性能生成加速到RLHF PPO算法,打破 PPO 训练中生成速度瓶颈,PPO训练性能大幅领先。通用化支持 FastFFN、FusedQKV等多个大模型训练性能优化方式,大模型训练更快、更稳定。 | ||
* **2024.06.27 [PaddleNLP v3.0 Beta](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v3.0.0-beta0)**:拥抱大模型,体验全升级。统一大模型工具链,实现国产计算芯片全流程接入;全面支持飞桨4D 并行配置、高效精调策略、高效对齐算法、高性能推理等大模型产业级应用流程;自研极致收敛的 RsLoRA+算法、自动扩缩容存储机制 Unified Checkpoint 和通用化支持 FastFFN、FusedQKV 助力大模型训推;主流模型持续支持更新,提供高效解决方案。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
整体再统一下工具链和套件这两个术语,我看LLM目录和主readme还有相关的技术术语
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删除“工具链”表述,统一使用“套件”
README.md
Outdated
支持数据、分片、张量、流水线并行的4D高性能训练,Trainer支持分布式策略配置化,降低复杂分布式组合带来的使用成本; | ||
Unified Checkpoint大模型存储格式在模型参数分布上支持动态扩缩容训练,降低硬件切换带来的迁移成本。 | ||
|
||
支持数据、分片、张量、流水线并行的4D 高性能训练,Trainer 支持分布式策略配置化,降低复杂分布式组合带来的使用成本; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
模型并行策略、分组参数切片组合、流水线并行策略和数据并行策略 这里可以按照官方的术语统一下4D并行的介绍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
参考官网分布式训练简介修改为“支持纯数据并行策略、分组参数切片的数据并行策略、张量模型并行策略和流水线模型并行策略的4D 高性能训练”
摘要:
针对千亿参数及以上的模型,可选用飞桨的多维混合并行策略。此类策略有效地融合了纯数据并行、分组参数切片的数据并行、张量模型并行、流水线模型并行、专家并行等多种并行策略,为用户提供高效的大模型分布式训练解决方案。
README.md
Outdated
|
||
### <a href=#高效精调与高效对齐> 🤗 高效精调与高效对齐 </a> | ||
精调和对齐算法深度结合零填充数据流和FlashMask高性能算子,降低训练无效数据填充和计算,大幅提升精调和对齐训练吞吐。 | ||
|
||
精调和对齐算法深度结合零填充数据流和 FlashMask 高性能算子,降低训练无效数据填充和计算,大幅提升精调和对齐训练吞吐。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对齐在DPO算法中还没有上FlashMask策略,这里去掉对齐的说法
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
修改为:精调算法深度结合零填充数据流和 FlashMask 高性能算子,降低训练无效数据填充和计算,大幅提升精调训练吞吐。
@@ -86,27 +88,28 @@ pip install --upgrade paddlenlp | |||
pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/paddlenlp.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这样的方式可以安装最新的dev版本吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以安装最新版本,通过paddlenlp.version.commit查看是前一天的最新版本
README.md
Outdated
>>> tokenizer.batch_decode(outputs[0]) | ||
['我是一个AI语言模型,我可以回答各种问题,包括但不限于:天气、新闻、历史、文化、科学、教育、娱乐等。请问您有什么需要了解的吗?'] | ||
>>> print(tokenizer.batch_decode(outputs[0])) | ||
['我是一个 AI 语言模型,我可以回答各种问题,包括但不限于:天气、新闻、历史、文化、科学、教育、娱乐等。请问您有什么需要了解的吗?'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里应该不能空格?看起来要豁免下markdown的格式处理
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
修改了格式化代码,豁免代码块内中英文混写空格。
README_en.md
Outdated
>>> tokenizer.batch_decode(outputs[0]) | ||
['我是一个AI语言模型,我可以回答各种问题,包括但不限于:天气、新闻、历史、文化、科学、教育、娱乐等。请问您有什么需要了解的吗?'] | ||
>>> print(tokenizer.batch_decode(outputs[0])) | ||
['我是一个 AI 语言模型,我可以回答各种问题,包括但不限于:天气、新闻、历史、文化、科学、教育、娱乐等。请问您有什么需要了解的吗?'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
修改了格式化代码,豁免代码块内中英文混写空格。
llm/README.md
Outdated
|
||
此项目支持了LLaMA、GPT-3、BaiChuan、Qwen、Mixtral 等大模型的预训练。用户切换配置config文件,即可一键运行。 | ||
数据详细制作流程可参考[此处](https://paddlenlp.readthedocs.io/zh/latest/llm/pretraining/dataset.html) , [Pretrain 和自定义数据集](https://paddlenlp.readthedocs.io/zh/latest/llm/pretraining/dataset.html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里看起来放了两个重复链接
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -158,8 +160,8 @@ python -u -m paddle.distributed.launch --gpus "0,1,2,3,4,5,6,7" run_finetune.py | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
主readme的模型列表后续通过什么方式来展示了?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
通过issue列表进行展示:#8663 (申请置顶)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个我们Readme 里面给一些链接吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
给了说明和链接
- 模型参数已支持 LLaMA 系列、Baichuan 系列、Bloom 系列、ChatGLM 系列、Gemma 系列、Mistral 系列、OPT 系列和 Qwen 系列,详细列表👉【LLM】模型参数支持列表
- 4D 并行和算子优化已支持 LLaMA 系列、Baichuan 系列、Bloom 系列、ChatGLM 系列、Gemma 系列、Mistral 系列、OPT 系列和 Qwen 系列,详细列表👉【LLM】模型4D 并行和算子支持列表
docs/llm/docs/quantization.md
Outdated
- `auto_clip`: AWQ时是否进行自动搜索截断值并对模型权重进行截断操作,截断操作有利于量化模型精度,但搜索速度较慢。默认为False。 | ||
- `autoclip_step`: AutoClip步数,也即模型前向次数,采样时默认concat每轮数据用来搜索截断值,默认为8。 | ||
|
||
<summary>  量化参数(QuantArgument)</summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
此处修改中英空格和拼写问题
# Only check sibling headings | ||
siblings_only: false | ||
|
||
# MD025/single-title/single-h1 : Multiple top-level headings in the same document : https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md025.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
赞!
@@ -158,8 +160,8 @@ python -u -m paddle.distributed.launch --gpus "0,1,2,3,4,5,6,7" run_finetune.py | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个我们Readme 里面给一些链接吧
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Others
PR changes
Docs
Description
markdownlint
tool to format.md
files.