Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove refined recompute deep copy #9617
base: develop
Are you sure you want to change the base?
remove refined recompute deep copy #9617
Changes from 9 commits
b6429c0
6b93359
64084ca
258b48b
79b86f6
927d157
2878179
39bd2c8
92d7e82
aa91e8e
2486c2a
688eb18
1b14cba
c81a8b3
0035f4f
69a5123
b1f0ef8
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/PaddlePaddle/PaddleNLP/blob/1842d6d133525a2ba72d1e26aecbbca54b78f8f9/llm/run_finetune.py#L156C1-L160C6
这部分看可以加到trainer 或者哪里吗?否则很容易漏写
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skip_recompute_ops 这个没有了,现在加在哪里?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
放到这里了,作为config基类里面的属性,默认都是空字典
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
加入到不保存的配置里吧,否则可能影响下游推理等任务加载。
Check warning on line 621 in paddlenlp/transformers/llama/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/llama/modeling.py#L621
Check warning on line 623 in paddlenlp/transformers/llama/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/llama/modeling.py#L623
Check warning on line 631 in paddlenlp/transformers/llama/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/llama/modeling.py#L631
Check warning on line 633 in paddlenlp/transformers/llama/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/llama/modeling.py#L633
Check warning on line 749 in paddlenlp/transformers/llama/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/llama/modeling.py#L749
Check warning on line 751 in paddlenlp/transformers/llama/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/llama/modeling.py#L751
Check warning on line 758 in paddlenlp/transformers/llama/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/llama/modeling.py#L758
Check warning on line 760 in paddlenlp/transformers/llama/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/llama/modeling.py#L760
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个是必需加layer_idx的吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
必须加,不然不知道第几层是不是需要开启rr
Check warning on line 169 in paddlenlp/transformers/qwen/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/qwen/modeling.py#L169
Check warning on line 171 in paddlenlp/transformers/qwen/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/qwen/modeling.py#L171
Check warning on line 178 in paddlenlp/transformers/qwen/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/qwen/modeling.py#L178
Check warning on line 180 in paddlenlp/transformers/qwen/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/qwen/modeling.py#L180
Check warning on line 423 in paddlenlp/transformers/qwen/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/qwen/modeling.py#L423
Check warning on line 425 in paddlenlp/transformers/qwen/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/qwen/modeling.py#L425
Check warning on line 432 in paddlenlp/transformers/qwen/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/qwen/modeling.py#L432
Check warning on line 434 in paddlenlp/transformers/qwen/modeling.py
Codecov / codecov/patch
paddlenlp/transformers/qwen/modeling.py#L434
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QwenBlock的外部调用,缺少了layer_idx的输入,可以检查一下。