-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
grad norm too large,梯度爆炸 #6249
Labels
solved
This problem has been already solved
Comments
use bf16 |
hiyouga
added
solved
This problem has been already solved
and removed
pending
This problem is yet to be addressed
labels
Dec 5, 2024
是有设置
|
应该是硬件问题,换了个 gpu 就好了,之前 grad norom 爆炸的硬件现在报 ECC 错误 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Reminder
System Info
Name: llamafactory
Version: 0.9.2.dev0
Reproduction
Expected behavior
grad norm 的值达到了几十万,不知道为什么会出现这种情况,gradient clipping 没起作用吗
正常 grad norm 的值应该小于 10 的
Others
No response
The text was updated successfully, but these errors were encountered: