Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

epic: Replace optimizers or code new optimizers? #103

Open
tikikun opened this issue Nov 7, 2024 · 4 comments
Open

epic: Replace optimizers or code new optimizers? #103

tikikun opened this issue Nov 7, 2024 · 4 comments
Assignees
Labels
type: epic A major feature or initiative
Milestone

Comments

@tikikun
Copy link
Collaborator

tikikun commented Nov 7, 2024

Goal

There are a few new optimizers which can converge faster? Should consider switch?

Resources

https://arxiv.org/abs/2411.02853

@tikikun tikikun added the type: epic A major feature or initiative label Nov 7, 2024
@tikikun tikikun self-assigned this Nov 7, 2024
@tikikun tikikun added this to Research Nov 7, 2024
@tikikun
Copy link
Collaborator Author

tikikun commented Nov 11, 2024

@tuanlda78202 you can also look into this

@tikikun
Copy link
Collaborator Author

tikikun commented Nov 19, 2024

Precision scaling

https://arxiv.org/abs/2411.04330

@tikikun tikikun changed the title epic: Replace optimizers? epic: Replace optimizers or code new optimizers? Nov 19, 2024
@tikikun
Copy link
Collaborator Author

tikikun commented Nov 20, 2024

Some observations:

  • Higher precision can lead to better convergence
  • The precision scaling paper made use of the same optimizer, we might get better result with better optimizer

@tikikun
Copy link
Collaborator Author

tikikun commented Nov 20, 2024

Current results:

  • 1B llama can stabilize at higher learning rate on FP32 compared to bf16
  • Catastrophic forgetting occred

@github-project-automation github-project-automation bot moved this to Investigating in Jan & Cortex Nov 22, 2024
@tikikun tikikun moved this from Investigating to Icebox in Jan & Cortex Nov 25, 2024
@bachvudinh bachvudinh added this to the Icebox milestone Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: epic A major feature or initiative
Projects
Status: Icebox
Development

No branches or pull requests

3 participants