Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monkey patch layer norm in mllama #302

Merged
merged 4 commits into from
Oct 17, 2024
Merged

Monkey patch layer norm in mllama #302

merged 4 commits into from
Oct 17, 2024

Conversation

shivam15s
Copy link
Collaborator

@shivam15s shivam15s commented Oct 11, 2024

Summary

Monkey patches layer norm in mllama for conditional generation

Testing Done

Tested monkey patching works as intended

  • Hardware Type:
  • run make test to ensure correctness
  • run make checkstyle to ensure code style
  • run make test-convergence to ensure convergence

)

from liger_kernel.transformers.model.mllama import lce_forward as mllama_lce_forward

if rope:
modeling_mllama.apply_rotary_pos_emb = liger_rotary_pos_emb
if layer_norm:
modeling_mllama.nn.LayerNorm = LigerLayerNorm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I looked into this but was worried this would modify torch.nn.LayerNorm globally, not just for the targeted model

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That probably answers why ci tests fail. LigerLayerNorm gets applied even before we monkey patch because a previous test patched layer norm globally.

@shivam15s shivam15s changed the title Monkey patches layer norm in mllama Monkey patch layer norm in mllama Oct 17, 2024
@ByronHsu ByronHsu merged commit 6ab3b9f into main Oct 17, 2024
2 checks passed
@ByronHsu ByronHsu deleted the shisahni/patch-mllama branch October 17, 2024 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants