-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monkeypatch for Llama 3.2-Vision #282
Monkeypatch for Llama 3.2-Vision #282
Conversation
c7900fb
to
01c2fd7
Compare
Wanted to add support for LayerNorm in the vision tower as well, but it is hard to patch the way it is used in modeling_mllama.py - a patch would end up changing torch.nn globally. |
## Summary - Previously, the pre-trained weights were not being loaded if patching model post-initialization - Instead of loading weights, just patch the model instance module's forward method (see linkedin#279) ## Testing Done - In convergence tests, check that pre-init patching and post-init patching match results from original model - Hardware Type: A100 - [x] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [ ] run `make test-convergence` to ensure convergence --> most tests working, waiting for other fixes for all tests to pass
- Make the `transformers` dependency optional so we only have `torch` and `triton` as required deps, which is helpful if you're not using `transformers` for modeling code. This was also causing installation issues for people using slightly older transformers versions. - If transformers is needed, make it compatible with any 4.x version. The specific model being used should dictate the transformers version compatibility. `pip install -e .[transformers]` `pip install -e .[dev]` A100-80G-PCIe - Hardware Type: A100-80G-PCIe - [x] run `make test` to ensure correctness - [x] run `make checkstyle` to ensure code style - [x] run `make test-convergence` to ensure convergence
Just need someone to set a HF Token that has accepted the llama license in the CICD env:
|
Added HF hub read secret to the repo and got the gated llama 3.2 access |
Thanks, for some reason I'm still seeing the same issue - could you try triggering a cicd run on this branch? |
Let me get back. Still figuring out how to configure the gpu to use the access token. |
Seems like using gated stuff for tests isn't a good idea. I'm looking into using fake/mirrored models/processors for the tests |
Any update @shivam15s ? |
Hey @tyler-romero, created a PR on your branch. Let me know if the proposed solution makes sense |
Gated tokenizer possible fix
Really nice solution, thank you! |
Seems like a convergence test fails because of a precision issue. |
Now that you made custom tokenizers for the multimodal tests, I could reduce the vocab sizes of the mini-models (to be in-line with the vocab sizes of the other mini-models), and the convergence tests are passing now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, lgtm
Sounds good, feel free to merge (only collaborators/linkedin employees can merge) - and please add yourself as a contributor to this change if GitHub doesn't do that automatically. |
Thank for both for driving this!! |
Summary
Add monkeypatch to support Llama 3.2-Vision models.
Details
Llama 3.2-Vision is a multimodal model. It is also only available in transformers>=4.45.0.
Torchvision is required to run the multimodal tests for Llama 3.2-Vision (the image processor requires it).
Testing Done
make test
to ensure correctnessmake checkstyle
to ensure code stylemake test-convergence
to ensure convergence