Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[model] add support for mixtral moe model #128

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

936187425
Copy link
Collaborator

@936187425 936187425 commented Apr 16, 2024

support for Mixtral-8x7B-v0.1

const int64_t head_dim = args.head_dim();
const int64_t n_kv_heads = args.n_kv_heads().value_or(n_heads);
const int64_t n_local_heads = n_heads / world_size;
const int64_t n_local_kv_heads = n_kv_heads / world_size;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a heads up. i added support for MQA and GQA, please also include that support in your change. FYI dff774e

you can learn MQA and GQA from this blog: https://iamshobhitagarwal.medium.com/navigating-the-attention-landscape-mha-mqa-and-gqa-decoded-288217d0a7d1

@936187425 936187425 changed the title [model] added support for mixtral moe model [model] add support for mixtral moe model May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants