What about the next version on llama3.mojo? #93

agbarbosa · 2024-05-10T16:00:53Z

agbarbosa
May 10, 2024

Hey, what about making it to work with Llama3?

tairov · 2024-05-21T10:55:05Z

tairov
May 21, 2024
Maintainer

Hi @agbarbosa , it seems that llama3 is practically uses same architecture. Though, we haven't tested llama2 on llama3 models, but it looks like they should work

0 replies

mikowals · 2024-05-21T11:54:27Z

mikowals
May 21, 2024

The architecture is the same. You even had the foresight to implement the grouped query attention before they rolled it out in Llama3.

But the vocab is bigger and no longer uses sentencepiece. The new BPE switches from trying to get the highest score when combining tokens to going with the lowest rank. The vocab is provided in ranked order so it is nice that index = rank and you don't need to keep the score. But some small fiddling needs to be done if you want to be compatible with both Llama2 and Llama3.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What about the next version on llama3.mojo? #93

{{title}}

Replies: 2 comments

{{title}}

{{title}}

Select a reply

What about the next version on llama3.mojo? #93

agbarbosa May 10, 2024

Replies: 2 comments

tairov May 21, 2024 Maintainer

mikowals May 21, 2024

agbarbosa
May 10, 2024

tairov
May 21, 2024
Maintainer

mikowals
May 21, 2024