-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multiple chat templates per model #134
Support multiple chat templates per model #134
Conversation
9c8cd5e
to
cc7b2bc
Compare
29cf6ff
to
48f9167
Compare
I've tried to bring this more in line with the Python implementation. I'm unsure about whether |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the PR! I made a couple of comments, but it looks in good shape. Do you think it would be possible to add a simple test case that exercises this feature?
What do you mean? As I understand it, you are adding support to the Jinja Swift package here, right? |
I think that PR won't include the tools functionality, for the sake of simplicity. Later on I may try to add it. |
In that case, maybe we should get started with just a simple way to select the desired template (by name) and add the tool stuff later. Would that work for the purposes of ml-explore/mlx-swift-examples#135? |
I think this now mirrors the Python implementation more closely. It throws an error if no template is selected, which I think is better than silently falling back to a default template that might not be compatible with the model, which previously caused some confusion for me when I was debugging the problems with Mistral 7B 0.3. |
I've added some tests with a note to add tests for tool use templates once these are implemented. Do you think this can be merged without implementing tool use in Jinja? It would be an improvement on the current situation, in which Mistral 7B doesn't work properly. |
let tokenizer = try await AutoTokenizer.from(pretrained: "microsoft/Phi-3-mini-128k-instruct") | ||
// Purposely not using the correct template for this model to verify that the template from the config is not being used | ||
let mistral7BDefaultTemplate = "{{bos_token}}{% for message in messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if message['role'] == 'user' %}{{ ' [INST] ' + message['content'] + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ ' ' + message['content'] + ' ' + eos_token}}{% else %}{{ raise_exception('Only user and assistant roles are supported!') }}{% endif %}{% endfor %}" | ||
let encoded = try tokenizer.applyChatTemplate(messages: messages, chatTemplate: mistral7BDefaultTemplate, addGenerationPrompt: false, truncation: false, maxLength: nil, tools: nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit unfortunate we have to add all the arguments in this version of the call, perhaps we could add a new one that supports messages
and chatTemplate
, if you think it's interesting to do so.
Sources/Tokenizers/Tokenizer.swift
Outdated
} | ||
return (name, template) | ||
}) | ||
if let chatTemplateArgument = chatTemplate, let matchingDictEntry = templateDict[chatTemplateArgument] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if you pass an actual template as an argument to a tokenizer that has multiple ones (mlx-community/Mistral-7B-Instruct-v0.3-4bit
, for example), it will be silently ignored and replaced with the default
one, right? Perhaps we could fail if the tokenizer has several templates and the argument provided does not match any of them. What do you think?
Thank you!
Yes, let's merge this and we can improve later. I have a couple of final comments / questions and that's it. |
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
/// A Jinja template or the name of a template to use for this conversion. | ||
/// It is usually not necessary to pass anything to this argument, | ||
/// as the model's template will be used by default. | ||
chatTemplate: String? = nil, | ||
addGenerationPrompt: Bool = false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addGenerationPrompt
is false by default in this method (which mirrors the Python implementation), but it is currently set to true in the overload method. Should we make the behavior consistent in all applyChatTemplate
methods? Should it be true or false by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be true
for the overloads as I expect that to be the most common use of the API, but I'm not fully sure.
|
We can maintain consistency with the Python API while being more explicit using an enum for the |
That's a great idea! In any case, it is ok to deviate a bit from the Python API if it's useful for the project. I see it more as a reference. Also, the more we use the same criteria, the less friction people that come from Python will experience. But I think that some developers don't come from that audience. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great, thanks a lot for your patience!
|
||
func applyChatTemplate(messages: [[String: String]], chatTemplateName: String) throws -> [Int] | ||
func applyChatTemplate(messages: [[String: String]], chatTemplate: ChatTemplateArgument) throws -> [Int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this solution. But it's a breaking change, do you know if the current method is used by mlx or others?
Perhaps we could add an overload applyChatTemplate(messages: [[String: String]], chatTemplateName: String)
that just builds a ChatTemplateArgument
(literal
) and forwards the call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I searched the mlx-swift and mlx-swift-examples repos and didn't find any instances of applyChatTemplate
there. I have a draft PR to use the chat templates once this PR and the Jinja PR land.
I think you mean the additional overload should have the argument chatTemplate: String
. Sure, I can add that.
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for iterating here @DePasqualeOrg 🙌
Great contribution, merging this now!
Thanks for your help! Do you want to make a new release so that this can get picked up in mlx-swift-examples for ml-explore/mlx-swift-examples#135? |
Some models, such as Mistral 7B Instruct 0.3, specify an array of chat templates in
tokenizer_config.json
. This change allows for parsing chat templates in this format.