Why is llama-cli doing string comparison to check antiprompt? #10007
-
In llama-cli main.cpp, we have the antiprompts that we store as multiple tokens (antiprompt_ids). However, when it comes time to check it, we do a string comparison over the last 32 decoded tokens against params.antiprompt (vector of strings). But then, right after that we check antiprompt_ids (tokenized version), but ONLY check if the entry is a single token length. This seems unnecessarily complex. Wouldn't it be easier to just do a tokenized check on the length of each antiprompt_ids and avoid the string check and get rid of the "only check the token if the antiprompt is a single token" special casing? In short, I'm trying to understand if there is reasoning behind this complexity. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Likely the logic can be simplified and PRs are welcome. Just keep in mind that tokenization is tricky - for example |
Beta Was this translation helpful? Give feedback.
Likely the logic can be simplified and PRs are welcome. Just keep in mind that tokenization is tricky - for example
hello
andhello
in most cases will tokenize in 2 different tokens, so antiprompt checks most likely have to remain in text space rather than token space.