Skip to content

v0.1.8

Compare
Choose a tag to compare
@xinfeishi xinfeishi released this 25 Mar 13:32
· 1160 commits to main since this release

feat

  • support qwen2 gptq
  • update multi_task_prompt create
  • speculative support tp
  • support roberta

refactor

  • refactor multimodal model process

fix

  • fix kv cache int8 bug: add dequantization method in reuse block scenario
  • fix stream output stop words
  • fix lora