We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hello:
Glad to see that Llava is supported now. We're trying to deploy it in triton, how to do that?
The text was updated successfully, but these errors were encountered:
You could refer the document of Triton backend https://github.com/triton-inference-server/tensorrtllm_backend/blob/main/docs/llama.md and replace the engine and tokenizer by Llava if you already have a Llava engine.
Sorry, something went wrong.
Does tensorrt-llm-backend have support for multimodal? Is there an example for passing prompt and and an image through a request?
There is no such example now.
same question. need some docs about how to deploy multimodal model (such as LLaVA) via triton server tensorrtllm_backend.
@DefTruth Did you figure it out? I'm looking for the same
multimodal
byshiue
No branches or pull requests
Hello:
Glad to see that Llava is supported now. We're trying to deploy it in triton, how to do that?
The text was updated successfully, but these errors were encountered: