Skip to content

Actions: NVIDIA/TensorRT-LLM

auto-assign

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
67 workflow runs
67 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Adding custom sampling config
auto-assign #70: Issue #2609 labeled by nv-guomingz
December 24, 2024 15:28 3s
December 24, 2024 15:28 3s
[Performance] What is the purpose of compiling a model?
auto-assign #69: Issue #2617 labeled by nv-guomingz
December 24, 2024 15:26 2s
December 24, 2024 15:26 2s
December 24, 2024 15:25 39s
SIGABRT while trying to build trtllm engine for biomistral model on T4
auto-assign #67: Issue #2619 labeled by nv-guomingz
December 24, 2024 15:24 2s
December 24, 2024 15:24 2s
Performance of streaming requests is worse than non-streaming
auto-assign #66: Issue #2613 labeled by nv-guomingz
December 24, 2024 15:21 49s
December 24, 2024 15:21 49s
Phi4 support?
auto-assign #65: Issue #2616 labeled by nv-guomingz
December 24, 2024 15:07 3s
December 24, 2024 15:07 3s
support for T4
auto-assign #64: Issue #2620 labeled by nv-guomingz
December 24, 2024 15:04 2s
December 24, 2024 15:04 2s
support for T4
auto-assign #63: Issue #2620 labeled by krishnanpooja
December 24, 2024 11:32 2s
December 24, 2024 11:32 2s
SIGABRT while trying to build trtllm engine for biomistral model on T4
auto-assign #62: Issue #2619 labeled by krishnanpooja
December 24, 2024 10:44 3s
December 24, 2024 10:44 3s
[Performance] What is the purpose of compiling a model?
auto-assign #61: Issue #2617 labeled by Flynn-Zh
December 24, 2024 10:03 2s
December 24, 2024 10:03 2s
Performance of streaming requests is worse than non-streaming
auto-assign #60: Issue #2613 labeled by activezhao
December 24, 2024 07:50 2s
December 24, 2024 07:50 2s
Adding custom sampling config
auto-assign #59: Issue #2609 labeled by buddhapuneeth
December 23, 2024 23:58 3s
December 23, 2024 23:58 3s
SmoothQuant doesn't work with lora
auto-assign #58: Issue #2604 labeled by nv-guomingz
December 23, 2024 06:11 39s
December 23, 2024 06:11 39s
[Feature Request] Better support for w4a8 quantization
auto-assign #57: Issue #2605 labeled by nv-guomingz
December 23, 2024 06:09 47s
December 23, 2024 06:09 47s
Gemma 2 LoRA support
auto-assign #56: Issue #2606 labeled by nv-guomingz
December 23, 2024 06:09 43s
December 23, 2024 06:09 43s
SmoothQuant doesn't work with lora
auto-assign #55: Issue #2604 labeled by ShuaiShao93
December 20, 2024 20:39 2s
December 20, 2024 20:39 2s
lora doesn't work with --use_fp8_rowwise
auto-assign #54: Issue #2603 labeled by ShuaiShao93
December 20, 2024 20:15 2s
December 20, 2024 20:15 2s
--use_fp8 doesn't work with llama 3.1 8b
auto-assign #53: Issue #2602 labeled by ShuaiShao93
December 20, 2024 20:11 3s
December 20, 2024 20:11 3s
Qwen 2.5 hallucinating
auto-assign #52: Issue #2600 labeled by ChristophHandschuh
December 20, 2024 14:38 3s
December 20, 2024 14:38 3s
No module named 'tensorrt_llm.bindings'
auto-assign #51: Issue #2599 labeled by WGS-note
December 20, 2024 06:52 3s
December 20, 2024 06:52 3s
[Performance] TTFT of qwen2.5 0.5B model
auto-assign #50: Issue #2598 labeled by ReginaZh
December 20, 2024 06:27 2s
December 20, 2024 06:27 2s
Wrong result when using lora on multi gpus
auto-assign #49: Issue #2589 labeled by ShuaiShao93
December 18, 2024 16:57 3s
December 18, 2024 16:57 3s
Error in building llama with eagle for speculative decoding
auto-assign #48: Issue #2588 labeled by nv-guomingz
December 18, 2024 07:36 2s
December 18, 2024 07:36 2s
Error in building llama with eagle for speculative decoding
auto-assign #47: Issue #2588 labeled by nv-guomingz
December 18, 2024 07:36 48s
December 18, 2024 07:36 48s
Cannot load built Llama engine due to KeyError with config
auto-assign #46: Issue #2555 labeled by nv-guomingz
December 18, 2024 06:52 43s
December 18, 2024 06:52 43s