-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
checkpoint conversion script (/llama/convert_checkpoint.py) for Llama-3.2-3B-Instruct is failing with the following error #2339
Comments
I am getting this same error for model=meta-llama/Llama-3.2-3B
suffix=bfloat16_1gpu_tp1
in_dir=/root/huggingface_local_models
out_conv_path=/root/trt_checkpoint_dir/$model/$suffix
out_build_path=/root/trt_build_dir/$model/$suffix
# Build LLaMA v3 3B TP=1 using HF checkpoints directly.
python3 convert_checkpoint.py --model_dir $in_dir/$model \
--output_dir $out_conv_path \
--dtype bfloat16 \
--tp_size 1
I got this same error when I tried running it with the quickstart script yesterday. Yesterday, I just took that script and substituted the model with the |
Has anyone built an engine file for the llama3.2-3B Unquantized? I can get it to build and engine quantized, but the unquantized for the checkpoint conversion fails with the same error @GaneshDoosa and @mrakgr have. |
I've switched to LMDeploy by now. |
Please note that llama 3.2 is not supported in release 0.13. Please try on main branch. |
I checked out the |
[TensorRT-LLM] TensorRT-LLM version: 0.13.0
0.13.0
^M0it [00:00, ?it/s]^M139it [00:00, 1375.80it/s]^M201it [00:00, 1554.11it/s]
[1729020016.135793] [toyota-tom-buddy-ml-vm:879 :0] ucp_context.c:1774 UCX WARN UCP version is incompatible, required: 1.17, actual: 1.12 (release 1)
[1729020016.154083] [toyota-tom-buddy-ml-vm:879 :0] ucp_context.c:1774 UCX WARN UCP version is incompatible, required: 1.17, actual: 1.12 (release 1)
Traceback (most recent call last):
File "/tensorrtllm_backend/convert_checkpoint_v0.13.0.py", line 503, in
main()
File "/tensorrtllm_backend/convert_checkpoint_v0.13.0.py", line 495, in main
convert_and_save_hf(args)
File "/tensorrtllm_backend/convert_checkpoint_v0.13.0.py", line 437, in convert_and_save_hf
execute(args.workers, [convert_and_save_rank] * world_size, args)
File "/tensorrtllm_backend/convert_checkpoint_v0.13.0.py", line 444, in execute
f(args, rank)
File "/tensorrtllm_backend/convert_checkpoint_v0.13.0.py", line 423, in convert_and_save_rank
llama = LLaMAForCausalLM.from_hugging_face(
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/model.py", line 358, in from_hugging_face
loader.generate_tllm_weights(model)
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/model_weights_loader.py", line 357, in generate_tllm_weights
self.load(tllm_key,
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/model_weights_loader.py", line 278, in load
v = sub_module.postprocess(tllm_key, v, **postprocess_kwargs)
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/layers/linear.py", line 391, in postprocess
weights = weights.to(str_dtype_to_torch(self.dtype))
AttributeError: 'NoneType' object has no attribute 'to'
The text was updated successfully, but these errors were encountered: