checkpoint conversion script (/llama/convert_checkpoint.py) for Llama-3.2-3B-Instruct is failing with the following error #2339

GaneshDoosa · 2024-10-15T19:23:38Z

[TensorRT-LLM] TensorRT-LLM version: 0.13.0
0.13.0
^M0it [00:00, ?it/s]^M139it [00:00, 1375.80it/s]^M201it [00:00, 1554.11it/s]
[1729020016.135793] [toyota-tom-buddy-ml-vm:879 :0] ucp_context.c:1774 UCX WARN UCP version is incompatible, required: 1.17, actual: 1.12 (release 1)
[1729020016.154083] [toyota-tom-buddy-ml-vm:879 :0] ucp_context.c:1774 UCX WARN UCP version is incompatible, required: 1.17, actual: 1.12 (release 1)
Traceback (most recent call last):
File "/tensorrtllm_backend/convert_checkpoint_v0.13.0.py", line 503, in
main()
File "/tensorrtllm_backend/convert_checkpoint_v0.13.0.py", line 495, in main
convert_and_save_hf(args)
File "/tensorrtllm_backend/convert_checkpoint_v0.13.0.py", line 437, in convert_and_save_hf
execute(args.workers, [convert_and_save_rank] * world_size, args)
File "/tensorrtllm_backend/convert_checkpoint_v0.13.0.py", line 444, in execute
f(args, rank)
File "/tensorrtllm_backend/convert_checkpoint_v0.13.0.py", line 423, in convert_and_save_rank
llama = LLaMAForCausalLM.from_hugging_face(
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/model.py", line 358, in from_hugging_face
loader.generate_tllm_weights(model)
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/model_weights_loader.py", line 357, in generate_tllm_weights
self.load(tllm_key,
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/model_weights_loader.py", line 278, in load
v = sub_module.postprocess(tllm_key, v, **postprocess_kwargs)
File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/layers/linear.py", line 391, in postprocess
weights = weights.to(str_dtype_to_torch(self.dtype))
AttributeError: 'NoneType' object has no attribute 'to'

mrakgr · 2024-11-03T09:30:20Z

I am getting this same error for meta-llama/Llama-3.2-3B. Here is the script I used to run it:

model=meta-llama/Llama-3.2-3B
suffix=bfloat16_1gpu_tp1
in_dir=/root/huggingface_local_models
out_conv_path=/root/trt_checkpoint_dir/$model/$suffix
out_build_path=/root/trt_build_dir/$model/$suffix

# Build LLaMA v3 3B TP=1 using HF checkpoints directly.
python3 convert_checkpoint.py --model_dir $in_dir/$model \
                            --output_dir $out_conv_path \
                            --dtype bfloat16 \
                            --tp_size 1

root@08478573f7bc:~/TensorRT-LLM/examples/llama# bash run.sh
[TensorRT-LLM] TensorRT-LLM version: 0.15.0.dev2024102900
0.15.0.dev2024102900
201it [00:00, 1037.89it/s]
Traceback (most recent call last):
  File "/root/TensorRT-LLM/examples/llama/convert_checkpoint.py", line 529, in <module>
    main()
  File "/root/TensorRT-LLM/examples/llama/convert_checkpoint.py", line 521, in main
    convert_and_save_hf(args)
  File "/root/TensorRT-LLM/examples/llama/convert_checkpoint.py", line 463, in convert_and_save_hf
    execute(args.workers, [convert_and_save_rank] * world_size, args)
  File "/root/TensorRT-LLM/examples/llama/convert_checkpoint.py", line 470, in execute
    f(args, rank)
  File "/root/TensorRT-LLM/examples/llama/convert_checkpoint.py", line 447, in convert_and_save_rank
    llama = LLaMAForCausalLM.from_hugging_face(
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/llama/model.py", line 397, in from_hugging_face
    loader.generate_tllm_weights(model)
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/model_weights_loader.py", line 403, in generate_tllm_weights
    self.load(tllm_key,
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/models/model_weights_loader.py", line 291, in load
    v = sub_module.postprocess(tllm_key, v, **postprocess_kwargs)
  File "/usr/local/lib/python3.10/dist-packages/tensorrt_llm/layers/linear.py", line 390, in postprocess
    weights = weights.to(str_dtype_to_torch(self.dtype))
AttributeError: 'NoneType' object has no attribute 'to'

I got this same error when I tried running it with the quickstart script yesterday. Yesterday, I just took that script and substituted the model with the meta-llama/Llama-3.2-3B.

JoJoLev · 2024-11-17T15:14:13Z

Has anyone built an engine file for the llama3.2-3B Unquantized? I can get it to build and engine quantized, but the unquantized for the checkpoint conversion fails with the same error @GaneshDoosa and @mrakgr have.

mrakgr · 2024-11-17T15:15:24Z

I've switched to LMDeploy by now.

byshiue · 2024-11-21T07:31:10Z

Please note that llama 3.2 is not supported in release 0.13. Please try on main branch.
Also, please add --use_embedding_sharing when you convert the llama 3.2 ckpt.

jingzhaoou · 2024-12-02T08:05:23Z

I checked out the r24.10 branch of tensorrtllm_backend. With the nvcr.io/nvidia/tritonserver:24.10-trtllm-python-py3 docker image, I ran into the same error when converting checkpoints for model meta-llama/Llama-3.2-3B-Instruct. I resolved the issue by adding --use_embedding_sharing. Everything works great for an un-quantized model.

GaneshDoosa mentioned this issue Oct 15, 2024

TRT-LLM Support for Llama3.2 #2320

Closed

Superjomn added bug Something isn't working triaged Issue has been triaged by maintainers labels Oct 16, 2024

yspch2022 mentioned this issue Nov 19, 2024

Error convert_checkpoint in TensorRT-LLM 0.13.0 for Llama3.2 3B #2467

Open

DeekshithaDPrakash mentioned this issue Nov 20, 2024

Error: convert_checkpoint in TensorRT-LLM for Llama3.2 3B when tested on multiple versions #2471

Closed

4 tasks

byshiue self-assigned this Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

checkpoint conversion script (/llama/convert_checkpoint.py) for Llama-3.2-3B-Instruct is failing with the following error #2339

checkpoint conversion script (/llama/convert_checkpoint.py) for Llama-3.2-3B-Instruct is failing with the following error #2339

GaneshDoosa commented Oct 15, 2024

mrakgr commented Nov 3, 2024

JoJoLev commented Nov 17, 2024

mrakgr commented Nov 17, 2024

byshiue commented Nov 21, 2024

jingzhaoou commented Dec 2, 2024 •

edited

Loading

checkpoint conversion script (/llama/convert_checkpoint.py) for Llama-3.2-3B-Instruct is failing with the following error #2339

checkpoint conversion script (/llama/convert_checkpoint.py) for Llama-3.2-3B-Instruct is failing with the following error #2339

Comments

GaneshDoosa commented Oct 15, 2024

mrakgr commented Nov 3, 2024

JoJoLev commented Nov 17, 2024

mrakgr commented Nov 17, 2024

byshiue commented Nov 21, 2024

jingzhaoou commented Dec 2, 2024 • edited Loading

jingzhaoou commented Dec 2, 2024 •

edited

Loading