Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

webui加载qwen2-vl-7b进行chat报错 #6371

Open
1 task done
laoqiongsuan opened this issue Dec 18, 2024 · 1 comment
Open
1 task done

webui加载qwen2-vl-7b进行chat报错 #6371

laoqiongsuan opened this issue Dec 18, 2024 · 1 comment
Labels
pending This problem is yet to be addressed

Comments

@laoqiongsuan
Copy link

Reminder

  • I have read the README and searched the existing issues.

System Info

  • llamafactory version: 0.9.2.dev0
  • Platform: Linux-5.15.0-124-generic-x86_64-with-glibc2.17
  • Python version: 3.8.19
  • PyTorch version: 2.3.1+cu121 (GPU)
  • Transformers version: 4.46.1
  • Datasets version: 2.20.0
  • Accelerate version: 1.0.1
  • PEFT version: 0.12.0
  • TRL version: 0.9.6
  • GPU type: NVIDIA RTX A6000
  • DeepSpeed version: 0.14.4

Reproduction

llamafactory-cli webui

[INFO|2024-12-18 10:52:09] llamafactory.model.model_utils.attention:157 >> Using torch SDPA for faster training and inference.
[INFO|2024-12-18 10:52:09] llamafactory.model.loader:157 >> all params: 8,291,375,616
[WARNING|2024-12-18 10:52:09] llamafactory.chat.hf_engine:168 >> There is no current event loop, creating a new one.
Exception in thread Thread-8:
Traceback (most recent call last):
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/transformers/generation/utils.py", line 2215, in generate
result = self._sample(
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/transformers/generation/utils.py", line 3206, in _sample
outputs = self(**model_inputs, return_dict=True)
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/accelerate/hooks.py", line 170, in new_forward
output = module._old_forward(*args, **kwargs)
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 1722, in forward
outputs = self.model(
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/transformers/models/qwen2_vl/modeling_qwen2_vl.py", line 1159, in forward
layer_outputs = decoder_layer(
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/accelerate/hooks.py", line 165, in new_forward
args, kwargs = module._hf_hook.pre_forward(module, *args, **kwargs)
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/accelerate/hooks.py", line 364, in pre_forward
return send_to_device(args, self.execution_device), send_to_device(
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/accelerate/utils/operations.py", line 184, in send_to_device
{
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/accelerate/utils/operations.py", line 185, in
k: t if k in skip_keys else send_to_device(t, device, non_blocking=non_blocking, skip_keys=skip_keys)
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/accelerate/utils/operations.py", line 175, in send_to_device
return honor_type(
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/accelerate/utils/operations.py", line 82, in honor_type
return type(obj)(generator)
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/accelerate/utils/operations.py", line 176, in
tensor, (send_to_device(t, device, non_blocking=non_blocking, skip_keys=skip_keys) for t in tensor)
File "/data/s2/zhuzhaowei/anaconda3/envs/py38/lib/python3.8/site-packages/accelerate/utils/operations.py", line 156, in send_to_device
return tensor.to(device, non_blocking=non_blocking)
RuntimeError: CUDA error: peer mapping resources exhausted
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Expected behavior

help

Others

No response

@github-actions github-actions bot added the pending This problem is yet to be addressed label Dec 18, 2024
@laoqiongsuan
Copy link
Author

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pending This problem is yet to be addressed
Projects
None yet
Development

No branches or pull requests

1 participant