如何以 int4 的精度运行 glm-4-9b #211
-
如何以 int4 的精度运行 glm-4-9b 下面是我的代码: from modelscope import AutoTokenizer, AutoModel, snapshot_download
model_dir = snapshot_download("ZhipuAI/glm-4-9b")
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
model = AutoModel.from_pretrained(
model_dir, trust_remote_code=True).half().cuda()
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)
response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history)
print(response) 但是运行报错了
因为我只有一张 Tesla T4 显卡,只有 16GB 的显存 ─➤ nvidia-smi
Wed Jun 19 09:48:27 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.161.08 Driver Version: 535.161.08 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla T4 Off | 00000000:AF:00.0 Off | 0 |
| N/A 45C P0 27W / 70W | 2MiB / 15360MiB | 7% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+ 我看到 https://github.com/THUDM/GLM-4/blob/main/basic_demo/README.md 中提到可以在 INT4 精度下运行 glm-4-9b 我应该如何修改我的代码呢? |
Beta Was this translation helpful? Give feedback.
Answered by
zRzRzRzRzRzRzR
Jun 22, 2024
Replies: 1 comment
-
代码中有,看cli_demo中有对应注释掉的代码 |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
zRzRzRzRzRzRzR
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
代码中有,看cli_demo中有对应注释掉的代码