Replies: 2 comments
-
请参考 #2432 (comment) 使用微调后的模型 |
Beta Was this translation helpful? Give feedback.
0 replies
-
非常感謝您的幫忙!
hoshi-hiyouga ***@***.***> 於 2024年2月26日 週一 下午6:05寫道:
… 请参考 #2432 (comment)
<#2432 (comment)>
使用微调后的模型
—
Reply to this email directly, view it on GitHub
<#2602 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AJXGUHP2XDO427BNLIGTSZ3YVRM5VAVCNFSM6AAAAABDZ3WHU6VHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DKOJQGM4TA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
我使用llama factory中的export功能將模型匯出,為了測試模型的性能我簡單的寫了一個python程式去讀取但卻發現效果並不如預期,想請問是我匯出的時候產生了問題還是我的python程式碼可能少了一些東西?
以下是我的程式碼:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_path = r"C:\Users\User\Desktop\LLMs_myproject\LLaMA-Factory-main\epochs50"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = model.to("cpu")
temperature = 0.95
top_k = 50
top_p = 0.7
def gen(prompt, result_length):
input_ids = tokenizer.encode(prompt, return_tensors="pt")
output = model.generate(input_ids,
max_length=result_length,
temperature=temperature,
top_k=top_k,
top_p=top_p,
num_beams=2,
no_repeat_ngram_size=2,
early_stopping=True
)
return tokenizer.decode(output[0], skip_special_tokens=True)
while True:
prompt = input("prompt: ")
length = int(input("length: "))
full_prompt = "Prompt: " + prompt
result = gen(full_prompt, length)
print(result)
感謝各位的相助。
Beta Was this translation helpful? Give feedback.
All reactions