Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add inference code #199

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Add inference code #199

wants to merge 2 commits into from

Conversation

wade3han
Copy link

Tested with own fine-tuned 7B alpaca model

python inference.py \
    --model_name_or_path {model_path}
Instruction: Tell me about alpacas.
|  2499 | Al       | -15.960 | 0.00%
| 29886 | p        | -33.403 | 0.00%
|   562 | ac       | -32.065 | 0.00%
|   294 | as       | -24.586 | 0.00%
|   526 | are      | -20.448 | 0.00%
|   263 | a        | -17.845 | 0.00%
|  6606 | species  | -16.602 | 0.00%
|   310 | of       | -15.564 | 0.00%
|  4275 | South    | -11.832 | 0.00%
|  3082 | American | -22.230 | 0.00%
|  3949 | cam      | -12.354 | 0.00%
|   295 | el       | -34.635 | 0.00%
|   333 | id       | -19.849 | 0.00%
| 29892 | ,        | -20.313 | 0.00%
...
| 29889 | .        | -25.931 | 0.00%
|     2 | </s>     | -21.040 | 0.00%
Response:  Alpacas are a species of South American camelid, related to the llama. They are smaller than llamas and typically have finer fiber. Alpacas are primarily bred for thei
r fiber, which can be spun into soft and luxurious yarns. They are also used for their meat, which is similar to that of a chicken. Alpacas are social animals and live in herds w
ith a dominant male leader.</s>

...

Largely influenced by https://github.com/kriskrisliu/stanford_alpaca/tree/krisliu

@MrRace
Copy link

MrRace commented Apr 10, 2023

    indices = sequences[:, cut_idx:] + beam_sequence_indices
RuntimeError: The size of tensor a (114) must match the size of tensor b (259) at non-singleton dimension 1

Have you meet error like it? @wade3han

@wade3han
Copy link
Author

No, I didn't encounter that error. Can you give me more context?

@MrRace
Copy link

MrRace commented Apr 10, 2023

No, I didn't encounter that error. Can you give me more context?

just use :

instructions = [
        "模仿鲁迅的风格, 吐槽一下最近食堂饭菜涨价",
    ]

@diichen
Copy link

diichen commented Apr 13, 2023

same problem.

@magnificent1208
Copy link

RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)

same for inferencing both llama-7b-hf and fine-tune model

@diichen
Copy link

diichen commented Apr 14, 2023

Cool! The problem has been fixed.

@BaoBaoGitHub
Copy link

BaoBaoGitHub commented Jul 20, 2023

Thanks for the code!

However, I had some problems when I run the code in my server with three 3090 GPUs with VRAM of 24GB*3.
I solved the error of out of memory by commenting out the line model.cuda().
Then I solved the error "Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!" by commenting out the line num_beams=4, .

I know model.cuda() will set all model to the first GPU.
But what happend when I commenting out the line num_beams=4 ? Why it can fix the error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants