Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify chatglm3 6b #1119

Open
wants to merge 39 commits into
base: master
Choose a base branch
from

Conversation

Aniruddha521
Copy link

I proceed as mentioned in the task #259 with the following changes.
1) Extended the nightly_model in the file openvino.genai/tests/python_tests/ov_genai_test_utils.py

nightly_models = [
        "TinyLlama/TinyLlama-1.1B-Chat-v1.0",
        "facebook/opt-125m",
        "microsoft/phi-1_5",
        "microsoft/phi-2",
        "THUDM/chatglm2-6b",
        "THUDM/chatglm3-6b", # no beam_search
        "Qwen/Qwen2-0.5B-Instruct",
        "Qwen/Qwen-7B-Chat",
        "Qwen/Qwen1.5-7B-Chat",
        "argilla/notus-7b-v1",
        "HuggingFaceH4/zephyr-7b-beta",
        "ikala/redpajama-3b-chat",
        "mistralai/Mistral-7B-v0.1",

2) Added model to

openvino.genai/.github/workflows/causal_lm_cpp.yml

Line 62 in 8470250
run: |

cpp-greedy_causal_lm-Chatglm3-6b and cpp-prompt_lookup_decoding_lm-ubuntu-Chatglm3-6b

3) Extended the supported_model_list and added note

Note

The beam_search_causal_lm is not supported in the ChatGLM3-6B model.

@github-actions github-actions bot added category: sampling Sampling / Decoding algorithms category: GHA CI based on Github actions labels Oct 31, 2024
A:' > ./prompt.txt

./build/samples/cpp/prompt_lookup_decoding_lm/prompt_lookup_decoding_lm ./TinyLlama-1.1B-Chat-v1.0/ "$(<prompt.txt)" > predictions_prompt_lookup.txt
./build/samples/cpp/greedy_causal_lm/greedy_causal_lm ./TinyLlama-1.1B-Chat-v1.0/ "$(<prompt.txt)" > predictions_greedy.txt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test chatglm3-6b instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a mistake thank you for pointing it out, in my updated commit I have removed cpp-prompt_lookup_decoding_lm-ubuntu-Chatglm3-6b and just added the necessary part in cpp-prompt_lookup_decoding_lm-ubuntu

assert predicted_greedy == predicted_prompt_lookup
"
echo "Prompt lookup" passed
- name: run and compare (model with seq_length_axis = 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this step is just a copy of another one. Remove it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/openvinotoolkit/openvino.genai/blob/master/src/docs/SUPPORTED_MODELS.md isn't updated. Please, extend the note below the table that optimum-cli requires --task text-generation-with-past for THUDM/chatglm3-6b and beam search isn't supported for that model.

@@ -25,6 +25,7 @@ def get_models_list():
"microsoft/phi-1_5",
"microsoft/phi-2",
"THUDM/chatglm2-6b",
"THUDM/chatglm3-6b", # no beam_search
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does every test pass with THUDM/chatglm3-6b? If not, please, mark the failing tests to be skipped.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skips must happen only for that particular model

@@ -274,6 +274,41 @@ jobs:
&& call .\ov\setupvars.bat
&& python samples\python\greedy_causal_lm\lora.py .\TinyLlama\TinyLlama-1.1B-intermediate-step-1431k-3T\ adapter_model.safetensors "How to create a table with two columns, one of them has type float, another one has type int?"

cpp-greedy_causal_lm-Chatglm3-6b:
runs-on: ubuntu-24.04
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
runs-on: ubuntu-24.04
runs-on: ubuntu-20.04-4-cores

@Aniruddha521
Copy link
Author

@Wovchena can you please guide me that what possibly gone wrong for cpp-greedy_causal_lm-Chatglm3-6b and cpp-prompt_lookup_decoding_lm-ubuntu although both greedy_causal_lm and prompt_lookup_decoding_lm works for chatglm3-6b in my local machine

@mlukasze mlukasze linked an issue Nov 4, 2024 that may be closed by this pull request
optimum-cli export openvino --trust-remote-code --weight-format fp16 --model THUDM/chatglm3-6b chatglm3-6b --task text-generation-with-past
- run: >
. ./ov/setupvars.sh
&& timeout 2m ./build/samples/cpp/greedy_causal_lm/greedy_causal_lm ./chatglm3-6b/ 69 | diff <(timeout 2m samples/python/greedy_causal_lm/greedy_causal_lm.py ./chatglm3-6b/ 69) -
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Wovchena can you please guide me that what possibly gone wrong for cpp-greedy_causal_lm-Chatglm3-6b and cpp-prompt_lookup_decoding_lm-ubuntu although both greedy_causal_lm and prompt_lookup_decoding_lm works for chatglm3-6b in my local machine

The samples work but the comparison fails which means the samples produced different results. Try running the samples again, but pay attention to their output.

By the way, here's a newer and more verbose implementation of the same idea using tee and timeout-minutes. It is also split into multiple steps to make it clear which command fails. You can rewrite your tests in a similar way to simplify reasoning about what happened.

- name: Run visual_language_chat C++ sample - MiniCPM-V-2_6
run: >
source ./ov/setupvars.sh
&& ./build/samples/cpp/visual_language_chat/visual_language_chat ./MiniCPM-V-2_6/ ./images/
<<< $'Describe the images?' | tee cpp.txt
timeout-minutes: 2
- run: diff cpp.txt ref.txt
- name: Run visual_language_chat Python sample - MiniCPM-V-2_6
run: >
source ./ov/setupvars.sh
&& ./samples/python/visual_language_chat/visual_language_chat.py ./MiniCPM-V-2_6/ ./images/
<<< $'Describe the images?' | tee py.txt
env:
PYTHONPATH: "./build/"
- run: diff py.txt ref.txt

@ilya-lavrenov ilya-lavrenov removed the category: sampling Sampling / Decoding algorithms label Nov 5, 2024
@github-actions github-actions bot added the category: sampling Sampling / Decoding algorithms label Nov 5, 2024
@github-actions github-actions bot added the category: tokenizers Tokenizer class or submodule update label Nov 5, 2024
….genai into verify_chatglm3-6b

merging-remote-and-local
@Aniruddha521
Copy link
Author

@Wovchena
I use the following command but doesn't seems like it is working:

git pull --rebase origin verify-chatglm-6b
git add resolved-file
git rebase --continue
git push origin verify-chatglm-6b --force

can you suggest me anything which I may have done wrong or missed.
Additionally my forked repo is up to date with openvinotoolkit:master
please help me in this matter!

@Wovchena
Copy link
Collaborator

You can checkout thirparty/openvino_tokenizers from master and push to you branch to align the tokenizers with master:

git checkout master thirdparty/openvino_tokenizers/
git add thirdparty/openvino_tokenizers/
git commit -m tokenizers
git push

@Wovchena
Copy link
Collaborator

Apparently that was wrong master. Try git checkout a0268cd5c5fe71ccbc4dc773b502075867c859fe thirdparty/openvino_tokenizers instead of git checkout master thirdparty/openvino_tokenizers/

@Aniruddha521
Copy link
Author

@Wovchena
why prompt_lookup and greedy is are generating different output for chatglm3?here

- name: run and compare
        run: |
          source ./ov/setupvars.sh

          echo 'Code:```python
          def add(a, b):
              return a + b
          ```
          Question: Can you please add 2 and 3
          A:' > ./prompt.txt

          ./build/samples/cpp/prompt_lookup_decoding_lm/prompt_lookup_decoding_lm ./chatglm3-6b/ "$(<prompt.txt)" > predictions_prompt_lookup.txt
          ./build/samples/cpp/greedy_causal_lm/greedy_causal_lm ./chatglm3-6b/ "$(<prompt.txt)" > predictions_greedy.txt
          diff predictions_prompt_lookup.txt predictions_greedy.txt
          python -c "
          with open('predictions_greedy.txt', 'r') as f:
              predicted_greedy = f.readline()
          with open('predictions_prompt_lookup.txt', 'r') as f:
              predicted_prompt_lookup = f.readline()
          print(predicted_greedy)
          print(predicted_prompt_lookup)
          "
          echo "Prompt lookup" passed

because as per the error raised I think it is due to the different output generated by greedy and prompt lookup.
But it is probably generating same results for Qwen model since it is passing the tests.

@Wovchena
Copy link
Collaborator

I don't know. That is the point of this task to test if chatglm is fine. Apparently it's not. You can print the results to see them

@Aniruddha521
Copy link
Author

@Wovchena
As I can see cpp-greedy_causal_lm-Chatglm3-6b is failing because the output generated by python sample and c++ sample are different and whereas cpp-prompt_lookup_decoding_lm-ubuntu is failing because prompt lookup and greedy generated text are different.
In both the cases I am using diff command to compare the generated text which may be causing assertion error , so will I only need to remove comparison part or remove whole cpp-greedy_causal_lm-Chatglm3-6b and cpp-prompt_lookup_decoding_lm-ubuntu` .
Also don't know why

  1. Linux (Ubuntu 20.04, Python 3.9) / OpenVINO genai extension (cmake + wheel) (pull_request) Failing after 27m
  2. Windows (VS 2019, Python 3.11) / OpenVINO genai extension (cmake + wheel) (pull_request)
  3. macOS (12, Python 3.9) / OpenVINO genai extension (cmake + wheel) (pull_request)
    the above tests are failing?
    I also proceed as you mentioned earlier
git checkout a0268cd5c5fe71ccbc4dc773b502075867c859fe thirdparty/openvino_tokenizers
git add thirdparty/openvino_tokenizers/
git commit --amend
git push

@Wovchena
Copy link
Collaborator

You need to find out why C++ and Python greedy produce different outputs and fix it. Other model runs are aligned, that means the problem is not about the samples themselves.

This PR diff shows that you've modified thirdparty/openvino_tokenizers. Eliminate that diff. master's version of openvino_tokenizers should be kept. May be this is the reason for Linux (Ubuntu 20.04, Python 3.9) / OpenVINO genai extension (cmake + wheel) (pull_request) failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: GHA CI based on Github actions category: sampling Sampling / Decoding algorithms category: tokenizers Tokenizer class or submodule update
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Good First Issue]: Verify chatglm3-6b with GenAI text_generation
3 participants