Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

py : type-check all Python scripts with Pyright #8341

Merged
merged 10 commits into from
Jul 7, 2024

Conversation

compilade
Copy link
Collaborator

@compilade compilade commented Jul 6, 2024

This fixes pretty much all type errors and warnings in all python scripts of this repo, I've also added a GitHub workflow to run Pyright on all the Python scripts to make sure new changes don't introduce type errors.

The local equivalent is simply to use pyright (without arguments), because the config is all in pyrightconfig.json.

This should help with catching typos, invalid imports, and type errors in general, and also things which are not valid in the targeted Python version. (pyright allows targeting different Python versions for different parts of the code, which is nice)

Some of the Python scripts are simply not statically type-checkable, like examples/pydantic_models_to_grammar.py and some minor parts of other scripts, so appropriate # pyright: ignore[somethingHere] comments were added.

Note that this upgrades the openai python library used by the server test suite to 1.30 (from 0.25). And this PR also includes the cpu-only torch changes from #8335, to make fetching dependencies faster.

I've added requirements/requirements-all.txt and other missing requirements files to allow the Pyright workflow to build a Python environment with all the dependencies used by the scripts in this repo. Note that Pyright does verify the validity of imports, so it will output warnings or errors about invalid imports even for non-toplevel imports, unlike scripts/check-requirements.txt, but it's not using separate venvs for each scripts, so it can't really check the requirements files themselves. (by the way, there is an invalid non-toplevel import on master in convert_llama_ggml_to_gguf.py which I noticed with this (it was import convert, but that script was moved and renamed))

TODO

  • Test the scripts which were changed more than simply adding type annotations, to check if I broke anything.

    • lazy conversion with convert_hf_to_gguf.py @compilade
      • using a class instead of a field of a function object to carry the lazy tensor queue when recursing
    • JAIS model conversion with convert_hf_to_gguf.py @fmz
      • seems like data_torch._data[0] is not using a valid field of torch.Tensor, so I've replaced it with data_torch[0].item().
      • This should work, from some separate test of the behavior of only this line.
    • llama-server test suite
      • The failures were mostly caused by the upgrade to openai~=1.30 from openai~=0.25, and a lack of type annotations which made it harder to debug.
    • ggml/ggml_vk_generate_shaders.py @0cc4m
      • the base_dict variable seems to be a constant, and was possibly unbound below the loop, so I moved it outside, but I'm not sure if that's the right thing here.
    • tests/test-tokenizer-random.py @jaime-m-p
      • which version of cffi is known to work with this?
        • It works with 1.16.0. A more useful question would be "which gcc version is known to work with this?", because it seems like gcc 13.3.0 runs optimizations in the pre-processor step (unless when using -O0), which confuses pycparser.
          • I've fixed this by passing -O0 in the call to gcc in the test-tokenizer-random.py script.
      • does generator.__qualname__ have the same behavior which generator.__name__ had?
        • seems like it does
  • I have read the contributing guidelines

  • Self-reported review complexity:

    • Medium

@compilade compilade added python python script changes Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level devops improvements to build systems and github actions labels Jul 6, 2024
@github-actions github-actions bot added script Script related testing Everything test related nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment Vulkan Issues specific to the Vulkan backend examples server labels Jul 6, 2024
The cache is not shared between branches, and it's 250MB in size,
so it would become quite a big part of the 10GB cache limit of the repo.
@compilade compilade added the merge ready indicates that this may be ready to merge soon and is just holding out in case of objections label Jul 7, 2024
@compilade
Copy link
Collaborator Author

I'll be merging this soon, because otherwise there's nothing warning anyone of the new type errors they introduce in Python scripts, unless they already use a static type checker.

For example, #8031 and #8048 were both recently merged and introduced new type errors in some Python scripts which I've fixed in 6f215f1.

@jaime-m-p
Copy link
Collaborator

@compilade

I'm testing with:

  • gcc 10.2.1
  • python 3.9.2
  • cffi 1.16.0

Using -O3 fails due to inlining. Fixed adding -fno-inline.
But -O0 seems to generate exact same code without speed difference, so -O0 is fine.

Since there are no nested scopes, generator.__qualname__ and generator.__name__ are the same.
Do you suggest the replacement? I have no problem.

@compilade
Copy link
Collaborator Author

I'm testing with:

  • gcc 10.2.1
  • python 3.9.2
  • cffi 1.16.0

@jaime-m-p

Thanks for sharing your setup. I've also tested tests/test-tokenizer-random.py on my system and it seems to also work with:

  • gcc 13.3.0
  • python 3.11.9
  • cffi 1.16.0

(which are the versions I get with nix develop .#default-extra)

Using -O3 fails due to inlining. Fixed adding -fno-inline.

Hmm, when I try with -fno-inline, it strangely still produces extern __inline int in some places, and so pycparser fails. -O0 works, though.

Since there are no nested scopes, generator.__qualname__ and generator.__name__ are the same.
Do you suggest the replacement? I have no problem.

Yes, because pyright otherwise complains that the attribute __name__ does not exist for Iterator[str], but __qualname__ does exist.
Maybe the generator functions do have __name__ at runtime, this isn't necessarily the case for all types compatible with Iterator[str], unlike with __qualname__.

The "information" level otherwise has entries
from 'examples/pydantic_models_to_grammar.py',
which could be confusing for someone trying to figure out what failed,
considering that these messages can safely be ignored
even though they look like errors.
@compilade compilade merged commit 3fd62a6 into master Jul 7, 2024
16 checks passed
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Jul 13, 2024
* py : type-check all Python scripts with Pyright

* server-tests : use trailing slash in openai base_url

* server-tests : add more type annotations

* server-tests : strip "chat" from base_url in oai_chat_completions

* server-tests : model metadata is a dict

* ci : disable pip cache in type-check workflow

The cache is not shared between branches, and it's 250MB in size,
so it would become quite a big part of the 10GB cache limit of the repo.

* py : fix new type errors from master branch

* tests : fix test-tokenizer-random.py

Apparently, gcc applies optimisations even when pre-processing,
which confuses pycparser.

* ci : only show warnings and errors in python type-check

The "information" level otherwise has entries
from 'examples/pydantic_models_to_grammar.py',
which could be confusing for someone trying to figure out what failed,
considering that these messages can safely be ignored
even though they look like errors.
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Jul 13, 2024
* py : type-check all Python scripts with Pyright

* server-tests : use trailing slash in openai base_url

* server-tests : add more type annotations

* server-tests : strip "chat" from base_url in oai_chat_completions

* server-tests : model metadata is a dict

* ci : disable pip cache in type-check workflow

The cache is not shared between branches, and it's 250MB in size,
so it would become quite a big part of the 10GB cache limit of the repo.

* py : fix new type errors from master branch

* tests : fix test-tokenizer-random.py

Apparently, gcc applies optimisations even when pre-processing,
which confuses pycparser.

* ci : only show warnings and errors in python type-check

The "information" level otherwise has entries
from 'examples/pydantic_models_to_grammar.py',
which could be confusing for someone trying to figure out what failed,
considering that these messages can safely be ignored
even though they look like errors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
devops improvements to build systems and github actions examples merge ready indicates that this may be ready to merge soon and is just holding out in case of objections nix Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment python python script changes Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level script Script related server testing Everything test related Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants