Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embed files #8121

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

Embed files #8121

wants to merge 13 commits into from

Conversation

katsu560
Copy link
Contributor

sync ggml's PR: Embed yolo files #831 (ggerganov/ggml#831)

I added new script gguf-addfile.py to ggml. so I need sync llama.cpp side.
I closed gguf : embed files to gguf model file #7392.

@github-actions github-actions bot added the python python script changes label Jun 25, 2024
@mofosyne mofosyne added the Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix label Jun 26, 2024
Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the CI is fixed and the file renamed we can merge

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use underscore in the python filenames: gguf-addfile.py -> gguf_add_file.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for checking.
I renamed the script to gguf_add_file.py.

@mofosyne
Copy link
Collaborator

If possible, it would be helpful if there was a readme or an entry in the wiki or comments in the code explaining how the file is being embedded and it's structure in the kv store... e.g. is there a structure for storing the file name? Is it stored as a dummy tensor?

@compilade
Copy link
Collaborator

@mofosyne The embedded files seem to be stored as tensors, so this means they must be loaded on model load, and so the embedded files cannot be optional metadata as in #8602.

(using tensors was suggested in ggerganov/ggml#831 (review) as a concern for the size of the metadata)

This limits the use-cases for embedded files to required files (no extra unhandled files), which might or might not be intended.

I don't really have a strong opinion on this, either way is fine, as long as the tradeoffs are known.

Comment on lines +103 to +105
# Dimensions are written in reverse order, so flip them first
shape = np.flipud(tensor.shape)
writer.add_tensor_info(tensor.name, shape, tensor.data.dtype, tensor.data.nbytes, tensor.tensor_type)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quantized tensors won't have the correct shape otherwise. See #7483.

Suggested change
# Dimensions are written in reverse order, so flip them first
shape = np.flipud(tensor.shape)
writer.add_tensor_info(tensor.name, shape, tensor.data.dtype, tensor.data.nbytes, tensor.tensor_type)
writer.add_tensor_info(tensor.name, tensor.data.shape, tensor.data.dtype, tensor.data.nbytes, tensor.tensor_type)

data_len = len(data)
dims = [data_len]
raw_dtype = GGMLQuantizationType.I8
writer.add_tensor_info(path, dims, np.float16, data_len, raw_dtype)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ignored anyway because raw_dtype is specified, but I think the Numpy type given should at least be similar.

Suggested change
writer.add_tensor_info(path, dims, np.float16, data_len, raw_dtype)
writer.add_tensor_info(path, dims, np.int8, data_len, raw_dtype)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python python script changes Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants