-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embed files #8121
base: master
Are you sure you want to change the base?
Embed files #8121
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the CI is fixed and the file renamed we can merge
gguf-py/scripts/gguf-addfile.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use underscore in the python filenames: gguf-addfile.py
-> gguf_add_file.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for checking.
I renamed the script to gguf_add_file.py.
If possible, it would be helpful if there was a readme or an entry in the wiki or comments in the code explaining how the file is being embedded and it's structure in the kv store... e.g. is there a structure for storing the file name? Is it stored as a dummy tensor? |
@mofosyne The embedded files seem to be stored as tensors, so this means they must be loaded on model load, and so the embedded files cannot be optional metadata as in #8602. (using tensors was suggested in ggerganov/ggml#831 (review) as a concern for the size of the metadata) This limits the use-cases for embedded files to required files (no extra unhandled files), which might or might not be intended. I don't really have a strong opinion on this, either way is fine, as long as the tradeoffs are known. |
# Dimensions are written in reverse order, so flip them first | ||
shape = np.flipud(tensor.shape) | ||
writer.add_tensor_info(tensor.name, shape, tensor.data.dtype, tensor.data.nbytes, tensor.tensor_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quantized tensors won't have the correct shape otherwise. See #7483.
# Dimensions are written in reverse order, so flip them first | |
shape = np.flipud(tensor.shape) | |
writer.add_tensor_info(tensor.name, shape, tensor.data.dtype, tensor.data.nbytes, tensor.tensor_type) | |
writer.add_tensor_info(tensor.name, tensor.data.shape, tensor.data.dtype, tensor.data.nbytes, tensor.tensor_type) |
data_len = len(data) | ||
dims = [data_len] | ||
raw_dtype = GGMLQuantizationType.I8 | ||
writer.add_tensor_info(path, dims, np.float16, data_len, raw_dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ignored anyway because raw_dtype
is specified, but I think the Numpy type given should at least be similar.
writer.add_tensor_info(path, dims, np.float16, data_len, raw_dtype) | |
writer.add_tensor_info(path, dims, np.int8, data_len, raw_dtype) |
sync ggml's PR: Embed yolo files
#831
(ggerganov/ggml#831)I added new script gguf-addfile.py to ggml. so I need sync llama.cpp side.
I closed gguf : embed files to gguf model file #7392.