Embed files #8121

katsu560 · 2024-06-25T20:16:30Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

sync ggml's PR: Embed yolo files #831 (ggerganov/ggml#831)

I added new script gguf-addfile.py to ggml. so I need sync llama.cpp side.
I closed gguf : embed files to gguf model file #7392.

…nto embed_files

ggerganov

After the CI is fixed and the file renamed we can merge

ggerganov · 2024-07-20T13:30:59Z

gguf-py/scripts/gguf-addfile.py

Use underscore in the python filenames: gguf-addfile.py -> gguf_add_file.py

Thank you for checking.
I renamed the script to gguf_add_file.py.

mofosyne · 2024-07-27T11:45:32Z

If possible, it would be helpful if there was a readme or an entry in the wiki or comments in the code explaining how the file is being embedded and it's structure in the kv store... e.g. is there a structure for storing the file name? Is it stored as a dummy tensor?

compilade · 2024-07-27T14:56:57Z

@mofosyne The embedded files seem to be stored as tensors, so this means they must be loaded on model load, and so the embedded files cannot be optional metadata as in #8602.

(using tensors was suggested in ggerganov/ggml#831 (review) as a concern for the size of the metadata)

This limits the use-cases for embedded files to required files (no extra unhandled files), which might or might not be intended.

I don't really have a strong opinion on this, either way is fine, as long as the tradeoffs are known.

compilade · 2024-07-27T14:59:38Z

gguf-py/scripts/gguf_add_file.py

+        # Dimensions are written in reverse order, so flip them first
+        shape = np.flipud(tensor.shape)
+        writer.add_tensor_info(tensor.name, shape, tensor.data.dtype, tensor.data.nbytes, tensor.tensor_type)


Quantized tensors won't have the correct shape otherwise. See #7483.

Suggested change

# Dimensions are written in reverse order, so flip them first

shape = np.flipud(tensor.shape)

writer.add_tensor_info(tensor.name, shape, tensor.data.dtype, tensor.data.nbytes, tensor.tensor_type)

writer.add_tensor_info(tensor.name, tensor.data.shape, tensor.data.dtype, tensor.data.nbytes, tensor.tensor_type)

compilade · 2024-07-27T15:04:26Z

gguf-py/scripts/gguf_add_file.py

+            data_len = len(data)
+            dims = [data_len]
+            raw_dtype = GGMLQuantizationType.I8
+            writer.add_tensor_info(path, dims, np.float16, data_len, raw_dtype)


It's ignored anyway because raw_dtype is specified, but I think the Numpy type given should at least be similar.

Suggested change

writer.add_tensor_info(path, dims, np.float16, data_len, raw_dtype)

writer.add_tensor_info(path, dims, np.int8, data_len, raw_dtype)

…nto embed_files

katsu560 added 3 commits June 26, 2024 04:54

sync ggml

b096272

sync ggml

e456204

Merge branch 'embed_files' of https://github.com/katsu560/llama.cpp i…

ae6a22b

…nto embed_files

katsu560 mentioned this pull request Jun 25, 2024

gguf : embed files to gguf model file #7392

Closed

github-actions bot added the python python script changes label Jun 25, 2024

mofosyne added the Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix label Jun 26, 2024

katsu560 and others added 3 commits July 15, 2024 18:46

Merge branch 'ggerganov:master' into embed_files

f1348e2

delete commented line

562d4f2

Merge branch 'embed_files' of https://github.com/katsu560/llama.cpp i…

096c3da

…nto embed_files

ggerganov approved these changes Jul 20, 2024

View reviewed changes

katsu560 added 2 commits July 27, 2024 18:54

rename to gguf_add_file.py

2bad597

update comment

48856e1

compilade reviewed Jul 27, 2024

View reviewed changes

katsu560 and others added 5 commits August 10, 2024 16:07

Merge branch 'ggerganov:master' into embed_files

15c309c

add EMBEDDED

a4289e2

add write_data

01b463f

add gguf_add_file.py

287f7db

Merge branch 'embed_files' of https://github.com/katsu560/llama.cpp i…

9c972aa

…nto embed_files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Embed files #8121

Embed files #8121

katsu560 commented Jun 25, 2024

ggerganov left a comment

ggerganov Jul 20, 2024

katsu560 Jul 27, 2024

mofosyne commented Jul 27, 2024

compilade commented Jul 27, 2024

compilade Jul 27, 2024

compilade Jul 27, 2024

	writer.add_tensor_info(path, dims, np.float16, data_len, raw_dtype)
	writer.add_tensor_info(path, dims, np.int8, data_len, raw_dtype)

Embed files #8121

Are you sure you want to change the base?

Embed files #8121

Conversation

katsu560 commented Jun 25, 2024

ggerganov left a comment

Choose a reason for hiding this comment

ggerganov Jul 20, 2024

Choose a reason for hiding this comment

katsu560 Jul 27, 2024

Choose a reason for hiding this comment

mofosyne commented Jul 27, 2024

compilade commented Jul 27, 2024

compilade Jul 27, 2024

Choose a reason for hiding this comment

compilade Jul 27, 2024

Choose a reason for hiding this comment