Replies: 2 comments
-
What worked for me was setting the This works because bentoml seems to be using |
Beta Was this translation helpful? Give feedback.
-
The reason behind temp is to make move to the store atomic. Probably when trying to copy models you don't have enough mem. I think for vLLM specifically you might want to just use huggingface cache. Try a newer workflow for llama3 or llama3.1 in BentoVLLM. Thanks. |
Beta Was this translation helpful? Give feedback.
-
Hi,
I am trying to create bentoml.Model from my local pretrained model.
I am following "BentoVLLM/llama2-7b-chat/import_models.py" file and I get error on this code.
model.save_pretrained(bento_model_ref.path)
safetensors_rust.SafetensorError: Error while serializing: IoError(Os { code: 28, kind: StorageFull, message: "No space left on device" })
bento_model_ref.path returned "/tmp/tmpgh8al39tbentoml_mymodel", but my computer doesn't have enough space on "/tmp".
I tried to change this path but I couldn't find the exact code line which returns "/tmp/..."
I found getsyspath() return "/tmp/..." prefix, but I could not find its implementation.
Can anybody help me to change bentoml.Model.path?
Beta Was this translation helpful? Give feedback.
All reactions