Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm chart - copy models from NFS storage to attached storage #10

Open
bd-g opened this issue May 31, 2024 · 0 comments
Open

Helm chart - copy models from NFS storage to attached storage #10

bd-g opened this issue May 31, 2024 · 0 comments

Comments

@bd-g
Copy link
Collaborator

bd-g commented May 31, 2024

Proposed changes

The AWS and GCP default configurations configure the Engine Pods to read models from shared network attached storage. This can increase the disk latency and possibly increase the latency of requests that require a model load, which is particularly sensitive for streaming requests.

There should be an option to copy models from the NFS onto the Pod's attached host storage to reduce read latency for models. This could be done once on startup, and possibly poll for updated models in NFS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant