Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for remote filesystem paths in Hugging Face CLI (--local-dir) #2407

Open
kimminw00 opened this issue Jul 21, 2024 · 3 comments
Open
Labels
enhancement New feature or request

Comments

@kimminw00
Copy link

kimminw00 commented Jul 21, 2024

huggingface-cli download provides a convenient way to interact with our pre-trained models and datasets.
However, when working with large models and datasets, it can be cumbersome to download and manage them locally.
To improve the user experience, I request a feature which supports for S3 paths remote filesystem paths in the Hugging Face CLI.

@Wauplin
Copy link
Contributor

Wauplin commented Jul 22, 2024

To improve the user experience, I request a feature which supports for S3 paths in the Hugging Face CLI.

What would be the goal of such a feature @kimminw00? Do you want to download models that are on the Hugging Face Hub to a S3 bucket? Or from an S3 bucket to the Hugging Face Hub? Or something else? What would be the CLI command you are expecting? Could you provide me with an example? Thanks in advance!

@kimminw00 kimminw00 changed the title Add support for S3 paths in Hugging Face CLI Add support for remote filesystems in Hugging Face CLI Aug 5, 2024
@kimminw00 kimminw00 changed the title Add support for remote filesystems in Hugging Face CLI Add support for remote filesystem paths in Hugging Face CLI Aug 5, 2024
@kimminw00 kimminw00 changed the title Add support for remote filesystem paths in Hugging Face CLI Add support for remote filesystem uri in Hugging Face CLI Aug 5, 2024
@kimminw00 kimminw00 changed the title Add support for remote filesystem uri in Hugging Face CLI Add support for remote filesystem paths in Hugging Face CLI Aug 5, 2024
@kimminw00
Copy link
Author

kimminw00 commented Aug 5, 2024

What would be the goal of such a feature @kimminw00?

The goal of this feature is to support remote filesystems for --local-dir and --cache-dir.

Do you want to download models that are on the Hugging Face Hub to a S3 bucket? Or from an S3 bucket to the Hugging Face Hub? Or something else?

Download models that are on the Hugging Face Hub to a S3 bucket.
(It would be nice to support other remote file systems)

What would be the CLI command you are expecting? Could you provide me with an example? Thanks in advance!

huggingface-cli repo download meta-llama/Meta-Llama-3.1-405B \
  --cache-dir s3://BUCKET_NAME/cache \
  --save-dir s3://BUCKET_NAME/models

(To emphasize that it also works on remote filesystems, I replaced --local-dir with --save-dir.)

@Wauplin Wauplin added the enhancement New feature or request label Aug 12, 2024
@Wauplin Wauplin changed the title Add support for remote filesystem paths in Hugging Face CLI Add support for remote filesystem paths in Hugging Face CLI (--cache-dir / --local-dir) Aug 12, 2024
@Wauplin Wauplin changed the title Add support for remote filesystem paths in Hugging Face CLI (--cache-dir / --local-dir) Add support for remote filesystem paths in Hugging Face CLI (--local-dir) Aug 12, 2024
@Wauplin
Copy link
Contributor

Wauplin commented Aug 12, 2024

Oooh, I see. Thanks for the examples! I don't think this will be supported in mid term perspective. The download process relies on some low-level IO features (filelock, symlinks, chmod) and turning it into a generic filesystem support would require heavy changes in the process. Furthermore, such a change would only be possible for --local-dir (that you renamed --save-dir) since the cache system uses symlinks which are not supported by most remote filesystems.

I think that the short-term best solution would be to build an ad-hoc tool (i.e. transfer from HF Hub to S3) and shared it with the community to see if there is interest in such a feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants