Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(datasets): Added the Experimental SafetensorsDataset #898

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

MinuraPunchihewa
Copy link
Contributor

@MinuraPunchihewa MinuraPunchihewa commented Oct 18, 2024

Description

This PR adds the SafetensorsDataset to support interactions with tensors stored in files in the Safetensors format.

Fixes #221

Development notes

I have used the PickleDataset as a base for the implementation of this dataset (as it goes about the same manner to access files).

These changes have been tested,

  1. Manually, by running the code locally to load and save tensors from and to Safetensors files.
  2. Via the existing and newly added unit tests.

Checklist

  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the relevant RELEASE.md file
  • Added tests to cover my changes
  • Received approvals from at least half of the TSC (required for adding a new, non-experimental dataset)

Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
MinuraPunchihewa and others added 4 commits October 20, 2024 12:08
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
@MinuraPunchihewa MinuraPunchihewa marked this pull request as ready for review October 20, 2024 06:39
@MinuraPunchihewa
Copy link
Contributor Author

Hey @astrojuanlu,
This is my implementation of the SafetensorsDataset. I would appreciate a review.

Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Signed-off-by: Minura Punchihewa <49385643+MinuraPunchihewa@users.noreply.github.com>
@DimedS DimedS added the Community Issue/PR opened by the open-source community label Nov 8, 2024
Copy link
Contributor

@DimedS DimedS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR, @MinuraPunchihewa! It looks great to me. I tested it manually, and everything works perfectly!

DimedS and others added 2 commits November 12, 2024 12:07
Signed-off-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
@MinuraPunchihewa
Copy link
Contributor Author

Thank you for the PR, @MinuraPunchihewa! It looks great to me. I tested it manually, and everything works perfectly!

Thank you, @DimedS. I appreciate it. I just updated the release notes as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community Issue/PR opened by the open-source community
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add safetensors dataset
2 participants