-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update airflow.io.path.ObjectStoragePath for upcoming universal_pathlib release #37067
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. |
Hey @ap-- , That just sounds awesome! Obviously any simplification while retaining flexibility sounds great. The generic need for a custom class on top of UPath is indeed to support Airflow-isms, like what you mentioned but also to be able to de/serialize a path into something that can be stored in a database. We are also a bit more "object store" oriented (semantically) than UPath I guess, so a little bit more strict in what we support how how things behave. |
Great 👍 Once I finish the implementation for fsspec/universal_pathlib#173 I'll open a PR here. |
Last blocker for #36755 |
@bolkedebruin I released a new version of universal_pathlib yesterday, that vendors the required code to delay import of the required fsspec filesystems until they are actually used for IO. PR #37524 should in theory pass all tests now, if |
Apache Airflow version
main (development)
If "Other Airflow 2 version" selected, which one?
No response
What happened?
I'm opening this issue to track a needed refactor of
airflow.io.path.ObjectStoragePath
to support the next release ofuniversal_pathlib
without having to have a special implementation for Python-3.12What you think should happen instead?
Tagging @bolkedebruin, @uranusjr, @potiuk
It would be great if I could have some insight into the intended behaviour for the custom
UPath
subclassairflow.io.path.ObjectStoragePath
Looking at the current implementation it seems that these are the intended changes:
airflow/airflow/io/path.py
Lines 148 to 151 in 07fd364
airflow.io.store.ObjectStore
airflow/airflow/io/path.py
Lines 49 to 65 in 07fd364
UPath
interface with several custom methods.I am working on supporting (1) and (2) via custom methods that can be overwritten in UPath subclasses, and allow to have a single implementation for python versions prior to 3.12 and 3.12+, see: fsspec/universal_pathlib#172
Once that is integrated in UPath it should simplify the
ObjectStoragePath
implementation a lot. The custom accessor will be replaced by a factory method for creating fsspec filesystems, and the custom__new__
implementation by a method for parsing additional storage_options.Let me know if that sounds good, and what's the best way to collaborate on this issue and move forward.
Cheers,
Andreas 😃
How to reproduce
N/A
Operating System
N/A
Versions of Apache Airflow Providers
Deployment
Other
Deployment details
N/A
Anything else?
N/A
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: