-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to deal with 'side input' recipe - Example static ocean grid #663
Comments
Thanks for raising this, Julius. I think something like your pseudocode example is entirely possible in Beam. The challenge I see is that to deploy this, the pipeline object needs to get involved, i.e. IIUC composite transforms without a pipeline object cannot operate as a side input, so it would need to be something like: with beam.Pipeline() as p:
ds_static_side_input = p | static_rec
recipe = (
beam.Create(pattern.items())
| OpenURLWithFSSpec()
| OpenWithXarray()
| Preprocess(ds_static=ds_static_side_input)
| StoreToZarr()
) which means that Other than that, I'm not aware of any fundamental technical blocker, will just require someone spending some time on it to explore further. |
Thats great to hear @cisaacstern. What do you suggest as the next action item? xref/move this issue in |
Not sure if this is helpful, but from the little I know of |
I just ran into this case over at leap-stc/data-management#75 and I believe it is actually a super relevant use case for pangeo-forge:
In principal the situation is the following:
The user provides
We usually want two things from b):
Both of these aspects are very important for the AR aspect of the resulting dataset, and present a huge advantage to the user of the resulting ARCO data.
I would be curious what folks here think is a good workflow to achieve this. Some thoughts from my side:
Making this work would really enable PGF to just fully ingest output from ocean models fresh of the HPC 🍞
The text was updated successfully, but these errors were encountered: