-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enumerating tool chains for geozarr in different ecosystems #5
Comments
Toolchain: Python / Pangeo (Zarr + Dask + Xarray + MetPy)OverviewThe Pangeo ecosystem has done a lot to move Zarr forward as a cloud native format. Here's what our stack looks like for the common use case of reading Zarr from cloud storage. If you want to parse CRS information out of your Xarray data that has been loaded in this way, I believe that your only option is MetPy's Dependency chainflowchart TD
s3fs --> zarr-python
zarr-python --> Dask
Dask --> Xarray
zarr-python --> Xarray
Xarray --> MetPy
|
I'm not sure to see what you mean with the CRS. You might parse it with MetPy (and I suppose with Rioxarry) if you first set it, right ? |
What I mean is that Xarray by itself has no inherent understanding of CRS. The metadata are there, but not useful. MetPy's |
To follow up on this issue, at today's meeting, we played around with trying to parse data from Brianna's example Zarr dataset into rioxarray. Notebook here https://gist.github.com/rabernat/8c53c380fbb38dcf556e85487960a847 |
Here's an example of rioxarray (building on rasterio & GDAL) understanding CRS information (via rioxarray's support for CF conventions I believe). import xarray as xr
import rioxarray
import requests
token = requests.get(
"https://planetarycomputer.microsoft.com/api/sas/v1/token/daymet-daily-hi"
).json()["token"]
ds = xr.open_dataset("abfs://daymet-zarr/daily/hi.zarr", engine="zarr", decode_coords="all", storage_options={"account_name": "daymeteuwest", "credential": token})
print(ds.rio.crs)
print("---")
print(ds.rio.transform()) which prints out
In that case, the It doesn't appear that |
Super useful Tom! You issue reveals the problem that different python libraries use different in-memory representation of CRS / transform. This is separate from, but closely related to, the on-disk representation (e.g. #12). |
@TomAugspurger : so the source metadata only includes CF projection (grid mapping) and rioxarray handles that ? I don't remember that this was working last year on my side. Surprising. |
Yes, at some point I want to dig into why that is. As long as they're all using the same "protocol" things aren't so bad, but still it'd be nicer to consolidate if possible.
Yes. I think it's some combination of https://github.com/corteva/rioxarray/blob/34c64140ba1f38e020e6d4b2cd0f3e4bb4c36f62/rioxarray/rioxarray.py#L262-L277 and https://github.com/corteva/rioxarray/blob/34c64140ba1f38e020e6d4b2cd0f3e4bb4c36f62/rioxarray/rioxarray.py#L307-L310 in rioxarray. It is important to use |
As we discussed on today's call, it will be very useful to enumerate different tool chains used to read Zarr in a geospatial context in different languages / package ecosystems. This will help us understand where GeoZarr actually needs to be implemented.
I'll go first in a reply to this issue.
The text was updated successfully, but these errors were encountered: