Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using pangeo-forge-recipes with two concat dims #348

Closed
leifdenby opened this issue Apr 26, 2022 · 2 comments
Closed

Using pangeo-forge-recipes with two concat dims #348

leifdenby opened this issue Apr 26, 2022 · 2 comments

Comments

@leifdenby
Copy link
Contributor

I work with a Large-Eddy simulation model which decomposes the 3D simulation domain into horizontal 2D grid of columns, one for each CPU to handle during simulation. The output netCDF files are stored in the same way, so that I have one file for each CPU used during execution. I was thinking I might be able to use pangeo-forge-recipes to produce a single zarr-based datastore for my simulation output (rather than the individual netCDF files).
Unfortunately, I get an exception from the XarrayZarrRecipe recipe that it doesn't currently support multiple concat dims.

Is this the wrong kind of idea for the purpose of this package?

Below is what I've done so far:

from pathlib import Path
from pangeo_forge_recipes.patterns import ConcatDim
from pangeo_forge_recipes.patterns import FilePattern
from pangeo_forge_recipes.recipes import XarrayZarrRecipe


SOURCE_BLOCK_FILENAME_FORMAT_3D = "{file_prefix}.{i:04d}{j:04d}.nc"

def make_full_path(i, j):
    data_root = Path(
        "/nfs/see-fs-02_users/earlcd/datastore/a289/LES_analysis_output/uclales/rico_gcss/raw_data"
    )
    return data_root / SOURCE_BLOCK_FILENAME_FORMAT_3D.format(
        i=i, j=j, file_prefix="rico_gcss"
    )

col_dim_x = ConcatDim("i", list(range(1, 4)))
col_dim_y = ConcatDim("j", list(range(1, 4)))

pattern = FilePattern(make_full_path, col_dim_x, col_dim_y)
recipe = XarrayZarrRecipe(pattern, inputs_per_chunk=10)
@rabernat
Copy link
Contributor

Thanks for reporting Leif!

This is a duplicate of #140. It is definitely high on our list of priorities for development! We hope to support it within the next few months.

@cisaacstern - would you mind making a user story for this?

@martindurant
Copy link
Contributor

Note that kerchunk's MultiZarrToZarr does support multiple dimensions, so it may be possible to plumb it into pangeo-forge without too much of a rewrite. Of course, it's still on us to get that done :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants