-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for custom seasons spanning calendar years #423
Conversation
96c5eca
to
fa087b7
Compare
Example result of # Before dropping
# -----------------
# 2000-1, 2000-2, and 2001-12 months in incomplete "DJF" seasons" so they are dropped
ds.time
<xarray.DataArray 'time' (time: 15)>
array([cftime.DatetimeGregorian(2000, 1, 16, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 2, 15, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 3, 16, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 4, 16, 0, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 5, 16, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 6, 16, 0, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 7, 16, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 8, 16, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 9, 16, 0, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 10, 16, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 11, 16, 0, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 12, 16, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2001, 1, 16, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2001, 2, 15, 0, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2001, 12, 16, 12, 0, 0, 0, has_year_zero=False)],
dtype=object)
Coordinates:
* time (time) object 2000-01-16 12:00:00 ... 2001-12-16 12:00:00
Attributes:
axis: T
long_name: time
standard_name: time
bounds: time_bnds
# After dropping
# -----------------
ds_new.time
<xarray.DataArray 'time' (time: 12)>
array([cftime.DatetimeGregorian(2000, 3, 16, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 4, 16, 0, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 5, 16, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 6, 16, 0, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 7, 16, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 8, 16, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 9, 16, 0, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 10, 16, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 11, 16, 0, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2000, 12, 16, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2001, 1, 16, 12, 0, 0, 0, has_year_zero=False),
cftime.DatetimeGregorian(2001, 2, 15, 0, 0, 0, 0, has_year_zero=False)],
dtype=object)
Coordinates:
* time (time) object 2000-03-16 12:00:00 ... 2001-02-15 00:00:00
Attributes:
axis: T
long_name: time
standard_name: time
bounds: time_bnds |
c11e505
to
dc0c325
Compare
Hey @lee1043, this PR seemed to be mostly done when I stopped working on it last year. I just had to fix a few things and update the tests. Would you like to check out this branch to test it out on real data? Also a code review would be appreciated. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #423 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 15 15
Lines 1555 1609 +54
=========================================
+ Hits 1555 1609 +54 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
@tomvothecoder sure, I will test it out and review. Thank you for the update! |
@tomvothecoder Can this be considered for v0.7.0? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My PR self-review
xcdat/temporal.py
Outdated
warnings.warn( | ||
"The `season_config` argument 'drop_incomplete_djf' is being " | ||
"deprecated. Please use 'drop_incomplete_seasons' instead.", | ||
DeprecationWarning, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Need to a specify a specific version that we will deprecate drop_incomplete_djf. Probably v0.8.0 or v0.9.0.
if len(input_months) != len(predefined_months): | ||
raise ValueError( | ||
"Exactly 12 months were not passed in the list of custom seasons." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed requirements for all 12 months to be included in a custom season
This PR still needs thorough review before I'm confident in merging it. I'll probably tag Steve at some point. The release after v0.7.0 is more realistic and reasonable. We can always initiate a new release for this feature whenever it is merged. |
@tomvothecoder no problem. Thank you for consideration. |
@tomvothecoder it looks like when custom season go beyond calendar year (Nov, Dec, Jan) there is error as follows. import os
import xcdat as xc
input_data = os.path.join(
"/p/css03/esgf_publish/CMIP6/CMIP/AWI/AWI-CM-1-1-MR/historical/r1i1p1f1/Amon/psl/gn/v20181218/",
"psl_Amon_AWI-CM-1-1-MR_historical_r1i1p1f1_gn_201301-201312.nc")
ds = xc.open_mfdataset(input_data)
# Example of custom seasons in a three month format:
custom_seasons = [
['Dec', 'Jan'],
]
season_config = {'custom_seasons': custom_seasons, 'dec_mode': 'DJF', 'drop_incomplete_djf': True}
ds.temporal.group_average("psl", "season", season_config=season_config)
|
@lee1043 Thanks for trying to this out and providing an example script! I'll debug the stack trace. |
- Remove logic for requiring all 12 months to be used
- Add conditional that determines whether subsetting time coordinates is necessary with custom seasons - Update docstrings for `season_config` - Add tests
- Months are also shifted in the `_preprocess_dataset()` method now. Before months were being shifted twice, once when dropping incomplete seasons or DJF, and a second time when labeling time coordinates.
8f9af92
to
8d156c2
Compare
In commit I will do a final walk through at the next xCDAT meeting (11/20) before merging. |
I conducted extra testing and confirmed that the current PR is working without any noticeable issue. import xcdat
import matplotlib.pyplot as plt
filepath = "http://esgf.nci.org.au/thredds/dodsC/master/CMIP6/CMIP/CSIRO/ACCESS-ESM1-5/historical/r10i1p1f1/Amon/tas/gn/v20200605/tas_Amon_ACCESS-ESM1-5_historical_r10i1p1f1_gn_185001-201412.nc"
ds = xcdat.open_dataset(filepath)
# Climatology for default seasons
season_climo = ds.temporal.climatology(
"tas",
freq="season",
weighted=True,
season_config={"dec_mode": "DJF", "drop_incomplete_djf": True},
)
# Climatology for custom seasons
custom_seasons = [
["Jan", "Feb", "Mar"], # "JanFebMar"
["Apr", "May", "Jun"], # "AprMayJun"
["Jul", "Aug", "Sep"], # "JunJulAug"
["Oct", "Nov", "Dec"], # "OctNovDec"
]
c_season_climo = ds.temporal.climatology(
"tas",
freq="season",
weighted=True,
season_config={"custom_seasons": custom_seasons},
)
fig, ax = plt.subplots(2, 2, figsize=(12, 5))
c_season_climo.isel(time=0)["tas"].plot(ax=ax[0, 0]) # First row, first column
c_season_climo.isel(time=1)["tas"].plot(ax=ax[0, 1]) # First row, second column
c_season_climo.isel(time=2)["tas"].plot(ax=ax[1, 0]) # Second row, first column
c_season_climo.isel(time=3)["tas"].plot(ax=ax[1, 1]) # Second row, second column
plt.tight_layout()
plt.show() # Climatology for custom seasons
custom_seasons_2 = [
["Oct", "Nov", "Dec", "Jan", "Feb"], # "OctNovDec"
]
c_season_climo_2 = ds.temporal.climatology(
"tas",
freq="season",
weighted=True,
season_config={"custom_seasons": custom_seasons_2},
)
c_season_climo_2["tas"].plot() |
- Methods include `_subset_coords_for_custom_seasons()` and `_shift_custom_season_years()`
Thank for testing @lee1043. I will now merge this PR! |
Hey @arfriedman, @DamienIrving, and @oliviermarti, I know this PR is a long time coming (over a year and a half). If you're still interested, you can try out this custom seasons feature by checking out the latest git clone https://github.com/xCDAT/xcdat.git
cd xcdat
conda activate <YOUR-ENV>
make install # or python -m pip install . |
Thank you @tomvothecoder! I'm very excited about this feature. |
Tom, Thank you for this work :-) I still have a concern : I have a monthly variable with time values centered at the middle of the month, and correct bounds. When I compute
The result is on a time axis with values at the beginning of the season, not the middle. That not very nice for plots, for instance when I plot monthly and seasonal means on the same plot. And no bounds are produced. That means that for example that if I start from daily values, then compute compute monthly mean. I can not compute seasonnal means. Olivier |
|
Thanks @oliviermarti! And thank you @pochedls for pointing to #565 and a partial solution. #565 should be relatively easy to address. I will add this as a higher priority item to tackle in the next few months. |
Description
TODO:
_shift_spanning_months()
)custom_season = ["Nov", "Dec", "Jan", "Feb", "Mar"]
:["Nov", "Dec"]
are from the previous year since they are listed before"Jan"
["Jan", "Feb", "Mar"]
are from the current year["Nov", "Dec"]
need to be shifted a year forward for correctgrouping.
_drop_incomplete_seasons()
)_drop_incomplete_djf()
drop_incomplete_djf
withdrop_incomplete_season
cftime
time coordinates. Does it make sense to also keep the custom seasons with the time coordinates, similar to what Xarray does?Checklist
If applicable:
Additional Context
Google Slides explaining logic
Refactoring this PR to use Add SeasonGrouper, SeasonResampler pydata/xarray#9524 will most likely require addressing [Refactor]: Consider using
flox
andxr.resample()
to improve temporal averaging grouping logic #217, which involves extensive refactoring in how the time coordinates are pre-processed based on the averaging mode and frequency. As ofxarray >=2024.09.0
, Xarray supports grouping by multiple variables too.https://docs.xarray.dev/en/stable/user-guide/groupby.html#grouping-by-multiple-variables