truncated training data? #122

janetrbarclay · 2021-07-16T13:12:59Z

Unless I'm missing something, this line truncates the observations to the time frame of the PRMS data, which means we're only training on WY 1986 - 2016 (ignoring WY 2017 - 2020, even in the fine tuning). I guess this is unavoidable since we need the PRMS data for the inputs even in the fine-tuning? Maybe we just need a comment in the config.yml that the train / test / val dates are truncated to those in the sntemp file?

river-dl/river_dl/preproc_utils.py

Line 125 in 0c78af2

obs = xr.merge(obs, join="left")

jsadler2 · 2021-07-27T19:19:29Z

this is a really interesting idea, @janetrbarclay. really, we could have two separate training periods, huh? i'd never thought of that. we could have
pretrain_start_date
pretrain_end_date
finetune_start_date
finetune_end_date
instead of just
train_start_date
train_end_date

I think this would be a good PR. If only pretrain_{start,end}_date is defined, those would be used for the finetuning and vice versa.

janetrbarclay · 2021-07-27T19:36:55Z

I agree that could be interesting. Is it something that someone would use right now? If so, I'd be happy to do a PR for it. I modified the gw_utils.py script to use observed temps from the training period outside the PRMS time frame to calculate the annual properties (assuming those properties are representative of the full training period), but without the PRMS outputs I can't train on that time period.

aappling-usgs · 2021-07-31T13:44:37Z

I don't know of immediate needs for this for streams, but I wouldn't be surprised if they come up. Hayley is working on lake projections where she's pretraining on contemporary and future periods of GCM simulations and GLM predictions, then finetuning on contemporary periods only. We might have a similar need when we get to making stream projections, too.

jzwart · 2021-08-06T13:47:16Z

For the temperature forecasting project, we had to do two different training periods for pre-training and fine-tuning since pre-training dataset didn't extend as far as the fine-tune dataset (see here). So I think this flexibility would be useful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

truncated training data? #122

truncated training data? #122

janetrbarclay commented Jul 16, 2021

jsadler2 commented Jul 27, 2021

janetrbarclay commented Jul 27, 2021

aappling-usgs commented Jul 31, 2021

jzwart commented Aug 6, 2021

truncated training data? #122

truncated training data? #122

Comments

janetrbarclay commented Jul 16, 2021

jsadler2 commented Jul 27, 2021

janetrbarclay commented Jul 27, 2021

aappling-usgs commented Jul 31, 2021

jzwart commented Aug 6, 2021