Skip to content
This repository has been archived by the owner on Jun 2, 2023. It is now read-only.

truncated training data? #122

Open
janetrbarclay opened this issue Jul 16, 2021 · 4 comments
Open

truncated training data? #122

janetrbarclay opened this issue Jul 16, 2021 · 4 comments

Comments

@janetrbarclay
Copy link
Collaborator

Unless I'm missing something, this line truncates the observations to the time frame of the PRMS data, which means we're only training on WY 1986 - 2016 (ignoring WY 2017 - 2020, even in the fine tuning). I guess this is unavoidable since we need the PRMS data for the inputs even in the fine-tuning? Maybe we just need a comment in the config.yml that the train / test / val dates are truncated to those in the sntemp file?

obs = xr.merge(obs, join="left")

@jsadler2
Copy link
Collaborator

this is a really interesting idea, @janetrbarclay. really, we could have two separate training periods, huh? i'd never thought of that. we could have
pretrain_start_date
pretrain_end_date
finetune_start_date
finetune_end_date
instead of just
train_start_date
train_end_date

I think this would be a good PR. If only pretrain_{start,end}_date is defined, those would be used for the finetuning and vice versa.

@janetrbarclay
Copy link
Collaborator Author

I agree that could be interesting. Is it something that someone would use right now? If so, I'd be happy to do a PR for it. I modified the gw_utils.py script to use observed temps from the training period outside the PRMS time frame to calculate the annual properties (assuming those properties are representative of the full training period), but without the PRMS outputs I can't train on that time period.

@aappling-usgs
Copy link
Member

I don't know of immediate needs for this for streams, but I wouldn't be surprised if they come up. Hayley is working on lake projections where she's pretraining on contemporary and future periods of GCM simulations and GLM predictions, then finetuning on contemporary periods only. We might have a similar need when we get to making stream projections, too.

@jzwart
Copy link
Member

jzwart commented Aug 6, 2021

For the temperature forecasting project, we had to do two different training periods for pre-training and fine-tuning since pre-training dataset didn't extend as far as the fine-tune dataset (see here). So I think this flexibility would be useful.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants