Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Da abstract classes #579

Merged
merged 9 commits into from
Sep 21, 2022
Merged

Conversation

shorvath-noaa
Copy link
Contributor

Updates to DataAssimlilation.py to create a data_assimilation object containing all the necessary DA information. This implementation defines a single DA class, 'AllDA', that creates all of the DA dataframes and dictionaries (e.g., lastobs_df, usgs_df, reservoir_usgs_df, etc.) that are needed based on user input. All of the logic regarding which DA options to use are performed within the AllDA class definition.

I've tested that calling AllDA(...) creates identical dataframes as those in the current implementation of t-route. I used the standard_AnA.yaml configuration file for these tests.

Additions

  • import additional necessary modules
  • _reindex_link_to_lake_id function to replace link ID index values with lake ID
  • _create_usgs_df function to create dataframe of USGS gage observations from USGS timeslice files
  • _create_reservoir_df function which is generalized to create either usgs or usace reservoir dataframes based on input
  • _set_reservoir_da_params function to update reservoir parameters from run_results
  • new_lastobs function to create new lastobs_df for next simulation chunk
  • AllDA class which is a subclass of DataAssimilation. Determines which DA features to turn on based on user input and creates all necessary dataframes
  • update function within AllDA that updates dataframes for next simulation chunk

Removals

Changes

Testing

  1. Once NHDNetwork object is updated, test to make sure these two are compatible. AllDA takes 'network' as an input, so we need to make sure network has the necessary features to run AllDA.

Screenshots

Notes

Todos

  • We might consider creating separate DA subclasses for each type of DA (streamflow nudging, reservoir persistence, etc.). Right now it is all done in a single DA subclass, 'AllDA'

Checklist

  • PR has an informative and human-readable title
  • Changes are limited to a single goal (no scope creep)
  • Code can be automatically merged (no conflicts)
  • Code follows project standards (link if applicable)
  • Passes all existing automated tests
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future todos are captured in comments
  • Visually tested in supported browsers and devices (see checklist below 👇)
  • Project documentation has been updated (including the "Unreleased" section of the CHANGELOG)
  • Reviewers requested with the Reviewers tool ➡️

Testing checklist

Target Environment support

  • Windows
  • Linux
  • Browser

Accessibility

  • Keyboard friendly
  • Screen reader friendly

Other

  • Is useable without CSS
  • Is useable without JS
  • Flexible from small to large screens
  • No linting errors or warnings
  • JavaScript tests are passing

@shorvath-noaa
Copy link
Contributor Author

@kumdonoaa @hellkite500 @mattw-nws : The only changes are to the DataAssimilation.py file, so t-route will run as normal with these changes. This new object will be implemented once we refactor main.py which has a draft PR out (#578)

Copy link
Member

@hellkite500 hellkite500 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely going in the right direction, but we may need to discuss the decoupling of network and DA functionality a bit...

__slots__ = ["_usgs_df", "_last_obs_df", "_reservoir_usgs_df", "_reservoir_usgs_param_df", "_reservoir_usace_df", "_reservoir_usace_param_df", "_da_parameter_dict", "_waterbody_types_df", "_usgs_lake_gage_crosswalk", "_usace_lake_gage_crosswalk"]
def __init__(self, data_assimilation_parameters, run_parameters, waterbody_parameters, network, da_run):
lastobs_df, da_parameter_dict = nnu.build_data_assimilation_lastobs(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why isn't this build_data_assimilation_lastobs functions moved to this module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason. I've now included this function within this module.

#--------------------------------------------------------------------------------

# if user requested nudging or usgs_persistence and a specified a USGS TimeSlice directory,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slightly confusing comment "and a specified a USGS TimeSlice directory"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most of your codes are from existing functions in preprocess.py. The original intention for creating preprocess.py was to put everything related to data or parameters preparation on the same place. Thus, it might be efficient either to directly call existing functions in preprocess.py or to create modified functions there for customized function calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropped the 'a' before 'specified'. I think that makes it clearer?

usgs_lake_gage_crosswalk,
usace_lake_gage_crosswalk
) = nhd_io.read_reservoir_parameter_file(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm...here is a strong connection of DA back to the nhd network, which isn't what we want with these classes. We may need to take a close look at what the purpose of this is and where the functionality is best suited to exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function just relates gage ids to reservoir ids, so I believe this is still used for ngen, no? I don't think this has anything to do with the type of network.... But I will move this function into this module so we are not relying on nhd_io.py

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant here is that the way this function works, it relies on some state/info in a reservoir param file that is specific to a particular network representation (in this case, the nhd/nwm reservoir params). What I was suggesting is refactoring this so that DA can extract that information from an instantiated Network (whether it is an NHDNetwork, HYFeaturesNetwork, CustomNetwork, ect...) instead of reading hard coded params from a file that require all networks to use the same file format/layout ect...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. Ok, so the outputs from this function (waterbody_types_df, usgs_lake_gage_crosswalk, and usace_lake_gage_crosswalk) should actually be created when the network is created. These variables would then just be passed to the data_assimilation object as inputs within the network object, yeah?

@shorvath-noaa
Copy link
Contributor Author

Changes have been made to this module so that it should be network-agnostic now.

#--------------------------------------------------------------------------------

# if user requested nudging or usgs_persistence and a specified a USGS TimeSlice directory,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think most of your codes are from existing functions in preprocess.py. The original intention for creating preprocess.py was to put everything related to data or parameters preparation on the same place. Thus, it might be efficient either to directly call existing functions in preprocess.py or to create modified functions there for customized function calls.

@property
def usace_lake_gage_crosswalk(self):
return self._usace_lake_gage_crosswalk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned before, we could make new/modified functions in preprocess.py and then call the functions from here for concisely reading code? Also, to make this file agnostic to either NHD network or HY feature network, we may create a separate preprocess.py only for HY feature network?

@shorvath-noaa shorvath-noaa merged commit bb9a092 into NOAA-OWP:master Sep 21, 2022
nmizukami pushed a commit to nmizukami/t-route that referenced this pull request Sep 30, 2022
* created AllDA class to perfrom all DA actions

* added additional packages to be imported

* fixed indentation errors

* cleaning up the code a bit

* added update function and some small bug fixes

* fix update function, added documentation for functions and AllDA class

* fixed typos and moved build_lastobs_df function into DataAssimilation.py. Also simplified this function

* moved read_reservoir_parameter_file function into DataAssimilation module

* changed usgs/usace lake gage variables and waterbody_type_df to be called from within network object (now agnostic to type of network).

Co-authored-by: Sean Horvath <shorvath@cheyenne4.cheyenne.ucar.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants