-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#157] Example notebook for NLDI data access #158
[#157] Example notebook for NLDI data access #158
Conversation
@thodson-usgs I am waiting for Jeff to do some final edits to the new notebook for nldi data access. After that I it should be ready for review. |
@thodson-usgs It is now ready for review. Thanks. |
@pkdash I will take a look in the next week. Thanks for contributing! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good! I definitely learned a lot. As someone who has not used this module before, my questions represent how a beginner might struggle with the nldi functions. I also feel that the nldi module could benefit from more detailed descriptions of what the arguments do/mean. For example, what is stop_comid
in get_flowlines()
, and what does trim_start()
trim? I am also curious if you can use basin polygons as features from which to obtain upstream/downstream basins? If so, that might be a nice example to add.
"source": [ | ||
"### Install the Package\n", | ||
"\n", | ||
"Use the following code to install the package if it doesn't exist already within your Jupyter Python environment. Note the `nldi` option in the `dataretrieval` package installation. The default `dataretrieval` package installation does not support NLDI data access. You must run the dataretrieval install with the `nldi` option to ensure that you have the necessary capabilities. If you are running this notebook in the CUAHSI JuypyterHub server linked to HydroShare, you will want to run the following pip install command. The base `dataretrieval` package is already installed in the CUAHSI JupyterHub Python environment, but it does not include the NLDI option." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure I follow this. Are you saying the official release of dataretrieval-python does not have the nldi module? I think if you pip install the latest version, the default does come with NLDI. Please let me know if I misunderstand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NLDI was added after the last official release.
"\n", | ||
"The following arguments are supported:\n", | ||
"\n", | ||
"* **feature_source** (string): The name of the NLDI feature source.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One difficulty I have here is that I don't know all the feature_source
options. I see in this example wqp
and the NLDI blog post mentions nwissite
, but do you know of a place where one can reference all possible inputs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is something USGS should answer. @dblodgett-usgs might recommend a NLDI API reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A full list is available here: https://api.water.usgs.gov/nldi/linked-data/?f=json
https://waterdata.usgs.gov/blog/nldi-intro/ has more info
"The following arguments are supported:\n", | ||
"\n", | ||
"* **feature_source** (string): The name of the NLDI feature source.\n", | ||
"* **feature_id** (string): The identifier of the NLDI feature.\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to previous comment, it's not clear the format of the feature_id
. I see examples with monitoring locations, like USGS-#######
, but what about other features?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to know the id you are looking for or discover it through navigation requests.
"metadata": {}, | ||
"source": [ | ||
"#### Example 2: Get flowline data using a NHDPlus comid\n", | ||
"In some cases, you may wish to get flowline data associated with a NHDPlus COMID rather than flowlines associated with a monitoring station. The following shows how to get flowline data for a COMID as a geopandas dataframe." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide a link to what a COMID is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ehinman that's our job to define. I'm also hesitant to use links that we know will break. Perhaps a conceptual description would suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the link I usually use. https://www.epa.gov/waterdata/nhdplus-national-hydrography-dataset-plus
"* \"WQP\" - Water Quality Portal\n", | ||
"* \"comid\" - NHDPlus comid\n", | ||
"\n", | ||
"There are a few others - for a complete list, consult the getDataSources endpoint of the NLDI API. For this example, we will trace upstream along the mainstem and find any NWIS surface water sites located along the mainstem of the river." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"There are a few others - for a complete list, consult the getDataSources endpoint of the NLDI API. For this example, we will trace upstream along the mainstem and find any NWIS surface water sites located along the mainstem of the river." | |
"There are a few others - for a complete list, consult the getDataSources endpoint of the NLDI API. For this example, we will trace from an NWIS surface water site upstream along the mainstem and find any NWIS surface water sites located along the mainstem of the river." |
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"gdf = nldi.get_features(lat=43.073051, long=-89.401230)\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the buffer distance for a call like this? Can you use navigation mode with a lat/long input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This piece is actually throwing an error for me: ValueError: Error getting features for lat '43.073051' and long '-89.40123'. Error reason: Internal Server Error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disregard bug comment, above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ehinman I was able to run it without error:
"source": [ | ||
"***\n", | ||
"\n", | ||
"### Examples for the `search()` function:\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a little bit confused by the search function. I don't see how it differs from get_basin/flowlines/features(as_json=True)
? I could see using search()
for situations when someone isn't sure what kind of geometry they want, they just want to know the geometries available around a certain point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think search mainly allows you to search for multiple feature types in a single query, which can be much more efficient. My recollection was that I wanted to tweak the implementation somewhat, but that's not on @pkdash. He did great for having little-to-no requirements to work from.
"source": [ | ||
"#### Example 1: Get all indexed features along a specified navigation path\n", | ||
"Get all of the indexed features from a particular data source using a navigation path that traces upstream along the mainstam (UM) and uses as an origin for the trace a feature from a given feature source (e.g., a monitoring station from the Water Quality Portal). You can get any indexed features from any of the included data sources. Example data sources and the codes you need to retrieve features from those sources include:\n", | ||
"* \"census2020-nhdpv2\" - 2020 Census Block - NHDPlusV2 Catchment Intersections\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this list is really helpful and should be at the top. Have you considered providing this list as a function like dataRetrieval::get_nldi_sources()
?
I would just add that @ehinman makes some good points, but some seem out of scope of the PR and will be for us to follow up later.
After reviewing our own swagger doc, I'm also confused. As these functions aren't implemented in the PR, this is a little out of scope and perhaps not something @pkdash can answer. |
BTW, we are migrating the NLDI to the URL in that link. If the "labs" url is used in this package, can you change "https://labs.waterdata.usgs.gov/api/nldi/" to "https://api.water.usgs.gov/nldi/" throughout? |
Co-authored-by: Elise Hinman <121896266+ehinman@users.noreply.github.com>
Co-authored-by: Elise Hinman <121896266+ehinman@users.noreply.github.com>
Co-authored-by: Elise Hinman <121896266+ehinman@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, for your review @ehinman. Indeed, there are a few loose ends for us to follow up on, but otherwise, this contribution is a clear improvement to our documentation, and I support merging it.
No description provided.