Initial research #5

nikola-rados · 2020-07-10T21:02:41Z

We need to figure out how to break down the RVIC tool into usable wps processes. Explore the code and try to come to up with a few pathways for us to follow. Once we have sorted some of our ideas out we can create new issues and begin work.

Please post any notes/thoughts/discussion in this issue thread.

The text was updated successfully, but these errors were encountered:

corviday · 2020-07-10T21:40:18Z

In order to generate routed streamflow, RVIC has to be run twice.

The first run is the "parameters" run - it looks at the watershed uphill from the point you care about and calculates how long it takes rain that falls on each grid square to reach the one you care about.

The second run is the "convolution" run - it starts with the equations created in the parameters run, and applies them to rainfall data to calculate the amount of water passing through the grid cell you're asking about at each time point.

You could probably make those two runs separate processes. The parameter process usually takes about ten seconds. The convolution process usually takes 10-20 minutes (it can be stopped and restarted, but I've never done that and don't know a huge amount about it). So splitting them up doesn't divide the work evenly.

RVIC takes a large number of arguments in the form of a configuration file. (There's a separate one for each of the two processes). You could also have a process to generate the configuration files, or a separate process to generate each one.

RVIC theoretically can be called from a python script and passed a dictionary of values, instead of gettng all its arguments from config files, but this isn't well documented and we've never done it here at PCIC. Might be the most convenient approach for this project, though.

eyvorchuk · 2020-07-10T21:56:58Z

What does the convert run do?

nikola-rados · 2020-07-10T21:57:46Z

Thanks for that rundown @corviday!

You could probably make those two runs separate processes.

Is there ever a time where you would only want to run just one of these processes? Or is it always one then the other to produce the ouptut you want? If the latter is the case I think it would make more sense to have it a single process (at least to start).

You could also have a process to generate the configuration files, or a separate process to generate each one.

This seems like a pretty good idea. Is there code in RVIC that already does this? Or would it be from scratch?

RVIC theoretically can be called from a python script and passed a dictionary of values

You are right in saying that this is probably the best approach for the project. I think taking in a collection of kwargs then either building a config or dict to pass into RVIC would be a good approach. Although, you also said it was a large number of args so perhaps that would be messy.

corviday · 2020-07-10T22:07:25Z

What does the convert run do?

I think it converts data from some other format into a format that can be used by RVIC. I don't know much about it, I've never needed to use it.

Is there code in RVIC that already does this [generate config files]? Or would it be from scratch?

It would be from scratch. The config files are just a list of argument: value pairs for all the arguments to the RVIC program. Most of the arguments are the same for all PCIC runs.

Is there ever a time where you would only want to run just one of these processes? Or is it always one then the other to produce the output you want? If the latter is the case I think it would make more sense to have it a single process (at least to start).

Every convolution run to output routed streamflow data needs a parameter run.

I think if you want to generate routed streamflow for the same point, but different climate model outputs - to compare CanESM2 and ACCESS1-0 or something - they could use the same parameter run. The elevations and soil types and all those things that go into the parameter run don't change over time or between different climate models. (On the other hand, it is pretty quick to run parameter, I don't know that there's really a good case for saving its results.)

sum1lim · 2020-07-14T16:51:42Z

RVIC takes a large number of arguments in the form of a configuration file. (There's a separate one for each of the two processes). You could also have a process to generate the configuration files, or a separate process to generate each one.

It looks like we do not necessarily have to write a process to write a .cfg file(looks kind of redundant to me). Instead, both convolution_init and gen_uh_init can take dictionary input.

So, we can write a process that allows users to input both:

an already written config file(this case we only have the call the function)
string inputs that need to be generated into a dictionary.

sum1lim · 2020-07-14T16:53:01Z

I believe this issue has not been resolved in the most recent version of RVIC. This might be problematic in our future processes. Correct me if I am wrong.

corviday · 2020-07-14T17:01:56Z

You are correct; that issue has not been resolved, as far as I know. A fix I used is here.

sum1lim · 2020-07-14T17:20:12Z

Also, I got this error:

slim@devel2:~/Desktop/hydro_data$ rvic parameters sample_parameter_config.cfg
-->> Could not load xarray!! <<--
INFO:init_logger>> -------------------- INITIALIZED RVIC LOG ------------------
INFO:init_logger>> LOG LEVEL: DEBUG
INFO:init_logger>> Logging To Console: True
INFO:init_logger>> LOG FILE: ./routing/sample/logs/RVIC-20200714-171433.log
INFO:init_logger>> ----------------------------------------------------------

INFO:gen_uh_init>> plots directory is ./routing/sample/plots
INFO:gen_uh_init>> logs directory is ./routing/sample/logs
INFO:gen_uh_init>> params directory is ./routing/sample/params
INFO:gen_uh_init>> inputs directory is ./routing/sample/inputs
INFO:gen_uh_init>> aggregated directory is ./routing/sample/temp/aggregated
INFO:gen_uh_init>> remapped directory is ./routing/sample/temp/remapped
INFO:gen_uh_init>> Opened Pour Points File: ./routing/sample/inputs/sample_pour.txt
ERROR:gen_uh_init>> Error opening pour points file: ./routing/sample/inputs/sample_pour.txt
ERROR:gen_uh_init>> 'DataFrame' object has no attribute 'ix'
Traceback (most recent call last):
  File "/home/slim/.local/lib/python3.7/site-packages/rvic/parameters.py", line 188, in gen_uh_init
    pour_points.ix[i, 'names'] = strip_invalid_char(name)
  File "/home/slim/.local/lib/python3.7/site-packages/pandas/core/generic.py", line 5274, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'ix'
ERROR:write>> Traceback (most recent call last):
ERROR:write>>   File "/home/slim/.local/bin/rvic", line 61, in <module>
ERROR:write>> main()
ERROR:write>>   File "/home/slim/.local/bin/rvic", line 43, in main
ERROR:write>> parameters(args.config_file, numofproc=args.numofproc)
ERROR:write>>   File "/home/slim/.local/lib/python3.7/site-packages/rvic/parameters.py", line 51, in parameters
ERROR:write>> outlets, config_dict, directories = gen_uh_init(config_file)
ERROR:write>>   File "/home/slim/.local/lib/python3.7/site-packages/rvic/parameters.py", line 188, in gen_uh_init
ERROR:write>> pour_points.ix[i, 'names'] = strip_invalid_char(name)
ERROR:write>>   File "/home/slim/.local/lib/python3.7/site-packages/pandas/core/generic.py", line 5274, in __getattr__
ERROR:write>> return object.__getattribute__(self, name)
ERROR:write>> AttributeError
ERROR:write>> :
ERROR:write>> 'DataFrame' object has no attribute 'ix'

I had to change this line from .ix[i, 'names'] to .loc[i, 'names]:

ERROR:write>>   File "/home/slim/.local/lib/python3.7/site-packages/rvic/parameters.py", line 188, in gen_uh_init
ERROR:write>> pour_points.ix[i, 'names'] = strip_invalid_char(name)

eyvorchuk · 2020-07-14T17:35:39Z

@sum1lim Where'd you get that config file? It's not in samples/configs.

nikola-rados · 2020-07-14T17:36:43Z

After doing some reading, I think the rvic tool breaks down fairly nicely into 3 different components:

parameters: "Development of impulse response functions using input datasets such as a flow direction grid, outlet locations, etc."
convolution: "This utility generates streamflows by convolving the impulse response function developed in the previous setp with runoff fluxes from a land model (e.g. VIC)."1
convert: "A simple conversion utility to provide users with the ability to convert old routing model setups into RVIC parameters."

I would suggest we create an issue/process for each of these items.

Each of these components are called with a config file in the main script which gives us a good starting point in terms of wps inputs. The outputs are a little more difficult. The modules don't actually directly return anything, so we'll have to dig around in workdir to get the files we are interested in (nothing we haven't done before).

There looks to be lots of samples that will likely help us with development. Furthermore, we will likely be able to use some of those files for testing.

I have yet to test out the scripts, but that is my next step. Once we have all had the chance to successfully run them I think we can get to work.

sum1lim · 2020-07-14T17:47:07Z

Just my thought from running rvic. I think it would be a lot nicer if we make our own input checking process. For example, if a user mixes up between the input files such as DOMAIN/FILE_NAME and ROUTING/FILE_NAME, this is what error messages look like:

slim@devel2:~/Desktop/hydro_data$ rvic parameters sample_parameter_config.cfg
-->> Could not load xarray!! <<--
INFO:init_logger>> -------------------- INITIALIZED RVIC LOG ------------------
INFO:init_logger>> LOG LEVEL: DEBUG
INFO:init_logger>> Logging To Console: True
INFO:init_logger>> LOG FILE: ./routing/sample/logs/RVIC-20200714-172724.log
INFO:init_logger>> ----------------------------------------------------------

INFO:gen_uh_init>> plots directory is ./routing/sample/plots
INFO:gen_uh_init>> logs directory is ./routing/sample/logs
INFO:gen_uh_init>> params directory is ./routing/sample/params
INFO:gen_uh_init>> inputs directory is ./routing/sample/inputs
INFO:gen_uh_init>> aggregated directory is ./routing/sample/temp/aggregated
INFO:gen_uh_init>> remapped directory is ./routing/sample/temp/remapped
INFO:gen_uh_init>> Opened Pour Points File: ./routing/sample/inputs/sample_pour.txt
INFO:gen_uh_init>> Opened UHbox File: ./routing/sample/inputs/uhbox.csv
DEBUG:read_netcdf>> Reading input data variables: ['lon', 'lat', 'frac', 'area', 'mask'], from file: ./routing/sample/inputs/sample_routing_domain.nc
ERROR:gen_uh_init>> Error opening FDR file
Traceback (most recent call last):
  File "/home/slim/.local/lib/python3.7/site-packages/rvic/parameters.py", line 227, in gen_uh_init
    fdr_shape = fdr_data[fdr_var].shape
KeyError: 'Flow_Direction'
ERROR:write>> Traceback (most recent call last):
ERROR:write>>   File "/home/slim/.local/bin/rvic", line 61, in <module>
ERROR:write>> main()
ERROR:write>>   File "/home/slim/.local/bin/rvic", line 43, in main
ERROR:write>> parameters(args.config_file, numofproc=args.numofproc)
ERROR:write>>   File "/home/slim/.local/lib/python3.7/site-packages/rvic/parameters.py", line 51, in parameters
ERROR:write>> outlets, config_dict, directories = gen_uh_init(config_file)
ERROR:write>>   File "/home/slim/.local/lib/python3.7/site-packages/rvic/parameters.py", line 227, in gen_uh_init
ERROR:write>> fdr_shape = fdr_data[fdr_var].shape
ERROR:write>> KeyError
ERROR:write>> :
ERROR:write>> 'Flow_Direction'

KeyError: 'Flow_Direction' does not really give meaningful information to users(Does this mean the file has or does not have the Key?). It would be better if there was a logging message saying "Invalid format: FILENAME does not have Flow_direction variable".

Since RVIC uses many input files, it is so confusing and easy to mix up between the input files. Therefore, a message that directs users to try another attempt with the inputs in the right order could help them reduce time on running this program. For example, if Flow_Direction is in the file that was given for DOMAIN/FILE_NAME, a message like "The file given to DOMAIN could possibly be a ROUTING file. Try another attempt by reordering the inputs." can be logged out.

nikola-rados · 2020-07-14T18:51:41Z

@sum1lim I got that same ix error you did. Do you want to put an issue into the RVIC repo and then suggest your change in a PR to them?

sum1lim · 2020-07-14T18:53:50Z

Do you want to put an issue into the RVIC repo and then suggest your change in a PR to them?

@nikola-rados I can surely do that!

eyvorchuk · 2020-07-14T19:06:39Z

When I view the sample pour_points csv files through GitHub, it shows proper lons and lats columns, but I get

ERROR:gen_uh_init>> Pour Points File must include variables (lons, lats) or (x, y)

when trying to run rvic parameters on either of the sample parameter configuration files. Reading the rows from the csv file gives me this output

['version https://git-lfs.github.com/spec/v1']
['oid sha256:ce13e2314839c6c2a560db0381b32df6e09300f35a0f4591dc256e0690f9610e']
['size 518']

rather than what's shown in GitHub. These were the commands I ran:

>>> import csv
>>> with open('samples/pour_points/columbia_sample_pour_points.csv') as csv_file:
...     csv_reader = csv.reader(csv_file, delimiter=',')
...     for row in csv_reader:
...             print(row)
...

Does anyone know why the output would be different?

nikola-rados · 2020-07-14T19:13:33Z

Hmmm, I'm not sure. @corviday do you have any ideas?

nikola-rados · 2020-07-14T19:16:39Z

@sum1lim and @eyvorchuk I have assigned issues to each of you. Once we clear up some of the running issues I will close this issue and you can begin working on the processes.

sum1lim · 2020-07-14T19:20:41Z

@eyvorchuk yeah, I have the same problem. The csv file is corrupted somehow. You probably want to just copy and paste the content you see in gitHub and make a new csv file.

eyvorchuk · 2020-07-14T20:58:31Z

I think it'd be good to put rvic in one of the requirements files (probably requirements_dev.txt) so that we can import the RVIC modules we need in our processes. Are there any other packages we should add?

sum1lim · 2020-07-20T20:28:47Z

RVIC theoretically can be called from a python script and passed a dictionary of values, instead of gettng all its arguments from config files, but this isn't well documented and we've never done it here at PCIC.

Unfortunately, I have found out that the ability to pass dictionary into RVIC has not been applied to the latest version 1.1.0.post1. We will probably have to wait until version 1.1.1 is released. Currently, we only can pass config files at the moment.

Do you want to put an issue into the RVIC repo and then suggest your change in a PR to them?

Also, my PR is still pending yet.

nikola-rados · 2020-07-20T20:46:23Z

That's fine we can add that in later once they get around to it. Maybe you can @ the lead developer over there to get their attention. It looks like the repo hasn't been touched in a while so they may not have seen your request.

nikola-rados added process Create a WPS process research Initial exploration labels Jul 10, 2020

nikola-rados added this to the Initial Setup milestone Jul 10, 2020

nikola-rados assigned nikola-rados, eyvorchuk and sum1lim Jul 10, 2020

sum1lim mentioned this issue Jul 14, 2020

.ix indexer should be changed to .loc in parameters.py UW-Hydro/RVIC#130

Open

nikola-rados closed this as completed Jul 23, 2020

sum1lim mentioned this issue Aug 27, 2020

Use rvic 1.1.1 and resolve its internal issues #31

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial research #5

Initial research #5

nikola-rados commented Jul 10, 2020

corviday commented Jul 10, 2020 •

edited

Loading

eyvorchuk commented Jul 10, 2020

nikola-rados commented Jul 10, 2020

corviday commented Jul 10, 2020 •

edited

Loading

sum1lim commented Jul 14, 2020 •

edited

Loading

sum1lim commented Jul 14, 2020

corviday commented Jul 14, 2020

sum1lim commented Jul 14, 2020

eyvorchuk commented Jul 14, 2020 •

edited

Loading

nikola-rados commented Jul 14, 2020

sum1lim commented Jul 14, 2020 •

edited

Loading

nikola-rados commented Jul 14, 2020

sum1lim commented Jul 14, 2020

eyvorchuk commented Jul 14, 2020 •

edited

Loading

nikola-rados commented Jul 14, 2020

nikola-rados commented Jul 14, 2020

sum1lim commented Jul 14, 2020

eyvorchuk commented Jul 14, 2020 •

edited

Loading

sum1lim commented Jul 20, 2020 •

edited

Loading

nikola-rados commented Jul 20, 2020

Initial research #5

Initial research #5

Comments

nikola-rados commented Jul 10, 2020

corviday commented Jul 10, 2020 • edited Loading

eyvorchuk commented Jul 10, 2020

nikola-rados commented Jul 10, 2020

corviday commented Jul 10, 2020 • edited Loading

sum1lim commented Jul 14, 2020 • edited Loading

sum1lim commented Jul 14, 2020

corviday commented Jul 14, 2020

sum1lim commented Jul 14, 2020

eyvorchuk commented Jul 14, 2020 • edited Loading

nikola-rados commented Jul 14, 2020

sum1lim commented Jul 14, 2020 • edited Loading

nikola-rados commented Jul 14, 2020

sum1lim commented Jul 14, 2020

eyvorchuk commented Jul 14, 2020 • edited Loading

nikola-rados commented Jul 14, 2020

nikola-rados commented Jul 14, 2020

sum1lim commented Jul 14, 2020

eyvorchuk commented Jul 14, 2020 • edited Loading

sum1lim commented Jul 20, 2020 • edited Loading

nikola-rados commented Jul 20, 2020

corviday commented Jul 10, 2020 •

edited

Loading

corviday commented Jul 10, 2020 •

edited

Loading

sum1lim commented Jul 14, 2020 •

edited

Loading

eyvorchuk commented Jul 14, 2020 •

edited

Loading

sum1lim commented Jul 14, 2020 •

edited

Loading

eyvorchuk commented Jul 14, 2020 •

edited

Loading

eyvorchuk commented Jul 14, 2020 •

edited

Loading

sum1lim commented Jul 20, 2020 •

edited

Loading