You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a slightly difficult issue to write, because I need to strike a balance between the purpose of this project (which is good and I support) and one of the fundamental decisions you made when creating the project (which I like to initiate a discussion about). I would like to discuss and get your opinion about the format of the input files that you chose.
Just as a basis for discussion, this is the format you chose:
"""Set the full path to the flow executable and flags"""
mpirun -np 33 flow --tolerance-mb=1e-7 --linear-solver=cprw --newton-min-iterations=1 --enable-tuning=true --enable-opm-rst-file=true --output-extra-convergence-info=steps,iterations
"""Set the model parameters"""
spe11a master #Name of the spe case (spe11a, spe11b, or spe11c) and OPM Flow version (master or release)
complete gaswater #Name of the co2 model (immiscible, convective [convective requires a Flow version newer than 22-08-2024], or complete) and co2store implementation (gaswater or gasoil [oil properties are set to water internally in OPM flow])
cartesian #Type of grid (cartesian, tensor, or corner-point)
2.8 0.01 1.2 #Length, width, and depth [m]
280 #If cartesian, number of x cells [-]; otherwise, variable array of x-refinment
1 #If cartesian, number of y cells [-]; otherwise, variable array of y-refinment [-] (for spe11c)
120 #If cartesian, number of z cells [-]; if tensor, variable array of z-refinment; if corner-point, fix array of z-refinment (18 entries)
20 20 #Temperature bottom and top rig [C]
1.2 110000 1 #Datum [m], pressure at the datum [Pa], and multiplier for the permeability in the z direction [-]
1e-9 1.6e-5 #Diffusion (in liquid and gas) [m^2/s]
0 0 #Rock specific heat and density (for spe11b/c)
0 0 0 #Added pore volume on top boundary (for spe11a [if 0, free flow bc]), pore volume on lateral boundaries, and width of buffer cell [m] (for spe11b/c)
0 0 #Elevation of the parabola and back boundary [m] (for spe11c)
"""Set the saturation functions"""
(max(0, (s_w - swi) / (1 - swi))) ** 2 #Wetting rel perm saturation function [-]
(max(0, (1 - s_w - sni) / (1 - sni))) ** 2 #Non-wetting rel perm saturation function [-]
penmax * math.erf(pen * ((s_w-swi) / (1.-swi)) ** (-(1.0 / 2)) * math.pi**0.5 / (penmax * 2)) #Capillary pressure saturation function [Pa]
(np.exp(np.flip(np.linspace(0, 5.0, npoints))) - 1) / (np.exp(5.0) - 1) #Points to evaluate the saturation functions (s_w) [-]
Just from a data perspective, this is a:
line-based plain-text format
supporting comments (#)
with custom markup elements ("""), I am actually unclear if they serve a purpose or are comments
only containing values, not key-value pairs (e.g. x=y)
since there are no keys it seems to depend on the order of arguments
without a hierarchy (there are no sections, unless the """ stands for that)
Such a type of input format has a number of serious challenges:
it is not versioned (i.e. you cannot tell when and for what version of pyopenspe you made that file)
it is not backward compatible (old files will not work with new versions of pyopenspe if the number or order of parameters change)
it is not expressive (without the comments I have no idea what each line means)
it is not readable by standard input parser libraries (e.g. configparser, or argparse), instead you wrote your own parser (in utils/inputvalues.py)
writing your own parser is actually pretty hard to get correct, e.g. I think the following is true:
your parser does not fail gracefully if I do not follow the format (e.g. put in the wrong number of lines, put in too many arguments on one line, put in the wrong type, put values outside of ranges)
your parser does not support default arguments, specifying the expected type of a value, specifying the expected data range of a value, a documentation string that belongs to a key-value pair, other types of information about the value (e.g. if it is deprecated and will be removed)
I am not pointing this out, because I want to diminish your effort. Constructing data formats and writing parsers is just hard (e.g. one of the projects I contribute to that uses a custom parser has a parser of 2500 lines of C++ code), which is why many state-of-the-art software makes use of existing file formats and established parser libraries that can read them.
Currently the following formats are widely in-use:
E.g. the input file listed above, but converted to TOML could look like:
# Set the full path to the flow executable and flags
flow_exectuable = "mpirun -np 33 flow --tolerance-mb=1e-7 --linear-solver=cprw --newton-min-iterations=1 --enable-tuning=true --enable-opm-rst-file=true --output-extra-convergence-info=steps,iterations"
[Model parameters]
spe_case = "spe11a master" #Name of the spe case (spe11a, spe11b, or spe11c) and OPM Flow version = "version" (master or release)
co2_model = ["complete", "gaswater"] #Name of the co2 model (immiscible, convective [convective requires a Flow version newer than 22-08-2024], or complete) and co2store implementation (gaswater or gasoil [oil properties are set to water internally in OPM flow])
grid = "cartesian" #Type of grid (cartesian, tensor, or corner-point)
grid_length = [2.8, 0.01, 1.2] #Length, width, and depth [m]
...
[Saturation functions]
wetting_permeability_function = "(max(0, (s_w - swi) / (1 - swi))) ** 2"
...
And this file could be read into a python dictionary as simple as:
import tomllib
with open("spe11b.toml", "rb") as f:
data = tomllib.load(f)
You can then check that all necessary entries in the dict exist with the correct type, or create default values if parameters are not given in the file.
I understand that this is one of the basic assumptions about the user interface of pyopenspe11 and you already distributed this to a number of users, but I would like to at least urge you to consider to switch a better data format. You will make your own life and that of your users unbelievably hard with the current data format, and better formats are available and easy to use / implement. Additionally using standard data formats is one of the guiding principles of FAIR research software (https://www.go-fair.org/fair-principles/, I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.](https://www.go-fair.org/fair-principles/i1-metadata-use-formal-accessible-shared-broadly-applicable-language-knowledge-representation/).
This is a slightly difficult issue to write, because I need to strike a balance between the purpose of this project (which is good and I support) and one of the fundamental decisions you made when creating the project (which I like to initiate a discussion about). I would like to discuss and get your opinion about the format of the input files that you chose.
Just as a basis for discussion, this is the format you chose:
Just from a data perspective, this is a:
x=y
)"""
stands for that)Such a type of input format has a number of serious challenges:
utils/inputvalues.py
)I am not pointing this out, because I want to diminish your effort. Constructing data formats and writing parsers is just hard (e.g. one of the projects I contribute to that uses a custom parser has a parser of 2500 lines of C++ code), which is why many state-of-the-art software makes use of existing file formats and established parser libraries that can read them.
Currently the following formats are widely in-use:
E.g. the input file listed above, but converted to TOML could look like:
And this file could be read into a python dictionary as simple as:
You can then check that all necessary entries in the dict exist with the correct type, or create default values if parameters are not given in the file.
I understand that this is one of the basic assumptions about the user interface of pyopenspe11 and you already distributed this to a number of users, but I would like to at least urge you to consider to switch a better data format. You will make your own life and that of your users unbelievably hard with the current data format, and better formats are available and easy to use / implement. Additionally using standard data formats is one of the guiding principles of FAIR research software (https://www.go-fair.org/fair-principles/, I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.](https://www.go-fair.org/fair-principles/i1-metadata-use-formal-accessible-shared-broadly-applicable-language-knowledge-representation/).
Related to openjournals/joss-reviews#7357.
The text was updated successfully, but these errors were encountered: