Note: the assessment reports, model input files, and review panel report from the 2021 lingcod assessments are available at https://www.pcouncil.org/stock-assessments-star-reports-stat-reports-rebuilding-analyses-terms-of-reference/groundfish-stock-assessment-documents/ (search for "lingcod").
To interact with this repository
- clone it to your computer
git clone https://github.com/pfmc-assessments/lingcod.git
, - open R and set your working directory to the cloned location,
- run
devtools::load_all()
, and - make magic.
Overview
Repository structure
- DESCRIPTION
- R
- data-raw
- unfit
- example structure
Development guidelines
Github issue guidelines
Modeling workflow
These materials do not constitute a formal publication and are for information only. They are in a pre-review, pre-decisional state and should not be formally cited (or reproduced). They are to be considered provisional and do not represent any determination or policy of NOAA or the Department of Commerce.
This repository houses information related to the U.S. West Coast 2021 lingcod stock assessment. The repository is structured as an R package; though, there are additional directories that are not typically standard in an R package. There are many benefits to using R packages for science (Vuorre and Crump, 2020) and given we are already familiar with using them for other purposes, it seems like a natural extension to use them for stock assessments as well. The structure follows the vertical philosophy combined with a Google drive with unlimited storage.
The premise behind vertical is that everything must be portable across users and reproducible. Both of these features exist because users can expect a common directory structure when a project is designed using vertical. This may seem overly prescriptive at first but it should be helpful.
The following sections contain descriptive information about potential contents and how to interact with files and directories in this repository. Feel free to add new files or directories, just be sure to also add information to the .gitignore and .Rbuildignore files if you need to.
A plain text file that lists all of the necessary packages.
Add packages that everyone must have to Imports, and
devtools::load_all()
will check that imported packages are installed.
Add packages that are useful and people should install to Suggests,
these are packages used for one-of analyses or preparing things that will not be touched again.
Note that you will still need to use the ::
operator unless you add @import
or @importFrom
.
A directory that stores R functions using .R files.
All code in this folder should be used to create functions,
i.e., no scripts or analyses.
Files stored in this directory will be sourced upon loading the package,
e.g., devtools::load_all()
and
the resulting R objects will be available to all in their workspace.
Including functions in a single place facilitates loading functions
without performing any analyses, the same as when you load the ggplot2 package.
Other stock assessment teams could even load the lingcod package
if they want to make use of the epic plotting functions that we WILL make.
If you have files that are not quite ready for deployment in this directory, then please commit them to the unfit directory and we can always move them later.
A directory that stores
- CONFIDENTIAL data that is NOT committed to the repository,
- non-confidential data that is NOT committed to the repository,
- other non-confidential data that is committed to the repository,
- scripts to transform data-raw objects to data objects, and
- other useful data-like files.
Most important, DO NOT commit confidential data.
The .gitignore
file is setup to help you not commit confidential data by
only suggesting that you commit .R files that are saved within data-raw.
For example, when you add a new file called test.csv
to data-raw
and run git status
the output to the console will not include test.csv under
Untracked files; but, if you add test.R
to data-raw,
git status
will list the new file under Untracked files.
Choosing to not track a data file can be made for multiple reasons other than ensuring confidentiality. For example, files might be too large to be tracked, not have a tangible structure that can be tracked, be static and unlikely to change over time, or have been provided but not currently being used. Thus, Google Drive is sufficient for their storage.
If you go to the Lingcod_2021 Google Drive folder, you will see data-raw. Download all of the files at the top level of this directory if you want to run the scripts inside the repository version of data-raw. Do not worry about downloading the directories stored within data-raw/ if you do not want to. When you receive an email with data or a contributor wants to provide data, think about adding it to a directory within the Google Drive data-raw directory. If others provide data to you and you upload it to the Google Drive, please do not change the file name even if it has spaces in it ... just add it as is to maintain its traceability.
An extra step that you can do for fun within Google Drive is to provide shortcuts to shared folders. For example, WDFW shared a Google Drive folder with me called Lingcod. I added a folder in the Google Drive called data-raw/washington_sharedwithTheresa and added a link to her shared folder here. I also added a link to the actual catch file she provided in data-raw/ rather than copying the file. This process is largely explained in the data-raw README.
Scripts inside this folder will be used to create data objects and products. In theory, scripts should be developed such that the following code from the top-level directory of lingcod_2021 will create all of the stored data objects and save them in data\ as .rda files:
mapply(source, dir("data-raw", pattern = "\\.R", full.names = TRUE))
For example, recreational fisheries catches are constructed from information stored in multiple data files within data-raw and code in lingcod_catch.R. The result, rec_catch_OR.rda, is an R object that is available in everyone's workspace when the R package is loaded. This script also stores code to build other catch data frames that will be combined to create the time series of catches placed in the data file.
todo: more information about types of scripts and where would you put a script that uses output?
A storage and tracking location for 'unfit' information. This could be scripts that were exploratory. Feel free to add anything here that you want tracked but it does not fit into the 'vertical' structure outlined by the 'rules' above.
The following directory structure should be adhered to when adding new files:
lingcod_2021
|----R
| | data.R
| | plot_ntable.R
|
|----data
| | rec_catch_OR.rda
|
|----data-raw
| | lingcod_catch.R
| | template.R
|
|----doc
| |----north
| |----south
| | lingcod_21f_rec_00.R
|
|----figures
|
|----inst
| |----extdata
|
|----man (`devtools::document()`)
| |----roxygen
| |----templates
| | data.R
| | rec_catch_OR.Rd
|
|----models
|
|----slides
|
|----tables
|
|----tests
|
|----unfit
|
| .Rbuildignore (lists directories and files that do not pertain to building the R package)
| .gitignore (lists untracked directories and files)
| [DESCRIPTION](#DESCRIPTION) (lists necessary R packages)
| NAMESPACE (generated by roxgyen)
| README.md (stores this content you are reading)
- Do not commit any confidential data to this repository. Files placed in data-raw are ignored by default unless they have the .R extension. Use the Google Drive folder to share data with team members.
- Hard wrap text at less than 80 characters; consider using a smaller number of characters if it leads to logical chunks. Think about how users will edit the text and use that to guide where you should wrap lines.
- Use decimal degrees rounded to 2 digits instead of minutes degrees for location information.
- Colors for north and south are blue and red, respectively.
- Please have use a functional spell checker while developing within this repository.
If you think you should write something down, more than likely you should put it in an issue. Issues are searchable and a great way to document thoughts and voluntold people to do things.
todo: provide information on how to use github issues for this repository
[updated Friday, June 4]: examples at the end of models/lingcod_model_bridging_new_exe.R, and models/lingcod_model_bridging_newdat.R
)
- Add rows to models/README.md for each model, including placeholders as needed
- Add a script with name like
models/lingcod_model_..._.R
which is focused on the particular modeling task, and note the script name in the README file. - Within the script, use the functions
r4ss::copy_SS_inputs()
,get_dir_ling()
, andget_dir_exe()
to copy model files into a new folder. - Use the
get_inputs_ling()
function to read the SS input files into R - Modify the input files within R
- Write the modified files using
write_inputs_ling()
- Run the model using either
r4ss::run_SS_models()
, command line commands, or whatever approach you like - Commit the model results to the repo (most files will be ignored
thanks to
.gitignore
)
- Read model output and assign to workspace using a command like
get_mod(area = "n", num = 22, sens = 1)
Note:get_mod()
is a wrapper forr4ss::SS_output()
which saves you having to figure out the full path for the model, but also adds an$area
element to the resulting list: "n" or "s" which is used in the next steps) - Make standard r4ss plots with custom colors:
make_r4ss_plots_ling(mod.2021.n.022.001, plot = 1:26)
- Make custom plots
make_r4ss_plots_ling(mod.2021.n.022.001, plot = 31:50)
- Two panel plot comparing two models
plot_twopanel_comparison(list(mod.2021.n.022.001, mod.2021.n.022.405), print = FALSE)
, just a wrapper for SSsummarize() %>% SSplotComparisons() with a few extra defaults. Use argumentlegendlabels = c("north base", "sensitivity blah blah blah")
if you want more in the key than the model ids.
- We could run save the results of
SS_output()
(orget_mod()
) as an .rda file for sensitivities and not just the base model when building the report - If the repo is getting too big we can delete most of the .sso files from the repo and rely on Google Drive to pass these back and forth, perhaps in concert with the R package to load those files.
- The compiled pdf can be found on google drive and is not always pushed to github because of its size.