Using classified raster images and meteo drivers to try to better understand what is causing sediment plumes and blooms in Lake Superior. The input data for this repo comes from the rossyndicate/Superior-Plume-Bloom repo.
This pipeline is setup to download, process, and run models for detecting blooms and plumes. It is structured as a {{targets}}
pipeline so that the workflow is easily reproducible and can be followed. The pipeline and workflow be run easily using tar_make()
. The first time you run this, you may get errors about missing packages. Install those and then try again. You should read the following caveats about some of the data inputs/downloads within the pipeline before attempting to build.
The meteorological driver data from PRISM does take a long time to download and process. Due to this, we have two spots in the pipeline where pre-built data can be used to skip over those steps.
- If you have access to the zip file of the pre-downloaded, raw meteorological data on Box, comment out the
p1_prism_files
target in1_download.R
and uncomment the target with the same name that is set up below it. You will need to download the zip file from Box and unzip the files to the1_download/prism_data/
directory before being able to build the full pipeline. - If you have access to the CSV file of processed meteorological data on Box, comment out the
p2_prism_data_huc
target in2_process.R
and uncomment the target with the same name that is set up below it. You will need to download the CSV file from Box and move it to the2_process/in/
directory before being able to build the full pipeline.
At this time, the raster files of classified imagery are kept in a Google Drive folder where you need to have specific access. The data may be released in the future, which would make this step easier. For now, you need to follow the steps below in order to authenticate to Google Drive when running tar_make()
.
- Create a new text file called
.gd_config
and save in the top-level directory of this project. - Copy-paste this code into that file:
gd_email: 'YOUR_EMAIL@some.service'
- Change the
YOUR_EMAIL@some.service
part of the file to match your own email that you will use to access the data. - Then, try running
tar_make()
.
For now, the Lake Superior shapefile LakeSuperiorWatershed.shp
is only available to our internal team via Box. You should download the spatial zip called LakeSuperiorWatershed.zip
from Box (includes all associated metadata files) and unzip to the folder 1_download/in
. This will ensure that the target in 1_download.R
called p1_lake_superior_watershed_shp
will successfully find the file it needs.
After you build the pipeline, you should be able to see the following:
- Histogram summarizing the pixel counts by year and mission:
tar_read(p4_basic_summary_histogram)
- PRISM drivers as timeseries, visualized by HUC:
tar_read(p4_prism_summary_timeseries)
- PRISM drivers as boxplots, visualized by HUC and decade:
tar_read(p4_prism_summary_boxes)
Everyone who is developing this pacakge will build their own pipeline locally. We will not commit output of the pipeline and should .gitignore
and files generated by the pipeline build. The very first time you build the pipeline, you should delete the _targets/.gitignore
file. It overrides the top-level gitignore and can be frustrating. Run the following to delete it: file.remove('_targets/.gitignore')
.