Skip to content

time series module

aed-modeller edited this page Jul 5, 2023 · 32 revisions

overview

The time-series plotting module is one of the core modules of MARVL. In a basic model performance assessment, the model outputs at the same location and same time of field observations are extracted and the results are compared with the observations to calculate the model skills. However, it is often seen in water quality study that multiple sites (from regular monitoring or occasional surveys) exist in an interested model domain and it is critical for the model to capture the spatial variation rather than just one site. This is especially important for areas with quick and steep bathymetry change, where both the field observations and model present strong spatiotemporal variations. Therefore, this module is designed to be able to show observations from one or multiple sites within a certain area, in comparison to the median and percentiles of model results within the same area. The users can define the size of interested area in GIS polygon files to cover just one site or a larger area with multiple sites.

The module also includes standard and advanced statistic methods to evaluate model performance scores. There are seven statistic methods available in the time-series module for assessing the model performance skills (Table 2.2.1). The readers are referred to these nice publications of Bennett et al. (2013) and Hipsey et al., (2020) for the detailed overview and discussion of these assessment methods. Users can define which method(s) to be included in performance assessment via the configuration of the ‘skills’ field in the module.

The module provides two options for comparing the observations and model prediction when multiple sites exist at the same time in a define polygon. These options can be configured in the ‘scoremethod’ field of the module, which include:

  • Median: when this option is activated, the median value of all observations is compared to the median value of model predictions. E.g.

    • $O_i = O_{50}$
    • $P_i = P_{50}$
      where $O_{50}$ and $P_{50}$ are the 50th percentile of observation and model prediction at the time of ith observation, respectively.
  • Range: when this option is activated, the median value of all observations is compared to the range of model prediction in the define polygon. For example, if the user define the model range to be ‘timeseries.pred_lims = [0.05,0.25,0.5,0.75,0.95]’, then the range is taken as the 5th and 95th of model prediction, and the model prediction is considered satisfying if the median observation sit within the range. Otherwise the nearest prediction of the range to the median prediction is used. E.g.

    • $P_i = O_i$ if $P_{95}>=O_i>=P_5$
    • $P_i = P_{95}$ if $O_i>P_{95}$
    • $P_i = P_5$ if $P_5>O_i$

    where $P_5$ and $P_{95}$ are the 5th and 95th percentile of model prediction at the time of ith observation, respectively.

Summary of model performance skills in the time series module

Model skills Description
R correlation coefficient
BIAS bias of model prediction relative to observations
MAE mean absolute error
NMAE normalized mean absolute error to mean observation
RMS root-mean square error
NRMS normalized root-mean square error to mean observation
MEF model efficiency, also called Nash-Sutcliffe efficiency

Summary of time series module configuration

Field name Description Options Comments
start_plot_ID select which variable to start plotting refer to the setting in master.varname
end_plot_ID select which variable to finish plotting refer to the setting in master.varname
plotvalidation option to add field data 1: add field data; 0: no field data
plotmodel option to add model results 1: add model results; 0: no model results
plotdepth option to select surface or bottom layers surface and/or bottom if both surface and bottom are configures, both layers will be added onto same plots and used together for model performance matrix
edge_color define edge colors for field data symbols
depth_range define depth range for plotting [0 5000] by default
validation_minmax option to add min and max values of observations 1: add max/min data; 0: no max/min data; 0 by default
isModelRange option to add model results percentiles 1: add percentile range; 0: no percentile range; 0 by default
pred_lims define percentile limits for range plotting 5 numbers between 0 -1 [0.05 0.25 0.5 0.75 0.95] by default
alph transparency of range plot 0 - 1 0.5 by default
isFieldRange option to add monthly historical field data range 1: add range; 0: no range 0 by default
fielddprctile define field data percentile range if isFieldRange==1 range between 0 to 100 [10 90] by default
isHTML option to add all plots into one HTML page 1: add to HTML; 0: no HTML 0 by default
polygon_file define the polygon file for zones/sites/areas
plotAllsites option to plot all sites in the polygon file 1: plot all site; 0: selected sites 1 by default
plotsites define site IDs for plotting if plotAllsites==0
add_error option to calculate model performance matrix 1: calculate; 0: not calculate 0 by default
isSaveErr option to save the performance matrix in a .mat file 1: save; 0: not save 0 by default
obsTHRESH define the number of observations over which the performance statistics make sense >5 5 by default
showSkill option to show performance statistics in the plot 1: show; 0: not show 0 by default
scoremethod option to select traditional or advanced method for performance skills 'median' or 'range' 'median': is a traditional method to compare the median observation values to median model results at selected sites;
'range': compare median observation values to the model percentile preditions
skills option to select statistics skills R; BIAS; MAE; RMSE; NMAE; NRMS; MEF
SkillStyle Option to select performance skill style ‘score’ or ‘tailor’ 'score': traditional way to present selected skills in a table on the figure;
'tailor': Tailor diagram to present correlation coefficient, RMSE, and standard deviation of both modelled/observed data on the figure.
outputdirectory define directory to save plots
htmloutput define directory to save HTML files
ErrFilename define directory and file name to save the performance stastistics matrix
ncfile.symbol define symbol for model results (surface and bottom) '-' or '.' user can define multiple model output to be compared in one figure
ncfile.colour define colors for model median value plotting (surface and bottom) RGB format for color definition
ncfile.col_pal_color_surf define color for range plot of surface model results
ncfile.col_pal_color_bot define color for range plot of bottom model results
datearray define time period for plotting
dateformat define time format to show in x axis mm/yy' by default
istitled option to add title 1: add; 0: not add 1 by default
isylabel option to add y label 1: add; 0: not add 1 by default
islegend option to add legend 1: add; 0: not add 1 by default
isYlim option to define Y axis limits 1: add; 0: not add 0 by default
isGridon option to add grid on 1: add; 0: not add 1 by default
dimensions define figure dimensions in centimeters [20 10] by default
dailyave option to use daily average or raw model output internal 1: daily-average; 0: raw model output interval 0 by default
smoothfactor option to smooth out the median model results use odd numbers, such as 1 or 3 1 by default, no smoothing
legendlocation define legend location 'northeastoutside' by default
filetype define figure file type 'png' or 'eps' 'png' to save figures to PNG format only; 'eps' to save figures in both EPS and 300dpi JPG formats
cAxis.value define limits of Y axis empty [] by default, matlab will automatically adjust the y limit

Example timeseries module configuration:

timeseries.start_plot_ID = 1; % select which variable to plot
timeseries.end_plot_ID = 12;

timeseries.plotvalidation = 1; % Add field data to figure (1 or 0)
timeseries.plotmodel = 1;

timeseries.plotdepth = {'surface','bottom'};  %  {'surface','bottom'} Cell-array with either one
timeseries.edge_color = {[166,86,40]./255;[8,88,158]./255}; % symbol edge color for field data, surface and bottom
timeseries.depth_range = [0.2 100];
timeseries.validation_minmax = 0;    % option to add max/min observations
timeseries.isModelRange = 1;         % option to plot model range with below percentile
timeseries.pred_lims = [0.05,0.25,0.5,0.75,0.95]; % must be 5 numbers
timeseries.alph = 0.5; % transparency

timeseries.isFieldRange = 0;         % option to add plot field data range
timeseries.fieldprctile = [10 90];
timeseries.isHTML = 1;

% polygon file define the site areas
timeseries.polygon_file = 'E:\database\AED-MARVL-v0.4\Projects\Erie\GIS\erie_validation_v4.shp';

% option to plot all sites or selected sites
timeseries.plotAllsites = 1;
if timeseries.plotAllsites == 0
    timeseries.plotsite = [1];
end

% section for model skill calculations
timeseries.add_error = 1;
timeseries.isSaveErr = 1;
timeseries.obsTHRESH = 5;
timeseries.showSkill = 1;
timeseries.scoremethod = 'range'; % 'range' or 'median'
timeseries.SkillStyle = 'score'; % 'score' for score table or 'tailor' for tailor diagram

% selection of model skill assessment, 1: activated; 0: not activated
timeseries.skills = [1,... % r: regression coefficient (0-1)
    1,... % BIAS: bias relative to mean observation (%)
    1,... % MAE: mean absolute error
    1,... % RMSE: root mean square error
    1,... % NMAE: MAE normalized to mean observation
    1,... % NRMS: RMSE normalized to mean observation
    1,... % MEF: model efficienty, Nash-Sutcliffe Efficiency
    ];
timeseries.outputdirectory = 'E:\database\AED-MARVL-v0.4\Projects\Erie\plotting\timeseries_testF\RAW\';
timeseries.htmloutput = 'E:\database\AED-MARVL-v0.4\Projects\Erie\plotting\timeseries_testF\HTML\';
timeseries.ErrFilename = 'E:\database\AED-MARVL-v0.4\Projects\Erie\plotting\timeseries_testF\errormatrix.mat';

timeseries.ncfile(1).symbol = {'-','-'};
timeseries.ncfile(1).colour = {[166,86,40]./255;[8,88,158]./255};% Surface and Bottom
timeseries.ncfile(1).col_pal_color_surf =[[254,232,200]./255;[252,141,89]./255]; % color1: 5-95 perc; color2: 25-75 perc
timeseries.ncfile(1).col_pal_color_bot  =[[222,235,247]./255;[107,174,214]./255];

% plotting configuration
timeseries.datearray = datenum(2013,5:1:10,01);
timeseries.dateformat = 'mm/yy';

%timeseries.dimc = [0.9 0.9 0.9]; % dimmest (lightest) color
timeseries.istitled = 1;
timeseries.isylabel = 1;
timeseries.islegend = 1;
timeseries.isYlim   = 1;
%timeseries.isGridon = 1;
timeseries.dimensions = [15 7.5]; % Width & Height in cm

timeseries.dailyave = 0; % 1 for daily average, 0 for model output interval. 

timeseries.legendlocation = 'northeastoutside';
timeseries.filetype = 'eps';

for vvvv=1:size(MARVLs.master.varname,1)
    timeseries.cAxis(vvvv).value = [ ];
end

Example outputs

Example output of time-series module for water temperature in Coorong South Lagoon with the ‘Range’ method and Tailor diagram for model performance assessment.

Example