Model Calibration #16995/HEAD / v10 |
Tags:
not added yet
LucaLuca is a tool for building and performing a procedure to calibrate parameters for a (hydrologic) model. Luca integrates the following components in a wizard-style user interface:
Shuffled Complex Evolution (SCE)The purpose of SCE is to calibrate parameters so that the model, which requires those parameters, gives better results. SCE consists of the following six steps:
The points converge into a very small region, which is less than 0.1% of the space within the lower and upper bounds of parameters. The number of complexes used in SCE Step 3 decreases by 1 for every shuffling loop. This decrease stops when the number of complexes reaches the minimum number of complex required. The output is the parameter file containing the point (a parameter set) that has the best criterion value. Luca Wizard StepsThe following calibration wizard steps are provided by Luca. The step number and caption matches the list as shown in the wizard window. . StartThe user has a choice of starting a new session or choosing a previous one by opening an existing session file. Session files can be created by saving the session in Wizard Step 12. Opening Session File When the user opens a session file, the project directory specified in the selected session file is displayed in the project directory in the session file field. The user has an option of changing the project directory but keeping the rest of settings by clicking the Change button and selecting another project directory. When some fields in the selected session file are invalid or missing, the checkbox Skip to last step and run the calibration using this settings is disabled, and thus the user is not allowed to skip to the last step. However, the user can still proceed to following steps and fill out the missing fields. . Model and Parameter FilesIn this wizard step, the model and parameter files are selected by selecting a Project Simulation. In OMS a Project Simulation is a user-specified combination of a specific model, parameter set, and data set. The Simulations available in the Project Simulation directory are displayed in the Basic Simulation File panel. In the example below, the Basic Simulation prms_test_cc.jsa has been selected. The contents of this Simulation are displayed. The model, chosen from the Project Model directory, is prms_test_cc.jma. In OMS the parameter set can be divided into multiple files to enable easier access to specific groups of parameters. In this example, the efcarson_dates.csp file contains time-specific parameters, such as start and end times of the simulation. The efcarson_files.csp file contains parameters specific to input and output files. The efc1108.csp contains the basin-specific parameters for the model selected.
The efcarson_luca.csp file contains parameter information specific to a Luca application. An example of this file is shown below: @Properties, Parameter created by, od @Properties, Luca created by, od # Obs data set (static) @Property, obs_file, "${prj.dir}/data/efcarson.csv" @Property, obs_table, "efcarson" @Property, obs_column, "runoff[0]" # simulated data set (dynamic) @Property, sim_file, "${sim.output}/Luca/efc.csv" @Property, sim_table, "Output" @Property, sim_column, "basin_cfs" The Properites tags identify this as a parameter file and a Luca specific file. The Obs data set and simulated data set Properties define the input and output files to be used to obtain observed and predicted variables for objective function computations in Luca. All variables in each file are available for selection in the creation of objective functions used in Luca (see Step 10 below). . Rounds and StepsIn this Wizard Step, the user is prompted to set the number of rounds and steps.
Rounds and Steps In the multi-step calibration technique, a step and a round are defined as follows:
Examples: Assume there are N rounds, each of which contains M steps. The following is how this program performs calibration:
. Calibration PeriodThe calibration period is the period for which parameters are calibrated. When selecting the calibration period, you also have to specify the Model Initialization Start Date. The model runs for the period from the selected Model Initialization Start Date to the Calibration End Date, and generates an output file containing values for this period. However, objective function calculation is performed only on the simulated and observed values of the period from the Calibration Start Date to the Calibration End Date. The initialization period is typically set to run the model through at least one wetting and drying cycle in an effort to remove any bias created by the user-defined initial model state variables.
In the example above, the model runs from 1980/10/1 to 1984/9/30, and the objective function calculation is performed using simulated and observe values for the period 1981/10/1 to 1984/9/30. . Parameter SelectionIn Step 3, one Round and two Steps were specified. In Step 5, the parameters for each step are selected. A parameter set for each calibration step is defined by selecting parameters from Available Parameters list. These parameters are automatically displayed based on the input parameter file specified in the Project Simulation selected in Step 2. The user creates a distinct parameter set for each calibration step by clicking the appropriate Step tab and then selecting the desired parameters. In the panel below, the Step 1 tab and the parameter jh_coef have been selected. The selected parameter is moved to the Selected Parameters list by clicking on the Add button.
To select parameters for Step 2, the Step 2 tab is selected. In the example below, two parameters, soil2gw_max and ssrcoef_sq, have been chosen and moved to the Selected Parameters list by clicking on the Add button. Parameters can be added one at a time, or can be selected in multiples by holding down the Ctrl key when making parameter selections in the Available Parameters list.
6. Calibration StrategyIn Wizard Step 6, the user is prompted to do the following three things for selected parameters in each step:
Calibration Types The three calibration types are available for calibrating the parameter values:
Initial Parameter Values The default parameter values shown in this Wizard Step come from the input parameter file defined by the Simulation. The individual values or the mean values of the parameters specified in this Wizard Step will be used as initial values of one of the points in SCE. If you do not want to specify the initial parameter values or mean values as one of the points, check the box Do not use values from parameter file. In this case, all points are generated by SCE. Lower and Upper Bounds The lower and upper bounds for each parameter in every step are used to generate points in SCE. In SCE, a set of selected parameters is considered as a point in N dimension space where N is the number of parameter values in the parameter set. Points are randomly generated by SCE such that each parameter value is within its lower and upper bounds. (See also the brief description of SCE above.) Restriction:
Actual Bounds After entering the values for lower and upper bounds, the actual range will be displayed. The actual bounds is exactly the same as the user defined bound if Use the individual values or Parameters are binary is selected for the use of parameter values during calibration. If Use the mean is selected, then the actual bounds are calculated as follows:
where
Learn how the equations above are obtained. Saving Modified Parameter Values When the initial parameter values are set by modifying the parameter values, the user may want to save the changes in a new parameter file. Luca allows the user to do this as an option. If a file is specified at the bottom of the Wizard Step 6 screen, the parameters with any changes in their values are saved in the file when the user clicks the Next button. The calibration strategy for the parameters selected in Step 2 is to Use the mean value. The soil2gw_max parameter is a distributed parameter with an individual value for each spatial hydrologic response unit. The generation of distributed parameters using the mean value is described below.
Generation of Parameter Values based on the Mean If the mean value is calibrated instead of individual values, the parameter values must be regenerated from the calibrated mean value. Each value of a parameter can be generated by the following equation: (1) where
Equation (1) does not work if [original mean] = 0. To avoid this problem, some constant C is added to each variable in Equation (1). C is a constant that is large enough to ensure that the values of all variables are positive:
Therefore, the equation that determines the new value is:
In Luca, C is defined as: C = [absolute value of the user defined lower bound] + 10 If the absolute value of the user defined lower bound is added to parameter values, it is guaranteed that all of them will be positive because no value is less than the lower bound. Note that 10 is added to C so that the denominator [original mean] + C can never be 0. 7. SCE Control ParametersIn this Wizard Step, SCE control parameters need to be defined for each step. The default values are automatically calculated based on the suggestions in the Duan et al. (1994). The definition and the details on the SCE control parameters are discussed below. For more information on these parameters the user is referred to Duan et al. (1994).
Definition of SCE Control Parameters Definition of SCE control parameters (See also the brief description of how SCE works above to get a better understanding of the SCE control parameters):
SCE Control Parameter Values
N =Number of parameter values. This is the total number of values in the parameter set. For example, if the parameter set has two parameters where one contains 12 values and the other contains 1 value, then N is 13. 8. Number of Objective FunctionsFor each step, Luca allows the user to use the multi-objective function technique in which the objective function value is calculated from multiple objective functions. In each step, the user must define The number of objective functions to be used in multi-objective functions.
9. Multi-Objective Function SetupIn this Wizard Step, the user is asked to set the following for individual objective functions in multi-objective functions:
The user must specify that the objective function value must be either minimized or maximized during the calibration by selecting Minimized or Maximized in the combo box.
Objective Function Type Each objective function is defined below.
Normalized Root Mean Square Error (NRMSE):
This objective function value is considered good when it is small. Nash-Sutcliffe (NS):
This objective function value is considered good when it is large (closer to 1.0). Absolute Difference (ABS):
This objective function value is considered good when it is small. Absolute Difference log (ABS log):
This objective function value is considered good when it is small. Pearson's Correlation: Refer to the function PEARSN in the book Numberical Recipes in Fortran (http://www.nr.com). This objective function value is considered good when it is large. Weight for Each Objective Function In Luca, the final objective function value based on multi-objective functions for each step is calculated as follows:
where OFf is the final objective function value, nOF is the number of objective functions for given step, wi is the ith weight associated with the ith objective function (OFi ). The user is prompted to set the weight for each objective function in every step Data Subdivide The user has a choice of using a subset of the observed data for model calibration. If the Use Data Subdivide file is chosen, the user is prompted for an input file that consists of the date and a value. When the data value for a given time step is equal to the value listed in the Data Subdivide value, then data for that day is used in model calibration. For example, the user may only want to calibrate runoff on days with irrigation diversions, peak flows, or significant hydrometeorological events. Data Subdivide File Format The format of the data_subdivide file is: Year Month Day Value where Value is the Data Subdivide value, which must be an integer. Example: 2000 10 1 1 2000 10 2 1 2000 10 3 2 2000 10 4 2 2000 10 5 2 2000 10 6 1 2000 10 7 3 ... If the Data Subdivide technique is used, simulated values and observed values of only desired dates can be used for the objective function calculation. Assume the Data Subdivide value is 2. Then, the simulated and observed values of the dates 2000/10/3 - 2000/10/5 are used, but those of the dates 2000/10/1, 2000/10/2, 2000/10/6, and 2000/10/7 are not used in the above example. Restriction If the Data Subdivide technique is used, the following must be satisfied:
Time Step The time step specified in Wizard Step 9 is used to calculate an objective function value for simulated and observed values. The available time steps are:
The calibration period must be at least 1 year (excluding the model initialization period) to select Annual Mean. Number of Days If the user selects Daily, then the Number of Days must be chosen. A value greater than 1 uses an n-day moving average for objective function calculation. If the Data Subdivide file is used, then Number of Days is automatically set to 1. Example: 2001 10 1 50.3 2001 10 2 40.1 2001 10 3 55.5 2001 10 4 39.0 2001 10 5 42.3 2001 10 6 41.8 .... 2002 9 27 57.1 2002 9 28 55.2 2002 9 29 54.8 2002 9 30 51.0 Assume you want to use the daily data above and Number of Days is set to 3. Then, the program adds up daily values of 2001/10/1, 2001/10/2, and 2001/10/3 (50.3, 40.1, and 55.5), to get 145.9, which is used as a value of the date 2001/10/1. The sum of the values of 2001/10/2, 2001/10/3, 2001/10/4 is used as a value of 2001/10/2, and so forth. There are not enough values for the last two dates, 2002/9/29 and 2002/9/30, so values are not calculated for them. The daily values below show what values are actually used for the calculation if the daily values above are used and Number of Days is 3. 2001 10 1 145.9 2001 10 2 134.6 2001 10 3 136.8 2001 10 4 123.1 .... 2002 9 27 167.1 2002 9 28 161.0 Period The Period field must be specified if the Monthly Mean, Mean Monthly, or Annual Mean time step is selected. Period allows the user to select a period of months whose monthly mean, mean monthly, or annual mean values are used for objective function calculation. For example, if Period is set to Oct. - Mar., then the monthly mean, mean monthly, or annual mean values for the period from October to March of each year are used for objective function calculation, and values from April to September are ignored. If Period is set to Oct. - Sept., then all monthly mean, mean monthly, or annual mean values are used. 0. Simulated and Observed VariablesIn this Wizard Step, the user is prompted to select the simulated variable from a list of variables that are produced by the model and saved in the model output file. The corresponding observed data must be identified and the source must be indicated as either Input File or External Source. If the user chooses Input File, the user must select an available observed variable from the Observed Variable list. If the user chooses External Source, the user must supply the observed data file, which must contain the data covering at least the calibration period, excluding the model initialization period. When this External Source is selected, the observed data type (Value or Range) must be set. In this wizard step, the user is also asked to set the Data Missing Value for observed values.
Observed Data File from External Source Format for the Value type If the observed data file contains an observed value for each date, it is the Value type. If the Value type is selected, the file format of the observed data must be selected from several choices:
2001 10 5.5 2001 11 9.1 2001 12 15.2 ...
1 43.1 2 50.4 ... 12 49.8
10 55.1 11 61.4 12 49.8 1 43.1 ... 9 49.8
2001 2.2 2002 3.05 2003 1.4 ... NOTE: There is only one space between each field in the examples shown above, but there can be more than one space. Format for the Range type Observed data files of the Range type must contain lower and upper bounds for observed values. The file format varies depending on the time step selected for objective function calculation. In order to use the Range type, the same time step must be selected for the all objective functions for a given step. If the Range type is selected, the format which the file must follow is displayed in the panel:
2001 10 1 18.1 22.0 2001 10 2 10.5 14.9 2001 10 3 12.9 16.0 ...
2001 10 4.9 5.8 2001 11 7.8 10.0 2001 12 14.3 16.2 ...
1 40.8 44.2 1 49.2 52.1 ... 12 49.3 50.8
10 50.0 57.1 11 57.23 64.0 12 49.3 50.8 1 40.8 44.2 ... 9 49.8
2001 1.43 2.86 2002 2.44 5.1 2003 0.4 2.0 ... Objective Function Calculation with the Range type Observed Data When the Range type is selected, the observed values used in the objective function calculations are determined based on the range defined by the lower and upper bounds in the observed data file and the simulated value generated by the model. If a simulated value is within the range for a given date, the observed value is set equal to the simulated value. If it is less than the lower bound of the range, the observed value is set to the lower bound. If it is greater than the upper bound, then the observed value is set to the upper bound. Example: Suppose the observed data file contains the following: 2001 10 1 18.1 22.0 2001 10 2 10.5 14.9 2001 10 3 12.9 16.0 .... and the simulated values generated by the Model are: 2001 10 1 28.3 2001 10 2 12.6 2001 10 3 8.9 .... Then, the observed values that will be used for objective function calculation are: 2001 10 1 22.0 (the same as the upper bound) 2001 10 2 12.6 (the same as the simulated value) 2001 10 3 12.9 (the same as lower bound) .... 1. Output Files & SaveOutput Files Three different types of output files are generated in the <work directory>/output/ directory during the Luca calibration:
In Wizard Step 11, the name of the model output file is displayed. If the user would like to keep the record of the parameter values and the objective function value for each model execution through the calibration, the checkbox Generate Calibration Trace File must be checked and the calibration trace file name must be specified. Also, the user is asked to set the name of the output parameter file for each step. Note that a separate output parameter file is generated for each step of each round, while only one model output file and calibration trace file are generated during the calibration. See below for the details on each output file. Session Name The session name is used distinguish this session from others, and will be used as a prefix for the name of files Luca creates (e.g. output parameter files). Also, this allows you to simultaneously run several calibration procedures as long as each session has a different session name since no session will write the same file. Example: Suppose the following:
Then, the name of output files will be:
Model Output File The model output file, Statvar file, is not saved for each step of each round. Instead, the model keeps writing the same Statvar file during calibration. (If the session name is "calib1", then the name of the Statvar File is “calib1_luca.statvar.) The user can always reproduce Statvar files of each step of each round using output parameter files created by Luca. Calibration Trace File The calibration trace file is a text file containing all objective function values calculated during the calibration regardless of good or bad values, and the corresponding parameter values selected for the calibration. SCE calculates the objective function values many times with many different parameter values in the parameter space in order to find the best values. All of these in all steps of all rounds are stored in this one file. This file gives the information about the changes in objective function values in the parameter space. This file can be huge if the number of steps and rounds are large. The user has to make sure if the enough storage is available. Calibration Trace File Format: Each line contains: Count, Parameter Value 1, Parameter Value 2, ...., Round Number, Step Number, OF Value 1, OF Value 2, .... Note:
Output Parameter File The user is allowed to set the name of the output parameter file for each step. Luca keeps parameter files of all steps of all rounds. The output parameter files will be placed in the <work directory>/params/ directory. If the name is set to 'step1.par' for Step 1 and the session name is 'calib1', then the actual parameter file name for Step1 of each round will be:
2. Calibration RunIn the last Wizard Step, the user must click the Run Calibration button at the bottom of the panel to run SCE for every step of every round. While it is running, Luca shows the calibration current status, and the best values and the status for each step of each round including the current best parameter values and the objective function values. After the calibration ends, the SCE end statement is displayed in the box located to the right of the screen. The end statement explains why the SCE is terminated for a given step of a round. The calibration is summarized in a summary file.
Calibration Current Status The fields in Calibration Current Status tell displays how far the calibration has run. While running the calibration, the model is executed many times, and the objective function value is calculated every time the model is executed. The Objective Function Value field shows the latest objective function value regardless of a good or bad value. Best Values and Status for each Step of each Round The fields associated with each step of each round including the following are displayed by clicking on a node in the tree structure located in the left-hand side of the screen:
SCE End Statement When SCE finishes at each step of a round, one of the following is displayed as an end statement: Optimization search terminated because the limit on the maximum number of trials, N, was exceeded. Search was stopped at sub-complex M of complex L in shuffling loop K. (Note: N, M, L, and K are substituted with real numerical values.) This end statement is displayed if SCE terminates because the number of model executions in SCE reaches the maximum number of model execution. N is the user defined value of the SCE control parameter, Maximum number of model executions. Optimization terminated because the criterion value has not changed N percent in M shuffling loops. (Note: N and M are substituted with real numerical values.) This end statement is displayed if SCE terminates because the criterion value (best objective function value) SCE finds in every shuffling loop does not change much. No varying criterion value means that no better parameter values than the current best values can be found. The criterion value is considered not to change much if the percent change in the best criterion value of the current shuffling loop and that of M shuffling loop before is less than N percent. N and M are the user defined SCE control parameter values. N corresponds to the value of Percentage of the criterion value, and M corresponds to the value of Shuffling loops in which the criterion value must change by given percent before optimization is terminated. Optimization terminated because the population has converged into N percent of the feasible space. (Note: N is a normalized geometric mean of parameter ranges.) This end statement is displayed if SCE terminates because all points (parameter sets) generated in SCE has converged into a very small region. It terminates if the points are gathered in the area that is less than or equal to 0.1 % of the feasible space, the space within the lower and upper bounds of parameters. The value N is the percent area into which the points converged. ReferencesSCE Related Papers
Step-Wise, Multiple-Objective Calibration Related Papers
|
Navigation Bar for Object Modeling System
Resources:
Downloads You must login to see this link. Register now, if you have no user account yet. OMS API Javadoc Publications & Presentations Annotation Reference DSL Reference Handbook 3.0 (Draft) Frequently Asked Questions (FAQ) OMS License (LGPL 2.1) New Users: 1. Get an ALM account 2. Join the OMS project Contact Us: Jack Carlson Coordinator, OMS Laboratory OMS Laboratory Jack.Carlson@colostate.edu (970) 492-7323 Olaf David Director, OMS Laboratory odavid (at) colostate.edu (970) 491-8026 |