Table of Contents
List of Figures
List of Tables
List of Equations
PET (Parameter Estimation Toolkit) is a graphical user interface for exploring the parameter space of a mathematical model. PET is specifically designed for biological models based on ODEs (Ordinary Differential Equations) using SBML (Systems Biology Markup Language) to define the model. PET can be applied to a much broader range of models as long as they can be expressed in SBML (e.g., chemical kinetic models). This manual describes how to use PET and provides tutorials and examples of different uses of PET.
Table of Contents
This chapter describes concepts that PET uses in the user interface or data structuring that the user may find useful to understand and tasks that the user may wish to perform using PET.
The concept discussed in this section is central to PET and is the primary reason for the creation of PET. Please read this section. Initial conditions are modified in the same way as parameters, unless otherwise noted.
This section explains how PET determines which values of parameters to use when simulating the model. This is not about parameter estimation. This is about how the user specified basal sets, simulations, transforms, and estimator settings determine which value of parameters and initial conditions PET uses for any particular simulation.
When PET performs simulations, PET takes parameters from the basal set, modifies them according to the simulation setup, and passes those parameters to the transform for simulation (initial conditions are handled in the same way). This process is shown abstractly in Figure 2.1, “Overview of How Parameter Values Are Determined”.
Figure 2.2, “Parameter Determination Example 1” shows the values of the parameters through the process. In this example PET is simulating 5 experiments (the simulation setup for each experiment is shown) using the one basal set Foobar. Notice how parameters assigned values in a simulation setup override the values from the basal set. Also, if the simulation setup does not specify a change to the parameters, the parameter value from the basal set is used. Finally, each simulation is done independently; parameters are assigned values from the basal set, modified by the simulation setup, passed to the simulator, then reset to the values in the basal set in preparation for the next simulation.
Multiple experiments and basal sets can be selected for simulation. They are selected by checking the box left of their name in the "Basals" and "Simulations" tabs. If the basal sets Foobar and Joe are selected to be simulated along with all five simulations in Figure 2.2, “Parameter Determination Example 1”, then PET will first set the parameters to the values in one basal set, say Foobar. Next, PET will simulate the model for each of the selected simulation setups (parameters being reset to those from the basal set between each simulation). After all simulations are performed using one basal set, PET will set the parameter values to those from the next basal set: Joe. The process repeats until all selected basal sets and simulation setups are used to generate a plot. The results of the simulations for the basal sets and simulation setups from Figure 2.2, “Parameter Determination Example 1” and Figure 2.3, “Parameter Determination Example 2” can be seen in this PDF report. For this example the PET file used is parameter_change_example.pet.
In Figure 2.3, “Parameter Determination Example 2” the value of krev is modified by the simulation setup Double Rev. The value of krev is doubled with the expression 2*krev.
The process for determining parameters while estimating parameters (using the automated parameter estimator) is similar to the process used when simulating (see the section called “While Simulating”). The difference lies in the way the "initial" parameter values are chosen. Instead of reseting parameter values to those of the basal set between each simulation, PET resets the paramter values to those of the best parameter set for the current iteration of the parameter estimation algorithm (call this the "current best parameters"). The basal set selected in the "Estimator Settings" tab is used as the initial guess and/or the default values (default values are used for parameters that are not estimated).
Using the PET file parameter_change_example_pe.pet and Figure 2.4, “Parameter Determination During Parameter Estimation”, this example illustrates how parameters are determined. For this parameter estimation run, only the parameter k is estimated. The other parameter, krev and the initial conditions are set to their default values between each simulation; these values are determined from the basal set named "Initial Guess" and are a = 1, b = undefined, and krev = 0. During the first iteration of the parameter estimator k = 0.5. k is then modified by the simulation setup to have a value of 0.25. The transform "Time Series w/ Presimulation" executes and returns the transform output to the parameter estimator. The parameter estimator adjusts all parameters being estimated (in this case only k). Let's say k is adjusted to be 0.72. The current best parameters are now a = 1, b = undefined, k = 0.72, and krev = 0. The simulation setup modifies k to be 0.36, and the loop continues.
PET implements a linear undo/redo history. Every user action that modifies the user's data is saved as an "action" in the undo/redo history. The user can then move between states of their data by using the the undo and redo buttons. Any click of the undo button changes the data to a state before the last action was taken. A redo changes the data to reflect the state after that action.
All "future" history is lost when the undo button is clicked followed by a user edit of the data. For example, if a user deletes the simulation "My Sim", undoes the delete, then renames the simulation "Another Sim" to "Last Sim", PET will no longer have the delete action of "My Sim". In this example the redo button will not be active.
This section describes how to get your SBML model into PET, how to modify the SBML file that is in PET, what it means for SBML to be in PET, how PET uses the SBML to get basal sets and default settings, and how PET does not directly edit or change the SBML (that is left to the JigCell ModelBuilder).
PET requires SBML as a source for parameter names, species names, and a model definition. The parameter and species names are used directly in PET when creating basal sets, defining simulations, defining experimental data, configuring plots, setting up parameter estimation runs, etc.. The model definition is passed to the simulator or parameter estimator when a user requests simulation or parameter estimation.
PET does not edit or display SBML. Instead, it relies on the JigCell ModelBuilder (packaged with PET and integrated via the Edit SBML button) to edit and display SBML.
Though PET does not edit or display SBML, PET stores SBML in PET files when they are saved. For example, a user may create a new PET file, import their SBML, create a couple basal sets, save their PET file, close PET, and the SBML will be saved inside their PET file. The next time the user opens that PET file they will be able to access the SBML as if they just imported it (e.g., to edit the SBML the user may click the Edit SBML Button).
See the section called “Basal Sets” for how parameters and species are loaded from SBML into a basal set.
SBML files can be imported by clicking the Import SBML button located in the toolbar. NOTE: At the time of this writing this is the only way to import SBML files; the File->Import menu allows users to select an SBML file but will not import the file. When an SBML file is imported it overwrites any previously imported SBML (if any). In the case where SBML is overwritten, PET checks
If PET finds any changes the following takes place:
The current SBML can be editted by clicking the Edit SBML Button. PET will then open the JigCell ModelBuilder with the current SBML file loaded and ready for editting.
When finished editting, the SBML must be saved to a file within the ModelBuilder. The file can be anywhere with any name. The SBML will then be sent back to PET and automatically imported as if the Import SBML button was pressed.
When the SBML has been changed (through importing or editting) PET may display the Parameter and Species Mapper (or Mapper for short) to prompt the user how to map the old parameters to the new parameters. A map of one parameter to another tells PET that the two are equivalent and to use the parameter name on the right hand side of the map for display. Species are also mapped this way, and anywhere this section refers to "parameter", "species" can be substituted unless otherwise stated.
The Mapper is color coded to help identify which parameters are deleted, created, or mapped. There is also a key in the user interface briefly explaining the meaning of the color coding.
The color codings and their meaning are as follows:
The
parameter vcp is deleted. This means it appears in the old SBML but not in the
new. Any occurence of vcp in the PET file will be removed.
The
parameter k23 is created. This means it appears in the new SBML but not in the
old. k23 will be added to all basal sets and to the list of available
parameters for simulations, data, and parameter estimation. If k23 were a
species it would also be added to the list of plotables.
The
parameter vwppp is mapped (i.e., renamed) to vwppp_triple. All occurences of
vwppp in the PET file will be renamed to vwppp_triple.
The maps may be changed or parameters set to be created or deleted by clicking
the row then clicking the right or left side of the arrow (
). The new parameter mapping can be typed or a drop down list of all available
maps can be accessed by clicking
. The screenshots
Figure 2.6, “Mapper Left Drop Down” and
Figure 2.7, “Mapper Right Drop Down” show
the drop down lists.
This section describes what basal sets are, how they are used by PET, how they are created (by PET and users), and how users can effectively use them.
This section describes what simulations are, the inheritence structure of simulations, how simulations can be used to duplicate experimental protocols in silico, what happens when parents are deleted, and more.
This section describes what happens when the simulation button is clicked and the results.
This section describes how plots are created, how the user can change ranges, how the user can change which data is plotted, how the user can change plot titles and labels, etc..
This section describes how to create PDF reports containing plots and how those reports are layed out on paper.
This section describes how PET can be used to manually explore parameter space.
This section describes the requirements for parameter estimation, how to select parameters to estimate, how to set ranges and scales for parameters being estimated, the meaning of and how to set settings for the optimization algorithms, and how to set weights for experiments and experimental data.
Every experimental datum is assigned a weight. Effectively this is a relative importance of that datum compared to the other data. Larger values for the weight cause the optimization algorithm to try "harder" to fit the model to that datum when adjusting parameters. See the section called “Objective Function” for details on how the optimization algorithms use the weights when fitting the model to experimental data. This section talks about default values for the weights and the information in the "Weights and Deltas" tab.
In the Weights and Deltas tab (Figure 2.8, “Weights and Deltas Tab”) there is a column for "Weight", "X Weight", and "Y Weight". These are editable and for the user to set. The column "Experiment", "X Data", and "Y Data" are there as information for the user to reference when setting the weights. The default value for weights in the "Weight" column is 1. The default value for weights in the "X Weight" column is
where x is the value from the "X Data" column in the corresponding row for a give "X Weight". Similarly, the default weight for y data is
The value of 0.2 in the second row under experiment "Fig. 3C (1995) Kumagai and Dunphy" in the "X Weight" column comes from 1/(4+1), where the 4 is the value from the "X Data" column of the same row.
The final value of the weight used in the objective function (the section called “Objective Function”) is a combination of values from the "Weight" column and the corresponding "X Weight" or "Y Weight". Each of the x weights are multiplied by the weight from the experiment the datum comes from. For the first experiment in Figure 2.8, “Weights and Deltas Tab”, all the x weights are multiplied by 1 before passed to the optimization algorithm for use in the objective function. For the second experiment "Fig. 3C M-phase (1995) Kumagai and Dunphy", all the x weights are multiplied by 2 before being passed to the optimization algorithm. Therefore, 0.2 becomes 0.4, 0.0588... becomes 0.1176..., and 1 becomes 2.
An equation can be entered to the right of "X Weight Equation:" defining how to calculate the value for the x weights. When the "Set X Weights" button is clicked, all the weights will be calculated using this user supplied equation. The equation can include reference to the x data using the character "x". The equation must be a valid Perl expression. Similarly, the y weights can be set. Figure 2.9, “User Defined Weights” shows an example where the weights are defined by an equation. Note: the weights can be individually specified and changed by clicking on the values. If an equation is specified and the "Set ... Weights" button is clicked, then any previously value will be lost.
Table of Contents
This chapter details how to use the PET user interface and the function of the user interface components.
Clicking this button opens a file chooser and waits for the user to select an SBML file.
Clicking this button launches the JigCell ModelBuilder with the current SBML or empty SBML if there is no current SBML.
Table of Contents
This chapter describes the algorithms used for optimization and transforms. The settings for these algorithms are described and some tips on how to effectively use the algorithms to get desired results.
The Time Series transform is the most basic of transforms. For this transform the model is simulated and values of a species are returned for all requested values of time. The user specifies which species is to be returned in the "Transforms" tab. PET automatically determines the values of time used when requesting values of the species. When the transform is plotted, PET chooses a couple hundred time points within the plot range for the x-axis. When parameter estimation is run, PET chooses one time point for each experimental datum from the "Data" tab.
In the following example, species b is selected as the output of the transform for the simulation named "basic" (See Figure 4.1, “Time Series Transform Configuration”). The model for this example consists of the single reaction shown in Equation 4.1, “Reaction Network”. The rate of the reaction is k · a (k=1.0), and the values of the initial conditions a0 and b0 are 1.0 and 0.0, respectively. The experimental data can be seen in Figure 4.2, “Time Series Experimental Data”.
The output of the Time Series transform for the simulation named "basic" is shown in Figure 4.3, “Time Series Transform Plot”. Comparing Figure 4.3, “Time Series Transform Plot” to Figure 4.4, “Time Series Plot”, one can see that output of the simulation of this model for species b is identical to output of the Time Series transform. This is precisely the purpose of the Time Series transform; simple as it is. Other transforms will have more complicated calculations that process the simulation output.
The Time Series with Pre-simulation transform adds a "pre-simulation" to the Time Series transform; otherwise, the Time Series with Pre-simulation transform behaves exactly like the Time Series transform. A pre-simulation is run before the main simulation. The pre-simulation is selected (See Figure 4.5, “Time Series With Pre-simulation Transform Configuration”) from the available simulations defined in the Simulations tab. The pre-simulation is simulated to the specified time. Then, the main simulation is simulated using the initial conditions from the pre-simulation and the parameter values from the main simulation.
In the following example, species b is selected as the output of the transform for simulation "Using Reverse as Presim". Before the output of the transform, b, is generated, the pre-simulation "Reverse" is run for 1.7 units of time. Note that "Reverse" is just a regular simulation, defined like any other simulation. The only difference between "Reverse" and any other simulation is that we select it in the Transforms tab as the pre-simulation for the simulation "Using Reverse as Presim". These settings can be seen in Figure 4.5, “Time Series With Pre-simulation Transform Configuration”.
The model for this example contains two reactions, shown in Equation 4.2, “Reaction Network”. The initial conditions and parameters are shown in Table 4.1, “Parameters and Initial Conditions”. For this example, those initial conditions and parameters are in the basal set "from SBML".
When the transform is run for the simulation "Using Reverse as Presim" using the basal set "from SBML", the following steps are taken by PET:
Figure 4.7, “Pre-simulation and Simulation” shows how a plot of a and b would look if we could see both the pre-simulation and the simulation. Note that PET only plots the simulation (the part after the blue line). Figure 4.7, “Pre-simulation and Simulation” shows the initial conditions and parameters for both the pre-simulation and the simulation. Notice how the parameters are "reset" for the simulation while the initial conditions of the simulation are taken from the final conditions of the pre-simulation.
The Steady State or Fixed Time transform performs a simulation for every experimental datum. Before each simulation the "Independent Variable" is assigned the value from the experimental datum and the final value of the "Dependent Variable" is returned as the transform output. Each simulation is run to the time specified in the transform configuration's "Time" field (see Figure 4.8, “Steady State or Fixed Time Transform Configuration”). When "Fixed Time" is selected, the transform will run to "Time" and return the result. In this case, "Steady State Tolerance" is ignored. When "Steady State" is selected, the transform will still run to "Time", but the output of the transform will be undefined if the model is not at a steady state. Steady state is defined as the L2-norm of the vector of the values of the ODE right hand sides (RHS) being less than the value of "Steady State Tolerance" (this is discussed more below).
The Steady State or Fixed Time transform will run a single simulation for each experimental data point when parameter estimation requests transform output. The independent data (seen in the "Independent" column of Figure 4.9, “Experimental Data”) are assigned, one per simulation, to the "Independent Variable" (seen in Figure 4.8, “Steady State or Fixed Time Transform Configuration”) before the simulation is started and before the simulation parameter and initial condition changes are applied. When plotting, PET will give the Steady State or Fixed Time transform several hundred points for values of the "Independent Variable". One simulation will be run for each point. Again, taking the final value of "Dependent Variable" from the simulation and returning the value as the transform output.
In the following example, species b is selected as the output of the transform for simulation "A Sim" (based on Figure 4.8, “Steady State or Fixed Time Transform Configuration”). During parameter estimation, for each of the experimental data for "A Sim" (seen in Figure 4.9, “Experimental Data”) a simulation of the model is run to 100 time units and the value of b, at 100 time units, is returned as the transform output. Since there are 3 experimental data points, each request by the parameter estimator for transform output will result in 3 simulations: one with Total=1, one with Total=5, and one with Total=10 (this is approximately what happens, see the section called “Optimization Algorithms” for more details on how parameter estimation requests transform output). The step-by-step process, for this example, by which the Steady State or Fixed Time transform will generate output for plotting can be seen below (this example uses the simulation "A Sim" and the basal set "from SBML").
The model for this example contains two reactions, shown in Equation 4.3, “Reaction Network”. The rate of each reaction is listed to the right of the reaction. The initial conditions and parameters are shown in Table 4.2, “Parameters and Initial Conditions”. For this example, those initial conditions and parameters are in the basal set "from SBML". The parameter Total is the conserved quantity a+b.
When using the "Steady State" option, PET will determine if the model is at steady state before returning this transforms output. The L2-norm of the vector of the right hand sides of the differential equations is calculated and compared to the "Steady State Tolerance" (see Figure 4.8, “Steady State or Fixed Time Transform Configuration”). If the L2-norm is less than the tolerance then the transform returns the value of the "Dependent Variable". If the L2-norm is greater than the tolerance the transform returns "Undefined" for the current value of the "Dependent Variable". "Undefined" may cause error messages in simulations and parameter estimation similar to "Transform Error", "Transform Undefined", etc..
For this example the simulation is run to 100 time units (as seen in Figure 4.10, “Timeseries to 100 time units”). The values for the dependent variable b are defined. We can see this be taking the last value for a and b and calculating the L2-norm. The last value for a is 0.3333… and b is 0.6666…. Plug these numbers into the equations for the right hand sides (RHSs can be seen in Equation 4.4, “Right Hand Sides”) and we have -1.0*0.3333+0.5*1.0*0.6666 = 0.0000 for da/dt. Similarly we have 0.0000 for db/dt. We take da/dt square it and add it to the square of db/dt and get 0.0000. Take the square root and we still have 0.0000. This is less than 1.0e-10 (the "Steady State Tolerance"). Therefore, by definition, we have a steady state.
For illustration let's assume 1.0 as the "Time" for the Steady State or Fixed Time transform. In this case a timeseries plot would look like Figure 4.11, “Timeseries to 1 time unit”. The last values of a and b are 0.48 and 0.52, respectively. Pluggin these numbers into the RHSs yields da/dt = -0.22 and db/dt = 0.22. -0.22 squared equals 0.0484, 0.22 squared equals 0.0484, add the two numbers to get 0.0968, and take the square root to get 0.31. 0.31 is greater than 1.0e-10. Therefore, by definition, this is not a steady state. Note that poor choices for the "Steady State Tolerance" can cause a steady state to be detected when there is none.
Finally, the transform output for this example can be seen in Figure 4.12, “Steady State or Fixed Time transform plot”.
This transform plots the time a species (variable) takes to activate with respect to a parameter. Different values of the parameter will cause the activation time to change. Using an example from the frog egg extract model of Novak and Tyson, MT (total MPF) is set to the values 0.19, 0.31, and 0.44. As seen in Figure 4.13, “Timelags for MPF Activation”, the time for MPF to become mostly active varies with respect to MT. The timelag transform takes these independent simulations and calculates the timelags from each simulation. The result can be plotted as timelag vs. parameter. Figure 4.14, “Timelag vs. MT” shows timelag for MPF activation vs. MT. The timelags in Figure 4.13, “Timelags for MPF Activation” are approximately 60 minutes, 25 minutes, and 15 minutes for MT=0.19, MT=0.31, and MT=0.44, respectively. These points are on the curve plotted in Figure 4.14, “Timelag vs. MT” along with all the timelags for other values of MT (the black dots are the experimental data, not to be confused with the "points" just mentioned).
The configuration for the MPF example can be seen in Figure 4.15, “Timelag Transform Configuration”. MPF, M, is set as the "Variable (Species)"; this is the species that activates (or inactivates) after some timelag - it will be plotted on the y-axis of the transform plot. The timelag depends on the parameter MT, therefore, this parameter is set as the "Independent Paramter"; it will be plotted on the x-axis of the transform plot. The species will be set to the initial value specified in "IC Parameter" (where IC stands for "initial condition") before each simulation. The time lag will be measured as the time it takes the species to reach the value of the "EFC Parameter" (EFC stands for "Expected Final Concentration"). The species may never reach the EFC exactly; therefore, the "EFC Tolerance" is used to specify how close the species must be to the EFC when the timelag is measured. For this example a value of 0.5 is assigned to EFC tolerance. In this example, the timelag is measured to the point where the species, M, is half the EFC. Sometimes the species will never reach the EFC. The model is simulated to steady state and the species is compared to the EFC at steady state. If the species reach the EFC within the tolerance then the data is defined, calculated, and returned. Otherwise, the data is set as undefined and the transform returns. The setting "Steady State Tolerance" tells the transform when to assume the model is at steady state.
The threshold transform is related to the timelag transform. The threshold transform looks for the point at which a species is no longer able to reach the EFC (see the section called “Timelag”). The EFC is the expected final concentration of the species of interest ("Variable (Species)"). For the frog egg extract model of Novak and Tyson the concentration of active MPF is expected to increase to approximately equal the total concentration of MPF. This does not happen if the total concentration of MPF is too low. The point at which the total concentration of MPF, MT, is no longer high enough for MPF to reach its EFC is the threshold value of MT. Just like in the timelag transform, the threshold transform does not look for when MPF reaches the EFC, instead the threshold transform looks for when MPF crosses a value close enough to the EFC; this value is the value of the "Threshold Parameter". For this example it is equal to half the value of total MPF, MT. MT is specified as the "Control Parameter"; it will be varied and the final concentration of MPF will be measured and compared to the "Threshold Parameter".
The threshold transform uses a bisection algorithm and tries to bracket the threshold then home in on the precise value. The user supplies a guess for the lower and upper bracket of the threshold: "Lower Guess" and "Upper Guess". The threshold transform tests that this guess does bracket the solution. If not, the lower guess is halfed or the upper guess is doubled until the solution is bracketed. The user supplies a maximum upper guess ("Max Upper") so that the algorithm will run in finite time even if there is no upper bound (and therefore no threshold to be found). The upper and lower brackets are bisected and one of them is replaced by the mid-point between the two. The new bracket is such that the solution is still within the brackets. When the upper and lower brackets are within "Relative Tolerance" of each other, the algorithm terminates and returns the threshold value as the mid-point between the brackets. The "Absolute Tolerance" is used to determine what numbers are treated as equivalent to zero. If the "Lower Guess" is within "Absolute Tolerance" of zero then it is considered zero. Same goes for the upper guess (which may decrease with any iteration of the algorithm). The value of the variable ("Variable (Species)") is taken from the steady state of a simulation using the current value of the "Control Parameter". The model is considered at steady state when the L2-norm of the species' rates is less than "Steady State Tolerance". The model will not be simulated longer than "Max Time".
The "Direction" setting specifies whether the variable should be high or low when the control parameter is above or below the threshold. A direction of "Positive" means when the control parameter is above the threshold, the variable is high (above the threshold parameter). If a direction of "Negative" is selected, the variable would be low in this case.
Let's take a moment to distinguish "Threshold Parameter" from the threshold being searched for. The threshold is the point at which the variable crosses the value of the threshold parameter. This point occurs for a specific value of the control parameter. This value of the control parameter is referred to here as the threshold.
When plotted the threshold plots the steady state value of the variable with respect to the control parmeter. This can be seen in Figure 4.17, “Threshold Transform Plot”
The User Defined transform can be used to specify Fortran 95 code to be executed as the transform. There are two fields that must be completed: the "Name" field and the text area where the Fortran source code goes. The name of the transform must be a valid Fortran 95 subroutine name: it must start with a character, contain only characters, numbers, and '_' (no spaces are permitted). Figure 4.18, “User Defined Transform” shows these fields and an example user defined transform. A real world example of a user defined transform can be found in the section called “Time Series User Defined Transform”.
This transform is written to a Fortran source file and compiled before simulating or estimating parameters. The code is wrapped in a subroutine and would look something like
SUBROUTINE My_transform( XDATA, YDATA, DEFINED )
USE MODEL_MOD
USE LSODAR_MOD
USE REAL_PRECISION
USE TRANSFORMS
USE MOD__New_u20Mechanism_
IMPLICIT NONE
REAL (KIND=R8) :: XDATA(:), YDATA(:,:)
INTEGER :: DEFINED(:)
! User defined code:
integer :: i
do i = 1, size(xdata)
ydata(i,1) = sin(xdata(i))
defined(i) = 1
end do
! End user defined code
END SUBROUTINE My_transform
There are many variables, functions, and subroutines accessible by this user written transform. The full Fortran 95 language is available to the user written code.
Error: Symbol 'dummy' at (1) conflicts with the same name in an encompassing program unitthen the name of the transform or one of the names of a local variable conflicts with an existing name. Rename the variable or transform and this error will be resolved.
The subroutine arguments are used as follows:
All the species and parameters defined in the model are available in user defined transforms. The names will mostly be the same as the names used in the rest of PET, except there is an underscore '_' appended to the name. If the model contains the species and parameters MPF, Cdc25, km, kmd, kcm, kcdm, etc., then the names MPF_, Cdc25_, km_, kmd_, kcm_, kcdm_ will be available in the user defined transform.
There are many subroutines and functions available for user defined transforms. Here is a partial list of the subroutines of most value:
The get and set routines of lsodar (lsodar_get and lsodar_set, respectively) are used to set simulator settings, initial conditions, and retrieve the simulation output. Of particular note are the setting and getting of species values and the setting of the simulation time. Setting the simulation time to 144 can be done with the code
call lsodar_set( tout = 144 )
Setting the initial condition for MPF to 1 can be done with the following code. Note that this code sets all the initial conditions to the current value of each species. At the start of your transform the values will be set by the basal set and the simulation setup; any modification to the species within the transform will affect the initial conditions when this code executes. See the section called “How PET Determines Initial Condition Values” for a discussion of how initial conditions are determined in PET.
MPF_ = 1 call lsodar_set( y = model_vars )
Getting (or retrieving) the final value of MPF after a simulation and setting the temp variable temp to this value can be done with the code
call lsodar_get( y = model_vars ) temp = MPF_
There are many simulator settings available. Some important and commonly used settings are listed below.
| 2 : no root was found, and tout was reached. |
| 3 : a root was found prior to reaching tout. |
| -1 : excess work done on this call (perhaps wrong jt). |
| -2 : excess accuracy requested (tolerances too small). |
| -3 : illegal input detected (see printed message). |
| -4 : repeated error test failures (check all inputs). |
| -5 : repeated convergence failures (perhaps bad jacobian supplied or wrong choice of jt or tolerances). |
| -6 : error weight became zero during problem. (solution component i vanished, and atol or atol(i) = 0.) |
| -7 : work space insufficient to finish (see messages). |
PET supports two optimization algorithms: ODRPACK95 and VTDirect. These algorithms are integrated into PET's underlying engine, Biopack, for estimating parameters. In some cases these algorithms perform very well at adjusting and discovering parameter values that fit experimental data. In other cases they fail to find any reasonable parameter values. Although there is never a guarantee that an optimization algorithm can find optimal parameters in finite time, this section will help you understand the algorithms and their settings so that PET will perform its best when searching parameter space for you.
There are three options on which algorithm to use. The radio buttons at the top right of the "Estimator Settings" tab (see Figure 4.19, “Estimator Settings Tab”) are used to select which algorithm to use. The next three sections describe each of these choices, how the algorithms work, and what their settings mean.
The local optimization algorithm takes an initial guess defined in the "Default Values / Initial Guess" field of the "Estimator Settings" tab (see Figure 4.19, “Estimator Settings Tab”). The parameter values are assigned to the values from this basal set and an objective function is evaluated at this point. The objective function - described in more detail later - is a measure of the goodness of fit between the model and the experimental data. A reduction in the objective function value corresponds to a better fit to experimental data. ODRPACK95 calculates the gradient of the objective function and changes the parameter values in the direction of the steepest descent of the gradient (i.e., maximal reduction of the objective function value). The new value of parameters will give a smaller, and therefore better, objective function value. The process repeats until some stopping criteria are met (see the section called “Convergence”). The final parameter values are returned as the "best" fit to experimental data. The values may or may not actually be the best fit, but they are the best ODRPACK95 could find. This is a rough outline of how ODRPACK95 works. To understand the algorithm more completely consult a numerical analysis textbook on trust region Levenburg-Marquardt optimization or the ODRPACK95 paper published in ACM TOMS (Transactions On Mathematical Software). The remainder of this section describes concepts necessary to understanding how to set the parameters and weights for ODRPACK95.
The objective function is the weighted sum of the squares of the orthogonal distances between the experimental data and the model simulations (transform output). The orthogonal distance for any single datum is defined as the length of the straight line from the experimental data to the model curve, where the line is perpendicular to the tangent line. Figure 4.20, “Orthogonal Distance” shows this orthogonal line, as the red line, for one particular point. The components of the orthogonal distance are epsilon and delta: the sides of a right triangle where the orthogonal distance is the hypotenuse. Delta represents the error contribution along the x-axis, while epsilon represents the error contribution along the y-axis. Intuitively, the orthogonal distance represents the shortest distance between the model simulation and the experimental data. For each point these distances are calculated, then squared, then summed together (see Equation 4.5, “Objective Function”). The resulting number is the value of the objective function for a particular set of parameters.
where

The objective function just described is used when the user selects "Orthogonal Distance Regression" under the Estimator Settings Tab (see Figure 4.19, “Estimator Settings Tab”). This is the default setting PET uses. If the user selects "Ordinary Least Squares" the objective function is changed by setting all deltas to 0. This change assumes that all the error in the experimental observations is in the dependent variable. A visual illustration of the distance between model and data, for this case, can be seen in Figure 4.21, “Ordinary Least Squares Error Illustration”. Mathematically the objective function is Equation 4.6, “Ordinary Least Squares Objective Function”.
ODRPACK95 converges or stops when one of three conditions are met:
When ODRPACK95 explores parameter space, it guesses one parameter set per iteration. Therefore, the maximum number of iterations will control how many parameter sets ODRPACK95 guesses to be the minimum. A parameter set is considered the minimum when the sum of squares tolerance is met. The sum of squares tolerance is met when the change in the objective function value from the last iteration to the current is less than the sum of squares tolerance provided in the estimator settings tab (Figure 4.19, “Estimator Settings Tab”). Last, if the parameter tolerance is met ODRPACK95 will stop and return the current parameter set as the best set. The parameter tolerance is met when the distance from the previous parameter values to the current parameter values is less than the parameter tolerance defined in the estimator settings tab.
VTDirect globally searches parameter space within upper and lower bounds on the parameters. VTDirect has guaranteed convergence on a global minimum; practically, one may not find a global minimum if the algorithm is not given enough time to run. (Running the algorithm for some time followed by a local search can yield good results and reduce run times.) This section describes the VTDirect algorithm and its settings.
VTDirect divides the search space (a p-dimensional box, where p is the number of parameters being estimated) into boxes and systematically subdivides the boxes in search of regions of parameter space where the objective function values are small. For each box in parameter space, VTDirect keeps the value of the objective function at the center of that box. The size of the boxes and the values of the objective function are used to determine which boxes to divide in the next iteration. The larger the box and the smaller the objective function value the more likely the box will be divided. Figure 4.22, “VTDirect Example” shows how VTDirect divided parameter space for a model with two parameters. The boxes are shown for 1, 5, and 10 iterations. The parameter space is overlayed with a contour plot of the objective function so that we can see how VTDirect is focusing computation effort on the regions where the objective function values are smaller.
VTDirect chooses the boxes to divide by comparing the objective function values and the box size (box diameter). Boxes on the lower right portion of the convex hull of a plot of the objective function values vs. the box sizes are divided in the next iteration. Figure 4.23, “Scatter Plot of Objective Function Values vs. Box Size” shows an example where the boxes that are to be divided on the next iteration have a line drawn through them. This line is the lower right hand portion of the convex hull. When a box is divided, it is replaced with smaller boxes. Therefore, the point representing the box in the scatter plot is removed and replaced by the new smaller boxes. Note that on one iteration many boxes may be divided. The parameters to control the number of iterations and the number of boxes to divide are "Maximum Iterations" and "Maximum Evaluations", respectively (see Figure 4.19, “Estimator Settings Tab”). The term "Maximum Evaluations" refers to the number of times the objective function is calculated. This is equivalent to the number of boxes since the objective function is calculated once per box (at the center of the box).
The setting "Sum of Squares Tolerance" in VTDirect Settings tells VTDirect to stop running if the improvement in the objective function from the last iteration to the current iteration does not exceed the specified value.
EPS is a difficult parameter to grasp and use effectively. It controls how VTDirect splits computational effort between exploration and convergence. Larger values of EPS cause VTDirect to divide larger boxes more aggressively. Smaller values of EPS cause VTDirect to divide smaller boxes more aggressively. The relationship is best seen visually in Figure 4.24, “EPS Illustration”. Effectively, use of EPS adds a fictitious box with size = 0 and an objective function value of Emin-Emin·EPS. This point can be seen as E* in the figure. The addition of the point at E* causes some of the small boxes to be skipped for division in the current iteration. Only small boxes with "significant" improvement in the objective function will be divided. That is, the improvement in the objective function must place the box below the dashed line in Figure 4.24, “EPS Illustration”.
Biopack, PET's computational engine, implements some additional functionality specifically for VTDirect. Each parameter can be set on a linear or log scale by checking the box in the "Log" column on the left hand side of the estimator settings tab (see Figure 4.19, “Estimator Settings Tab”). If, for example, the range for a parameter was set to be 0.01 to 1000, on a linear scale VTDirect might divide boxes such that values of the parameter at the center of the boxes are approximately 200, 400, 600, and 800. The same scenario using a log scale would result in parameter values of 0.1, 1, 10, and 100 at the center of their respective boxes (which would be different boxes than the boxes from the linear case). This feature allows VTDirect to search parameter space more completely when the scale of a parameter is unknown.
The "Default Values / Initial Guess" setting is used by VTDirect only for default values. For the parameters that are designated as "Fixed" – by a check in the box in the "Fixed" column left of the parameter name – the default values are used and the parameter value is not changed. The remainder of the parameters are adjusted according to the box dividing algorithm (described above) and start with a value at the midpoint between their upper and lower bounds. The starting values of the parameters and VTDirect are not the starting values of the "Default Values / Initial Guess" basal set. Therefore, the parameter set returned by VTDirect may be a worse fit to the data than the "Default Values / Initial Guess" basal set.
When the setting "VTDirect Followed BY ODRPACK95" is selected then the resulting best point from the VTDirect run will be passed to ODRPACK95 as the initial guess. ODRPACK95 will run and the resulting best point will be returned in a basal set.
This section gives examples and explanations to help you choose appropriate settings for your problem.
A simple definition of local optimization is: search parameter space until a local minimum is found. A local minimum is a place in parameter space where all close neighboring points are a worse fit to the data. A simple definition of global optimization is: search parameter space until the minimum is found. This is the parameter set for which there is no better fit to the model. Sounds like one should always use global optimization.
The choice is not theoretical though. The choice is a practical one. Global optimization tends to take longer and may not produce any good results in the time allotted for the algorithm to run. The practical advantages of ODRPACK95 are:
The disadvantages of ODRPACK95 are:
The advantages of VTDirect are:
The disadvantages of VTDirect are:
In most cases, ODRPACK95 is a good algorithm to start with. If no good parameter values are discovered then move to using VTDirect. In most cases, running ODRPACK95 on the resulting parameters returned by VTDirect will give further improvement with only a little extra computational time.
The rule of thumb is this: if there are lots of "close" data in one experiment then use OLS, otherwise, use ODR. ODR generally does a better job of fitting data since it allows for error in the independent variable (often there is really error in the independent variable: think of the experimentalist that takes a measurement at 1 hr and is 5-10 minutes late). "close" is a relative term. If the error in the independent component of the data is likely to be more than the distance between data points then the data is "close". Choosing OLS or ODR also depends on how much error is in the dependent data. If most of the error is in the dependent data then OLS is a fine choice regardless of the closeness of the data.
The default values of these settings are probably sufficient for most problems. Though, choosing between ordinary least squares and orthogonal distance regression could have a significant impact (this is discussed in the section called “Ordinary Least Squares (OLS) vs. Orthogonal Distance Regression (ODR)”). Read the section called “Convergence” to learn about how these settings affect ODRPACK95.
Table of Contents
This section shows how the Time Series transform could be defined by a User Defined Transform. The PET file used in this example can be found at timeseries_userdefined.pet.
In PET the transform looks like

The transform takes time in the argument XDATA and outputs the species L2 in the first column of YDATA. The code can be divided into 4 sections and we'll explain each of these sections. The first section declares and defines local variables used within this transform.
integer :: i i = 1
The second section skips any negative times in the independent variable XDATA. The DEFINED array is set to 0 for each corresponding negative time. This lets PET know that YDATA is not defined at those times.
do while ( xdata(i) .lt. 0 )
defined(i) = 0
i = i+1
end do
The third section checks if the first simulatable time is zero. There is no need to perform a simulation for time=0, because these are the initial conditions. However, L2 is defined at time=0 so we set YDATA and DEFINED accordingly. Note here that L2 has an '_' appended to the end. This is done by the tool underlying PET (Biopack) so that user names will not conflict with names used by Biopack or Fortran.
if ( xdata(i) .eq. 0 ) then
defined(i) = 1
ydata(i,1) = L2_
i = i+1
end if
The fourth and last section performs a simulation stopping at each time in XDATA and recording L2 in YDATA. Note the same '_' is appended to L2 as above and for the same reasons discussed above. There are three steps to performing a simulation: set up the simulator (lsodar_set), run the simulator (lsodar_simulate), and get the results (lsodar_get). TOUT is the time the simulated will run to. Y is the generic internal name the simulator uses for the state variables of the ODEs. MODEL_VARS is an array that is "tied" to the names of the species in the model. When MODEL_VARS is assigned a value the species (i.e., L2_, L_, Ma_, etc.) are updated and vice versa. This is done automatically. The Fortran code that does this uses the EQUIVALENCE Fortran statement.
do while ( i .le. size(xdata) )
call lsodar_set( tout = xdata(i) )
call lsodar_simulate()
call lsodar_get( y = model_vars )
defined(i) = 1
ydata(i,1) = L2_
i = i+1
end do
Table of Contents
This tutorial walks you through a series of tasks involving simulating models using PET. Each task builds on the previous starting with a basic simulation and ending with complex simulation definitions that inherit parameter changes from other simulation definitions.
NOTE: You will need the files listed below for this tutorial. They can be found on the PET website (http://mpf.biol.vt.edu/pet/), in the PET distribution (c:\Program Files\PET\doc\manual\), or by right clicking and saving the files below (if you are viewing this page through a web browser).







The state of a user interface component such as a button, check box, text box, etc.. In this state the user interface will respond to user clicks and edits.
See Also inactive.
See user data.
The state of a user interface component such as a button, check box, text box, etc.. In this state the user interface will not respond to user clicks and edits.
See Also active.
To take a file from the filesystem (i.e., disk) and read its contents into the currently open PET file. The contents of the imported file could be added to a list of like content (e.g., basal set files) or overwrite some existing content (e.g., SBML files).
A problem solving environment created by a Virginia Tech team of computer scientists and computational biologists. This is a separate project from PET, but some components are included with PET.
The act of changing the state of the user's data to match its state before the last time the user editted the data.
All information entered into PET by the user. This includes experimental data, names of simulations, plot settings, experimental data weights, basal set names, values of parameters in basal sets, changes to parameters in simulations, parameter estimator settings, etc..