Pathway Modeling and Problem Solving Environments Cliff Shaffer Department of Computer Science...
-
Upload
catherine-bryant -
Category
Documents
-
view
215 -
download
1
Transcript of Pathway Modeling and Problem Solving Environments Cliff Shaffer Department of Computer Science...
Pathway Modeling andProblem Solving Environments
Cliff ShafferDepartment of Computer Science
Virginia TechBlacksburg, VA 24061
The Fundamental Goal of Molecular Cell Biology
Application:Cell Cycle Modeling
How do cells convert genes into behavior? Create proteins from genes Protein interactions Protein effects on the cell
Our study organism is the cell cycle of the budding yeast Saccharomyces cerevisiae.
S
cell d
ivision
G1
DNAreplication
G2M(mitosis)
growth
Clb5MBF
P Sic1 SCFSic1Swi5
Clb2Mcm1
APCCdc14
Cdc14
CDKs
Cln2SBF
?
andCln3
Bck2
DNA synthesis
Inactive trimer
Inactive trimer
P
Clb2
Budding
Cdc20
Cdc20
Cdh1
Cdh1
Mcm1
Mad2
unaligned chromosomes
RENT
Cdc14
APC-P
Cln2Clb2Clb5
Lte1
SBF
Esp1 Esp1Pds1
Pds1
Net1
Net1P
PPX
Cdc15/MENTem1-GDP
Tem1-GTPBub2
unaligned chromosomesCdh1
Sister chromatid separation
Mcm1Cdc20
Mitosis
Modeling Techniques
One method: Use ODEs that describe the rate at which each protein concentration changes Protein A degrades protein B:
… with initial condition [A](0) = A0.
Parameter c determines the rate of degradation. Sometimes modelers use “creative” rate laws to
approximate subsystems
]A[]B[
cdt
d
'1 1 2
d[Cln2][SBF] [Cln2]
dk k k
t
' '3 3 4 4 5
d[Clb2][Mcm1] [Cdh1] [Clb2] [Sic1][Clb2]
dk k k k k
t
' '6 6 T 7 7
6 T 7
[Cdc20] [Cdh1] [Cdh1] [Clb5] [Cdh1]d[Cdh1]
d [Cdh1] [Cdh1] [Cdh1]
k k k k
t J J
synthesis degradation
synthesis degradation binding
activation inactivation
Mathematical Model
0 50 100 150
0.0
0.5
1.0
1.5
0.0
0.5
0.0
0.5
1.0
1
2
Time (min)
CKI
mass
Clb2
Cln2
Cdh1
Simulation of the budding yeast cell cycle
G1 S/M
Cdc20
Table 6. Properties of clb, sic1, and hct1 mutants
mass at birth
mass at
SBF 50%
mass at
DNA repl.
mass at bud ini.
mass at division
TG1
(min)
changed
parameter
Comments
1 wild type
(daughter) 0.71 1.07
(71’) 1.15 (84’)
1.15 (84’)
1.64 (146’)
84 CT 146 min (time of occurrence of event)
2 clb1 clb2
0.71 1.07 1.16 1.16 No mit k's,b2 = 0
k"s,b2 = 0 Surana 1991 Table 1, G2 arrest.
3 clb1 clb2
1X GAL-CLB2 0.65 1.10 1.19 1.19 1.50 105 k's,b2 = 0.1
k"s,b2 = 0 Surana 1993 Fig 4, 1X GAL-CLB2 is OK, 4X GAL-CLB2 (or 1X GAL-CLB2db) causes telophase arrest.
4 clb5 clb6 0.73 1.07
(65’) 1.30 (99’)
1.17 (80’)
1.70 (146’)
99 k's,b5 = 0 k"s,b5 = 0
Schwob 1993 Fig 4, DNA repl begins 30 min after SBF activation.
5 clb5 clb6
GAL-CLB5 0.61 0.93 0.92 0.96 1.41 73 k's,b5 = 0.1
k"s,b5 = 0 Schwob 1993 Fig 6, DNA repl concurrent with SBF activation in both GAL-CLB5 and GAL-CLB5db.
6 sic1 0.66 1.00
(73’) 0.82 (37’)
1.06 (83’)
1.52 (146’)
38 k's,c1 = 0 k"s,c1 = 0
Schneider 1996 Fig 4, sic1 uncouples S phase from budding.
7 sic1 GAL-SIC1 0.80 1.07 1.38 1.17 1.86 94 k's,c1 = 0.1 k"s,c1 = 0
Verma 1997 Fig3B, Nugroho & Mendenhall 1994 Fig 2, most cells are viable.
8 hct1 0.73 1.08 1.17 1.18 1.69 82 k"d,b2 = 0.01 Schwab 1997 Fig 2, viable, size like WT, Clb2 level high
throughout the cycle. 9 sic1 hct1
0.71 No SBF 0.72 No bud No mit k's,c1 = 0
k"d,b2 = 0.01 Visintin 1997, telophase arrest.
10 sic1 GAL-CLB5
first cycle second cycle
0.71 0.52
0.74
0.73
No repl
0.76
1.20
k's,b5 = 0.1 k"s,b5 = 0 k's,c1 = 0
Schwob 1994 Fig 7C, inviable. First cycle OK, DNA repl advanced; but pre-repl complexes cannot form and cell dies after the first cycle.
Table 6. Properties of clb, sic1, and hct1 mutants
mass at birth
mass at
SBF 50%
mass at
DNA repl.
mass at bud ini.
mass at division
TG1
(min)
changed
parameter
Comments
1 wild type
(daughter) 0.71 1.07
(71’) 1.15 (84’)
1.15 (84’)
1.64 (146’)
84 CT 146 min (time of occurrence of event)
2 clb1 clb2
0.71 1.07 1.16 1.16 No mit k's,b2 = 0
k"s,b2 = 0 Surana 1991 Table 1, G2 arrest.
3 clb1 clb2
1X GAL-CLB2 0.65 1.10 1.19 1.19 1.50 105 k's,b2 = 0.1
k"s,b2 = 0 Surana 1993 Fig 4, 1X GAL-CLB2 is OK, 4X GAL-CLB2 (or 1X GAL-CLB2db) causes telophase arrest.
4 clb5 clb6 0.73 1.07
(65’) 1.30 (99’)
1.17 (80’)
1.70 (146’)
99 k's,b5 = 0 k"s,b5 = 0
Schwob 1993 Fig 4, DNA repl begins 30 min after SBF activation.
5 clb5 clb6
GAL-CLB5 0.61 0.93 0.92 0.96 1.41 73 k's,b5 = 0.1
k"s,b5 = 0 Schwob 1993 Fig 6, DNA repl concurrent with SBF activation in both GAL-CLB5 and GAL-CLB5db.
6 sic1 0.66 1.00
(73’) 0.82 (37’)
1.06 (83’)
1.52 (146’)
38 k's,c1 = 0 k"s,c1 = 0
Schneider 1996 Fig 4, sic1 uncouples S phase from budding.
7 sic1 GAL-SIC1 0.80 1.07 1.38 1.17 1.86 94 k's,c1 = 0.1 k"s,c1 = 0
Verma 1997 Fig3B, Nugroho & Mendenhall 1994 Fig 2, most cells are viable.
8 hct1 0.73 1.08 1.17 1.18 1.69 82 k"d,b2 = 0.01 Schwab 1997 Fig 2, viable, size like WT, Clb2 level high
throughout the cycle. 9 sic1 hct1
0.71 No SBF 0.72 No bud No mit k's,c1 = 0
k"d,b2 = 0.01 Visintin 1997, telophase arrest.
10 sic1 GAL-CLB5
first cycle second cycle
0.71 0.52
0.74
0.73
No repl
0.76
1.20
k's,b5 = 0.1 k"s,b5 = 0 k's,c1 = 0
Schwob 1994 Fig 7C, inviable. First cycle OK, DNA repl advanced; but pre-repl complexes cannot form and cell dies after the first cycle.
d CDK dt = k1 - (v2’ + v2” . Cdh1 ) . CDK
d Cdh1dt =
(k3’ + k3” . Cdc20A) (1 - Cdh1) J3 + 1 - Cdh1 -
(k4’ + k4” . CDK . M) Cdh1 J4 + Cdh1
d IEPdt = k9
. CDK . M . (1 – IEP ) – k10 . IEP
d Cdc20T
dt = k5’ + k5” (CDK . M)4
J54 + (CDK . M)4 - k6
. Cdc20T
d Cdc20A
dt = k7
. IEP (Cdc20T - Cdc20A) J7 + Cdc20T - Cdc20A
- k8
. MAD Cdc20A
J8 + Cdc20A - k6
. Cdc20T
Differential equations Parameter values
k1 = 0.0013, v2’ = 0.001, v2” = 0.17,
k3’ = 0.02, k3” = 0.85, k4’ = 0.01, k4” = 0.9,
J3 = 0.01, J4 = 0.01, k9 = 0.38, k10 = 0.2,
k5’ = 0.005, k5” = 2.4, J5 = 0.5, k6 = 0.33,
k7 = 2.2, J7 = 0.05, k8 = 0.2, J8 = 0.05,
…
Experimental Data
Tyson’s Budding Yeast Model
Tyson’s model contains over 30 ODEs, some nonlinear.
Events can cause concentrations to be reset.
About 140 rate constant parameters Most are unavailable from experiment and must set by
the modeler
Fundamental Activities
Collect information Search literature (databases), Lab notebooks
Define/modify models A user interface problem
Run simulations Equation solvers (ODEs, PDEs, deterministic,
stochastic)
Compare simulation results to experimental data Analysis
Modeling Lifecycle
Our Mission: Build Software to Help the Modelers
Typical cycle time for changing the model used to be one month Collect data on paper lab notebooks Convert to differential equations by hand Calibrate the model by trial and error Inadequate analysis tools
Goal: Change the model once per day. Bottleneck should shift to the experimentalists
Another View
Current models of simple organisms contain a few 10s of equations.
To model mammalian systems might require two orders of magnitude in additional complexity.
We hope our current vision for tools can supply one order of magnitude.
The other order of magnitude is an open problem.
JigCell
Current Primary Software Components:JigCell Model Builder
JigCell Run Manager
JigCell Comparator
Automated Parameter Estimation (PET)
Bifurcation Analysis (Oscill8)
http://jigcell.biol.vt.edu
Model Builder
Run Manager
Comparator
Parameter Values
ParameterOptimizer
Optimum Parameter Values
From a wiring diagram…
JigCell Model Builder
N.B. Parameters are given names,not numerical values!
…to a reaction mechanism
… to ordinary differential equations (ode files, SBML)
JigCell Model Builder
Mutations
Wild type cell
Mutations Typically caused by gene knockout Consider a mutant with no B to degrade A.
Set c = 0 We have about 130 mutations
each requires a separate simulation run
• Inheritance patterns
Basal Set(wild-type)
Derived Set(mutant A)
Derived Set(mutant B)
Derived Set(mutant C)
Derived Set(mutant A’)
Derived Set(mutant AB)
Derived Set(mutant A’C)
Run Manager
JigCell Run Manager
Phenotypes
Each mutant has some observed outcome (“experimental” data). Generally qualitative. Cell lived Cell died in G1 phase
Model should match the experimental data. Model should not be overly sensitive to the rate
constants. Overly sensitive biological systems tend not to
survive
Visualize results
Kumagai1 Kumagai2
Comparator
Comparator
Optimization
How to decide on parameter values?
Key features of optimization Each problem is a point in multidimensional space Each point can be assigned a value by an objective
function The goal is to find the best point in the space as defined
by the objective function We usually settle for a “good” point
Parameter Optimization
Error Function
orthogonal distance regression
Levenberg-Marquardt algorithm
Parameter Optimization
Only 1 experiment shown here. The model must be fitted simultaneously to many different experiments.
Parameter Optimization
Global DIRECT Search(DIViding RECTangles)
Global DIRECT Search(DIViding RECTangles)
Composition Motivation
Models are reaching the limits of manageability due to an increase in: Size Complexity
Making a model suitable for stochastic simulation increases the number of reactions by a factor of 3-5.Models of the mammalian cell cycle will require 100-1000 reactions (even more for stochastic simulation).
Model Composition
Notice that the yeast cell diagram contains natural components
Composition ProcessesFusion Merging two or more existing models
Composition Build up model hierarchy from existing models by
describing their interactions and connections
Aggregation Connects modular blocks using controlled
interfaces (ports)
Flattening Convert hierarchy back into a single “flat” model
for use with standard simulators
Composition Processes
Sample Sub-models
Sample Composed Model
Composition WizardFinal Species Mapping Table
Composition WizardFinal Reaction Mapping Table
Aggregated Submodels
Final Aggregated Model
Aggregation Connector
Composition in SBML
Virginia Tech’s proposed language features to support composition/aggregation being written into forthcoming SBML Level 3 definition
Stochastic Simulation
ODE-based (deterministic) models cannot explain behaviors introduced by random nature of the system. Variations in mass of division Variations in time of events Differences in gross outcomes
Gillespie’s Stochastic Simulation Algorithm
There is a population for each chemical species
There is a “propensity” for each reaction, in part determined by population
Each reaction changes population for associated species
Loop: Pick next reaction (random, propensity) Update populations, propensities
Slow, there are approximations to speed it up
Comments on Collaboration
Domain team routinely underestimates how difficult it is to create reliable and usable software.
CS team routinely underestimates how difficult it is to stay focused on the needs of the domain team.
Partial solution: truly integrate.
How to Succeed in CBB
Programming skills are necessary but not sufficient
Math is usually the biggest bottleneck Statistics for Bioinformatics Numerical analysis, optimization, differential equations
for computational biology
Chemistry/biochemistry are good choices for domain knowledge
You have to have an “interdisciplinary attitude”