Post on 11-Jan-2016
description
Assigning Numbers to the Arrows
Parameterizing a Gene Regulation Network by using Accurate
Expression Kinetics
Overview
• Motivation• Gene Regulation Networks Background• Our Goal• Our Example• Parameterizing Algorithm• Results
Motivation
• Understand regulation factors for different genes
• Can help understand a gene’s function
• If we can understand how it all works we can use it for medical purposes like fixing and preventing DNA damage!
Background: Gene Regulation Networks(1)
• Dynamically orchestrate the level of expression for each gene
• How? Control whether and how vigorously that gene will be transcribed into RNA (biological stuff)
Background: Gene Regulation Networks(2)
• Contains:1. Input Signals: environmental cues, intracellular signals 2. Regulatory Proteins3. Target Genes
Our Goal
• Assign parameters to a Gene Regulation Network based on experiments:
- production of unrepressed promoter. the maximum production
- concentration of repressor at half maximal repression. The bigger it is the earlier the earlier the gene becomes active and the later it becomes inactive again
k
Our Example(1)
• Escheria coli bacterium
• SOS DNA repair system – used to repair damage done by UV light
• 8 (out of about 30) gene groups (operons)
Our Example(2)
• Simple network architecture – recall what we saw last week: SIM (Single Input Module)
• All genes are under negative control of a single repressor (a protein that reduces gene levels)
...1X 2X nX
A
Parametrization Algorithm
)(tX ij
Definitions: - the activity of promoter i in experiment j as function of time
)(tAj - effective repressor concentration in experiment j as function of time
i - production rate of the unrepressed promoter i
ik - k parameter of promoter i
Parametrization Algorithm 1:Trial Function
)/)(1()(:]1[
ij
iij ktAtX
Why?Michaelis-Menten form: a very useful equation in modeling biological behavior.
Parametrization Algorithm 2:Data Preprocessing(1)
• Smoothing the signals using a hybrid Gaussian-median filter with a window size of five measurements:
Five time points are taken, sorted and the average of central three points is taken to be the signal.
Parametrization Algorithm 2:Data Preprocessing(2)
)(tX i - the activity of promoter i as a function of time
)(tGi - GFP fluorescence from the corresponding reporter as a function of time
)(tODi - corresponding Optical Density as a function of time
Some more definitions:
Parametrization Algorithm 2:Data Preprocessing(3)
• The signal is smooth enough to be differentiated
• The activity of promoter i is proportional to the number of GFP molecules produced per unit time per cell
)(/]/)([)( tODdttdGtX iii
Parametrization Algorithm 2:Data Preprocessing(4)
• The activity signal is smoothed by a polynomial fit of sixth order to:
)](log[ tX i
• The smoothing procedure captures the dynamics well, while removing noise
• Data for all experiments is concatenated and normalized by the maximal activity for each operon
Parametrization Algorithm 3:Parameter Determination(1)
• To determine parameters in equation [1] based on experimental data we transform it into a bilinear form:
iiii
btAatutX
)()()(
1
where:
iii ka
1
i
ib 1
Parametrization Algorithm 3:Parameter Determination(2)
• Now, the matrix MNi tX )(
where N is for genes and M for time points, is modeled by two vectors of size N: ii ba ,
and one vector of size M: )(tA
• 2N*M variables
Parametrization Algorithm 3:Parameter Determination(3) – some
algebra• The standard method of least mean squares solution for such a problem uses SVD (Singular Value Decomposition)
• The mean over i of )(tui is removed:))(()()( tumeantutu iii
Parametrization Algorithm 3:Parameter Determination(4) – some
algebra• A(t) is the SVD eigenvector with the largest eigenvalue of the matrix:
This is the covariance matrix
i
ii tututtJ )'()()',(
• Results for A(t) are normalized to fit the constraints:
• Alternative normalization: add points with A=0 and
0))(min(,1)0( tAtA
iX
Parametrization Algorithm 3:Parameter Determination(5) – some
algebra• Perform a second round of optimization for
by using a nonlinear least mean squares solver to minimize
ii k,2)( predictedmeasured XX
Parametrization Algorithm 4:Error Evaluation(1)
• The mean error for promoter i is given by:
T
tmeasuredit
predictedit
measuredit
i X
XX
TE
1
1
where T is the total time of the experiment
• This is considered the quality of the data model in describing the data
Parametrization Algorithm 4:Error Evaluation(2)
• The error estimate for the parameters is determined by using a graphic method:
iiiii
i k
tAbtAa
tX 1)(
)()(
1
is plotted vs. A(t)
Parametrization Algorithm 4:Error Evaluation(3)
• From maximal and minimal slopes of the graphs the error for is determined
iii ka
1
• From maximal and minimal intersections with the y axis the error for is determined
i1
Parametrization Algorithm 5:Additional Trial Function(1)
• An extension of the model to the case of cooperative binding – a regulator can be a repressor for some genes and an activator for others, and with different measures:
iHij
iij ktAtX
)/)((1)(
Parametrization Algorithm 5:Additional Trial Function(2)
0iH
-Hill coefficient for operon i
Hill coefficient? A coefficient that describes binding
- repression0iH - activation
iH
1iH - no cooperation
Parametrization Algorithm 5:Additional Trial Function(3)
Our example: good comparison between measured results and those calculated with trial function suggest there may be no significant cooperativity in the repressor action
1iH
Results: Promoter Activity Profiles(1)
• After about half a cell cycle the promoter activities begin to decrease
• Corresponds to the repair of damaged DNA
Results: Promoter Activity Profiles(2)
• The mean error between repeat experiments performed of different days is about 10%
Results:Assigning Effective Kinetic
Parameters• The error is under 25% for most promoters
Results:Detection of Promoters with
Additional Regulation • Relatively large error may help to detect
operons that have additional regulation.
• Examples:
1. lacZ – very large error (150%)
2. uvrY – recently found to participate in another system and to be regulated by other transcription factors (45% error)
Results:Determining Dynamics of an Entire
System Based on a Single Representative(1)
• Once the parameters are determined for each operon, we need to measure only the dynamics of one promoter in a new experiment to estimate all other SOS promoter kinetics
)1)(
(1)(
tXk
ktX
n
n
m
n
mm
Results:Determining Dynamics of an Entire
System Based on a Single Representative(2)
• The estimated kinetics using data from only one of the operons agree quite well with the measured kinetics for all operons
• Same level of agreement found by using different operons as the base operon
Results:Determining Dynamics of an Entire
System Based on a Single Representative(3)
Results:Repressor Protein Concentration
Profile• Current measurements don’t directly measure
the concentration of the proteins produced by these operons, only the rate at which the corresponding mRNA’s are produced
• The parameterization algorithm allows calculation of the transcriptional repressor - A(t), directly.
Summary
• We can apply the current method to any SIM motif, in gene regulation networks
• The method won’t work with multiple regulatory factors
Questions?
Thank You For Listening!