. Inferring Subnetworks from Perturbed Expression Profiles D. Pe’er A. Regev G. Elidan N....
-
Upload
eric-hubbard -
Category
Documents
-
view
220 -
download
1
Transcript of . Inferring Subnetworks from Perturbed Expression Profiles D. Pe’er A. Regev G. Elidan N....
.
Inferring Subnetworks from Perturbed Expression
Profiles
D. Pe’er
A. Regev G. Elidan N. Friedman
Expression Profiling An Expression Profile is
a simultaneous measurement of the level of all mRNAs in a cell population
Experimental design: Measure profiles of mutated or treated cultures
Goal: infer regulatory and molecular interactions
Wild-Type Mutant
Profile
Compare
Common Approaches
Comparative Analysis (Holstage et al. 1998) Clustering (Hughes et al. 2000) Limitations:
Cannot distinguish between direct and indirect interactions
Limited to pair-wise relations Can not infer a finer context
Bayesian Network FrameworkFriedman, Linial, Nachman ,Pe’er (JCB 2000)
Probabilistic: Characterize statistical relationships between expression patterns of different genes
Multi-variable interactions (beyond pair-wise): Identify intermediate interactions Handle combinatorial regulation by several
gene-products Statistical confidence: Asses the statistical
significance of interactions found
Our Contributions
Modeling of mutations and treatments into the Bayesian network framework
Novel data discretization based on guided k-means clustering
New features: Mediator and Regulator
Automatic reconstruction of statistically significant sub-networks.
- 0 +
+
Modeling Gene Expression
Gene 1
Expression level of each gene = Random variable
Gene 3Gene 4
Gene 5
Gene 2
Gene interaction =Probabilistic dependency
Directed Acyclic graphModels dependency structure of distribution
0.9 0.1
1
2
1
0.2 0.8
0.6 0.4
0.9 0.1
21
2
2
1
21 P(3 | 1,2)
Each node has a probabilistic functionConditioned on its parents in the graph
Activator Inhibitor
Graph structure + local probabilityDefine a unique multivariate distribution
Mutational AssayWild-Type
Measurements
0.9 0.1
pgk10.1 0.9
pgk1
P(rap1|pgk1)
Equivalence: Two models explain
correlation between RAP1 & PGK1
RAP1 PGK1
RAP1 PGK1
Mutated pgk1Measurements
0.5 0.5
pgk10.5 0.5
pgk1
P(rap1|pgk1)
Note causalityinto mutated variable
Compendium Dataset (Hughes et al., 2000)
300 samples of yeast deletion mutants and other treatments Deleted genes are from various functional families
A rich variety of profiles, but… There is only one sample from each mutation
cell growth, division, DNA
synthesis
cell rescue, cell defense, aging
cellular biogenesis
cellular organization
unclear classification
energy
intracellular transport
ionic homeostasis
metabolism
protein destination
protein synthesis
signal transduction
transcription
transport facilitation
unclassified proteins
Guided K-meansDiscretization
Guided K-meansDiscretization
Expressiondata
MarkovMarkov SeparatorSeparatorEdgeEdge RegulatorRegulator
Bayesian Network Learning Algorithm
+ Bootstrap
Bayesian Network Learning Algorithm
+ Bootstrap
Reconstruct SubNetworksReconstruct SubNetworks
Visualize UsingPathway ExplorerVisualize Using
Pathway Explorer
Preprocess
Learn model
Featureextraction
Featureassembly
Visualization
E
R
B
A
CS
Resulting PDAG
Confidence Estimates: Bootstrap
D resample
resample
resample
D1
D2
Dm
...
Learn
Learn
Learn
E
R
B
A
C
E
R
B
A
C
E
R
B
A
C
m
iiGf
mfC
1
11
)(Estimate:
Bootstrap approach[FGW, UAI99]
Estimating Confidence
Common Practice: Pick a single top scoring modelProblem: Insufficient information!!In gene expression data: only few hundred experiments => many high scoring models
Answer based on one model uselessSolution: Search for features common to
many likely models!Sample models from posterior distribution
P(Model|Data)Confidence of feature:
G
DGPGfDfP )|()()|(Feature of G,e.g., XY
Guided K-meansDiscretization
Guided K-meansDiscretization
Expressiondata
MarkovMarkov SeparatorSeparatorEdgeEdge RegulatorRegulator
Bayesian Network Learning Algorithm
+ Bootstrap
Bayesian Network Learning Algorithm
+ Bootstrap
Reconstruct SubNetworksReconstruct SubNetworks
Visualize UsingPathway ExplorerVisualize Using
Pathway Explorer
Preprocess
Learn model
Featureextraction
Featureassembly
Visualization
Markov Relations
Question: Do X and Y directly interact? Parent-child (one gene regulating the other)
Hidden Parent (two genes co-regulated by a hidden factor)
(0.91,0.67) SST2 STE6 SST2 STE6
Mating pathway regulator
Exporter of mating factor
ARG5 ARG3(0.84,0.79)
ARG3 ARG5
GCN4
Arginine Biosynthesis
Transcription factor
Low Correlation Relations
Previously unknown link strongly supported by evidence in the literature
High confidence, Low correlation Processes occur under specific conditions Captured by our context specific score
ESC4 KU70(0.91, 0.16)
DNA ds break repair
Chromatin silencing
Separators
Question: Given that X andY are indirectly dependant, who mediates this dependence?
Separator relation: X affects Z who in turn affects Z Z regulates both X and Y
AGA1 FUS1
KAR4
Mating transcriptional
regulator of nuclear fusion
Cell fusion Cell fusion
Separators: Intra-cluster Context
CRH1 YPS3
SLT2
Cell wall protein
MAPK of cell wall
integrity pathway
Cell wall protein
YPS1Cell wall protein
SLR3Protein of unknown function
++
All gene pairs have high correlation, clustering groups them together
assigned putative function to SLR3 - cell wall protein We can assign regulatory role to SLT2 Many other signaling and regulatory proteins were identified
as direct and indirect separators
Guided K-meansDiscretization
Guided K-meansDiscretization
Expressiondata
MarkovMarkov SeparatorSeparatorEdgeEdge RegulatorRegulator
Bayesian Network Learning Algorithm
+ Bootstrap
Bayesian Network Learning Algorithm
+ Bootstrap
Reconstruct SubNetworksReconstruct SubNetworks
Visualize UsingPathway ExplorerVisualize Using
Pathway Explorer
Preprocess
Learn model
Featureextraction
Featureassembly
Visualization
Sub-Networks
Reconstruct a Conserved sub-network Provides a more global picture Allows to include features with lower-confidence Preserved in most networks with high posterior Probably reflects a real biological process
Automatic algorithm Score: high concentration of pairwise features Greedy search for high scoring subgraphs
Increased Confidence(simulated data)
Percent of
False positives
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Confidence
Entire network
Subnetwork
Guided K-meansDiscretization
Guided K-meansDiscretization
Expressiondata
MarkovMarkov SeparatorSeparatorEdgeEdge RegulatorRegulator
Bayesian Network Learning Algorithm
+ Bootstrap
Bayesian Network Learning Algorithm
+ Bootstrap
Reconstruct SubNetworksReconstruct SubNetworks
Visualize UsingPathway ExplorerVisualize Using
Pathway Explorer
Preprocess
Learn model
Featureextraction
Featureassembly
Visualization
Rosetta networks in Pathway Explorer
http://www.cs.huji.ac.il/labs/compbio/ismb01
Summary
Primary contribution: automated methodology for finding patterns of interactions among genes
Clear semantics Principled handing of mutations and interventions
Built in handling of statistical significance Feature confidence Extracts significant sub-networks
Differs from clustering Inter-cluster relations Finer intra-cluster structure
Provides biologist with promising hypothesis