Bayesian network models of Biological signaling pathways
K. Sachs2
From Phospho-molecular profiling to Signaling pathways
High throughput dataR
af
Erk
p38
PKA
PKC
Jnk
PIP2
PIP3
Plc
Akt
...
Cell1
Cell2
Cell3
Cell4
Cell600
Signaling Pathways
Flow Measurments
Picture: John Albeck
K. Sachs
Outline
What are signaling pathways?
What kind of data is available study them?
How do we use Bayesian networks to learn their structure?
Two extensions: Markov
neighborhood algorithm
Bayesian network based cyclic networks (BBCs)
3
K. Sachs
Outline
What are signaling pathways?
What kind of data is available study them?
How do we use Bayesian networks to learn their structure?
Two extensions: Markov
neighborhood algorithm
Bayesian network based cyclic networks (BBCs)
4
K. Sachs5
Cell death ProliferationSecrete cytokines
Cells respond to their environment
Inside each cell is a molecular network
K. Sachs6
“Central Dogma”
Translation
ProteinDNA
Transcription
mRNA
Modification
Modified Protein
‘Blueprint’- instructions
for production
of all proteins
Delivers instruction
s for specific gene
Ribosome: Protein-
production factory
K. Sachs7
Signaling & Genetic pathways
A
B
C
A
BTF
DNA
RNA
C
Cell response
K. Sachs
Outline
What are signaling pathways?
What kind of data is available study them?
How do we use Bayesian networks to learn their structure?
Two extensions: Markov
neighborhood algorithm
Bayesian network based cyclic networks (BBCs)
8
K. Sachs9
d[R]dt k1[LR]
k2[R][L]
...
Spectrum of Modeling Tools in Systems Biology
K. Sachs10
Graph
Node: Measured level/activity of protein
Edge: Influence (dependency) between proteins
Conditional probability distributions
Each node has a conditional probability given its parents
Protein A
Protein B
Protein C Protein D
Protein E
Bayesian Networks
P(B|A=‘On’)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
-1 0 10 1 2
K. Sachs
How do we use Bayesian Networks to infer pathways?
11
The Technical Details
BayesianScore (S) logP(S D)
logP(S) logP(D S) c
Score candidate models
Use a heuristic search to find high scoring models
... P(D,S)P( S)dn
1
... P(D, S)dn
1
P(DS)
(analytical solution!)
K. Sachs12
Protein data
Western blot
K. Sachs13
Protein data
Protein arrays
K. Sachs14
Protein data
Mass Spectrometry
All of these lysate approaches give 1
measurement per protein for 10^3-10^7 cells
K. Sachs15
Flow Cytometry: Single Cell Analysis
Thousands of datapoints
K. Sachs16
MEK3/6
MAPKKK
PLC
Erk1/2
Mek1/2
Raf
PKC
p38
Akt
MAPKKK
MEK4/7
JNK
L
A
TLck
VAVSLP-76
RAS
PKA
1 2 3
CD28CD3
PI3K
LFA-1
Cytohesin
Zap70
PIP3
PIP2
JAB-1
Activators
1.-CD3
2.-CD28
3. ICAM-2
4. PMA
5. 2cAMP
Inhibitors
6. G06976
7. AKT inh
8. Psitect
9. U0126
10. LY294002
10
5
46
7
9
8
Stimulations and perturbations
K. Sachs17
Datasets of cells• condition ‘a’• condition ‘b’•condition…‘n’
Raf
Mek
1/2
Erk
p38
PK
AP
KC
Jnk
PIP
2P
IP3
Plc
Akt
12 Color Flow Cytometry
perturbation a
perturbation n
perturbation b
Conditions (multi-well format)
T-Lymphocyte Data
Primary human T-Cells
9 conditions (6 Specific
interventions)
9 phosphoproteins, 2 phospolipids
600 cells per condition 5400 data-points
Omar Perez
K. Sachs18
Statistical Dependencies
A
B
C D
E
Phosp
ho A
Phospho B
K. Sachs19
Statistical Dependencies
Edges can be directed (primarily) due to the use of
interventions
A
B
C D
E
Phosp
ho A
Phospho B
K. Sachs20
Overview
Influence
diagram of
measured
variables
Bayesian Network Analysis
Datasets of cells• condition ‘a’• condition ‘b’•condition…‘n’
Raf
Mek
1/2
Erk
p38
PK
AP
KC
Jnk
PIP
2P
IP3
Plc
Akt
Multiparameter Flow Cytometry
perturbation a
perturbation n
perturbation b
Conditions (multi well format)
K. Sachs21
PKC
Raf
P44/42
Mek
Plc
PKA
Akt
Jnk P38
PIP2
PIP3
Phospho-Proteins Phospho-Lipids Perturbed in data
Inferred Network
K. Sachs22
PKC
Raf
P44/42
Mek
Plc
PKA
Akt
Jnk P38
PIP2
PIP3
Phospho-Proteins Phospho-Lipids Perturbed in data
How well did we do?
Direct phosphorylation
K. Sachs23
Features of Approach
Direct phosphorylation:
Mek
Difficult to detect using other forms of high-throughput data:
-Protein-protein interaction data
-Microarrays
Erk
K. Sachs24
PKC
Raf
P44/42
Mek
Plc
PKA
Akt
Jnk P38
PIP2
PIP3
Phospho-Proteins Phospho-Lipids Perturbed in data
How well did we do?
K. Sachs25
PKC
Raf
P44/42
Mek
Plc
PKA
Akt
Jnk P38
PIP2
PIP3
Phospho-Proteins Phospho-Lipids Perturbed in data
How well did we do?
Indirect Signaling
K. Sachs26
Indirect signaling
Dismissing edges
Raf Mek Erk
PKC Jnk PKC Mapkkk Jnk
Not measured
Mek4/7
Indirect connections can be found even when the intermediate molecule(s) are not
measured
Indirect signaling
K. Sachs27
Indirect signaling - Complex example
Is this a mistake?
The real picture
Phoso-protein specific
More than one pathway of influence
PKC Raf Mek
PKC Rafs259 Mek
Rafs497
Ras
K. Sachs28
PKC
Raf
P44/42
Mek
Plc
PKA
Akt
Jnk P38
PIP2
PIP3
Expected Pathway
15/17 Classic
Phospho-Proteins Phospho-Lipids Perturbed in data
How well did we do?
K. Sachs29
PKC
Raf
Erk
Mek
Plc
PKA
Akt
Jnk P38
PIP2
PIP3
Expected Pathway
Reported
Missed
15/17 Classic
17/17 Reported
3 Missed
Reversed
Phospho-Proteins Phospho-Lipids Perturbed in data
Signaling pathway reconstruction
[Sachs et al 2005]
K. Sachs
Caveats
Inhibitor specificity Binding site similar
across proteins
Reagent availability and specificity
Data quality
These are issues in many biological apps!
30
I think I’ll bind here
K. Sachs
Outline
What are signaling pathways?
What kind of data is available study them?
How do we use Bayesian networks to learn their structure?
Two extensions: Markov
neighborhood algorithm
Bayesian network based cyclic networks (BBCs)
31
K. Sachs32
Markov Neighborhood Algorithm
K. Sachs33
Building larger networks
12 color capability Model 50-100 variables
4 color capability Model 12 variables
PKC
Raf
P44/42
Mek
PlcPKA
Akt
Jnk P38
PIP2
PIP3
~80 proteins involved in
MAPK signaling
(11- at the cutting edge- is NOT enough!)
K. Sachs34
Measured subsets = Incomplete dataset (Missing data)
Insufficient information for standard approaches (will perform poorly)
Use a set of biologically motivated assumptions to constrain search..
And to reduce the number of experiments
( )11
4= 330
K. Sachs35
Constraining the search
Plus potential perturbation parents
Identify candidate parents
Using ‘Markov neighborhoods’
(for each variable)
K. Sachs36
Bayesian Network Analysis
(Constrained search)
Raf
Mek
1/2
Erk p38
PK
AP
KC
Jnk
PIP
2P
IP3
Plc
Akt
Molecules 1, 3, 7, 9
Molecules 2, 4, 7, 10
Molecules 1, 2, 6, 11
Approach overview
K. Sachs37
Neighborhood reduction
CB
E
DA
F
4 color capability
Conditional independencies in the
substructure?ABC
411
K. Sachs38
Accurate Reproduction of Model ~15 experiments, 4-colors
Confidence value different from original
model
PKC
Raf
Erk
Mek
Plc
Akt
Jnk P38
PIP2
PIP3
PKA
K. Sachs39
Raf
Mek
1/2
Erk p38
PK
AP
KC
Jnk
PIP
2P
IP3
Plc
Akt
Active learning approach
K. Sachs
Outline
What are signaling pathways?
What kind of data is available study them?
How do we use Bayesian networks to learn their structure?
Two extensions: Markov
neighborhood algorithm
Bayesian network based cyclic networks (BBCs)
40
K. Sachs41
Learning cyclic structures with Bayesian networks
Biological networks contain many loops
Bayesian networks are constrained to be acyclic
So…
K. Sachs
Overcoming acyclicity
Signaling pathways contain many cycles
Bayesian networks are constrained to be acyclic
How can we accurately model pathways with cycles?
42
GRB2/SOSGRB2/SOS
RafRaf
MEKMEK
ErkErk
RasRas
Develop a new, Bayesian network derived algorithm that models
cycles…
K. Sachs
Bayesian Network Based Cyclic Networks (BBNs)
I. Break loops with molecule inhibitors
II. Use BN to learn the structure (now not cyclic!)
III. Close loops
43
GRB2/SOSGRB2/SOS
RafRaf
MEKMEK
ErkErk
RasRas
Mek inhibitor
Solomon Itani
K. Sachs44
GRB2/SOSGRB2/SOS
RafRaf
MEKMEK
ErkErk
RasRas
I. Break loops with molecule inhibitors Detect loops P(A)A* ~= P(A)
II. Use BN to learn the structure (now not cyclic!)
III. Close loops
P(B|Pa(B)) A* ~= P(B|Pa(B))
AB
Bayesian Network Based Cyclic Networks (BBNs)
K. Sachs45
Future work
Larger network from overlapping sets (Markov neighborhood)
Dynamic models over time
Differences in signaling (sub-populations, treatment conditions, cell types, disease states)
K. Sachs46
Acknowledgements
Shigeru Okumur
a
Funding
LLS post doctoral fellowship
Solomon Itani
Garry Nolan
Dana Pe’er
Doug Lauffenburge
r
Omar Perez
Dennis Mitchell
Mesrob Ohannessia
n
Extra slides
Mathematical Intuition
BB CC C is independent of A given B.
AA
AA BB
CCDD
C independent of A given B and D
1) No need to introduce time!!!
2) When loops are broken, the result is a BN!!!
K. Sachs49
Prediction: ErkAktErk1/2 unperturbed Erk Akt not well established
in literature
Predictions:
Erk1/2 influences Akt
While correlated, Erk1/2 does not influence PKA
PKC
Raf
Erk1/2
Mek
PKA
Akt
K. Sachs50
Validation
control, stimulated
Erk1 siRNA, stimulated
SiRNA on Erk1/Erk2 Select transfected cells Measure Akt and PKA
100 101 102 103 104
APC-A: p-akt-647 APC-A100 101 102 103 104
PE-A: p-pka-546 PE-A
P-Akt P-PKA
P=9.4e-5 P=0.28
Top Related