Microarray Basics, and Planning a Microarray Experiment Amy Caudy Lewis-Sigler Fellow.
Building biological networks from diverse genomic data Chad Myers Department of Computer Science,...
-
date post
19-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of Building biological networks from diverse genomic data Chad Myers Department of Computer Science,...
![Page 1: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/1.jpg)
Building biological networks from diverse genomic data
Chad Myers
Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics
Princeton University
PRIME Workshop on Pathway Databases and Modeling Tools June 16, 2006
![Page 2: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/2.jpg)
2
Motivation: building biological networks from experimental data
Explosion of functional genomic DATA
KNOWLEDGE of components and inter-relationships that lead to function
? Find missing pathway components
Detect uncharacterized crosstalk between pathways
Discover novel pathways
![Page 3: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/3.jpg)
3
Motivation: building biological networks from experimental data
noisy
How can we harness this information without sacrificing precision?
![Page 4: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/4.jpg)
4Directed network discovery: involving the biologist in the search process
Previous approaches to network analysis from genomic data:
largely undirected global approaches that detect interesting network features
Incorporating expert direction can:
Improve sensitivity and precision by using context information
Focus on relevant information for biologist user (allows interactivity)
Two-hybrid interaction network, yeast (SH3
domain) Boone lab
Previous work: Bader et al. (2003), Asthana et al. (2004)Yamanashi et al. (2004,2005), Kato et al. (2005)
![Page 5: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/5.jpg)
5
bioPIXIE system overview
bioPIXIE: Pathway Inference from eXperimental Interaction Evidence
![Page 6: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/6.jpg)
6
Overview
How do we integrate heterogeneous evidence?
Expert-driven network discoveryMaking it usable: practical visualization
and other interface considerationsDoes it work?
(evaluation experiments and biological validation)
Challenges/opportunities and future work
![Page 7: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/7.jpg)
7
Heterogeneous data integration
Diverse forms of data: what’s a unifying framework?
Variable coverage, reliability, and relevance Integration scheme should utilize information in data
when available, but be robust when missing
physical binding
genetic interaction
cellular localization
expressionsequence (TF motifs, coding,…)
Bayes net
Map to associations of genes/proteins
![Page 8: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/8.jpg)
8 Bayes net for evidence integration
Functional Relationship
Microarray correlation
Shared transcription
factors
Purified complex
Affinity precipitation
2 Hybrid
Syntheticlethality
Syntheticrescue
Co-localization
evidenceproteintorelatedlyfunctionalisprotein jiPWe infer:
Input evidence: grouped by lab (source) and by type
Structure:
Naïve Bayes (~60 nodes)
(also tried TAN)
CPT’s:
learned from GO gold standard
Fully-connected, weighted graph
of proteins
…
![Page 9: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/9.jpg)
9
Overview
How do we integrate heterogeneous evidence?
Expert-driven network discoveryMaking it usable: practical visualization
and other interface considerationsDoes it work?
(evaluation experiments and biological validation)
Challenges/opportunities and future work
![Page 10: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/10.jpg)
10
Expert-driven network discovery Local search in the PPI network centered at the
query
Which proteins should we extract as a single, functionally coherent group?
Should consider: confidence in links and topology surrounding query group
![Page 11: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/11.jpg)
11
Extracting relevant proteins
Basic idea: compute expected linkage to query set eij = P ( protein i is functionally related to protein j | evidence)
Xij : binary RV with prob. eij
SQ ( pi ): # of links from protein i to query set, Q
Find proteins that maximize:
Qpij
Qpij
QpijiQ
jjj
eXEXEpSE
What about indirect links to the query set?
![Page 12: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/12.jpg)
12 Graph search: handling indirect links
Solution: iterative expanding search where indirect links to the query through high confidence neighbors
are counted
![Page 13: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/13.jpg)
13
Overview
How do we integrate heterogeneous evidence?
Expert-driven network discoveryMaking it usable: practical visualization
and other interface considerationsDoes it work?
(evaluation experiments and biological validation)
Challenges/opportunities and future work
![Page 14: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/14.jpg)
14
Making bioPIXIE usable
Guiding principles: Accessibility (users can access most recent data with little effort)
Simplicity vs. flexibility
Drill-down (details, e.g. supporting exp. data, hidden until requested)
Browseable
![Page 15: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/15.jpg)
15
Graph visualization
![Page 16: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/16.jpg)
16
Overview
How do we integrate heterogeneous evidence?
Expert-driven network discoveryMaking it usable: practical visualization
and other interface considerationsDoes it work?
(evaluation experiments and biological validation)
Challenges/opportunities and future work
![Page 17: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/17.jpg)
17
Evaluation experiments
Recovering known network components:
How much does integration help?
Results averaged over 31 pathways, processes, and complexes (KEGG, GO, MIPS)
10 random proteins as query set and try to recover remaining members
![Page 18: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/18.jpg)
18
Evaluation experiments (2)
Recovering known network components:
Do naïve methods of integration/search work just as well?
Results averaged over 31 pathways, processes, and complexes (KEGG, GO, MIPS)
10 random proteins as query set and try to recover remaining members
![Page 19: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/19.jpg)
19 Biological validation: finding new components
S. cerevisiae uncharacterized gene, YPL077C
Predicted involvement in chromosome segregation
Using bioPIXIE to characterize unknown genes
![Page 20: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/20.jpg)
20
Biological validation: finding new components
P-value based on blind counting: 1.98x10-7 , Fisher’s exact test
![Page 21: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/21.jpg)
21
(Helmut Pospiech)
Biological validation: novel links between pathways
DNA replication initiation:
Cdc7: “switch” that starts replication (activated by Dbf4)
Linked to Hsp90 complex by our method
Hsp90 (yeast- hsc82,hsp82):Cytosolic molecular chaperone that participates in the folding of several signaling kinases and hormone receptors
![Page 22: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/22.jpg)
22
Genetic analysis of DNA replication-Hsp90 link
105 cells
105 cells
105 cells
wt
db
f4Δ
hsp
82Δ
db
f4Δ
hsp
82Δ
wt
db
f4Δ
hsc
82Δ
db
f4Δ
hsc
82Δ
wt
db
f4Δ
cpr7
Δ
db
f4Δ
cpr7
Δ
RT
30°C
37°C
YKO Dbf4 vs. hsp82, hsc82 and co-chaperones: cpr7, sti1, cdc37
![Page 23: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/23.jpg)
23
Overview
How do we integrate heterogeneous evidence?
Expert-driven network discoveryMaking it usable: practical visualization
and other interface considerationsDoes it work?
(evaluation experiments and biological validation)
Challenges/opportunities and future work
![Page 24: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/24.jpg)
24 Practical challenges/opportunities
Visualizing complex networks of interactions in a meaningful way
how does it scale with added data? easy user navigation around the network
Data-centric vs. established knowledge viewsHow do we overlay current knowledge of pathways with predictions derived from experimental data?
![Page 25: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/25.jpg)
25
Future workAn observation:
The more specific we can be about the end goal, the better the accuracy of our prediction
![Page 26: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/26.jpg)
26
Future workExploiting relevance and reliability variation: context-specific integration
![Page 27: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/27.jpg)
27
Summary
bioPIXIE can facilitate precise network discovery from experimental data using:
Bayesian data integration Expert-directed search Web-based dynamic interfacebioPIXIE is an effective tool for browsing
genomic evidence and generating specific, testable hypotheses
http://pixie.princeton.edu
![Page 28: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/28.jpg)
28
Acknowledgements
http://pixie.princeton.edu
Olga TroyanskayaDrew RobsonAdam Wible
Kara Dolinski
Camelia Chiriac
Matt Hibbs
Curtis Huttenhower
David Botstein Lab
Leonid Kruglyak LabThank you!
![Page 29: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/29.jpg)
29
Evaluation experiments (3): what about noise in the query set?
AU
PR
C
# of random proteins out of 20
total query proteins
![Page 30: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/30.jpg)
31
30°C
37°C
HU 0 mM HU 50 mM HU 100 mM
wt
cpr7
Δ
sti1
Δ
db
f4Δ
hs
p8
2Δ
hs
c8
2Δ
db
f4Δ
hs
c8
2Δ
db
f4Δ
sti1
Δ
db
f4Δ
cpr7
Δdb
f4Δ
hs
p8
2Δ
wt
cpr7
Δst
i1Δ
db
f4Δ
cpr7
Δ wt
cpr7
Δ
sti1
Δ
db
f4Δ
cpr7
Δhs
p8
2Δ
hs
p8
2Δ
hs
c8
2Δ
hs
c8
2Δ
db
f4Δ
db
f4Δ
db
f4Δ
hs
p8
2Δ
db
f4Δ
hs
p8
2Δ
db
f4Δ
hs
c8
2Δ
db
f4Δ
hs
c8
2Δ
db
f4Δ
sti1
Δ
db
f4Δ
sti1
Δ
Hydroxyurea sensitivity (replication inhibitor)
106 cells
106 cells
![Page 31: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/31.jpg)
32
Is this interaction specific to DNA replication?
37°C
wt
cpr7
Δ
sti1
Δ
db
f4Δ
hs
p8
2Δ
hs
c8
2Δ
db
f4Δ
hs
c8
2Δ
db
f4Δ
sti1
Δ
db
f4Δ
cpr7
Δdb
f4Δ
hs
p8
2Δ
wt
cpr7
Δst
i1Δ
db
f4Δ
cpr7
Δ wt
cpr7
Δ
sti1
Δ
db
f4Δ
cpr7
Δhs
p8
2Δ
hs
p8
2Δ
hs
c8
2Δ
hs
c8
2Δ
db
f4Δ
db
f4Δ
db
f4Δ
hs
p8
2Δ
db
f4Δ
hs
p8
2Δ
db
f4Δ
hs
c8
2Δ
db
f4Δ
hs
c8
2Δ
db
f4Δ
sti1
Δ
db
f4Δ
sti1
Δ
106 cells
MMS treatment has no apparent effect at RT, 30°C or 37°C (shown)
MMS sensitivity (induces DNA damage)
Conclusions:
Hsp90 complex plays specific role in DNA replication
Hsc82 and hsp82 do not have identical function
Possible new link between signaling cascades, stress, and DNA replication
Our system generates specific, testable hypotheses
![Page 32: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/32.jpg)
33
![Page 33: Building biological networks from diverse genomic data Chad Myers Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics Princeton.](https://reader035.fdocuments.us/reader035/viewer/2022062714/56649d365503460f94a0d83d/html5/thumbnails/33.jpg)
34