Module 7: Comparing Datasets and Comparing a Dataset with a Standard
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a...
-
Upload
charleen-patrick -
Category
Documents
-
view
216 -
download
0
Transcript of 1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a...
1
Having genome data allows collection of other ‘omic’ datasets
Systems biology takes a different perspective on the entire dataset,
often from a Network Perspective
Networks consist of nodes (entities)and interactions between nodes
2
Having genome data allows collection of other ‘omic’ datasets
Systems biology takes a different perspective on the entire dataset,
often from a Network Perspective
Ongoing questions in Systems Biology:
Types of network structures and their properties
Effects of positive/negative feedback, feed-forward
Dynamics of signal processing through network
Insulation of signal through the network
Ultimately, using information to predictoutput of the network given some input
3
Certain network features are of interest
Connectivity (degree): Number of connections
Centrality (betweenness): How central a node is
Assortativity: Density of a node neighborhood
Distance: shortest path between 2 nodes
Average Distance: average between all node pairs
Node: entity (protein, gene, metabolite)
Edge: connection (physical, genetic) between entities
DAG: Directed Acyclic Graph
4
Evolution of networks through:
* Adding new nodes to an network
* Addition/loss of connections
* Higher-order rewiring
How do networks evolve?
5
Protein-protein interaction (ppi) networks
Data can be collected in several ways:
Goal is to capture every ppi in the cell
Bait immunoprecipitation + tandem mass spectrometry (MS/MS)high throughput bait pull downs and tons of MS/MS
6
Protein-protein interaction (ppi) networks
Data can be collected in several ways:
From Ho et al. Nature 2002
Goal is to capture every ppi in the cell
Bait immunoprecipitation + tandem mass spectrometry (MS/MS)high throughput bait pull downs and tons of MS/MS
7
Data can be collected in several ways:
Large-scale yeast two-hybrid assays (in vivo in yeast)
Fuse bait to DNA binding domain of TF
Co-express in yeast: library of proteins fused to activation domain of TF
Reporter (often drug resistance gene) only expressed if BD and AD are brought
together through ppi
Protein-protein interaction (ppi) networksGoal is to capture every ppi in the cell
8
Currently, there are several major issues with ppi
* Only partial data some interactions hard to measure
* Often noisydifferent types of noise inherent to different approaches
* Affected (sometimes) by high false-positive interactions
* So far collected only under standard conditionslikely to be many condition-specific interactions
Still relatively low overlap between different ppi datasets
Most reliable data: that observed in >1 study
Protein-protein interaction (ppi) networksGoal is to capture every ppi in the cell
9
Conservation of ppi’s across species
‘interlogs’ (M. Vidal): conserved protein-protein interaction pair
Matthews et al. Gen Res 2001. Tested Y2H interactions in worm ‘interlogs’
- only 25% of previously shown Y2H ppi could be verified in yeast!- 6/19 (31%) were conserved ppi- another assessment found 19% of ppi were conserved
so, 19 - 31% of ppi were conserved between yeast and C. elegans
Other methods emerging to compare networks in a more complex way …but it’s challenging due to partial/noisy networks.
10
Do ppi’s constrain protein evolution?
Fraser et al. Science 2001: significant correlation between rate of protein evolution and connectivity (# ppi)reported slower evolution rates for proteins with lots of contacts
But other studies reported no significant correlation …
Bloom & Adami. BMC Evo Biol. 2003: Reason for Fraser correlation wasan artifact of some of the datasets
- compiled 7 different yeast largescale datasets
- argue that affinity purification = more artifactual ppi’s measured, specificallyfor abundant proteins
- after controlling for this, the remaining partial correlation explained by protein abundance.
11
Genetic interaction networksSynthetic genetic (epistatic) interactions for double-gene knock outs:
Gene 1 knock-out: no phenotypeGene 2 knock-out: no phenotypeGene 1 & 2 knocked out: sickly
Negative interaction: double knockout phenotype worse than singles
Gene 1 knock-out: sicklyGene 2 knock-out: no phenotype or sicklyGene 1 & 2 knocked out: less sickly
Positive interaction: double knockout phenotype improves over singles
Generally more (>2X in yeast) negative than positive interactions detected in a single species
12
Nat Gen 2008
Identified synthetic lethal (extreme negative) genetic interactions in S. cerevisiae
Only 6 (0.7%) of pairs were synthetic lethal in C. elegans Adjust to ~5% given error ratenot explained by paralogy, as these are all 1:1 orthologs
Compared to >60% essentiality conserved across species (individual essential genes)
>30% protein-protein interactions conserved across species
Then used RNAi to knock down 837 pairs of orthologs in C. elegans
13
Nevan Krogan E-maps (epistatic interactions between pairs of gene xo’s)
Science 2008
550 genes, 118,000 different gene-gene knockouts, focusing on chromatin/nuclear
* Matches a similar network designed in S. cerevisiae
15 - 30% of negative interactions were conserved between species (>500 my)more than C. elegans-yeast comparison by Tischler et al.
>50% of positive interactions were conserved
15
Roguev et al. 2008
Several networks appear to have evolved significantly
MSC1
Sz. pombe -specificparalog of SWR-CRPD3L MED.
WHY?1. Could be subfunctionalization in Sz. pombe by SWR-C paralog MSC12. Could be compensation in S. cerevevisiae for loss of RNAi3. Could be missed interactions (different environment, etc)
16
Many remaining questions …
* What types of protein-protein interactions are most conserved and why?
* What types of networks are more constrained and why?specific functions, structures, features more constrained?
* What processes allow/promote network ‘rewiring’?
* What effect do network interactions have on protein evolution rates?
* How to ppi networks vary across environmental space and time?