STRING - Predicting novel metabolic pathways through the integration of diverse genome-scale data
-
Upload
lars-juhl-jensen -
Category
Technology
-
view
805 -
download
2
description
Transcript of STRING - Predicting novel metabolic pathways through the integration of diverse genome-scale data
STRINGPredicting novel metabolic pathways through the
integration of diverse genome-scale data
Lars Juhl JensenEMBL Heidelberg
Too much information – too little knowledge
• Biology is now in the age of large-scale data collection– Explosive increase in data from genome sequencing, microarray
expression studies, screening for protein interactions etc.– The data types are highly heterogeneous– Much data is not being deposited in standardized repositories– Most data sets are error-prone and suffer from systematic biases
• STRING is a web resource that integrates many different types of information across 100+ species– Objective definition of metabolic pathways / functional modules– Prediction of additional pathway members / novel pathways
• We do not intend STRING to be– a primary repository for experimental data– a curated database of complexes or pathways– a substitute for expert annotation
STRING provides a network of functional interactions between proteins
Genomic neighborhood
Species co-occurrence
Gene fusions
Database imports
Exp. interaction data
Microarray expression data
Literature co-mentioning
Inferring functional modules fromgene presence/absence patterns
Restingprotuberances
Protractedprotuberance
Cellulose
© Trends Microbiol, 1999
CellCell wall
Anchoring proteins
Cellulosomes
Cellulose
The “Cellulosome”
Score calibration against a common reference
• Many diverse types of evidence– The quality of each is judged by
very different raw scores
– These are all calibrated against the same reference set
• Requirements for a reference– Must represent a compromise
of the all types of evidence
– Broad species coverage
• Both a strength and a weakness– Scores for all evidence types
are directly comparable
– The type of interaction is currently not predicted
Multiple evidence types from several species
Image: Molecular Biology of the Cell, 3.rd edition
Metabolism overview
Defined manually:
cutting metabolic
maps into pathways
Purinebiosynthesis
Histidinebiosynthesis
Objective definition of metabolic pathways
Defined objectively:
standard clustering
of genome-scale data
Getting more specific – generally speaking
Acknowledgments
• The STRING team– Christian von Mering
– Berend Snel
– Martijn Huynen
– Daniel Jaeggi
– Steffen Schmidt
– Mathilde Foglierini
– Peer Bork
• ArrayProspector web service– Julien Lagarde
– Chris Workman
• NetView visualization tool– Sean Hooper
• Analysis of yeast cell cycle– Ulrik de Lichtenberg
– Thomas Skøt
– Anders Fausbøll
– Søren Brunak
• Web resources– string.embl.de
– www.bork.embl.de/ArrayProspector
– www.bork.embl.de/synonyms
Thank you!