Representing Biological Processes: The Reactome Database
description
Transcript of Representing Biological Processes: The Reactome Database
Representing Biological Processes:The Reactome Database
Gopal Gopinathrao1 & Peter D’Eustachio1,2
1Cold Spring Harbor Laboratory2NYU School of Medicine
[email protected]@med.nyu.edu
Reactome is- reductionist. All of biology can be represented as events that convert input physical entities into output physical entities.- a generic parts list. Tissue and state specificity of events are not captured.- qualitative. Kinetic parameters and data are not captured.- human-centric. Experiments can use reagents from diverse sources, but most biological processes take place in single species, and our focus is on human biological processes.- manually curated. Events are annotated by expert curators, and linked to published data.- open source. All data and software are freely downloadable and reusable.
Data model in a nutshell
Pathway
Pathway Reaction Reaction
CatalystActivity
Output 1
Reaction
Input 1
Input 2 Output 2
Regulation
Annotating more details- post-translational modifications of proteins- exact locations of entities and eventsAnnotating more ambiguities- sets of entities - defined, open, and candidate- incompletely specified entities- “black box” reactions
A geometrical compartment set for locating molecules in human cells
Hemo-stasis
Apop-tosis
Insulinsignal-ing
Notchsignal-ing
Glucagonsignaling
Cell cycle& DNAreplication DNA
repairTranscription
Translation
Posttransla-tional modifi-cations
TCAcycle
Lipid metabolismAmino acidmetabolism
Nucleotidemetabolism
Xenobioticmetabolism
Carbohydratemetabolism
The starry sky view of all of Reactome
HIV & Influenzalife cycles
Sterol metab-olism
Reactome Home Page
http://brie8.cshl.edu/cgi-bin/frontpage?DB=gk_central
Reactome Event Page
http://brie8.cshl.edu/cgi-bin/eventbrowser?DB=gk_central&ID=163767&
Export Formats
<owl:Ontology rdf:about=""> <owl:imports rdf:resource="http://www.biopax.org/release/biopax-level2.owl" /> <rdfs:comment rdf:datatype="http://www.w3.org/2001/XMLSchema#string">BioPAX pathway converted from "DNA Replication" in the Reactome database.</rdfs:comment> </owl:Ontology> <bp:pathway rdf:ID="DNA_Replication"> <bp:PATHWAY-COMPONENTS rdf:resource="#Regulation_of_DNA_replicationStep" /> <bp:PATHWAY-COMPONENTS rdf:resource="#DNA_strand_elongationStep" /> <bp:PATHWAY-COMPONENTS rdf:resource="#DNA_replication_initiationStep" /> <bp:PATHWAY-COMPONENTS rdf:resource="#Switching_of_origins_to_a_post_replicative_stateStep" /> <bp:PATHWAY-COMPONENTS rdf:resource="#DNA_Replication_Pre_InitiationStep" /> <bp:ORGANISM rdf:resource="#Homo_sapiens" /> <bp:NAME rdf:datatype="http://www.w3.org/2001/XMLSchema#string">DNA Replication</bp:NAME> <bp:SHORT-NAME rdf:datatype="http://www.w3.org/2001/XMLSchema#string">DNA Replication</bp:SHORT-NAME> <bp:XREF rdf:resource="#Reactome69306" /> <bp:XREF rdf:resource="#REACT_383.2" /> <bp:COMMENT rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Studies in the past decade have suggested that the basic mechanism of DNA replication initiation is conserved in all kingdoms of life. Initiation in unicellular eukaryotes, in particular Saccharomyces cerevisiae (budding yeast), is well
Bioinformatics Access
• BioMart API• MySQL/Perl API• MySQL/Java API• SOAP/WSDL Interface (multiple languages)• Flat files• Database dumps• Local site install (instructions going into
CPBI)
Inference Statistics
direct curation underway
Validation of inference
• Comparison of manually curated yeast reactions from YBP with inferred reactions from human Reactome
• Sensitivity: 72%• Specificity: 78%
Inferring chicken reactions from curated human ones
Gaps in Reactome
Gopal Gopinathrao, PhDReactome, CSHL
1) Gaps in Reactome annotation
2) Gaps in annotate-able information
3) What a network / pathway ontology can do to fill this gap?
Information
Cellular
Pathogens
Pathogens
Information
Metabolism
Signaling Signaling
Information
Protozoan/Host interactions
Developmental pathwaysTranscriptional regulationFeedback loops
Neuroscience topicsDegenerative diseasesSynaptic processes
Cancer processes
OMIM-functional (biochemical)
Complex diseases
Cellular differentiation, Regulation
Domains of Biology waiting to be Reactomized
Metabolism
Cellular housekeeping
Information
Signaling
Pathogens/ Host interactions
476
755
376
414
600
Unique human proteins used in pathways (in March 2008)
2500
Swissprot section of UniProt~16,000
0
1000
2000
3000
4000
5000
6000
10 20 30 40
release
totalproteins
unique proteinsunique + isoforms
Gaps in Reactome annotation
Some pathway/Int dbs are more equal?
Are all Swissprot proteins annotatable forpathways/interactions?
Can all interactions can be placed in a biologically relevant‘pathway’ or even sub-graphs of a network?
If yes, who is going to validate and how, the biological ‘truth’ of any subgraphs derived from a network?
[Terms of biological truth - tissue, regulation, developmental stage, expression …]
Mind what gets filled in…
Watching the gap…
Source Type Protein(SwissProt) Coverage(SwissProt) InteractionPathways 5283 (3847) 21% (27%) 118867
PPIs 10674 (6298) 42% (44%) 43797Total 13318 (7590) 53% (53%) 162664
Data Source Protein(SwissProt) Coverage (SwissProt) Interaction CitationReactome 1229 (1194) 5% (8%) 21394 Vastrik et al , 2007
Panther 2997 (1670) 12% (12%) 75694 Mi et al , 2007CellMap 567 (567) 2% (4%) 1195 cancer.cellmap.org
INOH 719 (711) 3% (5%) 11759 Kushida et al , 2006NCI-Nature 593 (592) 2% (4%) 2900 pid.nci.nih.gov
NCI-BioCarta 936 (936) 4% (6%) 4752 pid.nci.nih.govKEGG 2033 (1947) 8% (13%) 11144 Kanehisa et al , 2004Total 5283 (3847) 21% (27%) 118867
Adding in pathway data decomposed to interactions …
Adding PPI data to the above …
NBC Predictions in Reactome
How a network / pathway ontology may help to fill the gap in pathway annotations..
A<----->B
C<----->B
A<----->D
C<----->D
C<----->A
A<-----| B
Known
Novel
New regulatory event
A+B+C+D
Feedback loop
ABCD complex
ABCD complex
1. A+B+C+D
2. Interaction of C and D may regulate ABCD complex formation
Updated model for curation would be:
C<----->DA<----->B Novel
Feedback loop
3. Post-translational inhibition of B by A may result in down regulation of A, there by affecting the stability of complex ABCD
Evidence from a network ontology
A<-----| B
New regulatory event
Evidence from a network ontology in a model organism
The Team
• CSHL– Lincoln Stein (PI)– Gopal Gopinathrao (managing editor)– Marc Gillespie, Lisa Matthews, Bruce May, Mike Caudy
(curators)– Guanming Wu, Alex Kanapin (developer)
• EBI– Ewan Birney (coPI)– Esther Schmidt, Imre Vastrik, David Croft (developers)– Bernard de Bono, Bijay Jassal, Phani Garapati (curators)
• NYU– Peter D’Eustachio (co-PI; editor-in-chief)– Shahana Mahajan (curator)
P41 HG003751