BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

18
BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

description

BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C. BioRDF Introduction. BioRDF participants The task is lead by Kei Cheung (Yale) Has approximately 20 participants BioRDF activities include: - PowerPoint PPT Presentation

Transcript of BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

Page 1: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

BioRDF Task: Building a Knowledgebase for Neuroscience

Eric Prud’hommeaux, W3C

Page 2: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

BioRDF Introduction

• BioRDF participants• The task is lead by Kei Cheung (Yale) • Has approximately 20 participants

• BioRDF activities include:• Explore the effectiveness of current tools for making data available as RDF/OWL• Build a life sciences demo that spans from bench to bedside using RDF/OWL to help

scientist better understand the value of the Semantic Web • Document our finding to help accelerate the adoption of the Semantic Web by others

• BioRDF Publications• A Prototype Knowledge Base for the Life Sciences - http://www.w3.org/TR/hcls-kb/• Experience with the Conversion of SenseLab Databases to RDF/OWL –

http://www.w3.org/TR/hcls-senselab/

• More Information on the group is available at • http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup

Page 3: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

Answering Questions

Goals: Get answers to questions posed to a body of collective knowledge in an effective way

Knowledge used: Publicly available databases, and text mining

Strategy: Integrate knowledge using careful modeling, exploiting Semantic Web standards and technologies

Page 4: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

Looking for Alzheimer Disease Targets

• Signal transduction pathways are considered to be rich in “druggable” targets

• CA1 Pyramidal Neurons are known to be particularly damaged in Alzheimer’s disease

• Casting a wide net, can we find candidate genes known to be involved in signal transduction and active in Pyramidal Neurons?

Page 5: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

Answering Questions with Google

Page 6: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

Answering Questions with PubMed

Page 7: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

Answering Questions across Data Sets

Page 8: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

NeuronDBBAMS

Literature

Homologene

SWAN

Entrez Gene

Gene Ontology

Mammalian Phenotype

PDSPki

BrainPharm

AlzGene

Antibodies

PubChem

MESH

Reactome

Allen Brain Atlas

Integrating Heterogeneous Data Sets

Page 9: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

Integrating Heterogeneous Data Sets

Page 10: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

NeuronDB

BAMS

Literature

Homologene

SWAN

Entrez Gene

Gene Ontology

Mammalian Phenotype

PDSPki

BrainPharm

AlzGene

Antibodies

PubChem

MESH

Reactome

Allen Brain Atlas

Integrating Heterogeneous Data Sets

Page 11: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

SPARQL Query Spanning Data Sources

Page 12: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

Results: Genes, Processes

DRD1, 1812 adenylate cyclase activationADRB2, 154 adenylate cyclase activationADRB2, 154 arrestin mediated desensitization of G-protein coupled receptor protein signaling pathwayDRD1IP, 50632 dopamine receptor signaling pathwayDRD1, 1812 dopamine receptor, adenylate cyclase activating pathwayDRD2, 1813 dopamine receptor, adenylate cyclase inhibiting pathwayGRM7, 2917 G-protein coupled receptor protein signaling pathwayGNG3, 2785 G-protein coupled receptor protein signaling pathwayGNG12, 55970 G-protein coupled receptor protein signaling pathwayDRD2, 1813 G-protein coupled receptor protein signaling pathwayADRB2, 154 G-protein coupled receptor protein signaling pathwayCALM3, 808 G-protein coupled receptor protein signaling pathwayHTR2A, 3356 G-protein coupled receptor protein signaling pathwayDRD1, 1812 G-protein signaling, coupled to cyclic nucleotide second messengerSSTR5, 6755 G-protein signaling, coupled to cyclic nucleotide second messengerMTNR1A, 4543 G-protein signaling, coupled to cyclic nucleotide second messengerCNR2, 1269 G-protein signaling, coupled to cyclic nucleotide second messengerHTR6, 3362 G-protein signaling, coupled to cyclic nucleotide second messengerGRIK2, 2898 glutamate signaling pathwayGRIN1, 2902 glutamate signaling pathwayGRIN2A, 2903 glutamate signaling pathwayGRIN2B, 2904 glutamate signaling pathwayADAM10, 102 integrin-mediated signaling pathwayGRM7, 2917 negative regulation of adenylate cyclase activityLRP1, 4035 negative regulation of Wnt receptor signaling pathwayADAM10, 102 Notch receptor processingASCL1, 429 Notch signaling pathwayHTR2A, 3356 serotonin receptor signaling pathwayADRB2, 154 transmembrane receptor protein tyrosine kinase activation (dimerization)PTPRG, 5793 ransmembrane receptor protein tyrosine kinase signaling pathwayEPHA4, 2043 transmembrane receptor protein tyrosine kinase signaling pathwayNRTN, 4902 transmembrane receptor protein tyrosine kinase signaling pathwayCTNND1, 1500 Wnt receptor signaling pathway

Many of the genes are related to AD through gamma

secretase (presenilin) activity

Page 13: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

http://hcls1.csail.mit.edu:8890/sparql/?query=prefix%20go%3A%20%3Chttp%3A%2F%2Fpurl.org%2Fobo%2Fowl%2FGO%23%3E%0Aprefix%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0Aprefix%20owl%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2002%2F07%2Fowl%23%3E%0Aprefix%20mesh%3A%20%3Chttp%3A%2F%2Fpurl.org%2Fcommons%2Frecord%2Fmesh%2F%3E%0Aprefix%20sc%3A%20%3Chttp%3A%2F%2Fpurl.org%2Fscience%2Fowl%2Fsciencecommons%2F%3E%0Aprefix%20ro%3A%20%3Chttp%3A%2F%2Fwww.obofoundry.org%2Fro%2Fro.owl%23%3E%0A%0Aselect%20%3Fgenename%20%3Fprocessname%0Awhere%0A%7B%20%20graph%20%3Chttp%3A%2F%2Fpurl.org%2Fcommons%2Fhcls%2Fpubmesh%3E%0A%20%20%20%20%20%7B%20%3Fpaper%20%3Fp%20mesh%3AD017966%20.%0A%20%20%20%20%20%20%20%3Farticle%20sc%3Aidentified_by_pmid%20%3Fpaper.%0A%20%20%20%20%20%20%20%3Fgene%20sc%3Adescribes_gene_or_gene_product_mentioned_by%20%3Farticle.%0A%20%20%20%20%20%7D%0A%20%20%20graph%20%3Chttp%3A%2F%2Fpurl.org%2Fcommons%2Fhcls%2Fgoa%3E%0A%20%20%20%20%20%7B%20%3Fprotein%20rdfs%3AsubClassOf%20%3Fres.%0A%20%20%20%20%20%20%20%3Fres%20owl%3AonProperty%20ro%3Ahas_function.%0A%20%20%20%20%20%20%20%3Fres%20owl%3AsomeValuesFrom%20%3Fres2.%0A%20%20%20%20%20%20%20%3Fres2%20owl%3AonProperty%20ro%3Arealized_as.%0A%20%20%20%20%20%20%20%3Fres2%20owl%3AsomeValuesFrom%20%3Fprocess.%0A%20%20%20graph%20%3Chttp%3A%2F%2Fpurl.org%2Fcommons%2Fhcls%2F20070416%2Fclassrelations%3E%0A%20%20%20%20%20%7B%7B%3Fprocess%20%3Chttp%3A%2F%2Fpurl.org%2Fobo%2Fowl%2Fobo%23part_of%3E%20go%3AGO_0007166%7D%0A%20%20%20%20%20%20%20union%0A%20%20%20%20%20%20%7B%3Fprocess%20rdfs%3AsubClassOf%20go%3AGO_0007166%20%7D%7D%0A%20%20%20%20%20%20%20%3Fprotein%20rdfs%3AsubClassOf%20%3Fparent.%0A%20%20%20%20%20%20%20%3Fparent%20owl%3AequivalentClass%20%3Fres3.%0A%20%20%20%20%20%20%20%3Fres3%20owl%3AhasValue%20%3Fgene.%0A%20%20%20%20%20%20%7D%0A%20%20%20graph%20%3Chttp%3A%2F%2Fpurl.org%2Fcommons%2Fhcls%2Fgene%3E%0A%20%20%20%20%20%7B%20%3Fgene%20rdfs%3Alabel%20%3Fgenename%20%7D%0A%20%20%20graph%20%3Chttp%3A%2F%2Fpurl.org%2Fcommons%2Fhcls%2F20070416%3E%0A%20%20%20%20%20%7B%20%3Fprocess%20rdfs%3Alabel%20%3Fprocessname%7D%0A%7D&format=&maxrows=50

Another View of the Query

Page 14: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

Discoverable, Queryable and Accessible on the Web

Allen Brain Institute ServersJavascript

SPARQLAJAX

Que

ry

UR

L

http://www.brainmap.org://….0205032816_B.aff/TileGroup3/1-0-1.jpg

GoogleMapsAPI

http://hcls1.csail.mit.edu/map/#Kcnip3@2850,Kcnd1@2800

Neurocommons Servers

Page 15: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

Use Exhibit to Visualize Results

Page 16: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

Technology

• So far about 350M triples (~20Gb on disk)

• Openlink Virtuoso - open source triple store

• Commodity Hardware: 2x2core duo/2 disks/8G Ram

Page 17: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

Going Forwards

• Incorporate additional data sources into the HCLS KB

• Make the interface easier for scientists to use

• Focus on processes for updating the data sources

• Find additional places to host the HCLS KB

Page 18: BioRDF Task: Building a Knowledgebase for Neuroscience Eric Prud’hommeaux, W3C

Conclusions

• The Semantic Web offers a flexible approach to data integration

• BioRDF has integrated over a dozen neuroscience related resources to simplify answering scientific questions

• The HCLS KB is accessible on the Web today

• Please let us know if you are interested in participating in the BioRDF task