Bio2RDF and Beyond!
-
Upload
michel-dumontier -
Category
Health & Medicine
-
view
2.977 -
download
2
description
Transcript of Bio2RDF and Beyond!
![Page 1: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/1.jpg)
EBI : 14-01-101
Bio2RDF and Beyond! Large Scale, Distributed Biological Knowledge Discovery
Michel Dumontier, Ph.D.Associate Professor of Bioinformatics
Carleton University
Department of BiologySchool of Computer Science
Institute of BiochemistryOttawa Institute of Systems Biology
Ottawa-Carleton Institute of Biomedical Engineering
![Page 2: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/2.jpg)
EBI : 14-01-102 Carole Goble (ISWC 2005)
Web-based Knowledge Discovery a very painful process
![Page 3: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/3.jpg)
EBI : 14-01-103
Syntactic Web…It takes a lot of digging to get answers
![Page 4: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/4.jpg)
EBI : 14-01-104
Portals provide structured informationand give better results
![Page 5: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/5.jpg)
EBI : 14-01-105
Surface web:167 terabytes
Deep web:91,000 terabytes
545-to-one
We need to expose the deep web
![Page 6: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/6.jpg)
EBI : 14-01-106
Data silos – not made for sharing
![Page 7: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/7.jpg)
EBI : 14-01-107
How do we integrate these resources?
![Page 8: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/8.jpg)
EBI : 14-01-108
We want to simultaneously
query the 1000+ biological databases
![Page 9: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/9.jpg)
EBI : 14-01-109
The Semantic Web is a web of knowledge.
It is about standards for publishing, sharing and querying knowledge drawn from diverse sources
It enables the answering of sophisticated questions
![Page 10: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/10.jpg)
EBI : 14-01-1010
A growing web of linked data
![Page 11: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/11.jpg)
EBI : 14-01-1011
Life Science Data Contributors
• HCLS (LODD)• Neurocommons• Bio2RDF
![Page 12: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/12.jpg)
EBI : 14-01-1012
Resource Description Framework (RDF)
Uniform Resource Identifier (URI) can be used as entity names
Bio2RDF specifies the naming convention
http://bio2rdf.org/uniprot:P05067
is a name for Amyloid precursor protein
http://bio2rdf.org/omim:104300
is a name for Alzheimer disease
uniprot:P05067
omim:104300
Allows one to talk about anything
![Page 13: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/13.jpg)
EBI : 14-01-1013
Resource Description Framework (RDF)
uniprot:Protein
is a
A RDF statement consists of:– Subject: resource identified by a URI– Predicate: resource identified by a URI– Object: resource or literal
uniprot:P05067
Allows one to express statements
![Page 14: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/14.jpg)
EBI : 14-01-1014
Multi-Source Data Integration
uniprot:P05067 go:Membrane
uniprot:Proteinis a
located in
uniprot:P05067
uniprot:P05067 uniprot:P05067interacts with
UniProt
Gene Ontology
uniprot:P05067
has name
located in
interacts with
Unified view
+
+
iRefIndex
depends on consistent naming
go:Membrane
uniprot:Protein
uniprot:P05067
![Page 15: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/15.jpg)
EBI : 14-01-1015
Building statements creates knowledge
uniprot:P05067
Protein
is a
omim:104300
Disease
is a
is involved in
Amyloid precursor
protein
label
AlzheimerDisease
label
![Page 16: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/16.jpg)
EBI : 14-01-1016
RDF/XML<?xml version="1.0"?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:u="http://bio2rdf.org/uniprot:"
<rdf:Description rdf:about=“&u;Q16665"> <rdf:type rdf:resource=“&u;Protein"/> </rdf:Description></rdf:RDF>
PREFIX u: <http://bio2rdf.org/uniprot:>
<u:Q16665> a <u:Protein> .
RDF/N3
RDF has multiple representations
![Page 17: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/17.jpg)
EBI : 14-01-1017
Bio2RDF is a framework to create and provision linked data networks
Francois Belleau, Laval UniversityMarc-Alexandre Nolin, Laval University
Peter Ansell, Queensland University of TechnologyMichel Dumontier, Carleton University
![Page 18: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/18.jpg)
EBI : 14-01-1018
Bio2RDF’s RDFized data fits together
![Page 19: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/19.jpg)
EBI : 14-01-1019
Bio2RDF now serving over 5 / 15 billion triples of linked biological data
![Page 20: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/20.jpg)
EBI : 14-01-1020
Bio2RDF linked data
![Page 21: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/21.jpg)
Bioinformatics Discovery Registry• SharedName initiative to provide stable URI patterns for data records.• We added the relationship between entities and records
Directory Service• ~1700 datasets & dozens of resolvers.
Discovery Service• Registry links entities to data records, their formats (RDF/XML, HTML, etc)
and provider (Bio2RDF, Uniprot)
Redirection Service• Automatic redirection to data provider document
![Page 22: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/22.jpg)
EBI : 14-01-1022
something you can lookup or search for with rich descriptions
![Page 23: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/23.jpg)
EBI : 14-01-1023
Bio2RDF: Raw Data!
![Page 24: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/24.jpg)
EBI : 14-01-1024
SPARQL is the new cool kid on the query block
SQL SPARQL
![Page 25: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/25.jpg)
EBI : 14-01-1025
Bio2RDF’s describe service uses SPARQL
CONSTRUCT {?s ?p ?o .
}WHERE {?s ?p ?o .FILTER(?s = <http://bio2rdf.org/ns:id>).
}
Sent to http://ns.bio2rdf.org/sparql?query=...
http://bio2rdf.org/ns:id
![Page 26: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/26.jpg)
EBI : 14-01-1026
Bio2RDF’s search service uses SPARQLhttp://bio2rdf.org/search/hexokinase
kegguniprot
gene
bio2rdf.org
![Page 27: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/27.jpg)
EBI : 14-01-1027
Bio2RDFScalable, Decentralized Data ProvisionGlobally Mirrored and Point Provision
Customizable Query Resolution
![Page 28: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/28.jpg)
EBI : 14-01-1028
Customizable Configuration (in N3)Single Query, Single Provider
![Page 29: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/29.jpg)
EBI : 14-01-1029
Query Resolution
![Page 30: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/30.jpg)
EBI : 14-01-1030
![Page 31: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/31.jpg)
EBI : 14-01-1031
700,000 queries in November 2009
![Page 32: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/32.jpg)
EBI : 14-01-1032
Yai for data!
But how do we discover more than what was in the data?
![Page 33: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/33.jpg)
EBI : 14-01-1033
Ontology as Strategy
![Page 34: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/34.jpg)
EBI : 14-01-1034
uniprot:P05067
Uniprot:Protein
is a
chebi:PolyatomicEntity
is a
is a
Reasoning and Inference through Semantics
fact
ontology
Knowledge base
![Page 35: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/35.jpg)
EBI : 14-01-1035
The Web Ontology Language (OWL) Has Explicit Semantics
Can therefore be used to capture knowledge in a machine understandable way
![Page 36: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/36.jpg)
Over 170 bio-ontologies
EBI : 14-01-1036
![Page 37: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/37.jpg)
From linked data to linked knowledge through syntactic and semantic normalization.
![Page 38: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/38.jpg)
Multiple Ways To Represent Knowledge
Three ways to model the relationship between a protein and the volume it occupies.
![Page 39: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/39.jpg)
EBI : 14-01-1039
Web-based Knowledge DiscoverySome of our queries need services
![Page 40: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/40.jpg)
EBI : 14-01-1040
The Holy Grail:
Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.
Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% similar in the active site to kinases known to be involved in cell-cycle regulation in any other species.
![Page 41: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/41.jpg)
EBI : 14-01-1041
Semantic Automated Discovery and Integration
http://sadiframework.org
Mark Wilkinson, UBCMichel Dumontier, Carleton UniversityChristopher Baker, UNB
![Page 42: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/42.jpg)
SADI – described oriented service matching based on
registered predicates
![Page 43: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/43.jpg)
EBI : 14-01-1043
![Page 44: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/44.jpg)
EBI : 14-01-1044
What pathways does UniProt protein P47989 belong to?
PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#>PREFIX ont: <http://ontology.dumontierlab.com/>PREFIX uniprot: <http://lsrn.org/UniProt:>
SELECT ?gene ?pathway WHERE {
uniprot:P47989 pred:isEncodedBy ?gene . ?gene ont:isParticipantIn ?pathway .
}
![Page 45: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/45.jpg)
EBI : 14-01-1045
SADI
• Describe the input and output using OWL-DL classes• Subject of input and output must be the same• Web services correspond to predicates• Biocatalogue to register SADI-compliant services• Simplified migration path for existing web services (java, perl)
![Page 46: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/46.jpg)
EBI : 14-01-1046
Build aknowledge basefrom a series of questions
![Page 47: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/47.jpg)
EBI : 14-01-1047
You want to join the knowledge web
![Page 48: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/48.jpg)
EBI : 14-01-1048
Share your data
![Page 49: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/49.jpg)
EBI : 14-01-1049
Build semantic web services
![Page 50: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/50.jpg)
EBI : 14-01-1050Get to where you want to be … faster!
![Page 51: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/51.jpg)
EBI : 14-01-1051
Next Steps
Service and Data Buildout Formal Partnerships
Applications
![Page 53: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/53.jpg)
EBI : 14-01-1053
![Page 54: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/54.jpg)
We’re interested in Personalized Medicine
The ability to offer • The Right Drug• To The Right Patient• For The Right Disease• At The Right Time• With The Right Dosage
Genetic and metabolic data will allow drugs to be tailored to patient subgroups
54 EBI : 14-01-10
![Page 55: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/55.jpg)
EBI : 14-01-1055
PHARMGKB is an emerging resource for pharmacogenomics
+ Role of genes, gene variants , drugs + pharmacokinetics + pharmacodynamics + clinical outcomes. + Links to publications
- Natural language descriptions- Variant details in publications
![Page 56: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/56.jpg)
EBI : 14-01-1056
contains statements from 11/40 relevant publications involving 45 genes / gene variants, 57 drugs annotated with 19 classes of antidepressants, 45 drug treatments, 47 drug-gene interactions, 29 clinical outcomes, 10 drug-induced side-effects, and 8 gene-disease interactions.
PHARMACOGENOMICS OF DEPRESSION KNOWLEDGE BASE
![Page 57: Bio2RDF and Beyond!](https://reader038.fdocuments.us/reader038/viewer/2022110119/5560b153d8b42aef3b8b45e3/html5/thumbnails/57.jpg)
EBI : 14-01-1057
Nortriptyline induced side effects for ABCB1 gene variants
‘side effect’ that ‘is realized by’ some (‘drug treatment’ that ‘involves’ some ‘nortriptyline’ and
‘involves’ some (‘variant of’ some ‘ABCB1’))
QUERYING THE PDKBProtégé 4, FaCT++, DL Query Tab
postural hypotension is a side effect of nortriptyline treatment of depression for individuals presenting the 3435C>T genotype