Post on 11-Jun-2015
From BioMoby to SADI
The Quest for the Holy Grail!
BioMoby Stats in a nutshell• >1800 services worldwide (~1300 “alive” at any given time)• 4 major installations of the Moby Service registry
– Genome Canada, SUN Center of Excellence, Calgary– Genome España, Barcelona Supercomputing Center– International Rice Research Institute, Philippines – Max Planck, Cologne
• Canadian service registry brokers ~400,000 requests/month• Canadian BioMoby services receive ~700,000 uses/month• Canadian server just had a significant memory upgrade to
improve performance
“The report of my death was an exaggeration”-- Mark Twain
Model Organism Bring Your-Own Database Interface Conference
“MOBY-DIC”
Emma Lake, SaskatchewanSept 21, 2001
Are we going after The Holy Grail
here?
The Holy Grail:(this slide created circa 2002)
Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.
Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.
Holy Grail Demo #1
Imagine there is a “virtual database” containing all of the data from all of the databases,together with the output of
every conceivable analysis
How do we query that database?
“SHARE”Semantic Health And Research Environment
SADI client application
http://biordf.net/cardioSHARE (Pellet)
http://dev.biordf.net/cardioSHARE (Pellet 2)
What pathways does UniProt protein P47989 belong to?
PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#>PREFIX ont: <http://ontology.dumontierlab.com/>PREFIX uniprot: <http://lsrn.org/UniProt:>SELECT ?gene ?pathway WHERE {
uniprot:P47989 pred:isEncodedBy ?gene . ?gene ont:isParticipantIn ?pathway .
}
Recapwhat we just saw
A standard SPARQL query was entered into SHARE, a SADI-aware query engine
Recapwhat we just saw
The query was interpreted to extract the “triple” patterns
subject, predicate, object
being requested
Recapwhat we just saw
Triple-patterns are passed to SADI for Web Service discovery
Recapwhat we just saw
Services capable of generating those triple-patterns are automatically executed,
the triples are stored, and the query is resolved.
Recapwhat we just saw
We posed, and answered a ~complex database query
WITHOUT A DATABASE
(in fact, the data didn’t even have to exist...)
Recapwhat we just saw
Note that there is no centralized ontology
Unlike BioMoby, SADI supports all (OWL) ontologies and
does not invent any of its own
Holy Grail Demo #1
Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.
Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.
Holy Grail Demo #2
Show me the latest Blood Urea Nitrogen and Creatinine levelsof patients who appear to be rejecting their transplants
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creatFROM <http://sadiframework.org/ontologies/patients.rdf>WHERE {
?patient rdf:type patient:LikelyRejecter .?patient l:latestBUN ?bun . ?patient l:latestCreatinine ?creat .
}
Start burrowing through the LikelyRejector OWL class find that we need a regression model OWL class
Regression models have features like slopes and intercepts, and so on.The class is completely decomposed until a set of required Services are discoveredcapable of creating all these necessary properties
Decomposition of the OWL class uncovers the need for a Linear Regression analysis on the patient blood chemistry data
VOILA!
We just dynamically evaluated if individuals matching a particular high-level concept definition exist
…or can exist
Holy Grail Demo #2
Align the promoters of all serine threonine kinases involved exclusively in the regulation of cell sorting during wound healing in blood vessels.
Retrieve and align 2000nt 5' from every serine/threonine kinase in Mus musculus expressed exclusively in the tunica [I | M |A] whose expression increases 5X or more within 5 hours of wounding but is not activated during the normal development of blood vessels, and is <40% homologous in the active site to kinases known to be involved in cell-cycle regulation in any other species.
How does
SADI + SHARE
do that?
Please see other presentations uploaded to SlideShare for a full explanation
of SADI Functionality
See also the Taverna and Protégé plug-insfor discovering, running and creating services
TavernaSentient Knowledge Explorer
The Holy Grail may not yet be in-handbut I think we can at least see it from here!
So… now what?
Mark’s Manifesto
What is my next “Holy Grail”?
Science
Support for the in silico Scientific Method
Reproducibility
Clarity (hypothesis)
Discourse
Disagreement
Clarity (experiment)
The Scientific Method
Discourse: What do you believe? What do I believe?
Disagreement: You’re wrong! And I’m gonna prove it!
Clarity: This is the experiment I am going to do
Reproducibility: This is how I did it (“provenance”)
Clarity: This is my new hypothesis
The Scientific Method
Discourse: What do you believe? What do I believe?
Disagreement: You’re wrong! And I’m gonna prove it!
Clarity: This is the experiment I am going to do
Reproducibility: This is how I did it (“provenance”)
Clarity: This is my new hypothesis
Workflows (e.g. myExperiment)
Reproducibility
Clarity (hypothesis)
Discourse
Disagreement
Clarity (experiment)
In opposition to the lessons we learnt from Web 2.0
The Semantic Web in Healthcare and Life Sciences
is currently solving the problems of science…
…by forming institutions
Result:
Large, centrally-designed and centrally-curated ontologies
that enforce “community agreement” about “biological reality”
Science ≠ Consensus
Reproducibility
Clarity (hypothesis)
Discourse
Disagreement
Clarity (experiment)
Reproducibility
Clarity (hypothesis)
Institutions & Consortia
Disagreement
Clarity (experiment)
Reproducibility
Clarity (hypothesis)
Institutions & Consortia
Consensus
Clarity (experiment)
Reproducibility
????
Institutions & Consortia
Consensus
Clarity (experiment)
To bring the “traditions of Science”
to in silico Science
we need Web 3.0 tools that encourage and facilitate
personal opinion and debate
What has this got to do with SADI and SHARE?
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creatFROM <http://sadiframework.org/ontologies/patients.rdf>WHERE {
?patient rdf:type patient:LikelyRejecter .?patient l:latestBUN ?bun . ?patient l:latestCreatinine ?creat .
}
Likely Rejecter
I created a small ontologydescribing my definition of
a Likely Rejecter
… it was MY ontology!
I can re-use it
I can modify it as I change my world-view
Reproducibility
Clarity (hypothesis)
Discourse
Disagreement
Clarity (experiment)
I can publish it for others to use
Reproducibility
Clarity (hypothesis)
Discourse
Disagreement
Clarity (experiment)Others can modify it and/or
compare it to THEIR world-view
Reproducibility
Clarity (hypothesis)
Discourse
Disagreement
Clarity (experiment)
Sharing my ontology also gives opportunities for micro-attribution;
“Citation” of me is transparent and automatic when someone extends my ontology
Using SADI and SHAREmy personal world-view is
explicitly expressed and can bedynamically evaluated against
global data and knowledge
Ontology development is distributed and personal rather than centralized
no institutions
“an ecosystem of ideas!”
…but there’s more…
“Likely Rejecter”
I made that up! It came out of my head!
What’s another word for a world-view that you make-up?
Hypothesis
Reproducibility
Hypotheses
Discourse
Disagreement
Clarity (experiment)The “Likely Rejecter” OWL Classis an explicitly-expressed hypothesis;
Members of that class may or may not exist!
Reproducibility
Hypotheses
Discourse
Disagreement
Experiment
Ontologically-expressed Hypotheses drive the discovery, assembly, and analysis of data capable of evaluating their validity
Blood Pressure
Hypertension
Ischemia
Hypothesis
Database 1 Database 2
SADI+
SHARE
Analytical Algorithm
Join us!
SADI and CardioSHARE are Open-Source projects
Come join us – we’re having a lot of fun!!
http://sadiframework.org
#SADIFrameworkSADI SemanticWeb Services Page
C-BRASS: Canadian Bioinformatics Resources As Semantic Services
together with Michel Dumontier, Chris Baker
~$1M funding to help us deploy SADI services and provide training for new service providers
We can help you get started!
“C-BRASS” is on Facebook! Like
Credits
Benjamin VanderValk (SADI & SHARE)
Luke McCarthy (SADI & SHARE)
Soroush Samadian (CardioSHARE)
Microsoft Research
Fin
This presentation available on SlideShare: keywords ‘wilkinson’ ‘bosc’