OntoQuest: Exploring Ontological Data Made Easy

Post on 28-Jan-2016

45 views 0 download

Tags:

description

OntoQuest: Exploring Ontological Data Made Easy. Authors: Li Chen, Maryann Martone, Amarnath Gupta , Lisa Fong, Mona Wong-Barnum. Background. Many application domains in the natural sciences are rapidly building ontologies To attempt to standardize the vocabulary of their domains - PowerPoint PPT Presentation

Transcript of OntoQuest: Exploring Ontological Data Made Easy

OntoQuest: Exploring Ontological Data Made Easy

Authors: Li Chen, Maryann Martone, Amarnath Gupta, Lisa Fong, Mona Wong-Barnum

Background

Many application domains in the natural sciences are rapidly

building ontologies

To attempt to standardize the vocabulary of their domains

To record known relationships that have been established from

years of scientific research in the discipline

To use the ontology as the common framework to exchange,

assimilate and compare information

Experimental data collected by research groups

Curated data compiled from the literature

To establish relationships with data and ontologies from other

domains to achieve interoperability and information integration

The Problem / Requirement Need a system

To explore the ontology itself To relate the terms and relationships in an ontology to data

sources To explore multiple data sources as part of the ontology

exploration process To update the databases through the ontology exploration

tool To update the ontology and propagate the effects of the

update to the mappings between data sources and the ontology

The Problem / Requirement Need a system

To explore the ontology itself (OWL) To relate the terms and relationships in an ontology to data

sources (RDBMS, RDF, XML) To explore multiple data sources as part of the ontology

exploration process (instance inference) To update the databases through the ontology exploration

tool (instance Inference triggered by update) To update the ontology and propagate the effects of the

update to the mappings between data sources and the ontology (mapping change triggered by update)

OntoQuest Ingests any OWL-expressed ontology

Uses IBM’s IODT tool (modified) to shred the OWL ontology to a schema

Instances of ontology classes may reside locally or accessed from remote sources

Provides the ability for ontology exploration By traversal of any transitive relationship By SPARQL queries

Allows data exploration through ontology classes Allows single instance updates

OntoQuest Builds on IODT

Our system is developed on top of an IBM integrated ontology toolkit implements a high performance ontology repository built on

relational database A subset of W3C’s OWL and SPARQL query language Uses description logic reasoner for class-level inference and a

set of logic rules translated from DLP for instance-level inference

Hence, inference completeness and soundness on DLP can be guaranteed

Back-end database schema design supports efficient querying and inference, performance superior compared to Jena, Sesame etc.

IBM ToolKit

SKIL APIs

Biologist-Friendly GUI

Query Mediator

SQL

Cache

Updater Reasoner

. .

SQL

System Development Facts

OntoQuest has a domain user friendly GUI and a library of customized APIs Updater: enable inserting classes and instances incrementally

into the ontology repository Query Mediator: form user’s request as a query against the

global view; decompose it into sub-queries in forms of SQL and SPARQL and send to CCDB and CKB; reassemble the results and render an appropriate view (e.g. graphic) for the user

Reasoner: execute rules to compute indirect class memberships and properties

Cache: further enhance the system efficiency by caching or prefetching frequent query results

The system is still under development – some of the functionalities are not completed or need to be improved e.g., propagation of ontology updates

Data Integration with OntoQuest For every class,

the ids of the instances of the class are tracked from the respective data stores and maintained locally

a mapping is used to fetch instances of the class from the relevant store to a local instance store on demand

only the properties that are associated with the ontology classes are retrieved in a GAV fashion

all other properties are obtained (for now) only allowing the user to query the data source directly

The Application Setting for this Demo The Ontology

Developed by the neuroscientists in our group describes the subcellular anatomy of the nervous system, including cell types and their

subcellular properties and multicellular domains

The knowledge base was constructed as a directed graph using the open source tool Protégé (http://protege.stanford.edu), a freely available knowledge management tool written in Java.

The ontology is expressed in OWL-DL Since OWL-DL supports description logic, inferences are made from the property

rules e.g., protein Kv3.2 is located in the plasma membrane; if an instance of axon terminal

expresses Kv3.2, then it must have a plasma membrane.

Data Sources A Derby data store for literature-curated instances of subcellular anatomy

(CKB) A relational (MySQL) source containing experimental data from CCDB

Subcellular Ontology

Intercellular Junction

Multi-cellular Domain

Pinceau Node of Ranvier

Extracellular Space

Glomerulus NeuropilSynaptic Cleft

Subcellular Space

Nerve Cell

Neuron

Glia

Microglia Macroglia

Compartment

Dendrite Axon Cell body Spine

Dendritic Spine

Component

Post synaptic

Component

PSD

SER

Actin Filament

Ribosome

Orientation

Distribution

Property

Morphometrics

Shape

Compartment

Compartment

Shaft

Cytoplasm

Organelle

Cytoskeleton

Cilium

Specialization

Inclusion

Plasma Membrane

Component

Orientation

Distribution

Property

Morphometrics

Shape

Moleculesubclass

has-a

LEGEND

Demo Scenarios

Step 0: startup screen

Step 1: click to show subclass hierarchy by default

Step 1: other options for expanding different types of hierarchies e.g., the compartment types for Neuroepithelial_Cell and those for Neuron

Step 2: get the detailed info (instances and properties) of the subclass Dendrite of Neuron_Compartment

Step 2a: accessing the property values for the selected class

Step 2b: the CCDB image page corresponding to the selected instance Dendritic_Tree_1 is shown here

Step 2’: some concept (like Cellular_Dependent_Continuant here) has properties but no instances in CKB

Step 3: right click on a concept in the hierarchy pops up a list of view functions to choose from

Step 4: aggregate the has_Component values of all Dendrite instances; the last row shows statistics summary

You may also have noticed that instances of Dendrite include those of its subclasses (such as Dendrite_Tree)

Step 5: drill down to view instances of Dendrite_Tree, aggregate on several numeric type of property values

SPARQL Query

Add an Instnace

Edit Instance Properties

Ontology Store Properties

•What are the cellular components of a dendrite?29 instances of dendrite

1. Microtubules2. Mitochondria3. Hypolemmal cisternae4. Plasma membrane5. Smooth endoplasmic reticulum6. Rough endoplasmic reticulum7. Polyribosomes8. Neurofilaments

Average diameter = 3.2 umAverage length = 150 um

•How many dendrites does a Purkinje cell have?

3 instances of Purkinje cell dendritic tree1. Avg branch order = 222. Number of primary dendrites = 1.33. Avg number of branches = 760

**Computes aggregate properties from instances

“Rules” for cellular assembly