Amarnath Gupta Univ. of California San Diego. An Abstract Question There is no concrete answer...
-
Upload
jewel-tucker -
Category
Documents
-
view
215 -
download
0
Transcript of Amarnath Gupta Univ. of California San Diego. An Abstract Question There is no concrete answer...
Semantic Integration of Heterogeneous Domain-
Specific Information:The NIF Case
Amarnath Gupta
Univ. of California San Diego
“Bing is a search engine that finds and organizes the answers you need so you can make faster, more informed decisions” !!!
What happened to “organizes the
answers” and helping more informed
decisions? !!!
In NIFDomain = ontological definitions
+ axioms
Indexing property chains for fast query expansion
Schema mapping when possible
Ontological Source Filtering
Bicycle as in bi-cyclic
Bicycle as a therapeutic aidOntological Resource Annotation
Data Ingestion and
Transformation
Ontology Ingestion and Transformati
on
Rela
tional
Query
Pro
cess
or
Tree Q
uery
Pro
cess
or
Gra
ph Q
uery
Pro
cess
or
OntoQuestIn
dex
Str
uct
ure
sType-Partitioned Data Store
Ontology Repository
User Query Parser
Keyw
ord
Q
uery
Pro
cess
or
Query Planner
Data Reade
r
Data Reade
r
Data Reade
r
Execution Engine
OWL Reade
r
OBO Reade
r
RDFS Reade
r
Semantic & Assn. Catalogs
...
Current Query Architecture
•How to store, index and query ontologies efficiently? •What about different forms of ontology? •What about multiple inter-mapped ontologies?
Some Performance Numbers
Q1. A single term ontological query synonyms(Hippocampus) Q2. transcription AND gene AND pathwayQ3. (gene) AND (pathway) AND (regulation OR "biological regulation") AND (transcription) AND (recombinant) Q4. synonyms(zebrafish AND descendants(promoter,subclassOf))Q5. synonyms(descendants(Hippocampus,partOf))Q6. synonyms(Hippocampus) AND equivalent(synonyms(memory)) Q7. synonyms(x:descendants(neuron,subclassOf)
where x.neurotransmitter='GABA') AND synonyms(gene where gene name='IGF')Q8. synonyms(x:descendants(neuron,subclassOf) where
x.soma.location=descendants(Hippocampus,partOf))
The Abstract ProblemGiven
n data sources (n of the order of hundreds) Structured (relational) Semi-structured (XML, RDF) Un-structured (text) With specialized data semantics (pathway graphs, social nets,
annotated images, …) A domain specified by an ontology with known
entailment rules (preferably less expressive than full MSO logic)
A set of mappings from the data to the ontologyConstruct
An information system such that The ontology is the effective target schema Its query language has an enhanced keyword model
(or any associative query language) User queries are transformed into “intentionally
equivalent” source queries Results are ranked by relevance The system is responsive, robust and scalable
•Bootstrapping from a seed ontology•Creating a feature-derived ontology
A Linked Graph Perspective
We can view the data problem as a “constrained” graph integration exercise where
Every data/knowledge resource can be considered as a graph that is governed by a set of (Description Logic) axioms about its structure and component relationships
Connections between individual resources can be defined both at the level of the instance or at the level of the concepts
The connections themselves can be defined in terms of asserted or inferred Description Logic statements
The ontology’s role is to provide the bridges that can be considered “general knowledge” that is modularized under a well formed upper ontology.
Too Many IssuesToo Little Time (and Resources)
What’s the best way to implement ontologies with concrete domains through a graph-based approach?
Graphs with Colored DAG backbones? Balancing Materialized vs. Computed edges for best
time-space tradeoffsWhat is an appropriate result model for an associative graph query?
What is the query language and result model of a story?
Combining result presentation and navigation options?
Ranking Models? Contextual Query Interpretation and Ranking?
Oh! Scalability!!!