An Efficient Method for Computing Alignment Diagnoses
description
Transcript of An Efficient Method for Computing Alignment Diagnoses
An Efficient Method forComputing Alignment Diagnoses
Christian Meilicke, Heiner StuckenschmidtUniversity of Mannheim
Lehrstuhl für Künstliche Intelligenz{christian, heiner}@informatik.uni-mannheim.de
Computing a local optimal diagnosis 2
Problem StatementProblem Statement
• Automatically and manually (!) generated ontology alignments are often incoherent
– See OAEI-2008 results of conference track
• => Incoherent alignments are a problem in many application scenarios*
– Instance migration results in inconsistent ontologies
– Query translation results in ‚a priori‘ empty result sets
• Find a way to automatically repair incoherent alignments in a very efficient way, because …
– ‚Agents on the web‘ require coherent alignments on the fly
– Large ontologies require efficient algorithms* C.Meilicke and H.Stuckenschmidt. Incoherence as a Basis for Measuring the Quality of Ontology Mappings. OM-08.
Computing a local optimal diagnosis 3
OutlineOutline
Alignment Semantics Incoherence of an alignment, MIPS alignments
Alignment Diagnosis Diagnosis, Minimal Hitting Set, Local Optimal Diagnosis
Computing a Local Optimal Diagnosis (LOD) Brute-Force LOD and Efficient LOD
Experimental Results Runtime, Quality of the Diagnosis
Computing a local optimal diagnosis
O2
"Natural" Semantics"Natural" Semantics
<1#Person, 2#Person, =, 0.98><1#hasName, 2#name, =, 0.87><1#writtenBy, 2#docWrittenBy, = 0.7><1#authorOf, 2#hasWritten, =, 0.56><1#firstAuthor, 2#Author, ⊑ , 0.56>
O1
O1 ∪A O2
An alignment A and two ontologies O1 and O2
1#firstAuthor 2#Author⊑
1#Person ≣ 2#Person
…
Merged Ontology
Correspondences
Axioms
Computing a local optimal diagnosis
Incoherence of an AlignmentIncoherence of an Alignment
Definition: Incoherence of an Alignment
An alignment A between ontologies O1 and O2 is incoherent iff there exists an satisfiable concept i#C or property i#R in Oi {1,2} that is unsatisfiable in O1 ∪A O2.
can be reduced to thesatisfiability of ∃i#R.⊤
Definition: MIPS Alignment (minimal conflict set)
Given an incoherent alignment A between ontologies O1 and O2.
A subalignment M ⊆ A is a MIPS alignment (= minimal incoherence preserving subalignment) iff M is incoherent and there exists no M‘ ⊂ M such that M‘ is incoherent.
Computing a local optimal diagnosis 6
"Terminology""Terminology"
Correspondence Alignment
Alignmentin a sequence ordered by confidences
MIPS depicted by red-dotted links
Alignmentwith MIPS shown as subsets
Computing a local optimal diagnosis
Alignment DiagnosisAlignment Diagnosis
Definition: Alignment Diagnosis
Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff A \ ∆ is coherent with respect to O1 and O2 and for each ∆‘ ⊂ ∆ alignment A \ ∆‘ is incoherent with respect to O1 and O2.
Proposition: Alignment Diagnosis and minimal Hitting Sets
Alignment ∆ ⊆ A is an alignment diagnosis for O1 and O2 iff ∆ is a minimal hitting set over all MIPS in A.
Computing a local optimal diagnosis
Local Optimal Diagnosis (LOD)Local Optimal Diagnosis (LOD)
low confidence
high confidenceDefinition: Accused correspondence
A correspondence c A is accused by A iff there exists a MIPS in A with c M such that for all c‘ ≠ c in M it holds that• (1) conf(c‘) > conf(c) and• (2) c‘ is not accused by A.
Definition: Local optimal diagnosis (LOD)
The set of all accussed correspondences is referred to as local optimal diagnosis (LOD).
important!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?YES!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?YES!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?NO!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?Now it is!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?YES!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?YES!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?NO!
Computing a local optimal diagnosis
Algorithm 1Algorithm 1
1 2 3 4 5 6 7 8 9 10
Coherent?Now it is!
… continue the same way
Computing a local optimal diagnosis
Algorithm 1: ResultAlgorithm 1: Result
… and after a few more slides we would end up like this:
Note:
• 10 times checking coherence for constructing a local optimal diagnosis, which is a minimal hitting set over all MIPS
• We have not computed a single MIPS alignment!
1 2 3 4 5 6 7 8 9 10
First sketch: Meilicke,Völker, Stuckenschmidt. Learning Disjointness for Debugging Mappings between Lightweight Ontologies (EKAW-08)
With focus on relation to belief revision discussed in: Qi, Ji, Haase: A Conflict-based Operator for Mapping Revision (ISWC-09)
Computing a local optimal diagnosis
„„Patternbased“ reasoningPatternbased“ reasoning
• Idea: Use incomplete method for incoherence detection in A‘ ⊆A– Classify O1 and O2 once, then check for each pair of
correspondence in A‘ wether a certain pattern occurs
• If pattern occurs for some pair of an alignment A‘, then A‘ is incoherent– If no pattern occurs A‘ can nevertheless be incoherent!
Oi
Oj
Computing a local optimal diagnosis
That doesn‘t work …That doesn‘t work …
• Use the efficient coherence test instead of complete reasoning in algorithm described above– Reasoning about A' ⊆ A does not require to reason in O1 ∪A'
O2, but is replaced by iterating over all pairs in A'
– Hoewever: Resulting alignment might still be incoherent and ∆ is not a LOD
– Missing out one MIPS might result in a chain of incorrect follow-up decisions!
– Thus, afterwards removal of missed-out MIPS does not work!
• How to exploit the efficient method while still constructing a LOD?
Computing a local optimal diagnosis
Algorithm 2: ExampleAlgorithm 2: Example
1 2 3 4 5 6 7 8 9 10
Detectable by efficient method
Only detectable by complete method
Resolved due to removal of correspondence
Computing a local optimal diagnosis
Algorithm 2: ExampleAlgorithm 2: Example
1 2 3 4 5 6 7 8 9 10
Detectable by efficient method
Only detectable by complete method
Resolved due to removal of correspondence
Run the BF algorithm with efficient reasoning. Still incoherent?
Verification Step: Use binary search to detect correspondence k such thatA[0… k-1] is coherent and A[0 … k] is incoherent
k=8safe part, efficient reasoning did not fail up to k
incorrect part,recompute!
Computing a local optimal diagnosis
Algorithm 2: ExampleAlgorithm 2: Example
1 2 3 4 5 6 7 8 9 10
Detectable by efficient method
Only detectable by complete method
Resolved due to removal of correspondence
Run the main algorithm again with efficient reasoning for A[k+1 … n] where∆1-k ∪ A[k] for A[1… k] is a fixed part of the resulting diagnosis.
Still incoherent?If yes, we have knew > kold
repeat again the same verification step
A[1…k] A[k+1…n]
Computing a local optimal diagnosis
Algorithm 2: ExampleAlgorithm 2: Example
1 2 3 4 5 6 7 8 9 10
Detectable by efficient method
Only detectable by complete method
Resolved due to removal of correspondence
Final result is a LOD.
Computing a local optimal diagnosis
Runtime Considerations (Theory)Runtime Considerations (Theory)
n = size of alignment A
m = number of times the binary search is applied
• The "more complete„ pattern-based reasoning is => the less verification steps/ iterations are necesarry
– Runtime of pattern based reasoning not really matters with respect to runtime!
• Runtime Comparison
– Brute Force LOD: O(n)
– Efficient LOD: O(log(n) * m)
Do we have m << n ?
Computing a local optimal diagnosis
Results: RuntimeResults: Runtime
• Based on experiments with OAEI conference ontologies and submission from 2007/08– Expressivity SHIN(D), ELI(D), SIF(D), ALCIF(D)
– Four different state of the art matching systemsn m
• Better results for benchmark datasets: 5 to 10 times faster
Computing a local optimal diagnosis
Results: Quality of DiagnosisResults: Quality of Diagnosis
• Removing the LOD results in an alignment with increased precision and slightly decreased recall => slightly increased f-measure
• For alignments with low precision positive effects are very strong.
• In rare cases an incorrect correspondences annotated with high confidence has negative effects
Computing a local optimal diagnosis
SummarySummary
• Algorithm 1: Algorithm for computing a LOD
– Without computing MIPS or MUPS!
• Algorithm 2: General approach for improving the algorithms of type 1
– Shown for natural interpretation of correspondences as axioms and a specific type of incomplete reasoning
– In principle applicable to each semantic for which we can find a similar efficient reasoning approach!
• Good results for natural interpretation + pattern based reasoning: between 2 and 10 times faster!
Computing a local optimal diagnosis 29
Thanks for attentionQuestions?
Computing a local optimal diagnosis 30
Back-Up SlidesBack-Up Slides
Computing a local optimal diagnosis
Property Pattern ExampleProperty Pattern Example
readPaper reviewOfPaper
DocumentDocument
∃readPaper.⊤ ∃reviewOfPaper.⊤
≣
≣
dis
join
t∃readPaper.⊤ ⊑ Reviewer Reviewer ⊑ Person Document ⊑ ¬Person
dis
join
t
∃reviewOfPaper.⊤ ⊑ Review ⊑ Document O1
O2