By Helen Howell. 2 Outline ] What is interoperation? ] Why? ] Problems ] Approach to Interoperations...
-
date post
20-Dec-2015 -
Category
Documents
-
view
213 -
download
0
Transcript of By Helen Howell. 2 Outline ] What is interoperation? ] Why? ] Problems ] Approach to Interoperations...
3
What is Interoperation?
Interoperation may be defined as a loose coupling across information sources, semantic metadata descriptions and ontologies
5
Domain Differences
Naming attribute items differently payroll - EMP; personnel - PEOPLE
Scope payroll may include suport for student
benefits for employee’s children; personnel - employees only
6
Domain Differences (cont.)
Encoding ssn with and without hypens
Attribute scopes-Semantic hetergeneity
hot has different meaning in weather domain than in truck-engine domain
7
Integration Problems
Data that is available is subject to implicit assumptions which means that it cannot be reused out of context.
Data is often duplicated and labeled under different terms.
e.g., Address verses Addr
9
Direct Translation
For each ontology a set of translating functions is provided to allow mappings with the other ontologies.
10
Direct Translation
Only feasible if a few ontologies OBSERVER - mappings between a
term in a source ontology and its synonym in a destination ontology.
Set up relationships e.g., Volcano effects Environment
11
Single Shared Ontology
Often standards are not convenient to use since they have to be suitable for all potential uses. So a single subset is formed as a standard
13
Multiple Shared Ontologies
Referred to as Resource Clustering or Ontology Clustering
Approach more flexible and scalable No longer have to commit to a
comprehensive ontology
14
Multiple Ontologies
Normally organized into a hierarchy Top level provides general definitions
share by all resources Lower levels provide definitions of
concepts that extend and characterize upper levels
15
Mediation-Based Information Systems
Mediators provide intelligent middleware services- matching resources to applications- resources are autonomous and
heterogeneous
16
Mediation Based Information Systems
Identify most relevant resources for user’s needs
Retrieve the relevant data Translate it to a common
representatione.g., Nevada, NV, NV USA
Interoperation of Information Sources via Articulation of Ontologies
Prasenjit Mitra, Gio Wiederhold
Stanford University
Supported by AFOSR- New World Vistas Program
18
Interoperation of Information Sources
Compose information Multiple independent, heterogeneous
sources Reliability, scalability Semantic heterogeneity - same term different semantics - different term same semantics
20
Articulation Generator
Driver
Ont1 Ont2
Phrase Relator
Thesaurus
StructuralMatcher
SemanticNetwork
Human Expert
Context-basedWord Relator
OntA
21
Preliminaries: Ontology
Ontology - hierarchy of terms and specification of their properties.
Modeled as a directed labeled graph + set of rules.
Ont = ( V, E, R) V - set of nodes(concepts) E - set of edges(properties) R - set of rules involving V,E
22
Articulation Rules
Articulation rules - logic-based rules that relate concepts in two ontologies:
- Binary Relationships (O1.Car SubClassOf O2.Vehicle) (O1.Buyer Equ O2.Owner) (O2.LuxuryCar SubClassOf O1.Car)
23
Example
Car
H123
Instance
List_PriceVID
PMitra
NameUnit
$ 40k CA
AddressAmount
Vehicle
LuxuryCar
PMitra
BuyerInstance
SubClass
Instance
VIN
H123
Owner Retail_Price
CA
Name Addr
$35k
SubClass
SubClass
BuyerOwner
Attribute
Attribute
Equ
24
Articulation Rules (contd.)
Horn Clauses - (O1.Car O1.Instance X), (X O1.Price Y), (Y > $30000) => (O2.LuxuryCar Instance X) - (V O2.Retail_Price P), (C O1.List_Price
L), (L Unit U), (L Amount A), (V Equ C) => (P Equ concat(U,A))
25
Contributions
Articulation Generation Toolkit - produce translation rules semi-
automatically - a library of reusable heuristic methods - a GUI to display ontologies and interact with the expert Ontology Algebra - query rewriting and planning
26
Articulation Generation Methods
Non-iterative Methods - Lexical Matcher - Thesaurus-based Matcher - Corpus-based Matcher - Instance-based Matcher Iterative Methods - Structural Matcher - Inference-based Matcher
27
Lexical Methods
Preprocessing rules. -Expert-generated seed rules. e.g., (O1.List_Price Equ O2.Retail_Price) -Context-based preprocessing directives. e.g., (O1.UK_Govt Equ O2.US_Govt) -Stop-word Removal & Stemming -Word match (full or partial) -Phrase match e.g., (O1.Ministry_Of_Defence Equ O2.Defense_Ministry) 0.6
28
Thesaurus-based methods
Consult a dictionary/thesaurus to find synonyms, related words
Generate a similarity measure or relatedness measure
- words that have similar words in their definitions are similar
Get more semantically meaningful relationships from WordNet (syn, hyper)
29
Candidate Match Repository
Term linkages automatically extracted from 1912 Webster’s dictionary *
* free, other sources . being processed.
Based on processing headwords definitions Notice presence
of 2 domains: chemistry, transport
30
Corpus-based
Collect a set of text documents preferably from same domain
- search using keywords in google Build a context vector (1000-character
neighbourhood) for each word Compute word-pair similarity based on the
cosine of the vectors Use word-pair similarity to find similarity
among labels of nodes/edges
31
Structural Methods
Uses results of lexical match If x% of parent nodes match & y% of
children nodes match Special relations (AttributeOf) match
32
Tools to create articulations
Graph matcherforArticulation- creatingExpert
Vehicle ontology
Transport ontology
Suggestionsfor articulations
33
continue from initial point
Also suggest similar terms for further articulation:
• by spelling similarity,• by graph position• by term match repository
Expert response:1. Okay2. False3. Irrelevant to this articulation
All results are recorded
Okay ’s are converted into articulation rules
34
An Ontology Algebra
Operations can be composed
Operations can be rearranged
Alternate arrangements can be evaluated
Optimization is enabled
The record of past operations can be
kept and reused when sources change
35
Binary Operators
A knowledge-based algebra for ontologies
The Articulation Function (ArtGen), given two ontologies, supplies articulation rules between them.
Intersection create a subset ontology keep sharable entries
Union create a joint ontology merge entries
Difference create a distinct ontology remove shared entries
36
Intersection: Definition
O1 = (V1, E1, R1) O2 = (V2, E2, R2) OI = ( O1 IntArtGen O2) = (VI, EI, RI)
Arules = ArtGen( O1, O2 ) VI = Nodes( Arules ) EI = Edges( Arules ) + Edges(E1, VI.V1) +
Edges( E2, VI.V2) RI = Arules + Rules( R1, VI.V1) + Rules( R2,
VI.V2)
37
Intersection: Example
ARules = { (O1.Car SubClass O2.LuxuryCar), (O1.MSRP Equ O2.Price)} NI = ( O1.Car, O1.MSRP, O2.LuxuryCar, O2.Price ) EI = Edges(ARules) + {(O1.Car Attribute O1.MSRP), (O2.LuxuryCar Attribute Price)}
InexpCar
Car LuxuryCar
MSRP Price LuxuryTax
SubClass
Equ
O1 O2OI
38
Intersection: Properties
Commutative? OI12=(O1 IntArtGen O2) = (O2 IntArtGen O1)=OI21
VI12 = VI21, EI12 = EI21, RI12=RI21
ARules12 = ARules21
ArtGen(O1, O2) = ArtGen(O2, O1)
IntArtGen is commutative iff ArtGen is commutative
39
Semantically Commutative
Example: ArtGen(O1,O2) : ( O1.Car SubClassOf O2.Vehicle) ArtGen(O2,O1) : ( O2.Vehicle SuperClassOf O1.Car)
Defn: ArtGen is Semantically Commutative iffArtGen(O1,O2) <=> ArtGen(O2,O1)
Operands to intersection can be rearranged if ArtGen is semantically commutative
40
Associativity Example
Example: ArtRules(O1, O3) : (O1.Car
SubClassOf O3.Vehicle) ArtRules(O2, O3) : (O2.Truck
SubClassOf O3.Vehicle) ArtRules(O1, O2) : null Intersection is not associative!
41
Associativity
(O1.O2).O3 = O1.(O2.O3) iff ArtGen is consistent and
transitively connective
consistent - given two nodes it generates the same relationship irrespective of relationships between other nodes
42
Associativity (contd.)
ArtGen relates A and B: (A Rel B) = (A R1 B) or (B R2 A)
Transitively connective: If ArtGen generates (A Rel B), (B Rel C) then it also generates (A Rel C’) where A O1, B O2, C,C’ O3
43
Using the Articulation Rules
Q(X) :- (LuxuryCar Instance X), (X Owner Y), (Y Addr `CA’)
Translated using Articulation Rules Q(X) :- (Car Instance X),(X List_Price Y) (Y Unit ‘$’),(Y Amount A), (A > 30000), (X Buyer Z), (Z Address ‘CA’)
44
Conclusion
Introduced an Ontology Interoperation System.
Interoperation based on articulations that bridge the semantic gap.
Graph-oriented data model, logical rules Founded on Ontology Algebra
Thanks to Stefan Decker, Alex Carobus, Jan Jannink, Sergey Melnik, Shrish Agarwal
46
DAML Excitement at Cycorp
Promoting DAML ontology interoperability through Cyc, as a reference ontology.Cyc now imports and exports DAML, paraphrases and reasons about it.
Cyc lexical tools assist the mapping of imported DAML ontologies into Cyc.
Both UDDI categorization schemes imported as DAML into Cyc:
–UNSPSC (via KSL) and NAICS
Cycorp is featuring DAML in it’s upcoming release of OpenCyc (this fall).
47
DAML Challenges at Cycorp
How to automate mapping and translation of DAML ontologies.
How to get the user community to help map numerous DAML terms -- interest growing at OpenCyc.Importation of large DAML schema is daunting but underway, WordNet, OpenDirectory, UNSPSC, NAICSProvide an inference engine for the Semantic Web - subsumption, classification, constraints, well-formed statement validation, diagnostics, justifications.What web services interface? SOAP likely.
DAML Ontology Interoperation
CycOntology
UNSPSCOntology
NAICSOntologyMapping 1:
UNSPSC to CycMapping 2:
NAICS to Cyc
User
Search for: “waterproof plywood”
Search for:UNSPSC/7311
Hardwood Int.… marine plywood …
UDDI Web Services
Return: marine plywood by Hardwood Int.
51
UNSPC/NAICS Mapping
Universal Standard Products and Services Classification, North American Industrial Classification SystemBoth business taxonomies are monohierarchies, entities should have one code assigned for them.
Cyc: #$MonohierarchyClassificationSystem implies that each collection is #$partitionedInto its narrower terms.
Mapping is targeted at facets of the reference Cyc terms via functional composition.
52
Mapping to Semantic Facets
(#$isa (#$BusinessFacetFn #$HardwoodInternational #$Hardware) #$NAICS-Hardware-Store))
• Hardwood International is an online store, selling mainly lumber but offering hardware as well, but NAICS is a monohierarchy.
Hardwood International
44413
Hardware
(#$isa (#$BusinessFacetFn #$HardwoodInternational #$Hardwood) #$NAICS-Other-Building-Materials-Dealers))
44419
Hardwood
Business facets
53
Conclusion
Ontology Interoperation is not without it problems
Hetergeneous data can work together