Ontology engineering: Ontology alignment

Post on 18-Nov-2014

892 views 7 download

description

 

Transcript of Ontology engineering: Ontology alignment

Ontology Alignment

Course “Ontology Engineering”

Goals of the lecture

Understand why ontology alignment is done Know what constructs can be used to express

an alignment between two concepts Know what options there are to find mappings

2

3

Agenda

Why ontology alignment? Alignment relations Alignment techniques

4

Why is Ontology Alignment done?

5

Interoperability problem II

A private company wants to participate in a marketplace

E.g. eBay: Home > Buy > Cameras & Photo > Digital Cameras >

Digital SLR > Nikon > D40

Needed: correspondences between entries of its catalogs and entries of a common catalog of a marketplace.

Example use of vocabulary alignment

“Tokugawa”

SVCN period Edo

SVCN is local in-house ethnology thesaurus

AAT style/period Edo (Japanese period) Tokugawa

AAT is Getty’s Art & Architecture Thesaurus

Alignment architecture for P2P

8

Two kinds of interoperability

Syntactic interoperability– using data formats that you can share– XML family is the preferred option

Semantic interoperability– How to share meaning / concepts– Technology for finding and representing semantic

links

9

Reusing vocabularies

10

The myth of a unified vocabulary

There will always be multiple ontologies Partly overlapping In multiple languages Each with their own perspective

11

Links between ontologies

“Ontology Alignment” / “Ontology Mapping”– use ontologies jointly by defining a limited set of

links– Benefit from knowledge encoded in the other

ontology– Enable access across applications/collections.– Partial by nature!

12

Why ontology alignment?

Summary: There is no single ontology of the world People work with different viewpoints and

thus multiple conceptualizations But: these concepts often overlap Semantic relations between ontologies help

integrating information sources Currently seen as a major issue in

development of distributed (web) systems

13

How do we represent the alignment between two concepts?

14

Link types between concepts in different ontologies

Equality

owl:sameAs

Individual individual“Den Haag” = “The Hague”

Equivalence

owl:EquivalentClass

class class

wood-material = wood

Subclass

rdfs:subClassOf

class class

aat:Artist wn:Artist

Instance of

rdf:typeindividual class

tgn:Africa wn:Continent

Disjoint

owl:disjointWith

class class

aat:wood wn:plastic

15

skos:mappingRelation- skos:closeMatch- skos:exactMatch- skos:broadMatch- skos:narrowMatch- skos:relatedMatch

Types of links between concepts in different thesauri

SKOS mapping properties

16

- skos:closeMatch- symmetricProperty

- skos:exactMatch- subPropertyOf

skos:closeMatch- transitiveProperty- symmetric property

- skos:broadMatch- subPropertyOf

skos:broader- inverseOf

skos:narrowMatch

- skos:narrowMatch-subPropertyOf skos:narrower-inverseOf skos:broadMatch

-skos:relatedMatch-subPropertyOf skos:related-symmetric property

17

Example: partial alignment between citations

18

Example: alignment between XML Schemas

19

Example: alignment between thesauri

20

Links between properties: equivalentProperty subPropertyOf inverseOf

E.g. painterOf – creatorOf Trick: wn:hyponym subPropertyOf

rdfs:subClassOf

Types of links between properties in different ontologies

21

Domain-specific links– Van Gogh (ULAN) born-in Groot-Zundert

(TGN) – Derain (ULAN) related-to Fauve (AAT))– Wandelkaart Pyreneeën RANDO.07 Haute-

Ariège - Vicdessos (Pied à Terre) related to Pyrénées (TGN)

– Part-of relations

Types of links between concepts in different ontologies

22

Alignment Techniques

23

Alignment tools Input: two ontologies, each consisting of a set

of discrete entities• HTML table headers• XML elements• Classes• Properties

Output: relationships holding between these entities (equivalence, subsumption, etc.) + confidence measure.

Cardinality (e.g., 1:1, 1:m)

24

Alignment techniques Syntax: comparison of characters of the terms

– Measures of syntactic distance– Language processing

• E.g. Tokenization, single/plural,

Relate to lexical resource– Relate terms to place in WordNet hierarchy

Taxonomy comparison– Look for common parents/children in taxonomy

Instance based mapping– Two classes are similar if their instances are similar.

String-based techniques (1)

Exact string match Prefix

– takes as input two strings and checks whether the first string starts with the second one

– net = network; but also hot = hotel Suffix

– takes as input two strings and checks whether the first string ends with the second one

– ID = PID; but also word = sword

String-based techniques (2)

Edit distance– takes as input two strings and calculates the

number of edition operations, (e.g., insertions, deletions, substitutions) of characters

– required to transform one string into another, normalized by length of the maximum string

– EditDistance ( NKN , Nikon ) = 0.4 (2/5)

Language-based techniques

Tokenization– parses names into tokens by recognizing punctuation, cases– Hands-Free Kits => hands, free, kits

Lemmatization– analyses morphologically tokens in order to find all their possible

basic forms– Kits => Kit

Elimination– discards “empty” tokens that are articles, prepositions,

conjunctions . . .– a, the, by, type of, their, from

Linguistic techniquesusing WordNet senses

A subClassOf B if A is a hyponym of B– Pine subClassOf Tree

A hasPart B if A is a holonym of B– Europe hasPart Greece

A = B if they are synonyms– Quantity = Amount

A disjoint B if they are antonyms or ar siblings in the same part of hierarchy– Pine disjoint Oak

Linguistic techniques: gloss-based

WordNet gloss comparison– The number of the same words occurring in both

input glosses increases the similarity value. – The equivalence relation is returned if the

resulting similarity value exceeds a given threshold

– Maltese dog is a breed of toy dogs having a long straight silky white coat Afghan hound is a tall graceful breed of hound with a long silky coat

Structural technique: taxonomy comparison

31

Techniques for Part-of Relations

Phrase (Hearst) patterns:

add <part> to <whole><whole> is made of <part><part> gives the <whole> its<whole>-containing <part><whole> consists of <part>

Overview of alignment techniques

33

Alignment issues (1)

Nature of the input– Underlying data models– Schema-level vs. Instance-level– Example: Link WordNet to Wikipedia

Interpretation of the output– Approximate vs. exact– Graded vs. absolute confidence

Performance varies> semi-automatic alignment.

Involving the human in alignment evaluation

35

Evaluation of alignments

Judging individual alignments– Precision

Comparison to a reference alignment– Recall– Precision?

Comparing the logical consequences of the models

End-to-end evaluation

The intrinsic fuzziness of alignment

37

WordNet

AAT

38

Literature / acknowledgment

Some slides from this lecture are based on a tutorial of Pavel Shvaiko and Jerome Euzenathttp://dit.unitn.it/~accord/Presentations/

ESWC'05-MatchingHandOuts.pdf

Some slides are from Antoine Isaac (STICH)