Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf ·...

107
Next Generation Semantic Web Applications Enrico Motta , Mathieu D’Aquin Sofia Angeletou, Claudio Baldassarre, Martin Dzbor, Laurian Gridinoc, Davide Guidi, Ainhoa Llorente, Vanessa Lopez, Marta Sabou Knowledge Media Institute The Open University Milton Keynes, UK

Transcript of Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf ·...

Page 1: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Next GenerationSemantic Web Applications

Enrico Motta, Mathieu D’AquinSofia Angeletou, Claudio Baldassarre, Martin Dzbor, Laurian Gridinoc,

Davide Guidi, Ainhoa Llorente, Vanessa Lopez, Marta Sabou

Knowledge Media InstituteThe Open UniversityMilton Keynes, UK

Page 2: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Introduction

This talk presents a number of projects, which are part of an integrated effort at exploring the possibilities opened by the Semantic Web, viewed as a domain-independent, large scale supplier of formally encoded background knowledge, with respect to enabling intelligent problem solving.We call the resulting applications:Next Generation Semantic Web Applications

Page 3: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Organization of the Talk

• The Semantic Web• The Semantic Web in the context of AI research• Next Generation Semantic Web Applications

– What are they?– Why are they different from 1st generation SW

Applications?– Infrastructure needs

• Examples– Ontology Matching– Integrating Web2.0 and SW– Semantic Web Browsing– Question Answering

• Conclusions

Page 4: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

The Semantic Web

The collection of all formal, machine processable, web accessible, ontology-based statements (semantic metadata) about web resources and other entities in the world, expressed in a knowledge representation language based on an XML syntax (e.g., OWL, DAML, DAML+OIL, RDF, etc…)

Page 5: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

<RDF triple><RDF triple><RDF triple><RDF triple><RDF triple><RDF triple><RDF triple>

<RDF triple><RDF triple><RDF triple><RDF triple><RDF triple><RDF triple><RDF triple>

<RDF triple><RDF triple><RDF triple><RDF triple><RDF triple><RDF triple><RDF triple>

<RDF triple><RDF triple><RDF triple><RDF triple><RDF triple><RDF triple><RDF triple>

Ontology

Metadata

UoD

Page 6: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 7: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

<foaf:PersonalProfileDocument rdf:about="http://kmi.open.ac.uk/peopl<dc:title>Sofia Angeletou&apos;s RDF Description</dc:title><rdfs:label>Sofia Angeletous RDF Description</rdfs:label><dc:description>RDF description for Sofia Angeletou in machine-read<dc:creator rdf:resource="http://identifiers.kmi.open.ac.uk/people/sofi<foaf:maker rdf:resource="http://identifiers.kmi.open.ac.uk/people/sof<foaf:primaryTopic rdf:resource="http://identifiers.kmi.open.ac.uk/peo

</foaf:PersonalProfileDocument>

<foaf:Person rdf:about="http://identifiers.kmi.open.ac.uk/people/sofia-a

<foaf:name>Sofia Angeletou</foaf:name><foaf:firstName>Sofia</foaf:firstName><foaf:surname>Angeletou</foaf:surname><f f b h 1 >F78114D4E45CFC6AC811E6191F50182FB98

Semantic WebDocument

Page 8: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Increasing Semantic Content

<rdf:RDF><Feature rdf:about="http://sws.geonames.org/26380<name>Shenley Church End</name><alternateName>Shenley</alternateName><inCountry rdf:resource="http://www.geonames.org</rdf:RDF>

Page 9: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Charting the web

Page 10: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Charting the web (2)

Page 11: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

• Great variety: Some topics are almost not covered (e.g. Adult), while some are over represented (e.g. Society, Computers)

• As we can expect, a large number of narrow coverage documents and a small number oflarge coverage ones.

Distribution of documents in the 16top categories of DMOZ

Distribution of the documents accordingto their coverage

Domain Coverage on the SW

Page 12: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

<RDF triple><RDF triple><RDF triple><RDF triple><RDF triple><RDF triple><RDF triple><RDF triple><RDF triple><RDF triple><RDF triple><RDF triple>

Example: Annotating the queen's birthday dinner

Page 13: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Nr.(t1, t2, t3) (t1) (t2) (t3) (t1, t2) (t1, t3) (t2, t3) (t1, t2, t3)1 (project, article, researcher) 84 90 24 9 13 9 82 (researcher, student, university) 24 101 64 16 15 38 133 (research, publication, author) 15 77 138 8 5 36 44 (adventurer, expedition, photo) 1 0 32 0 1 0 05 (mountain, team, talk) 12 25 9 2 1 1 1 6 (queen, birthday, dinner) 0 9 2 0 0 1 07 (project, relatedTo, researcher) 84 11 24 0 13 0 08 (researcher, worksWith, Ontology) 24 9 52 0 3 0 09 (academic, memberOf, project) 21 36 84 0 3 5 010 (article, hasAuthor, person) 90 14 371 8 32 2 011 (person, trip, photo) 371 7 32 1 20 1 112 (woman, birthday, dinner) 32 9 2 1 1 1 113 (person, memberOf, project) 371 36 84 16 46 5 514 (publication, hasAuthor, person) 77 14 371 2 52 2 2

Knowledge Sparseness

Page 14: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Example: Annotating the queen's birthday dinner

Page 15: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

The Rise of Semantics

Page 16: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Thesis #1

The SW today has already reached a level of scale good enough to make it a very useful source of knowledge to support intelligent applications

In other words: the Semantic Web is no longer an aspiration but a reality

The availability of such large scale amounts of formalised knowledge is unprecedented in the history of AI

Page 17: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Thesis #2

The SW may well provide a solution to one of the classic AI challenges: how to acquire and manage large volumes of knowledge to develop truly intelligent problem solvers and address the brittleness of traditional KBS

Page 18: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Knowledge Representation Hypothesis in AI

Any mechanically embodied intelligent process will be comprised of structural ingredients that we as external observers naturally take to represent a propositional account of the knowledge that the overall process exhibits, and independent of such external semantic attribution, play a formal but causal and essential role in engendering the behaviour that manifests that knowledge

Brian Smith, 1982

Page 19: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Intelligence as a function of possessing domain knowledge

Large Bodyof Knowledge

Intelligent Behaviour

KA Bottleneck

Page 20: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

The Knowledge Acquisition Bottleneck

Large Bodyof Knowledge

Intelligent Behaviour

KA Bottleneck

Knowledge

Page 21: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 22: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

SW as Enabler of Intelligent Behaviour

Intelligent Behaviour

Page 23: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Overall Goal

Our research programme is to contribute to the development of this large-scale web of data and develop a new generation of web applications able to exploit it to provide intelligent functionalities

Page 24: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

First GenerationSemantic Web Applications

Page 25: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 26: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

<rdf :Description rdf :about=" ht t p:/ /ww w.ecs. sot on.ac.uk/info/#p erson-01 269"> <ns 0 :family -name>Gibbins</ns0:family -name> <ns 0 :full -name>Nicholas Gibbins</ns0:full -name> <ns 0 :given-name>N icholas</ns0:g iven-name> <ns 0 :has-email -address> [email protected] t on.ac.u k</ns0:has -email -address> <ns 0 :has-affi liation -to -unit rdf: resource=" ht t p:// 194 .66 .1 83.26/ WEBSITE/GOW/Vie wDepartme nt.aspx?Dep art ment =750"/> < / rdf :Descriptio n> </ rdf :RDF>

CS Dept Data

AKT Reference Ontology

RDF Data

Bibliographic Data

Page 27: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 28: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 29: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

• Typically use a single ontology – Usually providing a homogeneous view over

heterogeneous data sources. – Limited use of existing SW data

• Closed to semantic resources

Features of 1st generation SW Applications

Hence: current SW applications are more similar to traditional KBS (closed semantic systems) than to 'real' SW applications (open semantic systems)

Page 30: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

1895 2007

It is still early days..

Page 31: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 32: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Next GenerationSemantic Web Applications

Page 33: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Architecture of NGSW Apps

Page 34: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Issue: Semantic Web Infrastructure

Page 35: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Current Gateway to the Semantic Web

Page 36: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Limitations of Swoogle

• Limited quality control mechanisms– Many ontologies are duplicated

• Limited Query/Search mechanisms– Only keyword search; no distinction between types of

elements– No support for formal query languages (such as SPARQL)

• Limited range of ontology ranking mechanisms– Swoogle only uses a 'popularity-based' one

• Limited API• No support for ontology modularization and

combination

Page 37: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

A New Gateway to the Semantic Web

http://watson.kmi.open.ac.uk

Page 38: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

• Sophisticated quality control mechanism– Detects duplications– Fixes obvious syntax problems

• E.g., duplicated ontology IDs, namespaces, etc..

• Structures ontologies in a network– Using relations such as: extends, inconsistentWith,

duplicates• Provides efficient API• Supports formal queries (SPARQL)• Variety of ontology ranking mechanisms• Modularization/Combination support• Plug-ins for Protégé and NeOn Toolkit (under devpt.)• Very cool logo!

Page 39: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

O1 O1‘priorVersionOf

O2

M1

relatedWith

sourcetarget

O3 O4

depe

ndsO

nO1‘‘

incompatibleWith

M2

source

extends

priorVersionOf

Networked Ontologies

Page 40: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 41: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

• Sophisticated quality control mechanism– Detects duplications– Fixes obvious syntax problems

• E.g., duplicated ontology IDs, namespaces, etc..

• Structures ontologies in a network– Using relations such as: extends, inconsistentWith, duplicates

• Provides efficient API• Supports formal queries (SPARQL)• Variety of ontology ranking mechanisms• Modularization/Combination support• Plug-ins for Protégé and NeOn Toolkit (under devpt.)• Very cool logo!

Page 42: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 43: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 44: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 45: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 46: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 47: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Examples ofNext Generation Semantic Web Applications

Page 48: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Example #1: Ontology Matching

Page 49: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

1

0.9

0.9 0.91

–Label similarity methods•e.g., Full_Professor = FullProfessor

Ontology Matching

0.5

0.5

–Structure similarity methods•Using taxonomic/property related information

Page 50: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

New paradigm: use of background knowledge

A B

Background Knowledge(external source)

A’ B’R

R

Page 51: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

External Source = One Ontology

Aleksovski et al. EKAW’06• Map (anchor) terms into concepts from a richly axiomatized domain ontology • Derive a mapping based on the relation of the anchor terms

Assumes that a suitable (rich, large) domain ontology (DO) is available.

Page 52: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

External Source = Web

van Hage et al. ISWC’05• rely on Google and an online dictionary in the food domain to extract semantic relations between candidate terms using IR techniques

A Brel

+ OnlineDictionary

IR MethodsPrecision increases significantly if domain specific sources are used:50% - Web; 75% - domain texts.

Does not rely on a rich Domain Ont,

Page 53: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Proposal: • rely on online ontologies (Semantic Web) to derive mappings• ontologies are dynamically discovered and combined

A Brel

Semantic Web

Does not rely on any pre-selected knowledge sources.

M. Sabou, M. d’Aquin, E. Motta, “Using the Semantic Web as Background Knowledge inOntology Mapping", Ontology Mapping Workshop, ISWC’06. Best Paper Award

External Source = SW

Page 54: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Strategy 1 - Definition

Find ontologies that contain equivalent classes for A and B and use their relationship in the ontologies to derive the mapping.

A Brel

Sem

antic

Web

A1’B1’

A2’B2’

An’Bn’

O1O2 On

BABABABABABABABA

⊥⇒⊥⊇=>⊇⊆=>⊆≡⇒≡

''''''''

For each ontology use these rules:

These rules can be extended to take into account indirect relations between A’ and B’, e.g., between parents of A’ and B’:

'''' BABCCA ⊥⇒⊥∧⊆

Page 55: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Strategy 1- Examples

Beef Food

Sem

antic

Web

Beef

RedMeat

Tap

Food

MeatOrPoultry

SR-16 FAO_Agrovoc

ka2.rdf

Researcher AcademicStaff

Sem

antic

Web

Researcher

AcademicStaff

ISWC SWRC

Page 56: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Strategy 2 - Definition

BABCCArBABCCArBABCCArBABCCArBABCCAr

⊇⇒≡∧⊇⊇⇒⊇∧⊇⊥⇒⊥∧⊆≡⇒≡∧⊆⊆⇒⊆∧⊆

')5(')4(')3(')2(

')1(

Principle: If no ontologies are found that contain the two terms then combine information from multiple ontologies to find a mapping.

A Brel

Sem

antic

Web

A’BC

C’B’rel

rel

Details:(1) Select all ontologies containing A’ equiv. with A(2) For each ontology containing A’:

(a) if find relation between C and B.(b) if find relation between C and B.

CA ⊆'CA ⊇'

Details:(1) Select all ontologies containing A’ equiv. with A(2) For each ontology containing A’:

(a) if find relation between C and B.(b) if find relation between C and B.

Page 57: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Strategy 2 - Examples

PoultryChicken⊆FoodPoultry ⊆

Chicken Vs. Food(midlevel-onto)(Tap)

Ex1:

FoodChicken ⊆

Ham Vs. FoodEx2:

(r1)

MeatHam⊆FoodMeat ⊆

(pizza-to-go)(SUMO) FoodHam ⊆

(Same results for Duck, Goose, Turkey)

(r1)

Ham Vs. SeafoodEx3:

MeatHam⊆SeafoodMeat ⊥

(pizza-to-go)(wine.owl) SeafoodHam ⊥

(r3)

Page 58: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Evaluation: 1600 mappings, two teamsOverall performance comparable to best in class

(derived from 180 different ontologies)

Matching AGROVOC (16k terms) and NALT(41k terms)

Large Scale Evaluation

M. Sabou, M. d’Aquin, W.R. van Hage, E. Motta, “Improving Ontology Matching byDynamically Exploring Online Knowledge“. In Press

Page 59: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Chart 2

Ontologies (180) used to derive mappings. TAP

CPE

Mid-level-ontology.daml

SUMO.daml

Economy.daml

Page 60: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Thesis #3

Using the SW to provide dynamically background knowledge to tackle the Agrovoc/NALT mapping problem provides the first ever test case in which the SW, viewed as a large scale heterogeneous resource, has been successfully used to address a real-world problem

Page 61: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 62: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Thesis #4

The claim that the information on the SW is of poor quality and therefore not useful to support intelligent problem solving is a myth not supported by concrete experience:Our experience in the NALT/Agrovoc ontology matching benchmark problem shows that without any particularly intelligent filter, the info available on the SW already allows a 85% theoretical precision for our algorithm, well beyond the performance of any other ontology matching algorithm

Page 63: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Example #2: Integrating SW and Web2.0

Page 64: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

• Tagging as opposed to rigid classification

• Dynamic vocabulary does not require much annotation effort and evolves easily

• Shared vocabulary emerge over time – certain tags become

particularly popular

Features of Web2.0 sites

Page 65: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Limitations of tagging

• Different granularity of tagging– rome vs colosseum vs roman monument– Flower vs tulip– Etc..

• Multilinguality• Spelling errors, different terminology, plural vs

singular, etc…

This has a number of negative implications for the effective use of tagged resources– e.g., Search exhibits very poor recall

Page 66: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Giving meaning to tags

Page 67: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

1. Mapping a tag to a SW element"japan"

<akt:Country Japan>

What does it mean to add semantics to tags?

2. Linking two "SW tags" using semantic relations

{japan, asia} <japan subRegionOf asia>

Page 68: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Applications of the approach

• To improve recall in keyword search

• To support annotation by dynamically suggesting relevant tags or visualizing the structure of relevant tags

• To enable formal queries over a space of tags– Hence, going beyond keyword search

• To support new forms of intelligent navigation– i.e., using the 'semantic layer' to support navigation

Page 69: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Concept and relation identification

No

END

Remainingtags?

Clustering

Google

Folksonomy

Cluster tags

Cluster1 Cluster2 Clustern…

2 “related” tags

Find mappings & relation for pair of tags

Yes

Analyze co-occurrence of tags

Co-occurence matrix

Pre-processingTags

Group similar tags

Filter infrequent tags

Concise tags

Clean tags

Wikipedia

SW search engine

<concept, relation, concept>

Page 70: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Pre-processing

• Scope: Subsets of Flickr and del.icio.us tags.

• Pre-processing (thresholds):– To be “similar”, Levenshtein >= 0.83; – A tag has to occur at least 10 times.

Total Distinct# entries # tags # users # resources # tags

del.icio.us 19,605 89,978 7,164 14,211 11,960Flickr 49,087 167,130 6,140 49,087 17,956

Total Distinct# entries # tags # users # resources # tags

del.icio.us 18,882 70,194 7,090 13,579 1,265Flickr 44,032 127,098 5,321 44,032 2,696

Page 71: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Clustering

1. Each pair of similar tags, as determined by a co-occurrence analysis (e.g., audio and mp3), is a seed constituting an initial cluster;

2. The cluster is enlarged by including tags that are similar to both the initial tags;

3. Repeat procedure recursively for all tags: each new “candidate” tag for a cluster must be similar to the whole (possibly enlarged) set of tags in that cluster.

4. If there are no more candidates for the cluster, go to step 1 with a new seed (e.g., audio and music).

Page 72: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Clustering

audio semantic-web

adult apple chat

1 mp3 rdf girls mac aim

2 music ontology nude macintosh messenger

3 playlist owl babes tiger gtalk

4 streaming semweb pics osx msn

5 radio daml sex macosx icq

? Fruit

Page 73: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Combining Clusters

• Smoothing heuristics are applied to avoid having a number of very similar clusters – originated from distinct seeds that are similar amongst

each other.

• For every two clusters: 1. If one cluster contains the other, i.e., if the larger cluster

contains all the tags of the smaller, remove the smaller cluster;

2. If clusters differ within a small margin, i.e., the number of different tags in the smaller cluster represents less than a percentage of the number of tags in the smaller and larger clusters, add the distinct words from the smaller to the larger cluster and remove the smaller.

Page 74: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Extracting relations

• For each pair of tags for which the search engine retrieved information, investigate the possible relationships: 1. A tag can be an ancestor of the other. For example, in the FOOD

ontology, apple is a subclass of fruit.2. A tag is the range or the value of a property of another tag. E.g., Class

Zinfandel has a property hasColor, with value red3. Both tags have the same direct parent: apple and pear are subclasses

of fruit4. Both tags have the same ancestors: assembly has as ancestors

building (1st level) and construction (2nd), while formation has fabrication (1st) and construction (2nd) in WordNet.

Page 75: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

participant

innovation

event

developer

activity

creatorplanning example

applica-tion

user

admin

resource

typeRange component

interface

partici-patesIn

in-eventarchive

Information Object

has-mention-of

Examples

Cluster_1: {admin application archive collection component control developer dom example form innovation interface layout planning program repository resource sourcecode}

Page 76: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Examples

Cluster_2: {college commerce corporate course education high instructing learn learning lms school student}

education

training1,4 qualification

corporate1 institution

university2,3 college2

postSecondarySchool2

school2

student3 studiesAt

course3offersCoursetakesCourse

activities4

learning4 teaching4

1http://gate.ac.uk/projects/htechsight/Employment.daml.2http://reliant.teknowledge.com/DAML/Mid-level-ontology.daml. 3http://www.mondeca.com/owl/moses/ita.owl.4http://www.cs.utexas.edu/users/mfkb/RKF/tree/CLib-core-office.owl.

Page 77: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Faceted Ontology

• Ontology creation and maintenance is automated

• Ontology evolution is driven by task features and by user changes

• Large scale integration of ontology elements from massively distributed online ontologies

• Very different from traditional top-down-designed ontologies

Page 78: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Lessons Learnt

• Approach proven to be feasible and promising.• However…

– Assumptions in initial experiments (e.g., single ontology coverage for pairs of tags; focus on classes, clustering-based approach, etc..) too restrictive

– Swoogle is too limited to support a fully automated approach

we are now using Watson for the current experiments– Integration with SW-enabled ontology matching

algorithm is essential to improve term matching

Page 79: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Example #3: Semantics-Enhanced Web Browsing

Page 80: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 81: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Enriched Web Page

Semantic Log

(found-item 3275578832 localhost #u"http://localhost/people/motta/" john-domingue john-domingue)(found-item 3275578832 localhost

Jabber Server

Magpie Hub

Ontology cache (Lexicon)

Problem Domain & Resources

Ontology based Proxy Server

Web Page

Magpie Architecture

Page 82: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 83: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 84: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 85: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 86: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

PowerMagpieArchitecture

Page 87: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 88: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 89: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 90: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 91: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Example #4: Question Answering on the Semantic Web

Page 92: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Aqualog: QA for Corporate Semantic Webs

Which KMi researcherswork on the Semantic Web?

<akt:Person rdf:about="akt:PeterScott"><rdfs:label>Peter Scott</rdfs:label><akt:hasAffiliation rdf:resource="akt:TheOpenUniversity"/><akt:hasJobTitle>kmi deputy director</akt:hasJobTitle><akt:worksInUnit rdf:resource="akt:KnowledgeMediaInstitute"/><akt:hasGivenName>Peter</akt:hasGivenName><akt:hasFamilyName>Scott</akt:hasFamilyName><akt:hasPrettyName>Peter Scott</akt:hasPrettyName><akt:hasPostalAddress rdf:resource="akt:KmiPostalAddress"/><akt:hasEmailAddress>[email protected]</akt:hasEmailAddress><akt:hasHomePage rdf:resource="http://kmi.open.ac.uk/people/scott/"/>

</akt:Person>

Answer…

Page 93: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

An Ontology-Modular System

Which premiership footballershave played for Leeds and Chelsea? <ftb:Footballer rdf:about=”ftb:WayneRooney">

<rdfs:label>Wayne Rooney</rdfs:label><ftb:playsFor ftb:ManUnited><ftb:hasPosition ftb:Forward><ftb:hasPreviousClub ftb:Everton> </ftb:Footballer><ftb:Footballer rdf:about=”ftb:DavidBeckham"><rdfs:label>David Beckham</rdfs:label><ftb:playsFor ftb:RealMadrid><ftb:hasPosition ftb:RightMidfield><ftb:hasPreviousClub ftb:ManUnited></ftb:Footballer>

AquaLog

Answer…

Page 94: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Coarse-grained Architecture

NL SENTENCEINPUT

LINGUISTIC&

QUERY CLASSIFICATION

RELATIONSIMILARITY

SERVICE

INFERENCEENGINE

QUERY

TRIPLES

ONTOLOGY

COMPATIBLE

TRIPLES

ANSWER

Linguistic Component obtains intermediate representation from the input query

Relation Similarity Service maps the intermediate representation to the ontology/kb

Page 95: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Relation Similarity Service

MECHANISMS: Ontology relationships and taxonomy, String

algorithms, WordNet, Learning Mechanism, User’s feedback

http://sourceforge.net/projects/jwordnet

http://secondstring.sourceforge.net/

Translated query Ontological structures

KMi researchers(person/organization, semantic web area)

Has-research-interest (kmi-research-staff-member,Semantic-web-area)

Which are the KMi researchers in the semantic web area?

Page 96: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Learning Mechanism

User Lexicon

Ontology Concepts

Mapping

Which academics work in Akt ?

academic project

User Disambiguation

Has-project-member

Which academics work in Akt ?

project

work has-project-member

academic

(inverse-of)

Page 97: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Learning Mechanism

- Key : the CONTEXT of a mapping is given by its particular place within the ontology, namely, since we work with triples, by the two arguments that the relation connects.

academic Has-project-member project

work

- How can we generalize what we have learned? we take the highest concepts in the ontology, which can handle the same relation

academic

work

Databaseperson Has-project-member project

Page 98: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Interpretation Mechanisms: Ontology structure, String algorithms,

WordNet, Machine Learning, User’s feedback

Page 99: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

User Feedback

Page 100: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 101: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

AquaLog --> PowerAqua

NL Query<akt:Person rdf:about="akt:PeterScott"><rdfs:label>Peter Scott</rdfs:label><akt:hasAffiliation rdf:resource="akt:TheOpenUniversity"/></akt:Person>

Answer…

PowerAqua

<ftb:Footballer rdf:about=”ftb:WayneRooney"><rdfs:label>Wayne Rooney</rdfs:label><ftb:playsFor ftb:ManUnited><ftb:hasPosition ftb:Forward><ftb:hasPreviousClub ftb:Everton></ftb:Footballer>

<ptb:Builder rdf:about=”ptb:Bob><rdfs:label>Bob the Builder</rdfs:label><ptb:playedBy……> </ptb:Builder>

Page 102: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

PowerAqua vs AquaLog

• Challenges when consulting and aggregating (dynamically mapping) information derived from multiple heterogeneous ontologies:– Locating the right ontologies– Intra-ontology semantic relevance analysis

• Filtering the right mappings– Intg. heterogeneous information to provide an

answer• This reduces to deciding whether two instances

specified according to different ontologies denote the same entity

Page 103: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale
Page 104: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

Conclusions

• SW provides an unprecedented opportunity to build a new generation of intelligent systems, able to exploit large scale background knowledge

• The large scale background knowledge provided by the SW may address one of the fundamental premises (and holy grails) of AI

• The SW is not an aspiration: it is a concrete technology that is already in place today and is steadily becoming larger and more robust

• The new class of systems enabled by the SW is fundamentally different in many respects both from traditional KBS and even from early SW applications

• The examples shown in this talk provide an initial taste of the new generation of applications which will be made possible by the emerging Semantic Web

Page 105: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

References

• Ontology Mapping– Lopez, V., Sabou, M., Motta, E. (2006). "Mapping the real

semantic web on the fly". ISWC 2006– Sabou, M., D'Aquin, M., Motta, E. (2006). "Using the semantic

web as background knowledge for ontology mapping". ISWC 2006 Workshop on Ontology Mapping.

• Integration of Web2.0 and Semantic Web– L.Specia, E. Motta, "Integrating Folksonomies with the

Semantic Web", ESWC 2007.– Angeletou, S., Sabou, M., Specia, L., and Motta, E., (2007).

“Bridging the Gap Between Folksonomies and the Semantic Web: An Experience Report”. ESWC 2007 Workshop on Bridging the Gap between Semantic Web and Web 2.0.

• Watson– d’Aquin, M., Sabou, M., Dzbor, M., Baldassarre, C., Gridinoc, L.,

Angeletou, S. and Motta, E.: "WATSON: A Gateway for the Semantic Web". Poster Session at ESWC 2007

Page 106: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale

'Vision' Papers

• Motta, E., Sabou, M. (2006). "Next Generation Semantic Web Applications". 1st Asian Semantic Web Conference, Beijing.

• Motta, E., Sabou, M. (2006). "Language Technologies and the Evolution of the Semantic Web". LREC 2006, Genoa, Italy.

• Motta, E. (2006). "Knowledge Publishing and Access on the Semantic Web: A Socio-Technological Analysis". IEEE Intelligent Systems, Vol.21, 3, (88-90).

Page 107: Next Generation Semantic Web Applicationseurolan.info.uaic.ro/html/profs/materials/Motta.pdf · possibilities opened by the Semantic Web, viewed as a domain-independent, large scale