YAGO – A Core of Semantic Knowledge

Post on 22-Mar-2016

47 views 0 download

description

YAGO – A Core of Semantic Knowledge. Fabian M. Suchanek , Gjergji Kasneci, Gerhard Weikum (Max-Planck Institute for Computer Science Saarbrücken/Germany). Overview. ر Motivation ر The Yago ontology ر Content ر Model ر Extension ر Conclusion. The Truth about Elvis. - PowerPoint PPT Presentation

Transcript of YAGO – A Core of Semantic Knowledge

YAGO - A Core of Semantic Knowledge 1Fabian M. Suchanek

YAGO – A Core of Semantic Knowledge

Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum

(Max-Planck Institute for Computer Science Saarbrücken/Germany)

YAGO - A Core of Semantic Knowledge 2Fabian M. Suchanek

Overview

Motivation ر

The Yago ontology ر

Content ر

Model ر

Extension ر

Conclusion ر

YAGO - A Core of Semantic Knowledge 3Fabian M. Suchanek

The Truth about Elvis

Elvis is alive!

YAGO - A Core of Semantic Knowledge 4Fabian M. Suchanek

The Truth about Elvis

Elvis is alive!

He works as an astronaut in

NASA's special security program

YAGO - A Core of Semantic Knowledge 5Fabian M. Suchanek

Usual solution

Which NASA astronaut was born when Elvis was born?

Yields only rubbish.

Reasons:

1. Google participates in the conspiracy

2. Google does not search knowledge, but Web sites

YAGO - A Core of Semantic Knowledge 6Fabian M. Suchanek

Solution: An ontology

born1935 ?born

is anastronaut

YAGO - A Core of Semantic Knowledge 7Fabian M. Suchanek

Solution: An ontology

born1935 ?born

is aastronaut

person

entity

subclass

subclass

"Elvis Presley" "The King"

means means

is a

YAGO - A Core of Semantic Knowledge 8Fabian M. Suchanek

Solution: An ontology

born1935 ?born

is aastronaut

person

entity

subclass

subclass

"Elvis Presley" "The King"

means means

Words

is a

Individuals

Classes

Relations

YAGO - A Core of Semantic Knowledge 9Fabian M. Suchanek

Where do we get the ontology from?

Previous approaches:

Assemble the ontology manually ر

(WordNet, SUMO, GeneOntology)

Problems: Usually low coverage (MPI is in none of these)

Extract the ontology from corpora (e.g. the Web) ر

(KnowItAll, Espresso, Snowball, LEILA)

Problem: Usually low accuracy (50%-92%)

YAGO - A Core of Semantic Knowledge 10Fabian M. Suchanek

Where do we get the ontology from?

YAGO approach:

Assemble the ontology from Wikipedia (=> good coverage)

Use the category system of Wikipedia (=> good accuracy)

YAGO - A Core of Semantic Knowledge 11Fabian M. Suchanek

Exploiting the Wikipedia category system

Elvis Pr

blah blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter

Categories:

1935_births

1935born

Exploit relational categories

YAGO - A Core of Semantic Knowledge 12Fabian M. Suchanek

Exploiting the Wikipedia category system

Elvis Pr

blah blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter

Categories:

American_singers

1935born

Exploit relational categoriesExploit conceptual categories

American_singer

is a

YAGO - A Core of Semantic Knowledge 13Fabian M. Suchanek

Exploiting the Wikipedia category system

Elvis Pr

blah blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter

Categories:

Disputed_articles

1935born

Exploit relational categoriesExploit conceptual categories

American_singer

is ais a

Disputed_article

Avoid administrational categories

YAGO - A Core of Semantic Knowledge 14Fabian M. Suchanek

Exploiting the Wikipedia category system

Elvis Pr

blah blah blub Elvis (don't read this! Better listen to the talk!) laber fasel suelz. Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Blub, aber blah! Insbesondere, blub, texte zu, und so weiter blah blah blub Elvis laber fasel suelz. Insbesondere, blub, texte zu, und so weiter

Categories:

Rock'n_Roll_Music

1935born

Exploit relational categoriesExploit conceptual categories

American_singer

is ais a

Rock'n_Roll_Music

Avoid administrational categoriesAvoid thematic categories

YAGO - A Core of Semantic Knowledge 16Fabian M. Suchanek

The Upper Model

1935born

American_singer

is a

person

entity

?

YAGO - A Core of Semantic Knowledge 17Fabian M. Suchanek

The Upper Model: From Wikipedia?

1935born

American_singer

is a

People_by_occupation

Business

?Social_group

YAGO - A Core of Semantic Knowledge 18Fabian M. Suchanek

The Upper Model: From WordNet?

1935born

American_singer

is a

Singer#1

Person#3

Singer#17...

YAGO - A Core of Semantic Knowledge 19Fabian M. Suchanek

The Upper Model: From WordNet?

1935born

American_singers_of_Jewish_origin

is a

Singer#1

Person#3

Singer#17...Origin#7

YAGO - A Core of Semantic Knowledge 20Fabian M. Suchanek

The YAGO ontology

1935born

American_singer

is a

Singer#1

Person#3

subclass

subclass

"singer"

means

"Elvis Presley"means

YAGO - A Core of Semantic Knowledge 21Fabian M. Suchanek

The YAGO ontology: Accuracy

Relation Accuracysubclass 97.70% +/- 1.59%is a 94.54% +/- 2.36%familyName 97.81% +/- 1.75%givenName 97.62% +/- 2.08%establishedIn 90.84% +/- 4.28%bornInYear 93.14% +/- 3.71% diedInYear 98.72% +/- 1.30% locatedIn 98.41% +/- 1.52%politicianOf 92.43% +/- 3.93%

writtenInYear 94.35% +/- 3.33%hasWonPrize 98.47% +/- 1.57%

YAGO - A Core of Semantic Knowledge 22Fabian M. Suchanek

The YAGO ontology: Number of Facts

KnowItAll SUMO WordNet OpenCyc Cyc

30,000 60,000 200,000 300,000

2,000,000

6,000,000

Yago

Ontologies should not be judged purely by the number of facts! This is just an informational overview.

YAGO - A Core of Semantic Knowledge 23Fabian M. Suchanek

The Yago Model: Why binary is not enough

singer

is a

(But only from 1953 to 1977)

(We know this from Wikipedia)

(Elvis, is_a, singer)

YAGO - A Core of Semantic Knowledge 24Fabian M. Suchanek

The Yago Model: Why binary is not enough

is a

1953-1977

Wikipedia

time

source

#1 (Elvis, is_a, singer)

#2 (#1, time, 1953-1977)

#3 (#1, source, Wikipedia)

singer

YAGO - A Core of Semantic Knowledge 25Fabian M. Suchanek

The Yago model formally

A YAGO ontology over

a set of relations R رa set of common entities C رa set of fact identifiers I رis a function

I (RCI) R (RIC)

#1 (Elvis, is_a, singer)

#2 (#1, time, 1953-1977)

#3 (#1, source, Wikipedia)

We can talk aboutfacts (#1, source, Wikipedia) رadditional arguments (#1, time, 1953-1977) رrelations (time, hasRange, time_interval) ر

YAGO - A Core of Semantic Knowledge 26Fabian M. Suchanek

The Yago model: Logical aspects

Axioms:

(x, is_a, y)

(y, subclass, z)

=> (x, is_a, z)

...

singer

person

subclass

is a

is a

YAGO - A Core of Semantic Knowledge 27Fabian M. Suchanek

The Yago model: Logical aspects

Axioms:

(x, is_a, y)

(y, subclass, z)

=> (x, is_a, z)

...f1, f2, f3, f4, f5

f1, f2, f3

f1, f2, f3, f4, f5, f6, f7, f8, f9, f10

derive facts

Eliminate facts

finite, unique

finite, unique

YAGO - A Core of Semantic Knowledge 28Fabian M. Suchanek

Extending the Ontology

Whom did Elvis marry?

Elvis married Priscilla

X married Y

Priscilla

YAGO - A Core of Semantic Knowledge 29Fabian M. Suchanek

Extending the Ontology

Whom did Elvis marry?X married Y

subj obj

Elvis, the great rock star, married Priscilla

subj obj

Priscilla

with LEILA

YAGO - A Core of Semantic Knowledge 30Fabian M. Suchanek

Extending the Ontology

Ontology

(YAGO)

Information Extraction

(LEILA)

YAGO - A Core of Semantic Knowledge 31Fabian M. Suchanek

The Truth about Elvis

http://www.mpi-inf.mpg.de/~suchanek/downloads/yago/

"Elvis Presley" bornInYear $year

$astro bornInYear $year

$astro isa astronaut

Enter your Yago Query:

Which astronaut was born in the same year as Elvis?

20 results

YAGO - A Core of Semantic Knowledge 32Fabian M. Suchanek

The Truth about Elvis

http://www.mpi-inf.mpg.de/~suchanek/downloads/yago/

"Elvis Presley" bornInYear $year

$astro bornInYear $year

"Roger" givenNameOf $astro

$astro isa astronaut

Enter your Yago Query:

Which astronaut codenamed "Roger" was born in the same year as Elvis?

$astro = Roger_Chaffee

YAGO - A Core of Semantic Knowledge 33Fabian M. Suchanek

Conclusions

Yago bases on a logically clean model ر

Yago has an accuracy of around 95% ر

Yago is 3 times larger than the largest competitor ر

Elvis is alive ر

YAGO - A Core of Semantic Knowledge 34Fabian M. Suchanek

Reference

For all details, please refer to our technical report

"Yago – A Core of Semantic Knowledge"

(Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum)

available at http://www.mpii.mpg.de/~suchanek

BibTex:@TECHREPORT{yagotr, AUTHOR = {Suchanek, Fabian and Kasneci, Gjergji and Weikum, Gerhard}, TITLE = {Yago: A Core of Semantic Knowledge}, TYPE = {Research Report}, INSTITUTION = {Max-Planck-Institut f{\"u}r Informatik}, ADDRESS = {Stuhlsatzenhausweg 85, 66123 Saarbr{\"u}cken, Germany}, NUMBER = {MPI-I-2006-5-006}, YEAR = {2006}}