Knowledge Extraction and Linked Data: Playing with Frames

46
Knowledge Extraction and Linked Data: Playing with Frames Valentina Presutti STLab, ISTC-CNR Linked Data For Information Extraction @ ISWC 2016 Tuesday, October 18 th 2016

Transcript of Knowledge Extraction and Linked Data: Playing with Frames

Page 1: Knowledge Extraction and Linked Data: Playing with Frames

Knowledge Extraction and Linked Data: Playing with Frames

Valentina Presutti

STLab, ISTC-CNR

Linked Data For Information Extraction @ ISWC 2016

Tuesday, October 18th 2016

Page 2: Knowledge Extraction and Linked Data: Playing with Frames

STLab team

Valentina Presutti Aldo Gangemi

Andrea NuzzoleseDiego Reforgiato

Martina Sangiovanni Mario CarusoGiorgia Lodi

Alessandro RussoLuigi Asprino

Piero Conca2

Page 3: Knowledge Extraction and Linked Data: Playing with Frames

3

• Frames as units of meaning (claim and intuition)

• Background on frames

• From entity-centric to frame-centric knowledge extraction

• Some STLab research projects and results

• Next and open issues

Outline

Page 4: Knowledge Extraction and Linked Data: Playing with Frames

4

Claim and intuition

Page 5: Knowledge Extraction and Linked Data: Playing with Frames

5

Frames naturally support knowledge reconciliation, regardless the logical, conceptual or syntactic

representation of knowledge sources

Page 6: Knowledge Extraction and Linked Data: Playing with Frames

To understand who speaks to us or a text we read

We identify the main entities and how they relate to each other within a schema (frame)

Frame occurrences + context-dependent reasoning

The intuition

6

Page 7: Knowledge Extraction and Linked Data: Playing with Frames

7

I went to the disco and I met a friend, who had lost her keys.

We spent the night looking for them.

Page 8: Knowledge Extraction and Linked Data: Playing with Frames

8

I went to the disco and I met a friend, who had lost her keys.

We spent the night looking for them.

Page 9: Knowledge Extraction and Linked Data: Playing with Frames

9

I went to the disco and I met a friend, who had lost her keys.

We spent the night looking for them.

Page 10: Knowledge Extraction and Linked Data: Playing with Frames

10

I went to the disco and I met a friend, who had lost her keys.

We spent the night looking for them.

Page 11: Knowledge Extraction and Linked Data: Playing with Frames

I went to the disco and I met a friend, who had lost her keys.

We spent the night looking for them.

11

Page 12: Knowledge Extraction and Linked Data: Playing with Frames

12

We want machines to perform this process

Page 13: Knowledge Extraction and Linked Data: Playing with Frames

13

Background

Page 14: Knowledge Extraction and Linked Data: Playing with Frames

14

Minsky [1]

“When one encounters a new situation […] one selects from memory a structure called a Frame. This is a remembered framework to be adaptedto fit reality by changing details as necessary.”

“A frame is a data-structure for representing a stereotyped situation, like being in a certain kind of living room, or going to a child's birthday party”

“We can think of a frame as a network of nodes and relations.”

“Collections of related frames are linked together into frame-systems”

Fillmore [2]

“[…] in characterising a language system we must add to the description of grammar and lexicon a description of the cognitive and interactional “frames” […]”

“The evolution toward language must have consisted in part in the gradual acquisition of a reportory of frames and of mental processes for operating with them, and eventually the capacity to create new frames and to transmit them.”

“[…] in order to perceive something or to attain a concept, what is […] necessary isto have in memory a repertoire of prototypes. The act of perception or conception being that of recognizing in what ways an object can be seens as an instance of one or another of theseprototypes.”

Page 15: Knowledge Extraction and Linked Data: Playing with Frames

15

Frame definition and representation

Page 16: Knowledge Extraction and Linked Data: Playing with Frames

16

CureHealer

Medicationhttp://framenet.icsi.berkeley.edu/

Patient

Page 17: Knowledge Extraction and Linked Data: Playing with Frames

17

N-ary relation f(e, e1,…en)

f is a first order logic relation

e is a variable for any event or situation described by f

ei is a variable for any of the entity arguments of f

An OWL n-ary relation pattern

the n-ary relation is the reification of f, i.e. e

the n objects represent the arguments of f

the n argument relations are binary projections of f including e

co-participation relations are binary projections of f not including e

Representing frames

“Hagrid rolled up a note for Harry in Hogwarts”

Page 18: Knowledge Extraction and Linked Data: Playing with Frames

18

From entity-centric to frame-centric design and extraction

Before:

Key terms àclasses/properties

After:

Key situations àframes/patterns

Frames as units of meaning [3]

Page 19: Knowledge Extraction and Linked Data: Playing with Frames

19

From entity-centric to frame-centric knowledge extraction

Page 20: Knowledge Extraction and Linked Data: Playing with Frames

20

We want machines to perform this process

Page 21: Knowledge Extraction and Linked Data: Playing with Frames

This requires at least three ingredients:

Knowledge representation

Knowledge extractionAutomated reasoning and learning

21

Page 22: Knowledge Extraction and Linked Data: Playing with Frames

The Semantic Web and Linked Data

Knowledge representation

Knowledge extractionAutomated reasoning

22

Page 23: Knowledge Extraction and Linked Data: Playing with Frames

Mary

marriedWith

John Mary

weddingDate

October 12th, 2016 John

weddingDate

October 12th, 2016

Mary

weddingPlace

Kobe John

weddingPlace

Kobe Mary

weddingPlace

Rome

The Semantic Web and Linked Data

Knowledge representation

Knowledge extractionAutomated reasoning

Page 24: Knowledge Extraction and Linked Data: Playing with Frames

24

OL & KE tools main focus:Named Entity extractionTaxonomy induction,

Relation extractionAxiom extraction, …

The Semantic Web and Linked Data

Knowledge representation

Knowledge extractionAutomated reasoning

Page 25: Knowledge Extraction and Linked Data: Playing with Frames

25

This is useful, but it’s not enough

Semantic heterogeneity

Lack of knowledge boundaries (context) [3]

marriedWith

firstMarriageWith

spousemarriage

spousedate

Page 26: Knowledge Extraction and Linked Data: Playing with Frames

26

The role of frames in knowledge representation, extraction and interaction

Performing empirical observations on the web (in line with van Harmelen’s [4])

Using frames for driving the design of solutions to research problems and test their performance

Frames as units of meaning

Page 27: Knowledge Extraction and Linked Data: Playing with Frames

27

Some projects and results

Page 28: Knowledge Extraction and Linked Data: Playing with Frames

28

Frame-based knowledge extraction [5]http://wit.istc.cnr.it/stlab-tools/fred/

From text to linked data

Page 29: Knowledge Extraction and Linked Data: Playing with Frames

29

Frame-based Linked Data

“RicoLebruntaught visualartsattheChouinard ArtInstituteandattheDisneyStudios.HewasinfluencedbyMichelangeloandmaintained alifelongaffinity for

GoyaandPicasso.”

Page 30: Knowledge Extraction and Linked Data: Playing with Frames

30

FRED“The Black Hand might not have decided to barbarously assassinate Franz Ferdinand

after he arrived in Sarajevo on June 28th, 1914”

Page 31: Knowledge Extraction and Linked Data: Playing with Frames

31

Automatic selection of relevant binary projections of frames

Usable label generation

Formal alignment between frames and binary properties

Binary relations [6]

Page 32: Knowledge Extraction and Linked Data: Playing with Frames

32

Binary relation assessment

“Rico Lebrun taught visual arts at the Chouinard Art Institute and at the Disney Studios. He was influenced by Michelangelo and maintained a lifelong affinity for Goya and Picasso.”

Subject

ObjectSubject Object

http://wit.istc.cnr.it/stlab-tools/legalo

Page 33: Knowledge Extraction and Linked Data: Playing with Frames

33

Binary property generation

vn.role:Actor1 ->“with”vn.role:Actor2 ->“with”vn.role:Beneficiary->“for”vn.role:Instrument ->“with”vn.role:Destination ->“to”vn.role:Topic ->“about”vn.role:Source ->“from”

SubjectObject

legalo:teachArtAt

teach art at

“Rico Lebrun taught visual arts at the Chouinard Art Institute and at the Disney Studios. He was influenced by Michelangelo and maintained a lifelong affinity for Goya and Picasso.”

http://wit.istc.cnr.it/stlab-tools/legalo

Page 34: Knowledge Extraction and Linked Data: Playing with Frames

34

vn.role:Actor1 ->“with”vn.role:Actor2 ->“with”vn.role:Beneficiary->“for”vn.role:Instrument ->“with”vn.role:Destination ->“to”vn.role:Topic ->“about”vn.role:Source ->“from”

Subject

Object

legalo:teachAbout

teach about

Binary property generation

“Rico Lebrun taught visual arts at the Chouinard Art Institute and at the Disney Studios. He was influenced by Michelangelo and maintained a lifelong affinity for Goya and Picasso.”

http://wit.istc.cnr.it/stlab-tools/legalo

Page 35: Knowledge Extraction and Linked Data: Playing with Frames

35

Semantic Web triples and properties generation

“RicoLebrun taughtvisualarts attheChouinardArtInstitute andattheDisneyStudios.HewasinfluencedbyMichelangeloandmaintainedalifelongaffinity forGoya

andPicasso.”

dbpedia:Rico_Lebrun s:teachAbout dbpedia:Visual_arts .s:teachAbout a owl:ObjectProperty ;

rdfs:subPropertyOf fred:Teach;rdfs:domain wibi:Artist ;rdfs:range wibi:Art ;

grounding:definedFromFormalRepresentationfred-graph:a6705cedbf9b53d10bbcdedaa3be9791da0a9e94 ;

grounding:derivedFromLinguisticEvidence s:linguisticEvidence ;owl:propertyChainAxiom([ owl:inverseOf s:AgentTeach ] s:TopicTeach) .

_:b2 a alignment:Cell ;alignment:entity1 s:teachAbout ;alignment:entity2 <http://purl.org/vocab/aiiso/schema#teaches> ;alignment:measure "0.846"^xsd:float ;alignment:relation "equivalence" .

domain, range, subsumption

linguisticandformalscope

alignment toexistingLODvocabularies

Page 36: Knowledge Extraction and Linked Data: Playing with Frames

36

Evaluation tasks [7]

36

Tool/Task Topics NER NE-RS TE TE-RS Senses Taxo Rel Roles Events Frames+SRL

AIDA – + + – – + – – – – –Alchemy + + – + – + – + – – –

Apache Stanbol – + + – – + – – – – –CiceroLite – + + + + + – + + + +

DB Spotlight – + + – – + – – – – –FOX + + + + + + – – – – –FRED – + + + + + + + + + +NERD – + + – – + – – – – –Ollie – – – – – – – + – – –

Open Calais + + – – – + – – – + –PoolParty KD + – – – – – – – – – –

ReVerb – – – – – – – + – – –Semiosearch – – + – + – – – – – –

Tagme – + + + + – – – – – –Wikimeta – + – + + + – – – – –Zemanta – + – – – + – – – – –

Page 37: Knowledge Extraction and Linked Data: Playing with Frames

37

Topic detection and Opinion holder detection [8]

Sentiment propagation through frames and roles [9]

Sentiment analysis

“People hope that the President will be condemned by the judges”

Page 38: Knowledge Extraction and Linked Data: Playing with Frames

38

50 sentences from MPQA opinion corpus1 and Europarl corpus2

100 Sentence sentiment polarity of open rated hotel reviews (positive and negative)

Evaluation

Task Measure Value

Holder detection F1 0.95

Topic detection F1 0.68

Sub-topic detection

F1 0.77

Review sentiment vs. user scores

Avg. correlation 0.81

2 http://www.statmt.org/europarl/1 http://mpqa.cs.pitt.edu/corpora/mpqacorpus/

3 http://www.stlab.istc.cnr.it/documents/sentilo/reviewsposneg.zip

Page 39: Knowledge Extraction and Linked Data: Playing with Frames

39

Frame-based linked data shows an effective representation of discourse

Our ultimate goal is machine understanding, hence

an important issue is the limited coverage of existing resources and their integration with factual world knowledge

FrameBase [10] partially addresses this problem, starting from similar principles and intuitions

STLab has develop Framester [11,12]: a general web-scale integrated resource which integrates linguistic and world factual knowledge

(see Aldo’s presentation later)

Coverage and integration of linguistic and world knowledge

Page 40: Knowledge Extraction and Linked Data: Playing with Frames

40

Abstract, formalised frame model

generalised model of roles

Represents all resources’ entities in terms of its frame semantics

Links linguistic data with ontologies and facts (~43M triples)

Includes FrameBase’s ReDer rules

Framester

Page 41: Knowledge Extraction and Linked Data: Playing with Frames

41

Word-Frame-Disambiguation (frame detection)

any word, e.g. Shakespeare, write, alone, nicely, etc.

frames evoked by word senses

Outperforms Semafor and FrameBase

details to come in few minutes J

!!!Spoiler Warning!!!

http://lipn.univ-paris13.fr/framester/en/wfd/

Page 42: Knowledge Extraction and Linked Data: Playing with Frames

42

Helping people with Dementia and their carers

Natural language understanding

questionnaire for cognitive ability assessment

speech to tag (pictures, music, events, etc.)

reminiscence games and suggestions

suggesting missing words

understanding with partial information

Current project and challenge

http://www.mario-project.eu

Blah blah blah blahBlah blah blah blahBlah blah blah blahBlah blah blah blahBlah blah blah blahBlah blah blah blahBlahblahblahblahBlahblahblahblahBlahblahblahblah\

User-Robot KB

Page 43: Knowledge Extraction and Linked Data: Playing with Frames

43

Current work:

To integrate FRED and Framester for normalising results

Framester-driven Ontology Alignment (part of a PhD thesis under dev)

MARIO understanding component and evaluation (with datasets and PwD)

Open challenge:

How to combine statistical learning with our approaches?

we want FRED to learn from interaction experiences

we want to learn new rules and procedures, not only data (algorithm learning), and get their formalisation, explicitly

Next and open issues

Page 44: Knowledge Extraction and Linked Data: Playing with Frames

44

Stupid questions are only those that are not asked (Prof. Paolo Ciancarini)

Page 45: Knowledge Extraction and Linked Data: Playing with Frames

45

References

[1] Marvin Minsky: A Framework for Representing Knowledge. MIT-AI Laboratory Memo 306, June, 1974.

[2] Charles J Fillmore. Frame Semantics and the Nature of Language. Annals of the New York Academy of Sciences, 280(1):20-32, 1976.

[3] Aldo Gangemi, Valentina Presutti: Towards a pattern science for the Semantic Web. Semantic Web 1(1-2): 61-68 (2010)

[4] Frank van Harmelen: The Web of Data: do we understand what we build? https://sssw.org/2016/?page_id=386

[5] Aldo Gangemi, Valentina Presutti, Diego Reforgiato Recupero, Andrea Giovanni Nuzzolese, Francesco Draicchio, Misael Mongiovì: Semantic Web Machine Reading with FRED. Semantic Web (To appear)

[6] Valentina Presutti, Andrea Giovanni Nuzzolese, Sergio Consoli, Aldo Gangemi, Diego Regorgiato Recupero: From hyperlinks to Semantic Web properties using Open Knowledge Extraction pp. 351-378, Semantic Web, Volume 7, Number 4 / 2016.

Page 46: Knowledge Extraction and Linked Data: Playing with Frames

46

[7] Aldo Gangemi: A Comparison of Knowledge Extraction Tools for the Semantic Web. ESWC 2013: 351-366

[8] Aldo Gangemi, Valentina Presutti, Diego Reforgiato Recupero:Frame-Based Detection of Opinion Holders and Topics: A Model and a Tool. IEEE Comp. Int. Mag. 9(1): 20-30 (2014)

[9] Diego Reforgiato Recupero, Valentina Presutti, Sergio Consoli, Aldo Gangemi, Andrea Giovanni Nuzzolese: Sentilo: Frame-Based Sentiment Analysis. Cognitive Computation 7(2): 211-225 (2015)

[10] Jacobo Rouces, Gerard de Melo, and Katja Hose. Framebase: Representing n-ary relations using semantic frames. ESWC 2015: 505-521

[11] Aldo Gangemi, Mehwish Alam, Valentina Presutti, Luigi Asprino and Diego Reforgiato Recupero: Framester: A Wide Coverage Linguistic Linked Data Hub. In Proceedings of EKAW 2016

[12] Aldo Gangemi, Mehwish Alam, Valentina Presutti: Word Frame Disambiguation: Evaluating Linguistic Linked Data on Frame Detection. LD4IE@ISWC 2016: 23-31

References cont.