Ontology and the Lexiocn.week4

39
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lexical Resources Lexicalized Ontologies . . . . . . . . . . Lab Ontology and the Lexicon Week 4: Ontological and Lexical Resources Shu-Kai Hsieh Lab of Ontologies, Language Processing and e-Humanities GIL, National Taiwan University March 18, 2014 Ontology and the Lexicon Shu-Kai Hsieh

Transcript of Ontology and the Lexiocn.week4

Page 1: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Ontology and the LexiconWeek 4: Ontological and Lexical Resources

Shu-Kai Hsieh

Lab of Ontologies, Language Processing and e-HumanitiesGIL, National Taiwan University

March 18, 2014

Ontology and the Lexicon Shu-Kai Hsieh

Page 2: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

..1 Lexical ResourcesWordNetFrameNet

..2 Lexicalized Ontologies

..3 LabWiki TaxonomyChinese Wordnet

Ontology and the Lexicon Shu-Kai Hsieh

Page 3: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

..1 Lexical ResourcesWordNetFrameNet

..2 Lexicalized Ontologies

..3 LabWiki TaxonomyChinese Wordnet

Ontology and the Lexicon Shu-Kai Hsieh

Page 4: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Barsalou[1]

• We shall assume that concepts are people’s psychologicalrepresentations of categories (e.g., apple, chair); whereasmeanings are people’s understandings of words and otherlinguistic expressions (e.g., ”apple”, ”large chair”).

• We shall argue that concepts and meanings differsubstantially. Although they are related in important ways,the relationship is one of complementarity, not equivalence.

Ontology and the Lexicon Shu-Kai Hsieh

Page 5: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

WordNet

WordNet

• A freely available lexical Database developed by PrincetonUniversity.

• A semantic network which consists of synsets and relations.• 155,287 word forms are grouped into 117,659 synsets

(WordNet 3.0)• synsets are glossed, and interconnected with different semantic

relations.

Ontology and the Lexicon Shu-Kai Hsieh

Page 6: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

FrameNet

..1 Lexical ResourcesWordNetFrameNet

..2 Lexicalized Ontologies

..3 LabWiki TaxonomyChinese Wordnet

Ontology and the Lexicon Shu-Kai Hsieh

Page 7: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

FrameNet

Frames: chunks of knowledge

..1 The notion of frame came up in the 1970s in AI and CognitiveScience. Similar notions include schema, script and scenario.

..2 Words are defined through the semantic roles they play andthe frames (types of event, relation, or entity) they evoke.

Ontology and the Lexicon Shu-Kai Hsieh

Page 8: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

FrameNet

source:

http://www.icsi.berkeley.edu/icsi/news/2014/02/charles-fillmore-dies-at-84

Ontology and the Lexicon Shu-Kai Hsieh

Page 9: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

FrameNet

FrameNet

• Word evokes the frame.• Instead of words, FN works with lexical units (LUs), each of

these being a pairing of a word with a sense.

Example (of FN work)

Let’s work through the Revenge frame following Fillmore (pp26-):(https://framenet.icsi.berkeley.edu/fndrupal/sites/default/files/FNintroCJF.ppt); Theglossary also helps (https://framenet.icsi.berkeley.edu/fndrupal/glossary)

Ontology and the Lexicon Shu-Kai Hsieh

Page 10: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

FrameNet

FrameNet Relations

The frames that we create, and thus the Frame Elements (FE) andLexical Units (LU) associated with them, are intended to besituated in semantic space by means of frame-to-frame relationsand semantic types. The relations we use include Inheritance,Subframe, Causative of, Inchoative of, and Using.

Ontology and the Lexicon Shu-Kai Hsieh

Page 11: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

FrameNet

FrameNet Relationssource: FrameNet (II) book

Inheritance An IS-A relation. The child frame is a subtype of theparent frame, and each FE in the parent is bound toa corresponding FE in the child. An example is theRevenge frame which inherits from the Rewards andpunishments frame.

Subframe The child frame is a subevent of a complex eventrepresented by the parent, e.g. the Criminalprocess frame has subframes of Arrest,Arraignment, Trial, and Sentencing.

Ontology and the Lexicon Shu-Kai Hsieh

Page 12: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

FrameNet

FrameNet Relationssource: FrameNet (II) book

Using The child frame presupposes the parent frame asbackground, e.g the Speed frame ”uses” (orpresupposes) the Motion frame; however, not allparent FEs need to be bound to child FEs.

Perspective on The child frame provides a particular perspectiveon an un-perspectivized parent frame. A pair ofexamples consists of the Hiring and Get_a_jobframes, which perspectivize the Employment_startframe from the Employer’s and the Employee’s pointof view, respectively.

Ontology and the Lexicon Shu-Kai Hsieh

Page 13: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

FrameNet

FrameNet Ontology

Ontology and the Lexicon Shu-Kai Hsieh

Page 14: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

FrameNet

Lexical Resources: Comparison and Alignment

• FN: 800 frames, 10,000 LUs, examplified in more than135,000 sentences.

• WN: 117659 synsets, 206941 word-sense pairs, ....

Ontology and the Lexicon Shu-Kai Hsieh

Page 15: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

FrameNet

Lexical Resources: Comparison and Alignment

• WN: a syset comprises only synonyms of the same part ofspeech; FN: a frame may include different parts of speech,and words with contradictory definitions (such as antonymsrelated to the same idea).

• Statistical measure that shows where WordNet and FrameNetagree well on the meanings of words and phrases, and wherethey do not [2].

Ontology and the Lexicon Shu-Kai Hsieh

Page 16: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

FrameNet

Ontology and the Lexicon Shu-Kai Hsieh

Page 17: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

FrameNet

Lexical Resources: Comparison and Alignment

Out of the 67 annotated sentences that included the word curious,48 both evoked the typicality frame (FN1) and used curious in theWordNet sense of ”beyond or deviating from the usual orexpected” (WN1).

Ontology and the Lexicon Shu-Kai Hsieh

Page 18: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

..1 Lexical ResourcesWordNetFrameNet

..2 Lexicalized Ontologies

..3 LabWiki TaxonomyChinese Wordnet

Ontology and the Lexicon Shu-Kai Hsieh

Page 19: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Classification of Ontologies

• Formal vs informal ontologies• Lexicalized vs non-lexicalized ontologies

Ontology and the Lexicon Shu-Kai Hsieh

Page 20: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

WN as a lexicalized ontology

• All of the WordNet noun synsets are organized intohierarchies, with 25 top-level synsets headed by the uniquebeginner synset entity.

Ontology and the Lexicon Shu-Kai Hsieh

Page 21: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

• E.g., Hypernymy/Hyponymy and Meronymy/Holonymy aretransitive and asymmetrical. So as Hyponymy generates ahierarchical semantic structure, a hyponym inherits all thefeatures of the more generic concept and adds at least onefeature that distinguishes it from its superordinate. Thehierarchy is about 16 levels.

Ontology and the Lexicon Shu-Kai Hsieh

Page 22: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Multilingual Wordnets• (EuroWordnet) The development of a multilingual database

with WordNets for several European languages, with 10,000up to 50,000 synsets. (Dutch, German, French, Spanish,Italian, Czech, Estonian).

• Inter-Lingual-Index, which are mainly based on EWNsynsets, serves as unstructured fund of concepts that providean efficient mapping across the languages;

• Various types of equivalence relations are distinguished tolink synsets with index records.

• Some cross-linguistic issues identified: different lexicalization,differencs in synonymy and homonymy, etc.

Ontology and the Lexicon Shu-Kai Hsieh

Page 23: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Multilingual Wordnets

• (BalkaNet) (Romanian, Bulgarian, Turkish, Slovenian, Greek,Serbian), with 10,000 synsets.

• (AsianWordnet) (Hindi, Indonesian, Lao, Mongolian,Japanese, Burmese, Nepali, Sinhala, Thai, Vietnamese) 1

• (The Global WordNet Association, more than 50 languages) 2

1http://www.asianwordnet.org2http://www.globalwordnet.org

Ontology and the Lexicon Shu-Kai Hsieh

Page 24: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Figure : Multilingual WordNet Architecture: EWN’s modelOntology and the Lexicon Shu-Kai Hsieh

Page 25: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

The ideal Global Wordnet Model (Vossen)?• Construct separate wordnets for each language.• Contributors from each language encode the same core set of

concepts plus culture/language-specific ones.• Synsets (concepts) are mapped cross linguistically via an

ontology, instead of just the English Wordnet.

Ontology and the Lexicon Shu-Kai Hsieh

Page 26: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

NLTK rocks

Ontology and the Lexicon Shu-Kai Hsieh

Page 27: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Wiki Taxonomy

..1 Lexical ResourcesWordNetFrameNet

..2 Lexicalized Ontologies

..3 LabWiki TaxonomyChinese Wordnet

Ontology and the Lexicon Shu-Kai Hsieh

Page 28: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Wiki Taxonomy

Wiki: its role

• A good review of current state-of-arts can be found in [3].• Resolving the Knowledge acquisition bottleneck: The

creation of very large knowledge bases has been made possibleby the availability of collaboratively-curated online resourcessuch as Wikipedia and Wiktionary.

• structured, semi-structured, unstructured resources.• what are the advantages and disadvantages, respectively?

Ontology and the Lexicon Shu-Kai Hsieh

Page 29: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Wiki Taxonomy

Wiki as semi-structured content for Ontologies

• Transforming Wikipedia into machine-readable knowledge• Acquiring related terms: thesaurus extraction• Relation extraction

• Leitmotif: generating semantics by exploiting the shallowstructure found in Wikipedia.

• Building and enriching ontologies from Wikipedia: YAGO,WikiNet and BabelNet.

Ontology and the Lexicon Shu-Kai Hsieh

Page 30: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Wiki Taxonomy

Multilingual case of collaboratively-generatedsemi-structured resources

Ontology and the Lexicon Shu-Kai Hsieh

Page 31: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Wiki Taxonomy

• WikiTaxonomy (Ponzetto and Strube, 2007; Ponzetto andStrube, 2011)(100k is-a relations)

• WikiNet: (Nas- tase et al., 2010; Nastase and Strube, 2013)is a project which heuristically exploits different aspects ofWikipedia to obtain a multilingual concept network by derivingnot only is-a relations, but also other types of relations.

• MENTA (de Melo and Weikum, 2010), creates one of thelargest multilingual lexical knowledge bases by interconnectingmore than 13M articles in 271 languages.

Ontology and the Lexicon Shu-Kai Hsieh

Page 32: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Chinese Wordnet

Chinese Wordnet(s)

Existing CWNs:• Sinica BOW (Huang et al, 2004)• CWN.1 (Academia Sinica, 2004-2010) »> CWN.2

(NTU-Taiwan 2011-)• COW (NTU-Singapore)• Southeast University WordNet (SEW), 2008

Ontology and the Lexicon Shu-Kai Hsieh

Page 33: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Chinese Wordnet

Chinese Wordnet(s)

Evaluation matters!Example

• quantity w/o quality• linguistic depth (co-predication, variants, etc)• coarse or fine-grained evalution methodology (e.g., via

annotators agreement or OntoClean methodology) (Hsieh,2013)

Ontology and the Lexicon Shu-Kai Hsieh

Page 34: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Chinese Wordnet

Chinese Wordnet.2

Launched and develped by AS (Chu-Ren Huang) and maintainedby LOPE lab (Shu-Kai Hsieh) at NTU.

• Successor of bilingual translation wordnet (Sinica BOW)developed by Academia Sinica.3.

• condense in synsets, sparse in LSR.• with some specific designs (e.g., meaning facets, paranymy,

etc).• other variants 4

3bow.sinica.edu.tw4googleCWN: lope.linguistics.ntu.edu.tw/gcwn

Ontology and the Lexicon Shu-Kai Hsieh

Page 35: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Chinese Wordnet

Chinese Wordnet 2

Figure : Chinese Wordnet: lope.linguistics.ntu.edu.tw/cwn2

Ontology and the Lexicon Shu-Kai Hsieh

Page 36: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Chinese Wordnet

A wikified collaboration platform

Ontology and the Lexicon Shu-Kai Hsieh

Page 37: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Chinese Wordnet

Homework

• 找一個角度,比較WN 與FN 的異同�• (修過python的同學,請多利用nltk來數量上的說明與論

據)

Ontology and the Lexicon Shu-Kai Hsieh

Page 38: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Chinese Wordnet

Lawrence W Barsalou, Wenchi Yeh, Barbara J Luka, Karen LOlseth, Kelly S Mix, and Ling-Ling Wu.Concepts and meaning.1993.Gerard De Melo, Collin F Baker, Nancy Ide, Rebecca JPassonneau, and Christiane Fellbaum.Empirical comparisons of masc word sense annotations.In LREC, pages 3036–3043, 2012.Eduard Hovy, Roberto Navigli, and Simone Paolo Ponzetto.Collaboratively built semi-structured content and artificialintelligence: The story so far.Artificial Intelligence, 194:2–27, 2013.Sebastian Löbner.

Ontology and the Lexicon Shu-Kai Hsieh

Page 39: Ontology and the Lexiocn.week4

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

.

. . . . . . . . . . .

Lexical Resources Lexicalized Ontologies. . . .. . . . . .

Lab

Chinese Wordnet

Understanding semantics.Routledge, 2013.

Ontology and the Lexicon Shu-Kai Hsieh