Concept Hierarchy Induction
description
Transcript of Concept Hierarchy Induction
![Page 1: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/1.jpg)
B Y P H I L I P P C I M I A N O
P R E S E N T E D B Y J O S E P H P A R K
CONCEPT HIERARCHY INDUCTION
![Page 2: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/2.jpg)
CONCEPT HIERARCHIES
• Structure information into categories
• Provide a level of generalization
• Form the backbone of any ontology
![Page 3: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/3.jpg)
COMMON APPROACHES
• Machine readable dictionaries
• Lexico-syntactic patterns
• Distributional similarity
• Co-occurrence analysis
![Page 4: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/4.jpg)
MACHINE READABLE DICTIONARIES
• Exploit regularity of dictionaries• Find a hypernym for the defined word• Head of the first NP (genus or kernel term)
• spring "the season between winter and summer and in which leaves and flowers appear“• hornbeam "a type of tree with a hard wood,
sometimes used in hedges“• launch "a large usu. motor-driven boat used for
carrying people on rivers, lakes, harbors, etc."
![Page 5: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/5.jpg)
LEXICO-SYNTACTIC PATTERNS
• Hearst patterns• Hearstl: NP such as {NP,}* {(and | or)} NP• Hearst2: such NP as {NP,}* {(and | or)} NP• HearstS: NP {,NP}* {,} or other NP• Hearst4: NP {,NP}* {,} and other NP• Hearst5: NP including {NP,}* NP {(and | or)} NP• Hearst6: NP especially {NP,}* {(and|or)} NP
• They should occur frequently and in many text genres• They should accurately indicate the relation of interest• They should be recognizable with little or no pre-
encoded knowledge
![Page 6: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/6.jpg)
EXAMPLE OF USING HEARST PATTERN
• 'Such injuries as bruises, wounds and broken bones...'
• hyponym(bruise, injury)• hyponym(wound, injury)• hyponym(broken bone, injury)
![Page 7: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/7.jpg)
DISTRIBUTIONAL SIMILARITY
• Distributional hypothesis• Words are similar to the extent they share the same
context• ‘you shall know a word by the company it keeps’ –Firth
![Page 8: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/8.jpg)
EXAMPLE
![Page 9: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/9.jpg)
CO-OCCURRENCE ANALYSIS
• Collocation
• Document-based subsumption• a certain term is more special than a term if also
appears in all the documents in which appears
![Page 10: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/10.jpg)
THREE MORE APPROACHES
• Formal Concept Analysis (FCA)
• Guided Clustering
• Learning from heterogeneous sources of evidence
![Page 11: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/11.jpg)
FORMAL CONCEPT ANALYSIS
• Set-theoretical approach• Parse corpus (extract dependencies)• Verb-pp-complement• Verb-object• Verb-subject
• Extract surface dependencies (section 4.1.4)
![Page 12: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/12.jpg)
PSEUDOCODE
![Page 13: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/13.jpg)
EXAMPLE
![Page 14: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/14.jpg)
RESULTS
![Page 15: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/15.jpg)
GUIDED CLUSTERING
• Uses hypernyms from WordNet and Hearst patterns
![Page 16: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/16.jpg)
EXAMPLE
![Page 17: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/17.jpg)
RESULTS
![Page 18: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/18.jpg)
MORE RESULTS
![Page 19: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/19.jpg)
HETEROGENEOUS SOURCES OF EVIDENCE
• Naïve threshold classifier• Uses Hearst patterns for corpus patterns• Uses Google API for web patterns• Uses Hearst patterns over downloaded pages• Uses WordNet senses• Uses ‘head’-heuristic (r-match)• Uses corpus based subsumption• Uses document based subsumption
![Page 20: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/20.jpg)
RESULTS
![Page 21: Concept Hierarchy Induction](https://reader036.fdocuments.us/reader036/viewer/2022070420/56815f78550346895dce7dcc/html5/thumbnails/21.jpg)
MORE RESULTS