Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construction from Corpus
-
Upload
subhabrata-mukherjee -
Category
Data & Analytics
-
view
219 -
download
0
Transcript of Domain Cartridge: Unsupervised Framework for Shallow Domain Ontology Construction from Corpus
Domain Cartridge: UnsupervisedFramework for Shallow Domain Ontology
Construction from Corpus
Subhabrata MukherjeeJitendra Ajmera, Sachindra Joshi
Max Planck Institute for InformaticsIBM India Research Lab
CIKM 2014
October 9, 2014
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Term Discovery
Usefulness for Parsing. Consider the examples:
I “install sprint navigation”I “problem with touch screen of the mobile”
ESG Parser generates noisy or incomplete parse without thesmartphone domain knowledge that “sprint navigation, touchscreen” are multi-word concepts
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Term Discovery
Usefulness for Parsing. Consider the examples:
I “install sprint navigation”I “problem with touch screen of the mobile”
ESG Parser generates noisy or incomplete parse without thesmartphone domain knowledge that “sprint navigation, touchscreen” are multi-word concepts
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Relation Discovery
I Interactive dialogue systemsI For user query “battery of my device depletes fast", the
knowledge ‘battery’ is a Feature-Of ‘device’ enables the systemto clarify about the type of device
I Query expansionI E.g. Consider Synonyms along with the original query, ‘battery’
is a Feature-Of ‘phone’ as well as ‘tablet’ ‘device’
I Query re-formulationI For user query “screen freezes E5150", the knowledge ‘E5150’
is a Type-Of ‘Error’ results in query re-formulation “screenfreezes error E5150"
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Relation Discovery
I Interactive dialogue systemsI For user query “battery of my device depletes fast", the
knowledge ‘battery’ is a Feature-Of ‘device’ enables the systemto clarify about the type of device
I Query expansionI E.g. Consider Synonyms along with the original query, ‘battery’
is a Feature-Of ‘phone’ as well as ‘tablet’ ‘device’
I Query re-formulationI For user query “screen freezes E5150", the knowledge ‘E5150’
is a Type-Of ‘Error’ results in query re-formulation “screenfreezes error E5150"
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Motivation: Domain Relation Discovery
I Interactive dialogue systemsI For user query “battery of my device depletes fast", the
knowledge ‘battery’ is a Feature-Of ‘device’ enables the systemto clarify about the type of device
I Query expansionI E.g. Consider Synonyms along with the original query, ‘battery’
is a Feature-Of ‘phone’ as well as ‘tablet’ ‘device’
I Query re-formulationI For user query “screen freezes E5150", the knowledge ‘E5150’
is a Type-Of ‘Error’ results in query re-formulation “screenfreezes error E5150"
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Unsupervised Framework
I Typically for a domain, a lot of knowledge articles, manuals,tutorials etc. are available in a variety of formats
I Most of these documents have less hyperlink and table(info-box as in Wikipedia) information
I Challenge is to learn a shallow ontology from raw unannotatedplain text
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Unsupervised Framework
I Typically for a domain, a lot of knowledge articles, manuals,tutorials etc. are available in a variety of formats
I Most of these documents have less hyperlink and table(info-box as in Wikipedia) information
I Challenge is to learn a shallow ontology from raw unannotatedplain text
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Unsupervised Framework
I Typically for a domain, a lot of knowledge articles, manuals,tutorials etc. are available in a variety of formats
I Most of these documents have less hyperlink and table(info-box as in Wikipedia) information
I Challenge is to learn a shallow ontology from raw unannotatedplain text
Domain Cartridge as a Graph
install insert
device
handset
android
blackberry
Operatingsystem
samsung
sim
card
SamsungGalaxyvictory
Samsungarray
Domainterm
Domainprocess
Domain Cartridge as a Graph
install insert
device
handset
android
blackberry
Operatingsystem
samsung
sim
card
SamsungGalaxyvictory
Samsungarray
Domainterm
Domainprocess
Synonyms
Domain Cartridge as a Graph
install insert
device
handset
android
blackberry
Operatingsystem
samsung
sim
card
SamsungGalaxyvictory
Samsungarray
Domainterm
Domainprocess
Feature-Of
Domain Cartridge as a Graph
install insert
device
handset
android
blackberry
Operatingsystem
samsung
sim
card
SamsungGalaxyvictory
Samsungarray
Domainterm
Domainprocess
Type-Of
Domain Cartridge as a Graph
install insert
device
handset
android
blackberry
Operatingsystem
samsung
sim
card
SamsungGalaxyvictory
Samsungarray
Domainterm
Domainprocess
Action-On
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Roadmap
I Unsupervised framework for shallow domain ontologyconstruction:
I Domain Term Discovery (DTD)I Improvement of Parser performance by DTDI Domain Relation Discovery (DRD)
I Use-Case: Improvement of an in-house Question-Answeringsystem
I Experiments: Manual Evaluation, Comparison with BabelNet,WordNet, Yago
I Conclusions
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Roadmap
I Unsupervised framework for shallow domain ontologyconstruction:
I Domain Term Discovery (DTD)I Improvement of Parser performance by DTDI Domain Relation Discovery (DRD)
I Use-Case: Improvement of an in-house Question-Answeringsystem
I Experiments: Manual Evaluation, Comparison with BabelNet,WordNet, Yago
I Conclusions
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Roadmap
I Unsupervised framework for shallow domain ontologyconstruction:
I Domain Term Discovery (DTD)I Improvement of Parser performance by DTDI Domain Relation Discovery (DRD)
I Use-Case: Improvement of an in-house Question-Answeringsystem
I Experiments: Manual Evaluation, Comparison with BabelNet,WordNet, Yago
I Conclusions
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Roadmap
I Unsupervised framework for shallow domain ontologyconstruction:
I Domain Term Discovery (DTD)I Improvement of Parser performance by DTDI Domain Relation Discovery (DRD)
I Use-Case: Improvement of an in-house Question-Answeringsystem
I Experiments: Manual Evaluation, Comparison with BabelNet,WordNet, Yago
I Conclusions
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Corpus: Knowledge articles, manuals, tutorials etc.
Domain Cartridge: Framework
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Parsing
Domain Cartridge: Framework
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Parsing
English Slot Grammar (ESG) parser is used
50 - 100 times faster than the Charniak parser
Shallow semantic relationship (SSR) annotation done over ESGparser output to generate normalized parser relations
E.g., “Samsung has a battery" and “Samsung’s battery died" bothgenerate the same relation ‘nnMod:samsung_battery’
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge: Framework
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Lucene Index
For efficient retrieval of relations, documents, positional (offset)information, handling proximity based queries or answer scorers
E.g: (Doc. Freq. of) All relations containing “install” or“rel:svo:quickbook_update_file”
TokenStream modified to produce contiguous representation ofall types of concepts (ngrams, relations, words etc.)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge: Framework
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term DiscoveryESG parser maintains a domain term lexicon of multi-wordconcepts. E.g. “touch screen, sprint navigation”
Noun Phrase Chunking done on document titles to extractfrequently occuring concepts as domain wordsI Enrich lexicon and bootstrap parserI Parser generates refined output
High precision but low recall — as titles are precise, clean butshort
To extract more fine-grained domain terms HITS is used onparser output
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term DiscoveryESG parser maintains a domain term lexicon of multi-wordconcepts. E.g. “touch screen, sprint navigation”
Noun Phrase Chunking done on document titles to extractfrequently occuring concepts as domain wordsI Enrich lexicon and bootstrap parserI Parser generates refined output
High precision but low recall — as titles are precise, clean butshort
To extract more fine-grained domain terms HITS is used onparser output
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term DiscoveryESG parser maintains a domain term lexicon of multi-wordconcepts. E.g. “touch screen, sprint navigation”
Noun Phrase Chunking done on document titles to extractfrequently occuring concepts as domain wordsI Enrich lexicon and bootstrap parserI Parser generates refined output
High precision but low recall — as titles are precise, clean butshort
To extract more fine-grained domain terms HITS is used onparser output
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
HITS
I Any Shallow Semantic Relation (SSR) from ESG parser can bevisualized as a hub generating domain terms
I Any domain term can be visualized as an authority influencedby incoming features from hubs
I Link strength between the hub and authority is taken as thenumber of mutual participating SSR in the Lucene Index
I Good authorities are incorporated in Parser Domain TermLexicon
I Parser is re-run, refined relations generated and previous stepsiterated till convergence
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
HITS
I Any Shallow Semantic Relation (SSR) from ESG parser can bevisualized as a hub generating domain terms
I Any domain term can be visualized as an authority influencedby incoming features from hubs
I Link strength between the hub and authority is taken as thenumber of mutual participating SSR in the Lucene Index
I Good authorities are incorporated in Parser Domain TermLexicon
I Parser is re-run, refined relations generated and previous stepsiterated till convergence
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
HITS
I Any Shallow Semantic Relation (SSR) from ESG parser can bevisualized as a hub generating domain terms
I Any domain term can be visualized as an authority influencedby incoming features from hubs
I Link strength between the hub and authority is taken as thenumber of mutual participating SSR in the Lucene Index
I Good authorities are incorporated in Parser Domain TermLexicon
I Parser is re-run, refined relations generated and previous stepsiterated till convergence
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
HITS
I Any Shallow Semantic Relation (SSR) from ESG parser can bevisualized as a hub generating domain terms
I Any domain term can be visualized as an authority influencedby incoming features from hubs
I Link strength between the hub and authority is taken as thenumber of mutual participating SSR in the Lucene Index
I Good authorities are incorporated in Parser Domain TermLexicon
I Parser is re-run, refined relations generated and previous stepsiterated till convergence
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
HITS
I Any Shallow Semantic Relation (SSR) from ESG parser can bevisualized as a hub generating domain terms
I Any domain term can be visualized as an authority influencedby incoming features from hubs
I Link strength between the hub and authority is taken as thenumber of mutual participating SSR in the Lucene Index
I Good authorities are incorporated in Parser Domain TermLexicon
I Parser is re-run, refined relations generated and previous stepsiterated till convergence
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Feedback
Domain Cartridge: Framework
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Parser Performance Improvement
Number of incomplete parses went down by 73% afterincorporating domain terms in the parser lexicon
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Terms
samsung blackberry software-version application htc-evo wi-fi memory-card bluetooth motorola kyocera browser voicemailmicrosoft-exchange lg-optimus samsung-m400 samsung-galaxy-victory software-updates samsung-array text-messaging wallpa-per synchronization iphone touch-screen blackberry-bold
Table: Snapshot of multi-word domain terms extracted by NP Chunking.
wi-fi optimus-g set-up novatel-wireless e-mail sierra-wirelessapple-id google-maps play-music mobile-network 10-digitinternet-explorer slacker-radio caller-id google-search address-book my-computer software-update blackberry-id as-well-aswindows-update terms-of-service drop-down pro-700 add-onscp-2700 mac-os device-manager voice-mail non-camera
Table: Snapshot of multi-word domain terms extracted by HITS (notfound by NP Chunking).
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge: Framework
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Random Index (RI)
Random Indexing is used for computing word-relatedness anddimensionality reduction
RI considers “term X term” co-occurrence, as opposed to“term X document” matrix — allowing for incremental learning ofcontext information, scaling up with the corpus size
We adopt the notion of Relational Distributional Similarity. Twoterms are considered similar if they appear in a similar contextparticipating in similar Shallow Semantic Relations
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Random Index (RI)
Random Indexing is used for computing word-relatedness anddimensionality reduction
RI considers “term X term” co-occurrence, as opposed to“term X document” matrix — allowing for incremental learning ofcontext information, scaling up with the corpus size
We adopt the notion of Relational Distributional Similarity. Twoterms are considered similar if they appear in a similar contextparticipating in similar Shallow Semantic Relations
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Random Index
Instead of taking the raw neighborhood of N terms, we use onlythose neighboring terms that share a syntactic dependency(given by ESG parser) with the target term
Lucene Index is queried for term and relation co-occurrencestatistics while constructing high dimensional random indexvector for each term
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge: Framework
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Synonym DiscoveryRandom Index is used to get top N similar terms for a given termbased on Relational Distributional Similarity
HITS is used to get the dominant domain terms and domain(SSR) relations
Sim(wi ,wj) =
∑p Ili=lj ,ki=kj (fwki
,p, fwkj,p′)∑
p∑
r Ili=lr ,ki=kr (fwki,p, fwkr ,p′)
Numerator counts the number of times any word appears in bothfeature vectors with similar dominant SSR relations
Denominator counts the number of times it appears in the featurevector of any other word with that SSR relation
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Synonym DiscoveryRandom Index is used to get top N similar terms for a given termbased on Relational Distributional Similarity
HITS is used to get the dominant domain terms and domain(SSR) relations
Sim(wi ,wj) =
∑p Ili=lj ,ki=kj (fwki
,p, fwkj,p′)∑
p∑
r Ili=lr ,ki=kr (fwki,p, fwkr ,p′)
Numerator counts the number of times any word appears in bothfeature vectors with similar dominant SSR relations
Denominator counts the number of times it appears in the featurevector of any other word with that SSR relation
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Synonym Discovery
High recall but low precision. E.g. ‘folder’ and ‘tab’ have a highrelational distributional similarity due to overlapping contextwords like “open, close, minimize, shortcut, multiple"
ESG parser attributes like noun, animated, physical object,device are used to increase precision
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Cartridge: Framework
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Action-On Discovery
Action-On represents any activity or ‘method’ on a domain term
E.g. “rel:svo:tap_add_account, rel:svo:phone_access_internet,rel:svo:mobile_sync_phone, rel:svo:account_use_phone etc."
HITS index is traversed to extract all the dominant dm (action onentities), and verb-object from svo (subject-verb-object) SSR
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Action-On Discovery
Action-On represents any activity or ‘method’ on a domain term
E.g. “rel:svo:tap_add_account, rel:svo:phone_access_internet,rel:svo:mobile_sync_phone, rel:svo:account_use_phone etc."
HITS index is traversed to extract all the dominant dm (action onentities), and verb-object from svo (subject-verb-object) SSR
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Type-Of DiscoveryType-Of relations depict Is-A hierarchy i.e. a parent-child relation
E.g. “rel:svo:devices_include_HTC, rel:npo:applications_such-as_WhatsApp, rel:npo:features_like_call,rel:npo:contact_such-as_address".
svo like “include” and npo “like, such-as, as” etc. are mostinformative
Dominant domain terms from HITS, bearing Hearst patterns like“or”, “especially”
E.g. service_or_process, prev_or_next,messages_especially_sms_and_ mms
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Type-Of DiscoveryType-Of relations depict Is-A hierarchy i.e. a parent-child relation
E.g. “rel:svo:devices_include_HTC, rel:npo:applications_such-as_WhatsApp, rel:npo:features_like_call,rel:npo:contact_such-as_address".
svo like “include” and npo “like, such-as, as” etc. are mostinformative
Dominant domain terms from HITS, bearing Hearst patterns like“or”, “especially”
E.g. service_or_process, prev_or_next,messages_especially_sms_and_ mms
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Feature-Of Discovery
Feature-Of relations depict components or functionalities of adomain term
nnMod depicting noun-noun modifications, and subject-object(so) extracted from svo SSR used to discover Feature-Of fromHITS index. E.g.
“rel:nnMod:network_life, rel:nnMod:account_settings,rel:nnMod:iPhone_battery etc."
“rel:svo:motorola-photon-4g_run_device-software, rel:svo:router_decrease_signal-strength,rel:svo:find-a-store_open_locator etc."
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Feature-Of Discovery
Feature-Of relations depict components or functionalities of adomain term
nnMod depicting noun-noun modifications, and subject-object(so) extracted from svo SSR used to discover Feature-Of fromHITS index. E.g.
“rel:nnMod:network_life, rel:nnMod:account_settings,rel:nnMod:iPhone_battery etc."
“rel:svo:motorola-photon-4g_run_device-software, rel:svo:router_decrease_signal-strength,rel:svo:find-a-store_open_locator etc."
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Smartphone Ontology Snapshot
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term Evaluation
5000 articles, tutorials and manuals from the smartphone domain
We used the Back-of-the-Book Index (BOI) of manuals, to createground truth for domain term discovery
Baselines:I WordNet (G. A. Miller. Wordnet: A lexical database for english. COMMUNICATIONS OF THE ACM, 38,
1995.)
I BabelNet (R. Navigli and S. P. Ponzetto. BabelNet: Building a very large multilingual semantic network. ACL
’10.)
I Yago (F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: a core of semantic knowledge. WWW ’07.)
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Term Evaluation
Method RecallWordNet 22.62%NP Chunking on Titles 32.45%HITS 40.87%Yago 43.77%BabelNet 53.74%
Table: Domain term evaluation.
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Recall of a Question-Answering System
Recall@N With DomainTerm Lexicon
Without domainterm lexicon
recall@1 0.40 0.33recall@2 0.49 0.45
Table: Performance of a QA system with and without domain termlexicon.
Incorporation of domain terms in parser lexicon improves QAsystem performance
1D. Gondek et al. A framework for merging and ranking of answers inDeepQA. IBM Journal of Research and Development, 56(3), 2012.
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Domain Relation Evaluation
2000 word pairs (500 for each of four categories) are manuallyannotated by two annotators
High Kappa score of 0.92 for Synonyms and 0.70 for Type-Of dueto the large number of word pairs, 84% and 62%, the annotatorsagree are not Synonyms and Type-Of respectively
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
ComparisonI Relations in BabelNet come either from (unlabeled) Wikipedia
hyperlinks or WordNetI Yago uses features like “wikipedia:redirect” links to detect
synonymous contextsI WordNet “Hyponyms, “Hypernyms” are similar to Type-Of, and
“Meronyms, Holonyms” are subset of Feature-OfI BabelNet, WordNet, Yago do not have any category like
Action-OnI Yago has Type-Of categories derived from WordNet hyponymy
and hypernymy, and Wikipedia categories
System Type-Of Feature-Of Action-OnBabelNet, WordNet 19.27% - -Yago 25.12% - -Domain Cartridge 77% 85.7% 68%
Table: Recall comparison of systems for 3 relations.
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
ComparisonI Relations in BabelNet come either from (unlabeled) Wikipedia
hyperlinks or WordNetI Yago uses features like “wikipedia:redirect” links to detect
synonymous contextsI WordNet “Hyponyms, “Hypernyms” are similar to Type-Of, and
“Meronyms, Holonyms” are subset of Feature-OfI BabelNet, WordNet, Yago do not have any category like
Action-OnI Yago has Type-Of categories derived from WordNet hyponymy
and hypernymy, and Wikipedia categories
System Type-Of Feature-Of Action-OnBabelNet, WordNet 19.27% - -Yago 25.12% - -Domain Cartridge 77% 85.7% 68%
Table: Recall comparison of systems for 3 relations.
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
ComparisonI Relations in BabelNet come either from (unlabeled) Wikipedia
hyperlinks or WordNetI Yago uses features like “wikipedia:redirect” links to detect
synonymous contextsI WordNet “Hyponyms, “Hypernyms” are similar to Type-Of, and
“Meronyms, Holonyms” are subset of Feature-OfI BabelNet, WordNet, Yago do not have any category like
Action-OnI Yago has Type-Of categories derived from WordNet hyponymy
and hypernymy, and Wikipedia categories
System Type-Of Feature-Of Action-OnBabelNet, WordNet 19.27% - -Yago 25.12% - -Domain Cartridge 77% 85.7% 68%
Table: Recall comparison of systems for 3 relations.
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
ComparisonI Relations in BabelNet come either from (unlabeled) Wikipedia
hyperlinks or WordNetI Yago uses features like “wikipedia:redirect” links to detect
synonymous contextsI WordNet “Hyponyms, “Hypernyms” are similar to Type-Of, and
“Meronyms, Holonyms” are subset of Feature-OfI BabelNet, WordNet, Yago do not have any category like
Action-OnI Yago has Type-Of categories derived from WordNet hyponymy
and hypernymy, and Wikipedia categories
System Type-Of Feature-Of Action-OnBabelNet, WordNet 19.27% - -Yago 25.12% - -Domain Cartridge 77% 85.7% 68%
Table: Recall comparison of systems for 3 relations.
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
ComparisonI Relations in BabelNet come either from (unlabeled) Wikipedia
hyperlinks or WordNetI Yago uses features like “wikipedia:redirect” links to detect
synonymous contextsI WordNet “Hyponyms, “Hypernyms” are similar to Type-Of, and
“Meronyms, Holonyms” are subset of Feature-OfI BabelNet, WordNet, Yago do not have any category like
Action-OnI Yago has Type-Of categories derived from WordNet hyponymy
and hypernymy, and Wikipedia categories
System Type-Of Feature-Of Action-OnBabelNet, WordNet 19.27% - -Yago 25.12% - -Domain Cartridge 77% 85.7% 68%
Table: Recall comparison of systems for 3 relations.
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Distributional Similarity Comparison
System Precision Recall F-ScoreYago 38% 32% 34.37%BabelNet, WordNet 83% 31% 45.14%Domain Cartridge (DC) 58% 41% 47.60%DC + WordNet 62% 40% 49.00%DC + ESG Parser Features 65% 39% 49.14%
Table: Precision-Recall comparison of Domain Cartridge(random-indexing, HITS and sim. eqn.) with other systems.
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
Comparison with Distributional SimilarityMeasures in WordNet
WordNet F-ScoreLCH 0.22RES 0.31JCN 0.42PATH 0.42LIN 0.43WUP 0.43LESK 0.45Domain Cartridge 0.49
Table: F-Score comparison of WordNet similarity measures withDomain Cartridge.
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
ConclusionsI Unsupervised framework for shallow domain ontology
construction, without using manually annotated resources
I Multi-words form an important component of Domain TermDiscovery
I Incorporation of domain terms in parser lexicon results in 73%reduction in incomplete parses, improving performance of anin-house QA system by upto 7%
I Synonym discovery approach, using Relational DistributionalSimilarity, RI, HITS etc., performs better than other existingapproaches
I Future Work: Measure customization time required for domainadaptation of the QA system
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
ConclusionsI Unsupervised framework for shallow domain ontology
construction, without using manually annotated resources
I Multi-words form an important component of Domain TermDiscovery
I Incorporation of domain terms in parser lexicon results in 73%reduction in incomplete parses, improving performance of anin-house QA system by upto 7%
I Synonym discovery approach, using Relational DistributionalSimilarity, RI, HITS etc., performs better than other existingapproaches
I Future Work: Measure customization time required for domainadaptation of the QA system
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
ConclusionsI Unsupervised framework for shallow domain ontology
construction, without using manually annotated resources
I Multi-words form an important component of Domain TermDiscovery
I Incorporation of domain terms in parser lexicon results in 73%reduction in incomplete parses, improving performance of anin-house QA system by upto 7%
I Synonym discovery approach, using Relational DistributionalSimilarity, RI, HITS etc., performs better than other existingapproaches
I Future Work: Measure customization time required for domainadaptation of the QA system
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
ConclusionsI Unsupervised framework for shallow domain ontology
construction, without using manually annotated resources
I Multi-words form an important component of Domain TermDiscovery
I Incorporation of domain terms in parser lexicon results in 73%reduction in incomplete parses, improving performance of anin-house QA system by upto 7%
I Synonym discovery approach, using Relational DistributionalSimilarity, RI, HITS etc., performs better than other existingapproaches
I Future Work: Measure customization time required for domainadaptation of the QA system
Domain Adaptation for IE and IR Domain Term Discovery Domain Relation Discovery Experiments
ConclusionsI Unsupervised framework for shallow domain ontology
construction, without using manually annotated resources
I Multi-words form an important component of Domain TermDiscovery
I Incorporation of domain terms in parser lexicon results in 73%reduction in incomplete parses, improving performance of anin-house QA system by upto 7%
I Synonym discovery approach, using Relational DistributionalSimilarity, RI, HITS etc., performs better than other existingapproaches
I Future Work: Measure customization time required for domainadaptation of the QA system