Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of...
-
Upload
alyson-williamson -
Category
Documents
-
view
215 -
download
0
Transcript of Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of...
![Page 1: Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:](https://reader036.fdocuments.us/reader036/viewer/2022070415/56649c7d5503460f949323c0/html5/thumbnails/1.jpg)
Text Analysis ConferenceKnowledge Base Population
2013
Hoa Trang DangNational Institute of Standards and Technology
Sponsored by:
![Page 2: Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:](https://reader036.fdocuments.us/reader036/viewer/2022070415/56649c7d5503460f949323c0/html5/thumbnails/2.jpg)
TAC KBP Goals
• Goal: Populate a knowledge base (KB) with information about entities as found in a collection of source documents, following a specified schema for the KB
• KBP 2009-2011: Focus on augmenting an existing KB. Decompose KBP into two tasks▫ Entity-Linking: link each given named entity mention to a node in
reference KB (or create new node)▫ Slot-Filling: Learn attributes about target entities from the source
documents and add new information about the entity to the reference KB
• KBP 2012: Combine entity-linking and slot-filling to build a KB from scratch -> Cold Start
• KBP 2013: ▫ Conversational, informal data (discussion fora)▫ Temporal constraints for Slot Filling (2011 pilot)▫ Sentiment analysis for Slot Filling
![Page 3: Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:](https://reader036.fdocuments.us/reader036/viewer/2022070415/56649c7d5503460f949323c0/html5/thumbnails/3.jpg)
TAC KBP 2013 Track Participants
• Track coordinators▫ Hoa Dang (Slot Filler Validation)▫ Jim Mayfield (Entity Linking, Cold Start KBP)▫ Margaret Mitchell (Sentiment Slot Filling)▫ Mihai Surdeanu (English Slot Filling and Temporal Slot
Filling)• LDC linguistic resource providers: Joe Ellis, Jeremy
Getman, Justin Mott, Xuansong Li, Kira Griffitt, Stephanie M. Strassel, Jonathan Wright
• Coordinators emeritus: Ralph Grishman, Heng Ji• Advisor: Boyan Onyshkevych• 45 Teams
▫ 14 countries (21 USA, 9 China, 3 Spain, 2 Germany,….)
![Page 4: Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:](https://reader036.fdocuments.us/reader036/viewer/2022070415/56649c7d5503460f949323c0/html5/thumbnails/4.jpg)
6 (8) TAC KBP 2013 Tracks
• Entity-Linking▫ English▫ Chinese▫ Spanish
• Slot-Filling (English)▫ Regular▫ Sentiment▫ Temporal▫ Slot Filler Validation Task
• Cold Start (English)
![Page 5: Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:](https://reader036.fdocuments.us/reader036/viewer/2022070415/56649c7d5503460f949323c0/html5/thumbnails/5.jpg)
Entity Linking and Slot Filling Tracks
• Goal: Augment a reference knowledge base (KB) with info about query entities (PER, ORG, GPE) as found in a diverse collection of documents
• Reference KB: Oct 2008 Wikipedia snapshot. Each KB node corresponds to a Wikipedia page and contains:▫ Infobox▫ Wiki_text (free text not in infobox)
• English source documents:▫ 1M News docs▫ 1M Web docs▫ 99K Discussion Forum docs (threads)
• Chinese source documents: 2M news, 800K Web• Spanish source documents: 900K news
![Page 6: Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:](https://reader036.fdocuments.us/reader036/viewer/2022070415/56649c7d5503460f949323c0/html5/thumbnails/6.jpg)
Entity-Linking Evaluation Results
• English▫ Participants: 26 teams▫ Highest F1: 0.721 (0.730 in 2012)▫ Median F1: 0.583 (0.536 in 2012)
• Chinese▫ Participants: 4 teams▫ Highest F1: 0.622 (0.740 in 2012)▫ Median F1: 0.619 (0.617 in 2012)
• Spanish▫ Participants 3 teams▫ Highest F1: 0.709 (0.641 in 2012)▫ Median F1: 0.651 (0.612 in 2012)
![Page 7: Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:](https://reader036.fdocuments.us/reader036/viewer/2022070415/56649c7d5503460f949323c0/html5/thumbnails/7.jpg)
Regular Slot Filling Evaluation Results
•Participants: 18 teams•Human F1: 0.685 (0.814 in 2012)•Highest System F1: 0.373 (0.517 in 2012)•2nd Highest System F1: 0.339 (0.296 in 2012)•Median System F1: 0.150 (0.099 in 2012)
![Page 8: Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:](https://reader036.fdocuments.us/reader036/viewer/2022070415/56649c7d5503460f949323c0/html5/thumbnails/8.jpg)
Sentiment Slot Filling Track
• Sentiment analysis for KBP:▫Holder (PER, ORG, GPE)▫Target (PER, ORG, GPE)▫Polarity (positive, negative)
• Implemented as regular slot filling, with different set of slots▫{per,org,gpe}:positive-towards▫{per,org,gpe}:negative-towards▫{per,org,gpe}:positive-from▫{per,org,gpe}:negative-from
• Participants: 3 teams• Evaluation results:
▫Human F1: 0.727▫Highest System F1: 0.132▫Median System F1: 0.014
![Page 9: Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:](https://reader036.fdocuments.us/reader036/viewer/2022070415/56649c7d5503460f949323c0/html5/thumbnails/9.jpg)
Temporal Slot Filling Track
• Find tightest temporal constraints [T1 T2 T3 T4] on a given relation▫ Relation is true for a period beginning between T1 and
T2▫ Relation is true for a period ending between T3 and T4
• Participants: 5 teams• Evaluation results:
▫ Human Accuracy: 0.688▫ Highest System Accuracy: 0.331▫ Median System Accuracy: 0.148
![Page 10: Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:](https://reader036.fdocuments.us/reader036/viewer/2022070415/56649c7d5503460f949323c0/html5/thumbnails/10.jpg)
Slot Filler Validation Track (SFV)
• Task: Determine whether or not a candidate slot filler is correct
• Objective: improve precision without excessive reduction of recall
• Participants: 5 teams• Some SFV runs had overwhelmingly positive impact
on individual SF runs!
![Page 11: Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:](https://reader036.fdocuments.us/reader036/viewer/2022070415/56649c7d5503460f949323c0/html5/thumbnails/11.jpg)
Cold Start KBP Track
• Goal: Build a KB from scratch, containing all targeted info about all entities as found in a relatively closed domain corpus of documents
• KB schema: same entity types and slots as regular slot-filling task• Source document collection:
▫ 50K Web pages from small-town publications (from TREC KBA document stream)
• Required capabilities:▫ Entity-linking: Grounding all named entity mentions in docs to
KB nodes▫ Slot-filling: Learning attributes about all named entities
• Post-submission evaluation queries traverse KB starting from a single entity node (entity mention):▫ 0-hop: Find all children of Michael Jordan▫ 1-hop: Find date of birth of each of the children of Michael
Jordan
![Page 12: Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:](https://reader036.fdocuments.us/reader036/viewer/2022070415/56649c7d5503460f949323c0/html5/thumbnails/12.jpg)
Cold Start Evaluation Results (Preliminary)
• Participants: 3 teams• 0-hop queries:
▫ Highest F1 0.384 (0.497 in 2012)• 1-hop queries:
▫ Highest F1 0.145 (0.255 in 2012)• Combined 0-hop and 1-hop F1
▫ Highest F1: 0.278 (~0.352 in 2012)
![Page 13: Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:](https://reader036.fdocuments.us/reader036/viewer/2022070415/56649c7d5503460f949323c0/html5/thumbnails/13.jpg)
TAC KBP Discussion/Planning Sessions
• Monday, November 18 (2:15-3:10pm):▫ English Slot Filling▫ Slot Filler Validation▫ Temporal Slot Filling?▫ +Spanish Slot Filling?▫ +Event identification and argument extraction?
• Tuesday, November 19 (3:00-4:00pm):▫ Cold Start▫ English Entity Linking (as queries in Cold Start
framework?)▫ Cross-Lingual Spanish and Chinese Entity Linking
+ Discussion forum