Linguistic Resources for the 2013 TAC KBP Entity Linking Evaluation
description
Transcript of Linguistic Resources for the 2013 TAC KBP Entity Linking Evaluation
Linguistic Resources for the 2013 TAC KBP Entity Linking Evaluation
Joe Ellis (presenter), Justin Mott, Xuansong Li, Jeremy Getman, Jonathan
Wright, Stephanie Strassel
Linguistic Data ConsortiumUniversity of Pennsylvania, USA
2013 Source Corpus
Language Genre Documents
English
Newswire 1,000,257
Web Text 999,999
Discussion Forums 99,063
Chinese
Newswire 2,000,256
Web Text 815,886
Discussion Forums 199,321
Spanish Newswire 910,734
TAC KBP Evaluation Workshop – NIST, November 18-19, 2013
Entity Linking Overview
TAC KBP Evaluation Workshop – NIST, November 18-19, 2013
Stage 1:Select name strings
and ref docs
Stage 2:Link namestrings to KB or mark as NIL
Entity Linking Overview
TAC KBP Evaluation Workshop – NIST, November 18-19, 2013
Stage 1:Select name strings
and ref docs
Stage 2:Link namestrings to KB or mark as NIL
Entity Linking Overview
TAC KBP Evaluation Workshop – NIST, November 18-19, 2013
Stage 1:Select name strings
and ref docs
Stage 2:Link namestrings to KB or mark as NIL
Entity Linking Overview
TAC KBP Evaluation Workshop – NIST, November 18-19, 2013
Stage 1:Select name strings
and ref docs
Stage 2:Link namestrings to KB or mark as NIL
Stage 3:Co-reference NIL
entities
Wendy
Wendy Gaxiola
Entity Linking Overview
TAC KBP Evaluation Workshop – NIST, November 18-19, 2013
Stage 1:Select name strings
and ref docs
Stage 2:Link namestrings to KB or mark as NIL
Stage 3:Co-reference NIL
entitiesWendyWendy Gaxiola
Entity Linking – Stage 1
Run named entity taggers over source corporaProvides guided search through the corpusThanks KBP coordinators!
Namestring SelectionConfusable
TAC KBP Evaluation Workshop – NIST, November 18-19, 2013
Entity Linking – Stage 1
Run named entity taggers over source corporaProvides guided search through the corpusThanks KBP coordinators!
Namestring SelectionConfusable
TAC KBP Evaluation Workshop – NIST, November 18-19, 2013
Entity Linking – Stage 1
Run named entity taggers over source corporaProvides guided search through the corpusThanks KBP coordinators!
Namestring SelectionConfusable
TAC KBP Evaluation Workshop – NIST, November 18-19, 2013
Entity Linking – Stage 1
Run named entity taggers over source corporaProvides guided search through the corpusThanks KBP coordinators!
Namestring SelectionConfusableAmbiguousVaried
TAC KBP Evaluation Workshop – NIST, November 18-19, 2013
?
Entity Linking – Stage 1
Run named entity taggers over source corporaProvides guided search through the corpusThanks KBP coordinators!
Namestring SelectionConfusableAmbiguousVaried
TAC KBP Evaluation Workshop – NIST, November 18-19, 2013
?
Entity Linking – Stage 1: Namestring Selection
Ratios NIL & non-NIL Entity types Genre
Measurable confusability Multiple-entity namestrings (“Smith”) Multiple-namestring entities (“Barack Obama”, “Bam-Bam”,
“Bammy”) NIL singletons Cross-lingual
TAC KBP Evaluation Workshop – NIST, November 18-19, 2013
Entity Linking – Stages 2 & 3: KB Linking and NIL
Coref KB Linking
Review ref document and search KB for matching nodeMultiple namestrings viewed together for quicker linking
NIL CoreferenceNIL queries (no KB match) require manual co-reference
annotationTime-limited quality control pass to enhance
completeness and accuracy
TAC KBP Evaluation Workshop – NIST, November 18-19, 2013
Delivered 2013 Resources
TAC KBP Evaluation Workshop – NIST, November 18-19, 2013
Corpus Title Type LDC Catalog Language Size
TAC 2013 KBP English Entity Linking Evaluation Queries and Knowledge Base Links
Evaluation LDC2013E90 English803 GPE686 PER701 ORG
TAC 2013 KBP Chinese Entity Linking Evaluation Queries and Knowledge Base Links
Evaluation LDC2013E96ChineseEnglish
714 GPE706 PER735 ORG
TAC 2013 KBP Spanish Entity Linking Evaluation Queries and Knowledge Base Links
Evaluation LDC2013E97SpanishEnglish
660 GPE695 PER762 ORG