Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.
-
Upload
joy-hamilton -
Category
Documents
-
view
212 -
download
0
Transcript of Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.
![Page 1: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/1.jpg)
Talk Schedule Question Answering from Email
Bryan KlimtJuly 28, 2005
![Page 2: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/2.jpg)
Project Goals• To build a practical working question
answering system for personal email
• To learn about the technologies that go into QA (IR,IE,NLP,MT)
• To discover which techniques work best and when
![Page 3: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/3.jpg)
System Overview
![Page 4: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/4.jpg)
Dataset
• 18 months of email (Sept 2003 to Feb 2005)
• 4799 total• 196 are talk announcements• hand labelled and annotated
• 478 questions and answers
![Page 5: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/5.jpg)
A new email arrives…
• Is it a talk announcement?
• If so, we should index it.
![Page 6: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/6.jpg)
Email Classifier
Email Data
LogisticRegression
DecisionLogisticRegressionCombo
![Page 7: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/7.jpg)
Classification Performance
• precision = 0.81• recall = 0.66• (previous works had better
performance)
• Top features:– abstract, bio, speaker, copeta, multicast,
esm, donut, talk, seminar, cmtv, broadcast, speech, distinguish, ph, lectur, ieee, approach, translat, professor, award
![Page 8: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/8.jpg)
Annotator• Use Information Extraction
techniques to identify certains types of data in the emails– speaker names and affiliations– dates and times– locations– lecture series and titles
![Page 9: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/9.jpg)
Annotator
![Page 10: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/10.jpg)
Rule-based Annotator
• Combine regular expressions and dictionary lookups
• defSpanType date =:...[re('^\d\d?$') ai(dayEnd)?
ai(month)]...;
• matches “23rd September”
![Page 11: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/11.jpg)
Conditional Random Fields
• Probabilistic framework for labelling sequential data
• Known to outperform HMMs (relaxation of independence assumptions) and MEMMs (avoid “label bias” problem)
• Allow for multiple output features at each node in the sequence
![Page 12: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/12.jpg)
Rule-based vs. CRFs
![Page 13: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/13.jpg)
Rule-based vs. CRFs
• Both results are much higher than in previous study
• For dates, times, and locations, rules are easy to write and perform extremely well
• For names, titles, affiliations, and series, rules are very difficult to write, and CRFs are preferable
![Page 14: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/14.jpg)
Template Filler• Creates a database record for each talk
announced in the email• This database is used by the NLP answer
extractor
![Page 15: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/15.jpg)
Filled TemplateSeminar {
title = “Keyword Translation from English toChinese for Multilingual QA”
name = Frank Lintime = 5:30pmdate = Thursday, Sept. 23location = 4513 Newell Simon Hallaffiliation = series =
}
![Page 16: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/16.jpg)
Search Time• Now the email is index• The user can ask questions
![Page 17: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/17.jpg)
IR Answer ExtractorWhere is Frank Lin’s talk?
0.5055 3451.txtsearch[468:473]: "frank"search[2025:2030]: "frank"search[474:477]: "lin”
0.1249 2547.txtsearch[580:583]: "lin”
0.0642 2535.txtsearch[2283:2286]: "lin"
• Performs a traditional IR (TF-IDF) search using the question as a query
• Determines the answer type from simple heuristics (“Where”->LOCATION)
![Page 18: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/18.jpg)
IR Answer Extractor
![Page 19: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/19.jpg)
NL Question Analyzer
• Uses Tomita Parser to fully parse questions to translate them into a structured query language
• “Where is Frank Lin’s talk?”• ((FIELD LOCATION)
(FILTER (NAME “FRANK LIN”)))
![Page 20: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/20.jpg)
NL Answer Extractor
• Simply executes the structured query produced by the Question Analyzer
• ((FIELD LOCATION) (FILTER (NAME “FRANK LIN”)))
• select LOCATION from seminar_templates where NAME=“FRANK LIN”;
![Page 21: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/21.jpg)
Results
• NL Answer Extractor -> 0.870• IR Answer Extractor -> 0.755
Answer Accuracy
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
NL Answer Extractor IR Answer Extractor
![Page 22: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/22.jpg)
Results• Both answer extractors have similar
(good) performance• IR based extractor
– easy to implement (1-2 days)– better on questions w/ titles and names– very bad on yes/no questions
• NLP based extractor– more difficult to implement (4-5 days)– better on questions w/ dates and times
![Page 23: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/23.jpg)
Examples• “Where is the lecture on dolphin language?”
– NLP Answer Extractor: Fails to find any talk– IR Answer Extractor: Finds the correct talk– Actual Title: “Natural History and Communication of
Spotted Dolphin, Stenella Frontalis, in the Bahamas”
• “Who is speaking on September 10?”– NLP Extractor: Finds the correct record(s)– IR Extractor: Extracts the wrong answer– A talk “10 am, November 10” ranks higher than one
on “Sept 10th”
![Page 24: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/24.jpg)
Future Work• Add an annotation “feedback loop” for the
classifier
• Add a planner module to decide which answer extractor to apply to each individual question
• Tune parameters for classifier and TF-IDF search engine
• Integrate into a mail client!
![Page 25: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/25.jpg)
Conclusions• Overall performance is good enough for the
system to be helpful to end users
• Both rule-based and automatic annotators should be used, but for different types of annotations
• Both IR-based and NLP-based answer extractors should be used, but for different types of questions
![Page 26: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005.](https://reader035.fdocuments.us/reader035/viewer/2022070402/56649f275503460f94c3e3ec/html5/thumbnails/26.jpg)
DEMO