Post Graduate Diploma in Management - JIMS Rohini Admissions Open 2016 - JIMS Rohini News-
A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT...
-
Upload
anthony-wilcox -
Category
Documents
-
view
214 -
download
1
Transcript of A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT...
A Technical Seminar on
Question Answering
SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT
Presented By:Rohini Kamdi
Guided By:Dr. A.J.Agrawal
Contents:IntroductionWhy Question Answering ?The Architecture of a Generic QA SystemIssues with traditional QA SystemsThe Web Solution- AskMSRCurrent Research WorkConclusion
Question answering (QA), in information retrieval, is the task of automatically answering a question posed in natural language (NL) using either a pre-structured database or a collection of natural language documents.
Goal : to retrieve answers to questions rather than full documents or best-matching passages
QA=Information Retrieval + Information Extraction
to find short answers to fact-based questions
Introduction
Why Question Answering ?Google – Query driven search
Answers to a query are documentsQuestion Answering – Answer driven search
Answers to a query are phrases
QuestionProcessingquestion query
PassageRetrieval Answer
Extractionanswers
DocumentRetrieval
DocumentRetrieval
The Architecture of a Generic QA System
Question ProcessingCaptures the semantics of the question;
Tasks:Determine the question typeDetermining the answer typeExtract keywords from the question and formulate a
query
Question Types
Class 1 Answer: single datum or list of itemsC: who, when, where, how (old, much, large)
Class 2 A: multi-sentenceC: extract from multiple sentences
Class 3 A: across several textsC: comparative/contrastive
Class 4 A: an analysis of retrieved informationC: synthesized coherently from several retrieved fragments
Class 5 A: result of reasoningC: word/domain knowledge and common sense reasoning
Types of QAClosed-domain QA systems: are built for very specific
domain and exploit expert knowledge in them. very high accuracy require extensive language processing and limited to one domain
Open-domain QA systems: can answer any question from any collection.
can potentially answer any question very low accuracy
Keyword SelectionList of keywords in the question to help in finding relevant
texts
Some systems expanded them with lexical/semantic alternations for better matching:inventor -> inventhave been sold -> selldog -> animal
Passage RetrievalExtracts passages that contain all selected keywords
Passage quality based on loops:In the first iteration use the first 6 keyword selection
heuristicsIf no. passages < a threshold query is too strict drop a
keywordIf no. passages > a threshold query is too relaxed add a
keyword
Answer Extraction
Pattern matching between question and the representation of the candidate answer-bearing texts
A set of candidate answers is produced
Ranking according to likelihood of correctness.
QA System Output
AnswerBus Sentences
AskJeeves (ask.com)
Documents/direct answers
IONAUT Passages
LCC Sentences
Mulder Extracted answers
QuASM Document blocks
START Mixture
Webclopedia Sentences
Example of Answer Processing
Issues with traditional QA SystemsRetrieval is performed against small set of documentsExtensive use of linguistic resources
POS tagging, Named Entity Tagging, WordNet etc.Difficult to recognize answers that do not match question
syntaxE.g. Q: Who shot President Abraham Lincoln?
A: John Wilkes Booth is perhaps America’s most infamous assassin having fired the bullet that killed Abraham Lincoln.
The Web can helpWeb – A gigantic data repository with extensive data
redundancyFactoids likely to be expressed in hundreds of different waysAt-least a few will match the way the question was asked
E.g. Q: Who shot President Abraham Lincoln? A: John Wilkes Booth shot President Abraham
Lincoln.
AskMSR: Details
Step 1: Rewrite queriesIntuition: The user’s question is often syntactically quite close to
sentences that contain the answerE.g. Q-Where is the Louvre Museum located?
A- The Louvre Museum is located in Paris
Classify question into specific categories. Category-specific transformation rulesExpected answer “Data type” (E.g. Date, Person, Location, …)
Step 2: Query search engineSend all rewrites to a Web search engineRetrieve top N answersFor speed, search engine’s “snippets” are used instead of full text or
the actual document
Step 3: Mining N-GramsEnumerate all N-grams (N=1,2,3 say) in all retrieved snippets
Use hash table to make this efficientWeight of an n-gram: occurrence count, each weighted by
“reliability” (weight) of rewrite that fetched the document
Step 4: Filtering N-GramsEach question type is associated with one or more “data-type filters” =
regular expression
When… Where… What … Who …
Boost score of n-grams that do match regular exp Lower score of n-grams that don’t match regular exp
Date
Location
Person
Step 5: Tiling the Answers
Dickens
Charles Dickens
Mr Charles
Scores
20
15
10
merged, discardold n-grams
Mr Charles DickensScore 45
N-Grams
tile highest-scoring n-gram
N-Grams
Repeat, until no more overlap
Example: “Who created the character of Scrooge?”
Current Research Work
Human Question Answering Performance Using an Interactive Document Retrieval System
Document retrieval + QA system
: the ability of the users answering the questions on their own using an interactive document retrieval system and result compared and evaluated by QA systems
Towards Automatic Question Answering over Social Media by Learning Question Equivalence Patterns
Collaborative Question Answering (CQA) systems
:accessed on an existing archive, in which users answer each other questions, many questions to be asked have already been asked and answered and group equivalence patterns are generated for questions having syntactic similarities
An Automatic Answering System with Template Matching for Natural Language Questions
Closed-domain system : template matching is applied, to provide a
service for cell phones by SMS, Frequently Asked Questions (FAQ) are used as sample data.
Preprocessing Question Template Matching Answering
Conclusion
Question Answering requires more complex NLP techniques compared to other forms of Information Retrieval
There is a huge possibility that complex Automatic QA systems can replace simple web search systems, but the Automatic QA are still non-trivial research fields as Document and Information Retrieval, are huge with many different approaches, which are still not all fully developed
ReferencesAn Analysis of the AskMSR Question Answering
System , Eric Brill et. al., Proceedings of the Conference on Empirical Methods in Natural Association for Computational Linguistics. Language Processing (EMNLP), Philadelphia, July 2002, pp. 257-2 6 4
New Trends in Automatic Question Answering by Univ.-Doz. Dr.techn. Christian Gutl
Thank You..