A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT...

23
A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal

Transcript of A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT...

Page 1: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

A Technical Seminar on

Question Answering

SHRI RAMDEOBABA COLLEGE OF  ENGINEERING & MANAGEMENT 

Presented By:Rohini Kamdi

Guided By:Dr. A.J.Agrawal

Page 2: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Contents:IntroductionWhy Question Answering ?The Architecture of a Generic QA SystemIssues with traditional QA SystemsThe Web Solution- AskMSRCurrent Research WorkConclusion

Page 3: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Question answering (QA), in information retrieval, is the task of automatically answering a question posed in natural language (NL) using either a pre-structured database or a collection of natural language documents.

Goal : to retrieve answers to questions rather than full documents or best-matching passages

QA=Information Retrieval + Information Extraction

to find short answers to fact-based questions

Introduction

Page 4: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Why Question Answering ?Google – Query driven search

Answers to a query are documentsQuestion Answering – Answer driven search

Answers to a query are phrases

Page 5: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

QuestionProcessingquestion query

PassageRetrieval Answer

Extractionanswers

DocumentRetrieval

DocumentRetrieval

The Architecture of a Generic QA System

Page 6: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Question ProcessingCaptures the semantics of the question;

Tasks:Determine the question typeDetermining the answer typeExtract keywords from the question and formulate a

query

Page 7: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Question Types

Class 1 Answer: single datum or list of itemsC: who, when, where, how (old, much, large)

Class 2 A: multi-sentenceC: extract from multiple sentences

Class 3 A: across several textsC: comparative/contrastive

Class 4 A: an analysis of retrieved informationC: synthesized coherently from several retrieved fragments

Class 5 A: result of reasoningC: word/domain knowledge and common sense reasoning

Page 8: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Types of QAClosed-domain QA systems: are built for very specific

domain and exploit expert knowledge in them. very high accuracy require extensive language processing and limited to one domain

Open-domain QA systems: can answer any question from any collection.

can potentially answer any question very low accuracy

Page 9: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Keyword SelectionList of keywords in the question to help in finding relevant

texts

Some systems expanded them with lexical/semantic alternations for better matching:inventor -> inventhave been sold -> selldog -> animal

Page 10: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Passage RetrievalExtracts passages that contain all selected keywords

Passage quality based on loops:In the first iteration use the first 6 keyword selection

heuristicsIf no. passages < a threshold query is too strict drop a

keywordIf no. passages > a threshold query is too relaxed add a

keyword

Page 11: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Answer Extraction

Pattern matching between question and the representation of the candidate answer-bearing texts

A set of candidate answers is produced

Ranking according to likelihood of correctness.

Page 12: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

QA System Output

AnswerBus Sentences

AskJeeves (ask.com)

Documents/direct answers

IONAUT Passages

LCC Sentences

Mulder Extracted answers

QuASM Document blocks

START Mixture

Webclopedia Sentences

Example of Answer Processing

Page 13: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Issues with traditional QA SystemsRetrieval is performed against small set of documentsExtensive use of linguistic resources

POS tagging, Named Entity Tagging, WordNet etc.Difficult to recognize answers that do not match question

syntaxE.g. Q: Who shot President Abraham Lincoln?

A: John Wilkes Booth is perhaps America’s most infamous assassin having fired the bullet that killed Abraham Lincoln.

Page 14: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

The Web can helpWeb – A gigantic data repository with extensive data

redundancyFactoids likely to be expressed in hundreds of different waysAt-least a few will match the way the question was asked

E.g. Q: Who shot President Abraham Lincoln? A: John Wilkes Booth shot President Abraham

Lincoln.

Page 15: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

AskMSR: Details

Page 16: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Step 1: Rewrite queriesIntuition: The user’s question is often syntactically quite close to

sentences that contain the answerE.g. Q-Where is the Louvre Museum located?

A- The Louvre Museum is located in Paris

Classify question into specific categories. Category-specific transformation rulesExpected answer “Data type” (E.g. Date, Person, Location, …)

Step 2: Query search engineSend all rewrites to a Web search engineRetrieve top N answersFor speed, search engine’s “snippets” are used instead of full text or

the actual document

Page 17: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Step 3: Mining N-GramsEnumerate all N-grams (N=1,2,3 say) in all retrieved snippets

Use hash table to make this efficientWeight of an n-gram: occurrence count, each weighted by

“reliability” (weight) of rewrite that fetched the document

Step 4: Filtering N-GramsEach question type is associated with one or more “data-type filters” =

regular expression

When… Where… What … Who …

Boost score of n-grams that do match regular exp Lower score of n-grams that don’t match regular exp

Date

Location

Person

Page 18: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Step 5: Tiling the Answers

Dickens

Charles Dickens

Mr Charles

Scores

20

15

10

merged, discardold n-grams

Mr Charles DickensScore 45

N-Grams

tile highest-scoring n-gram

N-Grams

Repeat, until no more overlap

Example: “Who created the character of Scrooge?”

Page 19: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Current Research Work

Human Question Answering Performance Using an Interactive Document Retrieval System

Document retrieval + QA system

: the ability of the users answering the questions on their own using an interactive document retrieval system and result compared and evaluated by QA systems

Towards Automatic Question Answering over Social Media by Learning Question Equivalence Patterns

Collaborative Question Answering (CQA) systems

:accessed on an existing archive, in which users answer each other questions, many questions to be asked have already been asked and answered and group equivalence patterns are generated for questions having syntactic similarities

Page 20: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

An Automatic Answering System with Template Matching for Natural Language Questions

Closed-domain system : template matching is applied, to provide a

service for cell phones by SMS, Frequently Asked Questions (FAQ) are used as sample data.

Preprocessing Question Template Matching Answering

Page 21: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Conclusion

Question Answering requires more complex NLP techniques compared to other forms of Information Retrieval

There is a huge possibility that complex Automatic QA systems can replace simple web search systems, but the Automatic QA are still non-trivial research fields as Document and Information Retrieval, are huge with many different approaches, which are still not all fully developed

Page 22: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

ReferencesAn Analysis of the AskMSR Question Answering

System , Eric Brill et. al., Proceedings of the Conference on Empirical Methods in Natural Association for Computational Linguistics. Language Processing (EMNLP), Philadelphia, July 2002, pp. 257-2 6 4

New Trends in Automatic Question Answering by Univ.-Doz. Dr.techn. Christian Gutl

Page 23: A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.

Thank You..