June 11-13, 2002AQUAINT 6-Month Workshop1 HITIQA: High-Quality Interactive Question Answering...

June 11-13, 2002 AQUAINT 6-Month Workshop 1

HITIQA: High-Quality Interactive Question Answering

6-Month Review

University at Albany, SUNYRutgers University


HITIQA Team• SUNY Albany:

– Prof. Tomek Strzalkowski, PI/PM– Prof. Rong Tang– Prof. Boris Yamrom, consultant– Ms. Sharon Small, Research Scientist– Mr. Ting Liu, Graduate Student– Mr. Tom Palen, summer intern– Mr. Peter LaMonica, summer intern/AFRL

• Rutgers:– Prof. Paul Kantor, co-PI– Prof. K.B. Ng– Mr. Robert Rittman, Graduate Student– Ms. Ying Sun, Graduate Student– Mr. Vladimir Menkov, Consultant/Programmer– Mr. Peng Song, summer student


HITIQA System

Question: What recent disasters occurred in tunnels used for transportation?

Possible Category Axes SeenV

ehic

le t

yp

eLosses/Cost

loca

tion

other

auto

train

USER PROFILE; TASK CONTEXT

QUESTION NL PROCESSING

Clarification Dialogue:S: Are you interested in train accidents,automobile accidents or others?U: Any that involved lost life or a majordisruption in communication. Must identifyloses.

Semantics: What the question“means”:• to the system• to the userS

EM

AN

TIC

PR

OC

FUSE &SUMMARIZE

Answer &Justification

AN

SW

ER

GE

NE

R.

SEARCH &CATEGORIZE

KB

TEMPLATE SELECTION

Focused Information Need

QUALITY ASSESSMENT


Key Research Issues

• Question Semantics – how the system “understands” user requests

• Human-Computer Dialogue – how the user and the system negotiate this

understanding

• Information Quality Metrics – how some information is better than other

• Information Fusion – how to assemble the answer that fits user

needs.


fooddrug

food labelingrecallFDA

life threaten

Extracting Question SemanticsWhat are the laws dealing with the quality and processing of

food or drugs??

fooddrug

lawquality

ground terms

un/assignedattributes

Answer Cluster

drugdrug trafficking

enforcementFBI

Alternative Cluster


Possible Answer Clusters

• Each cluster represents:– Complementary answer pieces

• With attribute grounding• Specific concept/variable instantiation

– Alternative interpretations of analyst’s question

• Within a cluster:– Instances of some concepts/variables– Complementary descriptions– Redundancy increases confidence


Inducing answer “frames”• Collect references to the topic of interest

– high-precision query– cluster to separate nuggets from noise– extract verb patterns, n-grams

• Form a ‘naïve’ rule to find more examples– low recall is expected– high precision is desired– identify ‘signature’ features – initial rules

• Bootstrap the rules to find new signature patterns in new examples


Data-Driven Interaction What does the

question mean to the user?– The speech act– The focus– User’s

task/intention/goal– User’s background

knowledge

What does the question mean to the system?– Available information– Information that can

be retrieved– The dimensions of the

retrieved information Shared Understanding

– Semantic gaps drive the dialogue: to negotiate between user’s meaning and system’s

meaning to fill the gaps in the expected answer to resolve ambiguities in the data to reduce dimensionality of the answer space


Dialogue Motivators• Dialogue arises from:

– System’s need to clarify before proceeding– Analyst’s need to clarify to keep system on target

• What is returned from database:– Alternative interpretations: need to select

• differentiate candidate answers from others

– Off-target interpretations: need to re-target• reformulate the question

– Partial answers• follow through linked questions

• A dialogue is unique to each analyst-data pair


What kind of Dialogue?Good afternoon, how can help?

Hello I wanted to notify you of a change of address

Yes certainly, can I take your name?

Yeh, its Miss Danielle Lansley

And your old post code please?

SS6 9GD

Oh I have a different post code to that

What for my old address?

Errr well for the address that’s on file

Oh what address have you got?

What address are you at now?

I’m at 1A Willop Call

Oh we’ve actually got that address

Oh right ohhhh


Information QualityQuality Criteria

• CONTENT– Accuracy and Objectivity– Completeness; uniqueness– Importance; Verifiability

• AUTHORITY– Reliability; credibility

• PRESENTATION– Clarity and Un-ambiguity– Style and Gravitas– Orientation and Level– Readability and Usability

• TIMELINESS– Recency– Currency

Measurable Quality Indicators

• IN/OUT-DEGREE MEASURE– Number of cites or links

to/from– Credibility of these cites/links

• DOCUMENT SIZE• STYLISTIC FEATURES

– Typical sentence length– Use of pronouns, punctuations

• LINGUISTIC FEATURES– Sentence forms, verbs– References to names,

amounts• STRUCTURAL FEATURES

– Organization of sections– Use of section titles, etc.

• COLLECTION FEATURES


Information Quality Assessments

Initial FrameworkInitial Framework

Focus Group Studies

System for Judgment

Experiments

Quality Metrics

Judgment Experiments

Pretest


Focus Group Studies

• Identify quality aspects salient to analysts’ work

• Participants: Journalists, editors (newspaper, TV & Radio)

Faculty of Journalism & Communication

• Design: 90 min discussion and/or task oriented

• Sessions completed March 8 (Times Union Albany)

April 9 (SUNY Albany)

• Future Sessions: Rutgers, NBC News


Quality experiments design

• Developed simple, practical GUI– Supports document manipulation (TREC, Web)– Supports quality assessment– Supports gathering material for answer

• Subjects perform sessions– Gather material on a given topic

• E.g., Laws governing food and drug production

– Assess the quality of each retained text

• In phase 2, actually compose the answer


Quality Assessment GUI


Quality Annotation status

• Expert Sessions (phase I)– 10 experts, 10 documents each– Approx. 2 hours per session

• Student sessions (phase I)– 40 students, 1000 documents– Train & test on expert judged material– Compute student-expert differential


Info fusion: what and how? Evidence about relevance

of a document– statistical information– analyst judgments– links to other documents– internal linguistic evidence– named entity evidence

Evidence about “confidence” of a document– source validity evidence– grammatical evidence– linguistic evidence

Search techniques– discrete - different

methods– linear and Support Vector

models for specific methods

– non-linear optimization for “elliptical models”

– Boolean rule learning Other training data

available– TREC filtering track


Evidence Fusion Experiments Off-line experimentation

– scores of items on multiple scales are known

– value judgments by evaluators (during design) and users (during adaptive usage) are known

– search the space of fusion formulas to find the version that produces best results for training data. Parameter variation.

On-line experimentation– users drag related points

together– system works in

background to find which features must have been salient to make these similar

– user can do a “what if” update of the display


Lemur as a Tool – batch fusion

• Linux box – datafusion.rutgers.edu

• 11 Ranking algorithms

• (Rel, x_1, …., x_11)

• Pattern-finding, Machine Learning

• Can we predict the relevance from x

• Rel=f(x)

• Use f(x) as a ranking score


Evidence Fusion Displays ABSTRACT FUSION SPACE

Focused Fusion(Elliptic)

Fuzzy Logic (AND)

Attribute_1

Att

ribu

te_

2

High

Hig

h

Low

Low

DISPLAY SPACE

• By observing many users the system learns:– patterns of fusion which are most effective– choices that produce a useful display– preferences of individual users and user types

Recent Relevant Representative

Rec

ent

Rel

evan

tR

epre

sent

ativ

e

The user can select which characteristic is displayed


Information Visualization

• Supports Evidence Fusion– Dimensional displays

• Supports Information Quality Decisions– User interfaces

• Supports Clarification Dialogue– Multi-media dialogue: “picture = Kilo-word”

• Navigation through information space– Multiple views and orientation


Clusters in gravitational space

Points representindividual texts

Color-coded clusters.

Each cluster hasdifferent shape and

density showing howwell formed it is.

Verify if the clustersolution makes sense


Current Status Summary

• Initial HITIQA architecture completed• End-to-end prototype running

– Most coded in Java

• Data: TREC– A subset of TREC queries converted into

“analytical” questions

• User studies underway– Focus groups done

• Information fusion work underway


Plans for the next 6 months• Mapping topical clusters onto Frames

– Frames represent “transactions”: events, situations, etc

• Build initial Dialogue Manager– Data driven, mixed initiative

• Complete Phase I of user experiments– Collect information quality assessments

• Interactive Visualization prototype• Start information fusion experiments

June 11-13, 2002AQUAINT 6-Month Workshop1 HITIQA: High-Quality Interactive Question Answering...

Documents

Transcript of June 11-13, 2002AQUAINT 6-Month Workshop1 HITIQA: High-Quality Interactive Question Answering...