Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson,...

28
Capturing and Answering Questions Posed to a Knowledge- Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce Porter, Ken Barker, Peter Yeh, James Fan (UT Austin) Vinay Chaudhri, Aaron Spaulding (SRI) Bonnie John (CMU)

Transcript of Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson,...

Page 1: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Capturing and Answering Questions Posed to a Knowledge-Based System

Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing)

Jason Chaw, Bruce Porter, Ken Barker, Peter Yeh, James Fan (UT Austin)

Vinay Chaudhri, Aaron Spaulding (SRI)Bonnie John (CMU)

Page 2: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Overview

The context and problem Question-Answering

Controlled Language for Asking Questions Reasoning for Answering Questions

Evaluation and how it worked out Reformulation attempts Advice Common sense

Future

Page 3: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Overview

The context and problem Question-Answering

Controlled Language for Asking Questions Reasoning for Answering Questions

Evaluation and how it worked out Reformulation attempts Advice Common sense

Future

Page 4: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Context: Project Halo (Vulcan Inc)

“The Digital Aristotle” access to massive amounts of knowledge

in computationally usable form

First step Restricted to science domains

Advanced high-school (AP) physics, chemistry, biology

Knowledge acquisition Can domain experts directly enter their

knowledge?

Question-Answering Can non-computer-scientists pose

questions? Can the system reason and provide good

answers back?

Page 5: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Some Example AP Questions

A solution of nickel nitrate and sodium hydroxide are mixed together. Which of the following statements is true?a. A precipitate will not form.b. A precipitate of sodium nitrate will be produced.c. Nickel hydroxide and sodium nitrate will be produced.d. Nickel hydroxide will precipitate.e. Hydrogen gas is produced from the sodium hydroxide.

Example question (physics)

Example question (chemistry)

An alien measures the height of a cliff by dropping a boulder from rest and measuring the time it takes to hit the ground below. The boulder fell for 23 seconds on a planet with an acceleration of gravity of 7.9 m/s2. Assuming constant acceleration and ignoring air resistance, how high was the cliff?

?

Page 6: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Question-Asking: Approaches

Posing complex questions is challenging!

Templates: Too restricted

English: Too difficult for computer to understand

Formal language: Too difficult for user to learn

Controlled language?

Page 7: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

The “Controlled Language” Claim

Formallanguage

Unrestrictednatural

languageCPL

“A boulder is dropped”“Consider the following possible situation in which a boulder first…”

“xy B(x)R(x,y)C(y)”

too hard for the user

too hard for the computer

to understand

There lies a “sweet spot” between logic and full NL which is both human-usable and machine-understandable

Page 8: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Example of a CPL encoding of a question

A boulder is dropped.The initial speed of the boulder is 0 m/s.The duration of the drop is 23 seconds.The acceleration of the drop is 7.9 m/s^2.What is the distance of the drop?

An alien measures the height of a cliff by dropping a boulder from rest and measuring the time it takes to hit the ground below. The boulder fell for 23 seconds on a planet with an acceleration of gravity of 7.9 m/s2. Assuming constant acceleration and ignoring air resistance, how high was the cliff?

?

Page 9: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

The Interface (Posing Questions)

Page 10: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Question-Answering: The Interface

Page 11: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Overview

The context and problem Question-Answering

Controlled Language for Asking Questions Reasoning for Answering Questions

Evaluation and how it worked out Reformulation attempts Advice Common sense

Future

Page 12: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Controlled Language for Question-Asking…

Controlled Language: Not a panacea! Not just a matter of grammatical simplification

Only certain linguistic forms are understood Many concepts, many ways of expressing each one

Huge effort to encode these in the interpreter

User has to learn acceptable forms

User needs to make common sense explicit Man pulls rope, rope attached to sled → force on sled

4 wheels support a car → ¼ weight on each wheel

Page 13: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

The Question Answering Cycle

Originaltext

A boulder is dropped.The initial speed of the boulder is 0 m/s.The duration of the drop is 23 seconds.The acceleration of the drop is 7.9 m/s^2.What is the distance of the drop?.

CPL (Controlled english)

Logic

Question-Answering

Rewritingadvice

Graph & paraphrase ofsystem’s understanding

A boulder is the object of a dropping.The dropping has a duration of 23 seconds.The dropping has initial speed 23 seconds.The dropping has acceleratio 7.9 m/s^2.The dropping has a distance of unknownWhat is the distance?

Page 14: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

General Guidelines for CPL

Write in very simple sentences. Avoid “and” & “or” phrases and negatives. Avoid “flowery” language. Avoid multiple states; instead, describe a single

event with initial and final values. Include common-sense facts if needed. Ask for a single value in a question, or ask “Is it

true that ...?”

Page 15: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

For example: Write in Simple Sentences

INSTEAD OF: A 2 kg block, starting from rest, slides 20 m down a frictionless

inclined plane from X to Y, dropping a vertical distance of 10 m. WRITE: The mass of a block is 2 kg. The initial velocity of the block is 0 m/s. The block slides down an inclined plane from X to Y. The coefficient of friction of the plane is 0 units. X is a point on the plane. Y is a point on the plane. The distance between X and Y is 20 m. The vertical distance between X and Y is 10 m.

Page 16: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Question-Answering

Find a “model” (set of equations + assumptions) that matches the question scenario can provide an answer

May require making additional assumptions

"An object moves.The mass of the object is 80 kg.The initial speed of the object is 17 m/s.The final speed of the object is 0 m/s.The distance of the move is 10 m.What is the force on the object?"

Unstated assumption: Acceleration is constant

Without this assumption: Can’t answer the question

17m/s 0m/s

Page 17: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Question-Answering

Basic Problem Solver (BPS) Searches space of possible models Find model which answers qn under assumptions

Page 18: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Question-Answering

Basic Problem Solver (BPS) Searches space of possible models Find model which answers qn under assumptions

Page 19: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Question-Answering

Basic Problem Solver (BPS) Searches space of possible models Find model which answers qn under assumptions

Page 20: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Overview

The context and problem Question-Answering

Controlled Language for Asking Questions Reasoning for Answering Questions

Evaluation and how it worked out Reformulation attempts Advice Common sense

Future

Page 21: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Results

Tested on topics in AP science in June 2006 Overall results (knowledge formulation & QA)

38% (biology), 37.5% (chemistry), 19% (physics)

Huge achievement! But what about the 60%-70% incorrect?

No single weak point

Also: users needed several attempts to ask qns

40% correct20% missing knowledge20% bad interpretation20% bad qn formulation

Page 22: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Three investigations…

Reformulation Attempts Advice System Factoring out common sense

Page 23: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

1. Users would often have several attempts…

Mean # of reformulations ~ 6 Physics: 6.3 tries/qn (between 1 and 18) Chemistry: 6.6 tries/qn (between 1 and 19) Biology: 1.5 tries/qn (between 1 and 5)

Majority was trying to find a wording which worked

In general, only certain wordings translate to logical forms that trigger the right solution process, and in many cases the users appeared to be performing trial-and-error guessing until they hit a wording that worked, or they gave up.

Page 24: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

1. Users would often have several attempts…

The SME trying to find a wording which works…

Is it true that BaCO3 is soluble in H2O?

Is it true that BaCO3 dissolves in water?

Is it true that BaCO3 dissolves?

There is a reaction. BaCO3 is the raw material of the reaction.Is it true that the result of the reaction is an aqueous solution?

What is the solubility of BaCO3?

What is the solubility of BaCO3 in water?

E6. Which of the following ionic compounds are insoluble in water? [a] BaCO3[b] BaNO3[c] Al(OH)3[d] NaOH

Page 25: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

2. CPL’s advice to the User

Not mapping a word to a concept (37%) Ungrammatical sentence (5%) Not legal CPL (12%) Bad chemical formulas (21% in chem)

“Failed to understand the input. Please rephrase” 60%

“Always specify a unit for numbers (e.g., “10 m”, not just “10”) 40%

Page 26: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

2. CPL’s advice to the User

Unable to map word to a concept

Ungrammatical sentence

Not legal CPL

Bad chemical formula notation

Specific, targetted advice

1171 CPL advice messages given About half of the advice library used at some point

Page 27: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

3. Making common sense explicit

phys3-#24a: An aeroplane moves exactly in horizontal direction with a constant velocity of 50km/h. A parachutist leaves the aeroplane....

AURA needs to know v(parachutist) = v(airplane)

50km/h

Need to know difference in height = distance of fall

phys3-#18: A policeman chases a jewel thief across city rooftops. Both come to a gap between two buildings that is 5 m wide with a horizontal velocity of 5 m/s. What is the minimum drop of the second building in comparison to the first one to clear the gap (most nearly)?

5m

5m/s

?

Page 28: Capturing and Answering Questions Posed to a Knowledge-Based System Peter Clark, John Thompson, William R Murray, Phil Harrison (Boeing) Jason Chaw, Bruce.

Summary Question-Answering: Challenging!

Controlled language for question-asking A “sweet spot” between logic and language But not a panacea:

Users need to learn how to use and control it Expensive to build and maintain

Our system CPL Was able to adequately support QA in AURA But:

users needed several attempts to formulate qns CPL’s advice was often too general to be useful Challenging to factor out common sense knowledge

Question answering The Basic Problem Solver (BPS)

Searches for “best” model to answer a qn May include heuristic assumptions about the scenario