Project Halo Towards a Digital Aristotle Noah S. Friedland, Paul G. Allen, Gavin Matthews, Michael...

31
Project Halo Towards a Digital Aristotle Noah S. Friedland, Paul G. Allen, Gavin Matthews, Michael Witbrock, David Baxter, Jon Curtis, Blake Shepard, Pierluigi Miraglia, Jurgen Angele, Steffen Staab, Eddie Moench, Henrik Oppermann, Dirk Wenke, David Israel, Vinay Chaudhri, Bruce Porter, Ken Barker, James Fan, Shaw Yi Chaw, Peter Yeh, Dan Tecuci, and Peter Clark

Transcript of Project Halo Towards a Digital Aristotle Noah S. Friedland, Paul G. Allen, Gavin Matthews, Michael...

Project Halo

Towards a Digital Aristotle

Noah S. Friedland, Paul G. Allen, Gavin Matthews, Michael Witbrock, David Baxter, Jon Curtis, Blake Shepard, Pierluigi Miraglia, Jurgen Angele,

Steffen Staab, Eddie Moench, Henrik Oppermann, Dirk Wenke, David Israel, Vinay Chaudhri, Bruce Porter, Ken Barker, James Fan, Shaw Yi

Chaw, Peter Yeh, Dan Tecuci, and Peter Clark

Presented by Jacob Halvorson

Aristotle?

• Real person who lived from 384-322BC

• Known for: – The depth and scope of his knowledge

• Wide range of topics– Medicine– Philosophy– Physics– Biology

– Ability to explain this knowledge to others

What is Project Halo?

Goal:Create an application that will encompass much of the world’s scientific knowledge and be capable of applying sophisticated problem solving to answer novel questions.

Roles (envisioned):Tutor to instruct students in the sciences.

Interdisciplinary research assistant to help scientists.

Sponsored by Vulcan, Inc.

Why is a program like Digital Aristotle important?

• Too much knowledge in the world for a single person to assimilate.– This forces people to become more specialized,

thus defining their own restrictive “microworld”.

• Even these microworlds are too big.– MEDLINE

• 12 million publications with 2,000 added daily.

Don’t we have something like this already?

• Voorheese– Retrieval of simple facts from an “answer”

database.

• Knowledge-based expert systems– Retrieval of answers that aren’t in a database.– Digital Aristotle fits in this category

How Digital Aristotle differs from other knowledge-based expert systems

1 Speed and ease of knowledge formulation- Little or no help from knowledge engineers

- *Other expert systems required years to perfect and highly skilled knowledge engineers to craft them

2 Coverage - Encompass much of the world’s scientific knowledge

3 Reasoning Techniques - Multiple technologies and problem solving methods

4 Explanations

- Appropriate to the domain and user’s level of expertise

Where to Start: The Halo Pilot• Three teams contracted to participate in evaluation

– SRI International• Boeing Phantom Works and Univ. of Texas backing

– Cycorp– Ontoprise

• Goal:– Determine the current state of knowledge representation &

reasoning (KR&R) by mastering a 70-page subset of introductory college-level AP Chemistry.

– *Secret goal: Set the bar so high that the current weaknesses of KR&R would be exposed

• Four months to create formal encodings• Six months total.

70-page AP Chemistry Overview

• Self-contained, no reasoning with uncertainty, no diagrams

• Large enough for complex inference– Nearly 100 distinct chemistry laws

• Small enough to be represented quickly

The three team’s technology

• All three teams needed to address– Knowledge formation

• All built knowledge bases in a formal language and had knowledge engineers encode.

– Question answering• All used automated deductive inference to answer

questions.

– Explanation generation• All were different

Knowledge Formation – Ontoprise team

• Ontoprise encoded knowledge in three phases1 Encode knowledge into the ontology and rules

without considering sample questions- Tested with questions from textbook

2 Tested with questions from Vulcan

- Refined knowledge base until 70% coverage

- Coded explanation rules

3 Refined encoding of knowledge base and explanation rules

Knowledge Formation - Cycorp

• Cycorp encoded knowledge in two phases1 Concentrated on representing the basic

concepts and principles

2 Shift over to a question-driven approach.

*Avoid overfitting the knowledge to the specifics of the sample questions available.

Knowledge Formation - SRI

• Question-driven– Started with 50 sample questions

• Worked backwards to determine what knowledge was needed to solve them.

– Found additional questions and continued

• Combined team of knowledge engineers and four chemistry domain experts.

Cycorp and SRI had preexisting knowledge based content. Ontoprise started from scratch.

Explanation Generation – Ontoprise team

• Used metainferencing– While processing a query, a log file of the proof

tree is created.– The log file is used to create English answer

justifications.

• Running short on time, they mostly used template matching.

Explanation of the Ka value of a substance given its quantity in moles (0.2) and its pH (3.0)

Explanation Generation - SRI

• The knowledge engineer specifies what text to display: – When a rule is invoked (“entry text”)– When the rule has been successfully applied

(“exit text”)– A list of any other facts that should be

explained in support of the current rule (“dependent facts”)

Explanation generated for the computation of concentration of ions in NaOH

Explanation Generation - Cycorp

• Cycorp already had a program that was capable of providing natural language explanations in any detail.– Much of the effort was spent on strengthening

of the explanation filters• Output errs on the side of verbosity.

– The English is built up compositionally by automated techniques rather than handcrafted.

• Exhibits clumsiness of expression.

Evaluation of the three systems• After four months, an exam was given to all three

systems.– 100 AP-style English questions (total score: 1008

points)• 50 multiple choice• Two sets of 25 multipart questions

– Detailed answer» Fill in the blank and short essay

– Free-form answer» Qualitative, comprehension questions (somewhat common

sense questions)» Somewhat beyond the scope of the defined syllabus

– Graded by 3 chemistry professors (336 points per prof)• Correctness (168 points)• Quality of explanation (168 points)

– **Input could be in any form

Processing Time

• Ontoprise– 2 hours

• SRI– 5 hours

• Cycorp– 12 hours

Exam Examples

Results• All three systems scored above 40%

Cycorp’s program looked for provably wrong answers if the correct answer couldn’t be found immediately

Results - Multiple Choice• Cycorp’s program looked for provably

wrong answers if the correct answer couldn’t be found immediately.

• No answer = no justification• Incorrect answers = unconvincing

justification

• SRI was the winner of multiple choice– Best answers and justification

• Cycorp was the loser– Generative-English was least comprehensible

Results – Detailed Answer

• Cycorp appears to be the best– It wasn’t penalized by going

through all the answers like multiple choice.

Results – Free-Form

• SRI & Cycorp were expected to do well

• SRI did much better than others

Second Exam

• All three teams did their own modifications after the test and ran the challenges again

• Ontoprise– 9 minutes (2 hours previously)

• SRI– 30 minutes (5 hours previously)

• Cycorp– 27 hours (12 hours previously)

Failure Analysis (What we’ve learned)

• Modeling– Incorrect knowledge was represented and

captured at the wrong level of abstraction• Solution: Domain experts

• Answer Justification– Answers don’t matter if they can’t be explained

• Perform metareasoning over the proof tree

• Scalability for Speed and Reuse– How to manage trade-off between

expressiveness and tractability

Ontoprise Output Example

SRI Output Example

Cycorp Output Example

Where To Go From Here: Phase Two

• Goal: Domain expert uses an existing document such as a textbook as the basis for the formulation of a knowledge module.– 30 month phase– Three stages

• 6 month analysis-driven design process stage

• 15 month implementation stage

• 9 month refinement stage

• *Get rid of knowledge engineers