The Doctor is in… How does it work?jdb/ai/wk2-2006.pdf · project came from feeding the phrase...

10
1 RIT Computer Science Dept. ELIZA A program by Joseph Weizembaum in the 1960’s that emulated a Rogerian psychologist in dialogue with a patient – Led at least one psychologist to seriously suggest that computers might help alleviate the shortage of trained psychotherapists – Weizembaum later questioned his own work and became one of the field’s biggest critics RIT Computer Science Dept. ELIZA is… A reactive agent (a stimulus response agent) An example of a simple production system (we will discuss production systems later) RIT Computer Science Dept. The Doctor is in… [example of dialog with the doctor program in emacs] RIT Computer Science Dept. How does it work? See Weizembaum’s paper at: http://i5.nyu.edu/~mm64/x52.9265/january1966.html Slides come from information and text in the paper. “Input sentences are analyzed on the basis of decomposition rules which are triggered by key words appearing in the input text. Responses are generated by reassembly rules associated with selected decomposition rules.” RIT Computer Science Dept. The technical problems 1. the identification of keywords 2. the discovery of minimal context 3. the choice of appropriate transformations 4. generation of responses in the absence of keywords 5. the provision of an ending capacity for ELIZA "scripts" RIT Computer Science Dept. ID of Keywords Keywords may have a RANK or precedence number. The procedure is sensitive to such numbers. When a keyword has been found and there is a delimiter (period or comma), all subsequent text is deleted from the input message. The keywords and their transformation rules are the SCRIPT for the conversation class. Scripts are independent of the program, so the same program can handle English, German, and other languages.

Transcript of The Doctor is in… How does it work?jdb/ai/wk2-2006.pdf · project came from feeding the phrase...

Page 1: The Doctor is in… How does it work?jdb/ai/wk2-2006.pdf · project came from feeding the phrase “the spirit is willing but the flesh is weak” into an English to Russian translator,

1

RIT Computer Science Dept.

ELIZA

A program by Joseph Weizembaum in the 1960’s that emulated a Rogerian psychologist in dialogue with a patient

– Led at least one psychologist to seriously suggest that computers might help alleviate the shortage of trained psychotherapists

– Weizembaum later questioned his own work and became one of the field’s biggest critics

RIT Computer Science Dept.

ELIZA is…

A reactive agent (a stimulus response agent)An example of a simple production system (we will discuss production systems later)

RIT Computer Science Dept.

The Doctor is in…

[example of dialog with the doctor program in emacs]

RIT Computer Science Dept.

How does it work?

See Weizembaum’s paper at: http://i5.nyu.edu/~mm64/x52.9265/january1966.html

Slides come from information and text in the paper.

“Input sentences are analyzed on the basis of decomposition rules which are triggered by key words appearing in the input text. Responses are generated by reassembly rules associated with selected decomposition rules.”

RIT Computer Science Dept.

The technical problems

1. the identification of keywords2. the discovery of minimal context3. the choice of appropriate

transformations4. generation of responses in the

absence of keywords5. the provision of an ending capacity for

ELIZA "scripts"

RIT Computer Science Dept.

ID of Keywords

Keywordsmay have a RANK or precedence number. The procedure is sensitive to such numbers.When a keyword has been found and there is a delimiter (period or comma), all subsequent text is deleted from the input message. The keywords and their transformation rules are the SCRIPT for the conversation class. Scripts are independent of the program, so the same program can handle English, German, and other languages.

Page 2: The Doctor is in… How does it work?jdb/ai/wk2-2006.pdf · project came from feeding the phrase “the spirit is willing but the flesh is weak” into an English to Russian translator,

2

RIT Computer Science Dept.

Other Issues1. The ID of the "most important" keyword2. The ID of some minimal context within which the

chosen keyword appears; e.g., if the keyword is "you", is it followed by the word "are" (in which case an assertion is probably being made).

3. The choice of an appropriate transformation rule, and the transformation itself.

4. To respond "intelligently" when the input text contains no keywords.

5. The provision of machinery that facilitates editing, particularly extension, of the script on the script writing level

RIT Computer Science Dept.

An Example TransformationConsider the sentence:

– "I am very unhappy these days".

Consider somebody that only understood “I am” and responded:

– "How long have you been very unhappy these days?"

This person must have applied a template!He must also have a reassembly kit:

– "I am BLAH" can be transformed to

– "How long have you been BLAH" independently of the meaning of BLAH.

RIT Computer Science Dept.

A more complicatedexample

The sentence:– "It seems that you hate me".

The words "you" and "me“ are understoodA template decomposes the sentence into 4 parts:

1) It seems that2) you3) hate4) me

The reassembly rule might then be: – "What makes you think I hate you"

RIT Computer Science Dept.

What are the problems?

How can we make the Eliza program fail?

RIT Computer Science Dept.

Automated Translation

Failed miserably for years and is still very difficult

⇒One of the famous mistakes from an early DOD project came from feeding the phrase “the spirit is willing but the flesh is weak” into an English to Russian translator, then feeding the result back into a Russian to English translator to yield “the vodka is strong but the meat is rotten”.

RIT Computer Science Dept.

Knowledge Representation and Search

If a computer only has the right knowledge, representation of that knowledge, and way of indexing that knowledge then it can be intelligent

Does this idea work?

Page 3: The Doctor is in… How does it work?jdb/ai/wk2-2006.pdf · project came from feeding the phrase “the spirit is willing but the flesh is weak” into an English to Russian translator,

3

RIT Computer Science Dept.

Physical Symbol Hypothesis

Intelligent activity in either human or machine requires [Newell & Simon] :

– Symbol patterns to represent significant aspects of a problem domain

– Operations on these patterns to generate potential solutions to problems

– Search to select a solution from among these possibilities

RIT Computer Science Dept.

Representation Schemes

Schemes should be [Luger, pg. 36] :Expressive: The scheme must be adequate to express all necessary informationEfficient: Support efficient execution of the resulting codeNatural: Provide a natural scheme for expressing the required knowledge

RIT Computer Science Dept.

Representation Schemes

Where a good representation is available, the solution to the problem may be easy.Sometimes it’s hard to have a good representation – just imagine a representation for common sense knowledge in a natural language system!

RIT Computer Science Dept.

An exampleTask Description

To write a program that finds, for a given phone number, all possible encodings by words, and prints them. A phone number is an arbitrary(!) string of dashes - , slashes / and digits. The dashes and slashes will not be encoded. The words are taken from a dictionary which is given as an alphabetically sorted ASCII file (one word per line).

Participants: 14 programmers (ave. experience: ~ 7 yr)Biggest experimental flaw: subjects self selected

RIT Computer Science Dept.

Mapping

The following mapping from letters to digits is given:

E | J N Q | R W X | D S Y | F T | A M | e | j n q | r w x | d s y | f t | a m | 0 | 1 | 2 | 3 | 4 | 5 |

C I V | B K U | L O P | G H Zc i v | b k u | l o p | g h z6 | 7 | 8 | 9

RIT Computer Science Dept.

The Results

Using LispTime (hr): 2 to 8.5, ave: 5Lines of Code: 51 to 182Run Time (median): 30 seconds

Using C/C++Time (hr): 3 to 25, ave: 11Lines of Code: 107 to 614, ave: 277Run Time (median): 54 seconds

Using JavaTime (hr): 4 to 63, ave: 9Lines of Code: 107 to 614,ave: 277

Quickest C/C++ program ran faster than the quickest Lisp program

Page 4: The Doctor is in… How does it work?jdb/ai/wk2-2006.pdf · project came from feeding the phrase “the spirit is willing but the flesh is weak” into an English to Russian translator,

4

RIT Computer Science Dept.

Different Categories

Logical representation schemes. This class of representations uses expressions in formal logic to represent a knowledge base. Inference rules and proof procedures apply this knowledge to problem instances. Procedural representation schemes. Procedural schemes represent knowledge as a set of instructions for solving a problem. This contrasts with the declarative representations provided by logic and semantic networks. A production rule system is an example of this approach.Network representation schemes. Network representations capture knowledge as a graph in which the nodes represent objects or concepts in the problem domain and the arcs represent relations or associations between them. Structured representation schemes. Structured representation languages extend networks by allowing each node to be a complex data structure consisting of named slots with attached values.

RIT Computer Science Dept.

Different RepresentationSchemes

We’ll cover:– Brook’s Subsumption Architecture– Semantic networks– Frames– Trees

RIT Computer Science Dept.

Brooks’ Hypothesis

Rational behavior does not come from disembodied systems. Intelligence is the product of the interaction between an appropriately layered system and its environment. Intelligent behavior emerges from the interactions of architectures of organized simpler behaviors

RIT Computer Science Dept.

Subsumption Architecture

Constructed of augmented finite state machinesBased on production rules where inputs map to actionsLower level modules combine together and create emergent intelligent behaviorThere is no central memory[example given from Brook's 1985 paper]

RIT Computer Science Dept.

Semantic Networks

Use the associationist viewRepresent knowledge as a graph with nodes corresponding to facts or concepts and the arcs to relations or associations between concepts. Can represent inheritance relationshipsVariations used for natural language processing

RIT Computer Science Dept.

Examples

[shown in class and your book]

Page 5: The Doctor is in… How does it work?jdb/ai/wk2-2006.pdf · project came from feeding the phrase “the spirit is willing but the flesh is weak” into an English to Russian translator,

5

RIT Computer Science Dept.

Quillian’s Word System [1967]

Program defined English words in terms of other words as a dictionary doesThere were possible circular traversals of the graphEach node was a word concept and the knowledge base (KB) was organized into planes where each plane was a graph that defined a single word

RIT Computer Science Dept.

How it was used

The KB was used to find relationships between pairs of English words. Given two words, it would search the graphs outward from each word in a breadth-first fashion, searching for a common concept or intersection node.

RIT Computer Science Dept.

Quillian Suggested

This approach to semantics might provide a natural language processing system with the ability to:

– Determine the meaning of a body of English text by building up collections of these intersection nodes

– Choose between multiple meanings of words by finding the meanings with the shortest intersection path to other words in the sentence.

– Answer a flexible range of queries based on associations between word concepts in the queries and concepts in the system

RIT Computer Science Dept.

Problems

Graphs were just another notation for relationships

– Many people have worked on building up a richer set of link labels to model language semantics. By implementing semantics relationships as part of the representational scheme rather than the domain knowledge added by the system builder, KBs require less handcrafting

– Can a program be written to reduce sentences to a canonical form?

– There is a computational price to reducing everything to low-level primitives

RIT Computer Science Dept.

Frames

Frame system (p. 32):Each frame:

– Describes an instance or a class– Has one or more slots, which are are assigned

slot values– Slots can be frames– Can have procedural attachments

RIT Computer Science Dept.

Scripts

Many systems require a large amount of background knowledge in order to functionA script is a structured representation describing a stereotyped sequence of events in a particular context.Pieces of semantic meaning are represented with conceptual dependency relationships

Page 6: The Doctor is in… How does it work?jdb/ai/wk2-2006.pdf · project came from feeding the phrase “the spirit is willing but the flesh is weak” into an English to Russian translator,

6

RIT Computer Science Dept.

Example

When you enter a classroom, the teacher normally follows a certain script and you as a student do also!

RIT Computer Science Dept.

Components of a Script

Entry conditions: Things that must be true for a script to occurResults: Facts that are true once the script has terminated.Props: Things that support the content of the script.Roles: Actions that individual participants performScenes: Represent the temporal aspect of each script

RIT Computer Science Dept.

An example?

[class picks an example and we turn it into a script]

RIT Computer Science Dept.

Problems?

What kind of problems are there with scripts?

RIT Computer Science Dept.

Search Trees

The tree is a representation of a problem that makes it easy for a solution to be found.

Decision trees: yes/noGoal trees: problem broken into subproblems

RIT Computer Science Dept.

Deciding What to Do: Scenario Objectives(Simplified Counter-Terrorist Bomb Example)

[Booth, GDC 2004]

Page 7: The Doctor is in… How does it work?jdb/ai/wk2-2006.pdf · project came from feeding the phrase “the spirit is willing but the flesh is weak” into an English to Russian translator,

7

RIT Computer Science Dept.

Search

Many problems are search problemsTraveling salesman problemN-Queens or half toning?Object RecognitionInternet search or any kind of data miningRobot navigationVLSI circuit layout…

RIT Computer Science Dept.

State Space Search

Questions to be answered:– Is the problem solved guaranteed to find a

solution?– Will the problem solver always terminate or

can it become caught in an infinite loop?– When a solution is found will it always be

optimal?– What is the complexity of the search

process in terms of time usage? Memory usage?

RIT Computer Science Dept.

Uninformed Searches

Ch. 4 in bookBreadth-first – queue-basedUniform-costDepth-first (depth limited) – stack-basedIterative deepening depth-firstBi-directional

Example of navigation shown in class

RIT Computer Science Dept.

Basic Search Problem

All search techniques so far have a worst case exponential time complexityA full search may take too long!

– Chess has a possible 10120 game paths– Checkers has a possible 1040 game paths

RIT Computer Science Dept.

Comparing Search Techniques Russell&Norvig p. 81

YesYesNoNoYesYesOptimal?

O(b d/2)O(bd)O(bl)O(bm)O(b 1+[C*/ε])O(b d+1)Space

O(b d/2)O(b d)O(b l)O(b m)O(b 1+[C*/ε])O(b d+1)Time

YesYesNoNoYesYesComplete

bidirIterative deepening

Depth limited

Depth 1st

Uniform cost

Breadth 1st

RIT Computer Science Dept.

Examples

Trees for common 2-player games– Tic-tac-toe has a possible 9! game paths (this is

why tic-tac-toe tends to be used in brute force search examples)

– The Eight Puzzle is another game example

Page 8: The Doctor is in… How does it work?jdb/ai/wk2-2006.pdf · project came from feeding the phrase “the spirit is willing but the flesh is weak” into an English to Russian translator,

8

RIT Computer Science Dept.

Different Ways ofProblem Solving

Data-driven: forward chaining– Starts with facts and attempts to make a path to

the goal

Goal-driven: backward chaining– Starts from the goal and goes backwards

towards the initial facts

What is preferred?– It depends

RIT Computer Science Dept.

An Example

Say I want to confirm/deny that “I am a descendant of Thomas Jefferson”.

– He was born around 250 years ago and we assume 25 yr. per generation. We also assume that people general have more children than parents (say an average of 3 children)

– The required path back would be around 10. If we assume 2 parents for each person, then there are 210 possible states to search.

– The required path forward would be around 310

RIT Computer Science Dept.

Which one?

Goal-driven search suggested if:– Goal-hypothesis is given in the problem and is

easily formulated. Example: a theorem prover– There are a large number of rules that match

the facts of the problem and thus produce an increasing number of conclusions or goals.

– Problem data are not given but must be acquired by the problem solver. Example: a medical diagnosis system where diagnostic tests are ordered to confirm/deny a particular hypothesis

RIT Computer Science Dept.

The other?

Data-driven search is suggested if:– All or most of the data are given in the initial

problem statement. Systems that analyze data fall into this category

– There are a large number of possible goals, but only a few ways to use the facts. Example: DENDRAL, an expert system that finds the molecular structure of organic compounds based on their formula, mass spectrographic data, and knowledge of chemistry

– It’s difficult to form a goal or hypothesis

RIT Computer Science Dept.

Examples

Trees for common 2-player games– Tic-tac-toe has a possible 9! game paths (this is

why tic-tac-toe tends to be used in brute force search examples)

– The Eight Puzzle is another game example

What about solving mazes using trees?[example from book shown]

RIT Computer Science Dept.

Basic Search Problem

All search techniques so far have a worst case exponential time complexityA full search may take too long!

– Chess has a possible 10120 game paths– Checkers has a possible 1040 game paths

Page 9: The Doctor is in… How does it work?jdb/ai/wk2-2006.pdf · project came from feeding the phrase “the spirit is willing but the flesh is weak” into an English to Russian translator,

9

RIT Computer Science Dept.

Considering Complexityin Search

10120 is comparable to the number of molecules in the universe!If the branching factor of a tree (ave. number of branches for each parent) can be cut down, then the tree can be searched deeper

– This is easy to do for tic-tac-toe when you consider the symmetry of the board

RIT Computer Science Dept.

Heuristic Search

“Intelligence for a system with limited processing resources consists in making wise choices of what to do next…”

– Newell and Simon, 1976, Turing Award Lecture

RIT Computer Science Dept.

Consider Two Problems

Medical DiagnosisChess

RIT Computer Science Dept.

Two Basic Situations

A problem may not have an exact solution because of inherent ambiguities in the problem statement or available data. Medical diagnosis is an example of this. Heuristics are used to choose the most likely diagnosis and formulate a plan of treatment.A problem may have an exact solution, but the computational cost may be prohibitive. An example here is chess.

RIT Computer Science Dept.

Comments on the Nature of Heuristics

Heuristics have limited information and are seldom able to predict the exact behavior of the state space farther along in the search – can lead to sub-optimal solutionsHeuristic algorithms have two parts:

– The heuristic measure– An algorithm that uses it to search the state

space

RIT Computer Science Dept.

Hill Climbing

The simplest thing to do is to climb a hill the steepest path possibleHill climbing strategies search and evaluate only the best children’s children. They ignore other siblings.What could go wrong here?

Page 10: The Doctor is in… How does it work?jdb/ai/wk2-2006.pdf · project came from feeding the phrase “the spirit is willing but the flesh is weak” into an English to Russian translator,

10

RIT Computer Science Dept.

Local Maxima

What if you have to take a winding path (it goes down and sometimes up) to get to the top of the hill?

– Many algorithms in AI suffer from the problem of getting stuck in local maxima.

RIT Computer Science Dept.

Best-first Search

Local maxima are hopefully avoided by backtrackingUses two lists:

– An open list: Keeps track of current fringe of the search

– A closed list: keeps track of states already visited

– The algorithm orders states on open according to their “closeness” to a goal

– This amounts to a priority queue for the search

RIT Computer Science Dept.

Best-First Search

When the frontier of the search is uneven, best-first search can work well

RIT Computer Science Dept.

Admissibility

Admissibility: Heuristics that find the shortest path to a goal whenever it exists are admissible.Breadth-first search is admissible as it examines all possible states at level n, before going to the next level