AI Jacobs University Green Slides Dec 5

Artificial IntelligenceCourse No. 320331, Fall 2013

Dr. Kaustubh PathakAssistant Professor, Computer [email protected]

Jacobs University Bremen

December 5, 2013

K. Pathak (Jacobs University Bremen) Artificial Intelligence December 5, 2013 1 / 475

Course Introduction

Python Brief Introduction

Agents and their Task-environments

Goal-based Problem-solving Agents using Searching

Non-classical Search Algorithms

Games Agents Play

Logical Agents: Propositional Logic

Probability Calculus

Beginning to Learn using Nave Bayesian Classifiers

Bayesian Networks

Some Learning Methodologies

Home-work Assignments

Quizzes


Course Introduction

Contents

Course IntroductionCourse LogisticsWhat is Artificial Intelligence (AI)?Foundations of AIHistory of AIState of the Art


Course Introduction Course Logistics

Grading

I Break-down:Easy quizzes 15% Auditors: taking 75% quizzes necessary.Homeworks (5) 25%Mid-term exam 30% 23rd Oct. (Wed.) after Reading days.Final exam 30%

I If you have an official excuse for a quizz/exam, make-up will beprovided. For home-works, make-ups will be decided on acase-by-case basis: official excuse for at least three days immediatelybefore the deadline necessary.

I Home-works: Python or C++.

I Teaching Assistant: Vahid Azizi [email protected]



Homework Submission via Grader

Check after a week:https://cantaloupe.eecs.jacobs-university.de/login.php



Teaching Philosophy

I No question will be ridiculed.

I Some questions would be taken oine or might be postponed.

I Homeworks are where you really learn!

I Not all material will be in the slides. Some material will be derived onthe board - you should take lecture-notes yourselves.

I Material done on the board is especially likely to appear inquizzes/exams.



Expert of the Day

I At the beginning of each lecture, a student will summarize the lastlecture in 5 minutes (more than 7 will be penalized).

I She/He can also highlight things which need more clarification.

I A student will volunteer at the end of each lecture for being theexpert in the next lecture.

I Your participation counts as 1 quiz. Everyone should do it at leastonce.



Coming Up...

Our Next Expert Is?



Textbooks

Main textbook:

I Stuart Russell and Peter Norvig, Artificial Intelligence: A ModernApproach, 3rd Edition, 2010, Pearson International Edition.

Other references:

I Uwe Schoning, Logic for Computer Scientists, English 2001,German 2005, Birkhauser.

I Daphne Koller and Nir Friedman, Probabilistic Graphical Models:Principles and Techniques, 2009, MIT Press.



SyllabusI Introduction to AI; Intelligent agents: Chapters 1,2.

I A Brief Introduction to Python (skipped this year)I Solving problems by Searching

I BF, DF, A? search: Proofs Chapter 3.I Sampling Discrete Probability Distributions, Simulated Annealing,

Genetic Algorithms: Real-world example Chapter 4.I Adversarial search (Games): Minimax, pruning Chapter 5.

I Logical Agents Also Schonings BookI Propositional Logic: Inference with Resolution Chapter 7.

I Uncertain Knowledge & Reasoning Also Kollers BookI Introduction to Probabilistic Reasoning: Chapter 13.I Bayesian Networks: Various Inference Approaches Chapter 14.

I Introduction to Machine-LearningI Supervised Learning: Information Entropy, Decision Trees, ANNs:

Chapter 18.I Model Estimation: Priors, Maximum Likelihood, Kalman Filter, EKF,

RANSAC.I Learning Probabilistic Models: Chapter 20.I Unsupervised Learning: Clustering (K-Means, Mean-Shift Algorithm).


Course Introduction What is Artificial Intelligence (AI)?

Defining AIHuman-centered vs. Rationalist Approaches

Thinking Humanly [Theautomation of] activities that

we associate with human think-

ing, activities such as decision-

making, problem-solving, learn-

ing... (Bellman, 1978)

Thinking Rationally Thestudy of computations that make

it possible to perceive, reason, and

act. (Winston, 1992)

Acting Humanly The artof creating machines that per-

form functions that require intelli-

gence when performed by people.

(Kurzweil, 1990)

Acting Rationally Computa-tional Intelligence is the study of

the design of intelligent agents.

(Poole et al., 1998)



Acting Humanly

The Turing Test (1950)

The test is passed if a human interrogator,after posing some written questions,cannot determine whether the responsescome from a human or from a computer.

Total Turing Test

There is a video signal for the interrogatorto test the subjects perceptual abilities, aswell as a hatch to pass physical objectsthrough.

Figure 1: Alan Turing(1912-1954)



Reverse Turing Test: CAPTCHACompletely Automated Public Turing test to tell Computers and Humans Apart

Figure 2: Source: http://www.captcha.net/



Capabilities required for passing the Turing testThe 6 main disciplines composing AI.

The Turing Test

I Natural language processing

I Knowledge representation

I Automated reasoning

I Machine learning

The Total Turing Test

I Computer vision

I Robotics



Thinking Humanly

Trying to discover how human mindswork. Three ways:

I Introspection

I Psychological experiments onhumans

I Brain imaging: FunctionalMagnetic Resonance Imaging(fMRI), Positron EmissionTomography (PET), EEG, etc.

Cognitive Science constructs testabletheories of mind:

I Computer models from AI

I Experimental techniques frompsychology

Figure 3: fMRI image (source:http://www.umsl.edu/~tsytsarev)

Youtube video (1:00-4:20)

Reading mind by fMRI



Thinking Rationally

I Aristotles Syllogisms (384-322 B.C.): right thinking, deductivelogic.

I The logicist tradition in AI. Good old AI. Logical programming.I Problems:

I Cannot handle uncertaintyI Does not scale-up due to high computational requirements.



Acting Rationally

Definition 1.1 (Agent)

An agent is something that acts, i.e,

I perceives the environment,

I acts autonomously,

I persist over a prolonged time-period,

I adapts to change,

I creates and pursues goals (by planning), etc.



Acting RationallyThe Rational Agent Approach

Definition 1.2 (Rational Agent)

A rational agent is one that acts so as to achieve the best outcome, orwhen there is uncertainty, the best expected outcome. This approach ismore general, because:

I Rationality is more general than logical inference, e.g. reflex actions.

I Rationality is more amenable to scientific development than the onesbased on human behavior or thought.

I Rationality is well defined mathematically in a way, it is justoptimization under constraints. When, due to computationaldemands in a complicated environment, the agent cannot maintainperfect rationality, it resorts to limited rationality.



Acting RationallyThe Rational Agent Approach

This course therefore concentrates on general principles of rationalagents and on components for constructing them.


Course Introduction Foundations of AI

Disciplines Contributing to AI. I

Philosophy

I Rationalism: Using power of reasoning to understand the world.I How does the mind arise from the physical brain?

I Dualism: Part of mind is separate from matter/nature. ProponentRene Descartes, among others.

I Materialism: Brains operation constitutes the mind. Claims that freewill is just the way perception of available choices appears to thechoosing entity.

MathematicsLogic, computational tractability, probability theory.

EconomicsUtility theory, decision theory (probability theory + utility theory), gametheory.



NeuroscienceThe exact way the brain enables thought is still a scientific mystery.However, the mapping between areas of the brain and parts of body theycontrol or receive sensory input from can be found, though it can changeover a course of a few weeks.

Parietal LobeFrontal Lobe

Occipital Lobe

Temporal Lobe

Motor Cortex

Visual Cortex

Dorsal Stream

Ventral Stream

Cerebellum

Spinal Cord

Figure 4: The human cortex with the various lobes shown in different colors. Theinformation from the visual cortex gets channeled into the dorsal (where/how)and the ventral (what) streams.



The Human Brain

The human brain has 1011 neurons, with 1014 synapses, cycle time of103, and 1014 memory updates/sec. Refer to Fig. 1.3 in the book.

Figure 5: TED Video: Dr. Jill Bolte Taylor



Psychology

Behaviorism (stimulus/response), Cognitive psychology.

Computer Engineering

Hardware and Software. Computer vision.

Linguistics

Natural language processing.



Control Theory and Cybernetics

Figure 6: A typical control system with feedback. Source:https://www.ece.cmu.edu/~koopman/des_s99/control_theory/

The basic idea of control theory is to use sensory feedback to alter systeminputs so as to minimize the error between desired and observed output.Basic example: controlling the movement of an industrial robotic arm to adesired orientation.



Control Theory and Cybernetics

I Norbert Wiener (18941964): book Cybernetics (1948).

I Modern control theory and AI have a considerable overlap: both havethe goal of designing systems which maximize an objective functionover time.

I Difference is in: 1) the mathematical techniques used; 2) theapplication areas.

I Control theory focuses more on calculus of continuous variables,Matrix algebra, whereas AI also uses tools of logical inference andplanning.


Course Introduction History of AI

History of AI I

Gestation Period (1943-1955)

McCulloch and Pitts (1943) proposed a model for the neuron. Hebbianlearning (1949) for updating inter-neuron connection strengths developed.Alan Turing published Computing Machinery and Intelligence (1950),proposing the Turing test, machine learning, genetic algorithms, andreinforcement learning.

Birth of AI (1956)

The Dartmouth workshop organized by John McCarthy of Stanford.

Early Enthusiasm (1952-1969)

LISP developed. Several small successes including theorem proving etc.Perceptrons (Rosenblatt, 1962) developed.

Reality hits (1966-1973)



History of AI II

After the Sputnik launch (1957), automatic Russian to English translationattempted. Failed miserably.

1. The spirit is willing, but the flesh is weak. Translated to:

2. The wodka is good but the meat is rotten.

Computational complexity scaling-up could not be handled. Single layerperceptrons were found to have very limited representational power. Mostof government funding stopped.

Knowledge-based Systems (1969-1979)

Use of expert domain specific knowledge and cook-book rules collectedfrom experts for inference.Examples: DENDRAL (1969) for inferring molecular structure from massspectrometer results; MYCIN (Blood infection diagnosis) with 450 rules.

AI in Industry (1980-present)



History of AI III

Companies like DEC, DuPont etc. developed expert systems. Industryboomed but all extravagant promises not fulfilled leading to AI winter.

Return of Neural Networks (1986-present)

Back-propagation learning algorithm developed. The connectionistapproach competes with logicist and symbolic approaches. NN researchbifurcates.

AI embraces Control Theory and Statistics (1987-present)

Rigorous mathematical methods began to be reused instead of ad hocmethods. Example: Hidden Markov Models (HMM), Bayesian Networks,etc. Real-life data-sets sharing started.

Intelligent agents (1995-present)

Growth of the Internet. AI in web-based applications (-bots).



History of AI IV

Huge data-sets (2001-present)

Learning based on very large data-sets. Example: Filling in holes in aphotograph; Hayes and Efros (2007). Performance went from poor for10,000 samples to excellent for 2 million samples.

Figure 7: Source: Hayes and Efros (SIGGRAPH 2007).



Reading Assignment (not graded)

Read Sec. 1.3 of the textbook.


Course Introduction State of the Art

Successful ApplicationsIntelligent Software Wizards and Assistants

(a) Microsoft Office Assistant Clippit (b) Siri

Figure 8: Wizards and Assistants.



Logistics Planning

Dynamic Analysis and Replanning Tool (DART). Used during Gulf war(1990s) for scheduling of transportation. DARPA stated that this singleapplication paid back DARPAs 30 years investment in AI.DART won DARPAs outstanding Performance by a Contractor award, formodification and transportation feasibility analysis for Time-Phased Forceand Deployment Data that was used during Desert Storm.http://www.bbn.com



Flow Machines2013 Best AI Video Award: http://www.aaaivideos.org

Figure 9: Video (4:53)



Intelligent Textbook2012 Best AI Video Award: http://www.aaaivideos.org




DARPA Urban Challenge 2007




3D Planar-Patches based Simultaneous Localization andMapping (SLAM): Scene Registration

Figure 12: Collecting Data Registered Point-CloudsRegistered Planar-Patches

The Minimally Uncertain Maximum Consensus (MUMC) Algorithm

Related to the RANSAC (Random Consensus) Algorithm that we willstudy.

K. Pathak, A. Birk, N. Vaskevicius, and J. Poppinga, Fast registration based on noisy planes

with unknown correspondences for 3D mapping, IEEE Transactions on Robotics, vol. 26, no.

3, pp. 424-441, 2010.



New Sensing Technologies: Example Kinect

(a) The Microsoft Kinect 3D camera(from Wikipedia)

(b) A point-cloud obtained from it (fromWillow Garage).



RGBD SegmentationUnsupervised Clustering By Mean-Shift Algorithm



Object Recognition & Pose Estimation

Figure 13: IEEE Int. Conf. on Robotics & Automation (ICRA) 2011: PerceptionChallenge. Our group won II place between Berkeley (I) and Stanford (III).Video (2:39)


Python Brief Introduction

Contents

Python Brief IntroductionData-typesControl StatementsFunctionsPackages, Modules, Classes


Python Brief Introduction Data-types

Built-in Data-types

Type Example ImmutableNumbers 12, 3.4, 7788990L, 6.1+4j, Decimal XStrings "abcd", abc, "abcs" XBoolean True, False XLists [True, 1.2, "vcf"]Dictionaries {"A" : 25, "V" : 70}Tuples ("ABC", 1, Z) XSets/FrozenSets {90,a}, frozenset({a, 2}) X/XFiles f= open(spam.txt, r)}Single Instances None, NotImplemented. . .



Sequences Istr, list, tuple

Creation and Indexing

a= "1234567"

a[0]

b= [z, x, a, k]

b[-1] == b[len(b)-1], b[-1] is b[len(b)-1]

x= """This is a

multiline string"""

print x

y= me

too

print y

len(y)



Sequences IIstr, list, tuple

Immutability

a[1]= q # Fails

b[1]= s

c= a;

c is a, c==a

a= "xyz"; c is a

Help

dir(b)

help(b.sort)

b.sort()

b # In-place sorting



Sequences IIIstr, list, tuple

Slicing

a[1:2]

a[0:-1]

a[:-1], a[3:]

a[:]

a[0:len(a):2]

a[-1::-1]

Repetition & Concatenation

c=a*2

b*3

a= a + 5mn

d= b + [abs, 1, False]



Sequences IVstr, list, tuple

Nesting

A=[[1,2,3],[4,5,6],[7,8,9]]

A[0]

A[0][2]

A[0:-1][-1]

A[3] # Error

List Comprehension

q= [x.isdigit() for x in a]

print q

p=[(r[1]**2) for r in A if r[1]< 8] # Power

print p



Sequences Vstr, list, tuple

Dictionaries

D= {0:Rhine, 1:"Indus", 3:"Hudson"}

D[0]

D[6] # Error

D[6]="Volga"

dir(D)



Numbers I

I Math Operations

1 a= 10; b= 3; c= 10.5; d=1.2345

2 a/b

3 a//b, c//b # Floor division: b*(a//b) + (a%b) == a

4 d**c # Power

5 type(10**40) # Unlimited integers

6 import math

7 import random

8 dir(math)

9 math.pi # repr(x)

10 print math.pi # str(x)

11 s= "e is %08.3f and list is %s" % (math.e, [a,1,1.5])

12 random.random() # [0,1)

13 random.choice([apple,orange,banana,kiwi])



Numbers II

I Booleans

1 s1= True

2 s2= 3 < 5


Dynamic Typing I

I Variables are names and have no types. They can refer to objects ofany type. Type is associated with objects.

1 a= "abcf"

2 b= "abcf"

3 a==b, a is b

4 a= 2.5

I Objects are garbage-collected automatically.

I Shared references



Dynamic Typing II1 a= [4,1,5,10]

2 b=a

3 b is a

4 a.sort()

5 b is a

6 a.append(w)

7 a

8 b is a

9 a= a + [w]

10 a

11 b is a

12 b

13 x= 42

14 y= 42

15 x is y, x==y

16 x= [1,2,3]; y=[1,2,3]



Dynamic Typing III

17 x is y, x==y

18 x=123; y= 123

19 x is y, x==y # Wassup?

20 # Assignments create references

21 L= [1,2,3]

22 M= [x, L, c]

23 M

24 L[1]= 0

25 M

26 # To copy

27 L= [1,2,3]

28 M= [x, L[:], c]

29 M

30 L[1]= 0

31 M


Python Brief Introduction Control Statements

Control Statements I

I Mind the indentation! One extra carriage return to finish ininteractive mode.

1 import sys

2 tmp= sys.stdout

3 sys.stdout = open(log.txt, a)

4 x= random.random();

5 if x < 0.25:

6 [y, z]= [-1, 4]

7 elif 0.25


Loops I

I While

1 i= 0;

2 while i< 5:

3 s= raw_input("Enter an int: ")

4 try:

5 j= int(s)

6 except:

7 print invalid input

8 break;

9 else:

10 print "Its square is %d" % j**2

11 i += 1

12 else:

13 print "exited normally without break"



Loops II

I For

1 X= range(2,10,2) # [2, 4, 6, 8]

2 N= 7

3 for x in X:

4 if x> N:

5 print x, "is >", N

6 break;

7 else:

8 print no number > , N, found

9

10 for line in open(test.txt, r):

11 print line.upper()


Python Brief Introduction Functions

Functions I

I Arguments are passed by assignment

1 def change_q(p, q):

2 for i in p:

3 if i not in q: q.append(i)

4 p= abc

5

6 x= [a,b,c]; # Mutable

7 y= bdg # Immutable

8 print x, y

9 change_q(q=x,p=y)

10 print x, y

I Output

[a, b, c] bdg

[a, b, c, d, g] bdg



Functions II

I Scoping rule: LEGB= Local-function, Enclosing-function(s), Global(module), Built-ins.

1 v= 99

2 def local():

3 def locallocal():

4 v= u

5 print "inside locallocal ", v

6 u= 7; v= 2

7 locallocal()

8 print "outside locallocal ", v

9

10

11 def glob1():

12 global v

13 v += 1



Functions III

14

15 local()

16 print v

17 glob1()

18 print v

I Output

inside locallocal 7

outside locallocal 2

99

100


Python Brief Introduction Packages, Modules, Classes

Packages, Modules IPython Standard Library http://docs.python.org/library/

I Folder structure

root/

pack1/

__init__.py

mod1.py

pack2/

__init__.py

mod2.py

I root should be in one of the following: 1) program home folder, 2)PYTHONPATH 3) standard lib folder, or, 4) in a .pth file on path. Thefull search-path is in sys.path.

I Importing



Packages, Modules IIPython Standard Library http://docs.python.org/library/

import pack1.mod1

import pack1.mod3 as m3

from pack1.pack2.mod2 import A,B,C



Classes I

I Example

class Animal(object): # new style classes

count= 0

def __init__(self, _name):

Animal.count += 1

self.name= _name

def __str__(self):

return I am + self.name

def make_noise(self):

print (self.speak()+" ")*3

class Dog(Animal):

def __init__(self, _name):

Animal.__init__(self, _name)

self.count= 1



Classes II

def speak(self):

return "woof"

I Full examples in python examples.tgz



Useful External Libraries

I The SciPy library is a vast Python library for scientific computations.I http://www.scipy.org/I In Ubuntu install python-scitools in the package-manager.I Library for doing linear-algebra, statistics, FFT, integration,

optimization, plotting, etc.

I Boost is a very mature and professional C++ library. It has Pythonbindings. Refer to:http://www.boost.org/doc/libs/1_47_0/libs/python/doc/

I For creation of Python graph data-structures (leveraging boost) lookat: http://projects.skewed.de/graph-tool/



Contents

Agents and their Task-environmentsAgent Types



A general agent

Agent Sensors

Actuators

Enviro

nment

Percepts

Actions

?

Definition 3.1 (A Rational Agent)

For each possible percept sequence, a rational agent should select anaction that is expected to maximize its performance measure, given theevidence provided by the percept sequence, and whatever built-in (prior)knowledge the agent has.



Properties of the Task EnvironmentFully observable: relevant environmentstate fully exposed by the sensors.

Partially observable: e.g. a limitedfield-of-view sensor. Unobservable

Single Agent Multi-agentDeterministic: If the next state is com-pletely determined by the current stateand the action of the agent

Stochastic: Uncertainties quantified byprobabilities.

Episodic: Agents experience is dividedinto atomic episodes, each independentof the last, e.g. assembly-line robot.

Sequential: current action affects futureactions, e.g. chess-playing agent.

Static: Environment unchanging.Semi-dynamic: agents performancemeasure changes with time, env. static.

Dynamic: The environment changeswhile the agent is deliberating.

Discrete: state of the environment isdiscrete, e.g. chess-playing, traffic con-trol.

Continuous: The state smoothlychanges in time, e.g. a mobile robot.

Known: rules of the game/laws ofphysics of the env. are known to theagent.

Unknown: The agent must learn therules of the game.

I Hardest case: Partially observable, multiagent, stochastic, sequential,dynamic, continuous, and unknown.



Example of a Partially Observable Environment

Figure 14: A mobile robot operating GUI.


Agents and their Task-environments Agent Types

Agent Types

Four basic types in order of increasing generality:

I Simple reflex agents

I Reflex agents with state

I Goal-based agents

I Utility-based agents

All these can be turned into learning agents



Simple reflex agents

Agent

Environment

Sensors

What the worldis like now

What action Ishould do nowConditionaction rules

Actuators

Algorithm 1: Simple-Reflex-Agent

input : perceptoutput : actionpersistent: rules, a set of

condition-action rules

state Interpret-Input(percept) ;rule Rule-Match(state, rules) ;action rule.action;return action



Model-based reflex agents I

Agent

Environment

Sensors

What action Ishould do now

State

How the world evolves

What my actions do

Conditionaction rules

Actuators




Model-based reflex agents II

Algorithm 2: Model-Based-Reflex-Agent

input : perceptoutput : actionpersistent: state, agents current conception of worlds state

model , how next state depends on the current state and actionrules, a set of condition-action rulesaction, the most recent action, initially none

state Update-State(state, action, percept, model) ;rule Rule-Match(state, rules) ;action rule.action;return action



Goal-based agents

Agent

Environment

Sensors

What it will be like if I do action A


State


What my actions do

Goals

Actuators


Figure 15: Includes search and planning.



Utility-based agents

Agent

Environment

Sensors

What it will be like if I do action A

How happy I will be in such a state


State


What my actions do

Utility

Actuators


Figure 16: An agents utility function is its internalization of the performancemeasure.



Contents

Goal-based Problem-solving Agents using SearchingThe Graph-Search AlgorithmUninformed (Blind) SearchInformed (Heuristic) Search



Problem Solving Agents

Algorithm 3: Simple-Problem-Solving-Agentinput : perceptoutput : actionpersistent: seq, an action sequence, initially empty

state, agents current conception of worlds stategoal , a goal, initially nullproblem, a problem formulation

state Update-State(state, percept) ;if seq is empty then

goal Formulate-Goal(state) ;problem Formulate-Problem(state, goal) ;seq Search(problem) ;if seq = failure then return a null action

action First(seq) ;seq Rest(seq) ;return action



Searching for Solutions

I State: The system-state of x Xparameterizes all properties of interest. Theinitial-state of the state is x0 and the set ofgoal-states is Xg . The set X of valid statesis called the state-space.

I Actions or Inputs: At each state x, thereare a set of valid actions u(x) U(x) thatcan be taken by the search agent to alter thestate.

I State Transition Function: How a newstate x is created by applying an action u tothe current state x.

x = f(x,u) (4.1)

The transition may have a cost k(x,u) > 0.

u1u2

Initial State x0

Goal State xg

Valid Actions

Figure 17: Nodes arestates, and edges arestate-transitions causedby actions.



Examples I

2

Start State Goal State

51 3

4 6

7 8

5

1

2

3

4

6

7

8

5

(a) An instance

1

23

45

6

7

81

23

45

6

7

8

State Node depth = 6g = 6

state

parent, action

(b) A node of the search-graph. Arrowspoint to parent-nodes.

Figure 18: The 8-puzzle problem



Examples II

Figure 19: An instance of the 8-Queens problem



Examples III

Giurgiu

UrziceniHirsova

Eforie

NeamtOradea

Zerind

Arad

Timisoara

Lugoj

Mehadia

DobretaCraiova

Sibiu Fagaras

Pitesti

Vaslui

Iasi

Rimnicu Vilcea

Bucharest

71

75

118

111

70

75120

151

140

99

80

97

101

211

138

146 85

90

98

142

92

87

86

Figure 20: The map of Romania. An instance of the route planning problemgiven a map.



Examples IV

Figure 21: A 2D occupancy grid map created using Laser-Range-Finder (LRF).



Examples V

Figure 22: Result of A path-planning algorithm on a multi-resolution quad-tree.


Goal-based Problem-solving Agents using Searching The Graph-Search Algorithm

Graph-SearchCompare with Textbook Fig. 3.7

Algorithm 4: Graph-Search

input : x0,XgD = , The explored-set/dead-set/passive-set;F .Insert(x0, g(x0) = 0, `(x0) = h(x0)) Frontier/active-set;while F not empty dox, g(x), `(x) F .Choose() Remove best x from F ;if x Xg then return SUCCESS;D D {x};for u U(x) do

1 x f(x,u), g(x) g(x) + k(x,u) ;if (x / D) and (x / F) thenF .Insert(x, g(x), `(x) = g(x) + h(x,Xg ));

else if (x F) then2 F .Resolve-Duplicate(x, g(x), `(x));


Goal-based Problem-solving Agents using Searching The Graph-Search Algorithm

Measuring Problem-Solving Performance

I Completeness: Is the algorithm guaranteed to find a solution if thereis one?

I Optimality: Does the strategy find optimal solutions?I Time & Space Complexity: How long does the algorithm take and

how much memory is needed?I Branching factor b: The maximum number of successors (children) of

any node.I Depth d : The shallowest goal-node level.I Max-length m: Maximum length of any path in state-space.


Goal-based Problem-solving Agents using Searching Uninformed (Blind) Search

Breadth-First Search (BFS)

I The frontier F is implemented as a FIFO queue. The oldest elementis chosen by Choose().

I For a finite graph, it is complete, and optimum if all edges have samecost. It finds the shallowest goal node.

I The Graph-Search can return as soon as a goal-state is generatedin line 1.

I Number of nodes generated b + b2 + . . .+ bd = O(bd). This is thespace and time complexity.

I The explored set will have O(bd1) nodes and the frontier will haveO(bd) nodes.

I Mememory becomes more critical than computation time, e.g. forb = 10, d = 12, 1 KB/node, search-time is 13 days, andmemory-requirements 1 petabyte (= 1015 Bytes).



BFS Example

A

B C

D E F G

A

B C

D E F G

A

B C

D E F G

A

B C

D E F G



Dijkstra Algorithm or Uniform-Cost Search

I The frontier F is implemented as a priority-queue. Choose() selectsthe element with the highest priority, i.e. the minimum path-lengthg(x).

I F .Resolve-Duplicate(x) function on line 2 updates the path-costg(x) of x in the frontier F , if the new value is lower than the storedvalue. If the cost is decreased, the old parent is replaced by the newone. The priority queue is reordered to reflect the change.

I It is complete and optimum.

I When a node x is chosen from the priority-queue, the minimum lengthpath from x0 to it has been found. Its length is denoted as g(x).

I In other words, the optimum path-lengths of all the explored nodes inthe set D have already been found.



Correctness of Dijkstra Algorithm

D

F

x

xg

x0

Figure 23: The graph separation by the frontier. The node x in the frontier ischosen for further expansion. The set D is a tree.



Correctness of Dijkstra Algorithm: Observations

I Unexplored nodes can only be reached through the frontier nodes.

I An existing frontier node xf s cost can only be reduced thorough anode xc which currently has been chosen from the priority-queue as ithas the smallest cost in the frontier: This will be done by theRESOLVE-DUPLICATE function. Afterwards, parent(xf )= xc .

I Note that D remains a tree.

I The frontier expands only through the unexplored children of thechosen frontier node, all of the children will have costs worse thantheir parent.

I The costs of the successively chosen frontier nodes arenon-decreasing.



Correctness of Dijkstra Algorithm: Proof by Induction

Theorem 4.1 (When a frontier node xc is chosen for exansion, itsoptimum path has been found)

Proof.The proof will be done as part of proof of optimality of the A algorithm,as the Dijkstra Algorithm is a special case of the A algorithm. Refer toLemma 4.6.



Depth-first Search (DFS)

I The frontier F is implemented as a LIFO stack. The newest elementis chosen by Choose().

I For a finite graph, it is complete, and but not optimum.

I Explored nodes with no descendants in the frontier can be removedfrom the memory! This gives a space-complexity adavantage: O(bm)nodes. This happens automatically if the algorithm is writtenrecursively.

I Assuming that nodes at the same depth as the goal-node have nosuccessors, b = 10, d = 16, 1 KB/node, DFS will require 7 trillion(1012) times less space than BFS! That is why, it is popular in AIresearch.



Algorithm 5: Depth-limited-Search

input : current-state x, depth dif x Xg then

return SUCCESS ;else if d = 0 then

return CUTOFFelse

for u U(x) dox f(x,u);result Depth-limited-Search(x, d 1);if result =SUCCESS then

return SUCCESSelse if result =CUTOFF then

cutoff-occurred trueif cutoff-occurred then return CUTOFF ;else return NOT-FOUND ;



DFS ExampleGoal node M

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O



Iterative Deepening Search (IDS)

I As O(bd) O(bd1), one can combine the benefits of BFSand DFS.

I All the work from previous iteration is redone, but this isacceptable, as the frontier-size is dominant.

Algorithm 6: Iterative-Deepening-Search

for d= 0 to doresult Depth-limited-Search(x0, d);if result 6= CUTOFF then

return result



Limit = 3

Limit = 2

Limit = 1

Limit = 0 A A

A

B C

A

B C

A

B C

A

B C

A

B C

D E F G

A

B C

D E F G

A

B C

D E F G

A

B C

D E F G

A

B C

D E F G

A

B C

D E F G

A

B C

D E F G

A

B C

D E F G

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H I J K L M N O

A

B C

D E F G

H J K L M N OI

A

B C

D E F G

H I J K L M N O


Goal-based Problem-solving Agents using Searching Informed (Heuristic) Search

Bellmans Principle of Optimality

Theorem 4.2All subpaths of an optimal path are also optimal.



A Search

Est. path-cost from x0 to xg `(x)

= g(x) + est. path-cost from x to xg h(x)

(4.2)

`(x) , g(x) + h(x) (4.3)`(xg ) g(xg ), as h(xg ) = 0. (4.4)

I h(x) is a heuristically estimated cost, e.g., for the map route-findingproblem, h(x) = xg x.

I `(x) is the estimated cost of the cheapest solution through node x.

I A is a Graph-Search where, the frontier F is a priority-queue withhigher priority given to lower values of the evaluation function `(x).

I If no heuristics are taken, i.e. h(x) 0, A reduces to Dijkstrasalgorithm.



A SearchResolve-Duplicate

Similar to Dijkstras Algorithm, F .Resolve-Duplicate(x) on line 2 ofAlgo. 4 updates the cost `(x) in the frontier F , if the new value is lowerthan the stored value. If this occurs, the old parent of x is replaced by thenew one. The priority queue is reordered to reflect the change.



Example heuristic function

Urziceni

NeamtOradea

Zerind

Timisoara

Mehadia

Sibiu

PitestiRimnicu Vilcea

Vaslui

Bucharest

GiurgiuHirsova

Eforie

Arad

Lugoj

DrobetaCraiova

Fagaras

Iasi

0160242161

77151

366

244226

176

241

25332980

199

380234

374

100193

Figure 24: Values of hSLD straight-line distances to Bucharest.



Conditions for Optimality of A IA will find the optimal path, if the heuristic cost h(x) is admissible andconsistent.

Definition 4.3 (Admissibility)

h(x) is admissible, if it never over-estimates the cost to reach the goal h(x) is always optimistic.Definition 4.4 (Consistency)

h(x) is consistent, if for every child (generated by action ui ) xi of a node x

the triangle-inequality holds:

h(xg ) = 0, (4.5)

h(x) k(x,ui , xi ) + h(xi ), i (4.6)

This is a stronger condition than admissibility, i.e.consistency admissibility.



Proof: consistency admissibility IWe show the result by induction on n([x : xg ]): the number of edges inthe optimal-path (with the least sum of edge-costs) from a node x to thegoal xg . We assume that we have a consistent heuristic h(x). So, ourinduction hypothesis is that the consistent heuristic h(x) is also admissible.

I Case n([x : xg ]) = 1: We use the property of consistency thath(xg ) = 0. Nodes x with n([x : xg ]) = 1 are such that theiroptimum path is just one edge long, this edge being the one whichconnects them to the goal: thus, h(x) = k(x, xg ). Now, sinceconsistency is assumed to hold,

h(x) k(x, xg ) + 0 = h(x) (4.7)

This proves the admissibility. Thus our hypothesis holds for n = 1.



Proof: consistency admissibility III Case n([x : xg ]) = m: We assume now that our hypothesis holds

for all nodes with optimal-paths which have at most m 1 edges, i.efor nodes Sm1 , {y | n([y : xg ]) < m}. Let x be a node withn([x : xg ]) = m, i.e. its optimal path to goal is m edges long. Letthe successor of x on this optimal path be x. Since the sub-paths ofan optimal path are optimal also, x Sm1, hence, the hypothesisholds for it, i.e. h(x) h(x). Now, since consistency holds for x,

h(x) h(x) + k(x, x) h(x) + k(x, x) = h(x) (4.8)

The last step holds because x is the successor of x on the lattersoptimal path to goal. Hence admissibilty has been demonstrated for ageneral node x with n([x, xg ]) = m.

By induction, the result holds for nodes with all values of n([x : xg ]), i.e.the entire graph.



A proof I

Lemma 4.5 (`(x) is non-decreasing along any optimal path, if h(x) isconsistent)

x0 xm

xnxp

Figure 25: A dashed line between two nodes denotes that the nodes areconnected by a path, but are not necessarily directly connected.

To prove: Let xp be a node for which the optimum-path with cost g(xp)has been found (see figure): Nodes xm and xn lie on this optimum pathsuch that xm precedes xn, then

`(xm) `(xn) (4.9)



A proof IIProof.

I First note that since xm and xn lie on the optimum path to xp, theirpaths are also optimum and have lengths g(xn) and g(xm)respectively.

I Let us first assume that xm is the parent of xn, then

`(xn) = h(xn) + g(xn) (4.10)= h(xn) + g

(xm) + k(xm, xn) (4.11)(4.6)

h(xm) + g(xm) `(xm) (4.12)I Now if xm is not the parent of xn but a predecessor, the inequality

can be chained for every child-parent node on the path between them,and we reach the same conclusion.



Recall Graph-Search

Algorithm 7: Graph-Search

input : x0,XgD = , The explored-set/dead-set/passive-set;F .Insert(x0, g(x0) = 0) The frontier/active-set;while F not empty dox, g(x) F .Choose() Remove x from the frontier;if x Xg then return SUCCESS;D D {x};for u U(x) do

x f(x,u), g(x) g(x) + k(x,u) ;if (x / D) and (x / F) thenF .Insert(x, g(x) + h(x,Xg ));

else if (x F) thenF .Resolve-Duplicate(x);



A proof I

Lemma 4.6 (At selection for expansion, a nodes optimum path hasbeen found)

To prove: In every iteration k of A, the node x selected for expansion bythe frontier Fk (x has the minimum value of `(x) in Fk) is such that, atselection:

g(x) = g(x). (4.13)



Proof of Lemma 4.6

I Proof is by induction on the iteration number N. At N = 1, thefrontier F1 selects its only node x0 with cost g(x0) = g(x0) = 0.

I Assume that the induction hypothesis holds for N = 1 . . . k , Now weneed to show that it holds for iteration k + 1.

I Assume that Fk+1 selects xn at this iteration. All frontier nodes havetheir parents in D. At the time of selection, parent(xn)= xs .

I Suppose that the path through xs is not the optimal path for xn, butthe optimal path is , as shown in the figure in blue. This path existsin the graph at iteration k + 1 whether or not it will ever bediscovered in future iterations is irrelevant.



Proof of Lemma 4.6 contd.

x0

xn

Fk+1

D xs

xp

xm

(x0 : xm)

(xp : xn)

Figure 26: The path shown in blue is the assumed optimal path. Since xm Dat iteration k + 1, by the induction hypothesis it must have been selected byFi , i < k + 1, and hence its path (x0 : xm) is optimum.




I Note that the assumed optimal-path has to pass through a nodewhich is in Fk+1 because the frontier separates the dead-nodes Dand unknown nodes and all expansion occurs through the frontiernodes.

I Let xp be the first node on to belong to Fk+1. Let its parent in be xm D.

I Thus, the entire assumed optimal path consists of the followingsub-paths ( stands for path-concatenation):

(x0 : xn) = (x0 : xm) k(xm : xp) (xp : xn) (4.14)



Proof of Lemma 4.6 contd.I The cost of xp in Fk+1 is

gk+1(xp) = g(xm) + k(xm : xp) = g(xp) (4.15)

`k+1(xp) = `(xp) = h(xp) + g(xp). (4.16)I As xp lies on the optimum-path to xn, from Lemma 4.5,

`(xp) `(xn) (4.17a) `k+1(xn), the cost of xn at k+1. (4.17b)

I However, xn was selected by Fk+1 for expansion. Therefore,`k+1(xn) `k+1(xp) (4.16)= `(xp). (4.18)

I Combining results,

`(xp)(4.17a)

`(xn) `k+1(xn)(4.18)

`(xp), (4.19) `(xn) = `k+1(xn). (4.20)




As `(xn) = `k+1(xn), g(xn) + h(xn) = gk+1(xn) + h(xn),

g(xn) = gk+1(xn), (4.21)

Therefore, the path-cost of g(xn) at k + 1 is indeed optimum. This is incontradiction to what we assumed, with our alternate optimal-path hypothesis. Therefore, we conclude that the optimal path g(xn) is foundat N = k + 1 when xn is selected by Fk+1.By extension, when the goal point is selected by the frontier, its optimumpath with cost g(xg ) has been found.



Lemma 4.7 (Monotonicity of Expansion)

Let the selected node by Fj be xj , and that selected by Fj+1 be xj+1.Then it must be true that

`(xj) `(xj+1). (4.22)

Why?

Proof.Sketch: You have to consider two cases:

I At iteration j + 1, xj+1 is a child of xj .

I At iteration j + 1, xj+1 is not a child of xj .



Consequences of the Monotonicity of Expansion

Remark 4.8This shows that at iteration N = j , if Fj selects xj then:I All nodes x with `(x) < `(xj) have already been expanded (i.e. they

have died), and some nodes with `(x) = `(xj) have also beenexpanded.

I In particular, when the first goal is found, then, All nodes x with`(x) < g(xg ) have already been expanded, and some nodes with`(x) = g(xg ) have also been expanded.



Figure 27: Region searched before finding a solution: Dijkstra path search



Figure 28: Region searched before finding a solution: A path search. Thenumber of nodes expanded is the minimum possible.



Properties of A

I The paths are optimal w.r.t. the cost function, but do you notice anyundesirable properties of the planned paths?

I Why are there less colors in Fig. 18 than in Fig. 17?

I In Fig. 18 why are the red-shades lighter in the beginning?



An alternative to A for path-planning

Figure 29: Funnel-planning using wave-front expansion. The path stays awayfrom the obstacles.



Funnel path-planning

Figure 30: Funnel-planning using wave-front expansion in 3D. Source: Brock andKavraki, Decomposition based motion-planning: A framework for real-timemotion-planning in high dimensional configuration spaces, ICRA 2001.



Algorithm 8: Funnel-Planning

input: x0, xgB findFreeSphere(x0)B.parent Q.insert(B, B.center xg B.r)while Q not empty do

B Q.getMin()D.insert(B)if xg B then

return [D,B]

for s 1 . . .Ns dox sampleOnSurface(B)if x / D then

C findFreeSphere(x)C .parent BQ.insert(C , C .center xg C .r)



Funnel path-planning

Figure 31: Motion planning using the funnel potentials. Source: LaValle,Planning Algorithms, http://planning.cs.uiuc.edu/.



Contents

Non-classical Search AlgorithmsHill-ClimbingSampling from a PMFSimulated AnnealingGenetic Algorithms



Local Search

currentstate

objective function

state space

global maximum

local maximumflat local maximum

shoulder

Figure 32: A 1-D state-space landscape. The aim is to find the global maximum.

Local search algorithms are used whenI The search-path itself is not important, but only the final optimal

state, e.g. 8-Queen problem, job-shop scheduling, IC design, TSP, etc.I Memory efficiency is needed. Typically only one node is retained.I The aim is to find the best state according to an objective function to

be optimized. We may be seeking the global maximum or theminimum. How can we reformulate the former to the latter?


Non-classical Search Algorithms Hill-Climbing

Algorithm 9: Hill-Climbing

input : x0, objective (value) function v(x)to maximize

output: x, the state where a localmaximum is achieved

x x0 ;while True do

y the highest-valued child of x ;if v(y) v(x) then return x ;x y

To avoid getting stuck in plateaux:

I Allow side-ways movements.Problems?

I Random-restart: perform search frommany randomly chosen x0 till anoptimal solution is found.

Figure 33: A ridge. The localmaxima are not directlyconnected to each other.


Non-classical Search Algorithms Hill-Climbing

Figure 34: (a) Starting state. h(x) is the number of pairs of queens attackingeach other. Each node has 8 7 children. (b) A local minimum with h(x) = 1.


Non-classical Search Algorithms Sampling from a PMF

Sampling from a Probability Mass Function (pmf)

Definition 5.1 (PMF)

Given a discrete random-variable A with an exhaustive and ordered (canbe user-defined, if no natural order exists) list of its possible values[a1, a2, . . . , an], its pmf P(A) is a table, with probabilitiesP(A = ai ), i = 1 . . . n. Obviously,

ni=1 P(A = ai ) = 1.

Problem 5.2Sampling a PMF A uniform random number generator in the unit-intervalhas the probability distribution function pu[0,1](x) as shown below in thefigure. Python random.random() returns a sample x [0.0, 1.0). Howcan you use it to sample a given discrete distribution (PMF)?

1

0 1

pu[0,1](x)

xa b

P (x [a, b] ; 0 a b < 1) = b a



Definition 5.3 (Cumulative PMF FA for a PMF P(A))

FA(aj) , P(A aj) =n

i=1

P(A = ai )u(aj ai ), where, (5.1)

u(x) ,{

0 x < 0

1 x 0 (5.2)

FA(aj) ij

P(A = ai ) (5.3)

u(x) is called the discrete unit-step function or the Heaviside stepfunction. What is FA(an)?



I Form a vector of half-closed intervals

s ,

[0,FA(a1))

[FA(a1),FA(a2))...

[FA(an2),FA(an1))[FA(an1), 1)

, Define a0 s.t. FA(a0) , 0. (5.4)

I Let ru be a sample from a uniform random-number generator in theunit-interval [0, 1). Then,

P(ru s[i ]) = P(FA(ai1) ru < FA(ai ))= P(FA(ai1) ru FA(ai ))= FA(ai ) FA(ai1)(5.3)= P(ai ) (5.5)


Non-classical Search Algorithms Simulated Annealing

Algorithm 10: Simulated-Annealing

input : x0, objective (cost) function c(x)to minimize

output: x, a locally optimum statex x0 ;for k k0 to do

T Schedule(k) ;if 0 < T < then return x ;y a randomly selected child of x ;E c(y) c(x) ;if E < 0 then

x yelse

x y with probability P(E ,T ) ;P(E ,T ) = 1

1+eE/T eE/T ;

(Boltzmann distribution)

An example of aschedule is

Tk = T0ln(k0)

ln(k)(5.6)

Applications:

VLSI layouts,Factory-scheduling.



Another Schedule

We first generate some random rearrangements, and use themto determine the range of values of E that will be encounteredfrom move to move. Choosing a starting value of T which isconsiderably larger than the largest E normally encountered,we proceed downward in multiplicative steps each amounting toa 10 % decrease in T . We hold each new value of T constantfor, say, 100N reconfigurations, or for 10N successfulreconfigurations, whichever comes first. When efforts to reduceE further become sufficiently discouraging, we stop.

Numerical Recipes in C: The Art of Scientific Computing.



Example: Traveling Salesman Problem

0 50000 100000 150000 200000 250000nr. iteration40060080010001200140016001800

E

0 50000 100000 150000 200000 250000nr. iteration0

200

400

600

800

1000

T



Example: Traveling Salesman Problem

0 20 40 60 80 1000

20

40

60

80

100

Figure 35: A suboptimal tour found by the algorithm in one of the runs.


Non-classical Search Algorithms Genetic Algorithms

Algorithm 11: Genetic Algorithm

input : P = {x}, a population of individuals,a Fitness() function to maximize

output: x, an individualrepeatPn ;for i 1 to Size(P) do

x Random-Selection(P, Fitness()) ;y Random-Selection(P, Fitness()) ;c Reproduce(x, y) ;if small probability then Mutate(c) ;Add c to Pn

P Pnuntil x P, Fitness(x) > Threshold, or enough timeelapsed ;return best individual in P



Algorithm 12: Reproduce(x, y)

N Length(x) ;R random-number from 1 to N (cross-over point);c Substring(x, 1, R) + Substring(y, R + 1, N) ;return c

(a)Initial Population

(b)Fitness Function

(c)Selection

(d)Crossover

(e)Mutation

24

23

20

11

29%

31%

26%

14%

32752411

24748552

32752411

24415124

32748552

24752411

32752124

24415411

32252124

24752411

32748152

24415417

24748552

32752411

24415124

32543213

Figure 36: The 8-Queens problem. The ith number in the string is the position ofthe queen in the ith column. The fitness function is the number of non-attackingpairs (maximum fitness 28).



Properties of Genetic Algorithms (GA)

I Crucial issue: encoding.

I Schema e.g. 236 , an instance of this schema is 23689745. Ifaverage fitness of instances of schema are above the mean, then thenumber of instances of the schema in the population will grow overtime. It is important that the schema makes some sense within thesemantics/physics of the problem.

I GAs have been used in job-shop scheduling, circuit-layout, etc.

I The identification of the exact conditions under which GAs performwell requires further research.



A Detailed Example: Flexible Job Schedulingfrom G. Zhang et al An effective genetic algorithm for the flexible job-shop schedulingproblem, Expert Systems with Applications, vol. 38, 2011.

Figure 37: Gantt-Chart of a Schedule: Minimizing Makespan.



GA Example: Flexible Job Scheduling

Job Operation M1 M2 M3 M4 M5J1 O11 2 6 5 3 4

O12 - 8 - 4 -

J2 O21 3 - 6 - 5O22 4 6 5 - -O23 - 7 11 5 8



GA Example: Constraints

I Oi(j+1) can begin only after Oij has ended.

I Only a certain subset ij of machines can perform Oij .

I Jio is the number of total operations for job Ji .

I L =N

i=1 Jio total number of operations of all jobs.

I Pijk is the processing-time of Oij on machine k.



GA Example: Chromosome Representation

(a) (b) Machine Selection Part

(c) Operation Sequence Part

Figure 38: Chromosome Representation



GA Example: Decoding Chromosome

(a) Finding enough space to insert Oi(j+1)



GA Example: Initial Population: Global Selection (GS)



GA Example: Initial Population: Local Selection (LS)



GA Example: MS Crossover Operator

Figure 39: Machine Sequence (MS) Part Crossover



GA Example: OS Crossover OperatorPrecedence Preserving Order-Based Crossover (POX)

Figure 40: Operation Sequence (OS) Part: Precedence Preserving Order-BasedCrossover (POX)



GA Example: Mutation

Figure 41: Machine Sequence (MS) Part Mutation



GA Example: Run

Figure 42: A typical run of GA


Games Agents Play

Contents

Games Agents PlayMinimaxAlpha-Beta Pruning


Games Agents Play Minimax

Zero-Sum GamesA Partial Game-Tree for Tic-Tac-Toe

XXXX

XX

X

XX

MAX (X)

MIN (O)

X X

O

OOX O

OO O

O OO

MAX (X)

X OX OX O XX X

XX

X X

MIN (O)

X O X X O X X O X

. . . . . . . . . . . .

. . .

. . .

. . .

TERMINALXX

1 0 +1Utility

Figure 43: Each half-move is called a ply.



Search-Tree vs Game-Tree

I The search-tree is usually a sub-set of the game-tree.

I Example: For Chess, the game-tree is estimated to be over 1040

nodes big, with an average branching factor of 35.



Zero-Sum GamesNomenclature

S0 The initial state.

Player(s) The player which has the move in state s.

Actions(s) Set of legal moves in state s.

Result(s, a) The transition-model: the state resulting from applying theaction a to the state s.

Terminal-Test(s) Returns True if the game is over at s.

Utility(s) The payoff for the Max player at a terminal state s.

Zero-sum game A game where the sum of utilities for both players at eachterminal state is a constant. Example: Chess:(1, 0), (0, 1), (1/2, 1/2).



An Example 2-Ply Game

MAX

3 12 8 642 14 5 2

MIN

3A 1 A 3A 2

A 13A 12A 11 A 21 A 23A 22 A 33A 32A 31

3 2 2

Figure 44: Each node (state) labeled with its minimax value.

Minimax(s) =Utility(s) if Terminal-Test(s)

maxaActions(s) Minimax(Result(s, a)) if Player(s) = MaxminaActions(s) Minimax(Result(s, a)) if Player(s) = Min

(6.1)K. Pathak (Jacobs University Bremen) Artificial Intelligence December 5, 2013 148 / 475


The Minimax AlgorithmAlgorithm 13: Minimax-Decisioninput : State sreturn arg maxaActions(s) Min-Value(Result(s,a))

Algorithm 14: Max-Valueinput : State sif Terminal-Test(s) then return Utility(s) ; ;for a Actions(s) do

max(, Min-Value(Result(s, a)))return

Algorithm 15: Min-Valueinput : State sif Terminal-Test(s) then return Utility(s) ; ;for a Actions(s) do

min(, Max-Value(Result(s, a)))return



Search-Tree Complexity

I Since the search is a DFS, for an average branching-factor b andmaximum depth m:

I Space complexity: O(bm).I Time complexity: O(bm).

I It turns out we can reduce time-complexity in the best case toO(bm/2) using Alpha-Beta Pruning.


Games Agents Play Alpha-Beta Pruning

Alpha-Beta Pruning

MAX

3 12 8 642 14 5 2

MIN

3

3 2 2

MAX

3 12 8

MIN 3

3 MAX

3 12 8

MIN 3

2

2

X X

3 MAX

3 12 8

MIN 3

2

2

X X14

14

3 MAX

3 12 8

MIN 3

2

2

X X14

14

5

5

3 MAX

3 12 8

MIN

3

3

2

2

X X14

14

5

5

2

2

3



Algorithm 16: Alpha-Beta-Search(s)

Max-Value(s, = , =) ;return the Action in Actions(s) with value

Algorithm 17: Max-Value(s, , )

if Terminal-Test(s) then return Utility(s) ; ;for a Actions(s) do

max(, Min-Value(Result(s, a), , )) ;if then return ; max(, );

return

Algorithm 18: Min-Value(s, , )

if Terminal-Test(s) then return Utility(s) ; ;for a Actions(s) do

min(, Max-Value(Result(s, a), , )) ;if then return ; min(, );

return



Reference

Donald E. Knuth and Ronald W. Moore, An Analysis of Alpha-BetaPruning, Artificial Intelligence, 1975.



Transposition Table

I Some state-nodes may reappear in the tree: To avoid repeating theirexpansion, their computed utilities can be cached in a hash-tablecalled the Transposition Table. This analogous to the dead-set inthe Graph-Search Algorithm.

I It may not be practical to cache all visited nodes. Various heuristicsare used to decide which nodes to discard from the transpositiontable.



Cutoff Depth and Evaluation Functions

I To achieve real-time performance, we expand the search-tree up to amaximum depth only and replace the utility computation by aheuristic evaluation function.

Algorithm 19: Min-Value(s, , , d)

if Cutoff-Test(s, d) then return Eval(s) ; ;for a Actions(s) do

min(, Max-Value(Result(s, a), , , d + 1)) ;if then return ; min(, );

return



Example Evaluation Functions

I Each state s is considered to have several features, e.g. in Chess,number of rooks, pawns, bishops, number of plys till now, etc.

I Each feature can be given a weight and a weighted sum of featurescan be used. Example: pawn (1), bishop (3), rook (5), queen (9).

I Weighting can be nonlinear, e.g. a pair of bishops is worth more thantwice the worth of a single bishop; a bishop is more valuable inendgame.

I Read Sec 5.7 of the textbook. The (2007-2010) computer worldchampion was RYBKA running on a desktop with its evaluationfunction tuned by International Master Vasik Rajlich. Allegations ofplagiarization: Crafty and Fruit.



Contents

Logical Agents: Propositional LogicPropositional LogicEntailment and InferenceInference by Model-CheckingInference by Theorem ProvingInference by ResolutionInference with Definite Clauses2SATAgents based on Propositional LogicTime out from Logic



Knowledge-Base

A Knowledge-Base is a set of sentences expressed in a knowledgerepresentation language. New sentences can be added to the KB and itcan be queried about whether a given sentence can be inferred from whatis known.A KB-Agent is an example of a Reflex-Agent explained previously.

Algorithm 20: Knowledge-Base (KB) Agent

input : KB, a knowledge-base,t, time, initially 0.

Tell(KB, Make-Percept-Sentence(percept, t)) ;action Ask(KB, Make-Action-Query(t)) ;Tell(KB, Make-Action-Sentence(action, t)) ;t t + 1 ;return action


Logical Agents: Propositional Logic Propositional Logic

Propositional LogicA simple knowledge representation language

Definition 7.1 (Syntax of Propositional Logic)

I An atomic formula (also called an atomic sentence or aproposition-symbol) has the form P,Q,A1,True ,False , IsRaining etc.

I A formula/sentence can be defined inductively asI All atomic formulas are formulas.I For every formula F , F is a formula, called a negation.I For all formulas F and G , the following are also formulas:

I (F G), called a disjunction.I (F G), called a conjunction.

I If a formula F is part of another formula G , then it is called asubformula of G .

I We use the short-hand notations:I F G (Premise implies Conclusion) for (F ) GI F G , (Biconditional) for (F G ) (G F ).



Propositional LogicA simple knowledge representation language

Syntax of Propositional Logic rewritten in BNF Grammar

Sentence Atomic-Sentence | Complex-SentenceAtomic-Sentence True | False | P | Q | R | . . .

Complex-Sentence (Sentence ) | [Sentence ]| Sentence| Sentence Sentence| Sentence Sentence| Sentence Sentence| Sentence Sentence

Operator-Precedence : ,,,, (7.1)

Axioms are sentences which are given and cannot be derived from othersentences.



Semantics of Propositional Logic

II The elements of the set T , {0, 1}, also written {False ,True } or{F ,T} are called Truth-Values.

I Let D be a set of atomic formulas/sentences. Then an assignment Ais a mapping A : D T.

I We can extend the mapping A to A : E T, where, E D is theset of formulas which can be built using only the atomic formulas inD, as follows:

I For any atomic formula Bi D, A(Bi ) , A(Bi ).I A(P) ,

{1, if A(P) = 00, otherwise.

I A((P Q)) ,{

1, if A(P) = 1 and A(Q) = 10, otherwise.

I A((P Q)) ,{

1, if A(P) = 1 or A(Q) = 10, otherwise.



Semantics of Propositional LogicTruth-Table

The semantic interpretation can be shown by a truth-table.

A(P ) A(Q ) A(P Q ) A(P Q )1 1 1 11 0 0 00 1 1 00 0 1 1

I From now on, the distinction between A and A is dropped.



Suitable Assignment, Model, Satisfiability, Validity

I If an assignment A is defined for all atomic formulas in a formula F ,then A is called suitable for F .

I If A is suitable for F and A(F ) = 1, then A is called a model for Fand write A F . Otherwise, we write A 2 F .

I The set of all models of a formula/sentence F is denoted by M(F )I A formula F is called satisfiable, if it has at least one model,

otherwise it is called unsatisfiable or contradictory.

I A set of formulas F is called satisfiable, if there exists an assignmentA which is a model for all Fi F.

I A formula F is called valid (or a tautology), if every suitableassignment for F is also a model of F . In this case, we write F .Otherwise, we write 2 F .



Theorem 7.2A formula F is valid if and only if (iff) F is unsatisfiable.Proof.

I F is valid iff every suitable assignment of F is a model of F .I iff every suitable assignment of F (and hence, of F ) is not a model ofF .

I iff F has no model, and hence, is unsatisfiable.



Wumpus World

PIT

1 2 3 4

1

2

3

4

START

Stench

Stench

Breeze

Gold

PIT

PIT

Breeze

Breeze

Breeze

Breeze

Breeze

Stench

Figure 45: Actions=[Move-Forward, Turn-Left, Turn-Right, Grab, Shoot, Climb],Percept=[Stench, Breeze, Glitter, Bump, Scream]



Wumpus World




Wumpus World




Wumpus World KB

Px ,y is true if there is a pit in [x , y ]Wx ,y is true if there is a Wumpus in [x , y ]Bx ,y is true if the agent perceives a breeze in [x , y ]Sx ,y is true if the agent perceives a stench in [x , y ]

R1 : P1,1 (7.2)R2 : B1,1 (P1,2 P2,1), (7.3)R3 : B2,1 (P1,1 P2,2 P3,1) (7.4)

We also have percepts.

R4 : B1,1, R5 : B2,1 (7.5)

Query to the KB: Q = P1,2 or Q = P2,2.


Logical Agents: Propositional Logic Entailment and Inference

Entailment

Definition 7.3 (Entailment)

The formula/sentence F entails the formula/sentence G , i.e.

F G , iff M(F ) M(G ) . (7.6)

We also say that G is a consequence of F .



Theorem 7.4 (Deduction Theorem)

For any formulas F and G , F G iff the formula (F G ) is valid, i.e.true in all assignments suitable for F and G .

Proof.

I Assume F G . Let A be an assignment suitable for F and G , then,I If A is not a model of F , i.e. A 2 F , then A is a model of (F G )

(ref. truth-table of implication).I If A F , then as F G , A G . Hence, A is a model of (F G ).I Thus, A is always a model for (F G ). Hence, (F G ) is valid.

I Assume (F G ) is valid. Hence, there does not exist an assignmentA such that

I A F .I A 2 G .I Hence, all models of F are also models of G , and so F G .



Definition 7.5 (Equivalence F G )Two formulas F and G are semantically equivalent if for every assignmentA suitable for both F and G , A(F ) = A(G ).Remark 7.6 (An equivalent definition of equivalence ,)F G , iff F G and G F .



Equivalence

Example 7.7

In the following, and can be swapped to get new equivalences.

F F (7.7)F F F Idempotency (7.8)F G G F Commutativity (7.9)

(F G ) H F (G H) Associativity (7.10)F (F G ) F Absorption (7.11)F (G H) (F G ) (F H) Distributivity (7.12)(F G ) (F ) (G ) deMorgans Law (7.13)

P Q Q P Contraposition (7.14)

All of them can be shown by truth-tables.


Logical Agents: P

AI Jacobs University Green Slides Dec 5

Documents

Transcript of AI Jacobs University Green Slides Dec 5