State Space 4 Chapter 4 Adversarial Games. Two Flavors Games of Perfect Information ◦Each player...

30
State Space 4 Chapter 4 Adversarial Games

Transcript of State Space 4 Chapter 4 Adversarial Games. Two Flavors Games of Perfect Information ◦Each player...

State Space 4Chapter 4

Adversarial Games

Two Flavors

Games of Perfect Information◦Each player knows everything that can be

known ◦Chess, Othello

Games of Imperfect Information◦Player’s have partial knowledge◦Poker: dispute is settled by revealing the

contents of one’s hand

Two Approaches to Perfect Information Games

Use simple heuristics and search many nodes

Use sophisticated heuristics and search few nodes

Cost of calculating the heuristics might outweigh the cost of opening many nodes

The closer h is to h*, the better informed it is. But information can be expensive

A Model: MiniMax on exhaustively searchable graphs

Two Playersmin: tries to achieve an outcome of 0max: tries to achieve an outcome of 1

You are max at node A

B

A

H

F

C D

K

G

O

E

J

N

P Q R

I

L M

MAX

Max

Min

Min

Min

Max

0

1 0

1

1

1

0 1

0

0

Wins

Max would like to go to F,J,N,Q,LMin would like to go to D,H,P,R,M

Propagating Scores: A first passMin’s Turn a Node I

◦Go to M to win◦So assign I a 0

Max’s turn at node O◦Go to Q to win◦So assign 1 to node O

Conclusion

1. If faced with two labeled choices, you would choose 0 (if min) or 1 (if max)

2. Assume you’re opponent will play the same way

Propagating Scores

For each unlabeled node in the treeIf it’s max’s turn, give it the max score of

its childrenIf it’s min’s turn, give it the min score of

its children

Now label the treeConclusion: Max must choose C at the first

move or lose the game

Nim

7 coins2 playersPlayers divide coins into two piles at each

move, such that◦Piles have an unequal number of coins◦No pile is empty

Play ends when a player can no longer move

Start of Game

7

6,1 5,2 4,3

min

Complete the game to see that min wins only if max makes a mistake at 6,1 or 5,2

5,1,1 4,2,1

αβ Pruning

ProblemFor games of any complexity, you can’t

search the whole tree

SolutionLook ahead a fixed number of plys (levels)Evaluate according to some heuristic

estimateDevelop a criterion for pruning subtrees

that don’t require examination

Recast

Instead of representing wins, numbers represent the relative goodness of nodes.

At any junctionMax chooses highestMin chooses lowestHigher means better for maxLower means better for min

What should Max do?

j

m z

g

k

n t

p q r

8

8 12

7 3 9

9

4

4

8

Min

alphabeta

Max

Situation

Max’s turn at node g Left subtree of g has been explored If max chooses j, min will choose m So the best max can do by going left is 8. Call

this αNow Examine K and its left subtree n with a value

of 4 If max chooses k, the worst min can do is 4. Why? T may be < 4. If it is min will choose it.

If not, min will choose 4 So the worst min can do, if max goes right is 4.

Call this β

Question

Must max expand the rst(K)?

No. min is guaranteed 4. But if max chose j, min is guaranteed 8So max is better off by going left

More formally:

Val(k) = min(4, val(t)) <= 4Val(g) = max(8,val(k))

= max(8, min(4,val(t)) = 8

Max Principle

If you’re Max:Search can be stopped below any min

node where β <= α of its max ancestor

What should Min do?

n

d e

k

t

p q r

4

4 3 7 3 9

9

4

Min

alphaBeta

Situation

Min’s turn at node kLeft subtree of k has been exploredIf min chooses n, max will choose dSo the best min can do by going left is 4. Call

this βNow examine T and its left subtree P with a value

of 7If min chooses T, the worst max can do is 7. Why? Q or R may be > 7. If either is Max will choose

one of them. If not, max will choose 7.So the worst max can do, if min goes right is 7.

Call this α

Question

Should min explore Q and RNoMax is guaranteed 7 if Min chooses TBut if min chooses N, max gets only 4Val(T) = max(7,val(Q), val(R)) >= 7Val(k) = min(4,val(T)) = 4

Min Principle

If you’re min:Search can be stopped below any max

node where α >= β of its min ancestor

To Summarize

Max’s turn◦ β is min’s guaranteed score (the worst min can do)◦ α is best max can do

Max principle◦ Search can be stopped below any min node where β <= α of its max ancestor

Min’s turn◦ α is max’s guaranteed score (the worst max can do)◦ β is best min can do

Min principle◦ Search can be stopped below any max node where α >= β of its min ancestor

Examples 3 - 6

On White Board

AB Prune (Nilsson, p. 205)

Similarity between• 2 and 2’• Arguments for min/min principles (Slides 15, 19)

Best Case Performance

Call ◦ D the depth of the search space

◦ND the number of terminal nodes◦B the branching factor

Best case AB performance:ND = 2BD/2 – 1 for even D

ND = 2B(D+1)/2 + B(D-1)/2 for odd D

Example

Suppose B = 5, D = 6w/out alpha/beta ND= 56 = 15625

w/alpha/beta ND= 2* 53 - 1 = 249

Approx. 1.6% of terminal nodes without AB prune

Average Performance

AB prune reduces branching factor B to B3/4

Suppose B = 5, Bab = 3.34

Suppose D = 6Then ND = 56 = 15625

NDAB = 3.346 = 1388 ≈ 8.8% without AB prune

Binary Trees

Clear relationship between branching factor and the size of the search space

Let T = number of nodes in a full binary treeT = 20 + 21 + 22 + … + 2D-1 = 2D – 1

Easily proved through induction

Extend to arbitrary B

T = B0 +B1 + B2 + … + BD = B(BD - 1)/(B-1) + 1

Also provable through induction

To Sum Up

1. D is the depth of the search space2. B is the average number of descendants at

each level3. Size of search space = B(BD -1)/(B-1) + 14. Grows very fast

◦ As branching factor increases ◦ As depth increases

5. Combinatorial Explosion: search space grows too fast to be exhaustively searched

6. But we want to search deeply (large D)7. Conclusion: reduce B through AB pruning