Adversarial Games. Two Flavors Perfect Information –everything that can be known is known...

45
Adversarial Games

Transcript of Adversarial Games. Two Flavors Perfect Information –everything that can be known is known...

Page 1: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Adversarial Games

Page 2: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Two Flavors

Perfect Information– everything that can be known is known– Chess, Othello

Imperfect Information– Player’s have each have partial

knowledge– Poker: dispute is settled by revealing the

contents of one’s hand

Page 3: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Two Approaches to Perfect Information Games

Use simple heuristics and search many nodes

Use sophisticated heuristics and search few nodes

Cost of calculating the heuristics might outweigh the cost of opening many nodes

The closer h is to h*, the better informed it is. But information can be expensive

Page 4: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

MiniMax on exhaustively searchable graphs

Two Players min: tries to achieve an outcome of 0 max: tries to achieve an outcome of

1

Page 5: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

You are max at node A

Expand the entire search space

B

A

H

F

C D

K

G

O

E

J

N

P Q R

I

L M

MAX

Max

Min

Min

Min

Max

0

1 0

1

1

1

0 1

0

0

Page 6: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Wins

Max: F,J,N,Q,L Min: D,H,P,R,M

Propagating Scores: A first pass Min’s Turn a Node I

– Go to M to win– So assign I a 0

Max’s turn at node O– Go to Q to win– So assign 1 to node O

Page 7: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Conclusion

1. If faced with two labeled choices, you would choose 0 (if min) or 1 (if max)

2. Assume you’re opponent will play the same way

Page 8: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Propagating Scores

For each unlabeled node in the tree If it’s max’s turn, give it the max

score of its children If it’s min’s turn, give it the min score

of its children

Now label the treeConclusion: Max must choose C at the

first move or lose the game

Page 9: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Nim

7 coins 2 players Players divide coins into two piles at

each move, such that– Piles have an unequal number of coins– No pile is empty

Play ends when a player can no longer move

Page 10: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Start of Game

7

6,1 5,2 4,3

min

Complete the game to see that min wins only if max makes a mistake at 6,1 or 5,2

5,1,1 4,2,1

Page 11: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

α-β Pruning

Problem For games of any complexity, you can’t

search the whole tree

Solution Look ahead a fixed number of plys (levels) Evaluate according to some heuristic

estimate Develop a criterion for pruning subtrees

that don’t require examination

Page 12: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Recast

Instead of representing wins, numbers represent the relative goodness of nodes.

At any junction Max chooses highest Min chooses lowestHigher means better for maxLower means better for min

Page 13: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Example: Max’s Turn

j

m z

g

k

n t

p q r

8

8 12

7 3 9

9

4

4

8 Max

Min

Alphabeta

Page 14: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Situation

Max’s turn at node g Left subtree of g has been explored If max chooses j, min will choose m So the best max can do by going left is 8. Call

this αNow Examine K and its left subtree n with a value of

4 If max chooses k, the worst min can do is 4. Because, T may be < 4. If it is min will choose

it. If not, min will choose 4 So the worst min can do, if max goes right is 4.

Call this β

Page 15: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Question

Should max expand the right subtree of k.

No. Because min is guaranteed 4. But if max chose j, min is only guaranteed 8.

Val(k) = min(4, val(t)) <= 4Val(g) = max(8,val(k))

= max(8, min(4,val(t)) = 8

Page 16: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Leads to Max Principle

Search can be stopped below any min node where β <= α of its max ancestor

Page 17: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Example: Min’s Turn

n

d e

k

t

p q r

4

4 3 7 3 9

9

4

4 min

Min

alphaBeta

Page 18: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Situation

Min’s turn at node k Left subtree of k has been explored If min chooses n, max will choose d So the best min can do by going left is 4. Call this

βNow examine T and its left subtree P with a value of

7 If min chooses T, the worst max can do is 7. Because, Q or R may be > 7. If it is Max will

choose it. If not, min will choose 7. So the worst max can do, if min goes right is 7.

Call this α

Page 19: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Question

Should min explore Q and R No Max is guaranteed 7 if min chooses T But if min chooses N, max gets only

4 Val(T) = max(7,val(Q), val(R)) >= 7 Val(k) = min(4,val(T)) = 4

Page 20: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Leads to min principle

Search can be stopped below any max node where α >= β of its min ancestor

Page 21: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

To Summarize Max’s turn

– β is min’s guaranteed score (the worst min can do)– α is best max can do

Max principle– Search can be stopped below any min node where β <= α of its max ancestor

Min’s turn– α is max’s guaranteed score (the worst max can do)– β is best min can do

Min principle– Search can be stopped below any max node where α >= β of its min ancestor

Page 22: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.
Page 23: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.
Page 24: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.
Page 25: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.
Page 26: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.
Page 27: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.
Page 28: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.
Page 29: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.
Page 30: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Efficiency

Suppose a tree has depth, D, and branching factor B

D = 2, B = 2: Terminal Nodes = 22

In general, Terminal Nodes = BD

Page 31: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Ordering

Ordering of nodes in a tree clearly affects the number than can be pruned using alpha/beta.

Call ND the number of terminal nodes Can be shown that with alpha/beta best case

performance is:ND = 2BD/2 – 1 for even D

ND = 2B(D+1)/2 + B(D-1)/2 for odd D

Page 32: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Example

Suppose B = 5, D = 6w/out alpha/beta ND= 56 = 15625

w/alpha/beta ND= 2* 53 - 1 = 249

Approx. 1.6% of worst case

Page 33: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Reduction in Branching Factor

Without alpha/beta pruning: ND = BD

With alpha/beta pruning: ND = 2BD/2 - 1 for even D

Reducing the branching factor used in computing ND from B to B1/2

(since: 1/2= BD/2

Page 34: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Average Performance

On average, the peformance reduces B to B3/4

Suppose B = 5, Bab = 3.34

Suppose D = 6Then ND = 56 = 15625

NDAB = 3.346 = 1388 ≈ 8.8% of worst case

Page 35: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Combinatorial Explosion Key Idea: branching factor makes optimal solution

intranctable– Sum of Subsets Problem

2– Traveling Salesperson

(N + 1)/2– 8 puzzle

2.67– 15 puzzle

Approximately 4Let B = average branching factorLet T = total nodesLet D = depth of searchThenT = B + B2 + B3 + … + BD

= B(BD – 1)/(B – 1) + 1

Page 36: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Sum of Subsets

Given a set, S, of positive integers, find all subsets whose sum is m.

E.G.S = {7,11,13,24}m = 31Solutions S-1 = {7,11,13}S-2 = {7,24}

Page 37: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Problem Representation

Solution is a sequence of 1s and 0s, indicating that elements of S have been chosen or not.

Rep of S-1 (1,1,1,0)Rep of S-2 (1,0,0,1)State space is a tree where left turn

indicates a 1 and right turn indicates a 0

Page 38: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Partial Space

S-1

Three left turns and a right turn to get to S-1 is the sequence (1,1,1,0)

Clearly the branching factor is 2

Page 39: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

TSP

Can also be represented as a state space search.

Suppose 4 cities4 Choices at level 03 Choices at level 12 Choices at level 21 Choice at level 3

T = 4*3*2*1 = 4P4 = 4!/(4-4)! = 4!

Page 40: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Branching Factor

B_F = (sum of choices at each level)/#of levels

= (4+3+2+1)/4 = 2.5

Clearly this increases with the size of the tour:(1+2+3+…+n)/n = (n(n+1)/2)/n = (n+1)/2

Page 41: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Relationship between branching factor and nodes in the tree

Whenever the branching factor >= 2, we have an exponentially complex problem

Let T = number of nodes in a full binary tree

T = 20 + 21 + 22 + … + 2d-1 = 2d – 1 Easily proved through induction

Page 42: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Replace 2 by branching factor, B

T = B0 +B1 + B2 + … + BL = B(BL -1)/(B-1) + 1Where L is d-1, d being the depth of the treeProofBasis: T = B(B0 – 1)/(B-1) + 1 = 1Inductive hypothesis:T = B0 +B1 + B2 + … + BL = B(BL -1)/(B-1) + 1

Page 43: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Show that this is true at level L+1That is,Show= B0 +B1 + B2 + … + BL+1 = B(BL+1 -1)/(B-1) + 1

B0 +B1 + B2 + … + BL + BL+1 = B(BL -1)/(B-1) + BL+1 + 1= (B(BL -1) + BL+1(B-1))/(B-1) + 1 %common D= (B( (BL -1) + BL(B-1))/(B-1) + 1 %factor B out= (B(BL -1 + BL+1 –BL )/(B-1) + 1 %multiply b= B(BL+1 -1)/(B-1) + 1 %subtract

Which is what we were trying to prove

Page 44: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Two Concepts

1. B – average number of descendents that emerge from any state in the space

2. Total nodes = B(BL -1)/(B-1) + 1

Where L is the deepest level (or, the depth of the search)

Page 45: Adversarial Games. Two Flavors  Perfect Information –everything that can be known is known –Chess, Othello  Imperfect Information –Player’s have each.

Another Problem: 8 Puzzle

A B C1 2 3 1 2 1 28 4 3 4 5 3 4 57 6 5 6 7 8 6 7 8A: 4 moves for blank * 1 position = 4B: 2 moves for blank * 4 positions = 8C: 3 moves for blank * 4 positions = 12

B = (4 + 8 + 12) /(1 + 4 + 4) = 2.67

Does the branching factor of larger (15, 24) puzzles approach 4 as the puzzles get larger?