Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.
-
date post
20-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.
![Page 1: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/1.jpg)
Rutgers CS440, Fall 2003
Lecture 6:Adversarial Search & Games
Reading: Ch. 6, AIMA
![Page 2: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/2.jpg)
Rutgers CS440, Fall 2003
Adversarial search
• So far, single agent search – no opponents or collaborators• Multi-agent search:
– Playing a game with an opponent: adversarial search
– Economies: even more complex, societies of cooperative and non-cooperative agents
• Game playing and AI:– Games can be complex, require (?) human intelligence
– Have to evolve in “real-time”
– Well-defined problems
– Limited scope
![Page 3: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/3.jpg)
Rutgers CS440, Fall 2003
Games and AI
Deterministic Chance
perfect info Checkers, Chess, Go, Othello
Backgammon, Monopoly
imperfect info Bridge, Poker, Scrabble
![Page 4: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/4.jpg)
Rutgers CS440, Fall 2003
Games and search
• Traditional search: single agent, searches for its well-being, unobstructed
• Games: search against an opponent
• Consider a two player board game:– e.g., chess, checkers, tic-tac-toe
– board configuration: unique arrangement of "pieces"
• Representing board games as search problem:– states: board configurations
– operators: legal moves
– initial state: current board configuration
– goal state: winning/terminal board configuration
![Page 5: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/5.jpg)
Rutgers CS440, Fall 2003
Wrong representation
• We want to optimize our (agent’s) goal, hence build a search tree based on possible moves/actions
• Problem: discounts the opponent
X X X
X
X
X O
X
X O
X
O
X X
OX
X
![Page 6: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/6.jpg)
Rutgers CS440, Fall 2003
Better representation: game search tree
• Include opponent’s actions as well
X X X
X
OX OX
O
X
O
X
X
O
X
X O
X
X O
X
O
X X
OX
X
Agent move
Opponent move
Agent move
Fullmove
10 15 Utilities(assigned to goal nodes)
![Page 7: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/7.jpg)
Rutgers CS440, Fall 2003
Game search trees
• What is the size of the game search trees?– O(bd)
– Tic-tac-toe: 9! leaves (max depth= 9)
– Chess: 35 legal moves, average “depth” 100• bd ~ 35100 ~10154 states, “only” ~1040 legal states
• Too deep for exhaustive search!
![Page 8: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/8.jpg)
Rutgers CS440, Fall 2003
Utilities in search trees
• Assign utility to (terminal) states, describing how much they are valued for the agent– High utility – good for the agent
– Low utility – good for the opponent
M1
N3
O2
K0
L2
F-7
G-5
H3
I9
J-6
E3
D2
B-5
C9
opponent'spossible moves
board evaluation from agent's perspective
A9
computer's possible moves
terminal states
![Page 9: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/9.jpg)
Rutgers CS440, Fall 2003
Search strategy
• Worst-case scenario: assume the opponent will always make a best move (i.e., worst move for us)
• Minimax search: maximize the utility for our agent while expecting that the opponent plays his best moves:1. High utility favors agent => chose move with maximal utility
2. Low move favors opponent => assume opponent makes the move with lowest utility
EDB C
A
E1
D0
B-7
C-6
A1
M1
N3
O2
K0
L2
F-7
G-5
H3
I9
J-6
computer's possible moves
opponent'spossible moves
terminal states
![Page 10: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/10.jpg)
Rutgers CS440, Fall 2003
Minimax algorithm
1. Start with utilities of terminal nodes
2. Propagate them back to root node by choosing the minimax strategy
EDB C
A
EDB C
A
M1
N3
O2
K0
L2
F-7
G-5
H3
I9
J-6
EDB C
A
E1
D0
B-5
C-6
A
M1
N3
O2
K0
L2
F-7
G-5
H3
I9
J-6
min
EDB C
A
E1
D0
B-5
C-6
A1
M1
N3
O2
K0
L2
F-7
G-5
H3
I9
J-6
max
![Page 11: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/11.jpg)
Rutgers CS440, Fall 2003
Complexity of minimax algorithm
• Utilities propagate up in a recursive fashion:– DFS
• Space complexity:– O(bd)
• Time complexity:– O(bd)
• Problem: time complexity – it’s a game, finite time to make a move
![Page 12: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/12.jpg)
Rutgers CS440, Fall 2003
Reducing complexity of minimax (1)
• Don’t search to full depth d, terminate early• Prune bad paths
• Problem:– Don’t have utility of non-terminal nodes
• Estimate utility for non-terminal nodes: – static board evaluation function (SBE) is a heuristic that assigns
utility to non-terminal nodes
– it reflects the computer’s chances of winning from that node
– it must be easy to calculate from board configuration
• For example, Chess:SBE = α * materialBalance + β * centerControl + γ * …material balance = Value of white pieces - Value of black piecespawn = 1, rook = 5, queen = 9, etc.
![Page 13: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/13.jpg)
Rutgers CS440, Fall 2003
Minimax with Evaluation Functions
• Same as general Minimax, except– only goes to depth m– estimates using SBE function
• How would this algorithm perform at chess?– if could look ahead ~4 pairs of moves (i.e., 8 ply)
would be consistently beaten by average players
– if could look ahead ~8 pairs as done in a typical PC, is as good as human master
![Page 14: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/14.jpg)
Rutgers CS440, Fall 2003
Reducing complexity of minimax (2)
• Some branches of the tree will not be taken if the opponent plays cleverly. Can we detect them ahead of time?
• Prune off paths that do not need to be explored• Alpha-beta pruning
• Keep track of while doing DFS of game tree:– maximizing level: alpha
• highest value seen so far
• lower bound on node's evaluation/score
– minimizing level: beta • lowest value seen so far
• higher bound on node's evaluation/score
![Page 15: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/15.jpg)
Rutgers CS440, Fall 2003
O
W-3
B
N4
F G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
A
Alpha-Beta Example
minimax(A,0,4)
max CallStack
A
AAα=
![Page 16: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/16.jpg)
Rutgers CS440, Fall 2003
O
W-3
B
N4
F G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=
Alpha-Beta Example
minimax(B,1,4)
max CallStack
A
BBβ=
B
min
![Page 17: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/17.jpg)
Rutgers CS440, Fall 2003
O
W-3
Bβ=
N4
F G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=
Alpha-Beta Example
minimax(F,2,4)
max CallStack
A
FFα=
B
min
max
F
![Page 18: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/18.jpg)
Rutgers CS440, Fall 2003
O
W-3
Bβ=
N4
Fα=
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=
Alpha-Beta Example
minimax(N,3,4)
max CallStack
A
N4
B
min
max
F
blue: terminal state
N
![Page 19: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/19.jpg)
Rutgers CS440, Fall 2003
O
W-3
Bβ=
N4
Fα=
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=
Alpha-Beta Example
minimax(F,2,4) is returned to
maxCall
Stack
A
alpha = 4, maximum seen so far
B
min
max
F
blue: terminal state
Fα=4
![Page 20: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/20.jpg)
Rutgers CS440, Fall 2003
O
W-3
Bβ=
N4
Fα=4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=
Alpha-Beta Example
minimax(O,3,4)
maxCall
Stack
AB
min
max
F
blue: terminal state
Omin OOβ=
![Page 21: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/21.jpg)
Rutgers CS440, Fall 2003
blue: terminal state
Oβ=
W-3
Bβ=
N4
Fα=4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=
Alpha-Beta Example
minimax(W,4,4)
max CallStack
AB
min
max
F
blue: terminal state (depth limit)
O
W-3
min
W
![Page 22: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/22.jpg)
Rutgers CS440, Fall 2003
blue: terminal state
Oβ=
W-3
Bβ=
N4
Fα=4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=
Alpha-Beta Example
minimax(O,3,4) is returned to
max CallStack
A
beta = -3, minimum seen so far
B
min
max
FO
min Oβ=-3
![Page 23: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/23.jpg)
Rutgers CS440, Fall 2003
blue: terminal state
Oβ=-3
W-3
Bβ=
N4
Fα=4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=
Alpha-Beta Example
minimax(O,3,4) is returned to
maxCall
Stack
A
O's beta F's alpha: stop expanding O (alpha cut-off)
B
min
max
FO
min
X-5
![Page 24: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/24.jpg)
Rutgers CS440, Fall 2003
blue: terminal state
Oβ=-3
W-3
Bβ=
N4
Fα=4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=
Alpha-Beta Example
Why? Smart opponent will choose W or worse, thus O's upper bound is –3
So computer shouldn't choose O:-3 since N:4 is better
maxCall
Stack
AB
min
max
FO
min
![Page 25: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/25.jpg)
Rutgers CS440, Fall 2003
blue: terminal state
Oβ=-3
W-3
Bβ=
N4
Fα=4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=
Alpha-Beta Example
minimax(F,2,4) is returned to
maxCall
Stack
AB
min
max
Fmin
X-5
alpha not changed (maximizing)
![Page 26: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/26.jpg)
Rutgers CS440, Fall 2003
blue: terminal state
Oβ=-3
W-3
Bβ=
N4
Fα=4
G-5
X-5
ED0C
R0
P9
Q-6
S3
T5
U-7
V-9
K MH3
I8 J L
2
Aα=
Alpha-Beta Example
minimax(B,1,4) is returned to
maxCall
Stack
AB
min
max
min
X-5
beta = 4, minimum seen so far
Bβ=4
![Page 27: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/27.jpg)
Rutgers CS440, Fall 2003
Effectiveness of Alpha-Beta Search
Effectiveness depends on the order in which successors are examined. More effective if bestare examined first
• Worst Case:– ordered so that no pruning takes place
– no improvement over exhaustive search
• Best Case:– each player’s best move is evaluated first (left-most)
• In practice, performance is closer to bestrather than worst case
![Page 28: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/28.jpg)
Rutgers CS440, Fall 2003
Effectiveness of Alpha-Beta Search
• In practice often get O(b(d/2)) rather than O(bd)– same as having a branching factor of b
since (b)d = b(d/2)
• For Example: Chess– goes from b ~ 35 to b ~ 6– permits much deeper search for the same time
– makes computer chess competitive with humans
![Page 29: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/29.jpg)
Rutgers CS440, Fall 2003
Dealing with Limited Time
• In real games, there is usually a time limit T on making a move• How do we take this into account?
– cannot stop alpha-beta midway and expect to useresults with any confidence
– so, we could set a conservative depth-limit that guarantees we will find a move in time < T
– but then, the search may finish early andthe opportunity is wasted to do more search
![Page 30: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/30.jpg)
Rutgers CS440, Fall 2003
Dealing with Limited Time
• In practice, iterative deepening search (IDS) is used– run alpha-beta search with an increasing depth limit
– when the clock runs out, use the solution foundfor the last completed alpha-beta search(i.e., the deepest search that was completed)
![Page 31: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/31.jpg)
Rutgers CS440, Fall 2003
The Horizon Effect
• Sometimes disaster lurks just beyond search depth– computer captures queen, but a few moves later the opponent
checkmates (i.e., wins)
• The computer has a limited horizon; it cannotsee that this significant event could happen
• How do you avoid catastrophic losses due to “short-sightedness”?– quiescence search
– secondary search
![Page 32: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/32.jpg)
Rutgers CS440, Fall 2003
The Horizon Effect
• Quiescence Search– when evaluation frequently changing, look deeper than limit
– look for a point when game “quiets down”
• Secondary Search1. find best move looking to depth d2. look k steps beyond to verify that it still looks good
3. if it doesn't, repeat Step 2 for next best move
![Page 33: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/33.jpg)
Rutgers CS440, Fall 2003
Book Moves
• Build a database of opening moves, end games, and studied configurations
• If the current state is in the database, use database:– to determine the next move
– to evaluate the board
• Otherwise, do alpha-beta search
![Page 34: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/34.jpg)
Rutgers CS440, Fall 2003
Examples of Algorithmswhich Learn to Play Well
Checkers:
A. L. Samuel, “Some Studies in Machine Learning using the Game of Checkers,” IBM Journal of Research and Development, 11(6):601-617, 1959
• Learned by playing a copy of itself thousands of times• Used only an IBM 704 with 10,000 words of RAM, magnetic
tape, and a clock speed of 1 kHz• Successful enough to compete well at human tournaments
![Page 35: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/35.jpg)
Rutgers CS440, Fall 2003
Examples of Algorithmswhich Learn to Play Well
Backgammon:
G. Tesauro and T. J. Sejnowski, “A Parallel Network that Learns to Play Backgammon,” Artificial Intelligence 39(3), 357-390, 1989
• Also learns by playing copies of itself• Uses a non-linear evaluation function - a neural network • Rated one of the top three players in the world
![Page 36: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/36.jpg)
Rutgers CS440, Fall 2003
Non-deterministic Games
• Some games involve chance, for example:– roll of dice– spin of game wheel– deal of cards from shuffled deck
• How can we handle games with random elements?• The game tree representation is extended
to include chance nodes:1. agent moves2. chance nodes3. opponent moves
![Page 37: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/37.jpg)
Rutgers CS440, Fall 2003
Non-deterministic Games
The game tree representation is extended:
Aα=
Bβ=2
7 2
Cβ=6
9 6
Dβ=0
5 0
Eβ=-4
8 -4
50/50 50/50
.5 .5 .5 .5
max
chance
min
![Page 38: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/38.jpg)
Rutgers CS440, Fall 2003
Non-deterministic Games
• Weight score by the probabilities that move occurs• Use expected value for move: sum of possible random
outcomes
Aα=
Bβ=2
7 2
Cβ=6
9 6
Dβ=0
5 0
Eβ=-4
8 -4
50/50 50/50
.5 .5 .5 .5
max
chance
min
50/504
50/50-2
![Page 39: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/39.jpg)
Rutgers CS440, Fall 2003
Non-deterministic Games
• Choose move with highest expected value
Aα=
Bβ=2
7 2
Cβ=6
9 6
Dβ=0
5 0
Eβ=-4
8 -4
50/504
50/50-2
.5 .5 .5 .5
max
chance
min
Aα=4
![Page 40: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/40.jpg)
Rutgers CS440, Fall 2003
Non-deterministic Games
• Non-determinism increases branching factor– 21 possible rolls with 2 dice
• Value of lookahead diminishes: as depth increases probability of reaching a given node decreases
• alpha-beta pruning less effective• TDGammon:
– depth-2 search
– very good heuristic
– plays at world champion level
![Page 41: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/41.jpg)
Rutgers CS440, Fall 2003
Computers can playGrandMaster Chess
“Deep Blue” (IBM)• Parallel processor, 32 nodes• Each node has 8 dedicated VLSI “chess chips”• Can search 200 million configurations/second• Uses minimax, alpha-beta, sophisticated heuristics• It currently can search to 14 ply (i.e., 7 pairs of moves)• Can avoid horizon by searching as deep as 40 ply• Uses book moves
![Page 42: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/42.jpg)
Rutgers CS440, Fall 2003
Computers can playGrandMaster Chess
Kasparov vs. Deep Blue, May 1997• 6 game full-regulation chess match sponsored by ACM• Kasparov lost the match 2 wins & 1 tie to 3 wins & 1 tie• This was an historic achievement for computer chess being
the first time a computer became the best chess player on the planet
• Note that Deep Blue plays by “brute force” (i.e., raw power from computer speed and memory); it uses relatively little that is similar to human intuition and cleverness
![Page 43: Rutgers CS440, Fall 2003 Lecture 6: Adversarial Search & Games Reading: Ch. 6, AIMA.](https://reader035.fdocuments.us/reader035/viewer/2022081519/56649d415503460f94a1b5fe/html5/thumbnails/43.jpg)
Rutgers CS440, Fall 2003
Status of Computersin Other Deterministic Games
• Checkers/Draughts– current world champion is Chinook
– can beat any human, (beat Tinsley in 1994)
– uses alpha-beta search, book moves (> 443 billion)
• Othello– computers can easily beat the world experts
• Go– branching factor b ~ 360 (very large!)
– $2 million prize for any system that can beat a world expert