Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to...

Games, page 1CSI 4106, Winter 2005

GamesPointsGames and strategiesA logic-based approach to games

AND-OR trees/graphsStatic evaluation functions

Tic-Tac-ToeMinimaxAlpha-beta cutoffExtensions


Definitions

We consider non-random or semi-random games• with full information,• zero-sum,• two-person (dual),• with rational players.

That's mainly board games such as chess, chequers, go -- sometimes called strategic games; some paper-and-pencil games are of this kind too.


Definitions (2)

A (pure) strategy:a complete set of advance instructions that specifies a definite choice for every conceivable situation in which the player may be required to act.In a two-player game, a strategy allows the player to have a response to every move of the opponent.Game-playing programs implement a strategy as a software mechanism that supplies the right move on request.


A logic-based approach to games

Find a winning strategy by proving that the game can be won -- use backward chaining.A very simple game: nim. initially, there is one stack of chips; a move: select a stack and divide it in two

unequal non-empty stacks; a player who cannot move loses the game.

(The player who moves first can win.)


... Games in logic (2)



A_wins([6], A)

A_wins([5], B) A_wins([4], B)

A_wins([4], A) A_wins([3], A)

A_wins([3], B) A_wins([], B)

A_wins([], A)

The players are A and B. A_wins( P, X ) means "player X moves in position P and there is a winning continuation for A". A position is represented as a list of sizes of stacks with 3 or more chips (only those can be still divided).

This is an AND/OR tree (actually a directed, acyclic AND/OR graph).

Player B loses.



A_wins([6], B)

A_wins([5], A) A_wins([4], A)

A_wins([4], B) A_wins([3], B)

A_wins([3], A) A_wins([], A)

A_wins([], B)

A winning strategy would always lead to a win. Here, such a strategy is described by a subgraph with one OR edge selected from each OR node. All leaves in the subgraph must represent wins for player A.

Now we cannot find a winning strategy: why?



This kind of analysis only works for very small game trees:

A_wins([8], A)

A_wins([7], B)

A_wins([6], A) A_wins([5], A)A_wins([4, 3], A)

A_wins([4], B)


The basic loop

The basic loop in a game program:build as much of the complete tree as seems reasonable (for example, within a given time limit);evaluate the incomplete tree;prune unpromising or bad moves;make a move;get the opponent's move.

Regularities:moves of player A sprout from OR nodes,moves of player B sprout from AND nodes.Seen from A's perspective, this means that A chooses one of the moves (the best move, if possible), and is ready to react to all of B's moves.


Static evaluation

A static evaluation function returns the value of a move without trying to play (which would mean simulating the rest of the game but not playing it).Usually a static evaluation function returns positive values for positions advantageous to A, negative values for positions advantageous to B.If player A is rational, he will choose the maximal value of a leaf.Player B will choose the minimal value.


Static evaluation (2)

If we can have (guess or calculate) the value of an internal node N, we can treat it as if it were a leaf. This is the basis of the minimax procedure.No tree would be necessary if we could evaluate the initial position statically. Normally we need a tree, and we need look-ahead into it. Further positions can be evaluated more precisely, because there is more information, and a more focussed search.Minimax works best for large trees, but it can be useful even in mini-games such as tic-tac-toe.


Tic-Tac-ToeLet player A be x and let open(x), open(o) mean the number of lines open to x and o. There are 8 lines. An evaluation function for position P:f(P) = - if o winsf(P) = + if x winsf(P) = open(x) - open(o) otherwiseExample:open(x) - open(o) = 6 - 4

xo

Assumptions:only one of symmetrical positions is generated;we build 2 levels of the game tree (one move -- one response) to have 2-ply lookahead.


Tic-Tac-Toe (2)

Player B chooses the minimal backed-up value among level 1 nodes.Player A chooses the maximal value, and makes the move.Player B, as a rational agent, selects the optimal response.

xx

x

x x x x x

x x x x x

o oo

oo

o oxx

oo

oo

o

6-5 5-5 6-5 5-5 4-5 5-4 6-4

5-6 5-5 5-6 6-6 4-6


Tic-Tac-Toe (3)

B's first three moves are blocking moves. Other moves lead to + for A: the only finite value is the minimum.For A this is a three-way tie in the evaluation; the chance to get more information is to consider more plies.

x xxx x

o

x x

o o

x

oo

x

o o

3-3 3-2 4-3 4-2 3-2 3-2

o o

xxo x

xo

xxo

xx

ox

x

ox

x

oo

o


Tic-Tac-Toe (4)

Now, what happens if B chooses a weaker move?

The procedure finds a winning continuation: the best position ensures a win by forced moves.

x xx

xx

o

x x

oo

x

o

o

x

o o

2-2 3-2 4-2 4-3 4-3 3-3

o o

xox

xxo

xx

oxx

oxx

oxx

oo o


Tic-Tac-Toe (5)

Building complete plies is usually not necessary. If we evaluate a position when it is generated, we may save a lot.Assume that we are at a minimizing level. If the evaluation function returns -, we do not need to consider other positions:- will be the minimum.The same applies to + at a maximizing level.

2-1 3-12-1 3-1 -

xx

o o

xx

o o

xx

o o

xx

o o

xx

o o

xx

o oxx

x xx

xx

o o

xx

o o

xx

o o

xx

o ox

x xx

- --

o o o o


Tic-Tac-Toe (6)

This is possible because of the special properties of the infinite values, but we can achieve a similar effect for finite values.

x x

x x x x xo oo

oo

o x

6-5 5-5 6-5 5-5 4-5 5-6


Tic-Tac-Toe (7)

The backed-up value of the first node at level 1 is -1, so the value of the (maximizing) root must be ≥ -1.When we see 5 - 6 = -1, we know that the value of the (minimizing) node • must be ≤ -1. The whole subtree sprouting from • cannot contribute anything and should not even be built.

x x

x x x x xo oo

oo

o x

6-5 5-5 6-5 5-5 4-5 5-6


Minimax with cut-offIn general, we keep a provisional value in every node. This value can only increase in an OR (maximizing) node, and decrease in an AND (minimizing) node.If an AND node with the provisional value V has a child C with a value less than V, we abandon C.If an OR node with the provisional value V has a child C with a value greater that V, we abandon C.Provisional values are established as soon as we "know something", and are propagated up the tree, from the leaves to the root.


- cut-off (2)

i. We stop searching in and below a minimizing node N with a provisional value PVN that is less than or equal to the provisional values of its maximizing ancestors.The final value for N is PVN.

ii. We stop searching in and below a maximizing node N with a provisional value PVN that is greater than or equal to the provisional values of its minimizing ancestors.The final value for N is PVN.


- cut-off (3)

A provisional value of an OR node is called its alpha-value.A provisional value of an AND node is called its beta-value.During search, the alpha-value of a node is set to the currently largest of the final values for descendants;the beta-value of a node is set to the currently smallest of the final values for descendants.(i) is a shallow alpha-cutoff,(ii) is a shallow beta-cutoff.


- cut-off (4)

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2


Extensions, modifications"Waiting for quiescence" -- when we reach the depth limit in the middle of a dynamic exchange (large amplitude of values).Secondary search ("feedover") -- "double check" down a path that seems the best.Book moves -- "canned" continuations (openings, endgames), and forced moves.Disadvantages of minimax:relying on the optimality of the opponent's play,no spectacular sacrifices are possible (winning back beyond the search limit),the horizon effect.

Queenlost

Queenlost

Pawnlost

Search limit


What next?

This is a technology for one kind of games, and quite probably not the most popular (-:). Adventure games require a very different kind of Artificial Intelligence. Visit

http://www.gameai.com/ai.htmlfor a nearly professional perspective. And, in general, google it.

http://www.gameai.com/ai.html





Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to...

Documents

Transcript of Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to...