Introduction General Game PlayingLecture 1 Michael Genesereth Spring 2012.

download Introduction General Game PlayingLecture 1 Michael Genesereth Spring 2012.

If you can't read please download the document

description

3/42 General Game Playing General Game Players are systems able to play arbitrary games effectively based solely on formal descriptions supplied at “runtime”. Translation: They don’t know the rules until the game starts.

Transcript of Introduction General Game PlayingLecture 1 Michael Genesereth Spring 2012.

Introduction General Game PlayingLecture 1 Michael Genesereth Spring 2012 2/42 Game Playing Human Game Playing Intellectual Activity Skill Comparison Computer Game Playing Testbed for AI Limitations 3/42 General Game Playing General Game Players are systems able to play arbitrary games effectively based solely on formal descriptions supplied at runtime. Translation: They dont know the rules until the game starts. 4/42 Technology Unlike specialized game players (e.g. Deep Blue), they do not use algorithms designed in advance for specific games. Artificial Intelligence Technologies knowledge representation deduction, reformulation, induction, rational behavior w/ uncertainty, resource bounds 5/42 Variety of Games 6/42 Chess 7/42 Knight-Zone Chess 8/42 Bughouse Chess 9/42 Other Games 10/42 Single Player Games a c b 11/42 Annual GGP Competition Held at AAAI or IJCAI conference Administered by Stanford (Stanford folks not eligible to participate) Reward Cluneplayer (Jim Clune) Fluxplayer (Stephan Schiffel, Michael Thielscher) Cadiaplayer (Yngve Bjornsson, Hilmar Finsson) Cadiaplayer (Yngve Bjornsson, Hilmar Finsson) Ary (Jean Mehat) TurboTurtle (Sam Schreiber) Annual GGP Competition 12/42 GGP-05 Winner Jim Clune 13/42 Other Games, Other Winners 14/42 Qualification Round via the web Tournaments in June Semi-Final Round at AAAI Tournament style (one full day) Final Round at AAAI Top two competitors head to head Winner take all (glory + chance to ) Competition at GGP-2012 15/42 Carbon versus Silicon Come cheer on your favorite. The battle continues at GGP-2012. 16/42 Chris Welty 17/42 General Game Description 18/42 Single Player Game as a State Machine s3s3 s2s2 s5s5 s6s6 s8s8 s9s9 s4s4 s7 s 10 s1s1 s 11 b a a d b c a b a b a a b b a a a c d b 19/42 Composite Actions for Multiple Players s3s3 s2s2 s5s5 s6s6 s8s8 s9s9 s4s4 s7s7 s 10 s1s1 s 11 bb ba aa bbab ba aa ab aa ab aa ab aa bb aa bb ba ab 20/42 Direct Description Since all of the games that we are considering are finite, it is possible in principle to communicate game information in the form of transition graphs. Problem:Size of description. Even though everything is finite, the graphs can be large. Solution: Compact Encoding 21/42 Tic-Tac-Toe init(cell(1,1,b)) init(cell(1,2,b)) init(cell(1,3,b)) init(cell(2,1,b)) init(cell(2,2,b)) init(cell(2,3,b)) init(cell(3,1,b)) init(cell(3,2,b)) init(cell(3,3,b)) init(control(x)) legal(P,mark(X,Y)) :- true(cell(X,Y,b)) & true(control(P)) legal(x,noop) :- true(control(o)) legal(o,noop) :- true(control(x)) next(cell(M,N,P)) :- does(P,mark(M,N)) next(cell(M,N,Z)) :- does(P,mark(M,N)) & true(cell(M,N,Z)) & Z#b next(cell(M,N,b)) :- does(P,mark(J,K)) & true(cell(M,N,b)) & (M#J | N#K) next(control(x)) :- true(control(o)) next(control(o)) :- true(control(x)) terminal :- line(P) terminal :- ~open goal(x,100) :- line(x) goal(x,50) :- draw goal(x,0) :- line(o) goal(o,100) :- line(o) goal(o,50) :- draw goal(o,0) :- line(x) row(M,P) :- true(cell(M,1,P)) & true(cell(M,2,P)) & true(cell(M,3,P)) column(N,P) :- true(cell(1,N,P)) & true(cell(2,N,P)) & true(cell(3,N,P)) diagonal(P) :- true(cell(1,1,P)) & true(cell(2,2,P)) & true(cell(3,3,P)) diagonal(P) :- true(cell(1,3,P)) & true(cell(2,2,P)) & true(cell(3,1,P)) line(P) :- row(M,P) line(P) :- column(N,P) line(P) :- diagonal(P) open :- true(cell(M,N,b)) draw :- ~line(x) & ~line(o) 22/42 No Built-in Assumptions What we see: next(cell(M,N,x)) :- does(white,mark(M,N)) & true(cell(M,N,b)) What the player sees: next(welcoul(M,N,himenoing)) :- does(himenoing,dukepse(M,N)) & true(welcoul(M,N,lorenchise)) 23/42 Game Management 24/42 Game Manager Game Descriptions Match Records Temporary State Data Player Graphics for Spectators... Tcp/ip 25/42 Start Manager sends Start message to players Start(id,role,description,startclock,playclock) Play Manager sends Play messages to players Play(id, actions) Receives plays in response Stop Manager sends Stop message to players Stop(id, actions) Game Playing Protocol 26/42 Records (Gamemaster) Matches Tournaments Parameterized Collection of Games One player or n players Competition and Cooperation Services Game Stepper Sample Players (Legal, Random, Maximizer, ) Game Manager (Gameserver) 27/42 General Game Playing 28/42 Logic Possible to use logical reasoning for game play Computing legality of moves Computing consequences of actions Computing goal achievement Computing termination Easy to convert from logic to other representations many orders or magnitude speedup on simulation no asymptotic change Things that may be better done in Logic Game Reformulation Game Analysis 29/42 State Representation cell(1,1,x) cell(1,2,b) cell(1,3,b) cell(2,1,b) cell(2,2,o) cell(2,3,b) cell(3,1,b) cell(3,2,b) cell(3,3,x) control(black) X O X 30/42 Legality Computation legal(W,mark(X,Y)) :-cell(1,1,x) true(cell(X,Y,b)) cell(1,2,b) true(control(W))cell(1,3,b) cell(2,1,b) legal(white,noop) :-cell(2,2,o) true(cell(X,Y,b)) cell(2,3,b) true(control(black))cell(3,1,b) cell(3,2,b) legal(black,noop) :-cell(3,3,x) true(cell(X,Y,b)) control(black) true(control(white)) Conclusions: legal(white,noop) legal(black,mark(1,2)) legal(black,mark(3,2)) 31/42 Update Computation cell(1,1,x)cell(1,2,b) cell(1,3,b)cell(1,3,o) cell(2,1,b) blackcell(2,1,b)cell(2,2,o) cell(2,3,b) mark(1,3)cell(2,3,b)cell(3,1,b)cell(3,2,b)cell(3,3,x) control(black)control(white) next(cell(M,N,o)) :- does(black,mark(M,N)) & true(cell(M,N,b)) 32/42 Game Tree Expansion X OX OX O X OX OX O X X OX OX O X X OX OX O X 33/42 Game Tree Search X OX O X OX OX O X X OX OX O X X OX OX O X X OX OX O X OX OX X OX O X X OX O X X OX O X X OX O X 34/42 Resource Limitations Large state spaces ~5000 states in Tic-Tac-Toe ~10 44 states in Chess Limited Resources Memory Time (start clock, move clock) 35/42 Incremental Search Expand Game Graph incrementally As much as time allows Minimax/etc. evaluation on non-terminal states using an evaluation function of some sort But how do we evaluate non-terminal states? In traditional game playing, the rules are known in advance; and the programmer can invent game- specific evaluation functions. Not possible in GGP. 36/42 Non-Guaranteed Evaluation Functions Ideas Goal proximity (everyone) Maximize mobility (Barney Pell) Minimize opponents mobility (Jim Clune) Depth Charge (aka Monte Carlo) 37/42 AAAI-06 Final - Cylinder Checkers 38/42 Guaranteed Evaluation Functions Ideas * Novelty with reversibility * Goal-monotonic observables * Bad states, useless moves 39/42 Metagaming Metagaming is the process of analyzing and/or modifying a game during the startclock so as to improve performance once the game begins. Examples of Metagaming Techniques: Headstart on Game-Graph Search Game Optimization Compilation Game Reformulation Discovery of Game-Specific Evaluation Functions 40/42 Hodgepodge = Chess + Othello Branching factor: aBranching factor: b Analysis of joint game: Branching factor as given to players: a*b Fringe of tree at depth n as given: (a*b) n Fringe of tree at depth n factored: a n +b n Hodgepodge 41/42 Double Tic-Tac-Toe X O X O O X X OX O O X 42/42 Game Reformulation Examples Symmetry, e.g. Tic-Tac-Toe Independence, e.g Hodgepodge Bottlenecks, e.g. Triathalon Techniques: Graph Analysis (factoring, branching factor, ) Proofs (esp those using mathematical induction) Trade-off - cost of finding structure vs savings Sometimes cost varies with size of description rather than size of expanded game graph Use of startclock versus playclock 43/42 Delayed Gratification X OX O O X 44/42 Conclusion 45/42 Critique: Game playing is frivolous. Reply: General game playing is serious business. AI: General problem solving DB: Dynamic constraints Systems: Automatic Programming redux Motivating Applications: Enterprise Management, Computational Law Funding by Darpa, SAP, etc. General Game Playing is not a game 46/42 Like General Game Theory, traditional Game Theory is concerned with games in general. Game Theory is concerned with the analysis of game trees. There is little concern for how those trees are communicated. There is little concern for trade-offs between deliberation and action. In General Game Playing, the game tree is communicated at runtime. Hence there is an emphasis on description and use of this description. There is the possibility of new forms of rationality. General Game Playing and Game Theory 47/42 Planning: Inputs: State machine, initial state, goal states Output: Plan that transforms initial to goal state General Game Playing: Inputs: State machine, initial state, goal states Objective: Reach a goal state (plan/execute?) Difference: Presence of execution environment with time bounds on combined deliberation/action Study of interleaved planning and execution General Game Playing and Planning 48/42 Critique: Human Intelligence is arguably the product of eons of evolution. We are, to some extent at least, wired to function well in this world. General Game Players have nothing at their disposal but mathematics. Reply: A key indicator of intelligence is the ability of each individual to function in radically new environments. Tabula Rasa 49/42 Critique: Real intelligence requires the ability to figure out new environments. Reply: Well, there is no question that that ability is essential. However, real intelligence also requires the ability to use theories once they are formed. This is the domain of general systems (universal systems or, sometimes, declarative systems); and interest in this problem dates to the beginning of the field. Using What We Learn 50/42 General Systems Ordinary Systems General Systems Like program interpreter but for declarative rules. General System Client Environment Rules Application-Specific System Client Environment 51/42 John McCarthy The main advantage we expect the advice taker to have is that its behavior will be improvable merely by making statements to it, telling it about its environment and what is wanted from it. To make these statements will require little, if any, knowledge of the program or the previous knowledge of the advice taker. 52/42 Ed Feigenbaum The potential use of computers by people to accomplish tasks can be one-dimensionalized into a spectrum representing the nature of the instruction that must be given the computer to do its job. Call it the what-to-how spectrum. At one extreme of the spectrum, the user supplies his intelligence to instruct the machine with precision exactly how to do his job step-by-step.... At the other end of the spectrum is the user with his real problem.... He aspires to communicate what he wants done... without having to lay out in detail all necessary subgoals for adequate performance. 53/42 Robert Heinlein A human being should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyze a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects. computer/robot v 54/42 55/42 56/42 Schedule March 30 Introduction April 6 Game Description 13Complete Search 20Incomplete Search 27Metagaming May 4Propositional Nets 11Factoring 18Pruning Heuristics 25Really General Game Playing June 1Final Competition 57/42 Teams Composition 3 people each (2 or 4 okay with good reason) Names: Pansy Division The Pumamen Team Camembert Mighty Bourgeoisie Michael Genesereth The Lemmings Greedy Bastards Team Gal No Homers Red Hot Chili Peppers 58/42 Technology Language Java Lisp Operating System Mac OS Unix Linux Windows Hardware Whatever you like but Accessible via hardwire connection to our router 59/42 60/42 61/42 GGP-06 Winners 62/42 Winners