CSC304 Lecture 5
Game Theory : Zero-Sum Games,
The Minimax Theorem
CSC304 - Nisarg Shah 1
Recap
CSC304 - Nisarg Shah 2
โข Last lectureโข Cost-sharing games
o Price of anarchy (PoA) can be ๐
o Price of stability (PoS) is ๐(log ๐)
โข Potential functions and pure Nash equilibria
โข Congestion games
โข Braessโ paradox
โข Updated (slightly more detailed) slides
โข Assignment 1 to be posted
โข Volunteer note-taker
Zero-Sum Games
CSC304 - Nisarg Shah 3
โข Total reward constant in all outcomes (w.l.o.g. 0)โข Common term: โzero-sum situationโ
โข Psychology literature: โzero-sum thinkingโ
โข โStrictly competitive gamesโ
โข Focus on two-player zero-sum games (2p-zs)โข โThe more I win, the more you loseโ
Zero-Sum Games
CSC304 - Nisarg Shah 4
SamJohn Stay Silent Betray
Stay Silent (-1 , -1) (-3 , 0)
Betray (0 , -3) (-2 , -2)
Non-zero-sum game: Prisonerโs dilemma
Zero-sum game: Rock-Paper-Scissor
P1P2 Rock Paper Scissor
Rock (0 , 0) (-1 , 1) (1 , -1)
Paper (1 , -1) (0 , 0) (-1 , 1)
Scissor (-1 , 1) (1 , -1) (0 , 0)
Zero-Sum Games
CSC304 - Nisarg Shah 5
โข Why are they interesting?โข Most games we play are zero-sum: chess, tic-tac-toe,
rock-paper-scissor, โฆ
โข (win, lose), (lose, win), (draw, draw)
โข (1, -1), (-1, 1), (0, 0)
โข Why are they technically interesting?โข Relation between the rewards of P1 and P2
โข P1 maximizes his reward
โข P2 maximizes his reward = minimizes reward of P1
Zero-Sum Games
CSC304 - Nisarg Shah 6
โข Reward for P2 = - Reward for P1
โข Only need a single matrix ๐ด : reward for P1
โข P1 wants to maximize, P2 wants to minimize
P1P2 Rock Paper Scissor
Rock 0 -1 1
Paper 1 0 -1
Scissor -1 1 0
Rewards in Matrix Form
CSC304 - Nisarg Shah 7
โข Say P1 uses mixed strategy ๐ฅ1 = (๐ฅ1,1, ๐ฅ1,2, โฆ )โข What are the rewards for P1 corresponding to different
possible actions of P2?
๐ ๐
๐ฅ1,1
๐ฅ1,2
๐ฅ1,3
.
.
.
Rewards in Matrix Form
CSC304 - Nisarg Shah 8
โข Say P1 uses mixed strategy ๐ฅ1 = (๐ฅ1,1, ๐ฅ1,2, โฆ )โข What are the rewards for P1 corresponding to different
possible actions of P2?
๐ ๐
๐ฅ1,1, ๐ฅ1,2, ๐ฅ1,3, โฆ โ
โ Reward for P1 when P2
chooses sj = ๐ฅ1๐ โ ๐ด
๐
Rewards in Matrix Form
CSC304 - Nisarg Shah 9
โข Reward for P1 whenโฆโข P1 uses mixed strategy ๐ฅ1โข P2 uses mixed strategy ๐ฅ2
๐ฅ1๐ โ ๐ด
1, ๐ฅ1
๐ โ ๐ด2, ๐ฅ1
๐ โ ๐ด3โฆ โ
๐ฅ2,1๐ฅ2,2๐ฅ2,3โฎ
= ๐ฅ1๐ โ ๐ด โ ๐ฅ2
CSC304 - Nisarg Shah 10
How would the two players actdo in this zero-sum game?
John von Neumann, 1928
Maximin Strategy
CSC304 - Nisarg Shah 11
โข Worst-case thinking by P1โฆโข If I choose mixed strategy ๐ฅ1โฆ
โข P2 would choose ๐ฅ2 to minimize my reward (i.e., maximize his reward)
โข Let me choose ๐ฅ1 to maximize this โworst-case rewardโ
๐1โ = max
๐ฅ1min๐ฅ2
๐ฅ1๐ โ ๐ด โ ๐ฅ2
Maximin Strategy
CSC304 - Nisarg Shah 12
๐1โ = max
๐ฅ1min๐ฅ2
๐ฅ1๐ โ ๐ด โ ๐ฅ2
โข ๐1โ : maximin value of P1
โข ๐ฅ1โ (maximizer) : maximin strategy of P1
โข โBy playing ๐ฅ1โ, I guarantee myself at least ๐1
โโ
โข But if P1 โ ๐ฅ1โ, P2โs best response โ เท๐ฅ2
โข Will ๐ฅ1โ be the best response to เท๐ฅ2?
Maximin vs Minimax
CSC304 - Nisarg Shah 13
Player 1
Choose my strategy to maximize my reward, worst-case over P2โs response
๐1โ = max
๐ฅ1min๐ฅ2
๐ฅ1๐ โ ๐ด โ ๐ฅ2
Player 2
Choose my strategy to minimize P1โs reward, worst-case over P1โs response
๐2โ = min
๐ฅ2max๐ฅ1
๐ฅ1๐ โ ๐ด โ ๐ฅ2
Question: Relation between ๐1โ and ๐2
โ?
๐ฅ1โ ๐ฅ2
โ
Maximin vs Minimax
CSC304 - Nisarg Shah 14
๐1โ = max
๐ฅ1min๐ฅ2
๐ฅ1๐ โ ๐ด โ ๐ฅ2 ๐2
โ = min๐ฅ2
max๐ฅ1
๐ฅ1๐ โ ๐ด โ ๐ฅ2
โข What if (P1,P2) play (x1โ , x2
โ)?โข P1 must get at least ๐1
โ (ensured by P1)
โข P1 must get at most ๐2โ (ensured by P2)
โข ๐1โ โค ๐2
โ
๐ฅ1โ ๐ฅ2
โ
The Minimax Theorem
CSC304 - Nisarg Shah 16
โข Jon von Neumann [1928]
โข Theorem: For any 2p-zs game,
โข ๐1โ = ๐2
โ = ๐โ (called the minimax value of the game)
โข Set of Nash equilibria =
{ x1โ , x2
โ โถ x1โ = maximin for P1, x2
โ = minimax for P2}
โข Corollary: ๐ฅ1โ is best response to ๐ฅ2
โ and vice-versa.
The Minimax Theorem
CSC304 - Nisarg Shah 17
โข Jon von Neumann [1928]
โAs far as I can see, there could be no theory of games โฆ without that theorem โฆ
I thought there was nothing worth publishing until the Minimax Theorem was provedโ
โข An unequivocal way to โsolveโ zero-sum gamesโข Optimal strategies for P1 and P2 (up to ties)
โข Optimal rewards for P1 and P2 under a rational play
Proof of the Minimax Theorem
CSC304 - Nisarg Shah 18
โข Simpler proof using Nashโs theoremโข But predates Nashโs theorem
โข Suppose ๐ฅ1, ๐ฅ2 is a NE
โข P1 gets value ๐ฃ = ๐ฅ1๐๐ด ๐ฅ2
โข ๐ฅ1 is best response for P1 : ๐ฃ = max๐ฅ1 ๐ฅ1๐๐ด ๐ฅ2
โข ๐ฅ2 is best response for P2 : ๐ฃ = min๐ฅ2 ๐ฅ1๐๐ด ๐ฅ2
Proof of the Minimax Theorem
CSC304 - Nisarg Shah 19
โข But we already saw ๐1โ โค ๐2
โ
โข ๐1โ = ๐2
โ
max๐ฅ1
๐ฅ1๐๐ด ๐ฅ2 = ๐ฃ = min
๐ฅ2๐ฅ1
๐๐ด ๐ฅ2
โค max๐ฅ1
min๐ฅ2
๐ฅ1๐ โ ๐ด โ ๐ฅ2 = ๐1
โ
๐2โ = min
๐ฅ2max๐ฅ1
๐ฅ1๐ โ ๐ด โ ๐ฅ2 โค
Proof of the Minimax Theorem
CSC304 - Nisarg Shah 20
โข When ( ๐ฅ1, ๐ฅ2) is a NE, ๐ฅ1 and ๐ฅ2 must be maximin and minimax strategies for P1 and P2, respectively.
โข The reverse direction is also easy to prove.
max๐ฅ1
๐ฅ1๐๐ด ๐ฅ2 = ๐ฃ = max
๐ฅ2๐ฅ1
๐๐ด ๐ฅ2
= max๐ฅ1
min๐ฅ2
๐ฅ1๐ โ ๐ด โ ๐ฅ2 = ๐1
โ
๐2โ = min
๐ฅ2max๐ฅ1
๐ฅ1๐ โ ๐ด โ ๐ฅ2 =
Computing Nash Equilibria
CSC304 - Nisarg Shah 21
โข Can I practically compute a maximin strategy (and thus a Nash equilibrium of the game)?
โข Wasnโt it computationally hard even for 2-player games?
โข For 2p-zs games, a Nash equilibrium can be computed in polynomial time using linear programming.
โข Polynomial in #actions of the two players: ๐1 and ๐2
Computing Nash Equilibria
CSC304 - Nisarg Shah 22
Maximize ๐ฃ
Subject to
๐ฅ1๐ ๐ด
๐โฅ ๐ฃ, ๐ โ 1,โฆ ,๐2
๐ฅ1 1 +โฏ+ ๐ฅ1 ๐1 = 1
๐ฅ1 ๐ โฅ 0, ๐ โ {1,โฆ ,๐1}
Minimax Theorem in Real Life?
CSC304 - Nisarg Shah 23
โข If you were to play a 2-player zero-sum game (say, as player 1), would you always play a maximin strategy?
โข What if you were convinced your opponent is an idiot?
โข What if you start playing the maximin strategy, but observe that your opponent is not best responding?
Minimax Theorem in Real Life?
CSC304 - Nisarg Shah 24
KickerGoalie L R
L 0.58 0.95
R 0.93 0.70
KickerMaximize ๐ฃSubject to0.58๐๐ฟ + 0.93๐๐ โฅ ๐ฃ0.95๐๐ฟ + 0.70๐๐ โฅ ๐ฃ๐๐ฟ + ๐๐ = 1
๐๐ฟ โฅ 0, ๐๐ โฅ 0
GoalieMinimize ๐ฃSubject to0.58๐๐ฟ + 0.95๐๐ โค ๐ฃ0.93๐๐ฟ + 0.70๐๐ โค ๐ฃ๐๐ฟ + ๐๐ = 1
๐๐ฟ โฅ 0, ๐๐ โฅ 0
Minimax Theorem in Real Life?
CSC304 - Nisarg Shah 25
KickerGoalie L R
L 0.58 0.95
R 0.93 0.70
KickerMaximin:๐๐ฟ = 0.38, ๐๐ = 0.62
Reality:๐๐ฟ = 0.40, ๐๐ = 0.60
GoalieMaximin:๐๐ฟ = 0.42, ๐๐ = 0.58
Reality:๐๐ฟ = 0.423, ๐๐ = 0.577
Some evidence that people may play minimax strategies.
Minimax Theorem
CSC304 - Nisarg Shah 26
โข We proved it using Nashโs theoremโข Cheating. Typically, Nashโs theorem (for
the special case of 2p-zs games) is proved using the minimax theorem.
โข Useful for proving Yaoโs principle, which provides lower bound for randomized algorithms
โข Equivalent to linear programming duality
John von Neumann
George Dantzig
von Neumann and Dantzig
CSC304 - Nisarg Shah 27
George Dantzig loves to tell the story of his meeting with John von Neumann on October 3, 1947 at the Institute for Advanced Study at Princeton. Dantzig went to that meeting with the express purpose of describing the linear programming problem to von Neumann and asking him to suggest a computational procedure. He was actually looking for methods to benchmark the simplex method. Instead, he got a 90-minute lecture on Farkas Lemma and Duality (Dantzig's notes of this session formed the source of the modern perspective on linear programming duality). Not wanting Dantzig to be completely amazed, von Neumann admitted:
"I don't want you to think that I am pulling all this out of my sleeve like a magician. I have recently completed a book with Morgenstern on the theory of games. What I am doing is conjecturing that the two problems are equivalent. The theory that I am outlining is an analogue to the one we have developed for games.โ
- (Chandru & Rao, 1999)
Top Related