Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies...
-
date post
20-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies...
![Page 1: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/1.jpg)
Learning in Games
![Page 2: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/2.jpg)
Fictitious Play
![Page 3: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/3.jpg)
Notation!
For n Players we have: n Finite Player’s Strategies Spaces S1, S2, …, Sn
n Opponent’s Strategies Spaces S-1, S-2, …, S-n
n Payoff Functions u1, u2,…, un For each i and each s-i in S-i a set of
Best Responses BRi (s-i)
![Page 4: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/4.jpg)
What is Fictitious Play?
Each player creates an assessment about the opponent’s strategies in form of a weight function:
iit
iitii
tii
tssif
ssifss
1
11
0
1)()(
ii S:0
![Page 5: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/5.jpg)
Prediction
ii Ss
iit
iitii
t s
ss
~)~(
)()(
Probability of player i assigning to player –i playing s-i at time t:
![Page 6: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/6.jpg)
Fictious Play is …
… any rule that assigns )( it
it )()( i
tii
tit BR
NOT UNIQUE!NOT UNIQUE!
![Page 7: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/7.jpg)
Further DefinitionsIn 2 Player games:
Marginal empirical distributions of j’s play (j=-i)
t
sssd
jjtjj
t
)()()( 0
![Page 8: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/8.jpg)
Propositions:Strict Nash equilibria are absorbing for the
process of fictitious play.Any pure-strategy steady state of fictitous play
must be a Nash equilibrium
Asymptotic Behavior
![Page 9: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/9.jpg)
Example “matching pennies”
1,-1 -1,1
-1,1 1,-1
H T
H
T
![Page 10: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/10.jpg)
Example “matching pennies”
1,-1 -1,1
-1,1 1,-1
1.5 2 2 1.5
Weights:
Row Player Col Player
H T
H
T
H T H T
![Page 11: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/11.jpg)
Example “matching pennies”
1,-1 -1,1
-1,1 1,-1
1.5 3 2 2.5
Weights:
Row Player Col Player
H T
H
T
1.5 2 2 1.5H T H TH T H T
H T H T
H T H T
H T H T
![Page 12: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/12.jpg)
Example “matching pennies”
1,-1 -1,1
-1,1 1,-1
1.5 3 2 2.5
Weights:
Row Player Col Player
H T
H
T
1.5 2 2 1.5H T H T
H T H T
H T H T
H T H T
![Page 13: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/13.jpg)
Example “matching pennies”
1,-1 -1,1
-1,1 1,-1
2.5 3 2 3.5
Weights:
Row Player Col Player
H T
H
T
1.5 3 2 2.5
1.5 2 2 1.5H T H T
H T H T
H T H T
H T H T
![Page 14: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/14.jpg)
Example “matching pennies”
1,-1 -1,1
-1,1 1,-1
2.5 3 2 3.5
Weights:
Row Player Col Player
H T
H
T
1.5 3 2 2.5
1.5 2 2 1.5H T H T
H T H T
H T H T
H T H T
![Page 15: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/15.jpg)
Example “matching pennies”
1,-1 -1,1
-1,1 1,-1
2.5 3 2 3.5
Weights:
Row Player Col Player
H T
H
T
1.5 3 2 2.5
1.5 2 2 1.5
3.5 3 2 4.5
H T H T
H T H T
H T H T
H T H T
![Page 16: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/16.jpg)
Example “matching pennies”
1,-1 -1,1
-1,1 1,-1
2.5 3 2 3.5
Weights:
Row Player Col Player
H T
H
T
1.5 3 2 2.5
1.5 2 2 1.5
3.5 3 2 4.5
H T H T
H T H T
H T H T
H T H T
![Page 17: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/17.jpg)
Example “matching pennies”
1,-1 -1,1
-1,1 1,-1
2.5 3 2 3.5
Weights:
Row Player Col Player
H T
H
T
1.5 3 2 2.5
1.5 2 2 1.5
3.5 3 2 4.5
H T H T
H T H T
H T H T
H T H T
![Page 18: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/18.jpg)
Example “matching pennies”
1,-1 -1,1
-1,1 1,-1
2.5 3 2 3.5
Weights:
Row Player Col Player
H T
H
T
1.5 3 2 2.5
1.5 2 2 1.5
3.5 3 2 4.5
H T H T
H T H T
1.5 2H T H T
H T H T
![Page 19: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/19.jpg)
2.5 3 2 3.5
Weights:
Row Player Col Player
1.5 3 2 2.5
1.5 2 2 1.5
3.5 3 2 4.5
6.5 3
6.5 4
5 4.5
5.5 3 4 4.5
4.5 3 3 4.5
6.5 4 6 4.5
![Page 20: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/20.jpg)
Convergence?
…but the marginal empirical distributions?
2
1)()( 0 t
jjt
t
ss
Strategies cycle and do not converge …
![Page 21: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/21.jpg)
MATLAB Simulation - PenniesGame Play PayoffWeight / Time
![Page 22: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/22.jpg)
Proposition
Under fictitious play, if the empirical distributions over each player’s choices converge, the strategy profile corresponding to the product of these distributions is a Nash equilibrium.
![Page 23: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/23.jpg)
Rock-Paper-ScissorsGame Play PayoffWeight / Time
0,11,0
1,00,1
0,11,0
2
1,
2
1
2
1,
2
1
2
1,
2
1
A BA
BC
C
![Page 24: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/24.jpg)
Rock-Paper-ScissorsGame Play PayoffWeight / Time
0,11,0
1,00,1
0,11,0
2
1,
2
1
2
1,
2
1
2
1,
2
1
![Page 25: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/25.jpg)
Shapley GameGame Play PayoffWeight / Time
0,00,11,0
1,00,00,1
0,11,00,0
![Page 26: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/26.jpg)
Persistent miscoordinationGame Play PayoffWeight / Time
0,01,1
1,10,0A B
B
A
1.41
1.41Initial weights:
Nash: (1,0)(0,1)(0.5,0.5)
![Page 27: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/27.jpg)
Persistent miscoordinationGame Play PayoffWeight / Time
0,01,1
1,10,0A B
B
A
1.42
1.42Initial weights:
Nash: (1,0)(0,1)(0.5,0.5)
![Page 28: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/28.jpg)
Persistent MiscoordinationGame Play PayoffWeight / Time
0,01,1
1,10,0A B
B
A
2.42
2.42Initial weights:
Nash: (1,0)(0,1)(0.5,0.5)
![Page 29: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/29.jpg)
Summary on fictitious play
In case of convergence, the time average of strategies forms a Nash Equilibrium
The average payoff does not need to be the one of a Nash (e.g. Miscoordination)
Time average may not converge at all (e.g. Shapley Game)
![Page 30: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/30.jpg)
References
Fudenberg D., Levine D. K. (1998)The Theory of Learning in Games
MIT Press
![Page 31: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/31.jpg)
Nash Convergence of Gradient Dynamics in General-Sum Games
![Page 32: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/32.jpg)
Notation
2 Players:
Strategies and
Payoff matricies
r11 r12
r21 r22
c11 c12
c21 c22
R= C=
1
1
![Page 33: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/33.jpg)
Objective Functions
Payoff Functions:
Vr(,)=r11()+r22((1-)(1-))
+r12((1-))+r21((1-))
Vc(,)=c11()+c22((1-)(1-))
+c12((1-))+c21((1-))
![Page 34: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/34.jpg)
Hillclimbing Idea
![Page 35: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/35.jpg)
Gradient Ascent for Iterated Games
With u=(r11+r22)-(r21+r12)
u’=(c11+c22)-(c21+c12)
)(),(
1222 rruVr
)('),(
1222 ccuVc
![Page 36: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/36.jpg)
Update Rule
),(
1r
kk
V
),(
1c
kk
V
00 , can be arbitrary strategies
![Page 37: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/37.jpg)
Problem
Gradient can lead the players to an infeasible point outside the unit square.
0 1
1
![Page 38: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/38.jpg)
Solution:
Redefine the gradient to the projection of the true gradient onto the boundary.
0 1
1
Let this denote the constrained dynamics!Let this denote the constrained dynamics!
![Page 39: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/39.jpg)
Infinitesimal Gradient Ascent (IGA)
0lim
)(
)(
0'
0
1222
1222
cc
rr
u
u
t
t
)(),( tt Become functions of time!
![Page 40: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/40.jpg)
1. Case: U is invertibleThe two possible qualitative forms of the unconstrained strategy pair:
![Page 41: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/41.jpg)
2. Case: U is not invertibleSome examples of qualitative forms of the unconstrained strategy pair:
![Page 42: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/42.jpg)
Convergence
If both players follow the IGA rule, then both player’s average payoffs will converge to the expected payoff of some Nash equilibrium
If the strategy pair trajectory converges at all, then it converges to a Nash pair.
![Page 43: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/43.jpg)
Proposition
Both previous propositions also hold with finite decreasing step size
![Page 44: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/44.jpg)
References
Singh S., Kearns M., Yishay M. (2000)Nash Convergence of Gradient Dynamics in
General-Sum Games
Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann, pages 541-548
![Page 45: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/45.jpg)
Dynamic computation of Nash equilibria in Two-Player general-sum games.
![Page 46: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/46.jpg)
2 Players:
Strategies and
Payoff matricies
Notation
R= C=
np
p
1
nnn
n
rr
rr
1
111
nnn
n
cc
cc
1
111
nq
q
1
![Page 47: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/47.jpg)
Objective FunctionsObjective Functions
Payoff Functions:Payoff Functions:
Row Player:Row Player:
Col Player:Col Player:
RqpqpV Tr ),(
CqpqpV Tc ),(
![Page 48: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/48.jpg)
This means:
If thenthe value of pi the payoff.
Observation!
),( qpVr is linear in each pi and qj
Let xi denote the pure strategy for action i.
),(),( qpVqxV rir increases
increasing
![Page 49: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/49.jpg)
Hill climbing (again)
Multiplicative Update Rules
),(),()( qpVqxVtpt
pr
iri
i
),(),()( qpVxpVtqt
qc
ici
i
RqpRqtpt
p Tii
i
)(
RqpRptqt
q Ti
Ti
i
)(
![Page 50: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/50.jpg)
Hill climbing (again)
System of Differential Equations (i=1..n)
RqpRqtpt
p Tii
i
)(
RqpRptqt
q Ti
Ti
i
)(
![Page 51: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/51.jpg)
RqpRqtpt
p Tii
i
)( 0)( tpii either
or
Fixed Points?
0 RqpRq Ti
![Page 52: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/52.jpg)
When is a Fixpoint a Nash?
• Proposition:Provided all pi(0) are neither 0 nor 1, then if (p,q) converges to (p*,q*) then this is a Nash Equilibrium.
![Page 53: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/53.jpg)
Unit Square?
No Problem! pi=0 or pi=1 both set to zero! t
pi
![Page 54: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/54.jpg)
Convergence of the average of the payoff
If the (p,q) trajectory and both player’s payoffs converge in average, the average payoff must be the payoff of some Nash Equilibrium
![Page 55: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/55.jpg)
2 Player 2 Action Case
Either the strategies converge immediately to some pure strategy, or the difference between the Kullback-Leibler distances of (p,q) and some mixed Nash are constant.
)1
1log()1()log(),(
***
p
pp
p
ppppKL
.),(),( ** constqqKLppKL
![Page 56: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/56.jpg)
Trajectories of the difference between the Kullback-Leibler Distances
Nash
![Page 57: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/57.jpg)
But…
… for games with more than 2 actions, convergence is not guaranteed! Counterexample: Shapley Game
![Page 58: Learning in Games. Fictitious Play Notation! For n Players we have: n Finite Player’s Strategies Spaces S 1, S 2, …, S n n Opponent’s Strategies Spaces.](https://reader030.fdocuments.us/reader030/viewer/2022032800/56649d485503460f94a23449/html5/thumbnails/58.jpg)