Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

33
Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

Transcript of Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

Page 1: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

Evolution and Repeated Games

D. Fudenberg (Harvard)

E. Maskin (IAS, Princeton)

Page 2: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

2

Theory of repeated games important• central model for explaining how self-interested

agents can cooperate• used in economics, biology, political science and

other fields

2,2 1,3

3, 1 0,0

C D

Cooperate

Defect

Page 3: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

3

But theory has a serious flaw:

• although cooperative behavior possible, so is uncooperative behavior (and everything in between)

• theory doesn’t favor one behavior over another

• theory doesn’t make sharp predictions

Page 4: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

4

Evolution (biological or cultural) can promote efficiency

• might hope that uncooperative behavior will be “weeded out”

• this view expressed in Axelrod (1984)

Page 5: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

5

Basic idea:

• Start with population of repeated game strategy Always D

• Consider small group of mutants using Conditional C (Play C until someone plays D, thereafter play D)

– does essentially same against Always D as Always D does

– does much better against Conditional C than Always D does

• Thus Conditional C will invade Always D

• uncooperative behavior driven out

2,2 1,3

3, 1 0,0

C D

Cooperate

Defect

Page 6: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

6

But consider ALTAlternate between C and D until pattern broken, thereafter play D

• can’t be invaded by some other strategy– other strategy would have to alternate or else would do much worse

against ALT than ALT does

• Thus ALT is “evolutionarily stable”• But ALT is quite inefficient (average payoff 1)

2,2 1,3

3, 1 0,0

C D

C

D

Page 7: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

7

• Still, ALT highly inflexible– relies on perfect alternation– if pattern broken, get D forever

• What if there is a (small) probability of mistake in execution?

Page 8: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

8

• Consider mutant strategy identical to ALT except if (by mistake) alternating pattern broken– “intention” to cooperate by playing C in following period– if other strategy plays C too, – if other strategy plays D,

• • • •

2,2 1,3

3, 1 0,0

C D

C

D

s

signalss plays forevers C

and ALT each get about 0 against ALT, after pattern brokens gets 2 against ; ALT gets about 0 against , after pattern brokens s s

so invades ALTs

plays forevers D identical to ALT before pattern brokens

Page 9: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

9

Main results in paper (for 2-player symmetric repeated games)

(1) If s evolutionarily stable and– discount rate r small (future important)

– mistake probability p small (but p > 0)

then s (almost) “efficient”

(2) If payoffs (v, v) “efficient”,then exists ES strategy s (almost) attaining (v, v) provided

– r small

– p small relative to r

• generalizes Fudenberg-Maskin (1990), in which r = p = 0

Page 10: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

10

Finite symmetric 2–player game

• if

• normalize payoffs so that

2:g A A

1 2 1 2 convex hull , ,V g a a a a A A

1 2 1 2 2 1 2 1, , , then , ,g a a v v g a a v v

2 1

1 1 2min max , 0a a

g a a

Page 11: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

11

• strongly efficient if

0,0 1,2

2,1 0,0

1 2,v v V

1 2

1 2 1 2,

maxv v V

w v v v v

1 2 1 2 1, 3, 1 2 strongly efficientv v v v v

2,2 unique strongly efficient pair

2,2 1,3

3, 1 0,0

Page 12: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

12

Repeated game: g repeated infinitely many times

• period t history

• H = set of all histories

• repeated game strategy

– assume finitely complex (playable by finite computer)

• in each period, probability p that i makes mistake– chooses (equal probabilities for all actions)

– mistakes independent across players

1 2 1 21 , 1 , 1 , 1h a a a t a t

i ia s h

2 1 2 11 , 1 , , 1 , 1h a a a t a t

:s H A

Page 13: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

13

1

,1 1 2 1 1 2 1 2

1

1, , , ,

1 1

tr p

t

rU s s E g a t a t s s p

r r

1

,1 1 2 1 1 2 1 2

1

1, , , , ,

1 1

tr p

t

rU s s h E g a t a t s s p h

r r

Page 14: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

14

• informally, s evolutionarily stable (ES), if no mutant can invade population with big proportion s and small proportion

• formally, s is ES w.r.t. if for all and all

• evolutionary stability

– expressed statically here– but can be given precise dynamic meaning

, ,1 11 , ,r p r pq U s s qU s s

,1 11 , ,r r pq U s s qU s s

, ,q r pq q

s s

ss

Page 15: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

15

• population of • suppose time measure in “epochs” T = 1, 2, . . . • strategy state in epoch T

− most players in population use • group of mutants (of size a) plays s'

a drawn randomly from s' drawn randomly from finitely complex

strategies• M random drawings of pairs of players

− each pair plays repeated game• = strategy with highest average score

Ts

1,2, , , where a a b q

1Ts

Ts

playersb

Page 16: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

16

Theorem 1: For any

exists such that, for all there exists

such that, for all

(i) if s not ES,

(ii) if

, , and 0, thereq p r ,T T

M T

Pr , for all T t Ts s t T s s

T

,M M T

is ES, Pr for all 1T t Ts s s t T s s

Page 17: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

17

Let

Theorem 2: Given such that, for all

if s is ES w.r.t.then

, ,q r p

,1 , for all .r pU s s h v h

0 and 0, there exist and q r p 0, and 0, ,r r p p

min 0 there exists such that , strongly efficientv v v v v

Page 18: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

18

0,0 1,2

2,1 0,0 ,

11, So , 1r pv U s s

,12, So , 2r pv U s s

2,2 1,3

3, 1 0,0

Page 19: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

19

Proof:

Suppose• will construct mutant s' that can invade • let • if s = ALT, = any history for which alternating

pattern broken

,1 , for some r pU s s h v h

,1arg min ,r ph U s s h

h

Page 20: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

20

Construct s' so that• if h not a continuation of

• after , strategy s' – “signals” willingness to cooperate by playing differently

from s for 1 period (assume s is pure strategy)

– if other player responds positively, plays strongly efficiently thereafter

– if not, plays according to s thereafter

• after

– responds positively if other strategy has signaled, and thereafter plays strongly efficiently

– plays according to s otherwise

or h h ,s h s h h

(assume ), h h h s

Page 21: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

21

• because is already worst history,

s' loses for only 1 period by signaling(small loss if r small)

• if p small, probability that s' “misreads” other player’s intention is small

• hence, s' does nearly as well against s as s does against itself

(even after )• s' does very well against itself (strong efficiency),

after

h

or h h

or h h

,1 1, ,r pU s s h U s s h w

Page 22: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

22

• remains to check how well s does against s' • by definition of

• Ignoring effect of p,

Also, after deviation by s', punishment started again, and so

Hence

• so s does appreciably worse against s' than s' does against s'

, ,1 1, , ,r p r pU s s h U s s h v

, ,1 1, , for some >0r p r pU s s h U s s h w

, ,1 1, , .r p r pU s s h U s s h

, ,1 1, ,r p r pU s s h U s s h w

,h

Page 23: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

23

• Summing up, we have:

• s is not ES

, ,1 11 , ,r p r pq U s s q U s s

, ,1 11 , ,r p r pq U s s q U s s

Page 24: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

24

• Theorem 2 implies for Prisoner’s Dilemma that, for any

• doesn’t rule out punishments of arbitrary (finite) length

0,

,1 , 2 for and smallr pU s s h r p

Page 25: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

25

• Consider strategy s with “cooperative” and “punishment” phases – in cooperative phase, play C – stay in cooperative phase until one player plays D, in

which case go to punishment phase– in punishment phase, play D– stay in punishment phase for m periods (and then go back

to cooperative phase) unless at some point some player chooses C, in which case restart punishment

• For any m,

,

1 , 2 (efficiency), as 0 0r pU s s h r p

Page 26: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

26

Can sharpen Theorem 2 for Prisoner’s Dilemma:

Given , there exist such that, for all

if s is ES w.r.t.

then it cannot entail a punishment lasting more than periods

Proof: very similar to that of Theorem 2

and r p, , ,q r p 0, and 0, ,r r p p

q

3 2

2

q

q

Page 27: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

27

For r and p too big, ES strategy s may not be “efficient”

• if

• if fully cooperative

strategies in Prisoner’s Dilemma generate payoffs

12 , then evenp

1.

, back in one-shot caser

Page 28: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

28

Theorem 3: Let

For all for all

for all

there exists 0 s.t.r r p

,1 ,r pU s s v

0, there exist 0 and 0 s.t.q r

, with v v V v v

there exists ES w.r.t. , , for whichp p s q r p

Page 29: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

29

Proof: Construct s so that

• along equilibrium path of (s, s), payoffs are (approximately) (v, v)

• punishments are nearly strongly efficient – deviating player (say 1) minimaxed long enough wipe

out gain– thereafter go to strongly efficient point– overall payoffs after deviation:

• if r and p small (s, s) is a subgame perfect equilibrium

,v w v

Page 30: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

30

• In Prisoner’s Dilemma, consider s that– plays C the first period– thereafter, plays C if and only if either both players played

C previous period or neither did

• strategy s– is efficient– entails punishments that are as short as possible– is modification of Tit-for-Tat (C the first period; thereafter,

do what other player did previous period)

• Tit-for-Tat not ES– if mistake (D, C) occurs then get wave of alternating

punishments:(C, D), (D, C), (C, D), ...

until another mistake made

Page 31: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

31

• Let s = play d as long as in all past periods– both players played d– neither played d

if single player deviates from d– henceforth, that player plays b– other player plays a

• s is ES even though inefficient– any attempt to improve on efficiency, punished forever– can’t invade during punishment, because punishment efficient

2,2

0,0

0,0 0,0

a b

a

b

c

c

d

d

4,1 0,0

1,4 0,0 0,0

0,0

0,0

0,0

0,00,0

0,00,0

modified battle of sexes

Page 32: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

32

Consider potential invader s' For any h, s' cannot do better against s than s does against itself, since (s, s)

equilibriumhence, for all h,

and so

For s' to invade, need

Claim: implies h' involves deviation from equil path of (s, s) only other possibility:

– s' different from s on equil path – then s' punished by – violates

we thus have Hence, from rhs of

, ,1 1, ,r p r pU s s h U s s h

, ,1 1, ,r p r pU s s h U s s h

, ,1 1, , (otherwise can't invade)r p r pU s s h U s s h s

, ,1 1, , for some r p r pU s s h U s s h h

, so inequality not feasiblew

( )

( )

( )

, ,1 1, ,r p r pU s s h U s s h w

( )

( ), ( )

( )

Page 33: Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

33

For Theorem 3 to hold, p must be small relative to r• consider modified Tit-for-Tat against itself

(play C if and only if both players took same action last period)

• with every mistake, there is an expected loss of 2 – (½ · 3 + ½ (−1)) = 1 the first period2 – 0 = 2 the second period

• so over-all the expected loss from mistakes is

approximately

• By contrast, a mutant strategy that signals, etc. and doesn’t

punish at all against itself loses only about

• so if r is small enough relative to p, mutant can invade

13

rp

r

1 rp

r