Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin...

39
Time-Average Replicator and Best-Reply Dynamics Yannick Viossat (Universit´ e Paris-Dauphine) joint work with Josef Hofbauer (Universit¨ at Wien) and Sylvain Sorin (Universit´ e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics and Game Theory, LSE, May 2010 Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Transcript of Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin...

Page 1: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Time-Average Replicator and Best-ReplyDynamics

Yannick Viossat (Universite Paris-Dauphine)

joint work with

Josef Hofbauer (Universitat Wien)

and

Sylvain Sorin (Universite Paris 6 and Ecole polytechnique)

Seminar on Discrete Mathematics and Game Theory, LSE, May 2010

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 2: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Philosophy

A quote from a famous German philosopher :

Proofs should not be made in public

B. von Stengel

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 3: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Philosophy

A quote from a famous German philosopher :

Proofs should not be made in public

B. von Stengel

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 4: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Getting started...

Introduction to evolutionary game theory

Talk’s topic

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 5: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Traditional vs evolutionary game theory

Standard game theory :

few agents

know the game (common knowledge)

high rationality, use elaborate thinking

Evolutionary game theory :

populations of agents,

need not fully understand the game

low rationality, use rules of thumb (or selection process)

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 6: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Traditional vs evolutionary game theory

Standard game theory :

few agents

know the game (common knowledge)

high rationality, use elaborate thinking

Evolutionary game theory :

populations of agents,

need not fully understand the game

low rationality, use rules of thumb (or selection process)

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 7: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Evolutionary Game Theory : conceptual framework

large population of agents

meet randomly, and play a symmetric game

strategies with good results spread (imitation, selection,...)

changes the average behaviour, hence the “good strategies”, hencethe strategies that spread...

Two approaches : static and dynamic

Evolutionary game dynamics : dynamical system modeling such a process

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 8: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Evolutionary Game Theory : conceptual framework

large population of agents

meet randomly, and play a symmetric game

strategies with good results spread (imitation, selection,...)

changes the average behaviour, hence the “good strategies”, hencethe strategies that spread...

Two approaches : static and dynamic

Evolutionary game dynamics : dynamical system modeling such a process

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 9: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Usual topics

Compare outcome of evolutionary game dynamics and standard conceptsin game theory :

are dominated strategies eliminated ?

do dynamics lead to Nash equilibria ?

if so, to which equilibrium ?

Here, different topic : relate the two most studied dynamics

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 10: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Topic of the talk

Major dynamics : replicator (REP) and best-reply dynamics (BRD)

different interpretations

different degrees of rationality

different mathematical formulations

but in many examples, same long-run behaviour for BRD andtime-average of REP (Gaunersdorfer and Hofbauer, 95)

Aim : find a formal link between BRD and time-average of REP

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 11: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Outline

1 Framework, notation

2 Dynamics

3 Similarities between BRD and the time-average of REP

4 Main result : theoretical link

5 Intuition

6 Comments

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 12: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Framework and notation - I

single, large population

randomly drawn agents play a two-player symmetric game

n possible pure strategies : 1, 2, ..., n

xi (t) : frequency of strategy i at time t

x(t) = (x1(t), x2(t)..., xn(t)) : state variable

state space : {(x1, ..., xn) ∈ Rn+,

i xi = 1}

Evolutionary game dynamics : x = f (x , payoffs)

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 13: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Framework and notation - II

Payoffs :

payoff matrix A = (aij)1≤i,j≤n

expected payoff of strategy i against x : ai (x) = (Ax)i

mean payoff : x · a(x), where a(x) = (a1(x), ..., an(x))

For any quantity q(t), let q(t) = 1t

∫ t

0q(s)ds

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 14: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Replicator Dynamics (REP)

(REP) xi = xi [ai (x) − x · a(x)] with x = x(t)

differential equation due to Taylor and Jonker (78),

growth rate : difference between own and average payoff

idea : payoff = additional fitness

prototype of biological dynamics

Time-Average of REP (TAREP)

X (t) = x(t) =1

t

∫ t

0

x(s) ds with x(s) following (REP)

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 15: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Replicator Dynamics (REP)

(REP) xi = xi [ai (x) − x · a(x)] with x = x(t)

differential equation due to Taylor and Jonker (78),

growth rate : difference between own and average payoff

idea : payoff = additional fitness

prototype of biological dynamics

Time-Average of REP (TAREP)

X (t) = x(t) =1

t

∫ t

0

x(s) ds with x(s) following (REP)

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 16: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Best Reply Dynamics (BRD)

(BRD) x ∈ BR(x) − x

where BR(x) is the set of mixed best-replies to x

differential inclusion ; Matsui (91), Gilboa and Matsui (92)

population evolving towards best-reply to current situation

idea : in every time interval, a fraction of the populationswitches to

a current best-response

prototype of rational (but myopic) dynamics

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 17: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Similarities between TAREP and BRD - I

1. Convergence results also true for REP :

Example : In so called potential games and games with an interiorevolutionary stable strategy, any interior solution converges to a NashEquilibrium (NE).

2. Convergence results not true for REP :

Example : In zero-sum games with an interior equilibrium, any interiorsolution converges to the set of NE

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 18: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Similarities between TAREP and BRD - II

3. Divergence “exactly in the same way” :

Example : Generalized Rock-Paper-Scissors game

0 ǫ −1−1 0 ǫǫ −1 0

If ǫ ≥ 1, then any interior solution converges to the unique NE

If ǫ < 1, then any interior solution converges to the “Shapley triangle”(Gaunersdorfer and Hofbauer, 1995).

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 19: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Main result

(BRD) and (REP) look very different, yet striking similarities between thebehaviour of (BRD) and of the time-average of (REP).

Why ?

Main result : formal link

Up to a change in time, any interior solution of the time-average of REP

is a perturbed solution of BRD, with the perturbation vanishing as

t → +∞.

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 20: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Main result

(BRD) and (REP) look very different, yet striking similarities between thebehaviour of (BRD) and of the time-average of (REP).

Why ?

Main result : formal link

Up to a change in time, any interior solution of the time-average of REP

is a perturbed solution of BRD, with the perturbation vanishing as

t → +∞.

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 21: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Formal statement

Define perturbed best-reply correspondence BRǫ by :

y ∈ BRǫ(x) if ∀i , [maxj

aj(x)] − ai (x) > ǫ ⇒ yi < ǫ

Thm : if X (·) is the time-average of an interior solution of REP, then

X (t) ∈1

t

(

BRǫ(t)(X (t)) − X (t)))

with ǫ(t) → 0 as t → +∞

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 22: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Corollary

Corollary : the limit set of any interior solution of the time-average ofREP “has the same properties” as a true limit set of BRD

That is : internally chain transitive under BRD, hence invariant.

Proof : apply results on perturbed differential inclusions due to Benaım,Hofbauer and Sorin (2005, 2006).

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 23: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Corollary

Corollary : the limit set of any interior solution of the time-average ofREP “has the same properties” as a true limit set of BRD

That is : internally chain transitive under BRD, hence invariant.

Proof : apply results on perturbed differential inclusions due to Benaım,Hofbauer and Sorin (2005, 2006).

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 24: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Consequences

(almost) all properties mentioned above

in any zero-sum game (even with no interior equilibrium), everyinterior solution of TAREP converges to the set of NE

a better understanding

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 25: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Intuition - I

We want to show that under REP, the time-average of the past evolvestowards an approximate best-response to itself.

First idea : past of tomorrow = (past of today) + today

Formally : X =1

t(x − X )

We want : X ∈1

t(BRǫ(t)(X ) − X )

We need : x ∈ BRǫ(t)(X )

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 26: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Intuition - II

We need x ∈ BRǫ(t)(X ), that is : eventually, strategies having a highshare are almost best replies to the average population of the past.

Idea : REP is a selection process.

Strategies having a high share now are those that had :

- a good average growth rate in the past

- hence ( ?) a good average payoff in the past

- hence ( ?) a good payoff against the average population of the past

Problems : justify both “hence” + good versus best

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 27: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Growth rate and payoff

good average growth rate ⇔ good average payoff ?

Yes, because differences in growth rates = differences in payoffs

Recall :

(REP) xi = xi [ai (x) − x · a(x)] with x = x(t)

Let gi =xi

xi

and ai = ai (x). We have :

gi − gj = ai − aj

hence g i − g j = ai − aj

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 28: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Past average payoff and payoff against average past

good average payoff in the past

⇔ good payoff against average population of the past ?

Yes because the payoffs are linear in the population profile :

ai (x) = (Ax)i hence ai = (Ax)i = (Ax)i = (AX )i

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 29: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Good versus best

We saw : surviving strategies are good responses to the averagepopulation profile of the past

Why almost best responses ?

Answer : over a long period of time, a small difference in selectionpressures makes a large difference in shares

→ strategies that are good but not best-responses to the past areeliminated

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 30: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Link with logit map

BR(x) multivalued, no C1-selection

Logit approximation : br ǫ(x) = argmaxy∈∆ (y · a(x) − ǫ∑

k yk ln yk)

Unique solution : br ǫ(x) = L(a(x)/ǫ) with Li (U) = exp(Ui )P

jexp(Uj )

L : logit map, appears in multiplicative weight algorithms

Prop : the solution of (REP) starting at the barycenter satisfiesx(t) = br ǫ(X (t)) with ǫ = 1/t.

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 31: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Proof. We have :xi

xi

−xj

xj

= ai − aj

Integrating and assuming xi (0) = xj(0), this gives :

ln(xi/xj) =

∫ t

0

(ai − aj) = t(ai − aj)

hencexi

xj

= exp(t[ai − aj ]) =exp(tai )

exp(taj)

hence letting ǫ = 1/t and Z be a normalization factor :

xi (t) =exp(tai )

Z=

exp(ai/ǫ)

Z= br ǫ

i (X (t))

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 32: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Summary

REP and BRD seem very different but actually, the time-average ofREP is related to BRD

The link can be made using another standard tool : the logit map.

Shows a link between REP and multiplicative weight algorithms(also noted by Ed Hopkins).

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 33: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Comments I - What does the result really mean ?

The theorem says the time-average of REP : will lead you to someinvariant set of BRD

Does not say : will lead you to the same outcome as BRD

E.g., (BRD) and (REP) may lead to different equilibria.

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 34: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Example (Golman and Page, 2010)

A B C

A

B

C

1 −N 0−N2 0 1

0 0 0

REP leads to everybody playing A, hence so does it’s time-average

But BRD leads to everybody playing B.

Precisely : there exists a sequence ǫN → 0 such that more than 1 − ǫN ofthe state space flows to A under REP and to B under BRD.

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 35: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Sketch of proof

A B C

A

B

C

1 −N 0−N2 0 1

0 0 0

REP : if x1 > 1/N, strategy 1 earns more than average hence x1 > 0,hence leads to A.

BRD : for most initial conditions, flows first to C (unique best-response)then to B (best-response to C), hence leads to B.

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 36: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Comments II - extension

The analysis extends to a more general framework : “games against theenvironment”.

That is : the focal player faces a stream of vector payoffs, does not knowif she plays against Nature, one player, several players...

Link then between time-average of REP and Fictitious Play

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 37: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Comments III - lack of extensions

The proof uses two kind of linearities :

- the growth rates are linearly related to the payoffs : a property of (REP)

- the payoffs are linear in the population profile : a property of one ortwo-population settings.

→ The link between BRD and the time-average of REP does not extendto variants of REP nor to three-population dynamics

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 38: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

Comments IV - no-regret vs Nash

The strategies surviving a selection process are optimal against theaverage past, but not necessarily against the present.

Accordingly, (REP) satisfies a no-regret property (Hofbauer) but notconvergence to Nash equilibrium.

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics

Page 39: Time-Average Replicator and Best-Reply DynamicsJosef Hofbauer (Universit¨at Wien) and Sylvain Sorin (Universit´e Paris 6 and Ecole polytechnique) Seminar on Discrete Mathematics

The End

Thank you very much

Hofbauer, Sorin, Viossat Replicator and Best-Reply Dynamics