Evolutionary Game Theory and the Prisoner's …stat-or.unc.edu/files/2016/04/TR_08_08.pdfEGT and the...

EGT and the IPD page 1 of 53

Evolutionary Game Theory and the Prisoner's Dilemma: a survey

Douglas G. KellyDepartment of Statistics and Operations Research

University of North Carolina at Chapel HillOctober 2008

Contents0. Introduction1. The Prisoner's Dilemma2. The Prisoner's Dilemma as a two-player game3. The Iterated Prisoner's Dilemma (IPD)4. IPD competition as a two-player game5. The length of the round; discounting the future6. Game theory: Nash equilibrium7. Mixed strategies, best replies, expected payoff, and Nash equilibrium8. Evolutionarily stable strategies9. Evolutionary game theory: replicator dynamics10. Dynamic stability of stationary points11. Special case: 2 2 symmetric games‚12. Some particular pairs of strategies13. A class of three-strategy populations14. The case of TFT, TF2T, and STFT15. References

0. Introduction. This is an expository survey, intended to introduce, at the upperundergraduate level, enough game theory, evolutionary game theory, and evolutionarypopulation dynamics to begin a study of the ecology of populations of individualsplaying various strategies in games involving competition and cooperation. Somefamiliarity with linear algebra, differential equations, and elementary probability isassumed.

Sections 1 through 5 introduce the Prisoner's Dilemma, which is a particular example of asymmetric 2 2 game, and the Iterated Prisoner's Dilemma (IPD), in which competitious‚involving finite sets of strategies can be viewed as symmetric games.8 ‚ 8

Sections 6 through 8 introduce some of the ideas of game theory, focusing on mixedstrategies viewed as populations of pure-strategy players. We discuss some equilibriumproperties of mixed strategies in particular, Nash equilibrium and evolutionarystability that appear meaningful for studying the evolution of such populations.

In Sections 9 and 10 we introduce the , a system of ordinaryreplicator dynamicsdifferential equations whose orbits represent population evolution. We show (in somecases without proof; the proofs are in another paper to come) all the connections among


the equilibrium properties from Sections 6 through 8 and the various types of stationarypoints of the replicator dynamics. These connections are summarized in Table 10.1.

Our exposition of the theory in Sections 6 through 10 is based heavily on the coverage inthree important books on the subject, namely those by Hofbauer and Sigmund [1998],Osborne and Rubinstein [1994], and Weibull [1997].

In Secion 11 we cover the simplest special case, 2 symmetric games, classifying and‚ #describing all possible types of population dynamics in terms of the parameters of thegame. In Section 12 we apply these results to some two-strategy populations for the IPD.

For games, such a complete categorization is sufficiently complicated to be not$ ‚ $only infeasible but also unenlightening. In Section 13 we consider a fairly narrowsubclass of the games and describe completely all types of dynamics for these.$ ‚ $Though narrow, this subclass includes the games determined by three particular strategiesfor the IPD, suggested in a paper of Boyd and Lorberbaum [1987]. In Section 14 weexamine the dynamics of these three strategies. We confirm Boyd and Lorberbaum'sobservation that the celebrated strategy is in fact not evolutionarily stable;Tit for Tatunder some parameter settings it is driven to extinction by the combination of a more“forgiving” strategy, , and a less “nice” strategy, .Tit for Two Tats Suspicious Tit for Tat


1. The Prisoner's Dilemma. We start with a game invented in 1950 by Merrill Floodand Melvin Dresher of RAND. The history of this game, and many of its ramifications,are engagingly told in William Poundstone's book [1993[. Albert W.Prisoner's Dilemma Tucker named the game the and told something like the followingPrisoner's Dilemma story to motivate it.

Two people have been arrested in connection with a crime and are being held,unable to communicate with each other. The prosecuting attorney talks to eachof them separately, saying: “This crime carries a maximum sentence of sixyears in prison. If neither of you chooses to confess to it, I will have sometrouble prosecuting you, but I can still get a three-year sentence for each of you.If one of you confesses and the other doesn't, the confessor's sentence will bereduced to one year, while the other will get the full six years. If you bothconfess, you will each get five years.”

If you were one of these prisoners, what would you do? You have a choice of twoactions: with your partner in crime and refuse to confess, or and turnCooperate DefectState's evidence. The payoff (sentence reduction, in years) to either prisoner depends onwhat both prisoners do, according to the following table.

B's actionCooperate Defect

A's Cooperate 3 for A, 3 for B 0 for A, 5 for Baction Defect 5 for A, 0 for B 1 for A, 1 for B

The dilemma, once thought of as a paradox, is this. Each prisoner looks at the payofftable and notices that, regardless of what the other prisoner does, it is better to defect. Ifthey could talk to each other, agree to cooperate, and trust each other, they could getthree-year reductions. But they can't, and the result is that both defect and get only one-year reductions, simply because of their inability to communicate and lack of trust.

As applied to two criminals, this story's moral may be ambiguous. But real life is full ofsimilar situations “games” involving two or more players, in which defecting takes theform of “free riding.” Garrett Hardin [1968] wrote about the “Tragedy of the Commons.”Hardin's parable is of common grazing land, maintained and used by a collection ofkeepers of livestock. As long as they all “cooperate” and keep their herds at moderatesizes, all can use the land comfortably. But anyone who “defects” and unduly increaseshis herd size will gain. If many choose to defect, all suffer in the long run. Yet in theoptimal “all cooperate” state, there is a strong incentive for any individual to defect.

Analogous situations abound in human society. For example:

Why should public-radio listeners bother to contribute to their local stations? Yourcontribution is a noticeable expense for you but its absence would have littleeffect on the station.


Late at night in a subway station, with no one around, why not jump over the turnstileto avoid paying? It saves you a fare and costs the system almost nothing.

In traffic, where two lanes are narrowing into one, the flow will be smoother foreveryone if drivers alternate entering the single lane, but a defector saves sometime at cost to the others.

Why should you vote, since it's inconvenient for you and you can be virtually certainthat your vote will not affect the outcome?

In each case, the benefit to you of defecting, or free riding, is greater than its negativeeffect on the world. So people defect; and when many defect, the world is worse than itmight be.

On the other hand, many of us do cooperate, and as a result the world isn't as bad as itcould be. The question is: where does cooperation come from, in humans and in someanimal species? Where do we get our moral sense of wanting to be part of the solutionand not part of the problem? Religious answers may be sufficient in some contexts, but,like “intelligent design,” they don't get to the mechanisms by which deities work in theworld. Scientists, religious or not, want to understand these mechanisms.

The Prisoner's Dilemma is a minimal, and clearly insufficient, model for interactionsinvolving competition and cooperation among humans. What do we need to add to themodel so that cooperative behavior finds a place?

There is a twofold partial answer.First, the relevant game is not the Prisoner's Dilemma, but the so-called Iterated

Prisoner's Dilemma, in which players will play series of Prisoner's Dilemmagames against each other and may discover the advantages of cooperation and thepossibility of punishing defectors.

Second, extending game theory to provides a way to seeevolutionary game theoryhow populations of players, using different strategies, may attain stability with amix of different degrees of cooperativeness.


2. The Prisoner's Dilemma as a two-player game. A general two-player game takesplace between two people, a “row player” whom we will call Rhoda, and a “columnplayer,” called Colleen. The game has a payoff table of the form

Colleen's actions

Rhoda'sactions

G G â GV + ß , + ß , â + ß ,V + ß , + ß , â + ß ,ã ã ã ä ãV + ß , + ß , â + ß ,

" # 8

" "" "" "# "# "5 "5

# #" #" #" #" #5 #5

5 8" 8" 8# 8# 85 85

where is the payoff to Rhoda, and is the payoff to Colleen, when they+ − , −34 34‘ ‘take actions and , respectively. The two matrices and V G 8 ‚ 5 E œ + F œ ,3 4 34 34c d c ddetermine the game.

The game is if , , and . Note that thesymmetric 8 œ 5 V œ G Ð3 œ "á ß8Ñ F œ E3 3X

matrix of a symmetric game is not necessarily a symmetric matrix. If does happenE Eto be symmetric, the game is called .doubly symmetric

We will consider only symmetric two-player games in these notes.

A (PD) is defined to be a symmetric two-player game with twoPrisoner's Dilemmaactions, denoted C (Cooperate) D (Defect) instead of and , and payoff matricesV V" #

generalizing the original C D C D

C CD D

VßV Wß X $ß $ !ß &X ß W T ß T &ß ! "ß "

ß

with and . These four letters are chosen so that we canW T V X V ÐW XÑ"#

remember as the Reward for mutual cooperation,V as the Sucker's penalty,W as the Temptation to defect, andX as the Penalty for mutual defection.T

The second condition, , will be important when we consider two peopleV ÐW XÑ"#

playing repeatedly against each other. Then is the average gain per play of people whoVcooperate with each other, while is the average gain per play of people who"

# ÐW XÑ

alternately cooperate and defect, out of phase with each other. We will see theimportance of this below.

Customarily we will define a symmetric game by giving only the matrix , sinceEF œ EX . Thus we would represent the Prisoner's Dilemma with


E œV W $ !X T & "” • ” •or, in the original example, ,

remembering that the entries represent payoffs to Rhoda, the row player. We will speakof the parameters as the ”Axelrod numbers,” for reasonsV œ $ß W œ !ß X œ &ß T œ "that we will see below.


3. The Iterated Prisoner's Dilemma. Suppose two players repeatedly play the PDagainst each other. They can decide on each play whether to cooperate or defect, basedon the history of the game so far. Here is a possible outcome of a ten-play round betweentwo players, X and Y.

play 1 2 3 4 5 6 7 8 9 10 totalX's action (payoff) C (3) C (0) D (1) C (3) C (3) D (5) C (0) C (0) C (3) C (0) 18Ys action (payoff) C (3) D (5) D (1) C (3) C (3) C (0) D (5) D (5) C (3) D (5) 33

If they had both cooperated on every play, they would each have got 30 points in all. ButY was able to improve on that by being willing to defect, taking advantage of X'scooperativeness.

At this point it is helpful to think for a while about how one might play the IPDeffectively, assuming that one's opponent, like oneself, is bright and analytic.

The IPD became famous beginning in the late 1970s, when Robert Axelrod, a politicalscientist then at the University of Michigan, conducted a series of tournaments. Axelrodasked various researchers to contribute strategies, and then ran computer tournaments inwhich each submitted strategy was pitted against every other submitted strategy, as wellas against a copy of itself, along with some random strategies in which the actions areBernoulli sequences. Axelrod's results are summarized in a pair of books, The Evolutionof Cooperation The Complexity of Cooperation [1984] and [1997].

Here are some of the simpler strategies submitted to Axelrod's tournaments.

Tit for Tat: Cooperate on the first play; on subsequent moves do whateveropponent did on the previous play.

Always Cooperate.Always Defect.Grudger (“Massive Retaliation”): Cooperate until opponent defects, then defect on

every play thereafter.Suspicious Tit for Tat: Defect on the first play; on subsequent plays do whatever

opponent did on the previous play.Tit for Two Tats: Cooperate until opponent has defected twice in succession; then

defect until the opponent cooperates again; start over.Prober: Same as Tit for Tat, but wherever a C is called for, D with some small

probability (say 0.1).p : œPeacemaker: Same as Tit for Tat, but wherever a D is called for, C with some

small probability p.Adaptive: Begin with CCCCCCDDDDD; then on each play choose the action that

has given the best average return to date. [There are obviously variants, usingdifferent initial sequences.]

Soft Grudger: Cooperate until opponent defects, then punish with DDDDCC.[Obvious variants use different sequqnces of punishment followed byforgiveness.]


Pavlov, or Win-Stay, Lose-Switch: Cooperate on the first play; after that, repeatthe previous action if opponent cooperated on that play, change if opponentdefected.

The most important and surprising early finding in these tournaments was the success ofTit for Tat (TFT), the invention and tournament entry of Anatol Rapoport. It is one of thesimplest strategies; it looks only at the opponent's previous move and merely repeats it.It is not hard to see that TFT can never beat any opponent in head-to-head competition:at the end of a round it will either be tied with the opponent or have 5 points less. Yet itconsistently ranked at or near the top against the field in Axelrod's competitions.

Competitions have been held regularly by a group whose website can be found at<http://www.prisoners-dilemma.com>. Strategies have become quite elaborate in recentyears. TFT can be beaten, though not badly; it continually ranks among the mostsuccessful strategies. Another successful simple strategy is Pavlov, which strangelydidn't enter the public consciousness until the early 1990s. It appears to outperform TFT,depending of course on the other strategies in the contest.

Notice that the emphasis has shifted when we think about tournaments. Instead of tryingto beat opponents in single rounds of head-to-head competition, we would like a strategythat performs well against many other strategies. As we have mentioned, there are highlysuccessful strategies like TFT that cannot beat other strategy head-to-head, but areanynever beaten badly by any of them.

In his early work Axelrod noticed some qualities of strategies that seemed to make themsuccessful in his tournaments. These observations are provocative, if sometimes vague,and rigorous work on them has been difficult to achieve.

One such quality is A strategy is one that is never the first toniceness. nice defect. Examples are TFT, Grudger, Soft Grudger, Pavlov, and several others.Suspicious Tit for Tat, Prober, Adaptive, and Always Defect are not nice.

A strategy is called to the extent that it responds to defections byprovokable defecting one or more times. TFT is provokable, Tit for Two Tats less so;Always Cooperate is not provokable; Grudger is maximally provokable.

A strategy is called to the extent that it resumes cooperating at someforgiving point after an opponent's defection. TFT is quite forgiving; Always Cooperate isperfectly forgiving; Grudger is perfectly non-forgiving.

Axelrod noticed, and hypothesized, that successful strategies tend to be nice, provokable,and forgiving.

In reading Axelrod's early papers and his first book, one is struck by their qualitativeviewpoint. Axelrod scarcely describes the submitted strategies, but takes care todocument who submitted them, what fields of expertise and what organizations the


submitters came from, the lengths of the submitted computer programs, and thelanguages the programs were written in. In later work, of course and even in hisearliest work, as noted above Axelrod recognized the importance of trying to identifythe qualities that made strategies successful.

Axelrod used the parameter values in his tournaments, andV œ $ß W œ !ß X œ &ß T œ "while he was not the first to use these particular values, we will refer to them as the“Axelrod numbers.”

Where are we heading? At first one may approach the IPD by looking for a strategy thatbeats other strategies in head-to-head competition. Tournaments like Axelrod's put theemphasis not on head-to-head competition but on performing well against a field ofopponents playing a wide range of strategies; it is not important to do well againstindividuals, or even to beat a large number of others, as it is to do better than anyone elseagainst the field.

But when we think about the theevolution of cooperation among self-interested agents,emphasis shifts again. Now the goal is not to devise strategies that win in any sense, butto see what kinds of strategies can exist in stable equilibrium in a population.

Here “stable equilibrium” refers to the following scenario: in a large population ofindividuals, each individual plays one strategy consistently in encounters against a largenumber of randomly-chosen opponents. Strategies gain “fitness points” proportional totheir average gains in these encounters. Individuals reproduce, passing their strategies totheir progeny, and the number of progeny of an individual playing a given strategy isproportional to the number of fitness points she earned. A population is in stableequilibrium if the population shares of the pure strategies do not change over time. Aswe will see, this occurs when all strategies have the same expected average gain perencounter. Sections 6 through 10 below expand on this sketch of the scenario.


4. IPD competition as a two-player game. Now we observe that, given a set ofstrategies, we can regard an IPD round as a single play in a symmetric two-player game,in which the actions are the strategies in the given set.

To illustrate, suppose a number of players each choose one of the three strategies Tit forTat (TFT), Suspicious Tit for Tat (STFT), and Tit for Two Tats (TF2T). How will theydo in IPD rounds against each other (including against players using the same strategy asthemselves)?

Here, for example, is what happens in a round between two players, one playing TFT andthe other STFT.

play 1 2 3 4 5 6 7 8 9 10 averageTFT C ( ) D ( ) C ( ) D ( ) C ( ) D ( ) C ( ) D ( ) C ( ) D ( )

STFT D ( ) C ( ) D ( ) C ( ) D ( ) C ( ) D ( ) C ( ) D W X W X W X W X W X ÐW XÑ

X W X W X W X W

"#

( ) C ( )X W ÐW XÑ"#

The averages will change slightly if the round has an odd number of plays, butasymptotically for long rounds each player will earn points per round. With"

# ÐW XÑ

the Axelrod numbers this is 2.5.

Note that if TFT plays TFT, they get per play for an average of , and we haveV Vassumed . This requirement makes mutual cooperation preferable to theV ÐW XÑ"

#

kind of cycle of retribution shown above.

As another example, consider a long round of TF2T against STFT.

play 1 2 3 4 averageTF2T C ( ) C ( ) C ( ) C ( )STFT D ( ) C ( ) C ( ) C ( )

âW V V V â µ VX V V V â µ V

TF2T does better against STFT than TFT does. (However, TF2T is easily exploited byother non-nice strategies, which are not present in this example.)

It is not difficult to check that the following table gives the asymptotic average scores perround for each of the strategies against itself and the others.

TFT TF2T STFTTFT ( 3) ( 3) ( 2.5)TF2T ( 3) ( 3) ( 3)STFT ( 2.5) ( 3) ( 1)

(4.1)V œ V œ ÐW XÑ œ

V œ V œ V œ

ÐW XÑ œ V œ T œ

"#

"#

It is clear that strategies for the IPD can be thought of as the actions in a symmetric two-player game. A single play of this game is an IPD round.


In general, any set of IPD strategies determines a symmetric two-player game with 8 8actions available to each player. The payoffs are the average winnings of the strategiesagainst each other (and copies of themselves) in long rounds. (If the strategies haverandomness in them, the payoffs will be expected average winnings.)


5. The length of the round; discounting the future. People who play the IPD soonrecognize that it always pays to defect on the last play of the round. You cannot do betterby cooperating, and your opponent cannot punish you by defecting subsequently.Indeed, if you believe your opponent has this same insight, you may decide that youought to defect on the next-to-last play also, and so on. The well-known “unexpectedhanging paradox” comes to mind. Whether you reason all the way to the paradox or not,any strategy that does not always defect on the last play of the round can be improved bymodifying it to do so.

To avoid this, Axelrod decided to stop the IPD rounds in his tournaments at randomtimes unknown to the players. Players engage in shorter or longer rounds, but thenumber of plays over many rounds is approximately constant, so that average gain perplay (or expected average gain per play in the case of randomized strategies) isequivalent to total gain as a measure of success.

One way to achieve random stopping is to decide after each play to continue withprobability or to stop with probability 1 . If the decisions are made independently,A Athen the number of plays equals with probability .5 Ð" AÑA Ð5 œ "ß #ßá Ñ5"

We can then compute the expected payoffs of the various strategies against each other.In general, if denotes the payoff to strategy on the play against a given1 W 55 "

>2

opponent strategy , then the expected total payoff to in a randomly-stopped game isW W# "

5œ" 5œ4

∞ 5 ∞ ∞ ∞5" 5" 4"

4œ" 4œ" 4œ"

4 4 4Ð" AÑA 1 œ Ð" AÑ 1 A œ 1 A Þ

We see that this is the same as the total gain in an infinitely long IPD round, if the gainon the play is discounted by . We will refer to as the .4 A A>2 4" discount factor

For TFT STFT, for example, we have as abovevs.

play 1 2 3 4 5 6 7 8TFT C ( ) D ( ) C ( ) D ( ) C ( ) D ( ) C ( ) D ( )

STFT D ( ) C ( ) D ( ) C ( ) D ( ) C ( ) D ( ) C ( )

âW X W X W X W X âX W X W X W X W â

and so the expected total payoffs are

W XA WA XA â œW XA

" A

X WA XA WA â œX WA

" A

# $#

# $#

for TFT

and for STFT.

The advantage enjoyed by STFT, as measured by the ratio of their payoffs, is AsXWAWXA Þ

A increases towards 1 (i.e. the expected length of the game increases, or, equivalently,the future counts for more), the advantage enjoyed by STFT decreases. Notice that thedifference between their total gains increases without limit. However, it is the ratio thatmatters when we consider population dynamics, as we will see.


One can check that with the discount factor , the table (4.1) is replaced byA

TFT TF2T STFTTFTTF2TSTFT

(5.1)V V WXA

"A "A "AV V VA

"A "A "AXWA VA T"A "A "A

#

#

W

X

It is interesting to note the effect of on the game. Small values of correspond toA Arounds that are likely to be short or, equivalently, infinite rounds in which the future isdiscounted heavily. Small values of would seem to give STFT, with its initialAdefection, a greater advantage over the other strategies than larger values. And this istrue. Following are the values in the table for the Axelrod numbers V œ $ß W œ !ßX œ &ß T œ " A and two values of . We have multiplied all entries in the first matrix by3 for easier comparison with the second. (As we will see, this has no effect on any of ourconsiderations.)

A œ !Þ#&

"# "# %"# "# $"' ") %

A œ !Þ(&

"# "# )

"# "# *

"" "% %



%(

$(

The effect of on STFT's performance is apparent.A


6. Game theory: Nash equilibrium. In general, a two-player game is determined by apair of matrices, and . As noted above, the players are usually8 ‚ 5 E œ + F œ ,c d c d34 34

called the “row player” and the ”column player;” we call them Rhoda and Colleen forshort. Rhoda has actions to choose from, and Colleen has actions V ßá ßV G ßá ßG Þ" 8 " 5

If they choose actions and , then Rhoda's payoff is and Colleen's is . It isV G + ,3 4 34 43

customary to display both matrices in a single table, as we did earlier:

Colleen's actions

Rhoda'sactions

(6.1)

G G â GV + ß , + ß , â + ß ,V + ß , + ß , â + ß ,ã ã ã ä ãV + ß @ + ß , â + ß ,

" # 8

" "" "" "# "# "5 "5

# #" #" #" #" #5 #5

5 8" 8" 8# 8# 85 85

In a symmetric two-player game, , each player has the same set of actions, and8 œ 5E œ F E FX . Even for symmetric games, it is sometimes helpful to list both and . Wewould describe the PD, for example, by

C D C DC 3,3 0,5 C , ,D 5,0 1,1 D , ,

or generally by V V W XX W T T

and the IPD with three strategies TFT, TF2T, and STFT and , as seen at the endA œ !Þ(&of Section 5, by


"#ß "# "#ß "# ) ß ""

"#ß "# "#ß "# *ß "%

"" ß ) "%ß * %ß %

% $( (

$ %( (

Of course, any set of IPD strategies will produce a symmetric two-player game with 8 8actions. The discount factor must be specified, and if the strategies are random, thenAexpected payoffs are used.

Game theory originated with the attempt to answer questions like “What should playersdo in such a game?” or “What players do in such a game?” In the Prisoner'swill Dilemma, played once with no collusion and no trust, they probably will, alas, eachdefect, and perhaps this is what they should do. In the IPD game above, it is less clearwhat they will do.


Here is a nonsymmetric game in which it is instructive to think about what they will orshould do.

G G G GV &ß # #ß ' "ß % !ß %V !ß ! $ß # #ß " "ß "V (ß ! #ß # "ß & &ß "V *ß & "ß $ !ß # %ß )

" # $ %

"

#

$

%

Imagine that Rhoda, the row player, and Colleen, the column player, are going to playthis game just once. They have no opportunity to discuss the matter with each otherbefore playing. Rhoda might reason as follows: “I'd like to play in the hopes ofV%

getting that 9-point payoff. But if Colleen knows I plan to play , she'll surely play V G4 %

for an 8-point payoff. But if she's going to play , I should play (5-point payoff).G V% $

And if I play , her best play is . But knowing that, I'd better play , and if sheV G V$ $ #

knows that, then she'll play But then my best play is still . So if we play andG Þ V V# # #

G#, neither of us will have any incentive to change."

This kind of argument is a simple form of what is called . But will twofictitious playplayers actually follow such reasoning? Notice that neither of them will be very happywith the resulting pair of actions. But what is clear is that, if the pair wereV ßG# #

imposed as a default or a starting point, then neither player could improve her payoff byswitching to another action, unless the other player switched as well. This is because G#

is a to and vice versa.best reply V#

Generally, in a nonsymmetric two-player game as in (6.1) above, is a by theV3 best replyrow player to the column player's action if for . Similarly, G + + < œ "ß #ßá ß 8 G4 34 <4 4

is a by the column player to the row player's if for .best reply V , , = œ "ß #ßá ß 53 34 3=

And a (NE) is a pair of actions that are best replies to eachNash equilibrium V ßW3 4

other.

V G + + < Á 3 G V3 4 34 <4 4 3 is a to if for and similarly for to . A strict best reply strictNE is a NE in which the actions are strict best replies to each other.

It is easy to find all the NE (there may be more than one) for any game, using the tabularpresentation of and . Here is the table for the game above, with the payoffs for allE Fthe best replies in boldface.

G G G GV &ß # #ß "ß % !ß %V !ß ! ß ß " "ß "V (ß ! #ß # "ß ß "V ß & "ß $ !ß # %ß

" # $ %

"

#

$

%

'$ # #

& &* )


Look, for example, at the entry corresponding to and The is in boldface,&ß " V G$ %Þ 5because is the row player's best reply to the column player's . But is not theV G G$ % %

column player's best reply to .V$

Wee see immediately that there is one NE: the pair of actions in which bothV ßG# #

payoffs are in bold. This is the same pair of actions that we reached by the fictitious-playargument above. But we must note that such an argument does not always lead to a NE.Some games have more than one NE, and in some there is no pair of actions that are bestreplies to each other.

Another argument that may lead to NE is . Initerated elimination of dominated actionsthe game above, for example, Rhoda may notice that she has no reason ever to play ;V"

it is (i.e., never better and sometimes worse than) So she (anddominated by V Þ$ Colleen, since she is smart too) can eliminate One can check that in this reducedV Þ"game, Colleen can eliminate since dominates it. In this further reduced game,G G" %

Rhoda can eliminate , and then Colleen can eliminate , then Rhoda can eliminateV G% %

V G Þ V G$ $ # #, and finally Colleen can eliminate They are left with and , the NE.

It can be shown that iterated elimination of dominated actions never eliminates a NE;all NE will remain. But other pairs that are not NE may also remain. Indeed,sometimes there are no dominated actions in a game, so there is no elimination.

We reiterate that a NE is not necessarily optimal in any sense. In the above example,neither player would be happy with the NE , and in fact both would preferV ßG# #

V ßG V ßG% " % %or . But neither player can gain by changing actions from a NE unlessher opponent changes as well.

In the PD, as one may have sadly suspected, Defect Defect is the only NE, and it isßstrict.

C D C DC CD D

or, for any PD, .$ß $ !ß VßV Wßß ! ß ß W ß

& X& " " X T T

We also refer to the action D as a : an action that is a best reply to itself.symmetric NENotice that the idea of a symmetric NE makes little or no sense in nonsymmetric games,but that there can be nonsymmetric NE in symmetric games.

In the three-strategy IPD considered above, with , there are three NE, of whichA œ !Þ(&two are strict and the third is symmetric.


"# "# "#

"# * "%

"% *

ß "#ß ) ß ""

ß "# "#ß "# ß

"" ß ) ß %ß %

% $( (

$ %( (

(6.2)

Given that only these three strategies are available, if both players play TFT, or if one


plays TF2T and the other plays STFT, neither will have any incentive to switch strategiesunless the other does also.

A game need not have any NE:

V VV " "ßV "ß ß"

" #

"

#

" "" "

,

This game is called “Matching Pennies.” The row player wins if the actions match andthe column player wins otherwise. It is a zero-sum game, in that the row player's gain isthe column player's loss and vice versa.

This example does not contradict Nash's celebrated theorem that every finite symmetricgame has a NE. That theorem guarantees a - NE, which we will cover inmixed strategy the next section. In this section we are talking only about NE.pure-strategy

Even a symmetric game need not have any NE. The following game is “Rock, Paper,Scissors,” in which Rock defeats (breaks) Scissors, Scissors defeats (cuts) paper, andPaper defeats (covers) Rock.

Rock Paper ScissorsRock ,Paper ,

Scissors ,

!ß ! "ß "" !ß ! "ß

"ß " !ß !

" "" "

" "

One can also see easily that in a symmetric game, if is a NE, then so is V ßV V ßV Þ3 4 4 3

The considerations above are not dynamic in any sense. They have to do with one play(it is irrelevant here that one IPD play comes from many plays of a different game), withno prior history or contact between the players; and they give no indication of how anycombination of plays may be arrived at.

To put dynamic considerations into models of competition and cooperation, we need tobegin with the notion of a mixed strategy. This will be defined next as a probabilitydistribution over the set of actions available to a player. In this context the single actionsare called .pure strategies


7. Mixed Strategies, expected payoff, best replies, and Nash equilibrium. We willrestrict our attention to symmetric two-player games from here on. For simplicity, wewill label the actions by integers rather than . Any such game is"ß #ßá ß 8 V ßV ßáV" # 8

determined by an matrix , where is the payoff to any player who takes8 ‚ 8 E œ + +c d34 34

action when her opponent takes action .3 4

Given any such game, a for either player is a probability distribution overmixed strategythe actions , which we represent as a column vector of"ß #ßá ß 8 œ B ß B ßá ß BB c d" # 8

X

nonnegative numbers with The set of all such vectors is the B B œ "Þ 383 3"

8

unit simplex

‘ ?88; it is denoted .

The single action can then be viewed as the mixed strategy represented by the column3vector, denoted that has a in the position and zeros elsewhere. Such mixed/3 >2ß " 3strategies are called . The vectors are the vertices of thepure strategies / / /" # 8ß ßá ß

simplex , and any strategy equals .?8 " # 8 3X

"

83B /œ B ß B ßá ß B Bc d

There are two or three ways to think of a mixed strategy .B œ B ß B ßá ß Bc d" # 8X

Interpretation 1. As a randomized strategy for a player who chooses an action atrandom, with being the probability that she chooses action We call suchB 3Þ3

a player an B-player.Interpretation 2 In terms of an This can be eitherÞ B Þ-population 2a. A population of -players, orB 2b. A population of pure-strategy players, in which, for is the3 œ "ßá ß 8ß B3

proportion of players who always play pure strategy .3

In any of these interpretations, if and are mixed strategies, the of anB C expected payoffB B C-player, or a random member of an -population, playing against a -player or arandom member of a -population, isC

3œ" 4œ"

8 8

34 3 4X+ B C œ E œ † EB C B C.

We will regularly use the dot-product notation . Besides avoiding a superscript,B C† Ethe dot helps us remember that is the payoff to a player using .B C B† E

We will move freely among the interpretations given above, and we will drop the term“mixed,” referring simply to strategies , whether they are mixed or pure. Thus there isBno need to distinguish between an -player and an -player. Notice that the expected3 /3

payoff to an -player in an -population is which is the entry of the column3 † E 3B / B3 >2, vector EBÞ


Given two strategies and , we call a to strategy ifB D B Dbest reply

B D C D C† E † E − for all .?8

A (symmetric) is a strategy that is a best reply to itself:Nash equilibrium (NE) B

B B C B C† E † E − for all .?8

If this inequality is strict for all , then is a A vertex that is a NE is aC B B /Á strict NE. 3

pure-strategy NE.

[A general, nonsymmetric NE is a pair of strategies that are best replies to eachB Cßother. These certainly may exist in symmetric games; there are even pure-strategynonsymmetric NE, as shown in (6.2) above. But this concept has less use in evolutionarygame theory, so we will not deal with it. By “NE” here, we mean symmetric Nashequilibria.]

In his 1950 PhD dissertation, John Nash proved that every finite symmetric two-playergame has at least one symmetric NE. This can be proved using Kakutani's fixed-pointtheorem. [If denotes the set of best replies to , then is an upper semicontinuous" "Ð ÑB Bpoint-to-set map on the compact set , and so there is an with ]? "8 B B B− Ð ÑÞ

Following are some important properties of Nash equilibria involving the notion of thesupport of a strategy , which is the set of actions (or the set of vertices ) forWÐ Ñ 3B B /3

which . This might be called the “set of pure strategies in ” or the “set of actionsB !3 Bavailable to an -player”. Geometrically, it is the set of vertices on the smallest closedBface of containing . This closed face consists precisely of the convex combinations?8 Bof the vertices in . Notice that the interior of is the set of for whichWÐ ÑB B?8

WÐ Ñ œ "ß #ßá ß 8B e fThe condition means that whenever ; that is, contains onlyW © W C œ ! B œ !C B C3 3

actions contained in .B

Proposition 7.1. is a NE if and only ifB

/ B B B B

/ B B B B

3

3

† E œ † 3 − WÐ Ñ

† E Ÿ † 3 Â WÐ Ñ

E

E

for all (7.2.1)and for all . (7.2.2)

That is, if and only if every pure strategy in is also a best reply to , and the pureB Bstrategies not in are not better replies than itself.B B

Proof. If is a NE, then by definition for all . SinceB / B B B3 † E Ÿ † E 3

B / B B / B B Bœ B † œ B † †3−WÐBÑ 3−WÐBÑ

3 33 3, so . So is a weighted average ofE E E

numbers , none of which can exceed it. So they must all equal it./ B3 † E


Conversely, if satisfies (7.2.1) and (7.2.2), then for any we haveB C

C B / B B B B B B† E œ C † E Ÿ C † E œ † E3œ" 3œ"

8 8

3 33 , so is a NE.

Corollary 7.1.1. A strict Nash equilibrium must be one of the pure strategies /3Þ

Proof : If is a strict NE but not a pure strategy, then for allB / B B B3 † †E E3 † œ † 3 − WÐ Ñ, whereas we know for ./ B B B B3 E E

Proposition 7.2. If and is a linear combination of best replies to , then isC C B C− ?8

also a best reply to BÞ

Proof ,: If is a NE and are best replies to thenB C C B C B B B" 5 3ßá ß † E œ † E

for . If is in . then one can check that , and3 œ "ßá ß 5 œ - - œ "C C4œ" "

5 5

4 8 44 ?

so

C B C B B B B B† E œ - † E œ - † E œ † E" "

5 5

4 44 .

Note that although must equal 1 for to be in , it is not necessary that"

5

4 8- C ?

! Ÿ - Ÿ " 44 for all . For example,

# † $ † % † œ!Þ& !Þ% !Þ& !Þ#!Þ$ !Þ$ !Þ$ !Þ$!Þ# !Þ$ !Þ !Þ&

Ô × Ô × Ô × Ô ×Õ Ø Õ Ø Õ Ø Õ Ø2

.

Corollary 7.2.1. If is a NE and is a strategy with thenB C C BW © W ßC B B B C B Þ† œ †E E (i.e., is also a best reply to )

Proof. Such a strategy is a linear combination of the pure strategies in .B

Corollary 7.2.3. is a NE if and only ifB

/ B C B3

C3† œ † 3 B !ÞE Emax for every with (7.3)

Proof: That (7.3) holds for any NE is immediate from Proposition 7.1.Conversely, if (7.3) holds, then


B B / B C B C B† œ B † œ † B œ †E E E EB ! "

3 33

8

3

max maxC C

,

so is a NEB Þ


8. Evolutionarily stable strategies. Under Interpretation 2 above, if is a NashBequilibrium, then we might say that an -population is “resistant to invasion” in theBfollowing sense. If it is invaded by a relatively small group of individuals with adifferent mix , then, because , the invaders will not do better againstC B B C B† †E Ethe incumbents than the incumbents have been doing against each other.

Clearly, though, what is important is how the invaders and the incumbents will performin the whole augmented population, not just against the incumbents. Evolutionarystability formalizes this and turns out to be a considerable strengthening of the conditionsfor a Nash equilibrium.

Imagine an -population either a group of players all using mixed strategy , or aB Bgroup of pure-strategy players with mix . Suppose this population is invaded by a smallBC B-population, whose size is only times that of the -population. Then the expected%payoff to a -player in the enlarged population, whether is or or one of the pureD D B Cstrategies (or, for that matter, any other strategy that may wander in), is/3

D C B D C D B† Ð Ð" Ñ Ñ œ † Ð" Ñ † ÞE E E% % % %

We call an if, for any strategy , there is an B Cevolutionarily stable strategy (ESS) %Csuch that as long as this expected payoff is greater for than for That% % œ œ ÞC D B D Cis,

% % % % % %B C B B C C C B† Ð" Ñ † † Ð" Ñ † a ÞE E E E (8.1)C

In other words, is evolutionarily stable if, as long as the invading population is smallBenough, the incumbents will do better than the invaders in the enlarged population.

If the inequality in (8.1) holds but not strictly, then is called a B neutrally stable strategy(NSS).

Proposition 8.1. is an ESS (NSS) if and only if is a NE that satisfies in additionB B

If and , then ( ) (8.2)C B C B B B C C B CÁ † œ † † Ÿ † ÞE E E E

Proof. We can rewrite (8.1) as

Ð" ÑÐ † † Ñ Ð † † Ñ ! a ß% % % %B B C B B C C CE E E E C

and the result follows.

In words, an ESS is a NE such that, if is another best reply to , then is a betterB C B Breply to than itself. This does not look like much of a strengthening, but when weC Cconsider the replicator dynamics below, we will see that it is significant.

Corollary 8.1.1. An ESS is a NSS, and a NSS is a NE.


We will not prove the following useful proposition here; its proof is surprisinglytechnical.

Proposition 8.2. is an ESS (NSS) if and only ifB

B C C C C Á B† † RE E( ) for all in (8.3) B

where is some neighborhood of intersected with .RB B, ?8

The strict version of (8.3) is called . Proposition 8.2 shows in anotherlocal superioritylight how evolutionary stability strengthens the notion of Nash equilibrium.

Corollary 8.2.1. If is a NE and every neighborhood of contains at least one otherB BNE, then is not an ESS.B

Proof. If were an ESS, then by Proposition 8.2 it would have a neighborhoodBin which for every . But at least one of these is a NE, andB C C C C C† †E E for this we have , a contradiction.C C C B C† E † E

That is, an ESS must be an isolated NE. However, an NSS need not be, as we see belowin Section 13

The following is a restatement of Proposition 8.2.

Corollary 8.2.2. is an ESS (NSS) iff there exists a positive so thatB %B

% % %† E † E Ÿ !B ( ) , (8.4)

for all nonzero vectors with and and% œ ß ßá ß œ ! c d% % % % % %" # 8 3 BX

" "

8 8

3#

B −% ?8.

Proof. Vectors near in are vectors where is as described. It isC B B?8 % %easy to check that (8.3) is equivalent to (8.4).


9. Evolutionary game theory: replicator dynamics. Evolutionary stability is clearlydesigned to confer “invasion resistance” on an -population, but it is a static property,Bconcerning only the relative expected payoffs of various strategies in one play. Now wemove to a model of the evolution of a population, in order actually to see what happenswhen individuals playing different pure strategies come together repeatedly. The modelis a system of ordinary differential equations that yield what are called the replicatordynamics. They model evolution through natural selection, with payoffs regarded asevolutionary “fitness points” determining the number of offspring who inherit one's ownstrategy.

The idea is that we take a symmetric 2-player game with action set , and we"ß #ßá ß 8start at time with a population of pure-strategy players that has mix . The ! Ð!Ñ 3B >2

component is the proportion of the population who are : they always takeB Ð!Ñ3 3-playersaction .3

Prior to reproducing, individuals play this game many times against randomly-chosenother members of the population. Each individual gains “fitness points” proportional toher total payoff in these games. Then we decree that an individual's number of daughterswill be proportional to her number of fitness points, and her daughters will inherit thepure strategy that she plays. As time passes the mix changes, becoming at time .BÐ>Ñ >

In our motivating examples above, the game is the IPD, and the pure strategies are IPDstrategies such as TFT, TF2T, Pavlov, and STFT. The payoffs, as above, are theexpected winnings per round. These rounds are infinite but carry a discount factor , asAabove. We will assume that all individuals play about the same number of rounds, andthat this number is large enough to ensure that all pure strategies encounter each otherwith the appropriate frequencies. Thus a player's total winnings will be proportional toher expected winnings per round.

If is the mix at a given time, then the expected payoff ofB Bœ Ð>Ñ œ B Ð>Ñßá ß B Ð>Ñc d" 8X

an -player against a random member of the population is , which is just the 3 † 3/ B3 >2Eentry of the column vector EBÞ

Let denote the of -players in the population at time , and let: Ð>Ñ 3 >3 number

:Ð>Ñ œ : Ð>Ñ B Ð>Ñ œ 33œ"

8

3 3: Ð>Ñ:Ð>Ñ be the total population size. Then is the proportion of -3

players. (The notation can be confusing: is an vector, but is aBÐ>Ñ 8 ‚ " :Ð>Ñ œ : Ð>Ñ3

3

scalar.)

Discrete replicator dynamics. (This may be omitted.) Here we think of generationslabeled by Each individual reproduces just once in each generation> œ !ß "ß #ßá Þwhile she is alive. The fundamental assumption is

: Ð> "Ñ œ † Ð>Ñ : Ð>Ñß 3 œ "ßá ß 8Þ3 33ˆ ‰α / BE (9.1)

Here is a constant common to all members of the population, equal to a baselineα


fitness level minus a common death rate. Obviously, and the entries of need to beα Escaled so that the multipliers are not too far from 1, in order to avoidα † Ð>Ñ/ B3 Epopulation explosion or implosion.

We get a recursion for the proportions as follows. Summing (9.1) givesB Ð>Ñ3

:Ð> "Ñ œ † Ð>Ñ : Ð>Ñ

œ :Ð>Ñ :Ð>ÑB Ð>Ñ † Ð>Ñ

œ † Ð>Ñ :Ð>ÑÞ

ˆ ‰3œ"

83

3

3œ"

8

33

α

α

α

/ B

/ B

B B

E

E

E

(9.2)

Dividing (9.1) by (9.2), we get

B Ð> "Ñ œ Ð>Ñß 3 œ "ßá ß 8Þ † Ð>Ñ

Ð>Ñ † Ð>Ñ3

3α

α

/ B

B BB

E

E(9.3)

We see that a pure strategy that earns its players an above-average number of points perencounter will increase its population share, while one that earns less will decrease.The population will be in that whenstationaryß B BÐ> "Ñ œ Ð>Ñß/ B B B B3 † Ð>Ñ œ > † > 3 − WE E for all ; that is, when all players in the populationhave the same expected payoff. By Proposition 7.1, the population is stationary at anyNE, but not necessarily conversely.

Further analysis of the discrete replicator dynamics is not nearly so easy as for thecontinuous-time version, which we consider next.

Continuous replicator dynamics. Suppose the population is large enough that we canimagine reproduction taking place continuously in time. The fundamental assumption is

: œ † : ß 3 œ "ßá ß 8ß†3

33ˆ ‰α / BE (9.4)

where and denotes the time derivative . (Remember thatB Bœ Ð>Ñß : œ : Ð>Ñß : : Ð>Ñ†3 3 33

..>

:Ð>Ñ Ð>Ñis a scalar while is a vector.) The constant is common to all members of theB αpopulation, incorporating a baseline fitness level minus a death rate that is the same forall players. It may appear that and the entries of must be carefully scaled to avoidα Eexplosion or implosion of the population, but this turns out not to be crucial. Inparticular, we will see that disappears from the equations for the .α B Ð>Ñ3


Summing (9.4) over all gives3

: œ † :†

œ : Ð:B Ñ †

œ † :Þ

ˆ ‰3œ"

83

3

5œ"

8

33

α

α

α

/ B

/ B

B B

E

E

E

(9.5)

From this and (9.4) we get

B œ œ† . : : : : :

.> : :

† †

œ B: :† †

: :

œ † B † B Þ

33 33

#

33

33 3ˆ ‰α α/ B B BE E

That is,

(9.6)B œ B † † ß 3 œ "ß #ßá ß 8Þ†3 3

3/ B B BE E

The quantity in parentheses on the right is the difference between the expected payoff toan -player against a random member of the population and the expected payoff to one3random player in the population against another. Pure strategies whose players earnabove the population average will increase their population shares; those that earn lesswill shrink in number. If any pure strategy is not represented in the population ( ),B œ !3

it will stay unrepresented.

A mix is a of the replicator dynamics if for .B stationary point (SP) B œ ! 3 œ "ßá ß 8†3

Looking at (9.6), we see that this is equivalent to

/ B B B33† œ † 3 B !E E for all with . (9.7)

That is, the expected payoff of an -player is the same for all .3 3

Proposition 9.1. All vertices are stationary points.

This is immediate; (9.7) is obviously true if .B /œ 3

Proposition 9.2.a. If is a positive constant and denotes + +E Ew (the matrix obtained by multiplying

every entry of by )E + E E, then the replicator dynamics (9.6) with in place of w

have the same orbits, stationary points, and dynamic stability properties, and alsothe same NE, ESS, and NSS.


b. If is any real constant and denotes , where has ones in column - -I I 3E Ew3 3

and zeros elsewhere, then (9.6) for is identical to (9.6) for .E Ew

Proof (for the static properties NE, ESS, and NSS only). Part a is easy to see.

For b, we note that and so andI œ † I œ B

BBãB

3 3 3

3

3

3

3B / B

Ô ×Ö ÙÖ ÙÕ Ø

/ B33 3 4 3

4† I œ B B œ B Þ Thus

/ B B B œ / B B B / B / B

/ B B BÞ

3 w w 3 3 33 3

3 w w

† † † † -Ð † I † I Ñ

œ † †

E E E E

E E

Corollary 9.2.1. Subtracting any row of from all rows (leaving at least one row ofEzeros) does not change the replicator dynamics.

The following proposition assures us that all orbits of the replicator dynamics that start in? ?8 8 stay in .

Proposition 9.3. If , then for all B BÐ!Ñ − Ð>Ñ − > !Þ? ?8 8

Proof. Suppose . We need to show that for all andBÐ!Ñ − B Ð>Ñ ! 3?8 3

=Ð>Ñ œ " > =Ð>Ñ œ B Ð>Ñ 3 for all , where . Summing (9.6) over all we see that,"

8

3

as long as we haveBÐ>Ñ − ?8

= Ð>Ñ œ B Ð>Ñ † Ð>Ñ B Ð>Ñ † Ð>Ñ†

œ Ð>Ñ † Ð>Ñ Ð>Ñ † Ð>Ñ=Ð>Ñß

ˆ ‰ˆ ‰"

8

3 33/ B B B

B B B B

E E

E E

and as long as , and so BÐ>Ñ − =Ð>Ñ œ " =Ð>Ñ œ !Þ†?8

If any for some , then there is with and , by theB Ð>Ñ ! > > ! > > B Ð>Ñ œ !3 3

continuity of orbits of ODEs. But then (9.6) shows that , and so weB Ð>Ñ œ !†3

must have for all , contradicting .B Ð<Ñ œ ! < > B Ð>Ñ !3 3

This is easily generalized: if is on any face of (i.e. for some givenBÐ>Ñ B Ð>Ñ œ !?8 3

collection of indices ), then is on that face for all . This says in particular that3 Ð<Ñ < >Bif some pure strategies are not in , they cannot appear through evolution. That is,BÐ!Ñthere are no mutations.


Proposition 9.4. A Nash equilibrium is precisely a stationary point for which/ B B B B3 † Ÿ † 3 WÐ ÑÞE E for all not in

Proof. This is immediate from Proposition 7.1.

If is in the interior of , then all are in , and so we haveB B?8 3 WÐ Ñ

Corollary 9.4.1. If is in the interior of then is a Nash equilibrium if and only if itB B?8

satisfies (9.7). That is, all interior stationary points are Nash equilibria.

Thus we can amplify Corollary 8.1.1 as follows.

Proposition 9.5.

ESS NSS NE SP (and NE SP for interior points).Ê Ê Ê Í


10. Dynamic stability of stationary points. Now we are ready to examine the behaviorof the replicator dynamics. Because is compact and no orbit of the replicator?8

dynamics can go outside , all orbits will tend toward either a stationary point or a limit?8

cycle. As it happens, limit cycles do not arise in the examples we consider, so we willfocus on stationary points.

A stationary point is called (Lyapunov) if every neighborhood of containsB B! !stable Fa neighborhood of such that if then for all .F Ð!Ñ − F Ð>Ñ − F > !! !B! B B

B! is if it is Lyapunov stable and in addition there is aasymptotically stable neighborhood so that if then F Ð!Ñ − F Ð>Ñ œ Þ‡ ‡

>p∞B Blim B!

A stationary point that is not Lyapunov stable is called .unstable

Proposition 10.1. Any Lyapunov stable stationary point is a Nash equilibrium.

Proof. Suppose is a stationary point but not a NE. Then there exists some withB CC B B BÞ B / B B B† E † E † E œ † E 3Since is stationary, we know for all with3

B ! 3 B œ ! † E † E3 3, so there must be some with and Since bilinear/ B B BÞ3 functions are continuous, there is some neighborhood of and some constant Y !B $so that for . So for at least this , if is any point in/ C C C C C3 † E † E − ∩ Y 3? !

? ? $∩ Y C ! − ∩ Y C C† with , then for all on the orbit starting from and3 33C C!, therefore It follows that cannot be Lyapunov stable.C Ð>Ñ / Þ3

>$ B

The connections among all the important properties of mixed strategies areBsummarized in the following chains of implications, extending Proposition 9.5.

Table 10.1

evolutionarily Nashstable strategy equil.

neutrallystable strat.ß à

à ßÒ Ò

asymp. stablestat. point

Lyap. stable stat.stat. point point

The properties named in italics are defined in terms of the replicator dynamics, andaccordingly we will refer to them as “dynamic” stability properties of stationary points.The other properties are defined in terms of relative payoffs in a single play (or IPDround), in attempts to capture evolutionary behavior in a static manner. We will refer tothem as “static” stability properties.


We note also that none of the converse implications hold in general, nor does eitherimplication between neutrally stable strategies and asymptotically stable stationarypoints.

We will not prove these implications here, nor will we give counterexamples to the non-implications.

The key implications are the two from evolutionary and neutral stability to the twoforms of dynamic stability. Lyapunov's direct method is used to prove each of these,and the Lyapunov function is the Kullback-Leibler relative entropy measureL ÐCÑ œB

33 3 3B B ÎC Þlog

The implications in Table 10.1 enable us, at least in some cases, to determine nearly allthe dynamic properties of a mix by looking only at the static properties. Only ifB − ?8

we discover that a mix is neutrally stable but not evolutionarily stable would we need toBcheck further to determine whether it is asymptotically stable or only Lyapunov stable.

However, deterniming the static stability properties and then using Table 10.1 is notnecessarily the easiest way to investigate the dynamic stability properties. At least in thespecial cases we will consider below, it is quite easy to identify the stationary points andlook at the signs of the derivatives at nearby points, to determine their dynamicB†3stability properties without checking their static properties. The point is that knowing thestatic properties refines our understanding of the various population situations, beyondwhat we know from the dynamic properties alone.


11. Special case: 2x2 symmetric games. An arbitrary symmetric game has a# ‚ #

payoff matrix E œ Þ+ ++ +” •"" "#

#" ##As noted in Corollary 9.1.1, the dynamics will not

change if we subtract the first row from both rows. Hence we lose no generality byconsidering symmetric games with matrices of the form

E œ Þ! !+ ,” • (11.1)

The mixed strategies, viewed as population mixes i.e., vectors in are vectors of ?#

the form B œ ! Ÿ B Ÿ "B

" B” •, with , and the replicator dynamics are determined by

just one equation in the scalar variable , namelyB

B œ B † †† ˆ ‰/ B B B" E E . (11.2)

But since

EB œ!

+B ,Ð" BÑ” •,

(11.2) reads

B œ BÐ" BÑÐ+B ,Ð" BÑÑÞ† (11.3)

We will first find all stationary points; then we will determine conditions under whicheach stationary point is a NE, and finally we will determine conditions under which eachNE is a NSS or an ESS. As we will see, in this case no point is an NSS without being anESS, so there will be no open questions about asymptotic stability. Thus we can settle allthe questions of dynamic stability of a stationary point in the case by examining its# ‚ #static stability properties.

Stationary points. We see immediately from (11.3) that when and whenB œ ! B œ "†

B œ ! E; that is, the vertices and are stationary points. This is always true for any ,/ /" #

by Proposition 9.1.

In the trivial case , (11.3) reads and there are essentially no dynamics. It+ œ , œ ! B œ !†

is easy to see that all points are NSS but not ESS in this case. We will assume and are+ ,not both zero in what follows.

There is a third zero of when , namely . This will determine a vector inB + Á , B œ† ,,+

?# as long as or . This is equivalent to saying that either! , , + ! , , +


+ ! , + ! , or . In this case we denote the resulting stationary point by :B!

B! œ Þ"

, +

,+” • (11.4)

Summarizing: and are stationary points of the replicator dynamics for (11.1), and/ /1 #

if and are nonzero with opposite signs, given in (11.4) is also a stationary point.+ , B!

Nash equilibria. Since all NE are stationary points, we just check the stationary points tosee which are NE. By Proposition 9.4, a stationary point is a NE if and only if, inBaddition,

/ B B B B3 † E Ÿ † E 3 Â WÐ Ñ for all . (11.5)

Case 1: . Here (11.5) says . Note that is justB / / / / / / /œ † E Ÿ † E † E" # " " " 3 4

the entry in the row and column of . So is a NE if and only if 3 4 E + Ÿ !>2 >2 /" (and a strict NE if )+ ! Þ

Case 2: Now (11.5) says ; that is, . So is aB / / / / / /œ Þ † E Ÿ † E ! Ÿ ,# " # # # #

NE if (and a strict NE if ), ! , ! Þ

Case 3: and have opposite signs and By Corollary 9.2.1, any interior+ , œ ÞB B!

SP is a NE, so is a NE whenever and have opposite signs. ( cannot be aB B! + , !

strict NE; only vertices can, by Corollary 7.1.1.)

ESS and NSS. All NSS and ESS are NE, so we look at the three cases above to see whenthe points in question are NSS or ESS.

Case 1: and . In this case the vectors in that are in a+ Ÿ ! œB / C"#?

neighborhood of are vectors for small positive . For such ,/ C C" œ" ” •%% %

EC œ à!

+Ð" Ñ ,” •% %

so and / C C C" † œ ! † œ Ð+Ð" Ñ , Ñ œ + Ð, +Ñ ÞE E % % % % %c dSo is an ESS (NSS) if and only if/"

! + Ð, +Ñ( ) for small positive . (11.6)% % %c dIt follows that is an ESS if . If then (11.6) reads ( ) 0 for/" #+ ! + œ !ß , Ÿ%small positive ; since we are assuming when this requires .% , Á ! + œ !ß , !


Summarizing Case 1: is an ESS if and only if either (and is arbitrary), or/" + ! ,+ œ ! , ! + !ß and . If is not even a NE./"

Case 2: and Neighborhoods of in consist of vectors, ! œ ÞB / /# ##?

C Cœ" ” •%

%% for small positive . For such ,

EC œ!

+ ,Ð" Ñ” •% %;

so and . So is an ESS/ C œ C C /# #† + ,Ð" Ñ † œ Ð" Ñ + ,Ð" ÑE E% % % % %(NSS) if and only if

+ ,Ð" Ñ Ð" Ñ + ,Ð" Ñ% % % % % %( ) for small positive . (11.7)

This is equivalent to ( ) for small positive .+ ,Ð" Ñ !% % %

It follows that is an ESS if . If , then (11.7) reads ( )/# , ! , œ ! + Ð" Ñ % %! + Á ! , œ ! for small positive . Since we are assuming when , this requires%+ !.

Summarizing Case 2: is an ESS if and only if either (and is arbitrary),/# , ! +or and If , is not even a NE., œ ! + !Þ , ! /#

Case 3: and are nonzero with opposite signs, and .+ , œ œ,+

B B0"

,+” •Neighborhoods of in consist of vectors for small positiveB! ?

%%#

",+C œ

, + ” •

or negative . For such ,% C

E E EC œ œ ! œ"

, + , + + , ! !

B! ” • ” • ” •%% %

%,

so and So is an ESS (NSS) if and only ifB B! !† œ † œ ÞE EC C C+,+ ,+

Ð+ Ñ% % %

+ + ,+ ,+% % % ( ) for small ; that is,#

%

%%

#

, + Ÿ !( ) for all small positive and negative .

Since we are assuming and are nonzero with opposite signs, this requires+ ,, + ! , ! + Þ, i.e. , and implies strict inequality

Summarizing Case 3: If then is an ESS. If then is a, ! + + ! ,B B! !

NE but not an NSS. In other cases there are no interior SP.

We can put together all the results of this section in the following table. In each case theassertion (SP, NE, NSS, ESS) about each point is the strongest we can make. By Table


10.1, points listed as “NE” or “SP” are unstable stationary points, and those listed as

“ESS” are asymptotically stable. Recall that B! ",+œ

,+” •.

Table 11.1case orbit picture

NE ESS not in ESS SP not in SP ESS not in

ESS NE not in SP ESS not i

/ /" #

#

#

#

#

B!

+ œ !ß , !+ œ !ß , !+ !ß , œ !+ !ß , œ !+ !ß , !

????

ABAB

n ESS SP not in SP SP ESS

ESS ESS NE

??

#

#

ABCD

+ !ß , !

+ ! ,+ ! ,

With regard to the orbits, the above table shows how the situation depends on the valuesof and . Note that we are excluding the trivial case .+ , + œ , œ !

+ !ß , !

+ Ÿ !ß , Ÿ !

+ !ß , !

+ !ß , !

None of this is surprising, of course. We note only that the four orbit pictures, whichcapture the dynamic properties fully, do not reveal the eight different situations withregard to the static properties.


12. Some particular pairs of strategies: Here we look at a few pairs of strategies forthe IPD, with various values of the discount factor .A

Example 1: TFT vs. STFT. From (5.1) above we get the payoffs

TFT STFTTFT

STFT

V WXA"A "AXWA T"A "A

#

#

We multiply the matrix by and then subtract the first row from both rows to get" A#

E œ ÞVÐ" AÑ W XA ! !X WA TÐ" AÑ ÐX VÑ AÐW VÑ ÐT WÑ AÐT XÑ” • ” •Ò

For the Axelrod numbers this becomesV œ $ß W œ !ß X œ &ß T œ "

” • ” •! ! ! !+ , # $A " %A

œ ,

and we have the following cases, depending on the value of .A

A

A Ÿ + !ß , !

A , ! +

A + Ÿ !ß , !

case outcomes is ESS; TFT dies out

is ESS: is stable proportion of TFTis ESS; ST

"%

#

" # %A"% $ A"

#$

"

/

/

B0 FT dies out

This is intuitively plausible. Small values of correspond to shorter games, in which theAinitial defection of STFT gives it an advantage. Larger values of nullify thisAadvantage, and it is overcome by the advantage TFT players get by cooperating with eachother on every play.

Example 2: TF2T vs. STFT. From (5.1) we getTF2T STFT

TF2TSTFT

V VA"A "A

VA T"A "A

W

X

and so, multiplying by a positive constant and subtracting the first row from both rows,we get

E œV W AÐV WÑ

X AÐV XÑ T

! !ÐX VÑÐ" AÑ ÐT WÑ AÐV WÑ

Þ

– — ” •” •

V VA"A "A

VA T"A "A

W

X Ò

Ò


With the Axelrod numbers, this becomes

” • ” •! ! ! !+ , # #A " $A

œ ,

and we have the following cases, depending on the value of .A

A

A Ÿ + !ß , !

A + !ß , !

case outcomes is ESS; TF2T dies out

is ESS; is stable proportion of TF2T

"$

#

" $A"$ A"

!

/

B

Example 3: ALC (Always cooperate) ALD (Always Defect). Here the payoff table isvs.

ALC ALDALCALD

V W"A "AX T

"A "A

and so the matrix is

E œV W ! !X T X V T W” • ” •Ò .

The dynamics do not depend on , even if the total payoffs do. Because andA X VT W + !ß , ! in any PD, we are in the case , and so ALC dies/# is an ESS and out.


13. A class of three-strategy populations. Here we consider a special class ofsymmetric 3 3 games, with payoff matrices of the form‚

Ô ×Õ Ø< = >< = ?@ A B

Þ (13.1)

We will denote the three pure strategies simply by 1, 2, and 3. Notice that if no 3-playersare present, neither 1 nor 2 has an advantage; both have the same expected payoff. Anydifference in their expected payoffs comes from their encounters with 3-players.

This is motivated by the population dynamics of the three IPD strategies TFT,TF2T, and STFT, whose payoff matrix is given in (5.1) above:

E œ ÞW

X

Ô ×Ö ÙÕ Ø

V V WXA"A "A "AV V VA


#

#

The paper of Boyd and Lorberbaum [1987] drew attention to this combination ofthree strategies. The success of TFT in tournaments like Axelrod's had led tothe suggestion that a combination of cooperative strategies like TFT and TF2Twould be evolutionarily stable against invasion by any other strategies. Boydand Lorberbaum showed that TFT by itself is resistant to invasion by STFT, butthat a mix of TFT and TF2T can be invaded by a small infusion of STFT, and inthe end TFT may die out.

Notice that the payoff matrix for these three strategies specializes (13.1) furtherin that here .< œ =

In this section we will study the replicator dynamics of three-strategy populations withpayoff matrix of the form (13.1). In the next section we will specialize to the three IPDstrategies mentioned above.

If we subtract the first row of (13.1) from all rows, we get a matrix of the form

E œ! ! !! ! +, - .

Ô ×Õ Ø (13.2)

which produces the same replicator dynamics and the same static equilibrium properties.The equations (9.6) for this areE

B œ B † E † E ß 3 œ "ß #ß $Þ†3 3

3ˆ ‰/ B B B


We have

and . (13.3)E œ † E œ B +B ,B -B .B!

+B,B -B .B

B B BÔ ×Õ Ø$

" # $

$ # " # $

So stationary points must satisfy

! œ B œ B B +B ,B -B .B†

! œ B œ B B + +B ,B -B .B†

! œ B œ B +B B " B ,B -B .B†

" " $ # " # $

# # $ # " # $

$ $ # $ $ " # $

c dc dc d (13.4.2) (13

(13.4.1)

.4.3)

We will look at a few classes of points in and place them on the chain of implications?$

in Table 10.1. We will use the following facts.

B / B B B B is a SP iff (13.4) holds (or, equivalently, for all ).3 † E œ † 3 − WÐ ÑEB / B B B B is a NE iff in addition for all (Proposition 7.1).3 † E Ÿ † 3 Â WÐ ÑEB B Ñis an ESS (NSS) iff ( ) for all sufficiently small vectors for% % %† EÐ Ÿ !

which (Corollary 8.2.2).B −% ?$

Any interior SP is a NE.Non-isolated SP cannot be asymptotically stable, and thus cannot be ESS. (However,

they may be Lyapunov stable, and in addition they may be NSS.)

We note also that a SP that is not a NE must be an unstable SP.

There are three classes of points in : the vertices, the three open faces (which we refer?$

to as the open “1-2 face” and so on), and the interior.

13a. The vertices. We know that all three vertices are stationary points, by Proposition9.1. We look at them in turn to see which are NE, NSS, and ESS.

The vertex /" 3 " " ". This point is a NE if and only if for and ./ / / /† E Ÿ † 3 œ # 3 œ $EWe have and , so is a NE if and only if ./ / / / / / /" " # " $ " "† œ † œ ! † œ , , Ÿ !E E E

The vertex cannot be an ESS, because (as we note below) all points on the closed 1-2/"

face are stationary points, and and ESS must be an isolated stationary point.

/ /" "# $ # $ # $

Xis an NSS if and only if for all with and % % %† EÐ Ñ Ÿ ! œ ß ßc d% % % % % %nonnegative, not both zero, and sufficiently small. For such an ,%

% %† EÐ Ñ œ , Ð+ , -Ñ Ð. ,Ñ/"$ # $% % %c d.

This is certainly nonpositive when , so is an NSS if and only if%$"œ ! /

, Ð+ , -Ñ Ð. ,Ñ Ÿ !% % % %# $ # $ for small nonnegative and . This is certainly true


if . If , then is an NSS iff for such ; that is, if, ! , œ ! Ð+ -Ñ . Ÿ !/"# $% % %

+ - Ÿ ! . Ÿ ! and .

Summarizing: is/"

always a stationary point;a NE iff ;, Ÿ !an NSS iff either , or and and ;, ! , œ ! + - Ÿ ! . Ÿ !never an ESS.

The vertex /# #. A similar argument shows that is/always a stationary point;a NE iff ;+ - Ÿ !an NSS iff either , or and and ;+ - ! + - œ ! , Ÿ ! . Ÿ !never an ESS.

The vertex /$ 3 $ $ $. This vertex is a NE iff for and ; that is,/ / / /† E Ÿ † 3 œ " 3 œ #Eiff and .! Ÿ . + Ÿ .

/ / Ñ$ $# $ # $ $

X is an ESS (NSS) iff ( ) for all with % % %† EÐ Ÿ ! œ ß ßc d% % % % %small and positive and . If we write for ( ), then one can! Ÿ Ÿ ! Ÿ Ÿ "% % )% % )# $ $ #

check that for such ,%

% %† EÐ œ Ð+ .Ñ + , - . , Þ/ Ñ$ #$ $% ) % )c d

So is an NSS iff for all , and, if for some (which/$ + . Ÿ ! − Ò!ß "Ó + . œ !) ) ) )would have to be or ), then for that . The first) ) ) )œ ! œ " + , - . , !condition is equivalent to and , i.e. to and This just says+ . Ÿ ! . Ÿ ! . ! . +Þthat is a NE. The second condition says in addition that if then (i.e./$ . œ ! . , !, Ÿ ! . œ + + - . ! - Ÿ !), and if then (i.e. ).

Thus is an NSS iff is a NE and in addition if and if / /$ $ , Ÿ ! . œ ! - Ÿ ! . œ +Þ

Similarly, is an ESS iff/$

+ . ! − Ò!ß "Óß . ! . +ß

. œ ! + . , !ß . œ !ß + !ß , !ß

. œ + ! + . - ! . œ + ! - !Þ

) ) for all i.e. and or and i.e. and or and , i.e. and

Summarizing: is/$

always a stationary point;a NE iff and . ! . +ßan NSS iff in addition implies and implies . œ ! , Ÿ ! . œ + - Ÿ !ßan ESS iff in addition implies and implies . œ ! , ! . œ + - !Þ

One dynamic stability question is left open by the above. will be an NSS without/$

being an ESS in case or . In such cases we still need to, œ . œ ! + - œ . œ + !


determine whether is asymptotically stable or merely Lyapunov stable. These cases/$

are exemplified, respectively, by

E œ E œ Þ! ! ! ! ! !! ! + ! ! ! + !! - ! , + +

Ô × Ô ×Õ Ø Õ Ø and

13b. The interior. If and are all required to be nonzero, then the threeB ß B ß B" # $

equations in (13.4) become

,B + - B .B œ !

,B + - B .B œ +

,B + - B .B œ Þ,B -B .B

B

" # $

" # $

" # $" # $

$

So there will be interior stationary points if and only if and the plane defined by+ œ !,B -B .B œ !" # $ $passes through the interior of . This happens if and only if the?

vector that is normal to this plane is in neither the closed positive octant nor thec d,ß -ß . X

closed negative octant.

Thus there will be interior stationary points if and only if and , , and are all+ œ ! , - .nonzero and not all of the same sign. Moreover, in this case there will be a line segmentof interior stationary points, which cannot be ESS, but as noted above must be NE.

Which of these points are NSS? Such a point is an NSS if and only ifB B% %† EÐ Ñ Ÿ !

for all sufficiently small nonzero for which Such % % %œ ß ß − Þc d% % % % ?# $ # $ $X B

may have positive or negative components. Notice that when and+ œ !

,B -B .B E œ !ß !ß !" # $X, we have . So an interior stationary point is a NSS ifB Bc d

and only if for all such . Thus either all are NSS or none are. When ,% % %† E Ÿ ! + œ !

% %† E - , Ð. ,Ñœ % % %# $ $#,

and in order that this be nonpositive for all small positive and negative and , we% %# $

require and .- œ , . , Ÿ !

Summarizing:There are interior stationary points if and only if and are all nonzero+ œ ! ,ß -ß .

and not all of the same sign. In this case there is a line segment of interiorstationary points, namely those in satisfyingB œ B ß B ß Bc d" # $ $

X ?,B -B .B œ !" # $ .

These points are all NE. They are all NSS if and only if ., œ - Ÿ .There are no ESS in the interior of .?$


13c. The open 1-3 face. This is the set of vectors withB œ B ß !ß Bc d" $X

! B œ " B " ! œ !" $ . For such vectors, (13.4.2) becomes and the other twoequations in (13.4) both reduce to We see that if , then all, Ð. ,ÑB œ !Þ , œ . œ !$

points on the open 1-3 face are SP. If , there will be none. If , there will, œ . Á ! , Á .be one SP, with , if and none if or .B œ ! " Ÿ ! "$

, , ,,. ,. ,.

Notice that if and only if or ; that is, if and! " ! , , . ! , , .,,.

only if and are nonzero with opposite signs., .

Which of the SP are NE? is a NE iff ; that is, ifB / B B Bœ " B ß !ß B † E Ÿ † Ec d$ $X #

+B Ÿ B Ð, . , B Ñ + Ÿ , Ð. ,ÑB Þ$ $ $ $; that is,

In case , when all points are stationary, they will all be NE iff In case , œ . œ ! + Ÿ !Þ ,

and are nonzero with opposite signs, so that s a stationary point,. œ ß !ß 3B"$ . ,,. ,.

X‘this point will be a NE iff . So any SP on the open 1-3 face are NE iff .+ Ÿ ! + Ÿ !

One of these is an ESS (NSS) if and only if ( ) for all sufficiently% %† EÐ Ñ Ÿ !B

small nonzero with and positive or negative.% œ c d ß ß !% % % % % %# $ # $ # $X

Case 1: If and and are nonzero with opposite signs, then + Ÿ ! , . œ ß !ßB"$ . ,,. ,.

X‘is a NE. In this case

% %† EÐ Ñ œ + + , - . ,,

, .B1$

# $ # $% % % %c d.The first term is a positive multiple of , so if , then for small+ + ! † EÐ Ñ !% %B1$

% % %, and thus is an ESS. If , then , andB B"$ $$ # $+ œ ! † EÐ Ñ œ - , . ,1 % % %c d

unless this will take positive values in any neighborhood of , so is not an, œ - œ . B BNSS.

Case 2: Suppose , so that all points in the open 1-3 face are NE. None of+ Ÿ ! œ , œ .these can be an ESS because they are not isolated SP and therefore cannot beasymptotically stable. One can check that in this case

% %† EÐ Ñ œ +B Ð+ -ÑB % %# $ $ ,

and since , is a NSS iff%# ! B

+B Ð+ -Ñ Ÿ !$ $%

for as described. Since , we see that if then this will indeed hold for small% B ! + !$

% % % %# $ $ $ ! + œ ! - Ÿ ! and . But if then we require for all small positive or negative .Thus if we must also have + œ ! - œ !Þ

So in case 2 all points on the open 1-3 face are NSS if or if ; otherwise+ ! + œ - œ !none are.


Summarizing:There are no SP on the open 1-3 face unless either (1) and are nonzero with, .

opposite signs or (2) ., œ . œ !(1) If and are nonzero with opposite signs, then there is exactly one SP on the 1-3, .

face, namely is a NE iff in addition , and an ESSB B"$ "$. ,,. ,.

Xœ ß !ß Þ + Ÿ !‘

if . It is never an NSS without being an ESS.+ !(2) If , then all on the 1-3 face are SP. They are NE iff in addition , œ . œ ! + Ÿ !B

and NSS iff . They are never ESS.+ !

13d. The open 2-3 face. Vectors on this face can be handled usingB œ !ß " B ß Bc d$ $X

the results of 13c above, by converting

E œ E œ! ! ! ! ! +! ! + ! ! !, - . , - . +

Ô × Ô ×Õ Ø Õ Ø to ,

and exchanging strategies 1 and 2 to produce

E œ Þ! ! !! ! +- , . +

Ô ×Õ Ø

The vector becomes . Translating the above summary merely requiresB c d" B ß !ß B$ $

substituting for and for , and exchanging and .+ + . + . , -

The summary is as follows.There are no SP on the 2-3 face unless either (1) and are nonzero with- . +

opposite signs or (2) - œ . + œ !Þ(1) If and are nonzero with opposite signs, then there is exactly one SP on- . +

the 2-3 face, namely is a NE iff in addition ,B B#$ #$+. -+-. +-.

Xœ !ß ß Þ + !‘

and an ESS if . It is never a NSS without being an ESS.+ !(2) If , then all on the 2-3 face are SP. They are NE iff in addition- œ . + œ ! B

+ ! + ! and NSS if . They are never ESS.

13e. The open 1-2 face. For , the equations of (13.4) all reduce toB œ B ß " B ß !c d" "X

! œ !, and so every point on the open 1-2 face is always a stationary point. (This makesintuitive sense, since neither a 1-player nor a 2-player can distinguish between a 1-playerand a 2-player.)

Such an is a NE iff , and one can check that this inequality saysB / B B B$ † E Ÿ † E,B -Ð" B Ñ Ÿ ! Ð- ,ÑB -" " "; that is, . There are several cases.

If , then iff - , Ð- ,ÑB - B Þ" "-

-,

If , then , so is a NE iff - ! , ! " B Þ- --, -,"B

If , then , so no such is a NE.- , ! "--, B


If , then , so all such are NE.! - , Ÿ !--, B

If , then iff - , - , B - B Ÿ Þ" "-

-,

If , then , so is a NE iff - ! , ! " B Ÿ Þ- --, -,"B

If , then , so all such are NE.- , Ÿ ! "--, B

If , then , so no such is a NE.! Ÿ - , Ÿ !--, B

If , then iff .- œ , - , B - - Ÿ !"

If , then all such are NE.- œ , Ÿ ! BIf , then no such is a NE.- œ , ! B

We can rearrange the above into just four cases as follows. For withB œ B ß !ß " Bc d" "X

! B "" :

If and , then all such are NE., Ÿ ! - Ÿ ! BIf and (but excluding , then no such is a NE.., ! - ! , œ - œ !Ñ BIf , then all such with is a NE., ! - B B "

--,

If , then all such with is a NE.- ! , B ŸB "-

-,

No NE on the open 1-2 face can be an ESS, because if there are any there will be an openline segment of them, and they will not be isolated NE. An on the open 1-2 face willBbe an NSS if and only if for all sufficiently small nonzero% %† EÐ Ñ Ÿ !B

% œ c d ß ß !% % % % % %# $ # $ $ #X with and positive or negative. One can check that this is

equivalent to

- , - B + , - . , Ÿ !" # $% %

for small and with . We see that is a NSS iff ; that is, iff% % %# $ # " ! - , - B !B- , B -" . So we can expand on the four cases above as follows.

All with are SP.B œ B ß " B ß ! ! B "c d" " "X

If and (but excluding ), then no such is a NE., ! - ! , œ - œ ! BIf and , then all such are NE. All such are NSS iff in addition, Ÿ ! - Ÿ ! B B

, œ - ! , - Ÿ ! or .If , then is a NE iff and a NSS iff in addition ., ! - B B B " "

- --, -,

If , then is a NE iff and a NSS iff in addition .- ! , B Ÿ B B " "- -

-, -,

As a partial summary of this section, we note the following “watershed” relations among+ß ,ß -ß .and .

for / À"

, œ ! , œ !ß + - œ !ß . œ !

for / À#

+ - œ ! + - œ !ß , œ !ß . œ !


for / À$

. œ !ß . œ +. œ !ß , œ !. œ +ß - œ !

for interior À+ œ !ß ,ß -ß . nonzero and not all same sign

for 1-2 face À, œ - œ !


14. The case of TFT, TF2T, and STFT.

Now we apply the above to the particular case of the three strategies TFT (1), TF2T (2),and STFT (3). The matrix for the replicator dynamics is given by (5.1), but here weE

begin by simplifying the original PD payoff table ” •V WX T

W. If we subtract from all

entries, and then divide by the positive number , we getV W

” •" !" ZY

Y where and .œ Z œXV TWVW VW

The conditions andW T V X V ÐW XÑ and are equivalent to "# ! Z "

! "ÞY

We can interpret and as the gain to be had by defecting (i.e. the regret of aY Zcooperator) when the opponent cooperates and defects, respectively, as fractions ofV Wß the cost of the opponent's defection to a cooperator.

In the special case of the Axelrod numbers we haveV œ $ß W œ !ß X œ &ß T œ "ßY œ Z œ Þ# "

$ $ and As we will see, this is a somewhat special situation becauseY Z œ ".

Now we drop the original values and use Then theV œ "ß W œ !ß X œ " Yß T œ Z Þmatrix given by (5.1),

Ô ×Ö ÙÕ Ø

V V WXA"A "A "AV V VA


#

#

W

X

Þ

becomes

Ô ×Ö ÙÕ Ø

" ""A "A "A

Ð" ÑA

" " A"A "A "A" " Z"A "A "A

Y

Y

#

#

Þ

Y

We multiply all entries by the positive number , obtaining" A#

Ô ×Õ Ø" A " A Ð" ÑA" A " A AÐ" AÑ

" " A Ð" A Ñ Z Ð" AÑ

Y

Y Y #,

and then subtract the first row from all rows, producing


Ô ×Õ Ø

! ! !! ! AÐ AÑ

A Ð" A Ñ Z AÐ Z "Ñ

Y

Y Y Y#.

With the Axelrod numbers and , this becomesY œ Z œ# "$ $

Ô ×Ö ÙÕ Øˆ ‰! ! !

! ! A A

A " A A

#$

# # " %$ $ $ $

#

,

and finally we multiply by 3 to get

E œ œ Þ! ! !! ! +, - .

! ! !! ! A # $A

# $A # " A " %A

Ô × Ô ×Õ Ø Õ Ø#

The locations of the four parameters with respect to each other and zero, as determinedby the value of , are given in the following graph and table.A

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-3

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

a

b

c

d

The vertical lines mark the four values of at which the relative positions of the fourA

parameters and zero change: and . Thus there are nine separateA œ ß ß ß" " #% $ $ &

" ""Ècases: each of these four values and the five intervals between and around them.However, for the three strategies here, and the particular values of and , there areY Zonly six different dynamical situations, as shown in the following table.

parameters vertices 1-2 face 1-3 face 2-3 face interior # values of =A +ß ,ß -ß . / / / B B /" # $

" #, +.

,- +.-B Ÿ B

! A

" ", ,

,- ,."= to 2-3 face

1 "%

"%

" "% $" #$ $

+ ! . , - Â Â

A œ + ! œ . , - Â Â

A Ÿ + . ! , - Â

A . + ! , -

A

SP SP SP none2 SP SP SP none3 SP SP SP SP none4 SP SP SP SP SP none5

ESSNE

ESSESS

? ?

? ?

?

œ . ! œ + œ , - Â

A " . , ! +ß - Â

#$

#$

NE NE NENSS NSS ESS

SP SP SP6 SP SP SP none

?

?

Following are pictures showing the orbits of the dynamics in each of these cases.



" #, +.

,- +.-B Ÿ B

! A

" ", ,

,- ,."= to 2-3 face

1 "%

"%

" "% $" #$ $

+ ! . , - Â Â

A œ + ! œ . , - Â Â

A Ÿ + . ! , - Â

A . + ! , -

A


ESS ? ?

? ?

?

NEESSESS

œ . ! œ + œ , - Â

A " . , ! +ß - Â

#$

#$

NE NE NENSS NSS ESS


?

?

Case 1:



" #, +.

,- +.-B Ÿ B

! A

" ", ,

,- ,."= to 2-3 face

1 "%

"%

" "% $" #$ $

+ ! . , - Â Â

A œ + ! œ . , - Â Â

A Ÿ + . ! , - Â

A . + ! , -

A

SP SP SP noneSP SP SP none

3 SP SP SP SP none4 SP SP SP SP SP none5

ESS

ESSESS

? ?

? ?

?

2 NE

œ . ! œ + œ , - Â

A " . , ! +ß - Â

#$

#$

NE NE NENSS NSS ESS


?

?

Case 2:



" #, +.

,- +.-B Ÿ B

! A

" ", ,

,- ,."= to 2-3 face

1 "%

"%

" "% $" #$ $

+ ! . , - Â Â

A œ + ! œ . , - Â Â

A Ÿ + . ! , - Â

A . + ! , -

A

SP SP SP none2 SP SP SP none

SP SP SP SP none4 SP SP SP SP SP none5

ESSNE

ESS

? ?

? ?

?3 ESS

œ . ! œ + œ , - Â

A " . , ! +ß - Â

#$

#$

NE NE NENSS NSS ESS


?

?

Case 3:



" #, +.

,- +.-B Ÿ B

! A

" ", ,

,- ,."= to 2-3 face

1 "%

"%

" "% $" #$ $

+ ! . , - Â Â

A œ + ! œ . , - Â Â

A Ÿ + . ! , - Â

A . + ! , -

A

SP SP SP none2 SP SP SP none3 SP SP SP SP none

SP SP SP SP SP none5

ESSNE

ESS

? ?

? ?

?

4 ESSœ . ! œ + œ , - Â

A " . , ! +ß - Â

#$

#$

NE NE NENSS NSS ESS


?

?

Case 4:



" #, +.

,- +.-B Ÿ B

! A

" ", ,

,- ,."= to 2-3 face

1 "%

"%

" "% $" #$ $

+ ! . , - Â Â

A œ + ! œ . , - Â Â

A Ÿ + . ! , - Â

A . + ! , -

A

SP SP SP none2 SP SP SP none3 SP SP SP SP none4 SP SP SP SP SP none

ESSNE

ESSESS

? ?

? ?

?

5 œ . ! œ + œ , - Â

A " . , ! +ß - Â

#$

#$

NE NE NESP SP SP6 SP SP SP none

?

?NSS NSS ESS

Case 5:



" #, +.

,- +.-B Ÿ B

! A

" ", ,

,- ,."= to 2-3 face

1 "%

"%

" "% $" #$ $

+ ! . , - Â Â

A œ + ! œ . , - Â Â

A Ÿ + . ! , - Â

A . + ! , -

A


ESSNE

ESSESS

? ?

? ?

?

œ . ! œ + œ , - Â

A " . , ! +ß - Â

#$

#$

NE NE NESP SP SPSP SP SP none

?

?6 NSS NSS ESS

Case 6:


15. References.

Axelrod, Robert: The Evolution of Cooperation. Basic Books, 1984.

Axelrod, Robert: Princeton University Press, 1997.The Complexity of Cooperation.

Boyd, Robert and Lorberbaum, Jeffrey P.: No pure strategy is evolutionarily stable in therepeated Prisoner's Dilemma game, Nature, vol. 327 (7 May 1987), pp. 58-9.

Hardin, Garrett: The Tragedy of the Commons. , vol. 162 (1968), pp. 1243-1248.Science

Hofbauer, J. and Sigmund, K.: Evolutionary Games and Population Dynamics.Cambridge University Press, 1998.

Osborne, Martin J.: . Oxford University Press, 2004An Introduction to Game Theory( ).an undergraduate-level text

Osborne, Martin J.and Rubinstein, Ariel: . MIT Press, 1994 (A Course in Game Theory aslightly higher-level introductory text).

Poundstone, William: . Anchor Books, 1993.Prisoner's Dilemma

Weibull, Jörgen W.: MIT Press, 1997.Evolutionary Game Theory.

A website of interest: <http://www.prisoners-dilemma.com>

Evolutionary Game Theory and the Prisoner's …stat-or.unc.edu/files/2016/04/TR_08_08.pdfEGT and the...

Documents

Transcript of Evolutionary Game Theory and the Prisoner's …stat-or.unc.edu/files/2016/04/TR_08_08.pdfEGT and the...