A Constructive Study of Markov Equilibria in Stochastic Games...

A Constructive Study of Markov Equilibria in

Stochastic Games with Strategic Complementarities∗

Lukasz Balbus† Kevin Reffett‡ Lukasz Wozny§

December 2011

Abstract

We study a class of discounted infinite horizon stochastic gameswith strategic complementarities. We first characterize the set of allMarkovian equilibrium values by developing a new Abreu, Pearce, andStacchetti (1990) type procedure (monotone method in function spacesunder set inclusion orders). Secondly, using monotone operators on thespace of values and strategies (under pointwise orders), we prove exis-tence of a Stationary Markov Nash equilibrium via constructive meth-ods. In addition, we provide monotone comparative statics results forordered perturbations of the space of stochastic games. Under slightlystronger assumptions, we prove the stationary Markov Nash equilib-rium values form a complete lattice, with least and greatest equilibriumvalue functions being the uniform limit of successive approximationsfrom pointwise lower and upper bounds. We finally discuss the rela-tionship between both monotone methods presented in a paper.

1 Introduction and related literature

Since the class of discounted infinite horizon stochastic games was introducedby Shapley (1953), and subsequently extended to more general n-player set-tings (e.g., Fink (1964)), the question of existence and characterization of(stationary) Markov Nash equilibrium (henceforth, (S)MNE) has been theobject of extensive study in game theory.1 Further, and perhaps more cen-

∗We thank Rabah Amir, Andy Atkeson, Eric Maskin, Andrzej Nowak, Frank Page,Adrian Peralta-Alva, Ed Prescott, Manuel Santos and Jan Werner as well as participants of10th Society for the Advancement in Economic Theory Conference in Singapore 2010 and2011 NSF/NBER/CEME conference in Mathematical Economics and General EquilibriumTheory in Iowa, for helpful discussions during the writing of this paper.†Institute of Mathematics, Wroclaw University of Technology, 50-370 Wroclaw, Poland.‡Department of Economics, Arizona State University, USA.§Department of Theoretical and Applied Economics, Warsaw School of Economics,

Warsaw, Poland. Address: al. Niepodleglosci 162, 02-554 Warszawa, Poland. E-mail:[email protected].

1For example, see Raghavan, Ferguson, Parthasarathy, and Vrieez (1991) or Neymanand Sorin (2003) for an extensive survey of results, along with references.

1

tral to motivation of this paper, in recent times stochastic games have alsobecome a fundamental tool for studying dynamic economic models whereagents in the economy possess some form of limited commitment. Examplesof such limited commitment frictions are extensive, as they arise naturallyin diverse fields of economics, including work in: (i) dynamic political econ-omy (e.g., see Lagunoff (2009), and reference contained within), (ii) dynamicsearch with learning (e.g., see Curtat (1996) and Amir (2005)), (iii) equilib-rium models of stochastic growth without commitment (e.g., Majumdar andSundaram (1991), Dutta and Sundaram (1992), Amir (1996) or Balbus andNowak (2004, 2008) for examples of the classic fish war problem or stochas-tic altruistic growth models with limited commitment between successorgenerations), (iv) dynamic oligopoly models (see Rosenthal (1982), Cabraland Riordan (1994) or Pakes and Ericson (1998)), (v) dynamic negotiationswith status quo (see Duggan and Kalandrakis (2007)), (vi) internationallending and sovereign debt (Atkeson, 1991), (vii) optimal Ramsey taxation(Phelan and Stacchetti, 2001), (viii) models of savings and asset prices withhyperbolic discounting (Harris and Laibson, 2001), among others.2

Additionally, in the literature pertaining to economic applications ofstochastic games, the central concerns have been broader than the merequestion of weakening conditions for the existence of subgame perfect orMarkovian equilibrium. Rather, researchers have become progressively moreconcerned with characterizing the properties of computational implementa-tions, so they can study the quantitative (as well as qualitative) propertiesof particular subclasses of subgame perfect equilibrium. For example, oftenone seeks to simulate approximate equilibria in order to assess the quan-titative importance of dynamic/time consistency problems (as is done, forexample, for calibration exercises in applied macroeconomics). In othercases, one seeks to estimate the deep structural parameters of the game (as,for instance, in the recent work in empirical industrial organization). Ineither situation, one needs to relate theory to numerical implementation,which requires both (i) sharp characterizations of the set of SMNE beingcomputed, and (ii) constructive fixed point methods that can be tied di-rectly to approximation schemes. Of course, for finite games3, the questionof existence and computation of SMNE has been essentially resolved.4 Un-

2Also, for an excellent survey including other economic applications of stochastic games,please see Amir (2003).

3By ”finite game” we mean a dynamic/stochastic game with a (a) finite horizon and(b) finite state and strategy space game. By an ”infinite game”, we mean a game whereeither (c) the horizon is countable (but not finite), or (d) the action/state spaces are acontinuum. We shall focus on stochastic games where both (c) and (d) are present.

4For example, relative to existence, see Federgruen (1978) and Rieder (1979) (or morerecently Chakrabarti (1999)); for computation of equilibrium, see Herings and Peeters(2004); and, finally, for estimation of deep parameters, see Pakes and Pollard (1989); ormore recently Aguirregabiria and Mira (2007), Pesendorfer and Schmidt-Dengler (2008),Pakes, Ostrovsky, and Berry (2007) and Doraszelski and Satterthwaite (2010).

2

fortunately, for infinite games, although the equilibrium existence questionhas received a great deal of attention, results that provide a rigorous set ofresults characterizing the set of SMNE are needed (e.g. to obtain resultson the accuracy of approximation methods). Similarly, we still miss resultsthat establish classes of robust equilibrium comparative statics on the spaceof games which is needed to develop a collection of computable equilibriumcomparative statics5.

The aim of this paper is to address all of these issues in a single uni-fied methodological approach, that being monotone methods. We studythese questions with a constructive monotone approach, where our notionsof monotonicity include both set inclusion and pointwise partial orders (onspaces of values or pure strategies). Specifically, we study existence, compu-tation, and equilibrium comparatives statics relative to the set of (S)MNE foran important class of stochastic games, namely those with strategic (withinperiod) strategic complementarities6 and positive externalities (as analyzedby Amir, Curtat or Nowak).

We first prove existence of a Markovian NE via strategic dynamic pro-gramming methods similar to that proposed in the seminal work of Abreu-Pearce-Stacchetti (henceforth, APS).7 We refer to this as a more ”indirect”method, as we focus exclusively on equilibrium values (rather than char-acterizing the set of strategies that implement those equilibrium values).Our method differs from those of the traditional APS literature in at leasttwo directions. Perhaps most importantly, we study the existence of shortmemory or Markovian equilibria, as opposed to broad classes of sequentialsubgame perfect equilibria.8 Additionally, our strategic dynamic program-ming method works directly in function spaces (as opposed to spaces ofcorrespondences), where compatible partial orders and order topologies canbe developed to provide a unified framework to study the convergence andcontinuity of iterative procedures. The latter methodological issue is key(e.g. for uncountable state spaces), as it implies we can avoid many of the

5There are exceptions to this remark. For example, in Balbus and Nowak (2004), atruncation argument for constructing a SMNE in symmetric games of capital accumulationis proposed. In general, though, in this literature, a unified approach to approximationand existence has not been addressed.

6By this we mean supermodularity of a auxiliary game. See Echenique (2004) for anotion of a supermodular extensive form games.

7Our results are also related to recursive saddlepoint type methods, which began inthe work of Kydland and Prescott (1980), but are also found in the work of Marcet andMarimon (2011) and Messner, Pavoni, and Sleet (2011).

8It bears mentioning, we focus on short-memory Markovian equilibrium because thisclass of equilibrium has been the focus of a great deal of applied work. Our methods alsocan be adapted to studying the set of subgame perfect/sequential equilibrium. We shouldalso mention a very interesting papers by Curtat (1996), Cole and Kocherlakota (2001),and Doraszelski and Escobar (2011) that also pursue a similar ideas of trying to developAPS type procedures in function spaces for Markovian equilibrium (i.e., methods wherecontinuation structures are parameterized by functions).

3

difficult technical problems associated with numerical implementations usingset-approximation techniques9.

Next, we strengthen the conditions on the noise of the game, and wepropose a different (direct) monotone method for proving existence of sta-tionary MNE under different (albeit related) conditions than those used inprevious work (e.g., Curtat (1996), Amir (2002, 2005) or Nowak (2007)).In this case, we are able to improve on existence results for SMNE relativeto the literature a great deal. First, we are also able to consider existenceof SMNE in broader spaces of strategies. Second, we give conditions underwhich the set of SMNE values form a complete lattice of Lipschitz continuousfunctions, which provides a generalization of results found in Curtat (1996).And thirdly, not only are we able to show that SMNE exist, but we are ablein principle to compute these SMNE via monotone iterative procedures, andprovide conditions under which SMNE can be uniformly approximated asthe limit of sequences generated by iterations on our fixed point operator.

Finally, unlike the existing work in the literature, our monotone iterativemethods apply for the infinite horizon games, as well as finite horizon games.This is particularly important for numerical implementations. Specifically,we are able to provide conditions under which infinite horizon SMNE arethe limits of truncated finite horizon stochastic games. We are also ableto provide conditions for monotone comparative statics results on the setof SMNE for the infinite horizon game, as well as describe how equilibriumcomparative statics can be computed. This is particularly important whenone seeks to construct a stable selections of the set of SMNE that are nu-merically (and theoretically) tractable as functions of the deep parametersof the economy.

The rest of the paper is organized as follows. Section 2 states the formaldefinition of an infinite horizon, stochastic game. Under general conditions,our main result, on Markov equilibrium existence and its value set approxi-mation, can be found in section 3. Then under stronger assumptions in sec-tion 4 we propose a direct method for Markov Stationary Nash equilibriumexistence, and computation. In section 4.4, we present related comparativestatics and equilibrium dynamics results. Finally in section 5, we discusspossible applications of our results.

2 The Class of Stochastic Games

We consider a n-player discounted infinite horizon stochastic game in dis-crete time. The primitives of the class of games are given by the tuple{S, (Ai, Ai, �i, ui)

ni=1, Q, s0

}, where S = [0, S] ⊂ ℝk is the state space,

Ai ⊂ ℝki player i action space with A = ×iAi, �i is the discount factor

9Where equilibrium value set if first approximated, and then appropriate measurableequilibrium is selected.

4

for player i, ui : S×A→ ℝ is the one-period payoff function, and s0 ∈ S theinitial state of the game. For each s ∈ S, the set of feasible actions for playeri is given by Ai(s), which is assumed10 to be compact Euclidean interval inℝki . By Q, we denote a transition function that specifies for any currentstate s ∈ S and current action a ∈ A, a probability distribution over therealizations of next period states s′ ∈ S.

Using this notation, a formal definition of a (Markov, stationary) strat-egy, payoff, and a Nash equilibrium can now be stated as follows. A strategyfor a player i is denoted by Γi = ( 1

i , 2i , . . .), where ti specifies an action to

be taken at stage t as a function of history of all states st, as well as actionsat taken as of stage t of the game. If a strategy depends on a partition ofhistories limited to the current state st, then the resulting strategy is re-ferred to as Markov. If for all stages t, we have a Markov strategy given as ti = i , then strategy Γi for player i is called a Markov-stationary strategy,and denoted simply by i. For a strategy profile Γ = (Γ1,Γ2, . . . ,Γn), andinitial state s0 ∈ S, the expected payoff for player i can be denoted by:

Ui(Γ, s0) = (1− �i)∞∑t=0

�ti

∫ui(st, at)dm

ti(Γ, s0),

where mti is the stage t marginal on Ai of the unique probability distribution

(given by Ionescu–Tulcea’s theorem) induced on the space of all histories forΓ. A strategy profile Γ∗ = (Γ∗i ,Γ

∗−i) is a Nash equilibrium if and only if Γ∗

is feasible, and for any i, and all feasible Γi, we have

Ui(Γ∗i ,Γ∗−i , s0) ≥ Ui(Γi,Γ∗−i, s0).

3 MNE via APS methods in Function Spaces

The approach11 we take in this section to prove existence of MNE and ap-proximate its value set, is the strategic dynamic programming approachbased on the seminal work of Abreu, Pearce, and Stacchetti (1986, 1990)adapted to a stochastic game12. In the original strategic dynamic program-ming approach, dynamic incentive constraints are handled using correspondence-based arguments. For each state s ∈ S, we shall envision agents playing a

10It is important to note that our methods allow for a more general state space thanmany related papers in the existing literature (e.g., Amir (2002, 2005) where the statespace is often assumed to be one-dimensional).

11To motivate our approach in this section, recall a classical result on the existence fora subgame perfect equilibrium in infinite horizon stochastic games first offered by Mertensand Parthasarathy (1987), and later modified, by other authors (e.g., Solan (1998) orMaitra and Sudderth (2007)). See also Harris, Reny, and Robson (1995) for a relatedargument on subgame perfect equilibria in repeated games with public randomization.

12The APS method also is related to the work of Kydland and Prescott (1977, 1980), andnow referred to as KP), but is distinct. In the KP approach (and more general ”recursivesaddlepoint methods” ala Marcet and Marimon (2011) and Messner, Pavoni, and Sleet(2011)), one develops a recursive dual method for computing equilibrium, where Lagrange

5

one-shot stage game with the continuation structure parameterized by acorrespondence of continuation values, say v′ ∈ V (S) where V (S) is thespace of nonempty, bounded, upper semincontinuous correspondences (forexample). Imposing incentive constraints on deviations of the stage gameunder some continuation promised utility v′, a natural operator B, that ismonotone under set inclusion can be defined that transforms V (S). By it-erating on a ”value correspondence” operator from a ”greatest” element ofthe collection V (S), the operator is shown to map down the ”largest set”V (S), and then by appealing to standard ”self-generation” arguments, it canbe shown a decreasing subchain of subsets can be constructed, whose point-wise limit in the Hausdorff topology is the greatest fixed point of B. Justas in APS for a repeated game, this fixed point turns out to be the set ofsustainable values in the game, with a sequential subgame perfect equilib-rium being any measurable selection from the strategy set supporting thislimiting correspondence of values.

We now propose a new procedure for constructing all measurable (possi-bly nonstationary) Markov Nash equilibria for our infinite horizon stochasticgame. Our approach13 is novel, as we operate directly in function spaces14,i.e. a set of bounded measurable functions on S valued in ℝn. Our construc-tion is related to the Cole and Kocherlakota (2001) study of Markov-privateinformation equilibria by the APS type procedure in function spaces. Ascompared to their study our treats different class of games (with generalpayoff functions, public signals and uncountable number of states) though.

For our arguments in this section, we shall require the following assump-tions. We postpone commenting them for the next section.

Assumption 1 We let:

∙ ui is continuous on S ×A, ui is bounded by 0 and u,

∙ ui is supermodular in ai (for any s, a−i), and has increasing differencesin (ai; a−i, s), and is increasing in (s, a−i), (for each ai),

multipliers become state variables. In this method, the initial condition for the dualvariables has to made consistent with the initial conditions of the incentive constraineddynamic program. To solve this last step, Kydland-Prescott propose a correspondencebased method for solving for the set of such time-consistent initial conditions that bearsa striking resemblance to APS methods.

13It bears mentioning that for dynamic games with more restrictive shocks spaces (e.g.,discrete or countable), KP/APS procedures has been used extensively in economics inrecent years: e.g. by Kydland and Prescott (1980), Atkeson (1991), Pearce and Stacchetti(1997), Phelan and Stacchetti (2001) for policy games, and Feng, Miao, Peralta-Alva,and Santos (2009) for dynamic competitive equilibrium in recursive economies. Relatedmethods have been proposed for theoretical work in stochastic games (e.g., Mertens andParthasarathy (1987) and Chakrabarti (1999, 2008)).

14For example, see Phelan and Stacchetti (2001, p. 1500-1501), who discusses suchpossibility in function spaces. In this section, we show such a procedure is analyticallytractable, and discuss it computational advantages in addition.

6

∙ for all s ∈ S the sets Ai(s) are compact intervals and multifunctionAi(⋅) is upper hemicontinuous and ascending under both (i) set inclu-sion i.e. if s1 ≤ s2 then Ai(s1) ⊆ Ai(s2), and (ii) Veinott’s strong setorder ≤v (i.e., Ai(s1) ≤v Ai(s2) if for all a1i ∈ Ai(s1), a2i ∈ Ai(s2),a1i ∧ a2i ∈ Ai(s1) and a1i ∨ a2i ∈ Ai(s2))

∙ Q(ds′∣⋅, ⋅) has Feller property on S ×A15,

∙ Q(ds′∣s, a) is stochastically supermodular in ai (for any s, a−i), hasstochastically increasing differences in (ai; a−i, s), and is stochasticallyincreasing with a, s,

∙ Q(ds′∣s, a) has density q(s′∣s, a) with respect to some � finite measure �i.e. Q(A∣s, a) =

∫A

q(s′∣s, a)�(ds′). Assume that q(s′∣s, ⋅) is continuous

and bounded for all (s′, s) and for all s ∈ S∫S

∣∣q(s′∣s, ⋅)∣∣∞�(ds′) <∞,

∙ support of Q is independent of a.

We know describe formally our APS type method. Let V be the space ofbounded, increasing16, measurable value functions on S with values in ℝn,and V the set of all subsets of V partially ordered by set inclusion. Define anauxiliary (or, super) one period n-player gameGsv = ({1, . . . , n}, {Ai(s),Πi}ni=1),where payoffs depend on a weighted average of (i) current within period pay-offs, and (ii) a vector of expected continuation values v ∈ V , with weightsgiven by a discount factor:

Πi(vi, s, a) := (1− �i)ui(s, a) + �i

∫Svi(s

′)Q(ds′∣s, a),

where v = (v1, v2, . . . , vn), and the state s ∈ S. As v ∈ V is increasingfunction, under our assumptions, Gsv is a supermodular game. Therefore, Gsvhas a nonempty complete lattice of pure strategy Nash equilibria NE(v, s)(e.g., see Topkis (1979) or Veinott (1992)) and corresponding values Π(v, s).Having that, for any subset of functions W ∈ V, define the operator B to be

B(W ) =∪v∈W{w ∈ V ∈ V : w(s) = Π(v, s, a∗), a∗ ∈ NE(v, s)} .

We denote by V ∗ ∈ V the set of (nonempty) equilibrium values correspond-ing to all Markovian equilibria of our stochastic game. It is immediate that

15That is∫S

f(s′)Q(ds′∣⋅, ⋅) is continuous whenever f is continuous and bounded.

16In our paper we use increasing / decreasing terms in their week meaning.

7

B is both (i) increasing under set inclusion, and (ii) transforms the spaceV into itself. This fact implies, among other things, that B : V → V hasthe greatest fixed point by Tarski’s theorem (as V ordered under set in-clusion is a complete lattice). Now, as opposed to traditional APS wheresubgame perfect equilibrium are the focal point of the analysis, it turns outthe greatest fixed point of our operator B will generate the set of all (pos-sibly nonstationary) Markovian equilibria values V ∗ in the function spaceV. Finally, we offer iterative methods to compute the set of MNE valuesin the game. Consider a sequence of subsets of (n-tuple) equilibrium valuefunctions (on S) {Wt}∞t=1 generated by iterations on our operator B fromthe initial subset W1 = V (with Wt+1 = B(Wt)). By construction, W1 = Vis mapped down (under set inclusion by B) and a subchain of decreasingsubsets {Wt} converge to V ∗, the greatest fixed point of B.

We are now ready to summarize these results in the next theorem.

Theorem 3.1 Assume 1. Then:

1. operator B has the greatest fixed point W ∗ ∕= ∅ with W ∗ = limt→∞

Wt =∞∩t=1

Wt,

2. We have V ∗ = W ∗.

The above theorem establishes among others that, the stochastic gamehas a (possibly nonstationary) Markov Nash Equilibrium in monotone strate-gies on a minimal state space of current state variables. Observe that condi-tions to establish that fact are weaker than one imposed by Curtat (1996) orAmir (2002). Specifically we do not require smoothness of the primitives norany diagonal dominance conditions that assure that the auxiliary game has aunique Nash equilibrium, that is moreover continuous with the continuationvalue. Few other comments are now in order.

First, Theorem 3.1.1 follows from a standard application of the Tarski-Kantorovich theorem (e.g., Dugundji and Granas (1982), Theorem 4.2, p15). In our application of this theorem, as order convergence and topologicalconvergence coincide in our setting, the lower envelope of the subchainsgenerated by {Wt} under set inclusion is equal to the topological limit ofthe sequence, which greatly simplifies computations in principle.

Second, for any strategic dynamic programming argument (e.g., tradi-tional APS in spaces of correspondences or our method in function spaces),for the method to make sense, it requires the auxiliary game to have aNash equilibrium. In our situation we assume the auxiliary game is a gameof strategic complementarity. Of course, alternative topological conditionscould be imposed. In our paper, for an easy comparison with the result ofmonotone operator method presented in the next section, we keep super-modular game specification.

8

Third, the assumption of the full (or invariant) support is critical forthis ”recursive type” method to work. That is, by having a Markov selec-tion of values from V ∗, we can generate supporting Markov (time and statedependent) Nash equilibrium.

Fourth, observe that the procedure does not imply that the (Bellmantype) equation B(V ∗) = V ∗ is satisfied for a particular value function v∗;rather, only by a set of value functions V ∗. Hence, generally existenceof a stationary Markov Nash equilibrium cannot be deduced using thesearguments. Now, this is in contrast to the original APS method, wherefor any w(s), one finds a continuation v; hence, the construction for thatmethod becomes pointwise, and for any state s ∈ S, one can select a differentcontinuation function v. This implies the equilibrium strategies that sustainany given point in the equilibrium value set are not only state dependent,but also continuation dependent. In our method this is not the case. Thisdifference, in principle, could have significant implications for computation.

Fifth, on a related matter, the set of Markov Nash Equilibrium valuesis generally a subset of the APS (sequential subgame perfect equilibrium)value set. One should keep in mind, for repeated games, the two sets neednot be very different17.

Sixth, the characterization of an sequential NE strategy obtained usingstandard APS method in general is very weak (i.e., all can conclude for ourstochastic game is that there exists some measurable value function that areselections from V ∗). In our case, as we focus on an operator B, that mapsin space of functions, we resolve this selection issue at the stage of definingthe operator. Perhaps more importantly, to implement this traditional APSmethod, one must convexify the set of continuation values. This convexifi-cation step usually involves introducing sunspots or correlations devices intothe argument. In comparison, we do not need any of this convexification orcorrelation, as our method delivers MNE on a minimal state space.

Finally, Judd, Yeltekin, and Conklin (2003) offer a technique for APSset approximation. Their method can be applied in our context as well, butis substantially simpler at leats for the three reasons: (i) we operate directlyin function spaces, (ii) we analyze equilibria that are time/state dependentonly, and moreover (iii) we equilibria define on a minimal state space.

In the next section we strengthen the existence results to the stationary,non-monotone equilibrium and establish important equilibrium approxima-tion result together with the novel comparative statics exercise.

17For example, as Horner and Olszewski (2009) show, the Folk theorem can hold forrepeated games with imperfect monitoring and finite memory strategies. Further, Barlo,Carmona, and Sabourian (2009) show a similar results for repeated games with perfectmonitoring, rich action spaces and Markov equilibria.

9

4 SMNE via Monotone Operators

The aim of this section is to both (i) prove the existence of a stationaryMarkov Nash Equilibrium (SMNE), and (ii) provide a simple successive ap-proximation scheme for computing particular elements of this equilibriumset. The methods developed in this section of the paper are essentiallyvalue iteration procedures in pointwise partial orders on spaces of boundedvalue functions that map subsets of functions to subset of functions, incor-porating the strategic restrictions imposed by the game implicitly in theirdefinition.What is particularly important is that, as compared to APS meth-ods, we also can characterize (and compute) the set of pure strategies thesupport particular extremal equilibrium values, and well as discuss pointwiseequilibrium comparative statics. We will also show SMNE for infinite hori-zon games can be computed as the limit of finite horizon stochastic games.Sufficient conditions for any of these results have not been produced in anyof the existing literature.

Motivated by the discussion in Maskin and Tirole (2001), and concen-trating on the existence and properties of SMNE, many authors subsequently(as well as we in section 3) have analyzed the structure of equilibria for theinfinite horizon game by appealing to equilibrium in the auxiliary game Gsv.Remind that, if we denote by Π(v, s) a vector of values generated by anyNash equilibrium of Gsv, then it turns out to show the existence of a sta-tionary Markov equilibrium value in the original discounted infinite horizonstochastic game, it suffices to show existence of a (measurable) ”fixed point”selection in this auxiliary game (i.e., a measurable selection v∗(s) ∈ Π(v∗, s)).

The existence of such fixed point has typically been achieved in the lit-erature by developing applications of nonconstructive topological fixed pointtheorems on some nonempty, compact and (locally) convex space of valuefunctions V. For example, using this method, Nowak and Raghavan (1992)are able to show existence of a measurable value v∗ of the original stochasticgame applying Fan-Glicksberg’s generalization of Kakutani fixed point the-orem. In particular, they deduce the existence of a correlated equilibrium ofa stochastic game using a measurable selection theorem due to Himmelberg.This method has been further developed by Nowak (2003), where by addinga specific type of stochastic transition structure for the game (e.g., assum-ing a particular mixing structure), the existence of a measurable Markovstationary equilibrium is obtained18.

When using the auxiliary game approach, note that one can make sig-nificant progress, if it is possible to link a continuation value, say v′, to aparticular equilibrium value19 v(⋅) ∈ Π(v′, ⋅) from the same set of values.

18See also papers by Nowak and Szajowski (2003) or Balbus and Nowak (2004, 2008)for a related mixing assumptions on stochastic transition.

19That is, one can define a single-valued self map Tv(⋅) = Π(v, ⋅) on a nonempty, convexand compact function space, where Π is some selection from Π.

10

There are many approaches in the literature for implementing this idea.One method is to assume that the auxiliary game has a unique equilibrium(and, hence, a unique equilibrium value Π(v, ⋅) ∈ CM , where CM is the setof Lipschitz continuous functions on the state space S). This is preciselythe approach that has been recently taken by many authors in the literature(e.g., Curtat (1996) and Amir (2002)). Conditions required to apply thisargument are strong, requiring very restrictive assumptions on the game’sprimitives involving strong concavity/diagonal dominance of the game pay-offs/transitions, Lipschitzian structure for payoffs/transition structure in theauxiliary game, as well as stochastic supermodularity conditions. If theseconditions are present, one can show the existence of monotone, Lipschitzcontinuous stationary equilibrium via e.g., Schauder’s theorem20.

Another interesting idea for obtaining the needed structure to resolvethese existence issues for the original game via the one-shot auxiliary game,is to develop a procedure for selecting from Π an upper semi-continuous,increasing function (i.e., distribution functions) on a compact interval ofthe real line and observing that a set of such functions is weakly compact.This approach was first explored by Majumdar and Sundaram (1991) andDutta and Sundaram (1992) in the important class of dynamic games (e.g.,dynamic resource extraction games). More recently, Amir (2005) has gen-eralized this argument to a class of stochastic supermodular games, whereboth values and pure strategies for SMNE are shown to exist in spaces ofincreasing, upper semi-continuous functions. Of course, the most seriouslimitation of this purely topological approach in the literature is that todate, the authors have been forced to severely restrict the state space ofthe game, as well as requiring a great deal of complementarity between ac-tions and states (i.e., assumptions consistent with the existence of monotoneMarkov equilibrium), to keep the topological argument tractable. Perhapsequally as important is the fact that this method loses all hope of preservingmonotone comparative statics results in the supermodular game (in deepparameters).

In our approach, we will also follow this sort of analysis using the aux-iliary game, but we show how under our assumptions, we are able to definea single valued map, than maps between general spaces of values functions,without imposing uniqueness of Nash equilibrium in the one-shot game, norrequiring monotone SMNE. As will be clear in the sequel, our key point ofdeparture from some of this existing literature is that we adopt the specificnoise structure assumed in 3. Although this noise structure has been alreadyused a great deal in the existing literature (e.g. in (Nowak and Szajowski,

20A related, although more general, approach to this problem has recently been proposedby Nowak and Szajowski (2003) and Nowak (2007). In Nowak’s approach, based on resultsof Nowak and Raghavan (1992) and Nowak (2003), he is able to prove an existence ofa stationary Markov equilibrium, where the key restrictive assumptions are placed ontransition probability structure rather than preferences.

11

2003), (Balbus and Nowak, 2004, 2008) or (Nowak, 2007)), its full power hasnot been explored so far. Here our focus is computation and computableequilibrium comparative statics. So using this assumption, we are able tocharacterize appropriate monotone operators in spaces of values and purestrategies that preserves complementarities between the periods (and ontothe infinite horizon game).

4.1 Assumptions on the Game

We first state some initial conditions on the primitives of the game that arerequired for our methods in this section.

Assumption 2 (Preferences) For i = 1, . . . , n let:

∙ ui be continuous on A and measurable on S, with 0 ≤ ui(s, a) ≤ u,

∙ (∀a ∈ A)ui(0, a) = 0,

∙ ui be increasing in a−i,

∙ ui be supermodular in ai for each (a−i, s), and has increasing differ-ences in (ai; a−i),

∙ for all s ∈ S the sets Ai(s) are nonempty, compact intervals and s→Ai(s) is a measurable correspondence.

Assumption 3 (Transition) Let Q be given by:

∙ Q(⋅∣s, a) = g0(s, a)�0(⋅) +L∑j=1

gj(s, a)�j(⋅∣s), where

∙ for j = 1, ..., L the function gj : S×A→ [0, 1] is continuous on A andmeasurable on S, increasing and supermodular in a for fixed s, and

gj(0, a) = 0 (clearlyL∑j=1

gj(⋅) + g0(⋅) ≡ 1),

∙ (∀s ∈ S, j = 1, . . . , L)�j(⋅∣s) is a Borel transition probability on S,

∙ (∀j = 1, . . . , L) function s→∫S v(s′)�j(ds

′∣s) is measurable and boundedfor any measurable and bounded (by some predefined constants) v,

∙ �0 is a probability measure concentrated at point 0.

Although assumptions here are related to those made in recent work byAmir (2005) and Nowak (2007), there does exist some important differences.Before discussing them (in section 4.2) we state our main result.

12

4.2 Existence and Computation of SMNE

We now discuss existence and computation of SMNE. Let Bor(S,ℝn) tobe the set of Borel measurable functions from S into ℝn, and consider thesubset:

ℬn(S) := {v ∈ Bor(S,ℝn) : ∀ivi(0) = 0, ∣∣vi∣∣ ≤ u} .Equip the space ℬn(S) with a pointwise partial product order. For a vectorof continuation values v = (v1, v2, . . . , vn) ∈ ℬn(S), we can now analyze theauxiliary one-period, n-player game Gsv with action sets Ai(s) and payoffs:

Πi(vi, s, ai, a−i) := (1−�i)ui(s, ai, a−i)+�iL∑j=1

gj(s, ai, a−i)

∫Svi(s

′)�j(ds′∣s).

Under assumptions 2 and 3, this auxiliary game Gsv is a supermodularin a for any (v, s), hence it possesses a greatest a(s, v) and least a(s, v)pure strategy Nash equilibrium (see, see Topkis (1979) and Vives (1990)), aswell as corresponding greatest Π

∗(v, s) and least Π∗(v, s) equilibrium values,

where Π∗(v, s) = (Π∗1(v, s),Π∗2(v, s), . . . ,Π∗n(v, s)).We are now prepared to state the first main theorem of this section con-

cerning the existence of SMNE. Further, we verify existence constructivelyby providing an explicit successive approximation method whose iterationconstruct SMNE pure strategies and values.21 To do this, we define a pairof extremal value operators T (v)(s) = Π

∗(v, s) and T (v)(s) = Π∗(v, s), as

well as T t(v), which denote the t-iteration/orbit of the operator T (v) fromv. We then recursively22 generate a sequence of lower (resp., upper) boundsfor equilibrium values {vt}∞t=0 (resp., {wt}∞t=0) where vt+1 = T (vt) for t ≥ 1from the initial guess v0(s) = (0, 0, . . . , 0) (resp., wt+1 = T (wt) from ini-tial guess w0(s) = (u, u, . . . , u)) for s > 0 and w0(0) = 0). For both lower(resp, upper) value iterations, we can also associate sequences of pure strat-egy Nash equilibrium strategies {�t}∞t=0 (resp., { t}∞t=0), with the operatorsrelated by �t = a(s, vt) (resp., t = a(s, wt)). With this, our main existencetheorem in this section is the following:

Theorem 4.1 (The successive approximation of SMNE) Under assump-tions 2 and 3 we have

1. (for fixed s ∈ S) �t(s) and vt(s) are increasing sequences and t(s)and wt(s) are decreasing sequences,

21One key aspect of our work relative to APS strategic dynamic programming argumentsis that our methods construct selections for both values and strategies. When using APSstrategic dynamic programming methods, even in the simplest stochastic games (e.g., casesof games where measurability is not an issue), constructive methods are only available forconstructing the set of sustainable values in any state (not the strategies which supportthem, which can be pure or mixed).

22Lemmas 7.5, 7.6 and 7.7 show that both T and T are well defined transformations ofℬn(S).

13

2. for all t we have �t ≤ t and vt ≤ wt (pointwise),

3. the following limits exists: (∀s ∈ S) limt→∞

�t(s) = �∗(s) and (∀s ∈S) lim

t→∞ t(s) = ∗(s),

4. the following limits exists (∀s ∈ S) limt→∞

vt(s) = v∗(s) and (∀s ∈S) lim

t→∞wt(s) = w∗(s),

5. �∗ and ∗ are stationary Markov Nash equilibria in the infinite horizonstochastic game. Moreover, v∗ and w∗ are equilibria payoffs associatedwith �∗ and responsibly ∗.

Importantly, we can now also give pointwise bounds for successive ap-proximations relative to any SMNE using upper and lower iterations builtfrom the construction in Theorem 4.1. This result implies a notion of ”point-wise bounds” stated as follows23.

Theorem 4.2 (Pointwise equilibrium bounds for SMNE) Let assump-tions 2 and 3 be satisfied and ∗ be an arbitrary stationary Markov Nashequilibrium. Then (∀s ∈ S) we have the equilibrium bounds �∗(s) ≤ ∗(s) ≤ ∗(s). Further, if !∗ is equilibrium payoff associated with any stationaryMarkov Nash equilibrium ∗, then (∀s ∈ S) we have the bounds: v∗(s) ≤!∗(s) ≤ w∗(s).

Before we continue, we make a few remarks on how our Theorems 4.1 and4.2 relate to the known results from section 3. The above result proves theexistence of SMNE, while theorem 3.1 in the APS section of the paper appliesonly to MNE (i.e., computes all the MNE, stationary or not). Moreover theSMNE from theorem 4.1 are not necessarily monotone, like in theorem 3.1Finally, the theorem establishes existence of extremal SMNE in pointwisepartial orders, as opposed to set inclusion orders for MNE via APS typemethods.

The existence result in Theorem 4.1 is obtained under different assump-tions than Curtat (1996), Amir (2002, 2005) or Nowak (2007) concerningstochastic, supermodular games. Concerning detailed comparisons we pro-ceed in six points.

1. Relative to Curtat and Amir (see (Amir, 2002)), we do not require thepayoffs or the transition probabilities to be smooth (e.g., twice continuouslydifferentiable on an open set containing S × A). Additionally, we weakenthe implied assumption of Lipschitz continuity of the functions ui and gjthat is used in both those papers, which appear to be very strong relative

23Of course, if we sup the pointwise bounds across our compact state space, we can getan estimate of uniform bounds also. The point is our iterations are converging in a weakersense, i.e., in the topology of pointwise convergence.

14

to many economic applications24. Perhaps most importantly, we also do notimpose conditions on payoffs and stochastic transitions that imply ”doubleincreasing differences” in the sense of Granot and Veinott (1985) or diago-nal dominance conditions (needed for equilibrium uniqueness in the stagegame) for each player in actions and states. Finally, we do not assume anyincreasing differences between actions and states.

2. On the other hand, we do impose a very important condition on thestochastic transitions of the game that is stronger than needed for existencein these two papers. In particular, we assume the transition structure in-duced by Q can be represented as a convex combination of L+1 probabilitymeasures, of which one measure is a delta Dirac concentrated at 0. As aresult, with probability g0, we set the next period state to zero; with prob-ability gj , the distribution is drawn from the nondegenerate distribution �j(where, in this latter case, this distribution does not depend on the vector ofactions a, but is allowed to depend on the current state s). Also, althoughwe assume each �j is stochastically ordered relative to the Dirac delta �0,we do not impose stochastic orderings among the various measures �j .

This ”mixing” assumption for transition probabilities has been discussedextensively in the literature. For example, it was mentioned in Amir (1996),while studied systematically for broad classes of stochastic games in Nowak(2003) and (almost everywhere) by Horst (2005). Further, in many appli-cations, it has recently been used to a great advantage in various papersstudying dynamic consistency problems. Surprisingly, the main strength ofthis assumption has not been fully used until the work of Balbus, Reffett,and Wozny (2009) in the context of paternalistic altruism economies, as wellas in this present paper where we integrate this mixing assumption into ourclass of stochastic supermodular games.

There are, however, important differences between functions gj that weuse in the current paper, and those found in the existing literature. Specifi-cally, as we are building a monotone continuation method in spaces of valuefunctions, and not Euler equations (as, for example, in Balbus, Reffett, andWozny (2009)), we do not require strict monotonicity of gj . This issue,although somewhat technical, turns out to be important in the nature ofthe results we obtain in the paper (e.g., compare with Balbus, Reffett, andWozny (2009) where it is shown that weak monotonicity of gj is needed toavoid trivial invariant distribution on bounded state spaces).

3. The assumptions in the paper by Amir (2005) are, perhaps, the mostclosely related to ours. Our requirements on payoffs and action sets aresubstantially weaker (i.e., the feasible action correspondences Ai(s), andpayoff/transition structures ui and gj are only required to be measurablewith s, as opposed to upper semicontinuous as in Amir (2005)). Further, we

24E.g. as it rules out payoffs that are consistent with Inada type assumptions, which,for example, are also ruled out in the work of Horst (2005).

15

also eliminate Amir’s assumption on increasing differences between playersactions and the state variable s, for existence. That is, for existence, we donot require monotone Markov equilibrium. Additionally, we do not requirethe class of games to have a single dimensional state space (as required forAmir’s existence argument for the infinite horizon version of his game).

There is a critical difference between our work and Amir’s relative to thespecification of a transition Q, and the comparisons here are more subtle.On the one hand, we require these aforementioned mixing conditions forthe stochastic transitions for the game to make the infinite horizon gametractable. On the other hand, Amir for the infinite horizon game requires astrong stochastic equicontinuity conditions for the distribution function Qrelative to the actions a, which is critical for his existence argument. Wedo not need this latter assumption. Also, as we do not require any formof continuity of �j with respect to the state s, we are not satisfying Amir’sassumption T1. Further, although we both require that the stochastic tran-sition structure Q is stochastically supermodular with a, we do not requireincreasing differences with (a, s) as Amir does, nor stochastic monotonicityof Q in s. Therefore, the two sets of assumptions are incomparable (i.e., oneset does not imply the other).

With these changes of assumptions, our results are substantially strongerthan Amir’s. In particular, aside from verifying the existence of SMNE, weprovide methods for constructing them, and in a moment (see section 4.4),we will show how we can obtain equilibrium monotone comparative staticsfor the infinite horizon game.

More specifically, to better understand the differences in our approachwith Amir’s, first observe Amir’s existence proof is based on a Schauder fixedpoint theorem. For this, he needs stochastic equicontinuity conditions on thenoise to get (weak*) continuity of a best response operator an a compactand convex set of monotone, upper semicontinuous strategies defined on thereal line. In contrast, we just construct a sequence25 of functions whoselimit is a (fixed-point) value leading to SMNE. Our method is completelyconstructive. Therefore, we do not need to require continuity of a best-response operator, nor compactness or convexity of a particular strategyspace.

4. It bears mentioning that our assumptions are weaker than thosestudied in Nowak (2007).

5. Our limiting arguments are based on the topology of pointwise con-vergence26. This allows us to state equivalently all our results using fixedpoint theorems in �−complete posets (e.g., see Dugundji and Granas (1982),

25Compare with Nowak and Szajowski (2003) lemma 5 where a related limiting Nashequilibrium result for two player game is obtained.

26Which for pointwise partial orders on spaces of values and strategies coincides withorder convergence in the interval topology. See Aliprantis and Border (1999) lemma 7.16as applied on our context.

16

Theorem 4.1-4.2), where continuity and convergence is always characterizedin terms of order topologies. It is precisely here, where our mixing assump-tion on the noise has its bite, as this leads to a form of monotonicity thatis preserved via the value function operator to the infinite horizon. For ex-ample, in Curtat (1996), one can only manage to show monotonicity of anoperator T in gradient orders (i.e., in ∂v, where ∂ denote a vector of (almosteverywhere defined) superdifferentials of v). Obviously for such superdiffer-entials to be guaranteed to be well-behaved, one needs concavity of the valuefunction in equilibrium. Under our mixing assumptions, we are not limitedto such cases.

6. It is important to keep in mind our methods obviously apply forfinite horizon games. Of course, for such games, our existence and com-parative statics results can be obtained under weaker conditions; but formaking arguments that stationary MNE are limits of finite horizon games,our assumptions here will be critical.

We finish this section with an important remark for applications wediscuss in section 5. Namely, as proofs of theorems 4.1 and 4.2 are conductedpointwise we can weaken restrictions placed on S.

Remark 1 Theorems 4.1 and 4.2 (and later theorem 4.5) works for compactposets S with the minimal element 0, not necessarily boxes in ℝk.

4.3 Uniform Error Bounds for Lipschitz continuous SMNE

We now turn to error bounds for approximate solutions. This is some-thing that has not been addressed in the current literature. We initiallygive two motivations for our results in this section. First, observe that thelimits provided by theorems 4.1 and 4.2 are only pointwise. With slightlystronger assumptions, we can obtain uniform convergence to a set of sta-tionary Markov Nash equilibria. Second, to obtain uniform error bounds,we need make some stronger assumptions on the primitives that allow us toaddress the question of Lipschitz continuity of equilibrium strategies. Suchassumptions are common in applications (compare Curtat (1996) and Amir(2002)).

In this section we assume that S is endowed with a taxi-norm ∣∣ ⋅ ∣∣127.The spaces Ai and A are endowed with a natural maximum-norm. Eachfunction f : S → A is said to be M -Lipschitz continuous if and only if, forall i = 1, . . . , n ∣∣fi(x) − fi(y)∣∣ ≤ M ∣∣x − y∣∣1. Note, if fi is differentiable,then M -Lipschitz continuity is equivalent to that each partial derivative isbounded above by M .

To obtain corresponding uniform convergence and uniform approxima-tion results we need additional assumptions.

27Taxi-norm of vector x = (x1, . . . , xk) is defined as ∣∣x∣∣1 =k∑i=1

∣xi∣.

17

Assumption 4 For all i, j:

∙ ui, gj are twice continuously differentiable on an open set containing28

S ×A,

∙ ui is increasing in (s, a−i) and satisfies cardinal complementarity29, inai and (a−i, s)

∙ ui satisfy a strict dominant diagonal condition in ai, a−i for fixed s ∈S, s > 0, i.e. if we denote ai ∈ ℝki as ai := (a1

i , ..., akii ) then

∀i=1,...,n∀j=1,...,ki

n∑�=1

k�∑�=1

∂2ui

∂aji∂a��

< 0,

∙ gj is increasing in (s, a) and has cardinal complementarity in ai ands, a−i,

∙ gj satisfy a strict dominant diagonal condition in ai, a−i for fixed s ∈S, s > 0,

∙ for each increasing, Lipschitz and bounded by u function f , the func-tion �fj (s) :=

∫S

f(s′)�j(ds′∣s) is increasing and Lipshitz-continuous30

with a constant �,

∙ Ai(s) := [0, ai(s)] and each function s→ ai(s) is Lipschitz continuousand isotone function31.

Now, define a set CMN of N -tuples, increasing Lipschitz continuous(with some constant M and natural number N) functions on S. Observethat CMN is a complete lattice when endowed with a partial order (and alsoconvex and compact in the sup norm). CMN is also closely related to thespace where equilibrium is constructed in Curtat (1996) and Amir (2002).There are two key differences between our game and that of both authors.First, we allow the choice set Ai to depend on s (where, these authorsassume Ai independent of s). Second, under our assumptions, the auxiliarygame G0

v has a continuum of Nash equilibria, and hence we need to closethe Nash equilibrium correspondence in state 0. These technical differencesare addressed in lemma 7.8 and proof of the next theorem. Before that we

28Note that this implies that ui and gj are bounded Lipshitz continuous functions oncompact S ×A.

29That is, the payoffs are supermodular in ai and has increasing differences in (ai, s)and in (ai, a−i).

30This condition is satisfied if each of measures �j(ds′∣s) has a density �j(s

′∣s) and thefunction s→ �j(s

′∣s) is Lipshitz continuous uniformly in s′.31Each coordinate aji is Lipshitz continuous. Notice, this implies that the feasible actions

are Veinott strong set order isotone.

18

provide a few definitions. For each twice continuously differentiable functionf : A→ ℝ we define:

ℒ(f) := −n∑

�=1

k�∑�=1

∂2f

∂aji∂a��

, U2i,j,l := sup

s∈S,a∈A(s)

∂2ui∂aji∂sl

(s, a),

G1i,j := sup

s∈S,a∈A(s)

L∑�=1

∂g�

∂aji(s, a), G2

i,j,l := sups∈S,a∈A(s)

L∑�=1

∂2g�∂aji∂sl

(s, a),

M0 := max

{(1− �i)U2

i,j,l + �i�G1i,j + �iuG

2i,j,l

−(1− �i)ℒ(ui): i = 1, . . . , n, j = 1, . . . , ki, l = 1, . . . , k

}.

Theorem 4.3 (Lipschitz continuity) Let assumptions 2, 3, 4 be satis-fied. Assume additionally that each ai(⋅) is Lipschitz continuous with a con-stant less then M0. Then, stationary Markov Nash equilibria �∗, ∗ andcorresponding values v∗, w∗ are all Lipschitz continuous.

Remark 2 Note that we could relax assumption of transition probability,such that

∫S

v(s′)q(ds′∣s, a) is smooth, supermodular and satisfy strict diago-

nal property for all Lipschitz continuous v.

We can now provide conditions for the uniform approximation of SMNE.The results appeal to a version of Amann’s theorem (e.g., Amann (1976),Theorem 6.1) to characterize least and greatest SMNE via successive ap-proximations. Further, as a corollary of Theorem 4.1, we also obtain theirassociated value functions. For this argument, denote by 0 (by u respec-tively) the n-tuple of function identically equal to 0 (u respectively) for alls > 0. Observe, as under assumptions 4, the auxiliary game has a uniqueNE value, we have T (v) = T (v) := T (v). The result is then stated as follows:

Corollary 4.1 (Uniform approximation of extremal SMNE) Let as-sumptions 2, 3 and 4 be satisfied. Then limt→∞ ∣∣T t0 − v∗∣∣ = 0 andlimt→∞ ∣∣T tu−w∗∣∣ = 0, with limt→∞ ∣∣�t−�∗∣∣ = 0 and limt→∞ ∣∣ t− ∗∣∣ =0.

Notice, the above corollary assures that the convergence in theorem 4.1is uniform. We also obtain a stronger characterization of the set of SMNEin this case, namely, the set of SMNE equilibrium value functions form acomplete lattice.

Theorem 4.4 (Complete lattice structure of SMNE value set) Underassumptions 2, 3 and 4, the set of stationary Markov Nash equilibrium valuesv∗ in CMn is a nonempty complete lattice.

19

The above result provides a further characterization of a SMNE strate-gies, as well as their corresponding set of equilibrium value functions. From acomputational point of view, not only are the extremal values and strategiesLipschitzian (as known from previous work), they can also can be uniformlyapproximated by a simple algorithm. Also observe that the set of SMNE inCM

∑i ki is not necessarily a complete lattice.

4.4 Monotone Comparative Dynamics

In the next section, we provide conditions under which our games exhibitequilibrium monotone comparative statics of extremal fixed point valuesv∗, w∗, as well as the corresponding extremal equilibria �∗, ∗. We alsoprove a theorem on ordered equilibrium stochastic dynamics. With thisequilibrium comparative statics question in mind, consider a parametrizationof our stochastic game by a set of parameters � ∈ Θ, where Θ is a partiallyordered set. We can view � as representing the deep parameters of the spaceof games, with its elements including parameters for (i) period preferences ui,(ii) the stochastic transitions gj and �j , and (iii) feasibility correspondenceAi. Alternatively, we can think of elements � as being policy parameters ofthe environment governing the setting of taxes or subsidies (as, for example,in a dynamic policy game with strategic complementarities). Along thoselines, we provide parameterized versions of Assumptions 2 and 3 as follows:

Assumption 5 (Parameterized preferences) For i = 1, . . . , n let:

∙ ui : S × A × Θ → ℝ be a function and ui(⋅, s, �) continuous on A forany s ∈ S, � ∈ Θ with ui(⋅) ≤ u, and ui(⋅, ⋅, �) is measurable for all �

∙ (∀a ∈ A, � ∈ Θ)ui(0, a, �) = 0,

∙ ui be increasing in (s, a−i, �),

∙ ui be supermodular in ai for fixed (a−i, s, �), and has increasing differ-ences in (ai; a−i, s, �),

∙ for all s ∈ S, � ∈ Θ, the sets Ai(s, �) are nonempty, measurable (forgiven �), compact intervals and a measurable multifunction that is bothascending in the Veinott’s strong set order32, and expanding under setinclusion33 with Ai(0, �) = 0.

Assumption 6 (Parameterized transition) Let Q be given by:

∙ Q(⋅∣s, a, �) = g0(s, a, �)�0(⋅) +L∑j=1

gj(s, a, �)�j(⋅∣s, �), where

32That is, Ai(s, �) is ascending in Veinott’s strong set order if for any (s, �) ≤ (s′, �′),ai ∈ Ai(s, �) and a′i ∈ Ai(s′, �′) =⇒ ai ∧ a′i ∈ Ai(s, �) and ai ∨ a′i ∈ Ai(s′, �′).

33That is, Ai(s, �) is expanding if s1 ≤ s2 and �1 ≤ �2 then Ai(s1, �1) ⊆ Ai(s2, �2).

20

∙ for j = 1, ..., L function gj : S × A × Θ → [0, 1] is continuous with afor a given s, �, measurable for given �, increasing in (s, a, �), super-modular in a for fixed (s, �), and has increasing differences in (a; s, �)

and gj(0, a, �) = 0 (clearlyL∑j=1

gj(⋅) + g0(⋅) ≡ 1),

∙ (∀s ∈ S, � ∈ Θ, j = 1, . . . , L)�j(⋅∣s, �) is a Borel transition probabilityon S, with each �j(⋅∣s, �) stochastically increasing with � and s,

∙ (∀j = 1, . . . , L) and � ∈ Θ function s→∫S v(s′)�j(ds

′∣s, �) is measur-able for any measurable and bounded v,

∙ �0 is a probability measure concentrated at point 0.

Notice, in both of these assumptions, we have added increasing differenceassumptions between actions and states (as, for example, in Curtat (1996)and Amir (2002)). We first introduce some notation. For a stochastic gameevaluated at parameter � ∈ Θ, denote the least and greatest equilibriumvalues, respectively, as v∗� and w∗� . Further, for each of these extremal values,denote the associated least and greatest SMNE pure strategies, respectively,as �∗� and ∗� . Our first monotone equilibrium comparative statics theoremis given in the next theorem:

Theorem 4.5 (Monotone equilibrium comparative statics) Let assump-tions 5 and 6 be satisfied. Then, the extremal equilibrium values v∗�(s), w

∗�(s)

are increasing on S ×Θ. In addition, the associated extremal pure strategystationary Markov Nash equilibrium �∗�(s) and ∗�(s) are increasing on S×Θ.

In the literature on stochastic games with strategic complementarities,for infinite horizon, we are not aware of any analog result concerning mono-tone equilibrium comparative statics as in Theorem 4.5. In particular, be-cause of the non-constructive approach to the equilibrium existence problem(that is typically taken in the literature), it is difficult to obtain such a mono-tone comparative statics without fixed point uniqueness (e.g. Villas-Boas(1997) for the details). Therefore, one key innovation of our approach of theprevious section is that for the special case of our games where SMNE aremonotone Markov processes, we are able to construct a sequence of param-eterized monotone operators whose fixed points are extremal equilibriumselections. As the method is constructive, this also allows us to computedirectly the relevant monotone selections from the set of SMNE.

Finally, we state results on dynamics and invariant distribution startedfrom s0 and governed by a SMNE and transition Q. Before that let us men-tion that by our assumptions delta Dirac concentrated at 0 is an absorbingstate and hence we have a trivial invariant distribution. As a result we donot aim to prove existence of an invariant distribution but rather charac-terize a set of all invariant distributions and discuss conditions when it is

21

not a singleton. For this reason let � be given and by sft denote a processinduced by Q and equilibrium strategy f (i.e., s0 = sf is an initial valueand for t > 0), st+1 has a conditional distribution Q(⋅∣st, f(st)). By ર , wedenote the first order stochastic dominance order on the space of probabilitymeasures. We have the following theorem:

Theorem 4.6 (Invariant distribution) Let assumptions 2, 3 and 4 besatisfied.

∙ Then the sets of invariant distributions for processes s�∗

t and s ∗

t arechain complete (with both greatest and least elements) with respect to(first-) stochastic order.

∙ Let �(�∗) be the greatest invariant distribution with respect to �∗ and�( ∗) the greatest invariant distribution with respect to ∗. If the

initial state of s�∗

t or s ∗

t is a Dirac delta in S, then s�∗

t converges

weakly to �(�∗), and s ∗

t converges to �( ∗), respectively.

We make a few remarks. First, the above result is stronger than thatobtained in a related theorem in Curtat (1996) (e.g., Theorem 5.2). Thatis, not only we do characterize the set of invariant distributions associatedwith extremal strategies (which he does not), but we also prove a weakconvergence result per the greatest invariant selection. Further, it is worthmentioning if for almost all s ∈ S, we have

∑j gj(s, ⋅) < 1, we obtain a

positive probability of reaching zero (an absorbing state) each period, andhence the only invariant distribution is delta Dirac at zero. Hence, to obtaina nontrivial invariant distribution, one has to assume

∑j gj(s, ⋅) = 1 for all s

in some subset of a state space S with positive measure, e.g. interval [S′, S] ⊂S (see Hopenhayn and Prescott (1992), or more recently Kamihigashi andStachurski (2010)).34

Second, Theorems 4.5 and 4.6 also imply the existence of results onmonotone comparative dynamics (e.g., as defined by Huggett (2003)) withrespect to the parameter vector � ∈ Θ induced by extremal SMNE: �∗, ∗.To see this, we define the greatest invariant distribution ��(�

∗�) induced by

Q(⋅∣s, �∗�, �), and greatest invariant distribution ��( ∗�) induced byQ(⋅∣s, ∗� , �),

and consider the following corollary:

Corollary 4.2 Assume 5, 6. Additionally let assumptions of theorem 4.6be satisfied for all � ∈ Θ. Then ��2(�∗�2) ર ��1(�∗�1) as well as ��2( ∗�2) ર��1( ∗�1) for any �2 ≥ �1.

34It is also worth mentioning that much of the existing literature does not consider thequestion of characterizing the existence of Stationary Markovian equilibrium (i.e, strategyand invariant distribution pairs).

22

Clearly, the above results points to the importance of having construc-tive iterative methods for both strategies/values, as well as limiting distribu-tions associated with extremal SMNE. Without such monotone iterations,we could not close our monotone comparative statics results. Further, inconclusion, we stress the fact that by weak continuity of operators used toestablish invariant distributions, we can also obtain results that lead us todevelop methods to estimate parameters � in the context of our stochasticgame using simulated moments methods (e.g., see Pakes and Pollard (1989),Lee and Ingram (1991), and more recently Aguirregabiria and Mira (2007),for discussion of how this is done, and why it is important).

5 Applications

Applications of our theorems are immediate, and can be used to studythe equilibrium in the games of Curtat (1996), Amir (2002), Horst (2005)or Nowak (2007), including examples such as dynamic (price or quantity)oligopolistic competition, stochastic growth models without commitment,problems of dynamic consistency, models with weak social interactions, aswell as various dynamic policy games. We now discuss three such applica-tions of our results. We first use our results from section 4 to prove existenceof Markov equilibrium in dynamic oligopoly models. We then show how ourmethods can be used for analyzing the question of credible government pub-lic policies as defined by Stokey (1991). We conclude discussing comparativestatics and equilibrium dynamics in the dynamic search model with learning.

5.1 Price competition with durable goods

Consider an economy with N firms competing on customers buying durablegoods, that are heterogenous but substitutable to each other. Apart fromprice of a given good, and vector of competitors goods’ prices, demand forany commodity depends on demand parameter s. Each period firms choosetheir prices, competing a la Bertrand with other’s prices. Our aim is toanalyze the Markov (Stationary) Nash Equilibrium of such economy. Ourpaper developed two separate conditions and methods for studying equilibriain such economies, those from section 3 and that from 4. With start withthe latter one.

Payoff of firm i, choosing price ai ∈ [0, a] is ui(s, ai, a−i, �) = aiDi(ai, a−i, s)−Ci(Di(ai, a−i, s), �), where s is a (common) demand parameter, while � is acost function parameter. As within period game is Bertrand with heteroge-nous but substitutable products, naturally the preference assumption 2 issatisfied if Di is increasing with a−i, Di has increasing differences in (ai, a−i)and Ci is increasing and convex. As [0, a] is single dimensional ui is a super-modular function of ai. Concerning the interpretation of the assumptionsplaced on Q in the context of this model. Letting s = 0 be an absorbing

23

state means that there is a probability that demand will vanish and compa-nies will be driven out of the market. The other assumptions on transitionprobabilities are also satisfied if Q(⋅∣s, a) = g0(s, a)�0(⋅) +

∑j gj(s, a)�j(⋅∣s).

and gj , �j satisfy assumptions 3.Interpreting: high prices a today result in high probability for positive

demand in the future, as the customer trades-off between exchanging theold product with the new one, and keeping the current product and waitingfor lower prices tomorrow. Supermodularity in prices imply that the impactof a price increase on positive demand parameter tomorrow is higher whenthe others set higher prices. Indeed when the company increases its pricetoday it may lead to a positive demand in the future, if the others have alsohigh prices. But if the others firms set low prices today, then such impact isdefinitely lower, as some clients may want to purchase the competitors goodtoday instead.

The results of the paper (theorem 4.1) prove existence of the greatestand the least Markov stationary Bertrand Equilibrium and allow to computethe equilibria, by a simple iterative procedure. The results extend hence theCurtat (1996) paper example to the non-monotone strategies, characteriz-ing the monopolistic competition economy with substitutable durable goodsand varying consumer preferences. Finally, our approximation procedureallow applied researcher to compute and estimate the stochastic propertiesof these models using the extremal invariant distributions (see theorem 4.6).Finally, if one adds assumptions of theorem 4.5 one obtain monotone com-parative statics of the extremal equilibria and invariant distributions (seecorollary 4.2), the results absent in the related work.

If the mixing assumption is too restrictive in some applications, we stilloffer in theorem 3.1 MNE existence, provided assumption 1 is satisfied. Ob-serve that here, as compared to assumptions 2 and 3, we need to add in-creasing differences between (ai, s) and monotonicity in s. Such methodallow hence to study the monotone equilibria only. The interpretation ofsuch assumption means that high demand today imply high demand in thefuture, a.o. To justify this Curtat (1996) argues: that ”high level of demandtoday is likely to result in a high level of demand tomorrow because onecan assume that not all customers will be served today in the case of highdemand.” Although, in our paper the MNE existence is obtained underweaker assumption than those of Curtat (1996) or Amir (2002), still thementioned monotonicity assumption is questionable, as customer rationingis not a part of this game description. Hence, methods developed in section 4are plausible.

5.2 Time-consistent public policy

We now consider a time-consistent policy game as defined by Stokey (1991)(also see Curtat (1992), chapter 7.6) and analyzed more recently by Lagunoff

24

(2008). Consider a (stochastic) game between a single household and thegovernment. For any state k ∈ S (capital level), households choose consump-tion c and investment i treating level of a government spending G as given.There are no security markets that household can share the risk for tomorrowcapital level. The only way to consume tomorrow is to invest in the stochas-tic technology Q. The within period preferences of household are given byu(c), i.e. household do not obtain utility from public spending G. The gov-ernment raises revenue by levying flat tax � ∈ [0, 1] on capital income, tofinance its public spending G ≥ 0. Each period the government budget isbalanced and its within period preferences are given by: u(c) + J(G). Theconsumption good production technology is given by constant return to scalefunction f(k) with f(0) = 0. The transition technology between states isgiven by a probability distribution Q(⋅∣i, k), where i denotes household in-vestment. The timing in each period is that the government and householdsimultaneously choose tax rate. Household and government take price Ras given which in equilibrium equal its marginal productivity. Assume thatu, J, f are increasing, concave and twice continuously differentiable and Qis given by assumption 3.

Household choose:

maxi∈[0,(1−�)Rk]

u((1− �)Rk − i) + g(i)�

∫vH(s)�(s∣k).

Observe that objective is supermodular in i and has increasing differencesin (i, t), where t = 1− � by the envelope theorem and −u′′(⋅) ≥ 0. Moreoverthe objective is increasing in t = 1− � by monotonicity of u.

The government problem is:

maxt∈[0,1]

u(tRk − i) + g(i)�

∫vH(s)�(s∣k) +

+J(Rk(1− t)) + g(i)�

∫vG(s)�(s∣k).

That is, the government maximizes the household utility plus its additionalutility from public spending J and its continuation vG. Again objective issupermodular in 1 − � and has increasing differences in (t = 1 − �, i) as−u′′(⋅) ≥ 0. Moreover observe although the objective is not increasing in i,along the Nash equilibrium of the auxiliary game the objective is increasingin i∗(vH), again by the envelope theorem.

Although the modes does not fit the general assumptions of the game an-alyzed in our paper, essentially the same methods as developed in the papercan be applied here to prove existence and compute MSNE. Specifically wecan construct an operator on the space of values, that would be monotone(as the within period game is supermodular and the Nash equilibrium ofsuch game is monotone in vH , vG).

25

Some additional interesting points of departure from this above basicspecification can also be worked out, including: (i) elastic labor supplychoice, or more importantly (ii) adding security markets, investment/insurancefirms possessing Q and proving existence of prices decentralizing optimal in-vestment decision i∗. Still observe, however, that here we are able to weakensubstantially the assumptions needed for guaranteeing existence of a station-ary credible policy of Curtat (1992), as well as offer a variety of tools allowingfor its constrictive study of its properties and computation.

5.3 Dynamic search with learning

We extend a dynamic game of search35 with learning studied by Amir (2002),Horst (2005), Nowak (2007), see also Rauh (2009) for a large, search gamewith strategic complementarities. Each period N traders (players) expandseffort searching for trading partners (other players). Denoting search effortlevel by ai ∈ [0, s] = Ai(s) and search costs by ci(ai) we let state s ∈ S =[0, S] ⊂ ℝ+ represents the current productivity level of the search process,with a current search reward given by:

ui(s, a) = sf(ai, a−i)− (1− �)ci(ai),

where f is a ”matching” function and � ∈ [0, 1] a search subsidy. Let s bedrawn from a transition Q.

Let f and ci both be continuous and increasing, with f also supermod-ular in ai. Similar to Curtat (1996), Horst (2005) or Nowak (2007), assumethe transition Q is given by assumption 6. Then, theorem 4.5 implies ex-istence of extremal Markov perfect Nash equilibria, each monotone withproductivity s and subsidy �. Further, by Theorem 4.6 and Corollary 4.2,we have the existence of extremal invariant distributions that are mono-tone in subsidy �, and monotone comparative dynamics in �. Adding strictconvexity/concavity assumption we further obtain uniform approximationresults. This allows us to produce a new result on monotone comparativestatics and ordered equilibria. Additionally, we stress that as observed byNowak (2007), the least equilibrium may be trivial for a common (additive)specification of f . As we can use monotonicity to guarantee the existence of(the greatest) nontrivial equilibrium, this is an advantage of our constructivemethods over the purely topological results of Curtat (1996), Amir (2002)or Horst (2005).

6 Concluding remarks

We have presented two constructive methods for computing MNE in thispaper. Under very mild conditions on the game, we are able to develop

35Originating from Diamond (1982) search model.

26

a strategic dynamic programming approach to our class of stochastic su-permodular games, and provide constructive methods for computing theequilibrium value set that is associated with MNE in the game. Aside fromallowing us to weaken conditions for existence significantly from those in theexisting literature, we are also able to produce strategic dynamic program-ming methods that focus on Markovian equilibrium (as opposed to moregeneral subgame perfect equilibrium). The set of MNE include both sta-tionary and nonstationary Markovian equilibrium, but in all cases, MNEexist on minimal state spaces, and our state spaces are allowed to be verygeneral (i.e., compact intervals in finite dimensional Euclidean spaces).

Additionally, under mild mixing assumptions on stochastic transitions,we are able to exploit the complementarity structure of our class of games,and develop very sharp results for iterative procedures on both values andpure strategies, which provides very important improvements over strategicdynamic programming/APS type methods that we develop in section 3).For example, under somewhat stronger assumptions 1, 2 and 3, we are ableto show how to compute both the greatest and least SMNE values w∗, v∗ ∈V ∗ = B(V ∗) (as, of course, they are also Markov NE), as well as associatedextremal pure strategy equilibrium. As for every iteration of our APSoperator we can keep track of a greatest and least NE value, clearly, ouriterative methods under mixing assumptions verify that for these cases, wehave the our APS value set Wt ⊆ [vt, wt]. So in such case, our iterations insection 4 provide interval bounds on the iterations on our strategic dynamicprogramming operator. Further, under these mixing assumptions, as thepartial orders we use in all cases are chain-complete (i.e., both pointwiseand set inclusion orders), we conclude that V ∗ ⊆ [v∗, w∗]. That is, the set ofvalue functions that are associated with MNE belongs to an ordered intervalbetween least and greatest SMNE. So although we cannot show that Wt (fort ≥ 2) or V ∗ are ordered intervals of functions, we use our iterative methodsto calculate the two bounds using direct techniques of section 4.

This observation leads us to a final important point linking our directmethods with strategic dynamic programming/APS approach. Namely, inAbreu, Pearce, and Stacchetti (1990), they show that under certain assump-tion any value from V ∗(s) can be obtained in a bang-bang equilibrium fora repeated game, i.e. one using extremal values from V ∗(s). Our direct andconstructive methods of section 4 can be hence used to compute two of suchextremal values that support equilibrium punishment schemes that actuallyimplement MNE.This greatly sharpens the method by which we support allMNE in our collection of dynamic games. So this allows one to extend someof the ideas that have been developed in the literature on repeated gamesto settings with state variable, which also helps complement the importantwork of Judd, Yeltekin, and Conklin (2003) per developing numerical tech-niques to actually compute the entire equilibrium value correspondence Thisparticular issue will be pursued in our future research.

27

7 Proofs

7.1 Proof using indirect methods

We first prove a series of important lemmas that we use to prove the centraltheorem of this section. The main theorem, Theorem 3.1.1, will show thatiterations on an operator B starting from some upper element V ∈ V thatis mapped down (under set inclusion) converge pointwise (in the Hausdorffmetric topology on V) to the greatest fixed point W ∗ of B. Further, intheorem 3.1.2, the greatest fixed point of B under set inclusion, is a set ofall Markov Nash Equilibria values in V , i.e. W ∗ = V ∗.

Our first lemma considers the fixed point set of Gsv.

Lemma 7.1 (Complete lattice of NE(v, s)) If W ∈ V, then B(W ) ∩V ∕= ∅.

Proof of lemma 7.1: Let v ∈W . Since v is an increasing function, henceΠi(vi, s, a) is supermodular in ai and has increasing differences in (ai, a−i).Therefore Gsv is a supermodular game parameterized by s. By Theorem 5in Milgrom and Roberts (1990) it possess the nonempty, complete lattice ofpure strategy Nash equilibria. The greatest and least of them are increasingwith s by increasing differences assumption between (ai, s) and monotonicityof s⇉ Ai in Veinott strong set order (see theorem 6 by Milgrom and Roberts(1990)). Let s2 ≥ s1 and consider the greatest a(s, v) (or the least) Nashequilibrium. The following holds for all i:

wi(s2) :=

Πi(vi, s2, ai(s2, vi), a−i(s2, vi)) = maxai∈Ai(s2)

Πi(vi, s2, ai, a−i(s2, vi)) ≥

Πi(vi, s2, ai(s1, vi), a−i(s2, vi)) ≥ Πi(vi, s1, ai(s1, vi), a−i(s2, vi)) ≥Πi(vi, s1, ai(s1, vi), a−i(s1, vi)) := wi(s1).

where the first inequality follows as Ai is ascending in the set inclusion or-der, the second by monotonicity of Πi with s and the last by monotonicityof s → a−i(s, vi) and a−i → Πi(vi, s, ai, a−i). Therefore, w = (w1, ..., wn) isa selection from B(W ) ∩ V .

We next prove a result we need to study the convergence properties ofB. We use it in the proof of lemma 7.3 and theorem 3.1.

Lemma 7.2 (Convergence of values) Let vt → v in the weak*-topologyon L∞(�),36 and at → a as t→∞. Then Π(vt, s, at)→ Π(v, s, a) pointwisein s ∈ S.

36vt → v in the weak* topology if and only if, for all g ∈ ℒ1(�)∫S

vt(s)g(s)�(ds) →∫S

v(s)g(s)�(ds).

28

Proof of lemma 7.2: By continuity assumptions 1, we just need to showthat: ∫

S

vti(s′)q(s′∣s, at)�(ds′)→

∫S

vi(s′)q(s′∣s, a)�(ds′).

Indeed, by triangle inequality:∣∣∣∣∣∣∫S

vti(s′)q(s′∣s, at)�(ds′)−

∫S

vi(s′)q(s′∣s, a)�(ds′)

∣∣∣∣∣∣≤

∣∣∣∣∣∣∫S

vti(s′)q(s′∣s, at)�(ds′)−

∫S

vti(s′)q(s′∣s, a)�(ds′)

∣∣∣∣∣∣+

∣∣∣∣∣∣∫S

vti(s′)q(s′∣s, a)�(ds′)−

∫S


∣∣∣∣∣∣ .Using bounds on u we obtain that the expression above

≤ u∫S

∣q(s′∣s, at)− q(s′∣s, a)∣�(ds′) (1)

+

∣∣∣∣∣∣∫S

vti(s′)q(s′∣s, a)�(ds′)−

∫S


∣∣∣∣∣∣ .By the definition of weak*-convergence, the second term above convergesto 0. We now show that the first term in (1) converges to 0. By As-sumption 1, the function under the integral converges to 0. Note, we have∣q(s′∣s, a)−q

(s′∣s, at

)≤ 2∣∣q(s′∣s, ⋅)∣∣∞. Since by Assumption 1, ∣∣q(s′∣s, ⋅)∣∣∞

is � integrable function, the convergence of integrals follows from LebesgueDominance Theorem.

As a result Π(vt, s, at)→ Π(v, s, a) pointwise in s ∈ S.

Lemma 7.3 (Compactness of Wt) For each t ∈ ℕ, Wt is s compact setin the weak*-topology on L∞(�).

Proof of lemma 7.3: Since V is a set of functions with the image in [0, u],by Alaoglu theorem, it is weak*-compact set. We now show that Wt iscompact for t > 1. To do this, it is sufficient to show B(W ) is weak*-compact whenever W is weak*-compact.

Let wt ∈ B(W ) for all t. By the definition of B(W ), we have

wt(s) = Π(vt, s, at(s))

29

where vt ∈ W and at(s) ∈ NE(vt, s). As both W and V are compact,without loss of generality, assume vt → v∗ where v∗ ∈ W , and wt → w∗ ,where convergence in both cases is with respect to the weak*-topology. Fixs > 0. Then, at(s) → a∗. We now show that a∗ is Nash equilibrium in thereduced game Gsv∗ . By Lemma 7.2 , we have

Π(vt, s, at)→ Π(v∗, s, a∗)

pointwise in s. Hence, a∗ is a Nash equilibrium in the static game Gsv∗ . Foreach s, we can define a∗(s) as a Nash equilibrium function37. Therefore, wehave

w∗(s) := Π(v∗, s, a∗(s))

with w∗ ∈ B(W ), a∗(s) ∈ NE(v∗, s). Hence w∗ is a weak*-limit of thesequence wt as it is a pointwise limit.

Lemma 7.4 (Self generation) If W ⊂ B(W ) then B(W ) ⊂ V ∗.

Proof of lemma 7.4: Let w ∈ B(W ). Then, we have v0(⋅) := w(⋅) =Π(v1, ⋅, �1(⋅)) for some v1 ∈ W and Nash equilibrium �1(s) ∈ NE(v1, s).Then, since v1 ∈ W by the assumption, v1 ∈ B(W ). Consequently, forvt ∈ W ⊂ B(W ) (t ≥ 1) we can choose vt+1 ∈ W such that vt(⋅) =Π(vt+1, ⋅, �t+1(⋅)) and �t(⋅) ∈ NE(vt+1, ⋅). Clearly, the Markovian strat-egy � generates payoff vector w. We next need to show this is a Nashequilibrium in the stochastic game for (�-almost) all initial states. Supposethat only player i uses some other strategy �i. Then, for all t, we havevit(s) = Πi(vt+1, s, �

t(s)) ≥ Πi(vt+1, s, �t+1−i (s), �t+1

i ). If we take a T trun-cation38, �T,∞ =

((�1i , �

1−i), . . . ,

(�Ti , �

T−i), �T+1, �T+2, . . .

), this strategy

can not improve a payoff for player i. Indeed,

Ui(�, s) ≥ Ui(�−i, �T,∞i , s)→ Ui(�−i, �i, s)

as T → ∞. This convergence has been obtained as ui is bounded, and theresidum of the sum Ui(�−i, �

T,∞i , s) depending on

(�T+1, �T+2, . . .

)can be

obtained as an expression bounded by u, multiplied by �Ti .

Proof of theorem 3.1: We prove 1. As V is a complete lattice, B is in-creasing, by Tarski theorem, B has the greatest fixed point W ∗. Moreover,as B is increasing, {Wt}∞t=0 is a decreasing sequence (under set inclusion).

37Generally a∗ is unmeasurable, however for us it is enough to obtain measurability ofw∗.

38That is agent i uses strategy � up to step T and uses � after that. The rest agentsuse �.

30

Let V∞ := limt→∞

Wt =∞∩t=1

Wt.We need to show that V∞ = W ∗.Clearly,

V∞ ⊂Wtfor all t ∈ ℕ; hence

B(V∞) = B

( ∞∩t=1

Wt

)⊂∞∩t=1

B(Wt) =∞∩t=1

Wt+1 = V∞.

To show equality, it suffices to show V∞ ⊂ B(V∞). Let w ∈ V∞. Then,w ∈ Wt for all t. By the definition of Wt and B, we obtain existence of thesequence vt ∈Wt and Nash equilibria at such that

w(s) = Π(vt, s, at).

Since V is compact, without loss of generality, assume vt weakly starconverges to v∗. Moreover, v∗ ∈ V∞, since Wt is a descending set of compactsequences in the weak star topology. Fix arbitrary s > 0. Without loss ofgenerality, let at → a∗, where a∗ is some point from A. We can now showa∗ is a Nash equilibrium in the static game Gsv∗ . Let ai ∈ Ai(s). Then, forall � ∈ ℕ:

Πi(v�i , s, a

� ) ≥ Πi(v�i , s, a

�−i, ai).

By lemma 7.2 , if we take a limit in this expression, we obtain a∗ a Nashequilibrium in the static game Gsv∗ , and w(s) = Π(v∗, s, a∗). We obtainw ∈ B(V∞). Hence, V∞ is a fixed point of B, and, by definition V∞ ⊂W ∗.

To finish the proof, we simply need to show W ∗ ⊂ V∞. Since W ∗ ⊂ V ,W ∗ = B(W ∗) ⊂ B(V ) = W1. By induction, we have W ∗ ⊂ Wt for all t;hence, W ∗ ⊂ V∞. Therefore, W ∗ = V∞, which completes the proof.

We prove 2. First show that V ∗ is a fixed point of operator B. ClearlyB(V ∗) ⊂ V ∗. So we just need to show the reverse inclusion. Let v ∈ V ∗and � = (�1, �2, . . .) be a profile supporting v. By assumption 1, �2,∞ =(�2, �3, . . .) must be a Nash equilibrium � almost everywhere ( i.e. a setof initial states S0 which �2,∞ is not a Markov equilibrium must satisfy�(S0) = 0). Define a new profile �(s) = �2,∞ for s /∈ S0 and �(s) = � ifs ∈ S0. Let v be equilibrium payoff generated by �. Clearly, v ∈ V ∗ andis � measurable and also v(s) = Π(v, s, �1). Thus v ∈ B(V ∗) and henceV ∗ ⊂ B(V ∗). As a result B(V ∗) = V ∗.

Finally by definition (greatest fixed point) of W ∗ we conclude that V ∗ ⊂W ∗. To obtain the reverse inclusion we apply lemma 7.4. Indeed W ∗ ⊂B(W ∗), and, therefore, W ∗ ⊂ V ∗ and we obtain that V ∗ = W ∗.

7.2 Proof using direct methods

We first state three lemmata that prove useful in verifying the existence ofSMNE in our game. The lemmata, in addition, is also useful in charac-terizing monotone iterative procedures for constructing least and greatest

31

SMNE (relative to pointwise partial orders on ℬn(S)). More specifically,the lemma concerns the structure of Nash equilibria (and their associatedcorresponding equilibrium payoffs) in our auxiliary game Gsv.

Lemma 7.5 (Monotone Nash equilibria in Gsv) Under assumptions 2and 3, for every s ∈ S and value v ∈ ℬn(S), the game Gsv has the maximalNash equilibrium a(v, s), and minimal Nash equilibrium a(v, s). Moreover,both equilibria are increasing in v.

Proof of lemma 7.5: Without loss of generality fix s > 0. Define auxiliaryone shot game, say Δ(�), with an action space A, and payoff function forplayer i given as

Hi(a, �) := (1− �i)ui(s, ai, a−i) + �i

L∑j=1

�i,jgj(s, ai, a−i),

where � := [�i,j ]i=1,...,n,j=1,...,L ∈ T := ℝn×L is endowed with the naturalpointwise order. As supermodularity of a function on a sublattice of a di-rected product of lattices implies increasing differences (see Topkis (1998)theorem 2.6.1) clearly, for each � ∈ T , the game Δ(�) is supermodular,and satisfies all assumptions of Theorem 5 in Milgrom and Roberts (1990).Hence, there exists a complete lattice of Nash equilibrium, with the greatestNash equilibria given by NEΔ(�), and the least Nash equilibrium given byNEΔ(�). Moreover, for arbitrary i, the payoff function Hi(a, �) has increas-ing differences in ai and � ; hence, Δ(�) also satisfies conditions of Theorem6 in Milgrom and Roberts (1990). As a result, both NEΔ(�) and NEΔ(�)are increasing in � .

Step 2: For each s ∈ S, the game Gsv is a special case of Δ(�) where�i,j =

∫S

vi(s′)�j(ds

′∣s)). Therefore, by the previous step, least and great-

est Nash equilibrium a(v, s) and a(v, s) are increasing in v, for each s ∈ S

In our next lemma, we show that for each extremal Nash equilibrium(for state s and continuation v), we can associate an equilibrium payoff thatpreserves monotonicity in v. To do this, we first compute the values ofgreatest (resp., least) best responses given a continuation values v and states as follows:

Π∗i (v, s) := Πi(vi, s, ai(v, s), a−i(v, s))

and similarlyΠ∗i (v, s) := Πi(vi, s, ai(v, s), a−i(v, s)).

We now have the following lemma:

Lemma 7.6 (Monotone values in Gsv) Under assumptions 2 and 3 wehave: Π

∗i (v, s) and Π∗i (v, s) are monotone in v.

32

Proof of lemma 7.6: Function Πi is increasing with a−iand vi. For v2 ≥v1by Lemma 7.5, we have a(v2, s) ≥ a(v1, s). Hence,

Π∗i (v2, s) = max

ai∈Ai(s)Πi(v

2i , s, ai, a−i(v

2, s)) ≥ maxai∈Ai(s)

Πi(v1i , s, ai, a−i(v

2, s)) ≥

≥ maxai∈Ai(s)

Πi(v1i , s, ai, a−i(v

1, s)) = Π∗i (v1, s).

A similar argument proves the monotonicity of Π∗i (v, s).

To show that T (⋅)(s) = Π(⋅, s) and T (⋅)(s) = Π(⋅, s) are well-definetransformations of ℬn(S) we use the techniques introduced by Nowak andRaghavan (1992).

Lemma 7.7 (Measurable equilibria and values of Gsv) Under assump-tions 2 and 3 we have:

∙ T : ℬn(S)→ ℬn(S) and T : ℬn(S)→ ℬn(S),

∙ functions s → a(v, s) and s → a(v, s) are measurable for any v ∈ℬn(S).

Proof of lemma 7.7: For v ∈ ℬn(S) and s ∈ S, define the function Fv :A× S → ℝ as follows:

Fv(a, s) =n∑i=1

Πi(vi, s, a)−n∑i=1

maxzi∈Ai(s)

Πi(vi, s, zi, a−i).

Observe Fv(a, s) ≤ 0. Consider the problem:

maxa∈×ni Ai(s)

Fv(a, s).

By assumption 2 and 3, the objective Fv is a Caratheodory function, and the(joint) feasible correspondence A(s) = ×iAi(s) is weakly-measurable. By astandard measurable maximum theorem (e.g. theorem 18.19 in Aliprantisand Border (1999)), the correspondence Nv : S ↠ ×iAi(s) defined as:

Nv(s) := arg maxa∈Ai(s)

Fv(a, s),

is measurable with nonempty compact values. Further, observe that Nv(s),by definition, is a set of all Nash equilibria for the game Gsv. Therefore,to finish the proof of our first assertion, for some player i, consider a prob-lem maxa∈Nv(s) Πi(vi, s, a). Again, by the measurable maximum theorem,

the value function Π∗i (v, s) is measurable. A similar argument shows each

33

Π∗i (v, s) is measurable. Therefore, we have for value operators T : ℬn(S)→ℬn(S) and T : ℬn(S)→ ℬn(S).

To show the second assertion of the theorem, for some player i, againconsider a problem of maxa∈Nv(s) a

ji for some j ∈ {1, 2, . . . , ki}. Again, ap-

pealing to the measurable maximum theorem, the (maximizing) selectiona(v, s) (respectively, a(v, s)) is measurable with s.

Proof of theorem 4.1: Proof of 1. Clearly �1 ≤ �2 and v1 ≤ v2. Suppose�t ≤ �t+1 and vt ≤ vt+1. By the definition of the sequence {vt} and lemma7.6 , we have vt+1 ≤ vt+2. Then, by Lemma 7.5, definition of {�t}, and theinduction hypotheses, we obtain �t+1(s) = a(vt+1, s) ≤ a(vt+2, s) = �t+2(s).Similarly, we obtain monotonicity of t and wt.

Proof of 2: Clearly, the thesis is satisfied for t = 1. By induction,suppose that the thesis is satisfied for some t. Since vt ≤ wt, by Lemma 7.6,we obtain

vt+1(s) = Π∗(vt, s) ≤ Π∗(wt, s) ≤ Π∗(wt, s) = wt+1(s).

Then, by Lemma 7.5, we obtain

�t+1(s) = a(vt+1, s)

≤ a(wt+1, s) and hence

≤ a(wt+1, s) = t+2(s).

Proof of 3-4: It is clear since for each s ∈ S, the sequences of valuesvt, wt and associated pure strategies �t and t are bounded. Further, byprevious step, they are monotone.

Proof of 5: By definition of vt and �t, we obtain

vt+1i (s) = (1− �i)ui(s, �t(s)) + �i

L∑j=1

gj(s, �t(s))

∫S

vti(s′)�j(ds

′∣s)

≥ (1− �i)ui(s, ai, �t−i(s)) + �i

L∑j=1

gj(s, ai, �t−i(s))

∫S

vti(s′)�j(ds

′∣s),

for arbitrary ai ∈ Ai(s). By the continuity of ui and g and the LebesgueDominance Theorem, if we take a limit t→∞, we obtain

v∗i (s) = (1− �i)ui(s, �∗(s)) + �i

L∑j=1

gj(s, �∗(s))

∫S

v∗i (s′)�j(ds

′∣s)

≥ (1− �i)ui(s, ai, �∗−i(s)) + �i

L∑j=1

gj(s, ai, �∗−i(s))

∫S

v∗i (s′)�j(ds

′∣s),

34

which, by lemma 7.7, implies that �∗ is a pure stationary (measurable)Nash equilibrium, and v∗ is its associated (measurable) equilibrium payoff.Analogously, we have ∗ a pure strategy (measurable) Nash equilibrium,and w∗ its associated (measurable) equilibrium payoff.

Proof of theorem 4.2: Step 1. We prove the desired inequality for equi-libria payoffs. Since 0 ≤ !∗ ≤ u, by Lemma 7.6 and definition of vt and wt,we obtain

v1 ≤ !∗ ≤ w1.

By induction, let vt ≤ !∗ ≤ wt. Again, from Lemma 7.6 we have:

vt+1 = Π∗(vt, s) ≤ Π∗(!∗, s)

≤ !∗(s) ≤ Π∗(!∗, s) ≤ Π

∗(wt, s) = wt+1.

Taking a limit with t we obtain desired inequality for equilibria payoffs.Step 2: By previous step and Lemma 7.5, we obtain:

�∗(s) = a(v∗, s) ≤ a(!∗, s)

≤ ∗(s) ≤ a(!∗, s) ≤ a(w∗, s) = ∗.

For fixed continuation value v let:

Mi,j,l := sups∈S,a∈A(s)

∂2Πi∂aji∂sl

−n∑{=1

k{∑j=1

∂2Πi

∂aji∂aj{

,

M := max {Mi,j,l : i = 1, . . . , n, and j = 1, . . . , ki, and l = 1, . . . , k} .

By assumption 4, the constant M is a strictly positive real number.

Lemma 7.8 Let assumptions 2, 3, 4 be satisfied and constraint functionsai ∈ CMki. Fix v ∈ ℬn(S), and assume it is Lipschitz continuous. Consideran auxiliary game Gsv. Then, there is a unique Nash equilibrium in this gamea∗(v, s) and belongs to CM

∑i ki.

Proof of lemma 7.8: Let s > 0 and let v ∈ ℬn(S) be Lipschitz continuousfunction. To simplify we drop v from our notation. Let x1(s) = a(s) andxt+1i (s) := arg max

ai∈Ai(s)Πi(s, ai, x

t−i(s)) for n ≥ 1. This is well defined by strict

concavity of Πi in ai. Clearly, x1 is nondecreasing and Lipshitz continuouswith a constant less that M . By induction, assume that this thesis holds

35

for t ∈ ℕ. Note that (s, ai) → Πi(s, ai, xt−i(s)) has increasing differences.

Indeed if we take s1 ≤ s2 and y1 ≤ y2 then xt−i(s1) ≤ xt−i(s2) and

Πi(s1, y2, xt−i(s1))−Πi(s1, y1, x

t−i(s1)),

≤ Πi(s1, y2, xt−i(s2))−Πi(s1, y1, x

t−i(s2)),

≤ Πi(s2, y2, xt−i(s2))−Πi(s2, y1, x

t−i(s2)).

Therefore, since Ai(⋅) is ascending in the Veinott stron set order, by Theorem6.1 in Topkis (1978) we obtain that xt+1

i (⋅) is isotone. We show that xt+1i (⋅)

is Lipshitz continuous with a constant M . To do this we check hypothesesof Theorem 2.4(ii) in Curtat (1996). Define '(s) = s1 + . . . + sk. Define1i := (1, 1, . . . , 1) ∈ ℝki . We show that the function (s, y) → Π∗(s, y) :=Πi(s,M'(s)1i − y, xt−i(s)) has increasing differences. Note that M'(s) −ai(s) ≤ y ≤ M'(s). We show that the collection of the sets Y (s) :=[M'(s) − ai(s),M'(s)] is ascending in the Veinott strong set order. Lets1 ≥ s2 in product order Then,

M'(s1)− aji (s1)−(M'(s2)− aji (s2)

)=

= M ∣∣s1 − s2∣∣1 − ∣aji (s1)− aji (s2)∣ ≥ 0.

This, therefore, implies that lower bound of Y (s) is increasing with s. Clearlyupper bound of Y (s) is increasing as well. Hence Y (s) is ascending in theVeinott’s strong set order.

Note that since for all sl → xt(s) is monotone and continuous, hencemust be differentiable almost everywhere (Royden (1968)). By M Lipshitzproperty of xt we conclude that each partial derivative is bounded by M .Hence we have for all l = 1, . . . , k, i = 1, . . . , n and j = 1, . . . , ki:

∂Π∗i

∂yji= −∂Πi

∂aji.

36

Next we have (for fixed s−k):

∂2Π∗i

∂yji ∂sk= − ∂2Πi

∂aji∂sk−M

ki∑�=1

∂2Πi

∂ajia�i

∂'

∂sk−∑i ∕=i

ki∑j=1

∂2Πi

∂aji∂aj

i

∂x∗i,j

∂sk.

≥ − ∂2Πi

∂aji∂sk−M

ki∑�=1

∂2Πi

∂ajia�i

−∑i ∕=i

ki∑j=1

∂2Πi

∂aji∂aj

i

M

= − ∂2Πi

∂aji∂sk−∑i=1

ki∑j=1

∂2Πi

∂aji∂aj

i

M

=

⎛⎝−∑i=1

ki∑j=1

∂2Πi

∂aji∂aj

i

⎞⎠⎛⎜⎜⎜⎜⎝M −

∂2Πi∂aji∂sk

−∑i=1

ki∑j=1

∂2Πi

∂aji∂aj

i

⎞⎟⎟⎟⎟⎠ ≥ 0

almost everywhere. Since∂Π∗i∂aji

is continuous, by Theorem 6.2. in Topkis

(1978) the solution of the optimization problem y → Π∗(v, s, y) (say y∗(s, v))is isotone. From definition of y∗(s, v) and xt+1 if s1 ≤ s2 we have

0 ≤ xt+1i,j (s1)− xt+1

i,j (s2) ≤M('(s1)− '(s2)) = M ∣∣s1 − s2∣∣1.

Analogously we prove appropriate inequality whenever s1 ≥ s2. If s1 and s2

are incomparable then

xt+1i,j (s1)− xt+1

i,j (s2) ≤ xt+1i,j (s1 ∨ s2)− xt+1

i,j (s1 ∧ s2) ≤M ∣∣s1 − s2∣∣1

since ∣∣s1 − s2∣∣1 = ∣∣s1 ∨ s2 − s1 ∧ s2∣∣1. Similarly we prove that:

xt+1i,j (s1)− xt+1

i,j (s2) ≥ −M ∣∣s1 − s2∣∣1.

But this implies that xt+1 is also M-Lipshitz continuous, which implies thateach xt is M-Lipshitz continuous. Since Πi has increasing differences in(ai, a−i) hence by Theorem 6.2 in Topkis (1978) we know that the operatorx→ arg max

yi∈Ai(s)Πi(s, yi, x−i) is increasing. Therefore xt(s) must be decreas-

ing in t. This implies that there exists a∗ = limn→∞

xt which is isotone and

M-Lipshitz continuous. Uniqueness of Nash Equilibria follows from assump-tion 4 and Gabay and Moulin (1980), hence a∗ = a∗(s, v) for s > 0.

Finally Π(0, a) = 0 for all a ∈ A hence we can define a∗(0) := lims→0+ a∗(s)

and obtain a unique Nash equilibrium a∗(s, v) that is isotone andM -Lipschitzcontinuous in s.

37

Proof of theorem 4.3: To simplify notation, let L = 1 and hence g(s, a) :=

g1(s, a) and �(f)(s) := �f1 (s). Let v ∈ ℬn(S) be Lipschitz continuous.Under assumptions 4, a∗(s, v) is a well defined (for s > 0) as auxiliarygame satisfies conditions of Gabay and Moulin (1980). Let �i(vi, s, a) =(1 − �i)ui(s, a) + �i�i(vi)g(s, a), and observe that �i(vi, s, ⋅) has also strictdiagonal property, and obviously has cardinal complementarities. Here, notethat

ℒ(�i(vi, s, ⋅)) = (1− �i)ℒ(ui(s, ⋅)) + �i�i(vi)(s)ℒ(g(s, ⋅)) < 0.

Note further that applying Royden (1968) and continuity of the left side ofexpression below we have

∂2�i(v,s,⋅)∂aji∂sl

−∑{=1

k{∑j=1

∂2�i(v,s,⋅)∂aji∂a

j{

=(1− �i) ∂ui

∂aji∂sl+ �i

∂�(v)(s)∂sl

∂g(s,a)

∂aji+ �i�(v)(s)∂

2g(s,a)

∂aji∂sl

−(1− �i)ℒ(ui)− �i�(v)(s)ℒ(g)

≤(1− �i)U2

i,j,l + �i�G1i,j + �iuG

2i,j,l

−(1− �i)ℒ(ui)≤M0.

By Lemma 7.8, we know that a∗(⋅, v) ∈ CM∑i ki

0 . The following argumentshows that Tv(s) is Lipshitz continuous.

∣Tiv(s1)− Tiv(s2)∣ ≤ (1− �i)∣ui(s1, a∗(s1, v))− ui(s2, a

∗(s2, v))∣+ �i∣�(vi)(s1)− �(vi)(s2)∣g(s1, a

∗(s1, v))

+ �i�(vi)(s2)∣g(s1, a∗(s1, v))− g(s2, a

∗(s2, v))∣≤ (U1 +M0U2 + � + (G1 +G2M0)u)∣∣s1 − s2∣∣1= M1∣∣s1 − s2∣∣1,

where

U1 :=ℎ∑l=1

sups∈S,a∈A(s)

∂ui∂sl

(s, a), U2 :=m∑i=1

ki∑j=1

sups∈S,a∈A(s)

∂ui

∂aji(s, a),

G1 :=

ℎ∑l=1

sups∈S,a∈A(s)

∂g

∂sl(s, a), G2 :=

m∑i=1

ki∑j=1

sups∈S,a∈A(s)

∂g

∂aji(s, a),

and M1 := U1 +M0U2 + � + (G1 +G2M0)u. Hence image of operator T isa subset of CMn

1 . Therefore the thesis is proven.

Proof of theorem 4.4: On CMn define a function T (v)(s) = Π∗(v, s).By a standard argument (e.g., Curtat (1996) ) T : CMn → CMn is con-tinuous and increasing on CMn. By Tarski (1955) theorem, it therefore

38

has a nonempty complete lattice of fixed points, say FP (T ). Further, foreach fixed point v∗(⋅) ∈ FP (T ), there is a corresponding unique stationaryMarkov Nash equilibrium a∗(v∗, ⋅).

Lemma 7.9 Let X be a lattice, Y be a poset. Assume (i) F : X × Y → ℝand G : X × Y → ℝ have increasing differences, (ii) that ∀ y ∈ Y , G(⋅, y)and : Y → ℝ are increasing functions. Then, function H defined byH(x, y) = F (x, y) + (y)G(x, y) has increasing differences.

Proof of lemma 7.9: Under the hypotheses of the lemma, it suffices toshow that (y)G(x, y) has increasing differences (as increasing differencesis a cardinal property and closed under addition). Let y1 > y2, x1 > x2

and (xi, yi) ∈ X × Y . By the hypothesis of increasing differences of G, andmonotonicity of and G(⋅, y), we have the following inequality

(y1)(G(x1, y1)−G(x2, y1)) ≥ (y2)(G(x1, y2)−G(x2, y2)).

Therefore,

(y1)G(x1, y1) + (y2)G(x2, y2) ≥ (y1)G(x2, y1) + (y2)G(x1, y2).

Proof of theorem 4.5: Step 1. Let v� be a function (s, �)→ v�(s) that isincreasing. By assumption 6 and lemma 7.9, the payoff function Πi(v�, s, a, �)has increasing differences in (ai, �). Further, Πi clearly also has increasingdifferences in (ai, a−i). As Ai() is ascending in Veinott’s strong set order,by Theorem 6 in Milgrom and Roberts (1990), the greatest and the leastNash equilibrium in the supermodular game Gsv,� are increasing selections.

Further, by the same argument as in Lemma 7.6, as Ai() is also ascendingunder set inclusion by assumption, we obtain monotonicity of correspondingequilibria payoff.

Step 2: Note, for each �, the parameterized stochastic game satisfiesconditions of Theorem 4.1. Further, noting the initial values of the sequenceof wt�(s) and vt�(s) (constructed in Theorem 4.1) do not depend on � andisotone in s, by the previous step, each iteration of both sequences of valuesis increasing with respect to (s, �). Also, each of the iterations of �t�(s) and t�(s) are also increasing in (s, �). Therefore, as the pointwise partial orderis closed, the limits of these sequences preserve this partial ordering, andthe limits are increasing with respect to (s, �).

39

For each equilibrium strategy f , define the operator

T of (�)(A) =

∫S

Q(A∣s, f(s))�(ds). (2)

where �∗ is said to be invariant with respect to f if and only if it is a fixedpoint of T of .

Proof of theorem 4.6: By theorem 4.5 both �∗ and ∗ are increasingfunctions. By Assumption 3 T o�∗(�)([s, S]) = 1 for s ≤ 0 and for s > 0:

T o�∗(�)([s, S]) =L∑j=1

gj(s, �∗(s))

∫S�j([s, S]∣s′)�(ds′).

Since by assumption, for each s ∈ S, the function under integral is increas-ing, the right-side is increase pointwise whenever � is stochastically increase.Moreover, as the family of probability measures on a compact state spaceS ordered by ર (first order stochastic dominance) is chain complete (as itis a compact ordered topological space, e.g., Amann (1977), lemma 3.1 orcorollary 3.2). Hence, T o�∗ satisfies conditions of Markowsky (1976) theo-rem (Theorem 9), and we conclude that the set of invariant distributionsis a chain complete with greatest and least invariant distributions (see alsoAmann (1977), Theorem 3.3). By a similar construction, the same is truefor the operator T o ∗ .

To show the second assertion, we first prove that T o�∗(⋅)(A) is weaklycontinuous (i.e. if �t → � weakly then T o�∗(�t) → T o�∗(�) weakly). Let �t →� weakly. By stochastic continuity of �j(⋅∣s), we have s′ → �j([s, S]∣s′)continuous. Therefore,∫

S�j([s, S]∣s′)�t(ds′)→

∫S�j([s, S]∣s′)�(ds′)

for all s. This, in turn, implies

T o�∗(�t)→ T o�∗(�)

weakly. Let ��∗

t be a distribution of s�∗

t and ��∗

1 = �S . By the previous step,�t is stochastically decreasing. It is, therefore, weakly convergent to some�∗. By continuity of T o, we have �∗ = T o�∗(�

∗). By definition of �(�∗), weimmediately obtain �(�∗) ⪯ �∗. By the stochastic monotonicity of T o�∗(⋅),we can recursively obtain that �S ર ��

∗

t ર �(�∗), and hence �∗ ર �(�∗).As a result, we conclude �∗ = �(�∗). Similarly, we show convergence of the

sequence of distributions s ∗

t .

40

Proof of corollary 4.2: By theorem 4.6, there exists greatest fixed pointsfor T o,�2�∗ and T o,�1�∗ . Also, T o,��∗ is weakly continuous. Further, � → T o,��∗ is anincreasing map under first stochastic dominance on a chain complete posetof probability measures on the compact state set.

Consider a sequence of iterations from a �S generated on T o,��∗ (the opera-tor defined in (2) but associated with Q(⋅∣s, a, �) ). Observe, by Kantorvich-Tarski theorem (Dugundji and Granas (1982), theorem 4.2), we have

suptT t,o,�2�∗ = ��2 and sup

tT t,o,�2�∗ = ��2

As for any t, we also have T t,o,�2�∗ ર T t,o,�1�∗ . Therefore, by weak continuity(and the fact that ર is a closed order), we obtain:

��2 = suptT o,�2�∗ ર sup

tT t,o,�1�∗ = ��1 .

Similarly we proceed for ∗.

Remark 3 Since norms ∣∣ ⋅ ∣∣1 and ∣∣ ⋅ ∣∣ are equivalent on ℝℎ and for alls ∈ S ∣∣s∣∣1 ≤ ℎ∣∣s∣∣, hence we have that v∗, w∗, �∗ and ∗ are Lipschitzcontinuous when we endow S with the maximum norm.

References

Abreu, D., D. Pearce, and E. Stacchetti (1986): “Optimal Car-tel Equilibria with Imperfect Monitoring,” Journal of Economic Theory,39(1), 251–269.

(1990): “Toward a theory of discounted repeated games with im-perfect monitoring,” Econometrica, 58(5), 1041–1063.

Aguirregabiria, V., and P. Mira (2007): “Sequential Estimation of Dy-namic Discrete Games,” Econometrica, 75(1), 1–53.

Aliprantis, C. D., and K. C. Border (1999): Infinite dimentional anal-ysis. A hitchhiker’s guide. Springer Verlag: Heilbelberg.

Amann, H. (1976): “Fixed point equations and nonlinear eigenvalue prob-lems in order Banach spaces.,” Siam Review, 18(4), 620–709.

(1977): “Order structures and fixed points,” SAFA 2, ATTI del 2.Seminario di Analisi Funzionale e Applicazioni. MS.

Amir, R. (1996): “Strategic intergenerational bequests with stochastic con-vex production,” Economic Theory, 8, 367–376.

41

(2002): “Complementarity and Diagonal Dominance in DiscountedStochastic Games,” Annals of Operations Research, 114, 39–56.

(2003): “Stochastic games in economics and related fields: anoverview,” in Stochastic games and applications, ed. by A. Neyman, and

S. Sorin, NATO Advanced Science Institutes Series D: Behavioural andSocial Sciences. Kluwer Academin Press, Boston.

(2005): “Discounted supermodular stochastic games: theory andapplications,” MS. University of Arizona.

Atkeson, A. (1991): “International Lending with Moral Hazard and Riskof Repudiation,” Econometrica, 59(4), 1069–89.

Balbus, L., and A. S. Nowak (2004): “Construction of Nash equilibriain symmetric stochastic games of capital accumulation,” MathematicalMethods of Operation Research, 60, 267–277.

(2008): “Existence of perfect equilibri in a class of multigenerationalstochastic games of capital accumulation,” Automatica, 44(6).

Balbus, L., K. Reffett, and L. Wozny (2009): “A Constructive Geo-metrical Approach to the Uniqueness of Markov Perfect Equilibrium inStochastic Games of Intergenerational Altruism,” Arizona State Univer-sity.

Barlo, M., G. Carmona, and H. Sabourian (2009): “Repeated gameswith one-memory,” Journal of Economic Theory, 144(1), 312–336.

Cabral, L. M. B., and M. H. Riordan (1994): “The Learning Curve,Market Dominance, and Predatory Pricing,” Econometrica, 62(5), 1115–40.

Chakrabarti, S. K. (1999): “Markov Equilibria in Discounted StochasticGames,” Journal of Economic Theory, 85(2), 294–327.

(2008): “Pure Strategy Markov Equilibrium in Stochastic Gameswith Concave Transition Probabilities,” Department of Economics, Indi-ana University Purdue University Indianapolis.

Cole, H. L., and N. Kocherlakota (2001): “Dynamic Games with Hid-den Actions and Hidden States,” Journal of Economic Theory, 98(1), 114–126.

Curtat, L. (1992): “Supermodular stochastic games: existence and char-acterization of Markov equilibria,” Ph.D. thesis, Standford University.

(1996): “Markov equilibria of stochastic games with complemen-tarities,” Games and Economic Behavior, 17, 177–199.

42

Diamond, P. A. (1982): “Aggregate Demand Management in Search Equi-librium,” Journal of Political Economy, 90(5), 881–94.

Doraszelski, U., and J. F. Escobar (2011): “Restricted Feedback inLong Term Relationships,” Discussion paper.

Doraszelski, U., and M. Satterthwaite (2010): “Computable Markov-Perfect Industry Dynamics: Existence, Purification, and Multiplicity,”Discussion paper.

Duggan, J., and T. Kalandrakis (2007): “Dynamic Legislative PolicyMaking,” Wallis Working Papers WP45, University of Rochester - WallisInstitute of Political Economy.

Dugundji, J., and A. Granas (1982): Fixed point theory. PWN PolishScientific Publishers.

Dutta, P. K., and R. Sundaram (1992): “Markovian Equilibrium ina Class of Stochastic Games: Existence Theorems for Discounted andUndiscounted Models,” Economic Theory, 2(2), 197–214.

Echenique, F. (2004): “Extensive-form games and strategic complemen-tarities,” Games and Economic Behaviour, 46(348-354).

Federgruen, A. (1978): “On N-person stochastic games with denumerablestate space,” Advances in Applied Probability, 10, 452–572.

Feng, Z., J. Miao, A. Peralta-Alva, and M. S. Santos (2009): “Nu-merical Simulation of Nonoptimal Dynamic Equilibrium Models,” Discus-sion Paper WP2009-013, Boston University. Department of EconomicsWorking Papers Series.

Fink, A. (1964): “Equilibrium in a stochastic n-person game,” Journal ofScience of the Hiroshima University, A-I(28), 89–93.

Gabay, D., and H. Moulin (1980): “On the uniqueness and stability ofNash-equilibria in noncooperative games,” in Applied stochastic control ineconometrics and management science, ed. by A. Bensoussan, P. Klein-dorfer, and C. Tapiero. North-Holland, New York.

Granot, F., and A. F. Veinott (1985): “Substitutes, complements, andripples in network flows,”Mathematics of Operations Research, 10(3), 471–497.

Harris, C., and D. Laibson (2001): “Dynamic Choices of HyperbolicConsumers,” Econometrica, 69(4), 935–57.

43

Harris, C., P. Reny, and A. Robson (1995): “The existence of subgame-perfect equilibrium in continuous games with almost perfect information:a case for public randomization,” Econometrica, 63(3), 507–44.

Herings, P. J.-J., and R. J. A. P. Peeters (2004): “Stationary equilibriain stochastic games: structure, selection, and computation,” Journal ofEconomic Theory, 118(1), 32–60.

Hopenhayn, H. A., and E. C. Prescott (1992): “Stochastic monotonic-ity and stationary distribution for dynamic economies,” Econometrica,60(6), 1387–1406.

Horst, U. (2005): “Stationary equilibria in discounted stochastic gameswith weakly interacting players,” Games and Economic Behavior, 51(1),83–108.

Horner, J., and W. Olszewski (2009): “How Robust Is the Folk Theo-rem?,” Quarterly Journal of Economics, 124(4), 1773–1814.

Huggett, M. (2003): “When Are Comparative Dynamics Monotone?,”Review of Economic Studies, 6(1), 1–11.

Judd, K. L., S. Yeltekin, and J. Conklin (2003): “Computing Su-pergame Equilibria,” Econometrica, 71(4), 1239–1254.

Kamihigashi, T., and J. Stachurski (2010): “Stochastic Stability inMonotone Economies,” Discussion Papers Series, Kobe Universty, RIEB,DP2010-12.

Kydland, F., and E. Prescott (1977): “Rules Rather Than Discretion:The Inconsistency of Optimal Plans,” Journal of Political Economy, 85(3),473–91.

(1980): “Dynamic optimal taxation, rational expectations and op-timal control,” Journal of Economic Dynamics and Control, 2(1), 79–91.

Lagunoff, R. (2008): “Markov Equilibrium in Models of Dynamic En-dogenous Political Institutions,” Georgetown University.

(2009): “Dynamic stability and reform of political institutions,”Games and Economic Behavior, 67(2), 569–583.

Lee, B.-S., and B. F. Ingram (1991): “Simulation estimation of time-series models,” Journal of Econometrics, 47(2-3), 197–205.

Maitra, A. P., and W. D. Sudderth (2007): “Subgame-Perfect Equi-libria for Stochastic Games,” Mathematics of Operations Research, 32(3),711 – 722.

44

Majumdar, M., and R. Sundaram (1991): “Symmetric stochastic gamesof resource extraction: the existence of non-randomized stationary equi-librium,” in Stochastic games and related topics, ed. by T. Raghavan,T. Ferguson, T. Parthasarathy, and O. Vrieez. Kluwer, Dordrecht.

Marcet, A., and R. Marimon (2011): “Recursive Contracts,” Discussionpaper, Barcelona GSE Working Paper Series Working Paper no. 552.

Markowsky, G. (1976): “Chain-complete posets and directed sets withapplications,” Algebra Universitae, 6, 53–68.

Maskin, E., and J. Tirole (2001): “Markov Perfect Equilibrium: I. Ob-servable Actions,” Journal of Economic Theory, 100(2), 191–219.

Mertens, J.-F., and T. Parthasarathy (1987): “Equilibria for dis-counted stochastic games,” C.O.R.E. Dicsussion Paper 8750.

Messner, M., N. Pavoni, and C. Sleet (2011): “Recursive methods forincentive problems,” Discussion paper.

Milgrom, P., and J. Roberts (1990): “Rationalizability, learning andequilibrium im games with strategic complementarites,” Econometrica,58(6), 1255–1277.

Neyman, A., and S. Sorin (eds.) (2003): Stochastic games and appli-cations, NATO Advanced Science Institutes Series D: Behavioural andSocial Sciences. Kluwer Academin Press, Boston.

Nowak, A. S. (2003): “On a new class of nonzero-sum discounted stochasticgames having stationary Nash equilibrium points,” International Journalof Game Theory, 32, 121–132.

(2007): “On stochastic games in economics,” Mathematical Methodsof Operations Research, 66(3), 513–530.

Nowak, A. S., and T. Raghavan (1992): “Existence of stationary cor-related equilibria with symmetric information for discounted stochasticgames,” Mathematics of Operations Research, 17, 519–526.

Nowak, A. S., and P. Szajowski (2003): “On Nash equilibria in stochasticgames of capital accumulation,” Game Theory and Applications, 9, 118–129.

Pakes, A., and R. Ericson (1998): “Empirical Implications of AlternativeModels of Firm Dynamics,” Journal of Economic Theory, 79(1), 1–45.

Pakes, A., M. Ostrovsky, and S. Berry (2007): “Simple Estimators forthe Parameters of Discrete Dynamic Games (with Entry/Exit Examples),”Rand Journal of Economics, 38(2), 373–399.

45

Pakes, A., and D. Pollard (1989): “Simulation and the Asymptotics ofOptimization Estimators,” Econometrica, 57(5), 1027–57.

Pearce, D., and E. Stacchetti (1997): “Time Consistent Taxation bya Government with Redistributive Goals,” Journal of Economic Theory,72(2), 282–305.

Pesendorfer, M., and P. Schmidt-Dengler (2008): “Asymptotic LeastSquares Estimators for Dynamic Games,” Review of Economic Studies,75(3), 901–928.

Phelan, C., and E. Stacchetti (2001): “Sequantial equilibria in a Ram-sey tax model,” Econometrica, 69(6), 1491–1518.

Raghavan, T., T. Ferguson, T. Parthasarathy, and O. Vrieez(eds.) (1991): Stochastic games and related topics. Kluwer, Dordrecht.

Rauh, M. T. (2009): “Strategic complementarities and search market equi-librium,” Games and Economic Behavior, 66(2), 959–978.

Rieder, U. (1979): “Equilibrium plans for non-zero sum Markov games,” inGame Theory and Related Topics, ed. by O. Moeschlin, and D. Pallasche.

Rosenthal, R. W. (1982): “A dynamic model of duopoly with customerloyalties,” Journal of Economic Theory, 27(1), 69–76.

Royden, H. L. (1968): Real Analysis. New York: MacMillan.

Shapley, L. (1953): “Stochastic games,” Proceedings of the NationalAcademy of Sciences of the USA, 39, 1095–1100.

Solan, E. (1998): “Discounted Stochastic Games,” Mathematics of Opera-tions Research, 23(4), 1010–1021.

Stokey, N. L. (1991): “Credible public policy,” Journal of Economic Dy-namics and Control, 15(4), 627–656.

Tarski, A. (1955): “A lattice-theoretical fixpoint theorem and its applica-tions,” Pacific Journal of Mathematics, 5, 285–309.

Topkis, D. M. (1978): “Minimazing a submodular function on a lattice,”Operations Research, 26(2), 305–321.

(1979): “Equilibrium points in nonzero-sum n-person submodulargames,” SIAM Journal of Control and Optimization, 17, 773–787.

(1998): Supermodularity and complementarity, Frontiers of eco-nomic research. Princeton University Press.

46

Veinott (1992): Lattice programming: qualitative optimization and equi-libria. MS Standford.

Villas-Boas, M. J. (1997): “Comparative statics of fixed points,” Journalof Economic Theory, 73, 183–198.

Vives, X. (1990): “Nash equilibrium with strategic complementarites,”Journal of Mathematical Economics, 19, 305–321.

47

A Constructive Study of Markov Equilibria in Stochastic Games...

Documents

Transcript of A Constructive Study of Markov Equilibria in Stochastic Games...