Lecture 2.2 Finite State Markov...

StochasticDynamics

A. Banerji

Markov ChainsIntroduction

Marginal Distributions

Other Identities

Stability of FiniteState MCsStationary Distributions

Dobrushin Coefficient

Lecture 2.2 Finite State Markov Chains

A. Banerji

Department of Economics

February 24, 2014

StochasticDynamics

A. Banerji

Other Identities

Outline

Markov ChainsIntroductionMarginal DistributionsIdentities

Stability of Finite State MCsStationary DistributionsDobrushin Coefficient

StochasticDynamics

A. Banerji

Other Identities

Stochastic Kernels

Finite State Space S = {x1, x2, ..., xN}Distribution on S. A function φ : S → < s.t. φ(x) ≥ 0, forall x ∈ S and

∑x∈S φ(x) = 1.

The set of all distributions on S, P(S), is the N-1dimensional unit simplex in <N .

DefinitionA stochastic kernel on S is a function p : S × S → [0,1]s.t.

∑y∈S p(x , y) = 1, for all x ∈ S.

For each x ∈ S, we call the corresponding distribution onS, p(x ,dy). For a finite state space S, we can write downthe N distributions in an N × N matrix.

StochasticDynamics

A. Banerji

Other Identities

Markov Chains

M = (p(x ,dy))x∈S

p(x1, x1) . . . p(x1, xN)...

......

p(xN , x1) . . . p(xN , xN)

DefinitionThe Markov Chain on S generated by stochastic kernel pand initial condition ψ ∈ P(S) is the sequence (Xt)

∞t=0 of

random variables defined by(i) X0 ∼ ψ(ii)For t = 0,1,2, ..., Xt+1 ∼ p(Xt ,dy)So if Xt = x , P(Xt+1 = y |Xt = x) = p(x , y). CalledMarkov -(p, ψ) chain. Discuss Hamilton(2005),Quah(1993).

StochasticDynamics

A. Banerji

Other Identities

Marginal Distribution - Approximation

Let (Xt)∞t=0 be a Markov Chain on S generated by a

stochastic kernel p and initial condition ψ. The marginaldistribution ψt(y) ≡ P(Xt = y), for all y ∈ S.Approximating ψt(y) by Monte-Carlo. Draw Xt a largenumber of times and compute the relative frequency of y .Specifically:Draw X0 from ψ a large n number of times. Each of thesetimes:For k = 1, ..., t , draw Xk from p(xk−1,dy).For y ∈ S, ψt(y) ' 1

i=1 1{X it =y}

Now do JS exercises 4.2.1, 4.2.2.

StochasticDynamics

A. Banerji

Other Identities

Marginal Distribution - Recursion

By the Law of Total Probability,P{Xt+1 = y} =

∑x∈S P{Xt+1 = y |Xt = x}P{Xt = x}

(The above just integrates out Xt from the joint (Xt+1,Xt)).That is,ψt+1(y) =

∑x∈S p(x , y)ψt(x) = (ψt(x))x∈S.(p(x , y))x∈S

Stacking these for all y ∈ S in one row,ψt+1 = (ψt+1(y))y∈S = ψtMBy t recursions, we getψt+1 = ψM t+1 †The probabilities of the states at t + 1 is a weightedaverage of the transitions p(x ,dy) (rows of M) weightedby the probabilities of the states at t .

ExampleQuah - Starting in extreme poverty (State 1), what is themarginal distribution after 10, 60 and 160 periods.

StochasticDynamics

A. Banerji

Other Identities

Powers of M

Let (pk (x , y))N×N ≡ Mk

Lemmapk (x , y) = P{Xt+k = y |Xt = x}

Proof.Let δx ∈ P(S) be the degenerate distribution that gives xwith probability 1.So P{Xt+k = y |Xt = x} = P{Xt+k = y |Xt ∼ δx}This is just the marginal distribution ψt+k (y) with initialcondition Xt ∼ δx . By recursion †,ψt+k = δxMk = pk (x ,dy).

StochasticDynamics

A. Banerji

Other Identities

Expectation

Suppose Xt ∼ ψ ∈ P(S). So the marginal distribution ofXt+k , ψt+k = ψMk . So if h : S → <, the expectationE[h(Xt+k )|Xt ∼ ψ] =

∑y∈S ψMk (y)h(y) = ψMkh

where h ≡ (h(y))y∈S (we’ve taken an inner product).

ExampleIf ψ = δx , we haveE[h(Xt+K )|Xt = x ] =

∑y∈S pk (x , y)h(y) = δxMkh. This is

just the x th row of the matrix Mk multiplied by the vector h.For Hamilton(2005), let h = (1000,0,−1000) be profits ofa firm in the 3 states. What is expected profit 5 periodsfrom now, if we are currently in severe recession (state1)? Do JS 4.2.4-5.

StochasticDynamics

A. Banerji

Other Identities

Chapman-Kolmogorov Equation

The Equation:

pk+j(x , y) =∑z∈S

pk (x , z)pj(z, y)

Proof.Mk+j = MkM j . So the (x , y)th element of Mk+j is theinner product of the xth row of Mk and yth column ofM j .To go from state x to state y in k + j steps, we must go tostate z ∈ S in k steps, then from there to y in j steps. Forfixed z, multiply the 2 probabilities; then add over all(mutually exclusive) z ’s. In other standard notation,P{Xk+j = y |X0 = x} =

∑z∈S P{Xk+j = y |Xk = z}P{Xk =

z|X0 = x}

StochasticDynamics

A. Banerji

Other Identities

Exercise

JS 4.2.6. In terms of sums,pk (x , y) =

∑z1∈S p(x , z1)

∑z2∈S p(z1, z2) . . .∑

zk−1∈S p(zk−2, zk−1)p(zk−1, y)

Proof.pk (x , y) is the sum of probabilities of all mutuallyexclusive outcome paths of type {xz1z2 . . . zk−1y}, whichequals∑

all {xz1z2...zk−1y} p(x , z1)p(z1, z2) . . . p(zk−1, y)=∑

all{xz1z2...zk−2} p(x , z1) . . .

p(zk−3, zk−2)∑

zk−1∈S p(zk−2, zk−1)p(zk−1, y)where we’ve fixed {xz1 . . . zk−2} and summed across thelast stretch of the paths ending at y . Working backwardsall the way we get∑

z1∈S p(x , z1)∑

z2∈S p(z1, z2)∑

z3∈S . . .∑zk−1∈S p(zk−2, zk−1)p(zk−1, y)

StochasticDynamics

A. Banerji

Other Identities

Introduction

I Investigate the sequence (ψt) of marginaldistributions for Quah, as t grows large.

I (ψt) settles at some ψ∗, regardless of where we startI Global asymptotic stability of Markov Chains refers to

the settling down of the marginal distribution to aunique distribution, regardless of initial condition

I Known as ergodicity

StochasticDynamics

A. Banerji

Other Identities

Dynamical System Corresponding to FSMC

I The marginal distributions of the Markov Process(Xt) with matrix M are (ψt) = (ψM t), ifX0 ∼ ψ ∈ P(S).

I Notice that M : P(S)→ P(S) (JS 4.3.1.). Indeed, forany ψ ∈ P(S), ψM =

∑x∈S ψ(x)p(x ,dy). So, for all

y ∈ S, the y th coordinate of ψM,ψM(y) =

∑x∈S ψ(x)p(x , y) > 0. Also,

∑y∈S ψM(y)

y∈S∑

x∈S ψ(x)p(x , y)=∑

x∈S ψ(x)∑

y∈S p(x , y) =∑

x∈S ψ(x) = 1. So,ψM ∈ P(S).Basically, ψM is a convex combination of points fromP(S) and therefore belongs to P(S).

I Impose the norm || ||1 and the corresponding metricd1 on P(S). Then (P(S),M) is a dynamical system,with ψt+1 = ψtM, t = 0,1,2, . . ..

StochasticDynamics

A. Banerji

Other Identities

Stationary Distributions

DefinitionA distribution ψ∗ ∈ P(S) is stationary or invariant for M ifψ∗M = ψ∗. That is, ψ∗ is a fixed point of the dynamicalsystem (P(S),M).

TheoremEvery Markov chain on a finite state space has at leastone stationary distribution.

Proof.P(S) is compact and convex (it’s just the (N-1)dimensional unit simplex), and M is linear and hencecontinuous. So by Brouwer’s fixed point theorem, M hasa fixed point on P(S).Note: There could be many fixed points. e.g. JS 4.3.4.For the Markov matrix IN (the N × N identity matrix),every ψ ∈ P(S) is stationary.

StochasticDynamics

A. Banerji

Other Identities

Some Implications

LemmaM is d1-nonexpansive on P(S). That is, for allψ,ψ

′ ∈ P(S), d1(ψM, ψ′M) ≤ d1(ψ,ψ

Proof.

||ψM − ψ′M||1 =∑y∈S

|ψM(y)− ψ′M(y)|

y∈S |∑

x∈S(ψ(x)− ψ′(x))p(x , y)|

≤∑

y∈S∑

x∈S |(ψ(x)− ψ′(x))p(x , y)|

=∑x∈S |ψ(x)− ψ

′(x)|

∑y∈S p(x , y) =

∑x∈S |ψ(x)− ψ

′(x)|

= ||ψ − ψ′ ||1.The inequality follows from the triangleinequality.

StochasticDynamics

A. Banerji

Other Identities

Computing Stationary Distributions

ψ ∈ P(S) is stationary or a fixed point iff ψ(IN −M) = 0i.e. (IN −M)TψT = 0. We can solve the system ofequations, and normalize ψT by dividing by its norm, sothat it lies in P(S). Alternatively: JS 4.3.5. Let

1N ≡ (1,1, . . . ,1) and 1N×N an N × N matrix of ones. If ψis a fixed point of M and ψ ∈ P(S), we have1N = ψ(IN −M + 1N×N). Indeed, since ψ ∈ P(S), so

ψ1N×N = 1N . So, ψ(IN −M) = 0, or ψ is a fixed point ofM. However, if ψ /∈ P(S), then it is not necessarily truethat 1N = ψ(IN −M + 1N×N).

So solving (IN −M + 1N×N)TψT = 1T

N works. (do JS4.3.6-7)

StochasticDynamics

A. Banerji

Other Identities

StabilityDefinitionThe dynamical system (P(S),M) is globally stable if ithas a unique stationary distribution (fixed point)ψ∗ ∈ P(S), and for all ψ ∈ P(S),d1(ψM t , ψ∗) ≡ ||ψM t − ψ∗||1 → 0, as t →∞.Need more than nonexpansiveness of M for stability.Stability fails for M = IN . Succeeds ‘best’ if p(x ,dy) isidentical for all x ∈ S (we then jump to the unique fixedpoint in a single step, from any ψ ∈ P(S)).

Example

(0 11 0

)ψ∗ = (1/2,1/2) is the unique fixed point; for every otherψ = (ψ1,1− ψ1) 6= ψM = (1− ψ1, ψ1). This also showsthat (ψM t) oscillates with t , so the system is not globallystable.

StochasticDynamics

A. Banerji

Other Identities

DefinitionThe Dobrushin Coefficient of a stochastic kernel p is

α(p) ≡ min(x ,x ′ )∈S×S

∑y∈S

p(x , y) ∧ p(x′, y)

where a ∧ b ≡ min{a,b}Remarks.1. Hamilton and Quah. α(p) equals 0.029 forMH and 0 for MQ (see 1st and 5th rows of MQ). 2.α(p) ∈ [0,1], for all p. It equals 1 iff p(x ,dy) is identicalfor all x ∈ S. It equals 0 for IN and the periodic kernel onthe previous slide. 3. α(p) > 0 iff for every pair(x , x

′) ∈ S × S, p(x ,dy) and p(x

′,dy) overlap (assign

positive probability to at least one common state y ). Fromany 2 states then, there is positive probability that thechains will meet next period.

StochasticDynamics

A. Banerji

Other Identities

Dobrushin Coefficient and Stability

TheoremLet p be a stochastic kernel with Markov matrix M. Thenfor every φ, ψ ∈ P(S),

||φM − ψM||1 ≤ (1− α(p))||φ− ψ||1Moreover this bound is tight; for λ < (1− α(p)), thereexists a pair φ, ψ which violates the ≤ inequality.The proof consists of 3 steps/lemmas.

LemmaJS C.2.1. Let φ, ψ ∈ P(S) and h : S → <+. Then

|∑x∈S

h(x)φ(x)−∑x∈S

h(x)ψ(x)| ≤ 12

supx ,x ′|h(x)−h(x

′)|.||φ−ψ||1

StochasticDynamics

A. Banerji

Other Identities

Stability 2

JS C.2.1. provides an upper bound for the (absolute)difference of expectation h under φ and ψ. See proof inStachurski (appendix).

LemmaC.2.2.||φM −ψM||1 ≤ 1

2 supx ,x ′ ||p(x ,dy)− p(x′,dy)||1.||φ−ψ||1

Proof.See Stachurski (appendix). The inequality looks similar toC.2.1. Exercise 4.3.2. implies||φM − ψM||1 = 2 supA⊂S |φM(A)− ψM(A)|. Weintroduce the function h used in C.2.1. by noting that|φM(A)− ψM(A)| =|∑

x∈S P(x ,A)φ(x)−∑

x∈S P(x ,A)ψ(x)|, whereP(x ,A) =

∑y∈A p(x , y).

StochasticDynamics

A. Banerji

Other Identities

Stability 3To prove the first claim of the theorem, we now show that

supx ,x ′||p(x ,dy)−p(x

′,dy)||1 = 1− inf

x ,x ′

∑y∈S

p(x , y)∧p(x′, y)

It suffices to show that for every x , x′,

12||p(x ,dy)− p(x

′,dy)||1 = 1−

∑y∈S

p(x , y) ∧ p(x′, y)

This is true for any pair of distributions, as below.

LemmaC.2.3. For every pair µ, ν ∈ P(S) we have

12||µ− ν|| = 1−

∑y∈S

µ(y) ∧ ν(y)

StochasticDynamics

A. Banerji

Other Identities

Stability 4

To show that the bound is tight, note that 1− α(p)= 1

2 supx ,x ′ ||p(x ,dy)− p(x′,dy)||1

= sup {x 6= x′} ||p(x ,dy)−p(x

′,dy)||1

||δx−δx′ ||1≤ supµ6=ν

|µM−νM||1|µ−ν||1

The second equality holds since ||δx − δx ′ ||1 = 2. Thefinal inequality holds because the set of degeneratedistributions like δx , δx ′ is a subset of the set of alldistributions. More simply, just put M = IN (so α(p) = 0).Now for the main theorem.

TheoremLet p be a stochastic kernel on a finite set S, and M thecorresponding Markov matrix. The dynamical system(P(S),M) is globally stable iff there exists a t ∈ N s.t.α(pt) > 0.

StochasticDynamics

A. Banerji

Other Identities

Stability 5

Proof.M is nonexpansive. From the earlier theorem, we knowthat since α(pt) > 0, (P(S),M t) is globally stable. So bylemma 4.1.1., (P(S),M) is globally stable.Conversely, suppose (P(S),M) is globally stable. Sothere is a unique stationary distribution ψ∗, andψM t → ψ∗ for all ψ ∈ P(S). In particular,δxM t = pt(x ,dy)→ ψ∗, for every x ∈ S. So for all x ∈ S,pt(x , y)→ ψ∗(y), for all y ∈ S. Since ψ∗ is a distribution,there is at least one y ∈ S s.t. ψ∗(y) > 0. So for this y ,we have from the convergence that for t large enough,pt(x , y) > 0, for all x ∈ S. Thus there exists t such that allrows pt(x ,dy) of M t overlap at this y ; thus the Dobrushincoefficient α(pt) > 0.

StochasticDynamics

A. Banerji

Other Identities

ExercisesTheorem shows:for every pair (x , x

′) of states, pt(x ,dy)

and pt(x′,dy) overlap. Starting at any 2 different points

today, the chains meet with positive probability t periodslater. Extreme form: pt(x ,dy) same for all x orconvergence in t periods.

I α(p) > 0 for Hamilton’s matrix but zero for Quah’smatrix. However, for the 23rd iterate M23

Q reported byQuah, α(p23) > 0. So (P(S),MQ) is globally stable.

I JS 4.3.20. Code to calculate α(pt), t = 1,2, . . . ,T fora given Markov matrix M, stopping at the first t s.t.α(pt) > 0. Show that this t = 2 for Quah’s matrix.

I (s,S) (or (q,Q)) inventory dynamics. A firm withinventory Xt at the start of period t , has the option ofordering inventory up to its maximum storingcapacity Q. At the end of period t , demand Dt+1 isobserved (all non-negative integers). The firm meetsdemand up to its current stock level; remaininginventory is carried over to next period.

StochasticDynamics

A. Banerji

Other Identities

Inventory Dynamics exercise

(Dt)t≥1 is an iid sequence of random variables takingnonnegative integer values according to distributionb(d) ≡ P{Dt = d} = (1/2)d+1.The firm follows a stationary policy: If Xt ≤ q, orderinventory to top up to equal Q; otherwise, order noinventory (the choice of q is the firm’s predecided policychoice).So,

Xt+1 =

{max{Q − Dt+1,0} if Xt ≤ qmax{Xt − Dt+1,0} if Xt > q

Let S = {0,1, ...,Q}. What is the stochastic kernelMq = (p(x , y)) corresponding to restocking policy q?

StochasticDynamics

A. Banerji

Other Identities

(q,Q) DynamicsLet x ≤ q. Then

Xt+1 =

{Q − i with Pr(1/2)i+1, i = 0,1, ...,Q − 1

0 with Pr(1/2)Q

Let x > q. Then

Xt+1 =

{x − i with Pr(1/2)i+1, i = 0,1, ..., x − 1

0 with Pr(1/2)x

(1/2)Q (1/2)Q (1/2)Q−1 . . . . . . (1/2). . . . . . . . . . . . . . . . . .

(1/2)Q (1/2)Q (1/2)Q−1 . . . . . . (1/2). . . . . . . . .

(1/2)x (1/2)x (1/2)x−1 . . . (1/2) 0. . . . . . . . .

(1/2)Q (1/2)Q (1/2)Q−1 . . . . . . (1/2)

StochasticDynamics

A. Banerji

Other Identities

(q,Q) Dynamics cont

Staring at Mq, we see that regardless of q (and Q),α(p) > 0. So (P(S),Mq) is globally stable.

JS 4.3.23. Compute the stationary distribution when(q,Q) = (2,5).JS 4.3.24. Suppose Q = 20, and the fixed cost ofordering inventory in any period is 0.1. The firm buys theproduct at zero cost per unit and sells at USD 1 per unit.For each q ∈ {0,1,2, ...,20}, evaluate the stationarydistribution ψ∗q, and evaluate the firm’s expected perperiod profit at this stationary distribution (i.e. computethe firm’s long run average profits with restocking policyq). Show that this profit is maximized at q = 7.

Lecture 2.2 Finite State Markov...

Documents

Transcript of Lecture 2.2 Finite State Markov...

Heat transfer coefﬁcient and friction factor correlations ...

Determining a Function for the Damping Coefﬁcient of a ...

Dobrushin L.D., Brizgalin A.G, Cherkashin A.V. The Paton Electric Welding Institute,

Dimension reduction and coefﬁcient estimation in ... · Dimension reduction and coefﬁcient estimation in multivariate linear regression ... and Renato Monteiro Georgia ... thus

Multidisciplinary design and aerodynamic assessment of an agile … · 2017. 12. 8. · mx Body-fixed X-moment coefficient [–] C my Body-fixed Y-moment coefficient ... Sect.

Adaptive varying-coefﬁcient linear modelsorfe.princeton.edu/~jqfan/papers/00/adaptvc.pdf · Adaptive varying-coefﬁcient linear models ... and Zongwu Cai ... and Cai, Fan and Yao

Détermination du coefﬁcient Joule-Thomson d’un gazglace.recherche.usherbrooke.ca/.../2013/11/CPH316_Joule-Thomson.pdf · Détermination du coefﬁcient Joule-Thomson d’un gaz

Discharge Coefficient Measurements for Flow Through ...downloads.hindawi.com/journals/ijrm/2004/247964.pdf · discharge coefficient values due to interaction with freestream flow.

The Coefﬁcient of Determination - Hampden-Sydney …people.hsc.edu/faculty-staff/robbk/Math121/Lectures... · · 2010-04-131 The Regression Identity ... 4 TI-83 - The Coefﬁcient

Relative Gibbs measures and relative equilibrium measuressiamak.isoperimetric.info/talks/Oaxaca2019.pdf · Summary DLR theorem [Dobrushin, 1968; Lanford and Ruelle, 1969] Equilibrium

P-spline Varying Coefﬁcient Models for Complex Data

research.tue.nl · Notation symbol description a half of the tyre contact length ai coefﬁcient of the characteristic equation; general coefﬁcient c stiffness ca shaft torsional

A Case Study of Nearshore Drag Coefﬁcient Behavior during ...

3D experimental study on a cylindrical floating breakwater ...€¦ · Floating breakwater system 3D experimental study Wave transmission coefficient Wave reflection coefficient

Refining the clustering coefficient for analysis of …¬ning the clustering coefficient for analysis of social and ... largest subgraph that is aCartesian product of two complete

DISCO Nets: DISsimilarity COefﬁcient NetworksDISCO Nets: DISsimilarity COefﬁcient Networks Diane Bouchacourt University of Oxford diane@robots.ox.ac.uk M. Pawan Kumar University

Package ‘migration.indices’ · migration.cv.in In-migration Coefficient of Variation Description As "the coefficient of variation is defined as the standard deviation to mean

Correlations between the Hall coefﬁcient and the superconducting ...

Journal of International Economicsmeasure the degree of international risk sharing with the coefﬁcient on output growth (henceforth risk sharing coefﬁcient) in a panel re-gression

On the Diffusion Coefﬁcient: The Einstein Relation and Beyond