Download - Moderate Deviations for Time-varying Dynamic Systems ...qhe/paper/mode.pdf · Nonhomogeneous continuous-time Markov chains (or Markov chains with time-varying transition rates) naturally

Moderate Deviations for Time-varying DynamicSystems Driven by Nonhomogeneous Markov Chains

with Two-time Scales∗

Qi He† G. Yin‡

Abstract

Motivated by problems arising in time-dependent queues and dynamic systems withrandom environment, this work develops moderate deviations principles for dynamicsystems driven by a fast-varying nonhomogeneous Markov chain in continuous time.A distinct feature is that the Markov chain is time dependent or inhomogeneous soare the dynamic systems. Under irreducibility of the nonhomogeneous Markov chain,moderate deviations of a nonhomogeneous functional are established first. With thehelp of a martingale problem formulation and a functional central limit theorem forthe two-time-scale system, moderate deviations upper and lower bounds are obtainedfor the rapidly fluctuating Markovian systems. Then applications to dynamic systemsmodulated by a fast-varying Markov chain are examined.

Key Words. Moderate deviation, time-varying functional, Markov chain, averagingprinciple.

AMS Subject Classification. 60J27, 93E20, 34E05.

Running Title. Moderate Deviations for Two-time-scale Systems

∗This research was supported in part by the National Science Foundation under DMS-1207667.†Department of Mathematics, University of California, Irvine, CA 92697, [email protected].‡Department of Mathematics, Wayne State University, Detroit, MI 48202, [email protected].

1

1 Introduction

Markovian models have been used widely in manufacturing systems, production planning,

queueing networks, Monte Carlo simulation, and random environment. In recent years, new

applications of Markovian models have also emerged from wireless communications, internet

traffic modeling, and financial engineering. The rapid progress in technology has opened up

new domains and provided greater opportunities for further exploration.

To make the computation affordable and feasible, one often has to be contented with

finding approximate solutions. This is especially true for many control and optimization

problems. A useful modeling and computational step is to use a two-time-scale formulation.

In fact, time-scale separation is often inherent in the underlying problems. For instance,

equity investors in a stock market can be classified as belonging to two categories, long-

term investors and short-term investors. Long-term investors consider a relatively longtime

horizon and make decisions based on weekly or monthly performance of the stock, whereas

short term investors (such as day traders) focus on returns in a short term, daily or an

even shorter period of time. Their time scales are in sharp contrast. An effective way

to delineate the distinct rates of changes is to introduce a small parameter ε > 0 into the

system, in which ε is used to separate the time scales. In the past, Simon and Ando [18] used

such an idea and introduced the so-called hierarchical decomposition and aggregation. Sethi

and Zhang [17] initiated the study of nearly optimal controls for flexible manufacturing

systems. To further investigate the underlying properties, Khasminskii, Yin, and Zhang

developed asymptotic expansions for the probability distribution vectors [10, 11] using an

analytic approach. Subsequently, a more comprehensive study was launched in [23] (see

also [24, 26]), which contains scaled sequence of occupation measures, switching diffusion

limits, near-optimal controls of Markovian systems, Markov decision processes, and numerical

methods, among others.

Nonhomogeneous continuous-time Markov chains (or Markov chains with time-varying

transition rates) naturally arise in various applications. To give motivations, we recall the

problems treated in [15,16]. Consider an Mt/Mt/1/m0 queue with a finite number of waiting

buffers, m0, the first-in and first-out service discipline. Suppose that the arrival process is

a non-homogeneous Poisson process with intensity (arrival rate) function λ(t), and that the

service time is exponentially distributed with time-dependent rate µ(t). Let α(t) be the

2

queue length at time t, which is a nonstationary Markov chain with generator

Q(t) =

−λ(t) λ(t)µ(t) −(λ(t) + µ(t)) λ(t)

. . . . . . . . .

µ(t) −(λ(t) + µ(t)) λ(t)µ(t) −µ(t)

.

One aims to find an approximation to the probability P (α(t) = k) with 1 ≤ k ≤ m0. Define

the probability distribution by

p(t) = (P (α(t) = 1), . . . , P (α(t) = m0)). (1.1)

Then we havedp(t)

dt= p(t)Q(t). (1.2)

To treat the problem, Massey [15], and Massey and Whitt [16] developed an approach known

as uniform acceleration expansions, which can be explained as follows. Introduce a small

parameter ε > 0 to the generator Q(t). Assume that the rate of change of the generator

Q(t) varies slowly in time that the process α(t) can achieve equilibrium before there is any

significant change in the rate. Then we can replace Q(t) by Q(εt). In the following discussion,

for notational simplicity, we take the initial time to be t0 = 0. Let pε(t) be the probability

distribution (as defined in (1.1) that is indexed by ε) corresponding to the generator Q(εt).

We knowdpε(t)

dt= pε(t)Q(εt). (1.3)

To get uniform acceleration, the limit as t → ∞ and ε → 0 simultaneously was considered

in [16]. Denoting ς = εt and letting αε(·) be the new process associated with ς. Then the

corresponding probability distribution pε(ς) will solve the forward equation

dpε(ς)

dς= pε(ς)

Q(ς)

ε. (1.4)

Recall that (see [10]) a Markov chain or its generator Q(ς) is irreducible, if the system of

equations ν(ς)Q(ς) = 0,m0∑i=1

νi(ς) = 1(1.5)

has a unique solution such that νi(ς) > 0 for each i ∈ M. The unique solution of (1.5),

namely, the row vector ν(ς) = (ν1(ς), ν2(ς), . . . , νm0(ς)) is termed quasi-stationary distribu-

tion. When Q is independent of time, ν becomes the usual stationary distribution.

3

Assuming that Q(ς) is irreducible, then in [16, Theorem 3.3], it was proved that pε(ς) =∑ni=0 ε

iϕi(ς)+O(εn+1) using only outer expansions. Using the results of [10,23], one obtains

not only the outer expansions but also initial layer corrections leading to more accurate error

bounds for the acceleration. For the queueing problem, for T > 0, some c > 0, each i ∈ M,

and for any κ ∈ (0, 1/2), we are often interested in obtaining

P

(∣∣∣∣ 1

εκsup

0≤ς≤T

∫ ς

0

[I{αε(s)=i} − νi(s)]ds∣∣∣∣ ≥ c

), (1.6)

where I{·} is the indicator function and ν(t) = (ν1(t), ν2(t), . . . , νm0(t)) is the quasi-stationary

distribution corresponding to the generator Q(t). Previous results mentioned in the above

references cannot provide the desired bounds. In this paper, we show that such error bounds

can be obtained by means of a moderate deviations approach.

To further motivate our study, we consider the following dynamic system in a random

environment:

xε(t) = b(t, xε(t), αε(t)), xε(0) = x, (1.7)

which is a time-varying dynamic system perturbed by a fast-varying Markov chain αε(t)

in continuous time. The Markov chain has a finite state space M = {1, . . . ,m0}, whose

generator is Q(t)/ε, where Q(t) ∈ Rm0×m0 is a generator of a continuous-time Markov chain

and ε is a small parameter.

Under suitable conditions, in [23, Chapters 5], we showed that xε(·) converges weakly

to x(·), where x(·) is the solution of the following system x(t) = b(t, x(t)) where b(t, x) =∑m0

i=1 b(t, x, i)νi(t). Define

yε(t) =1

εκ(xε(t)− x(t)) for some 0 ≤ κ ≤ 1/2.

When κ = 0, the probability error estimates of the form P (|yε(t)| > δ) were considered in

He, Yin, and Zhang [8], in which large deviations results were obtained. When κ = 1/2, the

asymptotic distribution of yε(·) becomes a central limit type problem that was considered

in [9] under a somewhat different setting. It can be shown that yε(·) converges weakly to

y(·), a diffusion process. Then one natural question is: What will happen if κ ∈ (0, 12)? To

answer this questions, one must resort to the moderate deviations techniques.

Much work has been done for investigating moderate deviations problems. Freidlin and

Wentzell [5] studied moderate deviations associated with the averaging principle; they ob-

tained the results by imposing some abstract conditions. Subsequently, moderate deviations

were considered in Djellout [3] for martingale difference and φ-mixing sequences, Wu [20]

for Markov processes and Markov chains, de Acosta [2], Djellout and Guillin [4] for sharp

results of Markov chains, Liptser and Spokoiny [14] for fast ergodic diffusion precess, and

4

Wu [21] for large-time behavior of empirical measures of the solution of a damped Hamilto-

nian system. However, all the aforementioned work mainly considered moderate deviations

under time-homogeneous setting. Recently, Guillin [6] studied moderate deviations of in-

homogeneous functionals of Markov processes, but the Markov process under consideration

is still homogeneous. Motivated by the work [8], in this paper, we consider the case where

both the underlying functional and the Markov process are inhomogeneous, which is needed

for many applications.

This paper is organized as follows. Section 2 begins with certain preliminary results.

Section 3 provides results on H-functional, that is a basic tool we need. Then, we study

moderate deviations of inhomogeneous functional and moderate deviations of Markovian

switching system in Sections 4 and 5, respectively. We revisit the queueing problem men-

tioned before and a couple of examples of dynamic systems modulated by a fast-varying

Markov chain are examined in Section 6 together with some final remarks.

2 Preliminary Result

Let us begin with a continuous-time Markov chain with state space M = {1, 2, . . . ,m0}and a generator Q that is independent of t and that is irreducible. Then we can obtain the

following results, whose proof can be found in [8].

Lemma 2.1. Suppose that β ∈ Rk is a constant vector, f(·) : M 7→ Rk, and X(t) is a

Markov chain with generator Q that is irreducible. Then for each i ∈M, the limit

limT→∞

1

TlogEi exp

(∫ T

0

⟨f(X(s)), β

⟩ds

), (2.1)

exists, where Ei denotes the expectation with X(0) = i,⟨·, ·⟩

denotes inner product in Rk.

Denote the limit in (2.1) by H(β,Q), where H(β,Q) is continuous and convex in β.

The H(β,Q) in the above Lemma is the H functional that depends on β and Q. Similar

to [5, Theorem 7.2], we can derive the following result.

Lemma 2.2 Assume αε(·) is a homogeneous Markov chain with generatorQ

ε. Then

limε→0

ε1−2κ logEi exp

(εκ−1

∫ T

0

⟨f(αε(s)), β

⟩ds

)=T

2

⟨Cβ, β

⟩, (2.2)

where κ ∈ (0, 1/2), and C is the Hessian matrix of H(β,Q) at β = 0, that is, Cij =∂2H(0, Q)

∂βi∂βj.

5

Next, let us consider the nonhomogeneous Markov chain αε(t) with a finite space M =

{1, . . . ,m0}, and generator Q(t)/ε with Q(t) ∈ Rm0×m0 being an irreducible generator. By

virtue of [23, Theorem 5.9], we have the following result.

Lemma 2.3 For each i ∈M, let βi(·) be a bounded measurable deterministic function and

U εi (t) =

1√ε

∫ t

0

(I{αε(s)=i} − νi(s))βi(s)ds,

with U ε(t) = (U ε1 (t), . . . , U ε

m0(t))′, where z′ denotes the transpose of z. Suppose that Q(·) is

twice continuously differentiable with the second derivative being Lipschitz in [0, T ] for some

T > 0. Then U ε(·) converges weakly to a Gaussian process U(·) such that

EU(t) = 0 and E[Ui(t)Uj(t)] =

∫ t

0

βi(s)βj(s)Aij(s)ds,

and

Aij(t) = νi(t)

∫ ∞0

ψij(r, t)dr + νj(t)

∫ ∞0

ψji(r, t)dr, (2.3)

in which Ψ(r, t) = (ψij(r, t)) satisfies

Ψ(τ, t0)

dτ= Ψ(τ, t0)Q(t0), τ ≥ 0,

Ψ(0, t0) = I − P (0)(t0),

and

P (0)(t) =

ν(t)...ν(t)

.

3 H-Functional for the Two-time-scale Markov Chain

To study the moderate deviations of dynamic system (1.7), we first investigate the moderate

deviations of the nonhomogeneous functional of the time-varying Markov chain. Denote

f(t) =

m0∑i=1

f(t, i)νi(t),

and define

nε(t) = ε−κ∫ t

0

[f(s, αε(s))− f(s)]ds, κ ∈ (0,1

2),

=

m0∑i=1

ε−κ∫ t

0

f(s, i)[I{αε(s)=i} − νi(s)]ds.(3.1)

6

Then nε(t) goes to 0. To estimate the convergence rate we need to study further by using

moderate deviations. Next, let us state our standing assumption to be used throughout the

paper.

(A) The generator Q(t) is irreducible for each t ∈ [0, T ], Q(·) is twice continuously differen-

tiable in [0, T ], and the function f(·, i) is Lipschitz continuous in t for each i ∈M.

Use Es,i to denote the expectation with αε(s) = i and simply use Ei to denote the

expectation with αε(0) = i. Modifying the proof of [8, Theorem 3.2], we derive the following

theorem. For brevity, the proof is omitted.

Theorem 3.1 Assume that function f : [0, T ]×M→ Rk, β(t) is a step function on [0, T ],

and αε(t) is the continuous-time Markov chain with state space M and generator Qε(t) =

Q(t)/ε. Then under Assumption (A), there exists a function H(·, ·) : Rk × [0,∞]→ R such

that

limε→0

ε logEi exp{1

ε

∫ T

0

⟨f(s, αε(s))− f(s), β(s)

⟩ds} =

∫ T

0

H(β(t), t)dt.

To study the moderate deviations, we need to examine the following limit

limε→0

ε1−2κ logEi exp{εκ−1∫ T

0

⟨f(s, αε(s))− f(s), β(s)

⟩ds},

where κ ∈ (0, 12).

Theorem 3.2. Consider αε(·) on an interval [0, T ]. Then for each i ∈M,

limε→0

ε1−2κ logEi exp(εκ−1

∫ T

0

⟨f(s, αε(s))− f(s), β

⟩ds)

=1

2

∫ T

0

⟨βC(t), β

⟩dt,

where C(t) is the Hessian matrix of H(β, t) at β = 0, i.e., Cij(t) =∂2H(β, t)

∂βi∂βj

∣∣∣β=0

.

Proof. To prove the result, we define the family of operators Lβs,t, 0 ≤ s ≤ t, on the set of

vectors in Rm0 by

Lβs,tw(i) = Es,iw(αε(t)) exp(

∫ t

s

⟨f(τ, αε(τ))− f(τ), β

⟩dτ),

7

where w is a vector in Rm0 . Note that this operator also depends on ε. However, we suppress

the subscript ε for simplicity. It follows that Lβs,t has monotone and semigroup properties,

Lβs,tw1 ≤ Lβs,tw2, if w1 ≤ w2,

Lβs,tLβt,p = Lβs,p, for s ≤ t ≤ p.

(3.2)

In the above and hereafter, for two vectors v and v, by v ≤ v we mean that each component

of v is less than or equal to that of v. In addition, we use | · | to denote either a matrix norm

or a vector norm in what follows. Define matrix

Qβt = (qij(t) + δij

⟨f(t, i)− f(t), β

⟩).

From the result of [8, Lemma 2.6], (i)H(β, t) is the real, simple eigenvalue ofQβt exceeding the

real parts of all other eigenvalues, and (ii) there exists a eigenvector u(β, t) of Qβt satisfying

u(β, t) = (u1(β, t), . . . , um0(β, t)) and 0 < c(β, t) < mini≤m0 ui ≤ maxi≤m0 ui = 1. Note that

Aβt = limh→0

Lβt,t+h − Ih

= (qij(t)

ε+ δij

⟨f(t, i)− f(t), β

⟩)

=1

εQεβt .

(3.3)

So, Aεκ−1βt u(εκβ, t) =

1

εH(εκβ, t)u(εκβ, t). Use l to denote the vector with all components

being equal to one. Since u(0, t) = l, for any t ∈ [0, T ], and by the continuity of Qβt on t,

limε→0 u(εκβ, t) = l uniformly on [0, T ], and hence limε→0 c(εκβ, t) = 1 uniformly on [0, T ].

So, for small enough ε, there exists constant c > 0 such that

Ei exp(εκ−1

∫ T

0

⟨f(s, αε(s))− f(s), β

⟩ds)

= Ei exp(∫ T

0

⟨f(εs, αε(εs))− f(εs), εκ−1β

⟩ds)

= (Lεκ−1β0,T l)(i)

≤ 1

c(Lε

κ−1β0,T u(εκ, T ))(i).

By virtue of the semigroup property (3.2) and (3.3),

dLεκ−1β0,t

dtu(εκβ, t) =

dLεκ−1β0,t

dtu(εκβ, t)) + Lε

κ−1β0,t

du(εκβ, t)

dt

= limh→0

Lεκ−1β0,t+h − L

εκ−1β0,t

hu(εκβ, t) + Lε

κ−1β0,t

du(εκβ, t)

dt

= Lεκ−1β0,t lim

h→0

Lεκβt,t+h − L

εκβt,t

hu(εκβ, t) + Lε

κ−1β0,t

du(εκβ, t)

dt

= Lεκ−1β0,t Aε

κ−1βt u(εκβ, t) + Lε

κ−1β0,t

du(εκβ, t)

dt

=1

εH(εκβ, t)Lε

κ−1β0,t u(εκβ, t) + Lε

κ−1β0,t

du(εκβ, t)

dt.

(3.4)

8

Since limε→0 u(εκβ, t) = 1 uniformly on [0, T ], limε→0du(εκβ,t)

dt= 0 uniformly on [0, T ]. Hence,

for small enough ε > 0, −u(εκβ, t) ≤ du(εκβ, t)

dt≤ u(εκβ, t) on [0, T ]. By using (3.4), and

the property that Lεκ−1βs,t is monotonic and u(εκβ, t) > 0, we have

(1

εH(εκβ, t)− 1)Lε

κ−1β0,t u(εκβ, t)

≤d(Lε

κ−1β0,t u(εκβ, t))

dt=

1

εH(εκβ, t)Lε

κ−1β0,t u(εκβ, t) + Lε

κ−1β0,t

du(εκβ, t)

dt

≤ (1

εH(εκβ, t) + 1)Lε

κ−1β0,t u(εκβ, t).

Applying Gronwall’s inequality, we conclude that

e∫ t0 (

1εH(εκβ,s)−1)dsu(εκβ, 0) ≤ Lε

κ−1β0,t u(εκβ, t) ≤ e

∫ t0 (

1εH(εκβ,s)+1)dsu(εκβ, 0).

Hence

lim supε→0


∫ T

0

⟨f(s, αε(s))− f(s), β

⟩ds)

= lim supε→0

ε1−2κ logEi exp(∫ T

0


⟩ds)

= lim supε→0

ε1−2κ log(Lεκ−1β0,T l)(i)

≤ lim supε→0

ε1−2κ log1

c(Lε

κ−1β0,T u(εκβ, T ))(i)

≤ lim supε→0

ε1−2κ log e∫ T0

1εH(εκβ,s)+1ds + lim sup

ε→0ε1−2κ log

1

cu(εκβ, 0)(i)

= lim supε→0

∫ T

0

ε1−2κ(1

εH(εκβ, s) + 1)ds

= lim supε→0

∫ T

0

ε−2κH(εκβ, s)ds

= lim supε→0

∫ T

0

ε−2κ[H(0, s) +⟨εkβ,∇βH(0, s)

⟩+

1

2

⟨εkβC(s), εkβ

⟩+ o(|εkβ|2)]ds,

where ∇β denotes the gradient of H with respect to β and C(t) denotes the Hessian matrix

of H(β, t) with respect to β at β = 0. Note that H(0, t) = 0, and that ∇βH(0, t) = 0 since

f(t) =∑m0

i=1 f(t, i)νi(t). Therefore,

lim supε→0


∫ T

0

⟨f(s, αε(s))− f(s), β

⟩ds)

= lim supε→0

ε−2κ∫ T

0

[H(0, s) +⟨εkβ,∇βH(0, s)

⟩+

1

2

⟨εkβC(s), εkβ

⟩+o(|εkβ|2)

]ds

=1

2

∫ T

0

⟨βC(s), β

⟩ds.

9

Similarly, by

(Lεκ−1β0,t l)(i) ≥ (Lε

κ−1β0,t u(εκβ, t))(i) ≥ e

∫ t0 (

1εH(εκβ,s)−1)dsu(εκβ, 0)(i),

we obtain

lim infε→0


∫ T

0

⟨f(s, αε(s))− f(s), β

⟩ds)

= lim infε→0

ε1−2κ logEi exp(∫ T

0


⟩ds)

= lim infε→0

ε1−2κ log(Lεκ−1β0,T l)(i)

≥ lim infε→0

ε1−2κ log(e∫ T0 ( 1

εH(εκβ,s)−1)dsu(εκβ, 0)(i))

≥ lim infε→0

∫ T

0

ε1−2κ(1

εH(εκβ, s)− 1)ds

= lim infε→0

∫ T

0

ε−2κH(εκβ, s)ds

=1

2

∫ T

0

⟨βC(s), β

⟩ds.

Thus the proof of the theorem is concluded.

Theorem 3.3 Assume that f :M→ Rk is a well-defined function, β(t) is a step function

on [0, T ], and αε(t) is the continuous-time Markov chain with state space M and generator

Qε(t) = Q(t)/ε. Then under Assumption (A), we have

limε→0

ε1−2κ logEi exp{εκ−1∫ T

0

⟨f(s, αε(s))− f(s), β(s)

⟩ds} =

1

2

∫ T

0

⟨C(t)β(t), β(t)

⟩dt.

Proof. The argument is similar to the proof of [8, Theorem 3.2], so we omit the details.

Remark 3.4 It can be shown that∫ t0C(s)ds is the covariance matrix of the distribution

limε→0

1√ε

∫ t

0

(f(s, αε(s))− f(s))ds with A(t) given in (2.3). To see this, note that∫ t

0

Cij(s)ds =

∫ t

0

∂2

∂βiβjH(β, s)

∣∣β=0

ds

= limε→0

(∂2

∂βiβjε logE exp

1

ε

∫ t

0

⟨f(s, αε(s))− f(s), β

⟩)∣∣β=0

ds

= limε→0

1

εE(

∫ t

0

(fi(s, αε(s))− f i(s))ds

∫ t

0

(fj(s, αε(s))− f j(s))ds)

−E(

∫ t

0

(fi(s, αε(s))− f i(s))ds)E(

∫ t

0

(fj(s, αε(s))− f j(s))ds)

= limε→0

Cov(1√ε

∫ t

0

(fi(s, αε(s))− f i(s))ds,

1√ε

∫ t

0

(fj(s, αε(s))− f j(s))ds),

where fi denotes the ith component of the vector f .

10

4 Moderate Deviations of Inhomogeneous Functional

of Fast-Changing Markov Chain

Let C0,T (Rk) be the collection of continuous functions defined on [0, T ] taking values in Rk.

Define

Cx0,T (Rk) = {ϕ :∈ C0,T (Rk), ϕ(0) = x},

the set of all the continuous functions ϕ : [0, T ] → Rk with ϕ(0) = x. Recall the definitions

of nε(t) given in (3.1), and C(t) defined in Theorem 3.2. In what follows, we use Cf (t) to

denote the dependence of C(t) on f . To proceed, we state one of the main results below.

Theorem 4.1 If Assumption (A) holds, then for each set B ⊂ C00,T (Rk),

− infϕ∈B◦

Sf (ϕ) ≤ lim infε→0

ε1−2κ logP{nε(·) ∈ B}≤ lim sup

ε→0ε1−2κ logP{nε(·) ∈ B}

≤ − infϕ∈B

Sf (ϕ),

(4.1)

where

Sf (ϕ) =

∫ T

0

supβ∈Rk

[⟨ϕ(t), β

⟩− 1

2

⟨Cf (s)β, β

⟩]ds, if ϕ ∈ C0,T (Rk) is absolutely continuous,

∞ otherwise.(4.2)

Remark 4.2 Note that if Cf (t) is invertible for each t ∈ [0, T ], we have that

Sf (ϕ) =1

2

∫ T

0

⟨C−1f (s)ϕ(s), ϕ(s)

⟩ds,

and otherwise Sf (ϕ) =∞. The proof of this theorem is similar to [8, Theorem 5.1]. We first

present the following two lemmas.

Lemma 4.3 Let Φ(s) = {ϕ ∈ C00,T (Rk), ST (ϕ) ≤ s}. Then the following two statements are

equivalent:

(1) For each ϕ ∈ C00,T (Rk), each s ≥ 0, h > 0, and δ > 0, there is an ε0 > 0 such that for

ε ≤ ε0,

P{d0T (nε, ϕ) < δ} ≥ exp(− 1

ε1−2κ(ST (ϕ) + h)),

P{d0T (nε,Φ(s)) > δ} ≤ exp(− 1

ε1−2κ(s− h)),

(4.3)

where d0T is the metric on C0,T (Rk) defined by

d0T (ϕ1, ϕ2) = sup0≤t≤T

|ϕ1(t)− ϕ2(t)|.

11

(2) For any set B ⊂ C00,T (Rk),

− infϕ∈B◦

Sf (ϕ) ≤ lim infε→0

ε1−2κ logP{nε ∈ B}≤ lim sup

ε→0ε1−2κ logP{nε ∈ B}

≤ − infϕ∈B

Sf (ϕ).

(4.4)

By virtue of the above lemma, to prove Theorem 4.1, we need only show that (4.3) hold.

Note that the functional Sf (ϕ) is lower semi-continuous, and the set Φ(s) is compact in

C0,T (Rk) [5, Lemma 4.2, p.231].

Lemma 4.4 Suppose Assumption (A) holds. Then ST (ϕ) is the rate function for the family

of process nε(t), t ∈ [0, T ], as ε→ 0, for any s, δ, h > 0, and ϕ ∈ C0,T (Rk), ϕ(0) = 0,

P{d0T (nε, ϕ) < δ} ≥ exp(− 1

ε1−2κ(Sf (ϕ) + h)), (4.5)

and

P{d0T (nε,Φ(s)) > δ} ≤ exp(− 1

ε1−2κ(s− h)), (4.6)

where Φ(s) = {ϕ ∈ C0,T (Rk) : ϕ(0) = 0, Sf (ϕ) ≤ s}.

Proof. The detailed proof is similar to the argument in the proof of He, Yin, and Zhang [8,

Theorem 5.1 ] and is thus omitted.

The proof of Theorem 4.1 follows immediately from the above two lemmas.

5 ODEs with Markovian Switching

This section is devoted to dynamic systems represented by ordinary differential equations

with Markovian switching. We consider nonlinear ordinary differential equations involving

a Markov chain. The Markov chain αε(·) is fast varying that can be thought of as a noise

process. The continuous state, on the other hand, varies much slowly. As a result, cer-

tain averaging takes place. The dynamic system is replaced by an average with respect to

the quasi-stationary measure of the fast-varying process. Consider the system of ordinary

differential equations

xε(t) = b(t, xε(t), αε(t)), xε(0) = x. (5.1)

Throughout this section, we assume that αε(t) ∼ Q(t)/ε, where Q(t) is irreducible and ε > 0

is a small parameter. We also use the following condition.

12

(B) For each i ∈ M, b(·, ·, i) : [0, T ] × Rk 7→ Rk is bounded and continuous; for each t and

each i ∈M, the first and second partial derivatives of b(t, ·, i) with respect to x are bounded

and continuous.

Remark 5.1 Note that for convenience, we imposed the boundedness of b together with

its first and second derivatives with respect to x. Instead of requiring the boundedness

condition, we may use a truncation device, for example, as in [13, p. 284]. In order to avoid

using complex notation, we decide to use the above condition (B) in this section.

Lemma 5.2. The solution of (5.1), xε(·), converges weakly to x(·) such that for each T > 0

and for any t ∈ [0, T ],

x(t) = b(t, x(t)), x(0) = x0, (5.2)

where

b(t, x) =

m0∑i=1

b(t, x, i)νi(t),

and ν(t) = (ν1(t), . . . , νm0(t)) is the quasi-stationary distribution associated with Q(t).

The above lemma is similar to the law of large numbers result. To proceed, consider

the scaled difference ε−κ(xε(t) − x(t)), κ ∈ [0, 1/2]. Note that when κ = 0, we have a large

deviations type result as following.

Theorem 5.3 Under Assumptions (B), then for each set B ⊂ Cx0,T (Rk),

− infϕ∈B◦

I(ϕ) ≤ lim infε→0

ε logP{xε ∈ B}≤ lim sup

ε→0ε logP{xε ∈ B}

≤ − infϕ∈B

I(ϕ),

where

I(ϕ) =

∫ T

0

L(ϕ(s), ϕ(s), s)ds, if ϕ ∈ C0,T (Rk) is absolutely continuous,

∞ otherwise,L(x, γ, s) = sup

β∈Rk[⟨γ, β

⟩−H(x, β, s)],

where B◦ and B denote the interior and closure of B in Cx0,T (Rk), respectively.

Let H : Rk × Rk × [0, T ]→ R such that

limε→0

ε logEi exp{1

ε

∫ T

0

⟨b(ϕ(s), αε(s)), β(s)

⟩ds} =

∫ T

0

H(ϕ(s), β(s), s)ds, (5.3)

13

for any step functions ϕ(s) and β(s) on the interval [0, T ] with values in Rk, and for any

i ∈M. When κ = 1/2, we obtain a central limit theorem. Consider the scaled difference

λε(t) = (xε(t)− x(t))/√ε. (5.4)

Theorem 5.5 gives the limit distribution of λε(·). To obtain the desired result, we first show

that {λε(·)} is tight.

Lemma 5.4 Under Assumption (A) and (B), {λε(·)} is tight in D([0, T ] : Rk), the space of

functions that are right continuous with left limits endowed with the Skorohod topology.

Proof. 1. First, we show that supt∈[0,T ]E|λε(t)|2 <∞.

|λε(t)|2 =1

ε|∫ t

0

(b(s, xε(s), αε(s))− b(s, x(s)))ds|2

=1

ε|∫ t

0

(b(s, xε(s), αε(s))− b(s, x(s), αε(s)) + b(s, x(s), αε(s))− b(s, x(s)))ds|2

≤ 2

ε|∫ t

0

(b(s, xε(s), αε(s))− b(s, x(s), αε(s)))ds|2

+2

ε|∫ t

0

(b(s, x(s), αε(s))− b(s, x(s)))ds|2

≤ 2TK

∫ t

0

|λε(s)|2ds+2

ε|m0∑i=1

∫ t

0

b(s, x(s), i)(I{αε(s)=i} − νi(s))ds|2

By virtue of Lemma 2.3, there exists a constant M > 0 such that

E1

ε|∫ t

0

b(s, x(s), i)(I{αε(s)=i} − νi(s))ds|2 ≤M.

Applying Gronwall’s inequality, we have that supt∈[0,T ]E|λε(t)|2 <∞.2. For any 0 ≤ δ, 0 ≤ s, t ≤ T , and s ≤ δ,

Eεt |λε(t+ s)− λε(t)|2 = Eε

t

1

ε|∫ t+s

t

b(u, xε(u), αε(u))du−∫ t+s

t

b(u, x(u))du|2

≤ 2Eεt |∫ t+s

t

(b(u, xε(u), αε(u))− b(u, x(u), αε(u)))du|2

+2Eεt |∫ t+s

t

(b(u, x(u), αε(u))− b(u, x(u))du|2

≤ 2

εKEε

t

∫ t+s

t

|xε(u)− x(u)|2du

+2

εEεt

m0∑i=1

|∫ t+s

t

b(u, x(u), i)(I{αε(u)=i} − νi(u))du|2

≤ 2sKεt ,

14

where Eεt denotes the conditional expectation on the σ-algebra generated by F εt , and Kε

t is

a random variable such that

limδ→0

lim supε→0

sup0≤s≤δ

E[2sKεt ] = 0.

By the Kurtz tightness criterion [12, p.47], {λε(·)} is tight.

Theorem 5.5. Under Assumptions (B), λε(·) converge weakly to a process λ(·) that is the

solution of the stochastic differential equation:

dλ(t) = D(t, λ(t))λ(t)dt+ dV (t), (5.5)

where

Dlm(t, x, i) =∂bl(t, x, i)

∂xmand Dlm(t, x) =

m0∑i=1

Dlm(t, x, i)νi(t),

and V (t) is a Gaussian process with mean EV (t) = 0 and covariance matrix∫ t0C(s)ds,

where C(t) = B(t)A(t)B′(s), Bij(t) = bi(t, x(t), j), and A(t) is defined in Lemma 2.3.

Proof. Observe that we can write

λε(t) =1√ε

(xε(t)− x(t))

=1√ε

∫ t

0

(b(s, xε(s), αε(s))− b(s, x(s), αε(s)))ds

+1√ε

∫ t

0

(b(s, x(s), αε(s))− b(s, x(s)))ds

=1√ε

(

∫ t

0

(b(s,√ελε(s) + x(s), αε(s))ds−

∫ t

0

b(s, x(s), αε(s)))ds)

−∫ t

0

D(s, x(s), αε(s))λε(s)ds

+

∫ t

0

D(s, x(s), αε(s))λε(s)ds−∫ t

0

D(s, x(s))λε(s)ds

+

∫ t

0

D(s, x(s))λε(s)ds

+1√ε

∫ t

0

(b(s, x(s), αε(s))− b(s, x(s)))ds.

(5.6)

For notational simplicity, define

Iε1(t) :=1√ε

(b(t,√ελε(t) + x(t), αε(t))− b(t, x(t), αε(t)))−D(t, x(t), αε(t))λε(t)

Iε2(t) := D(t, x(t), αε(t))λε(t)−D(t, x(t))λε(t)Iε3(t) := D(t, x(t))λε(t)

Iε4(t) :=1√ε

(b(t, x(t), αε(t))− b(t, x(t)))

15

Then we can write

λε(t) =

∫ t

0

(Iε1(s) + Iε2(s) + Iε3(s) + Iε4(s))ds.

Next, we claim that∫ t0Iε4(s)ds converge weakly to the Gaussian process with 0 mean and

covariance matrix∫ t0C(s)ds. Define

V ε(t) =

∫ t

0

Iε4(s)ds =1√ε

∫ t

0

(b(s, x(s), αε(s))− b(s, x(s)))ds.

Then we can write

V ε(t) =1√ε

m0∑i=1

∫ t

0

b(s, x(s), i)(I{αε(s)=i} − νi(s))ds.

With the notation V ε(t) = (V ε1 (t), . . . , Vk(t))

′ ∈ Rk, by virtue of Lemma 2.3, it is easily seen

that limε→0E(V ε(t)) = 0 and

limε→0

Cov(V εl (t), V ε

m(t)) = limε→0

Cov(1√ε

m0∑i=1

∫ t

0

bl(s, x(s), i)(I{αε(s)=i} − νi(s))ds,

1√ε

m0∑j=1

∫ t

0

bm(u, x(u), j)(I{αε(u)=j} − νj(u))du)

= limε→0

∑i,j

E(

∫ t

0

bl(s, x(s), i)1√ε

(I{αε(s)=i} − νi(s))ds

×∫ t

0

bm(u, x(u), i)1√ε

(I{αε(u)=i} − νi(u))du)

=∑i,j

∫ t

0

bl(s, x(s), i)bm(s, x(s), j)Aij(s)ds,

where A(t) is defined in Lemma 2.3. If we define the matrix Bij(t) = bi(t, x(t), j) and let

C(t) = B(t)A(t)B′(t). Then by virtue of Lemma 2.3, V ε(t) converge weakly to V (t) with

covariance matrix∫ t0C(s)ds.

Since {λε(·)} is tight, by Prokhorov’s theorem, we can select a convergent subsequence.

For simplicity, we still denote the sequence by {λε(·)} with the limit denoted by λ(·). Define

a generator L on a suitable smooth function h(t, x) by

Lh(t, x) =∂h(t, x)

∂t+⟨∇xh(t, x), D(t, x)x

⟩+

1

2

∑i,j

∂2h(t, x)

∂xi∂xjCij(t)

Next, we show that for any bounded and continuous function ρ, any positive integer ϑ,

and any τι ≤ t,

Eρ(τι, λ(τι); ι < ϑ)[h(t+ s, λ(t+ s))− h(t, λ(t))−∫ t+s

t

Lh(s, λ(s))ds] = 0,

16

for s < τι < τϑ < t and for each bounded and continuous function ρ(t, x), and for each twice

continuously differentiable function with compact support h(t, x). This implies that

h(t, λ(t))−∫ t+s

t

Lh(s, λ(s))ds

is a continuous-time martingale, which in turn yields that λ(·) is a solution of the martingale

problem with operator L. To establish the desired result, we divide the interval [t, t+ s] into

subintervals t = t0 ≤ t1 · · · ≤ tN = t+ s, where tk = t+ kε, 0 ≤ k ≤ N. Note that

Eρ(τι, λε(τι), ι < ϑ)[h(tk+1, λ

ε(tk+1))− h(tk, λε(tk))]

= Eρ(τι, λε(τι), ι < ϑ)[h(tk+1, λ

ε(tk+1))− h(tk, λε(tk+1))

+h(tk, λε(tk+1))− h(tk, λ

ε(tk))]

= Eρ(τι, λε(τι), ι < ϑ)[

∫ tk+1

tk

∂h(u, λε(tk+1))

∂udu+

⟨∇h(tk, λ

ε(tk)), λε(tk+1)− λε(tk)

⟩+

1

2

⟨(λε(tk+1)− λε(tk))∇2h(tk, λ

ε(tk)), λε(tk+1)− λε(tk)

⟩+ o(|λε(tk+1)− λε(tk)|2)]

= Eρ(τι, λε(τι), ι < ϑ)

∫ tk+1

tk

∂h(u, λε(tk+1))

∂udu

+Eρ(τι, λε(τι), ι < ϑ)[

⟨∇h(tk, λ

ε(tk)),

∫ tk+1

t

[Iε1(u) + Iε2(u) + Iε3(u) + Iε4(u)]du⟩]

+Eρ(τι, λε(τι), ι < ϑ)

⟨(

∫ tk+1

tk

[Iε1(u) + Iε2(u) + Iε3(u) + Iε4(u)]du)∇2h(tk, λε(tk)),

+

∫ tk+1

tk

[Iε1(u) + Iε2(u) + Iε3(u) + Iε4(u)]du⟩

+o(|λε(tk+1)− λε(tk)|2).(5.7)

Sending ε to 0, we have

limε→0

Eρ(τι, λε(τι), ι < ϑ)[h(tk+1, λ

ε(tk+1))− h(tk, λε(tk))]

= Eρ(τι, λ(τι), ι < ϑ)[

∫ tk+1

tk

∂h(u, λ(tk+1))

∂udu

+⟨∇h(tk, λ(tk)),

∫ tk+1

tk

D(u, x(u))λ(u)du⟩

+

∫ tk+1

tk

tr(∇2h(tk, λ(tk))C′(u))du],

(5.8)

where tr(A) denotes the trace of A. The expression next to the last line of (5.8) follows from

the fact that E|Iε1(t)| → 0, E|Iε2(t)| → 0 as ε→ 0, and E|Iε4(t)| = 0. The last line of (5.8) is

concluded by the asymptotic normality of∫ tk+1

tkIε4(u)du. Owing to the continuity of h(t, x),

the last line of (5.8) can be replaced by

Eρ(τι, λε(τι), ι < ϑ) [(tk+1 − tk)

∂h(u, λε(tk+1))

∂u|u=tk

+⟨∇h(tk, λ

ε(tk)), D(tk, x(tk))λε(tk)(tk+1 − tk)

⟩+tr(∇2h(tk, λ

ε(tk))C′(tk)(tk+1 − tk))].

17

Hence,

Eρ(τι, λ(τι), ι < ϑ)[h(t+ s, λ(t+ s))− h(t, λ(t))]= lim

ε→0Eρ(τι, λ

ε(τι), ι < ϑ)[h(t+ s, λε(t+ s))− h(t, λε(t))]

= limε→0

Eρ(τι, λε(τι), ι < ϑ)

N−1∑k=0

[h(tk+1, λε(tk+1))− h(tk, λ

ε(tk))

= limε→0

Eρ(τι, λε(τι), ι < ϑ)

N−1∑k=0

[(tk+1 − tk)∂h(u, λε(tk+1))

∂u|u=tk

+⟨∇h(tk, λ

ε(tk)), D(tk, x(tk))λε(tk)(tk+1 − tk)

⟩+ tr(∇2h(tk, λ

ε(tk))C′(tk)(tk+1 − tk))].

= limε→0

Eρ(τι, λε(τι), ι < ϑ)[

∫ t+s

t

∂h(u, λε(u))

∂udu+

∫ t+s

t

⟨∇h(u, λε(u)), D(u, x(u))λε(u)

⟩du

+

∫ t+s

t

tr(∇2h(tk, λε(u))C ′(u)du)du]

= Eρ(τι, λ(τι), ι < ϑ)

∫ t+s

t

Lh(s, λ(s))ds.

What we have proved is that λ(·) is a solution of the martingale problem with operator

L. Equivalently, λ(·) is a solution of (5.5). The uniqueness of the solution of (5.5) then yields

that λε(·) converges weakly to λ(·) that is the solution of (5.5).

When κ ∈ (0, 1/2), the sequence nε(·) is in the moderate deviations range. To study

this, let us define a function f : [0, T ]×M→ Rk as f(s, i) = b(s, x(s), i)− b(s, x(s)). Then

by Theorem 4.1, nε(t) = ε−κ∫ t0f(s, αε(s))ds satisfies the moderate deviations principle with

rate function

S(ϕ) =

∫ T

0

supβ∈Rk

[⟨ϕ(s), β

⟩− 1

2

⟨C(s)β, β

⟩]ds if ϕ ∈ C0,T (Rk) is absolutely continuous,

∞ otherwise,(5.9)

where C(t) = B(t)A(t)B′(t). By applying the above fact, we obtain the following result.

Theorem 5.6. Under Assumptions (B), yε(t) = ε−κ(xε(t) − x(t)) satisfies a moderate de-

viations principle on C[0,T ](Rk) with rate function

S(ϕ(·)) = S(ϕ(·)−∫ ·0

D(s, x(s))ϕ(s)ds).

If C(t) = B(t)A(t)B′(t) is invertible, we can write S(φ) explicitly as

S(ϕ(·)) =

1

2

∫ T

0

⟨C(s)−1(ϕ(s)−D(x(s))ϕ(s)), ϕ(s)−D(x(s))ϕ(s)

⟩ds,

if ϕ ∈ C0,T (Rk) is absolutely continuous,∞ otherwise.

18

Proof. We can use similar decomposition as that of (5.6) to get

yε(t) =1

εκ(xε(t)− x(t))

=1

εκ

∫ t

0

(b(s, xε(s), αε(s))− b(s, x(s), αε(s)))ds

+1

εκ

∫ t

0

(b(s, x(s), αε(s))− b(s, x(s)))ds

=1

εκ(

∫ t

0

(b(s, εκyε(s) + x(s), αε(s))ds−∫ t

0

b(s, x(s), αε(s)))ds)

−∫ t

0

D(s, x(s), αε(s))yε(s)ds

+

∫ t

0

[D(s, x(s), αε(s))−D(s, x(s))]yε(s)ds

+

∫ t

0

D(s, x(s))yε(s)ds

+1

εκ

∫ t

0

(b(s, x(s), αε(s))− b(s, x(s)))ds.

(5.10)

Define

Iε1(t) :=1

εκ(b(t, εκyε(t) + x(t), αε(t))− b(t, x(t), αε(t)))−D(t, x(t), αε(t))yε(t)

Iε2(t) := D(t, x(t), αε(t))yε(t)−D(t, x(t))yε(t)

Iε3(t) := D(t, x(t))yε(t)

Iε4(t) :=1

εκ(b(t, x(t), αε(t))− b(t, x(t)))

(5.11)

Similar to the proof of Theorem 4.1, we can show that

nε(t) = ε−κ∫ t

0

f(s, αε(s))ds =

∫ t

0

Iε4(s)ds

satisfies moderate deviations with the rate function (5.9). Then we consider the process zε(t)

defined by the solution of

dzε(t) = D(s, x(s))zε(t)dt+ dnε(t).

Since zε(t) can be seen as a continuous mapping of nε(t) on C0,T (Rk) and by virtue of the

contraction principle of large deviations, zε(t) satisfies a moderate deviations principle with

rate function

S(ϕ(·)) = inf{S(ϕ(·)) : ϕ(·) = ϕ(·) +

∫ ·0

D(s, x(s))ϕ(s)ds}

= S(ϕ(·)−∫ .

0

D(s, x(s))ϕ(s)ds).

19

Next, we will show that yε(t) and zε(t) have the same rate function. To do so, we need only

show that they are exponential equivalent, i.e., for any δ > 0

lim supε→0

ε1−2κ logP ( supt∈[0,T ]

|yε(t)− zε(t)| > δ) = −∞. (5.12)

Observe that

|yε(t)− zε(t)| = |∫ t

0

D(s, x(s))(yε(s)− zε(s))ds+

∫ t

0

I2(s)ds+

∫ t

0

I1(s)ds|

≤ |D|∞∫ t

0

|yε(s)− zε(s)|ds+ |∫ t

0

I2(s) + I1(s)ds|,

where |D|∞ = supt∈[0,T ] |D(t, x(t))|. Applying Gronwall’s inequality, and by the boundedness

of D and D on [0, T ],

|yε(t)− z(t)| ≤ e|D|∞t∣∣∣∫ t0 [I2(s) + I1(s)]ds

∣∣∣ .We need to show that

lim supε→0


|∫ t

0

I2(s)ds| > δ) = −∞ (5.13)

lim supε→0


|∫ t

0

I1(s)ds| > δ) = −∞ (5.14)

First, let us verify (5.14). By the boundless of D and D,

|Iε1(t)| ≤ K|yε(t)|,

and

|Iε1(t)| ≤ Kεκ|yε(t)|2.

So, from the decomposition (5.10), we have

|yε(t)| ≤ |nε(t)|+ (|D −D|∞ + |D|∞)

∫ t

0

|yε(s)|ds.

By Gronwall’s’s inequality,

|yε(t)| ≤ K|nε(t)|.

Hence,

P ( supt∈[0,T ]

|∫ t

0

I1(s)ds| > δ) ≤ P (

∫ T

0

|nε(s)|2ds > δ

Kεκ).

Since nε satisfies moderate deviations principle with rate function S(·),∫ T0|nε(s)|2ds satisfies

moderate deviations principle with rate function I(x) = inf{S(ϕ) :∫ T0|ϕ(s)|2ds = x} by

contraction principle. It is easy seen that I(x)→∞ as |x| → ∞. Thus, (5.14) is concluded.

20

To prove (5.13), we apply the Schwarz inequality

P ( supt∈[0,T ]

|∫ t

0

I2(s)ds| > δ)

= P ( supt∈[0,T ]

|∫ t

0

(D(s, x(s), αε(s))−D(s, x(s)))yε(s)ds| > δ)

≤ P ( supt∈[0,T ]

|∫ t

0

(D(s, x(s), αε(s))−D(s, x(s)))yε(s)ds|2 > δ2)

≤ P ( supt∈[0,T ]

∫ t

0

|D(s, x(s), αε(s))−D(s, x(s))|2ds∫ t

0

|yε(s)|2ds > δ2)

≤ P ( supt∈[0,T ]

∫ t

0

|D(s, x(s), αε(s))−D(s, x(s))|2ds > δ

M) + P ( sup

t∈[0,T ]

∫ t

0

|yε(s)|2ds > δM)

Since nε(t) satisfies moderate deviations principle,

lim supε→0


∫ t

0

|yε(s)|2ds > δM) = −∞,

by letting M → 0. Define f(s, i) = D(s, x(s), i)−D(s, x(s)), then f(s, αε(s)) satisfies mod-

erate deviations principle with rate function Sf . Applying contraction principle,∫ t

0

|D(s, x(s), αε(s))−D(s, x(s))|2ds

satisfies moderate deviations principle with rate function I(x) = inf{Sf (ϕ) :∫ T0|ϕ(s)|2ds =

x}. Moreover, I(x)→∞ as x→∞. Hence,

lim supε→0


∫ t

0

|D(s, x(s), αε(s))−D(s, x(s))|2ds > δM)

= lim supε→0


1

εκ

∫ t

0

|D(s, x(s), αε(s))−D(s, x(s))|2ds > δM

εκ)

= −∞.

Thus (5.12) is proved and the proof of the theorem is concluded.

6 Examples

In this section, we consider a couple of examples that require the use of moderate deviations.

Finally, a few additional remarks are provided.

6.1 Examples

The first example is the queueing example we started in the introduction, the second and

the third examples are controlled switching linear systems in continuous time.

21

Example 6.1 Let us revisit the queueing problem given in the introduction section. Using

the results obtained in this paper, we can show that for each i ∈ M, any κ ∈ (0, 1/2), and

some T > 0,

P

(∣∣∣∣ 1

εκsup

0≤ς≤T

∫ ς

0

[I{αε(s)=i} − νi(s)]ds∣∣∣∣ ≥ c

)≤ exp(−k0/ε1−2κ) for some k0 > 0,

which provides accurate asymptotic error bounds on the desire occupation measures for the

queueing problem.

Our next two examples are concerned with randomly switched systems. The original

motivation comes from [22]. If we combine two stable linear systems, intuitively we may

expect that the resulting system will maintain the stability. This however is not the case as

studied in the aforementioned reference, which is counter intuitive at the first glance. In a

similar spirit, in the third example, we consider a case that three linear systems are unstable,

but when they are modulated by a Markov chain, the resulting system becomes stable in

the limit. Why such case can arise was exploit in the work [23, Section 5.6] using perturbed

Liapunov function methods. In what follows, we study the error between the original systems

and the associated limit by means of a moderate deviations approach. For any T > 0, denote

by d0,T the metric on C0,T (Rk) defined by d0,T (ϕ1, ϕ2) = sup0≤t≤T |ϕ1(t)− ϕ2(t)|.

Example 6.2 As a variation of the system considered in [22] (see [25]), we consider the

following example. Suppose that αε(·) is a continuous-time Markov chain with state space

M = {1, 2} and generator Qε = Q/ε, where Q =

(−1 1

1 −1

). Consider x = A(αε(t))x +

B(αε(t))u(t), and a state feedback u(t) = K(αε(t))x(t). Then one gets

x = [A(αε(t))−B(αε(t))K(αε(t))]x.

Suppose that αε(t) ∈ {1, 2} such that

G(1) = A(1)−B(1)K(1) =

(−100 20

200 −100

),

G(2) = A(2)−B(2)K(2) =

(−100 200

20 −100

).

The system can be rewritten as

xε(t) = G(αε(t))xε(t), (6.1)

It is easily checked that both G(1) and G(2) are stable matrices.

22

Since Q is irreducible, the stationary distribution associated with Q is given by (1/2, 1/2).

Consequently, as ε→ 0, xε(·) converges weakly to x(·), which is a solution of the system

x(t) = Gx(t), where G =1

2(G(1) +G(2)) =

(−200 220

220 −200

).

Note that G is an unstable matrix in that one of its eigenvalues is on the left-hand of the

complex plain and the other one is on the right. Thus, the averaged system is asymptotically

unstable. Moreover,

P (1

εκd0,T (xε(t), x(t)) ≥ δ) ≤ exp(−c1/ε1−2κ) for some c1 > 0.

We can conclude that the probability of the sample paths of (6.1) differ from that of the

averaged system within O(δ) for δ > 0 is exponentially small in any finite interval.

Example 6.3 Suppose that αε(·) is a 3-state Markov chain with generator Q/ε and the

system of interest is given by

xε(t) = G(αε(t))xε(t), αε(t) ∼ Q/ε, where

G(1) =

(−4 −1

0 1

), G(2) =

(12

00 1

2

), G(3) =

(1 0−1 −4

),

Q =

−1 1 012−1 1

2

0 1 −1

.

(6.2)

Note that for each for i = 1, 2, 3, the matrix G(i) is not a stable one and that Q is irreducible

with the associate stationary distribution given by (1/4, 1/2, 1/4). As a result, it can be

shown that xε(·) converges weakly to x(·) that is a solution of the following system

x(t) = Gx(t), G =

(−1

2−1

4

−14−1

2

). (6.3)

It is readily verified that system (6.3) is asymptotically stable. The probability that |xε(t)−x(t)| ≥ δ for some δ > 0 is exponentially small. That is, for any T > 0,

P (1

εκd0,T (xε(t), x(t)) ≥ δ) ≤ exp(−c2/ε1−2κ) for some c2 > 0.

6.2 Concluding Remarks

This paper derived moderate deviations for two-time-scale Markovian systems. Both lower

and upper bounds for moderate deviations were obtained. The functional and Markov chain

under consideration are both inhomogeneous. The results obtained can be used to treat

queueing problems, consensus problems with randomly switching topology, and randomly

switched systems. Our results will yield new insight on the study of dynamic systems in

random environment.

23

References[1] P. Billingsley, Convergence of Probability Measures, J. Wiley, New York, NY, 1968.

[2] A. de Acosta, Moderate deviations for empirical measure of Markov chains: Lower bound.Ann. Probab. 25 (1998), 259–284.

[3] H. Djellout, Moderate deviations for martingale differences and applications to φ-mixing se-quences, Stoch. Stoch. Rep. 73 (2002), 37–63.

[4] H. Djellout and A. Guillin, Moderate deviations for Markov chains with atom. StochasticProcess. Appl. 95 (2001), no. 2, 203–217.

[5] M.I. Friedlin and A.D. Wentzel, Random Perturbations of Dynamical Systems, Springer-Verlag, New York, 1984.Theory

[6] A. Guillin, Moderate deviations of inhomogeneous functionsals of Markov process and appli-cation to averaging, Stochastic Process Appl., 92 (2001), 287–313.

[7] F. Hollander, Large deviations, Amer. Math. Soc., 2008.

[8] Q. He, G. Yin, and Q. Zhang, Large Deviations for Two-Time-Scale Systems Driven by Non-homogeneous Markov Chains and LQ Control Problems, SIAM J. Control Optim., 49 (2011),1737–1765.

[9] R.Z. Khasminskii, On stochastic processes defined by differential equations with a small pa-rameter, Theory Probab. Appl. 11 (1966), 211–228.

[10] R.Z. Khasminskii, G. Yin, and Q. Zhang, Asymptotic expansions of singularly perturbedsystems involving rapidly fluctuating Markov chains, SIAM J. Appl. Math., 56 (1996), 277–293.

[11] R.Z. Khasminskii, G. Yin, and Q. Zhang, Constructing asymptotic series for probability dis-tribution of Markov chains with weak and strong interactions, Quart. Appl. Math., LV (1997),177–200.

[12] H.J. Kushner, Approximation and Weak Convergence Methods for Random Processes, withApplications to Stochastic Systems Theory, MIT Press, Cambridge, MA, 1984.

[13] H.J. Kushner and G. Yin, Stochastic Approximation and Recursive Algorithms and Applica-tions, 2nd ed., Springer-Verlag, New York, NY, 2003.

[14] R.S, Liptser and V., Spokoiny, Modrate deviations type evaluation for integral functional ofdiffusion processes. Eletron. J. Probab, 4 (1999), 1–25.

[15] W. Massey, Asymptotic analysis of the time dependent M/M/1 queue, Math. Oper. Res., 10(1985), 305-327.

[16] W.A. Massey and W. Whitt, Uniform acceleration expansions for Markov chains with time-varying rates, Ann. Appl. Probab., 8 (1998), 1130–1155.

[17] S.P. Sethi and Q. Zhang, Hierarchical Decision Making in Stochastic Manufacturing Systems,Birkhauser, Boston, MA, 1994.

[18] H.A. Simon and A. Ando, Aggregation of variables in dynamic systems, Econometrica, 29(1961), 111–138.

[19] A.V. Skorohod, Studies in the Theory of Random Processes, Dover, New York, 1982.

[20] L. Wu, Moderate deviations of dependent random variables related to CLT. Ann. Probab. 23,(1995) 420–445.

[21] L. Wu, Large and moderate deviations and exponential convergence for stochastic dampinghamiltonian systems. Stochas. Process. Appl. 91, (2001) 205–238.

[22] L.Y. Wang, P.P. Khargonekar, and A. Beydoun, Robust control of hybrid systems: Perfor-mance guided strategies, in Hybrid Systems V, P. Antsaklis, W. Kohn, M. Lemmon, A. Nerode,amd S. Sastry Eds., Lecuture Notes in Computer Sci., 1567, 356–389.

24

[23] G. Yin and Q. Zhang, Continuous-time Markov Chains and Applications: A Two-Time-ScaleApproach, 2nd Ed., Springer-Verlag, New York, NY, 2013.

[24] G. Yin, Q. Zhang, and G. Badowski, Asymptotic properties of a singularly perturbed Markovchain with inclusion of transient states, Ann. Appl. Probab., 10 (2000), 549–572.

[25] G. Yin, G. Zhao, F. Wu, Regularization and stabilization of randomly switching dynamicsystems, SIAM J. Appl. Math., 72 (2012), 1361–1382.

[26] Q. Zhang and G. Yin, On nearly optimal controls of hybrid LQG problems, IEEE Trans.Automat. Control, 44 (1999), 2271–2282.

25