Markowitz's Mean-Variance Portfolio Selection With Regime Switching: From Discrete-Time Models to...

12
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004 349 Markowitz’s Mean-Variance Portfolio Selection With Regime Switching: From Discrete-Time Models to Their Continuous-Time Limits G. Yin, Fellow, IEEE, and Xun Yu Zhou, Senior Member, IEEE Abstract—We study a discrete-time version of Markowitz’s mean-variance portfolio selection problem where the market parameters depend on the market mode (regime) that jumps among a finite number of states. The random regime switching is delineated by a finite-state Markov chain, based on which a discrete-time Markov modulated portfolio selection model is presented. Such models either arise from multiperiod portfolio selections or result from numerical solution of continuous-time problems. The natural connections between discrete-time models and their continuous-time counterpart are revealed. Since the Markov chain frequently has a large state space, to reduce the complexity, an aggregated process with smaller state–space is introduced and the underlying portfolio selection is formulated as a two-time-scale problem. We prove that the process of interest yields a switching diffusion limit using weak convergence methods. Next, based on the optimal control of the limit process obtained from our recent work, we devise portfolio selection strategies for the original problem and demonstrate their asymptotic optimality. Index Terms—Discrete-time model, linear-quadratic problem, Markov chain, Markowitz’s mean-variance portfolio selection, singular perturbation, switching diffusion. I. INTRODUCTION M ARKOWITZ’S Nobel-prize winning mean-variance portfolio selection model (for a single period) [16], [17] provides a foundation of modern finance theory; it has inspired numerous extensions and applications. The Markowitz model aims to maximize the terminal wealth, in the mean time to minimize the risk using the variance as a criterion, which enables investors to seek highest return upon specifying their acceptable risk level. There have been continuing efforts in extending portfolio se- lection from the static single period model to dynamic multi- period or continuous-time models. However, the research works on dynamic portfolio selections have been dominated by those of maximizing expected utility functions of the terminal wealth, Manuscript received May 15, 2002; revised December 15, 2002. Recom- mended by Guest Editor B. Pasik-Duncan. The work of G. Yin was supported in part by the National Science Foundation under Grant DMS-0304928 and in part by Wayne State University under the WSU Research Enhancement Program. The work of X. Y. Zhou was supported in part by the RGC Earmarked Grants CUHK 4435/99E and CUHK 4175/00E. G. Yin is with the Department of Mathematics, Wayne State University, De- troit, MI 48202 USA (e-mail: [email protected]). X. Y. Zhou is with the Department of Systems Engineering and Engineering Management, the Chinese University of Hong Kong, Shatin, Hong Kong (e-mail: [email protected]). Digital Object Identifier 10.1109/TAC.2004.824479 which is in spirit different from the original Markowitz’s model. For example, the multiperiod utility models were investigated in [7]–[10] and [19]–[21], among many others. As for the contin- uous-time case, the famous Merton paper [18] studied a utility maximization problem with market factors modeled as a dif- fusion process (rather than as a Markov chain). Along another line, the mean-variance hedging problem was investigated by Duffie and Richardson [6] and Schweizer [22], where an op- timal dynamic strategy was sought to hedge contingent claims in an imperfect market. Optimal hedging policies [6], [22] were obtained primarily based on the so-called projection theorem. Very recently, using the stochastic linear-quadratic (LQ) theory developed in [4] and [28], Zhou and Li [31] introduced a stochastic LQ control framework to study the continuous-time version of the Markowitz’s problem. Within this framework, they derived closed-form efficient policies (in the Markowitz sense) along with an explicit expression of the efficient frontier. In the aforementioned references, for continuous-time formulations of mean-variance problems, stochastic differ- ential equations and geometric Brownian motion models were used. Although such models have been used in a wide variety of situations, they have certain limitations since all the key parameters, including the interest rate and the stock appreciation/volatility rates, are assumed to be insensitive to the (very likely) drastic changes in the market. Typically, the underlying market may have many “modes” or “regimes” that switch among themselves from time to time. The market mode could reflect the state of the underlying economy, the general mood of investors in the market, and so on. For example, the market can be roughly divided as “bullish” and “bearish,” while the market parameters can be quite different in the two modes. One could certainly introduce more intermediate states between the two extremes. A system, commonly referred to as the regime switching model, can be formulated as a stochastic differential equation whose coefficients are modulated by a continuous-time Markov chain. Such a model has been employed in the literature to discuss options; see [1], [3], and [5]. Moreover, an investment-consumption model with regime switching was studied in [29]; an optimal stock selling rule for a Markov-modulated Black–Scholes model was derived in [30]; a stochastic approximation approach for the liquidation problem could be found in [25]. In [32], we treated the continuous-time version of Markowitz’s mean-variance portfolio selection with regime switching and derived the efficient portfolio and efficient frontier explicitly. 0018-9286/04$20.00 © 2004 IEEE

Transcript of Markowitz's Mean-Variance Portfolio Selection With Regime Switching: From Discrete-Time Models to...

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004 349

Markowitz’s Mean-Variance Portfolio Selection WithRegime Switching: From Discrete-Time Models

to Their Continuous-Time LimitsG. Yin, Fellow, IEEE, and Xun Yu Zhou, Senior Member, IEEE

Abstract—We study a discrete-time version of Markowitz’smean-variance portfolio selection problem where the marketparameters depend on the market mode (regime) that jumpsamong a finite number of states. The random regime switchingis delineated by a finite-state Markov chain, based on whicha discrete-time Markov modulated portfolio selection model ispresented. Such models either arise from multiperiod portfolioselections or result from numerical solution of continuous-timeproblems. The natural connections between discrete-time modelsand their continuous-time counterpart are revealed. Since theMarkov chain frequently has a large state space, to reduce thecomplexity, an aggregated process with smaller state–space isintroduced and the underlying portfolio selection is formulated asa two-time-scale problem. We prove that the process of interestyields a switching diffusion limit using weak convergence methods.Next, based on the optimal control of the limit process obtainedfrom our recent work, we devise portfolio selection strategies forthe original problem and demonstrate their asymptotic optimality.

Index Terms—Discrete-time model, linear-quadratic problem,Markov chain, Markowitz’s mean-variance portfolio selection,singular perturbation, switching diffusion.

I. INTRODUCTION

MARKOWITZ’S Nobel-prize winning mean-varianceportfolio selection model (for a single period) [16],

[17] provides a foundation of modern finance theory; it hasinspired numerous extensions and applications. The Markowitzmodel aims to maximize the terminal wealth, in the mean timeto minimize the risk using the variance as a criterion, whichenables investors to seek highest return upon specifying theiracceptable risk level.

There have been continuing efforts in extending portfolio se-lection from the static single period model to dynamic multi-period or continuous-time models. However, the research workson dynamic portfolio selections have been dominated by thoseof maximizing expected utility functions of the terminal wealth,

Manuscript received May 15, 2002; revised December 15, 2002. Recom-mended by Guest Editor B. Pasik-Duncan. The work of G. Yin was supported inpart by the National Science Foundation under Grant DMS-0304928 and in partby Wayne State University under the WSU Research Enhancement Program.The work of X. Y. Zhou was supported in part by the RGC Earmarked GrantsCUHK 4435/99E and CUHK 4175/00E.

G. Yin is with the Department of Mathematics, Wayne State University, De-troit, MI 48202 USA (e-mail: [email protected]).

X. Y. Zhou is with the Department of Systems Engineering and EngineeringManagement, the Chinese University of Hong Kong, Shatin, Hong Kong(e-mail: [email protected]).

Digital Object Identifier 10.1109/TAC.2004.824479

which is in spirit different from the original Markowitz’s model.For example, the multiperiod utility models were investigated in[7]–[10] and [19]–[21], among many others. As for the contin-uous-time case, the famous Merton paper [18] studied a utilitymaximization problem with market factors modeled as a dif-fusion process (rather than as a Markov chain). Along anotherline, the mean-variance hedging problem was investigated byDuffie and Richardson [6] and Schweizer [22], where an op-timal dynamic strategy was sought to hedge contingent claimsin an imperfect market. Optimal hedging policies [6], [22] wereobtained primarily based on the so-called projection theorem.

Very recently, using the stochastic linear-quadratic (LQ)theory developed in [4] and [28], Zhou and Li [31] introduced astochastic LQ control framework to study the continuous-timeversion of the Markowitz’s problem. Within this framework,they derived closed-form efficient policies (in the Markowitzsense) along with an explicit expression of the efficient frontier.

In the aforementioned references, for continuous-timeformulations of mean-variance problems, stochastic differ-ential equations and geometric Brownian motion modelswere used. Although such models have been used in a widevariety of situations, they have certain limitations since allthe key parameters, including the interest rate and the stockappreciation/volatility rates, are assumed to be insensitive tothe (very likely) drastic changes in the market. Typically, theunderlying market may have many “modes” or “regimes” thatswitch among themselves from time to time. The market modecould reflect the state of the underlying economy, the generalmood of investors in the market, and so on. For example, themarket can be roughly divided as “bullish” and “bearish,”while the market parameters can be quite different in the twomodes. One could certainly introduce more intermediate statesbetween the two extremes. A system, commonly referred to asthe regime switching model, can be formulated as a stochasticdifferential equation whose coefficients are modulated bya continuous-time Markov chain. Such a model has beenemployed in the literature to discuss options; see [1], [3], and[5]. Moreover, an investment-consumption model with regimeswitching was studied in [29]; an optimal stock selling rule for aMarkov-modulated Black–Scholes model was derived in [30]; astochastic approximation approach for the liquidation problemcould be found in [25]. In [32], we treated the continuous-timeversion of Markowitz’s mean-variance portfolio selectionwith regime switching and derived the efficient portfolio andefficient frontier explicitly.

0018-9286/04$20.00 © 2004 IEEE

350 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004

Motivated by the recent developments of mean-variance port-folio selection and Markov-modulated geometric Brownian mo-tion formulation, we develop a class of discrete-time mean-variance portfolio selection models and reveal their relationshipwith the continuous-time counterparts in this paper. The dis-crete-time case is as important as the continuous-time one. First,frequently, one needs to deal with multiperiod, discrete-timeMarkowitz’s portfolio selection problems directly; see [13] fora recent account on the topic, in which efficient strategies werederived together with the efficient frontier. In addition, to simu-late a continuous-time model, one often has to use a discretiza-tion technique leading to a discrete-time problem formulation.In this paper, however, one of the main features of the problemto be tackled is that all the market coefficients are modulated bya discrete-time Markov chain that has a finite state–space.

Owing to the presence of long-term and short-term investors,the movements of a capital market can be divided into primarymovement and secondary movement naturally leading totwo-time scales. Besides, with various economic factors suchas trends of the market, interest rates, and business cyclesbeing taken into consideration, the state space of the Markovchain, representing the totality of the possible market modes,is often large. If we simply treated each possible mode asan individual one distinct from all others, the size of theproblem would be huge. A straightforward implementation ofnumerical schemes may deem to be infeasible due to the curseof dimensionality. It is thus crucial to find a viable alternative.To reduce the complexity, we observe that the transition ratesamong different states could be quite different. In fact, there iscertain hierarchy (in terms of the magnitude of the transitionrates) involved. Therefore, it is possible to lump many statesat a similar hierarchical level together to form a big “state.”With this aggregated Markov chain, the size of the state–spaceis substantially reduced. Now, to highlight the different rates ofchanges, we introduce a small parameter into the tran-sition matrix, resulting in a singular perturbation formulation.Based on the recent progress on two-time-scale Markov chains(see [23] and [26]), we establish the natural connection be-tween the discrete-time problem and its continuous-time limit.Under simple conditions, we show that suitably interpolatedprocesses converge weakly to their limits leading to a con-tinuous-time mean-variance portfolio selection problem withregime switching. The limit mean-variance portfolio selectionproblem has an optimal solution [32] that can be obtained ina very simple way under appropriate conditions. Using thatsolution, we design policies that are asymptotically optimal.Our findings indicate that in lieu of examining the more com-plex original problem, we could use the much simplified limitproblem as a guide to obtain portfolio selection policies thatare nearly as good as the optimal one from a practical concern.The advantage is that the complexity is much reduced.

We remark that although the specific mean-variance portfolioselection is treated in this paper, the formulation and techniquescan be generally employed as well in the so-called hybrid con-trol problems that are modulated by a Markov chain for manyother applications.

The rest of this paper is arranged as follows. Section II be-gins with the formulation of the discrete-time mean-variance

portfolio selection problem. Section III proceeds with some pre-liminary results concerning two-time-scale Markov chains andintroduces an auxiliary process. Section IV is devoted to weakconvergence analysis, in which we establish the natural con-nection between the discrete-time and continuous-time modelsaiming at reducing the complexity of the underlying systems.Section V constructs policies that are based on optimal controlof the limit problem and derives asymptotic optimal strategyvia the constructed controls. Section VI extends the results byallowing the Markov chain to be nonhomogeneous and/or in-cluding transient states. Section VII concludes paper with addi-tional remarks. To make the paper more accessible, a couple oflong and technical proofs are placed in the Appendix.

II. FORMULATION

Suppose that is a fixed positive real number and thatis a small parameter. Working with discrete time , we consider

, where denotes the integer part of a realnumber . For ease of presentation, in what follows, we suppressthe floor function notation whenever there is no confusion.All the random variables/processes in this paper are defined ona given complete probability space .

Consider a market model as follows. Letdenote the collection of different market modes. Let , for

, be a discrete-time Markov chain, which is pa-rameterized by , with the state–space . Suppose that thereare assets in the underlying market. One of which is thebond and the rest of them are the stock holdings. Use todenote the price of the bond, and , , to de-note the prices of the stocks at time , respectively. (Note thathere does not represent the calendar time; it is the iterationtime at which the systems dynamics are updated. The calendartime is .) Corresponding to a market mode , letbe the interest rate, and and the appreciationrates and volatility rates of the stocks, respectively, where ,

, for , are givenfunctions. For each , denote

(1)Then, under the so-called multiplicative model, the asset prices

satisfy the following system of equations:

(2)

where , are sequences of independent andidentically distributed random variables. Note that the multi-plicative model adopted above is analogous to the geometricBrownian motion model in continuous time, which ensures thenonnegativity of the stock prices.

YIN AND ZHOU: MARKOWITZ’S MEAN-VARIANCE PORTFOLIO SELECTION 351

Remark 2.1: Another frequently used model for discrete-time stock prices is the following:

(3)

This model does not necessarily produce nonnegative stockprices. However, as demonstrated in [15, p. 311], the twomodels are essentially the same in terms of approximating thestock price process.

Suppose that at time , an investor with an initial endowmentholds shares of the th asset, ,

during the time interval . His or her total wealth at timeis for . Now for a self-financed portfolio (i.e.,

no infusion or withdrawal of funds during the indicated timeinterval), the difference in total wealth between two consecutivetimes is purely due to the change in the prices of the stocks (see,e.g., [11, p. 6. eq. (2.2)]). Therefore, we have

(4)

(5)

Denote by , the -algebra generated by, where . A portfolio

is admissible ifis -measurable for each . Note that we donot include , the amount of money allocated to the bond,in defining the portfolio. This is because, by (4), we have

Therefore

Hence, the wealth process iscompletely determined by the allocation of wealth among thestocks (excluding the bond). We call an admissible

wealth-portfolio pair, and denote the class of admissible wealth-portfolio pairs by .

The objective of the discrete-time mean-variance portfolio se-lection problem is to find an admissible portfolio

, for an initial wealth and an initial market mode, such that the terminal wealth is for a

given , and the risk in terms of the variance of the ter-minal wealth, , is minimized. That is

subject to:

(6)

III. PRELIMINARY RESULTS

Our effort in what follows is to obtain a limit problem of(6) as . The idea comes from aggregation of the Mar-kovian states to reduce the complexity of the underlying systemvia a hierarchical approach. To proceed, we make the followingassumptions. The first one is on the transition matrix of theMarkov chain, the second one concerns about the interest rateof the bond, and the stock appreciation and volatility rates, andthe last one is a condition on the exogenous noise .

A1) The transition matrix of the discrete-time Markovchain is given by

(7)

(8)

such that for are transition matrices,and is a generator (i.e., for and

for each ). Moreover,are irreducible and aperiodic.

A2) For each ,are real-valued continuous

functions defined on .A3) For each is a sequence of inde-

pendent and identically distributed (i.i.d.) random vari-ables that are independent of and that have mean 0and variance 1. Moreover, for , and are in-dependent.

Remark 3.1: The rationale for A1) is that among the tran-sition rates of the Markovian states, some of them vary rapidlyand others change slowly. Decomposing the state space in accor-dance with these transition rates, we can write the state–spaceas

(9)

where . Since in a finite-state Markovchain, there is at least one recurrent state (not all states can betransient), and there is no null-recurrent state, the irreducibilityimplies that there are ergodic classes of states. Each of thetransition matrix is responsible for the rapid transitions,whereas the generator governs the slow transitions from oneergodic class to another. The transition matrix given in (7)with specified in (8) has the so-called nearly completelydecomposable structure. It arises from many discrete-timecontrol and optimization problems, or from discretization of acontinuous-time problem; see [24] and the references therein.

352 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004

On the other hand, Condition A3) indicates that the sequencesare the so-called white noise. Under this condition,

the Donsker’s invariance principle [2, p. 137] implies thatconverges weakly to a Brownian motion as

. For definition, discussion, and basic results on weakconvergence, we refer the reader to [2] and [12]. In fact,correlated noise may also be dealt with. What is essential isa functional central limit theorem as mentioned before holdsfor the scaled sequence. Sufficient conditions guaranteeing theconvergence, for example, for -mixing processes are available.However, assumption A3) allows us to simplify much of thediscussion in what follows.

Due to the various considerations, the state space is large.That is, is a large number. In solving the portfolio selectionproblem, we aggregate the states in each of the recurrent class

into one state, resulting in an aggregated process havingonly states. If , the possible number of configurations issubstantially smaller for the aggregated process. In practice, wetypically merge all the similar states into one “big” state. For ex-ample, we can put all the states together where the market is gen-erally up (respectively, down) to form a “bullish” (respectively,“bearish”) mode of the market. Our effort in what follows is toshow that suitably interpolated processes of the prices convergeto limit processes that are switching diffusions, and to establishthe connection between the discrete-time and continuous-timeportfolio selection problems. Using the optimal strategy of thelimit problem, which was obtained explicitly in [32], enables usto construct an asymptotically optimal strategy for the originalproblem (6).

A. Singularly Perturbed Markov Chains

To carry out the analysis, define an aggregated discrete-timeprocess by if . Next,define the interpolated processes of the stock/bond price pro-cesses, the aggregated process, and admissible wealth-portfoliopairs as follows: For

(10)

That is, they are all piecewise constant on the interval of length. We will obtain the weak convergence of the interpolated pro-

cesses. Before proceeding further, we present some preliminaryresults.

Lemma 3.2: Assume condition A1). Then, the following as-sertions hold.

i) Denote by the stationary distribution correspondingto the transition matrix for each . Then

satisfies

(11)

for some , where(with ) satisfies

(12)

Here, denotes an -dimensional column vector withall entries being 1.

ii) For , the -step transition probability matrixsatisfies

(13)

where withsatisfying

(14)

iii) As converges weakly to , acontinuous-time Markov chain with state space

and generator given by (12).Moreover, for the occupation measures defined by

with , and , thefollowing mean square estimates hold:

(15)

Proof: The proofs of i) and ii) are in [24], and that of iii)can be found in [27].

B. Auxiliary Process

Model (2) is a multiplicative one. For the subsequent analysis,it is more convenient to work with an auxiliary additive model.To do so, we retain the definition of , and define auxiliaryprocesses for as: and

(16)

where is given by (1). In terms of the processes ,the price for the th stock can be written as

(17)

Similar to the interpolation , define the interpolatedprocesses

(18)

It is easily seen that . That is,is related to the interpolation of through the exponentialfunction.

To obtain the convergence of , we utilize the auxil-iary process . We first show that convergesweakly to certain limit process, in which the appreciation andvolatility rates are averaged out with respect to the stationarymeasures of the corresponding Markov chains. Then, by usinga well-known continuous mapping theorem [2, Th. 5.2, p. 31],we obtain the desired result.

IV. WEAK CONVERGENCE

This section is devoted to the weak convergence of interpo-lated processes and . We first verify the tight-ness of the underlying sequences. Then, we characterize thelimit by showing that they are solutions of certain martingaleproblems with appropriate generators. In fact, we work withthe processes and for

, respectively.

YIN AND ZHOU: MARKOWITZ’S MEAN-VARIANCE PORTFOLIO SELECTION 353

A. Tightness of and

Theorem 4.1: Under A1)–A3),and for are tight on

, where is the spaceof functions that are right continuous, have left limits, endowedwith the Skorohod topology.

Proof: The proof is in the Appendix.Since and

are tight, Prohorov’s theorem [2, p. 37] allows us to extractweakly convergent subsequences. Select such convergent subse-quences (still indexed by for notational simplicity) with limits

and , respectively. By virtue of theSkorohod representation [12, p. 29], we may assume withoutloss of generality that andconverge to and , respectively,with probability one (w.p. 1), and that they converge uniformlyon any compact time interval.

B. Weak Convergence

We proceed to characterize the limit processes. Denote thematrix by . Define

(19)

where denotes the th component of the stationary distribu-tion corresponding to . Let be suchthat

(20)

where for a vector or a matrix denotes its trans-pose. Concentrating on the processes and

, we wish to show that the limit processes aresolutions of

(21)

respectively, where for are independent,scalar, standard Brownian motions. Equivalently, it suffices toshow that for each is a solution ofthe martingale problem with the operator and

(22)

where for each is a suitable real-valued function de-fined on is given by (12), and

(23)

Theorem 4.2: Assume A1)–A3). Then,and for , the weak limits of

and are the unique solutionsof the martingale problems with the operators for

, where is defined in (22).Proof: It is in the Appendix.

Corollary 4.3: Under the conditions of Theorem 4.2,converges weakly to which are

solutions of

(24)

respectively, where for are independent,scalar, standard Brownian motions.

Proof: Since , and isa continuous function, by the well-known continuous mappingtheorem [2, Th. 5.1, p. 31], converges to such that

, where is the limit of .Applying Itô’s formula to (21) and noticing the relations (1) and(19), we obtain the desired result.

Now, we define the (continuous-time) limit problem ofmean-variance portfolio selection. Denote by the -algebragenerated by , where

. A control isadmissible for the limit problem if is -adapted and

(25)

has a unique solution corresponding to . Here, isthe total wealth of the investor and the allocation of wealthto the stocks, both at time . Thus, with being the amountof fund in the bond, we must have

(26)

where is the number of shares of the th stock held bythe investor at time . The is termed an admis-sible wealth-portfolio pair (for the limit problem). Denote theclass of admissible wealth-portfolio pairs by . The limit mean-variance portfolio selection problem is formulated as

Minimize

subject to:

and (27)

354 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004

where is such that [recall that is the initialcondition of the Markov chain in the original discrete-timeproblem (6)].

For any , there are the corresponding interpo-lated processes determined by (10). From nowon we will not distinguish between and ,and will occasionally write . As a conse-quence of Corollary 4.3, the following result holds, which indi-cates that associated with the discrete-time problem (6) there isa limit continuous-time problem (27).

Corollary 4.4: Under the conditions of Theorem 4.2, anyconverges weakly to that be-

longs to . Moreover, as

(28)

Proof: Set .Corollary 4.3 implies that converges weakly to

that satisfies

(29)

Recall that for . That is,. The weak convergence of

yields that converges weakly towhere . It follows

then from (5) and (26) that converges weakly to ,which satisfies (25). This also implies . Fi-nally, the interpolation of , the weak convergence of to

, the Skorohod representation, and the dominated conver-gence theorem lead to (28).

V. NEARLY EFFICIENT PORTFOLIO

We have established that associated with the original mean-variance control problem, there is a limit problem. In this sec-tion, we demonstrate that we can construct a nearly optimal port-folio for (6) based on the optimal portfolio for (27).

A. Efficient Portfolio of the Limit Problem

Recall that (27) is feasible (for fixed and ) if there is atleast one portfolio satisfying all the constraints. It is finite if it isfeasible and the infimum of is finite. An optimal portfolio fora given , if it exists, is called an efficient portfolio, and the cor-responding is called an efficient point. The set ofall the efficient points as varies is termed the efficient frontier.In terms of the control theory terminology, an efficient portfoliois an optimal control policy (corresponding to a particular ).

In [32], a necessary and sufficient condition for the feasibilityof the limit problem (27) is derived. In addition, it is proved thatif it is feasible, then indeed the efficient portfolio correspondingto exists, which can be expressed in a feedback form, namely,a function of the time , the wealth level and the market mode

. With the initial data given by and , theefficient portfolio (optimal control) is given by

(30)

where for

(31)

Moreover, the optimal value of , among all the wealthprocesses satisfying , is

(32)

which gives the closed-form efficient frontier. In what follows,using an efficient portfolio of the limit problem, we constructa nearly efficient portfolio (i.e., a near-optimal control) for theoriginal problem.

B. Nearly Efficient Portfolio

With the optimal control of the limit problem, ,given by (30), we construct

(33)

That is, if , for .Let be the wealth trajectory of the original problem (6) underthe feedback control . Recall that is the contin-uous-time interpolation of and denote the continuous-timeinterpolation of by . We shall show that theuse of such a control leads to near optimality of (6). Write

Theorem 5.1: Suppose that the conditions of Theorem 4.2are satisfied. Then

(34)

Proof: In view of the construction of in(33), the weak convergence argument as in the proofof Theorem 4.2 yields that converges weakly

YIN AND ZHOU: MARKOWITZ’S MEAN-VARIANCE PORTFOLIO SELECTION 355

to , and converges weaklyto . Then,

as . Therefore,

where as . Select an admissible controlsuch that

Define

set as in (5) but with replaced by , and letand be the piecewise constant interpo-

lations of and , respectively. Then,. Similar to the argu-

ment leading to the equation after (49) in [14], using the meansquares estimate (15) on the occupation measure, the wealth(5), the definition of , and the Gronwall’s inequality, we canshow that as for . Thisimplies that

(35)

where as . The tightness ofimplies that we can extract a con-

vergent subsequence, still denoted byfor simplicity, such that .It follows that

(36)

where as . Combining (35) and (36)

(37)

Subtracting from (37) and taking the limit as ,we arrive at (34).

VI. EXTENSIONS

We have demonstrated the natural connections of the dis-crete-time model and its continuous-time counterpart for theMarkowitz’s mean-variance portfolio selection problems underthe premise that the modulating Markov chain is time homoge-neous. In this section, we extend the results to time inhomoge-neous Markov chains, which entail the time-dependent transi-tion matrices. To proceed, we first recall the definition of weakirreducibility.

Definition 6.1: Suppose that is a discrete-time Markovchain with a finite state–space and (time-dependent) transition probability matrices . The Markovchain (or the transition matrix) is said to be weakly irreducibleif for each , the system of equations

(38)

has a unique solution andfor each . The unique nonnegative so-

lution is termed a quasi-stationary distribution.Remark 6.2: The definition above is an extension of the usual

notion of irreducibility. It allows the transition matrices andthe quasistationary measures to be time dependent. In addition,some of the components could be 0.

Using Definition 6.1, we modify the argument used in the pre-vious sections, and extend the results to include time-dependenttransition probabilities. Since the proofs of the assertions aresimilar to the previous case, we will only present the condi-tions needed and the final results. The verbatim proofs will beomitted. In lieu of A1), we use the following.

A1’) Suppose that is a Markov chain with state spaceand time-dependent transition matrices

(39)

such that has the form of the decompositionas in (8) with each replaced by . For each

is weakly irre-ducible and aperiodic. As functions ofis continuously differentiable with Lipschitz contin-uous derivative, and is Lipschitz continuous.

Theorem 6.3: Under the conditions of Theorem 4.2 with A1)replaced by A1’), the conclusions of Theorem 4.2 and Corollary4.4 continue to hold with and replaced by and ,respectively.

Since in a finite-state Markov chain, in addition to recurrentstates, transient states may also be included. Our next result con-cerns about the inclusion of transient states in addition to theclasses of weakly irreducible classes. In this case we need to re-place A1’) by A1”).

A1”) Suppose that the Markov chain has a finite statespace and transition matrix (39) with

. . .

(40)

For each , and each , thetransition probability matrices are weakly irre-ducible and aperiodic. For each ,is a matrix having all of its eigenvalues inside the unitcircle. As functions of is continu-ously differentiable with Lipschitz continuous deriva-tive, and is Lipschitz continuous. In addition,with denoting , there exists an nonsin-gular matrix and constant matrices andsatisfying , and

, for .Now, the state–space of the Markov chain can be written as:

, wherefor , and . Define

for

356 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004

Then, it is easily seen that is independent of .Denote the th component of by . Partition the matrix

as

where, and , and define

(41)In carrying out the aggregation, we only lump all the states ineach weakly irreducible class into a single state. This leads to thedefinition of the aggregated process. Let be a random variableuniformly distributed in and let

Then, define

Using essentially the same idea but more complex and detailedestimates, we can derive the following results.

Theorem 6.4: Under the conditions of Theorem 4.2 with A1)replaced by A1”), the conclusions of Theorem 4.2 and Corollary4.4 continue to hold with and replaced by the time-varying

and given by (41), respectively.

VII. CONCLUDING REMARKS

This paper has been devoted to a class of dynamicMarkowitz’s mean-variance portfolio selection problems.Taking into consideration of market trend and other factors, adiscrete-time model that is modulated by a Markov chain wasintroduced. Aiming at complexity reduction, we use nearlycompletely decomposable transition matrices and weak con-vergence methods to derive the limit mean-variance portfolioselection problem. Based on the limit, we can design optimal(efficient) portfolios and derive efficient frontier [32] (see alsothe framework of LQ control with indefinite control weights[4]). Then using the efficient portfolios of the limit problem,we constructed portfolios for the original discrete-time modeland show that such portfolios are nearly efficient.

In Section VI, we assumed the smoothness of the transitionfunctions in A1’) and A1”). Such a condition can be relaxedif we work with convergence of the probability vector andthe transition probability matrices under the weak topology of

. The associated weak convergence and the limit sys-tems can still be obtained. As far as the limit mean-varianceproblem is concerned, for the cases discussed in Section VI,due to the time-dependent generator (the nonhomogeneity), thecorresponding limit problem is more difficult to handle. How-ever, we can still obtain near optimality if we use a “ -optimal”portfolio policy for the limit problem. Based on such a -op-timal policy, we can construct portfolios that are nearly op-timal for the original problem. The statement of (34) is changedto . This paperhas been devoted to a discrete-time model. In fact, one may

use the weak convergence method to treat a continuous-timemean-variance portfolio problem

where is a continuous-time singularly perturbed Markovchain generated by . The use of thesingularly perturbed Markov chain again comes from the mo-tivation of reduction of complexity. As shown in [24, §2.2, p.840], such a continuous-time Markov chain has an associateddiscrete-time chain whose transition matrix has the form (7).Thus, the problem treated in this paper can be thought of as adiscretization of the continuous-time problem given previously.

It should be noted that in our model the wealth of an agentis allowed to become and remain negative. Prohibition of bank-ruptcy renders the problem one with state constraint, which re-mains an interesting yet challenging open problem.

Finally, we remark that although the formulation and moti-vation in this paper stem from mean-variance portfolio selec-tion problems, the techniques used and the methods of solutionsare not restricted to financial engineering applications. They canalso be employed in other hybrid control problems modulatedby a singularly perturbed Markov chain.

APPENDIX

Proof of Theorem 4.1: By virtue of Lemma 3.2-iii),is tight. In view of [12, Th. 3, p. 47], to prove the

tightness of , it suffices to establish thetightness of .

Use to denote the conditional expectation with respectto , the -algebra generated by

. For any , and , and for each

(42)

In this and hereafter, represents a generic positive realnumber; its values may be different for different appearances.Note that are understood to be the integerparts in the summation limits of (42). As shown previously,we have also used the independence of with , and

YIN AND ZHOU: MARKOWITZ’S MEAN-VARIANCE PORTFOLIO SELECTION 357

for and . Theboundedness of then implies that

Similarly, the boundedness of implies that

Consequently

The desired result then follows from the tightness crite-rion [12, Th. 3, p. 47]. Likewise, it can be verified that

. Thusis also tight.

Proof of Theorem 4.2: We focus on the stock price pro-cesses for . To proceed, let us fix

. The uniqueness of the solution of the martin-gale problem can be verified as in [23, Lemma 7.18]. To char-acterize the limit, it suffices to show that for each and

(where is the collection of functions thathave compact support and that are continuously differentiablewith respect to the first variable, and twice continuously differ-entiable with respect to the second variable)

(43)

To obtain (43), it suffices to show that for any positive integer, bounded and continuous function with , and

any satisfying

(44)

To verify (44), we begin with the pre-limit processes (the pro-cesses indexed by ). Define

(45)

Then . The weak convergence ofto , the Skorohod repre-

sentation, the continuity of , and the dominatedconvergence theorem then yield that as

(46)

Pick out , a sequence of positive integers such thatas and as . Then

(47)

where in probability as uniformly in .Note that we have used in theabove.

By virtue of the weak convergence, the Skorohod rep-resentation, and the continuity and the boundedness of

, as

(48)

As for the terms on the next to the last line of (47), define

...

Since is -measurable for , andgiven by (7)

(49)

358 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004

In (49), we have used the orthogonality .Furthermore

Lemma 3.2 implies that the terms in the last two lines abovegoes to 0 in probability uniformly in . In view of (12),as

uniformly in . Putting these estimates together, as

(50)

Using a truncated Taylor expansion and denotingand , we have

(51)

where is on the line segment joining and .Substituting (16) in (51), we can write

(52)

To proceed, we first replace in the argument ofabove by . Letting as , and using Lemma3.2-iii), for all

(53)

By virtue of the independence of , inserting and then, and using , we have

YIN AND ZHOU: MARKOWITZ’S MEAN-VARIANCE PORTFOLIO SELECTION 359

Likewise, detailed estimates reveal that

(54)

where and are given by (20) and (23),respectively. In addition, similar to the estimates for (49), as

(55)

(56)

Similar to (55) and (56), as

(57)

Combining the estimates obtained thus far and using (51) inconjunction with (47), we arrive at

(58)

Equation (58) together with (46) then yields the desiredassertion.

The same method works for the proof of to. The proof is even simpler since no diffusion is

involved.

ACKNOWLEDGMENT

The authors would like to thank the Reviewers and Editorsfor their suggestions leading to much improvement. They wouldalso like to thank W. Runggaldier for commenting on an earlierdraft of this paper.

REFERENCES

[1] G. Barone-Adesi and R. Whaley, “Efficient analytic approximation ofAmerican option values,” J. Finance, vol. 42, pp. 301–320, 1987.

[2] P. Billingsley, Convergence of Probability Measures. New York:Wiley, 1968.

[3] J. Buffington and R. J. Elliott, “American options with regimeswitching,” Int. J. Theoret. Appl. Finance, vol. 5, pp. 497–514, 2002.

[4] S. Chen, X. Li, and X. Y. Zhou, “Stochastic linear quadratic regulatorswith indefinite control weight costs,” in SIAM J. Control Optim., vol.36, 1998, pp. 1685–1702.

[5] G. B. Di Masi, Y. M. Kabanov, and W. J. Runggaldier, “Mean variancehedging of options on stocks with Markov volatility,” Theory Probab.Appl., vol. 39, pp. 173–181, 1994.

[6] D. Duffie and H. Richardson, “Mean-variance hedging in continuoustime,” Ann. Appl. Probab., vol. 1, pp. 1–15, 1991.

[7] E. J. Elton and M. J. Gruber, Finance as a Dynamic Process. UpperSaddle River, NJ: Prentice-Hall, 1975.

[8] J. C. Francis, Investments: Analysis and Management. New York: Mc-Graw-Hill, 1976.

[9] R. R. Grauer and N. H. Hakansson, “On the use of mean-variance andquadratic approximations in implementing dynamic investment strate-gies: A comparison of returns and investment policies,” Manage. Sci.,vol. 39, pp. 856–871, 1993.

[10] N. H. Hakansson, “Multi-period mean-variance analysis: Toward a gen-eral theory of portfolio choice,” J. Finance, vol. 26, pp. 857–884, 1971.

[11] I. Karatzas and S. E. Shreve, Methods of Mathematical Finance. NewYork: Springer-Verlag, 1998.

[12] H. J. Kushner, Approximation and Weak Convergence Methodsfor Random Processes, with applications to Stochastic SystemsTheory. Cambridge, MA: MIT Press, 1984.

[13] D. Li and W. L. Ng, “Optimal dynamic portfolio selection: Multi-pe-riod mean-variance formulation,” Math. Finance, vol. 10, pp. 387–406,2000.

[14] R. H. Liu, G. Yin, and Q. Zhang, “Asymptotically optimal controls ofhybrid linear quadratic regulators in discrete time,” Automatica, vol. 38,pp. 409–419, 2002.

360 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 49, NO. 3, MARCH 2004

[15] D. G. Luenberger, Investment Science. New York: Oxford Univ. Press,1998.

[16] H. Markowitz, “Portfolio selection,” J. Finance, vol. 7, pp. 77–91, 1952.[17] , Portfolio Selection: Efficient Diversification of Investment. New

York: Wiley, 1959.[18] R. C. Merton, “An intertemporal capital asset pricing model,” Econo-

metrica, vol. 41, pp. 867–888, 1973.[19] J. Mossin, “Optimal multiperiod portfolio policies,” J. Business, vol. 41,

pp. 215–229, 1968.[20] S. R. Pliska, Introduction to Mathematical Finance: Discrete Time

Models. Oxford, U.K.: Blackwell, 1997.[21] P. A. Samuelson, “Lifetime portfolio selection by dynamic stochastic

programming,” Rev. Econ. Statist., vol. 51, pp. 239–246, 1969.[22] M. Schweizer, “Approximation pricing and the variance-optimal mar-

tingale measure,” Ann. Probab., vol. 24, pp. 206–236, 1996.[23] G. Yin and Q. Zhang, Continuous-Time Markov Chains and Applica-

tions: A Singular Perturbation Approach. New York: Springer-Verlag,1998.

[24] , “Singularly perturbed discrete-time Markov chains,” SIAM J.Appl. Math., vol. 61, pp. 834–854, 2000.

[25] G. Yin, R. H. Liu, and Q. Zhang, “Recursive algorithms for stock Liq-uidation: A stochastic optimization approach,” SIAM J. Optim., vol. 13,pp. 240–263, 2002.

[26] G. Yin, Q. Zhang, and G. Badowski, “Asymptotic properties of a sin-gularly perturbed Markov chain with inclusion of transient states,” Ann.Appl. Probab., vol. 10, pp. 549–572, 2000.

[27] , “Discrete-time singularly perturbed Markov chains: Aggregation,occupation measures, and switching diffusion limit,” Adv. Appl. Probab.,vol. 35, pp. 449–476, 2003.

[28] J. Yong and X. Y. Zhou, Stochastic Controls: Hamiltonian Systems andHJB Equations. New York: Springer-Verlag, 1999.

[29] T. Zariphopoulou, “Investment-consumption models with transactionscosts and Markov-chain parameters,” SIAM J. Control Optim., vol. 30,pp. 613–636, 1992.

[30] Q. Zhang, “Stock trading: An optimal selling rule,” SIAM J. ControlOptim., vol. 40, pp. 64–87, 2001.

[31] X. Y. Zhou and D. Li, “Continuous-Time mean-variance portfolio se-lection: A stochastic LQ framework,” Appl. Math. Optim., vol. 42, pp.19–33, 2000.

[32] X. Y. Zhou and G. Yin, “Markowitz’s mean-variance portfolio selec-tion with regime switching: A continuous-time model,” SIAM J. ControlOptim., vol. 42, pp. 1466–1482, 2003.

G. Yin (S’87–M’87–SM’96–F’02) received theB.S. degree in mathematics from the University ofDelaware, Newark, in 1983, and the M.S. degree inelectrical engineering, and Ph.D. degree in appliedmathematics from Brown University, Providence,RI, in 1987.

Subsequently, he joined the Department ofMathematics, Wayne State University, Detroit, MI,where he became a Professor in 1996. He servedon the Editorial Board of Stochastic Optimizationand Design, the Mathematical Review Date Base

Committee, and various conference program committees. He was the Editor ofSIAM Activity Group on Control and Systems Theory Newsletters, the SIAMRepresentative to the 34th Conference on Decision and Control, Co-Chair of1996 AMS-SIAM Summer Seminar in Applied Mathematics, and Co-Chairof the 2003 AMS-IMA-SIAM Summer Research Conference: Mathematicsof Finance. He is an Associate Editor of the Journal of Control Theory andApplications and SIAM Journal on Control and Optimization.

Dr. Yin was an Associate Editor of the IEEE TRANSACTIONS ON AUTOMATIC

CONTROL.

Xun Yu Zhou (A’97–SM’99) received the Ph.D. de-gree in applied mathematics from Fudan University,Fundan, China, in 1989.

During 1989–1991 and 1991–1993, he was doingpostdoctoral research at Kobe University, Kobe,Japan, and the University of Toronto, Toronto, ON,Canada, respectively. He has been with the ChineseUniversity of Hong Kong since 1993, and is currentlya Professor. As a principal investigator of numerousresearch grants, he has done research in optimalstochastic control, mathematical finance/insurance,

and discrete-event manufacturing systems. He has served as Organizer andProgram Committee Member for numerous international conferences, and hasbeen invited to visit and/or give seminars at universities around the globe. Hehas published approximately 60 papers in refereed journals, and has publishedthe book Stochastic Controls: Hamiltonian Systems and HJB Equations(coauthored with J. Yong) (New York: Springer-Verlag, 1999). He is a Memberof SIAM and is currently on the Editorial Boards of Operations Research andMathematical Finance.

Dr. Zhou is a recipient of the 2003 SIAM Outstanding Paper Prize. He wason the Editorial Board of IEEE TRANSACTIONS ON AUTOMATIC CONTROL from1999 to 2002.