Varun Balupuri - Thesis

Stochastic Control for Optimal DynamicTrading Strategies

Varun Balupuri

Department of MathematicsKing’s College London

The Strand, London WC2R 2LSUnited Kingdom

Email: [email protected]: +44 (0)583 248 930

19 September 2016

Report submitted in partial fulfillment of therequirements for the degree of MSc in Finan-cial Mathematics in the University of London

Abstract

In this paper, we apply dynamic programming to solve Merton’s portfolio prob-lem in the classical Black-Scholes model under the familiar cases of power, ex-ponential and logarithmic utility, where we show that the optimal strategy is tokeep a constant proportion of wealth in the risky asset.

We also examine the problem in in the presence of stochastic volatility.The problem is found to be solved by a non-stochastic function of time andwe perform Monte Carlo simulations to numerically verify this. A focus of thispaper is on numerical estimates and analysis to back up theoretical results.A brief overview of the problem in the presence of transaction costs and theassociated difficulties is presented in Chapter 4.

1

Contents

1 INTRODUCTION 3

1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Market model and notation . . . . . . . . . . . . . . . . . . . . . 4

1.3 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Portfolio Allocation 8

2.1 Optimising Risky Wealth Under Exponential Utility . . . . . . . 8

2.2 Extension to non-zero interest rates . . . . . . . . . . . . . . . . 10

2.3 Numerical Implementation/Monte Carlo Estimates . . . . . . . . 12

2.3.1 Re-balancing the portfolio . . . . . . . . . . . . . . . . . . 18

2.4 Addition of Consumption and Infinite Horizons . . . . . . . . . . 22

3 Extension to Stochastic Volatility 25

3.1 Merton’s Problem in the Heston Model . . . . . . . . . . . . . . . 25

4 Transaction Costs 32

4.1 Proportional Transaction Costs . . . . . . . . . . . . . . . . . . . 32

5 CONCLUSION 36

A Simulation and Graphing Code 39

2

Chapter 1

INTRODUCTION

1.1 Background

Stochastic optimisation is concerned with controlling dynamic systems withstochastic pertubations to maximise (or minimise) some criteria, usually a func-tion representing value or a stopping time attaining said value. Richard Bellmanpioneered the dynamic programming equation and approach to these types ofproblems in a 1954 paper, ’The theory of dynamic programming’ [1]

In the 1970’s the mathematics of dynamic programming and stochasticcontrol was applied in the context of economics/financial mathematics, firstrecieving widespread recognition due to Merton’s 1973 paper [2].

From a mathematical approach, the field of dynamic portfolio choice wasfirst approached by Merton in his much celebrated, seminal papers ([3] [4]),building upon the Black-Scholes framework, Merton employed techniques fromfrom stochastic control to show how to create optimal portfolios consisting ofa risky and riskless asset in a friction-less market with constant volatility andconstant interest rates. Since then, Merton’s ’portfolio problem’ has become ahighly studied topic and there have been many papers addressing improvementsand extensions to this framework.

In this paper we consider maximizing an investors value function usingdynamic programming to solve the associated Heath-Jacobi-Bellman equationwith various utility functions in the HARA1 class, as Merton did and we placeextra emphasis on numerical analysis and verification of these results throughMonte Carlo methods.

In Chapter 1 we look at the topic of stochastic control in the context ofportfolio optimisation, building the initial framework and presenting the relevantmathematical theorems and results which are important for later chapters.

In Chapter 2 we look at the classical 2-asset Merton portfolio problem in

1hyperbolic absolute risk aversion (HARA) functions are easy to model mathematicallyand have properties which make them suited to modeling reality.

3

CHAPTER 1. INTRODUCTION 4

the Black-Scholes framework under exponential utility and power utility. In themost basic sense, the investor wants to decide what proportion of his wealthto keep in the risky asset to maximise their expected utility over a finite timehorizon. In addition to optimising the investor’s ’risky wealth’ we consider thecase when a consumption parameter is added, seeking to optimise the investorsrate of removal of money as well. We will consider Merton’s problem overan infinite horizon too, which is useful when modeling how to invest/consumeones savings until retirement or death. We will show via the use of dynamicprogramming, that all of these problems have a closed form solution and can beexplicitly calculated.

Since Merton’s problem is built upon many unrealistic assumptions, suchas the ability to trade continuously without fee and constant interest rates, theaddition of transaction costs and stochastic volatility is a very important stepin building a more realistic model.

Mathematical models have been introduced to add stochastic volatilityand stochastic interest rates to model stock prices. We examine Merton’s prob-lem with stochastic volatility in Chapter 3, focusing on Heston’s model. UsingLiu and Muhle-Karbe’s approach we show that even in this setting, Merton’sproblem has an explicit solution and verify the theoretical results by discretizingand using Monte Carlo methods.

The addition of transaction costs are briefly considered in chapter 4. Pro-portional transaction costs were first incorporated By Davis and Norman in1990 [6]. Since the investor cannot make a large amount of trades without ac-cruing large fees, the optimal strategy changes from continuous re-balancing ofthe portfolio to one which involves only making trades if the ratio falls outsideof a region, called the ’no trade’ region. Finding the boundaries of this regionis not a trivial task and computational approaches fall beyond the scope of thisthesis.

This paper focuses on models where an explicit solution to Merton’s prob-lem exists, however this is not generally the case. We can only use the Heath-Jacobi-Bellman equation when the value function is sufficiently smooth. Formany problems, this is not the case and alternate methods such as the viscos-ity solution approach introduced by Crandall and Lions (1980) is an effectivemethod to attack such problems. A modern development in the literature is onbackward stochastic differential equations, which provide a probabilistic repre-sentation of non-linear PDE’s. These topics are not discussed in this paper.

1.2 Market model and notation

Unless explicitly stated, consider a frictionless market with two assets, a riskystock, denoted by S, modelled by a stochastic price process (St)t≥0, whichfollows the standard Black-Scholes dynamics:

dSt = St(µdt+ σdWt)


and a riskless ’bank account’ Bt, whose dynamics are given by the ODE:

dBt = rBtdt

Definition 1.2.1 (Self Financing Portfolio). In continuous time, consider aportfolio consisting of ∆t units of the risky asset S and φt units of the risklessasset B. This portfolio has a time t value of Vt = ∆tSt + φtBt. We say thatthe portfolio is self-financing2 if:

Vt = V0 +

∫ t

0

∆wdSw +

∫ t

0

φwdBw

We assume the investor’s actions have no affect on the market and theinvestor’s strategy is self-financing.

Unless explicity stated otherwise, let µ, σ, r ∈ R and Wt be a standard one-dimensional Brownian Motion on complete probability space (Ω,F ,P), with(Ft)t≥0 being a filtration.

Definition 1.2.2 (Adapted Process). A process (Xt)t≥0 is adapted with respectto a filtration F = (Ft)t≥0 if:

∀t ≥ 0, Xt is measurable.

Definition 1.2.3 (Progressively Measurable Process). A process (Xt)t≥0 isprogressively measurable with respect to F if ∀t ≥ 0:

the map [0, t]× Ω defined by (s, ω) 7→ Xs(ω) is B([0, t])⊗Ft-measurable.

1.3 Preliminaries

Important theorems and results which are applied in latter chapters are pre-sented here without formal proof. References to derivations are provided forthe interested reader.

Consider stochastic differential equation:

dXt = b(Xt, αt)dt+ σ(Xt, αt)dWt (1.1)

with X0 = x , (αt)t≥0 is a progressively measurable process. Let functions a, σsatisfy the Lipschitz condition.

Finite Horizon Case

For proofs and extensions of the following theorems, see pg. 40-46 of Pham’stextbook [5]

2Intuitively this means that no money is exogenously added or withdrawn from the portfolio


Fix T ∈ (0,∞). Define

A =

α : E

[∫ T

0

|b(0, αt)|2 + |σ(0, αt)|2dt

]<∞

For functions f, g define our gain function as

J(t, x, α) = Et,x

[∫ T

t

f(s,Xs, αs)ds+ g(XT )

]The value function linked to this is v(t, x) = sup

α∈AJ(t, x, a)

Theroem 1.3.1 (Dynamic Programming Principle). For all t1 ∈ [t, T ]

v(t, x) = supα∈A

Et,x

[∫ t1

t

f(s,Xs, αs)ds+ v(t1, Xt1)

]

If v is smooth, making use of Ito’s formula leads us to the HJB equationin the finite-horizon case.

Theroem 1.3.2 (Finite Horizon Hamilton-Jacobi-Bellman equation). 3 Letα = a ∈ R with a arbitrary. The value function v(t, x) satisfies the followingpartial differential equation (if the supremum is finite), known as the Hamilton-Jacobi-Bellman equation:

∂v

∂t+ sup

a

[f(t, x, a) + b(x, a)

∂v

∂x+

1

2|σ(x, a)|2 ∂

2v

∂x2

]= 0

with boundary condition v(T, x) = g(x)

Infinite Horizon Case

For the infinite horizon class, we consider problems where T = ∞. Xt is time-homogeneous and we discount the gain function to maintain finiteness of J(x, α)For β > 0, let A(x) be the set of admissible controls α satisfying

E[∫ ∞

0

e−βs|f(XS , αS)|ds]<∞

Define

J(x, α) = E[∫ ∞

0

e−βsf(XxS , αS)ds

]and similarly to the finite horizon case, the value function is v(x) = sup

α∈A((§)J(x, α)

Theroem 1.3.3 (Infinite Horizon Hamilton-Jacobi-Bellman equation). Assum-ing our SDE follows (1.1) v(t, x) satisfies ∀x ∈ R:

βv(x) + supa

[f(x, a) + b(x, a)

∂v

∂x+

1

2|σ(x, a)|2 ∂

2v

∂x2

]= 0

3In Pham’s proof, the infinitesimal generator, La is used. We do not use it throughoutthis paper for consistency and to highlight the explicit equations


A vital result in dynamic programming is the verification theorem, whichensures that for an optimal control problem, a candidate solution of a non-linearPDE coincides with the value function.

We do not state the theorem here, but it’s consequences are important forensuring our solutions are indeed optimal controls.[5].

Utility Classes

The aim of Merton’s portfolio problem is to optimise the investors (expected)utility. We must take into account the investors risk aversion. A more riskaverse investor would prefer to place more of his capital in the riskless asset toensure guaranteed returns rather than a riskier stock.

Definition 1.3.4 (Hyperbolic Absolute Risk Aversion). A utility function U(x)is said to be a HARA utility function if and only if it is of the form:

U(x) =1− γγ

(ax

1− γ+ b

)γwith a > 0 and ax

1−γ + b > 0.

In this paper we consider three types of utility function:

• Exponential : U(x) = −e−αx

• Power : U(x) = 11−γx

1−γ

• Logarithmic : U(x) = log(x)

To see that exponential utility belongs in the class, take the limit as γ tends to∞ and b = 1 in Definition 1.3.4. Similarly, the logarithmic utility is obtainedby setting a = 1 and observing the limit as γ tends to 0.

These utility functions all fall under the HARA class and for many of theoptimization problems studied in this paper, they lead to closed form solutions.

In the case of power utility which we will consider in this section, we cansimplify calculations by using a result about its homotheticity.

Proposition 1.3.5 (Power Utility Homotheticy). Let U(x) = 1γx

γ for γ ∈(0, 1). Then V (t, x, y) is increasing and concave in both x and y and ∀ρ > 0:

V (t, ρx, ρy) = ργV (t, x, y)

For a proof, see Davis and Norman 1990 Theorem 3.1 [6]

Chapter 2

Portfolio Allocation

2.1 Optimising Risky Wealth Under Exponen-tial Utility

In this subsection, we assume interest rate, r = 0. Hence the risk-free accountfollows Bt = 1. Let Xt denote the investor’s total wealth at time t. Let πt bethe proportion of total wealth that the investor has invested in the risky assetS at time t.

Hence, πtXt is the amount of wealth in units of currency invested in thestock at time t and πtXt

Stis the number of shares the investor owns at t.

We wish to optimize over all possible dynamic strategies πt, the proportionof the investor’s total wealth which is ’risky wealth’ in order to maximize theirexpected utility. Xt follows dynamics1

dXt =πtXt

StdSt +

(1− πt)Xt

BtdBt

dXt =πtXt

StSt(µdt+ σdWt)

= πtXt(µdt+ σdWt)

(2.1)

The sum of proportions invested over all assets in the market model must equal1. In this two asset market this of course means the investor places (1− πt)Xt

units of wealth into the riskless asset. If πt > 1, this implies the investor is shortin the riskless asset. Similarly if πt < 0 the investor is short in the risky asset.

Define utility function U : R→ R with risk aversion parameter α by

U(x) = −e−αx

This falls under the class of constant absolute risk aversion (CARA) utility

1We may also impose the additional restriction that Xt ≥ 0 depending on whether theinvestor is allowed to continue trading after bankruptcy.

8

CHAPTER 2. PORTFOLIO ALLOCATION 9

functions and is concave and increasing on R implying ∀t ∈ [0, T ], V (t, x) is alsoincreasing and concave. ([5] see pg.52 for proof)

Define the set of all admissible trading strategies

A = π(·) : bounded, adapted stochastic process s.t:Xπx ≥ 0P-a.s

Where Xπx (t) corresponds to the wealth of the investor following strategy π at

time t with initial wealth x. We want to maximise expected utility over allπ ∈ A

Define our value function:

V (t, x) = supπ∈A

Et,x

[U(XT )] (2.2)

The Hamilton-Jacobi-Bellman equation corresponding to (2.2) is given by The-orem 1.3.22

∂V

∂t+ supπ∈R

(πxµ

∂V

∂x+

1

2π2σ2x2

∂2V

∂x2

)= 0 (2.3)

With boundary condition V (T, x) = U(x) Denote optimal π by π∗. Differenti-ating the sumpremum part in (2.3) with respect to π and equating to zero tofind optimal π yields:

d

dπ

(πxµ

∂V

∂x+

1

2π2σ2x2

∂2V

∂x2

)= 0

xµ∂V

∂x+ x2σ2π∗

∂2V

∂x2= 0

=⇒ π∗ =− µ∂V∂xxσ2 ∂2V

∂x2

(2.4)

Substituting this optimal value of π∗ into (2.3) gives rise to a non-linear PDE :

∂V

∂t−

µ2

2σ2

(∂V∂x

)2∂2V∂x

= 0 (2.5)

This is a separable partial differential equation, so we can find a solution usingan ansatz of the form V (t, x) = −e−αx−αβ(T−t) We now have:

∂V

∂t= −αβe−αx−αβ(T−t)

∂V

∂x= αe−αx−αβ(T−t)

∂2V

∂x2= −α2e−αx−αβ(T−t)

(2.6)

2In this case the corresponding variable are f(t, x, π) = 0, b(x, π)dt+σ(x, π)dWt = πµxdt+σπxdWt


Substituting (2.6) into (2.5) gives us a value for β

β =µ2

2ασ2(2.7)

Also, π∗ in (2.4) becomes

π∗ =µ

xσ2α(2.8)

It is important to note that π∗ is constant (x is fixed as X0 = x) in agreementto Merton’s discovery, meaning the optimal strategy is for the investor to keepa constant fraction of his wealth, π∗ in the risky asset S and rebalance hisportfolio continuously to maintain this proportion.

The analytic exponential utility V (t, x) is

V (t, x) = − exp(−αx− α(µ2

2ασ2)(T − t)) (2.9)

The dynamics of Xt derived in (2.1) now reduce to Arithmetic Brownian Motion,which by direct integration gives:

dXt =µ2

σ2αdt+

µ

σαdWt =⇒ Xt = X0 +

µ2

σ2αt+

µ

σαWt (2.10)

Hence XtDist∼ N(X0 + µ2

σ2α t,µ2

σ2α2 t). We see that if no modification to themodel is made to prevent the investor from stopping when he hits bankruptcy,then Xt can become negative.

2.2 Extension to non-zero interest rates

We extend our market model from the previous section by now adding a constantinterest rate r to the riskless asset.

Let St follow:dSt = St[(r + µ)dt+ σdWt]

and let the riskless asset obey dBt = rBtdt as usual.

Consider a ’power utility’/constant relative risk aversion (CRRA) utilityfunction:

U(x) =1

1− γx1−γ , γ ∈ (0, 1)

Consider as before all admissible dynamic trading strategies, πt. Again we wantto optimise V (t, x) as defined in (2.2) for our new utility function on a finitetime horizon.

The investor’s total wealth obeys


dXt =πtXt

StdSt +

Xt(1− πt)Bt

dBt

= πtXt[(r + µ)dt+ σdWt] +Xt(1− πt)rdt

= Xt[πtµ+ r]dt+XtπtσdWt

(2.11)

By applying Theorem 1.3.2, the corresponding HJB equation:

∂V

∂t+ supπ∈A

((πxµ+ rx)

∂V

∂x+

1

2π2σ2x2

∂2V

∂x2

)= 0 (2.12)

With boundary condition V (T, x) = U(x)

By using ansatz V (t, x) = 11−γx

1−γf(t) for some function f and followingthe same procedure as in the exponential utility case to find a solution for π∗

and β results in

∂V

∂t=

1

1− γx1−γf ′(t)

∂V

∂x= x−γf(t)

∂2V

∂x2= −γx−γ−1f(t)

(2.13)

Substituting (2.13) in to (2.12) for optimal π∗ and unknown f(t) gives usexplicit solutions:

π∗ =µ

γσ2(2.14)

1

1− γx1−γf ′(t) = (

−µ2

2γσ2− r)x1−γf(t)

=⇒ f ′(t) = −(1− γ)(µ2

2γσ2+ r)f(t)

(2.15)

The solution of the ODE in (2.15) with the initial condition f(T ) = 1 is givenby3

f(t) = e(1−γ)( µ2

2γσ2+r)(T−t)

3This is obvious as at time T we have V (T, x) = U(x) = 11−γ x

1−γ


So the previously unknown function f(t) is:

f(t) = e(T−t)(1−γ)β

Where we have used:

β =1

2

µ2

γσ2+ r

We say beta represents a ’fictitious safe rate’. If the investor was to place hiswealth in a safe asset with compound interest rate β, they would attain thesame utility as following trading strategy π∗.

This gives us an explicit representation for V (t, x)

V (t, x) =1

1− γx1−γ exp

((T − t)(1− γ) + (

1

2

µ2

γσ2+ r)

)(2.16)

Optimal π∗ is constant and is given by:

π∗ =µ

γσ2

In this case, the dynamics of Xt in (2.11) reduce to Geometric BrownianMotion

dXt = Xt[πtµ+ r]dt+XtπtσdWt

= (µ2

γσ2+ r)Xtdt+

µ

γσdWt

=⇒ Xt = X0 exp

([(µ2

γσ2+ r)− µ2

2γ2σ2

]t+

µ

γσWt

) (2.17)

2.3 Numerical Implementation/Monte Carlo Es-timates

In this section we wish to verify previous results. We must first use a methodto disretize from the continuous time setting in the previous section so we canperform numerical analysis.

Once we have a model to simulate stock prices and the dynamics of thewealth processes, we can perform thousands of simulations and use these pathsto find an estimate for the value function in (2.2).

Simulation Model

The stock price process and wealth process with constant π∗ follow geometricbrownian motion and can be discretized by various methods such as the Euler


Maruyama method or the Milstein method, but since GBM has a closed formsolution it is possible to simulate the log-price process then exponentiate.

Assume dSt = St(µdt + σdWt). Let us divide the interval [0, T ] into Nequal time steps such that T = Nδ then the following code simulates a price

path, making use of the property WtDist∼√tN(0, 1) 4

#GenStockPrice.py

def gen_BS_pricepath(mu,sigma,S0,N,T):

delta = T/float(N)

s = np.zeros((N+1)

s[0] = np.log(S0)

for i in range(1,N+1):

s[i] = s[i-1] + (mu-0.5*sigma**2)*delta +

sigma*np.sqrt(delta)*np.random.normal()

return np.exp(s)

The riskless asset is not influenced by any stochastic element, so to simu-late the riskless balance at each δ5 we simply loop over:

#Bank Balance logic

b[i] = b[i-1]*np.exp(r*delta)

Optimal pi

For exponential utility U(x) = −e−αx, using our derived V (t, x) = −e−αx−αβ(T−t)

in (2.9) with β = −µ2

2ασ2 with test parameters: t = 0, T = 1, x = 1, µ = 0.05, σ =0.2, α = 1 gives π∗ = 1.25.

Comparing the theoretically derived value function, V (t, x) to a MonteCarlo simulation of the expected utility of the wealth process with optimal π∗

and 10,000 scenarios gives us a result in very close agreement:

• Analytic V (t, x) = −0.35678

• Simulated V (t, x) = −0.35656

To verify that this value of π∗ is indeed the optimal choice for these pa-rameters, we show numerically that the expected utility decreases when π∗ isperturbed.

For the power utility case with γ = 12 and σ = 0.2, a Monte Carlo estimate

with 10000 simulations of V (t, x) lead to results in very close agreement to thetheoretical values of V (t, x) shown in the table below

4N(µ, σ2) is the CDF of a normal random variable with mean µ and variance σ2

5Interest is continuously compounded in these simulations.


Figure 2.1: Proportion Risky Wealth vs. Expected Utility for π-values at in-crements of 0.02 with 1,000 simulations each. Optimal π∗ = 1.25 shown by redline

Figure 2.2: A sample stock price path with optimal π∗ and optimal number ofshares π∗Xt

Stfor µ = 0.05, σ = 0.2, α = 1 and T = 1 year. (Exponential Utility)

Parameters Monte Carlo V (t, x) Theoretical V (t, x)µ = 0.02, r = 0 6.383373 6.388118µ = 0.05, r = 0 6.712548 6.732454µ = 0, r = 0 6.324555 6.324555µ = 0.02, r = 0.02 6.524122 6.517167µ = 0.05, r = 0.02 6.884561 6.868459µ = −0.02, r = 0.02 6.514090 6.517167


We turn our attention now to the investor’s risk aversion parameter. In-tuitively, increasing α (or γ in the power utility case) means the investor isless prone to taking on risk via the risky asset. Consequentially, their expectedwealth should be lower, but standard deviation should also be smaller. Con-versely a low risk aversion means the investor is willing to invest a greaterproportion of his/her wealth in the risky asset and (in the case of these parame-ters) will result in a higher expected terminal wealth but with a higher standarddeviation. This can be seen in Figure 2.3 for exponential utility, with the final

wealth distribution Xt tending to N(X0 + µ2

σ2α t,µ2

σ2α2 t) as expected due to theABM behaviour of Xt.

In the case of power utility function, the wealth dynamics are given bythe Geometric Brownian Motion in (2.17), and so in this case Xt will be log-normally distributed and this behaviour can be clearly seen in Figure 2.4 withthe lack of symmetry and longer right tail.


Figure 2.3: Histogram of terminal wealth distribution for different risk aversionsfor exonential case utility case. µ = 0.05, σ = 0.2, T = 1 year.


Figure 2.4: Histogram of terminal wealth distribution for different risk aversionsfor power utility case. µ = 0.05, σ = 0.2, T = 1 year


2.3.1 Re-balancing the portfolio

The results we have so far derived rely on the investor to be able to rebalancehis portfolio continuously at time. However, in practice it is impossible foran to rebalance in such a manner. In this subsection, we look at rebalancingour portfolio weights at various frequencies to see how this affects our terminalwealth and utility.

We have derived a closed form solution for the wealth process Xt andshown that in the absence of transaction costs or fees, the optimal control pitis constant. This gives us a theoretical value in a continuous time setting withinstant and continuous rebalancing given by (2.17) in the case of power utilityand (2.10) in the case of exponential utility.

We shall consider the power utility case, with Xt behaving as in (2.17).Recall, the stock price obeys Geometric Brownian Motion and the riskless ac-count is continuously compounded at a constant interest rate of r.

Figure 2.5: Sample path of theoretical total wealth process, with amount ofwealth in stock and in riskless if continuous rebalancing is performed.

Figure 2.5 demonstrates how Xt, wealth in stock and wealth in bank evolvein the case of continuous portfolio rebalancing.

As before, divide [0, T ] into N equal time steps such that T = Nδ. LetXiδ be the investor’s total wealth at discrete time point iδ for i ∈ [0, N ]. LetXSiδ denote the amount of wealth invested in the stock S at iδ and XB

iδ be theamount of wealth invested in the riskless asset B.

At certain regular time-points kiδ for some k ∈ N the investor rebalances


their portfolio6, buying or selling the risky asset (stock) such that the proportionof wealth they have invested in the stock is again π∗.

At each rebalance step kiδ, the investor want to re-attain the propotionπt = π∗ so he must transfer

Rkiδ = (1− π∗)XSkiδ − π∗XB

kiδ

from the stock to the riskless asset.

At these rebalancing steps, the usual evolution process for the stock isreplaced by XS

kiδ − Rkiδ. Similarly for the riskless asset, we replace XBkiδ with

XBkiδ +Rkiδ.

Figure 2.6: Rebalancing πt with k = 50. πt evolves as usual according to theevolution of St and Bt until every 50th day, where the investor buys or sellsshares to bring πt back to π∗, in this case π∗ = 1.25

Figure 2.6 demonstrates how the proportion of wealth held in the stock Sie. πt varies with time. The same parameters are used as at the start of Section2.3.

When rebalancing is performed less than daily (k = 5) as in figure 2.7the investor’s total actual wealth is very close to the theoretical wealth processgiven in (2.17), but begins to deviate from optimal.

Figure 2.8 shows the discrepancy between the theoretical wealth pro-cess and the investors actual wealth process with different rebalancing peri-ods, demonstrating how actual wealth deviates from the optimal wealth path ifperfect ratio π∗ is obeyed continuously.

6For simplicity assume there are 250 trading days in a year, 20 trading days in a monthand 5 in a week (in reality there are approximately 252 trading days in a year).


Figure 2.7: Wealth in stock, riskless account and total wealth with continuoustime result over 5 years (T = 5, X0 = 1)


Figure 2.8: Optimal stock wealth (red) vs. actual stock wealth (green) forvarious rebalancing frequencies.(a) = Daily, (b) = Weekly, (c) = Fortnightly,(d) = Monthly, (e) = Bi-monthly, (f) = No rebalancing


2.4 Addition of Consumption and Infinite Hori-zons

In this section we add an additional variable to our model. Now, let us assumethe investor wants to optimize how they withdraw wealth from their portfolioin order to spend on goods and services.

Let us assume the stock price evolves as dSt = St[(r+ µ)dt+ σdWt] as insection 2.2. Let πt be the proportion of wealth invested in the stock at time tand ct be the ’consumption rate’.

The SDE for Xt is:

dXt =πtXt

StdSt +

Xt(1− πt)Bt

dBt − ctdt

= πtXt[(r + µ)dt+ σdWt] +Xt(1− πt)rdt− ctdt

= (πtµXt + rXt − ct)dt+XtπtσdWt

(2.18)

Note in the case of zero-interest rate r = 0, which we will consider from now onfor simplicity, the SDE for Xt becomes

dXt = (µπtXt − ct)dt+ σπtXtdWt (2.19)

We consider an infinite time horizon and we want to maximise over (π, c) ∈ A×Cwhere A is the set of admissible control processes α and C is the set of controlprocesses for consumption c.7

In this problem, the investor is trying to maximise log-utility of his con-sumption. Our value function is given by:

V (x) = sup(π,c)∈A×C

Ex

[∫ ∞0

e−δt log(ct)dt

](2.20)

Since we are working in the infinite horizon case, the corresponding Hamilton-Jacobi-Bellman equation is given by Theorem 2.21. The coefficients b(x, π) andσ(x, π) in Theorem 2.21 correspond to (πµx− c) and σπx as in (2.19). We alsohave terminal condition f = log(c).

− δV + sup(π,c)∈R2

[(πµx− c)∂V

∂x+

1

2(πσx)2

∂2V

∂x2+ log(c)

]= 0 (2.21)

Let π∗, c∗ denote optimal values of π, c respectively.

To find maximum values of π and c, we can take partial derivatives of thesupremum part of (2.21), ∂

∂c (· · · ) = 0 and ∂∂π (· · · ) = 0.

7c has the restrictions that ct ≥ 0 and ∀t,∫∞0 |ct| <∞


=⇒ −∂V∂x

+1

c∗= 0

c∗ = 1/∂V

∂x

Similarly,

π∗ =− µ∂V∂xxσ2 ∂2V

∂x2

=µ

σ2

Hence we see that the optimal consumption is proportional to the investor’swealth at time t. Using ansatz V (x) = 1

δ log(x) + C1 we calculate derivatives

∂V

∂x=

1

δx

∂2V

∂x2=−1

δx2

We see that c∗ = δx8 and the HJB equation (2.21) becomes

0 = µπ∗ − c∗

x− 1

2(π∗σ)2 + δ log(c∗)− δ log(x)− δ2C1

0 =µ2

σ− δ − µ2

2σ2+ δ log(

δx

x)− δ2C1

After re-arranging,

C1 =µ2 − 2σ2δ + 2σ2δlog(δ)

2δ2σ2

and for optimal (π∗, c∗), the SDE for Xt as in (2.19) becomes

dXt = (µ2

σ2− δ)Xtdt+

µ

σXtdWt

As in previous sections, the solution is in the form of Geometric BrownianMotion

Xt = X0 exp

((−δ +

µ2

σ2− µ2

2σ2)t+

µ

σWt

)(2.22)

V (x) is given by:

V (x) =1

δlog(x) +

µ2 − 2σ2δ + 2σ2δlog(δ)

2δ2σ2(2.23)

8c∗t is a constant fraction of Xt, meaning that the investors consumption rate is linearlylinked to his current wealth.


Numerical Results

From the value function (2.23), we aim to find an approximation to the V (x) bysimulating many wealth paths obeying (2.22). Since this problem is stated inthe context of an infinite horizon, we take a large value for T for approximationpurposes. Let T = 100 years.

For parameters µ = 0.1, σ = 0.4, X0 = S0 = 1, δ = 0.5, T = 100 years,assuming 250 trading days per year and running 1000 simulations yields anestimation V (x) = −3.28909282. This is in close agreement to the analyticexpression for the value function, V (x) = −3.26129436

Figure 2.9: Optimal consumption, bank and stock wealth with stock price pathand total wealth shown. Note how consumption is a constant proportion oftotal wealth. µ = 0.1, σ = 0.4, X0 = S0 = 1, δ = 0.5, T = 1 year.

Chapter 3

Extension to Stochastic Volatil-ity

It is unrealistic over longer time horizons to assume that interest rates andvolatility will be constant. In this chapter we examine Merton’s problem in thepresence of stochastic volatility.

3.1 Merton’s Problem in the Heston Model

In this section we address solving Merton’s problem in the presence of stochasticvolatility. We examine the problem in the framework of the much celebratedand studied Heston model pioneered by Steven Heston in his 1993 paper [11] .

Assume St follows

dSt = (µYt + r)Stdt+√YtStdW

St

µ is the rate of return of the stock, as in the Black-Scholes frameworkand we assume fixed constant interest rate r, but the main difference being thevolatility in the Heston model is itself a stochastic process, following dynamics

dYt = κ(θ − Yt)dt+ ξ√YtdW

Yt

We say κ is the rate of reversion, ie: the rate at which Yt returns to thelong term mean given by θ. ξ is the volatility of the volatility.

In this model there are two driving Brownian Motions, WSt and WY

t .These are correlated with correlation ρ ∈ [0, 1].1

1In order to keep the volatility positive, we enforce that 2κθ > ξ2. This is known at theFeller Condition.

25

CHAPTER 3. EXTENSION TO STOCHASTIC VOLATILITY 26

Let us consider a finite horizon problem, where the investor seeks to op-timise proportion of wealth πt invested in the risky-asset. Let Xt denote theinvestors wealth at time t. Xt follows

dXt =XtπtSt

dSt +Xt(1− πt)

BtdBt (3.1)

= Xt[πtµYt + r]dt+ πtXt

√YtdW

St (3.2)

The investor’s optimisation problem is to maximise

V (t, x, y) = supπ

(Et,x,y

[U(XT )]

)Where the utility function is a power utility of the form U(x) = 1

1−γx1−γ

The HJB equation in the finite horizon case is:

∂V

∂t+sup

π

(x(πµy + r)

∂V

∂x+ κ(θ − y)

∂V

∂y+

1

2x2π2y

∂2V

∂x2+ xπρξy

∂V

∂xy+

1

2ξy

1

2ξ2y

∂2V

∂y2

)In a similar style to our constant volatility 1-dimensional derivation, to findoptimal πt, we differentiate the supremum part and equate to zero.

d

dπ

(x(πµy + r)

∂V

∂x+ κ(θ − y)

∂V

∂y+

1

2x2π2y

∂2V

∂x2+ xπρξy

∂V

∂xy+

1

2ξy

1

2ξ2y

∂2V

∂y2

)= 0 =⇒

µxy∂V

∂x+ x2yπ∗

∂2V

∂x2+ xρyξ

∂2V

∂xy= 0

=⇒ π∗ =− µx∂V∂x − ρξx

∂2V∂xy

x2 ∂2V∂x2

π∗ =− µ∂V∂x − ρξ

∂2V∂xy

x∂2V∂x2

(3.3)

Substituting this into our HJB equation reduces it to:

∂V

∂t=µ

2

(∂V∂x )2

∂2V∂x2

+ µρξy

∂V∂x

∂2V∂xy

∂2V∂x2

+ρ2ξ2y

2

∂2V∂xy

∂2V∂x2

− rv ∂V∂x− κ(θ − y)

∂V

∂y− ξy

2

∂2V

∂y2

For the purposes of our numerical analysis, we will consider only the powerutility case. We can exploit results on homotheticity as in [9], defining a reducedvalue function2 v(t, y) = (1−γ)V (t, 1, y), where V (t, x, y) = x1−γV (t, 1, y). Thepartial derivatives are

2See Proposition 1.3.5


∂V

∂x= (1− γ)x−γV (t, 1, y)

∂2V

∂x2= −γ(1− γ)x−γ−1V (t, 1, y)

∂2V

∂xy= (1− γ)x−γ

∂V (t, 1, y)

∂y

∂2V

∂y2= x1−γ

∂V (t, 1, y)

∂2y

The problem is simplified to solving a corresponding ’reduced’ HJB equation:

∂v

∂t=

1− γγ

[

(−1

2µ2y − γr

)v − µyρξ ∂v

∂y− 1

2vρ2ξ2y(

∂v

∂y)2]

− κ(θ − y)∂v

∂y− 1

2ξ2y

∂2v

∂y2

with v(T, y) = 1

(3.4)

When these partial derivatives are substituted in (??) we find that the optimalratio π∗t becomes

π∗t =µ

γ+ρξ

γ

∂v∂y

v(3.5)

One possible approach to attacking this problem under power utility and ex-ponential utility is to use Boguslavskaya and Muravey’s (2015) result, reduc-ing the optimal control to a linear parabolic boundary problem. Kallsen andMuhle-Karbe showed that in a general stochastic volatility setting, the coeffi-cients in the reduced PDE are affine linear functions, meaning we can use ansatzv = eA(t)+B(t)y with some smooth functions A and B[13].

Calculating derivatives of v gives

∂v

∂y= B(t)eA(t)+B(t)y

∂2v

∂y2= B(t)2eA(t)+B(t)y

Inserting these into (3.4) gives (after canceling exponential terms throughout)

dA(t)

dt+dB(t)

dty =

1− γγ

[

(−1

2µ2y − γr

)− µyρξB(t)− 1

2ρ2ξ2yB(t)2]

− κ(θ − y)B(t)− 1

2ξ2B(t)2

(3.6)

To solve this we use the following result.


Result 3.1.1 (Liu & Muhle-Karbe’s Representation of A(t) and B(t)). Let:

a =(γ − 1)µ2

2γ

b =γ − 1

γµρξ + κ

c =ξ2

2(γ − 1

γρ2 − 1)

D = b2 − 4ac

By comparing coefficients in (3.6) we can separate terms to give a system ofODE’s for A(t) and B(t):

dB(t)

dt= cB(t)2 + bB(t) + a B(T ) = 0

dA(t)

dt= (γ − 1)r − κθB(t) A(T ) = 0

A(t) is a straightforward integral and B(t) is a Ricatti equation. These aresolved by3:

B(t) = −2ae√D(T−t) − 1

e√D(T−t)(b+

√D)− b+

√D

A(t) = r(1− γ)(T − t) + κθ

∫ T

t

B(s)ds

Then the theoretical value function is:

V (t, x, y) =1

1− γx1−γeA(t)+B(t)y

For a more detailed explanation, see [9].

Importantly, we now have deterministic expression for π∗t by using Result3.1.1 with (3.5):

π∗t =µ

γ+ρσB(t)

γ(3.7)

Perhaps quite surprisingly, even in the case of the Heston model, Merton’sproblem has an explicit solution, however unlike the constant volatility Black-Scholes case, π∗t is a deterministic function depending on the current time t andtime horizon T .

Numerical Analysis

In this section we wish to explore the properties of the optimal portfolio π∗t andconfirm that V (t, x, y) explicitly derived in Result 3.1.1 is in line with simulatedestimates.

3Provided D > 0


For our simulation model, we can no longer model the wealth process asGeometric Brownian Motion. π∗t is a function dependent on time t, rather thanconstant in the Black-Scholes case. Instead we can simulate the behaviour ofXt by using a finite difference method.

WithN discretization steps of equal size dt such that T = N∗dt,using (3.1)as the dynamics of Xt, we first generate a π∗t array by calculating deterministicfunctions A(t) and B(t). We then generate a bivartiate Normal distributionwith correlation ρ. We can then simulate St, Yt and Xt by looping the followingcode over N time steps and for M paths:

epsilon = np.random.multivariate_normal([0,0], cov)

dW_S = epsilon[0]*np.sqrt(dt)

dW_Y = epsilon[1]*np.sqrt(dt)

pi[j,i] = (mu/gamma + ((rho*sigma)/gamma)*B[j,i])

S_values[j,i] = S_values[j,i-1] +

(Y_values[j,i]*mu+r)*S_values[j,i-1]*dt +

np.sqrt(Y_values[j,i-1])*S_values[j,i-1]*dW_S

X_values[j,i] = X_values[j,i-1] +

(pi[j,i]*Y_values[j,i-1]*mu+r)*X_values[j,i-1]*dt +

pi[j,i]*np.sqrt(Y_values[j,i-1])*X_values[j,i-1]*dW_S

Y_values[j,i] = Y_values[j,i-1] + kappa*(theta - Y_values[j,i-1])*dt +

sigma*np.sqrt( Y_values[j,i-1])* dW_Y

Y_values[j,i] = abs(Y_values[j,i]) #force non-neg volatility

To verify simulated V (t, x, y) agrees with the analytic result, we use exampleparameters


Table 3.1: Simulation Parametersµ 0.05r 0T 1X0 1S0 1Y0 0.1θ 0.024κ 5ρ 0.3ξ 0.38

As Figure 3.1 shows, the simulated V (t, x, y) is in very close agreement tothe analytic result given in Result 3.1.1.

Figure 3.1: Simulating V (t, x, y) with 100 scenarios for γ ∈ (0, 1) against ana-lytic V (t, x, y).

Behaviour of π∗t

From (3.7) we can see that π∗t is a deterministic function involving B(t). In-stinctively, by the definition of B(t) this means that it is dependent of both thetime-horizon T and current time t. Since we have terminal condition B(T ) = 0

we expect that as t→ T , the ρσB(t)γ term will disappear meaning π∗t approaches

µγ .


Figure 3.2 demonstrates how π∗t evolves for different time-horizons, usingthe parameters in Table 3.1. It is clear that in a Heston type model, the in-vestor’s horizon T affects the optimal amount of wealth he should place in therisky asset S. As expected, π∗t approaches µ

γ with equality at T . 4

Figure 3.2: Behaviour of π∗t with γ = 0.5. (a) = 1 Year, (b) = 2 Years, (c) =5 Years, (d) = 10 Years

4For these parameters µγ

= 0.050.5

= 0.1

Chapter 4

Transaction Costs

In this chapter, we briefly touch on Merton’s problem when transaction costsare present. This is an important step towards a more realistic model.

In the transaction-cost free environment, an investor would ideally re-balance his portfolio by trading as close to continuously as possible. Whentransaction costs are added however, we see that this is irrational behaviour asthe cost of rebalancing continuously is greater than the benefit in utility gained.We will show that it is only beneficial to the investor to trade when the ratioπt is within certain bounds.

We say the investor pays a fixed transaction cost ψ ∈ R+ every time theybuy or sell any amount of the stock.1

4.1 Proportional Transaction Costs

Let us assume the investor trades as in the Chapter 2, starting with a wealth of$10,000, rebalancing daily, but with the presence of a 1% proportional transac-tion cost and a small fixed fee of $5.

From Figure 4.1, we see clearly that maintaining optimal constant π∗

as in the frictionless case is far from the rational way to trade and due to thecumulative fees paid, rebalancing regularly to keep πt at π∗ leads to the investorlosing his wealth. For this reason we must modify our strategy and framework.

Proportional transaction costs were first studied by Magill and Constan-tinides [7] in 1976 and expanded upon by Davis and Norman in 1990 who in-troduced the notion of the no-trade region and provided mathematical rigour.Davis and Norman showed that the optimal strategy is to make the minimaltrade required (if neccesary) to the closest point in the ’wedge’ defined by theno-trade region.[6]

1Many brokers charge a price to make trades. This can vary from as low as $5 to upwardsof $100.

32

CHAPTER 4. TRANSACTION COSTS 33

Figure 4.1: Actual Wealth with daily rebalancing in presence of transactioncosts vs. theoretical transaction cost free wealth

Assume St follows dSt = St(µdt + σdWt) and dBt = rBtdt as usual. Weuse a similar notation as in Davis and Norman.

To model a bid/ask spread, we now assume the investor can buy the stockat ask price SAt and sell the stock at the bid price SBt given by:

Investor Sells at SAt = (1 + λ)St

Investor Buys at SBt = (1− ε)St

with ε, λ ∈ [0, 1)

Let Dt and Lt be the cumulative wealth from selling and buying stockrespectively. LetXt and Yt be the amount invested in the riskless asset and stockrespectively (X0 = x, Y0 = y), so our total wealth at time t is now Zt = Xt +Ytand starting wealth is x+ y.

This gives rise to wealth equations:

dXt = (rXt − ct)dt− (1 + λ)dLt + (1− ε)dDt, X0 = x

dYt = µYtdt+ σYtdWt + dLt − dDt, Y0 = y(4.1)

where ct is the investor’s consumption rate.

In the infinite horizon case the value function is defined for utility functionU :


V (x, y) = sup(c,L,D)

E[∫ ∞

0

e−δtU(c(t))

](4.2)

Davis and Norman showed that the holding’s at time t are within a closedregion, given by

Sλ,ε = (x, y) ∈ R2 : x+ (1− ε)y ≥ 0 and x+ (1 + λ)y ≥ 0

The investor wants find a triplet (c, L,D) ∈ A(x, y) which maximizes V (x, y) =sup(E[U(ZT )]). By using Proposition 1.3.5 we can factor V (t, x, y) = xγV (t, 1, y)

Using the dynamics in (4.1), the HJB equation for this problem becomes:

−δV + supc,l,d

[1

2σ2y2

∂2V

∂y2+ (rx− c)∂V

∂x+ µy

∂V

∂y+

1

γcγ

+

(−(1 + λ)

∂V

∂x+∂V

∂y

)l +

((1− ε)∂V

∂x− ∂V

∂y

)d] = 0

We differentiate with respect to c and set to zero to find maxima. This givesoptimal consumption:

c∗γ−1 =∂V

∂x

=⇒ c∗ = (∂V

∂x)

11−γ

In the no-trade region, l and d are zero as the investor does not make any trades,so the value function satisfies:

−δV + sup

[1

2σ2y2

∂2V

∂y2+ (rx− c)∂V

∂x+ µy

∂V

∂y+

1

γxγ]

In the sell region, dL attains it maximum and dD = 0 as the investor sells thestock to rebalance his portfolio meaning

∂V

∂y= (1 + λ)

∂V

∂x(4.3)

Similarly, in the buy region,

∂V

∂y= (1− ε)∂V

∂x(4.4)

We know that the value function is concave and by the homotheticity property,we have reason to believe that the no-trade region is a cone in R2.[10]


The main difficulty arises when trying to solve the HJB equation directly,as unlike in the classical or Heston case, it cannot be solved analytically[?].

By reducing the dimensionality of the problem, exploiting the homoth-eticity property and solving a free boundary problem, it can be shown that theoptimal strategy consists of a pair of ’local time’ processes. Davis and Normanshowed how to numerically calculate the buy and sell boundaries. This is stillvery much an active area of research.

Chapter 5

CONCLUSION

In this paper, we have presented solutions to Merton’s portfolio problem invarious different settings using the dynamic programming approach. We havedemonstrated and numerically verified how an investor can optimise his portfoliowhen their utility function is in the class of HARA functions and stock pricesare assumed to obey Geometric Brownian Motion.

In Chapter 2, we have shown that the optimal portfolio consists of holdinga constant proportion of wealth in the risky asset in the idealised model whenvolatility is constant and there are no transaction costs. This is true both inthe finite and infinite time horizon case.

In the stochastic volatility setting, we showed that rather surprisingly, anexplicit solution exists and the optimal portfolio is characterized by a deter-ministic function. In Chapter 3 we presented a simulation model to verify thisoptimal strategy via Monte Carlo estimations.

The main difficulties arise when proportional transaction costs are present,resulting in the HJB equation no longer having an explicit solution. Variousapproaches to define the boundaries of the trading regions have been proposed,such as those by Muthuraman and Zha (2008), Budhiraja and Ross(2007).

36

Bibliography

[1] R. Bellman, ”The theory of dynamic programming”, Bull. Amer. Math.Soc. 60, no. 6, 503-515, 1954.

[2] R. C. Merton, ”An Intertemporal capital asset pricing model,” Econo-metrica, vol. 41, no. 5, pp. 867-887, 1973.

[3] R. C. Merton, ”Lifetime portfolio selection under uncertainty: Thecontinuous-time case”, The Review of Economics and Statistics, vol. 51,no. 3, pp. 247-257, 1969.

[4] R. C. Merton, ”Optimal consumption and portfolio rules in a continuoustime model”, J. Econom. Theory vol. 3, no. 4, pp. 373-413, 1971.

[5] H. Pham, ”Continuous-time stochastic control and optimization with fi-nancial applications”. Springer-Verlag, 2009.

[6] M. H. H. Davis and A. R. Norman, ”Portfolio Selection with Trans-action Costs”, Mathematics of Operations Research vol. 15, no. 4, pp.676-713, 1990.

[7] M. Magill and G. M. Constantinides, ”Portfolio selection with trans-actions costs”, J. of Econom. Theory vol. 13, no. 2, pp. 245-263, 1976.

[8] H. Liu and M. Loewenstein, ”Optimal portfolio selection with trans-action costs and finite horizons”, The Review of Financial Studies vol. 15,no. 3, pp. 805-835, 2002

[9] R. Liu and J. Muhle-Karbe, ”Portfolio Choice with Stochastic Invest-ment Opportunities: a Users Guide”, 2013

[10] K. Muthuraman and S. Kumar , ” Solving Free-boundary Problemswith Applications in Finance.”, Now Publishers, 2008.

[11] S. L. Heston, ”A Closed Solution For Options With Stochastic Volatility,With Application to Bond and Currency Options”, Review of FinancialStudies vol. 6, no. 2, pp. 327-343.

[12] E. Boguslavskaya and D. Muravey, ”An explicit solution for optimalinvestment in Heston model”, Teor. Veroyatnost. i Primenen vol 60, no. 4,pp 811-819, 2015.

37

BIBLIOGRAPHY 38

[13] J. Kallsen and J. Muhle-Karbe, ”Utility maximization in affinestochastic volatility models”, 2008.

[14] M. Monoyios, ”Finite horizon portfolio selection with transaction costs”,Journal of Economic Dynamics and Control vol 28, pp 889-913, 2004.

Appendix A

Simulation and Graphing Code

# All code was run on Python 3.5.1

# Only dependencies: numpy, seaborn.

import numpy as np

import matplotlib.pyplot as plt

import seaborn as sns

# GBM WEALTH PROCESS

def generateWealthPricePaths(mu,sigma,pi,X0,nSteps,nPaths,T):

delta = T/float(nSteps)

logX0 = np.log(X0)

x_values = np.zeros((nPaths,nSteps+1))

( x_values[:,0] ) = [logX0]*nPaths

for j in range(0,nPaths):

for i in range(1,nSteps+1):

x_values[j,i] = x_values[j,i-1] +

(mu*pi-0.5*(sigma*pi)**2)*delta +

sigma**pi*np.sqrt(delta)*np.random.normal()

return np.exp(x_values)

print generateWealthPricePaths(0.05,0.2,1.25,1,10,10,1)

plt.title(’’)

plt.ylabel(’Asset Price’)

plt.xlabel(’Time Steps’)

fig, ax = plt.subplots()

ax.ticklabel_format(useOffset=False)

mu = 0.05

sigma = 0.2

pi = 1.25

X0 = 1

nSteps = 100

nPaths = 10

T = 1

for scenario in generateWealthPricePaths(mu,sigma,pi,X0,nSteps,nPaths,T):

ax.plot(range(nSteps+1), scenario, alpha = 0.2, color = ’green’)

ax.plot(range(nSteps+1), scenario, alpha = 1., color = ’red’)

sns.plt.show()

def exp_utility(alpha, x):

39

APPENDIX A. SIMULATION AND GRAPHING CODE 40

return -np.exp(-alpha*x)

#EXP UTILITY MONTE CARLO AND HISTOGRAM

#Calculates Monte Carlo estimate of value function with N scenarios

def Value_function(N, alpha,X0,pi_star,mu,sigma,t):

value_functions_list = []

for i in range(N):

X_t = X0 + ((mu**2)/(alpha*sigma**2))*t +

(mu/(sigma*alpha))*np.sqrt(t)*np.random.normal()

X_t = np.random.normal(X0 + ((mu**2)/(alpha*sigma**2))*t,

np.sqrt(t)*(mu/(alpha*sigma)))

#value_functions_list.append( exp_utility(alpha,X_t) )

value_functions_list.append( -np.exp(-alpha*X_t) )

return np.mean(value_functions_list)

alpha = 1. # pre-defined

t=1. # pre-defined

X0=1 # pre-defined

mu=0.05 # pre-defined

sigma = 0.2 # pre-defined

analytic_pi_star = mu/(X0*alpha*sigma**2) # optimal risky wealth

beta = (mu**2)/(2*alpha*(sigma)**2) # analytic calculation

def analyticValueFunction(alpha,beta,X0):

return -np.exp((-alpha*X0)-alpha*beta*(t-0))

print( Value_function(10000,alpha,X0,analytic_pi_star,mu,sigma,t))

print( analyticValueFunction(alpha,beta,X0))

# Generates 2M+1 points, M values below pi_star, M values above, equally

spaced with distance epsilon

# with N Monte Carlo scenarios

def errors(M,N,epsilon):

x_axis = [analytic_pi_star]

for i in range(1,M):

x_axis.append(analytic_pi_star-i*epsilon)

x_axis.append(analytic_pi_star+i*epsilon)

x_axis = sorted(x_axis)

y_axis =[]

for pi_val in x_axis:

y_axis.append( Value_function(N, alpha,X0,pi_val,mu,sigma,t) )


plt.title(’’)

plt.ylabel(’Value (exponential utility)’)

plt.xlabel(’pi’)

x_position = 1

plt.axvline(analytic_pi_star, color = ’red’, alpha = 0.3)

plt.axhline(analyticValueFunction(alpha,beta,X0),color = ’red’, alpha

= 0.3)

plt.xticks(np.arange(-10, 10, 0.5))

plt.plot(x_axis,y_axis)

sns.plt.show()

fig.savefig(’MonteCarlo_value_function_optimality.png’, format=’png’,

dpi=800)

errors(250,1000,0.05)

#EXPONENTIAL UTILITY HISTOGRAM

def final_wealth(N, alpha,X0,pi_star,mu,sigma,t):


wealth_list = []

for i in range(N):

X_t = np.random.normal(X0 + ((mu**2)/(alpha*sigma**2))*t,

np.sqrt(t)*(mu/(alpha*sigma)))

wealth_list.append( X_t )

return wealth_list

alpha_1 = (final_wealth(1000,5,1,1.25,0.05,0.2,1))


plt.hist(alpha_1, bins=100, alpha=0.5, label=’alpha = 5.0’,color=’blue’)

plt.xlabel(’Final wealth’)

plt.ylabel(’Frequency’)

plt.legend(loc=’upper right’,prop=’size’:14)

sns.plt.show()

#OPTIMAL SHARES, S_t AND pi FOR EXPONENTIAL UTILITY

pi_t = 1.25

def optimals(mu,sigma,alpha,S0,X0,nSteps,T):


logS0 = np.log(S0)

s_values = np.zeros(nSteps+1)

s_values[0] = logS0

X = np.zeros(nSteps+1)

X[0] = X0

optimal_shares = np.zeros(nSteps+1)

optimal_shares[0] = (pi_t*X0)/S0


X[i] = X[i-1]+ (mu**2)/(alpha*sigma**2)*delta +

(mu/(sigma*alpha))*np.sqrt(delta)*np.random.normal()

s_values[i] = s_values[i-1] + (mu-0.5*sigma**2)*delta +

sigma*np.sqrt(delta)*np.random.normal()

optimal_shares[i] = (pi_t* X[i])/np.exp(s_values[i])

return np.exp(s_values),X,optimal_shares

nSteps = 250

listo = optimals(0.05,0.2,1,1,1,nSteps,1)



ax.plot(range(nSteps+1), listo[0], alpha = 1, color = ’red’,label=’Stock

Price’)

ax.plot(range(nSteps+1), listo[2], alpha = 1, color =

’green’,label=’Optimal number of shares’)

ax.plot(range(nSteps+1), [1.25]*(nSteps+1), alpha = 1, color =

’blue’,label=’Optimal pi’)

plt.xlabel(’Time Steps (days)’)

for item in ([ax.title, ax.xaxis.label, ax.yaxis.label] +

ax.get_xticklabels() + ax.get_yticklabels()):

item.set_fontsize(14)

plt.legend(loc=’upper right’, prop=’size’:14)

sns.plt.show()

#MONTE CARLO VERIFICATIONS WITH POWER UTIL AND NON-ZERO INTEREST #RATES

def power_utility(gamma, x):

if gamma >=1 or gamma<=0:

return ’ERROR, GAMMA out of bounds’

return 1/(1-float(gamma))*x**(1-float(gamma))


#Calculates Monte Carlo estimate of value function with N scenarios

def Value_function(N, gamma,X0,pi_star,mu,r,sigma,t):

value_functions_list = []

for i in range(N):

X_t = X0 * np.exp( (pi_star * mu + r - 0.5*(sigma*pi_star)**2)*t +

pi_star*sigma*np.sqrt(t)*np.random.normal() )

#value_functions_list.append( exp_utility(alpha,X_t) )

value_functions_list.append( power_utility(gamma,X_t) )

return np.mean(value_functions_list)

gamma = 0.5 # pre-defined

t=2. # pre-defined

X0=10. # pre-defined

mu=0.02 # pre-defined

sigma = 0.2 # pre-defined

r = 1 # pre-defined

analytic_pi_star = mu/(gamma*sigma**2) # optimal risky wealth

beta = 0.5*(mu**2)/(gamma*sigma**2) + r # analytic calculation

def analyticValueFunction(gamma,beta,X0):

return 1/(1-float(gamma))*X0**(1-float(gamma)) *

np.exp((t-0)*(1-gamma)*beta)

def printer(muu,rr):

analytic_pi_star = muu/(gamma*sigma**2) # optimal risky wealth

beta = 0.5*(muu**2)/(gamma*sigma**2) + rr # analytic calculation

print(’mu is ’+str(muu))

print(’r is ’ + str(rr))

print(Value_function(10000,gamma,X0,analytic_pi_star,muu,rr,sigma,t))

print(analyticValueFunction(gamma,beta,X0))

print(’\n\n’)

#CODE FOR VARYING REBALANCING FREQUENCIES

def generateWealthPricePaths(mu,sigma,r,pi,X0,S0,nSteps,nPaths,T,freq):


# stock

logS0 = np.log(S0)

s_values = np.zeros((nPaths,nSteps+1))

( s_values[:,0] ) = [logS0]*nPaths

B0=1

logX0 = np.log(X0)

theo_money_in_stock = np.zeros((nPaths,nSteps+1))

theo_money_in_bank = np.zeros((nPaths,nSteps+1))

theo_money_in_stock[:,0] = (pi * X0)

theo_money_in_bank[:,0] = (1-pi)*X0



money_in_bank = np.zeros((nPaths,nSteps+1))

( money_in_bank[:,0] ) = [B0]*nPaths

units_of_stock = (pi * X0)/S0

init_units_of_bank = (1-pi)*B0

money_in_stock = np.zeros((nPaths,nSteps+1))

money_in_bank[:,0] = init_units_of_bank*B0

money_in_stock[:,0] = units_of_stock*S0





w = np.random.normal()

s_values[j,i] = s_values[j,i-1] + (mu-0.5*sigma**2)*delta +

sigma*np.sqrt(delta)*w

x_values[j,i] = x_values[j,i-1] +

(mu*pi-0.5*(sigma*pi)**2)*delta + sigma*pi*np.sqrt(delta)*w

money_in_bank[j,i] = money_in_bank[j,i-1]*np.exp(r*delta)

money_in_stock[j,i] = units_of_stock*np.exp(s_values[j,i])

real_pi =

money_in_stock[j,i]/(money_in_stock[j,i]+money_in_bank[j,i])

theo_money_in_stock[j,i] = np.multiply(pi,np.exp(x_values[j,i]))

theo_money_in_bank[j,i] =

np.multiply(1-pi,np.exp(x_values[j,i]))

if i%freq==0:

CORRECTOR = np.multiply((1-pi),money_in_stock[j][i])

-np.multiply((pi),money_in_bank[j][i])

money_in_stock[j,i] = money_in_stock[j,i] - CORRECTOR

money_in_bank[j,i] = money_in_bank[j,i] + CORRECTOR

units_of_stock = (units_of_stock)

-(CORRECTOR/np.exp(s_values[j,i]))

print(units_of_stock)

else:

pass

x_values = np.exp(x_values)

’’’units_of_stock = (pi * X0)/S0

units_of_bank = (1-pi)*B0

money_in_stock = units_of_stock*s_values

money_in_bank = units_of_bank*b_values’’’

total_wealth = np.add(money_in_bank, money_in_stock)

pi_process = np.divide(money_in_stock,total_wealth)

return [total_wealth,money_in_stock,money_in_bank, x_values,

theo_money_in_stock, theo_money_in_bank ]



mu = 0.05

sigma = 0.2

r = 0.0

B0 = 1.

S0 = 100.

nSteps = 250

nPaths = 1

T = 1

pi = 1.25

X0 =1.

plt.ylabel(’Wealth’)

plt.xlabel(’Time Steps (Days)’)

listo =

generateWealthPricePaths(mu,sigma,r,pi,X0,S0,nSteps,nPaths,T,freq)

for scenario in listo[5]:

ax.plot(range(nSteps+1), scenario, alpha = 1, linestyle = ’--’, color

= ’red’,label=’optimal bank’)


ax.plot(range(nSteps+1), scenario, alpha = 1, color =

’green’,label=’Actual Stock Wealth’)




’red’,label=’Actual Bank Wealth’)



’red’,label=’optimal Stock wealth’)



’black’,label=’actual total wealth’)



’blue’,label=’optimal total Wealth’)




sns.plt.show()

#INFINITE HORIZON LOG UTILITY WITH CONSUMPTION MODEL

def utility_function(c):

return np.log(c)

def Wealth_process(mu,sigma,r,X0,S0,delta,nSteps,nPaths,T):

dt = T/float(nSteps)

pi = mu/(sigma**2)

logX0 = np.log(X0)

logS0 = np.log(S0)




init_units_of_bank = (1-pi)

b_values = np.zeros((nPaths,nSteps+1))

b_values[:,0] = 1

s_values = np.zeros((nPaths,nSteps+1))

( s_values[:,0] ) = [logS0]*nPaths

money_in_stock = np.zeros((nPaths,nSteps+1))

money_in_bank = np.zeros((nPaths,nSteps+1))

money_in_stock[:,0] = (pi * X0)

money_in_bank[:,0] = (1-pi)*X0




x_values[j,i] = x_values[j,i-1] + (-delta+0.5*(mu/sigma)**2)*dt

+ (mu/sigma)*np.sqrt(dt)*w

s_values[j,i] = s_values[j,i-1] + (mu-0.5*sigma**2)*dt +

sigma*np.sqrt(dt)*w

b_values[j,i] = b_values[j,i-1]*np.exp(r*dt)

money_in_stock[j,i] = np.multiply(pi,np.exp(x_values[j,i]))

money_in_bank[j,i] = np.multiply(1-pi,np.exp(x_values[j,i]))

S_values = np.exp(s_values)

Wealth_process = np.exp(x_values)

stock_amount_process = np.exp(s_values)

bank_amoount_process = b_values

optimal_consumption_process = np.multiply(delta, Wealth_process)

return [money_in_bank, money_in_stock, Wealth_process, S_values,

optimal_consumption_process]


#INFINITE HORIZON LOG UTILITY MONTE CARLO VERIFICATION

X0 = 1

T = 100

mu = 0.1

sigma = 0.4

delta = 0.5

nSteps = 250*100

nPaths = 1000

pi = mu/(sigma**2)

def analytic_value_function(mu,sigma,delta,X0):

C_1 = (2*delta*np.log(delta)*sigma**2 + mu**2 -

2*delta*sigma**2)/(2*(delta*sigma)**2)

return (1/delta)*np.log(X0) + C_1

def monte_carlo_value_function(mu,sigma,X0,delta,nSteps,nPaths,T):



( x_values[:,0] ) = [np.log(X0)]*nPaths

c = np.zeros((nPaths,nSteps+1))




x_values[j,i] = x_values[j,i-1] + (-delta+0.5*(mu/sigma)**2)*dt

+ (mu/sigma)*np.sqrt(dt)*w

c[j,i] = delta*np.exp(x_values[j,i])

x_values = np.exp(x_values)

integral = np.zeros((nPaths,nSteps+1))



integral[j,i] = integral[j,i-1] +

np.exp(-delta*(i-1)*dt)*np.log(c[j,i])*dt

final_integral_values = []

for integral_path in integral:

final_integral_values.append(integral_path[-1])

return (np.mean(final_integral_values))

def graph_analytic_value_vs_delta(mu,sigma,X0):

analytic_list = []

monte_carlo_list = []

x_range = np.arange(0.2,0.8,0.05)

for delta in x_range:

C_1 = (2*delta*np.log(delta)*sigma**2 + mu**2 -

2*delta*sigma**2)/(2*(delta*sigma)**2)

analytic_list.append( (1/delta)*np.log(X0) + C_1 )

monte_carlo_list.append(

monte_carlo_value_function(mu,sigma,X0,delta,nSteps,nPaths,T) )

plt.plot(x_range,analytic_list)

plt.plot(x_range,monte_carlo_list)

plt.show()

#STOCHASTIC VOLATILITY MODEL AND MONTE CARLO VERIFICATION

def analytic_utility(gamma,t,x,y):

a = ((gamma-1)/gamma)*0.5*(mu**2)

b = ((gamma-1)/gamma)*mu*rho*sigma+ kappa

c = ( ((gamma-1)/gamma)*(rho**2)-1)*0.5*(sigma**2)

D = (b**2) -(4*a*c)


print(D)

B = -(2*a) * (np.exp(np.sqrt(D)*(T-t))

-1)/(np.exp(np.sqrt(D)*(T-t))*(b+np.sqrt(D))-b+np.sqrt(D) )

#A = ((1-gamma)*r*t)-((2*kappa*theta)/(b**2 - D))* ((b+np.sqrt(D))*t

-

2*np.log(((np.exp(np.sqrt(D)*t)*(b+np.sqrt(D))-b+np.sqrt(D)))/(2*np.sqrt(D))))

A = (1-gamma)*r*(T-t)- ((2*kappa*theta * a)/((b**2) -

D))*((b+np.sqrt(D))*(T-t)

- (2*np.log( ( (np.exp(np.sqrt(D)*(T-t)) * (b+np.sqrt(D) ) - ( b +

np.sqrt(D)) )) / (2*np.sqrt(D)) ) ))

return np.exp(A+B*y)*(1/(1-gamma))*x**(1-gamma)

def simulateHestonWealthPaths(gamma, S0, Y0,X0, mu, r, kappa, theta,

sigma, rho, T, nPaths, nSteps):


cov = [[rho,0],[0,rho]]

a = ((gamma-1)/gamma)*0.5*(mu**2)

b = ((gamma-1)/gamma)*mu*rho*sigma+ kappa

c = ( ((gamma-1)/gamma)*(rho**2)-1)*0.5*(sigma**2)

D = b**2 - 4*a*c

if D <= 0:

print( ’invalid a,b,c’)

S_values = np.zeros((nPaths,nSteps+1))

X_values = np.zeros((nPaths,nSteps+1))

Y_values = np.zeros((nPaths,nSteps+1))

S_values[:,0] = [S0]*nPaths

X_values[:,0] = [X0]*nPaths

Y_values[:,0] = [Y0]*nPaths

A = np.zeros((nPaths,nSteps+1))

B = np.zeros((nPaths,nSteps+1))

A[:,0] = [(1-gamma)*r*(T)-((2*kappa*theta*a)/(b**2 -

D))*(b+np.sqrt(D))*(T)

-2*np.log(((np.exp(np.sqrt(D)*T)*(b+np.sqrt(D))-b+np.sqrt(D)))/(2*np.sqrt(D)))]*nPaths

B[:,0] = [-2*a *

(np.exp(np.sqrt(D)*(T))-1)/(np.exp(np.sqrt(D)*T)*(b+np.sqrt(D))-b+np.sqrt(D)

)]*nPaths

pi = np.zeros((nPaths,nSteps+1))

pi[:,0] = [(mu/gamma + ((rho*sigma)/gamma)*B[0,0])]*nPaths



epsilon = np.random.multivariate_normal([0,0], cov)

dW_S = epsilon[0]*np.sqrt(dt)

dW_Y = epsilon[1]*np.sqrt(dt)

B[j,i]= -(2*a) *

(np.exp(np.sqrt(D)*(T-(i*dt)))-1)/(np.exp(np.sqrt(D)*(T-(i*dt)))

*(b+np.sqrt(D))-b+np.sqrt(D) )

A[j,i] = (1-gamma)*r*(T-(i*dt))-((2*kappa*theta*a)/(b**2 -

D))*((b+np.sqrt(D))*(T-i*dt)-

2*np.log(((np.exp(np.sqrt(D)*(T-i*dt))*(b+np.sqrt(D))-b+np.sqrt(D)))/(2*np.sqrt(D))))

pi[j,i] = (mu/gamma + ((rho*sigma)/gamma)*B[j,i])

S_values[j,i] = S_values[j,i-1] +

(Y_values[j,i]*mu+r)*S_values[j,i-1]*dt

+ np.sqrt(Y_values[j,i-1])*S_values[j,i-1]*dW_S

X_values[j,i] = X_values[j,i-1] +

(pi[j,i]*Y_values[j,i-1]*mu+r)*X_values[j,i-1]*dt


+ pi[j,i]*np.sqrt(Y_values[j,i-1])*X_values[j,i-1]*dW_S

Y_values[j,i] = Y_values[j,i-1] + kappa*(theta -

Y_values[j,i-1])*dt

+ sigma*np.sqrt( Y_values[j,i-1])* dW_Y

Y_values[j,i] = abs(Y_values[j,i]) #force non-neg volatility

return [pi, X_values]

nPaths=100

nSteps=250

mu = 0.05

r = 0

T = 1

X0 = 1

kappa = 5

theta = 0.024

Y0 = 0.1

sigma = 0.38

rho = 0.3

S0 = 1

gamma_list = np.arange(0.2,0.8,0.01)

MC_value=[]

theo_value=[]

for gamma in gamma_list:

a_price_path = simulateHestonWealthPaths(gamma, S0, Y0, X0, mu, r,

kappa, theta, sigma, rho, T, nPaths, nSteps)

utils_list = []

for wealth_path in a_price_path[1]:

final_util = (1/(1-gamma))*wealth_path[-1]**(1-gamma)

utils_list.append(final_util)

MC_value.append(np.mean(utils_list))

theo_value.append(analytic_utility(gamma,0,X0,Y0))



plt.ylabel(’Value’)

plt.xlabel(’Gamma’)

ax.plot(gamma_list,theo_value, color = ’red’,label=’Theoretical Value

Function’)

ax.plot(gamma_list,MC_value, color = ’green’,label=’Simulated Value

Function’)

plt.legend(loc=’upper right’)




sns.plt.show()

Varun Balupuri - Thesis

Documents

Transcript of Varun Balupuri - Thesis