Financial Engineering

284

Transcript of Financial Engineering

Page 1: Financial Engineering
Page 2: Financial Engineering

RECENT ADVANCES IN FINANCIAL ENGINEERING

2009Proceedings of the

KIER-TMU International Workshop on Financial Engineering 2009

Page 3: Financial Engineering

This page intentionally left blankThis page intentionally left blank

Page 4: Financial Engineering

N E W J E R S E Y • L O N D O N • S I N G A P O R E • B E I J I N G • S H A N G H A I • H O N G K O N G • TA I P E I • C H E N N A I

World Scientific

RECENT ADVANCES IN FINANCIAL ENGINEERING

2009Proceedings of the

KIER-TMU International Workshop on Financial Engineering 2009

Otemachi, Sankei Plaza, Tokyo 3 – 4 August 2009

editors Masaaki Kijima

Tokyo Metropolitan University, Japan

Chiaki Hara Kyoto University, Japan

Keiichi Tanaka Tokyo Metropolitan University, Japan

Yukio Muromachi Tokyo Metropolitan University, Japan

Page 5: Financial Engineering

British Library Cataloguing-in-Publication DataA catalogue record for this book is available from the British Library.

For photocopying of material in this volume, please pay a copying fee through the CopyrightClearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission tophotocopy is not required from the publisher.

ISBN-13 978-981-4299-89-3ISBN-10 981-4299-89-8

All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,electronic or mechanical, including photocopying, recording or any information storage and retrievalsystem now known or to be invented, without written permission from the Publisher.

Copyright © 2010 by World Scientific Publishing Co. Pte. Ltd.

Published by

World Scientific Publishing Co. Pte. Ltd.

5 Toh Tuck Link, Singapore 596224

USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601

UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

Printed in Singapore.

RECENT ADVANCES IN FINANCIAL ENGINEERING 2009Proceedings of the KIER-TMU International Workshop on Financial Engineering 2009

Jhia Huei - Recent Advs in Financial Engg 2009.pmd 5/4/2010, 11:09 AM1

Page 6: Financial Engineering

May 3, 2010 13:23 Proceedings Trim Size: 9in x 6in preface

PREFACE

This book is the Proceedings of theKIER-TMU International Workshop onFinancial Engineering 2009 held in Summer 2009. The workshop is the succes-sor of “Daiwa International Workshop on Financial Engineering” that was heldin Tokyo every year since 2004 in order to exchange new ideas in financial en-gineering among workshop participants. Every year, various interesting and highquality studies were presented by many researchers from various countries, fromboth academia and industry. As such, this workshop served as a bridge betweenacademic researchers in the field of financial engineering and practitioners.

We would like to mention that the workshop is jointly organized by the Insti-tute of Economic Research, Kyoto University (KIER) and the Graduate School ofSocial Sciences, Tokyo Metropolitan University (TMU). Financial support fromthe Public Management Program, the Program for Enhancing Systematic Edu-cation in Graduate Schools, the Japan Society for Promotion of Science’s Pro-gram for Grants-in Aid for Scientific Research (A) #21241040, the Selective Re-search Fund of Tokyo Metropolitan University, and Credit Pricing Corporation aregreatly appreciated.

We invited leading scholars including four keynote speakers, and various kindsof fruitful and active discussions were held during the KIER-TMU workshop.This book consists of eleven papers related to the topics presented at the work-shop. These papers address state-of-the-art techniques and concepts in financialengineering, and have been selected through appropriate referees’ evaluation fol-lowed by the editors’ final decision in order to make this book a high quality one.The reader will be convinced of the contributions made by this research.

We would like to express our deep gratitude to those who submitted their pa-pers to this proceedings and those who helped us kindly by refereeing these pa-pers. We would also thank Mr. Satoshi Kanai for editing the manuscripts, and Ms.Kakarlapudi Shalini Raju and Ms. Grace Lu Huiru of World Scientific PublishingCo. for their kind assistance in publishing this book.

February, 2010

Masaaki Kijima, Tokyo Metropolitan UniversityChiaki Hara, Institute of Economic Research, Kyoto UniversityKeiichi Tanaka, Tokyo Metropolitan UniversityYukio Muromachi, Tokyo Metropolitan University

v

Page 7: Financial Engineering

May 3, 2010 13:23 Proceedings Trim Size: 9in x 6in preface

KIER-TMU International Workshopon Financial Engineering 2009

DateAugust 3–4, 2009

PlaceOtemachi Sankei Plaza, Tokyo, Japan

OrganizerInstitute of Economic Research, Kyoto UniversityGraduate School of Social Sciences, Tokyo Metropolitan University

Supported byPublic Management ProgramProgram for Enhancing Systematic Education in Graduate SchoolsJapan Society for Promotion of Science’s Program for Grants-in Aidfor Scientific Research (A) #21241040Selective Research Fund of Tokyo Metropolitan UniversityCredit Pricing Corporation

Program CommitteeMasaaki Kijima, Tokyo Metropolitan University, ChairAkihisa Shibata, Kyoto University, Co-ChairChiaki Hara, Kyoto UniversityTadashi Yagi, Doshisha UniversityHidetaka Nakaoka, Tokyo Metropolitan UniversityKeiichi Tanaka, Tokyo Metropolitan UniversityTakashi Shibata, Tokyo Metropolitan UniversityYukio Muromachi, Tokyo Metropolitan University

vi

Page 8: Financial Engineering

May 3, 2010 13:23 Proceedings Trim Size: 9in x 6in preface

vii

Program

August 3 (Monday)

Chair: Masaaki Kijima

10:00–10:10Yasuyuki Kato, Nomura Securities/Kyoto UniversityOpening Address

Chair: Chiaki Hara

10:10–10:55Chris Rogers, University of CambridgeOptimal and Robust Contracts for a Risk-Constrained Principal

10:55–11:25Yumiharu Nakano, Tokyo Institute of TechnologyQuantile Hedging for Defaultable Claims

11:25–12:45Lunch

Chair: Yukio Muromachi

12:45–13:30Michael Gordy, Federal Reserve BoardConstant Proportion Debt Obligations: A Post-Mortem Analysis of RatingModels (with Soren Willemann)

13:30–14:00Kyoko Yagi, University of TokyoAn Optimal Investment Policy in Equity-Debt Financed Firms with FiniteMaturities (with Ryuta Takashima and Katsushige Sawaki)

14:00–14:20Afternoon Coffee I

Chair: St ephane Crepey

14:20–14:50Hidetoshi Nakagawa, Hitotsubashi UniversitySurrender Risk and Default Risk of Insurance Companies (with Olivier LeCourtois)

14:50–15:20Kyo Yamamoto, University of TokyoGenerating a Target PayoffDistribution with the Cheapest Dynamic Portfo-lio: An Application to Hedge Fund Replication (with Akihiko Takahashi)

15:20–15:50Yasuo Taniguchi, Sumitomo Mitsui Banking Corporation/TokyoMetropolitan UniversityLooping Default Model with Multiple Obligors

15:50–16:10Afternoon Coffee II

Page 9: Financial Engineering

May 3, 2010 13:23 Proceedings Trim Size: 9in x 6in preface

viii

Chair: Hidetaka Nakaoka

16:10–16:40Stephane Crepey, Evry UniversityCounterparty Credit Risk (with Samson Assefa, Tomasz R. Bielecki,Monique Jeanblanc and Behnaz Zagari)

16:40–17:10Kohta Takehara, University of TokyoComputation in an Asymptotic Expansion Method (with Akihiko Takahashiand Masashi Toda)

Page 10: Financial Engineering

May 3, 2010 13:23 Proceedings Trim Size: 9in x 6in preface

ix

August 4 (Tuesday)

Chair: Takashi Shibata

10:00–10:45Chiaki Hara, Kyoto UniversityHeterogeneous Beliefs and Representative Consumer

10:45–11:15Xue-Zhong He, University of Technology, SydneyBoundedly Rational Equilibrium and Risk Premium (with Lei Shi)

11:15-11:45Yuan Tian, Kyoto University/Tokyo Metropolitan UniversityFinancial Synergy in M&A (with Michi Nishihara and Takashi Shibata)

11:45–13:15Lunch

Chair: Andrea Macrina

13:15–14:00Mark Davis, Imperial College LondonJump-Diffusion Risk-Sensitive Asset Management (with Sebastien Lleo)

14:00–14:30Masahiko Egami, Kyoto UniversityA Game Options Approach to the Investment Problem with ConvertibleDebt Financing

14:30–15:00Katsunori AnoOptimal Stopping Problem with Uncertain Stopping and its Application toDiscrete Options

15:00–15:30Afternoon Coffee

Chair: Xue-Zhong He

15:30–16:00Andrea Macrina, King’s College London/Kyoto UniversityInformation-Sensitive Pricing Kernels (with Lane Hughston)

16:00–16:30Hiroki Masuda, Kyushu UniversityExplicit Estimators of a Skewed Stable Model Based on High-FrequencyData

16:30–17:00Takayuki Morimoto, Kwansei Gakuin UniversityA Note on a Statistical Hypothesis Testing for Removing Noise by TheRandom Matrix Theory, and its Application to Co-Volatility Matrices (withKanta Tachibana)

Chair: Keiichi Tanaka

17:00–17:10Kohtaro Kuwada, Tokyo Metropolitan UniversityClosing Address

Page 11: Financial Engineering

This page intentionally left blankThis page intentionally left blank

Page 12: Financial Engineering

May 3, 2010 10:39 Proceedings Trim Size: 9in x 6in contents

CONTENTS

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Risk Sensitive Investment Management with Affine Processes: A ViscosityApproach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . M. Davis and S. Lleo 1

Small-Sample Estimation of Models of Portfolio Credit Risk . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .M. B. Gordy and E. Heitfield 43

Heterogeneous Beliefs with Mortal Agents . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A. A. Brown and L. C. G. Rogers 65

Counterparty Risk on a CDS in a Markov Chain Copula Model with JointDefaults . . . . . . . . . . . . . . . . . . . . S. Crepey, M. Jeanblanc and B. Zargari 91

Portfolio Efficiency Under Heterogeneous Beliefs . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . X.-Z. He and L. Shi 127

Security Pricing with Information-Sensitive Discounting . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Macrina and P. A. Parbhoo 157

On Statistical Aspects in Calibrating a Geometric Skewed Stable AssetPrice Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H. Masuda 181

A Note on a Statistical Hypothesis Testing for Removing Noise by theRandom Matrix Theory and Its Application to Co-Volatility Matrices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .T. Morimoto and K. Tachibana 203

Quantile Hedging for Defaultable Claims . . . . . . . . . . . . . . . . . . . . Y. Nakano 219

New Unified Computational Algorithm in a High-Order AsymptoticExpansion Scheme . . . . . . . . . . K. Takehara, A. Takahashi and M. Toda 231

Can Financial Synergy Motivate M&A? . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Y. Tian, M. Nishihara and T. Shibata 253

xi

Page 13: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

Risk Sensitive Investment Management with AffineProcesses: A Viscosity Approach∗

Mark Davis and Sebastien Lleo

Department of Mathematics, Imperial College London, London SW7 2AZ, EnglandE-mail: [email protected] and [email protected]

In this paper, we extend the jump-diffusion model proposed by Davis andLleo to include jumps in asset prices as well as valuation factors. Thecriterion, following earlier work by Bielecki, Pliska, Nagai and others, isrisk-sensitive optimization (equivalent to maximizing the expected growthrate subject to a constraint on variance). In this setting, the Hamilton-Jacobi-Bellman equation is a partial integro-differential PDE. The mainresult of the paper is to show that the value function of the control problemis the unique viscosity solution of the Hamilton-Jacobi-Bellman equation.

Keywords: Asset management, risk-sensitive stochastic control, jumpdiffusion processes, Poisson point processes, Levy processes, HJB PDE,policy improvement.

1. IntroductionIn this paper, we extend the jump diffusion risk-sensitive asset management

model proposed by Davis and Lleo [19] to allow jumps in both asset prices andfactor levels.

Risk-sensitive control generalizes classical stochastic control by parametrizingexplicitly the degree of risk aversion or risk tolerance of the optimizing agent. Inrisk-sensitive control, the decision maker’s objective is to select a control policyh(t) to maximize the criterion

J(t, x, h; θ) := −1θ

lnE[

e−θF(t,x,h)]

(1)

∗The authors are very grateful to the editors and an anonymous referees for a number of veryhelpful comments.

1

Page 14: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

2

wheret is the time,x is the state variable,F is a given reward function, and the risksensitivityθ ∈]−1, 0[∪]0,∞) is an exogenous parameter representing the decisionmaker’s degree of risk aversion. A Taylor expansion of this criterion aroundθ = 0yields

J(t, x, h; θ) = E [F(t, x, h)] −θ

2Var [F(t, x, h)] +O(θ2) (2)

which shows that the risk-sensitive criterion amounts to maximizingE [F(t, x, h)]subject to a penalty for variance. Jacobson [28], Whittle [35], Bensoussan andVan Schuppen [9] led the theoretical development of risk sensitive control whileLefebvre and Montulet [32], Fleming [25] and Bielecki and Pliska [11] pio-neered the financial application of risk-sensitive control. In particular, Bieleckiand Pliska proposed the logarithm of the investor’s wealth as a reward func-tion, so that the investor’s objective is to maximize the risk-sensitive (log) re-turn of his/her portfolio or alternatively to maximize a function of the powerutility (HARA) of terminal wealth. Bielecki and Pliska brought an enormouscontribution to the field by studying the economic properties of the risk-sensitiveasset management criterion (see [13]), extending the asset management modelinto an intertemporal CAPM ([14]), working on transaction costs ([12]), nu-merical methods ([10]) and considering factors driven by a CIR model ([15]).Other main contributors include Kuroda and Nagai [31] who introduced an ele-gant solution method based on a change of measure argument. Davis and Lleoapplied this change of measure technique to solve a benchmarked investmentproblem in which an investor selects an asset allocation to outperform a givenfinancial benchmark (see [18]) and analyzed the link between optimal portfoliosand fractional Kelly strategies (see [20]). More recently, Davis and Lleo [19]extended the risk-sensitive asset management model by allowing jumps in assetprices.

In this chapter, our contribution is to allow not only jumps in asset pricesbut also in the level of the underlying valuation factors. Once we intro-duce jumps in the factors, the Bellman equation becomes a nonlinear Par-tial Integro-Differential equation and an analytical or classicalC1,2 solutionsmay not exist. As a result, to give a sense to the relation between thevalue function and the risk sensitive Hamilton-Jacobi-Bellman Partial Inte-gro Differential Equation (RS HJB PIDE), we consider a class of weak so-lutions called viscosity solutions, which have gained a widespread acceptancein control theory in recent years. The main results are a comparison theo-rem and the proof that the value function of the control problem under con-sideration is the unique continuous viscosity solution of the associated RS HJBPIDE. In particular, the proof of the comparison results uses non-standard ar-guments to circumvent difficulties linked to the highly nonlinear nature of theRS HJB PIDE and to the unboundedness of the instantaneous reward func-tion g.

Page 15: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

3

This chapter is organized as follows. Section 2 introduces the general settingof the model and defines the class of random Poisson measures which will beused to model the jump component of the asset and factor dynamics. In Section3 we formulate the control problem and apply a change of measure to obtain asimpler auxiliary criterion. Section 4 outlines the properties of the value function.In Section 5 we show that the value function is a viscosity solution of the RS HJBPIDE before proving a comparison result in Section 6 which provides uniqueness.

2. Analytical SettingOur analytical setting is based on that of [19]. The notable difference is that

we allow the factor processes to experience jumps.

2.1 OverviewThe growth rates of the assets are assumed to depend onn valuation factors

X1(t), . . . ,Xn(t) which follow the dynamics given in equation (4) below. The assetsmarket comprisesm risky securitiesSi , i = 1, . . . ,m. Let M := n + m. Let(Ω, Ft ,F , P) be the underlying probability space. On this space is defined anR

M-valued(Ft)-Brownian motionW(t) with componentsWk(t), k = 1, . . . ,M.Moreover, let (Z,BZ) be a Borel space1. Letp be an (Ft)-adaptedσ-finite Poissonpoint process onZ whose underlying point functions are maps from a countablesetDp ⊂ (0,∞) into Z. Define

Zp :=

U ∈ B(Z),E[

Np(t,U)]

< ∞ ∀t

(3)

ConsiderNp(dt, dz), the Poisson random measure on (0,∞)×Z induced byp. Fol-lowing Davis and Lleo [19], we concentrate on stationary Poisson point processesof class (QL) with associated Poisson random measureNp(dt, dx). The class (QL)is defined in [27] (Definition II.3.1, p. 59) as

Definition 2.1. An (Ft)-adapted point processp on (Ω,F , P) is said to beof class(QL) with respect to (Ft) if it is σ-finite and there existsNp =

(

Np(t,U))

such that

(i) for U ∈ Zp, t 7→ Np(t,U) is a continuous (Ft)-adapted increasing process;

(ii) for eacht and a.a.ω ∈ Ω, U 7→ Np(t,U) is aσ-finite measure on (Z,B(Z));

(iii) for U ∈ Zp, t 7→ Np(t,U) = Np(t,U) − Np(t,U) is an (Ft)-martingale;

The random measure

Np(t,U)

is called thecompensatorof the point processp.

1Z is a standard measurable (metric or topological) space andBZ is the Borelσ-field endowed toZ.

Page 16: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

4

Since the Poisson point processes we consider are stationary, then their compen-sators are of the formNp(t,U) = ν(U)t, whereν is theσ-finite characteristicmeasure of the Poisson point processp. For notational convenience, we define thePoisson random measureNp(dt, dz) as

Np(dt, dz)

=

Np(dt, dz) − Np(dt, dz) = Np(dt, dz) − ν(dz)dt =: Np(dt, dz) if z ∈ Z0

Np(dt, dz) if z ∈ Z\Z0

whereZ0 ⊂ BZ such thatν(Z\Z0) < ∞.

2.2 Factor DynamicsWe model the dynamics of then factors with an affine jump diffusion process

dX(t) = (b+ BX(t−))dt+ ΛdW(t) +∫

Zξ(z)Np(dt, dz), X(0) = x (4)

whereX(t) is theRn-valued factor process with componentsX j(t) andb ∈ Rn,B ∈ Rn×n, Λ :=

[

Λi j

]

, i = 1, . . . , n, j = 1, . . . ,N andξ(z) ∈ Rn with −∞ <

ξmini ≤ ξi(z) ≤ ξmax

i < ∞ for i = 1, . . . , n. Moreover, the vector-valued functionξ(z) satisfies:

Z0

|ξ(z)|2ν(dz) < ∞

(See for example Definition II.4.1 in Ikeda and Watanabe [27] whereFP andF2,locP

are given in equations II(3.2) and II(3.5) respectively.)

2.3 Asset Market DynamicsLet S0 denote the wealth invested in the money market account with dynamics

given by the equation:

dS0(t)S0(t)

=(

a0 + A′0X(t))

dt, S0(0) = s0 (5)

wherea0 ∈ R is a scalar constant,A0 ∈ Rn is a n-element column vector and

whereM’ denotes the transposed matrix ofM. Note that if we setA0 = 0 anda0 = r, then equation (5) can be interpreted as the dynamics of a globally risk-freeasset. LetSi(t) denote the price at timet of theith security, withi = 1, . . . ,m. Thedynamics of risky securityi can be expressed as:

dSi(t)Si(t−)

= (a+ AX(t))idt+N

k=1

σikdWk(t) +∫

Zγi(z)Np(dt, dz),

Si(0) = si , i = 1, . . . ,m (6)

Page 17: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

5

wherea ∈ Rm, A ∈ Rm×n, Σ :=[

σi j

]

, i = 1, . . . ,m, j = 1, . . . ,M andγ(z) ∈ Rm

satisfies Assumption 2.1.

Assumption 2.1. γ(z) ∈ Rm satisfies

−1 ≤ γmini ≤ γi(z) ≤ γmax

i < +∞, i = 1, . . . ,m

and

−1 ≤ γmini < 0 < γmax

i < +∞, i = 1, . . . ,m

for i = 1, . . . ,m. Furthermore, define

S := supp(ν) ∈ BZ

andS := supp(ν γ−1) ∈ B (Rm)

where supp(·) denotes the measure’s support, then we assume that∏m

i=1[γmini ,

γmaxi ] is the smallest closed hypercube containingS.

In addition, the vector-valued functionγ(z) satisfies:∫

Z0

|γ(z)|2ν(dz) < ∞

As noted in [19], Assumption 2.1 requires that each asset has, with positiveprobability, both upward and downward jumps and as a result bounds the space ofcontrols.

Define the setJ as

J :=

h ∈ Rm : −1− h′ψ < 0 ∀ψ ∈ S

(7)

For a givenz, the equationh′γ(z) = −1 describes a hyperplane inRm. Under As-sumption 2.1J is a convex subset ofRm.

2.4 Portfolio DynamicsWe will assume that:

Assumption 2.2. The matrixΣΣ′ is positive definite.

and

Assumption 2.3. The systematic (factor-driven) and idiosyncratic (asset-driven)jump risks are uncorrelated, i.e.∀z ∈ Z and i= 1, . . . ,m,γi(z)ξ′(z) = 0.

Page 18: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

6

The second assumption implies that there cannot be simultaneous jumps in thefactor process and any asset price process. This assumption, which will provesufficient to show the existence of a unique optimal investment policy, may appearsomewhat restrictive as it does not enable us to model a jump correlation structureacross factors and assets, although we can model a jump correlation structurewithin the factors and within the assets.

Remark 2.1. Assumption (2.3) is automatically satisfied when jumps are onlyallowed in the security prices and the state variableX(t) is modelled using a diffu-sion process (see [19] for a full treatment of this case).

LetGt := σ((S(s),X(s)), 0 ≤ s ≤ t) be the sigma-field generated by the secu-rity and factor processes up to timet.

An investment strategyor control processis anRm-valued process with theinterpretation thathi(t) is the fraction of current portfolio value invested in theithasset,i = 1, . . . ,m. The fraction invested in the money market account is thenh0(t) = 1−

∑mi=1 hi(t).

Definition 2.2. An Rm-valued control processh(t) is in classH if the followingconditions are satisfied:

1. h(t) is progressively measurable with respect toB([0, t]) ⊗ Gtt≥0 and iscadlag;

2. P(

∫ T

0|h(s)|2 ds< +∞

)

= 1, ∀T > 0;

3. h′(t)γ(z) > −1, ∀t > 0, z∈ Z, a.s.dν.

Define the setK as

K := h(t) ∈ H : h(t) ∈ J ∀ta.s. (8)

Lemma 2.1. Under Assumption 2.1, a control process h(t) satisfying condition 3in Definition 2.2 is bounded.

Proof. The proof of this result is immediate.

Definition 2.3. A control processh(t) is in classA(T) if the following conditionsare satisfied:

1. h(t) ∈ H ∀t ∈ [0,T];

Page 19: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

7

2. EχhT = 1 whereχh

t is the Doleans exponential defined as

χht := exp

−θ

∫ t

0h(s)′ΣdWs −

12θ2

∫ t

0h(s)′ΣΣ′h(s)ds

+

∫ t

0

Zln (1−G(z, h(s); θ)) Np(ds, dz)

+

∫ t

0

Zln (1−G(z, h(s); θ)) +G(z, h(s); θ) ν(dz)ds

;

(9)

and

G(z, h; θ) = 1−(

1+ h′γ(z))−θ (10)

Definition 2.4. We say that a control processh(t) is admissibleif h(t) ∈ A(T).

The proportion invested in the money market account ish0(t) = 1−∑m

i=1 hi(t).Taking this budget equation into consideration, the wealthV(t, x, h), or V(t), ofthe investor in response to an investment strategyh(t) ∈ H , follows the dynamics

dV(t)V(t−)

=(

a0 + A′0X(t))

dt + h′(t)(

a− a01+(

A− 1A′0)

X(t))

dt

+h′(t)ΣdWt +

Zh′(t)γ(z)Np(dt, dz)

where1 ∈ Rm denotes them-element unit column vector and withV(0) = v.Defininga := a−a01 andA := A−1A′0, we can express the portfolio dynamics as

dV(t)V(t−)

=(

a0 + A′0X(t))

dt + h′(t)(

a+ AX(t))

dt+ h′(t)ΣdWt +

Zh′(t)γ(z)Np(dt, dz)

(11)

3. Problem Setup3.1 Optimization Criterion

We will follow Bielecki and Pliska [11] and Kuroda and Nagai [31] and as-sume that the objective of the investor is to maximize the long-term risk adjustedgrowth of his/her portfolio of assets. In this context, the objective of the risk-sensitive management problem is to findh∗(t) ∈ A(T) that maximizes the controlcriterion

J(t, x, h; θ) := −1θ

lnE[

e−θ ln V(t,x,h)]

(12)

Page 20: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

8

By Ito, the log of the portfolio value in response to a strategyh is

ln V(t) = ln v+∫ t

0

(

a0 + A′0X(s))

+ h(s)′(

a+ AX(s))

ds−12

∫ t

0h(s)′ΣΣ′h(s)ds

+

∫ t

0h(s)′ΣdW(s)

+

∫ t

0

Z0

ln(

1+ h(s)′γ(z))

− h(s)′γ(z)

ν(dz)ds

+

∫ t

0

Zln

(

1+ h(s)′γ(z))

Np(ds, dz) (13)

Hence,

e−θ ln V(t) = v−θ exp

θ

∫ t

0g(Xs, h(s); θ)ds

χht (14)

where

g(x, h; θ) =12

(θ + 1) h′ΣΣ′h− a0 − A′0x− h′(a+ Ax)

+

Z

[

(

1+ h′γ(z))−θ− 1

]

+ h′γ(z)1Z0(z)

ν(dz) (15)

and the Doleans exponentialχht is given by (9).

3.2 Change of MeasureLet Pθh be the measure on (Ω,F ) be defined as

dPθhdP

Ft

:= χt (16)

For a change of measure to be possible, we must ensure that the following techni-cal condition holds:

G(z, h(s); θ) < 1

for all s ∈ [0,T] andza.s.dν. This condition is satisfied iff

h′(s)γ(z) > −1 (17)

a.s.dν, which was already one of the conditions required forh to be in classH(Condition 3 in Definition 2.2).

Pθh is a probability measure forh ∈ A(T). Forh ∈ A(T),

Wht =Wt + θ

∫ t

0Σ′h(s)ds

Page 21: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

9

is a standard Brownian motion under the measurePθh and we define thePθh com-pensated Poisson measure as

∫ t

0

ZNh

p(ds, dz) =∫ t

0

ZNp(ds, dz) −

∫ t

0

Z1−G(z, h(s); θ) ν(dz)ds

=

∫ t

0

ZNp(ds, dz) −

∫ t

0

Z

(

1+ h′γ(z))−θ

ν(dz)ds

As a result,X(s), 0 ≤ s≤ t satisfies the SDE:

dX(s) = f(

X(s−), h(s); θ)

ds+ ΛdWhs +

Zξ(z)Nh

p(ds, dz) (18)

where

f (x, h; θ) := b+ Bx− θΛΣ′h+∫

Zξ(z)

[

(

1+ h′γ(z))−θ− 1Z0(z)

]

ν(dz) (19)

We will now introduce the following two auxiliary criterion functions underthe measurePθh:

• the auxiliary function directly associated with the risk-sensitive controlproblem:

I (v, x; h; t,T; θ) = −1θ

lnEh,θt,x

[

exp

θ

∫ T

tg(Xs, h(s); θ)ds− θ ln v

]

(20)

whereEh,θt,x [·] denotes the expectation taken with respect to the measureP

θh

and with initial conditions (t, x).

• the exponentially transformed criterion

I (v, x, h; t,T; θ) := Eh,θt,x

[

exp

θ

∫ T

tg(Xs, h(s); θ)ds− θ ln v

]

(21)

which we will find convenient to use in our derivations.

We have completed our reformulation of the problem under the measurePθh. The

state dynamics (18) is a jump-diffusion process and our objective is to maximizethe criterion (20) or alternatively minimize (21).

3.3 The HJB EquationIn this section we derive the risk-sensitive Hamilton-Jacobi-Bellman partial

integro differential equation (RS HJB PIDE) associated with the optimal controlproblem. Since we do not anticipate that a classical solution generally exists, wewill not attempt to derive a verification theorem. Instead, we will show that the

Page 22: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

10

value functionΦ is a solution of the RS HJB PIDE in the viscosity sense. In fact,we will show that the value function is the unique continuous viscosity solutionof the RS HJB PIDE. This result will in turn justify the association of the RS HJBPIDE with the control problem and replace the verification theorem we wouldderive if a classical solution existed.

Let Φ be the value function for the auxiliary criterion functionI (v, x; h; t,T)defined in (20). ThenΦ is defined as

Φ(t, x) = suph∈A(T)

I (v, x; h; t,T) (22)

We will show thatΦ satisfies the HJB PDE

∂Φ

∂t(t, x) + sup

h∈JLh

tΦ(t,X(t)) = 0 (23)

where

LhtΦ(t, x) = f (x, h; θ)′DΦ +

12

tr(

ΛΛ′D2Φ)

−θ

2(DΦ)′ΛΛ′DΦ

+

Z

−1θ

(

e−θ(Φ(t,x+ξ(z))−Φ(t,x)) − 1)

− ξ′(z)DΦ

ν(dz)

− g(x, h; θ) (24)

D· = ∂·∂x, and subject to terminal condition

Φ(T, x) = ln v (25)

Similarly, let Φ be the value function for the auxiliary criterion functionI (v, x; h; t,T). ThenΦ is defined as

Φ(t, x) = infh∈A(T)

I (v, x; h; t,T) (26)

The corresponding HJB PDE is

∂Φ

∂t(t, x) +

12

tr(

ΛΛ′D2Φ(t, x))

+ H(x, Φ,DΦ)

+

Z

Φ(t, x+ ξ(z)) − Φ(t, x) − ξ′(z)DΦ(t, x)

ν(dz) = 0 (27)

subject to terminal condition

Φ(T, x) = v−θ (28)

Page 23: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

11

and where

H(s, x, r, p) = infh∈J

(

b+ Bx− θΛΣ′h(s))′ p+ θg(x, h; θ)r

(29)

for r ∈ R, p ∈ Rn and in particular,

Φ(t, x) = exp−θΦ(t, x) (30)

The supremum in (23) can be expressed as:

suph∈J

LhtΦ

= (b+ Bx)′ DΦ +12

tr(

ΛΛ′D2Φ)

−θ

2(DΦ)′ΛΛ′DΦ + a0 + A′0x

+

Z

−1θ

(

e−θ(Φ(t,x+ξ(z))−Φ(t,x)) − 1)

− ξ′(z)DΦ1Z0(z)

ν(dz)

+ suph∈J

−12

(θ + 1) h′ΣΣ′h− θh′ΣΛ′DΦ + h′(a+ Ax)

−1θ

Z

(

1− θξ′(z)DΦ)

[

(

1+ h′γ(z))−θ− 1

]

+ θh′γ(z)1Z0(z)

ν(dz)

(31)

Under Assumption 2.2 the term

−12

(θ + 1) h′ΣΣ′h− θh′ΣΛ′DΦ + h′(a+ Ax) −∫

Zh′γ(z)1Z0(z)ν(dz)

is strictly concave inh. Under Assumption 2.3, the nonlinear jump-related term

−1θ

Z

(

1− θξ′(z)DΦ)

[

(

1+ h′γ(z))−θ− 1

]

ν(dz)

simplifies to

−1θ

Z

[

(

1+ h′γ(z))−θ− 1

]

ν(dz)

which is also concave inh ∀z ∈ Z a.s.dν. Therefore, the supremum is reachedfor a unique optimal controlh∗, which is an interior point of the setJ defined inequation (7), and the supremum, evaluated ath∗, is finite.

Page 24: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

12

4. Properties of the Value Function

4.1 “Zero Beta” PoliciesAs in [19], we will use “zero beta” (0β) policies (initially introduced by

Black [16])).

Definition 4.1. 1.20β-policy]By reference to the definition of the functiong inequation (15), a‘zero beta’ (0β) control policyh(t) is an admissible control policyfor which the functiong is independent from the state variablex.

In our problem, the setZ of 0β-policies is the set of admissible policieshwhich satisfy the equation

h′A = −A0

As m > n, there is potentially an infinite number of 0β-policies as long as thefollowing assumption is satisfied

Assumption 4.1. The matrixA has rank n.

Without loss of generality, we fix a 0β controlh as a constant function of timeso that

g(x, h; θ) = g

whereg is a constant.

4.2 Convexity

Proposition 4.1. The value functionΦ(t, x) is convex in x.

Proof. See the proof of Proposition 6.2 in [19].

Corollary 4.1. The exponentially transformed value functionΦ has the followingproperty:∀(x1, x2) ∈ R2, κ ∈ (0, 1, ),

Φ(t, κx1 + (1− κ)x2) ≥ Φκ(t, x1)Φ1−κ(t, x2) (32)

Proof. The property follows immediately from the definition ofΦ(t, x) =− 1θ

lnΦ(t, x).

Page 25: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

13

4.3 Boundedness

Proposition 4.2. The exponentially transformed value functionΦ is positive andbounded, i.e. there exists M> 0 such that

0 ≤ Φ(t, x) ≤ M ∀(t, x) ∈ [0,T] × Rn

Proof. By definition,

Φ(t, x) = infh∈A(T )

Eh,θt,x

[

exp

θ

∫ T

tg(Xs, h(s); θ)ds− θ ln v

]

≥ 0

Consider the zero-beta policyh. By the Dynamic Programming Principle

Φ(t, x) ≤ eθ

[

∫ T

tg(X(s),h;θ)ds−ln v

]

= eθ[g(T−t)−ln v]

which concludes the proof.

4.4 Growth

Assumption 4.2. There exist2n constant controlshk, k = 1, . . . , 2n such that the2n functionsβk : [0,T] → Rn defined by

βk(t) = θB−1(

1− eB(T−t)) (

A0 + hkA)

(33)

and2n functionsαk : [0,T] → R defined by

α(t) = −∫ T

tq(s)ds (34)

where

q(t) :=

(

b− θΛΣ′h+∫

Zξ(z)

[

(

1+ hk′γ(z))−θ− 1Z0(z)

]

ν(dz)

)′

βk′ (t)

+12

tr(

ΛΛ′βk′ (t)βk(t))

+

Z

eβkξ(z) − 1− ξ′(z)βk′ (t)

ν(dz)

+12θ (θ + 1) hk′ΣΣ′hk − θa0 − θa

+ θ

Z

[

(

1+ hk′γ(z))−θ− 1

]

+ hk′γ(z)1Z0(z)

ν(dz)

exist and for i= 1, . . . , n satisfy:

βii(t) < 0

βn+ii (t) > 0 (35)

whereβij(t) denotes the jth component of the vectorβi(t).

Page 26: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

14

Remark 4.1. Key to this assumption is the condition (35) which imposes a spe-cific constraint on one element of each of the 2n vectorsβk(t). To clarify thestructure of this constraint, defineM−

βas the squaren× n matrix whosei-th col-

umn (withi = 1, . . . , n) is then-element column vectorβi(t). Then all the elementsm−j j , j = 1, . . . ,mon the diagonal ofM−β are such that

m−j j = βjj(t) < 0

Similarly, defineM+β

as the squaren × n matrix whosei-th column (withi =

1, . . . , n) is then-element column vectorβn+i(t). Then all the elementsm+j j , j =1, . . . ,mon the diagonal ofM+β are such that

m+j j = βn+ jj (t) > 0

Note that there is no requirement for eitherM−β or M+β to have full rank.It would in fact be perfectly acceptable to have rank 1 as a result of columnduplication.

Remark 4.2. For the functionβk in equation (33) to exists,B must be invertible.Moreover, the existence of 2n constant controlshk, k = 1, . . . , 2n such that (33)satisfies (35) is only guaranteed whenJ = Rn. However, since finding the controlsis equivalent to solving a system of at mostn inequalities withm variables andm > n, it is likely that one could find constant controls after some adjustments tothe elements of the matricesA0,A, B or to the maximum jump size allowed.

Proposition 4.3. Suppose Assumption 4.2 holds and consider the2n constantcontrolshk, k = 1, . . . , 2n parameterizing the4n functions

αk : [0,T] → R, k = 1, . . . , 2n

βk : [0,T] → Rn, k = 1, . . . , 2n

such that for i= 1, . . . , n,

βii(t) < 0

βn+ii (t) > 0

whereβij(t) denotes the j-th component of the vectorβi(t). Then we have the

following upper bounds:

Φ(t, x) ≤ eαk(t)+βk′ (t)x

in each element xi , i = 1, . . . , n of x.

Page 27: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

15

Proof. SettingZ = Rn − 0 and recalling that the dynamics of the state variableX(t) under thePθh-measure is given by

dX(t) = f (X(t−), h(t); θ) + ΛdWht +

Rnξ(z)Nh

p(dt, dz)

we note that the associated Levy measure ˜ν can be defined via the map:

ν = ν ξ−1 (36)

We will now limit ourselves the classHc of constant controls. By the opti-mality principle, for an arbitrary admissible constant control policyh, we have

Φ(t, x) ≤ I (x; h; t,T) ≤ Et,x

[

exp

θ

∫ T

tg(Xs, h)ds− θ ln v

]

:=W(t, x) (37)

In this setting, we note that the functiong is an affine function of the affine pro-cessX(t). Affine process theory (See Appendix A in Duffie and Singleton [24],Duffie, Pan and Singleton [23] or Duffie, Filipovic and Schachermayer [21] formore details on the properties of affine processes) leads us to expect that the ex-pectation on the right-hand side of equation (37) takes the form

W(t, x) = expα(t) + β(t)x (38)

where

α : t ∈ [0,T] → R

β : t ∈ [0,T] → Rn

are functions solving two ODEs.Indeed, applying the Feynman-Kac formula, we find that the functionW(t, x)

satisfies the integro-differential PDE:

∂W∂t+

(

b+ BXs− θΛΣ′h+

Zξ(z)

[

(

1+ h′γ(z))−θ− 1Z0(z)

]

ν(dz)

)′

DW(t, x)

+12

tr(

ΛΛ′D2W(t, x))

+

Z

W(t, x+ ξ(z)) −W(t, x) − ξ′(z)DW(t, x)

ν(dz)

+ θg(x, h; θ)W(t, x)

= 0

subject to terminal conditionΦ(T, x) = v−θ.

Page 28: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

16

Now, taking a candidate solution of the form

W(t, x) = expα(t) + β(t)x

we have

∂W∂t=

(

˙α(t) + β(t)x)

W(t, x)

DW = β′(t)W(t, x)

D2W = β′(t)β(t)W(t, x)

Substituting into the PDE, we get

(

˙α(t) + β(t)x)

W(t, x)

+

(

b+ Bx− θΛΣ′h+∫

Zξ(z)

[

(

1+ h′γ(z))−θ− 1Z0(z)

]

ν(dz)

)′

β′(t)W(t, x)

+12

tr(

ΛΛ′β′(t)β(t))

W(t, x)

+

Z

W(t, x+ ξ(z)) −W(t, x) − ξ′(z)β′(t)W(t, x)

ν(dz)

+ θ

(

12

(θ + 1) h′ΣΣ′h− a0 − A′0x− h′(a+ Ax)

+

Z

[

(

1+ h′γ(z))−θ− 1

]

+ h′γ(z)1Z0(z)

ν(dz)

)

W(t, x)

= 0

Dividing by W(t, x) and rearranging, we get

(

β(t) + B′β′(t) − θA′0 − θh′A

)

x

= −

(

˙α(t) +

(

b− θΛΣ′h+∫

Zξ(z)

[

(

1+ h′γ(z))−θ− 1Z0(z)

]

ν(dz)

)′

β′(t)

+12

tr (ΛΛ′β′(t)β(t)) +∫

Z

eβξ(z) − 1− ξ′(z)β′(t)

ν(dz)

+12θ (θ + 1) h′ΣΣ′h− θa0 − θa+ θ

Z

[

(

1+ h′γ(z))−θ− 1

]

+ h′γ(z)1Z0(z)

ν(dz)

)

Since the left-hand side is independent from the right-hand side, then both sidesare orthogonal. As a result we now only need to solve the two ODEs

β(t) + B′β′(t) − θA′0 − θh′A = 0 (39)

Page 29: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

17

and

α(t) +

(

b− θΛΣ′h+∫

Zξ(z)

[

(

1+ h′γ(z))−θ− 1Z0(z)

]

ν(dz)

)′

β′(t)

+12

tr (ΛΛ′β′(t)β(t)) +∫

Z

eβξ(z) − 1− ξ′(z)β′(t)

ν(dz)

+12θ (θ + 1) h′ΣΣ′h− θa0 − θa+ θ

Z

[

(

1+ h′γ(z))−θ− 1

]

+ h′γ(z)1Z0(z)

ν(dz)

= 0 (40)

to obtain the value ofW(t, x). The ODE (39) forβ is linear and admits the solution

β(t) = θB−1(

1− eB(T−t)) (

A0 + hkA)

(41)

As for the ODE (40) forα, we only need to integrate to get

α(t) = −∫ T

tq(s)ds (42)

where

q(t) :=

(

b− θΛΣ′h+∫

Zξ(z)

[

(

1+ h′γ(z))−θ− 1Z0(z)

]

ν(dz)

)′

β′(t)

+12

tr (ΛΛ′β′(t)β(t)) +∫

Z

eβξ(z) − 1− ξ′(z)β′(t)

ν(dz)

+12θ (θ + 1) h′ΣΣ′h− θa0 − θa+ θ

Z

[

(

1+ h′γ(z))−θ− 1

]

+ h′γ(z)1Z0(z)

ν(dz)

Observe thatW(t, x) is increasing inxi , the i-th element ofx, if βi > 0, andconversely,W(t, x) is decreasing inxi if βi < 0,

Equations (41) and (42) are respectively equations (33) and (34) from As-sumption 4.2. By Assumption 4.2, there exists 2n constant controlshk, k =1, . . . , 2n such that fori = 1, . . . , n,

βii(t) < 0

βn+ii (t) > 0

whereβij(t) denotes thejth component of the vectorβi(t). We can now conclude

that we have the following upper bounds

Φ(t, x) ≤ eαk(t)+βk′ (t)x

for each elementxi , i = 1, . . . , n of x.

Page 30: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

18

Remark 4.3. To obtain the upper bounds and the asymptotic behaviour, we donot need the 2n constant controls to be pairwise different. In fact, we need at least2 different controls and at most 2n different controls. Moreover, we could considerwider classes of controls extending beyond constant controls. This would requiresome modifications to the proof but would also alleviate the assumptions requiredfor the result to hold.

Remark 4.4. For a given constant controlh, equation (39) is a linearn-dimensional ODE. However, if in the dynamics of the state variableX(t),Λ andΞdepended onX, the ODE would be nonlinear. Once ODE (39) is solved, obtainingα(t) from equation (40) is a simple matter of integration.

Remark 4.5. For a given constant controlh, given x ∈ Rn and t ∈ [0,T], thesolution of ODE (39) is the same whether the dynamics ofS(t) andX(t) is thejump diffusion considered here or the corresponding pure diffusion model. Theconverse is, however, not true since in the pure diffusion settingh ∈ Rm, while inthe jump diffusion caseh ∈ J ⊂ Rm.

5. Viscosity Solution ApproachIn recent years, viscosity solutions have gained a widespread acceptance as an

effective technique to obtain a weak sense solution for HJB PDEs when no classi-cal (i.e.C1,2) solution can be shown to exist, which is the case for many stochasticcontrol problems. Viscosity solutions also have a very practical interest. Indeed,once a solution has been interpreted in the viscosity sense and the uniqueness ofthis solution has been proved via a comparison result, the fundamental ‘stability’result of Barles and Souganidis [8] opens the way to a numerical resolution ofthe problem through a wide range of schemes. Readers interested in an overviewof viscosity solutions should refer to the classic article by Crandall, Ishii and Li-ons [17], the book by Fleming and Soner [26] and Øksendal and Sulem [30], aswell as the notes by Barles [5] and Touzi [34].

While the use of viscosity solutions to solve classical diffusion-type stochasticcontrol problems has been extensively studied and surveyed (see Fleming andSoner [26] and Touzi [34]), this introduction of a jump-related measure makes thejump-diffusion framework more complex. As a result, so far no general theoryhas been developed to solve jump-diffusion problems. Instead, the assumptionsmade to derive a comparison result are closely related to what the specific problemallows. Broadly speaking, the literature can be split along two lines of analysis,depending on whether the measure associated with the jumps is assumed to befinite.

In the case when the jump measure is finite, Alvarez and Tourin [1] considera fairly general setting in which the jump term does not need to be linear in thefunctionu which solves the integro-differential PDE. In this setting, Alvarez and

Page 31: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

19

Tourin develop a comparison theorem that they apply to a stochastic differentialutility problem. Amadori [3] extends Alvarez and Tourin’s analysis to price Eu-ropean options. Barles, Buckdahn and Pardoux [6] study the viscosity solution ofintegro-differential equations associated with backward SDEs (BSDEs).

The Levy measure is the most extensively studied measure with singularities.Pham [33] derives a comparison result for the variational inequality associatedwith an optimal stopping problem. Jakobsen and Karlsen [29] analyse in detailthe impact of the Levy measure’s singularity and propose a maximum principle.Amadori, Karlsen and La Chioma [4] focus on geometric Levy processes and thepartial integro differential equations they generate before applying their resultsto BSDEs and to the pricing of European and American derivatives. A recentarticle by Barles and Imbert [7] takes a broader view of PDEs and their non-local operators. However, the authors assume that the nonlocal operator is broadlyspeaking linear in the solution which may prove overly restrictive in some cases,including our present problem.

As far as our jump diffusion risk-sensitive control problem is concerned, wewill promote a general treatment and avoid restricting the class of the compen-satorν. At some point, we will however needν to be finite. This assumption willonly be made for a purely technical reason arising in the proof of the comparisonresult (in Section 6). Since the rest of the story is still valid ifν is not finite, and inaccordance with our goal of keeping the discussion as broad as possible, we willwrite the rest of the article in the spirit of a general compensatorν.

5.1 DefinitionsBefore proceeding further, we will introduce the following definition:

Definition 5.1. The upper semicontinuous envelopeu∗(x) of a functionu at x isdefined as

u∗(x) = lim supy→x

u(y)

and the lower semicontinuous envelopeu∗(x) of u(x) is defined as

u∗(x) = lim infy→x

u(y)

Note in particular the fundamental inequality between a function and its upper andlower semicontinuous envelopes:

u∗ ≤ u ≤ u∗

The theory of viscosity solutions was initially developed for elliptical PDEsof the form

H(x, u,Du,D2u) = 0

Page 32: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

20

and parabolic PDEs of the form

∂u∂t+ H(x, u,Du,D2u) = 0

for what Crandall, Ishii and Lions [17] term a “proper” functionalH(x, r, p,A).

Definition 5.2. A functional H(x, r, p,A) is said to beproper if it satisfies thefollowing two properties:

1. (degenerate) ellipticity:

H(x, r, p,A) ≤ H(x, r, p, B), B ≤ A

and

2. monotonicityH(x, r, p,A) ≤ H(x, s, p,A), r ≤ s

In our problem, the functionalF defined as

F(x, p,A) := − suph∈J

f (x, h)′p+12

tr(

ΛΛ′A)

−θ

2p′ΛΛ′p

+

Z

−1θ

(

e−θ(Φ(t,x+ξ(z))−Φ(t,x)) − 1)

− ξ′(z)p

ν(dz)

− g(x, h) (43)

plays a similar role to the functionalH in the general equation (43), and we notethat it is indeed “proper”. As a result, we can develop a viscosity approach toshow that the value functionΦ is the unique solution of the associated RS HJBPIDE.

We now give two equivalent definitions of viscosity solutions adapted fromAlvarez and Tourin [1]:

• a definition based on the notion of semijets;

• a definition based on the notion of test function

Before introducing these two definitions, we need to define parabolic semijet ofupper semicontinuous and lower semicontinuous functions and to add two addi-tional conditions.

Definition 5.3. Let u ∈ USC([0,T] × Rn) and (t, x) ∈ [0,T] × Rn. We define:

Page 33: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

21

• the Parabolic superjetP2,+u as

P2,+u := (p, q,A) ∈ R × Rn × Sn :

u(s, y) ≤ u(s, x) + p(s− t) + 〈q, y− x〉 +12〈A(y− x), y− x〉

+o(|s− t| + |y− x|2) as (s, y)→ (t, x)

• the closure of the Parabolic superjetP2,+u as

P2,+u :=

(p, q,A) = limk→∞

(pk, qk,Ak) with (pk, qk,Ak) ∈ P2,+u

and limk→∞

(tk, xk, u(tk, xk)) = (t, x, u(t, x))

Let u ∈ LSC([0,T] × Rn) and (t, x) ∈ [0,T] × Rn. We define:

• the Parabolic subjetP2,−u asP2,−

u := −P2,+u , and

• the closure of the Parabolic subjetP2,−u asP

2,−u = −P

2,+u

Condition 5.1. Let (t, x) ∈ [0,T] × Rn and (p, q,A) ∈ P2,+u(t, x), there areϕ ∈C(Rn), ϕ ≥ 1 andR> 0 such that for

((s, y), z) ∈ (BR(t, x) ∩ ([0,T] × Rn)) × Z,

Z

−1θ

(

e−θ(u(s,y+ξ(z))−u(s,y)) − 1)

− ξ′(z)q

ν(dz) ≤ ϕ(y)

Condition 5.2. Let (t, x) ∈ [0,T] × Rn and (p, q,A) ∈ P2,−u(t, x), there areϕ ∈C(Rn), ϕ ≥ 1 andR> 0 such that for

((s, y), z) ∈ (BR(t, x) ∩ ([0,T] × Rn)) × Z,

Z

−1θ

(

e−θ(u(s,y+ξ(z))−u(s,y)) − 1)

− ξ′(z)q

ν(dz) ≥ −ϕ(y)

The purpose of these conditions onu and v is to ensure that the jump term issemicontinuous at any given point (t, x) ∈ [0,T]×Rn (see Lemma 1 and Conditions

Page 34: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

22

(6) and (7) in [1]). In our setting, we note that since the value functionΦ and thefunctionx 7→ ex are locally bounded, these two conditions are satisfied.

Remark 5.1. Note that the jump-related integral term

Z

−1θ

(

e−θ(u(s,y+ξ(z))−u(s,y)) − 1)

− ξ′(z)q

ν(dz)

is well defined when (p, q,A) ∈ P2,±u . First, by Taylor,

Z

−1θ

(

e−θ(u(s,y+ξ(z))−u(s,y)) − 1)

− ξ′(z)q

ν(dz)

=

Z

(u(s, y+ ξ(z)) − u(s, y)) −θ

2(u(s, y+ ξ(z)) − u(s, y))2

+θ2

3!(u(s, y+ ξ(z)) − u(s, y))3 + . . . − ξ′(z)q

ν(dz)

By definition of the Parabolic superjetP2,+u , for t = s, the pair (q,A) satisfies the

inequality

u(s, y+ ξ(z)) − u(s, y) − ξ′(z)q ≤12ξ′(z)Aξ(z) + o(|ξ(z)|2)

Similarly, by definition of the Parabolic subjetP2,−u , for t = s, the pair (q,A)

satisfies the inequality

u(s, y+ ξ(z)) − u(s, y) − ξ′(z)q ≥12ξ′(z)Aξ(z) + o(|ξ(z)|2)

Thus, ifu is a viscosity solution, we have

u(s, y+ ξ(z)) − u(s, y) − ξ′(z)q =12ξ′(z)Aξ(z) + o(|ξ(z)|2)

and the jump-related integral is equal to

Z

−1θ

(

e−θ(u(s,y+ξ(z))−u(s,y)) − 1)

− ξ′(z)q

ν(dz)

=

Z

−θ

2(u(s, y+ ξ(z)) − u(s, y))2 +

12ξ′(z)Aξ(z) + o(|ξ(z)|2)

ν(dz)

which is well-defined.

Page 35: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

23

Definition 5.4. A locally bounded functionu ∈ USC([0,T] ×Rn) satisfying Con-dition 5.1 is a viscosity subsolution of (23), if for allx ∈ Rn, u(T, x) ≤ g0(x), andfor all (t, x) ∈ [0,T] × Rn, (p, q,A) ∈ P2,+u(t, x), we have

−p+ F(x, q,A) −∫

Z

−1θ

(

e−θ(u(t,x+ξ(z))−u(t,x)) − 1)

− ξ′(z)q

ν(dz) ≤ 0

A locally bounded functionu ∈ LSC([0,T] × Rn) satisfying Condition 5.2 isa viscosity supersolution of (23), if for allx ∈ Rn, u(T, x) ≥ g0(x), and for all(t, x) ∈ [0,T] × Rn, (p, q,A) ∈ P2,−u(t, x), we have

−p+ F(x, q,A) −∫

Z

−1θ

(

e−θ(u(t,x+ξ(z))−u(t,x)) − 1)

− ξ′(z)q

ν(dz) ≥ 0

A locally bounded functionΦ whose upper semicontinuous and lowersemi-continuous envelopes are a viscosity subsolution and a viscosity supersolutionof (23) is a viscosity solution of (23).

Definition 5.5. A locally bounded functionu ∈ USC([0,T] × Rn) is a viscositysubsolution of (23), if for allx ∈ Rn, u(T, x) ≤ g0(x), and for all (t, x) ∈ [0,T]×Rn,ψ ∈ C2([0,T] × Rn) such thatu(t, x) = ψ(t, x), u < ψ on [0,T] × Rn\ (t, x), wehave

−∂ψ

∂t+ F(x,Dψ,D2ψ) −

Z

−1θ

(

e−θ(ψ(t,x+ξ(z))−ψ(t,x)) − 1)

− ξ′(z)Dψ

ν(dz) ≤ 0

A locally bounded functionv ∈ LSC([0,T] × Rn) is a viscosity supersolutionof (23), if for all x ∈ Rn, v(T, x) ≥ g0(x), and for all (t, x) ∈ [0,T] × Rn, ψ ∈C2([0,T] × Rn) such thatv(t, x) = ψ(t, x), v > ψ on [0,T] × Rn\ (t, x), we have

−∂ψ

∂t+ F(x,Dψ,D2ψ) −

Z

−1θ

(

e−θ(ψ(t,x+ξ(z))−ψ(t,x)) − 1)

− ξ′(z)Dψ

ν(dz) ≥ 0

A locally bounded functionΦ whose upper semicontinuous and lower semi-continuous envelopes are a viscosity subsolution and a viscosity supersolutionof (23) is a viscosity solution of (23).

We would have similar definition for the viscosity supersolution, subsolutionand solution of equation (27). Once again, the superjet and test function formu-lations are strictly equivalent (see Alvarez and Tourin [1] and Crandall, Ishii andLions [17]).

Remark 5.2. A more classical but also more restrictive definition of viscositysolution is as the continuous function which is both a supersolution and a sub-solution of (23) (see Definition 5.1 in Barles [5]). The line of reasoning we willfollow will make full use of the latitude afforded by our definition and we willhave to wait until the comparison result is established in Section 6 to prove thecontinuity of the viscosity solution.

Page 36: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

24

5.2 Characterization of the Value Function as a Viscosity SolutionTo show that the value function is a (discontinuous) viscosity solution of the

associated RS HJB PIDE (23), we follow an argument by Touzi [34] which en-ables us to make a greater use of control theory in the derivation of the proof.

Theorem 5.1.Φ is a (discontinuous) viscosity solution of the RS HJB PIDE(23)on [0,T] × Rn, subject to terminal condition(25).

Proof.

Outline: This proof can be decomposed in five steps. First, we defineΦ asa log transformation ofΦ. In the next three steps, we prove thatΦ is a viscositysolution of the exponentially transformed RS HJB PIDE by showing that it is 1) aviscosity subsolution; 2) a viscosity supersolution; and hence 3) a viscosity solu-tion. Finally, applying a change of variable result, such as Proposition 2.2 in [34],we conclude thatΦ is a viscosity solution of the RS HJB PIDE (23).

Step 1: Exponential Transformation

In order to prove that the value functionΦ is a (discontinuous) viscosity so-lution of (23), we will start by proving that the exponentially transformed valuefunctionΦ is a (discontinuous) viscosity solution of (27).

Step 2: Viscosity Subsolution

Let (t0, x0) ∈ Q := [0, t] × Rn andu ∈ C1,2(Q) satisfy

0 = (Φ∗ − u)(t0, x0) = max(t,x)∈Q

(Φ∗(t, x) − u(t, x)) (44)

and henceΦ ≤ Φ∗ ≤ u (45)

on Q.Let (tk, xk) be a sequence inQ such that

limk→∞

(tk, xk) = (t0, x0)

limk→∞Φ(tk, xk) = Φ

∗(t0, x0)

and define the sequenceξk asξk := Φ(tk, xk) − u(tk, xk). Sinceu is of classC1,2,limk→∞ ξk = 0.

Page 37: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

25

Fix h ∈ J and consider a constant controlh = h. Denote byXk the stateprocess with initial dataXk

tk = xk and, fork > 0, define the stopping time

τk := inf

s> tk : (s− tk,Xks − xk) < [0, δk) × αBn

for a given constantα > 0 and whereBn is the unit ball inRn and

δk :=√

ξk(

1− 10(ξk))

+ k−110(ξk)

From the definition ofτk, we see that limk→∞ τk = t0.By the Dynamic Programming Principle,

Φ(tk, xk) ≤ Etk,xk

[

exp

θ

∫ τk

tk

g(Xs, hs; θ)ds

Φ(τk,Xkτk

)

]

whereEtk,xk [·] represents the expectation under the measureP given initial data(tk, xk).

By inequality (45),

Φ(tk, xk) ≤ Etk,xk

[

exp

θ

∫ τk

tk

g(Xs, hs)ds

u(τk,Xkτk

)

]

and hence by definition ofξk,

u(tk, xk) + ξk ≤ Etk,xk

[

exp

θ

∫ τk

tk

g(Xs, hs)ds

u(τk,Xkτk

)

]

i.e.

ξk ≤ Etk,xk

[

exp

θ

∫ τk

tk

g(Xs, hs)ds

u(τk,Xkτk

)

]

− u(tk, xk)

DefineZ(tk) = θ∫ τk

tkg(Xs, hs)ds, then

d(

eZs)

:= θg(Xs, hs)eZsds

Also, by Ito,

dus =

∂u∂s+Lu

ds+ Du′Λ(s)dWs

+

Z

u(

s,X(s−) + ξ(z))

− u(

s,X(s−))

Np(ds, dz)

for s ∈ [tk, τk] and where the generatorL of the state processX(t) is defined as

Lu(t, x) := f (t, x, h; θ)′Du+12

tr(

ΛΛ′(t,X)D2u)

(46)

Page 38: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

26

By the Ito product rule, and sincedZs · us = 0, we get

d(

useZs)

= usd(

eZs)

+ eZsdus

and hence fort ∈ [tk, τk]

u(t,Xkt )eZt = u(tk, xk)eZtk + θ

∫ t

tk

u(s,Xks)g(Xk

s, hs)eZsds

+

∫ t

tk

(

∂u∂s

(s,Xks) +Lu(s,Xk

s)eZs

)

ds+∫ t

tk

Du′Λ(s)dWs

+

∫ t

tk

Z

u(

t,Xk(s−) + ξ(z))

− u(

t,Xk(s−))

Np(dt, dz)

Noting thatu(tk, xk)eZtk = u(tk, xk) and taking the expectation with respect to theinitial data (tk, xk), we get

Etk,xk

[

u(t,Xt)eZt]

= u(tk, xk)eZtk + Etk,xk

[∫ t

tk

(

∂u∂s

(s,Xs) +Lu(s,Xs) + θu(s,Xs)g(Xs, hs)

)

eZsds

]

In particular, fort = τk,

ξk ≤ Etk,xk

[

u(τk,Xτk)eZτk

]

− u(tk, xk)eZtk

= +Etk,xk

[∫ τk

tk

(

∂u∂s

(s,Xs) +Lu(s,Xs) + θu(s,Xs)g(Xs, hs)

)

eZsds

]

and thus

ξk

δk≤

1δk

(

Etk,xk,

[

u(τk,Xτk)eZτk

]

− u(tk, xk)eZtk

)

=1δk

(

Etk,xk

[∫ τk

tk

(

∂u∂s

(s,Xs) + Lu(s,Xs) + θu(s,Xs)g(Xs, hs)

)

eZsds

])

As k→ ∞, tk → t0, τk → t0, ξk

δk→ 0 and

1δk

(

Etk,xk

[∫ t

tk

(

∂u∂s

(s,Xs) +Lu(s,Xs) + θu(s,Xs)g(Xs, hs)

)

eZsds

])

→∂u∂s

(s,Xs) +Lu(s,Xs) + θu(s,Xs)g(Xs, hs)

Page 39: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

27

a.s. by the Bounded Convergence Theorem, since the random variable

1δk

∫ t

tk

(

∂u∂s

(s,Xs) +Lu(s,Xs) + θu(s,Xs)g(Xs, hs)

)

eZsds

is bounded for large enoughk.Hence, we conclude that sincehs is arbitrary,

∂u∂s

(s,Xs) +Lu(s,Xs) + θu(s,Xs)g(Xs, hs) ≥ 0

i.e.

−∂u∂s

(s,Xs) − Lu(s,Xs) − θu(s,Xs)g(Xs, hs) ≤ 0

This argument proves thatΦ is a (discontinuous) viscosity subsolution of thePDE (27) on [0, t) × Rn subject to terminal conditionΦ(T, x) = eg0(x;T).

Step 3: Viscosity Supersolution

This step in the proof is a slight adaptation of the proof for classical controlproblems in Touzi [34]. Let (t0, x0) ∈ Q andu ∈ C1,2(Q) satisfy

0 = (Φ∗ − u)(t0, x0) < (Φ∗ − u)(t, x) for Q\(t0, x0) (47)

We intend to prove that at (t0, x0)

∂u∂t

(t, x) + infh∈H

Lhu(t, x) − θg(x, h)

≤ 0

by contradiction. Thus, assume that

∂u∂t

(t, x) + infh∈H

Lhu(t, x) − θg(x, h)

> 0 (48)

at (t0, x0).SinceLhu is continuous, there exists an open neighbourhoodNδ of (t0, x0)

defined forδ > 0 as

Nδ := (t, x) : (t − t0, x− x0) ∈ (−δ, δ) × δBn, and (48) holds (49)

Note that by (47) and sinceΦ > Φ∗ > u,

minQ\Nδ

(

Φ − u)

> 0

Forρ > 0, consider the setJρ of ρ-optimal controlshρ satisfying

I (t0, x0, hρ) ≤ Φ(t0, x0) + ρ (50)

Page 40: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

28

Also, letε > 0, ε ≤ γ be such that

minQ\Nδ

(

Φ − u)

≥ 3εe−δθMδ > 0 (51)

whereMδ is defined as

Mδ := max(t,x)∈N J

δ,h∈Jρ

(−g(x, h), 0)

for

N Jδ := (t, x) : (t − t0, x− x0) ∈ (−δ, δ) × (ζ + δ)Bn (52)

and

ζ := maxz∈Z‖ξ(z)‖

Note thatζ < ∞ by boundedness ofξ(z) and thusMδ < ∞.Now let (tk, xk) be a sequence inNδ such that

limk→∞

(tk, xk) = (t0, x0)

andlimk→∞Φ(tk, xk) = Φ∗(t0, x0)

Since (Φ − u)(tk, xk)→ 0, we can assume that the sequence (tk, xk) satisfies

|(Φ − u)(tk, xk)| ≤ ε, for k ≥ 1 (53)

for ε defined by (51).Consider theε-optimal controlhεk, denote byXε

k the controlled process definedby the control processhεk and introduce the stopping time

τk := inf

s> τk : (s, Xεk(s)) < Nδ

Note that since we assumed that−∞ ≤ ξmini ≤ ξi ≤ ξ

maxi < ∞ for i = 1, . . . , n and

sinceν is assumed to be bounded thenX(τ) is also finite and in particular,

(Φ − u)(τk, Xεk(τk)) ≥ (Φ∗ − u)(τk, X

εk(τk)) ≥ 3εe−δθMδ (54)

ChooseN Jδ

so that (τ, Xε(τ)) ∈ N Jδ. In particular, sinceXε(τ) is finite then

N Jδ

can be defined to be a strict subset ofQ and we can effectively use the localboundedness ofg to establishMδ.

Page 41: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

29

LetZ(tk) = θ∫ τk

tkg(Xε

s, hεs)ds, sinceΦ ≥ Φ∗ and by (53) and (54),

Φ(τk, Xεk(τk))e

Z(τk) − Φ(tk, xk)eZ(tk)

≥ u(τk, Xεk(τk))e

Z(τk) − Φ(tk, xk)eZ(tk) + 3εe−δθMδeZ(τk) − ε

∫ τk

tk

d(

u(s, Xεk(s))eZs

)

+ 2ε

i.e.

Φ(tk, xk) ≤ Φ(τk, Xεk(τk))eZ(τk) −

∫ τk

tk

d(

u(s, Xεk(s))eZs

)

− 2ε

Taking expectation with respect to the initial data (tk, xk),

Φ(tk, xk) ≤ Etk,xk

[

Φ(τk, Xεk(τk))eZ(τk) −

∫ τk

tk

d(

u(s, Xεk(s))eZs

)

]

− 2ε

Note that by the Ito product rule,

d(

u(s, Xεk(s))eZs

)

= usd(

eZs)

+ eZsdus

=∂u∂t

(t, x) +Lhu(t, x) + θg(x, h)

Since we assumed that

−∂u∂t

(t, x) − Lhu(t, x) − θg(x, h) < 0

then

∫ τk

tk

d(

u(s, Xεk(s))ezs

)

< 0

and therefore

Φ(tk, xk) ≤ Etk,xk

[

Φ(τk, Xεk(τk))e

Z(τk) −

∫ τk

tk

d(

u(s, Xεk(s))eZs

)

]

− 2ε

≤ −2ε + E[

exp

θ

∫ τk

tk

g(Xs, hεk(s))ds

Φ(τk, Xεk(τk))

]

≤ −2ε + I (tk, xk, hεk)

≤ Φ(tk, xk) − ε

Page 42: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

30

where the third inequality follows from the Dynamic Programming Principle andthe last inequality follows from the definition ofε-optimal controls (see equa-tion (50)).

Hence, equation (48),

∂u∂t

(t, x) + infh∈H

Lhu(t, x) − θg(x, h)

> 0

is false and we have shown that

∂u∂t

(t, x) + infh∈H

Lhu(t, x) − θg(x, h)

≤ 0

This argument therefore proves thatΦ is a (discontinuous) viscosity superso-lution of the PDE (27) on [0, t)×Rn subject to terminal conditionΦ(T, x) = eg0(x;T).

Step 4: Viscosity Solution

SinceΦ is both a (discontinuous) viscosity subsolution and a supersolutionof (27), it is a (discontinuous) viscosity.

Step 5: Conclusion

Since by assumptionΦ is locally bounded, so isΦ. In addition,ϕ(x) = e−θx isof classC1(R). Also we note thatdϕdx < 0. By the change of variable property (seefor example Proposition 2.2 in Touzi [34]), we see that

1. sinceΦ is a (discontinuous) viscosity subsolution of (27),Φ = ϕ−1 Φ is a(discontinuous) viscosity supersolution of (23);

2. sinceΦ is a (discontinuous) viscosity supersolution of (27),Φ = ϕ−1 Φ isa (discontinuous) viscosity subsolution of (23).

and thereforeΦ is a (discontinuous) viscosity solution of (23) on [0, t)×Rn subjectto terminal conditionΦ(T, x) = eg0(x;T).

We also note the following corollary:

Corollary 5.1.

(i) Φ∗ is a upper semicontinuous viscosity subsolution, and;

(ii) Φ∗ is a lower semicontinuous viscosity supersolution of the RS HJB PIDE(23)on [0,T] × Rn, subject to terminal condition(25).

As a result of this corollary, we note thatΦ∗, Φ∗ andΦ are respectively aviscosity subsolution, supersolution, and solution in the sense of Definitions 5.4and 5.5.

Page 43: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

31

6. Comparison ResultOnce we have characterized the class of viscosity solutions associated with a

given problem, the next task is to prove that the problem actually admits a uniqueviscosity solution by establishing a comparison theorem. Comparison theoremsare the cornerstone of the application of viscosity theory. Their main use is toprove uniqueness, and in our case continuity, of the viscosity solution. Althougha set of, by now fairly standard, techniques can be applied in the proof, the com-parison theoremper seis generally customized to address both the specificities ofthe PDE and the requirements of the general problem.

We face three main difficulties in establishing a comparison result for our risk-sensitive control problem. The first obstacle is the behaviour of the value func-tion Φ at infinity. In the pure diffusion case or LEQR case solved by Kurodaand Nagai [31], the value function is quadratic in the state and is thereforenot bounded forx ∈ Rn. Consequently, there is no reason to expect the so-lution to the integro-differential RS HJB PIDE (23) to be bounded. The sec-ond hurdle is the presence of an extra non-linearity: the quadratic growth term(DΦ)′ΛΛ′DΦ. This extra non-linearity could, in particular, increase the com-plexity of the derivation of a comparison result for an unbounded value func-tion. Before dealing with the asymptotic growth condition we will therefore needto address this non-linear term. The traditional solution, an exponential changeof variable such as the one proposed by Duffie and Lions [22], is equivalent tothe log transformation we used to derive the RS HJB PIDE and again to provethat the value function is a viscosity solution of the RS HJB PIDE. However,the drawback of this method is that, by creating a new zeroth order term equalto the solution multiplied by the cost functiong, it imposes a severe restrictionon g for the PDE to satisfy the monotonicity property required to talk about vis-cosity solutions. The final difficulty lies in the presence of the jump term andof the compensatorν. If we assume that the measure is finite, this can be ad-dressed following the general argument proposed by Alvarez and Tourin [1] andAmadori [2].

To address these difficulties, we will need to adopt a slightly different strat-egy from the classical argument used to proof comparison results as set out inCrandall, Ishii and Lions [17]. In particular, we will exploit the properties of theexponentially transformed value functionΦ resulting from Assumption 4.2 andalternate between the log transformed RS HJB PIDE and the quadratic growth RSHJB PIDE (23) through the proof.

Theorem 6.1. Letu = e−θv ∈ USC([0,T]×Rn) be a bounded from above viscositysubsolution of(23) and v = e−θu ∈ LSC([0,T] × Rn) be a bounded from belowviscosity supersolution of(23). If the measureν is bounded and Assumption 4.2holds then

u ≤ v on[0,T] × Rn

Page 44: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

32

Proof outline: This proof can be decomposed in seven steps. In the first step, weperform the usual exponential transformation to rewrite the problem for the valuefunctionΦ into a problem for the value functionΦ. The rest of the proof is doneby contradiction. In step 2, we state the assumption we are planning to disprove.The properties of the value functionΦ related to Assumption 4.2 are used in Step3 to deduce that it is enough to prove the comparison result forΦ on a boundedstate space to reach our conclusion. We then double variables in step 4 beforefinding moduli of continuity for the diffusion and the jump components respec-tively in steps 5 and 6. Finally, we reach a contradiction in step 7 and concludethe proof.

Step 1: Exponential Transformation

Let u ∈ USC([0,T] × Rn) be a viscosity subsolution of (23) andv ∈LSC([0,T] × Rn) be a viscosity supersolution of (23). Define:

u := e−θv

v := e−θu

By the change of variable property (see for example Proposition 2.2 in Touzi [34]),u and v are respectively a viscosity subsolution and a viscosity supersolution ofthe RS HJB PIDE (27) for the exponentially transformed value functionΦ.

Thus, to prove thatu ≤ v on [0,T] × Rn

it is sufficient to prove that

u ≤ v on [0,T] × Rn

Step 2: Setting the Problem

As is usual in the derivation of comparison results, we argue by contradictionand assume that

sup(t,x)∈[0,T]×Rn

[u(t, x) − v(t, x)] > 0 (55)

Step 3: Taking the Behaviour of the Value Function into Consideration

The assertion of this theorem is that the comparison result holds in the classof functions satisfying Assumption 4.2. As a result Proposition 4.3 holds andwe can concentrate our analysis on subsolutions and supersolutions sharing the

Page 45: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

33

same growth properties as the exponentially transformed value functionΦ. ByPropositions 4.3 and 4.2,

0 < u(t, x) ≤ eαk(t)+βk′ (t)x ∀(t, x) ∈ [0,T] × Rn

0 < v(t, x) ≤ eαk(t)+βk′ (t)x ∀(t, x) ∈ [0,T] × Rn

andlim|x|→∞

u(t, x) = lim|x|→∞

v(t, x) = 0 ∀t ∈ [0,T] (56)

for k = 1, . . . , 2n whereαk andβk are the functions given in Assumption 4.2.Since (56) holds at an exponential rate, then by Assumption (55) there existsR>

0, such that

sup(t,x)∈[0,T]×Rn

[u(t, x) − v(t, x)] = sup(t,x)∈[0,T]×BR

[u(t, x) − v(t, x)]

Hence, it is enough to show a contradiction with respect to the hypothesis

sup(t,x)∈Q

[u(t, x) − v(t, x)] > 0 (57)

established on the setQ := [0,T] × BR. Before proceeding to the next step, wewill restate assumption (57) now needs to be restated in terms ofu andv as

sup(t,x)∈Q

[u(t, x) − v(t, x)] > 0 (58)

Step 4: Doubling of Variables on the SetQ

Let η > 0 be such that

N := sup(t,x)∈Q

[

u(t, x) − v(t, x) − ϕ(t)]

> 0

whereϕ(t) := η

t .

We will now double variables, a technique commonly used in viscosity solu-tions literature (see e.g. Crandall, Ishii and Lions [17]). Consider a global maxi-mum point (tε , xε , yε) ∈ (0,T] × BR × BR =: Qd of

u(t, x) − v(t, y) − ϕ(t) − ε|x− y|2

and define

Nε := sup(t,x,y)∈Qd

[

u(t, x) − v(t, y) − ϕ(t) − ε|x− y|2]

> 0

Page 46: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

34

Note thatNε > 0 for ε large enough. Moreover,Nε ≥ N andNε ↓ 0 asε → ∞.

It is well established (see Lemma 3.1 and Proposition 3.7 in [17]) that along asubsequence

limε→∞

(tε , xε , yε) = (t, x, x)

for some (t, x) ∈ [0,T] × Rn which is a maximum point of

u(t, x) − v(t, x) − ϕ(t)

Via the same argument, we also have

limε→∞

ε|xε − yε |2 = 0

as well as

limε→∞

u(tε , xε) = u(t, x)

and

limε→∞

v(tε , xε) = v(t, x)

In addition, we note that

limε→∞

Nε = N

Applying Theorem 8.3 in Crandall, Ishii and Lions [17] at (tε , xε , yε), we see thatthere existsaε , bε ∈ R andAε , Bε ∈ Sn such that

(aε , ε(xε − yε),Aε) ∈ P2,+u

(bε , ε(xε − yε), Bε) ∈ P2,−v

aε − bε = ϕ′(tε)

and

−3ε

[

I 00 I

]

[

Aε 00 −Bε

]

≤ 3ε

[

I −I−I I

]

Thus, we have for the subsolutionu

−aε + F(xε , ε(xε − yε),Aε)

+

Z

(

e−θ(u(tε ,xε+ξ(z))−u(tε ,xε )) − 1)

+ εξ′(z)(xε − yε )

ν(dz)

≤ 0

Page 47: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

35

and for the supersolutionv,

−bε + F(yε , ε(xε − yε), Bε)

+

Z

(

e−θ(v(tε ,yε+ξ(z))−v(tε ,yε )) − 1)

+ εξ′(z)(xε − yε)

ν(dz)

≥ 0

Subtracting these two inequalities,

−ϕ′(tε) = bε − aε

≤ F(yε , ε(xε − yε ), Bε) − F(xε , ε(xε − yε),Aε)

+

Z

(

e−θ(v(tε ,yε+ξ(z))−v(tε ,yε )) − 1)

+ εξ′(z)(xε − yε)

ν(dz)

Z

(

e−θ(u(tε ,xε+ξ(z))−u(tε ,xε )) − 1)

+ εξ′(z)(xε − yε )

ν(dz)

= F(yε , ε(xε − yε ), Bε) − F(xε , ε(xε − yε),Aε)

+1θ

Z

e−θ(v(tε ,yε+ξ(z))−v(tε ,yε ))

ν(dz)

−1θ

Z

e−θ(u(tε ,xε+ξ(z))−u(tε ,xε ))

ν(dz) (59)

Step 5: Modulus of Continuity

In this step, we focus on the (diffusion) operatorF.

F(yε , ε(xε − yε), Bε) − F(xε , ε(x− y),Aε)

= suph∈J

ε f (tε , yε ,h)′ (xε − yε ) +12

tr (ΛΛ′Bε) −θ

2ε2 (xε − yε)

′ΛΛ′ (xε − yε ) − g(yε ,h)

− suph∈J

ε f (tε , xε ,h)′ (xε − yε ) +12

tr (ΛΛ′Aε + δIn)

−θ

2ε2 (xε − yε )

′ΛΛ′ (xε − yε) − g(xε ,h)

≤12|tr (ΛΛ′Bε − ΛΛ

′Aε)| + suph∈Jε| f (tε , yε ,h) − f (tε , xε ,h)||(xε − yε )|

+ suph∈J|g(xε ,h) − g(yε ,h)|

≤12|tr (ΛΛ′Aε − ΛΛ

′Bε)| + suph∈Jε| f (tε , yε ,h) − f (tε , xε ,h)||(xε − yε )|

+ suph∈J|g(xε ,h) − g(yε ,h)|

Page 48: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

36

Note that the functionalf defined in (19) satisfies

| f (tε , yε , h) − f (tε , xε , h)| ≤ C f |yε − xε |

for some constantC f > 0. In addition,

tr(

ΛΛ′Aε − ΛΛ′Bε

)

= tr

([

ΛΛ′ ΛΛ′

ΛΛ′ ΛΛ′

] [

Aε 0

0 −Bε

])

≤ 3ε tr

([

ΛΛ′ ΛΛ′

ΛΛ′ ΛΛ′

] [

I −I

−I I

])

= 0

Finally, by definition ofg,

|g(yε , h) − g(xε , h)| ≤ Cg |yε − xε |

for some constantCg > 0. Combining these estimates, we get

F(yε , ε(xε − yε), Bε) − F(xε , ε(xε − yε),Aε)

≤ ω(ε |yε − xε |2 + |yε − xε |) (60)

for a functionω(ζ) = Cζ, with C = max[

C f ,Cg

]

. The functionω : [0,∞) →[0,∞), which satisfies the conditionω(0+) = 0, is called a modulus of continuity.

Step 6: The Jump Term

We now consider the jump term

Z

e−θ(v(tε ,yε+ξ(z))−v(tε ,yε )) − e−θ(u(tε ,xε+ξ(z))−u(tε ,xε ))

ν(dz)

=1θ

Z

e−θ(v(tε ,yε+ξ(z))−v(tε ,yε )) − e−θ(u(tε ,xε+ξ(z))−u(tε ,xε )+v(tε ,xδ)−v(tε ,xδ))

ν(dz) (61)

Since forε > 0 large enough,u(t, x) − v(t, y) ≥ 0 then

u(tε , xε + ξ(z)) − u(tε , xε) + v(tε , yε) − v(tε , yε + ξ(z))

≤ −(u(tε , xε) − v(tε , yε)) + N

by definition of N. Moreover, sinceNε = sup(t,x,y)∈Qd[u(t, x) − v(t, y) − ϕ(t)−

ε|x− y|2] > 0, thenNε ≤ u(tε , xε) − v(tε , yε) and therefore

u(tε , xε + ξ(z)) − u(tε , xε) + v(tε , yε) − v(tε , yε + ξ(z)) ≤ N − Nε

Page 49: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

37

for z ∈ Z. Thus,

e−θ(u(tε ,xε+ξ(z))−u(tε ,xε )+v(tε ,yε )−v(tε ,yε )) ≥ e−θ(v(tε ,yε+ξ(z))−v(tε ,yε )+N−Nε )

and equation (61) can be bounded from above by:

Z

e−θ(v(tε ,yε+ξ(z))−v(tε ,yε )) − e−θ(u(tε ,xε+ξ(z))−u(tε ,xε )+v(tε ,xε )−v(tε ,xε ))

ν(dz)

≤1θ

Z

e−θ(vε (tε ,yε+ξ(z))−v(tε ,yε )) − e−θ(v(tε ,yε+ξ(z))−v(tε ,yε )+N−Nδ)

ν(dz)

=1θ

Z

e−θ(v(tε ,yε+ξ(z))−v(tε ,yε ))(

1− e−θ(N−Nε ))

ν(dz)

=1θ

Z

e−θ(−1θ [ ln v(tε ,yε+ξ(z))−ln v(tε ,yε )])

(

1− e−θ(N−Nε ))

ν(dz)

=1θ

Z

v(tε , yε + ξ(z))v(tε , yε)

(

1− e−θ(N−Nε ))

ν(dz) (62)

By Proposition 4.2 and since ˜v is LSC, then∃λ > 0 : 0< λ ≤ v(t, x) ≤ CΦ∀(t, x) ∈Q. As a result,

v(tε , yε + ξ(z))v(tε , yε)

≤ K

for some constantK > 0. In addition, since the measureν is assumed to be finiteand the functionζ 7→ eζ is continuous, we can establish the following upper boundfor the right-hand side of (62):

Z

v(tε , yε + ξ(z))v(tε , yε)

(

1− e−θ(N−Nε ))

ν(dz)

≤Kθ

Z

1− e−θ(N−Nε )

ν(dz)

≤ ωR(N − Nε) sup(t,y)∈[0,T]×Rn

ν(Z) (63)

for some modulus of continuityωR related to the functionζ 7→ 1− eζ and param-eterized by the radiusR > 0 of the BallBR introduced in Step 3. Note that thisparametrization is implicitly due to the dependence ofN andNε on R. The termsup(t,y)∈[0,T]×Rn ν(Z) is the upper bound for the measureν.

Page 50: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

38

Step 7: Conclusion

We now substitute the upper bound obtained in inequalities (60) and (63)in (59) to obtain:

−ϕ′(tε) ≤ ω(ε |yε − xε |2 + |yε − xε |) + ωR(N − Nε) sup

(t,x)∈[0,T]×Rnν(Z) (64)

Taking the limit superior in inequality (64) asε → ∞ and recalling that

1. the measureν is finite;

2. ξi(z), i = 1, . . . ,m is bounded∀z ∈ Z a.s.dν

we see that

ν(Z) < ∞

Then

limε→0

ωR(N − Nε)ν(Z) = 0

which leads to the contradiction

−ϕ′(t) =η

t2≤ 0

We conclude from this that Assumption 58 is false and therefore

sup(t,x)∈Q

[v(t, x) − u(t, x)] ≥ 0 (65)

Stated differently, we conclude that

u ≤ v on [0,T] × Rn

6.1 UniquenessUniqueness is a direct consequence of Theorem 6.1. Another important corol-

lary is the fact that the (discontinuous) locally bounded viscosity solutionΦ is infact continuous on [0,T] × Rn.

Corollary 6.1. The functionΦ(t, x) defined on[0,T]×Rn is the unique continuousviscosity solution of the RS HJB PIDE(23)subject to terminal condition(25).

Proof. Uniqueness is a standard by-product of Theorem 6.1. Continuity can beproved as follows. By definition of the upper and lower semicontinuous envelopes,

Page 51: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

39

recall thatΦ∗ ≤ Φ ≤ Φ

By Corollary 5.1,Φ∗ andΦ∗ respectively are semicontinuous supersolutionand subsolution of the RS HJB PIDE (23) subject to terminal condition (25).

We note that as a consequence of Theorem 6.1 is that

Φ∗ ≥ Φ∗

and henceΦ∗ = Φ

is a continuous viscosity solution of the RS HJB PIDE (23) subject to terminalcondition (25).

Hence,Φ = Φ∗ = Φ∗ and it is the unique continuous viscosity solution of theRS HJB PIDE (23) subject to terminal condition (25).

Now that we have proved uniqueness and continuity of the viscosity solutionΦ to the RS HJB PIDE (23) subject to terminal condition (25), we can deducethat the RS HJB PIDE (27) subject to terminal condition (28) also has a uniquecontinuous viscosity solution. We formalize the uniqueness and continuity ofΦ

in the following corollary:

Corollary 6.2. The functionΦ(t, x) defined on[0,T]×Rn is the unique continuousviscosity solution of the RS HJB PIDE(27)subject to terminal condition(28).

7. ConclusionIn this chapter, we considered a risk-sensitive asset management model with

assets and factors modelled using affine jump-diffusion processes. This appar-ently simple setting conceals a number of difficulties, such as the unboundednessof the instantaneous reward functiong and the high nonlinearity of the HJB PIDE,which make the existence of classicalC1,2 solution unlikely barring the introduc-tion of significant assumptions. As a result, we considered a wider class of weaksolutions, namely viscosity solutions. We proved that the value function of a classof risk sensitive control problems and established uniqueness by proving a non-standard comparison result. The viscosity approach has proved remarkably usefulat solving difficult control problems for which the classical approach may fail.However, it is limited by the fact that it only provides continuity of the value func-tion and by its focus on the PDE in relative isolation from the actual optimizationproblem. The question is where to go from there? A possible avenue of researchwould be to look for a method to establish smootheness of the value function, forexample through a connection between viscosity solutions and classical solutions.

Page 52: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

40

Achieving this objective may also require changes to the analytic setting in orderto remove some of the difficulties inherent in manipulating unbounded functions.

References1. O. Alvarez and A. Tourin. Viscosity solutions of nonlinear integro-differential equa-

tions. Annales de l’Institut Henri Poincare - Analyse Non Lineaire, 13(3):293–317,1996.

2. A. L. Amadori.The obstacle problem for nonlinear integro-differential operators aris-ing in option pricing. Quaderno IAC Q21-000, 2000.

3. A. L. Amadori. Nonlinear integro-differential evolution problems arising in optionpricing: a viscosity solutions approach.Journal of Differential and Integral Equations,16(7):787–811, 2003.

4. A. L. Amadori, K. H. Karlsen, and C. La Chioma. Non-linear degenerate integro-partial differential evolution equations related to geometric Levy processes and ap-plications to backward stochastic differential equations.Stochastics An InternationalJournal of Probability and Stochastic Processes, 76(2):147–177, 2004.

5. G. Barles. Solutions de viscosite et equations elliptiques du deuxieme ordre.http://www.phys.univ-tours.fr/˜barles/Toulcours.pdf, 1997. Universite de Tours.

6. G. Barles, R. Buckdahn, and E. Pardoux. Backward stochastic differential equationsand integral-partial differential equations.Stochastics An International Journal ofProbability and Stochastic Processes,, 60(1):57–83, 1997.

7. G. Barles and C. Imbert. Second-order elliptic integro-differential equations: Viscositysolutions’ theory revisited.Annales de l’Institut Henri Poincare, 25(3):567–585, 2008.

8. G. Barles and P. E. Souganidis. Convergence of approximation schemes for fully non-linear second order equations.Journal of Asymptotic Analysis, 4:271–283, 1991.

9. A. Bensoussan and J. H. Van Schuppen. Optimal control of partially observablestochastic systems with an exponential-of-integral performance index. SIAMJournalon Control and Optimization, 23(4):599–613, 1985.

10. T. R. Bielecki, D. Hernandez-Hernandez, and S. R. Pliska.Recent Developments inMathematical Finance, chapter Risk sensitive Asset Management with ConstrainedTrading Strategies, pages 127–138. World Scientific, Singapore, 2002.

11. T. R. Bielecki and S. R. Pliska. Risk-sensitive dynamic asset management.AppliedMathematics and Optimization, 39:337–360, 1999.

12. T. R. Bielecki and S. R. Pliska. Risk sensitive asset management with transaction costs.Finance and Stochastics, 4:1–33, 2000.

13. T. R. Bielecki and S. R. Pliska. Economic properties of the risk sensitive criterion forportfolio management.The Review of Accounting and Finance, 2(2):3–17, 2003.

14. T. R. Bielecki and S. R. Pliska. Risk sensitive intertemporal CAPM. IEEETransac-tions on Automatic Control, 49(3):420–432, March 2004.

15. T. R. Bielecki, S. R. Pliska, and S. J. Sheu. Risk sensitive portfolio management withCox-Ingersoll-Ross interest rates: the HJB equation. SIAMJournal of Control andOptimization, 44:1811–1843, 2005.

16. F. Black. Capital market equilibrium with restricted borrowing.Journal of Business,45(1):445–454, 1972.

Page 53: Financial Engineering

May 3, 2010 13:34 Proceedings Trim Size: 9in x 6in 001

41

17. M. Crandall, H. Ishii, and P.-L. Lions. User’s guide to viscosity solutions of secondorder partial differential equations.Bulletin of the American Mathematical Society,27(1):1–67, July 1992.

18. M. H. A. Davis and S. Lleo. Risk-sensitive benchmarked asset management.Quanti-tative Finance, 8(4):415–426, June 2008.

19. M. H. A. Davis and S. Lleo. Jump-diffusion risk-sensitive asset management.Sub-mitted to the SIAM Journal on Financial Mathematics, 2009. http://arxiv.org/abs/0905.4740v1.

20. M. H. A. Davis and S. Lleo.The Kelly Capital Growth Investment Criterion: Theoryand Practice, chapter Fractional Kelly Strategies for Benchmarked Asset Manage-ment. World Scientific, forthcoming.

21. D. Duffie, D. Filipovic, and W. Schachermayer. Affine processes and applications infinance.Annals of Applied Probability, 13:984–1053, 2003.

22. D. Duffie and P.-L. Lions. PDE solutions of stochastic differential utility.Journal ofMathematical Economics, 21(6):577–606, 1992.

23. D. Duffie, J. Pan, and K. Singleton. Transform analysis and asset pricing for affinejump-diffusions.Econometrica, 68(6):1343–1376, 2000.

24. D. Duffie and K. J. Singleton.Credit Risk: Pricing, Measurement and Management.Princeton University Press, 2003.

25. W. H. Fleming.Mathematical Finance, volume 65 ofThe IMA volumes in mathe-matics and its applications, chapter Optimal Investment Models and Risk-SensitiveStochastic Control, pages 75–88. Springer-Verlag, New York, 1995.

26. W. H. Fleming and H. M. Soner.Controlled Markov Processes and Viscosity Solutions,volume 24 ofStochastic Modeling and Applied Probability. Springer-Verlag, 2 edition,2006.

27. N. Ikeda and S. Watanabe.Stochastic Differential Equations and Diffusion Processes.North-Holland Publishing Company, 1981.

28. D. H. Jacobson. Optimal stochastic linear systems with exponential criteria and theirrelation to deterministic differential games. IEEETransactions on Automatic Control,18(2):114–131, 1973.

29. E. R. Jakobsen and K. H. Karlsen. A “maximum principle for semicontinuous func-tions” applicable to integro-partial differential equations.Nonlinear Differential Equa-tions and Applications, 13:137–165, 2006.

30. B. Øksendal and A. Sulem.Applied Stochastic Control of Jump Diffusions. Springer,2005.

31. K. Kuroda and H. Nagai. Risk-sensitive portfolio optimization on infinite time horizon.Stochastics and Stochastics Reports, 73:309–331, 2002.

32. M. Lefebvre and P. Montulet. Risk-sensitive optimal investment policy.InternationalJournal of Systems Science, 22:183–192, 1994.

33. H. Pham. Optimal stopping of controlled jump diffusion processes: A viscosity solu-tion approach.Journal of Mathematical Systems, Estimation and Control, 8(1):1–27,1998.

34. N. Touzi. Stochastic control and application to finance. http://www.cmap.polytechnique.fr/˜touzi/pise02.pdf, 2002. Special Research Semester on FinancialMathematics, Scuola Normale Superiore, Pisa, April 29–July 15 2002.

35. P. Whittle.Risk Sensitive Optimal Control. John Wiley & Sons, New York, 1990.

Page 54: Financial Engineering

This page intentionally left blankThis page intentionally left blank

Page 55: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

Small-Sample Estimation of Modelsof Portfolio Credit Risk ∗

Michael B. Gordy and Erik Heitfield

Federal Reserve Board, Washington, DC 20551, USAE-mail: [email protected] and [email protected]

This paper explores the small sample properties of the most commonlyused estimators of ratings-based portfolio credit models. We considerboth method of moments and maximum likelihood estimators, and showthat unrestricted estimators are subject to large biases in realistic sam-ple sizes. We demonstrate large potential gains in precision and bias-reduction from imposing parametric restrictions across rating buckets.The restrictions we consider are based on economically meaningful hy-potheses on the structure of systematic risk.

Keywords: Portfolio credit risk, maximum likelihood, method of mo-ments, small sample bias.

1. IntroductionModels of portfolio credit risk have widespread application in bank risk-

management, the credit rating of structured credit products, and the assessmentof regulatory capital requirements. At the level of the individual position, creditrisk depends most importantly on obligor default and rating migration probabili-ties. At the portfolio level, aggregate risk-measures (such as value-at-risk) dependalso on the correlation (or, more generally, thedependence) across obligors incredit events. In practice and in academic work, the most widely used modelsare constructed as multi-firm generalizations of the structural model of Merton[20]. The return on firm asset value determines the outcome for the obligor at themodel horizon. Dependence across obligors is generated through a factor struc-

∗This paper is drawn from an earlier working paper by the title “Estimating Default Correlationsfrom Short Panels of Credit Rating Performance Data,” dated January 2002. The opinions expressedhere are those of the authors, and do not reflect the views of the Board of Governors or its staff.

43

Page 56: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

44

ture in which the obligor asset return is modeled as a weighted sum of systematicand idiosyncratic risk factors.

Calibration of these models often draws upon historical ratings performancedata. These panel datasets may provide performance data on large numbers ofrated obligors, but in the time-series dimension they invariably span just a fewdecades at most. As shown by Gagliardini and Gourieroux [7], largen in thecross-sectional dimension is not sufficient for consistency of the parameter esti-mates. Rather, it is largeT in the time-series dimension that is needed. Thus, forthe foreseeable future, large sample asymptotics may not be an adequate guideto the performance of the estimators on available data. Furthermore, even if theasymptotics were reliable and the estimators unbiased, parameter uncertainty mat-ters. Value-at-risk is a non-linear function of the model parameters, so the esti-mated VaR under parameter uncertainty is biased [15, 25]. Heitfield [14] drawssimilar conclusions in the context of model-based rating of collateralized debtobligations.

This paper explores the small sample properties of the most commonly usedestimators of ratings-based portfolio credit models. We consider both method ofmoments and maximum likelihood estimators. Our main purpose is to measure thepotential gain in precision and bias-reduction from imposing parametric restric-tions across rating buckets. The restrictions we consider are based on economi-cally meaningful hypotheses on the nature of the rating system and the structureof systematic risk.

The literature on estimation of portfolio credit risk models has grown enor-mously over the last decade. Method of moment estimators were introduced tothis literature by Gordy [9] and Nagpal and Bahar [21], and refined by Frey andMcNeil [6]. Early applications included [13] and [3]. Gagliardini and Gourieroux[7] extend the method to models of rating migration. Maximum likelihood esti-mation of these models was considered by Frey and McNeil [6], and has sincebeen extended by Feng, Gourieroux and Jasiak [5] to models with rating migra-tion. Gagliardini and Gourieroux [8] and Gourieroux and Jasiak [11] develop ap-proximate maximum likelihood approaches that exploit the large cross-sectionaldimension to reduce the computational burden of the estimator. A promisingnew development has been the introduction by McNeil and Wendin [18, 19] ofBayesian MCMC estimators of portfolio credit models. These methods are flexi-ble and powerful, though their computational requirements are non-trivial. For arecent application and extension of the Bayesian approach, see [24].

The portfolio credit model is presented in Section 2. We work within a two-state (default/no-default) setting, and so do not consider rating migrations of sur-viving obligors. In many cases, the ratings performance data include informationon rating migrations as well as on default. In principle, transition data can andshould be exploited to increase the precision of the estimators. We restrict our-selves to the two-state case partly for simplicity in exposition, but also for two

Page 57: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

45

practical reasons. First, some datasets might not contain information on ratingmigrations. Default information is the “least common denominator” in the creditrisk world. Second, estimation of a model of rating migration requires strongerassumptions on the nature and objectives of the rating process. The “through-the-cycle” rating philosophies of the leading rating agencies are open to variedinterpretations, some of which may be difficult to formalize in a statistical model.1

Section 3 shows how model parameters can be estimated from ratings perfor-mance data using the method of moments or maximum likelihood. The method ofmoments estimator has a closed-form solution, so it is especially convenient. Themaximum likelihood estimators are somewhat more computationally demanding,but are also more efficient. Furthermore, the ML estimators lend themselves toimposing structural parameter restrictions.

Section 4 presents results for a Monte Carlo study of the small sample prop-erties of three different maximum likelihood estimators as well as the method ofmoments estimator. We find that the method of moments and the least-restrictedmaximum likelihood estimator are subject to large biases in realistic sample sizes.The restricted maximum likelihood estimators offer large improvements in perfor-mance. In Section 5, we explain the source of the bias in the method of momentsestimator. Implications are discussed in the Conclusion.

2. A Structural Default ModelWe adopt a two-state version of the popular CreditMetrics model [12]. As-

sume we have a set of obligors, indexed byi. Associated with each obligor is alatent variableRi which represents the normalized return on an obligor’s assets.Ri is given by

Ri = Zηi + ξiεi . (1)

whereZ is a K-vector of systematic risk factors. These factors capture unantic-ipated changes in economy-wide variables such as interest rates and commodityprices. We assume thatZ is a mean-zero normal random vector with variance ma-trix Ω. We measure the sensitivity of obligori to Z by a vector offactor loadings,ηi . Obligor-specific risk is represented byεi . Eachεi is assumed to have a stan-dard normal distribution and is independent across obligors and independent ofZ.Without loss of generality, the covariance matrixΩ is assumed to have ones on themain diagonal (so eachZk has a standard normal marginal distribution), and theweightsηi andξi are scaled so thatRi has a mean of zero and a variance of one.The obligor defaults ifRi falls below the default thresholdγi . By construction,then, the unconditional probability of default (“PD”) of obligori is equal to thestandard normal CDF evaluated atγi .

1Alternative interpretations of “through-the-cycle” can be found in [2], [26], [1], and [16, 17].

Page 58: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

46

To allow the model to be calibrated using historical data of the sort availablefrom the rating agencies, we group the obligors intoG homogeneous “buckets”indexed byg. In most applications, the buckets comprise an ordered set of ratinggrades. In principle, however, a bucketing system can be defined along multipledimensions. For example, a bucket might be composed of obligors of a givenrating in a particular industry and country. Within a bucket, each obligor has thesame default thresholdγg so that the PD of any obligor in gradeg is

pg = Φ(γg), (2)

whereΦ(z) is the standard normal CDF.The vector of factor loadings is assumed to be constant across all obligors in

a bucket. so we can re-write the equation forRi as

Ri = Xgwg+ εi

1−w2g. (3)

where

Xg =∑k Zkηg,k√

η ′gΩηg

is a univariate bucket-specific common risk factor. By construction, eachXg

is normally distributed with mean zero and unit variance. TheG-vector X =(X1, . . . ,XG) has a multivariate normal distribution. Letσgh denote the covariancebetweenXg andXh. The factor loading onXg for obligors in bucketg is

wg =√

η ′gΩηg,

which is bounded between zero and one. We eliminateξi from equation (1) byimposing the scaling convention that the variance ofRi is one.

The advantage of writingRi in terms ofXg andwg rather thanZ andηg is thatwe then only need to keep track of one risk factor per bucket. We can think ofXg

as summarizing the total effect ofZ on obligors in bucketg, andwg as describingthe sensitivity of those obligors to the bucket-specific common risk factor. In thediscussion that follows, the termrisk factorsshould be taken to refer toXg. Thetermstructural risk factorswill be used to identify the elements ofZ because theyreflect underlying economic variables. Likewisefactor loadingswill refer to wg

andstructural factor loadingswill refer to ηg.In this model, dependence across obligorsi and j is summarized by theirasset

correlation, which is the correlation between the latent variablesRi andRj . If i andj are in bucketsg andh, respectively, then the asset correlation isρgh= wgwhσgh.For two distinct obligors in the same bucketg, we haveρgg = w2

g. In the Gaus-sian framework of the standard structural model, the matrix of asset correlationsis a complete characterization of the dependence structure. As observed by Em-brechts, McNeil and Straumann [4], linear correlations need not be sufficient un-der more general distributional assumptions.

Page 59: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

47

In some applications, there is interest in the correlation between default eventindicators 1[Yi < γi ] and 1[Yj < γ j ]. For obligor i in bucketg and obligor j inbucketh, thedefault correlationis

Cgh =Φ2(γg,γh,ρgh)− pgph

pg(1− pg)√

ph(1− ph)(4)

whereΦ2(z1,z2,ρ) is the bivariate normal cdf for standard normal marginals andcorrelationρ . The same formula holds in the special case where the two (distinct)obligors lie in the same bucket.

Given sufficient data, one can estimate allG(G+ 1)/2 asset correlations.When data are scarce, however, many of these parameters may be unidentifiedor poorly identified. To reduce the number of parameters to be estimated, we im-poseex anterestrictions on the factor loadings and risk factor variance matrix.The most commonly applied restriction is

RestrictionR1. One Risk Factor:σgh = 1 for all (g,h) bucket pairs.

R1 is equivalent to requiring thatX1 = X2 = . . . = XG. A sufficient condition forR1 is that there is exactly one structural risk factor (i.e.,K = 1). As shown byGordy [10],R1 is necessary assumption in the model underpinnings for the BaselII internal ratings-based capital standard and, indeed, is unavoidable (implictly ifnot explicitly) in any system of ratings-based capital charges. Empirically,R1may be an overly strong assumption, as casual observation suggests that indus-try and country business cycles are not perfectly synchronized. Nonetheless, ifa portfolio is relatively homogeneous, or if sectoral distinctions among obligorscannot be observed from available data, a single-factor representation can serveas a reasonable approximation.

While R1 imposes a restriction on the correlation among reduced form riskfactors, it does nothing to restrict the sensitivity of each obligor’s asset return tothose factors. A different reduced form factor loading is associated with eachbucket, and no restrictions are imposed on how these loadings vary. In practiceit may be reasonable to assume that factor loadings vary smoothly with obligordefault probabilities (or equivalently with obligor default thresholds). This as-sumption can be imposed by expressing factor loadings as a continuous functionof default thresholds.

RestrictionR2. Smooth Factor Loadings:wg = Λ(λ (γg)) for all g, whereΛ(·)is a continuous, strictly monotonic link function that maps real numbers onto theinterval (-1,1) andλ (·) is a continuous index function that maps default thresholdsonto the real line.

The choice of the link function is rather arbitrary. In the analysis that follows

Page 60: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

48

we use the simple arctangent transformation

Λ(λ ) =2π

arctan(λ ).

This function is linear with unit slope in a neighborhood ofλ = 0 and asymptotessmoothly toward positive (negative) one asλ approaches positive (negative) in-finity. The specification of the index function is more important than the choiceof the link because it can be used to restrict the wayw varies withγ. If the indexfunction is monotonic inγ , then mapping fromγ to w will be monotonic as well.The more parsimonious is the index function, the more restrictive is the impliedrelationship between the default thresholds and the factor loadings.

The strongest restriction one can impose on the factor loadings is to assumethat they are constant across all obligors.

RestrictionR3. Constant Factor Loading:wg = wh for all (g,h) bucket pairs.

Together,R1 andR3 imply that the structural factor loadings are constant acrossbuckets. Note thatR3 is a special case ofR2 in which the index functionλ (g) isa constant.

3. Moment and Maximum Likelihood EstimatorsIn this section, we develop method of moments and maximum likelihood esti-

mators for the structural model. The estimation framework assumes that we haveaccess to historical performance data for a credit ratings system. For each ofTyears andG rating buckets, we observe the number of obligors in bucketg atthe beginning of yeart (a “bucket-cohort”), and the number of members of thebucket-cohort who default by year-end. We assume that the default thresholdγg

and the factor loadingwg are constant across time for each bucket, and that thevector of risk factorsX,ε is serially independent. The task at hand is to estimateγg andwg for each rating bucket and (in the full-information MLE case) the vari-ance matrixΣ. Given these parameter estimates we can recover PDs and defaultcorrelations using equations (2) and (4).

Let ng anddg denote the number of obligors and the number of defaults inbucketg. Throughout this paper, we takeng as exogeneous, and so can treat it as afixed parameter in moment conditions and likelihood functions.2 Conditional onXg, defaults in bucketg are independent, and each default event can be viewed asthe outcome of a Bernoulli trial with success probability

pg(Xg) = p(Xg;γg,wg) = Φ

γg−wgXg√

1−w2g

. (5)

2It is clear that the number of obligors in each bucket is stochastic. We assume the random processthat generates the vectorn is independent of the process that generates defaults.

Page 61: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

49

Thus, the total number of defaults in the bucket is conditionally binomial withparametersng and andpg(Xg).

From the factorial moment of the binomial distribution, we have

E[dg(dg−1)|Xg] = ng(ng−1)pg(Xg)2.

Taking expectations, we obtain the unconditional second factorial moment

E[dg(dg−1)] = ng(ng−1)E[pg(Xg)2] = ng(ng−1)Φ2(γg,γg,w

2g) (6)

where the last equality follows from Proposition 1 in [9]. This leads to the simplemethod of moments estimator for bucket parametersγg,wg. Let Yg,1 andYg,2 bethe sample moments

Yg,1 =1T

T

∑t=1

dg,t

ng,t

Yg,2 =1T

T

∑t=1

dg,t

ng,t

(dg,t −1)(ng,t −1)

From equation (2), we have the moment restriction

E[Yg,1] = pg = Φ(γg) (7)

which implies the MM estimatorγg = Φ−1(Yg,1). From equation (6), we have

E[Yg,2] = Φ2(γg,γg,w2g). (8)

The Frey and McNeil [6] MM estimator ofwg is the value ˆwg that satisfies

Yg,2 = Φ2(γg, γg, w2g).

Note that the sign ofwg is not identified. Without loss of generality, we imposewg ≥ 0 for the MM estimator.

We now develop the full and restricted maximum likelihood estimators forthe model. The conditional binomial distribution fordg implies the likelihoodfunction

L(γg,wg|dg,Xg) =

(

ng

dg

)

p(Xg;γg,wg)dg (1− p(Xg;γg,wg))

ng−dg . (9)

Since defaults are conditionally independent across buckets, the joint likelihoodof the vectord conditional onX is simply the product of theG conditional likeli-hoods defined in (9). The unconditional likelihood ford is thus

L(γ,w,Σ|d) =∫

ℜG

G

∏g=1

((

ng

dg

)

p(xg;γg,wg)dg (1− p(xg;γg,wg))

ng−dg

)

dF(x;Σ) (10)

Page 62: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

50

whereF(x;Σ) is the multivariate normal CDF ofX.In principle, we could maximize the product of (10) acrossT observations

with respect to all 2G+(G−1)G/2 free parameters simultaneously. This wouldprovide unrestricted full information maximum likelihood estimates of the param-eters. In practice, however, this strategy is computationally feasible only whenGis small. To reduce the dimensionality of the optimization problem, we can inte-grateXg out of equation (9) to yield the marginal likelihood

L(γg,wg|dg) =

(

ng

dg

)

p(x;γg,wg)dg (1− p(x;γg,wg))

ng−dg dΦ(x). (11)

This function depends only on the two parameterswg andγg, so estimates ofwandγ can be obtained by maximizing the marginal likelihood for each bucket, onebucket at a time.3 This procedure yields our least restrictive maximum likelihoodestimator that imposes no restrictions in the parameters of the default model de-scribed in Section 2. Because this estimator does not utilize information aboutthe potential correlation in default rates across buckets, it is not asymptoticallyefficient, except in the unrealistic special case whereσgh = 0 for all g 6= h. It alsoprovides no estimate of the variance matrixΣ, which is needed to calculate value-at-risk. In practical application,Σ is sometimes obtained from other data sources.For example, in CreditMetrics,Σ is estimated by taking pairwise correlations instock market indices [12].

R1 implies that the effect ofX on all obligors can be represented by a singlestandard normalscalarvariableX. Under this restriction we can re-write (10) as

L(γ,w|d) =∫

G

∏g=1

((

ng

dg

)

p(x;γg,wg)dg (1− p(x;γg,wg))

ng−dg

)

dΦ(x). (12)

Maximizing this likelihood overw andγ yields a full information likelihood esti-mator that imposes the one risk factor restriction.

Rather than estimate the elements ofw directly one can substitute the formulain R2 into equation (12) and maximize the resulting equation overγ and theparameters of the index functionλ (γ). This procedure yields a FIML estimatorthat imposes both the one risk factor and the smooth factor loading restrictions.Similarly, R1 andR3 can be imposed by replacing the vectorw in equation (12)with a single loadingw≥ 0 and maximizing the resulting likelihood with respectto γ and the scalarw.

If bothR1 andR3 hold, then all the maximum likelihood estimators describedin this section are consistent forT → ∞. Furthermore, the estimator that imposes

3As was the case for the MM estimator, the sign ofwg is not identified by the marginal likelihoodestimator.

Page 63: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

51

R1 andR3 is efficient in the sense that it achieves the lowest possible asymptoticvariance among consistent estimators. It is important to emphasize, however, thatin finite samples some or all of these maximum likelihood estimators may bebiased. In the next section we use Monte Carlo simulations to investigate thesmall sample properties of these estimators.

4. Monte Carlo SimulationsIf many decades of ratings performance data were available, the asymptotic

results of the previous section would pose a clear trade-off. On one hand, themore restrictive maximum likelihood estimators yield more precise estimates ifthe restrictions they impose are valid; on the other hand, the less restrictive es-timators are more robust to specification errors. When ratings performance dataare in short supply (i.e.,T is small) the tradeoff becomes more complicated be-cause the less restrictive estimators may also be most biased. We use Monte Carlosimulations to study the small sample biases in our estimators.

The following four estimators are examined in this analysis.

MM: unrestricted method of moments estimator.

MLE1: limited information maximum likelihood estimator.

MLE2: full information maximum likelihood estimator that imposesR1.

MLE3: full information maximum likelihood estimator that imposesR1 andR3.

In each Monte Carlo simulation, we constructed a synthetic dataset intended torepresent the type of historical data available from the major rating agencies. Datawere simulated for three rating grades. Grade “A” corresponds to medium to lowinvestment grade (S&P A/BBB), grade “B” corresponds to high speculative grade(S&P BB), and grade “C” corresponds to medium speculative grade (S&P B).Table 1 summarizes characteristics of these three grades.4 Simulated defaults ineach grade were generated according to the stochastic model described in Section2 with R1 andR3 imposed.

Two sets of Monte Carlo simulations were undertaken. In the first, 500 syn-thetic datasets were generated for four different values ofT: 20, 40, 80, and 160.In each case a “true” factor loading of 0.45 was assumed. These simulations wereintended to shed light on the properties of our estimators as the number of yearsof default data increases. Though estimates of both factor loadings and defaultthresholds were obtained for each simulated dataset, we will postpone discussing

4S&P grade-cohorts are somewhat larger than we have assumed, but are similar in the relativepreponderance of higher grade obligors.

Page 64: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

52

Table 1. Characteristics of simulated rating grades.

Default No. ofGrade PD Threshold Obligors

A 0.0015 −2.9677 400B 0.0100 −2.3263 250C 0.0500 −1.6449 100

Figure 1. Median estimated factor loadings by sample size (error bars show 5th and 95th per-centiles).

default thresholds for the time being. Table 2 summarizes the means, standarddeviations, and root mean squared errors (“RMSE”) for the estimates ofw giveneach of the four sample sizes. Figure 1 displays the median and the 5th and 95thpercentiles of the estimated parameter values.

Not surprisingly, properties of all four estimators improve asT increases. Themeans become closer to 0.45 and the variances and RMSEs decrease. Also asexpected, for large values ofT the more restrictive estimators are more tightlyclustered around 0.45 than the less restrictive estimators. More surprising is therather poor performance of MM and MLE1 whenT is small. Though all fourestimators appear to be downward-biased in small samples, the bias of MM and

Page 65: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

53

MLE1 is substantially worse than that of MLE2 and MLE3.In real-world applications, we could never hope to observe 80 or 160 years of

default data. S&P historical performance data currently cover 28 annual cohorts[28]. Moody’s performance data go back to 1970, but there is believed to be animportant break in the time-series at 1983 due to a change in Moody’s rating meth-ods. Banks’ internal rating systems typically contain even shorter time-series,though larger grade-cohorts. For the vast majority of these internal systems, wewould observe less than 20 years of data. To explore the small-sample propertiesof our estimators in greater detail, a second set of Monte Carlo simulations wasrun with T fixed at 20. Four groups of 1,500 synthetic datasets were simulatedfor a grid of “true” factor loadings from 0.15 to 0.60. For a small minority oftrials, the simulated data did not permit identification of all model parameters. Inother trials, the optimization routines used to calculate the maximum likelihoodestimators failed to converge. In Appendix A, we provide details on the incidenceand treatment of identification and convergence problems.

Tables 3 and 4 show the distributions of estimated default thresholds and im-plied default probabilities. Even whenT is small, all four estimators generallyproduce minimally biased and reasonably precise estimates of default thresholdsand, therefore, of the corresponding PDs. Although the direct estimator of thePD is unbiased, we favor estimation of default thresholds, because the distribu-tion of γ is approximately symmetric. PDs, by contrast, are bounded at zero, soestimated PDs for the higher quality grades have highly asymmetric distributions.Therefore, standard test statistics should be better behaved for estimated defaultthresholds.

Tables 5(a) through 5(d) describe the distributions of estimated factor load-ings. Several strong patterns can be seen in these tables, of which the most strik-ing is the large downward bias associated with MM and MLE1. This problem isparticularly significant for high quality grades when the true factor loadings arehigh. MLE2 and MLE3 are also biased downward, but the magnitude of the biasis less severe. In contrast to the results for MM and MLE1, the magnitude of thebias for MLE2 does not appear to depend on the grade in any systematic way.

Based on the root mean squared error criterion, MLE3 clearly outperforms theother three estimators; and more generally, the more restrictive estimators outper-form the less restrictive estimators. The greatest gain in efficiency appears to occurwhen the single factor assumption (R1) is imposed. Because it incorporates in-formation on cross-grade default correlations, MLE2 produces substantially moreaccurate estimates of high-grade factor loadings than MLE1 or MM.

5. Bias in Method of MomentsFinite-sample bias in moment estimators arises when the moment restrictions

are nonlinear functions of the parameters. In this section, we show why the MMestimator for factor loadingw is subject to a large downward bias in realistic

Page 66: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

54

Table 2. Distribution of estimated factor loadings by sample size forw= 0.45.

MM MLE1 MLE2 MLE3T A B C A B C A B C All

Mean 0.3020 0.3816 0.41050.3748 0.4272 0.43900.4356 0.4389 0.44270.437420 Std. Dev. 0.1504 0.0984 0.09230.1762 0.1053 0.08520.1319 0.0907 0.07730.0743

RMSE 0.2110 0.1198 0.10040.1914 0.1077 0.08580.1325 0.0913 0.07750.0753

Mean 0.3660 0.4098 0.42930.4151 0.4429 0.44620.4418 0.4426 0.44440.445440 Std. Dev. 0.1074 0.0768 0.06920.1269 0.0717 0.06090.0885 0.0610 0.05500.0529

RMSE 0.1363 0.0867 0.07220.1315 0.0720 0.06100.0888 0.0614 0.05530.0530

Mean 0.3999 0.4272 0.43810.4329 0.4468 0.44990.4455 0.4481 0.44970.448680 Std. Dev. 0.0799 0.0620 0.05030.0837 0.0512 0.04320.0619 0.0454 0.04010.0396

RMSE 0.0943 0.0661 0.05160.0853 0.0513 0.04320.0620 0.0454 0.04000.0396

Mean 0.4208 0.4383 0.44440.4527 0.4536 0.45480.4573 0.4563 0.45710.4548160 Std. Dev. 0.0621 0.0483 0.03670.0595 0.0368 0.03200.0467 0.0369 0.03330.0311

RMSE 0.0686 0.0497 0.03710.0595 0.0369 0.03240.0472 0.0374 0.03410.0315

Table 3. Distribution of estimated default thresholds by “true” factor loadings (T = 20).

MM MLE1 MLE2 MLE3w A B C A B C A B C A B C

Mean −2.980−2.333−1.649−2.982−2.331−1.647−2.981−2.331−1.647−2.982−2.331−1.6470.15 Std. Dev. 0.101 0.064 0.060 0.096 0.063 0.060 0.097 0.063 0.060 0.096 0.063 0.060

RMSE 0.101 0.065 0.060 0.097 0.063 0.060 0.097 0.064 0.060 0.097 0.063 0.060

Mean −2.995−2.337−1.656−2.987−2.334−1.651−2.985−2.335−1.653−2.988−2.334−1.6510.30 Std. Dev. 0.134 0.092 0.084 0.125 0.094 0.088 0.125 0.093 0.088 0.124 0.094 0.088

RMSE 0.136 0.093 0.085 0.126 0.094 0.088 0.126 0.094 0.088 0.126 0.094 0.088

Mean −3.016−2.352−1.655−3.008−2.343−1.657−2.995−2.345−1.661−3.005−2.350−1.6640.45 Std. Dev. 0.190 0.139 0.124 0.173 0.137 0.123 0.163 0.127 0.117 0.165 0.132 0.120

RMSE 0.196 0.141 0.124 0.177 0.138 0.124 0.165 0.129 0.118 0.170 0.134 0.122

Mean −3.088−2.378−1.670−3.046−2.360−1.653−3.009−2.345−1.652−3.014−2.360−1.6670.60 Std. Dev. 0.311 0.214 0.176 0.213 0.186 0.155 0.181 0.155 0.133 0.185 0.159 0.137

RMSE 0.334 0.220 0.177 0.227 0.189 0.155 0.186 0.156 0.133 0.191 0.163 0.139

True Value−2.968−2.326−1.645−2.968−2.326−1.645−2.968−2.326−1.645−2.968−2.326−1.645

settings. For clarity in exposition, we make a number of simplifying assumptionsto reduce notation.

We fix a bucket with thresholdγ and factor loadingw. Assume that the cohortsizen is constant across time. For now, let us assume thatγ is known, so does notneed to be estimated, and that we wish to estimate the asset correlationρ = w2.The MM estimator ofρ is the valueρ that satisfies

Y2 = Φ2(γ,γ, ρ).

To emphasize that this givesρ as an implicit function, let us writeΦ2(ρ ;γ) for

Page 67: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

55

Table 4. Distribution of estimated default probabilities by “true” factor loadings (in percentagepoints).

MM MLE1 MLE2 MLE3w A B C A B C A B C A B C

Mean 0.151 0.996 4.987 0.149 1.000 5.008 0.150 0.999 5.008 0.149 1.000 5.0080.15 Std. Dev. 0.047 0.168 0.607 0.045 0.167 0.616 0.045 0.167 0.618 0.045 0.167 0.620

RMSE 0.047 0.168 0.607 0.045 0.166 0.615 0.045 0.167 0.618 0.045 0.167 0.619

Mean 0.149 0.999 4.951 0.151 1.007 5.003 0.152 1.003 4.986 0.151 1.008 5.0060.30 Std. Dev. 0.059 0.243 0.855 0.059 0.252 0.909 0.061 0.248 0.908 0.059 0.249 0.911

RMSE 0.059 0.243 0.856 0.059 0.252 0.909 0.061 0.248 0.908 0.059 0.249 0.911

Mean 0.149 0.991 5.025 0.152 1.014 5.006 0.155 1.000 4.953 0.151 0.990 4.9230.45 Std. Dev. 0.083 0.360 1.259 0.089 0.381 1.264 0.085 0.338 1.198 0.083 0.351 1.222

RMSE 0.083 0.360 1.259 0.089 0.381 1.263 0.085 0.338 1.198 0.083 0.351 1.224

Mean 0.149 0.999 4.998 0.145 1.015 5.121 0.152 1.020 5.078 0.150 0.984 4.9260.60 Std. Dev. 0.142 0.565 1.769 0.108 0.525 1.646 0.085 0.399 1.372 0.085 0.387 1.369

RMSE 0.142 0.565 1.769 0.108 0.525 1.650 0.085 0.400 1.374 0.085 0.387 1.370

True Value 0.150 1.000 5.000 0.150 1.000 5.000 0.150 1.000 5.000 0.150 1.000 5.000

the bivariate normal in the above equation. We denoteϒγ as the inverse of thisfunction, so that

ϒγ (Φ2(ρ ;γ)) = ρ . (13)

The empirical momentY2 is a noisy but unbiased estimator of the quantityy∗2 = Φ2(ρ ;γ) for the true parameter valueρ . As in [22], we take a Taylor seriesapproximation forρ as

ρ = ϒγ (Y2)≈ ϒγ (y∗2)+ (Y2− y∗2)ϒ

′γ (y

∗2)+

12(Y2− y∗2)

2ϒ′′γ (y

∗2).

Taking expectations of both sides, and noting thatϒγ (y∗2) = ρ , E[Y2− y∗2] = 0 andE[(Y2− y∗2)

2] is the variance V[Y2], the bias is approximated as

E[ρ]−ρ ≈ 12

V[Y2]ϒ′′γ (y

∗2). (14)

By twice differentiating both sides of identity (13), we find

ϒ′′γ (y

∗2) =−Φ′′

2(ρ ;γ)/Φ′2(ρ ;γ)3

As noted by Vasicek [27],

Φ′2(ρ ;γ) =

∂∂ρ

Φ2(γ,γ,ρ) = φ2(γ,γ,ρ)

whereφ2 is the bivariate normal density. From this, it is straightforward to showthat

Φ′′2(ρ ;γ) =

∂∂ρ

φ2(γ,γ,ρ) =

(

(

γ1+ρ

)2

1−ρ2

)

φ2(γ ,γ,ρ).

Page 68: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

56

Thus, we arrive at

ϒ′′γ (y

∗2) =−

(

(

γ1+ρ

)2

1−ρ2

)

1φ2(γ,γ,ρ)2 . (15)

In Appendix B, we derive the variance ofY2. Scaling byT, we have

T ·V[Y2] = Φ4(ρ ;γ)− Φ2(ρ ;γ)2

+1

n(n−1)

(

−2(2n−3)Φ4(ρ ;γ)

+ 4(n−2)Φ3(ρ ;γ)+2Φ2(ρ ;γ))

(16)

whereΦm is them-variate normal cdf such that

Φm(ρ ;γ) = Pr(Z1 ≤ γ, . . . ,Zm ≤ γ)

for Zi that are standard normal variables with equal correlations E[ZiZ j ] = ρ fori 6= j. Whenρ = 0, Φ4(ρ ;γ) = Φ2(ρ ;γ)2, so that only the sampling variationterm remains in the variance. In this case, the bias inρ is O(1/n). Whenρ > 0,Φ4(ρ ;γ) > Φ2(ρ ;γ)2, so the bias does not vanish even as the number of obligorsincreases to infinity.

A minor extension of these calculations gives us the bias for the factor loadingw. The moment condition is ˆw=

ϒγ (Y2), so the Taylor series approximation tothe bias is

E[w]−w≈ 12

V[Y2]d2

dy2

ϒγ (y)

y=y∗2

(17)

Taking derivatives of√

ϒ(y),

d2

dy2

ϒ(y) =12

ϒ′′(y)

ϒ(y)1/2− 1

4ϒ′(y)2

ϒ(y)3/2

and substituting as before forϒγ (y∗2) and its derivatives, we obtain

d2

dy2

ϒγ (y)

y=y∗2

=−12w

(

(

γ1+w2

)2

+w2

1−w4 +1

2w2

)

1φ2(γ,γ,w2)2 (18)

This expression is negative forw > 0, so it is clear that the bias in ˆw must betowards zero.

Table 6 displays the approximate bias in the factor loading estimator as givenby equation (17) for the three hypothetical buckets in Table 1. As in the previoussection, we varyw from 0.15 to 0.60. The bias is expressed as a multiple of 1/Tso, for example, ifT = 100 andw= 0.45, then the approximate bias in ˆw for grade

Page 69: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

57

Table 5(a). Distribution of estimated factor loadings forw= 0.15 andT = 20.

MM MLE1 MLE2 MLE3A B C A B C A B C All

Mean 0.0956 0.1091 0.12010.1220 0.1180 0.12570.1643 0.1383 0.14120.1341Std. Dev. 0.1090 0.0819 0.07650.1174 0.0813 0.07360.1032 0.0703 0.06380.0533RMSE 0.1218 0.0915 0.08210.1206 0.0874 0.07750.1042 0.0712 0.06440.0556

Percentile2.5 0.0000 0.0000 0.00000.0013 0.0010 0.00190.0095 0.0099 0.01420.01115.0 0.0000 0.0000 0.00000.0021 0.0028 0.00420.0200 0.0194 0.02730.0318

50.0 (Med.) 0.0000 0.1170 0.13000.0955 0.1212 0.13140.1544 0.1379 0.14310.138795.0 0.2900 0.2384 0.23500.3390 0.2540 0.24300.3450 0.2570 0.24380.217397.5 0.3124 0.2590 0.25550.3814 0.2772 0.26220.3817 0.2793 0.25950.2305

Table 5(b). Distribution of estimated factor loadings forw= 0.30 andT = 20.

MM MLE1 MLE2 MLE3A B C A B C A B C All

Mean 0.1960 0.2570 0.27180.2354 0.2723 0.27790.2898 0.2847 0.28500.2849Std. Dev. 0.1372 0.0874 0.07760.1519 0.0863 0.07570.1173 0.0773 0.07070.0621RMSE 0.1721 0.0973 0.08260.1650 0.0906 0.07880.1177 0.0788 0.07230.0639

Percentile2.5 0.0000 0.0531 0.10110.0040 0.0820 0.11790.0476 0.1307 0.13770.16055.0 0.0000 0.1171 0.14020.0081 0.1216 0.14980.0892 0.1529 0.16600.1793

50.0 (Med.) 0.2201 0.2583 0.27380.2393 0.2750 0.28130.2900 0.2841 0.28760.287195.0 0.4043 0.3994 0.39750.4784 0.4072 0.39680.4862 0.4115 0.39610.385897.5 0.4343 0.4297 0.41700.5429 0.4246 0.41990.5280 0.4333 0.41710.4018

Table 5(c). Distribution of estimated factor loadings forw= 0.45 andT = 20.

MM MLE1 MLE2 MLE3A B C A B C A B C All

Mean 0.3020 0.3816 0.41050.3591 0.4209 0.42510.4289 0.4319 0.42780.4280Std. Dev. 0.1504 0.0984 0.09230.1732 0.1022 0.08650.1255 0.0880 0.07960.0753RMSE 0.2110 0.1198 0.10040.1955 0.1062 0.09000.1272 0.0898 0.08260.0784

Percentile2.5 0.0000 0.1948 0.24790.0119 0.2026 0.25530.1485 0.2475 0.26850.28135.0 0.0000 0.2302 0.26960.0238 0.2484 0.28380.2061 0.2816 0.29240.2986

50.0 (Med.) 0.3274 0.3788 0.40530.3849 0.4258 0.42770.4362 0.4354 0.43180.430995.0 0.5074 0.5463 0.56880.6184 0.5780 0.55980.6215 0.5677 0.55270.549597.5 0.5449 0.5782 0.60280.6493 0.6127 0.57880.6499 0.5968 0.57770.5702

Table 5(d). Distribution of estimated factor loadings forw= 0.60 andT = 20.

MM MLE1 MLE2 MLE3A B C A B C A B C All

Mean 0.3675 0.4857 0.53880.4374 0.5517 0.57400.5384 0.5721 0.57670.5733Std. Dev. 0.1802 0.1159 0.11060.2004 0.1095 0.08420.1193 0.0891 0.07490.0721RMSE 0.2941 0.1628 0.12640.2580 0.1196 0.08810.1342 0.0933 0.07840.0768

Percentile2.5 0.0000 0.2718 0.33620.0107 0.3160 0.39300.2548 0.3755 0.40810.40045.0 0.0000 0.3078 0.36420.0248 0.3610 0.42670.3112 0.4122 0.43770.4368

50.0 (Med.) 0.4048 0.4776 0.53550.4769 0.5680 0.57470.5606 0.5843 0.57950.582295.0 0.6007 0.6862 0.72620.6803 0.7037 0.70200.6890 0.7008 0.69170.674297.5 0.6432 0.7372 0.76510.6970 0.7214 0.72940.7037 0.7244 0.71500.6895

Page 70: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

58

Table 6. Bias (timesT) in MM estimator for factor loading.

Bias

Grade w= 0.15 w= 0.30 w= 0.45 w= 0.60

A −6.95 −4.13 −10.59 −27.77

B −2.83 −1.89 −3.50 −6.56

C −3.21 −1.52 −1.77 −2.36

Table 7. Bias (timesT) in MM estimator for default threshold.

Bias

Grade γ w= 0.15 w= 0.30 w= 0.45 w= 0.60

A −2.968 −0.0013 −0.0021 −0.0047 −0.0131

B −2.326 −0.0025 −0.0053 −0.0123 −0.0291

C −1.645 −0.0057 −0.0123 −0.0258 −0.0508

B is −0.035. As it derives from a Taylor series in powers of 1/√

T, the accuracyof the approximate bias may be poor for low values ofT. For higher values ofT,the results of Table 6 are comparable to the simulation results of Section 4.

Thus far, we have assumed thatγ is known. If not, then the bias in ˆw hascomponents associated with the variance ofY1 and the covariance betweenY1 andY2, as well as the term (analyzed above) associated with the variance ofY2. TheMM estimatorγ is biased too. Arguments parallel to above show that

E[γ]− γ ≈ 12

V[Y1]d2

dy2 Φ−1(y)

y=y∗1

(19)

Proceeding as before, we find the variance ofY1 is given by

T ·V[Y1] = Φ2(ρ ;γ)− p2+1n

(

p− Φ2(ρ ;γ))

. (20)

For the second derivative ofΦ−1, we have

d2

dγ2 Φ−1(y)

y=y∗1

=−Φ′′(γ)Φ′(γ)3 =

γφ(γ)2

The bias inγ is away from zero.Table 7 displays the approximate bias in the default threshold estimator as

given by equation (19) for the three hypothetical buckets in Table 1. As above, wevary w from 0.15 to 0.60, and express the bias as a multiple of 1/T. We see that

Page 71: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

59

the bias is negligible in realistic sample sizes. For example, with only ten years ofdata, the bias for grade B is−0.0012 whenw= 0.45, so E[γ] = −2.3276 againstγ =−2.3263.

ConclusionWe have examined the small sample properties of method of moments and

maximum likelihood estimators of portfolio credit risk models. We show that es-timates of default thresholds are reasonably robust to the choice of estimators, butestimates of factor loadings (w) can differ markedly. The unrestricted estimatorsfor w are subject to large bias towards zero and high mean square error in realisticsample sizes. The downward bias is most severe for higher quality grades.

The performance of the method of moments (MM) estimator forw is partic-ularly dismal, as E[w] is roughly one-third less than the true value when we haveT = 20 years of data. The virtue of the MM estimator, and indeed the main sourceof its relative popularity in practical application, is its tractability. The cost ofthis tractability is bias and inefficiency. In realistic sample sizes, the costs to MMclearly should outweigh the benefits. Work in progress will determinate whetherwe can improve the performance of moment estimators without too much sacrificein computational facility. One possibility is to use cross-bucket moments as overi-dentifying information in generalized method of moments (GMM) estimation.

The three maximum likelihood estimators we study can be ordered by the re-strictiveness of the assumptions they impose. The least restrictive (MLE1) allowsfor the possibility that obligors in different rating grades may be sensitive to dif-ferent risk factors. The second (MLE2) imposes the restriction that obligors in allgrades are sensitive to asinglesystematic risk factor, but allows factor loadingsto vary across grades. Finally, the most restrictive (MLE3) requires that factorloadings be constant across rating grades.

If the restrictions imposed by the last estimator are correct, all three ML esti-mators are consistent. We find that all three estimators forw are downward biasedin small sample, but the biases for MLE2 and MLE3 are much smaller than thebias for MLE1. The gap between MLE2 and MLE3 is relatively modest in termsof bias, though for higher quality grades MLE3 has a much smaller variance.

In applied work, an intermediate approach between MLE2 and MLE3 couldbe preferred. Such an estimator would allow for the possibility that highly-ratedobligors have systematically higher or lower factor loadings than lower-ratedobligors, while still capturing the benefits of imposing structure on the relation-ship between PDs and factor loadings. Instead of fixing a single common valuefor all factor loading as in MLE3, factor loadings would be expressed as a simpleparametric function of the default threshold. This approach would permit greaterflexibility in fitting data than MLE3, but afford greater efficiency than MLE2.

Finally, MLE3 or a blended version of MLE2 and MLE3 provides two practi-cal advantages over the less restrictive estimators. First, by limiting the number of

Page 72: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

60

parameters that must be estimated, cross-bucket restrictions on factor loadings goa long way toward solving identification problem that arise when the number ofobligors in a bucket is small or when defaults are infrequent. When very few de-faults are observed in a bucket, estimating all the parameters of the more generaldefault models becomes difficult or impossible. Such circumstances may arise,for example, when buckets consist of a large number of narrowly-defined ratinggrades. Second, and perhaps more important, making factor loadings a (possiblyconstant) parametric function of default thresholds ensures that a bucket’s factorloading can be calculated directly from its PD. This provides a natural means forassigning factor loadings to bank rating grades that straddle or fall between ratingagency grades.

AppendicesA. Identification and Convergence Problems

In the main Monte Carlo study four sets of 1,500 synthetic datasets were con-structed withw set to 0.15, 0.30, 0.45, and 0.60. For some of these datasets, oneor more of the estimators described in Section 3 failed to generate a full set ofmodel parameters. The table below shows the fraction of simulations for whichone or more parameters could not be estimated.

w MM MLE1 MLE2 MLE30.15 0.005 0.000 0.000 0.0000.30 0.005 0.003 0.003 0.0000.45 0.007 0.038 0.043 0.0070.60 0.061 0.281 0.311 0.121

For grades where the PD implied byγg is small, a simulated dataset may con-tain a very small number of defaults. This outcome is particularly likely whenw islarge. When no defaults are observed in a bucket, the unrestricted model parame-ters (MM, MLE1) are not identified. When fewer than two defaults are observed,the MM asset correlation (ρ) is negative. In this case, we impose a lower boundof w= 0.

Even when model parameters are strictly identified by the data, the optimiza-tion algorithm used to obtain maximum likelihood estimators may fail to convergeto a solution. Often such convergence problems arise when the matrix of secondpartial derivatives of the log-likelihood function (the Hessian matrix) is nearly sin-gular. Rothenberg [23] shows that such singularity may result when model param-eters are “nearly” unidentified. In general, highly correlated observations containless information that is helpful in identifying model parameters than independentdata. For this reason, it is perhaps not surprising that convergence problems aregreater for higher values ofw. Identification problems can be overcome by im-posing parametric restrictions such asR3. This helps explain why MLE3 is morelikely to converge to a solution than MLE1 or MLE2.

Page 73: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

61

B. Variance of the Second Factorial Moment EstimatorFor the variance ofY2, we write

V[Y2] = (n(n−1))−2V[n(n−1)Y2] = (n(n−1))−2V[(1/T)∑t

dt(dt −1)]. (21)

As thedt are identically and independently distributed across time,

V[(1/T)∑t

dt(dt −1)] =1T

V[d1(d1−1)] =1T

(

E[d21(d1−1)2]−E[d1(d1−1)]2

)

.

We want to exploit the factorial moment rule

E[d1(d1−1) · · ·(d1− j +1)] = n(n−1) · · ·(n− j +1)Φ j(ρ ;γ).

Straightforward algebra shows that

d2(d−1)2 = d(d−1)(d−2)(d−3)+4d(d−1)(d−2)+2d(d−1)

and from this we obtain

E[d21(d1−1)2] = n(n−1)(n−2)(n−3)Φ4(ρ ;γ)

+4n(n−1)(n−2)Φ3(ρ ;γ)+2n(n−1)Φ2(ρ ;γ).

Substituting into equation (21), we arrive at

T ·V[Y2] =(n−2)(n−3)

n(n−1)Φ4(ρ ;γ)+

4(n−2)n(n−1)

Φ3(ρ ;γ)

+2

n(n−1)Φ2(ρ ;γ)− Φ2(ρ ;γ)2

= Φ4(ρ ;γ)− Φ2(ρ ;γ)2

+1

n(n−1)

(

−2(2n−3)Φ4(ρ ;γ)+4(n−2)Φ3(ρ ;γ)+2Φ2(ρ ;γ))

Page 74: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

62

References1. Altman, E. I. and Rijken, H. A. (2004), “How rating agencies achieve rating stability,”

Journal of Banking and Finance, 28(11), 2679–2714.2. Cantor, R. and Mann, C. (2009), “Are corporate bond ratings procyclical? An update,”

Special Comment, Moody’s Investor Services.3. de Servigny, A. and Renault, A. (2002), “Default correlation: empirical evidence,”

Technical report, Standard & Poor’s.4. Embrechts, P., McNeil A. J. and Straumann, D. (1999), “Correlations: pitfalls and

alternatives,”Risk, 12(2), 69–71.5. Feng, D., Gourieroux, C. and Jasiak, J. (2008), “The ordered qualitative model for

credit rating transitions,”Journal of Empirical Finance, 15(1), 111–130.6. Frey, R. and McNeil, A. J. (2003), “Dependent defaults in models of portfolio credit

risk,” Journal of Risk, 6(1), 59–92.7. Gagliardini, P. and Gourieroux, C. (2005), “Migration correlation: Definition and effi-

cient estimation,”Journal of Banking and Finance, 29(4), 865–894.8. Gagliardini, P. and Gourieroux, C. (2005), “Stochastic migration models with applica-

tion to corporate risk,”Journal of Financial Econometrics, 3(3), 188–226.9. Gordy, M. B. (2000), “A comparative anatomy of credit risk models,”Journal of Bank-

ing and Finance, 24(1–2), 119–149.10. Gordy, M. B. (2003), “A risk-factor model foundation for ratings-based bank capital

rules,”Journal of Financial Intermediation, 12(3), 199–232.11. Gourieroux, C. and Jasiak, J. (2008), “Granularity adjustment for default risk factor

model with cohorts,” working paper.12. Gupton, G. M., Finger, C. C. and Bhatia, M. (1997),CreditMetrics–Technical Docu-

ment, J. P. Morgan & Co., New York.13. Hamerle, A., Liebig, T. and Rosch, D. (2003), “Credit risk factor modeling and the

Basel II IRB approach,” Discussion Paper Series 2: Banking and Financial Studies02/2003, Deutsche Bundesbank.

14. Heitfield, E. A. (2008), “Parameter uncertainty and the credit risk of collateralizeddebt obligations,” working paper.

15. Loffler, G. (2003), “The effects of estimation error on measures of portfolio creditrisk,” Journal of Banking and Finance, 27(8), 423–444.

16. Loffler, G. (2004) “An anatomy of rating through the cycle,”Journal of Banking andFinance, 28(3), 695–720.

17. Loffler, G. (2005), “Avoiding the rating bounce: Why rating agencies are slow toreact to new information,”Journal of Economic Behavior and Organization, 56(3),365–381.

18. McNeil, A. J. and Wendin, J. P. (2006), “Dependent credit migrations,”Journal ofCredit Risk, 2(2), 87–114.

19. McNeil, A. J. and Wendin, J. P. (2007), “Bayesian inference for generalized lin-ear mixed models of portfolio credit risk,”Journal of Empirical Finance, 14(2),131–149.

20. Merton, R. C. (1974), “On the pricing of corporate debt: The risk structure of interestrates,”Journal of Finance, 29(2), 449–470.

21. Nagpal, K. and Bahar, R. (2001) “Measuring default correlation,”Risk, 14(3),129–132.

Page 75: Financial Engineering

May 3, 2010 12:27 Proceedings Trim Size: 9in x 6in 002

63

22. Phillips, P. C. B. and Yu, J. (2009), “Simulation-based estimation of contingent-claimsprices,”Review of Financial Studies, 22(9), 3669–3705, September 2009.

23. Rothenberg, T. J. (1971), “Identification in parametric models,”Econometrica, 39(3),577–591.

24. Stefanescu, C., Tunaru, R. and Turnbull, S. (2009), “The credit rating process andestimation of transition probabilities: A Bayesian approach,”Journal of Empirical Fi-nance, 16(2), 216–234.

25. Tarashev, N. A. (2009), “Measuring portfolio credit risk correctly: why parameter un-certainty matters,” Working Paper 280, Bank for International Settlements.

26. Treacy, W. F. and Carey, M. S. (1998), “Credit risk rating at large U.S. banks,”FederalReserve Bulletin, 84(11), 897–921.

27. Vasicek, O. A. (1998), “A series expansion for the bivariate normal integral,”Journalof Computational Finance, 1(4), 5–10.

28. Vazza, D., Aurora, D. and Kraemer, N. (2009), “2008 annual global corporate defaultstudy and rating transitions,” Technical report, Standard & Poor’s.

Page 76: Financial Engineering

This page intentionally left blankThis page intentionally left blank

Page 77: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

Heterogeneous Beliefs with Mortal Agents∗

A. A. Brown and L. C. G. Rogers†

Statistical Laboratory, University of CambridgeE-mail: [email protected]

This paper will examine a model with many agents, each of whom hasa different belief about the dynamics of a risky asset. The agents areBayesian and so learn about the asset over time. All agents are assumedto have a finite (but random) lifetime. When an agent dies, he passes hiswealth (but not his knowledge) onto his heir. As a result, the agents neverbecome sure of the dynamics of the risky asset. We derive expressionsfor the stock price and riskless rate. We then use numerical examples toexhibit their behaviour.

1. IntroductionThis paper will look at a model of agents with heterogeneous beliefs. We as-

sume that there is a single risky asset that produces a dividend process. Agents areunsure of the dynamics of the dividend process. Specifically, they do not know oneof the parameters that governs its dynamics. Agents therefore form beliefs aboutthis parameter and update these over time. To avoid agents eventually determiningthe true value of the parameter, we assume that agents are finite lived.

The paper will build on previous work of Brown & Rogers (2009). That pa-per explained the general theory of how to incorporate heterogeneous beliefs intoa dynamic equilibrium model. However, in the case in which the agents wereBayesian, it was seen that the agents would eventually determine the true drift ofthe dividend process. The purpose of this paper is therefore to investigate a modelin which there is a non-trivial steady state. This is done through the assumptionthat the different agents are in fact dynasties. Each member of the dynasty has afinite but random lifetime and when that member dies, he will pass on his wealth,but not his knowledge, to his heir. The paper will explain how to construct andsolve this model and will lead to a stationary distribution for the stock price.

∗It is a pleasure to thank the workshop organisers, Masaaki Kijima, Yukio Muromachi, HidetakaNakaoka, and Keiichi Tanaka for their warm welcome and efficient organisation; the many workshopparticipants for interesting discussions; and the referee of this paper for valuable comments on the firstdraft.†Corresponding author.

65

Page 78: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

66

As in Brown & Rogers (2009), we assume that there is a single risky assetwhich pays a dividend continuously in time. In addition there is a riskless assetin zero net supply. The dividend process of the stock is now assumed to be aquadratic function of an Ornstein-Uhlenbeck (OU) process. All the agents knowall the parameters of the OU process except the mean to which it reverts. All theagents observe the OU process as it evolves and so as time progresses they updatetheir beliefs about the unknown parameter. However, since they are finite lived,they will never find its true value.

The model described is quite simple, yet already there is enough to make theasset pricing non-trivial. Just as in Brown & Rogers (2009), the agents maximizetheir expected utilities subject to their budget constraints and we use these optimi-sation problems to derive a state price density. Using this state price density wecan then price the risky asset as the net present value of future dividends. Com-parative statics allow us to see how the stock price depends on the parameters ofour model. We also produce a volatility surface for the stock, which behaves veryreasonably.

The structure of the paper is as follows. We give a brief literature review be-low. Section 2 introduces the model and solves the equilibrium to determine astate price density. Section 3 then uses this state price density to calculate theprices of the stock and bond; these calculations are non-trivial. Section 4 looks atcomparative statics of the model and Section 5 concludes.

1.1 Literature ReviewThere is a large literature on heterogeneous beliefs, which has been discussed

in detail in Brown & Rogers (2009). Work includes Kurz (2008b), Kurz (1994),Kurz (1997), Kurz & Motolese (2006), Kurz (2008a), Kurzet al. (2005), Fan(2006), Harrison & Kreps (1978), Morris (1996), Wu & Guo (2003), Wu & Guo(2004), Harris & Raviv (1993), Kandel & Pearson (1995), Buraschi & Jiltsov(2006), Jouini & Napp (2007). Closer to the work presented here are the papersthat assume that there is a parameter of the economy that is unknown to the agents.We briefly review such models here.

Basak (2000) considers a two-agent model in which each agent receives anendowment process. There is also an extraneous process that agents believe mayeffect the economy. The endowment process and all its parameters are observed.The extraneous process is observed, but the parameters of the stochastic differen-tial equation (SDE) that drives it are not known to the agents. They form beliefsabout the drift term in this SDE and update their beliefs in a Bayesian manner.The paper analyses this problem and derives quantities such as the consumption,the state price density and riskless rate. Basak also explains how to generalise themodel to multiple agents and multiple extraneous processes.

Basak (2005) also considers a model with two agents, who each receive anendowment process. The aggregate endowment process is observed by the agents.

Page 79: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

67

They also observe its volatility, but not its drift; they use filtering to determinethis drift. There is assumed to be a bond and risky security, both in zero netsupply. Again, agents do not know the drift of the stock price. Agents maximizethe expected utility of consumption. He then solves for the equilibrium and usesit to derive interest rates and perceived market risk of the agents. He also givesa number of generalisations to the model. For example, he considers the case inwhich there is a process which does not directly affect the asset prices. However,each agent thinks that this process does affect the dynamics of the asset prices andso this changes the equilibrium. He also looks at the case of multiple agents andagain derives the riskless rate and perceived market prices of risk. The final partof his paper looks at further extensions to his model; for example, he explores amonetary model in which there is a money supply that is stochastic and agentsdisagree on its drift.

Gallmeyer & Hollifield (2008) have considered the effects of adding a short-sale constraint to a model with heterogeneous beliefs. They consider a modelwith two agents. These agents are unsure about the drift of the output processof the economy. They start with initial beliefs about the drift and use filteringto update these. The agent who is initially more pessimistic is assumed to havelogarithmic utility and a short sale constraint. The optimistic agent is assumed tohave general CRRA utility and does not have a short sale constraint. The authorsexamine this model and derive expressions for the state price densities, stock priceand consumption. In particular, they examine the effects of the imposition of theshort sale constraint on the stock price.

The paper of Zapatero (1998) considers a model in which there is an aggre-gate endowment process that obeys an SDE driven by two independent Brownianmotions. The constant drift of the process is unknown to the agents. There are 2groups of agents and they each have a different Gaussian prior for this drift. Zapa-tero also considers the case in which as well as observing the endowment process,the agents also see a signal, which again is driven by the two Brownian motions,but has unknown drift. Again, agents have prior beliefs about this drift, whichthey update. He derives an equilibrium and shows that volatility of the interestrate is higher in an economy with the additional information source.

Li (2007) considers a model with 2 groups of agents. There is a dividend pro-cess which obeys some SDE, but the drift of this SDE is unknown. The drift cansatisfy one of two different SDEs. Each group of agents attaches a different prob-ability to the drift obeying the two different SDEs. They update this probability asthey observe more data. Agents are assumed to have log utility and Li derives thestock price, wealth and consumption of agents in this model. He also analyses thevolatility of the stock price.

Turning to the Bayesian learning side of our story, we remark that there isan extensive literature on Bayesian learning in finance and economics in whichagents update their beliefs as they observe data. Work includes Hautsch & Hess

Page 80: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

68

(2004), Kandel & Pearson (1995), Schinkelet al. (2002) and Kalai & Lehrer(1993), each of whom uses this Bayesian learning in quite different setups. For ex-ample, Schinkelet al. (2002) apply Bayesian learning ton competitive firms whoset prices but do not know the demand function. They observe demand at each stepand use this to update their posterior belief for the state of the world, which thenimpacts their perceived demand function. The authors show that prices converge.Kalai & Lehrer (1993) applies Bayesian learning to ann-person game in whichagents do not know the payoff matrices of their competitors. They show that theequilibrium will approach the Nash equilibrium of the system. Hautsch & Hess(2004) apply Bayesian learning to explain why more precise data has a larger im-pact on market prices. They test this by looking at the behaviour of T-bond futureswhen unemployment data is announced.

Closer to our work, Guidolin & Timmermann (2001) look at a discrete timemodel in which the dividend process can have one of two different growth ratesover each time period and the probability of each growth rate is unknown to theagents. The agents are learning, so they update their estimate for the unknownprobability at each time step. In order to avoid the problem of agents discoveringthe true probability, they also consider agents who only look at a rolling windowof data.

2. The ModelThe setup of our model is similar to Brown & Rogers (2009). There is a

single productive asset, which we refer to as the stock, which pays dividendscontinuously in time. The dividend at timet is δt. The dividend process is assumedto be a quadratic function of a stationary Ornstein Uhlenbeck (OU) process.

Since we are interested in obtaining a stationary distribution for the stockprice, the construction of the probability space requires slightly more care thanin Brown & Rogers (2009). LetΩ denote the sample space. We setΩ = C(R,R),the space of continuous functions fromR to R. Let Xt(ω) ≡ ω(t) denote thecanonical process. Furthermore, letFt = σ(Xs : −∞ ≤ s ≤ t).

As before, the reference measure is denoted byP0. We assume that underthis measureX is a stationary OU process which reverts to mean zero and hasreversion rateλ.1

Next, we define:

Wt = Xt − X0 +

∫ t

0λXsds (1)

1An Ornstein Uhlenbeck process which reverts to meana′ with reversion rateλ satisfies the SDEdXt = dWt +λ(a′ −Xt)dt whereW is a standard Brownian motion under the reference measure. Whileit is common to allow a non-unit volatility in the definition of the OU process, this can always bescaled to 1, and in view of the form (2) of the dividend process, this scaling can be absorbed into theconstantsa0, a1, a2. process which reverts to mean zero and has reversion rateλ.

Page 81: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

69

for all t ∈ R. SinceX is an OU process, we observe that the process (Wt)t≥0 is astandard Brownian motion2.

2.1 The Dividend ProcessWe now define the dividend process by:

δt = a0 + a1Xt + a2X2t (2)

for some constantsa0, a1, a2, wherea0 anda2 are non-negative.The simplest non-trivial setup is that in whicha0 = a2 = 0, in which case the

dividend process will simply be an OU process. However, choosing such values ofa0 anda2 means that there is a positive probability that the dividend process willbecome negative, which is unrealistic. To overcome this problem, the constantscan be chosen so thata0 ≥ a2

1/4a2, in which case the dividend process will alwaysbe non-negative. Furthermore, it will transpire that considering the case in whichthe dividend process is a quadratic function ofX is no more difficult than the casein whichδ is simply a scaling ofX.3

2.2 The AgentsIn our model there areN agents at all times. We assume that each person has a

random lifetime. When this person dies, their wealth is immediately passed ontotheir (ignorant) child. Thus we are viewing each agent as a dynasty rather than aperson4.

Formally, there exist times (T ik)k∈Z which are the jump times of a stationary

renewal process. At each of these timesT ik, agenti will die and be replaced by

his child. Thus, the wealth of the agent will be maintained, but their beliefs willnot; the child will start with his own ignorant beliefs which will not depend onany historical data.

Turning now to the beliefs of the agents, first recall that, under the referencemeasure, (Xt)t∈R is an OU process with zero mean. However, under the true mea-sure,X will revert to levela, which will not necessarily be zero. The agents donot know this level. They will use Bayesian updating to deduce it.

2It will transpire that we are only interested in the increments ofW; thus it does not matter thatW0is known before time 0.

3The case in whichδ is a quadratic function ofX is slightly more complicated, since two differentvalues ofX can give the same value ofδ. Hence,σ(Xs : t0 ≤ s ≤ t) , σ(δs : t0 ≤ s ≤ t). Thus, wemust assume that the agents observe the processX, rather than just observing the processδ.

4This idea of dynasties has been used by Nakata (2007), who considers an economy in which atany time point there areH young andH old agents. Each agent lives for 2 periods. Young agenth ∈ 1, ...,H has the same preferences and beliefs as the old agenth. He then considers a RationalBeliefs Equilibrium as explained by Kurz. However, all agents in his model live for exactly two unitsof time, in contrast to our assumptions.

Page 82: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

70

We need to determine the measure that each agent works under. First note thatif we restrict to the time interval [s, t], we may define a new measure by:

dPa

dP0= exp

(

λa(Wt −Ws) −12

(λa)2(t − s)

)

(3)

It follows from the Cameron-Martin-Girsanov theorem5 that a standard Brownianmotion underP0 becomes a Brownian motion with driftλa underPa. Formally,Wr = Wr + λar, for s ≤ r ≤ t whereW is a standard Brownian motion underPa.Thus,

dXt = dWt + λ(a − Xt)dt

so we see that, underPa, X is an OU process which reverts to meana.Since agents do not knowa, the beliefs of each agent simply consist of their

distribution function for the parametera. When a member of theith dynastyis born, he givesλa a prior distribution6. We make the reasonable modellingassumption that this child’s prior for the parameterα ≡ λa is Normal with meanαi and precision7 ε. Hence, all members of dynastyi begin life with the sameprior precisionε. The agent then updates his prior according to his observation of(Xs)ti

k≤s≤t, wheretik denotes the time of birth of the current child andt is the current

time.If the agent knew the value ofa, he would simply use a change of measure

of the form (3). However,a is unknown, so the agent must weight each of thechanges of measure according to his prior distribution fora. Hence at timet,agenti has posterior density

πit(α) =

ε

2πexp

(

−ε

2(α − αi)2 + α(Wt −Wti

k) −

12α2(t − ti

k))

=

ε

2πexp

(

−ε

2(α − αi)2 + α∆W −

12α2∆t

)

, (4)

for α, where we use the abbreviations

∆t ≡ t − tik ∆W ≡ Wt −Wti

k.

Notice that this posterior forα is of course Gaussian; when we maximize overα,we find the posterior mean to be

αt =∆W + εαi

ε + ∆t, (5)

5See Rogers & Williams (2000), IV.38 for an account.6This is equivalent to having a prior distribution fora, sinceλ is known.7Equivalently, the prior has varianceε−1.

Page 83: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

71

which summarizes the way that agenti learns from the observations.Hence agenti’s law for the path has density with respect to the reference mea-

sure given by:

Λit =

∫ ∞

−∞

πit(α) dα

=

ε

ε + ∆texp

(

(∆W)2 + 2αiε∆W − ε(αi)2∆t2(ε + ∆t)

)

(6)

2.3 Deriving the State Price DensityAssociated with agent (or dynasty)i is a utility function, which we take to be

CARA: Ui(t, x) = − 1γi

e−γi xe−ρt. Here,ρ is the discount factor, assumed to be thesame for all agents. The agents seek to maximize the expected discounted utilityof their consumption. Thus, agenti’s objective is:

max E0

[

∫ ∞

t0

Ui(t, cit)Λ

it dt

]

(7)

wheret0 is some start value, which we will later allow to go to−∞. Λit is the

density derived in (6), which jumps at each of the timesT ik.

The objectives of the agents have the same form as the previous Brown &Rogers (2009), so its theory can be used to derive a state price density. In par-ticular, by looking at the price of an arbitrary contingent claim we can deducethat:

ζsνi = U ′i (s, cis)Λ

is

whereνi is someFt0 random variable8, andU ′i denotes the derivative ofUi withrespect to its second argument. Recalling the CARA form ofUi and taking logs,we obtain:

logζtγi+

logνiγi= −ρtγi− ci

t +logΛi

t

γi(8)

Summing (8) overi and using market clearing gives:

logζt1N

∑ 1γi+

1N

∑ logνiγi= −

1N

∑ ρtγi−δt

N+

1N

∑ logΛit

γi

8We will shortly let t0 tend to negative infinity and when this occurs, theFt0 will be trivial, thusνiwill just be a constant.

Page 84: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

72

2.4 A Continuum of AgentsRecall that there areN different agents in our model. We will now letN tend to

infinity so that we can examine the case in which there is a continuum9 of agents.We assume that1N

∑ 1γi

has a finite limit and denote this limit by:

Γ−1 ≡ limN

1N

∑ 1γi

Abusing notation slightly, we useai to denote the limN→∞aiN . Hence:

logζt +G′ = −ρt − Γ(a1Xt + a2X2t ) + Γ lim

N→∞

∑ 1Nγi

logΛit (9)

whereG′ is someFt0-measurable function. We now lett0 tend to negative infinity;Ft0 then becomes trivial, soG′ becomes a simple constant10.

Only the last term in (9) requires further development. Writingui for the timesince the the last person died in theith dynasty, we obtain:

Γ limN→∞

∑ 1Nγi

logΛit = Γ lim

N→∞

∑ 1Nγi

[12

log( ε

ε + ui

)

+( (Wt −Wt−ui )2 + 2αiε(Wt −Wt−ui ) − ε(αi)2ui

2(ε + ui)

)]

(10)

We assume that the mean ofαi is given by〈α〉 and further that the distribution ofui, αi andγi are all independent11. We further make the assumption thatu has adensityϕ(·), given by:

ϕ(u) = A(ε + u)λe−λu (11)

whereA = λ1+ελ is chosen so that

∫ ∞

0ϕ(u)du = 1. Sinceϕ(u) represents the prob-

ability of someone who is currently alive having ageu, it follows thatϕ(·) mustbe decreasing. This gives the inequalityλε ≥ 1. The assumed form (11) ofϕ isrestrictive; in particular, it confounds the effect of the mean reversion parameterλand prior precisionεwith the lifetimes of the individual members of the dynasties,

9Why do we not begin with a continuum of agents, then? We find the derivation of the state-pricedensity and the evolution of beliefs easier to understand in the finite-N description, though it shouldbe possible to derive these directly in a continuum model.

10We note that ast0 → ∞, the expression on the right of (9) is almost surely finite, so the left handside must be as well. Since ourζ and (νi)1≤i≤N were only chosen up to a multiplicative constant, wemay choose them to depend ont0 in such a way that ast0 →∞ bothζ andG′ are a.s. finite.

11The assumed independence of theαi and γi is a substantive structural assumption made fortractability; that these are independent of theui is a consequence of the renewal process structureof the death times, and the fact that the renewal process — which has been running for infinite time —will have reached steady-state.

Page 85: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

73

and this makes it impossible to give a clean interpretation of our later investigationof the effects of varyingλ andε. Nevertheless, we proceed with this assumption,as it would be difficult to make further progress without it.

Using our expression forϕ, equation (10) becomes:

logζt = −G − Γ(a1Xt + a2X2t ) − ρt +

12

∫ (

Wt −Wt−u)2

ε + uϕ(u) du

+ 〈α〉ε

∫ (

Wt −Wt−u)

ε + uϕ(u) du

whereG is some new constant. This then gives us:

logζt = −G − Γ(a1Xt + a2X2t ) − ρt +

A2ηt + 〈α〉εAξt

where

ξt =

∫ ∞

0(Wt −Wt−u)λe−λu du

ηt =

∫ ∞

0(Wt −Wt−u)2λe−λu du

By rearrangement and use of Fubini (see appendix), we are able to show that:

ξt = Xt

ηt = X2t + e−λt

∫ t

−∞

λeλsX2s ds

Our final expression for the state price density is then given by:

logζt = −G − Γ(a1Xt + a2X2t ) − ρt

+A2

[(Xt)2 + e−λt∫ t

−∞

λeλsX2s ds] + 〈α〉εAXt (12)

= −G + BXt + CX2t + Ut − ρt (13)

where:

B = 〈α〉εA − Γa1 C =A2− Γa2

and

Ut =12

Ae−λt∫ t

−∞

λeλsX2s ds

Page 86: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

74

3. Asset Prices3.1 The Interest Rate Process

We will use our state price density to derive the interest rate process. FromIto’s formula, we have:

dζtζt= (B + 2CXt)dWt

+(

C +λA2

X2t − λUt − ρ − BλXt − 2λCX2

t +12

(B + 2CXt)2)dt

.=

[

(−ρ + C +12

B2) + (−λB + 2CB)Xt + (−2λC +λA2+ 2C2)X2

t − λUt]

dt

where the symbol ˙= signifies that the two sides differ by a local martingale. Theinterest rate is equal to minus the coefficient ofdt in the above expansion, hence:

rt = r(Xt,Ut) ≡ (ρ −C −12

B2) + B(λ − 2C)Xt + (2λC −λA2− 2C2)X2

t + λUt

(14)

Thus, our model gives us an interest rate process of the form:

rt = α0 + α1Xt + α2X2t + λUt

for some constantsαi, i = 0, 1, 2. Note that the interest rate process will depend onthe behaviour of the dividend process in the past (viaUt) as well as on the currentvalue of the dividend process. We therefore see that in some sense, high historicalvolatility generates high values of the riskless rate.

3.2 The Stock PriceWe will now calculate the stock price. We have:

S t = E0t

[

∫ ∞

t

ζuδu

ζtdu

]

=1ζt

∫ ∞

tE

0t

[

ζuδu

]

du (15)

3.2.1 A PDE for the stock priceFrom the form ofζt and the Markovian structure, we will have that:

ζtS t = ζth(Xt,Ut) (16)

for some functionh. This function will satisfy a PDE which we may determineby by observing thatζtS t +

∫ t

0ζsδsds is a martingale and applying Ito’s formula.

After a few calculations, we obtain the PDE:

0 =12

hxx + (B + (2C − λ)x)hx + λ(A2

x2 − u)hu − r(x, u)h + (a0 + a1x + a2x2)

(17)

Page 87: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

75

Unfortunately, it does not appear to be possible to solve this equation in closedform, so we will resort to another approach. However, before we do this, let uslook at some of the consequences of (16) and (17). Suppose that under the real-world probability,P∗, the OU process reverts to levela∗, then we have that:

dS t = hxdW∗t + hxλ(a∗ − Xt)dt + hu(

λA2

X2t − λUt)dt +

12

hxxdt

whereW∗ denotes a Brownian motion under measureP∗. After using (17) we get

that:

dS t = hxdW∗t + hx(

(λa∗ − B) − 2CXt)

dt + r(Xt,Ut)hdt − (a0 + a1Xt + a2X2t )dt

Hence, we see that the volatility and drift of the stock price are given by:

Σt =hx(Xt,Ut)h(Xt,Ut)

(18)

µ∗t =r(Xt,Ut)h(Xt,Ut) − (a0 + a1Xt + a2X2

t ) +(

λa∗ − 2CXt − B)

hx(Xt,Ut)h(Xt,Ut)

(19)We shall use these expressions later.

3.2.2 Calculation of stock price via computation ofconditional expectation

We will now proceed to determine the stock price via another method. Substi-tuting the state price density from (13) into (15), we obtain:

S t = exp−BXt −CX2t − Ut + ρt

×

∫ ∞

tE

0t[

(a0 + a1XT + a2X2T ) expBXT + CX2

T + UT − ρT ]

dT

On first sight it may appear that it is very difficult to get any further with thisexpression. However, if we can calculate:

VT (t, Xt; θ) := E0t[

expθ(a0 + a1XT + a2X2T ) + BXT + CX2

T +

∫ T

t

A2λeλ(s−T )X2

s ds]

then we may differentiate with respect toθ to and setθ = 0 to give:

S t = exp−BXt − CX2t

∫ ∞

texp(e−λ(T−t) − 1)Ut − ρ(T − t)

∂θ|θ=0VT (t, Xt; θ)dT

We also defineτ ≡ T − t. We will show that:

VT (t, Xt; θ) = exp12

a(τ)X2t + b(τ)Xt + c(τ)

Page 88: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

76

wherea, b andc are functions which we will shortly deduce. To deduce thesefunctions, we will use a martingale argument. Fort ≤ T we define:

MTt ≡ E

0t[

expθ(a0 + a1XT + a2X2T ) + BXT + CX2

T +

∫ T

−∞

A2λeλ(s−T )X2

s ds]

= VT (t, Xt; θ) exp∫ t

−∞

A

2λeλ(s−T )X2

s ds

Now apply Ito’s formula:

dMTt = exp

∫ t

−∞

A2λeλ(s−T )X2

s ds[

Vtdt + VxdXt +12

VxxdXtdXt +λA2

eλ(t−T )X2t Vdt

]

= MTt[λA

2eλ(t−T )X2

t dt − (12

a(τ)X2t + b(τ)Xt + c(τ))dt

+ (a(τ)Xt + b(τ))(dWt − λXtdt) +12

(a(τ) + (a(τ)Xt + b(τ))2)dt]

But (MTt )t≤T is a martingale underP0, so the coefficient ofdt in the above expres-

sion must be zero. Thus we obtain:

12

a = λA2 e−λτ − λa + 12a2

b = ab − λb

c = 12(a + b2)

The boundary conditions are given by:

a(0) = 2(C + θa2) b(0) = B + θa1 c(0) = θa0

3.2.3 Solving the ODEsWe now solve the ODEs. The first equation is a Riccati equation, so in order

to solve we make the usual substitution:

a(τ) = −g(τ)g(τ)

Substituting this into the ODE fora gives:

12

g + λg +λA2

e−λτg = 0

and the boundary condition becomes:

−g(0) = 2(C + θa2)g(0)

Page 89: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

77

We can solve this equation using Maple to obtain:

g(u) = e−λu[

(

√λAY1(2

A/λ) − 2(C + θa2)Y2(2√

A/λ))

J2(2e−λu/2√

A/λ)

−(

√λAJ1(2

A/λ) − 2(C + θa2)J2(2√

A/λ))

Y2(2e−λu/2√

A/λ)]

whereJi andYi are Bessel functions of orderi of the first and second kind respec-tively. Turning now to the ODE forb, we may use our solution fora to deduce:

b +gg

b + λb = 0

Rearranging gives:

ddτ

(bgeλτ) = 0

which we can solve subject tob(0) = B + θa1 to give:

b(τ) =(B + θa1)g(0)

eλτg(τ)

Finally, we obtain:

c(τ) = θa0 +

∫ τ

0

12

(a(τ′) + b(τ′)2)dτ′

Thus we have completely solved the ODEs. In order to calculate the stock price,we need to find∂V

∂θ. We therefore need:

∂g∂θ= e−λu

[

− 2a2Y2(2√

A/λ)J2(2e−λu/2√

A/λ) + 2a2J2(2√

A/λ)Y2(2e−λu/2√

A/λ)]

and also:

∂g∂θ= −λ

∂g∂θ+ e−λu

[

− 2a2Y2(2√

A/λ)(

λJ2(2√

A/λe−λu/2)−√

Aλe−λu/2J1(2√

A/λe−λu/2))

+ 2a2J2(2√

A/λ)(

λY2(2√

A/λe−λu/2) −√

Aλe−λu/2Y1(2√

A/λe−λu/2))]

We may then calculate expressions for∂V∂θ

. First note that:

∂V

∂θ=

(12∂a∂θ

X2t +∂b∂θ

Xt +∂c∂θ

)

exp12

a(τ)X2t + b(τ)Xt + c(τ)

But:

∂c∂θ

(τ) = a0 +∫ τ

012

( ∂a∂θ

(τ′) + 2b(τ′) ∂b∂θ

(τ′))

dτ′

∂b∂θ

(τ) = a1g(0)

eλτg(τ) +(B+θa1)

eλτ

∂g∂θ

(0)g(τ) −

(B+θa1)eλτ

g(0)g(τ)2

∂g∂θ

(τ)

∂a∂θ

(τ) = −∂g∂θ

(τ)g(τ) +

˙g(τ)g(τ)2

∂g∂θ

(τ)

Page 90: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

78

So finally we have:

S t = exp−BXt − CX2t

∫ ∞

0exp−ρτ − (1− e−λτ)Ut

(12∂a∂θ

X2t +∂b∂θ

Xt +∂c∂θ

)

exp12

a(τ)X2t + b(τ)Xt + c(τ)dτ (20)

This is as far as we can get with the expression for the stock price. We see thatthe stock price depends not only on the dividend at timet, but also onUt, a termreflecting the behaviour of (Xs)−∞≤s≤t. This is as we would expect, since agentsneed to use information from the whole of their lifetimes to make better estimatesof the mean to whichX is reverting. From properties of the OU process, we seethat if Xt reverts to meana then, sinceX is stationary, we haveXt ∼ N(a, 1

2λ ).Hence,

EUt =

∫ t

−∞

λA2

eλ(r−t)( 12λ+ a2)dr =

A2( 12λ+ a2)

This indicates a sensible value forUt, which will be helpful for when we begin tolook at numerical examples later on.

3.3 The Bond PriceThe time-t price of a zero-coupon bond which has unit payoff at time T is

given by:

E0[ζT

ζt|Ft

]

= exp[

− BXt − CX2t − Ut(1− e−λτ) − ρτ

]

VT (t, Xt; θ = 0)

Using our expression forVT (t, Xt; θ = 0), we obtain:

exp[

(12

a(τ) − C)X2t + (b(τ) − B)Xt + c(τ) − ρτ − (1 − e−λτ)Ut

]

(21)

where the functionsa, b andc are all evaluated usingθ = 0.

3.4 Remarks on the Case in which a is KnownNote that if we letε → ∞, then this corresponds to the case in which all the

agents are certain that they know the value ofa. By taking the limit in our expres-sions for the stock price, bond price and riskless rate, we can deduce expressionsfor these quantities in this limit. We note further that if the agents are sure aboutthe value ofa and this value corresponds to the true value,a∗, then the expressionswe obtain will be the same as those for the model in which the true value ofa wasknown to all the agents.

Page 91: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

79

4. Numerical ResultsThe aim of this Section is to investigate how the stock price varies as the

different parameters of the model are varied. We do not intend here to discuss theextent to which this model might fit actual prices; this would be an econometricstudy taking us some distance from the theoretical aims of this paper. However,we want to work with parameter values which are plausible, and choosing theserequires some care.

We will restrict to the case in whicha0 = a1 = 0, so that we have simplyδt = a2X2

t . This ensures that the dividend process remains positive. Note furtherthat the state price density (13) only depends on theproduct Γa2 rather than theindividualΓ anda2. Although the dividend process does depend ona2, changinga2 simply corresponds to the changing the units in which we measure the dividendprocess. Hence, we may choosea2 = 1.

Some of the parameters are relatively easy to choose, such asλ andρ, forwhich we chooseλ = 2 andρ = 0.04; the impatience rate of the agents is 4%,so they have a mean time horizon of 25 years, reasonable for a human agent, andthe mean reversion of the OU process for the dividend has a half-life of 6 months,again a plausible value. However, other parameters, such asΓ are much harder todetermine. We are only interested in ensuring that the parameters are of the correctorder. For this, we abbreviate〈α〉 = a, and consider the thought experiment whereε → ∞, which corresponds to the case in which agents are sure that they knowthe true value ofa. This leaves the parametersa andΓ which we still need todetermine.

One way to determine these parameters would be to choose them in orderto match various moments from empirical data, such as the mean price-dividendratio; this was the strategy employed in Brown & Rogers (2009) when we con-sidered the equity premium puzzle. Ideally, we would use the same method here,but unfortunately our stock price is much more complicated. Thus, computing agiven stock price requires the numerical computation of an integral. To work outthe mean price dividend ratio, we would then need to compute a further integralas we averaged over the values of the driving Brownian motion. We would thenvary the parameters and calculate the expected price dividend ratio each time inan attempt to find a realistic set of parameters. Given the additional complexityof this problem and the fact that we are only interested in determining parametersthat are of the correct order, we will proceed in a different manner.

We first note that the interest rate process has a particularly simple form, whichwe can use to get a simple expression for the expected riskless rate. We can matchthis with the mean riskless rate from the Shiller data set.

Note that we are considering the case in whicha0 = a1 = 0, a2 = 1 and thelimit as ε → ∞ and henceAε → 1, B → 〈α〉 = a,C → −Γ. Substituting intoexpression (14) gives:

r = (ρ + Γ − 12a2) + a(λ + 2Γ)Xt − 2Γ(λ + Γ)X2

t

Page 92: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

80

Thus, the expected riskless rate is given by:

Er = (ρ + Γ − 12a2) + a2(λ + 2Γ) − 2Γ(λ + Γ)(a2 + 1

2λ )

To determineΓ, we compare a CRRA agent (where we know a reasonablevalue for the constant of relative risk aversion12) with a CARA agent. If we con-sider a single agent model in which the value ofa is known, the stock price willbe given by:

S 0 = E

∫ ∞

0

U ′(δt)U ′(δ0)

δtdt

Since we just want our parameters to be of the correct order, it is sufficient tocheck that the behaviour of

U ′(δt)U ′(δ0)

δt (22)

whenX is near to its mean valuea is the same for both the CRRA and CARAcase. If we setX0 = Xt = a then clearly (22) will be the same in both the CRRAand CARA case. We therefore impose the requirement that a small change inXt

from Xt = a has the same effect in both cases, leading to the condition:

U ′′CRRA(a2)

U ′CRRA(a2)=

U ′′CARA(a2)

U ′CARA(a2)

which leads us to the condition:

Γ =Ra2

(23)

Since we know a sensible value for the coefficient of relative risk aversionR isR = 2, this equation gives us an equation from which we can determineΓ anda.Substituting in our expression for the riskless rate yields the cubic equation:

l(Γ) ≡Γ3

λ+ 2RΓ2 + (Er − ρ + 2R(λ − 1))Γ + 1

2R − Rλ = 0

We will chooseR = 2. We also chooseEr = 0.01, as given by the Shiller dataset. We may then note thatl(0) < 0 and dl

dΓ > 0 for Γ > 0, hence there isa unique positive solution to the above equation, which we can easily compute.Computation shows that the correctΓ to choose isΓ = 0.49 which we take as ourdefault value. This givesa = 2.01.

12Ideally, we would have worked with CRRA agents throughout, but the combination of the indi-vidual agents’ first-order conditions to specify the state-price density is intractable.

Page 93: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

81

This concludes the thought experiment we used to find a reasonable value forΓ. We now use this with a more interesting value forε which does not imply thatagents knowα with certainty. To summarise, the default parameters we chooseare:a0 = a1 = 0;a2 = 1;λ = 2;ρ = 0.04;ε = 1.0;Γ = 0.49;〈α〉 = a = 2.01. We

also chooseXt = a,Ut =A2

(a2 + 12λ ). We then vary the parameters and examine

the behaviour.

4.1 Comments on ResultsFigure 1 shows that the stock price is decreasing inλ. Recall thatλ is the

parameter which tells us how quickly the dividend process returns to its mean.Hence, a lower value ofλ means that the dividend process is more likely to reachhigh values, so is worth more to the agents. However,λ is also a parameter used inspecifying the distribution of the lifetime of the agents. Increasingλ therefore de-creases the expected lifetime of the agents. Each child in the dynasty therefore hasless time to learn about the unknown parametera and this increased uncertaintyamongst the agents also means that the stock price decreases asλ increases.

Figure 2 shows that asε increases, so does the stock price, which is to beexpected since if the agents know more about the dividend process (i.e. theirbeliefs have a higher precision), the stock should be worth more to them. Onceagain, the effect of varyingε is confounded with the distribution of the agents’lifetimes.

Similarly, Figure 3 shows that the larger the value ofρ, the less the stock isworth. A largeρ indicates that the agents are impatient and want to consume theirwealth in the near future, making the stock less attractive.

Figure 4 exhibits the dependence of the stock price on〈α〉. Recall thatXt andUt are kept fixed as we vary〈α〉. A small 〈α〉 indicates that the agents think thelevel to whichX reverts is low. Thus, since we do not changeXt, a low value of〈α〉relative toX indicates thatX is currently abnormally high and so the dividends areabnormally high. Thus, the agents are keen to hold this stock. Furthermore, therelatively high level ofX means that the agents have a large amount of dividendwith which to buy the stock.

Figure 5 may at first seem surprising, since it shows that the stock price isincreasing in the risk aversion,Γ. However, we recall that all agents have a CARAutility and furthermore, the parameters of our model are chosen so that the divi-dend process is non-negative. On the one hand, a larger value ofΓ means that thevalue of the dividend process becoming larger are valued more highly than before.The downside of holding the stock is limited, since the dividend process is alwaysnon-negative. This explains the behaviour shown in Figure 5.

The volatility surface13 in Figure 6 shows that the volatility appears to beincreasing in bothXt andUt. This seems reasonable: if the dividend process has

13Note that the plot showshx/S t ; the absolute value of this would give the volatility.

Page 94: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

82

Figure 1. Graph ofS t againstλ.

Figure 2. Graph ofS t againstε.

Page 95: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

83

Figure 3. Graph ofS t againstρ.

Figure 4. Graph ofS t against〈α〉.

Page 96: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

84

Figure 5. Graph ofS t againstΓ.

Figure 6. Volatility surface.

Page 97: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

85

been varying greatly in the past, thenUt will be large, and in this case we wouldexpect the stock to have a larger volatility.

5. ConclusionsWe have introduced a new model in which the dividend of the stock obeys an

OU process for which none of the agents know the mean. We derived a state pricedensity and were able to use this to price the stock and a bond. We also wereable to deduce an interest rate model. We produced graphs which illustrated thedependence of the stock price on the various parameters. The behaviour shownin these graphs seemed very reasonable. We also looked at how the parametercertainty case could be viewed as a special limit of the parameter uncertainty case.

Extensions to this work include using a different utility function for the agents;a CRRA utility would be a natural choice. In section 2.4 we also had to assume aquite specific form for the distribution of the lifetimes of the agents. An obviousimprovement would be to consider the problem with a different distribution oflifetimes, in particular one that did not depend on the parameters of the dividendprocess. Unfortunately both these generalisations appear to make the calculationsintractable.

Appendices. Stochastic IntegralsA.1. Calculating ξt

Recall thatξt is given by:

ξt =

∫ ∞

0(Wt −Wt−u)λe−λudu

By change of variables,

ξt = Wt − e−λt∫ t

−∞

λeλsWsds

So substituting from (1) gives:

ξt = Wt + X0 − e−λt[

∫ t

−∞

Xsλeλsds +

∫ t

−∞

λeλs∫ s

0λXrdrds

]

(24)

But the final term in the above expression is:

−e−λt∫ t

−∞

λeλs∫ s

0λXrdrds

= e−λt∫ 0

s=−∞

∫ 0

r=sλeλsλXrdrds − e−λt

∫ t

s=0

∫ s

r=0λeλsλXrdrds

Page 98: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

86

Applying Fubini, we obtain:

e−λt∫ 0

r=−∞

∫ r

s=−∞λeλsλXrdsdr − e−λt

∫ t

r=0

∫ t

s=rλeλsλXrdsdr

Computing the integral with respect tos gives:

e−λt[

∫ 0

−∞

λeλrXrdr − eλt∫ t

0Xrλdr +

∫ t

0λeλrXrdr

]

= eλt∫ t

−∞

λeλrXrdr −∫ t

0λXrdr

Substituting this into (24) gives:

ξt = Wt + X0 −

∫ t

0λXrdr

But recalling (1), we obtain:

ξt = Xt

A.2. Calculating ηt

Recall thatηt is given by:

ηt =

∫ ∞

0(Wt −Wt−u)2λe−λudu

Changing variables we obtain:

ηt = e−λt∫ t

−∞

(Wt −Wr)2λeλrdr

Substituting from (1) gives:

ηt = e−λt∫ t

−∞

[

(Xt − Xr) +∫ t

rλXsds

]2λeλrdr

= e−λt∫ t

−∞

(Xt − Xr)2λeλrdr + 2e−λt∫ t

−∞

(Xt − Xr)(

∫ t

rλXsds

)

λeλrdr

+ e−λt∫ t

−∞

(

∫ t

rλXsds

)2λeλrdr (25)

We will now apply Fubini to two of these terms to deduce an expression forηt.Firstly, we work on:

∫ t

r=−∞Xt

∫ t

s=rλXsdsλeλrdr

Page 99: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

87

By applying Fubini, we obtain:∫ t

s=−∞XtXs

∫ s

r=−∞λ2eλrdrds

=

∫ t

−∞

XtXsλeλsds

Putting this into (25) gives:

ηt = X2t + e−λt

∫ t

−∞

λeλrX2r dr

− 2e−λt∫ t

−∞

Xr(

∫ t

rλXsds

)

λeλrdr

+ e−λt∫ t

−∞

∫ t

rλXsds

∫ t

rλXvdvλeλrdr (26)

The final term is:

2e−λt∫ t

r=−∞

∫ t

s=r

∫ t

v=sλXsλXvλe

λrdvdsdr

where we have halved the area of integration in thedvds integral. Applying Fubiniyields:

2e−λt∫ t

s=−∞

∫ t

v=s

∫ s

r=−∞λXsλXvλe

λrdrdvds

= 2e−λt∫ t

s=−∞

∫ t

v=sλXsλXve

λsdvds

= 2e−λt∫ t

r=−∞λXre

λr∫ t

s=rλXsdsdr

Substituting this into (26) gives:

ηt = X2t + e−λt

∫ t

−∞

λeλsX2s ds

ReferencesBasak, S. (2000). A model of dynamic equilibrium asset pricing with heteroge-

neous beliefs and extraneous risk.Journal of Economic Dynamics and Control,24, 63–95.

Basak, S. (2005). Asset pricing with heterogeneous beliefs.Journal of Banking&Finance, 29, 2849–2881, thirty Years of Continuous-Time Finance.

Page 100: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

88

Brown, A. A. & Rogers, L. C. G. (2009). Diverse beliefs. Preprint, StatisticalLaboratory, University of Cambridge.

Buraschi, A. & Jiltsov, A. (2006). Model uncertainty and option markets withheterogeneous beliefs.Journal of Finance, 61, 2841–2897.

Fan, M. (2006). Heterogeneous beliefs, the term structure and time-varying riskpremia.Annals of Finance, 2, 259–285.

Gallmeyer, M. & Hollifield, B. (2008). An Examination of Heterogeneous Be-liefs with a Short-Sale Constraint in a Dynamic Economy.Review of Finance,12, 323–364.

Guidolin, M. & T immermann, A. G. (2001). Option prices under bayesian learn-ing: Implied volatility dynamics and predictive densities.CEPR Discussion Pa-per, Available from http://ideas.repec.org/p/cpr/ceprdp/3005.html.

Harris, M. & Raviv, A. (1993). Differences of opinion make a horse race.TheReview of Financial Studies, 6, 473–506.

Harrison, J. M. & Kreps, D. (1978). Speculative investor behavior in a stockmarket with heterogeneous expectations.The Quarterly Journal of Economics,92, 323–336.

Hautsch, N. & Hess, D. (2004). Bayesian learning in financial markets - testingfor the relevance of information precision in price discovery.Discussion Paper,Available from http://ideas.repec.org/p/kud/kuiedp/0417.html.

Jouini, E. & Napp, C. (2007). Consensus consumer and intertemporal asset pricingwith heterogeneous beliefs.Review of Economic Studies, 74, 1149–1174.

Kalai, E. & Lehrer, E. (1993). Rational learning leads to Nash equilibrium.Econometrica, 61, 1019–1045.

Kandel, E. & Pearson, N. D. (1995). Differential interpretation of public signalsand trade in speculative markets.Journal of Political Economy, 4, 831–872.

Kurz, M. (1994). On the structure and diversity of rational beliefs.EconomicTheory, 4, 877–900.

Kurz, M., ed. (1997).Endogenous Economic Fluctuations: Studies in the Theoryof Rational Belief , vol. 6 ofStudies in Economic Theory. Berlin and New York:Springer-Verlag.

Kurz, M. (2008a). Beauty contests under private information and diverse beliefs:How different?Journal of Mathematical Economics, 44, 762–784.

Page 101: Financial Engineering

April 14, 2010 9:36 Proceedings Trim Size: 9in x 6in 003

89

Kurz, M. (2008b). Rational Diverse Beliefs and Economic Volatility.Preparedfor the Handbook of Finance Series Volume Entitled: Handbook of FinancialMarkets: Dynamics and Evolution.

Kurz, M. & M otolese, M. (2006). Risk premia, diverse belief and beauty contests.Working Paper, Available from http://ideas.repec.org/p/pra/mprapa/247.html.

Kurz, M., Jin, H. & Motolese, M. (2005). Determinants of stock market volatilityand risk premia.Annals of Finance, 1, 109–147.

Li, T. (2007). Heterogeneous beliefs, asset prices, and volatility in a pure exchangeeconomy.Journal of Economic Dynamics and Control, 31, 1697–1727.

Morris, S. (1996). Speculative investor behavior and learning.The QuarterlyJournal of Economics, 111, 1111–1133.

Nakata, H. (2007). A model of financial markets with endogenously correlatedrational beliefs.Economic Theory, 30, 431–452.

Rogers, L. C. G. & Williams, D. (2000).Diffusions, Markov Processes and Mar-tingales. Cambridge University Press.

Schinkel, M. P., Tuinstra, J. & Vermeulen, D. (2002). Convergence of bayesianlearning to general equilibrium in mis-specified models.Journal of Mathemat-ical Economics, 38, 483–508.

Wu, H. M. & Guo, W. C. (2003). Speculative trading with rational beliefs andendogenous uncertainty.Economic Theory, 21, 263–292.

Wu, H. M. & Guo, W. C. (2004). Asset price volatility and trading volume withrational beliefs.Economic Theory, 23, 795–829.

Zapatero, F. (1998). Effects of financial innovations on market volatility when be-liefs are heterogeneous.Journal of Economic Dynamics and Control, 22, 597–626.

Page 102: Financial Engineering

This page intentionally left blankThis page intentionally left blank

Page 103: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

Counterparty Risk on a CDS in a Markov ChainCopula Model with Joint Defaults∗

S. Crepey1,2, M. Jeanblanc1,2 and B. Zargari1,3

1Equipe Analyse et Probabilite, Universite d’Evry Val d’Essonne,Bd. F. Mitterrand, 91025Evry Cedex, France

2CRIS Consortium†3Dept. of Mathematical Sciences, Sharif University of Technology,

Azadi Ave., PO Box: 11365-11155, Tehran, IranE-mail: [email protected], [email protected],

and [email protected]

In this paper we study the counterparty risk on a payer CDS in a Markovchain model of two reference credits, the firm underlying the CDS and theprotection seller in the CDS. We first state few preliminary results aboutpricing and CVA of a CDS with counterparty risk in a general set-up. Wethen introduce a Markov chain copula model in which wrong way riskis represented by the possibility of joint defaults between the counterpartand the firm underlying the CDS. In the set-up thus specified we derivesemi-explicit formulas for most quantities of interest with regard to CDScounterparty risk such as price, CVA, EPE or hedging strategies. Modelcalibration is made simple by the copula property of the model. Numericalresults show adequacy of the behavior of EPE and CVA in the model withstylized features.

Keywords: Counterparty credit risk, CDS, wrong way risk, CVA,EPE.

∗This research benefited from the support of the Europlace Institute of Finance and an exchangegrant from AMaMeF. It was motivated by a presentation of J.-P. Lardy at the CRIS research workinggroup [20] (see http://www.cris-creditrisk.com). The authors thank J.-P. Lardy, F. Patras, S. Assefa andother members from the CRIS research group, as well as T. Bielecki, M. Rutkowski and V. Brunel, forenlightening discussions, comments and remarks.

†See http://www.cris-creditrisk.com.

91

Page 104: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

92

1. IntroductionSince the sub-prime crisis, counterparty risk is a crucial issue in connection

with valuation and risk management of credit derivatives. Counterparty risk ingeneral is ‘the risk that a party to an OTC derivative contract may fail to performon its contractual obligations, causing losses to the other party’ (cf. Canabarro andDuffie [13]). A major issue in this regard is the so-calledwrong way risk, namelythe risk that the value of the contract is particularly high from the perspective ofthe other party at the moment of default of the counterparty. As classic examplesof wrong way risk, one can mention the situations of selling a put option to acompany on its own stock, or entering a forward contract in which oil is boughtby an airline company (see Redon [24]).

Among papers dealing with general counterparty risk, one can mention, apartfrom the abovementioned references, Canabarroet al.[14], Zhu and Pykhtin [26],and the series of papers by Brigoet al. [7, 9, 10, 8, 11, 12]. From the pointof view of measurement and management of counterparty risk, two importantnotions emerge:

• The Credit Value Adjustment process (CVA), which measures the depreci-ation of a contract due to counterparty risk. So, in rough terms, CVAt =

Pt −Πt, whereΠ andP denote the price process of a contract depending onwhether one accounts or not for counterparty risk.

• The Expected Positive Exposure function (EPE), where EPE(t) is the risk-neutral expectation of the loss on a contract conditional on a default of thecounterparty occurring at timet.

Note that the CVA can be given an option-theoretic interpretation, so that coun-terparty risk can, in principle, be managed dynamically.

1.1 Counterparty Credit RiskWrong way risk is particularly important in the case ofcredit derivativestrans-

actions, at least from the perspective of a credit protection buyer. Indeed, via eco-nomic cycle and default contagion effects, the time of default of a counterpartyselling credit protection is typically a time of higher value of credit protection.

We consider in this paper aCredit Default Swap with counterparty risk (‘riskyCDS’ in the sequel, as opposed to ‘risk-free CDS’, without counterparty risk).Note that this topic already received a lot of attention in the literature. It can thusbe considered as a benchmark problem of counterparty credit risk. To quote but afew:

• Huge and Lando [17] propose a rating-based approach,

• Hull and White [18] study this problem in the set-up of a static copulamodel,

Page 105: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

93

• Jarrow and Yu [19] use an intensity contagion model, further considered inLeung and Kwok [21],

• Brigo and Chourdakis [7] work in the set-up of their Gaussian copula andCIR++ intensity model, extended to the issue of bilateral counterpartycredit risk in Brigo and Capponi [6],

• Blanchet-Scalliet and Patras [5] or Lipton and Sepp [22] develop structuralapproaches.

1.2 A Markov Copula ApproachWe shall consider a Markovian model of credit risk in which simultaneous

defaults are possible. Wrong way risk is thus represented in the model by the factthat at the time of default of the counterparty, there is a positive probability thatthe firm on which the CDS is written defaults too, in which case the loss incurredto the investor (Exposure at Default ED, cf. (3)) is the loss given default of the firm(up to the recovery on the counterparty), that is a very large amount. Of course,this simple model should not be taken too literally. We are not claiming here thatsimultaneous defaults can happen in actual practice. The rationale and financialinterpretation of our model is rather that at the time of default of the counterparty,there is a positive probability of a high defaults spreads environment, in whichcase, the value of the CDS for a protection buyer is close to the loss given defaultof the firm.

More specifically, we shall be considering a four-state Markov Chain modelof two obligors, so that all the computations are straightforward, either that thereare explicit formulas for all the quantities of interest, or, in case less elementaryparametrizations of the model are used, that these quantities can be easily andquickly computed by solving numerically the related Kolmogorov ODEs.

This Markovian set-up makes it possible to address in a dynamic and consis-tent way the issues of valuing (and also hedging) the CDS, and/or, if wished, theCVA, interpreted as an option as evoked above.

To make this even more practical, we shall work in aMarkovian copulaset-upin the sense of Bieleckiet al. [3], in which calibration of the model marginals tothe related CDS curves is straightforward. The only really free model parametersare thus the few dependence parameters, which can be calibrated or estimated inways that we shall explain in the paper.

1.3 Outline of the PaperIn Section 2 we first describe the mechanism and cash flows of a payer CDS

with counterparty credit risk. We then state a few preliminary results about pricingand CVA of this CDS in a general set-up. In Section 3 we introduce our Markovchain copula model, in which we derive explicit formulas for most quantities ofinterest in regard to a risky CDS, like price, EPE, CVA or hedging ratios. Section 4

Page 106: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

94

is about implementation of the model. Alternative model parametrizations and re-lated calibration or estimation procedures are proposed and analyzed. Numericalresults are presented and discussed, showing good agreement of model’s EPE andCVA with expected features. Section 5 recapitulates our model’s main propertiesand presents some directions for possible extensions of the previous results.

2. General Set-Up2.1 Cash Flows

As is well known, a CDS contract involves three entities: A reference credit(firm), a buyer of default protection on the firm, and a seller of default protectionon the firm. The issue of counterparty risk on a CDS is:

• Primarily, the fact that the seller of protection may fail to pay the protectioncash flows to the buyer in case of a default of the firm;

• Also, the symmetric concern that the buyer may fail to pay the contractualCDS spread to the seller.

We shall focus in this paper on the so-calledunilateral counterparty credit riskinvolved in a payer CDS contract, namely the risk corresponding to the first bulletpoint above; however it should be noted that the approach of this paper could beextended to the issue of bilateral credit risk.

We shall refer to the buyer and the seller of protection on the firm as the risk-free investorand the defaultablecounterpart, respectively. Indices 1 and 2 willrefer to quantities related to the firm and to the counterpart. The default time ofthe firm and of the counterpart are denoted byτ1 andτ2.

Under a risky CDS (payer CDS with counterparty credit risk), the investorpays to the counterpart a stream of premia with spreadκ, or Fees Cash Flows,from the inception date (time 0 henceforth) until the occurrence of a credit event(default of the counterpart or the firm) or the maturityT of the contract, whichevercomes first.

Let us denote byR1 andR2 the recovery of the firm and the counterpart, sup-posed to be adapted to the information available at timeτ1 andτ2, respectively. Ifthe firm defaults prior to the expiration of the contract, theProtection Cash Flowspaid by the counterpart to the investor depends on the situation of the counterpart:

• If the counterpart is still alive, she can fully compensate the loss of investor,i.e., she pays (1− R1) times the face value of the CDS to the investor;

• If the counterpart defaults at the same time as the firm (note that it is im-portant to take this case into account in the perspective of the model withsimultaneous defaults to be introduced later in this paper), she will only beable to pay to the investor a fraction of this amount, namelyR2(1−R1) timesthe face value of the CDS.

Page 107: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

95

Finally, there is aClose-Out Cash Flowwhich is associated to clearing thepositions in the case of early default of the counterpart. As of today, CDSs aresold over-the-counter (OTC), meaning that the two parties have to negotiate andagree on the terms of the contract. In particular the two parties can agree on oneof the following three possibilities to exit (unwind) a trade:

• Termination:The contract is stopped after a terminal cash flow (positive ornegative) has been paid to the investor;

• Offsetting:The counterpart takes the opposite protection position. This newcontract should have virtually the same terms as the original CDS except forthe premium which is fixed at the prevailing market level, and for the tenorwhich is set at the remaining time to maturity of the original CDS. So thecounterpart leaves the original transaction in place but effectively cancelsout its economic effect;

• Novation(or Assignment): The original CDS is assigned to a new counter-part, settling the amount of gain or loss with him. In this assignment theoriginal counterpart (ortransferor), the new counterpart (transferee) andthe investor agree to transfer all the rights and obligations of the transferorto transferee. So the transferor thereby ends his involvement in the contractand the investor thereafter deals with the default risk of the transferee.

In this paper we shall focus ontermination. More precisely, if the counterpartdefaults in the life-time of the CDS while the firm is still alive, a ‘fair value’χ(τ2)

of the CDS is computed at timeτ2 according to a methodology specified in theCDS contract at inception. If this value (from the perspective of the investor) isnegative, (−χ(τ2)) is paid by the investor to the counterpart, whereas if it is positive,the counterpart is assumed to pay to the investor a portionR2 of χ(τ2).

Remark 2.1. A typical specification isχ(τ2) = Pτ2, wherePt is the value at timetof a risk-free CDS on the same reference name, with the same contractual matu-rity T and spreadκ as the original risky CDS. The consistency of this rather stan-dard way of specifyingχ(τ2) is, in a sense, questionable. Given a pricing modelaccounting for the major risks in the product at hand, including, if appropriate,counterparty credit risk, with a related price process of the risky CDS denoted byΠ, it could be argued that a more consistent specification would beχ(τ2) = Πτ2

(or, more precisely,χ(τ2) = Πτ2−, sinceΠτ2 = 0 in view of the usual conventionsregarding the definition of ex-dividend prices). We shall see in section 4 that, atleast in the specific model of this paper, adopting either convention makes littledifference in practice.

2.2 PricingLet us be given a risk-neutral pricing model (Ω, F, P), whereF = (Ft)t∈[0,T] is a

given filtration making theτi ’s stopping times. In absence of further precision, all

Page 108: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

96

theprocesses, first of which, thediscount factorprocessβ, are supposed to beF-adapted, and all therandom variablesare assumed to beFT-measurable. The fairvalueχ(τ2) is supposed to be anFτ2-measurable random variable. The recoveriesR1 andR2 are assumed to beFτ1- andFτ2-measurable random variables. LetEτstand for the conditional expectation underP givenFτ, for any stopping timeτ.

We assume for simplicity that the face value of all the CDSs under consid-eration (risky or not) is equal to one monetary unit and that the spreads are paidcontinuously in time. All the cash flows and prices are considered from the per-spective of the investor. In accordance with the usual convention regarding thedefinition ofex-dividendprices, the integrals in this paper are taken open on theleft and closed on the right of the interval of integration. In view of the descriptionof the cash-flows in subsection 2.1, one then has

Definition 2.2. (i) The modelprice process of a risky CDS is given byΠt =

Et[πT(t)

], whereπT(t) corresponds to therisky CDS cumulative discounted cash

flowson the time interval (t,T], so,

βtπT(t) = −κ∫ τ1∧τ2∧T

t∧τ1∧τ2∧Tβsds+ βτ1(1− R1)1t<τ1≤T

[1τ1<τ2 + R21τ1=τ2

]

+βτ21t<τ2≤T1τ2<τ1

[R2χ

+(τ2) − χ

−(τ2)

]. (1)

(ii) The modelprice process of a risk-free CDS is given byPt = Et[pT(t)], wherepT(t) corresponds to therisk-free CDS cumulative discounted cash flowson thetime interval (t,T], so,

βt pT(t) = −κ∫ τ1∧T

t∧τ1∧Tβsds+ (1− R1)βτ11t<τ1≤T . (2)

The first, second and third term on the right-hand side of (1) correspond to thefees, protection and close-out cash flows of a risky CDS, respectively. Note thatthere are no cash flows of any kind afterτ1∧ τ2∧T (in the case of the risky CDS)or τ1 ∧ T (in the case of the risk-free CDS), soπT(t) = 0 for t ≥ τ1 ∧ τ2 ∧ T andpT(t) = 0 for t ≥ τ1 ∧ T.

Remark 2.3. In these definitions it is implicitly assumed that, consistent with thenow standard theory of no-arbitrage (cf. Delbaen and Schachermayer [15]), a pri-mary market of financial instruments (along with the risk-free assetβ−1) has beendefined, with price processes given as locally bounded (Ω, F, P) – local martin-gales. No-arbitrage on the extended market consisting of the primary assets and afurther CDS then motivates the previous definitions. Since the precise specifica-tion of the primary market is irrelevant until the question of hedging is dealt with,we postpone it to section 3.3.

Page 109: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

97

Definition 2.4. (i) TheExposure at Default(ED) is theFτ2-measurable randomvariableξ(τ2) defined by,

ξ(τ2) =

(1− R2)(1− R1), τ2 = τ1 ≤ T,Pτ2 − (R2χ

+(τ2) − χ

−(τ2)) τ2 < τ1, τ2 ≤ T,

0, otherwise .(3)

(ii) The Credit Valuation Adjustment(CVA) is the process killed atτ1 ∧ τ2 ∧ Tdefined by, fort ∈ [0,T],

βtCVAt = 1t<τ2Et[βτ2ξ(τ2)

]. (4)

(iii) TheExpected Positive Exposure(EPE) is the function of time defined by, fort ∈ [0,T],

EPE(t) = E[ξ(τ2)|τ2 = t

]. (5)

The following proposition justifies the name of Credit Valuation Adjustmentwhich is used for the CVA process defined by (4). In caseχ(τ2) = Pτ2 (see Remark2.1) then

ξ(τ2) = ξ0(τ2) := (1− R2) ×

(1− R1), τ2 = τ1 ≤ T,P+τ2

, τ2 < τ1, τ2 ≤ T,0, otherwise

(6)

and we essentially recover the basic result that has been established in Brigo andMasetti [8]. Note that as opposed to [8] we do not exclude simultaneous defaultsin our set-up, whence further terms in1t<τ1=τ2≤T in the proof of Proposition 2.1.

Proposition 2.1. One hasCVAt = Pt − Πt on t < τ2.

Proof. If τ1 ≤ t < τ2, thenΠt = Pt = CVAt = 0 in view of (1), (2) and (4).Assumet < τ1 ∧ τ2. SubtractingπT(t) from pT(t) yields,

βt (pT(t) − πT(t)) = −κ∫ τ1∧T

τ1∧τ2∧Tβsds+ βτ1(1− R1)1τ1≤T1τ1≥τ2

−βτ1R2(1− R1)1τ1≤T1τ1=τ2 − βτ21τ2<τ11τ2≤T(R2χ

+(τ2) − χ

−(τ2)

). (7)

Moreover, in view of (2), one has,

βτ2 pT(τ2)1τ2<τ11τ2≤T = −κ

∫ τ1∧T

τ1∧τ2∧Tβsds+ (1− R1)βτ11τ2<τ1≤T . (8)

Now, using the following identity in the second term on the right-hand-side of (7):

1τ1≤T1τ1≥τ2 = 1τ1≤T1τ2<τ1 + 1τ1=τ2≤T ,

Page 110: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

98

and plugging (8) into (7) , we obtain (recallt < τ1 ∧ τ2),

βt (pT(t) − πT(t)) = βτ21τ2<τ1, τ2≤T pT(τ2)

+βτ21τ2=τ1≤T (1− R2)(1− R1) − βτ21τ2<τ1, τ2≤T(R2χ

+(τ2) − χ

−(τ2)

).

Thus:

• On the setτ2 < τ1, τ2 ≤ T,

βt (pT(t) − πT(t)) = βτ2 pT (τ2) − βτ2

(R2χ

+(τ2) − χ

−(τ2)

)

As Pτ2 = Eτ2[pT(τ2)], we then have, sinceR2 andχ(τ2) areFτ2-measurable,

βtEτ2

[pT(t) − πT(t)

]= βτ2

(Pτ2 − (R2χ

+(τ2) − χ

−(τ2))

); (9)

• On the setτ1 = τ2 ≤ T,

βt (pT(t) − πT(t)) = βτ2(1− R1)(1− R2)

and thusβtEτ2

[pT(t) − πT(t)

]= Eτ2

[βτ2(1− R1)(1− R2)

]. (10)

Using the fact thatτ2 < τ1, τ2 ≤ T andτ2 = τ1 ≤ T areFτ2-measurable, itfollows,

βtPt − βtΠt = βtEt[Eτ2[pT(t) − πT(t)]

]

= βtEt

[Eτ2[pT(t) − πT(t)]1τ2<τ1, τ2≤T

+ Eτ2[pT(t) − πT(t)]1τ2=τ1≤T

]

= Et[βτ2ξ(τ2)

]= βtCVAt.

2.3 Special CaseF = HLet H = (H1,H2) denote the pair of the default indicator processes of the firm

and the counterpart, soH it = 1τi≤t. The following proposition gathers a few useful

results that can be established in the special case of a model filtrationF given as

F = H = (H1t ∨H

2t )t∈[0,T] ,

withH it = σ(H i

s; 0 ≤ s≤ t).

Proposition 2.2. (i)For t ∈ [0,T], anyHt-measurable random variable Yt can bewritten as

Yt = y0(t)1t<τ1∧τ2 + y1(t, τ1)1τ1≤t<τ2 + y2(t, τ2)1τ2≤t<τ1 + y3(t, τ1, τ2)1τ2∨τ1≤t

Page 111: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

99

where y0(t), y1(t, u), y2(t, v), y3(t, u, v) are deterministic functions.

(ii) For any integrable random variable Z, one has,

1t<τ1∧τ2EtZ = 1t<τ1∧τ2

E(Z1t<τ1∧τ2)P(t < τ1 ∧ τ2)

. (11)

(iii) The price process of the risky CDS is given byΠt = Π(t,Ht), for a pricingfunctionΠ defined onR+ × E1 × E1 with E1 = 0, 1, such thatΠ(t, e) = 0 fore, (0, 0). On the sett < τ1 ∧ τ2,Πt is given by the deterministic function

Π(t, 0, 0) = u(t) :=E[πT(t)

]

P(τ1 ∧ τ2 > t). (12)

(iv) One has, for suitable functionsχ(·), v(·), ξ(·, ·) andCVA(·),

1τ2<τ1χ(τ2) = 1τ2<τ1χ(τ2) , 1τ2<τ1Pτ2 = 1τ2<τ1v(τ2) (13)

ξ(τ2) = ξ(τ1, τ2) :=(1τ2=τ1≤T(1− R2)(1− R1)+ (14)

1τ2<τ1, τ2≤T(v(τ2) − (R2χ

+(τ2) − χ−(τ2))))

CVAt = 1t<τ1∧τ2CVA(t) . (15)

(v) A functionCVA(·) satisfying (15) is defined by, for t∈ [0,T],

βtCVA(t) :=∫ T

tβsEPE(s)

P(τ2 ∈ ds)P(t < τ1 ∧ τ2)

. (16)

Proof. (i) and(ii) are standard (see, e.g., Bielecki and Rutkowski [4]; (ii) in par-ticular is the so-calledKey Lemma).

(iii) Since there are no cash flows of a risky CDS beyond the first default (cf. (1)),one hasπT(t) = πT(t)1t<τ1∧τ2. The Key Lemma then yields,

Πt = Et[1t<τ1∧τ2πT(t)

]= (1− H1

t )(1− H2t )

E[πT(t)

]

P(τ1 ∧ τ2 > t)·

ThusΠt = Π(t,H1t ,H

2t ), for a pricing functionΠ defined by

Π(t, e1, e2) = (1− e1)(1− e2)u(t) ,

whereu(t) is defined by the right-hand-side of (12).

(iv) follows directly from part (i), given the definition ofPτ2, χ(τ2), ξ(τ2) and of theCVA process.

Page 112: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

100

(v) By (iv) and using (ii) again, one has, on the sett < τ1 ∧ τ2,

βtCVAt = Et[βτ2ξ(τ2)

]= Et

[βτ2 ξ(τ1, τ2)

]

=E[βτ2 ξ(τ1, τ2)1t<τ1∧τ2

]

P(t < τ1 ∧ τ2)=E[E(βτ2 ξ(τ1, τ2)1t<τ1∧τ2 |τ2

)]

P(t < τ1 ∧ τ2)

=E[E(βτ2 ξ(τ1, τ2)1t<τ2≤T |τ2

)]

P(t < τ1 ∧ τ2)=E[βτ2E

(ξ(τ1, τ2)|τ2

)1t<τ2≤T

]

P(t < τ1 ∧ τ2)

=E[βτ2EPE(τ2)1t<τ2≤T

]

P(t < τ1 ∧ τ2)=

∫ T

tβsEPE(s)

P(τ2 ∈ ds)P(t < τ1 ∧ τ2)

,

whence (v).

3. Markov Copula Factor Set-Up3.1 Factor Process Model

We shall now introduce a suitableMarkovian Copula Modelfor the pair ofdefault indicator processesH = (H1,H2) of the firm and the counterpart. Thename ‘Markovian Copula’ refers to the fact that the model will have prescribedmarginals for the laws ofH1 andH2, respectively (see Bieleckiet al. [2, 3] fora general theory). The practical interest of a Markovian copula model is clearin view of the task of model calibration, since the copula property allows one todecouple the calibration of the marginal and of the dependence parameters in themodel (see again section 4.1). More fundamentally, the opinion developed in thispaper is that it is also a virtue for a model to ‘take the right inputs to generate theright outputs’, namely taking as basic inputs the individual default probabilities(individual CDS curves), which correspond to the more reliable information onthe market, and are then ‘coupled together’ in a suitable way (see section 4.1).

An apparent shortcoming of the Markov copula approach is that it does notallow for default contagion effects in the usual sense (default of a name impactingthe default intensities of the other ones). Note also that in this work we assumethat the underlying filtration isH and thus the default intensities are deterministicbetween defaults. The interest of this admittedly simplified set-up is that one isable to derive explicit formulas for most quantities of interest with regard to CDScounterparty risk like price, CVA, EPE or hedging ratios. In a forthcoming paper,we will generalize this setting to take into account the spread risk.

The way we shall introduce dependence betweenτ1 andτ2 is by relaxing thestandard assumption of no simultaneous defaults. As we shall see, allowing forsimultaneous defaults is a powerful way of modeling defaults dependence.

Specifically, we model the pairH = (H1,H2) as an inhomogeneous Markovchain relative to its own filtrationH on a probability space (Ω, P) (for theσ-algebraHT), with state spaceE = (0, 0), (1, 0), (0, 1), (1, 1), and generator matrix attime t given by the following 4× 4 matrixA(t), where the first to fourth rows (or

Page 113: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

101

columns) correspond to the four possible states (0, 0), (1, 0), (0, 1) and (1, 1) ofHt :

A(t) =

−l(t) l1(t) l2(t) l3(t)

0 −q2(t) 0 q2(t)

0 0 −q1(t) q1(t)

0 0 0 0

. (17)

In (17) thel’s andq’s denote deterministic functions of time integrable over [0,T],with in particularl(t) = l1(t) + l2(t) + l3(t).

Remark 3.1. The intuitive meaning of ‘(17) being the generator matrix ofH’ isthe following (see, e.g., Rogers and Williams [25], Vol. I, Chap. III, Sec. 2, forstandard definitions and results on Markov Chains):

• First line: Conditional on the pairHt = (H1t ,H

2t ) being in state (0, 0) (firm

and counterpart still alive at timet), there is a probabilityl1(t)dt, (resp.l2(t)dt; resp.l3(t)dt) of a default of the firm alone (resp. of the counterpartalone; resp. of a simultaneous default of the firm and the counterpart) in theinfinitesimal time interval (t, t + dt);

• Second line:Conditional on the pairHt = (H1t ,H

2t ) being in state (1, 0)

(firm defaulted but counterpart still alive at timet), there is a probabilityq2(t)dt of a further default of the counterpart in the time interval (t, t + dt);

• Third line: Conditional on the pairHt = (H1t ,H

2t ) being in state (0, 1) (firm

still alive but counterpart defaulted at timet), there is a probabilityq1(t)dtof a further default of the firm in the time interval (t, t + dt).

On each line the diagonal term is then set as minus the sum of the off-diagonalterms, so that the sum of the entries of each line is equal to zero, as should befor A(t) to represent the generator of a Markov process. Moreover, for the sake ofthe desiredMarkov copula property (Proposition 3.1(iii) below), we impose thefollowing relations between thel’s and theq’s.

Assumption 3.2. q1(t) = l1(t) + l3(t) , q2(t) = l2(t) + l3(t).

Observe that in virtue of these relations:

• Conditional onH1t being in state 0, and whatever the state ofH2

t may be(that is, in the state (0, 0) or (0, 1) for Ht), there is a probabilityq1(t)dt ofa default of the firm (alone or jointly with the counterpart) in the next timeinterval (t, t + dt);

Page 114: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

102

• Conditional onH2t being in state 0, and whatever the state ofH1

t may be(that is, in the states (0, 0) or (1, 0) for Ht), there is a probabilityq2(t)dt ofa default of the counterpart (alone or jointly with the firm) in the next timeinterval (t, t + dt).

In mathematical terms the default indicator processesH1 andH2 areH-Markovprocesses on the state spaceE1 = 0, 1 with time t generators respectively givenby

A1(t) =

−q1(t) q1(t)

0 0

, A2(t) =

−q2(t) q2(t)

0 0

. (18)

To formalize the previous statements, and in view of the study of simultaneousjumps, let us further introduce the processesH1, H2 andH1,2 standing for theindicator processes of a default of the firm alone, of the counterpart alone, and ofa simultaneous default of the firm and the counterpart, respectively. So

H1,2 = [H1,H2] , H1 = H1 − H1,2 , H2 = H1 − H1,2 , (19)

where [·, ·] stands for the quadratic covariation. Equivalently, fort ∈ [0,T],

H1t = 1τ1≤t,τ1,τ2 , H2t = 1τ2≤t,τ1,τ2 , H1,2t = 1τ1=τ2≤t .

Note that the natural filtration of (Hι)ι∈I , with here and henceforthI =1, 2, 1, 2, is equal toH. The proof of the following Proposition is deferredto Appendix 5.

Proposition 3.1. (i) TheH-intensity of Hι is of the form qι(t,Ht) for a suitablefunction qι(t, e) for everyι ∈ I , namely,

q1(t, e) = 1e1=0(1e2=0l1(t) + 1e2=1q1(t)

)

q2(t, e) = 1e2=0(1e1=0l2(t) + 1e1=1q2(t)

)

q1,2(t, e) = 1e=(0,0)l3(t) .

Put another way, the processes Mι defined by, for everyι ∈ I,

Mιt = Hι

t −

∫ t

0qι(s,Hs)ds , (20)

with

q1(t,Ht) = (1− H1t )

((1− H2

t )l1(t) + H2t q1(t)

)

q2(t,Ht) = (1− H2t )

((1− H1

t )l2(t) + H1t q2(t)

)

q1,2(t,Ht) = (1− H1t )(1− H2

t )l3(t) ,

(21)

areH-martingales.

Page 115: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

103

(ii) TheH-intensity process of Hi is given by(1 − H it )qi(t). In other words, the

processes Mi defined by, for i= 1, 2,

Mit = H i

t −

∫ t

0(1− H i

s)qi(s)ds , (22)

areH-martingales.

(iii) The processes H1 and H2 areH-Markov processes with generator matrix attime t given by A1(t) and A2(t) (cf. (18)).

(iv) One has,

P(τ1 > s, τ2 > t) = exp(−

∫ s

0l1(u)du−

∫ t

0l2(u)du−

∫ s∨t

0l3(u)du

)(23)

and therefore

P(τ1 > t) = e−∫ t

0q1(u)du , P(τ2 > t) = e−

∫ t

0q2(u)du ,

P(τ1 ∧ τ2 > t) = e−∫ t

0l(u)du

P(τ1 > s, τ2 ∈ dt) = q2(t)e−∫ s

0l(u)due−

∫ t

sq2(u)dudt

P(τ1 ∈ dt, τ2 > s) = q1(t)e−∫ s

0l(u)due−

∫ t

sq1(u)dudt

P(τ1 > t, τ2 ∈ dt) = q2(t)e−∫ t

0l(u)dudt

P(τ1 ∈ dt, τ2 > t) = q1(t)e−∫ t

0 l(u)dudt .

(24)

(v) The correlation of H1t and H2

t (default correlation at the time horizont) is

ρd(t) =exp

(∫ t

0l3(s)ds

)− 1

√(exp

(∫ t

0q1(s)ds

)− 1

) (exp

(∫ t

0q2(s)ds

)− 1

) . (25)

Remark 3.3. (i) In the Markov copula terminology of Bieleckiet al. [3], the so-called consistency conditionis satisfied (H1 and H2 areH-Markov processes).The bi-variate modelH with generatorA is thus aMarkovian copula modelwithmarginal generatorsA1 andA2.

(ii) The default timesτ1 andτ2 could equivalently be defined by

τ1 = η1 ∧ η3 , τ2 = η2 ∧ η3

where theηi ’s are independent inhomogeneous exponential random variables withparametersl i(t)’s. Thus, for every 0≤ s, t,

P(τ1 > s, τ2 > t) = P(η1 > s)P(η2 > t)P(η3 > s∨ t) . (26)

Page 116: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

104

In the special case ofhomogeneousexponential random variables with (constant)parametersl i ’s, one has further (see section 4 of Embrechtset al. [16] or Marshalland Olkin [23]),

P(τ1 > s, τ2 > t) = C(P(η1 > s), P(η2 > t)) , (27)

where theMarshall-Olkin survival copula function Cis defined by, forp, q ∈[0, 1],

C(p, q) = pqmin(p−α1, q−α2) (28)

with αi =l3

l i+l3. Our model is thus an extension of the classical Marshall-Olkin

copula model in whichinhomogeneousexponential random variables are used asmodel inputs, and where, more importantly, adynamic perspectiveis shed on therandom timesτ1 andτ2 by introducing the model filtrationH.

3.2 PricingWe use the notation of Proposition 2.2, which applies here since we are in the

special caseF = H. Recall in particularΠt = Π(t,Ht) = (1− H1t )(1− H2

t )u(t), fora pricing functionΠ(t, 0, 0) = u(t), as well as the identities (13), (15), (16). Weassume henceforth for simplicity that:

• The discount factor writesβt = exp(−∫ t

0r(s)ds), for a deterministicshort-

term interest-ratefunctionr,

• The recovery ratesR1 andR2 are constant.

Proposition 3.2. The pricing function u of the risky CDS is given by

βtu(t) =∫ T

tβse−

∫ s

tl(u)duπ(s)ds (29)

withπ(s) = (1− R1)

[l1(s) + R2l3(s)

]+ l2(s)

[R2χ(s)+ − χ(s)−

]− κ . (30)

The function u satisfies the following ODE:

u(T) = 0

dudt (t) − (r(t) + l(t))u(t) + π(t) = 0 , t ∈ [0,T) .

(31)

Proof. Recall (12):

u(t) =E[πT(t)

]

P(τ1 ∧ τ2 > t),

where the denominator can be calculated using Proposition 3.1(iv). For computingthe numerator, one rewrites the expressions for the cumulative discounted Fee,

Page 117: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

105

Protection and Close-out cash flows in terms of integrals with respect toH1, H2

andH1,2, as follows:

Fees Cash Flow= κ∫ T

0βs(1− H1

s)(1− H2s)ds

Protection Cash Flow= (1− R1)∫ T

0βs(1− H2

s−)dH1s + R2(1− R1)∫ T

0βsdH1,2s

= (1− R1)∫ T

0βs(1− H2

s−)dM1s

+ (1− R1)∫ T

0βs(1− H2

s)q1(s,Hs)ds

+ R2(1− R1)∫ T

0βsdM1,2s + R2(1− R1)

∫ T

0βsq1,2(s,Hs)ds

Close-out Cash Flow=∫ T

0βs

[R2χ(s)+ − χ(s)−

](1− H1

s−)dH2s

=

∫ T

0βs

[R2χ(s)+ − χ(s)−

](1− H1

s−)dM2s

+

∫ T

0βs

[R2χ(s)+ − χ(s)−

](1− H1

s)q2(s,Hs)ds

Making use of the martingale property ofM1, M2 andM1,2 and the fact thatthe integrals of bounded predictable processes with respect to these martingalesare indeed martingales, we thus have

E(πT(t)) = E(πT(t)) (32)

with

βtπT(t) = − κ∫ T

tβs(1− H1

s)(1− H2s)ds

+ (1− R1)∫ T

tβs(1− H2

s)q1(s,Hs)ds

+ R2(1− R1)∫ T

tβsq1,2(s,Hs)ds

+

∫ T

tβs

[R2χ(s)+ − χ(s)−

](1− H1

s)q2(s,Hs)ds.

Page 118: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

106

Moreover, in view of the expressions forq1 andq2 in (21), one has

(1− H2s)q1(s,Hs) = (1− H1

s)(1− H2s)l1(s) ,

(1− H1s)q2(s,Hs) = (1− H1

s)(1− H2s)l2(s) .

(33)

Plugging this into (32) and using (24), it follows that,

βtE[πT(t)] = E

[∫ T

tβs(1− H1

s)(1− H2s)π(s)ds

]

=

∫ T

tβsE

[(1− H1

s)(1− H2s)]π(s)ds=

∫ T

tβse−

∫ s

0l(x)dxπ(s)ds

whereπ is given by (30). One can now check by inspection that the functionusatisfies the ODE (31).

Remark 3.4. The equation (31) can also be interpreted as the Kolmogorov back-ward equation related to the valuation of a risky CDS in our set-up. This ODE canin fact be derived directly and independently by an application of the Ito formulato the martingaleΠ(t,H1

t ,H2t ), which results in an alternative proof of Proposition

3.2.

Remark 3.5. In the set-up of the Markov chain copula model, the identity (when-ever assumed)χ(τ2) = Πτ2− (see Remark 2.1) is thus equivalent to

χ(τ2) = Πτ2− = limt→τ2−

u(t) = u(τ2) ,

by continuity ofu. This case thus corresponds to the case where the functionχ

in Proposition 2.2(iv) is in fact given by the functionu (caseχ = u). In this casethe positive and negative parts ofu, i.e.,u+ andu− are sitting in the expression forπ in (30). One thus deals with a non-linear valuation ODE (31), and the formula(29) is not explicit anymore, sinceu is ‘hidden’ inπ in the right hand side of thisformula. However one can still computeu by numerical solution of (31).

Proposition 3.3. The price of a risk-free CDS with spreadκ on the firm admitsthe representation:

Pt = P(t,H1t ) , (34)

for a function P of the form P(t, e1) = (1− e1)v(t). The pricing function v is givenby

βtv(t) =∫ T

tβse−

∫ s

tq1(x)dxp(s)ds

Page 119: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

107

with

p(s) = (1− R1)q1(s) − κ. (35)

The pricing function v thus solves the following pricing ODE:

v(T) = 0

dvdt (t) − (r(t) + q1(t))v(t) + p(t) = 0 , t ∈ [0,T) .

Proof. One has

βt pT(t) = −κ∫ T

tβs(1− H1

s)ds+ (1− R1)∫ T

tβsdH1

s

= −κ

∫ T

tβs(1− H1

s)ds+ (1− R1)∫ T

tβsdM1

s

+ (1− R1)∫ T

tβsq1(s)(1− H1

s)ds.

As M1 is anH-martingale andβ a bounded continuous function, thus

βtEt[pT(t)] = Et

[∫ T

tβs(1− H1

s)p(s)ds

]=

∫ T

tβsEt[1 − H1

s]p(s)ds , (36)

with p(t) defined by (35), and where in virtue of Proposition 3.1(iii) and Proposi-tion 2.2(ii) (Key Lemma), one has fort < s,

Et[1 − H1s] = E[1 − H1

s|H1t ] = (1− H1

t )P(τ1 > s)P(τ1 > t)

= (1− H1t )e−

∫ s

tq1(x)dx .

Proposition 3.4. One has, for t∈ [0,T], (cf. (13), (15), (16)),

EPE(t) =

((1− R2)(1− R1)

l3(t)q2(t)

(37)

+(v(t) − (R2χ

+(t) − χ−(t))) l2(t)q2(t)

)e−

∫ t

0l1(x)dx

CVA(t) =∫ T

tβs ((1− R2)(1− R1)l3(s) (38)

+(v(s) − (R2χ

+(s) − χ−(s)))l2(s)

)e−

∫ s

tl(x)dxds

Page 120: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

108

which in the special case whereχ(τ2) = Pτ2, χ = v reduce to

EPE(t) = EPE0(t) := (39)

(1− R2)((1− R1)

l3(t)q2(t)

+ v+(t)l2(t)q2(t)

)e−

∫ t

0l1(x)dx

CVA(t) = CVA0(t) := (40)∫ T

t(1− R2)βs

((1− R1)l3(s) + v+(s)l2(s)

)e−

∫ s

tl(x)dxds

Proof. Set

Φ(τ2) = E(1τ1=τ2≤T |τ2) , Ψ(τ2) = E(1τ2<τ1, τ2≤T |τ2) ,

which are characterized by

E(Φ(τ2) f (τ2)

)= E

(f (τ2)1τ1=τ2≤T

),

E(Ψ(τ2) f (τ2)) = E( f (τ2)1τ2<τ1, τ2≤T) ,(41)

for every Borel functionf . In particular we takef (x) = 1x≤t for somet ∈ (0,T].Now using the law ofτ2, the left-hand sides of (41) are given by

E(Φ(τ2)1τ2≤t

)=

∫ t

0Φ(s)q2(s)e−

∫ s

0q2(x)dxds

E(Ψ(τ2)1τ2≤t

)=

∫ t

0Ψ(s)q2(s)e−

∫ s

0q2(x)dxds

As for the right-hand-sides of (41), thanks to Proposition 3.1(i) and (iv), onehas

E(1τ2≤t1τ1=τ2≤T

)= E(

∫ t

0dH1,2s )

=

∫ t

0E((1− H1

s)(1− H2s))l3(s)ds=

∫ t

0e−

∫ s

0l(x)dxl3(s)ds ,

and

E(1τ2≤t1τ2<τ1, τ2≤T

)= E

( ∫ t

01s≤τ1∧TdH2s

)= E

( ∫ t

01s≤τ1q2(s,Hs)ds

)

= E( ∫ t

0(1− H1

s)(1− H2s)l2(s)ds

)=

∫ t

0e−

∫ s

0l(x)dxl2(s)ds ,

where the second identity in the first line uses thatH2 does not jump atτ1.

Page 121: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

109

Thus for f (x) = 1x≤t the identities in (41) can be rewritten as

∫ t

0Φ(s)q2(s)e−

∫ s

0q2(x)dxds=

∫ t

0l3(s)e−

∫ s

0l(x)dxds ,

∫ t

0Ψ(s)q2(s)e−

∫ s

0q2(x)dxds=

∫ t

0l2(s)e−

∫ s

0l(x)dxds .

Taking derivative with respect tot of these last equations leads us to

Φ(t) =l3(t)e−

∫ t

0l(x)dx

q2(t)e∫ t

0 q2(x)dxd , Ψ(t) =l2(t)e−

∫ t

0l(x)dx

q2(t)e∫ t

0 q2(x)dx

and (37) follows.Using (16), one then has fort ∈ [0,T],

βtCVA(t) =∫ T

tβsEPE(s)e

∫ s

0l(x)dxe−

∫ s

0q2(x)dxq2(s)e−

∫ s

tl(x)dxds

=

∫ T

tβsEPE(s)e

∫ s

0 l1(x)dxq2(s)e−∫ s

tl(x)dxds .

Hence (38) follows from (37).

Remark 3.6. In view of the option-theoretic interpretation of the CVA, the CVAvaluation formula (38) can also established directly, without passing by the EPE,much like formula (29) in Proposition 3.2 above (using a probabilistic computa-tion, or resorting to the related Kolmogorov pricing ODE).

3.3 HedgingWe now give few preliminary results about hedging the risky CDS. We shall

mainly consider the issue of delta-hedging, at least partially, the risky CDS, bya risk-free CDS which would also be available on the market (CDS on the firmwith the same characteristics, except for the counterparty credit risk). Anotherperspective on the counterparty credit risk of the risky CDS can thus be given byassessing to which extent the risky CDS could, in principle, be hedged by the risk-free CDS.

3.3.1 Price dynamicsLet Π denote the discounted cum-dividend price of the risky CDS, that is, the

local martingaleΠt = βtΠt + πt(0).

The Ito formula applied toΠt = Π(t,Ht) yields, on [0, τ1 ∧ τ2 ∧ T], )

dΠt = βt(δΠ1(t)dM1t + δΠ2(t)dM2t + δΠ1,2(t)dM1,2t

)(42)

Page 122: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

110

with

δΠ1(t) = 1− R1 − u(t) , δΠ2(t) = R2χ+(t) − χ−(t) − u(t) ,

δΠ1,2(t) = R2(1− R1) − u(t) .

Similarly, settingPt = βtPt + pt(0),

it follows that,

dPt = βtδP1(t)dM1t (43)

withδP1(t) = 1− R1 − v(t) .

3.3.2 Min-variance hedgingLet us denote byψ a (self-financing) strategy in the risk-free CDS with price

processP (and the savings accountβ−1t ) for tentatively hedging the risky CDS

with price processΠ.Recall thatP is the risk neutral probability chosen by market. So the dis-

counted cum-dividend price processP is a P-local martingale (actually in viewof (43) P is here aP-martingale). As a result of the Galtchouk-Kunita-Watanabedecomposition, the hedging strategyψva which minimizes theP-variance of thehedging error, ormin-variance hedging strategy, is given by

ψvat =

d〈Π, P〉t

d〈P〉t.

Remark 3.7. Note that we only deal with minimization of the risk-neutral vari-ance of the hedging error, here, as opposed to the more difficult problem of mini-mizing the variance of the hedging error under the historical probability measure.

In view of the price dynamics (42)–(43), one has, fort ≤ τ1 ∧ τ2,

d〈Π, P〉t

d〈P〉t=

l1(t)(δΠ1(t))(δP1(t)) + l3(t)(δΠ1,2(t))(δP1(t))

q1(t)(δP1(t))2.

So

ψvat =

l1(t)q1(t)

1− R1 − u(t)1− R1 − v(t)

+l3(t)q1(t)

R2(1− R1) − u(t)1− R1 − v(t)

on [0, τ1 ∧ τ2 ∧ T] (andψva = 0 on (τ1 ∧ τ2 ∧ T,T]). The related min-variancehedging reduction factorwrites:

Var(ΠT)

Var(ΠT −∫ T

0ψva

t dPt)= (44)

Var(ΠT)

Var(ΠT) + Var(∫ T

0ψva

t dPt) − 2Cov(ΠT ,∫ T

0ψva

t dPt),

Page 123: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

111

where:

Var(ΠT) = E〈Π〉T = E∫ τ1∧τ2∧T

0

((l1(t)

(δΠ1(t)

)2

+l2(t)(δΠ2(t)

)2+ l3(t)

(δΠ1,2(t)

)2)dt ,

Var(∫ T

0ψva

t dPt) = E〈∫ ·

0ψva

t dPt〉T (45)

= E

∫ τ1∧τ2∧T

0q1(t)(ψva

t δP1(t))2dt ,

Cov(ΠT ,

∫ T

0ψva

t dPt) = E〈Π,∫ ·

0ψva

t dPt〉T

= E

∫ τ1∧τ2∧T

0

(l1(t)δΠ1(t) + l3(t)δΠ1,2(t)

)ψva

t δP1(t)dt .

The various quantities that arise in (45), and therefore the hedging reduction factorgiven by (44), can be computed by Monte Carlo simulation.

Remark 3.8. The previous min-variance hedging strategy can be easily extendedto multi-instrument hedging schemes. In case three non-redundant hedging in-struments are available, then, in view of (42), the risky CDS can be perfectlyreplicated.

4. Implementation4.1 Affine Intensities Model Specification

Note that the Markov chain copula model primitives are the marginal pre-default intensity functionsq1 andq2 as well as the ‘dependence intensity function’l3 in A(t) (cf. (17)).

Let us specify, for constantsai ’s andbi ’s,

qi(t) = ai + bi t , l3(t) = a3 + b3t , (46)

with

a3 = αmina1, a2 , b3 = αminb1, b2 ,

for amodel dependence parameterα ∈ [0, 1] (for the sake of Assumption 3.2).

Remark 4.1. Such an affine specification of intensities was already used by Bi-eleckiet al. [2] in a context of CDO modeling.

It is immediate to check that under (46), the spreadκi of a risk-free CDS on

Page 124: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

112

namei is given by

κi = (1− Ri)

∫ T

0βt(ai + bi t) exp(−ait −

bi

2t2)dt

∫ T

0βt exp(−ai t −

bi

2t2)dt

. (47)

Also note that one has, by Proposition 3.1(v),

ρd := ρd(T) =ea3T+b3T2/2 − 1√(

ea1T+b1T2/2 − 1) (

ea2T+b2T2/2 − 1) , (48)

or, equivalently,

α =

ln(1+ ρd

√(ea1T+b1T2/2 − 1

) (ea2T+b2T2/2 − 1

))

aT + bT2/2(49)

wherea = mina1, a2 andb = minb1, b2.

4.1.1 Calibration issuesUsing (47), theai ’s andbi ’s can be calibrated independently in a straightfor-

ward way to the market CDS curves of the firm and the counterpart, respectively.Note in this regard that market CDS curves can be considered as ‘risk-free CDScurves’.

As for the model dependence parameterα, in case the market price of aninstrument sensitive to the dependence structure of default times (basket creditinstrument on the firm and the counterpart) is available, one can use it to calibrateα. Admittedly however, this situation is an exception rather than the rule. It is thusimportant to devise a practical way of settingα in case such a market data is notavailable. A possible procedure1 thus consists in ‘calibrating’α to a target valuefor the model probabilityp1,2(T) = P(H1

T = H2T = 1) A target value forp1,2(T)

can be obtained by plugging a standard static Gaussian copulaasset correlationρinto a bivariate normal distribution function, so

p1,2(T) = Nρ

2

(N−1

1 (p1(T)),N−11 (p2(T))

), (50)

where:

• N1 denotes the standard Gaussian c.d.f.,

• Nρ

2 denotes a bivariate centered Gaussian c.d.f. with unit variances andcorrelation coefficientsρ,

1We thank J.-P. Lardy for suggesting this procedure.

Page 125: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

113

• pi(T) = P(H iT = 1) for i = 1, 2.

Regulatory capital requirements being based on the Vasicek formula, such a staticcopula correlationρ can be retrieved from the Basel II correlations per asset class(cf. [1] ).

4.1.2 Special case of constant intensitiesWe now look at a particular case in whichb1 = b2 = b3 = 0. This case

will be referred to henceforth as the case ofconstant intensities, as opposed to themore general case ofaffine intensitiesintroduced in subsection 4.1. In the case ofconstant intensities, one has,

q1(t) = a1, q2(t) = a2, l3(t) = a3.

The correlation coefficientρd in (48) simplifies to

ρd =ea3T − 1√(

ea1T − 1) (

ea2T − 1)

from whicha3 can be calculated as

a3 =1T

ln

(1+ ρd

√(ea1T − 1

) (ea2T − 1

)).

As is well known, the price of a risk-free CDS in a constant intensity model isnull, i.e.,v(t) ≡ 0 whenb1 = 0. So the EPE formula (37) simplifies to

EPE(t) = (1− R1)(1− R2)a3

a2e−(a1−a3)t .

Also in this case, the pricing formula (29) for the risky CDS reduces to (assumingherer(t) = r),

u(t) = −(1− R1)(1− R2)a31− e−(r+a1+a2−a3)(T−t)

r + a1 + a2 − a3.

Finally, from Proposition 2.1, one gets,

CVA(t) = −u(t) .

In particular, for low values of the coefficients,

CVA(0) ' (1−R1)(1−R2)a3T = (1−R1)(1−R2) ln

[1+ ρd

√(ea1T − 1

) (ea2T − 1

)],

so, finally,

CVA(0) ' (1− R1)(1− R2)√

a1a2Tρd . (51)

Page 126: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

114

4.2 Numerical ResultsOur aim is to assess by means of numerical experiments the impact ofρ (the

asset correlation between the firm and the counterpart, cf. (50)) on one hand, andof κ2 (the risk-free CDS fair spread of the counterparty as of (47)) on the otherhand, on the counterparty risk exposure of the investor.Towards this end we fix the general data of Table 4.2 (case with affine intensities)or 4.2 (case with constant intensities, allb’s equal to 0), and we further considertwelve alternative sets of values fora2, b2, andρ given in columns one, two andfour of Table 4.2 (case with affine intensities), resp. fora2 andρ given in columnsone and three of Table 4.2 (case with constant intensities).

Table 1. Fixed Data — Affine Intensities.

r R1 R2 T a1 b1 κ1

5% 40% 40% 10 years .0095 .0010 84 bp

In the case of affine intensities the corresponding spreadsκ2 at time 0, de-fault correlationρd, model dependence parameterα and joint default probabilitiesp1,2 = P(H1

T = H2T = 1) are displayed respectively in the third, fifth, sixth and

seventh column of Table 4.2, whereas the last column of Table 4.2 (which willbe commented later in the text) gives the corresponding CVA’s at time 0. Therisky and risk-free CDS pricing functionsu andv corresponding to each of ourtwelve sets of parameters are displayed in Figures 4.2 and 4.2. On each graphthree curves are represented (see Remark 3.5):

• v(t) (dashed blue curve),

• u(t) with χ = v therein, denoted byu0(t) (doted red curve),

• u(t) with χ = u therein, denoted byu1(t) (black curve).

The analogous results in the case of constant intensities are displayed in Table4.2 and Figures 4.2 and 4.2. Note that on each graph in Figures 4.2 and 4.2 thefunctionv is equal to 0, as must be in the case of constant intensities.

In all the casesu0 andu1 are rather close to each other, and one can check nu-merically that using either one makes little difference regarding the related EPEsand CVAs. We present henceforth the results foru = u0.

Figures 4.2, 4.2 and 4.2 show the graphs of the Expected Positive Exposure asa function of time, of the Credit valuation Adjustment as a function of time, andof the Credit Valuation Adjustment at time 0 as a function ofρ, in the cases ofaffine (left graphs) or constant (right graphs) intensities.

Page 127: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

115

Table 2. Variable Data — Affine Intensities.

a2 b2 κ2 ρ ρd α p1,2 CVA(0).0056 .0006 50 bp 10% .0378 .0520 .0147 .0013.0085 .0009 75 bp 10% .0418 .0472 .0211 .0018.0122 .0010 100 bp 10% .0444 .0522 .0269 .0021.0189 .0014 150 bp 10% .0476 .0702 .0376 .0028.0056 .0006 50 bp 40% .1859 .2531 .0286 .0056.0085 .0009 75 bp 40% .1998 .2230 .0388 .0074.0122 .0010 100 bp 40% .2074 .2406 .0472 .0087.0189 .0014 150 bp 40% .2145 .3107 .0616 .0110.0056 .0006 50 bp 70% .4020 .5406 .0489 .0119.0085 .0009 75 bp 70% .4256 .4673 .0640 .0153.0122 .0010 100 bp 70% .4336 .4937 .0754 .0178.0189 .0014 150 bp 70% .4306 .6100 .0925 .0214

Table 3. Fixed Data — Constant Intensities.

r R1 R2 T a1 κ1

5% 40% 40% 10 years .0140 84 bp

Table 4. Variable Data — Constant Intensities.

a2 κ2 ρ ρd α p1,2 CVA(0).0083 50 bp 10% .0372 .0510 .0138 .0011.0125 75 bp 10% .0411 .0464 .0198 .0015.0167 100 bp 10% .0438 .0515 .0254 .0018.0250 150 bp 10% .0470 .0690 .0355 .0023.0083 50 bp 40% .1839 .2501 .0272 .0054.0125 75 bp 40% .1977 .2207 .0368 .0070.0167 100 bp 40% .2056 .2387 .0451 .0084.0250 150 bp 40% .2128 .3073 .0587 .0104.0083 50 bp 70% .3998 .5372 .0469 .0117.0125 75 bp 70% .4231 .4650 .0613 .0150.0167 100 bp 70% .4315 .4921 .0726 .0175.0250 150 bp 70% .4288 .6063 .0889 .0210

Page 128: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

116

Figure 1. Pricing functions in the case of affine intensities —v(t) (dashed curve),u0(t) (dotted curve) andu1(t) (line).

One can see on Figure 4.2 the impact on the counterparty risk exposure of theinvestor of the default risk (as measured by the risk-free spreadκ2) of the coun-terpart. On each graph the asset correlationρ is fixed, with from top to downρ = 10%, 40% and 70%. The four curves on each graph of Figure 4.2 corre-spond to EPE(t) for κ2 = 50, 75, 100 and 150bps. Observe that asκ2 decreasesthe counterparty risk exposure increases. This is in line with the stylized fea-tures and the financial intuition regarding the EPE: EPE(t) is the expectation ofthe investor’s loss, given the default of the counterpart at timet. A default of

Page 129: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

117

Figure 2. Pricing functions in the case of affine intensities —v(t) (dashed curve),u0(t) (dotted curve) andu1(t) (black curve).

a counterpart with a lower spread is interpreted by the markets as a worse newsthan a default of a counterpart with a higher spread. The related EPE is thuslarger.

Figure 4.2 shows the graphs of the Credit Valuation Adjustment as a functionof time, for affine (left column) or constant (right column) intensities. One canthus see the impact ofκ2 on the CVA. In each graph the asset correlationρ is fixed,with from top to downρ = 10%, 40% and 70%. The four curves on each graph ofFigure 4.2 correspond to CVA(t) for κ2 = 50, 75, 100 and 150bps. Observe that

Page 130: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

118

Figure 3. Pricing functions in the case of constant intensities — v(t) (dashedcurve),u0(t) (dotted curve) andu1(t) (line).

as opposed to the EPE, the CVA is increasing inκ2, in line with stylized features.Also note that the CVA is a decreasing function of time, in accordance again withexpected features: less time to maturity, less risk.

Finally Figure 4.2 represents the graphs of CVA(0) as a function of the assetcorrelationρ for κ2 = 50, 75, 100 and 150bps. Note for comparison that CVA(0)grows essentially linearly in the default correlationρd, at least in the case of con-stant coefficients (cf. formula (51)).

Page 131: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

119

Figure 4. Pricing functions in the case of constant intensities — v(t) (dashedcurve),u0(t) (dotted curve) andu1(t) (line).

5. Concluding Remarks and PerspectivesIn this article we propose a model of CDS with counterparty credit risk, with

the following desirable properties:

• Adequacy of the behavior of EPE and CVA in the model with expectedfeatures (see Section 4.2),

• Wrong way risk (via joint defaults, specifically),

• Simplicity, since the model is a four-state Markov chain of two creditnames, with one-name marginals automatically calibrated to the individualCDS curves,

Page 132: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

120

Figure 5. EPE(t) (χ = v, u = u0). In each graphρ is fixed. From top to downρ = 10%,ρ = 40% andρ = 70%. Left column: affine intensities. Right column:constant intensities.

Page 133: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

121

Figure 6. CVA(t) (χ = v, u = u0). In each graphρ is fixed. From top to downρ = 10%,ρ = 40% andρ = 70%. Left column: affine intensities. Right column:constant intensities.

Page 134: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

122

Figure 7. CVA(0) as a function ofρ for κ2 = 50 bp, 75 bp, 100 bp and 150 bp(χ = v, u = u0). Left: Affine intensities. Right: Constant intensities.

• Fact, related to the previous one, that the model ‘takes the right inputs togenerate the right outputs’, namely it takes as basic inputs the individualdefault probabilities (individual CDS curves), which correspond to the morereliable information on the market, which are then ‘coupled’ in a suitableway,

• Consistency, in the sense that it is a dynamic model with replication-basedvaluation and hedging arguments.

The present work might be extended in at least three directions.First, it would be desirable to add credit spread volatility into the model. Thiscould be achieved by adding areference filtrationF so that the model filtrationFbe given asF ∨H, and the intensitiesl, q are non-negativeF-adapted processes.A second related issue is that of merging the CDS-CVA pricing tool of this paperinto a more general, real-life CVA engine, including the following features:

• Netting, that is, aggregation in a suitable way of all the contracts (as opposedto only one CDS in this paper) relative to a given counterpart,

• Market (other than credit) risk factors,

• Margin agreements.

Finally, at the stage of implementation (see, e.g., Zhu and Pykhtin [26]), such real-life CVA engines pose interesting challenges from the numerical point of view ofMonte Carlo simulations.

Page 135: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

123

Appendix. Proof of Proposition 3.1We shall need the following (essentially classic) Lemma.

Lemma 5.1. LetX be a right-continuous process with a finite state spaceE andadapted to some filtrationF. Condition (i), (ii) or (iii) below are necessary andsufficient conditions forX to be anF – Markov chain with infinitesimal generatorA(t) = At = [Ai, j

t ] i, j∈E:

(i) For every function h overE,

Mht = h(Xt) −

∫ t

0(Ash)(Xs)ds (52)

is anF – local martingale;

(ii) For every j∈ E, the processM j defined by

Mjt = 1Xt= j −

∫ t

0AXs, js ds

is anF – local martingale;

(iii) For every i, j ∈ E the processMi, j given by

Mi, jt = 1Xt−=i,Xt= j −

∫ t

01Xs=iA

i, js ds

is anF – local martingale.

Proof. (i) is the usual local martingale characterization of Markov chains (see,e.g., Proposition 11.2.2 in Bielecki and Rutkowski [4]).

(ii) SinceE is finite, the set of the indicator functions1·= j spans linearly the set ofall functions overE. The condition of part (ii) is thus equivalent to that of (i).

(iii) Necessity follows by combination of Proposition 11.2.2 and Lemma 11.2.3in [4]. As for sufficiency, note that theMi, j ’s beingF – local martingales impliesthe same property for theM j ’s in (ii), by summation overi. We thus conclude bythe sufficiency in part (ii).

Let us proceed with the proof of Proposition 3.1. First, note the processesHι

can also be written as

H1t =∑

0<s≤t

1∆Hs=(1,0) , H2t =∑

0<s≤t

1∆Hs=(0,1) , H1,2t =∑

0<s≤t

1∆Hs=(1,1) .

(i) Let us verify that theMι’s in (20) areH – local martingales. As boundedH –local martingales,M1, M2 andM1,2 will thus beH-martingales. ForI = 1, 2,

Page 136: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

124

one has,

M1,2t = H1,2t −

∫ t

0q1,2(s,Hs)ds

=∑

0<s≤t

1∆Hs=(1,1) −

∫ t

01Hs=(0,0)l3(s)ds

=∑

0<s≤t

1Hs−=(0,0),Hs=(1,1) −

∫ t

01Hs=(0,0)l3(s)ds .

Thus Lemma 5.1 withi = (0, 0) and j = (1, 1), implies the local martingaleproperty ofM1,2.

For M1, one has,

M1t = H1t −

∫ t

0q1(s,Hs)ds

=∑

0<s≤t

1∆Hs=(1,0) −

∫ t

01H1

s=0

[1H2

s=0l1(s) + 1H2s=1q1(s)

]ds

=

0<s≤t

1Hs−=(0,0),Hs=(1,0) −

∫ t

01Hs=(0,0)l1(s)ds

+

0<s≤t

1Hs−=(0,1),Hs=(1,1) −

∫ t

01Hs=(0,1)q1(s)ds

.

Now we apply Lemma 5.1 to the two terms in the last equation, withi = (0, 0) andj = (1, 0) for the first term andi = (0, 1) and j = (1, 1) for the second term. ThusM1 being the sum of twoH – local martingales is anH – local martingale. Inthe same way,M2 is anH – local martingale. As boundedH – local martingales,M1, M2 andM1,2 are thusH–martingales.

(ii) As qi = l i + l3 andH i = Hi + H1,2, one hasMi = Mi + M1,2, so theMi ’sare in turnH-martingales.

(iii) Since theMi ’s areH-martingales, this follows easily from the sufficiency inLemma 5.1(ii).

(iv) Formulas (24) follow directly from (23), in which we shall now show the firstidentity. One has fort > s (see the end of the proof of Proposition 3.3),

P(τ2 > t|Hs) = P(τ2 > t|H2s) = (1− H2

s)e−∫ t

sq2(u)du .

Page 137: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

125

Thus

P(τ1 > s, τ2 > t) = E(1τ1>sE(1τ2>t|Hs))

= E

(1− H1

s)(1− H2s)e−

∫ t

sq2(u)du

,

and the result follows.

(v) SinceH it is a Bernoulli random variable with (cf. Proposition 3.1(iv))

P(H it = 1) = P(τi ≤ t) = 1− exp(−

∫ t

0qi(s)ds) := pi(t),

one hasVar(H i

t) = pi(t)(1− pi(t))

Also

Cov(H1t ,H

2t ) = Cov(1− H1

t , 1− H2t )

= E[(1− H1

t )(1− H2t )

]− E(1− H1

t )E(1− H2t )

= P(τ1 > t, τ2 > t) − P(τ1 > t)P(τ2 > t)

= exp

(−

∫ t

0l(s)ds

)− exp

(−

∫ t

0q1(s)ds

)exp

(−

∫ t

0q2(s)ds

).

Thus, after some algebraic simplifications,

ρd(t) =Cov(H1

t ,H2t )

√Var(H1

t )Var(H2t )=

exp(∫ t

0l3(s)ds

)− 1

√(exp

(∫ t

0q1(s)ds

)− 1

) (exp

(∫ t

0q2(s)ds

)− 1

) .

References1. Basel Committee on Banking Supervision: International Convergence of Capital Mea-

surement and Capital Standards,Bank for International Settlements, June 2006.2. Bielecki, T. R., Vidozzi, A. and Vidozzi, L.: A Markov Copulae Approach to Pricing

and Hedging of Credit Index Derivatives and Ratings Triggered Step–Up Bonds,J. ofCredit Risk, vol. 4, num. 1, 2008.

3. Bielecki, T. R., Vidozzi, A., Vidozzi, L. and Jakubowski, J.,: Study of Dependence forSome Stochastic Processes,Journal of Stochastic Analysis and Applications, Volume26, Issue 4 July 2008 , pages 903–924.

4. Bielecki, T. R. and Rutkowski, M.: Credit Risk: Modeling, Valuation and Hedging.Springer-Verlag, Berlin, 2002.

5. Blanchet-Scalliet, Ch. and Patras, F.: Counterparty Risk Valuation for CDS,WorkingPaper, 2008.

Page 138: Financial Engineering

May 3, 2010 13:51 Proceedings Trim Size: 9in x 6in 004

126

6. Brigo, D. and Capponi, A.: Bilateral counterparty risk valuation with stochastic dy-namical models and application to Credit Default Swaps,Working Paper, 2008.

7. Brigo, D. and Chourdakis, K.: Counterparty Risk for Credit Default Swaps: Impactof spread volatility and default correlation,International Journal of Theoretical andApplied Finance, Vol. 12(07), pages 1007–1026, 2009.

8. Brigo, D. and Masetti, M.: Risk Neutral Pricing of Counterparty Risk. InCounter-party Credit Risk Modeling: Risk Management, Pricing and Regulation, ed. Pykhtin,M., Risk Books, London, 2006.

9. Brigo, D. and Pallavicini, A.: Counterparty Risk under Correlation between Defaultand Interest Rates,Numerical Methods for Finance, Miller, J., Edelman, D., and Ap-pleby, J. (Editors), Chapman Hall, 2007.

10. Brigo, D. and Pallavicini, A.: Counterparty Risk and Contingent CDS under correla-tion, Risk Magazine, 2008.

11. Brigo, D. and Tarenghi, M.: Credit Default Swap Calibration and Counterparty RiskValuation with a Scenario based First Passage Model,Working Paper, 2005.

12. Brigo, D. and Tarenghi, M.: Credit Default Swap Calibration and Equity Swap Valu-ation under Counterparty Risk with a Tractable Structural Model,Proceedings of theFEA 2004, Conference at MIT, Cambridge, Massachusetts, November 8–10.

13. Canabarro, E. and Duffie, D.: Measuring and marking counterparty risk, Chapter 9of Asset/Liability Management of Financial Institutions, Euromoney Books, 2003.

14. Canabarro, E., Picoult, E.andWilde, T: Analysing counterparty risk, Risk Magazine,16:9 (September 2003), pp. 117–122.

15. Delbaen, F.and Schachermayer, W.: The mathematics of arbitrage. Springer Finance,2006.

16. Embrechts, P., Lindskog, F. and McNeil, A. J.: Modelling dependence with copulasand applications to risk management. InHandbook of heavy tailed distributions infinance, edited by Rachev, S. T., published by Elsevier/North-Holland, 2003.

17. Huge, B. and Lando, D.: Swap Pricing with Two-Sided Default Risk in a Rating-BasedModel, European Finance Review, 1999, 3, pp. 239–68.

18. Hull, J., White, A.: Valuing Credit Default Swaps II: Modeling Default Correlation,The Journal of derivatives, Vol. 8, No. 3. (2001), pp. 12–22.

19. Jarrow, R. and Yu, F.: Counterparty risk and the pricing of defaultable securities,Journal of Finance, Vol. 56 (2001), pp. 1765–1799.

20. Lardy, J. P.: Counterpart Risk on CDS: A composite Spread Method example,CRISSeminar, 2008.

21. Leung, S. Y.and Kwok, Y. K.: Credit Default Swap Valuation with Counterparty Risk,Kyoto Economic Review, 74(1): 25–45, 2005.

22. Lipton, A. and Sepp, A.: Counterparty Risk in the Extended Structural Default Model,The Journal of Credit Risk, Vol 5, Num 2, 123–146, Summer 2009.

23. Marshall, A. W. and Olkin, I.: A multivariate exponential distribution,J. Amer.Statist. Assoc., 2, 84–98, 1967.

24. Redon, C.: Wrong way risk modelling,Risk magazine, April 2006.25. Rogers, L. C. G.andWilliams, D. Diffusions, Markov Processes and Martingales.2nd

edition. Cambridge University Press, 2000.26. Zhu, S.and Pykhtin, M.:A Guide to Modeling Counterparty Credit Risk,GARP Risk

Review, July/August 2007.

Page 139: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

Portfolio Efficiency Under Heterogeneous Beliefs∗

Xue-Zhong He† and Lei Shi

School of Finance and Economics, University of Technology, Sydney, PO Box 123Broadway NSW 2007, Australia

E-mail: [email protected] and [email protected]

In the standard mean variance (MV) capital asset pricing model (CAPM)with homogeneous beliefs, the optimal portfolios of investors are MV ef-ficient. It is expected that this is no longer true in general when investorshave heterogeneous beliefs in the means and variances/covariances of as-set returns. This paper extends the standard Black’s zero-beta CAPM toincorporate heterogeneous beliefs and verifies that the subjectively opti-mal portfolios of heterogeneous investors are MV inefficient in general.The paper then demonstrates that the traditional geometric relation of themean variance frontiers with and without the riskless asset under homoge-neous beliefs does not hold in general under heterogeneous beliefs. Thepaper further examines the impact of biased beliefs among investors onthe MV efficiency of their optimal portfolios. The results provide someexplanations on the risk premium puzzle, Miller’s hypothesis, and under-performance of managed funds.

Keywords: Asset prices, mean-variance efficiency, heterogeneous be-liefs, zero-beta CAPM.

1. IntroductionThe Capital Asset Pricing Model (CAPM) developed by Sharpe [28], Lintner

[22] and Mossin [25] is perhaps the most influential equilibrium model in modern

∗We are grateful to an anonymous referee for helpful comments and seminar participants at PekingUniversity and the National Chengchi University, and to participants at the 14th International Con-ference on Computing in Economics and Finance (Paris, June 2008) and 2009 Asian FA confer-ence (Brisbane) for helpful comments. Financial supports from the Australian Research Council(ARC) under Discovery Grant (DP0773776) and the Faculty Research Grant at UTS are gratefullyacknowledged.†Corresponding author.

127

Page 140: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

128

finance. It is based on the assumptions that investors have homogeneous beliefsin the means and variances/covariances of risky assets and there is unrestrictedborrowing and lending of a risk-free asset. To relax these unrealistic assumptions,Lintner [23] extends the CAPM by incorporating heterogeneous beliefs amonginvestors. To provide a theoretical explanation on the early empirical tests of theCAPM, Black [2] removes the risk-free asset and develops the well-known zero-beta CAPM. Since then, equilibrium models have been developed in the literatureto examine the impact of heterogeneity amongst investors on market equilibrium1.Assuming that investors are bounded rational, heterogeneity may be caused bydifference in information or difference in opinion2.

In the mean-variance (MV) literature, the impact of heterogeneous beliefs ismostly studied for the case of a portfolio of one risky asset and one risk-free as-set. Lintner [23] is the first considering the CAPM with heterogeneous beliefs andwithout risk-free asset and shows that heterogeneity does not change the structureof capital asset prices in any significant way, and removing risk-free asset is just amere extension of the case with a risk-free asset. Surprisingly, this significant con-tribution from Lintner has not been paid much attention until recent years3. Themain obstacle in dealing with heterogeneity is the complexity and heavy notationinvolved when the number of assets and the dimension of the heterogeneity in-crease. It might be due to this notational obstacle that makes the paper of Lintnerhard to follow, and renders rather complicated analysis of the impact of hetero-geneity on the market equilibrium prices. Recently, Sun and Yang [30] provideconditions for the existence of the market equilibrium and have shown that thezero-beta CAPM still holds under heterogeneous beliefs within the MV frame-work. However, they do not provide the market equilibrium price and examinethe impact of heterogeneity on the market equilibrium price, including MV effi-ciency of the optimal portfolios of heterogeneous investors. When investors haveheterogeneous beliefs in the means and variances/covariances of asset returns, itis expected in general that the subjectively optimal portfolios are no longer MVefficient. If we treat managed funds as subjectively optimal portfolios, the MV

1Some have considered the problem in discrete time (for example, see Lintner [23], Rubinstein[27], Fan [13], Sun and Yang [30], Chiarellaet al. [7] and Sharpe [29]) and others in continuous time(for example, see Williams [32], Detemple and Murthy [10] and Zapatero [33]), and more recentlyJouni and Napp ([19], [18], [20]), Hara [14] and Brown and Rogers [5]. Some models are in the MVframework (see, Lintner [23], William [32] and Sun and Yang [30]), others are in the Arrow-Debreucontingent claims economy (see, for example Rubinstein [27] and Abel ([1])).

2In the first case, investors may update their beliefs as new information become available, Bayesianupdating rule is often used (see, for example, Williams [32] and Zapatero [33]). In the second case,investors agree to disagree and may revise their portfolio strategies as their views of the market changeover time (see, for example, Lintner [23], Rubinstein [26] and Brown and Rogers [5]). For a discussionon the difference of the two cases, we refer the reader to a survey paper by Kurz [21].

3See, for example, Wenzelburger [31], Bohm and Chiarella [3], Bohm and Wenzelburger [4], andChiarellaet al. ([6], [7], [8], [9]).

Page 141: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

129

inefficiency would imply the under-performance of the managed funds. This pa-per is devoted to present an explicit equilibrium price formula, and to examine theimpacts of the heterogeneous beliefs on the MV efficiency of the optimal portfo-lios and on the market equilibrium in general.

In their recent work, Chiarellaet al.[7] introduce a concept ofconsensus beliefand show that, when there is a riskless asset, the market consensus belief can beconstructed explicitly as a weighted average of the heterogeneous beliefs. Theyshow that the market equilibrium prices is a weighted average of the equilibriumprices perceived by each investor. They also establish a CAPM-like relation underheterogeneous beliefs. In this paper, we first extend their analysis to the casewhen there is no riskless asset and obtain a zero-beta CAPM-like relation underheterogeneous beliefs. It is well known that the geometric tangency relation oftraditional portfolio theory plays a very important role to the establishment ofthe CAPM4. We demonstrate that this geometric relationship does not hold underheterogeneous beliefs.

The current paper is related to Jouni and Napp ([19], [18]) who investigatethe impact of beliefs heterogeneity on the consumption CAPM and the risk freerate by constructing a consensus belief and consumer. They show how pessimismand doubt at the aggregate level result from pessimism and doubt at the individuallevel. The construction of the consensus belief in this paper shares some similarity(in a much simpler and explicit way within the MV framework) to that in Jouniand Napp, however our focus is on the portfolio analysis and MV efficiency of thesubjectively optimal portfolios, rather than on the risk premium. In other words,the focus of Jouni and Napp is on the impact of the aggregation of heterogeneousbeliefs on the market, while we focus on the impact of the aggregation on the MVefficiency of individuals’ optimal portfolio. Also, we compare the market MVfrontiers with and without riskless asset and focus on the impact of the hetero-geneous beliefs on the geometric relation of the frontiers. Interestingly, a similarresult on the MV efficiency of the optimal portfolio of heterogeneous beliefs isfound in Easlay and O’Hara [12] where the heterogeneous beliefs are due to theinformation asymmetry5. With a rational expectations equilibrium model, theyshow that the average market portfolio is MV efficient, but not necessarily for theinvestors with different information. In our setup, investors are bounded rationaland the market consensus belief is endogenously determined by all market par-ticipants and we show that the market portfolio is always MV efficiency (by the

4The market portfolio remains the same and MV efficient with or without the existence of a risklesssecurity

5The heterogeneity can be due to either asymmetric information or different interpretation aboutthe same information among investors in general. In the first case, certain structures on informationand learning (such as Bayesian updating and learning) are imposed, while in the second case, theheterogeneous beliefs are associated with certain trading strategies used in financial markets (such asthe momentum and contrarian strategies).

Page 142: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

130

construction of the consensus belief) and the subjectively optimal portfolios ofinvestors are inefficient in general.

The paper is structured as follows. In Section 2, we introduce and constructthe market consensus belief linking the heterogeneous market with an equivalenthomogeneous market, and present an explicit market equilibrium price formula.Consequently, a zero-beta CAPM under the heterogeneous beliefs is derived. InSection 3, we examines the impacts of different aspect of heterogenous beliefs onthe market equilibrium. Through some numerical examples, Section 4 examinesthe implications of heterogeneity on the MV efficiency of the optimal portfolios ofheterogeneous investors and the geometric tangency relation of the portfolios withand without riskless asset. Section 5 extends numerical analysis to a market withmany investors and examines the impact of heterogeneity on the MV efficiency ofoptimal portfolios and on the market when the belief dispersions are characterizedby mean-preserving spreads. Section 6 summaries and concludes the paper. Theproofs and details of an numerical example are provided in the appendices. Anearlier version considering the beliefs in both payoff and return setups can befound in He and Shi[16].

2. MV Equilibrium Asset Prices Under Heterogeneous BeliefsWhen a financial market consists of investors with different views on the fu-

ture movement of the market, it is important to understand how market equilib-rium is obtained and the roles played by different investors. Within the standardMV framework, in this section, we first introduce heterogeneous beliefs amonginvestors and a concept of market consensus belief to reflect the market beliefwhen market is in equilibrium. By constructing the consensus belief explicitly,we characterize the equilibrium asset prices. Consequently we obtain a zero-betaCAPM-like relation under heterogeneous beliefs.

2.1 Heterogeneous BeliefsFollowing Lintner [23] and Black [2], we extend the static MV model with

homogeneous belief and consider a market in which there are many risky assetsbut there is no risk-free asset and investors have heterogeneous beliefs of the futurereturns of risky assets. Similar to Chiarellaet al. [7], asset returns are measuredin the payoff in capital.

Consider a market withN risky assets, indexed byj = 1, 2, · · · ,N and I in-vestors indexed byi = 1, 2, · · · , I . Let x = (x1, · · · , xN)T be the random payoffvector of the risky assets. Assume that each investor has his/her own set of beliefsabout the market in terms of means, variances and covariances of the payoffs ofthe assets, denoted by

yi, j = Ei [ x j ], σi, jk = Covi(x j , xk) for 1 ≤ i ≤ I , 1 ≤ j, k ≤ N. (1)

For investori, we define the mean vector and variance/covariance matrix of the

Page 143: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

131

payoffs of N assets as follows,yi = Ei(x) = (yi,1, yi,2, · · · , yi,N)T andΩi =

(σi, jk)N×N, which is positive definite. DenoteBi = (Ei(x),Ωi) the set of subjectivebeliefs of investori. Letzi = (zi,1, zi,2, · · · , zi,N)T be the portfolio in the risky assets(in quantity) andWi,o be the initial wealth of investori. Then the end-of-periodportfolio wealth of investori is given byWi = xTzi . Under the beliefBi , the meanand variance of the portfolio wealthWi of investori are given, respectively, by

Ei(Wi) = yTi zi , σ2

i (Wi) = zTi Ωizi . (2)

As in the standard MV framework, we assume that investori has a constant ab-solute risk aversion (CARA) utility functionUi(w) = −e−θiw, whereθi is theCARA coefficient, and the end-of-period wealthWi of investor i is normallydistributed. Under these assumptions, maximizing investori’s expected utilityof wealth is equivalent to maximizing his/her certainty equivalent end-of-periodwealth maxzi Qi(zi) subject to the wealth constraint

pT0 zi =Wi,o, (3)

where

Qi(zi) := Ei(Wi) −θi

2σ2

i (Wi) = yTi zi −

θi

2ziΩizi

and p0 is the market price vector of the risky assets. Applying the first orderconditions, we obtain the following lemma on the optimal portfolio of the investor.

Lemma 2.1. For given market price vectorp0 of risky assets, the optimal riskyportfolio z∗i of investor i is uniquely determined by

z∗i = θ−1i Ω

−1i [yi − λ

∗i p0], (4)

where

λ∗i =pT

0Ω−1i yi − θiWi,o

pT0Ω−1i p0

. (5)

Lemma 2.1 implies that the optimal demand of investori depends on his/herabsolute risk aversion (ARA) coefficient (θi), the expected payoffs and vari-ance/covariance matrix of the risky asset payoffs, the Lagrange multiplier (λ∗i ),as well as the market price of the risky assets. Following Lintner [23],λ∗i is ashadow price, measuring themarginal real (riskless) certainty-equivalent of in-vestor i’s end-of-period wealth. In fact, applying the first order condition, weobtain

∂Qi (z∗i )∂zi= λ∗i po, which leads to

λ∗i =1

poj

∂Qi(z∗i )

∂zi jfor all j = 1, 2, · · · ,N. (6)

Page 144: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

132

More precisely, equation (6) indicates thatλ∗i actually measuresinvestor i’s opti-mal marginal certainty equivalent end-of-period wealth per unit of asset j relativeto its market priceand it is a constant across all assets. In general, the shadowprice is not necessary the same for all investors, however, it becomes the samewhen there exists a risk-free asset in the market. In fact, let the current price of therisk-free assetf be 1 and its payoff beRf = 1+ r f . Applying (6) to the risk-freeasset leads toλ∗i = Rf for all investors, that is, the shadow price is equal to thepayoff of the risk-free asset.

2.2 Consensus Belief and Equilibrium Asset PricesWe define the market equilibrium asset price vectorpo of the risky assets as

the price vector under which individual’s optimal demands (4) satisfy the marketaggregation condition

I∑

i=1

z∗i =I∑

i=1

zi := zm, (7)

where zi is the endowment portfolio of investori. Correspondingly,zm is themarket portfolio of the risky assets. It then follows from (7) and (4) that themarket equilibrium pricepo is given, in terms of the heterogeneous beliefs of theinvestors, by

p0 =

( I∑

i=1

θ−1i λ∗i Ω−1i

)−1[( I∑

i=1

θ−1i Ω

−1i yi

)− zm

]. (8)

This expression defines the market equilibrium pricepo implicitly since λ∗i de-pends onpo as well. For the existence of the market equilibrium price in general,we refer to Sun and Yang [30] and the references cited there. The concept of con-sensus belief has been used to characterize the market when investors are hetero-geneous in different context (such as Jouini and Napp ([19], [18]) and Chiarellaetal. [7]). It is closely related to but significantly different from the concept of repre-sentative investor in the classical finance literature. It is endogenously determinedthrough the market aggregation and reflects a weight average of heterogeneousbeliefs. We now introduce the concept of consensus belief for the market with theheterogeneous beliefs.

Definition 2.1. A belief Ba = (Ea(x),Ωa), defined by the expected payoff of therisky assetsEa(x) and the covariance matrix of the risky asset payoffsΩa, is calleda marketconsensus beliefif the market equilibrium price under the heterogeneousbeliefs is also the market equilibrium price under the homogeneous beliefBa.

When a consensus belief exists, the market with heterogeneous beliefs canbe treated as a market with homogeneous consensus belief and then the classical

Page 145: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

133

Markowtiz portfolio analysis can be applied. Due to the complexity of hetero-geneity, the existence and finding of such consensus belief is a difficult task inthe literature. This obstacle makes the examination of the impact of the hetero-geneity difficult. In the following, we construct the consensus belief explicitly,from which market equilibrium pricesp0 can be determined explicitly in termsof the consensus belief. It is the explicit construction of the consensus belief thatmakes it easy to examine the role of heterogeneous beliefs played in determiningthe market equilibrium price and to derive the zero-beta CAPM relation.

Proposition 2.1. Let

θa :=(1

I

I∑

i=1

θ−1i

)−1

, λ∗a :=1Iθa

I∑

i=1

θ−1i λ∗i .

Then

(i) the consensus beliefBa = (Ea(x),Ωa) is given by

Ωa = θ−1a λ∗a

(1I

I∑

i=1

λ∗i θ−1i Ω

−1i

)−1

, (9)

ya := Ea(x) = θaΩa

(1I

I∑

i=1

θ−1i Ω

−1i Ei(x)

); (10)

(ii) the market equilibrium pricepo is determined by

p0 =1λ∗a

[ya −

1IθaΩazm

]; (11)

(iii) the equilibrium optimal portfolio of investor i is given by

z∗i = θ−1i Ω

−1i

[(yi −

λ∗i

λ∗aya) +

λ∗i

Iλ∗aθaΩazm

]. (12)

Proposition 2.1 shows how the consensus belief can be constructed explicitlyfrom the heterogeneous beliefs. Under the consensus belief, the market equilib-rium prices of the risky assets are determined in the standard way with no risk-freeasset. Intuitively Proposition 2.1 indicates that the market consensus belief is aweighted average of the heterogeneous beliefs. More precisely, the market risktolerance (1/θa) is simply an average of the risk tolerance of the heterogeneousinvestors, according to Huang and Lizenberger [17],θa/I = (

∑Ii θ−1i )−1 is called

theaggregate absolute risk aversionand consequentlyθaWm0/I is referred to asthe aggregate relative risk aversion. The weighted average behaviour can also be

Page 146: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

134

viewed in the following way. Letτi = 1/θi be the risk tolerance of investori andτa =

∑Ii=1 τi be the market aggregate risk tolerance. Then

λ∗a =

I∑

i=1

τi

τaλ∗i , Ω−1

a =

I∑

i=1

τiλi

τaλaΩ−1

i , Ea(x) = Ωa

I∑

i=1

τi

τaΩ−1

i Ei(x).

Hence the precision matrix (Ω−1a ) for the market reflects an weighted average

of the precision matrices of all investors and the market expected payoff is aweighted average of the expected payoffs of the investors. The market equilib-rium prices are determined such that each investor can choose their optimal port-folio subjectively and the market is cleared. It follows from (4) in Lemma 2.1 thatp0 =

1λ∗i

(yi −1τiΩiz∗i ) for i = 1, · · · , I . However, if the entire market acts as an ag-

gregate investor, then for the market to clear, the prices must be determined by theconsensus belief as in (11) or equivalently asp0 =

1λ∗a

(ya−1τaΩazm). This suggests

that the consensus beliefBa must correspond to the belief of the aggregate marketsuch that the market portfolio is an optimal portfolio. The expressions in Propo-sition 2.1 provide explicit relationships between the heterogeneous belief and themarket consensus belief under the market aggregation. Their usefulness will berevealed when we derive a zero-beta CAPM-like relation and examine the impactsof the heterogeneity on the market equilibrium in the following subsection.

2.3 The Zero-Beta CAPM Under Heterogeneous BeliefsAs a corollary of Proposition 2.1, we show now that a zero-beta CAPM-like

relation holds under the constructed consensus belief with no risk-free asset.Let the future payoff of the market portfoliozm be given byWm = xTzm and

its current market value isWm,o = zTmp0 =

∑Ii=1 Wi,o. Hence under the consensus

belief Ba, Ea(Wm) = yTa zm andσ2

a(Wm) = zTmΩazm. Define the return vector

r = (r1, · · · , rN)T with r j = x j/p j,o − 1 and ˜rm = Wm/Wm,o − 1. Under the marketconsensus beliefBa, we set

Ea(r j) =Ea(x j)

p j,o− 1, Ea(rm) =

Ea(Wm)Wm,o

− 1 σ2a(rm) =

σ2a(Wm)

W2m,o

and

Cova(r j , rm) =1

p j,oWm,oCova(x j , Wm), Cova(r j , rk) =

1p j,opk,o

Cova(x j , x j).

Then we have the following result.

Corollary 2.1. In market equilibrium, the relation between expected return andrisk under the heterogeneous beliefs can be expressed as

Ea[ r ] − (λ∗a − 1)1 = β[Ea(rm) − (λ∗a − 1)], (13)

Page 147: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

135

where

λ∗a =zT

mya − θazTmΩazm/I

Wm,o, (14)

Ea(rm) − (λ∗a − 1) =θazT

mΩazm/I

Wm0=

1τa

Wm,oσ2a(rm) > 0 (15)

andβ = (β1, β2, · · · , βN)T with

β j =Cova(rm, r j)

σ2a(rm)

=Wmo

poj

Cova(x j, Wm)

σ2a,m

, j = 1, · · · ,N.

The equilibrium relation (13) is the standard Zero-Beta CAPM except that themean and variance/covarianceare calculated based on the consensus beliefBa. Werefer it as theZero-beta Heterogeneous Capital Asset Pricing Model(ZHCAPM).For risky assets, relation (13) is equivalent to

Ea[ r j ] − (λ∗a − 1) = β j [Ea(rm) − (λ∗a − 1)], for j = 1, · · · ,N. (16)

The zero-beta rate,λ∗a − 1, corresponds to the expected return of the zero-betaportfolio of the market portfolio, whereλ∗a is the market shadow price. As inthe standard case, the market risk premium, given by equation (15) is positivelyproportional to the aggregate relative risk aversionWm,o/τa and the variance ofthe market portfolio returnsσ2

a(rm). The market price of risk under the consen-sus belief is given byφ = (Ea(rm) − (λ∗a − 1))/σa(rm) = Wm0σ(rm)/τa, which isproportional to the level of volatility of the market and the aggregate relative riskaversion.

As discussed earlier, investori’s shadow price becomesRf across all investorswhen there exists a risk-free asset in the market. That is,λ∗i = λ

∗a = Rf . Sub-

stituting this into Proposition 2.1 and Corollary 2.1 leads to the main results inChiarellaet al. [7].

3. The Impact of HeterogeneityIn this section, we use Proposition 2.1 and Corollary 2.1 to examine the impact

of the heterogeneous beliefs on the market consensus belief and equilibrium price.To simplify the analysis, we focus on some special cases.

3.1 The Shadow Prices and the Aggregation PropertyWe first examine the relationship between individual shadow prices

and the market consensus shadow price. Following (2.1), letλ∗a =

f (λ∗1, λ∗2, · · · , λ

∗I ; θ1, θ2, · · · , θI ). Then it is easy to see that∂ f

∂λ∗i=θaθ−1i

I > 0,

showing that the market consensus shadow price increases as the shadow priceof investori increases, and the rate of increase depends onθi . It follows from

Page 148: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

136

∂2 f∂λ∗i ∂θi

= 1I θ−3i θa( 1

I θa − θi) and Iθ−1i > θ

−1a that ∂2 f

∂λ∗i ∂θi< 0. Therefore the market

consensus shadow price is more sensitive to the change of the shadow price ofinvestor who is less risk-averse.

According to Huang and Litzenberger [17], when investors have homogeneousbelief, time-additive and state independent utility functions with linear risk toler-ance and a common cautiousness coefficient, the market equilibrium prices areindependent of the distribution of the initial wealth among investors and, if thisis the case, we say that the market satisfies the aggregation property. In a gen-eral two-period economy without specifying the type of utility function for anyinvestors, Fan [13] shows theSecond Welfare Theoremholds. The theorem statesthat investors with large capital endowments would have lower marginal utilitiesof capital endowments and a stronger influence on the market equilibrium. In ourcase, the utility is measured byQi(z). From (6), the marginal utility of investoriis represented by the shadow price (λ∗i ). It then follows from (5) that a large initialwealth or capital endowment leads to a lower marginal utility. Also, from the ex-pression of the equilibrium price vector in (8), it can be seen that (λ∗i ) is inverselyrelated to the price vector. This suggests that an investor with a lower shadowprice or marginal utility has a stronger impact on the market equilibrium prices,and hence an investor with a larger capital is more influential in the market. Thisis consistent with theSecond Welfare Theorem. In other words, the aggregationproperty does not hold in our case in general. However, if there is a risk-free assetin the market, then the shadow prices or marginal utilities is a constant across allinvestors. Correspondingly the market prices are independent of the initial wealthdistribution. Summarizing the above analysis, we have the following corollary.

Corollary 3.1. With the heterogeneous beliefs and no risk-free asset, the aggre-gation property does not hold. Furthermore investors with lower shadow pricesor marginal utilities have a stronger impact on the market equilibrium prices, andhence investors with larger capital are more influential in the market. However, ifthere is a risk-free asset, the aggregation property holds.

3.2 The Impact of Heterogeneous ARA CoefficientsProposition 2.1 indicates that the heterogeneous ARA coefficients or risk tol-

erance have complicated impact on the market consensus belief and equilibriumprice. To illustrate such impact, we consider a special case when investors arehomogeneous in the expected payoffs and covariance matrix but heterogeneous inARA, that is,Ωi = Ωa := Ωo, yi = ya := yo for all i. Accordingly the equilibriumprice vector can be written as

p0 =1λ∗a

[yo −

1IθaΩozm

], λ∗a =

zTmyo − θazT

mΩozm/I

Wm0. (17)

Equation (17) implies that, when the risk aversion coefficient is the only source ofheterogeneity, the market equilibrium prices are independent of the initial wealth

Page 149: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

137

distribution amongst individuals and hence the aggregation property holds. Forany risky assetj, (17) becomes

p0, j =1λ∗a

[yo, j −

1IθaCov(x j, Wm)

].

This, together with the market shadow price in equation (17), leads to∂p0, j

∂θa=

σ2(rm)(1−β j )Iλ∗a

. In the presence of a risk-free asset with payoffRf , this becomes∂p0, j

∂θa=

−σ2(rm)β j

IRf. Noting that, in this case, the equilibrium prices and expected returns are

inversely related since the expected payoff is given. Together with the fact that∂θa∂θi= (θ−1

a θ−1i )2/I > 0 and ∂

2θa∂θ2i= −2∂θa

∂θi( ∂θa∂θiθ−1

a + θ−1i ) < 0, this analysis leads to

the following corollary.

Corollary 3.2. In a market with homogeneous beliefs and no risk-free assets,

(β j − 1)∂p0, j

∂θi< 0, (β j − 1)

∂Eo(r j)

∂θi> 0

for β j , 1 and ∂p0, j

∂θi=∂Eo(r j)∂θi= 0 for β j = 1. If there exists a risk-free asset, then

β j∂p0, j

∂θi< 0, β j

∂Eo(r j)

∂θi> 0

for β j , 0 and ∂p0, j

∂θi=∂Eo(r j )∂θi

= 0 for β j = 0. The rate of change for both theequilibrium price and expected return is greater when investor is less risk averse.

Corollary 3.2 indicates that the impact of ARA on the market equilibrium de-pends on the beta of the asset. When there is no risk-free asset, if an asset is riskierthan the market (β j > 1), an increase in ARA for any investor increases the priceand decreases the expected future return of the asset, and vice versa for a lessrisky asset. However, if there is a risk-free asset, the changes depend on the returncorrelation of the asset with the market. If the returns of the asset and market arepositive correlated, an increase (decrease) in ARA of any investor leads to lower(higher) market equilibrium price and higher (lower) expected return for the asset.In addition, changing ARA of less risk averse investor has more significant impacton market equilibrium price and expected return. The market is dominated by lessrisk averse investors, because the market average risk aversion coefficient θa isa harmonic mean ofθis, it aggravates the impact of the smallθis. This suggeststhat, when there is no risk-free asset in the market and when the risk aversioncoefficients of the investors becomes more divergent with a given average, theaggregate ARA would be reduced, resulting lower (higher) equilibrium price andhigher (lower) expected return for assets with betas are below (above) the market

Page 150: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

138

level. However, when there is a risk-free asset, the reduction of the market ag-gregate risk aversion leads to lower (higher) equilibrium price and higher (lower)expected return for assets that are negatively (positively) correlated with the mar-ket.

3.3 The Impact of Heterogeneous Expected PayoffsWe now assume that investors agree on the variances and covariances of asset

payoffs, sayΩi = Ωo, but disagree on the expected future payoffs of the assets.ConsequentlyΩa = Ωo and the equilibrium price for assetj becomes

p0, j =1λ∗a

[ya, j −

1τa

Covo(x j , Wm)], (18)

whereλ∗a = [zTmya − zT

mΩozm/τa]/Wm0 andya, j =∑I

i=1(τi/τa)yi, j. This, togetherwith (18), leads to

∂p0, j

∂ya, j=

1− α j

λ∗a, (19)

whereα j = p0, jzm, j/Wm0 is the market share of assetj in wealth. If there is a risk-free asset in the market with payoff Rf , then (19) simply becomes∂p0, j/∂ya, j =

1/Rf . Note that

∂ya, j

∂yi, j=

1Iθa

θiyi, j > 0,

∂2ya, j

∂yi, j∂θi= θaθ

−3i

yi, j

I(θa

I− θi) < 0. (20)

Because ofα ∈ [0, 1], equations (19) and (20) indicates that investori’s subjectivebelief in the expected payoff of assetj is positively related to its equilibrium price.This is also true when there is a risk-free asset in the market. The positive cor-relation between the subjective beliefs in the expected payoff and the equilibriumprice for assetj does not necessarily lead to a negative correlation between thesubjective beliefs in the expected payoff and the market expected return for assetj. To see the exact relation, we have fromEa(r j) = ya, j/po, j − 1 that

∂Ea(r j)

∂ya, j=

po, j − (1− α j)ya, j/λ∗a

p2o, j

. (21)

This expression is negative if and only if (1+ Ea(r j))(1 − α j) > λ∗a. When thiscondition holds, the expected return decreases when the expected payoff increasesfor assetj. When there is a risk-free asset,λ∗a = Rf and equation (21) becomes∂Ea(r j )∂ya, j

=po, j−ya, j/Rf

p2o, j

, which is negative if and only ifEa(r j) > r f . When this con-

dition holds, the expected return decreases when the heterogeneous belief in theexpected payoff increases for assetj. Summarizing the above analysis, we obtainthe following corollary.

Page 151: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

139

Corollary 3.3. In a market with homogeneous beliefs in covariance matrix andno risk-free assets, if

(1+ Ea(r j))(1− α j) > λ∗a (22)

for asset j, then the market expected payoff increases and the expected returndecreases when the heterogeneous belief in the expected payoff of any investorincreases for asset j. When there is a risk-free asset, the condition(22) becomesEa(r j) > r f .

The following discussion is devoted to Miller’s hypothesis (Miller [24]) thatassets with high dispersion in beliefs have higher market prices and lower ex-pected future returns than otherwise similar stocks. Empirical tests performed inDiether, Malloy and Scherbina [11] support Miller’s hypothesis. Intuitively, opti-mistic investors would increase the price of the asset and then reduce its expectedfuture return. We now provide an explanation on this hypothesis. Let us considera market in which investors have homogeneous beliefs in the covariance matrixbut heterogeneous beliefs in the expected payoffs of two risky assetsj and j′. Letthe expected payoffs bey j = (y1, j, y2, j, · · · , yI , j)T andy j′ = (y1, j′ , y2, j′ , · · · , yI , j′)T

for assetj and j′, respectively. Assumeyi, j′ = yi, j + εi, j , whereε j,1, ε j,2, · · · , ε j,I isa set of real numbers such that

∑ni=1 εi, j = 0 and1

I

∑Ii=1(yi, j′−y)2 ≥ 1

I

∑Ii=1(yi, j−y)2,

wherey = (1/I )∑I

i=1 yi, j . This condition implies that investors have more diver-gence of opinions in the expected payoff for assetj′ than assetj. According toMiller’s hypothesis, assetj′ would have higher market price and lower expectedfuture return than assetj. To see if this is true, we consider the following simpleexample whenI = 2.

Example 3.1. Let I = 2. Givenε > 0, consider two assetsj andk with y2, j < y1, j,andy1,k = y1, j+ε andy2,k = y2, j−ε. This specification indicates that the divergenceof opinion about the asset’s expected payoff is greater for assetk than for assetj.

Thenya, j =θ−1

1

θ−11 +θ

−12

y1, j+θ−1

1

θ−11 +θ

−12

y2, j andya,k =θ−1

1

θ−11 +θ

−12

(y1, j+ε)+θ−1

1

θ−11 +θ

−12

(y2, j−ε).Hence

ya, j − ya,k =ε

θ−11 +θ

−12

(θ−12 − θ

−11 ). Accordingly,ya, j < ya,k if and only if θ1 < θ2. This

implies that if investor who is optimistic about the asset expected payoff is lessrisk averse, then a divergence of opinion among the two investors for the expectedpayoff for assetk leads to high expected payoff for the asset in equilibrium. Thissuggests that divergence of opinion on the asset expected payoffs generates highermarket expected payoff if belief of assets’ expected future payoffs is negativelycorrelated to risk aversion for any investori. It then follows from Corollary 3.3that, when both assetsj andk satisfy the condition (22), the divergence of opinionon the asset expected payoffs generates lower expected future return for the asset.

To summarize, if our model is to be consistent with Miller’s hypothesis thatdivergence of opinion causes asset price to increase and expected return to de-crease, we need the investor with an optimistic view of the asset future payoff to

Page 152: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

140

be less risk-averse comparing with the relative pessimistic investor, also the assetreturn satisfies condition (22).

4. MV Efficiency and Geometric Relationship of MV FrontiersIn this section, we examine the MV efficiency of the optimal portfolios of

investors and the geometric relationship of the MV frontiers with and withoutriskless asset in market equilibrium. Following the standard Markowitz method,we can construct the MV portfolio frontier based on the consensus belief. Becausethe consensus belief reflects the market belief when it is in equilibrium, we callthis frontier the market equilibrium MV frontier. A portfolio is MV efficient if it islocated on the market equilibrium MV frontier. When investors are homogeneousin their beliefs, it is well known that the optimal portfolios of investors are alwaysMV efficient and also the market portfolio is the unique tangency portfolio be-tween the MV frontiers with and without a riskless asset. When investors’ beliefsare heterogeneous, we would expect that the subjectively optimal portfolios of in-vestors are MV inefficient. However, it is not clear if the geometric relationshipsof the MV frontiers with and without riskless asset is still hold. In this section, wefirst demonstrate the MV inefficiency of the subjectively optimal portfolios andthen show that the geometric relationship breaks down under the heterogeneousbeliefs.

4.1 MV Efficiency of the Optimal Portfolios Under Heterogeneous BeliefsIn the market we set up in Section 2, investors are bounded rational in the

sense that they make their optimal decisions based on their beliefs. Based oninvestors’ subjective beliefs, we can construct the MV frontiers (in the standarddeviation and expected return space) by using the standard Markwitz method. Ofcourse, the optimal portfolios of the investors will be located on the efficient MVfrontiers under their subjective beliefs. Similarly, based on the consensus belief,the market equilibrium MV frontier can be constructed. By the market clearingcondition and frontier construction, the market portfolio is always located on themarket equilibrium frontier, hence always efficient. The question is whether theoptimal portfolios of individual investors are MV efficient. This is a very impor-tant question both theoretically and empirically. If the answer to the question isyes, then the optimal portfolio of the bounded rational heterogeneous investorsare MV efficient under market aggregation. Otherwise, market fails to provide theMV efficiency for the investors. If we refer to heterogeneous investors as fundmanagers and the market portfolio as the market index, the MV efficiency of theoptimal portfolios will have important implications on whether fund managers canout perform the market index based on the MV criteria.

To answer this question, we consider a consensus investor with the marketconsensus beliefsBa, risk aversion coefficientθi and initial wealthWi,o. Then the

Page 153: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

141

optimal portfolio of the investor given by equation (12) becomes

z∗i =(1−λ∗i

λ∗a

)θ−1

i Ω−1a ya +

1Iθ−1

i θaλ∗i

λ∗azm. (23)

Equation (23) shows that any consensus investor will divide his/her investmentinto two portfolios, namely,Ω−1

a ya and the market portfoliozm, which is consistentwith theTwo Fund Separation Theorem(see Huang and Lizenberger [17] Chapter4, page 83) and such portfolios must be MV efficient due to the construction,which means that the portfoliosΩ−1

a ya andzm must be the MV frontier portfolios.It is easy to verify from (23) that the aggregate position of the portfolioΩ−1

a ya of

all investors is∑

i(1−λ∗iλ∗a

)θ−1i Ω

−1a ya = 0 when the market clearing condition (7) is

satisfied.However, when investori’s subjective belief (Bi) differs from the market belief

(Ba), the optimal portfolio of investori can be expressed as

z∗i = θ−1i Ω

−1i (yi − ya) +

1Iθ−1

i θaλ∗i

λ∗aΩ−1

i Ωazm. (24)

Then the composition of the portfolio depends also on the belief erroryi − ya

andΩ−1i Ωa of the investori from the market. Analytically it is not easy to see

if the optimal portfolio of investori lies on the market equilibrium MV frontier.However, through Example D.1 in Appendix D, we can show that the optimalportfolios of investors are not located on the market equilibrium MV frontier ingeneral. In this example, we consider a market with two investors and three riskyassets. Given individuals’ risk aversion coefficients, subjective beliefs and initialwealth, we first form the consensus belief and calculate the equilibrium price vec-tor. Using the equilibrium price, we convert the consensus belief in asset payoffsto the consensus belief in asset returns and obtain the market expected returns andvariances/covariances of asset returns. With the information provided in Table 3 inAppendix D, we can construct the portfolio frontiers for each investor and for themarket equilibrium frontier in the mean-standard deviation space, and locate theoptimal portfolios for individual investors as well as the market portfolio. Figure1 exhibits the resulting graph.

Figure 1 shows two interesting and important features. Firstly, the marketequilibrium MV frontier is located between two individual’s MV frontiers. How-ever, it is closer to that of investor 2. Intuitively, this is due to the fact that investor2 is less risk averse and more optimistic about the market in the sense that he/sheperceives higher expected payoffs and smaller standard deviations on the assetpayoffs and hence dominates the market. Secondly, it is verified that the optimalportfolios of the two investors are always located on their MV efficient frontiersbased on their own beliefs and the market portfolio is located on the market MVefficient frontier under the consensus belief. However, in market equilibrium, the

Page 154: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

142

MVS without a risk-free asset

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 0.05 0.1 0.15 0.2

Std

E(r

_p

)

Ind1's frontier Ind2's frontier Aggregated frontier

Ind1's opti (1's belief) Ind1's opt(agg's belief) Ind2's opt(2's belief)

Ind2's opt (agg's belief) Market Port Zero-beta rate

tangent line

Figure 1. The mean variance frontiers under the heterogeneous beliefs and the market equilibriumconsensus beliefs. The tangency line corresponding to the consensus belief has the market portfolioas the tangency portfolio and the expected return of the zero-beta portfolio of the market as theintercept with the expected return axis.

optimal portfolios of the two investors are strictly below the market equilibriumMV frontier. This may be hard to view in Figure 1. We provide a zoom-in versionin Figure 2 to verify this observation.

Figure 2 clearly shows that the optimal portfolios of the two investors are notlocated on the MV frontier, though they are very close to it, and hence are MVinefficient. Intuitively, because of the bounded rationality and the fact that themarket consensus belief is jointed determined by all market participants, no in-vestor has knowledge about the “correct” market belief. Therefore, both investorsmade “wrong guesses” about the market, investor 1 being pessimistic and investor2 being optimistic, their optimal portfolios suffer from those “wrong guesses” interms of MV efficiency.

Sharpe [29] simulates market trading using his latest program APSIM, the pro-gram assumes a risk-free asset and there is a true probability distribution of futurestates of the market. Although not directly compatible due to the different setups,

Page 155: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

143

Zoom in on the aggregate market frontier

0.59

0.6

0.61

0.62

0.63

0.64

0.65

0.66

0.67

0.68

0.69

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14

Std_dev

E(r

_p

)

Aggregate Frontier market portfolio 1's optimal 2's optimal

Figure 2. Close-up of the locations of individuals’ optimal portfolios and the market portfoliorelative to the market frontier when the market is in equilibrium.

Sharpe’s findings6 are consistent with ours. That is the market portfolio outper-forms most of other portfolios in terms of MV efficiency, superior fund managerscan only at their best perform as well as the market portfolio. In our model, the“true” probability distribution of the future depends on the heterogeneous beliefsof the investors. Sharpe [29] explains the inferior performance of the active fundmanagers in the long run compared to the index funds by higher cost of activemanagers. To add to this, we suggest that it might be simply because it is verydifficult for active mangers to consistently make correct predictions about the fu-ture, while index funds tracks the market portfolio, which is always MV efficient.We conclude this numerical example by amending Sharpe’sIndex Fund Premise(IFP) to the following:

6Sharpe shows in Chapter 6, case 18, that superior fund manager who makes the “correct guesses”about the future of the market (meaning their probability assessment of the future coincide with thehypothetical true probability assessment) has a Sharpe Ratio (of 0.367) slightly above the market’svalue (of 0.366), other investors who make the “wrong guess” (meaning that their probability assess-ment of the future differs the hypothetical true probability assessment) are mostly penalized in terms ofefficiency (with the lowest Sharpe Ratio of 0.237). However a lucky investor still has the same SharpeRatio (of 0.367) as the superior fund managers.

Page 156: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

144

IFPa. Few of us are as smart as all of us.

IFPb. Few of us are as smart as all of us, and it is hard to identify such people inadvance.

IFPc. Few of us are as smart as all of us, and it is hard to identify such people inadvance, and theydefinitely7 charge more than they are worth.

4.2 The Geometric Relation of the Equilibrium MV Frontiers with andwithout Risk-Free Asset

To examine thetangency relationshipof the traditional portfolio theory withheterogeneous beliefs, we consider the situation under which a riskless asset existswith future payoff Rf . Under the homogeneous belief, the classic portfolio theorytells us that the efficient portfolio frontier collapses to a straight line when a risk-free asset is added to the market. This straight line has one tangency point with theoriginal frontier without a risk-free asset. Thistangency portfoliois exactly themarket portfolio when both the risk-free and equity markets are in equilibrium.We now examine this equilibrium tangency relationship under heterogeneous be-liefs through the following example.

Example 4.1. Consider the case withI = 2 investors with beliefsBi = (Ωi , yi)for i = 1, 2. There areN = 3 risky assets and a risk-free asset with payoff Rf .Let the absolute risk aversion coefficients (θ1, θ2) = (5, 1), investors’ initial wealthW1,o = W2,o = $10, market endowment of risky assetszm = (1, 1, 1)T, andyo =

(6.59, 9.34, 9.78)T, 1 = (1, 1, 1)T andΩo = DoCDo where

Do =

0.7933 0 0

0 0.8770 00 0 1.4622

, C =

1 0.2233 0.1950

0.2233 1 0.11630.1950 0.1163 1

,

in which Do corresponds to the standard deviation matrix andC is the correlationmatrix. Assume that investors’ beliefs are given byyi = (1+δi)yo andΩi = DiCDi ,whereDi = (1+ εi)Do for i = 1, 2. This implies that investors agrees on the cor-relation of asset payoffs, but disagree about the volatilities and expected payoffs.Next we aggregate individuals’ beliefs according to Proposition 2.1, first withouta risk-free asset, then with a risk-free asset. The risk-free payoff Rf is determinedsuch that the risk-free asset is in net-zero supply in equilibrium. To examine thetangency relationship, we plot the MV frontiers and optimal portfolios under themarket consensus belief with and without risk-free asset for different values ofδiandεi . Plots are shown in Figure 3.

7It reads “may” in Sharpe’s book. Because no optimal portfolio is MV efficient unless the indi-vidual’s belief coincides to the consensus belief, hence no one can beat the market portfolio when themarket is in equilibrium.

Page 157: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

145

0.00 0.05 0.10 0.15 0.20 0.25 0.30σ rp

0.2

0.4

0.6

0.8E rp

0.00 0.05 0.10 0.15 0.20 0.25 0.30σ rp

0.2

0.4

0.6

0.8E rp

(a1) (δ1, δ2) = (0.2, 0) (a2) (δ1, δ2) = (0, 0.2)

0.00 0.05 0.10 0.15 0.20 0.25 0.30σ rp

0.2

0.4

0.6

0.8E rp

0.00 0.05 0.10 0.15 0.20 0.25 0.30σ rp

0.2

0.4

0.6

0.8E rp

(a3) ( 1 2) = (−0.2, 0) (a4) ( 1 2) = (0, −0.2)

portfolios without riskless security portfolios with a riskless security

Figure 3. Compare the geometric relationships between market MV frontiers with and without arisk-free asset, when the risk-free asset is in net-zero supply. In (a1) and (a2),y1 , y2,Ω1 = Ω2; in(a3) and (a4),y1 = y2,Ω1 , Ω2.

When investors are homogeneous about the variances and covariances but het-erogeneous about the expected payoffs of the risky asset, Figure 3(a1) and (a2)show that the tangency relation still holds. This is not surprising. Because of thehomogeneous belief of the variance-covariance matrixΩi = Ωo, the consensusvariance-covariance matrix is given byΩa = Ωo. From the construction of theconsensus belief, the expected payoff ya is a risk tolerance weighted average ofthe heterogeneous beliefs in the expected payoffs. Therefore, the consensus beliefBa remains the same when a risk-free asset is added to the market. Furthermore,since the risk-free asset is in net-zero supply, it follows from equation (11) in

Page 158: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

146

Proposition 2.1 that

Wm,o = zTmp0 =

1λ∗a

[yT

a zm −1IθazT

mΩazm

]=

1Rf

[yT

azm −1IθazT

mΩazm

].

Consequently, the riskless payoff Rf must equal to the zero-beta payoff λ∗a.This implies that both the market’s optimal marginal certainty equivalent wealth(CEW) and the equilibrium prices do not change when a risk-free asset is addedto the market. Therefore the tangency relationship of the two market equilibriumfrontiers with and without a risk-free asset holds with the market portfolio as thetangency portfolio. However, the efficiency of the optimal portfolios of the twoinvestors depends on their expectations and risk aversion coefficients. On the onehand, when the more risk averse investor is optimistic and the less risk averse in-vestor is pessimistic about the expected payoffs, Figure 3(a1) indicates that theoptimal portfolios of both investors are located closer to the market portfolio andmarket MV frontiers. On the other hand, when the more risk averse investor ispessimistic and the less risk averse investor is optimistic about the expected pay-offs, Figure 3(a2) indicates that the optimal portfolios of both investors are locatedfar away from the market portfolio and the equilibrium market MV frontier, in par-ticular, the optimal portfolio of the pessimistic investor may become even moreinefficient when the risk-free asset is available. This means that adding a risk-freeasset in this situation may help investor 2 to achieve a higher expected return forhis optimal portfolio by sacrificing the MV efficiency of the optimal portfolio ofinvestor 1.

When investors are heterogeneous in the variances of the asset payoffs but ho-mogeneous in their expected payoffs, Figure 3(a3) and (a4) illustrate that the tan-gency relation breaks down. The risk-free payoff is no longer guaranteed to equalto the zero-beta payoff, which results in a change in the market’s optimal CEWand also the equilibrium prices. In particular, when the relative less risk averseinvestor, investor 2 in this case, is more confident (measured by the smaller vari-ance), Figure 3(a4) indicates that the existence of a risk-free asset actually pushesup the MV frontier, leading to higher expected return for the market portfolio. Ifone would believe that it is more likely that the less risk averse investor would bemore confident in general, this implies that adding a risk-free asset would be morelikely to push the portfolio frontier line above the tangency line of the frontierwithout the risk-free asset, leading to a higher market expected return. This ob-servation would help us to explain the risk premium puzzle8. However, when therelative more risk averse investor, investor 1 in this case, is more confident, Figure3(a3) implies that the existence of a risk-free asset actually pushes down the MV

8A detailed analysis on the conditions under which the market generates higher marker risk pre-mium and lower riskfree rate in market equilibrium when there are two assets and two beliefs can befound in He and Shi [15].

Page 159: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

147

frontier, lowing the expected return of the market portfolio. This is an unexpectedand surprising result. In the standard homogeneous case, the expected return ofthe market portfolio is independent of the existence of the risk-free asset whichis in zero-net supply. The above analysis demonstrates that this is no longer thecase when investors are heterogeneous. Based on Figure 3(a4), we observe that arestriction to the access of the risk-free asset may lead to a lower market expectedreturn, a phenomena we experienced in the current financial crisis, and we leavefurther development along this line to future research.

5. The Impact of Heterogeneity on the Market with Many InvestorsFrom the traditional portfolio theory under homogeneous beliefs, we know

that all investors hold portfolios located on the market’s MV efficient frontier,which is a hyperbola when there is no riskless asset and a straight line connectingthe risk-free asset and the market portfolio, which is called theCapital MarketLine (CML), when there is a risk-free asset. When investors are heterogeneous,we have shown through numerical examples of three assets and two investorsin the previous section that investors no longer held portfolios on the equilibriummarket MV frontier unless investors held the market consensus belief. Essentially,this is due to the fact that the consensus belief is determined endogenously by theheterogeneous beliefs and no individual knows the consensus belief in advance. Inthis section, we extend the analysis in the previous section to a market consistingof many different investors and we want to see whether those features observed forthe market with two investors also hold for the market with many investors. Weuse mean-preserving spreads to characterize the beliefs of the investors and thespreads can be either univariate or multivariate. We use numerical examples toexamine the MV efficiency of the optimal portfolios of investors and their relativeposition to the CML when the heterogeneity is either in the expected payoffs orthe variances of the payoffs.

Example 5.1. Let the number of investorsI = 50, number of risky assetsN = 3,and market portfolio of risky assets is given byzm = (25, 25, 25)T (so that theaverage number of each stock per investor stays at 0.5 as in the previous example).Assume that there is a risk-free asset with payoff Rf = 1.05. Investors’ initialwealthW0,i = $10, the ARA coefficientsθi ∼ N(θo, σ2

θ) with θo = 3 andσθ = 0.3for i = 1, 2, · · · , I . Consider two types of probability distributions for investors’beliefs;

(i) yi = (1+ δi)yo andΩi = DiCDi , Di = (1+ εi)Do for i = 1, · · · , 50, whereC,

yo andDo are defined in Example 4.1 andδiiid∼ N(0, σ2

δ) andεiiid∼ N(0, σ2

ε );

(ii) yi = δi + yo andΩi = DiCDi , Di = Diag[εi + (0.7933, 0.8770, 1.4622)T]

for i = 1, · · · , 50, whereδiiid∼ MN(0,Σδ) andεi

iid∼ MN(0,Σε) andΣδ =

σδDiag[1] andΣε = σεDiag[1].

Page 160: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

148

0.05 0.10 0.15 0.20 0.25 0.30σ rp

1.0

0.5

0.5

1.0

E rp

0.05 0.10 0.15 0.20σ rp

0.6

0.4

0.2

0.2

0.4

0.6

E rp

(b1) Heterogeneity in expected payoffs (σδ, σ ) = (0.2, 0).

0.02 0.04 0.06 0.08 0.10 0.12 0.14σ rp

0.4

0.2

0.2

0.4

E rp

0.02 0.04 0.06 0.08 0.10 0.12σ rp

0.2

0.2

0.4

E rp

(b2) Heterogeneity in variances (σδ, σ ) = (0, 0.03).

Figure 4. The optimal portfolios of all the 50 investors and their relative positions to the CMLwhen investors’ beliefs is homogeneous in variances and heterogeneous in expected payoffs in (b1)or homogeneous in expected payoffs and heterogeneous in variances in (b2). The left (right) panelscorrespond to univariate (multivariate) distribution in beliefs.

SymbolsN andMN stand for (truncated, if necessary) normal distribution andmultivariate normal distribution respectively.

The results for the two cases are plotted in Figure 4 in which the optimal port-folios of all 50 investors and their relative position to the CML are plotted. Figure4(b1) illustrates the case when investors are homogeneous in variances but hetero-geneous in the expected payoffs, while Figure 4(b2) illustrates the case other wayaround. The left panels correspond to the case with univariate belief dispersionand the right panels correspond to the case with multivariate dispersions. Figure 4

Page 161: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

149

leads to the following interesting observations. (i) The optimal portfolios of all in-vestors are almost on the CML when investors are heterogeneous in variances buthomogeneous in the expected payoffs, illustrated by Figure 4(b2). The same effectis observed for univariate spread (left panel) and multivariate spreads (right panel).This shows that heterogeneity in covariances, characterized by mean-preservingspreads, plays insignificant role for the MV efficiency of the optimal portfoliosof investors. (ii) The heterogeneity in expected payoff has significant impact onthe MV efficiency of the optimal portfolios of the investors, illustrated in Figure4(b1). The optimal portfolios become less MV efficient, in particular, when thebelief dispersions are multivariate normally distributed (the right panel). Some op-timal portfolios are far below the CML, even have lower expected return than therisk free rate (the left panel). In addition, when the belief dispersions are univari-ate, the optimal portfolios seem to form a hyperbolic curve below the equilibriummarket MV efficient frontier (the left panel). However, when the divergence ofopinions are not the same for each asset, optimal portfolios are scattered underthe MV frontier without any significant pattern. This example shows that hetero-geneity in expected payoff has more significant impact on the MV efficiency ofoptimal portfolios of investors than the heterogeneity in variances.

Based on the example with many investors, we find that the impact on the MVefficiency of the optimal portfolios of investors is significant for the heterogeneityin expected payoffs, but insignificant for the heterogeneity in variances and dif-ferent mean preserved spreads in beliefs have different impact. However, basedon the analysis in the previous section, the impact on the geometric relation ofthe frontiers with and without riskless asset is insignificant for the heterogeneityin expected payoffs, but significantly for the heterogeneity in variance. Thereforethe heterogeneities have different impact on the MV efficiency and the geometricrelation of the portfolio frontiers. Overall, we can see that, due to the heteroge-neous beliefs, the market fails to provide investors with MV efficient portfolio,this generic feature is not what we would expect in the market with homogeneousbelief. It shows that heterogeneous investors can never beat the market when theperformance is measured by the MV efficiency.

6. ConclusionWithin the MV framework, by assuming that investors are heterogeneous, this

paper examines the impact of the heterogeneity on the market equilibrium pricesand equilibrium MV frontier in a market with many risky assets and no risklessasset. The heterogeneity is measured by the risk aversion coefficients, expectedpayoffs, and variance/covariance matrices of risky assets of heterogeneous in-vestors. Investors are bounded rational in the sense that, based on their beliefs,they make their optimal portfolio decisions. To characterize the market equilib-rium prices of the risky assets, we introduce the concept of consensus belief ofthe market and show how the consensus or market belief can be constructed from

Page 162: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

150

the heterogeneous beliefs. Basically, under the market aggregation, the consen-sus belief is a weighted average of the heterogeneous beliefs. Explicit formulafor the market equilibrium prices of the risky assets are derived. As a by-productof the consensus belief and equilibrium price formula, we show that the standardBlack’s zero-beta CAPM still holds with heterogeneous beliefs. The impact ofthe heterogeneity on the market equilibrium, mean variance frontier and the MVefficiency of the optimal portfolios of the investors are analyzed. In particular,through some numerical examples, we show that, in market equilibrium, the bi-ased belief (from the market belief) of an investor makes his/her optimal portfoliobelow the equilibrium market MV frontier (although they may be very close tothe MV efficient frontier). This demonstrates that bounded rational investors maynever achieve their MV efficiency in market equilibrium. If we refer the hetero-geneous investors as fund managers and the market portfolio as a market index,then our result offers an explanation on the empirical finding that, according tothe MV criteria, managed funds under-perform the market indices on average. Wealso offer an explanation on Miller’s proposition that “divergence of opinion cor-responds to lower future asset returns” and the subsequent empirical findings onthis. Furthermore, we show that the well knowntangency relationof the frontierswith and without risk-free asset under the homogeneous beliefs breaks down un-der the heterogeneous beliefs, in particular when investors are heterogeneous invariances. Adding a risk-free asset to the market with many risky assets can havevery complicated effect on the market in general. In the homogeneous market,the expected return of the market portfolio is independent of the existence of therisk-free asset. However, in the heterogeneous market, adding a risk-free asset tothe market with many risky assets can have different impact on the expected returnfor the market portfolio in equilibrium. This result can be used to explain the riskpremium puzzle and financial market crisis. In addition, the heterogeneity in theexpected payoffs has significant impact on the MV efficiency of the subjectivelyoptimal portfolios but insignificantly for the geometric relation. However, this isother way around for the heterogeneity in variance.

The implication of the heterogeneity on the market under different marketconditions is far more complicated than it seems and it deserves further study.It would be interesting to extend the current static framework to a dynamic settingin which the heterogeneous beliefs are characterized by some trading strategiesused in financial markets so that the MV efficiency of different trading strategiescan be examined. It is also interesting to allow investors to learn overtime from themarket through various learning mechanisms, such as the Bayesian updating rule,and adaptive learning mechanisms so that the expectation feedback (see Chiarellaet al. [9]) and the MV efficiency under the learning can be examined. Theseextension will give us a richer modeling environment and hopefully lead to a betterunderstanding of the phenomenons in our financial market. We leave these issuesto the future research.

Page 163: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

151

Appendices

A. Proof of Lemma 2.1Let λi be the Lagrange multiplier and set

L(zi , λi) := yTi zi −

θi

2ziΩizi + λi [pT

0 zi −Wi0]. (25)

Then the optimal portfolio of agenti is determined by the first order condition

∂L∂zi= 0 ⇒ zi = θ

−1i Ω

−1i [yi − λip0]. (26)

Substituting (26) into (3) yields (5).

B. Proof of Proposition 2.1From Definition 2.1, if the consensus beliefBa = (Ea(x),Ωa) exists, then

z∗i = θ−1a Ω

−1a [ya − λ

∗ap0]. (27)

Applying the market equilibrium condition to (27), we must have

zm =

I∑

i=1

z∗i = I[θ−1

a Ω−1a [ya − λ

∗ap0]]. (28)

This leads to the equilibrium price (11). On the other hand, it follows from theindividuals demand (4) and the market clearing condition (7) that, under the het-erogenous beliefs,

zm =

I∑

i=1

z∗i =I∑

i=1

θ−1i Ω

−1i [yi − λ

∗i p0]. (29)

Under the definitions (9) and (10), we can re-write equation (29) as

zm =

I∑

i=1

θ−1i Ω

−1i yi −

( I∑

i=1

θ−1i λ∗iΩ−1i

)p0 = Iθ−1

a Ω−1a ya − Iθ−1

a λ∗aΩ−1a p0, (30)

which leads to the same market equilibrium price (11). This shows thatBa =

Ωa, ya defined in (9) and (10) is the consensus belief. Inserting (11) into (4) givethe equilibrium optimal portfolio (12) of investori.

Page 164: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

152

C. Proof of Corollary 2.1The equilibrium price vector in (11) can be re-written to express the price of

each asset

p0, j =1λ∗a

(ya, j − θa/IN∑

k=1

σ j,kzm,k) =1λ∗a

[ya, j −θa

ICova(x j, Wm)]. (31)

It follows from (31) thatya, j − λ∗ap0, j =

θaI Cova(x j, Wm) and hence

ya, j

p0, j− λ∗a =

1p0, j

θa

ICova(x j , Wm).

Therefore

Ea(r j) − (λ∗a − 1) =1

p0, j

θa

ICova(x j, Wm). (32)

It follows from Wm0 =1λ∗a

zTm(ya − θaΩazm/I ) that

λ∗a =zT

mya − θazTmΩazm/I

Wm0. (33)

Using the definition ofλ∗a in (33), we obtain

Ea(rm) − (λ∗a − 1) =yT

a zm

zTmp0− λ∗a =

yTa zm

Wm0−

zTmya − θazT

mΩazm/I

Wm0.

Thus

Ea(rm) − (λ∗a − 1) =θazT

mΩazm/I

Wm0, 0. (34)

Dividing (32) by (34) leads to

Ea(r j) − (λ∗a − 1)

Ea(rm) − (λ∗a − 1)=

( 1p0, j

θaI Cova(x j, Wm)

)

( θazTmΩazm/IWm0

) =

1p0, j

Cova(x j , Wm)

σ2a,m

Wm0

=Cova

( x j

p0, j, Wm

Wm0

)

σ2a,m

W2m0

=Cova(r j , rm)

σ2a(rm)

= β j , (35)

leading to the CAPM-like relation in (13).

Page 165: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

153

Table 1. Market specifications and heterogeneous beliefs.

Initial Wealth Risk Aversion Expected payoffs Variance/Covariance of payoffs

W10 = 10 θ1 = 5 y1 =

6.60

9.35

9.78

Ω1 =

0.6292 0.1553 0.2262

0.7692 0.1492

2.1381

W20 = 10 θ2 = 1 y2 =

9.60

12.35

12.78

Ω2 =

0.4292 −0.0447 0.0262

0.5692 −0.0508

1.7381

D. A Numerical Example

Example D.1. Let I = 2 and N = 3. Consider the set up in Table 1. As-suming there is one share available for each asset, that is,zm = (1, 1, 1)T.Based on the information in Table 1, we use equation (8) and Excel Solver tosolve for the equilibrium price vector and obtain the market equilibrium pricep0 = (5.6436, 7.4328,6.9236)T. The optimal portfolios and shadow prices of theinvestors are given byz∗1 = (0.380, 0.768, 0.310)T, λ∗1 = 0.7894 for investor 1 andz∗2 = (0.620, 0.232, 0.690)T andλ∗2 = 1.6520 for investor 2. Using Proposition2.1, we construct the consensus beliefBa, the aggregate risk aversion coefficientθa, and the aggregate shadow priceλ∗a, and obtain the result in Table 2.

We then use the market equilibrium price to convert the consensus belief frompayoffs to returns as follows. LetP0 = diag[p0] = diag(5.6436, 7.4328, 6.9236)and

Ei(r ) := P−10 yi − 1, Vi(r ) := P−1

0 ΩiP−10 , i = 1, 2, a;

w∗i :=1

Wi,oP0z∗i , Ei(r

∗ip) := Ei(r )Tw∗i , σ∗ip = (w∗Ti Vi(r )w∗i )

1/2, i = 1, 2;

Ea(r∗ip) := Ea(r )Tw∗i , σaip = (w∗Ti Va(r )w∗i )

1/2, i = 1, 2;

wm :=1

Wm,oP0zm, Ea(rm) := Ea(r )Twm,

σa,m = (wTmVa(r )wm)1/2, β :=

Va(r )wm

σ2a,m

.

We then obtain the results in Table 3.In the above definitions,Ei(r ) andVi(r ) are the expected return vectors and

covariance matrices in terms of asset returns for each investor. Subsequently,w∗iare the individuals’ optimal portfolio weights,Ei(r∗ip) andσ∗ip are the expectedreturn and standard deviations of the optimal portfolios of investors, respectively,under their subjective beliefs,Bi = (Ea(r∗ip), σa

ip). Similarly, under the market

Page 166: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

154

Table 2. The market consensus beliefs, shadow price and ARA.

Initial Market Wealth Shadow Price Risk Aversion Expected payoffs Variance/Covariance of payoffs

Wm0 = 20 λ∗a = 1.5083 θa = 1.6667 ya =

8.8811.6312.06

Ωa =

0.4383 −0.0356 0.0352

0.5783 −0.04171.9472

Table 3. Heterogeneous beliefs and the consensus belief, theindividual optimal and market portfo-lios in equilibrium, and the means and standard deviations of these portfolios under heterogeneousand consensus belies, respectively.

Expected returns Variance/Covariance of returns Portfolio weights Portfolio Return/SD

E1(r ) =

.1690.2577.4126

V1 =

.0198 .0037 .0058

.0139 .0029.0446

w∗1 =

.2144.5711.2145

E1(r∗1p) = .2719σ∗1p = .09824Ea(r∗1p) = .6043σa

1p = .0748

E2(r ) =

.7006.6613.8459

V2 =

.0135 −.0011 .0007

.0103 −.0010.0404

w∗2 =

.3499.1722.4778

E2(r∗2p) = .7633σ∗2p = .1054Ea(r∗2p) = .6522σa

2p = .1065

Ea(r ) =

.5729.5644.7418

Va =

.0138 −.0008 .0009

.0105 −.0008.0406

wm =

.2822.3716.3462

Ea(rm) = .6283σa,m = .0848

β =(0.5390 0.4681 1.9468

)T

beliefBa, wm is the market portfolio weight vector,Ea(rm) andσa,m are the marketreturn and volatility under the market belief respectively. Finallyβ is the vectorof beta coefficients. According to these definitions, we obtain results in Table 3.

Page 167: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

155

References1. Abel, A. (2002), ‘An exploration of the effects of pessimism and doubt on asset re-

turns’,Journal of Economic Dynamics and Control26, 1075–1092.2. Black, F. (1972), ‘Capital market equilibrium with restricted borrowing’,Journal of

Business45, 444–454.3. Bohm, V. and Chiarella, C. (2005), ‘Mean variance preferences, expectations forma-

tion, and the dynamics of random asset prices’,Mathematical Finance15, 61–97.4. Bohm, V. and Wenzelburger, J. (2005), ‘On the performance of efficient portfolios’,

Journal of Economic Dynamics and Control29, 721–740.5. Brown, A. and Rogers, C. (2009), Diverse beliefs, Preprint, Statistical Laboratory,

University of Cambridge.6. Chiarella, C., Dieci, R. and Gardini, L. (2005), ‘The dynamic interaction of specula-

tion and diversification’,Applied Mathematical Finance12(1), 17–52.7. Chiarella, C., Dieci, R. and He, X. (2010), ’Do heterogeneous beliefs diversify market

risk?’, European Journal of Finance. Forthcoming.8. Chiarella, C., Dieci, R. and He, X. (2007), ‘Heterogeneous expectations and specula-

tive behaviour in a dynamic multi-asset framework’,Journal of Economic Behaviorand Organization62, 402–427.

9. Chiarella, C., Dieci, R. and He, X. (2009),Heterogeneity, Market Mechanisms andAsset Price Dynamics, Elsevier, pp. 277–344. inHandbook of Financial Markets: Dy-namics and Evolution, Eds. Hens, T. and K.R. Schenk-Hoppe.

10. Detemple, J. and Murthy, S. (1994), ‘Intertemporal asset pricing with heterogeneousbeliefs’,Journal of Economic Theory62, 294–320.

11. Diether, K., Malloy, C. and Scherbina, A. (2002), ‘Differences of opinion and crosssection of stock returns’,Journal of Finance57, 2113–2141.

12. Easley, D. and O’Hara, M. (2004), ‘Information and the cost of capital’,Journal ofFinance59, 1553–1583.

13. Fan, S. (2003), GCAPM(I): A microeconomic theory of investments, SSRN workingpaper series, Fan Asset Management LLC and Institutional Financial Analytics.

14. Hara, C. (2009), Heterogenenous impatience in a continuous-time model, Preprint,Institute of Economic Research, Kyoto University.

15. He, X. and Shi, L. (2010), Differences in opinion and risk premium, Technical Report271, Quantitative Finance Research Centre, University of Sydney.

16. He, X. and Shi, L. (2009), Portfolio analysis and Zero-Beta CAPM with Heteroge-neous beliefs, Technical Report 244, Quantitative Finance Research Centre, Universityof Technology, Sydney.

17. Huang, C.-F. and Litzenberger, R. (1988),Foundations for Financial Economics, El-sevier, North-Holland.

18. Jouini, E. and Napp, C. (2006a), ‘Aggregation of heterogeneous beliefs’,Journal ofMathematical Economics42, 752–770.

19. Jouini, E. and Napp, C. (2006b), ‘Heterogeneous beliefs and asset pricing in discretetime: An analysis of pessimism and doubt’,Journal of Economic Dynamics and Con-trol 30, 1233–1260.

20. Jouini, E. and Napp, C. (2007), ‘Consensus consumer and intertemporal asset pricingwith heterogeneous beliefs’,Review of Economic Studies74, 1149–1174.

21. Kurz, M. (2009),Rational Diverse Beliefs and Economic Volatility, Elsevier, pp. 439–

Page 168: Financial Engineering

May 3, 2010 14:33 Proceedings Trim Size: 9in x 6in 005

156

506. inHandbook of Financial Markets: Dynamics and Evolution, Eds. Hens, T. andK. R. Schenk-Hoppe.

22. Lintner, J. (1965), ‘The valuation of risk assets and the selection of risky investmentsin stock portfolios and capital budgets’,Review of Economic Studies47, 13–37.

23. Lintner, J. (1969), ‘The aggregation of investor’s diverse judgements and preferencesin purely competitive security markets’,Journal of Financial and Quantitative Analy-sis4, 347–400.

24. Miller, E. (1977), ‘Risk, uncertainty, and divergence of opinion’,Journal of Finance32, 1151–1168.

25. Mossin, J. (1966), ‘Equilibrium in a capital asset market’,Econometrica35, 768–783.26. Rubinstein, M. (1975), ‘Security market efficiency in an arrow-debreu economy’,

American Economic Review65, 812–824.27. Rubinstein, M. (1976), ‘The strong case for the generalized logarithmic utility model

as the premier model of financial markets’,Journal of Finance31, 551–571.28. Sharpe, W. (1964), ‘Capital asset prices: A theory of market equilibrium under condi-

tions of risk’,Journal of Finance19, 425–442.29. Sharpe, W. (2007),Investors and Markets, Portfolio Choice, Asset Prices, and Invest-

ment Advice, Princeton.30. Sun, N. and Yang, Z. (2003), ‘Existence of equilibrium and zero-beta pricing formula

in the capital asset pricing model with Heterogenenous beliefs’,Annals of Economicsand Finance4, 51–71.

31. Wenzelburger, J. (2004), ‘Learning to predict rationally when beliefs are Heteroge-neous’,Journal of Economic Dynamics and Control28, 2075–2104.

32. Williams, J. (1977), ‘Capital asset prices with Heterogeneous beliefs’,Journal of Fi-nancial Economics5, 219–239.

33. Zapatero, F. (1998), ‘Effects of financial innovations on market volatility when beliefsare heterogeneous’,Journal of Economic Dynamics and Control22, 597–626.

Page 169: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

Security Pricing with Information-SensitiveDiscounting∗

Andrea Macrina1,2,† and Priyanka A. Parbhoo3

1Department of Mathematics, King’s College LondonLondon WC2R 2LS, United Kingdom

2Institute of Economic Research, Kyoto University, Kyoto 606-8501, Japan3School of Computational and Applied Mathematics, University of the Witwatersrand,

Johannesburg, Private Bag-3, Wits 2050, South AfricaE-mail: [email protected]

In this paper, incomplete-information models are developed for the pric-ing of securities in a stochastic interest rate setting. In particular, we con-sider credit-risky assets that may include random recovery upon default.The market filtration is generated by a collection of information processesassociated with economic factors, on which interest rates depend, andinformation processes associated with market factors used to model thecash flows of the securities. We use information-sensitive pricing ker-nels to give rise to stochastic interest rates. Semi-analytical expressionsfor the price of credit-risky bonds are derived, and a number of recoverymodels are constructed which take into account the perceived state of theeconomy at the time of default. The price of a European-style call bondoption is deduced, and it is shown how examples of hybrid securities, likeinflation-linked credit-risky bonds, can be valued. Finally, a cumulativeinformation process is employed to develop pricing kernels that respondto the amount of aggregate debt of an economy.

Keywords: Asset pricing, incomplete information, stochastic interestrates, credit risk, recovery models, credit-inflation hybrid securities,information-sensitive pricing kernels.

∗The authors thank D.C. Brody, M.H.A. Davis, C. Hara, T. Honda, E. Hoyle, R. Miura, H. Naka-gawa, K. Ohashi, J. Sekine, K. Tanaka and participants in the KIER/ TMU 2009 International Work-shop on Financial Engineering for useful comments. We are in particular grateful to J. Akahori and L.P. Hughston for helpful suggestions at an early stage of this work. P. A. Parbhoo thanks the Instituteof Economic Research, Kyoto University, for its hospitality, and acknowledges financial support fromthe Programme in Advanced Mathematics of Finance at the University of the Witwatersrand and theNational Research Foundation, South Africa.†Corresponding author.

157

Page 170: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

158

1. IntroductionThe information-based framework developed by Brody et al. in [7] and [8]

is a method to price assets based on incomplete information available to marketparticipants about the cash flows of traded assets. In this approach the value ofa number of different types of assets can be derived by modelling the randomcash flows defining the asset, and by explicitly constructing the market filtrationthat is generated by the incomplete information about independent market fac-tors that build the cash flows. This principle has been used in [7] to derive theprice processes of credit-risky securities, in [8] to value equity-type assets withvarious dividend structures, in [9] to price insurance and reinsurance products,and in [6] to price assets in a market with asymmetric information. However,for simplicity, in this framework it is typically assumed that interest rates aredeterministic.

One of the earliest generalizations of the models developed in [7] to includestochastic interest rates can be found in [19]. Here, it is assumed that the filtra-tion is generated jointly by the information processes associated with the futurerandom cash flows of a defaultable bond and by an independent Brownian motionthat drives the stochastic discount factor.

Pricing kernel models for interest rates have been studied by the authors of[10], [15] and [18], among others. In such models, the pricePtT at time t of asovereign bond with maturityT and unit payoff, is given by the formula

PtT =EP[πT | Ft]πt

, (1)

whereπtt≥0 is the Ft-adapted pricing kernel process andP denotes the realprobability measure. Given the filtrationFtt≥0, arbitrage-free interest rate mod-els can be obtained by specifying the dynamics of the pricing kernel. In partic-ular, term structure models with positive interest rates are generated by requiringthat πt is a positive supermartingale. A more recent approach to constructinginterest rate models in an information-based setting, presented in [14], developsthe notion of an information-sensitive pricing kernel. The pricing kernel is mod-elled by a function of time and information processes that are observed by marketparticipants and that over time reveal genuine information about economic fac-tors at a certain rate. In order to obtain positive interest rate models, this func-tion must be chosen so that the pricing kernel has the supermartingale property.A scheme for generating appropriate functions to construct such pricing kernelsin an information-based approach is considered in [2]. Incomplete informationabout economic factors that is available to investors is modelled in [2] by usingtime-inhomogeneous Markov processes. The Brownian bridge information pro-cess considered in [14] and, more generally, the subclass of the continuous Levyrandom bridges, recently introduced in [12], are examples of time-inhomogeneousMarkov processes.

Page 171: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

159

In this paper we describe how credit-risky securities can be priced within theframework considered in [7] while including a stochastic discount factor by use ofinformation-sensitive pricing kernels. To this end, we proceed in Section 2 to re-cap briefly the theory for the pricing of fixed-income securities in an information-based framework described in [14]. In Section 3 we recall the result in [2] that canbe used to obtain the explicit dynamics of the pricing kernel by use of so-called“weighted heat kernels” with time-inhomogeneous Markov processes. In Section4, we derive the price process of a defaultable discount bond and compute theyield spreads between digital bonds and sovereign bonds. Section 5 considers anumber of random recovery models for defaultable bonds, and in the followingsection we derive a semi-analytical formula for the price of a European optionon a credit-risky bond. In Section 7 we demonstrate how to price credit-inflationsecurities as an example of a hybrid structure. We investigate the valuation ofcredit-risky coupon bonds in Section 8 and conclude by considering a pricing ker-nel that reacts to the level of debt accumulated in a country over a finite period oftime.

2. Information-Sensitive Pricing KernelsWe define the probability space (Ω,F , Ftt≥0, P), whereP denotes the real

probability measure. We fix two datesT andU, whereT < U, and introduce amacroeconomic random variableXU , the value of which is revealed at timeU.Noisy information about the economic factor available to market participants ismodelled by the information processξtU 0≤t≤U given by

ξtU = σ t XU + βtU . (2)

Here the parameterσ represents the information flow rate at which the true valueof XU is revealed as time progresses, and the noise componentβtU 0≤t≤U is aBrownian bridge that is taken to be independent ofXU . We assume that the marketfiltration Ftt≥0 is generated byξtU , and note that it is shown in, e.g., [7] thatξtU is a Markov process with respect to its natural filtration. We consider pricingkernelsπt that are of the form

πt = Mt f (t, ξtU ), (3)

whereMt0≤ t<U is the density martingale associated with a change of measurefrom P to the so-called “bridge measure”B under which the information processhas the law of a Brownian bridge. It is proven in [7], thatMt satisfies the differ-ential equation

dMt = −σU

U − tEP[XU | ξtU ] Mt dWt, (4)

whereWt0≤ t<U is an (Ft, P)-Brownian motion given by

Wt = ξtU +

∫ t

0

1U − s

ξsU ds− σ∫ t

0

UU − s

EP[XU | ξsU ] ds. (5)

Page 172: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

160

By applying Bayes change-of-measure formula to equation (1), we can expressthe pricePtT at timet of a sovereign discount bond with maturityT by

PtT =EB[ f (T, ξTU) | ξtU ]

f (t, ξtU ). (6)

Next we introduce the random variableYtT defined by

YtT = ξTU −U − TU − t

ξtU , (7)

and observe that under the measureB, YtT is a Gaussian random variable with zeromean and variance given by

v2tT =

(T − t)(U − T)U − t

. (8)

It can be verified thatYtT is independent ofξtU underB, see [14]. Next, weintroduce a Gaussian random variableY, with zero mean and unit variance; thisallows us to writeYtT = vtTY. SinceξtU is Ft-measurable andY is independentof ξtU , we can express the price of a sovereign bond by the following Gaussianintegral:

PtT =1

f (t, ξtU )

∫ ∞

−∞

f(

T, vtTy+U − TU − t

ξtU

) 1√

2πexp

(

− 12y2

)

dy. (9)

Interest rate models of various types can therefore be constructed in this frame-work by specifying the functionf (t, x). However, pricing kernels constructed bythe relation (3) are not automatically (Ft, P)-supermartingales. In particular, toguarantee positive interest rates, it is a requirement that the functionf (t, x) satis-fies the following differential inequality, see [14]:

xU − t

∂xf (t, x) −

12∂

∂2xf (t, x) −

∂tf (t, x) > 0. (10)

We emphasize that finding a function which satisfies relation (10) is equivalentto finding a process f (t, ξtU )0≤ t<U that is a positive (Ft,B)-supermartingale.Hence the pricing kernelπt0≤ t<U is a positive (Ft, P)-supermartingale since

EP[πT | Ft] = Mt E

B[ f (T, ξTU) | ξtU ] ≤ Mt f (t, ξtU ) = πt. (11)

We now proceed to construct such positive (Ft,B)-supermartingales using atechnique known as the “weighted heat kernel approach”, presented in [1] andadapted for time-inhomogeneous Markov processes in [2].

Page 173: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

161

3. Weighted Heat Kernel ModelsWe consider the filtered probability space (Ω,F , Ft, P) where the filtration

Ftt≥0 is generated by the information processξtU . We recall that the martingaleMt satisfying equation (4), induces a change of measure fromP to the bridgemeasureB, and that the information processξtU is a Brownian bridge underB.The Brownian bridge is a time-inhomogeneous Markov process with respect to itsown filtration.

Let w : R+0 × R+0 → R

+ be a weight function that satisfies

w(t, u− s) ≤ w(t − s, u) (12)

for arbitraryt, u ∈ R+0 ands ≤ t ∧ u. Then, fort < U and a positive integrablefunctionF(x), the process f (t, ξtU ) given by

f (t, ξtU ) =∫ U−t

0EB[F(ξt+u,U) | ξtU ] w(t, u) du (13)

is a positive supermartingale.The proof of this result goes as follows. Forf (t, x) an integrable function, the

process f (t, ξtU ) is a supermartingale for 0≤ s≤ t < U if

EB[ f (t, ξtU ) | ξsU] ≤ f (s, ξsU) (14)

is satisfied. We define the processp(t, u, ξtU) by

p(t, u, ξtU) = EB[

F(ξt+u,U)| ξtU]

, (15)

where 0≤ u ≤ U − t. Then we have:

EB[ f (t, ξtU ) | ξsU] =

∫ U−t

0EB[p(t, u, ξtU) | ξsU] w(t, u) du

=

∫ U−t

0p(s, u+ t − s, ξsU) w(t, u) du

=

∫ U−s

t−sp(s, v, ξsU) w(t, v− t + s) dv. (16)

Here we have used the tower rule of conditional expectation and the Markov prop-erty of ξtU . Next we make use of the relation (12) to obtain

EB[ f (t, ξtU ) | ξsU] ≤

∫ U−s

t−sp(s, v, ξsU) w(t − (t − s), v) dv

∫ U−s

0p(s, v, ξsU) w(s, v) dv

= f (s, ξsU). (17)

Thus, f (t, ξtU ) is a positive (Ft,B)-supermartingale ifF(x) is positive.

Page 174: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

162

The method based on equation (13) provides one with a convenient way togenerate positive pricing kernels driven by the information processξtU . Thesemodels can be used to generate information-sensitive dynamics of positive inter-est rates. In particular, the functionsf (t, x) underlying such interest rate modelssatisfy inequality (10).

4. Credit-Risky Discount BondsWe introduce two datesT andU, whereT < U, and attach two independent

factorsXT andXU to these dates respectively. We assume thatXT is a discreterandom variable that takes values inx0, x1, . . . , xn with a priori probabilitiesp0, p1, . . . , pn, where 1≥ xn > xn−1 > . . . > x1 > x0 ≥ 0. We takeXT tobe the random variable by which the future payoff of a credit-risky bond issuedby a firm is modelled. The second random variableXU is assumed to be contin-uous and represents a macroeconomic factor. For instance, one might considerthe GDP level at timeU of an economy in which the bond is issued. With thetwo X-factors, we associate the independent information processesξtT 0≤t≤T andξtU 0≤t≤U given by

ξtU = σ1 t XU + βtU , ξtT = σ2 t XT + βtT . (18)

The market filtrationFt is generated by both information processesξtT andξtU . The priceBtT at t ≤ T of a defaultable discount bond with payoff HT atT < U can be written in the form

BtT =EP[πTHT | Ft]πt

(19)

whereπt is the pricing kernel. We consider the positive martingaleMt0≤t<U

that satisfies

dMt = −σ1U

U − tEP[XU | ξtU ] Mt dWt, (20)

and introduce the pricing kernelπt given by

πt = Mt f (t, ξtU ). (21)

The dependence of the pricing kernel onξtU implies that interest rates fluctuatedue to the information flow in the market about the likely value of the macroeco-nomic factorXU at timeU. Since the information processes are Markovian, theprice of the defaultable discount bond can be expressed by

BtT =EP[

MT f (T, ξTU)HT

∣ ξtT , ξtU]

Mt f (t, ξtU ), (22)

Page 175: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

163

whereHT is the bond payoff at maturityT. We now suppose that the payoff ofthe credit-risky bond is a function ofXT and the value of the information processassociated withXU at the bond’s maturityT, that is

HT = H (XT , ξTU) . (23)

Due to the independence property of the information processes, the price of thecredit-risky discount bond can be written as follows:

BtT =EP[

EP[

MT f (T, ξTU)H(XT , ξTU)∣

∣ ξtT] ∣

∣ ξtU]

Mt f (t, ξtU ). (24)

By applying the conditional form of Bayes formula, we change the measure to thebridge measureB with respect to which the outer expectation is taken:

BtT =EB[

EP[

f (T, ξTU)H(XT , ξTU)∣

∣ ξtT] ∣

∣ ξtU]

f (t, ξtU ). (25)

At this stage, we define a random variableYtT by

YtT = ξTU −U − TU − t

ξtU . (26)

SinceξtU is a Brownian bridge underB, we know thatYtT is a Gaussian randomvariable with zero mean and variance

VarB[YtT ] =(T − t)(U − T)

(U − t). (27)

Next we introduce a standard Gaussian random variableY and writeYtT = νtTY,whereν2tT = VarB[YtT ].We can now express the price of the defaultable discountbond in terms ofY as

BtT =EB[

EP[

f(

T, νtTY+ U−TU−t ξtU

)

H(

XT , νtTY+ U−TU−t ξtU

) ∣

∣ ξtT] ∣

∣ ξtU]

f (t, ξtU ). (28)

Since f (T,Y, ξtU) in the numerator does not depend onξtT , we can write

BtT =EB[

f(

T, νtTY+ U−TU−t ξtU

)

EP[

H(

XT , νtTY+ U−TU−t ξtU

) ∣

∣ ξtT] ∣

∣ ξtU]

f (t, ξtU ). (29)

Because bothY andξtU are independent ofξtT , the inner conditional expectationin this expression can be carried out explicitly. We obtain

BtT =EB[

f(

T, νtTY+ U−TU−t ξtU

)

∑ni=0 πit H

(

xi , νtTY+ U−TU−t ξtU

) ∣

∣ ξtU]

f (t, ξtU ), (30)

Page 176: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

164

whereπit denotes the conditional density ofXT , given by

πit = P[

XT = xi

∣ ξtT]

=pi exp

[

TT−t

(

σ2xiξtT −12σ

22x2

i t)]

∑ni=0 pi exp

[

TT−t

(

σ2xiξtT −12σ

22x2

i t)] . (31)

Finally, since the random variableξtU , appearing in the arguments off (T,Y, ξtU)and ofH(Y, ξtU) in (30), is measurable at timet andY is independent of the con-ditioning random variableξtU , the conditional expectation reduces to a Gaussianintegral over the range of the random variableY:

BtT =1

f (t, ξtU )

n∑

i=0

πit

∫ ∞

−∞

f(

T, νtTy+U − TU − t

ξtU

)

H(

xi , νtTy+U − TU − t

ξtU

)

×1√

2πexp

(

− 12y2

)

dy. (32)

In the case where the payoff is HT = XT , by using the expression for thesovereign bond given by equation (9), we can write the price of the defaultablebond as:

BtT = PtT

n∑

i=0

xi πit , (33)

whereπit is defined by equation (31). Forn = 1, the defaultable bond pays aprincipal ofx1 units of currency, if there is no default, andx0 units of currency inthe event of default; we call such an instrument a “binary bond”. In particular, ifx0 = 0 andx1 = 1, we call such a bond a “digital bond”. The price of the digitalbond is

BtT = PtTπ1t. (34)

We can generalize the above situation slightly by considering a pricing kernelπt of the form

πt = Mt f (t, ξtT , ξtU). (35)

By following the technique in equations (22) to (32), and by using the fact thatξTT = σ2XTT, we can show that

BtT =1

f (t, ξtT , ξtU )

n∑

i=0

πit

∫ ∞

−∞

f(

T, σ2xiT, νtTy+U − TU − t

ξtU

)

×H(

xi , νtTy+U − TU − t

ξtU

) 1√

2πexp

(

− 12y2

)

dy. (36)

Page 177: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

165

Here we model the situation in which the pricing kernel in the economy is notonly a function of information at that time about the macroeconomic variable, butis also dependent on noisy information about potential default of the firm leaked inthe market throughξtT . This is relevant in light of events occurring in financialmarkets where defaults by big companies can affect interest rates and the marketprice of risk.

A measure for the excess return provided by a defaultable bond over the returnon a sovereign bond with the same maturity, is the bond yield spread. This mea-sure is given by the difference between the yields-to-maturity on the defaultablebond and the sovereign bond, see for example [3]. That is:

stT = ydtT − ytT (37)

for t < T, whereytT andydtT are the yields associated with the sovereign bond and

the credit-risky bond, respectively. We have:

stT =1

T − t(ln PtT − ln BtT) . (38)

In particular, the bond yield spread between a digital bond and the sovereign bondis given by

stT = −1

T − tlnπ1t. (39)

For bonds with payoff HT = XT , we see that the information related to the macroe-conomic factorXU does not influence the spread. Thus for 0≤ t < T, the spread attime t depends only on the information concerning potential default. In this case,the bond yield spread between the defaultable discount bond and the sovereignbond with stochastic interest rates is of the form of that in the deterministic inter-est rate setting treated in [7].

Figure 1 shows the bond yield spreads between a digital bond, with all trajec-tories conditional on the outcome that the bond does not default, and a sovereignbond. The maturities of the bonds are taken to beT = 2 years and the a prioriprobability of default is assumed to bep0 = 0.2. The effect of different valuesof the information flow parameter is shown by settingσ2 = 0.04, σ2 = 0.2 andσ2 = 1, σ2 = 5. Since the paths of the digital bond are conditional on the out-come that default does not occur, we observe that the bond yield spreads musteventually drop to zero. The parameterσ2 controls the magnitude of genuine in-formation about potential default that is available to bondholders. For low valuesof σ2, the bondholder is, so to speak,“in the dark” about the outcome until veryclose to maturity; while for higher values ofσ2, the bondholder is better informed.Asσ2 increases, the noisiness in the bond yield spreads, which is indicative of thebondholder’s uncertainty of the outcome, becomes less pronounced near maturity.

Page 178: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

166

Furthermore, if the bondholders in the market were well-informed, they wouldrequire a smaller premium for buying the credit-risky bond since its behaviourwould be similar to that of the sovereign bond; this is illustrated in Figure 1. Itis worth noting that in the information-based asset pricing approach, an increasedlevel of genuine information available to investors about their exposure, is mani-festly equivalent to a sort of “securitisation” of the risky investments.

The case for which the paths of the digital bond are conditional on defaultcan also be simulated. Here, the effect of increasing the information flow rateparameterσ2 is similar. However, the bondholder now requires an infinitely highreward for buying a bond that will be worthless at maturity. Thus the bond-yieldspread grows to infinity at maturity.

Figure 1. Bond yield spread between a digital bond (with all trajectories conditional on no default)and a sovereign bond. The bonds have maturityT = 2 years. The a priori probability of default istaken to bep0 = 0.2. We use (i) σ2 = 0.04, (ii ) σ2 = 0.2, (iii ) σ2 = 1, and (iv) σ2 = 5.

5. Credit-Risky Bonds with Continuous Market-Dependent RecoveryLet us consider the case in which the credit-risky bond paysHT = XT where

XT is a discrete random variable which takes valuesx0, x1, . . . , xn ∈ [0, 1] witha priori probabilitiesp0, p1, . . . , pn, wherexn > xn−1 > . . . > x1 > x0. Such apayoff spectrum is a model for random recovery where at bond maturity one outof a discrete number of recovery levels may be realised. We can also considercredit-risky bonds with continuous random recovery in the event of default. In

Page 179: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

167

doing so, we introduce the notion of “market-dependent recovery”. Suppose thatthe payoff of the defaultable bond is given by

HT = XT + (1− XT) R(ξTU), (40)

whereXT takes the values0, 1 with a priori probabilitiesp0, p1. The recoverylevelR : R → [0, 1) is dependent on the information at timeT about the macroe-conomic factorXU . In this case, if the credit-risky bond defaults at maturityT, therecovery level of the bond depends on the state of the economy at timeU that isperceived in the market at timeT. In other words, if the sentiment in the marketat timeT is that the economy will have good times ahead, then a firm in a stateof default atT may have better chances to raise more capital from liquidation (orrestructuring), thus increasing the level of recovery of the issued bond. We canprice the cash flow (40) by applying equation (32), withn = 1, x0 = 0 andx1 = 1.The result is:

BtT = PtTπ1t + π0t1

f (t, ξtU )

∫ ∞

−∞

f(

T, νtTy+U − TU − t

ξtU

)

R(

νtTy+U − TU − t

ξtU

)

×1√

2πexp

(

− 12y2

)

dy, (41)

wherePtT is given by equation (9). As an example, suppose that we choose therecovery function to be of the formR(z) = 1 − exp (−z2). In this case, it is pos-sible to have zero recovery when the value of the information process at timet isξtU = −(U − t)/(U − T) νtTY, thereby capturing the worst-case scenario in whichbondholders lose their entire investment in the event of default.

The latter consideration is apt in the situation where the extent of recovery isdetermined by how difficult it is for the firm to raise capital by liquidating its as-sets, i.e. the exposure of the firm to the general economic environment. However,this model does not say much about how the quality of the management of thefirm may influence recovery in the event of default. This observation brings usto another model of recovery. Default of a firm may be triggered by poor inter-nal practices and (or) tough economic conditions. We now structure recovery byspecifying the payoff of the credit risky bond by

HT = XC [XE + (1− XE)RE] + (1− XC) [XERC + (1− XE)RCE] , (42)

whereXC andXE are random variables taking values in0, 1 with a priori proba-bilities pC

0 , pC1 andpE

0 , pE1 , respectively. We defineXC andXE to be indicators

of good management of the company and a strong economy, respectively. We setRC to be a continuous random variable assuming values in the interval [0, 1). WetakeRE to be a function ofξTU, andRCE to be a function ofξTU andRC, wherebothRE andRCE assume values in the interval [0, 1).

Page 180: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

168

The payoff in equation (42) covers the following situations: First, we supposethat despite good overall management of the firm, default is triggered as a resultof a depressed economy. Here,XC = 1 andXE = 0 which implies thatHT = RE.Therefore the recovery is dependent on the state of the economy at timeT andthus, how difficult it has been for the firm to raise funds. It is also possible thata firm can default in otherwise favourable economic conditions, perhaps due tothe management’s negligence. In this case we haveXE = 1 andXC = 0. ThusHT = RC and the amount recovered is dependent on the level of mismanagementof the firm. Finally we have the worst case in which a firm is poorly managed,XC = 0, and difficult economic times prevail,XE = 0. Recovery is given by theamountHT = RCE, which is dependent on both, the extent of mismanagement ofthe firm and how much capital the firm can raise in the face of an economic down-turn. The particular payoff structure (42) is used in [16] to model the dependencestructure between two credit-risky discount bonds that share market factors incommon. Further investigation may include the situation where one models suchdependence structures for bonds subject to stochastic interest rates and featuringrecovery functions of the form (42).

6. Call Option Price ProcessLet Cst0≤s≤t<T be the price process of a European-style call option with ma-

turity t and strikeK, written on a defaultable bond with price processBtT. Theprice of such an option at times is given by

Cst =1πsEP[

πt (BtT − K)+ | Fs]

. (43)

We recall that if the payoff of the credit-risky bond isHT = XT , then the price ofthe bond at timet is

BtT = PtT

n∑

i=0

πit xi , (44)

wherePtT is given by equation (9) and the conditional densityπit is defined inequation (31). The filtrationFt is generated by the information processesξtT andξtU, and the pricing kernelπt is of the form

πt = Mt f (t, ξtU ), (45)

with Mt satisfying equation (20). Then the price of the option at times is ex-pressed by

Cst =1

Ms f (s, ξsU)EP[

Mt f (t, ξtU ) (BtT − K)+ | ξsT, ξsU]

. (46)

Page 181: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

169

We recall that the two information processes are independent, and use the martin-galeMt to change the measure as follows:

Cst =1

f (s, ξsU)EBU

f (t, ξtU )EP

PtT

n∑

i=0

xi πit − K

+ ∣

ξsT

ξsU

. (47)

We first simplify the inner conditional expectation by following an analogous cal-culation to that in [7], Section 9. The difference is that the discount factorPtT in(47) is stochastic. However sincePtT is driven byξtU , it is unaffected by theconditioning of the inner expectation, allowing us to use the result in [7]. Let usintroduceΦt by

Φt =

n∑

i=0

pit , (48)

wherepit = pi exp[

TT−t

(

σ2xi ξtT −12 σ

22 x2

i t)]

. We write the inner expectation as

EP

PtT

n∑

i=0

xi πit − K

+ ∣

ξsT

= EP

1Φt

n∑

i=0

(PtT xi − K) pit

+ ∣

ξsT

. (49)

The processΦ−1t induces a change of measure fromP to the bridge measureBT ,

under whichξtT is a Brownian bridge; this allows us to use Bayes formula toexpress the expectation as follows:

EP

1Φt

n∑

i=0

(PtT xi − K) pit

+ ∣

ξsT

=1ΦsEBT

n∑

i=0

(PtT xi − K) pit

+ ∣

ξsT

. (50)

In order to compute the expectation we introduce the Gaussian random variableZst, defined by

Zst =ξtT

T − t−ξsT

T − s, (51)

which is independent ofξuT0≤u≤s. It is possible to find the critical value, forwhich the argument of the expectation vanishes, in closed form if it is assumedthat the defaultable bond is binary. So, forn = 1, the critical valuez∗ is given by

z∗ =ln

[

π0s(K−x0PtT)π1s(x1PtT−K)

]

+ 12σ

22

(

x21 − x2

0

)

α2st T2

σ2 (x1 − x0)αst T, (52)

whereα2st = VarBT [Zst]. The computation of the expectation amounts to two Gaus-

sian integrals reducing to cumulative normal distribution functions, which we de-

Page 182: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

170

note byN[x]. We obtain the following:

EP

PtT

1∑

i=0

xi πit − K

+∣

ξsT

= π1s(PtT x1 − K)N[d+s ] − π0s(K − PtT x0)N[d−s ],

(53)

where

d±s =ln

[

π1s(x1PtT−K)π0s(K−x0PtT)

]

± 12σ

22 (x1 − x0)2α2

st T2

σ2 (x1 − x0)αst T. (54)

We can now insert this intermediate result into equation (47) forn = 1; we have

Cst=1

f (s, ξsU)EBU

[

f (t, ξtU )[

π1s(PtT x1 − K)N[d+s ] − π0s(K − PtT x0)N[d−s ]]

| ξsU]

.

(55)

We emphasize thatPtT is given by a functionP(t,T, ξtU) and thus is affected bythe conditioning with respect toξsU. To compute the expectation in equation (55),we use the same technique as in Section 4 and introduce the Gaussian randomvariableYst, defined by

Yst = ξtU −U − tU − s

ξsU , (56)

with mean zero and varianceν2st = VarBU [Yst]. Thus, as shown in the previoussections, the outer conditional expectation reduces to a Gaussian integral:

Cst =1

f (s, ξsU)

∫ ∞

−∞

f(

t, νsty+U − tU − s

ξsU

) 1√

2πexp

(

− 12y2

)

×

[

π1s

(

P(

t,T, νsty+U − tU − s

ξsU

)

x1 − K)

N[d+s (y)]

−π0s

(

K − P(

t,T, νsty+U − t

U − sξsU

)

x0

)

N[d−s (y)]]

dy.

(57)

Therefore we obtain a semi-analytical pricing formula for a call option on a de-faultable bond in a stochastic interest rate setting. The integral in equation (57)can be evaluated using numerical methods once the functionf (t, x) is specified.

7. Hybrid SecuritiesSo far we have focused on the pricing of credit-risky bonds with stochastic

discounting. The formalism presented in the above sections can also be appliedto price other types of securities. In particular, as an example of a hybrid security,we show how to price an inflation-linked credit-risky discount bond. While such asecurity has inherent credit risk, it offers bondholders protection against inflation.

Page 183: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

171

This application also gives us the opportunity to extend the thus far presented pric-ing models to the case wheren independent information processes are employed.We shall call such models, “multi-dimensional pricing models”.

In what follows, we consider three independent information processes,ξtT ,ξtU1 andξtU2, defined by

ξtT = σ t XT + βtT , ξtU1 = σ1 t XU1 + βtU1 , ξtU2 = σ2 t XU2 + βtU2 , (58)

where 0≤ t ≤ T < U1 ≤ U2. The positive random variableXT is discrete, whileXU1, XU2 are assumed to be continuous. The market filtrationFt is generatedjointly by the three information processes. LetCtt≥0 be a price level process,e.g., the process of the consumer price index. The priceQtT , at time t, of aninflation-linked discount bond that paysCT units of a currency at maturityT, is

QtT =EP [πTCT | Ft]πt

. (59)

We now make use of the “foreign exchange analogy” (see, e.g., [4], [5], [11],[13], [17]) in which the nominal pricing kernelπt, and the real pricing kernelπR

t , are viewed as being associated with “domestic” and “foreign” economiesrespectively, with the price level processCt, acting as an “exchange rate”. TheprocessCt is expressed by the following ratio:

Ct =πR

t

πt. (60)

For further details about the modelling of the real and the nominal pricing kernels,and the pricing of inflation-linked assets, we refer to [14]. In what follows, wemake use of the method proposed in [14] to price an example of an inflation-linked credit-risky discount bond (ILCR) that, at maturityT, pays a cash flowHT = CT H(XT , ξTU1 , ξTU2). The priceHtT at timet ≤ T of such a bond is

HtT =1πtEP[

πRT H(XT , ξTU1 , ξTU2)

∣Ft

]

, (61)

where we have used relation (60). We choose to model the real and the nominalpricing kernels by

πt = M(1)t M(2)

t f (t, ξtU1 , ξtU2) and πRt = M(1)

t M(2)t g(t, ξtU1 , ξtU2), (62)

where f (t, x, y) and g(t, x, y) are two functions of three variables. The processM(i)

t 0≤t≤T<Ui for i = 1, 2 is a martingale that induces a change of measure to thebridge measureBi . We recall that the information processξtUi has the law ofa Brownian bridge under the measureBi . In order to work out the expectation

Page 184: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

172

in (61) with the pricing kernel models introduced in (62), we can also define aprocessMt by

Mt = M(1)t M(2)

t , (63)

where 0≤ t ≤ T < U1 ≤ U2. Since the information processesξtU1 andξtU2 areindependent,Mt is itself an (Ft, P)-martingale, withM0 = 1 andEP[Mt] = 1.ThusMt can be used to effect a change of measure fromP to a bridge measureB,under which the random variablesξtU1 andξtU2 have the distribution of a Brownianbridge for 0≤ t ≤ T < U1. This can be verified as follows:ξtU1 is a Gaussianprocess with mean

EB[ξtU1 ] = E

B1

Mt

M(1)t

ξtU1

= EB1[

M(2)t

]

EB1[ξtU1] = 0, (64)

due to the independence property ofξtU1 andξtU2 . Moreover, for 0≤ s ≤ t ≤T < U1, the covariance is given by

EB[ξsU1ξtU1 ] = E

B1[

M(2)t

]

EB1[ξsU1ξtU1 ] = E

P[Mt] EB1[ξsU1ξtU1 ] =

s(U1 − t)U1

.

(65)The same can be shown forξtU2 .

By the definition ofMt and by use of Bayes formula and the fact thatξtT ,ξtU1 andξtU2 areFt-Markov processes, equation (61) reduces to

HtT =1

f (t, ξtU1 , ξtU2)EB[

EP[

g(T, ξTU1, ξTU2)H(

XT , ξTU1 , ξTU2

)

∣ ξtT] ∣

∣ ξtU1 , ξtU2

]

.

(66)Next we repeat an analogous calculation to the one leading from equation (25) toexpression (32). For the ILCR discount bond under consideration, we obtain

HtT =1

f (t, ξtU1 , ξtU2)

n∑

i=0

∫ ∞

−∞

∫ ∞

−∞

g (T, z(y1), z(y2)) H (xi , z(y1), z(y2)) πit

× 12π exp

[

− 12

(

y21 + y2

2

)]

dy1 dy2. (67)

Here the conditional densityπit is given by an expression analogous to the one inequation (31) and, fork = 1, 2, z(yk) is defined by

z(yk) = ν(k)tT yk +

Uk − TUk − t

ξtUk , where ν(k)tT =

(T − t)(Uk − T)Uk − t

. (68)

In the special case whereHT = XT , the expression for the price at timet of theILCR discount bond simplifies to

HtT = QtT

n∑

i=0

πit xi . (69)

Page 185: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

173

Here QtT is the price of an inflation-linked discount bond that depends on theinformation processesξtU1 and ξtU2. In particular, a formula similar to (57)can be derived for the price of a European-style call option written on an ILCRbond with price process given by (69) withn = 1. We note here that similarpricing formulae can be derived for credit-risky discount bonds traded in a foreigncurrency. In this case the real pricing kernel, and thus the real interest rate, isassociated with the pricing kernel denominated in the foreign currency. On theother hand, the nominal pricing kernel is associated with the domestic currency,thus giving rise to the domestic interest rate.

8. Credit-Risky Coupon BondsLet Tkk=1,...,n be a collection of fixed dates where 0≤ t ≤ T1 ≤ . . . ≤ Tn.

We consider the valuation of a credit-risky bond with coupon paymentHTk at timeTk and maturityTn. The bond is in a state of default as soon as the first couponpayment does not occur. We denote the price process of the coupon bond byBtTn and introducen independent random variablesXT1, . . . ,XTn that are appliedto construct the cash flowsHTk given by

HTk = ck

j=1

XT j , (70)

for k = 1, . . . , n− 1, and fork = n by

HTn = (c + p)n

j=1

XT j . (71)

Herec andp denote the coupon and principal payment, respectively, and the ran-dom variablesXTkk=1,...,n take values in0, 1. With each factorXTk we associatean information processξtTk defined by

ξtTk = σk t XTk + βtTk . (72)

Furthermore we introduce another information processξtU given by

ξtU = σ t XU + βtU (0 ≤ t ≤ Tn < U) (73)

that we reserve for the modelling of the pricing kernel. The market filtrationFt

is generated jointly by then + 1 information processes, that isξtTkk=1,...,n andξtU . Following the method in Section 4, we model the pricing kernelπt by

πt = Mt f (t, ξtU ), (74)

where the density martingaleMt which induces a change of measure to thebridge measure satisfies equation (4). Armed with these ingredients we are now in

Page 186: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

174

the position to write down the formula for the priceBtTn at timet of the credit-riskycoupon bond:

BtTn =1πt

n∑

k=1

EP[

πTk HTk

∣ ξtT1, . . . , ξtTk , ξtU]

,

=1

Mt f (t, ξtU )

n∑

k=1

EP

MTk f (Tk, ξTkU) ck

j=1

XT j

ξtT1, . . . , ξtTk , ξtU

+1

Mt f (t, ξtU )EP

MTn f (Tn, ξTnU) pn

j=1

XT j

ξtT1, . . . , ξtTn, ξtU

. (75)

To compute the expectation, we use the approach presented in Section 4. Sincethe pricing kernel and the cash flow random variablesHTk, k = 1, . . . , n, are inde-pendent, we conclude that the expression for the bond priceBtTn reduces to

BtTn = cn

k=1

PtTkEP

k∏

j=1

XT j

ξtT1, . . . , ξtTk

+ p PtTnEP

n∏

j=1

XT j

ξtT1, . . . , ξtTn

,

(76)where the discount bond systemPtTk is given by

PtTk =1

f (t, ξtU )

∫ ∞

−∞

f(

Tk, νtTkyk +U − Tk

U − tξtU

) 1√

2πexp

(

− 12y2

k

)

dyk, (77)

andν2tTk= (Tk − t)(U − Tk)/(U − t). We note that formula (75) can be simpli-

fied further since the expectations therein can be worked out explicitly due to theindependence property of the information processes. We have,

EP

k∏

j=1

XT j

ξtT1 , . . . , ξtTk

=

k∏

j=1

π( j)1t , (78)

where the conditional densityπ( j)1t at timet that the random variableXT j takes value

one is given by

π( j)1t =

p( j)1 exp

[

T j

T j−t

(

σ j ξtT j −12σ

2j t)

]

p( j)0 + p( j)

1 exp[

T j

T j−t

(

σ j ξtT j −12σ

2j t)

] . (79)

Herep( j)1 = P[XT j = 1]. Thus, the priceBtTn at timet of the credit-risky coupon

bond is given by

BtTn =

n∑

k=1

c PtTk

k∏

j=1

π( j)1t + p PtTn

n∏

j=1

π( j)1t . (80)

Page 187: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

175

At this stage, we observe that the price of a credit-risky coupon bond hasbeen derived for the case in which the cash flow functionsHTk , k = 1, . . . , n,do not depend on the information available at timeTk about the macroeconomicfactorXU , thereby leading to independence between the discount bond system andthe credit-risky component of the bond. This is generalized in a straightforwardmanner by considering cash flow functions of the form

HTk = H(XT1, . . . ,XTk , ξTkU), (81)

for k = 1, . . . , n. The valuation of such cash flows at timet may include the casetreated in (32), however endowed with coupon payments.

As an illustration we consider the situation in which the bond pays a couponc at Tk, k = 1, . . . , n, and the principal amountp at Tn. Upon default, market-dependent recovery given byRk(ξTkU ) (as a percentage of coupon plus principal)is paid atTk. For simplicity, we considern = 2. In this case, the random cashflows of the bond are given by

HT1 = cXT1 + (c + p) R1(ξT1U )(1− XT1),

HT2 = (c + p) XT1

[

XT2 + R2(ξT2U)(1− XT2)]

.

By making use of the technique presented in Section 5, we can express the priceof the credit-risky coupon bond by

BtT2 = c PtT1π(1)1t + (c + p) PtT2π

(1)1t π

(2)1t

+ (c + p) π(1)0 t

1f (t, ξtU )

∫ ∞

−∞

f (T1,m(y1)) R1 (m(y1))1√

2πexp

(

− 12y2

1

)

dy1

+ (c + p) π(1)1t π

(2)0 t

1f (t, ξtU )

∫ ∞

−∞

f (T2,m(y2)) R2 (m(y2))1√

2πexp

(

− 12y2

2

)

dy2,

(82)

where, fork = 1, 2, we define

m(yk) = νtTk yk +U − Tk

U − tξtU , νtTk =

(Tk − t)(U − Tk)U − t

. (83)

9. Credit-Sensitive Pricing KernelsWe fix the datesT1 andT2, whereT1 ≤ T2, to which we associate the economic

factorsXT1 andXT2 respectively. The first factor is identified with a debt paymentat timeT1. For exampleXT1 could be a coupon payment that a country is obligedto make at timeT1. The second factor,XT2, could be identified with the measuredgrowth (possibly negative) in the employment level in the same country at timeT2 since the last published figure. In such an economy, with two random factors

Page 188: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

176

only, it is plausible that the prices of the treasuries fluctuate according to the noisyinformation market participants will have about the outcome ofXT1 andXT2. Thusthe price of a sovereign bond with maturityT, where 0≤ t ≤ T < T1 ≤ T2, isgiven by:

PtT =1

f (t, ξtT1, ξtT2)

∫ ∞

−∞

∫ ∞

−∞

f

(

T, ν(1)tT y1 +

T1 − TT1 − t

ξtT1 , ν(2)tT y2 +

T2 − TT2 − t

ξtT2

)

×12π

exp[

− 12

(

y21 + y2

2

)]

dy2 dy1. (84)

In particular, the resulting interest rate process in this model is subject to the in-formation processesξtT1 andξtT2 making it fluctuate according to information(both genuine and misleading) about the economy’s factorsXT1 andXT2.

We now ask the following question:What type of model should one consider ifthe goal is to model a pricing kernel that is sensitive to an accumulation of losses?Or in other words, how should one model the nominal short rate of interest andthe market price of risk processes if both react to the amount of debt accumulatedby a country over a finite period of time?

To treat this question we need to introduce a model for an accumulation pro-cess. We shall adopt the method developed in [9], where the idea of a gammabridge information process is introduced. It turns out that the use of such a cumu-lative process is suitable to provide an answer to the question above. In fact, if inthe example above, the factorXT1 is identified with the total accumulated debt attimeT1, then the gamma bridge information processξγtT1

, defined by

ξγ

tT1= XT1 γtT1 (85)

whereγtT10≤t≤T1 is a gamma bridge process that is independent ofXT1, measuresthe level of the accumulated debt as of timet, 0 ≤ t ≤ T1. If the market filtrationis generated, among other information processes, also by the debt accumulationprocess, then asset prices that are calculated by use of this filtration, will fluctuateaccording to the updated information about the level of the accumulated debt ofa country. We now work out the price of a sovereign bond for which the priceprocess reacts both to Brownian and gamma information.

We consider the time line 0≤ t ≤ T < T1 ≤ T2 < ∞. Time T is thematurity date of a sovereign bond with unit payoff and price processPtT 0≤t≤T .With the dateT1 we associate the factorXT1 and with the dateT2 the factorXT2.The positive random variableXT1 is independent ofXT2, and both may be discreteor continuous random variables. Then we introduce the following informationprocesses:

ξγ

tT1= XT1 γtT1, ξtT2 = σ t XT2 + βtT2. (86)

Page 189: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

177

The processξγtT1 is a gamma bridge information process, and it is taken to be

independent ofξtT2. The properties of the gamma bridge processγtT1 are de-scribed in great detail in [9]. We assume that the market filtrationFtt≥0 is gener-ated jointly byξγtT1

andξtT2.In this setting, the pricing kernel reacts to the updated information about the

level of accumulated debt and, for the sake of example, also to noisy informationabout the likely level of employment growth atT2. Thus we propose the followingmodel for the pricing kernel:

πt = Mt f(

t, ξγtT1, ξtT2

)

(87)

where the processMt is the change-of-measure martingale from the probabilitymeasureP to the Brownian bridge measureB, satisfying

dMt = −σT2

T2 − tE

[

XT2 | ξtT2

]

Mt dWt. (88)

Here Wt is an (Ft, P)-Brownian motion. The formula for the price of thesovereign bond is given by

PtT =EP[

MT f(

T, ξγTT1, ξTT2

) ∣

∣ ξγ

tT1, ξtT2

]

Mt f(

t, ξγtT1, ξtT2

) . (89)

We make use of the Markov property and the independence property of the infor-mation processes, together with the change of measure to express the bond priceby

PtT =EP

γ

[

EB[

f(

T, ξγTT1, ξTT2

) ∣

∣ ξtT2

] ∣

∣ ξγ

tT1

]

f(

t, ξγtT1, ξtT2

) . (90)

Here, the expectationsEPγ andEB are operators that apply according to the de-pendence of their argument on the random variablesξ

γ

TT1andξTT2 respectively.

This is a direct consequence of the independence ofξγ

tT1 andξtT2. We now use

the technique adopted in the preceding sections, where we introduce the Gaussianrandom variableYtT with mean zero and varianceν2tT = (T − t)(T2 − T)/(T2 − t),and the standard Gaussian random variableY. By following the approach taken inSection 4, we can compute the inner expectation explicitly since the conditionalexpectation reduces to a Gaussian integral over the range of the random variableY. Thus we obtain:

PtT =

∫ ∞

−∞

EP

γ

[

f(

T, ξγTT1, νtTy+ T2−T

T2−t ξtT2

) ∣

∣ ξγ

tT1

]

f(

t, ξγtT1, ξtT2

)

1√

2πexp

(

− 12y2

)

dy. (91)

Page 190: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

178

The feature of this model which sets it apart from those considered in precedingsections, is the fact that we have to calculate a gamma expectationE

P

γ. In thiscase, we cannot adopt the “usual” change-of-measure method we have used thusfar. To this end we refer to [9], where the price process of the Arrow-Debreusecurity for the case that it is driven by a gamma bridge information process isderived. We use this result and obtain for the Arrow-Debreu density processAtT

the following expression:

AtT(yγ) = EP[

δ(ξγTT1− yγ)

∣ ξγ

tT1

]

(92)

=1lyγ > ξ

γ

tT1 (yγ − ξ

γ

tT1)m(T−t)−1

B[m(T − t),m(T1 − T)]

∫ ∞

yγp(x) x1−mT1(x− yγ)m(T1−T)−1dx

∫ ∞

ξγ

tT1

p(z) z1−mT1(z− ξγtT1)m(T1−t)−1dz

, (93)

whereδ(y) is the Dirac distribution andp(x) is thea priori probability density ofXT1. HereB[a, b] is the beta function. Following [16], Section 3.4, we consider afunctionh(ξγTT1

) of the random variableξγTT1and note that for a suitable function

h we may write:

EP

γ

[

h(

ξγ

TT1

) ∣

∣ ξγ

tT1

]

=

∫ ∞

−∞

EP

γ

[

δ(

ξγ

TT1− yγ

) ∣

∣ ξγ

tT1

]

h(yγ) dyγ. (94)

Next we see that the conditional expectation under the integral is the Arrow-Debreu density (92) for which there is the closed-form expression (93). We goback to equation (91) and observe that the conditional expectation under the inte-gral is of the formEPγ

[

h(

ξγ

TT1

) ∣

∣ξγ

tT1

]

. Thus we can use (94) to calculate the gammaexpectation in (91). We write:

EP

γ

[

f

(

T, ξγTT1, νtT y+

T2 − TT2 − t

ξtT2

)

ξγ

tT1

]

=

∫ ∞

−∞

AtT

(

yγ)

f

(

T, yγ, νtT y+T2 − TT2 − t

ξtT2

)

dyγ. (95)

We are now in the position to write down the bond price (91) in explicit form byusing equation (95). We thus obtain:

PtT =

∫ ∞

−∞

∫ ∞

−∞

AtT

(

yγ)

f(

T, yγ, νtT y+ T2−TT2−t ξtT2

)

f(

t, ξγtT1, ξtT2

)

1√

2πexp

(

− 12y2

)

dyγ dy. (96)

The bond price can be written more concisely by defining

f(

T, t, ξγtT1, ξtT2

)

=

∫ ∞

−∞

∫ ∞

−∞

AtT

(

yγ)

f

(

T, yγ, νtTy+T2 − TT2 − t

ξtT2

)

×1√

2πexp

(

− 12 y2

)

dyγ dy. (97)

Page 191: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

179

We thus have:

PtT =f(

T, t, ξγtT1, ξtT2

)

f(

t, ξγtT1, ξtT2

) . (98)

Future investigation in this line of research incorporates the constructions of pro-cesses f (t, ξγtT1

, ξtT2) such that the resulting pricing kernel (87) is an (Ft, P)-supermartingale. The appropriate choice off (t, x, y) depends also on a suitabledescription of the economic interplay of the information flows modelled byξ

γ

tT1

and ξtT2. One might begin with looking at the situation in which the price ofthe bond depreciates due to a rising debt level and a higher level of employment.We conclude by observing that the gamma bridge information process may alsobe considered for the modelling of credit-risky bonds, where default is triggeredby the firm’s accumulated debt exceeding a specified threshold at bond maturity.Random recovery models may be constructed using the technique in Section 5.

References1. Akahori, J., Hishida, Y., Teichmann, J. and Tsuchiya, T. (2009), “A Heat Kernel Ap-

proach to Interest Rate Models,” http://arxiv.org/abs/0910.5033v1.2. Akahori, J. and Macrina, A. (2010), “Heat Kernel Interest Rate Models with Time-

Inhomogeneous Markov Processes,” Ritsumeikan University, King’s College Londonand Kyoto University working paper.

3. Bielecki, T. R. and Rutkowski, M. (2002),Credit Risk: Modelling, Valuation andHedging, Springer-Verlag, Berlin.

4. Brigo, D. and Mercurio, F. (2006),Interest Rate Models: Theory and Practice (withSmile, Inflation and Credit), Springer-Verlag, Berlin.

5. Brody, D. C., Crosby, J. and Li, H. (2008), “Convexity Adjustments in Inflation-LinkedDerivatives”, Risk Magazine.

6. Brody, D. C., Davis, M. H. A., Friedman, R. L. and Hughston, L. P. (2009), “InformedTraders”,Proceedings of the Royal Society London, A465, 1103–1122.

7. Brody, D. C., Hughston, L. P. and Macrina, A. (2007), “Beyond Hazard Rates: ANew Framework to Credit Risk Modelling”, inAdvances in Mathematical Finance,Festschrift Volume in Honour of Dilip Madan(eds Elliott R., Fu, M., Jarrow, R. andYen, J. Y.), Birkhauser, Basel.

8. Brody, D. C., Hughston, L. P. and Macrina, A. (2008), “Information-Based Asset Pric-ing”, International Journal of Theoretical and Applied Finance, 11, 107–142.

9. Brody, D. C., Hughston, L. P. and Macrina, A. (2008), “Dam Rain and CumulativeGain”, Proceedings of the Royal Society London, A464, 1801–1822.

10. Flesaker, B. and Hughston, L. P. (1996), “Positive Interest”,Risk, 9, 46–49.11. Hinnerich, M. (2008), “Inflation-Indexed Swaps and Swaptions”,Journal of Banking

and Finance, 32, 2293–2306.12. Hoyle, E., Hughston, L. P. and Macrina, A. (2009), “Levy Random Bridges and the

Modelling of Financial Information”, http://arxiv.org/abs/0912.3652v1.13. Hughston, L. P. (1998), “Inflation Derivatives”, Merrill Lynch and King’s College

London working paper, with added note (2004).14. Hughston, L. P. and Macrina, A. (2009), “Pricing Fixed-Income Securities in an

Information-Based Framework”, http://arxiv.org/abs/0911.1610v1.

Page 192: Financial Engineering

May 3, 2010 15:25 Proceedings Trim Size: 9in x 6in 006

180

15. Hunt, P. J. and Kennedy, J. E. (2004),Financial Derivatives in Theory and Practice,Wiley, Chichester.

16. Macrina, A. (2006), “An Information-Based Framework for Asset Pricing: X-factorTheory and its Applications”, PhD thesis, King’s College London.

17. Mercurio, F. (2005), “Pricing Inflation-Indexed Derivatives”,Journal of QuantitativeFinance, 5, 289–302.

18. Rogers, L. C. G. (1997), “The Potential Approach to the Term Structure of InterestRates and Foreign Exchange Rates”,Mathematical Finance, 7, 157–176.

19. Rutkowski, M. and Yu, N. (2007), “An Extension of the Brody-Hughston-MacrinaApproach to Modelling of Defaultable Bonds”,International Journal of Theoreticaland Applied Finance, 10, 557–589.

Page 193: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

On Statistical Aspects in Calibrating a GeometricSkewed Stable Asset Price Model∗

Hiroki Masuda

Graduate School of Mathematics, Kyushu University,744, Motooka, Nishi-ku, Fukuoka, 819-0395, Japan

E-mail: [email protected]

Estimation of an asset price process under the physical measure can be re-garded as the first step of the calibration problem, hence is of practical im-portance. In this article, supposing that a log-price process is expressed bya possibly skewed stable driven model and that a high-frequency datasetover a fixed period is available, we provide practical procedures of esti-mating the dominating parameters. Especially, the scale parameter maybe time-varying and possibly random as long as it is independent of thedriving skewed stable Levy process. By means of the scaling propertyand realized bipower variations, it is possible to estimate the index andpositivity (skewness) parameters without specific information of the scaleprocess. When the target scale parameter is constant, our estimators areasymptotically normally distributed, the rate of convergence being

√n.

When the scale is actually time-varying, we focus on estimation of theintegrated scale, which is an analogue to the integrated volatility in theBrownian-semimartingale framework. In this case we show that estima-tion of the integrated scale exhibits a kind of asymptotic singularity withrespect to the unknown index parameter, with the rate of convergence be-ing the slower

√n/ logn.

Keywords: High-frequency sampling, parameter estimation, skewed sta-ble Levy process.

∗This work was partly supported by Grant-in-Aid for Young Scientists (B) of Japan, and Coopera-tive Research Program of the Institute of Statistical Mathematics.

181

Page 194: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

182

1. IntroductionNowadays there exists a vast amount of option-pricing theories for many kinds

of underlying asset price processes, which depend on either finite- or infinite-dimensional unknown parameters. Typically, we are first given an underlying assetprice process whose law is governed by a physical measure (real world), and thenconstruct a risk-neutral measure under which a price formula is provided througha change of measure. To apply the theories in practice, we are inevitably forced tocalibrate the model in question. Then, the first key step would be to estimate thestructure of the underlying asset price process based on observed return data.

In this article, we address the estimation problem for a class of asset pricemodels driven by a possibly skewed stable Levy process. Specifically, we providesimple recipes for estimating the parameters governing the law of the log-priceprocessX = logS, whereSdenotes a univariate asset price process: recall that fora semimartingaleX without continuous local martingale part it follows from Ito’sformula that

dSt = St−dXt +(e∆Xt −1−∆Xt)with some positive initial variableS0, where∆Xt :=Xt −Xt− denotes the jump ofXat timet. We modelX as a stochastic integral of a positive processσ independentof the integratorZ, a skewed stable Levy process with finite mean. Our model in-cludes the so-called geometric stable Levy process, whereσ is constant. Undoubt-edly, Levy processes, which formthecontinuous-time counterpart of discrete-timerandom walks, serve as a building block for continuous-time modelling of finan-cial data. We refer the reader to, among others, Bertoin [5] and Sato [16] for sys-tematic accounts of Levy processes. Recently, Miyahara and Moriwaki [15] (seealso Fujiwara and Miyahara [9]) introduced an option-pricing model based on thegeometric stable Levy process and the minimal entropy martingale measure, andshown its usability to, e.g., reproduce the volatility smile/smirk properties.

Our estimation procedure utilizes empirical-sign statistics and realized multi-power variations (MPV for short), and its implementation is pretty simple and re-quires no hard numerical optimization, hence preferable in practice. Using MPVsessentially amounts to the classical method of moments with possibly randomtargets. Some authors have studied asymptotic behaviors concerning MPVs forestimating integrated-σ quantities: Barndorff-Nielsen and Shephard [4] for cen-tered and symmetric stableZ, Woerner [18, 20] for generalZ admitting a nearlysymmetric Levy density near the origin. The independence betweenσ andZ wascrucial in these papers. On the other hand, Corcueraet al. [7] treated realizedpower variation for general strictly stableZ with σ not necessarily independentof Z.

Concerning joint estimation of the stable Levy processes based on high-frequency data, Masuda [13] considered a joint estimation of the index, scale,and location parameters in case of symmetric Levy density. There it was shownthat the sample median based estimator of the location combined with a variant

Page 195: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

183

of the central limit theorem led to full-joint estimators, which are asymptoticallynormal with finite and nondegenerate asymptotic covariance matrices. In particu-lar, the sample median based estimator turned out to be rate-efficient. Our modelsetup in this article does not contain the drift parameter (presupposed to be zero),but instead allows possible skewness.

This article is organized as follows. Our model setup and objectives are de-scribed in Section 2. Section 3 presents our estimation procedures. Small simula-tion results are reported in Section 4. Concluding remarks are given in Section 5.

2. SetupLet (Ω,F ,(Ft )t∈[0,1],P) be an underlying probability space, which is sup-

posed to be rich enough to carry all the random variables and processes appearingbelow, and to make all the random processes adapted. We denoted byE the ex-pectation operator. For convenience, we start with describing some basic factsconcerning the stable distributions and stable Levy processes.

Denote bySα(ρ ,σ) the possibly skewed stable distribution without drift, thecharacteristic function of which is given by

u 7→ exp

−σ |u|α(

1− isgn(u) tanαπ(ρ −1/2))

, u∈ R. (1)

The dominating parametersα, ρ , andσ correspond to:

• thestable-index parameterα ∈ (1,2);

• thepositivity parameterρ fulfilling that 1−1/α < ρ < 1/α; and

• thescale parameterσ > 0.

We here rule out the “infinite-mean” case (i.e.α ∈ (0,1]), and also the case of“one-sided jumps” (i.e. eitherρ = 1− 1/α or 1/α) from our scope; in manycases, this restriction is non-fatal for realistic modelling in finance.

Let ζ stand for a random variable such thatL (ζ ) = Sα(ρ ,σ). Here and inthe sequel, for a random variableξ we denote its law byL (ξ ). The name of“positivity parameter” ofρ comes from the fact thatP[ζ ≥ 0] = ρ ; trivially, thesymmetric case corresponds toρ = 1/2. Note that the positivity parameter ofL (cζ ) is againρ whateverc> 0 is. For future reference, we mention the closed-form expressions of absolute and signed-absolute moments (cf. Kuruoglu [11]):for anyr ∈ (−1,α) andr ′ ∈ (−2,−1)∪ (−1,α),

E[|ζ |r ] = Γ(1− r/α)

Γ(1− r)cos(rξ/α)

cos(rπ/2)σ r/α

|cos(ξ )|r/α , (2)

E[|ζ |r ′sgn(ζ )] =Γ(1− r ′/α)

Γ(1− r ′)sin(r ′ξ/α)

sin(r ′π/2)σ r ′/α

|cos(ξ )|r ′/α , (3)

Page 196: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

184

where we wroteξ = απ(ρ −1/2)

and the symbol sgn(u) expresses 1,0,−1 according asu > 0, = 0, < 0, respec-tively. We write

µr = σ−r/αE[|ζ |r ] and νr ′ = σ−r ′/αE[|ζ |r ′sgn(ζ )],

the rth absolute andr ′th signed-absolute moments associated withSα(ρ ,1), re-spectively.

The most familiar parametrization of the stable distribution would be, insteadof (1),

u 7→ exp

− (σ |u|)α(

1− iβ sgn(u) tanαπ2

)

,

where the skewness parameter fulfilsβ ∈ (−1,1), the symmetric case correspond-ing to β = 0; as such,ρ andβ have the one-to-one relation

tan

απ(

ρ − 12

)

= β tanαπ2

.

Also, regardingρ as a function ofβ (for any fixedα ∈ (1,2)), it can be seen thatρ is monotonically decreasing on(−1,1). Henceρ − 1/2 andβ have oppositesigns forα ∈ (1,2), which is not the case forα ∈ (0,1); Figure 1 illustrates thispoint, where also included just for comparison is the case ofα = 0.8. Interestedreaders can consult Zolotarev [21] for more details concerning one-dimensionalstable distributions; see also Boraket al. [6].

The reason why we have chosen the parametrization (1) is that, as is expectedfrom Figure 1, estimation performance ofβ based on the empirical sign is desta-bilized for α close to 2. That is to say, a “small” change of the empirical-signquantity (see Section 3.1.1) leads to a “big” diremption of the estimate ofβ fromthe true value; this point can be seen from Figure 1, where the curve is gentler forα closer to 2.

Denote byZ = (Zt)t∈[0,1] a univariate Levy process starting from the originsuch that

L (Zt ) = Sα(ρ , t), t ∈ [0,1]. (4)

The image measure of the processZ is completely characterized by the two pa-rameterα andρ . Figure 2 shows two simulated sample paths ofZ.

For the stable Levy processes, the (tail-)indexα also corresponds to theBlumenthal-Getoor activity index (see, e.g., Sato [16] ). In view of (4), we seethat the time parametert directly serves as the scale in the parametrization (1).

The processZ itself does not accommodate the scale parametrization. Now weintroduce a possibly time-varying scale process. Letσ = (σt)t∈[0,1] be a positive

Page 197: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

185

Beta

Rho

−1.0 −0.5 0.0 0.5 1.0

0.0

0.2

0.4

0.6

0.8

1.0

1.8

1.5

1.2

0.8

Figure 1. Plots ofρ as a function ofβ for the valuesα = 0.8,1.2,1.5, and 1.8.

Sa

mp

le p

ath

s

0.0 0.2 0.4 0.6 0.8 1.0

−0

.50

.00

.51

.0

time

1.51.8

Figure 2. Two simulated sample paths ofZ of (4) for α = 1.5 and 1.8, with β =−0.5 andσt ≡ 1;although we drew solid and dashed lines for clarity, they are actually of pure jump in theory.

cadlag process (right-continuous and having left-hand side limits) independent ofZ, such that

P

[

∫ 1

0σ2

s ds< ∞]

= 1. (5)

Then we consider the processX = (Xt)t∈[0,1] given by

Xt =

∫ t

0σs−dZs

Page 198: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

186

as a model of a univariate log-price process under physical measure; without lossof generality, we have setX0 = 0. The condition (5) is sufficient in order to makethe stochastic integral well-defined; see, e.g., Applebaum [1] for a general ac-count of stochastic integration. Additionally, for a technical reason, we imposethe following structure onσα (the αth power process ofσ ) which is borrowedfrom Barndorff-Nielsenet al. [3] (see also Barndorff-Nielsenet al. [2]):

σαt = σα

0 +∫ t

0asds+

∫ t

0bs−dws

+

∫ t

0

h c(s−,z)(µ −ν)(ds,dz)+∫ t

0

(c−h c)(s−,z)µ(ds,dz).

Here the ingredients are as follows:w is a standard Wiener process;µ is a Poissonrandom measure having the intensity measureν(ds,dz) = dsF(dz), whereF isa σ -finite measure on(0,∞)×R; a andb are real-valued cadlag processes;c :Ω× [0,∞)×R → R is a cadlag process satisfying that (i)c(s,z) = c(ω ;s,z) isFs⊗B(R)-measurable for eachs, and that (ii) supω∈Ω,s<Sk(ω) |δ (ω ;s,z)| ≤ ψk(z)

for some nonrandom functionsψk(z) fulfilling that∫

R1∧ψk(z)2F(dz)< ∞ and

stopping timesSk such thatSk → ∞ a.s.; finally,h is a continuous function onRwith compact support such thath(x) = x near the origin.

Suchσs constitute a broad class of the so-called Ito’s semimartingales, includ-ing diffusions with jumps.

Remark 2.1. Extending the present time period[0,1] to [0,∞), we may equiva-lently setXt = Z∫ t

0 σαs ds provided that the “clock” process

∫ t0 σα

s ds→ ∞ a.s. fort → ∞. This time-change representation is known to be inherent in the case ofstable-Levy integrators among general Levy ones; see Kallsen and Shiryaev [10]for details.

Remark 2.2. We have set the target period is[0,1] from the very beginning. How-ever, this point is a matter of no importance: enlarging the length of the period isreflected in making

∫ 10 σα

s ds larger throughσ .

Suppose that we have a discrete-time data with sampling mesh 1/n over thetarget period[0,1], wheren denotes the sample size; namely, we observe the se-quence of log prices

X1/n,X2/n, . . . ,X(n−1)/n,X1.

The log-price model described above is governed by the parameter(ρ ,α,σ·)unknown to observers. Nevertheless, note that(ρ ,α,σ·) is possibly infinite-dimensional. Hopefully we will be able to estimateσt for eacht ∈ [0,1], butthis is beyond the scope of this article; to the best of author’s knowledge, no suchresult has been obtained in the non-Gaussian stable driven case. Instead, we aregoing to confine our objective to the following:

Page 199: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

187

(A) estimation of(ρ ,α,σ) whenσt ≡ σ for a positive constantσ .

(B) estimation of(ρ ,α,∫ 1

0 σαs ds) when σ· is actually time-varying (possibly

random).

Our goal is to provide an explicit recipe of interval estimation based on the avail-able high-frequency data (i.e., forn → ∞). To this end we are going to deriveasymptotic (mixed) normality with specific asymptotic covariance matrix as wellas rate of convergence.

Of course the case (A) is formally included in the case (B), however, we needa separate argument to consider the latter. In both cases, we first construct asimple estimator of(ρ ,α) with leavingσ· unknown. Then, using the estimates,we provide a estimator ofσ or

∫ 10 σα

s ds. Estimation of integrated quantities suchas

∫ 10 σα

s ds is already known to be possible in the light of recently developedtheory of MPV for pure-jump processes; see Woerner [20] and references therein.However, to implement the procedure, as a matter of fact we need estimates ofα andρ beforehand. We can avoid this inconvenience since, in our estimationprocedure, an estimator of(ρ ,α) is first provided without using information ofσ .This is a great advantage of our estimation procedure.

According to the scaling property of the strictly stable distributions and theindependence betweenσ andZ, we have

L (X1|σ) = Sα

(

ρ ,∫ 1

0σα

s ds

)

(6)

in the case (B). It seems natural to target at the integrated scale∫ 1

0 σαs ds; the ma-

jor estimation target in the familiar Brownian semimartingale framework (e.g.,Barndorff-Nielsenet al. [2, 3] as well as their references) is the integrated volatil-ity

∫ 10 σ2

s ds. The author expects that the pricing strategy of Miyahara and Mori-waki [15] for the geometric stable Levy process remains valid even for the cases oftime-varying scale, as long as the option in question is of European type, in whichonly an expectation of the “terminal” variable (namely,X1 in our framework) isconcerned: this is just because, as specified in (6), theL (X1|σ) is exactly stable.

3. Description of Estimation Procedure3.1 Preliminaries

Write the increments of successive observations as

∆iX = Xi/n−X(i−1)/n, i ≤ n.

Conditional on the processσ , the random variables∆iX are mutually independentand for eachn∈ N andi ≤ n

L (∆iX|σ) = Sα

(

ρ ,∫ i/n

(i−1)/nσα

s ds

)

.

Page 200: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

188

Before proceeding let us remind two fundamental facts, which are several timesused in the sequel without notice.

• Since we are concerned here with the weak property, we may set

∆iX = (σi/n)1/αζi a.s.,

whereσi := n∫ i/n(i−1)/nσα

s dsand(ζi) is an i.i.d. sequence with common law

Sα(ρ ,1).

• Let Λn be a sequence of essentially bounded functionals on the productspace of the path spaces ofZ andσ , and letλn(σ) :=

Λn(σ ,z)PZ(dz),wherePξ denotes the image measure of a variableξ . Supposeλn(σ) →p

λ0(σ) for some functionalλ0 on the path spaces ofσ , where→p denotesthe convergence in probability. In view of the independence betweenZ andσ , a disintegration argument givesλn(σ) = E[Λn(σ ,Z)|σ ] a.s., moreover,the boundedness ofλn(σ)n∈N yields convergence of moments, namely,E[Λn(σ ,Z)] =

λn(σ)Pσ (dσ)→∫

λ0(σ)Pσ (dσ). That is to say, we mayactually treatσ a nonrandom process in the process of deriving weak limittheorems. In particular, if some functionalsSn(σ0,Z) with fixed σ0 areasymptotically centered normal with covariance matrixV(σ0), then it auto-matically follows that the limit distribution ofSn(σ ,Z) has the characteristicfunctionu 7→ ∫

exp−u>V(σ)u/2Pσ(dσ), a mixed normal ifσ is random.

These are trivial, but crucial in our study.1

As mentioned before, first we construct concrete estimators ofρ and α inthis order without any further information of the scale processσ· (Section 3.2),and then, using the estimates ofρ andα so obtained, we give estimators of theremainingσ or

∫ 10 σα

s dsaccording as the cases (A) or (B), respectively (Sections3.3 and 3.4). For later use, in the rest of this subsection we give some backgroundinformation on the empirical-sign statistics and MPVs.

3.1.1 Expression of empirical-sign statisticsLet Hn := n−1 ∑n

i=1sgn(∆iX), thenHn = n−1∑ni=1sgn(ζi) →p E[sgn(ζ1)] =

2ρ −1. Hence

ρn :=12(Hn+1) (7)

serves as a consistent estimator ofρ . Since

√n(ρn−ρ) =

n

∑i=1

12√

nsgn(ζi)− (2ρ −1), (8)

1Moreover, if necessary in the proof, we may suppose that(σt)t∈[0,1] is bounded from above andbounded away from zero without loss of generality: this follows from the localization arguments as inBarndorff-Nielsenet al. [3] .

Page 201: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

189

we easily deduce the asymptotic normality√

n(ρn − ρ) →dN1(0,ρ(1− ρ)),

where the symbol→d stands for the weak convergence. It is nice that the asymp-totic variance only depends onρ as it directly enables us to provide a confidenceinterval ofρ . Despite of its simplicity, it exhibits unexpectedly good finite-sampleperformances; see Section 4.

Perhaps the simplest possible estimator ofρ is not (7) butn−1 ∑ni=1 I(∆iX ≥ 0),

whereI(A) denotes the indicator function of an eventA. The reason why we chose(7) is that, thanks to (3), it directly leads to an explicit asymptotic covariancebetween the estimator of the remaining parameters. Moreover, the asymptoticvariance ofn−1∑n

i=1 I(∆iX ≥ 0) is ρ(1−ρ), which is the same as that of (7). SeeSection 3.2 for details.

Remark 3.1. There are other possible ways to construct an estimate ofρ , forexample, the method of moments based onE[|ζ |q] together withE[ζ 〈q〉], whereL (ζ ) = Sα(ρ ,1) (see Kuruoglu [11]). However, in this case the asymptotic vari-ance of the resulting estimator must depend on the true value ofα.

Remark 3.2. It may be expected that there is no other Levy process than thestable one, for which we can consistently estimate the “degree of skewness” insuch a simple way. For instance, the familiar generalized hyperbolic Levy processhas the skewness parameter, but it can be consistently estimated only when wetarget the long-term asymptotics; see, e.g., Woerner [19].

3.1.2 Expression of normalized MPVFix anm∈ N, and letr = (r l )

ml=1 be such thatr l ≥ 0, r+ := ∑m

l=1 r l > 0, andmaxl≤mr l < α/2. Then we define therth MPV as

Mn(r) :=1n

n−m+1

∑i=1

m

∏l=1

|n1/α∆i+l−1X|r l . (9)

By the equivalent expression of(∆iX), we may replace “|n1/α∆i+l−1X|r l ” in the

right-hand side of (9) by “σ r l /αi+l−1|ζi+l−1|r l ”. Let

σ∗q :=

∫ 1

0σq

s ds

for q> 0 andµ(r) := ∏ml=1 µr l . Here we prepare a first-order stochastic expansion

useful for our goal.Observe that

√n

Mn(r)−µ(r)σ∗r+

=n−m+1

∑i=1

1√n

χ ′ni(r)+Rn(r),

Page 202: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

190

where

χ ′ni(r) :=

( m

∏l=1

σ r l /αi+l−1

)( m

∏l=1

|ζi+l−1|r l − µ(r))

,

Rn(r) := µ(r)n−m+1

∑i=1

1√n

( m

∏l=1

σ r l /αi+l−1−σ r+

(i−1)/n

)

+n−m+1

∑i=1

√n∫ i/n

(i−1)/n(σ r+

(i−1)/n−σ r+s )ds

+O

(

1√n

)

.

From the same argument as in Woerner [20] together with Barndorff-Nielsenetal. [3] (see also Masuda [14]), we can deduce thatRn(r)→p 0. Similarly, straight-forward but rather messy computations lead to

n−m+1

∑i=1

1√n

χ ′ni(r) =

n

∑i=m

1√n

χni(r)+op(1),

where

χni(r) :=

( m

∏l=1

σ r l /αi−m+l

) m

∑q=1

(q−1

∏l=1

|ζi+l−q|r l

)( m

∏l=q+1

µr l

)

(|ζi |rq −µrq).

In summary, we have

√n

Mn(r)−µ(r)σ∗r+

=n

∑i=m

1√n

χni(r)+op(1). (10)

3.1.3 A basic limit resultBuilding on the arguments above, we now derive a basic distributional result.Let r = (r l )

ml=1 be as before, and also letr ′ = (r ′l )

ml=1 be another vector fulfilling

the same conditions asr. In what follows we set

r+ = r ′+ = p (11)

for somep > 0; this setting is enough for both (A) and (B). We here derive thelimit distribution (normal conditional onσ ) of the random vectors

Sn(r, r′) :=

√n

Hn − (2ρ −1)Mn(r) − µ(r)σ∗

pMn(r ′) − µ(r ′)σ∗

p

,

which serves as a basic tool for our purpose.

Page 203: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

191

In view of (8) and (10), it follows thatSn(r, r ′) admits the stochastic expansion

Sn(r, r′) =

n

∑i=m

1√n

sgn(ζi)− (2ρ −1)χni(r)χni(r ′)

+op(1) =:n

∑i=m

1√n

γni +op(1).

For the leading term∑ni=mn−1/2γni, we can apply a central limit theorem either

for finite-order dependent arrays or for martingale difference arrays. Here weformally use the latter, where the underlying filtration may be taken asGnii≤n

with Gni :=σ(ζ j : j ≤ i); recall that we are now regardingσ a nonrandom process.The Lindeberg condition readily follows from the condition

maxl≤m

(r l ∨ r ′l )<α2,

hence it suffices to compute the quadratic variation. Therefore we are left to find-ing the limits in probability ofn−1∑n

i=mE[γniγ>ni |Gn,i−1]. After lengthy computa-tion, it turns out that, under the regularity conditions imposed onσ ,

1n

n

∑i=m

E[

γniγ>ni

∣Gn,i−1]

→p Σ(ρ ,α,σ·) :=

4ρ(1−ρ) A(r)σ∗r+ A(r ′)σ∗

r ′+B(r, r)σ∗

2r+ B(r, r ′)σ∗r++r ′+

sym. B(r ′, r ′)σ∗2r ′+

,

where we conveniently wrote

A(r) =m

∑q=1

(

∏1≤l≤m,l 6=q

µr l

)

νrq − (2ρ −1)µrq,

B(r, r ′) =m

∏1=1

µr l+r ′l− (2m−1)

m

∏1=1

µr l µr ′l

+m−1

∑q=1

(m−q

∏l=1

µr ′l

)( m

∏l=m−q+1

µr ′l+r l−m+q

)( m

∏l=q+1

µr l

)

+

(m−q

∏l=1

µr l

)( m

∏l=m−q+1

µr l+r ′l−m+q

)( m

∏l=q+1

µr ′l

)

,

with obvious analoguesA(r ′) andB(r, r), andB(r ′, r ′). Thus we arrive at

Sn(r, r′)→d

N3(

0,Σ(ρ ,α,σ·))

, (12)

which implies that the limit distribution ofSn(r, r ′) is a normal scale mixture con-ditional onσ with conditional covariance matrixΣ(ρ ,α,σ·). Here we note thatΣ(ρ ,α,σ·) depends on the processσ· only through the integrated quantitiesσ∗

r+ ,σ∗

r ′+, σ∗

2r+ , σ∗2r ′+

, andσ∗r++r ′+

.

Having the basic convergence (12) in hand, we now turn to our main objec-tives, (A) and (B) mentioned in Section 2.

Page 204: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

192

3.2 Joint Asymptotic (Mixed) NormalityGiven a p > 0 and (r, r ′) (remind that we are assuming (11)), we write

(ρn, αp,n, σ∗p,n) for the random root of

Hn − (2ρ −1)Mn(r) − µ(r)σ∗

pMn(r ′) − µ(r ′)σ∗

p

=

000

. (13)

For a moment we suppose that such a root indeed exists. We introduce the function

F(ρ ,α,s) :=

2ρ −1µ(r)sµ(r ′)s

.

Now let us recall (2) withσ = 1. As we are assuming thatα ∈ (1,2) and1− 1/α < ρ < 1/α, we haveξ ∈ (−π/2,π/2), so that cos(ξ ) > 0. Hence thequantitiesµ(r) andµ(r ′) are continuously differentiable with respect to(ρ ,α).Let Dρ(r) := ∂

∂ρ µ(r) andDα(r) := ∂∂α µ(r): here, the variable “s” is supposed

to be independent of(ρ ,α). Trivially,

∇F(ρ ,α,s) =

2 0 0sDρ (r) sDα (r) µ(r)sDρ(r ′) sDα(r ′) µ(r ′)

,

which is nonsingular for eachs> 0 as soon as

µ(r ′)Dα(r) 6= µ(r)Dα (r′). (14)

Again let us recall that we may proceed as ifσ is nonrandom. The classical deltamethod (e.g., van der Vaart [17]) yields that, if (14) holds true, then

√n

ρn−ραp,n−ασ∗

p,n−σ∗p

→dN3(0,V(ρ ,α,σ·)), (15)

where

V(ρ ,α,σ·) := ∇F(ρ ,α,σ∗p)−1Σ(ρ ,α,σ·)∇F(ρ ,α,σ∗

p)−1,>.

We see thatΣ(ρ ,α,σ·) here depends onσ only throughσ∗p and σ∗

2p; hence,more specifically we may writeΣ(ρ ,α,σ·) = Σ(ρ ,α,σ∗

p,σ∗2p), and accordingly,

V(ρ ,α,σ·) =V(ρ ,α,σ∗p,σ∗

2p). We should note that the functionV(ρ ,α,σ∗p,σ∗

2p)is fully explicit as a function of its four arguments.

Now we setm= 2 and considerr = (2q,0) andr ′ = (q,q) for a q> 0 (hencep= 2q). In order to make (12) valid, we needq< α/4: as we are assuming thatα ∈ (1,2), a naive choice isq= 1/4 (see Remark 3.3 below).

Page 205: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

193

Let us mention the computation of the solution to (13). We already have aclosed-form solutionρn in (7). As for αn, we can conveniently utilize the sec-ond and third arguments of (13): writeµ(·) for the µ(·) with (ρ ,α) replacedby (ρn, αp,n), and then consider the estimating equationMn(q,q)/Mn(2q,0) =µ(q,q)/µ(2q), which can be rewritten as

∑n−1i=1 |∆iX|q|∆i+1X|q

∑ni=1 |∆iX|2q =C1(q)C2(q, ρn)

Γ(1−q/αp,n)2

Γ(1−2q/αp,n), (16)

where, havingρn beforehand, we can regard

C1(q) :=Γ(1−2q)cos(qπ)

Γ(1−q)cos(qπ/2)2 and C2(q, ρn) :=[cosqπ(ρn−1/2)]2cos2qπ(ρn−1/2)

as constants. Since the function

α 7→ Γ(1−q/α)2

Γ(1−2q/α)(17)

is strictly monotone on(1,2), it is easy to search the rootαp,n. Clearly, the rootdoes uniquely exist with probability tending to one.

Remark 3.3. We see that the range of the function (17) becomes narrower forsmallerq, so that the rootαp,n becomes too sensitive for a small change of thesample quantity in the left-hand side of (16). This implies that the law of largenumbers for the sample quantity should be in force with high degree of accuracyfor smallerq.

Thus, given ap = 2q > 0, we could get the estimatesρn and αp,n withoutspecial information ofσ , which may be time-varying and random as long as theregularity conditions onσ imposed on Section 2 hold true. It is important herethat we have used the bipower variation in part; the procedure using the first andsecond empirical moments as in Masuda [13] is valid only whenσ is constant.

The present asymptotic covariance matrix isV(ρ ,α,σ∗2q,σ∗

4q), for which wewant to provide a consistent estimator. We only need to give consistent estimatorsof σ∗

2q andσ∗4q; recall that we need

4q< α

in order to make the distributional result (15) withp = 2q valid. For instance,we can proceed as follows. First, (15) withp = 2q implies thatMn(2p,0) →p

µ(2q,0)σ∗2q. Using the estimates(ρn, αp,n) and the continuous mapping theo-

rem, we deduce thatMn(2q,0)/µ(2q,0) is a consistent estimator ofσ∗2q. We

should notice the dependence ofMn(2q,0) on α (recall (9)): Mn(2q,0) =n2q/α−1∑n

i=1 |∆iX|2q. Nevertheless, as in Masuda [13] , we see that theα can be

Page 206: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

194

replaced byαp,n since we already know that√

n(αp,n−α) = Op(1). Therefore,

σ∗2q,n :=

n2q/αp,n−1

µ(2q,0)

n

∑i=1

|∆iX|2q →p σ∗2q. (18)

Once again, let us remind thatµ(2q,0) can be easily computed in view of (2) withσ = 1. By the same token, we could deduce that (still under 4q< α, of course)

σ∗4q,n :=

n4q/αp,n−1

µ(2q,2q)

n−1

∑i=1

|∆iX|2q|∆i+1X|2q →p σ∗4q.

After all, V(ρn, αp,n, σ∗2q,n, σ

∗4q,n) can serve as a desired consistent estimator.

Now we are in a position to complete our main objectives (A) and (B).

3.3 Case (A): Geometric Skewed Stable Levy ProcessWhenσt ≡ σ > 0, our model reduces to the geometric skewed stable Levy

process. In this case we can perform a full-joint interval estimation concerningthe dominating (three-dimensional) parameter(ρ ,α,σ) at rate

√n.

We keep using the framework of the last subsection. It directly follows from(15) that

√n

ρn−ραp,n−α

(σp,n)p−σ p

→dN3(0,V(ρ ,α,σ)), (19)

where V(ρ ,α,σ) explicitly depends on the three-dimensional parameter(ρ ,α,σ); recall thatp = 2q < α/2. Applying the delta method to (19) in or-der to convert(σp,n)

p to σp,n in (19), we readily get the asymptotic normality of√n(ρn−ρ , αp,n−α, σp,n−σ); we omit the details. Our first objective (A) is thus

achieved.In summary, we may proceed with the choiceq= 1/4 (sop= 1/2) as follows.

1. Compute the estimateρn of ρ by (7).

2. Using theρn, find the rootα1/2,n of (16).

3. Using(ρn, α1/2,n) thus obtained, an estimate ofσ is provided by, e.g. (recall(18)),

σ1/2,n :=

n1/(2αp,n)−1

µ(1/2,0)

n

∑i=1

|∆iX|2

.

3.4 Case (B): Time-Varying Scale ProcessNow we turn to the case (B). Again by means of the argument give in Section

3.2, it remains to construct an estimator ofσ∗α =

∫ 10 σα

s ds. The point here is that,different from the case (A), a direct use of (15) is not sufficient to deduce the

Page 207: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

195

distributional result concerning estimatingσ∗α because the dependence of(r, r ′)

on α is not allowed there. In order to utilizeMn(r) with r depending onα, weneed some additional arguments.

Extracting the second row of (12), we have

√nMn(r)− µ(r)σ∗

r+→d

N1(

0,B(r, r)σ∗2r+

)

. (20)

In view of the condition maxl≤mr l < α/2, we need (at least) a tripower variationfor settingr+ = α. For simplicity, we setm= 3 and

r = r(α) =

(

α3,

α3,α3

)

.

With this choice, we are going to provide an estimator ofσ∗α with specifying its

rate of convergence and limiting distribution.Let M∗

n(α) := Mn(α/3,α/3,α/3). In this case the normalizing factor isnr+/α−1 ≡ 1, so that

M∗n(α) =

n−2

∑i=1

3

∏l=1

|∆i+l−1X|α/3,

which is computable as soon as we have an estimate ofα. We have already ob-tained the estimatorαp,n, hence want to useM∗

n(αp,n). For this, we have to lookat the asymptotic behavior of the gap

√nM∗

n(r(α))− µ(r(α))σ∗α−

√nM∗

n(αp,n)−µ(r(αp,n))σ∗α,

namely, the effect of “plugging inαp,n”.By means of Taylor’s formula

ax = ay+(loga)y(x− y)+ (loga)2∫ 1

0(1−u)ay+u(x−y)du(x− y)2

applied to the functionx 7→ ax (x,y,a> 0), we get

√n

M∗n(αp,n)− µ

(

α3,

α3,

α3

)

σ∗α

=√

n

M∗n(α)− µ

(

α3,

α3,

α3

)

σ∗α

+13

√n(αp,n−α)

n−2

∑i=1

xα/3i logxi

+

13

√n(αp,n−α)

2 1√n

n−2

∑i=1

(logxi)2∫ 1

0(1−u)x

α+u(αp,n−α)/3i du, (21)

where we wrotexi = ∏3l=1 |∆i+l−1X|. We look at the right-hand side of (21)

termwise. Letyi := ∏3l=1 |n1/α∆i+l−1X|.

Page 208: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

196

• The first term isOp(1), as is evident from (20).

• Concerning the second term, we have

n−2

∑i=1

xα/3i logxi =

1n

n−2

∑i=1

yα/3i logy j −

3α(logn)

1n

n−2

∑i=1

yα/3i

= Op(1)− (logn)3α

µ(

α3,α3,

α3

)

σ∗α +Op

(

1√n

)

= Op(1)− (logn)3α

µ(

α3,

α3,

α3

)

.

• Write the third term as√n(αp,n −α)/32Tn, and let us show thatTn =op(1). Fix anyε > 0 andε0 ∈ (0,α/2) in the sequel. Then

P[|Tn|> ε]≤ P[|αp,n−α|> ε0]+P[

|Tn|> ε, |αp,n−α| ≤ ε0]

=: p′n+ p′′n.

Clearly p′n → 0 by the√

n-consistency ofαp,n. As for p′′n, we first note that

infu∈[0,1]

1αα +u(αp,n−α) ≥ 1− ε0

α> 0

on the event|αp,n−α| ≤ ε0. We estimatep′′n as follows:

p′′n = P

[

|αp,n−α| ≤ ε0,

1√n

n−2

∑i=1

(logxi)2∫ 1

0(1−u)y

α+u(αp,n−α)/3i n−α+u(αp,n−α)/αdu> ε

]

≤ P

[

|αp,n−α| ≤ ε0, nε0/α−1/2 1n

n−2

∑i=1

(logxi)2∫ 1

0(1−u)y

α+u(αp,n−α)/3i du> ε

]

≤ P

[

nε0/α−1/2 1n

n−2

∑i=1

(logn)2+(logyi)2(1+yi )

(α+ε0)/3 >Cε]

≤ 1Cε

nε0/α−1/2(logn)2 → 0

for some constantC > 0. Here we used Markov’s inequality in the lastinequality; note that(α + ε0)/3< α/2, hence the moment does exist.

Piecing together these three items and (21), we arrive at the asymptotic relation:√

nlogn

M∗n(αp,n)−µ

(

α3,

α3,

α3

)

σ∗α

=− 1α

µ(

α3,

α3,

α3

)

σ∗α√

n(αp,n−α)+Op

(

1logn

)

.

(22)

Page 209: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

197

Now, recalling (2) we note that the quantityµ(α/3,α/3,α/3) is a continuouslydifferentiable function of(ρ ,α). Write µ(ρ ,α) = µ(α/3,α/3,α/3). In view ofthe

√n-consistency of(ρn, αp,n) and the delta method, we obtain

µ(ρ ,α) = µ(ρn, αp,n)+Op

(

1√n

)

. (23)

Substituting (23) in (22) we end up with

√n

logn

M∗n(αp,n)

µ(ρn, αp,n)−σ∗

α

=− 1α

σ∗α√

n(αp,n−α)+Op

(

1logn

)

, (24)

which implies that

σ∗α ,n :=

M∗n(αp,n)

µ(ρn, αp,n)(25)

serves as(√

n/ logn)-consistent estimator ofσ∗α . Its asymptotic distribution is the

centered normal scale mixture with limiting variance being

v(ρ ,α,σ∗α ,σ∗

p,σ∗2p) :=

(

σ∗α

α

)2

V22(ρ ,α,σ∗p,σ∗

2p),

whereV22 denotes the(2,2)th entry ofV; recall thatp is a parameter-free con-stant (see Section 3.2). A consistent estimator ofv(ρ ,α,σ∗

α ,σ∗p,σ∗

2p) can be con-structed by plugging in the estimators of its arguments.

The stochastic expansion (24) indicates an asymptotic linear dependence of√n(αp,n−α) and(

√n/ logn)(σ∗

α ,n−σ∗α). Of course, this occurs even for con-

stantσ , if we try to estimate(α,σα ) instead of(α,σ). The point is that, pluggingin a

√n-consistent estimator ofα into the indexr of the MPVMn(r) slows down

estimation ofσ∗α from

√n to

√n/(logn). It is beyond the scope of this article to

explore a better alternative estimator ofσ∗α .

4. Simulation ExperimentsBased on the discussion above, let us briefly observe finite-sample perfor-

mance of our estimators. For simplicity, we here focus on nonrandomσ .

4.1 Case (A)First, let σ is a positive constant, so thatX is the geometric skewed stable

Levy process and the parameter to be estimated is(ρ ,α,σ).As a simulation design, we setα = 1.3,1.5, 1.7, and 1.9 with commonβ =

−0.5 andσ = 1; hence(α,ρ) = (1.2,0.7638), (1.5,0.5984), (1.7,0.5467), and(1.9,0.5132). The sample size are taken asn = 500, 1000, 2000, and 5000. In

Page 210: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

198

all cases, the tuning parameterq is set to be 1/4, and 1000 independent samplepaths ofX are generated. Empirical means and empirical s.d.’s are given with the1000 independent estimates obtained. The results are reported in Table 4.1. Wesee that estimation of(ρ ,α) is, despite of its simplicity, quite reliable. On theother hand, estimation variance ofσ is relatively large compared with those ofρandα. Nevertheless, it is clear that the bias is small. Moreover, asα gets closeto 2, the performance ofσn becomes better, while that of(ρn, αp,n) is seeminglyunchanged.

In the unreported simulation results, we have observed that a change ofqwithin its admissible region does not lead to a drastic change unless it is too small(see Remark 3.3).

Table 1. Estimation results for the true parameters(ρ,α ,σ) = (0.7638,1.2,1), (0.5984,1.5,1),(0.5467,1.7,1), and(0.5132,1.9,1) with the geometric stable Levy processes. In each case, theempirical mean and standard deviation (in parenthesis) are given.

α = 1.2n ρ α σ500 0.7627 (0.0186) 1.2026 (0.0790) 1.1021 (0.8717)1000 0.7634 (0.0137) 1.2031 (0.0575) 1.0450 (0.4643)2000 0.7645 (0.0096) 1.2031 (0.0437) 1.0253 (0.5102)5000 0.7636 (0.0061) 1.2023 (0.0313) 1.0123 (0.2854)

α = 1.5n ρ α σ500 0.5988 (0.0222) 1.4929 (0.1030) 1.0751 (0.4066)1000 0.5981 (0.0162) 1.5010 (0.0757) 1.0289 (0.2549)2000 0.5986 (0.0106) 1.4986 (0.0564) 1.0284 (0.2355)5000 0.5984 (0.0073) 1.4983 (0.0364) 1.0169 (0.1516)

α = 1.7n ρ α σ500 0.5476 (0.0219) 1.6810 (0.1103) 1.0633 (0.2359)1000 0.5474 (0.0158) 1.6830 (0.0823) 1.0567 (0.1948)2000 0.5472 (0.0113) 1.6930 (0.0625) 1.0308 (0.1611)5000 0.5466 (0.0070) 1.6977 (0.0375) 1.0126 (0.1022)

α = 1.9n ρ α σ500 0.5129 (0.0224) 1.8553 (0.1026) 1.0821 (0.1767)1000 0.5133 (0.0164) 1.8767 (0.0808) 1.0535 (0.1568)2000 0.5131 (0.0109) 1.8870 (0.0579) 1.0330 (0.1111)5000 0.5128 (0.0073) 1.8971 (0.0401) 1.0097 (0.0809)

Page 211: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

199

4.2 Case (B)Next we observe a case of time-varying but nonrandom scale. We set

σαt =

25

cos(2πt)+32

, (26)

so thatσ∗α = 0.6.

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Time

Va

ryin

g s

ca

le

Figure 3. The plot of the functiont 7→ σαt given by (26).

With the same choices of(ρ ,α), q, andn as in the previous case, we obtainthe result in Tables 4.2; the estimator ofσ∗

α here is based on (25). There we canobserve a quite similar tendency as in the previous case.

5. Concluding RemarksWe have studied some statistical aspects in the calibration problem of a ge-

ometric skewed stable asset price models. Estimation of stable asset price mod-els with possibly time-varying scale can be done easily by means of the sim-ple empirical-sign statistics and MPVs. Especially, we could estimate integratedscale, which is a natural quantity as in the integrate variance in the framework ofBrownian semimartingales, with multistep estimating procedure: we estimateρ ,α, andσ (or σ∗

α ) one by one in this order. Our simulation results say that finite-sample performance of our estimators are unexpectedly good despite of their sim-plicity, except for a relatively bigger variance in estimatingσ (or σ∗

α ).

We close with mentioning some possible future issues.

• Throughout we supposed the independence between the scale processσand the driving skewed stable Levy processZ. This may be disappointing

Page 212: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

200

Table 2. Estimation results for the true parameters(ρ,α) = (0.7638,1.2), (0.5984,1.5),(0.5467,1.7), and(0.5132,1.9) with σ∗

α = 0.6 in common under (26). In each case, the empiri-cal mean and standard deviation (in parenthesis) are given.

α = 1.2n ρ α σ ∗

α500 0.7632 (0.0179) 1.1951 (0.0794) 0.6730 (0.3857)1000 0.7636 (0.0139) 1.2042 (0.0619) 0.6274 (0.3094)2000 0.7638 (0.0098) 1.2044 (0.0472) 0.6105 (0.2323)5000 0.7641 (0.0059) 1.2025 (0.0305) 0.6029 (0.1521)α = 1.5n ρ α σ ∗

α500 0.5978 (0.0220) 1.4877 (0.1023) 0.6697 (0.3031)1000 0.5981 (0.0159) 1.4908 (0.0733) 0.6551 (0.2488)2000 0.5985 (0.0111) 1.4960 (0.0573) 0.6349 (0.2033)5000 0.5987 (0.0069) 1.4990 (0.0376) 0.6151 (0.1414)α = 1.7n ρ α σ ∗

α500 0.5460 (0.0216) 1.6727 (0.1038) 0.6832 (0.2465)1000 0.5465 (0.0160) 1.6801 (0.0820) 0.6714 (0.2280)2000 0.5468 (0.0113) 1.6931 (0.0600) 0.6318 (0.1607)5000 0.5465 (0.0071) 1.6988 (0.0393) 0.6116 (0.1135)α = 1.9n ρ α σ ∗

α500 0.5130 (0.0229) 1.8440 (0.1039) 0.7196 (0.2233)1000 0.5131 (0.0159) 1.8703 (0.0823) 0.6762 (0.1897)2000 0.5138 (0.0114) 1.8851 (0.0588) 0.6412 (0.1349)5000 0.5135 (0.0068) 1.8956 (0.0411) 0.6168 (0.0998)

as it excludes accommodating the leverage effect, however, the simple con-structions of our estimators (especially,ρn) break down if they are allowedto be dependent. We may be able to deal with correlatedσ andZ if we havean extension of the power-variation results obtained in Corcueraet al. [7]to the MPV version. To the best of author’s knowledge, such an extensiondoes not seem to have been explicitly mentioned as yet.

• Assuming thatσ is indeed time-varying and possibly random, estimationof “spot” scalesσt is an open problem. Needless to say, this is much moredifficult and delicate to deal with than the integrated scale. We know sev-eral results for Brownian-semimartingale cases (see, among others, Fan andWang [8] and Malliavin and Mancino [12]), however, yet no general resultfor the case of pure-jumpZ.

Page 213: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

201

• Finally, it might be interesting to derive an option-pricing formula for thecase of time-varying scale, which seems more realistic than the mere geo-metric skewed stable Levy processes.

References1. Applebaum, D. (2004),Levy Processes and Stochastic Calculus.Cambridge Univer-

sity Press, Cambridge.2. Barndorff-Nielsen, O. E., Graversen, S. E., Jacod, J. and Shephard, N. (2006), Limit

theorems for bipower variation in financial econometrics.Econometric Theory22,677–719.

3. Barndorff-Nielsen, O. E., Graversen, S. E., Jacod, J., Podolskij, M. and Shephard,N. (2006), A central limit theorem for realised power and bipower variations of con-tinuous semimartingales.From Stochastic Calculus to Mathematical Finance, 33–68,Springer, Berlin.

4. Barndorff-Nielsen, O. E. and Shephard, N. (2005), Power variation and timechange.Teor. Veroyatn. Primen.50, 115–130; translation inTheory Probab. Appl.50(2006), 1–15.

5. Bertoin, J. (1996),Levy Processes.Cambridge University Press.6. Borak, S., Hardle, W. and Weron, R. (2005), Stable distributions.Statistical tools for

finance and insurance, 21–44, Springer.7. Corcuera, J. M., Nualart, D. and Woerner, J. H. C. (2007), A functional central limit

theorem for the realized power variation of integrated stable processes.Stoch. Anal.Appl.25, 169–186.

8. Fan, J. and Wang, Y. (2008), Spot volatility estimation for high-frequency data.Stat.Interface1, 279–288.

9. Fujiwara, T. and Miyahara, Y. (2003), The minimal entropy martingale measures forgeometric Levy processes.Finance Stoch.7, 509–531.

10. Kallsen, J. and Shiryaev, A. N. (2001), Time change representation of stochastic inte-grals. Teor.Veroyatnost. i Primenen.46, 579–585; translation inTheory Probab. Appl.46 (2003), 522–528.

11. Kuruoglu, E. E. (2001), Density parameter estimation of skewedα-stable distribu-tions.IEEE Trans. Signal Process.49, no. 10, 2192–2201.

12. Malliavin, P. and Mancino, M. E. (2009), A Fourier transform method for nonpara-metric estimation of multivariate volatility.Ann. Statist.37, 1983–2010.

13. Masuda, H. (2009), Joint estimation of discretely observed stable Levy processes withsymmetric Levy density.J. Japan Statist. Soc.39, 1–27.

14. Masuda, H. (2009), Estimation of second-characteristic matrix based on realized mul-tipower variations. (Japanese)Proc. Inst. Statist. Math.57, 17–38.

15. Miyahara, Y. and Moriwaki, N. (2009), Option pricing based on geometric stable pro-cesses and minimal entropy martingale measures. In “Recent Advances in FinancialEngineering”,World Sci. Publ., 119–133.

16. Sato, K. (1999),Levy Processes and Infinitely Divisible Distributions.Cambridge Uni-versity Press.

17. van der Vaart, A. W. (1998),Asymptotic Statistics.Cambridge University Press, Cam-bridge.

Page 214: Financial Engineering

May 3, 2010 15:41 Proceedings Trim Size: 9in x 6in 007

202

18. Woerner, J. H. C. (2003), Purely discontinuous Levy processes and power variation:inference for integrated volatility and the scale parameter. 2003-MF-08 Working PaperSeries in Mathematical Finance, University of Oxford.

19. Woerner, J. H. C. (2004), Estimating the skewness in discretely observed Levy pro-cesses.Econometric Theory20, 927–942.

20. Woerner, J. H. C. (2007), Inference in Levy-type stochastic volatility models.Adv. inAppl. Probab.39, 531–549.

21. Zolotarev, V. M. (1986),One-Dimensional Stable Distributions.American Mathemat-ical Society, Providence, RI. [Russian original 1983]

Page 215: Financial Engineering

May 3, 2010 16:28 Proceedings Trim Size: 9in x 6in 008

A Note on a Statistical Hypothesis Testing forRemoving Noise by the Random Matrix Theory

and Its Application to Co-Volatility Matrices

Takayuki Morimoto1,∗ and Kanta Tachibana2

1School of Science and Technology, Kwansei Gakuin University,2-1 Gakuen, Sanda-shi, Hyogo 669-1337, Japan.

2Faculty of Informatics, Kogakuin University, 1-24-2 Nishi-shinjuku,Shinjuku-ku, Tokyo 163-8677, Japan.

Email: [email protected] and [email protected]

It is well known that the bias called market microstructure noise will arise,when estimating realized co-volatility matrix which is calculated as a sumof cross products of intraday high-frequency returns. An existing con-ventional technique for removing such a market microstructure noise isto perform eigenvalue decomposition of the sum of cross products ma-trix and to identify the elements corresponding to the decomposed valueswhich are smaller than the maximum eigenvalue of the random matrix asnoises. Although the maximum eigenvalue of a random matrix followsasymptotically Tracy-Widom distribution, the existing technique does nottake this asymptotic nature into consideration, but only the convergencevalue is used for it. Therefore, it cannot evaluate quantitatively such arisk that regards accidentally essential volatility as a noise. In this paper,we propose a statistical hypothesis test for removing noise in co-volatilitymatrix based on the nature in which the maximum eigenvalue of a randommatrix follows Tracy-Widom distribution asymptotically.

Keywords: Realized volatility, market microstructure noise, random ma-trix theory.

∗Corresponding author.

203

Page 216: Financial Engineering

May 3, 2010 16:28 Proceedings Trim Size: 9in x 6in 008

204

1. IntroductionIn recent years, we can easily obtain “high frequency data in finance”, so we

may estimate and forecast (co)-volatility more correctly than before using by Re-alized Volatility (RV) which is a series of the sum of intraday squared log returnand Realized Co-volatility (RC) which is a series of the sum of cross-product oftwo log returns, see [2] or [1]. However, it is well known that when forecastingvolatility, RV and RC are contaminated by large biases, so called micro structuralnoise which is progressively increased as sampling frequency becomes higher,see [7]. Thus, the research considers a statistical method of removing such a noisein RV and RC by using random matrix theory. Doing eigenvalue decompositionof cross product matrix, we consider that noises in a covolatility matrix are ele-ments corresponding to eigenvalues smaller than the maximum eigenvalue of therandom matrix. It is known that the maximum eigenvalue of a random matrixwill follow Tracy-Widom distribution asymptotically. However, existing meth-ods haven’t taken into consideration a distribution of the maximum eigenvalue ofa random matrix, but have used only the maximum eigenvalue itself, for exam-ple, see [9]. Therefore, they cannot evaluate quantitatively a risk of consideringaccidentally that essential volatility is a noise.

Therefore, we propose a statistical hypothesis test for removing noise in co-volatility matrix based on the nature in which the maximum eigenvalue of a ran-dom matrix follows Tracy-Widom distribution asymptotically.

This paper is organized as follows. Section 2 describes theoretical backgroundof this study and gives brief explanation of random matrix theory and our proposal.Section 3 investigates empirical analysis. Section 4 concludes.

2. Theoretical BackgroundIn this section, we will introduce theoretical properties of random matrix with

some simulation results.

2.1 Random matrixRandom matrix is a matrix which has random variables as its elements. First,

[16] and [17] developped a eigenvalue distribution ofN × N real symmetric ma-trix A = (ai j) with elementsai j ∼ i.i.d.(0, 1/N). Following [16] and [17], weintroduceN × N real symmetric random matrixA = (ai j) with elementsai j|i ≤ jwhich independently follows a distribution with a mean 0 and a variance 1/N. Ifeigenvalues ofA areλ1, . . . , λN and an empirical eigenvalue distribution ofA isdefined by

ρA(λ) =1N

N∑

i=1

δ(λ − λi),

then

limN→∞ρA(λ) =

√4−λ2

2π (|λ| ≤ 2)

0 (otherwise),

Page 217: Financial Engineering

May 3, 2010 16:28 Proceedings Trim Size: 9in x 6in 008

205

whereδ(·) is Dirac measure. Figures 1 and 2 show simulated eigenvalue distribu-tion of A with n = 1000. The left panel is sampled from a Normal distributionand the right one is from uniform distribution. From these figures we can see thatasymptotic behavior of eigenvalues ofA is identical whatever distribution theyfollow.

−4 −3 −2 −1 0 1 2 3 40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Figure 1. Sampled from normal.

−4 −3 −2 −1 0 1 2 3 40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Figure 2. Sampled from uniform.

Second, we introduce Wishart distribution which plays very important rolein multivariate analysis. Wishart found it for describing behavior of a samplecovariance matrixXX> in 1928. A distribution ofXX> depends on a distributionof random variablesx, so we can estimate the original distribution ofx from thedistribution of XX>. If each column vector ofN × p matrix X =

(

x(1) · · · x(n))

independently followsN dimensional Gaussian distribution,x(i) ∼ NN(0,Σ), thenN × N random matrixXX>

XX> ∼ WN(p,Σ)

follows N dimensional Wishart distribution with a degree of freedomp and acovariance matrixΣ. If N = 1, then it followsχ2 distribution with a degree offreedomp, and forN = 2, [6] found a relevant distribution.

Next, we also consider an asymptotic eigenvalue distribution of Wishart ma-trix. Supposed thatΣ = IN and each element of a random matrixX ∈ RN×p

independently followsN1(0, 12),

XX> ∼ WN(p, IN).

For a random matrixX ∈ RN×p with XX> ∼ WN(p, IN), if keeping proportion of

α = p/N

Page 218: Financial Engineering

May 3, 2010 16:28 Proceedings Trim Size: 9in x 6in 008

206

and N → ∞, then a eigenvalue distribution ofXX> converges to some function.If N×p matrixX doesn’t follows Gaussian distribution, say,XX> isn’t Wishart

matrix, then the eigenvalue distribution ofXX> still converges to the same func-tion above. This property is known as “the universality” of random matrix theory.It is very important characteristics that there is no necessity that each element ofX follows Gaussian distribution. That is, it is not a necessary condition thatXX>

follows Wishart distribution. Hence, we can generalize the limiting distribution ofeigenvalues ofXX>. We have the following theorem related to the universality byreferring to [10] and [5].

Theorem 1 (Marcenko-Pastur law): Let X be anN × p matrix with indepen-dent, identically distributed entriesXi, j. We assume that E(Xi, j) = 0 andvar(Xi, j) = 1. If p,N are large enough andp/N is a non-zero constant, thenthe distribution of eigenvalues ofXX> converges almost surely to a knowndensity.

We set eigenvaluesλ1, . . . , λN sampled fromXX> ∼ WN(p, IN) are scaled by

ui = λi/p, i = 1, . . . ,N.

An empirical distribution ofu follows

δP =1Nδ(u1) + · · · + δ(uN),

whereδ(u) is Dirac measure. Ifα = p/N and p,N → ∞, δP converges a.e. top(u)du,

p(u) =

12πα

√(u−umin)(umax−u)

u if umin < u < umax,

0 otherwise,

umin = (√α − 1)2, umax = (

√α + 1)2.

The asymptotic eigenvalue distribution is given by the following formula fromMarcenko-Pastur law,

λmin = (1−√α)2, λmax = (1+

√α)2.

Limiting distribution of eigenvalues ofA = XX> is given by

limN→∞ρA(λ) =

12πλ

√(λ − λmin)(λmax− λ) λmin ≤ λ ≤ λmax,

1− α λ = 0 andα < 1,

0 otherwise.

Figures 3 and 4 show simulated eigenvalue distribution ofXX> with p = 1000.The left panel is sampled from Normal distribution and the right one is from

Page 219: Financial Engineering

May 3, 2010 16:28 Proceedings Trim Size: 9in x 6in 008

207

−0.5 0 0.5 1 1.5 2 2.5 3 3.50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Figure 3. Sampled from normal.

−0.5 0 0.5 1 1.5 2 2.5 3 3.50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Figure 4. Sampled from uniform.

uniform distribution. From these figures we can see that asymptotic behavior ofeigenvalues ofXX> is identical whatever distribution they follow just as Wigner’scase.

Figures 5 and 6 show theoretical and empirical distribution ofXX>. The leftpanel is simulated distribution sampled from Normal distribution withp = 1000andN = 600,N = 1000 andN = 1400. The right panel is empirical distributionsampled from individual stocks listed in Tokyo Stock Exchange. From these fig-ures we can see that empirical distribution resembles simulated one in appearancehowever its scale is very different from each other.

0 1 2 3 4 5 60

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6Theoretical Distribution of Mercenko Pastur

λ

ρ(λ

)

p=1000, N=600

p=1000, N=1000

p=1000, N=1400

Figure 5. Theoretical distribution.

0 5 10 15 200

5

10

15

20

25

30

35Empirical Distribution of Latent Roots

λ

ρ(λ

)

Figure 6. Empirical distribution.

2.2 Extraction of Essential VolatilityHere, we describe the technique of dividing a matrixV = RR> calculated from

a standardized log return matrixR into essential parts and noise parts, following

Page 220: Financial Engineering

May 3, 2010 16:28 Proceedings Trim Size: 9in x 6in 008

208

[9]. We first obtain unit eigenvaluesuk corresponding to eigenvaluesλ1, . . . , λN

and λk, (k = 1, . . . ,N) of V where N denotes the number of stocks. LettingVk := λkuku>k be thekth element of the matrix, thenV =

∑Nk=1 Vk hold, so we can

divide V into N elements. Among theN elements, we can see that the elementscorresponding to big eigenvalues are more essential to heavily influence on over-all market. On the other hand, we can consider that the elements corresponding tosmall eigenvalues are less essential to be independent of overall market. To put itbriefly, independent elements which are inside the maximum eigenvalue make nosense to portfolio strategy which are connected to correlation of log returns.

We schematically show the denoinsing method above.

• If the elements ofVk independent and identically distributed, then the cor-responding eigenvalueλk must lie in the support of the Marcenko-Pasturlaw.

• If kth eigenvalueλk lies out of the support of the Marcenko-Pastur law, thenthe corresponding elementVk is not independent and identically distributed,say, can be considered to contain something other than noise.

• Thus, we can consider that the sum of elementsV corresponding to a biggereigenvalue than a threshold valueθ which is the maximum eigenvalue of thematrix is so called denoised daily realized volatility1.

V+ =∑

k|λk>θ

Vk.

As you can see, conventional studies have dichotomously distinguished noiseand substantial parts in a convergence point of the maximum eigenvalue. In theexisting research, the threshold valueθ is determined only by the maximum eigen-value of the matrix regardless of the asymptotic nature of the maximum eigen-value of a random matrix. That is, they consider that the sum of the elementscorresponding to eigenvalues such asλk > θ1 is denoised, and the sum of others iscontaminated with noise. However, such a deterministic and “digital” method mayaccidentally cause an error misidentifying denoised volatility as contaminated oneand vice versa since the maximum eigenvalue of a random matrix is still a randomvariable.

Therefore, we propose an interval estimation of eigenvalues which can distin-guish noises, paying attention to the point that the maximum eigenvalue is also arandom variable. We perform a statistical hypothesis testing with respect to de-noised and contaminated volatility by using characteristics of which the maximum

1No-one knows essential volatility since it is usually unobservable, so we dare to use the term“denoised” in stead of “essential”.

Page 221: Financial Engineering

May 3, 2010 16:28 Proceedings Trim Size: 9in x 6in 008

209

eigenvalue ofV follows Tracy-Widom distribution which is explained in the nextsubsection. Here, the null hypothesis, which assumes that log returns only consistof pure noises, can be rejected by the fact that the largest eigenvalue of the samplecovariance matrix does not lie in the support of the Marcenko-Pastur law. Specif-ically we set up a null hypothesis as contaminated with noise and alternative asdenoised and vice versa, and the statistics value is obtained by a eigenvalue ofVcalculated from standardized log return matrixR.

2.3 Maximum Eigenvalue Density of Random MatrixWe suppose thatX is n × p random matrix andXX> is its covariance matrix.

Under Gaussian assumptions,XX> is said to have a Wishart distributionWp(n,Σ).If Σ = I, it is called as a white Wishart, in analogy with time series settingswhere a white spectrum is one with the same variance at all frequencies, see [8].Asymptotic distribution of maximum eigenvalue of Wishart matrixXX> with unitcovariance follows the first order Tracy-Widom distribution, ifα = p/n is con-stant, see [12], [13] and [14]. Moreover, even if the size ofn or p is about ten,this asymptotic property is not lost, and it is known that Tracy-Widom distributionappear as a solution of Painleve II type differential equation.

Theorem 2 (Tracy-Widom Law): Suppose thatW is white Wishart matrix,γis a constant,l1 is the maximum eigenvalue, andn/p→ γ ≥ 1, then

l1 − µnp

σnp

dist−→W ∼ F1,

where the location and scale parameters are given by

µnp = (√

n − 1+√

p)2,

σnp =√µnp

(

(n − 1)−12 + p−

12

)1/3,

whereF1 denotes the density function of the first order Tracy-Widom law.

We have to mention that the Tracy-Widom law also has “the universality” of ran-dom matrix theory. Hence, the Tracy-Widom law still holds without Gaussianassumptions, see [11], for example.

The asymptotic distribution functionF1 is a special case of the distributionfamily Fβ. For β = 1, 2, 4, the functionFβ appears as a asymptotic distributionfor the maximum eigenvalue of Gaussian Orthogonal Ensemble (GOE), Gaus-sian Unitary Ensemble (GUE) and Gaussian Symplectic Ensemble (GSE), respec-

tively. According to this fact, a distribution functionFN,β(s)de f= P(lmax(A) < s),

β = 1, 2, 4 for the maximum eigenvaluelmax(A) of each random matrixA of GOE(β = 1), GUE (β = 2) or GSE (β = 4) satisfies the asymptotic law as follows:

Fβ(s) = limN→∞

FN,β(2σ√

N + σN−1/6s)

Page 222: Financial Engineering

May 3, 2010 16:28 Proceedings Trim Size: 9in x 6in 008

210

whereFβ is explicitly given by

F1(s) = exp

(

−12

∫ ∞

sq(x)dx

)

[F2(s)]12 ,

F2(s) = exp

(

∫ ∞

s(x − s))q2(x)dx

)

,

F4(2−23 s) = cosh

(

−12

∫ ∞

sq(x)dx

)

[F2(s)]12

and q(s) is a unique solution for Painleve equation type II. Again,

q′′ = sq + 2q3 + α, α = 0

satisfies the boundary condition

q(s) ∼ Ai( s), s→ +∞

where Ai(s) denotes Airy function. Figures 7 and 8 show simulated maximumdistribution of XX> with p = 1000 which is known as Tracy-Widom distribu-tion. The left panel is sampled from Normal distribution and the right one is fromuniform distribution. From these figures we can see that asymptotic behavior ofmaximum eigenvalues ofXX> is identical whatever distribution they follow justas previous cases.

−6 −5 −4 −3 −2 −1 0 1 2 3 40

50

100

150

200

250

300

Figure 7. Sampled from normal.

−6 −5 −4 −3 −2 −1 0 1 2 3 40

50

100

150

200

250

300

350

Figure 8. Sampled from uniform.

Then we can construct the following two types of hypothesis testing for noisesby comparing sample eigenvaluesλk, (k = 1, . . . ,N) with the theoretical Tracy-Widom statistictwα. To make it easy to understand, we illustrate these tests inFigures 9 and 10. These plots are from numerical work of [15] reporting that

Page 223: Financial Engineering

May 3, 2010 16:28 Proceedings Trim Size: 9in x 6in 008

211

the F1 distribution has mean−1.21 and standard deviation 1.27. The density isasymmetric and its left tail has exponential order of decay likee|s|

3/24, while itsright tail is of exponential ordere−

23 s3/2

, see [8]. The asymmetric feature is just thereason to propose two types of hypothesis testing for noises.2

−6 −5 −4 −3 −2 −1 0 1 2 3 40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

α

H0 is

rejected.

H0 is

notrejected.

Figure 9. Illustration of Type I.

−5 −4 −3 −2 −1 0 1 2 3 40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

H0 is

notrejected.

H0 is

rejected.

α

Figure 10. Illustration of Type II.

Type I: We test the probability that we take accidentally denoised parts as noises.So in this case, the null hypothesis H0 assumes that log returnsR are notpure noises:

H0 : R ∼ not i.i.d. distributed.

If λ > twα, say, a sample eigenvalue is larger than the relevant critical value,then we fail to reject the null hypothesis.

Type II: We test the probability that we take accidentally noises as denoisedparts. So in this case, the null hypothesis H0 assumes that log returnsR arepure noises:

H0 : R ∼ i.i.d. distributed.

If λk < tw1−α, say, a sample eigenvalue is smaller than the relevant criticalvalue, then we fail to reject the null hypothesis.

Hence, Type I is a lower test for Tracy-Widom distribution and Type II an upperone. Tail probability of Tracy-Widom distribution is given by numerical compu-tation as shown in the following Table 1, see [3] for more detailed description.Therefore, statistical hypothesis testing to the maximum eigenvalue of a covari-ance matrix becomes possible by using these values and significance levelα.

2If the asymptotic distribution is symmetric such as a normal distribution andt distribution, then,of course, it is not necessary to consider two types of hypothesis testing.

Page 224: Financial Engineering

May 3, 2010 16:28 Proceedings Trim Size: 9in x 6in 008

212

Table 1. Probability values (β = 1, 2,4).

β \α 0.995 0.975 0.95 0.05 0.025 0.005

1 −4.1505 −3.5166 −3.1808 0.9793 1.4538 2.42242 −3.9139 −3.4428 −3.1945 −0.2325 0.0915 0.74624 −4.0531 −3.6608 −3.4556 −1.0904 −0.8405 −0.3400

2.4 Realized QuantitiesIn recent years, we can estimate and forecast volatilities more correctly by us-

ing Realized Volatility (RV) which is a consistent estimator of Integrated Volatility(IV). We can estimate RV by the sum of intraday log return of high frequency datain finance. Realized Covariance (RC) is also important as application to financewhich can be estimated by the sum of cross-product of two log returns.

We define logarithmic stock price at timet as pt and assume thatpt followsthe following diffusion process:

dpt = µtdt + σtdwt,

whereµt,σt andwt are instantaneous drift and diffusion terms and standard Brow-nian motion, respectively. If∆→ 0, then

RVζ :=∑

τ

r2ζ,τ →

∫ ζ

ζ−1σ2

sds

where∆ is a small time interval in each day andrζ,τ is τth intraday logarithmicreturnpζ,τ∆ − pζ,(τ−1)∆ in ζ day. If sampling interval is small enough, then RV is aconsistent estimator of IV.

Provided thatτth logarithmic returns of two stocksi, j in ζ day are defined byrζ,τ,i andrζ,τ, j respectively,

CVζ,i j :=∑

τ

rζ,τ,irζ,τ, j.

Unifying RV and RC, we can obtainN × N matrix:

Vζ = R>ζ Rζ

whereRζ is p×N log return matrix,N is the number of stocks, andp is the lengthof time series.

3. Empirical AnalysisWe use high frequency data as follows in our empirical analysis. Data that

we use consists of individual stocks listed in Nikkei 225 and TOPIX (N = 226),

Page 225: Financial Engineering

May 3, 2010 16:28 Proceedings Trim Size: 9in x 6in 008

213

Sampling period is from January 4, 2007 to December 28, 2007 (245 days). Wecalculate intraday log return against∆ = 1, . . . , 10 (minute) in order to evaluatedenoising performance for each method. Trading time in a day is 4.5 hours inTokyo Stock Exchange, so if∆ = 1[min] thenp = 270, α = 2.67.

However, there is a problem resulting from using high frequency data in RVand RC. It is well known as microstructure noise which may be derived fromasymmetric information and the bid-ask spread and brings some bias to volatil-ity estimates obtained by RV. Figure 11 is an example from TOPIX 1003 of mi-crostructure noise and this is so called volatility signature plot (VSP) whose hor-izontal axis denotes∆ and whose vertical axis denotes volatility. Each value in-dicates a 245 days average value of tr(V), tr(V+(m)) calculated byV,V+(m), (m =1, 2, 3) in each day, where tr(·) is the sum of diagonal elements. Herem = 1 de-notes volatility obtained from conventional method,m = 2 from Type I test andm = 4 from Type II test, respectively. In Figure 12, solid line is tr(V) obtainedfrom raw data, dashed line is tr(V+(1)) dotted line is tr(V+(2)) and chained line istr(V+(3)). From this figure, we can see that tr(V) obviously diverge when samplingfrequency is small but others are stable and almost identical to each other.

0 5 10 15 20200

300

400

500

600

700

800

Sampling interval (min.)

Vo

latilit

y

Average volatility over the sampling period (TOPIX100)

Figure 11. Microstructure noise.

0 2 4 6 8 10600

800

1000

1200

1400

1600

1800

2000Volatility Signature Plot

Sampling Frequency in Minutes

Average Volatility

rawconvTpITpII

Figure 12. Average volatility.

Next we calculate minimum variance portfoliopk without risk-free rate de-fined by

pk =1Z

N∑

i=1

C−1i j , Z =

N∑

i, j=1

C−1i j ,

wherek = 1, . . . ,N andC denotesN ×N correlation matrix, see [4]. Furthermore,

3TOPIX 100 consists of 100 more liquid individual stocks from Tokyo Stock Exchange.

Page 226: Financial Engineering

May 3, 2010 16:28 Proceedings Trim Size: 9in x 6in 008

214

we compute the total varianceσ2p of the minimum variance portfolio given by

σ2p = p C p>,

wherep is a 1× N vector which containsp1, . . . , pN . If the correlation matrixC is not contaminated by noises, that is,C consists entirely of significant ele-ments, then the total varianceσ2

p of the minimum variance portfolio must be rela-tively small. Table 2 shows the estimatedσ2

ps for each sampling interval and eachmethod. From the table we can see that the variance of Type I is better particu-larly in smaller sampling intervals than 5 minutes where noise causes explosivevolatility as we explained in Figure 11.

Table 2. Minimum variance portfolio without risk-free rate.

raw conv. Type I Type II

01 min. 3.5648 3.3221 3.3123 3.3214

02 min. 2.4060 2.2242 2.2130 2.2243

03 min. 2.4260 2.3802 2.3807 2.3910

04 min. 2.6374 2.3718 2.3553 2.3725

05 min. 1.5812 1.4415 1.4195 1.4431

06 min. 2.6137 2.4255 2.3724 2.4402

07 min. 2.1421 2.3312 2.3266 2.3300

08 min. 2.6004 2.2699 2.3505 2.2701

09 min. 1.5338 2.1933 2.1939 2.1935

10 min. 2.0842 1.7209 1.7280 1.7209

Furthermore, we investigate the efficient portfolio taking into account risk-freerate. We use interbank rate 0.0599 as of July 2007 for risk-free rate. Figure 13 and14 show two remarkable examples of empirical efficient frontier. In these figures,circle denotes raw data, dotted line conventional method which means existingresearch, solid line Type I, and chained line Type II, respectively. Type I andType II mean data denoised by hypothetical testing that we proposed in previoussection. The right panel is efficient frontiers calculated by the data of March 15,2007. It is interesting that dotted and chained lines which mean conventionalmethod and Type II are placed nearer a vertical axis, which may underestimatethe risk. The left panel is efficient frontiers calculated by the data of May 15,2007. It is remarkable that circle which means raw data is situated inside others,which also may underestimate the risk. As you see from above result, efficient

Page 227: Financial Engineering

May 3, 2010 16:28 Proceedings Trim Size: 9in x 6in 008

215

frontiers differ from day to day however the outcome of Type I may seem to bestable4.

0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2−0.005

0

0.005

0.01

0.015

0.02

0.025

Efficient Portofolio (0315)

Risk

Return

raw

conv

Type I

Type II

Figure 13. March 15, 2007.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8−0.005

0

0.005

0.01

0.015

0.02

0.025

0.03

0.035

Efficient Portofolio (0515)

Risk

Return

raw

conv

Type I

Type II

Figure 14. May 15, 2007.

Finally we present some results of empirically estimated volatility and co-volatility. Table 3 shows average values of volatility in 2007 for each samplinginterval. The S.D. means standard deviation over all intervals. From the table wecan see that mean volatility of Type I is relatively stable. Table 4 shows average

Table 3. Mean values of volatility.

raw conv. Type I Type II01 min. 1.9067 0.7572 0.7803 0.752102 min. 1.6163 0.7495 0.7709 0.744403 min. 1.4847 0.7536 0.7753 0.745704 min. 1.4064 0.7589 0.7793 0.754005 min. 1.3577 0.7615 0.7780 0.756906 min. 1.3248 0.7630 0.7818 0.759707 min. 1.2785 0.7556 0.7711 0.752808 min. 1.2597 0.7597 0.7788 0.758209 min. 1.2501 0.7620 0.7757 0.760510 min. 1.2389 0.7625 0.7806 0.7603

S.D. 210.6982 4.3976 3.8342 5.7993Note: S.D. is×1/1000 and others are×1/100.

4The efficient frontiers over the sampling period are available upon request.

Page 228: Financial Engineering

May 3, 2010 16:28 Proceedings Trim Size: 9in x 6in 008

216

values of absolute covolatility in 2007 for each sampling interval. However, wecannot find a remarkable difference in covolatility other than raw data.

Table 4. Mean values of absolute covolatility.

raw conv. Type I Type II01 min. 4.3136 4.2221 4.2334 4.217702 min. 4.6013 4.5301 4.5399 4.527103 min. 4.7614 4.6913 4.6985 4.687704 min. 4.8437 4.7672 4.7773 4.765105 min. 4.8765 4.7950 4.8039 4.792906 min. 4.9548 4.8649 4.8757 4.863107 min. 4.8942 4.7978 4.8073 4.794708 min. 4.9551 4.8525 4.8664 4.849709 min. 4.9508 4.8429 4.8534 4.841810 min. 4.9850 4.8679 4.8820 4.8649

S.D. 2.1010 2.0388 2.0426 2.0457Note: S.D. is×1/1000 and others are×1/100.

4. Concluding RemarksWe focused on denoising a covariance matrix of log-return by using the ran-

dom matrix theory. Conventional researches have dichotomously distinguishednoise and substantial parts in a convergence point of the maximum eigenvalue.Paying attention to the point that the maximum eigenvalue is also a random vari-able, we introduced an interval estimation of eigenvalues which can distinguishbetween noises. Here, we applied this technique to an empirical analysis of highfrequency data in finance. Challenges for the future are introduction of time seriesstructure and comparison of the forecasting ability of covolatility models.

References1. Andersen, T. G., T. Bollerslev, and Diebold, F. X. (2007) “Roughing It Up: Including

Jump Components in the Measurement, Modeling, and Forecasting of Return Volatil-ity,” Review of Economics and Statistics, 89, 701–720.

2. Barndorff-Nielsen, O. E., and Shephard, N. (2004) “Power and Bipower Variation withStochastic Volatility and Jumps,”Journal of Financial Econometrics, 2, 1–37.

3. Bejan, A. (2005) “Largest eigenvalues and sample covariance matrices. Tracy-Widomand Painleve II: computational aspects and realization in S-Plus with applications,”Preprint.

4. Bouchaud J. P. and Potters, M. (2000) “Theory of Financial Risks: From StatisticalPhysics to Risk Management,” Cambridge University Press.

Page 229: Financial Engineering

May 3, 2010 16:28 Proceedings Trim Size: 9in x 6in 008

217

5. El Karoui, N. (2005) “Recent results about the largest eigenvalue of random covariancematrices and statistical application,”Acta Phys. Pol. B, 36, 2681–2697.

6. Fisher, R. A. (1915) “Frequency distribution of the values of the correlation coefficientin samples from an indefinitely large population,”Biometrika, 10, 507–521.

7. Hansen, P. R. and Lunde, A. (2006) “Realized Variance and Market MicrostructureNoise,”Journal of Business and Economic Statistics, 24, 127–161.

8. Johnstone, I. M. (2001) “On the distribution of the largest eigenvalue in principalcomponent analysis,”Ann. of Stat., 29, 295–327.

9. Laloux, L., Cizeau, P., Potters, M. and Bouchaud, J. (2000) “Random matrix the-ory and financial correlations,”International Journal Theoretical Applied Finance,3, 391–397.

10. Marcenko, V. A. and Pastur, L. A. (1967) “Distribution of eigenvalues for some setsof random matrices,”Mathematics of the USSR Sbornik, 72, 457–483.

11. Soshnikov, A. (2001) “A note on universality of the distribution of the largest eigen-values in certain sample covariance matrices,”J. Statist. Phys., 108, 1033–1056.

12. Tracy, C. A. and Widom, H. (1993) “Level-spacing distribution and Airy kernel,”Phys.Letts. B, 305, 115–118.

13. Tracy, C. A. and Widom, H. (1994) “Level-spacing distribution and Airy kernel,”Comm. Math. Phys., 159, 151–174.

14. Tracy, C. A. and Widom, H. (1996) “On orthogonal and symplectic matrix ensambles,”Comm. Math. Phys., 177, 727–754.

15. Tracy, C. A. and Widom, H. (2000) “The distribution of the largest eigenvalue in theGaussian ensembles,” InCalogero-Moser-Sutherland Models (J. van Diejen and L.Vinet, eds.) 461–472. Springer, New York.

16. Wigner, E. P. (1955) “Characteristic vectors of bordered matrices with infinite dimen-sions,”Annals of Mathematics, 62, 548–564.

17. Wigner, E. P. (1957) “On the distribution of the roots of certain symmetric matrices,”Annals of Mathematics, 67, 325–327.

18. Wishart, J. (1928) “The generalised product moment distribution in samples from anormal multivariate population,”Biometrika, 20, 32–52.

Page 230: Financial Engineering

This page intentionally left blankThis page intentionally left blank

Page 231: Financial Engineering

May 5, 2010 17:13 Proceedings Trim Size: 9in x 6in 009

Quantile Hedging for Defaultable Claims

Yumiharu Nakano

Graduate School of Innovation ManagementTokyo Institute of Technology

2-12-1 Ookayama 152-8552, Tokyo, Japanand PRESTO, Japan Science and Technology Agency4-1-8 Honcho Kawaguchi, Saitama 332-0012, Japan

E-mail: [email protected]

We study the quantile hedging problem for defaultable claims in incom-plete markets modeled by Ito processes, in the case where the portfo-lio processes are adapted to the full filtration. Using the convex dualitymethod as in Cvitanic and Karatzas (Bernoulli, 7 (2001), 79–97) and agood structure of the class of the equivalent martingale measures, we de-rive a closed form solution for the problem.

Keywords: Quantile hedging, defaultable claims, convex duality,Neyman-Pearson lemma, jump processes.

1. IntroductionIt is known that, in arbitrage-free, incomplete financial markets, the super-

hedging cost of a contingent claim is often too high. More precisely, for any Eu-ropean call option in markets with transaction costs, the cheapest super-hedgingis given by the buy-and-hold portfolio. This result is conjectured by Davis andClark [11], and proved by, to name a few, Soner, Shreve and Cvitanic [24], Cvi-tanic, Pham and Touzi [9], Levental and Skorohod [16], and Jakubenas, Leventaland Ryznar [14]. Similar results are obtained by Bellamy-Jeanblanc [2] in jump-diffusion models and by Cvitanic, Pham and Touzi [10] in stochastic volatilitymodels.

In such a situation, it is reasonable that a hedger of a claim starts with an initialcapital less than the super-hedging cost and accepts the possibility of the shortfall.One criterion for measuring this downside risk is the probability of super-hedgingbeing successful. Optimizing this criterion is usually called quantile hedging,which is first studied by Kulldorff [15] in the context of gambling theory. Browne[5] considers the case of financial markets modeled by Ito processes with deter-ministic coefficients. Follmer and Leukert [12] studies this problem for general

219

Page 232: Financial Engineering

May 5, 2010 17:13 Proceedings Trim Size: 9in x 6in 009

220

semimartingale financial market models. Spivak and Cvitanic [25] treats partialinformation market models and markets with different interest rates for borrowingand for lending. Sekine [22] analyzes the case of defaultable claims in the Brow-nian market models. Other criterions, such as the expected loss function or therisk measures, for the shortfall risk are also considered. See Cvitanic [6], Cvitanicand Karatzas [7], Follmer and Leukert [13], Nakano [17, 18, 19], Pham [21], andSekine [23], for examples.

In this paper, we consider the quantile hedging problem for defaultable claimsin Brownian market models as in [23]. It investigates the case where the portfoliosare adapted to the market information structure and gives closed form solutionsby some reductions of the original problems to default-free ones. In our frame-work presented below, the portfolio processes are assumed to be adapted to thefull filtration, i.e., the filtration generated by both the price and default indicatorprocesses.

The quantile hedging problem is non-standard as a stochastic control problemand the usual dynamic programming approach cannot be applicable in a trivialway. Thus, in [12], they combine a super-hedging argument with a Neyman-Pearson lemma in the hypothesis testing to reduce the original dynamic problemto a static one. In a complete market framework, the reduced static problem isstated as the testing problem of a single null hypothesis versus a single alternativehypothesis, and so is directly solved by the classical Neyman-Pearson lemma (see[12]). However, this is not the case in our incomplete markets. To handle thisissue, as in [6] and [18], we follow the convex duality approach for the generalizedNeyman-Pearson lemma developed in [8].

This paper is organized as follows: In Section 2, we describe our marketmodels. As a basic result, we give an explicit formula for the super-replicationcost. Section 3 presents a solution to our quantile hedging problem for defaultableclaims with zero recovery rate. In doing so, we explicitly solve the dual prob-lem with the help of a good structure of the class of the equivalent martingalemeasures. Section 4 deals with the case of non-zero recovery rate.

2. ModelWe consider the financial market with terminal timeT ∈ (0,∞) consisting of

one stock with price processSt0≤t≤T and one riskless bond with price processBt0≤t≤T , whose dynamics are given respectively by

dSt = Stbtdt+σtdWt, 0≤ t ≤ T, S0 = s0 ∈ (0,∞),

dBt = rtBtdt, 0≤ t ≤ T, B0 = 1.

Here, Wtt≥0 is a standard one-dimensional Brownian motion on a completeprobability space(Ω,G ,P). The filtrationF= Ftt≥0 is generated byWtt≥0,augmented withP-null sets inG . The processesbt, rt, σt are all assumed to

Page 233: Financial Engineering

May 5, 2010 17:13 Proceedings Trim Size: 9in x 6in 009

221

be boundedF-predictable processes. Moreover we assume thatσt >0 for t ∈ [0,T]a.s. and thatσ−1

t is also bounded. Then, the process

θt := σ−1t (bt − rt), 0≤ t ≤ T

is a boundedF-predictable process.Let τ be a nonnegative random variable satisfyingP(τ = 0) = 0 and

P(τ > t) > 0 for anyt ≥ 0, and letNtt≥0 be the counting process with respectto τ, i.e.,

Nt = 1τ≤t, t ≥ 0.

Denote byH= Htt≥0 the filtration generated byNt and byG = Gtt≥0 thefiltration F∨H. For simplicity we assume thatG = GT . The survival processGtt≥0 of τ with respect toF is then defined by

Gt = P(τ > t |Ft), 0≤ t ≤ T.

We assume thatGt > 0 for t ≥ 0, and consider the hazard processΓtt≥0 ofτ with respect toF defined byGt = e−Γt or Γt = − logGt for everyt ≥ 0. Wealso assume thatΓt =

∫ t0 µsds, t ≥ 0, for some nonnegativeF-predictable process

µtt≥0, so-calledF-intensity of the random timeτ. Then the process

Mt := Nt −∫ t

0µs(1−Ns−)ds= Nt −

∫ t∧τ

0µsds, t ≥ 0,

follows aG-martingale (see Bielecki and Rutkowski [3]).We now make the standing assumption thatWt is a (G,P)-standard Brow-

nian motion. Notice that ifτ is independent ofWt then this assumption is sat-isfied. Moreover, we can construct the random timeτ such thatWt is a(G,P)-standard Brownian motion (see, e.g., [3]).

As in the usual Brownian market models, we consider theG-martingale

Z∗t = exp

(

−∫ t

0θsdWs−

12

∫ t

0θ2

s ds

)

, 0≤ t ≤ T.

Then by Girsanov’s theorem, the process

W∗t :=Wt +

∫ t

0θsds, 0≤ t ≤ T

is a standard Brownian motion under the probability measureP∗ defined by

dP∗

dP= Z∗

T .

In addition, we consider the process

Zκt = (1+κτ1τ≤t)exp

(

−∫ t∧τ

0κsµsds

)

, 0≤ t ≤ T,

Page 234: Financial Engineering

May 5, 2010 17:13 Proceedings Trim Size: 9in x 6in 009

222

whereκt0≤t≤T is taken from the class

D = κt0≤t≤T : bounded,G-predictable, κt >−1 dt×dP-a.e..

ThenZκt , κ ∈ D , satisfies

Zκt = 1+

∫ t

0κsZ

κs−dMs, 0≤ t ≤ T,

and follows a(P,G)-martingale (see Bremaud [4] for example). Since thequadratic covariation process[Z∗,Zκ ] is identically zero,

dZ∗t Zκ

t = Z∗t Zκ

t−(−θtdWt +κtdMt). (2.1)

Thus,Z∗t Zκ

t is a(P,G)- positive martingale forκ ∈D . EachZκt is orthogonal

to (P,F)-martingales, so we can show thatW∗t is also a Brownian motion under

Qκ defined bydQκ/dP = Z∗

TZκT . HenceQκ : κ ∈ D defines the class of the

equivalent martingale measures. We refer to [3] for details.We considerG as the available information for the market participants. The

portfolio process is thus defined as aG-predictable processπt0≤t≤T satisfying∫ T

0 |πt |2dt < ∞, a.s. The (self-financing) wealth processXx,πt 0≤t≤T for an initial

wealthx≥ 0 and a portfolio processπt is then described by

dXx,πt = rtX

x,πt dt+πt(bt − rt)dt+πtσtdWt , Xx,π

0 = x.

The solution to this equation is given by

Xx,πt = Bt

[

x+∫ t

0B−1

u πu(bu− ru)du+σudWu]

, 0≤ t ≤ T.

We writeA (x) for the set of all portfolio processesπt0≤t≤T such thatXx,πt ≥ 0,

0≤ t ≤ T, a.s.By Ito formula and (2.1), we get, forπ ∈ A (x),

dLκt Xx,π

t = Lκt−[(πtσt −Xx,π

t θt)dWt +Xx,πt κtdMt ],

whereLκ

t = B−1t Z∗

t Zκt , κ ∈ D .

This and the nonnegativity of the wealth process mean that the processLκt Xx,π

t is a supermartingale for eachπ ∈ A (x). We denote byL the set of all randomvariableLκ

T , κ ∈ D .In this setting, we consider hedging problems for the defaultable claimH de-

fined byH =Y1τ>T+δY1τ≤T. (2.2)

Page 235: Financial Engineering

May 5, 2010 17:13 Proceedings Trim Size: 9in x 6in 009

223

Here,Y is anFT -measurable nonnegative random variable, which represents thepayoff received by the holder at timeT if the default does not occur in[0,T]. Weassume thatE∗[Y] < ∞, whereE∗ stands for the expectation with respect toP

∗.The constantδ ∈ [0,1] is the recovery rate of the payoff in case the default occursin [0,T].

The most conservative way of hedging the claims is the so-called super-hedging, and its costΠ(H) of H is defined by

Π(H) = infx≥ 0 : Xx,πT ≥ H a.s. for someπ ∈ A (x).

In our setting, this super-hedging cost can be obtained explicitly.

Proposition 2.1. Let H be as in(2.2)such thatE∗[Y]< ∞. Then we have

Π(H) = E∗[B−1

T Y].

Moreover, the replicating portfolio for Y becomes a super-hedging portfolio forH.

Proof. Set x = E∗[B−1

T Y] and letπ be the replicating portfolio forY. Then wefind thatπ ∈ A (x) andXx,π

T =Y ≥ H. Thusx≥ Π(H).On the other hand, suppose thatXx,π

T ≥ H for someπ ∈A (x). Then, from thesupermartingale property ofLκ

t Xx,πt ,

E[LκTH]≤ E[Lκ

TXx,πT ]≤ x, κ ∈ D . (2.3)

It follows from H = δY+(1− δ )Y1τ>T that the left-hand side in (2.3) can bewritten as

E[LκTH] = E[B−1

T Z∗TZκ

T δY]+E[LκT(1−δ )Y1τ>T]. (2.4)

Since the quadratic covariation ofZκt and anF-martingale is equal to zero, the

processZκt E[B

−1T Z∗

TδY|Ft ] is a local martingale. So, ifY is bounded then thisprocess is a martingale. Therefore, by approximatingY with Y∧ n and by themonotone convergence theorem, we find that the first term in the right-hand sidein (2.4) is given byE[B−1

T Z∗TδY]. From this and (2.3) we have

E[B−1T Z∗

TδY]+supκ∈D E[LκT(1− δ )Y1τ>T]≤ Π(H).

However, for any constantκ >−1,

E[LκTY1τ>T] = E[B−1

T Z∗TY(1+κ1τ≤T)e

−κ∫ τ∧T0 µtdt1τ>T]

= E[B−1T Z∗

TY1τ>Te−κ

∫ T0 µtdt] = E[B−1

T Z∗TYGTe−κ

∫ T0 µtdt]

= E[B−1T Z∗

TYe−(κ+1)∫T0 µtdt].

HenceE[LκTY1τ>T]→ E[B−1

T Z∗TY] asκ −1. Thus the proposition follows.

Page 236: Financial Engineering

May 5, 2010 17:13 Proceedings Trim Size: 9in x 6in 009

224

3. Quantile Hedging ProblemProposition 2.1 implies that if the hedger of the defaultable claim wants to

hedge the claim almost surely then s/he needs to have the perfect hedging cost forthe liability to be paid when the default does not occur. However, the price ofHshould reflect the possibility of default and be smaller thanE

∗[B−1T Y] since one can

receiveY almost surely with this cost buying in the default-free market. In otherwords, an initial wealth for hedging ofH may be smaller thanE∗[B−1

T Y]. In sucha case there is the possibility of the shortfall in hedging ofH. One criterion formeasuring this downside risk is the probability of super-hedging being successful.Our objective is thus to solve the following problem: forx< E

∗[B−1T Y],

maxπ∈A (x)

P(Xx,πT ≥ H). (3.1)

Adapting an optimal portfolio for this problem as a hedging strategy forH isusually called a quantile hedging. To solve the quantile hedging problem (3.1), asin [18], we first reduce the original dynamic problem to a Neyman-Pearson typeproblem via a super-hedging argument, then adapt the convex duality approach tosolve the Neyman-Pearson type problem.

To this end, we first introduce the classL defined by the closed hull ofLwith respect toL1 := L1(Ω,G ,P) convergence. SinceL is convex (see, e.g., [4]),so isL . Thus,L is a closed convex set inL1.

Let us consider the Neyman-Pearson type problem defined by

maxϕ∈R

E[ϕ ] (3.2)

whereR = ϕ : 0≤ ϕ ≤ 1 a.s., supL∈L

E[LHϕ ]≤ x.As in [12], our problem is reduced to the Neyman-Pearson type problem via

the following proposition.

Proposition 3.1. Suppose that there exist A∈ GT andπt ∈ A (x) such that1A

solves the Neyman-Pearson type problem(3.2) and Xx,πT ≥ H1A a.s. Thenπ is

optimal for the quantile hedging problem(3.1).

Proof. For π ∈ A (x),

E

[

LκT1Xx,π

T ≥HH]

≤ E

[

LκT1Xx,π

T ≥HXx,πT

]

≤ E[Xx,πT Lκ

T ]≤ x, κ ∈ D .

Let L ∈ L . Then there existLn ∈ L , n= 1,2, . . . , such thatL = limn→∞ Ln a.s.(possibly along with a subsequence). Thus it follows from Fatou’s lemma that1Xx,π

T ≥H ∈ R. Hence

maxπ∈A (x)

P(Xx,πT ≥ H)≤ max

ϕ∈R

E[ϕ]. (3.3)

Page 237: Financial Engineering

May 5, 2010 17:13 Proceedings Trim Size: 9in x 6in 009

225

On the other hand, denotingX = Xx,πT , we see

P(X ≥ H)≥ P(X ≥ H,A) = P(X ≥ H1A,A) = P(A) = maxϕ∈R

E[ϕ].

Combining this with (3.3), we have the proposition.

We adapt the convex duality approach in [8] and [18]. Observe that forϕ ∈R,y≥ 0, L ∈ L ,

E[ϕ ] = E[ϕ(1− yLH)]+ yE[LHϕ]≤ E[(1− yLH)+]+yx. (3.4)

Thus the following dual problem naturally arises:

V(x) := infy≥0, L∈L

E[(1− yLH)+]+yx. (3.5)

In what follows we will see that the minimization above can be completely solvedin the case of zero recovery rate, i.e., in the case thatH is of the following form:

H =Y1τ>T. (3.6)

Define

L = B−1T Z∗

T1τ>Te∫ T0 µtdt. (3.7)

Then we have the following:

Theorem 3.2. Suppose that H is as in(3.6) with E∗[Y] < ∞. ThenL defined by

(3.7) solvesinf

L∈L

E[(1− yLH)+].

Moreover, there existsy> 0 that minimizes

h(y) := E[(1− yLH)+]+yx

over y≥ 0. The pair(y, L) is optimal for the minimization problem(3.5).

Proof. First notice thatL ∈ L since L = limκ−1 LκT a.s. and inL1.

Forκ ∈ D ,

E[1∧ (yLκTH)] = E[1∧ (yB−1

T Z∗Te−

∫ T0 κsµsdsY)1τ>T]

≤ E[1∧ (yB−1T Z∗

Te∫ T0 µsdsY)1τ>T]

= E[1∧ (yLH)].

Page 238: Financial Engineering

May 5, 2010 17:13 Proceedings Trim Size: 9in x 6in 009

226

Thus, by Fatou’s lemma and(1− z)+ = 1−1∧z, we find

E[(1− yLH)+]≥ E[(1− yLH)+], L ∈ L .

Next, we claim that there existsy0 > 0 such thath(y0)< 1. Suppose otherwise.Then we find thatE[1∧ (yLH)] ≤ yx for everyy > 0. Dividing by y and thenletting y 0, we obtainE[LH]≤ x. However, this contradicts to the assumptionx< E

∗[B−1T Y] since

E[LH] = E[B−1T Z∗

Te∫ T0 µtdtYP(τ > T|FT)] = E

∗[B−1T Y].

The existence of the minimizer ˆy> 0 now follows from the convexity ofh and thefacts thath(0) = 1 andh(+∞) = +∞.

The optimality of the pair(y, L) for (3.5) is easy to see, so omitted.

Let y> 0 be as in the previous theorem and consider theFT -measurable ran-dom variableξ defined by

ξ = yB−1T Z∗

Te∫ T0 µtdtY.

Then we can describe an optimal quantile hedging portfolio in the following way:

Theorem 3.3. Suppose that H is as in(3.6) with E∗[Y] < ∞ and that

P(ξ = 1) = 0. Then the perfect hedging portfolio for Y1ξ<1 is optimal for thequantile hedging problem(3.1).

Proof. We follow essentially the same arguments as in [8] and [18].The set

M := (L,y) ∈ L1×R : L ∈ L , y≥ 0

is closed and convex in the Banach spaceL1×R with the norm‖(L,y)‖ :=E[L]+y. Moreover, it is straightforward to see that the functional

L1×R 3 (L,y) 7→U(L,y) := yx+E(1−L)+

is proper, convex and lower semi-continuous onL1×R, and that

inf(L,y)∈M

U(yLH,y) =U(yLH, y).

DenoteL∞(Ω,G ,P) by L∞. Let us consider the setM∗ = (yLH,y) : (L,y) ∈

M , the normal cone

N (yLH, y) :=

(ϕ,u) ∈ L∞ ×R : E[yLHϕ]+ yu≥ E[yLHϕ ]+yu, (L,y) ∈ M

Page 239: Financial Engineering

May 5, 2010 17:13 Proceedings Trim Size: 9in x 6in 009

227

to the setM at (yLH, y), and the subdifferential

∂U(yLH, y) :=

(ϕ ,u) ∈ L∞ ×R :U(yLH, y)−U(L,y)

≤ E[ϕ(yLH −L)]+u(y− y), (L,y) ∈ L1×R

at this point. Then, by Corollary 4.6.3 in Aubin and Ekeland [1],

(0,0) ∈ ∂U(yLH, y)+N (yLH, y).

This implies that there exists(ϕ , u) ∈ L∞ ×R such that(ϕ, u) ∈ N (yLH, y) and(−ϕ,−u) ∈ ∂U(yLH, y). Hence we obtain

E[Hϕ(yL− yL)]+ (y− y)u≥ 0, (L,y) ∈ M , (3.8)

(x+ u)(y− y)≤ E[ϕ(L− yLH)]+E(1−L)+−E(1− yLH)+, (L,y) ∈ L1×R.(3.9)

By letting y → ±∞, we see that (3.9) holds only if ˆu = −x. From (3.8) withu=−x, y= y± δ (δ > 0), andL = L, we have

E[HϕL] = x. (3.10)

Thus, reading (3.8) withy= y, we get

E[HϕL]≤ x, L ∈ L . (3.11)

Eq. (3.9) is now written as

E[ϕ(L− yLH)]+E(1−L)+−E(1− yLH)+ ≥ 0, L ∈ L1. (3.12)

Considering (3.12) forL= yLH+1A with arbitraryA∈G , we see that 0≤E[ϕ1A].Thus ϕ ≥ 0 a.s. Similarly, considering (3.12) forL = yLH − 1A with arbitraryA∈ G and using(x+y)+ ≤ (x)++(y)+ for x,y∈R, we see that 0≤ E(1− ϕ)1A.Thusϕ ≤ 1 a.s. Combining with (3.11), we haveϕ ∈ R.

Eq. (3.12) forL = 1 impliesE[ϕ(1− yLH)] ≥ E(1− yLH)+. Thus ϕ(1−yLH) = (1− yLH)+ a.s. From this andϕ ∈ R we find thatϕ = 1 onyLH < 1andϕ = 0 onyLH > 1. Hence there must be some[0,1]-valued random variableC such that the representation

ϕ = 1yLH<1+C1yLH=1 (3.13)

holds. Moreover, we haveE[ϕ ] =E[(1− yLH)+]+ yx. This and (3.4) imply thatϕis optimal for the Neyman-Pearson type problem and that there is no duality gap.By the assumption of the theorem,P(yLH = 1) = P(ξ = 1,τ > T) = 0. Moreover,

Page 240: Financial Engineering

May 5, 2010 17:13 Proceedings Trim Size: 9in x 6in 009

228

sinceHϕ =Y1ξ<11τ>T, we can apply Proposition 3.1 to deduce that a super-hedging portfolio forHϕ is given by the perfect hedging portfolio forY1ξ<1.From this and Proposition 2.1, we arrive at the conclusion of the theorem.

Remark 3.4. The conditionP(ξ = 1) = 0 is satisfied, e.g., in the case where thefollowing are all satisfied:τ is independent toWt, the market model is describedby Black-Scholes one, andY is given by a plain vanilla option. In more generalMarkovian cases, Hormander’s condition in Malliavin calculus (see, e.g., Nualart[20]) can be used to check thatξ is continuously distributed.

Remark 3.5. Since the perfect hedging strategy forY1ξ<1 is anF-predictableprocess, Theorem 3.3 gives an explicit solution to the quantile hedging problemswith respect toF-predictable portfolios studied in [22] in the case of zero-recoveryrate.

4. Case of Non-Zero Recovery RateNext we consider the case ofδ > 0. In this case, solving the dual problem is

more difficult, and we leave it for a future study. Instead, we present a solution toour quantile hedging problem in a restricted class of portfolio processes.

Notice that the payoff of the defaultable claim satisfies

H = δY+(1−δ )Y1τ>T ≥ δY.

This implies that a seller of the claim must pay at leastδY at the maturity. SinceδY is FT -measurable, there exists a unique portfolio processπ∗

t 0≤t≤T such that

Xx∗,π∗t = E

∗[B−1T BtδY|Ft ],

wherex∗ = E∗[B−1

T δY]. In view of these considerations, we impose the followingcapital requirements on the wealth process:

Xx,πt ≥ E

∗[B−1T Bt δY|Ft ], 0≤ t ≤ T, a.s. (4.1)

In particular,x must be at leastx∗. We denote byA ∗(x) all portfolio processessuch that the corresponding wealth process with initial wealthx satisfies (4.1), andrestrict ourselves the class of portfolio processes toA

∗(x). Then let us considerthe following quantile hedging problem

maxπ∈A ∗(x)

P(Xx,πT ≥ H). (4.2)

It follows from H = δY+(1−δ )Y1τ>T thatP(Xx,πT ≥H) =P(Xx−x∗,π−π∗

T ≥(1−δ )Y1τ>T). Thus the problem (4.2) is reduced to the maximization problem

Page 241: Financial Engineering

May 5, 2010 17:13 Proceedings Trim Size: 9in x 6in 009

229

of P(Xx′,π ′T ≥ H ′) over all portfolio processesπ ′ ∈ A (x′). Here,x′ = x− x∗ ≥ 0

andH ′ = (1− δ )Y1τ>T. Thus,

maxπ∈A ∗(x)

P(Xx,πT ≥ H) = max

π ′∈A (x′)P(Xx′,π ′

T ≥ H ′).

Therefore we can apply Theorem 3.3 to the problem (4.2) and obtain the follow-ing:

Theorem 4.1. Suppose thatE∗[B−1T Y]< ∞, and let y′ andξ ′ be defined by

y′ = argminy≥0

E(1− yLH ′)++ yx′

,

ξ ′ = y′B−1T Z∗

Te∫ T0 µtdt(1−δ )Y.

Suppose moreover thatP(ξ ′ = 1) = 0. Then the perfect hedging portfolio forδY+(1− δ )Y1ξ<1 is optimal for the quantile hedging problem(4.2).

References1. Aubin, J. and Ekeland, J.,Applied nonlinear analysis, John Wiley & Sons, New York,

1984.2. Bellamy, N. and Jeanblanc, M., Incompleteness of markets driven by a mixed diffu-

sion,Finance and Stochastics, 4, (2000), 209–222.3. Bielecki, T. R. and Rutkowski, M.,Credit Risk: Modeling, Valuation and Hedging,

Springer-Verlag, Berlin, 2004.4. Bremaud, P.,Point Processes and Queues: Martingale Dynamics, Springer-Verlag,

New York, 1981.5. Browne, S., Reaching Goals by a Deadline: Digital Options and Continuous-Time

Active Portfolio Management,Advances in Applied Probability, 31, (1999), 551–577.6. Cvitanic, J.: Minimizing expected loss of hedging in incomplete and constrained mar-

kets,SIAM Journal on Control and Optimization, 38, (2000), 1050–1066.7. Cvitanic, J. and Karatzas, I.: On dynamic measures of risk,Finance and Stochastics,

3, (1999), 904–950.8. Cvitanic, J. and Karatzas, I.: Generalized Neyman-Pearson lemma via convex duality,

Bernoulli, 7, (2001), 79–97.9. Cvitanic, J., Pham, H., and Touzi, N., A closed-form solution for the problem of super-

replication under transaction costs,Finance and Stochastics, 3, (1999), 35–54.10. Cvitanic, J., Pham, H., and Touzi, N., Super-replication in stochastic volatility models

with portfolio constraints,Journal of Applied Probability, 36, (1999), 523–545.11. Davis, M. H. and Clark, J. M. C., A note on super replicating strategies,Philosophical

Transactions of the Royal Society of London. Series A, Physical sciences and engi-neering, 347, (1994), 485–494.

12. Follmer, H. and Leukert, P., Quantile Hedging,Finance and Stochastics, 3, (1999),251–273.

13. Follmer, H. and Leukert, P., Efficient hedging: cost versus shortfall risk,Finance andStochastics, 4, (2000), 117–146.

Page 242: Financial Engineering

May 5, 2010 17:13 Proceedings Trim Size: 9in x 6in 009

230

14. Jakubenas, P., Levental, S., and Ryznar, M., The super-replication problem via proba-bilistic methods,Annals of Applied Probability, 13, (2003), 742–773.

15. Kulldorff, M., Optimal control of a favourable game with a time-limit,SIAM Journalon Control and Optimization, 31, (1993), 52–69.

16. Levental, S. and Skorohod, A. V., On the possibility of hedging options in the presenceof transaction costs,Annals of Applied Probability, 7, (1997), 410–443.

17. Nakano, Y., Efficient hedging with coherent risk measure,Journal of MathematicalAnalysis and Applications, 293, (2004), 345–354.

18. Nakano, Y., Minimizing coherent risk measures of shortfall in discrete-time modelsunder cone constraints,Applied Mathematical Finance, 10, (2003), 163–181.

19. Nakano, Y., Minimization of shortfall risk in a jump-diffusion model,Statistics &Probability Letters, 67, (2004), 87–95.

20. Nualart, D.,The Malliavin calculus and related topics, 2nd ed., Springer-Verlag,Berlin, 2006.

21. Pham, H.: DynamicLp-hedging in discrete time under cone constraints,SIAM Journalon Control and Optimization, 38, (2000), 665–682.

22. Sekine, J., Quantile hedging for defaultable securities in an incomplete market,Mathematical Economics (Kyoto, 1999),S urikaisekikenky usho K oky uroku, No. 1165,(2000), 215–231.

23. Sekine, J., Dynamic minimization of worst conditional expectation of shortfall,Math-ematical Finance, 14, (2004), 605–618.

24. Soner, H. M., Shreve, S. E., and Cvitanic, J., There is no nontrivial hedging portfoliofor option pricing with transaction costs,Annals of Applied Probability, 5, (1995),327–355.

25. Spivak, G. and Cvitanic, J., Maximizing the probability of a perfect hedging,Annalsof Applied Probability, 9, (1999), 1303–1328.

Page 243: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

New Unified Computational Algorithm in aHigh-Order Asymptotic Expansion Scheme∗

Kohta Takehara†, Akihiko Takahashi and Masashi Toda

Graduate School of Economics, University of Tokyo7-3-1, Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan.

E-mail: [email protected]

An asymptotic expansion scheme in finance initiated by Kunitomo and Takahashi[6] and Yoshida [29] is a widely applicable methodology for analytic approxi-mation of the expectation of a certain functional of diffusion processes. Mathe-matically, this methodology is justified by Watanabe’s theory ([27]) in Malliavincalculus. In practical applications, it is desirable to investigate the accuracy andstability of the method especially with expansion up to high orders in situationswhere the underlying processes are highly volatile as seen in the recent finan-cial markets. Although Takahashi [17], [18] and Takahashi and Takehara [20]provided explicit formulas for the expansion up to the third order, to our bestknowledge a general computation scheme for an arbitrary-order expansion hasnot been given yet. This paper proposes two general methods for computing theconditional expectations that are powerful especially for high order expansions.The first one, as an extension of the method introduced by the preceding papers,presents a unified scheme for computation of the conditional expectations. Thesecond one develops a new calculation algorithm for computing the coefficientsof the expansion through solving a system of ordinary differential equations that isequivalent to computing the conditional expectations. To demonstrate their effec-tiveness, the paper gives numerical examples of the approximation forλ-SABRmodel up to the fifth order and a cross-currency Libor market model with a gen-eral stochastic volatility model of the spot foreign exchange rate up to the fourthorder.

Keywords: Asymptotic expansion, Malliavin calculus, approximation formula,stochastic volatility,λ-SABR model, Libor market model, currency options.

∗This research was partially supported by the global COE program “The Research and TrainingCenter for New Development in Mathematics.”†Corresponding author.

231

Page 244: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

232

1. IntroductionThis paper presents two alternative schemes for computation in the method so-

called “an asymptotic expansion approach” based on Watanabe’s theory (Watan-abe [27]) in Malliavin calculus by extending the preceding papers and also bydeveloping a new calculation algorithm.

To our best knowledge, the asymptotic expansion is first applied to finance forevaluation of an average option that is a popular derivative in commodity mar-kets. [6] and [17] derive the approximation formulas for an average option by anasymptotic method based on log-normal approximations of an average price dis-tribution when the underlying asset price follows a geometric Brownian motion.[29] applies a formula derived by the asymptotic expansion of certain statisticalestimators for small diffusion processes. Thereafter, the asymptotic expansionhave been applied to a broad class of problems in finance. See [18], [19], Kunit-omo and Takahashi [7], [8], Matsuoka, Takahshi and Uchida [12], Takahashi andYoshida [25], [26], Muroi [13], and Takahashi and Takehara [20], [21], [22]. Forother asymptotic methods in finance which do not depend on Watanabe’s theory,see also Fouque, Papanicolaou and Sircar [3], [4], Henry-Labordere [10], [11],Kusuoka and Osajima [9], and Siopacha and Teichmann [16].

Recently, not only academic researchers but also many practitioners such asAntonov and Misirpashaev [1] or Andersen and Hutchings [2] have used theasymptotic expansion method based on Watanabe’s theory in their proposed tech-niques for a variety of financial issues, e.g. pricing or hedging complex derivativesunder high-dimensional underlying stochastic environments. These methods arefully or partially based on the framework developed by [6], [17], [18] in financialliterature.

In theory, this method provides us the expansion of underlying stochastic pro-cesses which has a proper meaning in the limit of some ideal situations such ascases where they become deterministic ones (for details see [27], [28] or [8]).

In practice, however, we are often interested in cases far from that situation,where the underlying processes are highly volatile, as seen in the recent financialmarkets especially after the crisis of 2008. Then from the viewpoint of the accu-racy or stability of the techniques in practical uses, it is desirable to investigatebehaviors of its estimators in such situations, especially with expansion up to highorders.

In the existing application of the asymptotic expansion based on Watanabe’stheory, they calculated certain conditional expectations which appear in their ex-pansions and play a key role in computation, by the formulas up to the third ordergiven explicitly in [17], [18] and [20]. In many applications, these formulas givesufficiently accurate approximation, but in some cases, for example in the caseswith long maturities and/or with highly volatile underlying variables, the approx-imation up to the third order may not provide satisfactory accuracies. Thus, theformulas for the higher order computation are desirable. But to our knowledge,

Page 245: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

233

asymptotic expansion formulas higher than the third order have not been givenyet.

This paper provides the general procedures for the explicit computation ofconditional expectations in the asymptotic expansion. Moreover, we develop analternative but equivalent calculation algorithm which computes the unconditionalexpectations directly instead of the conditional ones and enables us to derive highorder approximation formulas in an automatic manner. While these techniquescan be applied to a broad class of Ito processes, for simplicity and limitation ofspace, in this paper we concentrate on a much simpler setting as described inSection 2. For further explanations in more general environment, see our onlineworking paper, Takahashi, Takehara and Toda [23].

Finally, our approximation generally shows sufficient accuracy with compu-tation of high order expansions, which is confirmed by numerical experiments inmore complex cases than that in Section 2.

The organization of this paper is as follows: After Section 2 will develop ourmethods in the simple setting, Section 3 applies our algorithms described in theprevious section to the concrete financial models, and confirms the effectivenessof the higher order expansions by numerical examples inλ-SABR model and across-currency Libor market model with a general stochastic volatility model ofthe spot foreign exchange rate. Due to limitations of space, detailed proofs, theconcrete expressions of some formulas and equations are omitted in this paper,which are given in [23]. We will refer to it if necessary.

2. An Asymptotic Expansion Approach in a Black-Scholes EconomyIn this section, our essential idea is explained in a simple Black-Scholes-type

economy. For discussions in more general settings, refer to Section 3 and 4 of[23].

2.1 An Asymptotic Expansion Approach in a Black-Scholes EconomyLet (W,P) be a one-dimensional Wiener space. HereafterP is considered as a

risk-neutral equivalent martingale measure and a risk-free interest rate is set to bezero for simplicity. Then, the underlying economy is specified with a (R+-valued)single risky assetS(ε) = S(ε)

t satisfying

S(ε)t = S0 + ε

∫ t

0σ(S(ε)

s , s)dWs (1)

whereε ∈ (0, 1] is a constant parameter;σ: R2+ 7→ R satisfies some regularity

conditions. We will consider the following pricing problem;

V(0,T) = E[Φ(S(ε)T )] (2)

Page 246: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

234

whereΦ is a payoff function written onS(ε)T (for example,Φ(x) = max(x − K, 0)

for call options orΦ(x) = δx(x), a delta function with mass atx for the densityfunction) andE[ · ] is an expectation operator under the probability measureP.Rigorously speaking, they are a generalized function on the Wiener functionalS(ε) and a generalized expectation defined for generalized functions respectively,whose mathematically proper definitions will be given in Section 2 of [23].

Let Akt =∂kS(ε)

t

∂εk|ε=0. Here we representA1t, A2t andA3t explicitly by

A1t =

∫ t

0σ(S(0)

s , s)dWs, (3)

A2t = 2∫ t

0∂σ(S(0)

s , s)A1sdWs, (4)

A3t = 3∫ t

0

(

∂2σ(S(0)s , s)(A1s)

2 + ∂σ(S(0)s , s)(A2s)

)

dWs (5)

recursively and thenS(ε)T has its asymptotic expansion

S(ε)T = S0 + εA1T +

ε2

2!A2T +

ε3

3!A3T + o(ε3). (6)

Note thatS(0)t = limε↓0 S(ε)

t = S0 for all t.Next, normalizeS(ε)

T with respect toε as

G(ε) =S(ε)

T − S(0)T

ε

for ε ∈ (0, 1]. Then,

G(ε) = A1T +ε

2!A2T +

ε2

3!A3T + o(ε2) (7)

in LP for everyp > 1. Here the following assumption is made:

ΣT =

∫ T

0σ2(S(0)

t , t)dt > 0. (8)

Note thatA1T follows a normal distribution with mean 0 and varianceΣT , andhence this assumption means that the distribution ofA1T does not degenerate. Itis clear that this assumption is satisfied whenσ(S(0)

t , t) > 0 for somet > 0.Then, the expectation ofΦ(G(ε)) is expanded aroundε = 0 up toε2-order in

the sense of Watanabe ([27], Yoshida [28]) as follows (hereafter the asymptotic

Page 247: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

235

expansion ofE[Φ(G(ε))] up to the second order will be considered):

E[Φ(G(ε))] =E [Φ(A1T)] + εE[

Φ(1)(A1T)A2T

]

+ ε2

E[

Φ(1)(A1T)A3T

]

+12

E[

Φ(2)(A1T)(A2T)2]

+ o(ε2)

=E [Φ(A1T)] + εE[

Φ(1)(A1T)E [A2T |A1T ]]

+ ε2

E[

Φ(1)(A1T)E[A3T |A1T ]]

+12

E[

Φ(2)(A1T)E[

(A2T)2|A1T

]]

+o(ε2)

=

R

Φ(x) fA1T (x)dx+ ε∫

R

Φ(1)(x)E [A2T |A1T = x] fA1T (x)dx

+ ε2∫

R

Φ(1)(x)E [A3T |A1T = x] fA1T (x)dx

+12

R

Φ(2)(x)E[

(A2T)2|A1T = x]

fA1T (x)dx

+ o(ε2)

=

R

Φ(x) fA1T (x)dx+ ε∫

R

Φ(x)(−1)∂

∂xE [A2T |A1T = x] fA1T (x)dx

+ ε2(∫

R

Φ(x)(−1)∂

∂xE [A3T |A1T = x] fA1T (x)dx

+12

R

Φ(x)(−1)2∂2

∂x2E

[

(A2T)2 |A1T = x]

fA1T (x)dx

)

+ o(ε2). (9)

whereΦ(m)(x) is m-th order derivative ofΦ(x) and fA1T (x) is a probability densityfunction ofA1T following a normal distribution;

fA1T (x) :=1

√2πΣT

exp

(

−x2

2ΣT

)

. (10)

In particular, lettingΦ = δx, we have the asymptotic expansion of the densityfunction ofG(ε) as seen later.

Then, all we have to do to evaluate this expansion is a computation of theseconditional expectations. In particular, we present two alternative approaches.

2.2 An Approach with an Expansion into Iterated Ito IntegralsIn this subsection we show an approach with a further expansion ofA2T ,

A3T and (A2T)2 into iterated Ito integrals to compute the conditional expectationsin (9).

Page 248: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

236

Recall that we have

E[Φ(G(ε))] =∫

R

Φ(x) fA1T (x)dx+ ε∫

R

Φ(x)(−1)∂

∂xE [A2T |A1T = x] fA1T (x)dx

+ ε2(∫

R

Φ(x)(−1)∂

∂xE [A3T |A1T = x] fA1T (x)dx

+12

R

Φ(x)(−1)2∂2

∂x2E[(A2T)2|A1T = x] fA1T (x)dx

)

+ o(ε2). (11)

Next, it is shown thatA2T , A3T , (A2T)2 can be expressed as summations ofiterated Ito integrals. First, note thatA2T is

A2T = 2∫ T

0

∫ t1

0∂σ(S(0)

t1 , t1)σ(S(0)t2 , t2)dWt2dWt1 (12)

Next, by application of It ˆo’s formula to (5) we obtain

A3T = 6∫ T

0

∫ t1

0

∫ t2

0∂σ(S(0)

t1 , t1)∂σ(S(0)t2 , t2)σ(S(0)

t3 , t3)dWt3dWt2dWt1

+ 6∫ T

0

∫ t1

0

∫ t2

0∂2σ(S(0)

t1 , t1)σ(S(0)t2 , t2)σ(S(0)

t3 , t3)dWt3dWt2dWt1

+ 3∫ T

0

∫ t1

0∂2σ(S(0)

t1 , t1)σ2(S(0)t2 , t2)dt2dWt1. (13)

Similarly,

(A2T)2 = 16∫ T

0

∫ t1

0

∫ t2

0

∫ t3

0∂σ(S(0)

t1, t1)∂σ(S(0)

t2, t2)σ(S(0)

t3, t3)σ(S(0)

t4, t4)dWt4dWt3dWt2dWt1

+ 8∫ T

0

∫ t1

0

∫ t2

0

∫ t3

0∂σ(S(0)

t1, t1)σ(S(0)

t2, t2)∂σ(S(0)

t3, t3)σ(S(0)

t4, t4)dWt4dWt3dWt2dWt1

+ 8∫ T

0

∫ t1

0

∫ t2

0∂σ(S(0)

t1, t1)∂σ(S(0)

t2, t2)σ

2(S(0)t3, t3)dt3dWt2dWt1

+ 8∫ T

0

∫ t1

0

∫ t2

0∂σ(S(0)

t1, t1)∂σ(S(0)

t2, t2)σ(S(0)

t2, t2)σ(S(0)

t3, t3)dWt3dt2dWt1

+ 8∫ T

0

∫ t1

0

∫ t2

0

(

∂σ(S(0)t1, t1)

)2σ(S(0)

t2, t2)σ(S(0)

t3, t3)dWt3dWt2dt1

+ 4∫ T

0

∫ t1

0

(

∂σ(S(0)t1, t1)

)2σ2(S(0)

t2, t2)dt2dt1.

(14)

Then, by Proposition 1 in [23],the conditional expectations in (11) can be com-

Page 249: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

237

puted as

E[A2T |A1T = x]

=

(

2∫ T

0

∫ t1

0∂σ(S(0)

t1 , t1)σ(S(0)t1 , t1)σ2(S(0)

t2 , t2)dt2dt1

)

H2(x;ΣT)

Σ2T

=: c2,12 H2(x;ΣT) (15)

E[A3T |A1T = x]

=

(

6∫ T

0

∫ t1

0

∫ t2

0∂σ(S(0)

t1 , t1)σ(S(0)t1 , t1)∂σ(S(0)

t2 , t2)σ(S(0)t2 , t2)σ2(S(0)

t3 , t3)dt3dt2dt1

+ 6∫ T

0

∫ t1

0

∫ t2

0∂2σ(S(0)

t1 , t1)σ(S(0)t1 , t1)σ2(S(0)

t2 , t2)σ2(S(0)

t3 , t3)dt3dt2dt1

)

×H3(x;ΣT)

Σ3T

+

(

3∫ T

0

∫ t1

0∂2σ(S(0)

t1 , t1)σ(S(0)t1 , t1)σ2(S(0)

t2 , t2)dt2dt1

)

H1(x;ΣT)ΣT

=: c3,13 H3(x;ΣT) + c3,1

1 H1(x;ΣT ) (16)

and

E[(A2T )2|A1T = x]

=

(

16∫ T

0

∫ t1

0

∫ t2

0

∫ t3

0∂σ(S(0)

t1, t1)σ(S(0)

t1, t1)∂σ(S(0)

t2, t2)σ(S(0)

t2, t2)σ2(S(0)

t3, t3)σ2(S(0)

t4, t4)dt4dt3dt2dt1

+ 8∫ T

0

∫ t1

0

∫ t2

0

∫ t3

0∂σ(S(0)

t1, t1)σ(S(0)

t1, t1)σ2(S(0)

t2, t2)∂σ(S(0)

t3, t3)σ(S(0)

t3, t3)σ2(S(0)

t4, t4)dt4dt3dt2dt1

)

×H4(x;ΣT )

Σ4T

+

(

16∫ T

0

∫ t1

0

∫ t2

0∂σ(S(0)

t1, t1)σ(S(0)

t1, t1)∂σ(S(0)

t2, t2)σ(S(0)

t2, t2)σ2(S(0)

t3, t3)dt3dt2dt1

+ 8∫ T

0

∫ t1

0

∫ t2

0

(

∂σ(S(0)t1, t1)

)2σ2(S(0)

t2, t2)σ2(S(0)

t3, t3)dt3dt2dt1

)

H2(x;ΣT )

Σ2T

+

(

4∫ T

0

∫ t1

0

(

∂σ(S(0)t1, t1)

)2σ2(S(0)

t2, t2)dt2dt1

)

H0(x;ΣT )

=: c2,24 H4(x;ΣT ) + c2,2

2 H2(x;ΣT ) + c2,20 H0(x;ΣT ) (17)

whereHn(x;Σ) is ann-th order Hermite polynomial defined by

Hn(x;Σ) := (−Σ)nex2/2Σ dn

dxne−x2/2Σ.

Substituting these into (11), we have the asymptotic expansion ofE[

Φ(G(ε))]

up toε2-order. Further, lettingΦ = δx, we have the expansion offG(ε) , the density

Page 250: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

238

function ofG(ε):

fG(ε) = fA1T (x) + ε(−1)∂

∂xE [A2T |A1T = x ] fA1T (x)

+ ε2(

(−1)∂

∂xE [A3T |A1T = x ] fA1T (x) +

12

(−1)2 ∂2

∂x2E

[

(A2T)2 |A1T = x]

fA1T (x)

)

+ o(ε2)

= fA1T (x) + ε(−1)∂

∂xc2,1

2 H2(x; ΣT) fA1T (x)

+ ε2

(−1)∂

∂x∑

i=1,3

c3,1i Hi (x;ΣT ) fA1T (x) +

12

(−1)2 ∂2

∂x2∑

i=0,2,4

c2,2i Hi(x; ΣT) fA1T (x)

+ o(ε2).

(18)

2.3 An Alternative Approach with a System of OrdinaryDifferential Equations

In this subsection, we present an alternative approach in which the conditionalexpectations are computed through some system of ordinary differential equa-tions. Again, the asymptotic expansion ofE

[

Φ(G(ε))]

up toε2-order is consideredin this subsection.

Note that the expectations ofA2T , A3T and (A2T)2 conditional onA1T are ex-pressed by linear combinations of a finite number of Hermite polynomials as in(15), (16) and (17). Thus, by Lemma 4 in [23], we have

E[A2T |A1T = x] =2

n=0

a2,1n Hn(x;ΣT), (19)

E[A3T |A1T = x] =3

n=0

a3,1n Hn(x;ΣT), (20)

and E[(A2T)2|A1T = x] =4

n=0

a2,2n Hn(x;ΣT), (21)

where the coefficients are given by

a2,1n =

1n!

1(iΣ)n

∂n

∂ξn

ξ=0

E[Z〈ξ〉T A2T ]

,

a3,1n [4pt] =

1n!

1(iΣ)n

∂n

∂ξn

ξ=0

E[Z〈ξ〉T A3T ]

,

a2,2n =

1n!

1(iΣ)n

∂n

∂ξn

ξ=0

E[Z〈ξ〉T (A2T)2]

,

and Z〈ξ〉t := exp

(

iξA1t +ξ2

2Σt

)

.

Page 251: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

239

Note thatZ〈ξ〉 is a martingale withZ〈ξ〉0 = 1. Since these conditional expectationscan be represented by linear combinations of Hermite polynomials as seen in theprevious subsection, the following should hold, which can be confirmed easilywith results of this subsection:

a2,12 = c2,1

2 ; a2,11 = a2,1

0 = 0;

a3,13 = c3,1

3 ; a3,11 = c3,1

1 ; a3,12 = a2,1

0 = 0;

a2,24 = c2,2

4 ; a2,22 = c2,2

2 ; a2,20 = c2,2

0 ; a2,23 = a2,2

1 = 0.

(22)

Then, computation of these conditional expectations is equivalent to that ofthe unconditional expectationsE[Z〈ξ〉T A2T ], E[Z〈ξ〉T A3T ] andE[Z〈ξ〉T (A2T)2].

First, applying Ito’s formula to(

Z〈ξ〉t A2t

)

we have

E[

Z〈ξ〉t A2t

]

= E[∫ t

0Z〈ξ〉s dA2s +

∫ t

0A2sdZ〈ξ〉s +

A2,Z〈ξ〉

t

]

= 2(iξ)∫ t

0∂σ(S(0)

s , s)σ(S(0)s , s)E

[

Z〈ξ〉s A1s

]

ds (23)

Then, applying Ito’s formula to(

Z〈ξ〉t A1t

)

again, we also have

E[

Z〈ξ〉t A1t

]

= E[∫ t

0Z〈ξ〉s dA1s +

∫ t

0A1sdZ〈ξ〉s +

A1,Z〈ξ〉

t

]

= (iξ)∫ t

0σ2(S(0)

s , s)E[

Z〈ξ〉s

]

ds

= (iξ)∫ t

0σ2(S(0)

s , s)ds (24)

sinceE[

Z〈ξ〉t

]

= 1 for all t.Similarly, the following are obtained;

E[

Z〈ξ〉t A3t

]

= 3(iξ)

(∫ t

0∂2σ(S(0)

s , s)σ(S(0)s , s)E

[

Z〈ξ〉s (A1s)2]

ds

+

∫ t

0∂σ(S(0)

s , s)σ(S(0)s , s)E

[

Z〈ξ〉s A2s

]

ds

)

(25)

E[

Z〈ξ〉t (A1t)2]

=

∫ t

0σ2(S(0)

s , s)ds

+ 2(iξ)∫ t

0σ2(S(0)

s , s)E[

Z〈ξ〉s A1s

]

ds (26)

Page 252: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

240

E[

Z〈ξ〉t (A2t)2]

= 4∫ t

0

(

∂σ(S(0)s , s)

)2E

[

Z〈ξ〉s (A1s)2]

ds

+ 4(iξ)∫ t

0∂σ(S(0)

s , s)σ(S(0)s , s)E

[

Z〈ξ〉s A2sA1s

]

ds (27)

E[

Z〈ξ〉t A2tA1t

]

= 2∫ t

0∂σ(S(0)

s , s)σ(S(0)s , s)E

[

Z〈ξ〉s A1s

]

ds

+ (ıξ)∫ t

0(σ(S(0)

s , s))2E

[

Z〈ξ〉s A2s

]

ds

+ 2(ıξ)∫ t

0∂σ(S(0)

s , s)σ(S(0)s , s)E

[

Z〈ξ〉s (A1s)2]

ds. (28)

Then,E[Z〈ξ〉T A2T ], E[Z〈ξ〉T A3T ] andE[Z〈ξ〉T (A2T)2] can be obtained as solutionsof the system of ordinary differential equations (23), (24), (25), (26), (27) and(28). In fact, since they have a grading structure that the higher-order equationsdepend only on the lower ones as

E[

Z〈ξ〉t A1t

]

= (iξ)∫ t

0σ2(S(0)

s , s)ds

E[

Z〈ξ〉t A2t

]

= 2(iξ)∫ t

0∂σ(S(0)

s , s)σ(S(0)s , s)E

[

Z〈ξ〉s A1s

]

ds

E[

Z〈ξ〉t (A1t)2]

=

∫ t

0σ2(S(0)

s , s)ds

+ 2(iξ)∫ t

0σ2(S(0)

s , s)E[

Z〈ξ〉s A1s

]

ds

E[

Z〈ξ〉t A3t

]

= 3(iξ)

(∫ t

0∂2σ(S(0)

s , s)σ(S(0)s , s)E

[

Z〈ξ〉s (A1s)2]

ds

+

∫ t

0∂σ(S(0)

s , s)σ(S(0)s , s)E

[

Z〈ξ〉s A2s

]

ds

)

E[

Z〈ξ〉t A2tA1t

]

= 2∫ t

0∂σ(S(0)

s , s)σ(S(0)s , s)E

[

Z〈ξ〉s A1s

]

ds

+ (iξ)∫ t

0(σ(S(0)

s , s))2E

[

Z〈ξ〉s A2s

]

ds

+ 2(iξ)∫ t

0∂σ(S(0)

s , s)σ(S(0)s , s)E

[

Z〈ξ〉s (A1s)2]

ds

E[

Z〈ξ〉t (A2t)2]

= 4∫ t

0

(

∂σ(S(0)s , s)

)2E

[

Z〈ξ〉s (A1s)2]

ds

+ 4(iξ)∫ t

0∂σ(S(0)

s , s)σ(S(0)s , s)E

[

Z〈ξ〉s A2sA1s

]

ds,

Page 253: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

241

they can be easily solved with substituting each solution into the next ordinarydifferential equation recursively. Moreover, since these solutions are clearly thepolynomial of (iξ), we can easily implement differentiations with respect toξ in(19), (20) and (21). It is obvious that the resulting coefficients given by thesesolutions are equivalent to the results in the previous subsection.

Moreover, we also remark the relationship between our method and an ap-proach presented by [18] in which the density function ofG(ε) is derived by Fourierinversion of its formally expanded characteristic function. Precisely speaking,[18] formally expandedΨG(ε) (ξ) = E[eiξG(ε)

] as

ΨG(ε) (ξ) = E[

eiξG(ε) ]

= e−ξ2

2 ΣT ×

1+ ε(iξ)E[

Z〈ξ〉T A2T

]

+ ε2(

(iξ)E[

Z〈ξ〉T A3T

]

+(iξ)2

2E

[

Z〈ξ〉T (A2T)2]

)

+ o(ε2)

= e−ξ2

2 ΣT ×

1+ ε(iξ)E[

Z〈ξ〉T E [A2T |A1T ]]

+ ε2(

(iξ)E[

Z〈ξ〉T E [A3T |A1T ]]

+(iξ)2

2E

[

Z〈ξ〉T E[

(A2T)2|A1T

]]

)

+ o(ε2)

(29)

and computed the conditional expectations in this expansion. Then,fG(ε) (x), thedensity function ofG(ε), was derived by Fourier inversion ofΨG(ε) (ξ);

fG(ε) (x) = F −1 (ΨG(ε) ) =12π

∫ ∞

−∞

e−ixξΨG(ε) (ξ)dξ. (30)

This approach is completely equivalent to our method based on Watanabe’stheory as also mentioned in [18]. In fact, from (18) and (22) we obtain

fG(ε) (x) = fA1T (x) + ε(−1)∂

∂xc2,1

2 H2(x;ΣT ) fA1T (x)

+ ε2

(−1)∂

∂x

n=1,3

c3,1n Hn(x;ΣT) fA1T (x)

+12

(−1)2 ∂2

∂x2

n=0,2,4

c2,2n Hn(x;ΣT ) fA1T (x)

+ o(ε2) (31)

Page 254: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

242

= F −1(

e−ξ2

2 ΣT

)

+ ε c2,12 F

−1(

(iξ)(iξΣT )2e−ξ2

2 ΣT

)

+ ε2

n=1,3

c3,1n F

−1(

(iξ)(iξΣT )ne−ξ2

2 ΣT

)

+12

n=0,2,4

c2,2n F

−1(

(iξ)2(iξΣT )ne−ξ2

2 ΣT

)

+ o(ε2)

= F −1

e−ξ2

2 ΣT ×

1+ ε(iξ)2

n=0

a2,1n (iΣT )nξn

+ ε2

(iξ)3

n=0

a3,1n (iΣT)nξn +

(iξ)2

2

4∑

n=0

a2,2n (iΣT)nξn

+ o(ε2)

= F −1(

e−ξ2

2 ΣT ×

1+ ε(iξ)E[Z〈ξ〉T A2T ]

+ ε2(

(iξ)E[Z〈ξ〉T A3T ] +(iξ)2

2E[Z〈ξ〉T (A2T)2]

))

+ o(ε2). (32)

Then it is obvious that the inversion of the characteristic function expanded upto ε2-order (29) coincides with the density function obtained by our approach.Moreover, it can be shown that this equivalence holds at any order.

Here, at the end of this section, we state a brief summary. In the Black-Scholes-type economy, we consider the risky assetS(ε) and evaluate some quanti-ties, expressed as an expectation of the function of the future price, such as pricesor risk sensitivities of the securities on this asset. First we expand them around thelimit to ε = 0 so that we obtain the expansion (9) which contains some conditionalexpectations. Then, by approaches described in Section 2.2 or 2.3, we computethese conditional expectations. Finally, substituting computation results into (9),we obtain the asymptotic expansion of those quantities. Or equivalently, one canuse the formulas for these conditional expectations listed in [23].

3. Numerical ExamplesIn this section we apply the proposed techniques to the model more complex

than Black-Scholes-type case in the previous section, to demonstrate their effec-tiveness. Detailed discussions in a general setting including following examplesare found in Section 3 and 4 of [23].

3.1 λ-SABR ModelWe first consider the European plain-vanilla call and put prices under the fol-

lowing λ-SABR model [10] (interest rate= 0%):

dS(ε)(t) = εσ(ε)(t)(S(ε)(t))βdW1t ,

dσ(ε)(t) = λ(θ − σ(ε)(t))dt+ εν1σ(ε)(t)dW1

t + εν2σ(ε)(t)dW2

t ,

Page 255: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

243

whereν1 = ρνν2 = (√

1− ρ2)ν (the correlation betweenS andσ is ρ ∈ [−1, 1]).Approximated prices by the asymptotic expansion method are calculated up

to the fifth order. Note that all the solutions to differential equations are obtainedanalytically. Benchmark values are computed by Monte Carlo simulations.ε isset to be one and other parameters used in the test are given in Table 1:

Table 1. Parameter specifications of theλ-SABR model for our numerical experiments.

Parameter S(0) λ σ(0) β ρ θ ν T

i 100 0.1 3.0 0.5 −0.7 3.0 0.3 10ii 100 0.1 0.3 1.0 −0.7 0.3 0.3 10iii 100 0.1 0.3 1.0 −0.7 0.3 0.3 30

For the case ofβ = 1(i.e. case ii and iii), we calculate approximated prices bythe “log-normal asymptotic expansion method” described in Section 4.3 in [23]up to the fourth order. In Monte Carlo simulations for benchmark values, we useEuler-Maruyama scheme as a discretization scheme with 1024 time steps for casei and for case ii and iii the second order discretization scheme given by Ninomiya-Victoir [14] with 128 and 256 time steps, respectively. Each simulation contains108 paths. The results are in Table 2.

From the results, in each case, the higher order asymptotic expansion or log-normal asymptotic expansion almost always improve the accuracy of approxima-tion by the lower expansions. Improvement is significant especially in long-termcases in which the lower order asymptotic expansions cannot approximate theprice well.

3.2 Currency Option under a Libor Market Model of Interest Rates and aStochastic Volatility of a Spot Exchange Rate

In this subsection, we apply our methods to pricing options on currencies un-der Libor Market Models (LMMs) of interest rates and a stochastic volatility ofthe spot foreign exchange rate (forex). Due to limitations of space, only the struc-ture of the stochastic differential equations of our model is described here. Fordetails of the underlying model, see Takahashi and Takehara [20].

3.2.1 Cross-Currency Libor Market ModelsLet (Ω,F , P, Ft0≤t≤T∗<∞) be a complete probability space with a filtration

satisfying the usual conditions. We consider the following pricing problem for thecall option with maturityT ∈ (0,T∗] and strike rateK > 0;

VC(0;T,K) = Pd(0,T) × EP [

(S(T) − K)+]

= Pd(0,T) × EP [

(FT(T) − K)+]

(33)

whereVC(0;T,K) denotes the value of an European call option at time 0 withmaturityT and strike rateK, S(T) denotes the spot exchange rate at timet ≥ 0

Page 256: Financial Engineering

May

3,201016:24

Proceedings

Trim

Size:

9inx

6in010

244Table 2. Comparisons of the absolute and relative differences between the estimators by our asymptotic expansion at different order and Monte Carlosimulations. “Absolute Differences” and “Relative Differences” are given by (the approximate value by our asymptotic expansion)− (the estimator byMonte Carlo simulations) and (Absolute Differences)/ (the estimator by Monte Carlo simulations).

A.E. (Difference) A.E. (Relative Difference)Case Strike (C/P) MC 1st 2nd 3rd 4th 5th 1st 2nd 3rd 4th 5th

i 50 Put 13.109 4.876 5.000 2.313 1.067 0.260 37.20 38.14 17.64 8.14 1.9860 Put 16.618 4.544 4.648 1.931 0.938 0.195 27.34 27.97 11.62 5.65 1.1770 Put 20.482 4.241 4.322 1.585 0.844 0.149 20.71 21.10 7.74 4.12 0.7380 Put 24.720 3.965 4.020 1.269 0.778 0.117 16.04 16.26 5.14 3.15 0.4790 Put 29.347 3.710 3.738 0.980 0.735 0.094 12.64 12.74 3.34 2.51 0.32

100 Call 34.375 3.472 3.472 0.712 0.712 0.077 10.10 10.10 2.07 2.07 0.22110 Call 29.811 3.246 3.217 0.459 0.704 0.063 10.89 10.79 1.54 2.36 0.21120 Call 25.659 3.026 2.971 0.220 0.711 0.050 11.79 11.58 0.86 2.77 0.19130 Call 21.914 2.809 2.728 −0.010 0.731 0.035 12.82 12.45 −0.04 3.33 0.16140 Call 18.571 2.591 2.487 −0.230 0.762 0.018 13.95 13.39 −1.24 4.10 0.10150 Call 15.615 2.370 2.246 −0.441 0.804 −0.002 15.18 14.38 −2.83 5.15 −0.02

Log Normal A.E. (Difference) Log Normal A.E. (Relative Difference)Case Strike (C/P) MC Log-Norm 1st 2nd 3rd 4th Log-Norm 1st 2nd 3rd 4th

ii 50 Put 9.429 −0.896 0.250 0.470 −0.223 0.021 −9.51 2.65 4.99 −2.36 0.2260 Put 13.095 −0.187 0.168 0.449 −0.215 0.028 −1.43 1.29 3.43 −1.64 0.2170 Put 17.307 0.678 0.045 0.431 −0.203 0.034 3.92 0.26 2.49 −1.17 0.1980 Put 22.041 1.620 −0.099 0.414 −0.190 0.039 7.35 −0.45 1.88 −0.86 0.1890 Put 27.272 2.577 −0.253 0.397 −0.177 0.045 9.45 −0.93 1.45 −0.65 0.17

100 Call 32.971 3.503 −0.416 0.379 −0.163 0.051 10.62 −1.26 1.15 −0.49 0.15110 Call 29.110 4.367 −0.589 0.360 −0.149 0.057 15.00 −2.02 1.24 −0.51 0.20120 Call 25.655 5.149 −0.773 0.338 −0.135 0.063 20.07 −3.01 1.32 −0.53 0.25130 Call 22.576 5.837 −0.972 0.315 −0.120 0.069 25.85 −4.30 1.39 −0.53 0.31140 Call 19.842 6.424 −1.186 0.289 −0.104 0.076 32.38 −5.98 1.46 −0.53 0.38150 Call 17.420 6.912 −1.416 0.261 −0.088 0.083 39.68 −8.13 1.50 −0.50 0.47

iii 50 Put 19.801 2.280 −0.889 1.143 −0.592 0.182 11.51 −4.49 5.77 −2.99 0.9260 Put 25.471 3.371 −1.248 1.254 −0.581 0.154 13.23 −4.90 4.93 −2.28 0.6070 Put 31.500 4.459 −1.594 1.351 −0.560 0.120 14.15 −5.06 4.29 −1.78 0.3880 Put 37.847 5.520 −1.927 1.437 −0.535 0.081 14.59 −5.09 3.80 −1.41 0.2190 Put 44.476 6.541 −2.246 1.515 −0.505 0.039 14.71 −5.05 3.41 −1.14 0.09

100 Call 51.357 7.512 −2.555 1.586 −0.474 −0.005 14.63 −4.98 3.09 −0.92 −0.01110 Call 48.465 8.430 −2.856 1.652 −0.442 −0.051 17.39 −5.89 3.41 −0.91 −0.10120 Call 45.780 9.291 −3.150 1.715 −0.409 −0.098 20.30 −6.88 3.75 −0.89 −0.21130 Call 43.281 10.097 −3.439 1.774 −0.376 −0.147 23.33 −7.94 4.10 −0.87 −0.34140 Call 40.954 10.848 −3.724 1.831 −0.342 −0.197 26.49 −9.09 4.47 −0.84 −0.48150 Call 38.782 11.545 −4.007 1.886 −0.309 −0.248 29.77 −10.33 4.86 −0.80 −0.64

Page 257: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

245

and FT(t) denotes the timet value of the forex forward rate with maturityT.Similarly, for the put option we consider

VP(0;T,K)=Pd(0,T) × EP [

(K − S(T))+]

=Pd(0,T) × EP [

(K − FT(T))+]

. (34)

It is well known that the arbitrage-free relation between the forex spot rate andthe forex forward rate are given byFT(t) = S(t) Pf (t,T)

Pd(t,T) wherePd(t,T) andPf (t,T)denote the timet values of domestic and foreign zero coupon bonds with matu-rity T respectively.EP[·] denotes an expectation operator under EMM(EquivalentMartingale Measure)P whose associated numeraire is the domestic zero couponbond maturing atT.

For these pricing problems, a market model and a stochastic volatility modelare applied to modeling interest rates’ and the spot exchange rate’s dynamics re-spectively.

We first define domestic and foreign forward interest rates asfd j(t) =(

Pd(t,T j )Pd(t,T j+1) − 1

)

1τ j

and f f j(t) =(

Pf (t,T j )Pf (t,T j+1) − 1

)

1τ j

respectively, wherej = n(t), n(t) +

1, . . . ,N, τ j = T j+1 − T j, and Pd(t,T j) and Pf (t,T j) denote the prices of do-mestic/foreign zero coupon bonds with maturityT j at timet(≤ T j) respectively;n(t) = mini : t ≤ Ti. We also define spot interest rates to the nearest fixingdate denoted byfd,n(t)−1(t) and f f ,n(t)−1(t) as fd,n(t)−1(t) =

(

1Pd(t,Tn(t))

− 1)

1(Tn(t)−t) and

f f ,n(t)−1(t) =(

1Pf (t,Tn(t))

− 1)

1(Tn(t)−t) . Finally, we setT = TN+1 and will abbreviate

FTN+1(t) to FN+1(t) in what follows.Under the framework of the asymptotic expansion in the standard cross-

currency Libor market model, we have to consider the following system ofstochastic differential equations(henceforth called S.D.E.s) under the domesticterminal measureP to price options. For detailed arguments on the frameworkof these S.D.E.s see [20].

As for the domestic and foreign interest rates we assume forward market mod-els; for j = n(t) − 1, n(t), n(t) + 1, . . . ,N,

f (ε)d j (t) = fd j(0)+ ε2

N∑

i= j+1

∫ t

0g0,(ε)

di (u)′

γd j(u) f (ε)d j (u)du

+ ε

∫ t

0f (ε)d j (u)γ

d j(u)dWu, (35)

f (ε)f j (t) = f f j(0)− ε2

j∑

i=0

∫ t

0g0,(ε)

f i (u)′

γ f j(u) f (ε)f j (u)du

+ ε2N

i=0

∫ t

0g0,(ε)

di (u)′

γ f j(u) f (ε)f j (u)du

− ε2∫ t

0σ(ε)(u)σ

γ f j(u) f (ε)f j (u)du+ ε

∫ t

0f (ε)f j (u)γ

f j(u)dWu, (36)

Page 258: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

246

whereg0,(ε)d j (t) :=

−τ j f (ε)d j (t)

1+τ j f (ε)d j (t)γd j(t), g0,(ε)

f j (t) :=−τ j f (ε)

f j (t)

1+τ j f (ε)f j (t)γ f j(t); x

denotes the trans-

pose ofx andW is a r-dimensional standard Wiener process under the domesticterminal measureP; γd j(s), γ f j(s) are r-dimensional vector-valued functions oftime-parameters; σ denotes ar-dimensional constant vector satisfying||σ|| = 1andσ(ε)(t), the volatility of the spot exchange rate, is specified to follow aR++-valued general time-inhomogeneous Markovian process as follows:

σ(ε)(t) = σ(0)+∫ t

0µ(u, σ(ε)(u))du+ ε2

N∑

j=1

∫ t

0g0,(ε)

d j (u)′

ω(u, σ(ε)(u))du

+ ε

∫ t

0ω′

(u, σ(ε)(u))dWu, (37)

whereµ(s, x) andω(s, x) are functions ofsandx.Finally, we consider the process of the forex forwardFN+1(t). SinceFN+1(t) ≡

FTN+1(t) can be expressed asFN+1(t) = S(t) Pf (t,TN+1)Pd(t,TN+1) , we easily notice that it is

a martingale under the domestic terminal measure. In particular, it satisfies thefollowing stochastic differential equation

F(ε)N+1(t) = FN+1(0)+ ε

∫ t

(ε)F (u)

F(ε)N+1(u)dWu (38)

where

σ(ε)F (t) :=

N∑

j=0

(

g0,(ε)f j (t) − g0,(ε)

d j (t))

+ σ(ε)(t).

3.2.2 Numerical ExamplesHere, we specify our model and parameters, and confirm the effectiveness of

our method in this cross-currency framework.First of all, the processes of domestic and foreign forward interest rates and

of the volatility of the spot exchange rate are specified. We supposer = 4, that isthe dimension of a Brownian motion is set to be four; it represents the uncertaintyof domestic and foreign interest rates, the spot exchange rate, and its volatility.Note that in this framework correlations among all factors are allowed. We alsosupposeS(0) = 100.

Next, we specify a volatility process of the spot exchange rate in (37) with

µ(s, x) = κ(θ − x), ω(s, x) = ωx, (39)

whereθ andκ represent the level and speed of its mean-reversion respectively, andω denotes a volatility vector on the volatility. In this section the parameters are

Page 259: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

247

Table 3. Initial domestic/foreign forward interest rates and their volatilities.

fd γ∗d f f γ∗f

case (i) 0.05 0.12 0.05 0.12

case (ii) 0.02 0.3 0.05 0.12

case (iii) 0.05 0.12 0.02 0.3

case (iv) 0.02 0.3 0.02 0.3

set as follows;ε = 1, σ(0) = θ = 0.1, andκ = 0.1;ω = ω∗v whereω∗ = 0.3 andvdenotes a four dimensional constant vector given below.

We further suppose that initial term structures of domestic and foreign forwardinterest rates are flat, and their volatilities also have flat structures and are constantover time: that is, for allj, fd j(0) = fd, f f j(0) = f f , γd j(t) = γ∗dγd1t<T j (t) andγ f j(t) = γ∗f γ f 1t<T j (t). Here,γ∗d andγ∗f are constant scalars, and ¯γd andγ f denotefour dimensional constant vectors. Moreover, given a correlation matrixC amongall four factors, the constant vectors ¯γd, γ f , σ andv can be determined to satisfy||γd|| = ||γ f || = ||σ|| = ||v|| = 1 andV′V = C whereV := (γd, γ f , σ, v).

In this subsection, we consider four different cases forfd, γ∗d, f f and γ∗fas in Table 3. For correlations, four sets of parameters are considered: In thecase “Corr. 1”, all the factors are independent: In “Corr. 2”, there exists onlythe correlation of−0.5 between the spot exchange rate and its volatility (i.e.σ′

v = −0.5) while there are no correlations among the others: In “Corr. 3”, thecorrelation between interest rates and the spot exchange rate are allowed whilethere are no correlations among the others; the correlation between domestic onesand the spot forex is 0.5(¯γ

dσ = 0.5) and the correlation between foreign onesand the spot forex is−0.5(γ

f σ = −0.5). In “Corr. 4”, more intricately corre-

lated structure is considered; ¯γ′

dσ = 0.5, γ′

f σ = −0.5 between interest rates

and the spot forex; and ¯σ′

v = −0.5 between the spot forex and its volatility.It is well known that (both of exact and approximate)evaluation of the long-term options is a hard task in the case with complex structures of correlationssuch as in “Corr. 3” or “Corr. 4”. Lastly, we make an assumption thatγdn(t)−1(t)andγ f n(t)−1(t), volatilities of the domestic and foreign interest rates applied tothe period fromt to the next fixing dateTn(t), are equal to be zero for arbitraryt ∈ [t,Tn(t)].

In Figure 1, we compare our estimations of the values of call and put optionsby an asymptotic expansion up to the fourth order to the benchmarks estimatedby 106 trials of Monte Carlo simulation which is discretized by Euler-Maruyamascheme with time step 0.05 and applied the Antithetic Variable Method. For themoneynesses (defined byK/FN+1(0)) less than one, the prices of put options areshown; otherwise, the prices of call options are displayed. Detailed data obtainedin this experiment, see Table 6–9 of [23].

Page 260: Financial Engineering

May

3,201016:24

Proceedings

Trim

Size:

9inx

6in010

248 ! !

" ! # " !

$ % ! $

% ! $ % ! # $

% % !

& ! &

! & ! # &

" !

' ! '

% % ! ' % % ! # '

% % !

Figure 1. Graphs of comparisons of estimators by the third- and fourth-order asymptotic expansion and Monte Carlo simulations in Corr. 1–4, witha ten-year maturity. Squares denote the differences between the third-order estimators and Monte Carlo estimators; circles denote those between thefourth-order ones and Monte Carlo ones. These differences are defined by (the approximate value by our asymptotic expansion)− (the estimator by MonteCarlo simulations).

Page 261: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

249

As seen in this figure, in general the estimators show more accuracy as theorder of the expansion increases. Especially, for the deep OTM options the fourthorder approximation performs much better and is stabler than the approximationwith lower orders.

4. Concluding RemarksIn this paper, we provided the general procedures for the explicit computation

of conditional expectations necessary for practical computations of the asymptoticexpansion method. Moreover, the alternative but equivalent calculation algorithmwhich computes the unconditional expectations directly instead of the conditionalones was developed. For simplicity and limitation of space, we focused on thesimple case of Black-Scholes-type economy as in Section 2, which illustrated ourkey ideas. For further explanations in more general environment, see [23].

Finally, we exmained the accuracy of our approximation with high order ex-pansions in theλ-SABR model and in the cross currency Libor market model witha stochastic volatility of the spot exchange rate, and confirmed satisfactory resultsin the both examples.

At the end of this section, we state our future plans: we will develop a similarresult in the case with a jump component; we will also pursue an efficient methodfor the evaluation of multi-factor path-dependent or/and American derivatives. Infact, our proposed scheme can be applied to average options under a general set-ting of the underlying factors.

References1. Antonov, A. and Misirpashaev, T. [2009], “Projection on a Quadratic Model by

Asymptotic Expansion with an Application to LMM Swaption,” Working Paper.2. Andersen, L. B. G. and Hutchings, N. A. [2009], “Parameter Averaging of Quadratic

SDEs With Stochastic Volatility,” Working Paper.3. Fouque, J.-P., Papanicolaou, G. and Sircar, K. R. [1999], “Financial Modeling in a

Fast Mean-reverting Stochastic Volatility Environment,Asia-Pacific Financial Mar-kets, Vol. 6(1), pp. 37–48.

4. Fouque, J.-P., Papanicolaou, G. and Sircar, K. R. [2000],Derivatives in financial Mar-kets with Stochastic Volatility, Cambridge University Press.

5. Ikeda, N. and Watanabe, S. [1989],Stochastic Differential Equations and DiffusionProcesses, Second Edition, North-Holland/Kodansha, Tokyo.

6. Kunitomo, N. and Takahashi, A. [1992], “Pricing Average Options,”Japan FinancialReview, Vol. 14, 1–20. (in Japanese).

7. Kunitomo, N. and Takahashi, A. [2001], “The Asymptotic Expansion Approach tothe Valuation of Interest Rate Contingent Claims,”Mathematical Finance, Vol. 11,117–151.

8. Kunitomo, N. and Takahashi, A. [2003a], “On Validity of the Asymptotic ExpansionApproach in Contingent Claim Analysis,”Annals of Applied ProbabilityVol. 13-3,914–952.

Page 262: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

250

9. Kusuoka S. and Osajima, Y. [2007], ”A Remark on the Asymptotic Expansion of Den-sity Function of Wiener Functionals,” Preprint, Graduate School of Mathematical Sci-ences, the University of Tokyo.

10. Labordere, P. H. [2005a], “A General Asymptotic Implied Volatility for StochasticVolatility Models”, cond-mat/0504317.

11. Labordere, P. H. [2005b], “Solvable Local and Stochastic Volatility Models: Super-symmetric Methods in Option Pricing,” Working Paper.

12. Matsuoka, R. Takahashi, A. and Uchida, Y. [2004], “A New Computational Scheme forComputing Greeks by the Asymptotic Expansion Approach,”Asia-Pacific FinancialMarkets, Vol. 11, 393–430.

13. Muroi, Y. [2005], “Pricing Contingent Claims with Credit Risk: Asymptotic Expan-sion Approach,”Finance and Stochastics, Vol. 9(3), 415–427.

14. Ninomiya, S. and Victoir, N. [2006], “Weak Approximation of Stochastic DifferentialEquations and Application to Derivative Pricing”, Preprint.

15. Nualart, D. [1995], “The Malliavin Calculus and Related Topics,” Springer.16. Siopacha, M. and Teichmann, J. [2007], “Weak and Strong Taylor Methods for Nu-

merical Solutions of Stochastic Differential Equations,” Working paper.17. Takahashi, A. [1995], “Essays on the Valuation Problems of Contingent Claims,”

Unpublished Ph.D. Dissertation, Haas School of Business, University of California,Berkeley.

18. Takahashi, A. [1999], “An Asymptotic Expansion Approach to Pricing ContingentClaims,”Asia-Pacific Financial Markets, Vol. 6, 115–151.

19. Takahashi, A. [2009], ”On an Asymptotic Expansion Approach to Numerical Prob-lems in Finance,”Selected Papers on Probability and Statistics, pp. 199–217, 2009,American Mathematical Society.

20. Takahashi, A. and Takehara, K. [2007], “Pricing Currency Options with a MarketModel of Interest Rates under Jump-Diffusion Stochastic Volatility Processes of SpotExchange Rates,”Asia-Pacific Financial Markets, Vol. 14 , pp. 69–121.

21. Takahashi, A. and Takehara, K. [2008a], “Fourier Transform Method with an Asymp-totic Expansion Approach: an Applications to Currency Options,”International Jour-nal of Theoretical and Applied Finance, Vol. 11(4), pp. 381–401.

22. Takahashi, A. and Takehara, K. [2008b], “A Hybrid Asymptotic Expansion Scheme:an Application to Currency Options,” Working paper, CARF-F-116, the University ofTokyo, http://www.carf.e.u-tokyo.ac.jp/workingpaper/

23. Takahashi, A., Takehara, K. and Toda, M. [2009], “Computation in an Asymp-totic Expansion Method,” Working paper, CARF-F-149, the University of Tokyo,http://www.carf.e.u-tokyo.ac.jp/workingpaper/

24. Takahashi, A. and Yamada, T. [2008], “An Asymptotic Expansion with Push DownMalliavin Weights,” Preprint.

25. Takahashi, A. and Yoshida, N. [2004], “An Asymptotic Expansion Scheme for Op-timal Investment Problems,”Statistical Inference for Stochastic Processes, Vol. 7-2,153–188.

26. Takahashi, A. and Yoshida, N. [2005], “Monte Carlo Simulation with AsymptoticMethod,”The Journal of Japan Statistical Society, Vol. 35-2, 171–203.

27. Watanabe, S. [1987], “Analysis of Wiener Functionals (Malliavin Calculus) and itsApplications to Heat Kernels,”The Annals of Probability, Vol. 15, 1–39.

Page 263: Financial Engineering

May 3, 2010 16:24 Proceedings Trim Size: 9in x 6in 010

251

28. Yoshida, N. [1992a], “Asymptotic Expansion for Small Diffusions via the Theory ofMalliavin-Watanabe,”Probability Theory and Related Fields, Vol. 92, 275–311.

29. Yoshida, N. [1992b], “Asymptotic Expansions for Statistics Related to Small Diffu-sions,”The Journal of Japan Statistical Society, Vol. 22, 139–159.

Page 264: Financial Engineering

This page intentionally left blankThis page intentionally left blank

Page 265: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

Can Financial Synergy Motivate M&A?∗

Yuan Tian1,2, Michi Nishihara3 and Takashi Shibata2

1Graduate School of Economics, Kyoto University2Graduate School of Social Sciences, Tokyo Metropolitan University

3Graduate School of Economics, Osaka UniversityE-mail: [email protected], [email protected] and [email protected]

1. IntroductionRecently, M&A has seen explosive growth. In 2007, M&A activity totaled

a record $4.38 trillion globally, up 21 percent from 2006. More and morefirms are considering M&A as a firm value creation strategy instead of internalgrowth. M&A also has been the subject of considerable research in financial eco-nomics. Most studies have focused on positive or negative operational synergy,e.g., economies of scale, market power, and managerial benefits. However, finan-cial synergy has rarely been analyzed.

The Modigliani-Miller (1958) theorem states that, without tax benefits and de-fault costs, capital structure is irrelevant to total firm value. As a result, there is nofinancial synergy. However, in the real world with tax benefits and default costs,capital structure does matter. Therefore, adjusting capital structure through M&Amay create financial synergy. Although some empirical papers relate firms’ in-centive of M&A to capital structure motives based on tax benefits, financial slack,wealth transfers, etc., they do not have an explicit model to analyze financial syn-ergy realized in M&A. On the other hand, with the exception of Leland (2007)and Morellec and Zhdanov (2008), M&A related theoretical works have not fo-cused on optimal capital structure by taking tax benefits and default costs intoconsideration.

This paper develops a continuous model to examine financial synergy whenM&A timing is determined endogenously by equityholders. The main questionsare as follows: (i) Can purely financial synergy (i.e., without operational synergy)motivate M&A? (ii) How is financial synergy distributed between equityholdersand debtholders? Related to the first question, Lewellen (1971) asserts that the

∗The first author appreciates the financial support by MEXT and JSPS (the Public ManagementProgram in Tokyo Metropolitan University). The second author acknowledges the financial supportby KAKENHI 20710116, 19710126.

253

Page 266: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

254

financial synergy of mergers is always positive. However, Leland (2007) suggeststhat financial synergy by itself is insufficient to justify M&A in many cases. Re-lated to the second question, Scott (1977) and Shastri (1990) report that, whiletotal firm value may increase through mergers because of lower risk, debtholdersmay gain at the expense of equityholders. On the other hand, Ghosh and Jain(2000) argue that equityholders can appropriate benefits from debtholders by fi-nancing M&A with debt and increasing financial leverage after M&A.

In this paper, we examine financial synergy in both scenario F and scenario E.By scenario F, we mean that the optimal capital structure after M&A is determinedto maximize the total firm value. Because the M&A decision is made by equity-holders, they choose scenario E rather than scenario F if there is no restriction ontheir behavior. By scenario E, we mean that equityholders maximize the sum ofequity value and newly issued debt value to determine the optimal capital struc-ture, ignoring the existing debt value. We find that purely financial synergy canmotivate M&A in both scenarios. However, the optimal M&A timing is delayedand financial synergy is larger in scenario E.

We demonstrate that the distribution of financial synergy between equityhold-ers and debtholders is different in the two scenarios. In scenario F, a part of thevalue created from exercising M&A option goes to existing debtholders, irrespec-tive of the fact that M&A cost is fully borne by equityholders. Theex post wealthtransfer leads equityholders to choose scenario E. This reflects the debt overhangproblem discussed in Myers (1977), which may delay or prevent an investment de-cision to improve the total firm value because of the existing debt. In scenario E,equityholders may issue a significant amount of new debt, which results in higherdefault risk. Such actions that transfer wealth from existing debtholders to equity-holders are similar to the risk shifting problem discussed in Jensen and Meckling(1976). Actually, scenario F corresponds to a situation where debt is issued withcovenants protecting existing debtholders. On the other hand, scenario E providesa clear rationale for LBOs (leveraged buyouts), where the acquirer issues a signif-icant amount of debt to pay for M&A and then uses the cash flows of target firmto pay off debt over time.

The main contribution of our paper is that we examine financial synergy withendogenous M&A timing. Two recent papers on synergy in M&A are related toours: Lambrecht (2004) and Leland (2007). Lambrecht (2004) analyzes the op-timal timing of mergers motivated by economies of scale. Because Lambrecht(2004) focuses on operational synergy, the tax benefits and default costs are outof consideration, which are central to our analysis of financial synergy. Leland(2007) develops a one-period model to examine the role of purely financial syn-ergy in motivating M&A with exogenous timing given as current time. Althoughthe focus of our paper, financial synergy in M&A, is similar to Leland (2007),our modelling differs from Leland (2007) in that we provide a continuous modelwith endogenous M&A timing. Our justification is as follows. Practically, M&A

Page 267: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

255

is usually regarded as a value creation strategy of separate firms with initial assetin place. While Leland (2007) considers brand-new firms with no initial asset inplace to start their operations at current time, we assume that two separate firmshave already started their operations with optimal capital structures initially. How-ever, their initial capital structures are not any longer optimal, because the statevariable changes as time goes by. Therefore, adjusting capital structure to opti-mal level through M&A may create financial synergy. Due to the uncertainty inthe values after M&A and the irreversibility of M&A cost, the M&A decisionresembles an option exercise. It is appropriate to derive the optimal M&A tim-ing using a real options approach. Moreover, the adjustment of capital structurethrough M&A rightly depends on M&A timing. By taking the optimal M&A tim-ing into consideration endogenously, our paper analyzes the interaction betweencapital structures and investment decisions and obtains results different from Le-land (2007).1 That is, we find that financial synergy can motivate M&A with-out operational synergy when M&A timing is endogenously determined, whereasLeland (2007) concludes that financial synergy by itself is insufficient to justifyM&A in many cases. Therefore, we complement the literature by demonstratingthat the results derived from endogenously determined M&A timing may signifi-cantly depart from those derived from exogenously given M&A timing.

The remainder of this paper is organized as follows. Section 2 describes themodel setup. Section 3 examines the adjustment of capital structure and determi-nation of optimal M&A timing in both scenarios. Section 4 calibrates the model tomeasure financial synergy and provides model implications. Section 5 concludes.Some detailed proofs can be found in the Appendix.

2. Model SetupThe model is set in a continuous-time framework. Since specialization effect

is more important than diversification effect nowadays, we consider M&A in thesame industry.2 There are two risk-neutral firms: a potential acquiring firm and apotential target firm.3 These roles are exogenously assigned and are determinedby firms’ specific characteristics, not modeled in this paper. Let “a” and “tar”stand for the acquiring firm and target firm, respectively. The firmj (∈ a, tar)

1Leland (1994) uses firstly a contingent claims approach to study the optimal capital structure incorporate finance. Dixit and Pindyck (1994) is a standard textbook on the real options approach toinvestment under uncertainty.

2During the 1960s and the early 1970s, most mergers were motivated by diversification effect. Thatis, when activities’ cash flows are imperfectly correlated, risk can be lowered via mergers. However,because business circumstances become increasingly competitive, it is inefficient to manage differentactivities as a conglomerate. See Rhodes-Kropf and Robinson (2004) for empirical evidence thatsimilar firms merge.

3We do not consider competition among several potential acquiring firms. See Morellec and Zh-danov (2008) for an analysis of a takeover contest with two potential acquiring firms and a potentialtarget firm.

Page 268: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

256

collects a revenue flowQ jX, whereQ j is the quantity produced by firmj andXis the price. We assume that the price process (X(t))t≥0 be given by the followinggeometric Brownian motion process:

dX(t) = µX(t)dt + σX(t)dz(t),

whereµ andσ(> 0) are constant parameters andz(t) is a standard Brownian mo-tion. The initial valueX(0) is sufficiently low. As in most real options model, wesuppose the discount rater > µ for convergence. Letτ denote the corporate taxrate. Then, the unlevered firm value at timet can be calculated as

Π j(x) :=1− τr − µ

Q jx, (1)

given thatX(t) = x.Both the acquirer and target have already been financed optimally by equity

and debt. For simplicity, we assume that the issued debt has infinite maturity andthe contractual continuous coupon payment of the perpetual debt issued by firmjis c j. The profit flow of firm j at timet before M&A is (1−τ)(Q jx−c j). Althoughissuing debt can obtain tax benefits, it is also accompanied with default costs.As in Leland (1994), we consider a stock-based definition of default wherebyequityholders inject funds in the firm as long as equity value is positive. In otherwords, equityholders default on their debt obligations the first time equity value isequal to zero. Letxd

j denote the default threshold of firmj before M&A. At the

default threshold, we assume that the firm value is given by (1− α)Π j(xdj ), where

α ∈ [0, 1] measures the loss in firm value incurred by default costs.We suppose that firms behave in the interests of equityholders and they can

only receive M&A option unexpectedly.4 If either the acquirer or target goesinto default before M&A occurs, M&A can never be realized. If the price pro-cess (X(t))t>0 is sufficiently high to hit the optimal M&A thresholdxi

m beforeeach firm’s default threshold, then the acquiring equityholders exercise the M&Aoption by providing the stand-alone value to target equityholders and bearing thefixed M&A cost I.5 The M&A cost is financed by issuing new equity and new debtwith couponcn. After M&A, the profit flow of the merged firm is (1−τ)(Qmx−cm),where the subscript “m” stands for the merged firm. The unlevered firm value afterM&A is

Πm(x) :=1− τr − µ

Qmx. (2)

4We abstract from potential agency conflicts between managers and equityholders by assuming thatthe incentives of these two groups are perfectly aligned. See Zwiebel (1996), Morellec (2004), Shibataand Nishihara (2010) for analysis of the relation between agency conflicts, financing decisions, andcontrol transactions.

5The fixed M&A cost here refers to the due diligence cost paid to the third party.

Page 269: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

257

Since our paper focuses on whether purely financial synergy can motivate M&Aor not, we assumeQm ≡ Qa + Qtar andcm ≡ ca + ctar + cn. The quantityQm

excludes the effect of operational synergy. The couponcm reflects the adjustmentof capital structure through M&A. We assume that firms cannot call back theirexisting debt when exercising M&A option, consequently,cn ≥ 0.6

3. Model AnalysisIn our model, acquiring equityholders make two types of interrelated deci-

sions: M&A investment decision and financing decision. The M&A decision ischaracterized by an endogenously determined threshold; when the price process(X(t))t>0 reaches M&A thresholdxi

m before each firm’s default thresholdxdj , ac-

quiring equityholders exercise M&A option. The financing decision involves thechoice of newly issued debt and an endogenous default threshold. The couponlevel of newly issued debtcn(xi

m), which is characterized by a trade-off betweenthe tax benefits and default costs of debt financing, is determined simultaneouslywith the M&A decision. In contrast, the default thresholdxd

m(cm), which dependson coupon level after M&A, is determined after M&A option is exercised. Notethat the three endogenous variables (i.e.,xi

m, cn(xim), and xd

m(cm)) form a nestedstructure, which is an important characteristic of this model.

We derive the equityholders’ decisions using backward induction. Section 3.1examines default threshold after M&A (step 1) and the coupon of newly issueddebt (step 2), which depends on M&A timing. Section 3.2 analyzes the optimalM&A timing (step 3), taking the possibility of default before M&A into consider-ation.

3.1 After M&AThe first step is to derive the values after M&A and determine the default

threshold for the merged firm,xdm. LetT i

m andT dm denote the endogenously chosen

times for M&A investment and default after M&A:

T im = inft ≥ 0; X(t) ≥ xi

m, T dm = inft ≥ T i

m; X(t) ≤ xdm.

According to our model setup, forT im ≤ t ≤ T d

m, the equity value after M&A canbe expressed as follows:

Em(x) = E

∫ T dm

te−r(s−t)(1− τ)(QmX(s) − cm)ds

X(t) = x

,

whereE[·|X(t) = x] denotes the expectation operator given thatX(t) = x. The in-stantaneous change in the equity value after M&A satisfies the following ordinary

6Goldsteinet al. (2001) argue that, while covenants are often in place to protect debtholders, inpractice firms typically have the option to issue additional debt in the future without recalling theoutstanding debt issues.

Page 270: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

258

differential equation (ODE):

rEm(x) = (1− τ)(Qmx − cm) + µxE′m(x) +12σ2x2E′′m(x), x ≥ xd

m. (3)

Once process (X(s))s>0 hits the thresholdxdm, the merged firm defaults. The fol-

lowing boundary conditions ensure that the optimal default threshold is chosen byequityholders:

Em(xdm) = 0,

E′m(xdm) = 0,

limx→∞Em(x)

x < ∞.

(4)

Here, the first condition is the value-matching condition. Following the stock-based definition of default, at the default thresholdxd

m, the equity value equals0. The second condition is the smooth-pasting condition, which ensures thatxd

mis chosen to maximize the equity value. The third condition is the no-bubblescondition.

Solving the ODE (3) under these boundary conditions, we obtain the equityvalue after M&A as follows (see Appendix A):

Em(x) = Πm(x) − (1− τ)cm

r−

[

Πm(xdm) − (1− τ)

cm

r

]( x

xdm

)γ, (5)

where

xdm =

γ

γ − 1r − µ

rcm

Qm, (6)

andγ is the negative root of the quadratic equation12σ

2y2 + (µ − 12σ

2)y − r = 0,i.e.,

γ =1σ2

(

µ −12σ2

)

(

µ −12σ2

)2

+ 2σ2r

< 0. (7)

The equity value after M&A has two components: (i) the unlevered firm valueminus the present value of the contractual coupon paid to the debtholders, andplus the present value of tax benefits; (ii) the value of default option, which isthe product of savings from default and the default probability, given by (x/xd

m)γ.Note that the default thresholdxd

m depends on the ratiocm/Qm.Similarly, for T i

m ≤ t ≤ T dm, the debt value after M&A can be expressed as

follows:

Dm(x) = E

∫ T dm

te−r(s−t)cmds + e−r(T d

m−t)(1− α)Πm(X(T dm))

∣X(t) = x

,

Page 271: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

259

and we obtain the debt value as

Dm(x) =cm

r−

[cm

r− (1− α)Πm(xd

m)]( x

xdm

)γ. (8)

It also has two components: (i) the present value of perpetual coupon payments;(ii) the present value of the loss in default. The firm valueVm(x) is the sum ofequity value and debt value.

Vm(x) = Em(x) + Dm(x) = Πm(x) + τcm

r−

[

αΠm(xdm) +

τcm

r

]( x

xdm

)γ. (9)

The second step is to determine the coupon of newly issued debt. FollowingSundaresan and Wang (2007), we assume that the existing debt and newly issueddebt have equal priority at the default threshold.7 Then, the existing debt valueafter M&A is De

m(x) = [(ca + ctar)/cm]Dm(x) and the newly issued debt value afterM&A is Dn

m(x) = (cn/cm)Dm(x).We consider the determination of newly issued debt in both scenario F and

scenario E. In scenario F, equityholders choosec∗n to maximize the total firmvalueVm(x) at the optimal M&A thresholdxi∗

m, which is endogenously determinedlater. The superscript “∗” stands for the solution corresponding to scenario F. Inscenario E, equityholders choosec∗∗n at the optimal M&A thresholdxi∗∗

m to max-imize Vn

m(x), which represents the sum of equity valueEm(x) and newly issueddebt valueDn

m(x). That is,

Vnm(x) =Πm(x) +

τcm − ca − ctar

r

+

[

(

(1− α)cn

cm− 1

)

Πm(xdm) +

ca + ctar − τcm

r

]

( x

xdm

)γ.

(10)

The superscript “∗∗” stands for the solution corresponding to scenario E. Thedistinction betweenVm(x) andVn

m(x) is essential, because equityholders no longercare about the existing debt value when exercising M&A option and issuing newdebt. This creates the differences between the two scenarios.

The coupon of newly issued debt in scenario F is derived by taking the first-order condition ofVm(x) in Eq. (9):

c∗n = −ca − ctar +r

r − µγ − 1γ

Qm

hxi∗

m, (11)

where

h =[

1− γ(1− α +α

τ)]−1/γ

> 1, (12)

7A number of papers, including Weiss (1990) and Goldsteinet al. (2001), report that the priorityof claims is frequently violated in bankruptcy. It is typical that all unsecured debt receives the samerecovery rate, regardless of the issuance date.

Page 272: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

260

provided that the right hand side of Eq. (11) is nonnegative. It is obvious thatdc∗n/dxi∗

m > 0. On the other hand, the coupon of the newly issued debt in scenarioE is derived by taking the first-order condition ofVn

m(x) in Eq. (10):

c∗∗n = − ca − ctar +r

r − µγ − 1γ

Qm

hxi∗∗

m

×

[

1−γ

γ − 1

(

τ−1 − γ(1− α + α/τ)1− γ(1− α + α/τ)

)

ca + ctar

c∗∗n + ca + ctar

]1/γ

,

(13)

provided that the right hand side of Eq. (13) is nonnegative. Totally differentiatingEq. (13), and then rearranging yields dc∗∗n /dxi∗∗

m > 0.Comparingc∗n andc∗∗n in Eq. (11) and Eq. (13), respectively, we find that the

expression ofc∗n is explicit, whilec∗∗n is implicit. Moreover, both of them posi-tively depend on M&A thresholdsxi∗

m andxi∗∗m , respectively, which are derived in

section 3.2. It means that waiting for a better state to exercise M&A option resultsin issuing more new debt.

3.2 Before M&AThe third step is to determine the M&A threshold, taking the possibility of de-

fault before M&A into consideration. While the upper boundaryxim is determined

by the acquiring equityholders, the lower boundary max[xdam, x

dtar] is determined

by either the acquiring equityholders (ifxdtar ≤ xd

am) or the target equityholders (ifxd

tar ≥ xdam). The subscript “am” differs from “a” in that it represents value with

M&A option. Because default means losing M&A option in the future, equity-holders may be less willing to go into default before M&A, compared to the casewithout M&A option. Therefore, even ifxd

a > xdtar, it is possible thatxd

am < xdtar.

8

LetH(x; y, z) denote the present value of a claim that pays $1 contingent onxreaching the upper thresholdy before reaching the lower thresholdz. In contrast,let L(x; y, z) denote the present value of a claim that pays $1 contingent onxreaching the lower thresholdz before reaching the upper thresholdy. In AppendixB, we demonstrate that:

H(x; y, z) =zγxβ − zβxγ

zγyβ − zβyγ, L(x; y, z) =

xγyβ − xβyγ

zγyβ − zβyγ, (14)

whereβ is the positive root of the quadratic equation12σ

2y2 + (µ− 12σ

2)y − r = 0,i.e.,

β =1σ2

(

µ −12σ2

)

+

(

µ −12σ2

)2

+ 2σ2r

> 1. (15)

8Morellec and Zhdanov (2008) also jointly determine the financing strategies and the takeover tim-ing. However, in their model, the takeover threshold is chosen by target equityholders. Furthermore,they did not explicitly consider the change in the lower boundary when M&A option is available.

Page 273: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

261

We suppose that if the acquiring equityholders bear M&A costI and providethe stand-alone value for target equityholders, the agreement on M&A can berealized. Therefore, the expression of target equity value is similar to Eq. (5):

Etar(x) = Πtar(x) − (1− τ)ctar

r−

[

Πtar(xdtar) − (1− τ)

ctar

r

]( x

xdtar

)γ, (16)

where

xdtar =

γ

γ − 1r − µ

rctar

Qtar. (17)

However, the target debt value with M&A option differs from the stand-alonevalue (i.e.,Dtarm(x) , Dtar(x)). Because of the assumption that the existing debtcannot be called back when M&A occurs, the target debt value is passively af-fected by the acquiring equityholders’ exercise of M&A option. At the upperboundary,

Dtarm(xim) =

ctar

cmDm(xi

m). (18)

At the lower boundary, since M&A option is lost,Dtarm(max[xdam, x

dtar]) =

Dtar(max[xdam, x

dtar]), which is similar to Eq. (8).

Dtarm(max[xdam, x

dtar]) =

ctar

r−

[ctar

r− (1− α)Πtar(xd

tar)](max[xd

am, xdtar]

xdtar

)γ, (19)

Therefore, we have the following expression for target debt value with M&A op-tion:

Dtarm(x) =ctar

r+ ei

tarH(

x; xim,max[xd

am, xdtar]

)

+ edtarL

(

x; xim,max[xd

am, xdtar]

)

,(20)

where

eitar =

ctar

cmDm(xi

m) −ctar

r,

edtar =

−[

ctarr − (1− α)Πtar(xd

tar)]

(

xdam

xdtar

, if xdtar < xd

am,

−[

ctar

r − (1− α)Πtar(xdtar)

]

, if xdtar ≥ xd

am.

(21)

Eq. (20) has three components: (i) the present value of the contractual couponpayments; (ii) the present value when M&A option is exercised, which is givenby the product of the net payoff ei

tar at the upper boundaryxim and the present

value of unit-payoff contigent claimH(

x; xim,max[xd

am, xdtar]

)

; and (iii) the present

Page 274: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

262

value when default option is exercised, which is given by the product of the netpayoff ed

tar at the lower boundary max[xdam, x

dtar] and the present value of unit-

payoff contigent claimL(

x; xim,max[xd

am, xdtar]

)

. The target firm value is the sumof Eq. (16) and Eq. (20) as follows:

Vtarm(x) = Etar(x) + Dtarm(x). (22)

The following boundary conditions ensure that the optimal M&A thresholdand default threshold of the acquirer are chosen in scenario F:

Vam(xim) + Vtarm(xi

m) = Vm(xim) − I,

V ′am(xim) + V ′tarm(xi

m) = V ′m(xim),

Eam(xdam) = 0,

E′am(xdam) = 0.

(23)

Here, the first condition is the value-matching condition atxim. After M&A, the

acquiring equityholders internalize the tax benefits and default costs of the mergedfirms. By paying the fixed costI to exercise M&A option atxi

m, the acquiring firmcollects the surplus from the merged firm value subtracting the value paid to thetarget firm (Vtarm = Etar + Dtarm). The second condition is the smooth pastingcondition atxi

m. This condition ensures thatxim is chosen to maximize the total

firm value. The remaining two conditions are the value-matching and smooth-pasting conditions atxd

am.According to the two value-matching conditions in (23), the firm value of the

acquiring firm with M&A option can be written as:

Vam(x) =Πa(x) + τca

r+ ei

aH(

x; xim,max[xd

am, xdtar]

)

+ edaL

(

x; xim,max[xd

am, xdtar]

)

,(24)

where

eia = Vm(xi

m) − Vtarm(xim) − I −

[

Πa(xim) +

τca

r

]

,

eda =

−[

αΠa(xdam) + τca

r

]

, if xdam > xd

tar,

−[

αΠa(xdtar) +

τca

r

]

, if xdam ≤ xd

tar ≤ xda,

−[

αΠa(xda) + τca

r

]

(

xdtar

xda

, if xdam ≤ xd

tar, xda < xd

tar.

(25)

The equity value of the acquiring firm with M&A option can be written as:

Eam(x) =Πa(x) − (1− τ)ca

r+ ei

aH(

x; xim,max[xd

am, xdtar]

)

+ edaL

(

x; xim,max[xd

am, xdtar]

)

,(26)

Page 275: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

263

where

eia = Vn

m(xim) − Etar(xi

m) − I −[

Πa(xim) − (1− τ)

ca

r

]

,

eda =

−[

Πa(xdam) − (1− τ) ca

r

]

, if xdam > xd

tar ,

−[

Πa(xdtar) − (1− τ) ca

r

]

, if xdam ≤ xd

tar ≤ xda,

−[

Πa(xda) − (1− τ) ca

r

]

(

xdtar

xda

, if xdam ≤ xd

tar, xda < xd

tar .

(27)

Note that if xdam ≤ xd

tar (the second and third lines in Eq. (27)), then the lowerboundary turns out to bexd

tar. Once the price process (X(s))s>0 hits xdtar, the ac-

quirer loses M&A option. Moreover, ifxdam ≤ xd

tar ≤ xda (the second line in

Eq. (27)), then the acquirer immediately goes into default at the lower boundaryxd

tar; if xdam ≤ xd

tar andxda < xd

tar (the third line in Eq. (27)), then the acquirer con-tinues operating the firm and goes into default optimally when the price process(X(s))s>0 hits xd

a.By now, we have obtained all the value expressions appeared in boundary

conditions (23). Substituting these expressions into the smooth-pasting conditionsat xi

m and max[xdam, x

dtar] in (23), respectively, we obtain:

ν1γ(xi∗m)γ =

(eda + ed

tar)(γ − β)(xi∗m)γ+β

(

max[xd∗am, xd

tar])γ

(xi∗m)β −

(

max[xd∗am, xd

tar])β

(xi∗m)γ

+

(eia + ei

tar)[

β(xi∗m)β

(

max[xd∗am, x

dtar]

)γ− γ(xi∗

m)γ(

max[xd∗am, x

dtar]

)β]

(

max[xd∗am, xd

tar])γ

(xi∗m)β −

(

max[xd∗am, xd

tar])β

(xi∗m)γ

,

(28)

where

ν1 = −[

αΠm(xdm) +

τcm

r

]

(xdm)−γ +

[

Πtar(xdtar) −

(1− τ)ctar

r

]

(xdtar)−γ,

and

Πa(xd∗am) +

eia(β − γ)(xd∗

am)β+γ + eda

[

γ(xd∗am)γ(xi∗

m)β − β(xd∗am)β(xi∗

m)γ]

(

xd∗am

)γ(xi∗

m)β −(

xd∗am

)β(xi∗

m)γ= 0. (29)

On the other hand, in scenario E, the value-matching and smooth-pasting con-ditions atxi

m are given as follows:

Eam(xim) + Etar(xi

m) = Vnm(xi

m) − I,

E′am(xim) + E′tar(xi

m) = Vn′m (xi

m),(30)

Page 276: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

264

whereEam(x) andVnm(x) are given as Eq. (26) and Eq. (10), respectively. The

value-matching and smooth-pasting conditions at the lower boundary are the samewith those in scenario F. The smooth-pasting condition atxi

m in (30) implies:

ν2γ(xi∗∗m )γ =

eda(γ − β)(xi∗∗

m )γ+β(

max[xd∗∗am , x

dtar]

)γ(xi∗∗

m )β −(

max[xd∗∗am , x

dtar]

)β(xi∗∗

m )γ

+

eia

[

β(xi∗∗m )β

(

max[xd∗∗am , x

dtar]

)γ− γ(xi∗∗

m )γ(

max[xd∗∗am , x

dtar]

)β]

(

max[xd∗∗am , xd

tar])γ

(xi∗∗m )β −

(

max[xd∗∗am , xd

tar])β

(xi∗∗m )γ

, (31)

where

ν2 =

[

(

(1− α)cn

cm− 1

)

Πm(xdm) +

ca + ctar − τcm

r

]

(xdm)−γ

+[

Πtar(xdtar) −

(1− τ)ctar

r

]

(xdtar)−γ.

Proposition 3.1. The optimal M&A threshold, default threshold of acquirer withM&A option, and coupon level of newly issued debt, can be obtained by simulta-neously solving the following equations:

(i) For scenario F, the three equations that determine xi∗m, xd∗

am, and c∗n areEq. (11), Eq. (28), and Eq. (29);

(ii) For scenario E, the three equations that determine xi∗∗m , xd∗∗

am and c∗∗n areEq. (13), Eq. (31), and Eq. (29) (xd∗∗

am instead of xd∗am).

4. Model ImplicationsSince the equations above are nonlinear in the thresholds, analytical solutions

in closed forms are impossible. In this section, we calibrate the model to an-alyze the characteristics of the solutions and provide several empirical predic-tions. In particular, we measure financial synergy when M&A option is exercisedoptimally.

We use the following input parameter values for calibration:µ = 0.01, σ =0.25, r = 0.06, τ = 0.4, α = 0.4, ca = 2.5, ctar = 3, Qa = 1, Qtar = 1.5, I =10, x = 2.3. The growth rateµ = 0.01 and volatilityσ = 0.25 of cash flows areselected to match the data of an average Standard and Poor’s (S&P) 500 firms (seeStrebulaev (2007)). The risk-free rater = 0.06 is taken from the yield curve onTreasury bonds. The corporate tax rateτ = 0.4 follows the estimation by Kemsleyand Nissim (2002). The default costs parameterα = 0.4 is chosen to be consistentwith Gilson (1997), which reports that default costs are equal to 0.365 and 0.455for the median firm in his samples. The remaining parameter values (the couponc j, the quantityQ j, the fixed costI, and the current value of state variablex) are

Page 277: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

265

not essentially important, because they can be normalized. We simply set them asabove to show the results clearly.

Under these parameter setting,xda = 1.09, xd

tar = 0.89. We can also calculateinversely that the initial value of state variable (denoted byx0

j , j ∈ a, tar) for

acquirer and target to establish their firms arex0a = 2.74 andx0

tar = 2.19, respec-tively, givenca andctar are their optimal coupons at the establishment timing 0.9

As time goes by, their initial capital structures are not any longer optimal becausethe state variable changes. Since we setx = 2.3 at current time, the acquireris a firm with excessive debt and the target is a firm with insufficient debt rela-tive to their optimal capital structures now. Therefore, adjusting capital structureto optimal level through M&A may create financial synergy. We also analyze aparameter setting whenca = 3, ctar = 2.5, Qa = 1.5, Qtar = 1, with other param-eters unchanged. In such a case, the acquirer is a firm with insufficient debt andthe target is a firm with excessive debt relative to their optimal capital structuresnow. After comparing the results of the two cases (the case when the acquirer’sdebt is excessive and the case when the acquirer’s debt is insufficient), we find thatin scenario E, there is little difference between the two cases, because the existingdebt value is ignored in the maximization process. On the other hand, in scenarioF, M&A is delayed in the case when the acquirer’s debt is excessive in compari-son to the case when the acquirer’s debt is insufficient, because the debt overhangproblem is more serious. Except for this point, the results when acquirer’s debt isinsufficient are very similar to the results when acquirer’s debt is excessive, whichwe will analyze below in detail.

4.1 Measure of Financial SynergySince we have assumed no operational synergy, financial synergy of M&A is

measured by the difference between the value of the optimally levered mergedfirm, and the sum of the stand-alone acquirer value and target value. The purelyfinancial synergy at current time is defined as:

FS (x) =[

∆T B(xim) − ∆DC(xi

m)]

(x/xim)β, (32)

where

∆T B(xim) =

τ

r

[

cm

[

1−

(

xim

xdm

)γ]

− ca

[

1−

(

xim

xda

)γ]

− ctar

[

1−

(

xim

xdtar

)γ]]

, (33)

∆DC(xim) = α

[

Πm(xdm)

(

xim

xdm

− Πa(xda)

(

xim

xda

− Πtar(xdtar)

(

xim

xdtar

)γ]

. (34)

9From Eq. (11), we know that there is a linear relationship between the optimal coupon and theinitial investment threshold.

Page 278: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

266

The financial synergy can be divided into two components, which are directly re-lated to changes in financial structure through M&A. The first component∆T Bdenotes the change in the present value of tax benefits from the optimally lev-ered merged firm versus separate firms. The second component∆DC denotes thechange in the present value of default costs. The credit spread and leverage atxi

mare defined as follows:

CS j(xim) =

c j

D j(xim)− r, (35)

L j(xim) =

D j(xim)

V j(xim), (36)

where j ∈ m, a, tar.

4.2 Main ResultsTable 1 demonstrates the results in both scenarios.10 According to our com-

putation, the main results are robust across a wide range of parameter valuesc j, Q j, I, andx.

Table 1. Results of scenarios F and E.

FS ∆T B ∆DC xim cm xd

m ∆E ∆Da ∆Dtar

F 0.23 0.46 0.24 2.51 5.71 0.99 0.86 1.09−1.72E 1.63 5.19 3.56 5.18 15.95 2.77 7.73−2.60 −3.50

CS a CS tar CS m La Ltar Lm

F 0.0292 0.0207 0.0253 0.739 0.654 0.705E 0.0105 0.0079 0.0418 0.474 0.403 0.819

There are three interesting findings. First, consider the financial synergy andM&A threshold. We find that financial synergy can be positive in both scenar-ios. In other words, purely financial synergy by itself can motivate M&A. Boththe tax benefits and default costs increase; however, the increase in tax benefitsis much larger than that in default costs, resulting in positive financial synergy.Moreover, in comparison to scenario F, the M&A threshold is higher and the fi-nancial synergy is larger in scenario E. Becausexi

m = 5.18 in scenario E is muchhigher thanx0

a = 2.74 andx0tar = 2.19, the distortion ofVa(xi

m) andVtar(xim) with

initial coupons from those with optimal coupons is larger. Therefore, the financialsynergy defined in Eq. (32) is larger in scenario E.

10Values are firstly calculated atxim, and then multiplied by the M&A probability (x/xi

m)β. Thecredit spread and leverage are calculated atxi

m.

Page 279: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

267

Claim 4.1. When operational synergy is zero, purely financial synergy can moti-vate M&A in both scenario F and scenario E.

This result differs from that of Leland (2007) who assumes two separate firms withno initial asset in place. With the assumption that M&A timing is exogenouslygiven as current time, Leland (2007) concludes that purely financial synergy byitself is insufficient to justify M&A in many cases. By contrast, we assume twoseparate firms with initial asset in place. By deriving M&A timing endogenously,we find that purely financial synergy can motivate M&A in both scenarios. Wetherefore demonstrate that financial synergy hinges in large part on whether M&Atiming is exogenously given or endogenouly determined.

Second, consider the changes in coupon and values. In scenario F, although thecoupon after M&A increases, default thresholdxd

m = 0.99 lies betweenxda = 1.09

and xdtar = 0.89. Therefore, default threshold decreases and existing debt value

increases from the viewpoint of acquiring firm with excessive debt. Irrespectiveof the fact that M&A cost is fully borne by acquiring equityholders, a part of theincrease in the total firm value accrues to existing debtholders. The wealth trans-fer discourages equityholders from exercising M&A option at a lower threshold inscenario F. This reflects the debt overhang problem discussed in Myers (1977) andSundaresan and Wang (2007), which may delay or prevent an investment decisionto improve the total firm value. In scenario E, default threshold increases and ex-isting debt value decreases. The reason is that acquiring equityholders appropriatethe benefits from existing debtholders by issuing a significant amount of new debtand increasing the leverage of the merged firm.11 That is the so-called risk shift-ing problem discussed in Jensen and Meckling (1976). The equity value increasesin both scenarios, which ensures the participation constraint of equityholders inM&A.

Third, consider the changes in leverage and credit spread. In scenario F, al-though the coupon after M&A increases a little, the default threshold is betweenthat of the two firms before M&A. Therefore, both the leverage and credit spreadare also between those of the two firms before M&A. On the other hand, in sce-nario E, because the coupon level increases significantly and the default thresholdincreases, both the leverage and credit spread increase. In fact, scenario F corre-sponds to a situation where debt is issued with covenants protecting the existingdebtholders, while scenario E corresponds to LBOs. In LBOs, acquirers issue asignificant amount of debt to pay for M&A and then use the cash flows of targetfirm to pay off debt over time. After LBOs, firms usually have high leverage,

11Although we assumed both existing debt and newly issued debt have equal priority at the defaultthreshold, even with seniority provisions, existing debtholders lose value when new debt is issued.Ziegler (2004) demonstrates that seniority provisions do protect existing debtholders against losingvalue to new debtholders; however, they do not protect existing debtholders against wealth transfersdriven by changes in the timing and probability of default.

Page 280: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

268

and the debt usually is below investment grade. From the perspective of existingdebtholders, LBOs represent a fundamental shift in the firm’s risk profile and re-sult in a decrease in debt value.12 However, our results demonstrate that the lossin debt values is not large enough to explain the gain in equity values. This isconsistent with the empirical findings documented in Brealeyet al. (2008).

To examine the effect of uncertainty on optimal M&A threshold, Fig. 1 plotsM&A thresholds for varying volatilities of the price process. We find that in sce-

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.51

2

3

4

5

6

7

xi m

scenario F

scenario E

Figure 1. The effects of uncertainty on M&A threshold.

nario E, the optimal M&A threshold increases with uncertainty. By contrast, inscenario F, the optimal M&A threshold increases with uncertainty at first, andthen decreases with uncertainty. The intuition is as follows. The uncertainty hastwo countervailing effects on the optimal M&A threshold. One is the usual pos-itive effect explained in the standard real options model (all-equity firm withoutdefault). Higher uncertainty implies a larger option value of waiting to exerciseM&A option. Therefore, M&A threshold increases with uncertainty. The otheris a negative effect because of the existence of the lower default threshold beforeM&A. As Fig. 2 shows (with parametersx = 2.3, y = 2.5, z = 1.8), the present

12The famous LBO was that Kohlberg Kravis Roberts (KKR) acquired RJR Nabisco in the late1980s and this illustrates the wealth transfer from the existing debtholders to equityholders.

Page 281: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

269

value of claimL(x; y, z) in Eq. (14) (pay $1 contingent onx reaching the lowerthresholdz before reaching the upper thresholdy) increases with uncertainty. Onthe other hand, the present value of claimH(x; y, z) in Eq. (14) (pay $1 contingenton x reaching the upper thresholdy before reaching the lower thresholdz) has lit-tle change with uncertainty. Since the probability of hitting the default thresholdbefore M&A increases, there is an incentive for equityholders to exercise M&Aearlier, which induces a lower M&A threshold. In scenario E, irrespective of theuncertainty level, the positive effect dominates the negative effect; while in sce-nario F, the negative effect becomes stronger as uncertainty increases and beginsto dominate the positive effect when uncertainty increases at a certain degree.

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

σ

H(x

;y,z

),L

(x;y

,z)

H

L

Figure 2. The effects of uncertainty on contingent claimsH andL.

5. ConclusionsThis paper developed a continuous model to examine financial synergy when

M&A timing is determined endogenously. We demonstrated that purely financialsynergy can motivate M&A in both scenarios. However, the optimal M&A timingis delayed and financial synergy is larger in scenario E.

The analysis in this paper is suitable for settings where the firm receives a newgrowth option (like M&A) unexpectedly. Our theoretical model generates im-plications that are consistent with empirical evidences in corporate finance. Oneimplication is the debt overhang problem. While total firm value increases through

Page 282: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

270

M&A, a part of the value created from exercising M&A option goes to existingdebtholders. Thisex post wealth transfer discourages equityholders from exercis-ing M&A option at the optimal timing in scenario F, because M&A cost is fullyborne by equityholders. Another implication is the risk shifting problem. Theexistence of debtholders already in place creates an incentive for equityholders toissue a significant amount of new debt which results in higher default risk. Ourresults also have implications for empirical works that examine the sources ofM&A synergies. Those parameters mentioned above, such as the tax rate and de-fault costs, which can create substantial financial synergy, should be included aspossible explanatory variables.

Lastly, we should point out an important but difficult topic for future research.While our paper considered the situation where firms receive M&A option unex-pectedly, the analysis when firms are able to anticipate a future growth option canendogenously derive the initial capital structure to deferex post inefficiency. Wewill consider this problem in the future.

Appendix AThe general solution of ODE (3) is:

Em(x) = A+xβ + A−xγ + (1− τ)Qmxr − µ

−cm

r, (A.1)

whereβ andγ are the positive and negative roots of the quadratic equation12σ

2y2+

(µ − 12σ

2)y − r = 0.According to the no-bubbles condition,A+ must equal zero. From the value-

matching and smooth-pasting conditions, we know that:

A−(xdm)γ + 1−τ

r−µQmxdm −

cmr = 0,

A−γ(xdm)γ−1 + 1−τ

r−µQm = 0.(A.2)

Solving the equations above yields the default threshold and equity value. Thedebt value can be obtained similarly.

Appendix BBecauseH(x; y, z) is a claim that receives no dividend, we know from (A.1) thatH(x; y, z) is of the form:

H(x; y, z) = A+xβ + A−xγ. (A.3)

Substituting (A.3) into the boundary conditions:

H(y; y, z) = 1, H(z; y, z) = 0,

Page 283: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

271

we obtain that

H(x; y, z) =zγxβ − zβxγ

zγyβ − zβyγ.

Similarly, L(x; y, z) can be derived as

L(x; y, z) =xγyβ − xβyγ

zγyβ − zβyγ.

References1. Brealey, R. A., Myers, S. C., and Allen, F. (2008),Principles of Corporate Finance,

9th Revised Edition, McGraw-Hill, New York.2. Dixit, A. and Pindyck, R. (1994),Investment under Uncertainty, Priceton University

Press, Priceton, NJ.3. Ghosh, A. and Jain, P. (2000), “Financial leverage changes associated with corporate

mergers,”Journal of Corporate Finance, 6, 377–402.4. Gilson, S. (1997), “Transaction costs and capital structure choice: Evidence from fi-

nancially distressed firms,”Journal of Finance, 52, 161–196.5. Goldstein, R., Ju. N., and Leland, H. (2001), “An EBIT-based model of dynamic cap-

ital structure,”Journal of Business, 74, 483–512.6. Jensen, M. C. and Meckling, W. H. (1976), “Theory of the firm: managerial behavior,

agency costs and ownership structure,”Journal of Financial Economics, 3, 305–360.7. Kemsley, D. and Nissim, D. (2002), “Valuation of the debt tax shields,”Journal of

Finance, 57, 2045–2073.8. Lambrecht, B. M. (2004), “The timing and terms of mergers motivated by economies

of scale,”Journal of Financial Economics, 72, 41–62.9. Leland, H. E. (1994), “Corporate debt value, bond covenants, and optimal capital

structure,”Journal of Finance, 49, 1213–1252.10. Leland, H. E. (2007), “Financial synergies and the optimal scope of the firm: Implica-

tions for mergers, spinoffs, and structured finance,”Journal of Finance, 62, 765–807.11. Lewellen, W. (1971). “A pure financial rationale for the conglomerate merger,”Journal

of Finance, 26, 521–537.12. Modigliani, F. and Miller, M. (1958), “The cost of capital, corporation finance and the

theory of investment,”American Economic Review, 48, 261–297.13. Morellec, E. (2004), “Can managerial discretion explain observed leverage rations?”

Review of Financial Studies, 17, 257–294.14. Morellec, E. and Zhdanov, A. (2008), “Financing and takeovers,”Journal of Financial

Economics, 87, 556–581.15. Myers, S. (1977), “Determinants of corporate borrowing,”Journal of Financial Eco-

nomics, 5, 147–175.16. Rhodes-Kropf, M. and Robinson, D. (2004), “The market for mergers and the bound-

aries of the firm,” Working paper, Utrecht University.17. Scott, J. (1977), “On the theory of corporate mergers,”Journal of Finance, 32, 1235–

1250.18. Shastri, K. (1990), “The differential effects of mergers on corporate security values,”

Reseach in Finance, 8, 179–201.

Page 284: Financial Engineering

May 3, 2010 16:33 Proceedings Trim Size: 9in x 6in 011

272

19. Shibata, T. and Nishihara, M. (2010), “Dynamic investment and capital structure undermanager-shareholder conflict,”Journal of Economic Dynamics and Control, 34, 158–178.

20. Strebulaev, I. (2007), “Do tests of capital structure mean what they say?”Journal ofFinance, 62, 1747–1787.

21. Sundaresan, S. and Wang, N. (2007), “Dynamic investment, capital structure, and debtoverhang,” Working paper, Columbia University.

22. Weiss, L. A. (1990), “Bankruptcy resolution: direct costs and violation of priority ofclaims,”Journal of Financial Economics, 27, 285–314.

23. Ziegler, A. (2004),A Game Theory Analysis of Options: Corporate Finance and Fi-nancial Intermediation in Continuous Time, Springer, Berlin.

24. Zwiebel, J. (1996), “Dynamic capital structure under managerial entrenchment,”American Economic Review, 86, 1197–1215.