ItoIntegralNew

download ItoIntegralNew

of 37

Transcript of ItoIntegralNew

  • 8/7/2019 ItoIntegralNew

    1/37

    NOTES ON THE ITO CALCULUS

    STEVEN P. LALLEY

    1. C-T P: P M

    1.1. Progressive Measurability. Throughout these notes, (, F, P) will be a fixed proba-bility space, and all random variables will be assumed to be F measurable. A filtration ofthe probability space is an increasing familyF := {Ft}tJ of sub-algebras ofFindexed byt J, where Jis an interval, usually J= [0, ). The filtration F is said to be complete if eachFt contains all sets of measure 0, and is right-continuous ifFt = s>tFs. A standard filtrationis a filtration that is both complete and right-continuous.

    Definition 1. A stochastic process {Xt}tJ is said to be adapted to the filtration F if for each tthe random variable Xt is measurable relative to Ft.Definition 2. The minimal (sometimes called the natural) filtration associated with a sto-chastic process {Xt}t0 is the smallest filtration FX := {FXt }t0 to which Xt is adapted.Definition 3. A stochastic process {Xt}t0 is said to be progressively measurable if for everyT, when viewed as a function X(t, ) on the product space ([0, T])

    , it is measurable

    relative to the product algebra B[0,T] FT.The definition extends in an obvious way to stochastic processes {Xt}tJ indexed by other

    intervals J. If a process {Xt}tJ is progressively measurable then it is necessarily adapted.Checking that a process is progressively measurable is in principle more difficult thanchecking that it is adapted, and so you might wonder why the notion is of interest. Onereason is this: We will want to integrate processes over time t, and we will be interested in

    when the random variables Y =

    JXt dt obtained by integration are integrable or are even

    random variables at all. Progressive measurability guarantees this. Following is a usefultool for checking that a stochastic process is progressively measurable. You should proveit, as well as Lemma 2 below, as an exercise.

    Lemma 1. If{Xt}tJ is an adapted process with right- or left-continuous paths, then it is progres-sively measurable.What happens when a process is evaluated at a stopping time?

    Lemma 2. Let J be a stopping time relative to the filtration F and let {Xt}tJ be progressivelymeasurable. Then X1{

  • 8/7/2019 ItoIntegralNew

    2/37

    Definition 4. Fix an interval J = [0, T] or J = [0, ). The class Vp = VpJ

    consists of all

    progressively measurable processes Xt = X(t, ) on J such that

    (1)

    J

    E|Xt|p dt < .

    Lemma 3. If Xs belongs to the class V1J then for almost every the function t Xt()is Borel measurable and integrable on J, and X(t, ) is integrable relative to the product measureLebesgueP. Furthermore, the process

    (2) Yt :=

    st

    Xs ds

    is continuous in t (for almost every ) and progressively measurable.

    Proof. Assume first that the process X is bounded. Fubinis theorem implies that for every

    , the function Xt() is an integrable, Borel measurable function oft. Thus, the process

    Yt is well-defined, and the dominated convergence theorem implies that Yt is continuousin t. To prove that the process Yt is progressively measurable it suffices, by Lemma 1, tocheck that it is adapted. This follows from Fubinis theorem.

    The general case follows by a truncation argument. If X V1, then by Tonellis theoremX(t, ) is integrable relative to the product measure LebesgueP, and Fubinis theoremimplies that Xt() is Borel measurable in t and Lebesgue integrable for Palmost every. Denote by Xm the truncation ofX at m, m. Then the dominated convergence theoremimplies that for Palmost every the integrals

    Ymt () :=

    st

    Xms () ds

    converge to the integral ofXs. Define Yt() = 0 for those pairs (t, ) for which the sequenceYmt () does not converge. Then the process Yt() is well-defined, and since it is the almosteverywhere limit of the progressively measurable processes Ymt (), it is progressivelymeasurable.

    1.2. Approximation of progressively measurable processes. The usual procedure fordefining an integral is to begin by defining it for a class of simple integrands and thenextending the definition to a larger, more interesting class of integrands by approximation.To define the Ito integral (section 3 below), we will extend from a simple class of integrandscalled elementary processes. A stochastic process V(t) = Vt is called elementary relative to thefiltration F if if it has the form

    (3) Vt =K

    j=0

    j1(tj,tj+1](t)

    where 0 = t0 < t1 < < tK < , and for each index j the random variable j is measurablerelative to Ftj . Every elementary process is progressively measurable, and the class ofelementary processes is closed under addition and multiplication (thus, it is an algebra).

    Exercise 1. Let be a stopping time that takes values in a finite set {ti}. (Such stoppingtimes are called elementary.) Show that the process Vt := 1(0,](t) is elementary.

    2

  • 8/7/2019 ItoIntegralNew

    3/37

    Lemma 4. For any p [1, ), the elementary processes are Lpdense in the space Vp. That is, forany Y Vp there is a sequence Vn of elementary functions such that

    (4) E

    J

    |Y(t) Vn(t)|p dt 0

    The proof requires an intermediate approximation:

    Lemma 5. Let X be a progressively measurable process such that |X(t, )| K for some constantK < . Then there exists a sequence Xm of progressively measurable processes with continuouspaths, and all uniformly bounded by K, such that

    (5) X(t, ) = limm X

    m(t, ) a.e.

    Note 1. O mistakenly asserts that the same is true for adapted processes see Step2 in his proof of Lemma 3.1.5. This is unfortunate, as this is the key step in the construction

    of the Ito integral.Proof. This is a consequence of the Lebesgue differentiation theorem. Since X is bounded,Lemma 3 implies that the indefinite integral Yt ofXs is progressively measurable and hascontinuous paths. Define

    Xmt = m(Yt Y(t1/m)+);then Xm is progressively measurable, has continuous paths, and is uniformly bounded byK (because it is the average ofXs over the interval [t1/m, t]). The Lebesgue differentiationtheorem implies that Xm(t, ) X(t, ) for almost every pair (t, ). Proof of Lemma 4. Any process in the class Vp can be arbitrarily well-approximated inLpnorm by bounded progressively measurable processes. (Proof: Truncate at m, and usethe dominated convergence theorem.) Thus, it suffices to prove the lemma for boundedprocesses. But any bounded process can be approximated by processes with continuouspaths, by Lemma 5. Thus, it suffices to prove the lemma for bounded, progressivelymeasurable processes with continuous sample paths.

    Suppose, then, that Yt() is bounded and continuous in t. For each n = 1, 2, . . . , define

    Vn(t) =

    22n1j=0

    Y(j/2n)1[j/2n,(j+1)/2n)(t);

    this is an elementary process, and because Y is bounded, Vn Vp. Since Y is continuousin t for t 2K,

    limn

    Vn(t) = Yt,

    and hence the Lp convergence (4) follows by the bounded convergence theorem. 1.3. Admissible filtrations for Wiener processes. Recall that a kdimensional Wienerprocess is anRkvalued stochastic process W(t) = (Wi(t))1ik whose components Wi(t) areindependent standard 1dimensional Wiener processes. It will, in certain situations, beadvantageous to allow filtrations larger than the minimal filtration for W(t). However, forthe purposes of stochastic calculus we will want to deal only with filtrations F such that

    3

  • 8/7/2019 ItoIntegralNew

    4/37

    the processes W(t), W(t)2 kt, etc. are martingales (as they are for the minimal filtration).For this reason, we shall restrict attention to filtrations with the following property:

    Definition 5. A filtration F = (Ft)0t 1 then it is uniformly integrable and closedin Lp, that is,

    (8) limtMt := M

    exists a.s and in Lp, and for each t < ,(9) Mt = E(M | Ft).Note 2. This is not true for p = 1: the double-or-nothing martingale is an example of anonnegative martingale that converges a.s. to zero, but not in L1.

    Proposition 3. (Doobs Maximal Inequality) If Mt is closed in Lp for some p 1 then

    P{suptQ+

    |Mt| } E|M|p/p and(10)

    P{suptT

    tQ+|Mt| } E|MT|

    p

    /p

    (11)

    Note 3. The restriction t Q+ is only to guarantee that the sup is a random variable; if themartingale is right-continuous then the sup may be replaced by sup t0. Also, the secondinequality (11) follows directly from the first, because the martingale Mt can be frozen atits value MT for all t T.

    4

  • 8/7/2019 ItoIntegralNew

    5/37

    2.2. Martingales with continuous paths.

    Definition 6. Let

    M2 be the linear space of all L

    2

    bounded, martingalesMt with continuous

    paths and initial values M0 = 0. Since each such martingale M = (Mt)0t 0there exists > 0 so that if maxj |Ij| < then maxj |j(F)| < , and so

    j

    |j(F)|2 C.

    5

  • 8/7/2019 ItoIntegralNew

    6/37

  • 8/7/2019 ItoIntegralNew

    7/37

    one adds up these infinitesimal increments, one gets j times the total increment of theBrownian motion over this time period.

    Properties of the Ito Integral:(A) Linearity: It(aV+ bU) = aIt(V) + bIt(U).(B) Measurability: It(V) is adapted to F.(C) Continuity: t It(V) is continuous.

    These are all immediate from the definition.

    Proposition 6. Assume that Vt is elementary with representation (13), and assume that each ofthe random variables j has finite second moment. Then

    E(

    t0

    V dW) = 0 and(15)

    E(t

    0 V dW)2

    =t

    0 EV2

    s ds.(16)

    Proof. Exercise!

    The equality (16) is of crucial importance it asserts that the mapping that takes theprocess Vto its Ito integral at any time t is an L2isometry relative to the L2norm for theproduct measure LebesgueP. This will be the key to extending the integral to a wider classof integrands. The simple calculations that lead to (15) and (16) also yield the followinguseful information about the process It(V):

    Proposition 7. Assume that Vt is elementary with representation (13), and assume that each ofthe random variables j has finite second moment. Then It(V) is an L

    2martingale relative to F.Furthermore, if

    (17) [I(V)]t :=

    t

    0

    V2s ds;

    then It(V)2 [I(V)]t is a martingale.

    Note: The process [I(V)]t is called the quadratic variation of the martingale It(V). The squarebracket notation is standard in the literature.

    Proof. First recall that a linear combination of martingales is a martingale, so to prove thatIt(V) is a martingale it suffices to consider elementary functions Vt with just one step:

    Vt = 1(s,r](t)

    with measurable relative to

    Fs. For such a process Vthe integral It(V) is zero for all t

    s,

    and It(V) = Ir(V) for all t r, so to show that It(V) is a martingale it is only necessary tocheck that

    E(It(V) |Fu) = Iu(V) for s u < t r. E((Wt Wr) | Fu) = (Wu Wr).

    But this follows routinely from basic properties of conditional expectation, since is mea-surable relative to Fr and Wt is a martingale with respect to F.

    7

  • 8/7/2019 ItoIntegralNew

    8/37

    It is only slightly more difficult to check that It(V)2 [I(V)]t is a martingale (you have to

    decompose a sum of squares). This is left as an exercise.

    3.2. Extension to the class VT. Fix T . Recall from Lemma 4 that the class V2 = V2Tconsists of all progressively measurable processes Vt, defined for t T, such that

    (18) V2 = V2VT =T

    0

    EV2s ds < .

    Since V2 is the only one of the Vp spaces that will play a role in the following discussion,I will henceforth drop the superscript and just write V = VT.

    The norm is just the standard L2norm for the product measure LebesgueP. Conse-quently, the class VT is a Hilbert space; in particular, every Cauchy sequence in the spaceVT has a limit. The class ET of bounded elementary functions is a linear subspace of VT.Lemma 4 implies that

    ET is a dense linear subspace, a crucial fact that we now record:

    Proposition 8. The set ET is dense in VT, that is, for every V VT there is a sequence of boundedelementary functions Vn ET such that

    (19) limn

    T0

    E|V(t) Vn(t)|2 dt = 0.

    Corollary 1. (Ito Isometry) The Ito integral It(V) defined by (14) extends to all integrands V VTin such a way that for each t T the mapping V It(V) is a linear isometry from the space Vt tothe L2space of square-integrable random variables. In particular, if Vn is any sequence of boundedelementary functions such that Vn V 0, then for all t T,

    (20) It(V) =t

    0 V dW := L2

    limnt

    0 Vn dW

    exists and is independent of the approximating sequence Vn.

    Proof. If Vn V in the norm (18) then the sequence Vn is Cauchy with respect to thisnorm. Consequently, by Proposition 6, the sequence of random variables It(Vn) is Cauchyin L2(P), and so it has an L2limit. Linearity and uniqueness of the limit both follow byroutine L2arguments.

    This extended Ito integral inherits all of the properties of the It o integral for elementaryfunctions. Following is a list of these properties. Assume that V VT and t T.

    Properties of the Ito Integral:(A) Linearity: It(aV+ bU) = aIt(V) + bIt(U).(B) Measurability: It(V) is progressively measurable.(C) Continuity: t It(V) is continuous (for some version).(D) Mean: EIt(V) = 0.(E) Variance: EIt(V)

    2 = V2Vt .(F) Martingale Property: {It(V)}tT is an L2martingale.

    8

  • 8/7/2019 ItoIntegralNew

    9/37

    (G) Quadratic Martingale Property: {It(V)2 [I(V)]t}tT is an L1martingale, where

    (21) [I(V)]t :=t

    0 V2

    s ds

    All of these, with the exception of (C), follow routinely from (20) and the correspondingproperties of the integral for elementary functions by easy arguments using DCT and thelike. Property (C) follows from Proposition 4 in Section 2, because for elementary Vn theprocess It(Vn) has continuous paths.

    3.3. Quadratic Variation andT

    0W dW. There are tools for calculating stochastic integrals

    that usually make it unnecessary to use the definition of the Ito integral directly. The mostuseful of these, the Ito formula, will be discussed in the following sections. It is instructive,however, to do one explicit calculation using only the definition. This calculation will show(i) that the Fundamental Theorem of Calculus does not hold for Ito integrals; and (ii) the

    central importance of the Quadratic Variation formula in the Ito calculus. The QuadraticVariation formula, in its simplest guise, is this:

    Proposition 9. For any T> 0,

    (22) P limn

    2n1k=0

    (nk

    W)2 = T,

    where

    nk

    W := W

    kT+ T

    2n

    W

    kT

    2n

    .

    Proof. For each fixed n, the incrementsnk

    are independent, identically distributed Gaussian

    random variables with mean zero and variance 2n

    T. Hence, the result follows from theWLLN for 2random variables. Exercise 2. Prove that the convergence holds almost surely. H: Borel-Cantelli andexponential estimates.

    Exercise 3. Let f : R Rbe a continuous function with compact support. Show that

    limn

    2n1k=0

    f(W(kT/2n))(nk

    W)2 =

    T0

    f(W(s)) ds.

    Exercise 4. Let W1(t) and W2(t) be independent Wiener processes. Prove that

    P limn2n1k=0 (

    n

    k W1)(n

    k W2)=

    0.

    H: (W1(t) +W2(t))/

    2 is a standard Wiener process.

    The Wiener process Wt is itself in the class VT, for every T< , because

    =

    T0

    EW2s ds =

    T0

    s ds =T

    2< .

    9

  • 8/7/2019 ItoIntegralNew

    10/37

    Thus, by Corollary 1, the integralT

    0W dWis defined and an element ofL2. To evaluate it,

    we will use the most obvious approximation ofWs by elementary functions. For simplicity,

    set T = 1. Let (n)s be the elementary function whose jumps are at the dyadic rationals1/2n, 2/2n, 3/2n, . . . , and whose value in the interval [k/2n, (k + 1)/2n) is W(k/2n): that is,

    (n)s =2n

    k=0

    W(k/2n)1[k/2n,(k+1)/2n)(s).

    Lemma 6. limn1

    0E(Ws (n)s )2 ds = 0.

    Proof. Since the simple process (n)

    s takes the value W(k/2n

    ) for all s [k/2n

    , (k + 1)/2n

    ],

    10

    E(s (n)s )2 ds =2n1k=0

    (k+1)/2nk/2n

    E(Ws Wk/2n )2 ds

    =

    2n1k=0

    (k+1)/2nk/2n

    (s (k/2n)) ds

    2n1k=0

    22n = 2n/22n 0

    Corollary 1 now implies that the stochastic integral

    s dWs is the limit of the stochastic

    integrals

    (n)s dWs. Since (n)s is elementary, its stochastic integral is defined to be

    (n)s dWs =

    2n1k=0

    Wk/2n (W(k+1)/2n Wk/2n ).

    To evaluate this sum, we use the technique of summation by parts (the discrete analogueof integration by parts). Here, the technique takes the form of observing that the sum can

    10

  • 8/7/2019 ItoIntegralNew

    11/37

    be modified slightly to give a sum that telescopes:

    W21 =

    2n1k=0

    (W2(k+1)/2n W2k/2n )

    =

    2n1k=0

    (W(k+1)/2n Wk/2n )(W(k+1)/2n +Wk/2n )

    =

    2n1k=0

    (W(k+1)/2n Wk/2n )(Wk/2n + Wk/2n)

    +

    2n1k=0

    (W(k+1)/2n Wk/2n )(W(k+1)/2n Wk/2n )

    = 22

    n

    1k=0

    Wk/2n (W(k+1)/2n Wk/2n )

    +

    2n1k=0

    (W(k+1)/2n Wk/2n )2

    Thefirstsumontherightsideis2

    (n)s dWs, and soconverges to 2

    10

    Ws dWs as n . Thesecond sum is the same sum that occurs in the Quadratic Variation Formula (Proposition 9),

    and so converges, as n , to 1. Therefore,1

    0W dW= (W2

    1 1)/2. More generally,

    (23)T

    0 Ws dWs =

    1

    2 (W2T T).

    Note that if the Ito integral obeyed the Fundamental Theorem of Calculus, then the valueof the integral would be

    t0

    Ws dWs =

    t0

    W(s)W(s) ds =W2s

    2

    t

    0=

    W2t2

    Thus, formula (23) shows that the Ito calculus is fundamentally different than ordinarycalculus.

    3.4. Stopping Rule for Ito Integrals.

    Proposition 10. Let Vt VT and let T be a stopping time relative to the filtration F. Then

    (24)

    0

    Vs dWs =

    T0

    Vs1[0,](s) dWs.

    In other words, if the Ito integral It(V) is evaluated at the random time t = , the result is a.s. thesame as the Ito integral IT(V1[0,]) of the truncated process Vs1[0,](s).

    11

  • 8/7/2019 ItoIntegralNew

    12/37

    Proof. First consider the special case where both Vs and are elementary (in particular, takes values in a finite set). Then the truncated process Vs1[0,](s) is elementary (Exercise 1

    above), and so both sides of (24) can be evaluated using formula (14). It is routine to checkthat they give the same value (do it!).Next, consider the case where V is elementary and Tis an arbitrary stopping time.

    Then there is a sequence m T of elementary stopping times such that m . Bypath-continuity ofIt(V) (property (C) above),

    limn In (V) = I(V).

    On the other hand, by the dominated convergence theorem, the sequence V1[0,n] convergesto V1[0,] in VTnorm, so by the Ito isometry,

    L2 limn IT(V1[0,n]) = IT(V1[0,]).

    Therefore, the equality (24) holds, since it holds for each m.

    Finally, consider the general case V VT. By Proposition 8, there is a sequence Vn ofbounded elementary functions such that Vn V in the VTnorm. Consequently, by thedominated convergence theorem, Vn1[0,] V1[0,] in VTnorm, and so

    IT(Vn1[0,]) IT(V1[0,])in L2, by the Ito isometry. But on the other hand, Doobs Maximal Inequality (see Proposi-tion 4), together with the Ito isometry, implies that

    maxtT

    |It(Vn) It(V)| 0in probability. The equality (24) follows.

    Corollary 2. (Localization Principle) Let

    T be a stopping time. Suppose that V, U

    VT are

    two processes that agree up to time , that is, Vt1[0,](t) = Ut1[0,](t). Then

    (25)

    0

    U dW=

    0

    V dW.

    Proof. Immediate from Proposition 10.

    3.5. Extension to the Class WT. Fix T . Define W = WT to be the class of allprogressively measurable processes Vt = V(t) such that

    (26) P

    T0

    V2s ds < = 1

    Proposition 11. Let V WT, and for each n 1 define n = T inf{t : t

    0 V2

    s ds n}. Then foreach n the process V(t)1[0,n](t) is an element ofVT, and

    (27) limn

    t0

    Vs1[0,n](s) dWs :=

    t0

    Vs dWs := It(V)

    exists almost surely and varies continuously with t T. The process {It(V)}tT is called the Itointegral process associated to the integrand V.

    12

  • 8/7/2019 ItoIntegralNew

    13/37

  • 8/7/2019 ItoIntegralNew

    14/37

    When there is no danger of confusion I will drop the boldface notation.

    Theorem 2. (Multivariate Ito Formula) Let u(t, x) be twice continuously differentiable in each xiand once continuously differentiable in t. If Wt is a standard kdimensional Wiener process, then

    (31) u(t, Wt) u(0, 0) =t

    0us(s, Ws) ds +

    t0

    xu(s, Ws) dWs + 12

    t0xu(s, Ws) ds.

    Here x and x denote the gradient and Laplacian operators in the xvariables, respectively.Proof. This is essentially the same as the proof in the univariate case.

    Example 1. First consider the case of one variable, and let u(t, x) = x2. Then uxx = 2 andut = 0, and so the Ito formula gives another derivation of formula (23). Actually, theIto formula will be proved in general by mimicking the derivation that led to (23), usinga two-term Taylor series approximation for the increments of u(t, W)t) over short time

    intervals.

    Example 2. (Exponential Martingales.) Fix R, and let u(t, x) = exp{x 2t/2}. It isreadily checked that ut + uxx/2 = 0, so the two ordinary integrals in the Ito formula cancel,leaving just the stochastic integral. Since ux = u, the Ito formula gives

    (32) Z(t) = 1 +

    t0

    Z(s) dW(s)

    whereZ(t) := exp

    W(t) 2t/2

    .

    Thus, the exponential martingale Z(t) is a solution of the linear stochastic differentialequation dZt = Zt dWt.

    Example 3. A function u : Rk R is called harmonic in a domain D (an open subset ofRk)if it satisfies the Laplace equation u = 0 at all points ofD. Let u be a harmonic function

    onRk that is twice continuously differentiable. Then the multivariable Ito formula impliesthat ifWt is a kdimensional Wiener process,

    u(Wt) = u(W0) +

    t0

    u(Ws) dWs.

    It follows, by localization, that if is a stopping time such that u(Ws) is bounded for s then u(Wt) is an L2 martingale.

    Exercise 5. Check that in dimension d 3 the Newtonian potential u(x) = |x|d+2 is harmonicaway from the origin. Check that in dimension d = 2 the logarithmic potential u(x) = log

    |x

    |is harmonic away from the origin.4.2. Ito processes. An Ito process is a solution of a stochastic differential equation. Moreprecisely, an Ito process is an Fprogressively measurable process Xt that can be repre-sented as

    (33) Xt = X0 +

    t0

    As ds +

    t0

    Vs dWs t T,14

  • 8/7/2019 ItoIntegralNew

    15/37

    or equivalently, in differential form,

    (34) dX(t) = A(s) ds + V(s)

    dW(t).

    Here W(t) i sa kdimensional Wienerprocess; V(s) i sa kdimensional vector-valuedprocesswith components Vi WT; and At is a progressively measurable process that is integrable(in t) relative to Lebesgue measure with probability 1, that is,

    (35)

    T0

    |As| ds < a.s.

    IfX0 = 0 and the integrand As = 0 for all s, then call Xt = It(V) an Ito integral process. Notethat every Ito process has (a version with) continuous paths. Similarly, a kdimensional Itoprocess is a vector-valued process Xt with representation (33) where U, Vare vector-valued

    and Wis a kdimensional Wiener process relative to F. (Note: In this case

    V dWmust be

    interpreted as V dW.) IfXt is an Ito process with representation (33) (either univariateor multivariate), its quadratic variation is defined to be the process

    (36) [X]t :=

    t0

    |Vs|2 ds.

    If X1(t) and X2(t) are Ito processes relative to the same driving ddimensional Wienerprocess, with representations (in differential form)

    (37) dXi(t) = Ai(s) ds +d

    j=1

    Vi j(t) dW(t),

    then the quadratic covariation ofX1 and X2 is defined by

    (38) d[Xi, Xj]t :=

    d

    l=1

    Vil(t)Vjl(t) dt.

    Theorem 3. (Univariate Ito Formula) Let u(t, x) be twice continuously differentiable in x and oncecontinuously differentiable in t, and let X(t) be a univariate Ito process. Then

    (39) du(t, X(t)) = ut(t, X(t)) dt + ux(t, X(t)) dX(t) +1

    2uxx(t, X(t)) d[X]t .

    Note: It should be understood that the differential equation in (39) is shorthand for anintegral equation. Since u and its partial derivatives are assumed to be continuous, theordinary and stochastic integrals of the processes on the right side of (39) are well-definedup to any finite time t. The differential dX(t) is interpreted as in (34).

    Theorem 4. (Multivariate Ito Formula) Let u(t, x) be twice continuously differentiable in x Rk and once continuously differentiable in t, and let X(t) be a kdimensional Ito process whosecomponents Xi(t) satisfy the stochastic differential equations (37). Then

    (40) du(t, X(t)) = us(s, X(s)) ds + xu(s, Xs) dX(s) + 12

    ki,j=1

    uxi,xj (s, X(s)) d[Xi, Xj](s)

    15

  • 8/7/2019 ItoIntegralNew

    16/37

    Note: Unlike the Multivariate Ito Formula for functions of Wiener processes (Theorem 2above), this formula includes mixed partials.

    Theorems3 and ?? can be proved by similar reasoning as in sec. 4.5 below. Alternatively,they can be deduced as special cases of the general Ito formula for Ito integrals relative tocontinuous local martingales. (See notes on web page.)

    4.3. Example: Ornstein-Uhlenbeck process. Recall that the Ornstein-Uhlenbeck processwith mean-reversion parameter > 0 is the mean zero Gaussian process Xt whose co-variance function is EXsXt = exp{|t s|}. This process is the continuous-time analogueof the autoregressive1 process, and is a weak limit of suitably scaled AR processes. Itoccurs frequently as a weak limit of stochastic processes with some sort of mean-reversion,for much the same reason that the classical harmonic oscillator equation (Hookes Law)occurs in mechanical systems with a restoring force. The natural stochastic analogue of theharmonic oscillator equation is

    (41) dXt = Xt dt+ dWt; is called the relaxation parameter. To solve equation (41), set Yt = e

    tXt and use the Itoformula along with (41) to obtain

    dYt = et dWt.

    Thus, for any initial value X0 = x the equation (41) has the unique solution

    (42) Xt = X0et + et

    t0

    es dWs.

    It is easily checked that the Gaussian process defined by this equation has the covariancefunction of the Ornstein-Uhlenbeck process with parameter . Since Gaussian processesare determined by their means and covariances, it follows that the process Xt defined by

    (42) is a stationary Ornstein-Uhlenbeck process, provided the initial value X0 is chosen tobe a standard normal variate independent of the driving Brownian motion Wt.

    4.4. Example: Brownian bridge. Recall that the standard Brownian bridge is the meanzero Gaussian process {Yt}0t1 with covariance function EYsYt = s(1 t) for 0 < s t < 1.The Brownian bridge is the continuum limit of scaled simple random walk conditionedto return to 0 at time 2n. But simple random walk conditioned to return to 0 at time 2nis equivalent to the random walk gotten by sampling without replacement from a box withn tickets marked +1 and n marked 1. Now ifS[nt] = k, then there will be an excess ofktickets marked 1 left in the box, and so the next step is a biased Bernoulli. This suggeststhat, in the continuum limit, there will be an instantaneous drift whose direction (in the(t, x) plane) points to (1, 0). Thus, let Wt be a standard Brownian motion, and consider thestochastic differential equation

    (43) dYt = Yt1 tdt + dWt

    for 0 t 1. To solve this, set Ut = f(t)Yt and use (43) together with the Ito formula todetermine which choice of f(t) will make the dt terms vanish. The answer is f(t) = 1/(1 t)(easy exercise), and so

    d((1 t)1Yt) = (1 t)1 dWt.16

  • 8/7/2019 ItoIntegralNew

    17/37

    Consequently, the unique solution to equation (43) with initial value Y0 = 0 is given by

    (44) Yt = (1 t)t

    0 (1 s)1

    dWs.

    It is once again easily checked that the stochastic process Yt defined by (44) is a meanzero Gaussian process whose covariance function matches that of the standard Brownianbridge. Therefore, the solution of (43) with initial condition Y0 = 0 is a standard Brownianbridge.

    4.5. Proof of the univariate Ito formula. For ease of notation, I will consider only thecase where the driving Wiener process is 1dimensional; the argument in the general caseis similar. First, I claim that it suffices to prove the result for functions u with compactsupport. This follows by a routine argument using the Stopping Rule and LocalizationPrinciple for Ito integrals: let Dn be an increasing sequence of open sets in R+ R thatexhaust the space, and let n be the first time that Xt exits the region Dn. Then by continuity,u(t n, X(t n)) u(t, X(t)) as n , andn

    0

    t0

    for each of the integrals in (39). Thus, the result (39) will follows from the correspondingformula with t replaced by t n. For this, the function u can be replaced by a functionu with compact support such that u = u in Dn, by the Localization Principle. (Note: theMollification Lemma 9 in the Appendix below implies that there is a C function n withcompact support that takes values between 0 and 1 and is identically 1 on Dn. Set u = unto obtain a C2 function with compact support that agrees with u on Dn.) Finally, if (39)can be proved with u replaced by u, then it will hold with t replaced by t

    n, using the

    Stopping Rule again. Thus, we may now assume that the function u has compact support,and therefore that it and its partial derivatives are bounded and uniformly continuous.

    Second, I claim that it suffices to prove the result for elementary Ito processes, that is,processes Xt of the form (33) where Vs andAs are bounded, elementary processes. This followsby a routine approximation argument, because any Ito process X(t) can be approximatedby elementary Ito processes.

    It now remains to prove (39) for elementary Ito processes Xt and functions u of compactsupport. Assume that X has the form (33) with

    As =

    j1[tj,tj+1](s),

    Vs =j1[tj,tj+1](s)

    where j, j are bounded random variables, both measurable relative to Ftj . Now it is clearthat to prove the Ito formula (39) it suffices to prove it for t (tj, tj+1) for each index j. Butthis is essentially the same as proving that for an elementary Ito process X of the form (33)with As = 1[a,b](s) and Vs = 1[a,b](s), and , measurable relative to Fa,

    u(t, Xt) u(a, Xa) =t

    a

    us(s, Xs) ds +

    ta

    ux(s, Xs) dXs +1

    2

    ta

    uxx(s, Xs) d[X]s

    17

  • 8/7/2019 ItoIntegralNew

    18/37

    for all t (a, b). Fix a < t and set T= t a. Definenk X := X(a + (k + 1)T/2

    n)

    X(a + kT/2n),

    nk U := U(a + (k + 1)T/2n) U(a + kT/2n), where

    U(s) := u(s, X(s)); Us(s) = us(s, X(s)); Ux(s) = ux(s, X(s)); etc.

    Notice that because of the assumptions on As and Vs,

    (45) nk

    X = 2nT1 + nk

    W

    Now by Taylors theorem,

    u(t, Xt) u(a, Xa) =2n1k=0

    nk

    U(46)

    =2n

    T1

    2n1

    k=0

    Us(kT/2n

    ) +

    2n1

    k=0

    Ux(kT/2n

    )n

    k X

    +

    2n1k=0

    Uxx(kT/2n)(n

    kX)2/2 +

    2n1k=0

    Rnk

    where the remainder term Rnk

    satisfies

    (47) |Rnk| n(22n + (nk X)2)

    and the constants n converge to zero as n . (Note: This uniform bound on theremainder terms follows from the assumption that u(s, x) is C12 and has compact support,because this ensures that the partial derivatives us and uxx are uniformly continuous andbounded.)

    Finally, lets see how the four sums on the right side of (46) behave as n . First,because the partial derivative us(s, x) is uniformly continuous and bounded, the first sumis just a Riemann sum approximation to the integral of a continuous function; thus,

    limn 2

    nT12n1k=0

    Us(kT/2n) =

    ta

    us(s, Xs) ds.

    Next, by (45), the second sum also converges, since it can be split into a Riemann sum fora Riemann integral and an elementary approximation to an Ito integral:

    limn

    2n1k=

    0

    Ux(kT/2n)n

    kX =

    ta

    ux(s, Xs) dXs.

    The third sum is handled using Proposition 9 on the quadratic variation of the Wienerprocess, and equation (45) to reduce the quadratic variation of X to that of W (Exercise:Use the fact that uxx is uniformly continuous and bounded to fill in the details):

    limn

    2n1k=0

    Uxx(kT/2n)(nk X)

    2/2 =1

    2

    ta

    uxx(s, Xs)d[X]s.

    18

  • 8/7/2019 ItoIntegralNew

    19/37

    Finally, by (47) and Proposition 9,

    limn

    2n1k=0

    Rnk = 0.

    5. C E M U

    5.1. Exponential Martingales. Assume that for each T < the process Vt is in the classWT. Then the Ito formula (39) implies that the (complex-valued) exponential process

    (48) Zt := exp

    i

    t0

    Vs dWs +1

    2

    t0

    V2s ds

    , t T,

    satisfies the stochastic differential equation

    (49) dZt = iZtVt dWt.

    This alone does not guarantee that the process Zt is a martingale; however, if for everyT<

    (50) EV2Texp

    T0

    V2s ds

    <

    then the integrand in (49) will be in the class VTThis has some important ramifications for stochastic calculus, as we shall see.

    Proposition 12. Assume that for each T< ,

    (51) E expT

    0

    V2s ds < Then the process Zt defined by (48) is a martingale, and in particular,

    (52) EZT= 1 for each T< .Proof.

    Xt = X(t) =

    t0

    Vs dWs,

    [X]t = [X](t) =

    t0

    V2s ds, and

    n = inf{t : [X]t = n}Since Z(tn)V(tn) is uniformly bounded for each n, the stochastic differential equation(??), together with the Stopping Rule for Ito integrals, implies that the process Z(t N) isa bounded martingale. Since

    |Z(t n)| = exp{iX(t n) + [X](t n)/2} exp{[X]t/2},the random variables Z(t n) are all dominated by the integrable random variableexp{[X]t/2} (note that theNovikov hypothesis (??)isthesameastheassertionthatexp{[X]t/2}

    19

  • 8/7/2019 ItoIntegralNew

    20/37

    has finite first moment). Therefore the Dominated Convergence Theorem for conditionalexpectations implies that the process Z(t) is a martingale.

    5.2. Ito Representation Theorem.

    Theorem 5. Let Wt be a kdimensional Wiener process and let FW = (FWt )0t

  • 8/7/2019 ItoIntegralNew

    21/37

    5.3. Time Change for Ito Integral Processes. Let Xt = It(V) be an Ito integral process,where the integrand Vs is a process in the class WT. One should interpret Vt as representinginstantaneous volatility in particular, the conditional distribution of the increment X(t +t)X(t) given the value V(t) = is approximately, for t 0, the normal distribution withmean zero andvariance t. One may viewthisin one of two ways: (1) the volatility V(t) = is a damping factor that is, it multiplies the next Wiener increment by ; alternatively, (2)V(s) is a time regulation factor, either slowing or speeding the normal rate at which varianceis accumulated. The next theorem makes this latter viewpoint precise:

    Theorem 6. Every Ito integral process is a time-changed Wiener process. More precisely, letXt = It(V) be an Ito integral process with quadratic variation [X]t defined by (36). Assume that[X]t < for all t < , but that [X] = limt[X]t = with probability one. For each s 0define

    (55) (s) = inf{t > 0 : [Xt] = s

    }.

    Then the process

    (56) W(s) = X((s)), s 0is a standard Wiener process.

    Note: There is a more general theorem of P. L which asserts that every continuous mar-tingale (not necessarily adapted to a Wiener filtration) is a time-changed Wiener process.

    Proof. By Proposition 12, for any R and any s > 0 the process

    Z(t) :=

    exp{iX(t (s))+

    2

    [X]t(s)/2}is a bounded martingale. Hence, by the Bounded Convergence Theorem,

    E exp{iX((s)) + 2s/2} = 1.This implies that W(s) = X((s)) has the characteristic function of the N(0, s) distribution.A variation of this argument (Exercise!) shows that the process W(s) has stationary, inde-pendent increments.

    Corollary 4. Let Xt = It(V) be an Ito integral process with quadratic variation [X]t, and let (s)be defined by (55). Then for each > 0,

    (57) P{maxt(s) |Xt| } 2P{Ws }.Consequently, the random variable maxt(s) |Xt| has finite moments of all order, and even a finitemoment generating function.

    Proof. The maximum of |X(t)| up to time (s) coincides with the maximum of the Wienerprocess Wup to time s, so the result follows from the Reflection Principle.

    21

  • 8/7/2019 ItoIntegralNew

    22/37

    5.4. Hermite Functions and Hermite Martingales. The Hermite functions Hn(x, t) are poly-nomials in the variables x and t that satisfy the backward heat equation Ht +Hxx/2 = 0. As

    we have seen (e.g., in connection with the exponential function exp{x 2

    t/2}), ifH(x, t)satisfies the backward heat equation, then when the Ito Formula is applied to H(Wt, t), theordinary integrals cancel, leaving only the Ito integral; and thus, H(Wt, t) is a (local) martin-gale. Consequently, the Hermite functions provide a sequence of polynomial martingales.The first few Hermite functions are

    H0(x, t) = 1,(58)

    H1(x, t) = x,

    H2(x, t) = x2 t,

    H3(x, t) = x3 3xt,

    H4(x, t) = x4

    6x2t + 3t2.

    The formal definition is by a generating function:

    (59)

    n=0

    Hn(x, t)n

    n!= exp{x 2t/2}

    Exercise 6. Show that the Hermite functions satisfy the two-term recursion relation Hn+1 =xHn ntHn1. Conclude that every term of H2m is a constant times x2mtnm for some0 m n, and that the lead term is the monomial x2n. Conclude also that each Hn solvesthe backward heat equation.

    Proposition 13. Let Vs be a bounded, progressively measurable process, and let Xt = It(V) bethe Ito integral process with integrand V. Then for each n 0, the processes Hn(Xt, [X]t) is amartingale.

    Proof. Since each Hn(x, t) satisfies the backward heat equation, the Ito formula impliesthat H(Xt, [X]t) is a martingale for each stopping time = (m) = first t such that either|Xt| = m or [X]t = m. IfVs is bounded, then Corollary 4 guarantees thatfor each t therandomvariables H(Xt(m), [X]t(m)) are dominated by an integrable random variable. Therefore,the DCT for conditional expectations implies that Hn(Xt, [X]t) is a martingale.

    5.5. Moment Inequalities.

    Corollary 5. Let Vt be a progressively measurable process in the classW

    T, and let Xt = It(V) bethe associated Ito integral process. Then for every integer m 1 and every time T < , there existconstants Cm < such that(60) EX2mT CmE[X]mT.Note: Burkholder, Davis, and Gundy proved a stronger result: Inequality (60) remains trueifXT is replaced by maxtT |Xt| on the left side of (60).

    22

  • 8/7/2019 ItoIntegralNew

    23/37

    Proof. First, it suffices to consider the case where the integrand Vs is uniformly bounded.

    To see this, define in general the truncated integrands V(m)s := V(s) m; then

    limm It(V

    (m)) = It(V) a.s., and

    limm [I(V

    (m))]t = [I(V)]t.

    Hence, if the result holds for each of the truncated integrands V(m), then it will hold for V,by Fatous Lemma and the Monotone Convergence Theorem.

    Assume, then, that Vs is uniformly bounded. The proof of (60) in this case is by inductionon m. First note that, because Vt is assumed bounded, so is the quadratic variation [X]T atany finite time T, and so [X]T has finite moments of all orders. Also, ifVt is bounded thenit is an element ofVT, and so the Ito isometry implies that

    EX2T= E[X]T.

    This takes care of the case m = 1. The induction argument uses the Hermite martingalesH2m(Xt, [X]t) (Proposition 13). By Exercise 6, the lead term (in x) of the polynomial H2m(x, t)is x2m, and the remaining terms are all of the form am,kx

    2ktmk for k < m. SinceH2m(X0, [X]0) =0, the martingale identity implies

    EX2mT = m1k=0

    am,kEX2kT [X]

    mkT =

    EX2mT A2mm1k=0

    EX2kT [X]mkT =

    EX2mT A2mm1k=0

    (EX2mT )k/m(E[X]mT)1k/m,

    the last by Holders inequality. Note that the constants A2m = max |am,k| are determined bythe coefficients of the Hermite function H2m. Now divide both sides by EX

    2mT

    to obtain

    1 A2mm1k=0

    E[X]mT

    EX2mT

    1k/m

    .

    The inequality (60) follows.

    6. G T

    6.1. Change of measure. If ZT is a nonnegative, FT measurable random variable withexpectation 1 then it is a likelihood ratio, that is, the measure Q on FT defined by(61) Q(F) := EP1FZT

    is a probability measure, and the likelihood ratio (Radon-Nikodym derivative) ofQ relativeto P is ZT. It is of interest to know how the measure Q restricts to the algebra Ft FTwhen t < T.

    23

  • 8/7/2019 ItoIntegralNew

    24/37

    Proposition 14. Let ZT be an FTmeasurable, nonnegative random variable such that EPZT= 1,and let Q = QT be the probability measure on FT with likelihood ratio ZT relative to P. Then forany t [0, T] the restriction Qt of Q to the algebra Ft has likelihood ratio(62) Zt := E

    P(ZT|Ft).Proof. The random variable Zt defined by (62) is nonnegative, Ftmeasurable, and inte-grates to 1, so it is the likelihood ratio of a probability measure on Ft. For any eventA Ft,

    Q(A) := EPZT1A = EPEP(ZT|Ft)1A

    by definition of conditional expectation, so

    Zt = (dQ/dP)Ft .

    6.2. The Girsanov formula. Let {Vt}be an adapted process, and let Z(t) be defined by (48):

    (63) Zt = exp

    t0

    Vs dWs t

    0

    V2s ds/2

    .

    Novikovs theorem implies that if V satisfies the integrability hypothesis (??) then theprocess {Zt}t0 is a martingale, and so for each T > 0 the random variable Z(T) is alikelihood ratio. Thus, the formula

    (64) Q(F) = EP(Z(T)1F)

    defines a new probability measure on (, FT). Girsanovs theorem describes the distribu-tion of the stochastic process {W(t)}t0 under this new probability measure. Define

    (65) W(t) = W(t) t

    0Vs ds

    Theorem 7. (Girsanov) Assume that under P the process {Wt}t0 is a standard Brownian motionwith admissible filtration F = {Ft}t0. Assume further that the exponential process {Zt}tT is amartingale relative to F under P. Define Q on FT by equation (64). Then under the probabilitymeasure Q, the stochastic process W(t)0tT

    is a standard Wiener process.

    Proof. To show that the process Wt, under Q, is a standard Wiener process, it suffices toshow that it has independent, normally distributed increments with the correct variances.For this, it suffices to show either that the joint moment generating function or the jointcharacteristic function (under Q) of the increments

    W(t1), W(t2) W(t1), , W(tn) W(tn1)24

  • 8/7/2019 ItoIntegralNew

    25/37

    where 0 < t1 < t2 < < tn, is the same as that of n independent, normally distributedrandom variables with expectations 0 and variances t1, t2 t1, . . . , that is, either

    EQ exp

    n

    k=1

    k(W(tk) W(tk1)) =

    nk=1

    exp2

    k(tk tk1)

    or(66)

    EQ exp

    n

    k=1

    ik(W(tk) W(tk1)) =

    nk=1

    exp2

    k(tk tk1)

    .(67)

    Special Case: Assume that the integrand process Vs is bounded. In this case it is easiest touse moment generating functions. Consider for simplicity the case n = 1: To evaluate theexpectation EQ on the left side of (66), we rewrite it as an expectation under P, using thebasic likelihood ratio identity relating the two expectation operators:

    EQ expW(t)

    = EQ exp

    W(t)

    t

    0Vs ds

    = EP exp

    W(t)

    t0

    Vs ds

    exp

    t0

    Vs dWs t

    0

    V2s ds/2

    = EP exp

    t0

    ( + Vs) dWs t

    0(2Vs + V

    2s ) ds/2

    = e2t/2EP exp

    t0

    ( +Vs) dWs t

    0

    ( + Vs)2 ds/2

    = e2t,

    as desired. In the last step we used the fact that the exponential integrates to one. This

    follows from Novikovs theorem, because the hypothesis that the integrand Vs is boundedguarantees that Novikovs condition (??) is satisfied by (+Vs). A similar calculation showsthat (66) holds for n > 1.

    General Case: Unfortunately, the final step in the calculation above cannot be justified ingeneral. However, a similar argument can be made using characteristic functions ratherthan moment generating functions. Once again, consider for simplicity the case n = 1:

    EQ expiW(t)

    = EQ exp

    iW(t) i

    t0

    Vs ds

    = EP exp

    iW(t) i

    t0

    Vs ds

    exp

    t0

    Vs dWs t

    0V2s ds/2

    = EP exp

    t

    0(i + Vs) dWs

    t

    0(2iVs +V

    2s ) ds/2

    = EP exp

    t0

    (i + Vs) dWs t

    0(i + Vs)

    2 ds/2

    e

    2t/2

    = e2t/2,

    25

  • 8/7/2019 ItoIntegralNew

    26/37

    which is the characteristic function of the normal distribution N(0, t). To justify the finalequality, we must show that

    EP exp{Xt [X]t/2} = 1where

    Xt =

    t0

    (i + Vs) dWs and [X]t =

    t0

    (i +Vs)2 ds

    Itos formula implies that

    d exp{Xt [X]t/2} = exp{Xt [X]t/2} (i + Vt) dWt,and so the martingale property will hold up to any stopping time that keeps the integrandon the right side bounded. Define stopping times

    (n) = inf{s : |X|s = n or [X]s = n or |Vs| = n};then for each n = 1, 2, . . . ,

    EP exp{Xt(n) [X]t(n)/2} = 1As n , the integrand converges pointwise to exp{Xt [X]t/2}, so to conclude the proofit suffices to verify uniform integrability. For this, observe that for any s t,

    | exp{Xs [X]s/2}| esZs etZsBy hypothesis, the process {Zs}st is a positive martingale, and consequently the ran-dom variables Zt(n) are uniformly integrable. This implies that the random variables| exp{Xt(n) [X]t(n)/2}| are also uniformly integrable. 6.3. Example: Ornstein-Uhlenbeck revisited. Recall that the solution Xt to the linearstochastic differential equation (41) with initial condition X0 = x is an Ornstein-Uhlenbeck

    process with mean-reversion parameter . Because the stochastic differential equation (41)is of the same form as equation (65), Girsanovs theorem implies that a change of measurewill convert a standard Brownian motion to an Ornstein-Uhlenbeck process with initialpoint 0. The details are as follows: Let Wt be a standard Brownian motion defined on aprobability space (, F, P), and let F = {Ft}t0 be the associated Brownian filtration. Define

    Nt = t

    0

    Ws dWs,(68)

    Zt = exp{Nt [N]t/2},(69)and let Q = QT be the probability measure on FT with likelihood ratio ZT relative to P.Observe that the quadratic variation [N]t is just the integral

    t

    0W2s ds. Theorem 7 asserts

    that, under the measure Q, the process

    Wt := Wt +

    t0

    Ws ds,

    for 0 t T, is a standard Brownian motion. But this implies that the process W itselfmust solve the stochastic differential equation

    dWt = Wt dt+ dWt26

  • 8/7/2019 ItoIntegralNew

    27/37

    under Q. It follows that {Wt}0tT is, under Q, an Ornstein-Uhlenbeck process with mean-reversion parameter and initial point x. Similarly, the shifted process x +Wt is, under Q,

    an Ornstein-Uhlenbeck process with initial point x.It is worth noting that the likelihood ratio Zt may be written in an alternative form thatcontains no Ito integrals. To do this, use the Ito formula on the quadratic function u(x) = x2

    to obtain W2t = 2It(W)+ t; this shows that the Ito integral in the definition (68) ofNt may be

    replaced by W2t /2 t/2. Hence, the likelihood ratio Zt may be written as

    (70) Zt = exp{W2t /2 + t/2 2t

    0

    W2s ds/2}.

    A physicist would interpret the quantity in the exponential as the energy of the pathW in a quadratic potential well. According to the fundamental postulate of statisticalphysics (see F, Lectures on Statistical Physics), the probability of finding a system

    in a given configuration is proportional to the exponential eH()/kT, where H() is theenergy (Hamiltonian) of the system in configuration . Thus, a physicist might viewthe Ornstein-Uhlenbeck process as describing fluctuations in a quadratic potential. (Infact, the Ornstein-Uhlenbeck process was originally invented to describe the instantaneousvelocity of a particle undergoing rapid collisions with molecules in a gas. See the bookby Edward Nelson on Brownian motion for a discussion of the physics of the Brownianmotion process.)

    The formula (70) also suggests an interpretation of the change of measure in the languageof acceptance/rejection sampling: Run a standard Brownian motion for time Tto obtain apath x(t); then accept this path with probability proportional to

    exp{x2T/2 2T

    0

    x2s ds/2}.

    The random paths produced by this acceptance/rejection scheme will be distributed asOrnstein-Uhlenbeck paths with initial point 0. This suggests (and it can be proved, but thisis beyond the scope of these notes) that the Ornstein-Uhlenbeck measure on C[0, T] is theweak limit of a sequence of discrete measures n that weight random walk paths accordingto their potential energies.

    Problem 1. Show how the Brownian bridge can be obtained from Brownian motion bychange of measure, and find an expression for the likelihood ratio that contains no Itointegrals.

    6.4. Example: Brownian motion conditioned to stay positive. For each x R, let Px bea probability measure on (, F) such that under Px the process Wt is a one-dimensionalBrownian motion with initial point W0 = x. (Thus, under P

    x

    the process Wt x is a standardBrownian motion.) For a < b defineT= Ta,b = min{t 0 : Wt (a, b)}.

    Proposition 15. Fix 0 < x < b, and let T= T0,b. Let Qx be the probability measure obtained from

    Px by conditioning on the event {WT= b} that W reaches b before 0, that is,(71) Qx(F) := Px(F {WT = b})/Px{WT= b}.

    27

  • 8/7/2019 ItoIntegralNew

    28/37

    Then under Qx the process {Wt}tT has the same distribution as does the solution of the stochasticdifferential equation

    (72) dXt = X1t dt + dWt, X0 = xunder Px. In other words, conditioning on the event WT = b has the same effect as adding thelocation-dependent drift 1/Xt.

    Proof. Qx is a measure on the algebra FT that is absolutely continuous with respect toPx. The likelihood ratio dQx/dPx on FT is

    ZT :=1{WT = b}

    Px{WT = b} =b

    x1{WT= b}.

    For any (nonrandom) time t 0, the likelihood ratio ZtT = dQx/dPx on FTt is gottenby computing the conditional expectation ofZT under P

    x (Proposition 14). Since ZT is afunction only of the endpoint WT, its conditional expectation on

    FT

    t is the same as the

    conditional expectation on (WTt), by the (strong) Markov property of Brownian motion.Moreover, this conditonal expectation is just WTt/b (gamblers ruin!). Thus,

    ZTt = WTt/x.

    This doesnt at first sight appear to be of the exponential form required by the Girsanovtheorem, but actually it is: by the Ito formula,

    ZTt = exp{log(WTt/x)} = expTt

    0

    W1s dWs Tt

    0

    W2s ds/2

    .

    Consequently, the Girsanov theorem (Theorem 7) implies that under Qx the process WtT

    Tt

    0W1s ds is a Brownian motion. This is equivalent to the assertion that under Qx the

    process WTt behaves as a solution to the stochastic differential equation (72).

    Observe that the stochastic differential equation (72) does not involve the stopping placeb. Thus, Brownian motion conditioned to hit b before 0 can be constructed by runninga Brownian motion conditioned to hit b + a before 0, and stopping it at the first hittingtime ofb. Alternatively, one can construct a Brownian motion conditioned to hit b before

    0 by solving1 the stochastic differential equation (72) for t 0 and stopping it at the firsthit of b. Now observe that ifWt is a Brownian motion started at any fixed s, the events{WT0,b = b} are decreasing in b, and their intersection over all b > 0 is the event that Wt > 0for all t and Wt visits all b > x. (This is, of course, an event of probability zero.) For thisreason, probabilists refer to the process Xt solving the stochastic differential equation (72)as Brownian motion conditioned to stay positive.

    C: Because the event that Wt > 0 for all t > 0 has probability zero, we cannot defineBrownian motion conditioned to stay positive using conditional probabilities directlyin the same way (see equation (71)) that we defined Brownian motion conditioned tohit b before 0. Moreover, whereas Brownian motion conditioned to hit b before 0 has adistribution that is absolutely continuous relative to that of Brownian motion, Brownian

    1We havent yet established that the SDE (72) has a soulution for all t 0. This will be done later.28

  • 8/7/2019 ItoIntegralNew

    29/37

    motion conditioned to stay positive for all t > 0 has a distribution (on C[0, )) that isnecessarily singular relative to the law of Brownian motion.

    6.5. Bessel processes. A Bessel process with dimension parameter d is a solution (or aprocess whose law agrees with that of a solution) of the stochastic differential equation

    (73) dXt =d 12Xt

    dt + dWt,

    where Wt is a standard one-dimensional Brownian motion. Brownian motion conditionedto stay positive for all t > 0 is therefore the Bessel process of dimension 3. The problems ofexistence and uniqueness of solutions to the Bessel SDEs (73) will be addressed later. Thenext result shows that solutions exist when d is a positive integer.

    Proposition 16. Let Wt = (W1t , W

    2t , . . . , W

    dt ) be a ddimensional Brownian motion started at a

    point x 0, and let Rt =

    |Wt

    |be the modulus of Wt. Then Rt is a Bessel process of dimension d

    started at |x|.Remark 1. Consequently, the law of one-dimensional Brownian motion conditioned tostay positive forever coincides with that of the radial part Rt of a 3dimensional Brownianmotion!

    7. L T T F

    7.1. Occupation Measure of Brownian Motion. Let Wt be a standard one-dimensionalBrownian motion and let F := {Ft}t0 the associated Brownian filtration. A sample path of{Ws}0st induces, in a natural way, an occupation measure t on (the Borel field of) R:

    (74) t(A) := t

    0

    1A(Ws) ds.

    Theorem 8. With probabilityone, foreach t < the occupationmeasuret is absolutelycontinuouswith respect to Lebesgue measure, and its Radon-Nikodym derivative L(t; x) is jointly continuousin t and x.

    The occupation density L(t; x) is known as the local time of the Brownian motion at x.It was first studied by Paul Levy, who showed among other things that it could bedefined by the formula

    (75) L(t; x) = lim0

    1

    2

    t0

    1[x,x+](Ws) ds.

    Joint continuity ofL(t; x) in t, x was first proved by Trotter. The modern argument that

    follows is based on an integral representation discovered by Tanaka.

    7.2. Tanakas Formula.

    Theorem 9. The process L(t; x) defined by

    (76) 2L(t; x) = |Wt x| |x| t

    0sgn(Ws x) dWs

    29

  • 8/7/2019 ItoIntegralNew

    30/37

    is almost surely nondecreasing, jointly continuous in t, x, and constant on every open time intervalduring which Wt x.

    For the proof that L(t; x) is continuous (more precisely, that it has a continuous version)we will need two auxiliary results. The first is the Burkholder-Davis-Gundy inequality(Corollary 5 above). The second, the Kolmogorov-Chentsov criterion for path-continuityof a stochastic process, we have seen earlier:

    Proposition17. (Kolmogorov-Chentsov) Let Y(t) beastochasticprocessindexedbyaddimensionalparameter t. Then Y(t) has a version with continuous paths if there exist constants p, > 0 andC < such that for all s, t,(77) E|Y(t) Y(s)|p C|t s|d+.Proof of Theorem 9: Continuity. Since the process |Wtx||x| is jointly continuous, the Tanakaformula (76) implies that it is enough to show that

    Y(t; x) :=

    t

    0sgn(Ws x) dWs

    is jointly continuous in t and x. For this, we appeal to the Kolmogorov-Chentsov theorem:This asserts that to prove continuity of Y(t; x) in t, x it suffices to show that for some p 1and C, > 0,

    E|Y(t; x) Y(t; x)|p C|x x|2+ and(78)E|Y(t; x) Y(t; x)|p C|t t|2+.(79)

    I will prove only (78), with p = 6 and = 1; the other inequality, with the same values ofp, , is similar. Begin by observing that for x < x,

    Y(t; x) Y(t; x) = t0

    {sgn(Ws x) sgn(Ws x)} dWs

    = 2

    t0

    1(x,x)(Ws) dWs

    This is a martingale in t whose quadratic variation is flat when Ws is not in the interval(x, x) and grows linearly in time when Ws (x, x). By Corollary 5,

    E|Y(t;x) Y(t; x)|2m

    Cm22mEt

    01(x,x)(Ws) ds

    m

    Cm23m|x x|mm!t

    t1=0

    tt1t2=0

    t

    tm

    1

    tm=01

    t1t2 tm dt1dt2 dtm C|x x|m.

    Proof of Theorem 9: Conclusion. It remains to show that L(t; x) is nondecreasing in t, and thatit is constant on any time interval during which Wt x. Observe that the Tanaka formula

    30

  • 8/7/2019 ItoIntegralNew

    31/37

    (76) is, in effect, a variation of the Ito formula for the absolute value function, since thesgn function is the derivative of the absolute value everywhere except at 0. This suggests

    that we try an approach by approximation, using the usual Ito formula to a smoothedversion of the absolute value function. To get a smoothed version of | |, convolve with asmooth probability density with support contained in [, ] . Thus, let be an even, Cprobability density onRwith support contained in (1, 1) (see the proof of Lemma 9 in theAppendix below), and define

    n(x) : = n(nx);

    n(x) : = 1 + 2x

    n(y) dy; and

    Fn(x) : = |x| for |x| > 1

    : = 1 + x

    1/nn(z) dz for

    |x|

    1.

    Note that1/n1/n n = 0, because of the symmetry of about 0. (E: Check this.)

    Consequently, Fn is C on R, and agrees with the absolute value function outside the

    interval [n1, n1]. The first derivative ofFn is n, and hence is bounded in absolute valueby 1; it follows that Fn(y) |y| for all y R. The second derivative Fn = 2n. Therefore,Itos formula implies that

    (80) Fn(Wt x) Fn(x) =t

    0n(Ws x) dWs + 1

    2

    t0

    2n(Ws x) ds.

    By construction, Fn(y) |y| as n , so the left side of (80) converges to |Wt x| |x|.Now consider the stochastic integral on the right side of (80): As n , the function nconverges to sgn, and in fact n coincides with sgn outside of [n1, n1]; moreover, thedifference |sgn n| is bounded. Consequently, as n ,

    limn

    t0

    n(Ws x) dWs =t

    0sgn(Ws x) dWs,

    because the quadratic variation of the difference converges to 0. (E: Check this.)This proves that two of the three quantities in (80) converge as n , and so the thirdmust also converge:

    (81) L(t; x) = limn

    t0

    n(Ws x) ds := Ln(t; x)

    Each Ln(t; x) is nondecreasing in t, because n 0; therefore, L(t; x) is also nondecreasingin t. Finally, since n = 0 except in the interval [n1, n1], each of the processes Ln+m(t; x)is constant during time intervals when Wt [n1, n1]. Hence, L(t; x) is constant on timeintervals during which Ws x.

    31

  • 8/7/2019 ItoIntegralNew

    32/37

    Proof of Theorem 8. It suffices to show that for every continuous function f : R R withcompact support,

    (82)

    t

    0f(Ws) ds =

    R

    f(x)L(t; x) dx a.s.

    Let be, as in the proof of Theorem 9, a C probability density on R with support [1, 1],and let n(x) = n(nx); thus, n is a probability density with support [1/n, 1/n]. ByCorollary 8,

    n f funiformly as n . Therefore,

    limn

    t0

    n f(Ws) ds =t

    0f(Ws) ds.

    But t0

    n f(Ws) ds =t

    0

    R

    f(x)n(Ws x) dxds

    =

    R

    f(x)

    t0

    n(Ws x) dsdx

    =

    R

    f(x)Ln(t; x) dx

    where Ln(t; x) is defined in equation (81) in the proof of Theorem 9. Since Ln(t; x) L(t; x)as n , equation 82 follows.

    7.3. Skorohods Lemma. Tanakas formula (76) relates three interesting processes: thelocal time process L(t; x), the reflected Brownian motion |Wt x|, and the stochastic integral

    (83) X(t; x) :=

    t0

    sgn(Ws x) dWs.

    Proposition 18. For each x R the process {X(t; x)}t0 is a standard Wiener process.Proof. Since X(t; x) is given as the stochastic integral of a bounded integrand, the time-change theorem implies that it is a time-changed Brownian motion. To show that the timechange is trivial it is enough to check that the accumulated quadratic variation up to timet is t. But the quadratic variation is

    [X]t =

    t0

    sgn(Ws x)2 ds t, and

    E[X]t =

    t0

    P{Ws x} ds = t.

    32

  • 8/7/2019 ItoIntegralNew

    33/37

    Thus, the process |x| + X(t; x) is a Brownian motion started at |x|. Tanakas formula,after rearrangement of terms, shows that this Brownian motion can be decomposed as the

    difference of the nonnegative process |Wt x| and the nondecreasing process 2L(t; x):(84) |x| + X(t; x) = |Wt x| 2L(t; x).This is called Skorohods equation. Skorohod discovered that there is only one such decom-position of a continuous path, and that the terms of the decomposition have a peculiarform:

    Lemma 7. (Skorohod) Let w(t) be a continuous, real-valued function of t 0 such that w(0) 0.Then there exists a unique pair of real-valued continuous functions x(t) and y(t) such that

    (a) x(t) = w(t) + y(t);(b) x(t) 0 and y(0) = 0;(c) y(t) is nondecreasing and is constant on any time interval during which x(t) > 0.

    The functions x(t) and y(t) are given by

    (85) y(t) = ( minst

    w(s)) 0 and x(t) = w(t) + y(t).

    Proof. First, lets verify that the functions x(t) and y(t) defined by (85) satisfy properties (a)(c). It is easy to see that ifw(t) is continuous then its minimum to time t is also continuous;hence, both x(t)and y(t) are continuous. Up to the first time t = t0 that w(t) = 0, the functiony(s) will remain at the value 0, and so x(s) = w(s) 0 for all s t0. For t t0,

    y(t) = minst

    w(s) and x(t) = w(t) minst

    w(s);

    hence, x(t)

    0 for all t

    t0. Clearly, y(t) never decreases; and after time t0 it increases only

    when w(t) is at its minimum, so x(t) = 0. Thus, (a)(c) hold for the functions x(t) and y(t).Now suppose that (a)(c) hold for some other functions x(t) and y(t). By hypothesis,

    y(0) = y(0) = 0, so x(0) = x(0) = w(0) 0. Suppose that at some time t 0,x(t) x(t) = > 0.

    Then up until the next time s > t that x = 0, the function y must remain constant, and soxw must remain constant. But xw = y never decreases, so up until time s the differencex x cannot increase; at time s (if this is finite) the difference x x must be 0, becausex(s) = 0. Since x(t)x(t) is a continuous function oft that begins at x(0)x(0) = 0, it followsthat the difference x(t) x(t) can never exceed . But > 0 is arbitrary, so it must be that

    x(t)

    x(t)

    0

    t

    0.

    The same argument applies when the roles ofx and x are reversed. Therefore, x x. Corollary 6. (Levy) For x 0, the law of the vector-valued process (|Wt x|, L(t; x)) coincides withthat of(Wt + x M(t; x),M(t; x)), where(86) M(t; x) := min

    st(Ws + x) 0.

    33

  • 8/7/2019 ItoIntegralNew

    34/37

    7.4. Application: Extensions of Itos Formula.

    Theorem 10. (Extended Ito Formula) Let u : R

    R be twice differentiable (but not necessarily

    C2). Assume that |u| is integrable on any compact interval. Let Wt be a standard Brownianmotion. Then

    (87) u(Wt) = u(0) +

    t0

    u(Ws) dWs +1

    2

    t0

    u(Ws) ds.

    Proof. Exercise. Hint: First show that it suffices to consider the case where u has compactsupport. Next, let be a C

    probability density with support [, ]. By Lemma 10, uis C, and so Theorem 1 implies that the Ito formula (86) is valid when u is replaced byu . Now use Theorem 8 to show that as 0,t

    0

    u (Ws) ds t

    0

    u(Ws) ds.

    Tanakas theorem shows that there is at least a reasonable substitute for the Ito formulawhen u(x) = |x|. This function fails to have a second derivative at x = 0, but it does havethe property that its derivative is nondecreasing. This suggest that the Ito formula mightgeneralize to convex functions u, as these also have nondecreasing (left) derivatives.

    Definition 8. A function u : R R is convex if for any real numbers x < y and any s (0, 1),(88) u(sx + (1 s)y) su(x) + (1 s)u(y).Lemma 8. If u is convex, then it has a right derivative D+(x) at all but at most countably manypoints x R, defined by

    (89) D+u(x) := lim0+u(x + )

    u(x)

    .

    The function D+u(x) is nondecreasing and right continuous in x, andthe limit(88) exists everywhereexcept at those x where Du(x) has a jump discontinuity.

    Proof. Exercise or check any reasonable real analysis textbook, e.g. R, ch. 5.

    Since D+u is nondecreasing and right continuous, it is the cumulative distribution func-

    tion of a Radon measure2 onR, that is,

    D+u(x) = D+u(0) + ((0, x]) for x > 0(90)

    = D+u(0) ((x, 0]) for x 0.Consequently, by the Lebesgue differentiation theorem and an integration by parts (Fubini),

    u(x) = u(0) + D+u(0)x +

    [0,x]

    (x y) d(y) for x > 0(91)

    = u(0) + D+u(0)x +

    [x,0]

    (y x) d(y) for x 0.

    2A Radon measure is a Borel measure that attaches finite mass to any compact interval.

    34

  • 8/7/2019 ItoIntegralNew

    35/37

    Observe that this exhibits u as a mixture of absolute value functions ay(x) := |x y|. Thus,given the Tanaka formula, the next result should come as no surprise.

    Theorem 11. (Generalized Ito Formula) Let u : R R be convex, with right derivative D+u andsecond derivative measure , as above. Let Wt be a standard Brownian motion and let L(t; x) be theassociated local time process. Then

    (92) u(Wt) = u(0) +

    t0

    D+u(Ws) dWs +

    R

    L(t; x) d(x).

    Proof. Another exercise. (Smooth; use Ito; take limits; use Tanaka and Trotter.)

    8. A: S

    The proof of Itos formula presented above relies on the technique of mollification , in

    which a smooth function f is replaced by a smooth function f with compact support such

    that f =

    f in some specified compact set. Such a function

    f can be obtained by settingf = f v, where v is a smooth function with compact support that takes values between 0and 1, and is such that v = 1 on a specified compact set. The next lemma implies that suchfunctions exist.

    Lemma 9. (Mollification) For any open, relatively compact3 sets D, G Rk such that the closureD of D is contained in G, there exists a C function v : Rd R such that

    v(x) = 1 x D;v(x) = 0 x G;

    0 v(x) 1 x RdThe proof uses two related results of classical analysis:

    Lemma 10. (Smoothing) Let : Rd R be of class Ck for some integer4 k 0. Assume that has compact support.5 Let be a finite measure on Rd that is supported by a compact subset ofRd.Then the convolution

    (93) (x) :=

    (x y) d(y)

    is of class Ck, has compact support, and its first partial derivatives satisfy

    (94)

    xi (x) =

    xi(x y) d(y)

    Proof. (Sketch) The shortest proof is by induction on k. For the case k = 0 it must be shownthat if is continuous, then so is

    . This follows by a routine argument from the

    dominated convergence theorem, using the hypothesis that the measure has compactsupport (exercise). That has compact support follows easily from the hypothesis that and both have compact support.

    3Relatively compact means that the closure is compact.4A function is of class Ck if it has k continuous derivatives; it is of class C0 if it is continuous.5The hypothesis that has compact support isnt really necessary, but it makes the proof a bit easier.

    35

  • 8/7/2019 ItoIntegralNew

    36/37

    Assume now that the result is true for all integers k K, and let be a function of classCK+1, with compact support. All of the first partial derivatives i := /xi of are of class

    CK

    , and so by the induction hypothesis each convolution

    i (x) =

    i(x y) d(y)

    is of class CK and has compact support. Thus, to prove that has continuous partialderivatives of order K+1, it suffices to prove the identity (93). By the fundamental theoremof calculus and the fact that i and have compact support, for any x Rd

    (x) =

    0

    i(x + tei) dt

    where ei is the ith standard unit vector in Rd. Convolving with and using Fubinis

    theorem shows that

    (x) =

    0

    i(x y + tei) dtd(y)

    =

    0

    i(x y + tei) d(y)dt

    =

    0

    i (x + tei) dtThis identity and the fundamental theorem of calculus (this time used in the reversedirection) imply that the identity (93) holds for each i = 1, 2, . . . , d.

    Lemma 11. (Existence of Smooth Mollifiers) There exists an even, C probability density (x) onR with support [

    1, 1].

    Proof. Consider independent, identically distributed uniform-[1, 1] random variables Unand set

    Y =

    n=1

    Un/2n.

    The random variable Y is certainly between 1 and 1, and its distribution is symmetricabout 0, so if it has a density the density must be an even function with support [1, 1].To show that exists and is C, it is enough to show that if the cumulative distributionfunction has m continuous derivatives then it must have m + 1. For this, observe that Yhas the self-similarity property Y = U1/2+Y

    /2 where Y has the same distribution functionw and is independent ofU1. Hence,

    (x) =1

    0(2x t) dt.

    It now follows by an inductive argument similar to that used in the proof of Lemma 10 that is C.

    Corollary 7. For any > 0 and d 1 there exists an even, C probability density on Rd that issupported by the ball of radius centered at the origin.

    36

  • 8/7/2019 ItoIntegralNew

    37/37