Lecture 3

13
One-Step MLE. Many estimators are consistent and asymptotically normal but not asymptotically efficient. Some of them can be improved up to asymptotically efficient. We observe dX t = S (ϑ, X t )dt + σ (X t )dW t , X 0 , 0 t T in regular case, MLE is as. normal L θ n T ˆ ϑ T - ϑ ·o = ⇒N 0, I(ϑ) -1 · , I(ϑ)= E ϑ ˆ ˙ S (ϑ, ξ ) σ (ξ ) ! 2 and asymptotically efficient lim δ 0 lim T →∞ sup |ϑ-ϑ 0 |T E ϑ ˆ ϑ T - ϑ · 2 = I(ϑ 0 ) -1 . 1

Transcript of Lecture 3

Page 1: Lecture 3

One-Step MLE. Many estimators are consistent and asymptoticallynormal but not asymptotically efficient. Some of them can beimproved up to asymptotically efficient. We observe

dXt = S(ϑ,Xt) dt + σ(Xt) dWt, X0, 0 ≤ t ≤ T

in regular case, MLE is as. normal

{√T

(ϑT − ϑ

)}=⇒ N

(0, I (ϑ)−1

), I(ϑ) = Eϑ

(S (ϑ, ξ)σ (ξ)

)2

and asymptotically efficient

limδ→0

limT→∞

sup|ϑ−ϑ0|<δ

TEϑ

(ϑT − ϑ

)2

= I(ϑ0)−1.

1

Page 2: Lecture 3

The family of measures is LAN

L

(ϑ +

u√T

, ϑ, XT

)= exp

{u∆T

(ϑ,XT

)− u2

2I (ϑ) + rT

(ϑ, u, XT

)}

Here rT → 0 and

∆T

(ϑ,XT

)=

1√T

∫ T

0

S (ϑ,Xt)σ (Xt)

2 [dXt − S (ϑ,Xt) dt] =⇒ N (0, I(ϑ))

Then having a consistent and as. normal estimator ϑT we constructthe estimator

ϑ◦T

= ϑT +∆T

(ϑT , XT

)√

T I(ϑT )

and show that this estimator is asymptotically efficient

2

Page 3: Lecture 3

(ϑ◦

T− ϑ

)√T =

(ϑT − ϑ

)√T +

∆T (ϑ)I (ϑ)

(1 + o(1))

+1

I (ϑ)√

T

∫ T

0

S(ϑT , Xt)σ(Xt)2

[S (ϑ, Xt)− S

(ϑT , Xt

)]dt (1 + o(1))

= ηT +∆T (ϑ)I (ϑ)

(1 + o(1))− ηT

I (ϑ)1T

∫ T

0

(S(ϑ, Xt)σ(Xt)

)2

dt (1 + o(1))

=∆T (ϑ)I (ϑ)

(1 + o(1)) + o(1) =⇒ N (0, I(ϑ)−1

).

3

Page 4: Lecture 3

It is easy to verify by the Ito formula that δT

(θ, XT

)= ∆T

(θ, XT

)

δT

(θ, XT

)=

1√T

∫ XT

X0

S(θ, y)σ (y)2

dy − 12√

T

∫ T

0

S′(θ, Xt) dt +

+1√T

∫ T

0

S(θ, Xt)(

σ′(Xt)σ(Xt)

− S(θ, Xt)σ(Xt)2

)dt,

and define the one-step maximum likelihood estimator by the sameformula

ϑ◦T

= ϑT +δT

(ϑT , XT

)√

T I(ϑT ).

We prove

{√T

(ϑ◦

T− ϑ

)}=⇒ N (

0, I(ϑ)−1)

4

Page 5: Lecture 3

Example. Let

dXt = − (Xt − ϑ)3 dt + σ dWt, X0, 0 ≤ t ≤ T.

The MLE cannot be written in explicit form, but the EMM

ϑT =1T

∫ T

0

Xt dt

is uniformly consistent and asymptotically normal. The one-stepMLE is

ϑ◦T

= ϑT −1

σ2I T

∫ T

0

(ϑT −Xt

)5 dt.

This estimator is consistent, asymptotically normal

{√T

(ϑ◦

T− ϑ

)}=⇒ N (

0, I−1)

5

Page 6: Lecture 3

Lower Bounds.

The first one is the Cramer-Rao bound. Suppose that the observeddiffusion process is

dXt = S (ϑ,Xt) dt + σ (Xt) dWt, X0, 0 ≤ t ≤ T

and we have to estimate some continuously differentiable functionψ (ϑ) , ϑ ∈ Θ ⊂ R.

∂ϑEϑψT = Eϑ

(∆T

(ϑ, XT

)ψT

)

where

∆T

(ϑ, XT

)=

f (ϑ,X0)f (ϑ,X0)

+∫ T

0

S (ϑ,Xt)σ (Xt)

2 [dXt − S (ϑ,Xt) dt] .

6

Page 7: Lecture 3

Then we can write

(∆T

(ϑ,XT

)ψT

)= Eϑ

(∆T

(ϑ,XT

) [ψT −EϑψT

])

≤(Eϑ

[ψT −EϑψT

]2)1/2 (Eϑ∆T

(ϑ, XT

)2)1/2

and

Eϑ∆T

(ϑ,XT

)2= Eϑ

(f (ϑ,X0)f (ϑ,X0)

)2

+ T Eϑ

(S (ϑ, ξ)σ (ξ)

)2

= IT (ϑ) .

Hence

[ψT −EϑψT

]2 ≥

[ψ(ϑ) + b (ϑ)

]2

IT (ϑ),

7

Page 8: Lecture 3

Using the equalityEϑ

[ψT − ψ (ϑ)− b(ϑ)

]2 = Eϑ

[ψT − ψ (ϑ)

]2 − b(ϑ)2 we obtain finally

[ψT − ψ (ϑ)

]2 ≥

[ψ(ϑ) + b (ϑ)

]2

IT (ϑ)+ b (ϑ)2

which is called the Cramer–Rao inequality. If ψ (ϑ) = ϑ it became

[ϑT − ϑ

]2 ≥

[1 + b (ϑ)

]2

IT (ϑ)+ b (ϑ)2

and this last inequality is sometimes used to define an asymptotically

efficient estimator ϑT as an estimator satisfying for any ϑ ∈ Θ therelation

limT→∞

T Eϑ

[ϑT − ϑ

]2 =1

I (ϑ)(wrong!).

8

Page 9: Lecture 3

Due to the well-known Hodges example this definition is notsatisfactory. Therefore we use another bound (inequality) called theHajek–Le Cam bound. For quadratic loss function this lower boundis: for any estimator ϑT and any ϑ0 ∈ Θ

limδ→0

limT→∞

sup|ϑ−ϑ0|<δ

T Eϑ

[ϑT − ϑ

]2 ≥ 1I (ϑ0)

It can be considered as an asymptotic minimax version of theCramer–Rao inequality.

To prove it we need the van Trees lower bound. Suppose that theunknown parameter ϑ ∈ Θ = (α, β) is a random variable with densityp (ϑ), p (α) = 0 = p (β) and the Fisher information

Ip =∫ β

α

p (θ)2

p (θ)dθ < ∞.

9

Page 10: Lecture 3

Further we suppose that

∂ϑL

(ϑ, ϑ1;XT

)= ∆

(ϑ,XT

)L

(ϑ, ϑ1;XT

)

Then we can write∫ β

α

ψ (ϑ)∂

∂ϑ

[L

(ϑ, ϑ1;XT

)p (ϑ)

]dϑ = ψ (ϑ)L

(ϑ, ϑ1; XT

)p (ϑ)

∣∣βα

−∫ β

α

ψ (ϑ)L(ϑ, ϑ1; XT

)p (ϑ) dϑ = −

∫ β

α

ψ (ϑ)L(ϑ, ϑ1; XT

)p (ϑ) dϑ.

In a similar way

Eϑ1

∫ β

α

(ψT − ψ (ϑ)

) ∂

∂ϑ

[L

(ϑ, ϑ1; XT

)p (ϑ)

]dϑ

= Eϑ1

∫ β

α

ψ (ϑ) L(ϑ, ϑ1; XT

)p (ϑ) dϑ =

∫ β

α

ψ (ϑ) p (ϑ) dϑ = EP

ψ (ϑ)

10

Page 11: Lecture 3

The Cauchy–Schwarz inequality gives us

(E

Pψ (ϑ)

)2

≤ Eϑ1

∫ β

α

(ψT − ψ (ϑ)

)2L

(ϑ, ϑ1;XT

)p (ϑ) dϑ

×Eϑ1

∫ β

α

(∂

∂ϑln

[L

(ϑ, ϑ1; XT

)p (ϑ)

])2

L(ϑ, ϑ1; XT

)p (ϑ) dϑ.

For the first integral we have

Eϑ1

∫ β

α

(ψT − ψ (ϑ)

)2L

(ϑ, ϑ1;XT

)p (ϑ) dϑ

=∫ β

α

(ψT − ψ (ϑ)

)2p (ϑ) dϑ = E

(ψT − ψ (ϑ)

)2,

and for the second integral we obtain EP

IT (ϑ) + Ip.

11

Page 12: Lecture 3

Therefore

E(ψT − ψ (ϑ)

)2 ≥

(E

Pψ (ϑ)

)2

EP

IT (ϑ) + Ip.

This lower bound is due to van Trees (1968) and is called the vanTrees inequality, global Cramer–Rao bound, integral type

Cramer–Rao inequality or Bayesian Cramer–Rao bound. If we needto estimate ϑ only, then it becomes

E(ϑT − ϑ

)2 ≥ 1E

PIT (ϑ) + Ip

.

The main advantage of this inequality is that the right hand sidedoes not depend on the properties of the estimators (say, bias) and sois the same for all estimators. It is widely used in asymptoticnonparametric statistics. In particular, it gives the Hajek–Le Caminequality in the following elementary way.

12

Page 13: Lecture 3

Let us introduce a random variable η with density functionp (v) , v ∈ [−1, 1] such that p (−1) = p (1) = 0 and the Fisherinformation Ip < ∞. Fix some δ > 0, put ϑ = θ0 + δη and write E forthe expectation with respect to the joint distribution of XT and η.Then we have

limT→∞

sup|θ−θ0|<δ

T Eθ

(ϑT − ϑ

)2 ≥ limT→∞

T E(ϑT − ϑ

)2

≥ limT→∞

T

EP

IT (ϑ) + δ−2Ip=

1∫ 1

−1I (θ0 + δu) p (u) du

.

Hence from the continuity of the function I (·) as δ → 0 we obtain

limδ→0

limT→∞

sup|θ−θ0|<δ

T Eθ

(ϑT − θ

)2 ≥ 1I (θ0)

13