Post on 14-Jul-2020
6/24/2008
1
Strasbourg 6/081
Extreme Value Theory in Time Series Analysis
Richard A. Davis
Columbia University
http://www.stat.columbia.edu/~rdavis
Strasbourg 6/082
Game Plan
Quick overview of classical extreme value theory
Extremal types theorem
Generalized extreme value distributions
Examples
Domain of attraction
Generalized Pareto distribution
Point Processes
Basic results
Application to extremes
Extensions to stationary time series
Extremal index
Mixing conditions
Point process examples
Regular variation
Univariate
Multivariate
Estimation
Examples
Application to Crystal River Flows
Bonus example!!
6/24/2008
2
Strasbourg 6/083
References
Bingham, N.H., Goldie, C.M. and Teugels, J.L. (1987) Regular Variation. Cambridge
University Press, Cambridge
Embrechts, P., Klüppelberg, C. and Mikosch, T. (1997) Modelling Extremal Events for
Insurance and Finance. Springer, Berlin.
Galambos, J. (1978) The Asymptotic Theory of Order Statistics. Wiley, New York.
Haan, L. de, Fereira, A. (2006) Extreme Value Theory: An Introduction. Springer,
Berlin, New York.
Leadbetter, M.R., Lindgren, G. and Rootzén, H. (1983) Extremes and Related
Properties of Random Sequences and Processes. Springer, Berlin.
Resnick, S.I. (1986) Point processes, regular variation and weak convergence. Adv.
Appl. Prob. 18, 66–138.
Resnick, S.I. (1987) Extreme Values, Regular Variation, and Point Processes.
Springer, New York.
Resnick, S.I. (20067 Heavy Tail Phenomena: Probabilistic and Statistical Modeling.
Springer, New York.
Samorodnitsky, G. and Taqqu, M.S. (1994) Stable Non–Gaussian Random
Processes. Stochastic Models with Infinite Variance. Chapman and Hall, London.
Strasbourg 6/084
Classical Extreme Value Theory
Setup:
• {Xt} ~IID(F)
• Mn= max{X1,…, Xn}
• P(Mn x) = Fn(x)
Now in order for the right hand side to converge to a nonzero value,
must let x with n. Replacing x by un ,
P(Mn un) = Fn(un)
= (1-(1-F(un)))n
= (1-n(1-F(un))/n)n e-t
if and only if
n(1-F(un)) t
6/24/2008
3
Strasbourg 6/085
Classical EVT— Extremal Types Theorem
Convergence of types: Now taking un = anx+ bn, an > 0,
P (an-1(Mn – bn ) x) = Fn(anx+ bn)
G(x)
if and only if
n(1-F(anx+ bn)) -log G(x)
Theorem. If G is a nondegenerate distribution, then G has to be one
of the three types,
1. G(x) = exp(-e-x) (Gumbel)
2. G(x) = exp(-x-a), x 0 (Fréchet)
3. G(x) = exp(-(-x)a), x 0 (Weibull)
Strasbourg 6/086
Classical EVT— GEV
The three types of extreme value distributions can be
parameterized as one family (GEV family) via,
where
x = 0 implies G(x) = exp(-e-x) (Gumbel)
x > 0 implies G(x) is Fréchet (a = 1/x)
x < 0 implies G(x) is Weibull ( x < -1/x, a = -1/x)
Summary
• Gumbel—light tailed
• Fréchet—heavy tailed
• Weibull—finite endpoint
01 ),)1(exp()( /1 >xx-= x- xxxG
6/24/2008
4
Strasbourg 6/087
Classical EVT— Examples
In looking at maxima, usually one considers only the Gumbel and
Fréchet distributions for modeling.
Examples
1. uniform. F(x)=x, 0 < x < 1. With an=1/n, bn=1, we have for n
large
n(1-F(anx+ bn)) = -x = -log G(x) = -log( exp(-(-x)1)
P(n(Mn – 1 ) x) ex (x 0). Weibull limit
2. exponential. F(x)=1-e-x, x > 0. With an=1, bn=log n, we have
n(1-F(anx+ bn)) = n e-x-log n = e-x = -log(exp(-e-x))
P( Mn – log n x) exp(-e-x) (Gumbel limit)
Application to Fisher’s test for hidden periodicity in time series (see
B&D (1991) and Davis and Mikosch (1999)).
Strasbourg 6/088
Classical EVT— Examples
Exponential. P( Mn – log n x) exp(-e-x)
Empirical distribution of Mn – log n, n=100 Limit density
x
de
nsit
y
0 5 10
0.0
0.1
0.2
0.3
6/24/2008
5
Strasbourg 6/089
Classical EVT— Examples
3. Pareto. F(x)=1-x-a , 1 < x. With an=na, bn=0, we have for n large
n(1-F(anx)) = x-a = -log G(x) = -log( exp(-x-a) )
P(n-aMn x) exp(-x-a ) (x 0). Fréchet limit
x
de
nsity
0 5 10
0.0
0.1
0.2
0.3
blue = empirical
red = limit
Strasbourg 6/0810
Classical EVT— Examples
4. Gaussian.
an= (2 log(n))-1/2, bn=(2 log(n))1/2 - .5 (2 log(n))-1/2(log log(n)+log(4pi)),
n(1-F(anx+ bn)) e-x = -log(exp(-e-x))
P((2 log(n))1/2 (Mn – bn ) x) exp(-e-x) (Gumbel limit)
x
de
nsit
y
-4 -2 0 2 4 6 8
0.0
0.1
0.2
0.3
x
de
nsit
y
-2 0 2 4 6 8
0.0
0.2
0.4
0.6
0.8
1.0
n=100
6/24/2008
6
Strasbourg 6/0812
Classical EVT— Examples
x
de
nsit
y
-4 -2 0 2 4 6 8 10
0.0
0.1
0.2
0.3
x
cd
f
-2 0 2 4 6 8
0.0
0.2
0.4
0.6
0.8
1.0
x
de
nsit
y
-4 -2 0 2 4 6 8
0.0
0.1
0.2
0.3
x
cd
f
-2 0 2 4 6
0.0
0.2
0.4
0.6
0.8
1.0
n =1000
n =10000
Strasbourg 6/0813
Classical EVT— Examples
x
de
nsit
y
-4 -2 0 2 4 6 8 10
0.0
0.1
0.2
0.3
x
cd
f
-2 0 2 4 6 8
0.0
0.2
0.4
0.6
0.8
1.0
n =1000
n =100000
x
de
nsit
y
-4 -2 0 2 4 6 8
0.0
0.1
0.2
0.3
x
cd
f
-2 0 2 4 6 8
0.0
0.2
0.4
0.6
0.8
1.0
6/24/2008
7
Strasbourg 6/0814
Classical EVT— Domain of Attraction
If
P((Mn – bn)/ an x) = Fn(anx+ bn) G(x)
which is equivalent to
n(1-F(anx+ bn)) -log G(x) = (1xx)-1/x,
then we say that F (or X) is in the domain of attraction of G (write
F or X ϵ D(G)).
Note that if x=0, then
n(1-F( bn)) 1,
or
P(X1 > bn ) ~ 1/n.
It follows that
P(X1 - bn > an x | X1 > bn ) (1xx)-1/x
Strasbourg 6/0815
Classical EVT— GPD
P(X1 - bn > an x | X1 > bn ) (1xx)-1/x
Replacing bn with u, it follows that conditional distribution of X1 - u
given X1 > u has an approximate generalized Pareto distribution,
P(X1 - u ≤ an x | X1 > u ) ~ H(x)=1- (1xx)-1/x
for u large.
Generalized Pareto Distribution (GPD)
H(x)=1- (1x(x-m)/s)-1/x
The limit above is the basis for fitting the tail of the distribution of a
random variable using a GPD. There are myriad of procedures for
estimating the parameters in this model.
6/24/2008
8
Strasbourg 6/0816
Classical EVT— Domains of Attraction
Domains of attraction: There are necessary and sufficient
conditions for F ϵ D(G) for the three extreme value distributions.
The heavy-tailed Fréchet, which is perhaps the most commonly
used extreme value distribution, has the easiest n.a.s. to state (and
check!). In this case,
F ϵ D(exp(-x-a)) if and only if F is RV(a) for some a > 0.
Regular variation: F is RV(a) if and only if
for every x > 0.
, as )(
)(
)(
)(
>
>= a- tx
tXP
txXP
tF
txF
Strasbourg 6/0817
Classical EVT — Domains of Attraction
Examples:
Pareto,
log-gamma,
infinite variance stable,
Cauchy,
student,
Fréchet.
Remarks:
• P(X > x) = L(x)x-a where L is slowly varying, L(tx)/L(t)1
(think log function for L)
• Can take bn = 0 and an as the 1-1/n quantile of F, i.e.,
n(1-F(an)) 1. It turns out that an = L1(n)n1/a
• P(an-1 Mn x) exp(-x-a).
6/24/2008
9
Strasbourg 6/0818
Point Processes
The theory of point processes plays a central role in extreme value
theory. Applications include:
• Derivation of joint limiting distribution of order statistics, i.e.,
kth largest order statistic, limiting distribution of maximum
and minimum, etc.
• Calculation of limit distribution of exceedances of a high
level.
• Extensions to stationary processes.
• Provides a useful tool in heavy-tailed case for deriving
limiting behavior of various statistics, e.g., sample mean,
sample autocovariances, etc, which are often determined by
the behavior of the extreme order statistics.
Strasbourg 6/0819
Point Processes — Definition and Basic Results.
Setup:
• Suppose (Xt) is an iid sequence with common distribution F.
• F ϵ D(exp(-x-a)) (Can be formulated for general extreme value
distribution, but for simplicity we will restrict attention to Fréchet.)
We have
n(1-F(anx)) = nP(an-1X >x)) x-a
from which it follows that
nP(an-1 X ϵ (a,b]) a-a - b-a =: n(a,b],
where n is the measure on (0,] given by n(dx)= ax-a-1(dx).
More generally, we have
nP(an-1 X ϵ B) n(B)
for all nice sets B, those bounded away from 0.
6/24/2008
10
Strasbourg 6/0820
Point Processes — Definition and Basic Results.
Since
nP(an-1 X ϵ B) n(B)
for nice sets B, it follows that
nP(an-1 X ϵ ) v n(),
where v denotes vague convergence of measures.
Now for a set B bounded away from 0, define the sequence of point
processes by
Nn(B) = #{t=1,…,n: an-1 Xt ϵ B} =
where ex is the Dirac measure at x.
=
-en
tXa
Btn
1
)(1
Strasbourg 6/08
=
-en
tXa
Btn
1
)(1
21
Point Processes — Definition and Basic Results.
Nn(B) = #{t=1,…,n: an-1 Xt ϵ B} =
Properties:
• Nn(B) ~ Bin(n,pn ), where pn= P(an-1 X ϵ B) and since
nP(an-1 X ϵ B) n(B),
it follows from convergence of binomial to Poisson that
Nn(B) d N(B),
where N(B) is a Poisson random variable with mean n(B).
• In fact, we have the stronger point process convergence,
Nn d N
where N is a Poisson process on (0,] with mean measure n (dx)
and d denotes convergence in distribution of point processes.
6/24/2008
11
Strasbourg 6/0822
Point Processes — Definition and Basic Results.
Convergence for point processes.
For our purposes, d for point processes means that for any
collection of Borel sets B1, . . . ,Bk that are bounded away from 0
and P(N(∂Bj) > 0) = 0, j = 1, . . . , k, we have
(Nn(B1), . . . ,Nn(Bk)) d (N(B1), . . . ,N(Bk))
Strasbourg 6/0823
Point Processes — Application to Extremes
Application
Define Mn,2 to be the second largest among X1, . . . ,Xn. Since the
event
{an-1 Mn,2 ≤ y} is the same as {Nn (y,∞) ≤ 1},
we conclude from the point process convergence that
P(an-1 Mn,2 ≤ y) = P(Nn(y,∞) ≤ 1)
→ P(N(y,∞)) ≤ 1)
= exp{-y-a }(1 − log (exp{-y-a })).
6/24/2008
12
Strasbourg 6/0824
Point Processes — Application to Extremes
Application
Similarly, the joint limiting distribution of (Mn,Mn,2) can be calculated
by noting that for y ≤ x,
{an-1 Mn ≤ x, an
-1 Mn,2 ≤ y} is the same as {Nn (x,∞) ≤ 0, Nn (y,x] ≤ 1}.
Hence,
P(an-1 Mn ≤ x, an
-1 Mn,2 ≤ y) = P(Nn (x,∞) ≤ 0, Nn(y,x] ≤ 1)
→ P(N(x,∞) ≤ 0, N(y,x] ≤ 1)
= exp{-y-a }(1 + y-a - x-a ).
Strasbourg 6/0825
Point Processes — Representation of Limit.
Representation of limit Poisson process
The points of the limit Poisson process can be displayed in an
explicit fashion. Set
Gk= E1+ · · · + Ek,
where E1,E2, . . . are iid unit exponentials. Then the limit Poisson
process N has the representation
and
If we order the data, then we can read off the weak convergence
for the kth-largest Mn,k, i.e.,
an-1 Mn,k d Gk
-1/a (joint in k)
=
G=
a-- e=e=n
t
n
t
dXanktn
NN11
/11
=G a-e=
1
/1
kk
N
6/24/2008
13
Strasbourg 6/0826
Point Processes — Exceedances
Point Process of Exceedances.
Sometimes, it is convenient to consider the two-dimensional point
process defined by
For x > 0 fixed, the point process
Is the point process of exceedances of the level anx. This point
process converges in distribution to a homogeneous Poisson process
rate with intensity x-a.
Bottom line: point process of exceedances is approximately Poisson.
.1
),/( 1
=
-e=t
Xantntn
N
)),(()),((1
),/( 1 e=
=
- xxNt
Xantntn
Strasbourg 6/0827
Point Processes — An Application
Cool application
If a < 1, then
implies, by an application of the continuous mapping theorem, that
The limit random variable is positive stable with index a.
.1
/1
1
1
=
a-
=
- Gk
k
n
t
dtn Xa
=
G=
a-- e=e=n
t
n
t
dXanktn
NN11
/11
6/24/2008
14
Strasbourg 6/0828
Extension to Stationary Time Series
Let (Xt) is a strictly stationary sequence with common df F ∈ D(G), i.e.,
Fn(anx+ bn) G(x).
Theorem If (Xt) satisfies a mixing condition (like strong mixing) and
P( an-1(Mn – bn ) x) H(x),
H nondegenerate, then there exists a q ∈ (0,1] such that
H(x)=Gq (x).
The parameter θ is called the extremal index and is a measure of
extremal clustering.
Strasbourg 6/0829
Extension to Stationary Time Series—Extremal Index
Fn(anx+ bn) G(x) P( an-1(Mn – bn ) x) Gq (x).
Properties
• θ < 1 implies clustering of exceedances
• Suppose c is a threshold such that Fn(c) ~.95 and θ = .5. Then
P(Mn ≤ c) ∼ .951/2 = .975
• 1/θ is the mean cluster size of exceedances.
• In a certain sense, one can view θ as a measure of statistical
efficiency relative to the iid case. That is, one needs 1/θ more
observations to match the behavior of the iid case. Specifically,
P(Mn/q x) ~ Fn(x)
6/24/2008
15
Strasbourg 6/0830
Extension to Stationary Time Series—Example
Example (max-moving average) Let (Zt) be iid with a Pareto
distribution, i.e., P(Z1 > x) = x-a for x 1, and set
Xt = max(Zt, fZt-1), f ∈ [0,1].
Then
nP(X1 > xn1/a ) (1+fa)x-a and Fn(anx) exp(-(1+fa)x-a ).
On the other hand
P( n-1/a Mn x) = P( n-1/a max(Z0 ,…, Zn) x) exp(-x-a ).
Thus θ = 1/(1+fa).
Strasbourg 6/0831
Extension to Stationary Time Series—Example
iid (pareto a = 3) max-moving average (f = 1)
q = 1 q = 1/2
t
ma
x m
a
0 20 40 60 80 100
12
34
56
Note that cluster size is exactly 2 in this case.
t
iid
0 20 40 60 80 100
12
34
56
6/24/2008
16
Strasbourg 6/0832
Extension to Stationary Time Series—Mixing Conditions
Strong Mixing:
Remarks:
• Since mixing is defined via σ-fields, measurable functions of (Xt)
inherit the same mixing property. For example, if the stationary
sequence (Xt) is strongly mixing, so are (|Xt|) and (Xt2) with a rate
function of similar order.
• If (ak) decays to zero at an exponential rate, (Xt) is strongly
mixing with geometric rate, i.e., the memory between past and
future dies out exponentially fast.
• Strong mixing is much stronger than Leadbetter’s dependence
condition D(un).
. as 0|)()()(|sup)( )0(
a=-ss
kBPAPBAP kk,sXB,,sXA ss
Strasbourg 6/0833
Extension to Stationary Time Series—Mixing Conditions
• If (ak) decays to zero at an exponential rate, (Xt) is strongly
mixing with geometric rate, i.e., the memory between past and
future dies out exponentially fast.
• The rate function is closely related to the ACVF of the process:
if E|X1|2+δ < ∞ for some d > 0, then
|Cov(X0,Xh)|c akd/(2+d)
• Many commonly used time series models, such as ARMA, GARCH,
stochastic volatility processes, and Markov processes are strong
mixing with a geometric rate.
6/24/2008
17
Strasbourg 6/0834
Extension to Stationary Time Series—D’
Anti-clustering condition D’(un): Think of un as anx + bn .
as k .
Theorem: If (Xt) satisfies D and D’, FϵD(G), then q = 1 (i.e., no
clustering).
Remarks:
• If (Xt) is iid, then the lim sup of the sum is
limsupn n2/k P2(X1 > un) =O(1/k).
• If (Xt) is a stationary Gaussian process with ACF r(h)=o(1/log h), then
D and D’ hold and there is no clustering for Gaussian processes.
=
=>>kn
t
ntnn kOuXuXPn/
2
1 )/1(),(suplim
Strasbourg 6/0835
Extension to Stationary Time Series—Example
IID N(0,1/(1-.92)) AR(1): Xt = .9 Xt-1 + Zt, (Zt)~IID N(0,1)
• Even though q = 1, there appears to be some clustering for small n.
• Hsing, Hüsler, Reiss (1996) overcome this problem for Gaussian
processes by considering a triangular array or rvs.
t
AR
(1)
0 50 100 150 200
-4-2
02
4
t
iid
0 50 100 150 200
-6-4
-20
24
6
6/24/2008
18
Strasbourg 6/08
( ) )/1()1()1/(
)/1(
)/1()(lim
)/1()),max(,),(max(lim
)/1(),(lim
),(suplim
1
1201
21
/
2
1
kOx
kOx
kOuZnP
kOuZZuZZnP
kOuXuXnP
uXuXPn
nn
nnn
nnn
kn
t
ntnn
fff=
f=
>f=
>f>f=
>>=
>>
a-aaa
a-a
=
36
Extension to Stationary Time Series—Example
Max-moving average: Let (Zt) be iid with a Pareto distribution, i.e.,
P(Z1 > x) = x-a for x 1, and set
Xt = max(Zt, fZt-1), f ∈ [0,1]
Then, since (Xt) is 1-dependent, the mixing condition D is automatically
satisfied. As for D’,
Strasbourg 6/08
, , 1
11
/11 kk
n
t
n
t
dZaEE
ktn
=Gee =
G=
a--
)(),0(
1)0,(
1),(
*/1/1
11 a-a-
-- G
=G
=
eee= kkttn
k
d
n
tZZanN
37
Point Process Example—baby steps
In particular, for one-dependent sequences,
P(X2 > x| X1 > x) 1-q = fa /(1+ fa ).
Point process convergence (max-moving average): With an=n1/a
nP(Z1 > anx) x-a and nP(X1 > anx) (1+fa)x-a
Define the sequence of point processes by
From the convergence
one can show
=
--e=
n
tZZan
ttn
N1
),(
*
11
6/24/2008
19
Strasbourg 6/08
38
Point Process Example—baby steps
Applying the continuous mapping theorem (need to be careful),
we have
N
N
kk
kk
ttntn
k
k
n
tZZa
n
tXan
:)(
)(
11
11
1,11
1
),0max(1
)0,max(d
1)max(
1
=ee=
ee
e=e=
a-a-
a-a-
---
Gf
=G
Gf
=G
=f
=
)(),0(
1)0,(
1),(
*/1/1
11 a-a-
-- G
=G
=
eee= kkttn
k
d
n
tZZanN
Red = Gk-1/a, k=1,…,5
Blue = .75 *Gk-1/a, k=1,…,5
0
39
Def: The random variable X is regularly varying with index a if
P(|X|> t x)/P(|X|>t) x-a and P(X> t)/P(|X|>t) p,
or, equivalently, if
P(X> t x)/P(|X|>t) px-a and P(X< -t x)/P(|X|>t) qx-a ,
where 0 p 1 and p+q=1.
Equivalence:
X is RV(-a) if and only if P(X t ) /P(|X|>t)v m( )
(v vague convergence of measures on R\{0}). In this case,
m(dx) = (pa x-a-1 I(x>0) + qa (-x)-a-1 I(x<0)) dx
Note: m(tA) = t-a m(A) for every t and A bounded away from 0.
Regular Variation — univariate case
6/24/2008
20
Strasbourg 6/0840
Another formulation (polar coordinates):
Define the 1 valued rv q, P(q = 1) = p, P(q = -1) = 1- p = q.
Then
X is RV(a) if and only if
or
(v vague convergence of measures on S0= {-1,1}).
)(x)t |X(|
)|X|X/ t x, |X(|SP
P
SP
>
> a-θ
)(x)t |X(|
)|X|X/ t x, |X(|
>
> a-θP
P
Pv
Regular Variation — univariate case
Strasbourg 6/0841
Equivalence:
m is a measure on Rm which satisfies for x > 0 and A bounded away
from 0,
m(xB) = x-a m(xA).
Multivariate regular variation of X=(X1, . . . , Xm): There exists a
random vector q Sm-1 such that
P(|X|> t x, X/|X| )/P(|X|>t) v x-a P( q )
(v vague convergence on Sm-1, unit sphere in Rm) .
• P( q ) is called the spectral measure
• a is the index of X.
)()t |(|
)t (m
>
v
P
P
X
X)(
)t |(|
)t (m
>
v
P
P
X
X
Regular Variation — multivariate case
6/24/2008
21
Strasbourg 6/0842
Examples: 1. If X1 and X2 are iid RV(a), then X= (X1, X2 ) is
multivariate regularly varying with index a and spectral distribution
(assuming symmetry)
P( q =pk/2) = ¼ k=1,2,3,4 (mass on axes).
Interpretation: Unlikely that X1 and X2 are very large at the same
time.
Figure: plot of
(Xt1,Xt2) for realization
of 10,000.
Regular Variation — multivariate case
x_1
x_
2
-20 -10 0 10 20
-20
02
04
0
Independent Components
43
2. If X1 = X2 > 0, then X= (X1, X2 ) is multivariate regularly varying
with index a and spectral distribution
P( q = p/2) ) = 1.
3. AR(1): Xt= .9 Xt-1 + Zt , {Zt}~IID t(3)
P( q = arctan(.9)) = .9898 P( q = p/2) ) = .0102
t
0 2000 4000 6000 8000 10000
-20
02
04
06
08
0
6/24/2008
22
44
Figure: plot of (Xt, Xt+1) for realization of 10,000.
Xt= .9 Xt-1 + Zt
x=t
x=
{t+
1}
-20 0 20 40 60 80
-20
02
04
06
08
0
AR(1), X_{t+1} vs. X_t
Strasbourg 6/0846
The marginal distribution F for heavy-tailed data is often modeled using
Pareto-like tails,
1-F(x) = x-aL(x),
for x large, where L(x) is a slowly varying function (L(xt)/ L(x)1, as x
1). Now if
X~ F, then P(log X > x) = P(X > exp(x))~exp(-ax),
and hence log X has an approximate exponential distribution for large x.
The spacings,
log(X(n-j)) - log(X(n-j-1)), j=0,1,2,. . . ,m,
from a sample of size n from an exponential distr are approximately
independent and Exp(a(j+1)) distributed. This suggests estimating a-1
by
( )
( )
-
=
--
-
=
---
-
-=
-=a
1
0
)()(
1
0
)1()(
1
loglog1
)1(loglog1
ˆ
m
j
mnjn
m
j
jnjn
XXm
jXXm
Estimation of a and q
6/24/2008
23
Strasbourg 6/0847
Def: The Hill estimate of a for heavy-tailed data with distribution given
by
1-F(x) = x-aL(x),
is
( )
( )
-
=
--
-
=
---
-
-=
-=a
1
0
)()(
1
0
)1()(
1
loglog1
)1(loglog1
ˆ
m
j
mnjn
m
j
jnjn
XXm
jXXm
The asymptotic variance of this estimate for a is
and estimated by
(See also GPD=generalized Pareto distribution.)
m/2a ./ˆ 2 ma
Hill’s estimate of a
Strasbourg 6/0848
m
Hill
0 500 1000 1500 2000
12
34
5
Hill Plot for Independent Components
For a bivariate series, we will estimate a for the univariate series using
the Euclidean norm of the two components.
Hill’s estimate of a
6/24/2008
24
Strasbourg 6/0849
m
Hill
0 500 1000 1500 2000
12
34
5
Hill Plot for AR(1)
Hill’s estimate of a
Strasbourg 6/0850
Estimation of the spectral distribution of q
Based on the relation
P(|X|> t x, X/|X| )/P(|X|>t) v x-a P( q )
a naïve estimate of the distribution of q is based on the angular
components Xt/|Xt| in the sample. One simply uses the empirical
distribution of these angular pieces for which the modulus |Xt|
exceeds some large threshold. In the examples given below, we
use a kernel density estimate of these angular components for
those observations whose moduli exceed some large threshold.
Here we only consider two components, i.e., q is one dimensional.
6/24/2008
25
Strasbourg 6/0851
theta
-3 -2 -1 0 1 2 3
0.1
00
.15
0.2
0
Independent Components
Estimation of the spectral distribution of q
x_1
x_
2
-20 -10 0 10 20
-20
02
04
0
Independent Components
Strasbourg 6/0852
theta
-3 -2 -1 0 1 2 3
0.0
0.2
0.4
0.6
AR(1)
Estimation of q
x=t
x=
{t+1
}
-20 0 20 40 60 80
-20
02
04
06
08
0
AR(1), X_{t+1} vs. X_t
Vertical lines on right are at arctan(.9) and arctan(.9) -p
6/24/2008
26
Strasbourg 6/0853
ARCH(1): Xt=(a0a1 X2t-1)
1/2Zt, {Zt}~IID.
a found by solving E|a1 Z2|a/2 = 1.
a1 .312 .577 1.00 1.57
a 8.00 4.00 2.00 1.00
Distr of q:
P(q ) = E{||(B,Z)||a I(arg((B,Z)) )}/ E||(B,Z)||a
where
P(B = 1) = P(B = -1) =.5
Examples of Processes that are Regular Varying
Strasbourg 6/0854
Figures: plots of (Xt, Xt+1) and estimated distribution of q for
realization of 10,000.
Example of ARCH(1): a0=1, a1=1, a=2, Xt=(a0a1 X2t-1)
1/2Zt, {Zt}~IID
-20 -10 0 10 20
x_t
-20
-10
010
20
x_{t
+1}
-3 -2 -1 0 1 2 3
theta
0.0
60.0
80.1
00.1
20.1
40.1
60.1
8
Examples of Processes that are Regular Varying
6/24/2008
27
Strasbourg 6/0855
Example: SV model Xt = st Zt
Suppose Zt ~ RV(a) and
Then Zn=(Z1,…,Zn)’ is regulary varying with index a and so is
Xn= (X1,…,Xn)’ = diag(s1,…, sn) Zn
with spectral distribution concentrated on (1,0), (0, 1).
).N(0, IID~}{ , ,log 222 se<e=s
-=
-
-=
t
j
jjt
j
jt
-5000 0 5000 10000
x_1
-5000
05000
10000
x_2
Figure: plot of
(Xt,Xt+1) for
realization of 10,000.
Examples of Processes that are Regular Varying
Strasbourg 6/0856
Theorem (Davis & Hsing `95, Davis & Mikosch `97). Let {Xt} be a
stationary sequence of random m-vectors. Suppose
(i) finite dimensional distributions are jointly regularly varying (let
(q-k, . . . , qk) be the vector in S(2k+1)m-1 in the definition).
(ii) mixing condition A (an) or strong mixing.
(iii)
Then
(extremal index)
exists. If q > 0, then
.0) || ||(suplimlim 0t||
=>>
yayaP nnrtknk n
XX
a
=
a
qq-q=q || /) |||(|lim )(
0
)(
1
)(
0
kk
jj
kk
kEE
,::111
/ ijt
=
==
e=e=j
P
i
dn
t
an inNN QX
Point process Convergence
6/24/2008
28
Strasbourg 6/0857
Point process convergence(cont)
• (Pi) are points of a Poisson process on (0,) with intensity function
n(dy)=qay-a-1dy.
• , i 1, are iid point process with distribution Q, and Q is the
weak limit of
=
e1
ij
j
Q
=
a
q=
a
q-qeq-q
kt
k
jj
kkk
jj
kk
kEIE k
t||
)(
1
)(
0
)(
1
)(
0 ) |||(|/)( ) |||(|lim )(
Remarks:
1. GARCH and SV processes satisfy the conditions of the theorem.
2. Limit distribution for sample extremes and sample ACF follows from
this theorem.
Strasbourg 6/0858
Setup
Xt = st Zt , {Zt} ~ IID (0,1)
Xt is RV (a)
Choose {an} s.t. nP(Xt > an) 1
Then }.exp{)( 1
1 a-- - xxXaP n
n
Then, with Mn= max{X1, . . . , Xn},
(i) GARCH:
q is extremal index ( 0 < q < 1).
(ii) SV model:
extremal index q = 1 no clustering.
},exp{)( 1 a-- q- xxMaP nn
},exp{)( 1 a-- - xxMaP nn
Application to GARCH and SV Models
6/24/2008
29
Strasbourg 6/0859
Suppose (Xt) is the linear process
If an is chosen such that nP(|Zt| > an) 1, then
)RV( IID~)( , a=
-=
- t
j
jtjt ZZX
and
That is, each Poisson point Gk-1/a gives rise to a cluster of points
(deterministic) given by the coefficients of the filter, i.e., jGk-1/a .
Extremal index and other extremal properties can be determined
from this limit result.
Application to Linear Processes
a-
-=
a => xxaXnPj
jnt ||)|(|
,::11
/t
-=G
==
a-e=e=ji
dn
t
anjkn
NN X
Strasbourg 6/0860
River flow rate for Crystal River located in the mountain of Western
Colorado (see Cooley et al. (2007)). After deasonalizing the data,
we obtain 728 weekly observations from Oct 1, 1990 to Oct 1,
2005.
Application to Crystal River
t
cry
sta
l.ri
ve
r
0 200 400 600
-20
24
68
lag (h)
ac
f
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
6/24/2008
30
Strasbourg 6/0861
Estimates of a and the distribution of q for bivariate pairs (Xt-1,Xt)
Application to Crystal River
m
Hill
0 20 40 60 80 100 120 140
24
68
X_{t-1}
X_
t
-2 0 2 4 6 8
-20
24
68
theta
-3 -2 -1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
Vertical lines at
p/4 and p/4 - p
Strasbourg 6/0862
The extremogram of a stationary time series (Xt) can be viewed as
the analogue of the correlogram for measuring dependence in
extremes (see Davis and Mikosch (2008)).
Definition: For two sets A & B bounded away from 0, the
extremogram is defined as
rA,B(h) = limnP(an-1X0 ϵ A, an
-1Xh ϵ B)/ P(an-1X0 ϵ A)
In many examples, this can be computed explicitly. If one takes
A=B=(1,), then
rA,B(h) = limx P(Xh >x, | X0 >x) = l(X0,Xh)
often called the extremal dependence coefficient.
The Extremogram
6/24/2008
31
Strasbourg 6/0863
The extremogram of a stationary time series (Xt) can be viewed as
the analogue of the correlogram for measuring dependence in
extremes (see Davis and Mikosch (2008)).
Definition: For two sets A & B bounded away from 0, the
extremogram is defined as
rA,B(h) = limnP(an-1X0 ϵ A, an
-1Xh ϵ B)/ P(an-1X0 ϵ A)
In many examples, this can be computed explicitly. If one takes
A=B=(1,), then
rA,B(h) = limx P(Xh >x, | X0 >x) = l(X0,Xh)
often called the extremal dependence coefficient (l = 0 means
independence or asymptotic independence).
The Extremogram
Strasbourg 6/0864
The extremogram is estimated via the empirical extremogram
defined by
where m with m/n 0. Note that the limit of the expectation of
the numerator is
mP (am-1X0 ϵ A, am
-1Xh ϵ B) m(AB),
where m is the measure defined in the statement of regular
variation. Hence the empirical estimate is asymptotically unbiased.
Under suitable mixing conditions, a CLT for the empirical estimate
is established in &M (2008).
The Extremogram
=
-
=
-
--
=rn
tAXa
hn
tBXaAXa
BA
tm
htmtm
In
m
In
m
h
1}{
1},{
,
1
11
)(ˆ
6/24/2008
32
Strasbourg 6/0865
Extremogram for Crystal River A = B = (1,)
Application to Crystal River
lag
ex
trem
og
ram
0 10 20 30 40
0.4
0.6
0.8
1.0
Strasbourg 6/0866
Fit an AR(6) model to the data (remove all appreciable
autocorrelation in the data). Now we estimate the distribution of q
and the extremogram based on the residuals.
Application to Crystal River
lag
ex
trem
og
ram
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
theta
-3 -2 -1 0 1 2 3
0.0
50
.10
0.1
50
.20
0.2
5
Vertical lines at -p/2, 0, and p/2
6/24/2008
33
Strasbourg 6/0867
There is still a touch of autocorrelation in the absolute values and
squares of the residuals. We remove these by fitting a GARCH
model to these residuals. The degrees of freedom for the noise
was 3.43
Application to Crystal River
m
Hill
0 20 40 60 80 100 120 140
12
34
5
Strasbourg 6/0868
Bonus Example
time
sp
ee
d (
kn
ots
)
01
23
45
Wind Speed at Kilkenny 1/1/61-1/17/78
1961 1963 1965 1967 1969 1971 1973 1975 1977
6/24/2008
34
Strasbourg 6/0869
Bonus Example
time
sp
eed
(k
no
ts)
-2-1
01
2 Wind speed at Kilkenny adjusted
1961 1963 1965 1967 1969 1971 1973 1975 1977
time
-20
2
Wind speed at Kilkenny Adjusted for Volatility
1961 1963 1965 1967 1969 1971 1973 1975 1977
lag (h)
ac
f of
ab
s v
alu
es
0 5 10 15 20 25 30
-0.0
50
.00
.05
0.1
0
lag (h)
ac
f of
ab
s v
alu
es
0 5 10 15 20 25 30
-0.0
50
.00
.05
0.1
0