An Introduction to Extreme Value Analysis
Transcript of An Introduction to Extreme Value Analysis
![Page 1: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/1.jpg)
An Introduction to Extreme Value Analysis
Whitney Huang
Clemson EVA Group, January 29, 2020
![Page 2: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/2.jpg)
Outline
Motivation
Extreme Value Theorem & Block Maxima Method
Peaks–Over–Threshold (POT) Method
![Page 3: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/3.jpg)
Extreme Rainfall During Hurricane Harvey
I “A storm forces Houston, the limitless city, to consider itslimits” – The New York Times (8.31.17)
![Page 4: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/4.jpg)
Environmental Extremes: Heatwaves, Storm Surges, etc.
I Heat wave: The 2003 European heat wave led to the hottestsummer on record in Europe since 1540 that resulted in atleast 30,000 deaths
I Storm Surge: Hurricane Katrina produced the highest stormsurge ever recorded (27.8 feet) on the U.S. coast
![Page 5: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/5.jpg)
Scientific Questions
I How to estimate the magnitude of extreme events (e.g.100-year rainfall)?
I How extremes vary in space?
I How extremes may change in future climate conditions?
![Page 6: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/6.jpg)
Outline
Motivation
Extreme Value Theorem & Block Maxima Method
Peaks–Over–Threshold (POT) Method
![Page 7: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/7.jpg)
Usual vs Extremes
![Page 8: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/8.jpg)
Probability Framework
Let X1, · · · , Xniid∼ F and define Mn = max{X1, · · · , Xn}
Then the distribution function of Mn is
P(Mn ≤ x) = P(X1 ≤ x, · · · , Xn ≤ x)= P(X1 ≤ x)× · · · × P(Xn ≤ x) = Fn(x)
Remark
Fn(x)n→∞===
{0 if F (x) < 11 if F (x) = 1
⇒ the limiting distribution is degenerate.
![Page 9: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/9.jpg)
Asymptotic: Classical Limit Laws
Recall the Central Limit Theorem:
Sn − nµ√nσ
d→ N(0, 1)
⇒ rescaling is the key to obtain a non-degenerate distribution
Question: Can we get the limiting distribution of
Mn − bnan
for suitable sequence {an} > 0 and {bn}?
![Page 10: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/10.jpg)
Asymptotic: Classical Limit Laws
Recall the Central Limit Theorem:
Sn − nµ√nσ
d→ N(0, 1)
⇒ rescaling is the key to obtain a non-degenerate distribution
Question: Can we get the limiting distribution of
Mn − bnan
for suitable sequence {an} > 0 and {bn}?
![Page 11: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/11.jpg)
Extremal Types Theorem (Fisher–Tippett 1928, Gnedenko 1943)
Define Mn = max{X1, · · · , Xn} where X1, · · · , Xni.i.d.∼ F . If
∃ an > 0 and bn ∈ R such that, as n→∞, if
P(Mn − bn
an≤ x
)d→ G(x)
then G must be the same type of the following form:
G(x;µ, σ, ξ) = exp
{−[1 + ξ(
x− µσ
)]−1ξ
+
}where x+ = max(x, 0) and G(x) is the distribution function of thegeneralized extreme value distribution (GEV(µ, σ, ξ))
I µ and σ are location and scale parametersI ξ is a shape parameter determining the rate of tail decay, with
I ξ > 0 giving the heavy-tailed case (Frechet)I ξ = 0 giving the light-tailed case (Gumbel)I ξ < 0 giving the bounded-tailed case (reversed Weibull)
![Page 12: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/12.jpg)
Extremal Types Theorem (Fisher–Tippett 1928, Gnedenko 1943)
Define Mn = max{X1, · · · , Xn} where X1, · · · , Xni.i.d.∼ F . If
∃ an > 0 and bn ∈ R such that, as n→∞, if
P(Mn − bn
an≤ x
)d→ G(x)
then G must be the same type of the following form:
G(x;µ, σ, ξ) = exp
{−[1 + ξ(
x− µσ
)]−1ξ
+
}where x+ = max(x, 0) and G(x) is the distribution function of thegeneralized extreme value distribution (GEV(µ, σ, ξ))
I µ and σ are location and scale parametersI ξ is a shape parameter determining the rate of tail decay, with
I ξ > 0 giving the heavy-tailed case (Frechet)I ξ = 0 giving the light-tailed case (Gumbel)I ξ < 0 giving the bounded-tailed case (reversed Weibull)
![Page 13: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/13.jpg)
Extremal Types Theorem (Fisher–Tippett 1928, Gnedenko 1943)
Define Mn = max{X1, · · · , Xn} where X1, · · · , Xni.i.d.∼ F . If
∃ an > 0 and bn ∈ R such that, as n→∞, if
P(Mn − bn
an≤ x
)d→ G(x)
then G must be the same type of the following form:
G(x;µ, σ, ξ) = exp
{−[1 + ξ(
x− µσ
)]−1ξ
+
}where x+ = max(x, 0) and G(x) is the distribution function of thegeneralized extreme value distribution (GEV(µ, σ, ξ))
I µ and σ are location and scale parametersI ξ is a shape parameter determining the rate of tail decay, with
I ξ > 0 giving the heavy-tailed case (Frechet)I ξ = 0 giving the light-tailed case (Gumbel)I ξ < 0 giving the bounded-tailed case (reversed Weibull)
![Page 14: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/14.jpg)
Extremal Types Theorem (Fisher–Tippett 1928, Gnedenko 1943)
Define Mn = max{X1, · · · , Xn} where X1, · · · , Xni.i.d.∼ F . If
∃ an > 0 and bn ∈ R such that, as n→∞, if
P(Mn − bn
an≤ x
)d→ G(x)
then G must be the same type of the following form:
G(x;µ, σ, ξ) = exp
{−[1 + ξ(
x− µσ
)]−1ξ
+
}where x+ = max(x, 0) and G(x) is the distribution function of thegeneralized extreme value distribution (GEV(µ, σ, ξ))
I µ and σ are location and scale parametersI ξ is a shape parameter determining the rate of tail decay, with
I ξ > 0 giving the heavy-tailed case (Frechet)I ξ = 0 giving the light-tailed case (Gumbel)I ξ < 0 giving the bounded-tailed case (reversed Weibull)
![Page 15: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/15.jpg)
Extremal Types Theorem (Fisher–Tippett 1928, Gnedenko 1943)
Define Mn = max{X1, · · · , Xn} where X1, · · · , Xni.i.d.∼ F . If
∃ an > 0 and bn ∈ R such that, as n→∞, if
P(Mn − bn
an≤ x
)d→ G(x)
then G must be the same type of the following form:
G(x;µ, σ, ξ) = exp
{−[1 + ξ(
x− µσ
)]−1ξ
+
}where x+ = max(x, 0) and G(x) is the distribution function of thegeneralized extreme value distribution (GEV(µ, σ, ξ))
I µ and σ are location and scale parametersI ξ is a shape parameter determining the rate of tail decay, with
I ξ > 0 giving the heavy-tailed case (Frechet)I ξ = 0 giving the light-tailed case (Gumbel)I ξ < 0 giving the bounded-tailed case (reversed Weibull)
![Page 16: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/16.jpg)
Extremal Types Theorem (Fisher–Tippett 1928, Gnedenko 1943)
Define Mn = max{X1, · · · , Xn} where X1, · · · , Xni.i.d.∼ F . If
∃ an > 0 and bn ∈ R such that, as n→∞, if
P(Mn − bn
an≤ x
)d→ G(x)
then G must be the same type of the following form:
G(x;µ, σ, ξ) = exp
{−[1 + ξ(
x− µσ
)]−1ξ
+
}where x+ = max(x, 0) and G(x) is the distribution function of thegeneralized extreme value distribution (GEV(µ, σ, ξ))
I µ and σ are location and scale parametersI ξ is a shape parameter determining the rate of tail decay, with
I ξ > 0 giving the heavy-tailed case (Frechet)I ξ = 0 giving the light-tailed case (Gumbel)I ξ < 0 giving the bounded-tailed case (reversed Weibull)
![Page 17: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/17.jpg)
Extremal Types Theorem (Fisher–Tippett 1928, Gnedenko 1943)
Define Mn = max{X1, · · · , Xn} where X1, · · · , Xni.i.d.∼ F . If
∃ an > 0 and bn ∈ R such that, as n→∞, if
P(Mn − bn
an≤ x
)d→ G(x)
then G must be the same type of the following form:
G(x;µ, σ, ξ) = exp
{−[1 + ξ(
x− µσ
)]−1ξ
+
}where x+ = max(x, 0) and G(x) is the distribution function of thegeneralized extreme value distribution (GEV(µ, σ, ξ))
I µ and σ are location and scale parametersI ξ is a shape parameter determining the rate of tail decay, with
I ξ > 0 giving the heavy-tailed case (Frechet)I ξ = 0 giving the light-tailed case (Gumbel)I ξ < 0 giving the bounded-tailed case (reversed Weibull)
![Page 18: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/18.jpg)
Max-Stability and GEV
DefinitionA distribution G is said to be max-stable if
Gk(akx+ bk) = G(x), k ∈ N
for some constants ak > 0 and bk
I Taking powers of a distribution function results only in achange of location and scale
I A distribution is max-stable ⇐⇒ it is a GEV distribution
![Page 19: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/19.jpg)
Quantiles and Return Levels
I Quantiles of GEV
G(xp) = exp
{−[1 + ξ(
xp − µσ
)
]−1ξ
+
}= 1− p
⇒ xp = µ− σ
ξ
[1− {− log(1− p)−ξ}] 0 < p < 1
I In the extreme value terminology, xp is the return levelassociated with the return period 1
p
![Page 20: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/20.jpg)
Clemson Daily Precipitation [Data Source: USHCN]
1950 1960 1970 1980 1990 2000 2010
0
1
2
3
4
5
6
7
Daily Precip in Clemson
Year
Pre
cipi
tatio
n (in
)
![Page 21: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/21.jpg)
Block Maxima Method (Gumbel 1958)
1. Determine the block size and extract the block maxima
1950 1960 1970 1980 1990 2000 2010
0
1
2
3
4
5
6
7
Daily Precip in Clemson
Year
Pre
cipi
tatio
n (in
)
![Page 22: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/22.jpg)
Block Maxima Method (Gumbel 1958)
1. Determine the block size and extract the block maxima
1950 1960 1970 1980 1990 2000 2010
0
1
2
3
4
5
6
7
Daily Precip in Clemson
Year
Pre
cipi
tatio
n (in
)
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
![Page 23: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/23.jpg)
Block Maxima Method (Gumbel 1958)
2. Fit the GEV to the maximal and assess the fit
1950 1960 1970 1980 1990 2000 2010
Daily Precip in Clemson
Year
0
1
2
3
4
5
6
7
Pre
cipi
tatio
n (in
)
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
Density
0.5
0.25 0
![Page 24: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/24.jpg)
Block Maxima Method (Gumbel 1958)
2. Fit the GEV to the maximal and assess the fit
2 3 4 5 6
2
3
4
5
6
Quantile Plot
Model
Em
piric
al
![Page 25: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/25.jpg)
Block Maxima Method (Gumbel 1958)
3. Perform inference for return levels, probabilities, etc
95% CI for 50−yr RL
Annmx Precip (in)
Den
sity
0 2 4 6
0.0
0.1
0.2
0.3
0.4
0.5
0.6Delta CIProf CI
![Page 26: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/26.jpg)
Outline
Motivation
Extreme Value Theorem & Block Maxima Method
Peaks–Over–Threshold (POT) Method
![Page 27: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/27.jpg)
Recall the Block Maxima Method
1950 1960 1970 1980 1990 2000 2010
Daily Precip in Clemson
Year
0
1
2
3
4
5
6
7
Pre
cipi
tatio
n (in
)
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
Density
0.5
0.25 0
Question: Can we use data more efficiently?
![Page 28: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/28.jpg)
Peaks–over–threshold (POT) method [Davison & Smith 1990]
1. Select a “sufficiently large” threshold u, extract the exceedances
1950 1960 1970 1980 1990 2000 2010
Daily Precip in Clemson
Year
0
1
2
3
4
5
6
7
Pre
cipi
tatio
n (in
)
0
1
2
3
4
5
6
7
Density
1.2
0.8
0.4 0
![Page 29: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/29.jpg)
Peaks–over–threshold (POT) method [Davison & Smith 1990]
2. Fit an appropriate model to exceedances
1950 1960 1970 1980 1990 2000 2010
Daily Precip in Clemson
Year
0
1
2
3
4
5
6
7
Pre
cipi
tatio
n (in
)
0
1
2
3
4
5
6
7
Density
1.2
0.8
0.4 0
![Page 30: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/30.jpg)
GPD for Exceedances
If Mn = maxi=1,··· ,nXi (for a large n) can be apprximated by aGEV(µ, σ, ξ), then for sufficently large u,
P(Xi > x+ u|Xi > u) =nP(Xi > x+ u)
nP(Xi > u)
→
(1 + ξ x+u−bnan
1 + ξ u−bnan
)−1ξ
=
(1 +
ξx
an + ξ(u− bn)
)−1ξ
⇒ Survival function of generalized Pareto distribution
![Page 31: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/31.jpg)
Pickands–Balkema–de Haan Theorem (1974, 1975)
If Mn = max1≤i≤n{Xi} ≈ GEV(µ, σ, ξ), then, for a “large” u (i.e.,u→ xF = sup{x : F (x) < 1}),
P(X > u) ≈ 1
n
[1 + ξ
(u− µσ
)]−1ξ
Fu = P(X − u < y|X > u) is well approximated by the generalizedPareto distribution (GPD). That is:
Fu(y)d→ Hσ,ξ(y) u→ xF
where
Hσ,ξ(y) =
1− (1 + ξy/σ)−1/ξ ξ 6= 0;
1− exp(−y/σ) ξ = 0.
and σ = σ + ξ(u− µ)
![Page 32: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/32.jpg)
How to Choose the Threshold?
Bias–variance tradeoff:
I Threshold too low ⇒ bias because of the model asymptoticsbeing invalid
I Threshold too high ⇒ variance is large due to few data points
0 1 2 3 4 5
0.2
0.4
0.6
0.8
1.0
1.2
Mean Residual Life
Threshold (in)
Mea
n E
xces
s
Task: To choose a u0 s.t. the Mean Residual Life curve behaveslinearly ∀u > u0
![Page 33: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/33.jpg)
Peaks–over–threshold (POT) method [Davison & Smith 1990]
2. Fit an appropriate model to exceedances and assess the fit
1 2 3 4 5 6
1
2
3
4
5
6
Quantile Plot
Model
Em
piric
al
![Page 34: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/34.jpg)
Peaks–over–threshold (POT) method [Davison & Smith 1990]
3. Perform inference for return levels, probabilities, etc
95% CI for 50−yr RL
Threshold excess (in)
Den
sity
1 2 3 4 5 6 7
0.0
0.5
1.0
1.5
2.0
Delta CIProf CI
![Page 35: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/35.jpg)
Summary & Discussion
I Extreme value theory provides a framework to model extremevalues
I GEV for fitting block maxima
I GPD for fitting threshold exceedances
I Return level for communicating risk
I Practical Issues: seasonality, temporal dependence,non-stationarity, ...
![Page 36: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/36.jpg)
Summary & Discussion
I Extreme value theory provides a framework to model extremevalues
I GEV for fitting block maxima
I GPD for fitting threshold exceedances
I Return level for communicating risk
I Practical Issues: seasonality, temporal dependence,non-stationarity, ...
![Page 37: An Introduction to Extreme Value Analysis](https://reader030.fdocuments.us/reader030/viewer/2022012916/61c6c702e9bfef6d611b4260/html5/thumbnails/37.jpg)
For Further Reading
S. ColesAn Introduction to Statistical Modeling of Extreme Values.Springer, 2001.
J. Beirlant, Y Goegebeur, J. Segers, and J TeugelsStatistics of Extremes: Theory and Applications.Wiley, 2004.
L. de Haan, and A. FerreiraExtreme Value Theory: An Introduction.Springer, 2006.
S. I. ResnickHeavy-Tail Phenomena: Probabilistic and Statistical Modeling.Springer, 2007.