Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high...

Post on 09-Oct-2020

3 views 0 download

Transcript of Threshold Selection for Precipitation Extremes · Threshold models Model exceedances over a high...

Threshold Selection for PrecipitationExtremes

Uli Schneider∗ and Philippe Naveau∗∗

EGU, April 27, 2004

∗Geophysical Statistics Project, NCAR∗∗Dept. of Applied Mathematics, University of Colorado and

Lab. des Sciences du Climat et de l’Environnement, CNRS

Outline

Extreme value theory (threshold models, advantagesand limitations)

A new approach – folding

Conclusions

Threshold Selection for Precipitation Extremes

Extreme value theory

In classical statistics: model the AVERAGE behaviorof a process.

In extreme value theory: model the EXTREMEbehavior (the tail of a distribution).

Usually deal with very small data sets!

Threshold Selection for Precipitation Extremes

Extreme value theory

In extreme value theory: model the EXTREMEbehavior (the tail of a distribution).

Usually deal with very small data sets!

Threshold Selection for Precipitation Extremes

Extreme value theory

In extreme value theory: model the EXTREMEbehavior (the tail of a distribution).

Usually deal with very small data sets!

Threshold Selection for Precipitation Extremes

Threshold models

Model exceedances over a high threshold u –X − u|X > u.

1950 1960 1970 1980 1990 2000

010

020

030

040

0

years

daily

pre

cipi

tatio

n

Daily precipitation for Boulder, Colorado [1/100 in]

u

Threshold Selection for Precipitation Extremes

Threshold models

Model exceedances over a high threshold u –X − u|X > u.

1950 1960 1970 1980 1990 2000

010

020

030

040

0

years

daily

pre

cipi

tatio

n

Daily precipitation for Boulder, Colorado [1/100 in]

u

Threshold Selection for Precipitation Extremes

Threshold models

Model exceedances over a high threshold u –X − u|X > u.

u

Threshold Selection for Precipitation Extremes

Threshold models

Model exceedances over a high threshold u –X − u|X > u.

u

u

Threshold Selection for Precipitation Extremes

Threshold models

Model exceedances over a high threshold u –X − u|X > u.

u

Threshold Selection for Precipitation Extremes

The gen. Pareto distribution (GPD)

The distribution of Y := X − u|X > u converges to(as u → ∞)

H(y) = 1 − (1 + ξy

σ)−

1

ξ .

H(y) is called the “Generalized Pareto” distribution(GPD) with 2 parameters.

shape parameter ξ

scale parameter σ

u

Threshold Selection for Precipitation Extremes

Extrapolation beyond the data

Return levels (quantiles) outside the data range are oftenthe quantity of interest:

Given m, what is the return level z such that there isa 1/m% probability to exceed z?

P (X > z) =1

m

E.g. for precipitation: the “infamous” 100-year flood

Easy to compute once the parameters of the modelare estimated.

Threshold Selection for Precipitation Extremes

Advantages and limitations

From a theoretical viewpoint

(+) “universal” approach(-) asymptotic result : convergence in u andsample size might be very slow.

From a statistical viewpoint

Choosing the threshold: trade-off – A highthreshold yields a better GPD approximation,whereas a low threshold leaves more data points.Goodness of fit – Is it reasonable to removeobservations in order to fit a pre-fixed model?

From a scientific viewpoint: Threshold determinesthe question: What is an extreme value?

Threshold Selection for Precipitation Extremes

Folding – idea

Main idea: Want to use information from the databelow the threshold as well.

u

move above u

[0,F(u))

F

[F(u),1)

F −1

unif.

Threshold Selection for Precipitation Extremes

Folding – idea

Main idea: Want to use information from the databelow the threshold as well.

u

keep it here

u

move above u

[0,F(u))

F

[F(u),1)

F −1

unif.

Threshold Selection for Precipitation Extremes

Folding – idea

Main idea: Want to use information from the databelow the threshold as well.

u

move above u

[0,F(u))

F

[F(u),1)

F −1

unif.

Threshold Selection for Precipitation Extremes

Folding – idea

Main idea: Want to use information from the databelow the threshold as well.

u

move above u

[0,F(u))

F

[0,F(u))

F

[F(u),1)

F −1

unif.

Threshold Selection for Precipitation Extremes

Folding – idea

Main idea: Want to use information from the databelow the threshold as well.

u

move above u

[0,F(u))

F

[F(u),1)unif.

[0,F(u))

F

[F(u),1)

F −1

unif.

Threshold Selection for Precipitation Extremes

Folding – idea

Main idea: Want to use information from the databelow the threshold as well.

u

move above u

[0,F(u))

F

[F(u),1)

F −1

unif.

Threshold Selection for Precipitation Extremes

Folding – formula

[0,F(u))

F

[F(u),1)

F −1

unif.

X Y(u)

Y (u) :=

{

F−1(

F (u)F (u)F (X) + F (u)

)

if X ≤ u

X if X > uwhere F = 1−F

If X ∼ F (x), then Y (u) has the same distribution asX|X > u.

[0,F(u))

F

[F(u),1)

F −1

unif.

X Y(u)

Idea

Estimate F (in the “middle” of the distribution) withthe empirical cdf Fn.

Estimate F−1 with a “preliminary” GPD.

Threshold Selection for Precipitation Extremes

Folding – formula

[0,F(u))

F

[F(u),1)

F −1

unif.

X Y(u)

Y (u) :=

{

F−1(

F (u)F (u)F (X) + F (u)

)

if X ≤ u

X if X > uwhere F = 1−F

If X ∼ F (x), then Y (u) has the same distribution asX|X > u.

Problem: F is unknown.

[0,F(u))

F

[F(u),1)

F −1

unif.

X Y(u)

Idea

Estimate F (in the “middle” of the distribution) withthe empirical cdf Fn.

Estimate F−1 with a “preliminary” GPD.

Threshold Selection for Precipitation Extremes

Folding – formula

[0,F(u))

F

[F(u),1)

F −1

unif.

X Y(u)

Idea

Estimate F (in the “middle” of the distribution) withthe empirical cdf Fn.

Estimate F−1 with a “preliminary” GPD.

Threshold Selection for Precipitation Extremes

Folding – simulation results

Normal dist. with 100 data points (“F unknown”)

20 40 60 80 100

0.5

1.0

1.5

2.0

2.5

3.0

m (years)

retu

rn le

vels

RETURN LEVELS

red = "true", green = with foldling, white = conventional

Threshold Selection for Precipitation Extremes

Folding – simulation results

Normal dist. with 100 data points (“F known”)

20 40 60 80 100

−20

24

6

m (years)

retu

rn le

vels

RETURN LEVELS

red = "true", green = with foldling, white = conventional

Threshold Selection for Precipitation Extremes

Folding – simulation results

Cauchy dist. with 100 data points (“F unknown”)

20 40 60 80 100

−50

050

100

150

m (years)

retu

rn le

vels

RETURN LEVELS

red = "true", green = with foldling, white = conventional

Threshold Selection for Precipitation Extremes

Folding – simulation results

Cauchy dist. with 100 data points (“F known”)

20 40 60 80 100

050

100

m (years)

retu

rn le

vels

RETURN LEVELS

red = "true", green = with foldling, white = conventional

Threshold Selection for Precipitation Extremes

Folding – An analytical result

Assuming that ξ = 0, it can be shown that thevariance of the estimator for σ using the foldingprocedure, Var(σ̂), for a GPD(ξ = 0, σ) can bereduced compared to the conventional estimator.

V ar(σ̂) ≤σ

n

The reduction in variance is a function of thethreshold u and the “quality” of the approximation forF.

Simulation results suggest that the folding worksbetter for heavy-tailed (Cauchy) distributions.

Threshold Selection for Precipitation Extremes

Folding – An analytical result

Assuming that ξ = 0, it can be shown that thevariance of the estimator for σ using the foldingprocedure, Var(σ̂), for a GPD(ξ = 0, σ) can bereduced compared to the conventional estimator.

V ar(σ̂) ≤σ

n

The reduction in variance is a function of thethreshold u and the “quality” of the approximation forF.

Simulation results suggest that the folding worksbetter for heavy-tailed (Cauchy) distributions.

Threshold Selection for Precipitation Extremes

Conclusions

Increasing the threshold according to model fitdiagnostics may be misleading in assessing thequality of the fit.

Using more information from below the thresholdseems to yield more robust estimates.

Using the folding procedure may lead to morefreedom to “define” extreme values in applications.

Threshold Selection for Precipitation Extremes

APPENDIX – GPD convergence

200 400 600 800 1000

1.0

1.5

2.0

2.5

3.0

3.5

4.0

−0.2

0−0

.15

−0.1

0−0

.05

shape parameter for normal distribution (simulated)

Threshold Selection for Precipitation Extremes