Berlin

69
Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules Nonparametric estimation of copula densities A. Charpentier 1 , J.D. Fermanian 2 & O. Scaillet 3 Gemeinsame Jahrestagung der Deutschen Mathematiker-Vereinigung und der Gesellschaft für Didaktik der Mathematik, Multivariate Dependence Modelling using Copulas - Applications in Finance Humboldt-Universität zu Berlin, March 2007 1 ensae-ensai-crest & Katholieke Universiteit Leuven, 2 CoopeNeff A.M. & crest et 3 HEC Genève & FAME 1

description

 

Transcript of Berlin

Page 1: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Nonparametric estimation of copuladensities

A. Charpentier1, J.D. Fermanian2 & O. Scaillet3

Gemeinsame Jahrestagung der Deutschen Mathematiker-Vereinigung und derGesellschaft für Didaktik der Mathematik,

Multivariate Dependence Modelling using Copulas - Applications in Finance

Humboldt-Universität zu Berlin, March 2007

1ensae-ensai-crest & Katholieke Universiteit Leuven, 2CoopeNeff A.M. & crest et 3HEC Genève & FAME

1

Page 2: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

One (short) word on copulasDefinition 1. A 2-dimensional copula is a 2-dimensional cumulativedistribution function restricted to [0, 1]2 with standard uniform margins.

Copula (cumulative distribution function) Level curves of the copula

Copula density Level curves of the copula

If C is twice differentiable, let c denote the density of the copula.

2

Page 3: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Theorem 2. (Sklar) Let C be a copula, and FX and FY two marginaldistributions, then F (x, y) = C(FX(x), FY (y)) is a bivariate distributionfunction, with F ∈ F(FX , FY ).

Conversely, if F ∈ F(FX , FY ), there exists C such thatF (x, y) = C(FX(x), FY (y). Further, if FX and FY are continuous, then C isunique, and given by

C(u, v) = F (F−1X (u), F−1

Y (v)) for all (u, v) ∈ [0, 1]× [0, 1]

We will then define the copula of F , or the copula of (X, Y ).

3

Page 4: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Motivation

Example Loss-ALAE: consider the following dataset, were the Xi’s are lossamount (paid to the insured) and the Yi’s are allocated expenses. Denote byRi and Si the respective ranks of Xi and Yi. Set Ui = Ri/n = FX(Xi) andVi = Si/n = FY (Yi).

Figure 1 shows the log-log scatterplot (log Xi, log Yi), and the associate copulabased scatterplot (Ui, Vi).

Figure 2 is simply an histogram of the (Ui, Vi), which is a nonparametricestimation of the copula density.

Note that the histogram suggests strong dependence in upper tails (theinteresting part in an insurance/reinsurance context).

4

Page 5: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

1 2 3 4 5 6

12

34

5

Log−log scatterplot, Loss−ALAE

log(LOSS)

log(A

LA

E)

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Copula type scatterplot, Loss−ALAE

Probability level LOSSP

robabili

ty le

vel A

LA

E

Figure 1: Loss-ALAE, scatterplots (log-log and copula type).

5

Page 6: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Figure 2: Loss-ALAE, histogram of copula type transformation.

6

Page 7: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Basic notions on nonparametric estimation of densities

Let {X1, ..., Xn} denote an i.i.d. sample with density f .

The standard kernel estimate of a density is,

f(x) =1

nh

n∑

i=1

K

(x−Xi

h

),

where K is a kernel function (e.g. K(ω) = I(|ω| ≤ 1)/2 for the movinghistogram).

If {X1, ..., Xn} of positive random variable (Xi ∈ [0,∞[). Let K denote asymmetric kernel, then

E(f(0, h) =12f(0) + O(h)

7

Page 8: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Exponential random variables

Den

sity

−1 0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Exponential random variables

Den

sity

−1 0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Figure 3: Density estimation of an exponential density, 100 points.

8

Page 9: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Kernel based estimation of the uniform density on [0,1]

Densi

ty

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Kernel based estimation of the uniform density on [0,1]

D

ensi

ty

Figure 4: Density estimation of an uniform density on [0, 1].

9

Page 10: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

How to get a proper estimation on the border

Several techniques have been introduce to get a better estimation on theborder,

• boundary kernel (Müller (1991))

• mirror image modification (Deheuvels & Hominal (1989), Schuster(1985))

• transformed kernel (Devroye & Györfi (1981), Wand, Marron &Ruppert (1991))

In the particular case of densities on [0, 1],

• Beta kernel (Brown & Chen (1999), Chen (1999, 2000)),

• average of histograms inspired by Dersko (1998).

10

Page 11: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

The mirror image modification

The idea is to fit a density to the enlarged sample −X1, ...,−Xn, X1, ..., Xn

(for a positive distribution) and −X1, ...,−Xn, X1, ..., Xn, 2−X1, ..., 2−Xn

(for a distribution with support [0, 1]).

The estimator of the density is

f(x, h) =1

nh

n∑

i=1

K

(x−Xi

h

)+ K

(x + Xi

h

).

Remark: It works well when the density f has a null derivative in 0.

11

Page 12: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

−1.0 −0.5 0.0 0.5 1.0 1.5 2.0

0.0

0.5

1.0

1.5

2.0

Density estimation using mirror image

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

Density estimation using mirror image

Figure 5: The mirror image principle.

12

Page 13: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

The transformed Kernel estimate

The idea was developed in Devroye & Györfi (1981) for univariatedensities.

Consider a transformation T : R→ [0, 1] strictly increasing, continuouslydifferentiable, one-to-one, with a continuously differentiable inverse.

Set Y = T (X). Then Y has density

fY (y) = fX(T−1(y)) · (T−1)′(y).

If fY is estimated by fY , then fX is estimated by

fX(x) = fY (T (x)) · T ′(x).

13

Page 14: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

−3 −2 −1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

0.5

Density estimation transformed kernel

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

Density estimation transformed kernel

Figure 6: The transform kernel principle (with a Φ−1-transformation).

14

Page 15: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

The Beta Kernel estimate

The Beta-kernel based estimator of a density with support [0, 1] at point x, isobtained using beta kernels, which yields

f(x) =1n

n∑

i=1

K

(Xi,

u

b+ 1,

1− u

b+ 1

)

where K(·, α, β) denotes the density of the Beta distribution with parametersα and β,

K(x, α, β) =Γ(α + β)Γ(α)Γ(β)

xα(1− x)β−11(x ∈ [0, 1]).

15

Page 16: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

02

46

810

Gaussian Kernel approach

0.0 0.2 0.4 0.6 0.8 1.0

02

46

810

Beta Kernel approach

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

Density estimation using beta kernels

Figure 7: The beta-kernel estimate.

16

Page 17: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

The average of histogramsIn a very general context Dersko (1998) introduced the intrinsic estimator ofthe density, as the average of all possible histograms. Consider the case of adensity on the unit interval [0, 1]. Let Jm denote the set of all partitions of[0, 1) into m+1 subintervals. Define fm(·) as the average of all the histograms,

fm(·) =1

#Jm

Im∈Jm

fIm(x), (1)

as in Dersko (1998).

Let Im = (I0:m, I1:m, I2:m, . . . , Im:m) a partition of [0, 1) into m + 1subintervals, I =

⋃i Ii:m. Set Ii:m = [ui−1:m, ui:m), where

u0:m = 0 < u1:m < · · · < uk:m < uk+1:m = 1. The histogram of X1, . . . , Xn

associated to Ik is

fIm(x) =1

n · [ui:m − ui−1:m]

n∑

j=1

1(Xi ∈ Ii:m), where x ∈ Ii:m.

17

Page 18: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

Average of histograms, m=1

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

Average of histograms, m=1

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

Average of histograms, m=1

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

Average of histograms, m=1

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

Density as average of randomized histograms, n=250, m=1

Figure 8: Average of histograms, m = 1.

18

Page 19: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

Average of histograms, m=2

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

Average of histograms, m=2

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

Average of histograms, m=2

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

Average of histograms, m=2

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

Density as average of randomized histograms, n=250, m=2

Figure 9: Average of histograms, m = 2.

19

Page 20: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

Average of histograms, m=5

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

Average of histograms, m=5

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

Average of histograms, m=5

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

Average of histograms, m=5

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

Density as average of randomized histograms, n=250, m=5

Figure 10: Average of histograms, m = 5.

20

Page 21: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

Density as average of randomized histograms, n=250, m=10

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

Density as average of randomized histograms, n=250, m=20

Figure 11: Average of histograms, m = 10 and m = 20.

21

Page 22: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

02

46

810

Gaussian Kernel approach

0.0 0.2 0.4 0.6 0.8 1.0

02

46

810

Beta Kernel approach

0.0 0.2 0.4 0.6 0.8 1.0

02

46

810

Average of histograms

0.0 0.2 0.4 0.6 0.8 1.00

24

68

10

Average of histograms

Figure 12: Shape of the induced kernel based on uniform averaging.

22

Page 23: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Copula density estimation: the boundary problem

Let (U1, V1), ..., (Un, Vn) denote a sample with support [0, 1]2, and with densityc(u, v), which is assumed to be twice continuously differentiable on (0, 1)2.

If K denotes a symmetric kernel, with support [−1,+1], then for all(u, v) ∈ [0, 1]× [0, 1], in any corners (e.g. (0, 0))

E(c(0, 0, h)) =14· c(u, v)− 1

2[c1(0, 0) + c2(0, 0)]

∫ 1

0

ωK(ω)dω · h + o(h).

on the interior of the borders (e.g. u = 0 and v ∈ (0, 1)),

E(c(0, v, h)) =12· c(u, v)− [c1(0, v)]

∫ 1

0

ωK(ω)dω · h + o(h).

and in the interior ((u, v) ∈ (0, 1)× (0, 1)),

E(c(u, v, h)) = c(u, v) +12[c1,1(u, v) + c2,2(u, v)]

∫ 1

−1

ω2K(ω)dω · h2 + o(h2).

23

Page 24: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

On borders, there is a multiplicative bias and the order of the bias is O(h)(while it is O(h2) in the interior).

If K denotes a symmetric kernel, with support [−1,+1], then for all(u, v) ∈ [0, 1]× [0, 1], in any corners (e.g. (0, 0))

V ar(c(0, 0, h)) = c(0, 0)(∫ 1

0

K(ω)2dω

)2

· 1nh2

+ o

(1

nh2

).

on the interior of the borders (e.g. u = 0 and v ∈ (0, 1)),

V ar(c(0, v, h)) = c(0, v)(∫ 1

−1

K(ω)2dω

)(∫ 1

0

K(ω)2dω

)· 1nh2

+ o

(1

nh2

).

and in the interior ((u, v) ∈ (0, 1)× (0, 1)),

V ar(c(u, v, h)) = c(u, v)(∫ 1

−1

K(ω)2dω

)2

d · 1nh2

+ o

(1

nh2

).

24

Page 25: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0

1

2

3

4

5

Estimation of Frank copula

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Figure 13: Theoretical density of Frank copula.

25

Page 26: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0

1

2

3

4

5

Estimation of Frank copula

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Figure 14: Estimated density of Frank copula, using standard Gaussian (inde-pendent) kernels, h = h∗.

26

Page 27: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Transformed kernel technique

Consider the kernel estimator of the density of the(Xi, Yi) = (G−1(Ui), G−1(Vi))’s, where G is a strictly increasing distributionfunction, with a differentiable density. Since density f is continuous, twicedifferentiable, and bounded above, for all (x, y) ∈ R2, define

f(x, y) =1

nh2

n∑

i=1

K

(x−Xi

h

)K

(y − Yi

h

),

Sincef(x, y) = g(x)g(y)c[G(x), G(y)]. (2)

can be inverted in

c(u, v) =f(G−1(u), G−1(v))

g(G−1(u))g(G−1(v)), (u, v) ∈ [0, 1]× [0, 1], (3)

27

Page 28: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

one gets, substituting f in (3)

c(u, v) =1

nh · g(G−1(u)) · g(G−1(v))

n∑

i=1

K

(G−1(u)−G−1(Ui)

h,G−1(v)−G−1(Vi)

h

),

(4)

Therefore,

E(c(u, v, h)) = c(u, v) +o(h)

g(G−1(u))g(G−1(v)).

Similarly,

V ar(c(u, v, h)) =1

g(G−1(u))g(G−1(v))

[c(u, v)nh2

(∫K(ω)2dω

)2]

+1

g(G−1(u))2g(G−1(v))2o

(1

nh2

).

Asymptotic normality and also be derived.

28

Page 29: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0

1

2

3

4

5

Estimation of Frank copula

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

Figure 15: Estimated density of Frank copula, using a Gaussian kernel, after aGaussian normalization.

29

Page 30: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0

1

2

3

4

5

Estimation of Frank copula

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

Figure 16: Estimated density of Frank copula, using a Gaussian kernel, after aStudent normalization, with 5 degrees of freedom.

30

Page 31: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0

1

2

3

4

5

Estimation of Frank copula

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

Figure 17: Estimated density of Frank copula, using a Gaussian kernel, after aStudent normalization, with 3 degrees of freedom.

31

Page 32: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Using the mirror reflection principleInstead of using only the (Ui, Vi)’s the mirror image consists in reflecting eachdata point with respect to all edges and corners of the unit square [0, 1]× [0, 1].Hence, additional observations can be considered, i.e. the (±Ui,±Vi)’s, the(±Ui, 2− Vi)’s, the (2− Ui,±Vi)’s and the (2− Ui, 2− Vi)’s. Hence, consider

c(u, v) =1

nh

n∑

i=1

K

(u− Ui

h

)K

(v − Vi

h

)+ K

(u + Ui

h

)K

(v − Vi

h

)+

K

(u− Ui

h

)K

(v + Vi

h

)+ K

(u + Ui

h

)K

(v + Vi

h

)+

K

(u− Ui

h

)K

(v − 2 + Vi

h

)+ K

(u + Ui

h

)K

(v − 2 + Vi

h

)+

K

(u− 2 + Ui

h

)K

(v − Vi

h

)+ K

(u− 2 + Ui

h

)K

(v + Vi

h

)+

K

(u− 2 + Ui

h

)K

(v − 2 + Vi

h

).

32

Page 33: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

In the case of uniformly continuous copula density c on [0, 1]× [0, 1], if K is abounded symmetric kernel that vanishes off [−1, 1], with ωK(ω) → 0 asω →∞, and if nh2 →∞ as h → 0 and n →∞, then f satisfies standardasymptotic properties, e.g. for every (u, v) ∈ [0, 1]× [0, 1]

E(c(u, v, h)) = c(u, v) + O(h) on the borders,

= c(u, v) + O(h2) in the interior,

and moreover, if σ2 =∫ 1

−1K(ω)2dω

V ar(c(u, v, h)) = 4c(u, v)(nh2)−1σ4 + o((nh2)−1) in corner,

= 2c(u, v)(nh2)−1σ4 + o((nh2)−1) in the interior of borders,

= c(u, v)(nh2)−1σ4 + o((nh2)−1) in the interior.

Furthermore, asymptotic normality can also be obtained,√

nh2 (c(u, v, h)− c(u, v)) L→ N(

0, κc(u, v)∫

K2(ω)dω

),

33

Page 34: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

−1.0 −0.5 0.0 0.5 1.0 1.5 2.0

−1.0

−0.5

0.0

0.5

1.0

1.5

2.0

0.0 0.2 0.4 0.6 0.8 1.00.0

0.2

0.4

0.6

0.81.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Estimation of the copula density

−1.0 −0.5 0.0 0.5 1.0 1.5 2.0

−1.0

−0.5

0.0

0.5

1.0

1.5

2.0

0.0 0.2 0.4 0.6 0.8 1.00.0

0.2

0.4

0.6

0.81.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Estimation of the copula density

Figure 18: Copula density estimation - mirror reflection (n = 100, 1000).

34

Page 35: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

−1.0 −0.5 0.0 0.5 1.0 1.5 2.0

−1.0

−0.5

0.0

0.5

1.0

1.5

2.0

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Estimation of the copula density (mirror reflection)

−1.0 −0.5 0.0 0.5 1.0 1.5 2.0

−1.0

−0.5

0.0

0.5

1.0

1.5

2.0

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Estimation of the copula density (mirror reflection)

Figure 19: Copula density estimation - mirror reflection (n = 100, 1000).

35

Page 36: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Estimation of the copula density (mirror reflection) Estimation of the copula density (mirror reflection)

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Figure 20: Estimated density of Frank copula, mirror reflection principle

36

Page 37: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Bivariate Beta kernels

The Beta-kernel based estimator of the copula density at point (u, v), isobtained using product beta kernels, which yields

c(u, v) =1n

n∑

i=1

K

(Xi,

u

b+ 1,

1− u

b+ 1

)·K

(Yi,

v

b+ 1,

1− v

b+ 1

),

where K(·, α, β) denotes the density of the Beta distribution with parametersα and β.

37

Page 38: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Beta (independent) bivariate kernel , x=0.0, y=0.0 Beta (independent) bivariate kernel , x=0.2, y=0.0 Beta (independent) bivariate kernel , x=0.5, y=0.0

Beta (independent) bivariate kernel , x=0.0, y=0.2 Beta (independent) bivariate kernel , x=0.2, y=0.2 Beta (independent) bivariate kernel , x=0.5, y=0.2

Beta (independent) bivariate kernel , x=0.0, y=0.5 Beta (independent) bivariate kernel , x=0.2, y=0.5 Beta (independent) bivariate kernel , x=0.5, y=0.5

Figure 21: Shape of bivariate Beta kernels K(·, x/b+1, (1−x)/b+1)×K(·, y/b+1, (1− y)/b + 1) for b = 0.2.

38

Page 39: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Assume that the copula density c is twice differentiable on [0, 1]× [0, 1]. Let(u, v) ∈ [0, 1]× [0, 1]. The bias of c(u, v) is of order b, i.e.

E(c(u, v)) = c(u, v) +Q(u, v) · b + o(b),

where the bias Q(u, v) is

Q(u, v) = (1−2u)c1(u, v)+(1−2v)c2(u, v)+12

[u(1− u)c1,1(u, v) + v(1− v)c2,2(u, v)] .

The bias here is O(b) (everywhere) while it is O(h2) using standard kernels.

Assume that the copula density c is twice differentiable on [0, 1]× [0, 1]. Let(u, v) ∈ [0, 1]× [0, 1]. The variance of c(u, v) is in corners, e.g. (0, 0),

V ar(c(0, 0)) =1

nb2[c(0, 0) + o(n−1)],

in the interior of borders, e.g. u = 0 and v ∈ (0, 1)

V ar(c(0, v)) =1

2nb3/2√

πv(1− v)[c(0, v) + o(n−1)],

39

Page 40: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

and in the interior,(u, v) ∈ (0, 1)× (0, 1)

V ar(c(u, v)) =1

4nbπ√

v(1− v)u(1− u)[c(u, v) + o(n−1)].

Remark From those properties, an (asymptotically) optimal bandwidth b canbe deduced, maximizing asymptotic mean squared error,

b∗ ≡(

116πnQ(u, v)2

· 1√v(1− v)u(1− u)

)1/3

.

Note (see Charpentier, Fermanian & Scaillet (2005)) that all those results canbe obtained in dimension d ≥ 2.

Example For n = 100 simulated data, from Frank copula, the optimalbandwidth is b ∼ 0.05.

40

Page 41: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Estimation of the copula density (Beta kernel, b=0.1) Estimation of the copula density (Beta kernel, b=0.1)

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Figure 22: Estimated density of Frank copula, Beta kernels, b = 0.1

41

Page 42: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Estimation of the copula density (Beta kernel, b=0.05) Estimation of the copula density (Beta kernel, b=0.05)

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Figure 23: Estimated density of Frank copula, Beta kernels, b = 0.05

42

Page 43: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4Standard Gaussian kernel estimator, n=100

Estimation of the density on the diagonal

Density o

f th

e e

stim

ato

r

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Standard Gaussian kernel estimator, n=1000

Estimation of the density on the diagonal

Density o

f th

e e

stim

ato

r

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Standard Gaussian kernel estimator, n=10000

Estimation of the density on the diagonal

Density o

f th

e e

stim

ato

r

Figure 24: Density estimation on the diagonal, standard kernel.

43

Page 44: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4Transformed kernel estimator (Gaussian), n=100

Estimation of the density on the diagonal

Density o

f th

e e

stim

ato

r

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Transformed kernel estimator (Gaussian), n=1000

Estimation of the density on the diagonal

Density o

f th

e e

stim

ato

r

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Transformed kernel estimator (Gaussian), n=10000

Estimation of the density on the diagonal

Density o

f th

e e

stim

ato

r

Figure 25: Density estimation on the diagonal, transformed kernel.

44

Page 45: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4Beta kernel estimator, b=0.05, n=100

Estimation of the density on the diagonal

Density o

f th

e e

stim

ato

r

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Beta kernel estimator, b=0.02, n=1000

Estimation of the density on the diagonal

Density o

f th

e e

stim

ato

r

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Beta kernel estimator, b=0.005, n=10000

Estimation of the density on the diagonal

Density o

f th

e e

stim

ato

r

Figure 26: Density estimation on the diagonal, Beta kernel.

45

Page 46: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Average of histograms and copulas

A natural idea would be to extend the notion of average of histograms inhigher dimension. But note that the notion of partition can be morecomplicated.

Recall that C is a copula if for any u1 ≤ u2 and v1 ≤ v2,

C(u2, v2)− C(u1, v2)− C(u2, v1) + C(u1, v1) ≥ 0,

and C is a semicopula (Durante & Sempi (2004)) if for any u1 ≤ u2 andv1 ≤ v2, C(u2, v2)− C(u1, v2) ≥ 0 and C(u2, v2)− C(u2, v1) ≥ 0.

0.0 0.2 0.4 0.6 0.8 1.0

0.00.2

0.40.6

0.81.0

Copula, positive area

0.0 0.2 0.4 0.6 0.8 1.0

0.00.2

0.40.6

0.81.0

Semicopula, positive area

46

Page 47: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Histogram in dimension 2

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Histogram in dimension 2

Figure 27: Partitioning the unit square.

47

Page 48: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Copula density as average of randomized histograms, m=2 Copula density as average of randomized histograms, m=2

0.0 0.2 0.4 0.6 0.8 1.0

0.00.2

0.40.6

0.81.0

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Copula density as average of randomized histograms, m=5 Copula density as average of randomized histograms, m=5

0.0 0.2 0.4 0.6 0.8 1.00.0

0.20.4

0.60.8

1.0

Figure 28: Copula density using average of histograms.

48

Page 49: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Copula density as average of randomized histograms, m=10 Copula density as average of randomized histograms, m=10

0.0 0.2 0.4 0.6 0.8 1.0

0.00.2

0.40.6

0.81.0

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0

1

2

3

4

5

Copula density as average of randomized histograms, m=15 Copula density as average of randomized histograms, m=15

0.0 0.2 0.4 0.6 0.8 1.00.0

0.20.4

0.60.8

1.0

Figure 29: Copula density using average of histograms.

49

Page 50: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4Average of histograms, n=1000, m=5

Estimation of the density on the diagonal

Estim

ato

r

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Average of histograms, n=1000, m=10

Estimation of the density on the diagonal

Estim

ato

r

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Average of histograms, n=1000, m=15

Estimation of the density on the diagonal

Estim

ato

r

Figure 30: Density estimation on the diagonal, average of histograms.

50

Page 51: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Copula density estimation

Gijbels & Mielniczuk (1990): given an i.i.d. sample, a natural estimate forthe normed density is obtained using the transformed sample(FX(X1), FY (Y1)), ..., (FX(Xn), FY (Yn)), where FX and FY are the empiricaldistribution function of the marginal distribution. The copula density can beconstructed as some density estimate based on this sample (Behnen,Husková & Neuhaus (1985) investigated the kernel method).

The natural kernel type estimator c of c is

c(u, v) =1

nh2

n∑

i=1

K

(u− FX(Xi)

h,v − FY (Yi)

h

), (u, v) ∈ [0, 1].

“this estimator is not consistent in the points on the boundary of the unitsquare.”

51

Page 52: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Copula density estimation and pseudo-observations

Figure 31 shows scatterplots when margins are known (i.e.(FX(Xi), FY (Yi))’s), and when margins are estimated (i.e.(FX(Xi), FY (Yi)’s). Note that the pseudo sample is more “uniform”, in thesense of a lower discrepancy (as in Quasi Monte Carlo techniques, see e.g.Niederreiter, H. (1992)).

52

Page 53: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Scatterplot of observations (Xi,Yi)

Value of the Xis

Va

lue

of

the

Yis

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

Scatterplot of pseudo−observations (Ui,Vi)

Value of the Uis (ranks)V

alu

e o

f th

e V

is (

ran

ks)

Figure 31: Observations and pseudo-observation, 500 simulated observationsfrom Frank copula (Xi, Yi) and the associate pseudo-sample (FX(Xi), FY (Yi)).

53

Page 54: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Because samples are more “uniform” using ranks and pseudo-observations, thevariance of the estimator of the density, at some given point(u, v) ∈ (0, 1)× (0, 1) is usually smaller. For instance, Figure 32 shows theimpact of considering pseudo observations, i.e. substituting FX and FY tounknown marginal distributions FX and FY . The dotted line shows thedensity of c(u, v) from a n = 100 sample (Ui, Vi) (from Frank copula), and thestraight line shows the density of c(u, v) from the sample (FU (Ui), FV (Vi))(i.e. ranks of observations).

54

Page 55: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0 1 2 3 4

0.00.5

1.01.5

2.0

Impact of pseudo−observations (n=100)

Distribution of the estimation of the density c(u,v)

Dens

ity of

the e

stima

tor

Figure 32: The impact of estimating from pseudo-observations.

55

Page 56: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

The problem of censored dataIn the case of censored data, consider sample of ((Xi, δ

Xi ), (Yi, δ

Yi ))’s, where

δXi = 0 when Xi is censored. Xi = min{X0

i , CXi } and Yi = min{X0

i , CYi }

where the Ci’s are censoring variates. The (Xi, Yi) are observed data, the(X0

i , Y 0i )’s are the variables of interest, that might be unobservable.

Kaplan-Meier estimator can be considered, based on the transformed samplemade of the non-zero elements.

The transformed sample is denoted by (X ′i, Y

′i ), i = 1, ..., n′, and its size is

n′ =∑n

i=1 δXi · δY

i of fully uncensored data.

The marginal Kaplan-Meier of the distribution function is

FKMX (x) = 1−

i:Xi≤x

(n− i

n− i + 1

)δXi

.

The nonparametric estimator of the copula density at point (u, v) can beobtained using product beta kernels, on

56

Page 57: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

(UKMi , V KM

i ) = (FKMX (X ′

i), FKMY (Y ′

i )), i = 1, ..., n′.

Remark In the LOSS-ALAE dataset, the loss amounts Xi’s are censored,because of a policy limit. Hence, Xi = min{loss, limit}.

57

Page 58: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

2 4 6 8 10 12 14

46

810

12

Log(LOSS)

Log(A

LA

E)

Figure 33: The LOSS-ALAE dataset and censoring.

58

Page 59: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Unfortunately, such a procedure suffers from the so-called stock sampling (orselection bias) issue. Actually, the subsample used is far from being an i.i.d.sample drawn from the law of (X, Y ).

The observations in hand are chosen as fully uncensored. Thus, smallobserved (Xi, Yi) are more frequent in the subsample than they should be in ausual i.i.d. sample. Therefore, the previous estimator can be advised onlywhen the proportion of censoring is small. Otherwise the bias induced bystock sampling can become large.

59

Page 60: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0

1

2

3

4

5

Estimation of Frank copula

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

Figure 34: Estimated density of Frank copula, using Beta kernels, n = 1000,b = 0.050 and no-censoring.

60

Page 61: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0

1

2

3

4

5

Estimation of Frank copula

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

Figure 35: Estimated density of Frank copula, using Beta kernels, n = 1000,b = 0.050 and 1.7% censored data.

61

Page 62: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0

1

2

3

4

5

Estimation of Frank copula

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

Figure 36: Estimated density of Frank copula, using Beta kernels, n = 1000,b = 0.050 and 7.8% censored data.

62

Page 63: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

0

1

2

3

4

5

Estimation of Frank copula

0.2 0.4 0.6 0.8

0.2

0.4

0.6

0.8

Figure 37: Estimated density of Frank copula, using Beta kernels, n = 1000,b = 0.050 and 24% censored data.

63

Page 64: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Beta kernel estimator, b=0.01, n=1000, no censoring

Estimation of the density on the diagonal

Density o

f th

e e

stim

ato

r

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Beta kernel estimator, b=0.01, n=1000, 5% censoring

Estimation of the density on the diagonal

Density o

f th

e e

stim

ato

r

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Beta kernel estimator, b=0.01, n=1000, 25% censoring

Estimation of the density on the diagonal

Density o

f th

e e

stim

ato

r

Figure 38: Estimation of the density on the diagonal, with censored data .

64

Page 65: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Stock sampling can be observe as the bias increases as u goes to 1, and as theproportion of censored data increases.

Figure 39 shows the estimated density c(12/13, 12/13) with no censoring, and25% censored observations.

Following the idea proposed in Efron and Tibshirani (1993), bootstraptechniques can be used to estimate the bias. Consider B bootstrap samplesdrawn independently from the original censored data, with replacement,{(

X∗i,b, Y

∗i,b, δ

∗i,b

)}, i = 1, ..., n and b = 1, ..., B. The bias of the density

estimator at point (u, v) can be estimated from

bias (u, v) =1B

B∑

i=1

c (u, v)∗,b − c (u, v)

where c (u)∗,b is the density obtained from the bth bootstrap sample.

65

Page 66: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

Standard Beta kernel estimator

Estimation of the density near the lower corner

Den

sity

of t

he e

stim

ator

Figure 39: c(12/13, 12/13) using Beta kernels, with censored data (dotted line).

66

Page 67: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Transformed kernel estimator (Gaussian), n=1000

Estimation of the density on the diagonal, 25% censoring

Densi

ty o

f th

e e

stim

ato

r, b

oots

trap b

ias

corr

ect

ion

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

Beta kernel estimator, b=0.01, n=1000

Estimation of the density on the diagonal, 25% censoringD

ensi

ty o

f th

e e

stim

ato

r, b

oots

trap b

ias

corr

ect

ion

Figure 40: Censoring bias correction using bootstrap techniques.

67

Page 68: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

Figure 40 shows the the evolution of{

u 7→ 1B

B∑

i=1

c (u, u)∗,b}

,

on the diagonal using bootstrap techniques (as in Efron (1981)).

For each bootstrap sample, unobservable value Y 0i,b and censoring value Ci,b

are simulated using Kaplan-Meier estimate of the density, and then a sample(Xi,b, Y i, b, δi, b), i = 1, ..., n is obtained, and a new Kaplan-Meier estimationis obtained, F 0

b , and is used to obtain the estimation of the copula density forthis bootstrap sample.

The application to the Loss-ALAE dataset confirms Gumbel model.

68

Page 69: Berlin

Arthur CHARPENTIER - Estimation non-paramétrique des densités de copules

0.0 0.2 0.4 0.6 0.8 1.0

01

23

45

67

Beta kernel estimator, b=0.01, n=1000

Estimation of the density on the diagonal, LOSS−ALAE

De

nsi

ty o

f th

e e

stim

ato

r, b

oo

tstr

ap

bia

s co

rre

ctio

n

0.75 0.80 0.85 0.90 0.95 1.00

01

23

45

67

Beta kernel estimator, b=0.005, n=1000

Estimation of the density on the diagonal, LOSS−ALAED

en

sity

of th

e e

stim

ato

r, b

oo

tstr

ap

bia

s co

rre

ctio

n

Figure 41: Beta kernel estimator for Loss-ALAE, compare with Gumbel copula.

69