The region determined by Kendall’s tau and Spearman’s...

18
Quick Overview General definitions Relationship between Kendall’s tau and Spearman’s rho The region determined by Kendall’s tau and Spearman’s rho Manuela Schreyer Ph.D. Student University of Salzburg (joint work with Roland Paulin and Wolfgang Trutschnig) Wien, October 22, 2015 Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Transcript of The region determined by Kendall’s tau and Spearman’s...

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

The region determined by Kendall’s tau andSpearman’s rho

Manuela SchreyerPh.D. Student

University of Salzburg(joint work with

Roland Paulin and Wolfgang Trutschnig)

Wien, October 22, 2015

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

Contents

1 Quick Overview

2 General definitionsPopulation version of Kendall’s tau and Spearman’s rho

3 Relationship between Kendall’s tau and Spearman’s rhoShuffles of MClassical τ -ρ region Ω0

Exact τ -ρ region Ω

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

Overview

−1.0

−0.5

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5 1.0τ

ρ

The following inequalities betweenτ and ρ go back to Daniels andDurbin & Stuart’s respectively:

|3τ − 2ρ| ≤ 1

(1 + τ)2

2− 1 ≤ ρ ≤ 1− (1− τ)2

2

The inequalities together yieldwhat we called the classical τ -ρregion Ω0 (shaded region), whichis not sharp.

Although both inequalities areknown since the 1950s, the exacttau-rho region was still unknown.

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

Overview

Main objective: Solve the sixty year old question about the τ -ρ regionΩ.We derived a function Φ, which fully determines Ω via

Ω =

(x , y) ∈ [−1, 1]2 : Φ(x) ≤ y ≤ −Φ(−x).

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rhoPopulation version of Kendall’s tau and Spearman’s rho

Definition: Kendall’s tau and Spearman’s rho

X ,Y are random variables with continuous distribution functions F andG respectively.

Spearman’s ρ

Spearman’s ρ is defined as the Pearson correlation coefficient of theU(0, 1)-distributed random variables U := F X and V := G Y , i.e.

ρ(X ,Y ) = 12(E(UV )− 1

4

)Kendall’s τ

Kendall’s τ is given by the probability of concordance minus theprobability of discordance, i.e.

τ(X ,Y ) = P((X1 − X2)(Y1 − Y2) > 0)− P

((X1 − X2)(Y1 − Y2) < 0

),

whereby (X1,Y1) and (X2,Y2) are independent and have the samedistribution as (X ,Y ).

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rhoPopulation version of Kendall’s tau and Spearman’s rho

Kendall’s tau and Spearman’s rho: definition with copulas

Both measures only depend on the underlying (uniquely determined)copula A of (X ,Y ).

Kendall’s τ and Spearman’s ρ

It is well known (Nelson, 2007) that, given the copula A of (X ,Y ),Kendall’s τ and Spearman’s ρ can be expressed as

τ(X ,Y ) = 4

∫[0,1]2

A(x , y) dµA(x , y)− 1 =: τ(A)

ρ(X ,Y ) = 12

∫[0,1]2

xy dµA(x , y)− 3 =: ρ(A),

whereby µA denotes the doubly stochastic measure corresponding to Aa.

aµA([a, b] × [c, d ]) := A(b, d) − A(b, c) − A(a, d) + A(a, c)

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

Shuffles of MClassical τ -ρ region Ω0Exact τ -ρ region Ω

Relationship between Kendall’s tau and Spearman’s rho

Kendall’s tau and Spearman’s rho are, without doubt, the two mostfamous nonparametric measures of association/concordance. Butthe values of τ and ρ are often quite different.

A very natural question is how much they can differ, i.e. if τ(X ,Y )is known which values may ρ(X ,Y ) assume and vice versa.

Main objective: Determine Ω

Ω =

(τ(X ,Y ), ρ(X ,Y )) : X ,Y continuous random variables

=

(τ(A), ρ(A)) : A ∈ C,

Main tool: Special class of all copulas, which is dense in the set C ofall copulas w.r.t. d∞, usually referred to as shuffles of the minimumcopula M.

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

Shuffles of MClassical τ -ρ region Ω0Exact τ -ρ region Ω

Shuffles of M: Definition

Definition:

A copula A is called shuffle of M if there exists a λ-preserving, piecewiselinear function h : [0, 1]→ [0, 1] with slope ±1 such that the mass of A isconcentrated on the graph of h.

Example:

0 10

1 For determining Ω it wassufficient to consider the setof all straight shuffles of M,which is dense in the setof all copulas. These areshuffles of M with positiveslope, to which we will referas CS+ .

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

Shuffles of MClassical τ -ρ region Ω0Exact τ -ρ region Ω

Classical τ -ρ region Ω0: Daniels and Durbin & Stuart’sinequalities

The following well-known universal inequalities between τ and ρ go backto Daniels and Durbin & Stuart’s respectively:

Classical τ -ρ region Ω0:

|3τ − 2ρ| ≤ 1

(1 + τ)2

2− 1 ≤ ρ ≤ 1− (1− τ)2

2

The inequalities together yield what we refere to as the classical τ -ρregion Ω0.

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

Shuffles of MClassical τ -ρ region Ω0Exact τ -ρ region Ω

Classical τ -ρ region Ω0: Are the inequalities sharp?

−1.0

−0.5

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5 1.0τ

ρ

Daniel’s inequality is known tobe sharp (red line)

The first part of the inequalityby Durbin & Stuart is onlyknown to be sharp at the (red)pointspn := (−1 + 2

n ,−1 + 2n2 ), n ≥ 2

The second part is sharp at the(red) points−pn, n ≥ 2

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

Shuffles of MClassical τ -ρ region Ω0Exact τ -ρ region Ω

Exact τ -ρ region Ω: Intuition behind the conjecture

Prototypes

Natural conjecture: all shuffles Ah of M with

n − 1 stripes of length r ∈(

1

n,

1

n − 1

)and

one stripe of length 1− r(n − 1)

for some n ≥ 2 might also be extremal in thesense that (τ(Ah), ρ(Ah)) is a boundary point ofΩ. (Example see figure)

0 10

1

We call shuffles of this form prototypes.

One can calculate τ and ρ explicitly for all prototypes and then, based onthese values, derive the function Φ, which fully determines the exact τ -ρregion Ω by

Ω =

(x , y) ∈ [−1, 1]2 : Φ(x) ≤ y ≤ −Φ(−x).

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

Shuffles of MClassical τ -ρ region Ω0Exact τ -ρ region Ω

Determining the exact τ -ρ region Ω: consider shuffles of M

We only considered the lower bound of Ω, by symmetry weautomaticly get the upper bound.

As mentioned before, for determining Ω it was sufficient to considerstraight shuffles of M

explicit formulas for (τ(Ah), ρ(Ah)) for straight shuffles of M areeasy to calculate (for shuffles )

τ(Ah) = 4

∫[0,1]

Ah(x , h(x)) dλ(x)− 1

= 1− 4

∫[0,1]2

1[0,x)(y)1(h(x),1](h(y)) dλ2(x , y)

ρ(Ah) = 12

∫[0,1]

xh(x) dλ(x)− 3

= 1− 12

∫[0,1]2

1[0,x)(y)1(h(x),1](h(y))(x − y) dλ2(x , y)

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

Shuffles of MClassical τ -ρ region Ω0Exact τ -ρ region Ω

Function Φ fully determine the exact τ -ρ region Ω

The following function Φ fully determine the exact τ -ρ region Ω:

Definition of the function Φ:

Define Φn : [−1 + 2n, 1]→ [−1, 1] by

Φn(x) = −1− 4

n2+

3

n+

3x

n− n − 2√

2n2√n − 1

(n − 2 + nx)3/2

and set Φ : [−1, 1]→ [−1, 1],

Φ(x) =

−1 if x = −1,

Φn(x) if x ∈[

2−nn, 2−(n−1)

n−1

]for some n ≥ 2.

Notice that Φ2(x) = − 12 + 3x

2 , i.e. on [0, 1] the function Φ coincideswith Daniels’ linear bound.

For xn = 2−nn and n ≥ 2 we have (xn,Φ(xn)) = pn, i.e. (xn,Φ(xn))

coincides with the points at which Durbin and Stuart’s inequality isknown to be sharp.

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

Shuffles of MClassical τ -ρ region Ω0Exact τ -ρ region Ω

Exact τ -ρ region Ω: graphical representation

−1.0

−0.5

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5 1.0τ

ρ

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

Shuffles of MClassical τ -ρ region Ω0Exact τ -ρ region Ω

Exact τ -ρ region Ω: graphical representation

The function Φ (red) and some prototypes with their correspondingKendall’s τ and Spearman’s ρ. The shaded region depicts the classicalτ -ρ-region Ω0, straight lines connecting the points pn (points at whichDurbin and Stuart’s inequality is known to be sharp) are plotted in green.

−1.0

−0.9

−0.8

−0.7

−0.6

−0.5

−0.6 −0.4 −0.2 0.0τ

ρ

Zoom to

−1.0

−0.5

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5 1.0τ

ρ

Bottom: Copulas for whichthe inequality by Durbin &Stuart is sharp.Top: Copulas for which τ andρ lie on the boundary curve Φ(red) of the exact τ -ρ regionand the inequality by Durbin &Stuart is not sharp.

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

Shuffles of MClassical τ -ρ region Ω0Exact τ -ρ region Ω

Exact τ -ρ region Ω: proof structure

Define the compact set ΩΦ by

ΩΦ :=

(x , y) ∈ [−1, 1]2 : Φ(x) ≤ y ≤ −Φ(−x).

First step in the proof (induction on the number of stripes):

The exact τ -ρ region Ω fulfills Ω ⊆ ΩΦ.

The fact that Ω ⊆ ΩΦ holds is the main result since it improves theclassical inequality by Durbin and Stuart mentioned before.

Second step in the proof (homotopy argument):

The exact τ -ρ region Ω coincides with ΩΦ. In particular, Ω is not convex.

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

Shuffles of MClassical τ -ρ region Ω0Exact τ -ρ region Ω

Additional related result

The proof of the main theorem has the following nice byproduct:

Corollary:

For every point (x , y) ∈ Ω there is a shuffle of Min A ∈ S such that wehave (τ(A), ρ(A)) = (x , y).

Paper currently under revision; downloadable underhttp://arxiv.org/abs/1502.04620

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho

Quick OverviewGeneral definitions

Relationship between Kendall’s tau and Spearman’s rho

Shuffles of MClassical τ -ρ region Ω0Exact τ -ρ region Ω

Thank you for your attention.

Manuela Schreyer, Roland Paulin, Wolfgang Trutschnig The region determined by Kendall’s tau and Spearman’s rho