1 Nonparametric Statistical Methods Svetlana Stoyanchev Luke Schordine Li Ouyang Valencia Joseph...

1

Nonparametric Statistical Methods

Svetlana StoyanchevLuke SchordineLi OuyangValencia JosephMinghui Lu Rachel Merrill Kleva CostaMichael JohnesJane Cerise

Statistics and Data Analysis

Chapter 14December 13, 2007

3

Why use nonparametric methods?

Lawyers’ income (http://www.nalp.org/)

Make very few assumptions about the data distribution•Ordinal scale•Not all data is normally distributed

4

Inference on a single sample: Sign Test

Use median μ as a measure instead of mean:

s+ = the number of xi ’s that exceed μ0 s- = n - s+

Reject H0 if s+ is large (or s- is small) How large should s+ be in order to reject H0 at a

given significance ?

H0: μ = μ0 H1: μ > μ0

Mean

Media

n

5

Random sample: X1, X2, … Xn with median μ0

Let: Prob( Xi> μ0 ) = p and Prob( Xi< μ0 ) = 1 - p

H0: μ = μ0 H1: μ > μ0

H0: p = 1/2 H1: p > 1/2

S+ ~ Bin ( n, p )S- ~ Bin ( n, 1 - p)

Apply the test of binomial proportion from Chapter 9!


(S+, S- are Random Variable when H0 is true)

6

Rejection Criterion for H0:

One-sided test:

H0: μ = μ0 H1: μ > μ0 (or μ < μ0)

Two-sided test:

H0: μ = μ0 H1: μ != μ0

or


Smin = min(S+,S-) Smax = max(S+,S-)

0 <= s- <= s+ <= n

7

When n > 20, distribution of S+ and S- can be approximated by normal distribution with

Can use Z-test with Z statistic:

Reject H0 when:

or


8

Confidence interval for μOrdered data values:

Confidence interval for μ with prob:

-level CI for μ: where

198.0 199.0 200.5 200.8 201.3 202.5 202.2 203.4 203.7 206.3

H0: μ = 200 H1: μ != 200

/2 = 0.011 = 1 – 0.022 = 0.978

The lower 1.1% critical point is 1 (from table A1 n=10, p=.5) The upper 1.1% critical point is 9 (by symmetry)

97.8% CI = [199, 203.7]

Compute 95% CI for the temperature measurements and hypothesis. Because of discreteness can not find exact 95 % CI

Let

10 10

14.1.2: Wilcoxon Signed Rank Test

Designed by Frank Wilcoxon (1892-1965) to improve on the Sign Test

Takes into account whether xi is greater or lesser than µ̃L, and also the difference

di=xi-µ̃�0.

11

Frank Wilcoxon: The Man Behind the TestBorn in Ireland, grew up in Catskills in

New YorkEarned B.S. at Penn. Military Academy,

master’s at Rutgers, Ph.D. at Cornell,

all in chemistryWorked as a research scientist at

several laboratoriesBecame interested in statistical methods

after reading R.A. Fisher’s Statistical

Methods for Research WorkersIn response to Fisher’s Student’s T-tests, he developed non-parametric tests for paired and unpaired data sets Source: http://www.wikipedia.org

12

Wilcoxon Signed Rank Test

Wilcoxon’s paired sample test, it assumes symmetry about the median.

Assigns a rank to each difference di based on its absolute value, |di| = ri

Take the sums of the positive and negative deviations (W+ and W -, respectively), with the smallest deviation receiving first rank E (w+) = E(Σ i*Zi)

= E(1Z1+2Z2+…+nZn) = E(1Z1)+E(2Z2)+…+E(nZn) = 1E(Z1) + 2E(Z2)+…+nE(Zn), [ E(Z1)= E(Z2)=…=E(Zn) ] = 1E(Z1) + 2E(Z1)+…+nE(Z1) = (1+2+3+…+n) E(Z1)

1

n

i

i

W iZ

13

Wilcoxon Signed Rank Test The actual test utilizes the Z distribution:

( 1)1/ 2

4

n n

( 1)(2 1)

24

n n n

( 1)w 1/ 2

4( 1)(2 1)

24

n n

n n nz

14

Wilcoxon Signed Rank Test

Use a two-tailed Z-test with H0: μ0=0.

Reject if Zα/2 ≤ Z0

There are advantages:Uses ranked difference, not just differenceLeads to increased power

And disadvantages:Symmetry is assumed but may not be trueLeads to increased Type I error

15

1 2 3 4 5 6 7

iSubj. XA XB

di

XA—XB

|di|XA—XB

ri

XA—XB Wi

1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16

78246445645230506450782284409072

78246248685625445640683668205832

0 0

+2 —3 —4 —4 +5 +6 +8

+10 +10 —14 +16 +20 +32 +40

0 0 2 3 4 4 5 6 8

10 10 14 16 20 32 40

------12

3.53.5567

8.58.51011121314

------+1—2

—3.5—3.5+5+6+7

+8.5+8.5—10+11+12+13+14

W = 67.0 TN = 14

Adapted from http://faculty.vassar.edu/lowry/ch12a.html1

n

i

i

W W

16

Wilcoxon Signed Rank Test: Example

For the present example, with N=14, W=67, and σw=±31.86, the result is:

067 0.5

2.0931.86

Z

00.5

W

WZ

(Subtracting 0.5 is a correction factor, due

to the fact that W is greater than μw=0)

Since Z0>Zα/2=1.96, reject the null hypothesis.

17

Wilcoxon Signed Rank Test: Confidence Interval

To get a 95% CI for the mean:Take pairwise averages of the data:

17

2

i jij

x xX

Order these Walsh Averages from greatest to least

(1-α) level CI: ( 1) ( )w N wX X

18 18

14.2 Inferences for Two Independent Samples

By-Li Ouyang

19 19

Problem: One population larger than another population.How to solve?two equivalent nonparametric tests-Wilcoxon, Mann and Whitney Test

14.2.1 Wilcoxon-Mann-Whitney Test1st-the Wilcoxon rank sum testAssumption: no ties in the two samples: x1,x2, …, xn1

and y1, y2, ,,,,,yn2.

1. Rank all N = n1 + n2 observations in ascending order.2. Denote w1= sum the rank of x’s. w2= sum the rank of y’s. Ranks range over the integers 1, 2, ….., N. We have,

3. Reject H0 if w1 is large or if w2 is small.Note: At significant level α, n1 ≠ n2, distributions of W1 ≠ W2.

1 2

( 1)1 2

2

N Nw w N

20 20

2nd-the Mann and Whitney U-test1.Compare xi with yi. u1= number of pairs xi > yi u2= number of pairs xi < yi and u1 + u2 = n1n2.2.Reject H0 if u1 is large or u2 is small.

Two Test Statistics are related as follows:

Advantage of the Mann-Whitney test:Same distribution (whether u1 or u2) & Distribution range : [0, n1n2 ]

P- value=P{U≥ u1}=P{U≤ u2}At significant level α, we reject H0 if P-value ≤ α or u1 ≥ un1, n2,α .

Denote: un1, n2,α - the upper α critical point.

1 1 2 21 1 2 2

( 1) ( 1),

2 2

n n n nu w u w

21 21

For large n1 and n2, the null distribution of U is Normal distributed.

Z-test(Large Sample)Test Statistic:

We reject H0 at significant level α, if z ≥ zα

Or

Two-sided test,Test Statistics: umax= max(u1 , u2) or umin = min (u1 , u2)

P-value = 2P{U ≥ umax}=2P{U ≤ umin }

1 2 1 2 ( 1)( ) , ( )

2 12

n n n n NE U Var U

1 21 1

1 2

1 1( )

2 2 2( 1) ( )

12

n nu u E U

Zn n N Var U

1, 2,

1 2 1 21

( 1)1

2 2 12 n n

n n n n Nu z u

22 22

Example : Failure Times of Capacitors (Wilcoxon-Mann-Whitney Test)18 capacitors, 8 under control group and 10 under stressed groupPerform the Wilcoxon-Mann-Whitney test to determine if thermal stress significantly reduces the time to failure of capacitors. α = 0.05.

n1 = 8 n2 = 10The rank sums arew1 = 4+8+10+11+13+14+17+18 = 95w2 = 1+2+3+5+6+7+9+12+15+16 = 76

Times to Failure for Two Capacitor Groups Ranks of Times to Failure

Control Group Stressed Group Control Group Stressed Group5.2 17.1 1.1 7.2 4 13 1 78.5 17.9 2.3 9.1 8 14 2 99.8 23.7 3.2 15.2 10 17 3 12

12.3 29.8 6.3 18.3 11 18 5 15 7.0 21.1 6 16

1 11 1

2 22 2

( 1) (8)(9)95 59

2 2( 1) (10)(11)

76 212 2

n nu w

n nu w

23 23

Let F1 be c.d.f of the control group. F2 be c.d.f of the stressed group.Check that u1 + u2= n1n2=80.From Table A.11 P-Value=0.051Large sample Z-test:

Conclusion: yields the P-Value= 1- Ф(1.643) =0.0502

0 1 2 1 1 2: . :H F F vs H F F

1 1 2

1 2

/ 2 1/ 2

( 1)

1259 (8)(10) / 2 1/ 2

1.643(8)(10)(19)

12

u n nZ

n n N

Table A.11 Upper-Tail Probabilities of the Null Distribution of the Wilcoxon-

Mann-Whitney Statistic

n1 n2 w1 u1

P(W≥w1)=P(U≥u1)

8 8 84 48 0.052

8 87 51 0.025

8 90 54 0.010

8 92 56 0.005

9 89 53 0.057

9 93 57 0.023

9 96 60 0.010

9 98 62 0.006

10 95 59 0.051

10 98 62 0.027

10 102 66 0.010

10 104 68 0.006

24 24

Null distribution of the Wilcoxon-Mann-Whitney Test Statistic

Two r.v’s, X and Y, with c.d.f’s F1 and F2,respectively.

Assumption: Under H0, all N= n1 + n2 observations

come from the common distribution F1 = F2. Therefore, all possible orderings of these observations with n1 coming from F1 and n2 coming from F2 are equally likely. There are: For example, Total orderings. n1 = 2, n2 = 3.

All possible Orderings of n1 = 2, n2 = 3

Ranksw1 u1

Ranksw1 u1

1 2 3 4 5 1 2 3 4 5

x x y y y 3 0 y x y x y 6 3

x y x y y 4 1 y x y y x 7 4

x y y x y 5 2 y y x x y 7 4

x y y y x 6 3 y y x y x 8 5

y x x x y 5 2 y y y x x 9 6

Null Distribution of W1 and U1(n1=2&n2=3)

w1 u1 P(W1 - w1) = P (U1 =u1)

3 0 0.14 1 0.15 2 0.26 3 0.27 4 0.28 5 0.19 6 0.1

1 1 2

!

! !

N N

n n n

510

2

25 25

14.2.2 Wilcoxon-Mann-Whitney Confidence IntervalAssumption: F1and F2belong to a location parameter family with location parameters θ1 and θ2 .(θ1&θ2 :respective population medians) F1(x)=F(x - θ1), and F2(y) =F(y - θ2) F: common unknown distribution function

How to calculate CI for θ1 –θ2 ?Step1: Calculate all N= n1n2 pairwise differences d ij = xi -yj (1≤i≤n1, 1 ≤j≤ n2)And rank them: d(1) ≤ d(2) ≤....≤ d(N)

Step 2:Lower α/2 critical point u = un1, n2,1-α/2

The 100(1-)% CI for is given by d(u+1) ≤ θ1 - θ2 ≤ d(N-u)

26 26

Example: Find 95% CI for the difference between the median failure times of the control group and thermally stressed group of capacitors.n1 =8, n2 =10, N= n1n2 =80The lower 2.2% critical point of the distribution of U is 17 The upper 2.2% critical point of the distribution of U is 80-17=63α/2=0.022 -> 1-α=1-0.044=0.956Therefore, [d(18) ,d(63) ] =[-1.1,14.7] is a 95.6% CI .

Differences dij = xi -yj between two group

xiyi

1.1 2.3 3.2 6.3 7.0 7.2 9.1 15.2 18.3 21.1

5.2 4.1 2.9 2.0 -1.1 -1.8 -2.0 -3.9 -10.0 -13.1 -15.98.5 7.4 6.2 5.3 2.2 1.5 1.3 -0.6 -6.7 -9.8 -12.69.8 8.7 7.5 6.6 3.5 2.8 2.6 0.7 -5.4 -8.5 -11.3

12.3 11.2 10.0 9.1 6.0 5.3 5.1 3.2 -2.9 -6.0 -8.817.1 16.0 14.8 13.9 10.8 10.1 9.9 8.0 1.9 -1.2 -4.017.9 16.8 15.6 14.7 11.6 10.9 10.7 8.8 2.7 -0.4 -3.223.7 22.6 21.4 20.5 17.4 16.7 16.5 14.6 8.5 5.4 2.6

29.8 28.7 27.5 26.6 23.5 22.8 22.6 20.7 14.6 11.5 8.7

Table A.11 n1 n2 u1(80-u1) P(W≥w1)

8 10 59(80-59=21) 0.051

10 62(80-62=18) 0.027

10 63(80-63=17) 0.22

10 66(80-66=14) 0.01

10 68(80-68=12) 0.006

27

Example Using SAS

Two Groups A & B Both groups are exposed to a chemical that encourages tumor growth Group B has been treated with a drug to prevent tumor formation

The masses (in grams) of tumors in each group are

Group A: 3.1 2.2 1.7 2.7 2.5

Group B: 0.0 0.0 1.0 2.3

We want to see if there are any differences in tumor mass between group A & B.

Thus we will use the Wilcoxon test. Puts all the data in increasing order Calculate the rank

Mass: 0.0 0.0 1.0 1.7 2.2 2.3 2.5 2.7 3.1

Group: B B B A A B A A A

Rank: 1.5 1.5 3 4 5 6 7 8 9

28

SAS Program

Data Tumor;Input Group $ Mass @@;Datalines;A 3.1 A 2.2 A 1.7 A 2.7 A 2.5B 0.0 B 0.0 B 1.0 B 2.3;

Proc NPAR1WAY data= Tumor Wilcoxon;Title "Non Parametric Test to Compare Tumor Masses";Class Group;Var Mass;Exact Wilcoxon;run;

proc univariate data=tumor normal plot;Title "More Descriptive Statistics";Class group;Var Mass;run;

29

The NPAR1WAY Procedure

Wilcoxon Scores (Rank Sums) for Variable Mass Classified by Variable Group

Group N Sum of Squares Expected Under HO Std Dev Under HO

Mean Score

A 5 33.0 25.0 4.065437 6.60

B 4 12.0 20.0 4.065437 3.00

Wilcoxon Two-Sample Test

StatisticsNormal ApproximationZ One-Sided Pr < Z Two-Sided Pr > |Z| t ApproximationOne-Sided Pr < Z Two-Sided Pr > |Z| Exact TestOne-Sided Pr <= S

Two-Sided Pr >= |S - Mean|

12.0000

-1.8448 0.0325 0.0651

0.0511 0.1023 0.0317 0.0635

Z includes a continuity correction of 0.5.

Kruskal-Wallis Test

Chi-Square 3.8723

DF 1

PR>Chi-Square 0.0491

30

The Univariate Procedure

Tests for Location: Mu0=0

Tests Statistic p Value

Student’s t T 10.3479 Pr > |t| 0.0005

Sign M 2.5 Pr >= |M| 0.0625

Signed Rank S 7.5 Pr >= |S| 0.0625

32

INFERENCES FOR SEVERAL INDEPENDENT SAMPLES

-The Kruskal-Wallis test is a -The Kruskal-Wallis test is a generalization of the Wilcoxon-generalization of the Wilcoxon-Mann-Whitney test for a ≥ 2 Mann-Whitney test for a ≥ 2 independent samplesindependent samples-It is also a nonparametric -It is also a nonparametric alternative to the ANOVA F-test alternative to the ANOVA F-test for a one-way layoutfor a one-way layout

33

The steps to the test:1) First rank all N values from smallest to

largest. And take the average rank of #’s with equal values using the formula (N+1)/2

2) Calculate rank sums ri= ∑j=1rij and averages ṝi = ri/ni, i=1, 2, …, a.

3) Calculate the test statistic kw = =

4) Reject H0 for large values of kw ( if kw > )

34

The Pedagogy Problem

Consider Example 14.9 on page 581 of the text, in which four methods of teaching the concept of percentage to sixth graders are compared. There are 28 classes, 7 using each method: the Case Method, the Formula Method, the Equation Method, and the Unitary Analysis Method.

35

DATA Test_Score; INPUT Method $ Score @@;DATALINES;C 14.59 C 23.44 C 25.43 C 18.15 C 20.82 C 14.06 C 14.26F 20.27 F 26.84 F 14.71 F 22.34 F 19.49 F 24.92 F 20.20E 27.82 E 24.92 E 28.68 E 23.32 E 32.85 E 33.90 E 23.42U 33.16 U 26.93 U 30.43 U 36.43 U 37.04 U 29.76 U 33.88;

PROC NPAR1WAY DATA=Test_Score WILCOXON;CLASS Method;VAR Score;*EXACT WILCOXON;RUN;

The Program

36

A Note About the ProgramYou might have noticed the asterisk in the program line:

*EXACT WILCOXON

The asterisk turns the line into a comment. Otherwise, SAS attempts to find an exact p-value for the test, and it can take a very long time. Otherwise, this command would be highly recommended. We’ll settle for a quicker approximation.

37

The Output The NPAR1WAY Procedure

Wilcoxon Scores (Rank Sums) for Variable Score Classified by Variable Method

Sum of Expected Std Dev Mean Method N Scores Under H0 Under H0 Score C 7 49.00 101.50 18.845498 7.000000 F 7 66.50 101.50 18.845498 9.500000 E 7 125.50 101.50 18.845498 17.928571 U 7 165.00 101.50 18.845498 23.571429

Average scores were used for ties.

Kruskal-Wallis Test

Chi-Square 18.1390 DF 3 Pr > Chi-Square 0.0004

38

A Note About the Output

We see that the value of kw is 18.1390, a value large enough to yield an approximate p-value of 0.0004... an extremely small value. At a level of significance of 5%, or even 1%, there is a strong suggestion that the methods are not equally effective, and that the Unitary Analysis Method seems to be the best choice.

39

Use this to check for differences between treatment groups.

Test Statistic: ṝi - ṝj (the difference in their rank avg. For large n’s, Ri – Rj is approximately normally

distributed. Therefore Zij =

Treatments i and j are different if |Zij|> qa,∞,α

40

INFERENCES FOR SEVERAL MATCHED SAMPLES

The Friedman Test is a generalization of the sign test for a ≥2 matched samples

It is also a nonparametric alternative to the ANOVA F-Test for a randomized block design

Since this is use for a block design, rankings are done within each individual group.

The steps for the test: Rank observations from a treatments separately within each

block. Where needed take the average of equal ranking values using (N+ 2) /2.

41

14.4.1 Friedman Test

Example 14.11

Ryan and Joiner give data on the percentage drip loss in meat loaves. The goal was to compare the eight oven positions, which might differ due to temperature variations. Three batches of eight loaves were baked. The loaves from each batch were randomly placed in the eight positions.

Analyze the data using the Friedman test.

Here the oven positions are treatments and batches are blocks.

42

14.4.1 Friedman TestExample 14.11, SASdata meatloaf; input ovenbatch ovenposition driploss @@; datalines;1 1 7.33 1 2 3.22 1 3 3.28 1 4 6.441 5 3.83 1 6 3.28 1 7 5.06 1 8 4.442 1 8.11 2 2 3.72 2 3 5.11 2 4 5.782 5 6.50 2 6 5.11 2 7 5.11 2 8 4.283 1 8.06 3 2 4.28 3 3 4.56 3 4 8.613 5 7.72 3 6 5.56 3 7 7.83 3 8 6.33 ;proc rank data=meatloaf out=rankings; by ovenbatch; var driploss; ranks drip; run;proc print data=rankings; run;proc means data=rankings sum; class ovenposition;var drip;run;proc freq data=rankings; tables ovenbatch*ovenposition*driploss /cmh2; run;proc freq data=meatloaf; tables ovenbatch*ovenposition*driploss /cmh2 scores=rank; run;

The Friedman test is identical to the ANOVA CMH statistic when the analysis uses rank scores (SCORES=RANK)

43

14.4.1 Friedman Test

Example 14.11, SAS results

Obs ovenbatch ovenposition driploss drip

1 1 1 7.33 8.0

2 1 2 3.22 1.0

3 1 3 3.28 2.5

4 1 4 6.44 7.0

5 1 5 3.83 4.0

6 1 6 3.28 2.5

7 1 7 5.06 6.0

8 1 8 4.44 5.0

9 2 1 8.11 8.0

10 2 2 3.72 1.0

11 2 3 5.11 4.0

12 2 4 5.78 6.0

13 2 5 6.50 7.0

14 2 6 5.11 4.0

15 2 7 5.11 4.0

16 2 8 4.28 2.0

17 3 1 8.06 7.0

18 3 2 4.28 1.0

19 3 3 4.56 2.0

20 3 4 8.61 8.0

21 3 5 7.72 5.0

22 3 6 5.56 3.0

23 3 7 7.83 6.0

24 3 8 6.33 4.0

Analysis Variable : drip Rank for Variable driploss

ovenposition N

Obs Sum

1 3 23.0000000

2 3 3.0000000

3 3 8.5000000

4 3 21.0000000

5 3 16.0000000

6 3 9.5000000

7 3 16.0000000

8 3 11.0000000

Summary Statistics for ovenposition by drip

Controlling for ovenbatch

Cochran-Mantel-Haenszel Statistics (Based on Table Scores)

Statistic Alternative Hypothesis DF Value Prob

1 Nonzero Correlation 1 0.1488 0.6997

2 Row Mean Scores Differ 7 17.9393 0.0122

Total Sample Size = 24

44

Calculate the Friedman Statistic: fr = Reject H0 for large values of fr.The distribution of this test can be approximated

by the Thus reject H0 if fr >

It is similar to the Kruskal- Wallis test|ri - rj|>

=

46

Rank Correlation Methods

Pearson Correlation Coefficient ρ measures only the degree of

linear association between random variables which are normally distributed, it can not deal with nonlinear case.

Spearman’s Rank Correlation Coefficient ρs and Kendall’s Rank

Correlation Coefficient τ measure the degree of monotone (increasing or decreasing) association between two variables.

Extreme (1 or -1) correlation does not imply a cause—effect relationship.

Zero correlation does not imply independence. A “strong” correlation is not necessarily statistically significant, and

vice versa.

47

Researchers at the European Centre for Road Safety Testing are trying to find out how the age of cars affects their braking capability. They test a group of ten cars of differing ages and find out the minimum stopping distances that the cars can achieve. The results are set out in the table below:

Car Age(months)

Xi

Mini Stopping at 40 kph (metres)

Yi

Age Rank

(ui)

Stopping Rank

(vi)

Differences of the Ranks

(di = ui-vi)

A 9 28.4 1 1 0

B 15 29.3 2 2 0

C 24 37.6 3 7 -4

D 30 36.2 4 4.5 -0.5

E 38 36.5 5 6 -1

F 46 35.3 6 3 3

G 53 36.2 7 4.5 2.5

H 60 44.1 8 8 0

I 64 44.8 9 9 0

J 76 47.2 10 10 0

d2=32.5

14.5.1 Spearman’s Rank Correlation Coefficient

48

14.5.1 Spearman’s Rank Correlation Coefficient• Ho: X and Y are independent => ρs = 0

• Ha: X and Y are positive (monotone) associated <=> ρs > 02

12

6 (6)(32.5)1 1 0.803

( 1) (10)(99)

n

iis n n

dr

P-value = 0.0081

For large samples n≥10, rs~ Normal (0, 1/(n-1))

409.29803.01 nrz s

Since -1<rs<1, rs=0.803 indicate a strong positive association between car ages and minimum stopping distance; in other words, the older the car, the longer the distance we could expect it to take to stop.

49

Car Age(months)

Xi

Mini Stopping at 40 kph (metres)

Yi

Concordant

Pairs (Nci)

Discordant

Pairs (Ndi)Tie Pairs (Nti)

A 9 28.4 9 0 0

B 15 29.3 8 0 0

C 24 37.6 3 4 0

D 30 36.2 4 1 1

E 38 36.5 3 2 0

F 46 35.3 4 0 0

G 53 36.2 3 0 0

H 60 44.1 2 0 0

I 64 44.8 1 0 0

J 76 47.2 0 0 0

Nc=37 Nd=7 Nt=1

Nci=#{j>i: xj>xi and yj>yi}Ndi=#{j>i: xj>xi and yj<yi}Nti=#{j>i: xj=xi or yj=yi}

14.5.2 Kendall’s Rank Correlation Coefficient

N

NN dcdc

Where N=Nc + Nd + Nt

50

14.5.2 Kendall’s Rank Correlation Coefficient

• Ho: X and Y are independent => τ = 0

• Ha: X and Y are positively associated <=> τ > 0

37 70.67

( )( ) (45 0)(45 1)c d

x y

N N

N T N T

9 ( 1) (9)(10)(9)0.67 2.697

2(2 5) 2(25)

n nz

n

P-value=0.00355

For Large samples n≥10,2(2 5)

~ (0, )9 ( 1)

nNormal

n n

Tie pairs:

1

2 0jg

xj

aT

1

22 2 1

jh

yj

bT

51

Kendall τ and Spearman ρs imply different interpretations: While Spearman ρs can be thought of as the regular Pearson ρ but computed just from ranks of variables, Kendall τ rather represents a probability.

Spearman’s rank correlation coefficient is related to Kendall’s coefficient of concordance, by rs=2w-1 when a=2

A piece of SAS code: PROC CORR DATA=CAR SPEARMAN KENDALL;

Which will generate the correlation coefficients by just click a way!

52

14.5.3 Kendall’s Coefficient of Concordance

This measures the degree to which many judges agree on the ranking of several subjects, suppose there were three employers ranking several candidates for a job, you get the following data:

Candidate a b c d e f---------------------------------------------Judge A 1 6 3 2 4 5---------------------------------------------Judge B 1 5 6 4 2 3---------------------------------------------Judge C 6 3 2 5 4 1---------------------------------------------Rank Sum 8 14 11 11 10 8

• Ho: Random assignment of ranks by the judges Judges are in disagreement

• Ha: Not Random assignment of ranks by the judges Judges are in agreement

53


1367.0)16)(3(

05.2

w

a: treatmentsb: blocks ri: sum of ranksfr: Friedman statistic

a

ii ab

fraababr

d

d

ntdisagreeme

agreementw

1

222

max )1(12

)1(/}

2

)1({

07.1105.26305.65

)7)(3(3]9101111148[)7)(3(6

12

)1(3)1(

12

205.0,5

222222

2

abraab

fri

i

0≤w≤1, small values indicating disagreement and large values indicating agreement

Conclusion: We can not reject Null hypothesis, all employers give different rankings to the same candidates.

54

14.5 Rank Correlation Methods

Examples 14.12 and 14.13

Data are given on the yearly alcohol consumption from wine in liters per person

and yearly heart disease deaths per 100,000 people for 19 countries.

Test if there is an association between these two variables using Spearman’s rank correlation coefficient.

Test if there is an association between these two variables using Kendall’s rank correlation coefficient (Kendall’s tau).

55


Example 14.12, 14.13, in SASdata wineheart;

input country $ alcohol deaths @@;

datalines;

australia 2.5 211 austria 3.9 167 belgium 2.9 131

canada 2.4 191 denmark 2.9 220 finland 0.8 297

france 9.1 71 iceland 0.8 211 ireland 0.7 300

italy 7.9 107 netherlands 1.8 167 newzealand 1.9 266

norway 0.8 227 spain 6.5 86 sweden 1.6 207

switzerland 5.8 115 uk 1.3 285 us 1.2 199

wgermany 2.7 172

;

proc corr data=wineheart spearman;

run;

proc corr data=wineheart kendall;

run;

2 Variables: alcohol deaths

Simple Statistics

Variable N Mean Std Dev Median Minimum Maximum

alcohol 19 3.02632 2.50972 2.40000 0.70000 9.10000

deaths 19 191.05263 68.39629 199.00000 71.00000 300.00000

Spearman Correlation Coefficients, N = 19 Prob > |r| under H0: Rho=0

alcohol deaths

alcohol 1.00000

-0.82886 <.0001

deaths -0.82886 <.0001

1.00000

Kendall Tau b Correlation Coefficients, N = 19 Prob > |r| under H0: Rho=0

alcohol deaths

alcohol 1.00000

-0.69644 <.0001

deaths -0.69644 <.0001

1.00000

56


Example

data brakestats;

input car $ age stoppingdistance @@;

datalines;

a 9 28.4 b 15 29.3 c 24 37.6 d 30 36.2 e 38 36.5

f 46 35.3 g 53 36.2 h 60 44.1 i 64 44.8 j 76 47.2

;

proc corr data=brakestats spearman kendall;

run;

2 Variables: age stoppingdistance

Simple Statistics

Variable N Mean Std Dev Median Minimum Maximum

age 10 41.50000 22.11209 42.00000 9.00000 76.00000

stoppingdistance 10 37.56000 6.23773 36.35000 28.40000 47.20000

Spearman Correlation Coefficients, N = 10 Prob > |r| under H0: Rho=0

age stoppingdistance

age 1.00000

0.80244 0.0052

stoppingdistance 0.80244 0.0052

1.00000

Kendall Tau b Correlation Coefficients, N = 10 Prob > |r| under H0: Rho=0

age stoppingdistance

age 1.00000

0.67420 0.0071

stoppingdistance 0.67420 0.0071

1.00000

57


The Kendall’s Coefficient of Concordance is closely related to the Friedman statistic, we can calculate the Coefficient of Concordance once we obtain the Friedman statistic using SAS.

Example: data election;

input judge $ candidate $ candrank @@;

datalines;

a a 1 a b 6 a c 3 a d 2 a e 4 a f 5

b a 1 b b 5 b c 6 b d 4 b e 2 b f 3

c a 6 c b 3 c c 2 c d 5 c e 4 c f 1

;

proc freq data=election;

tables judge*candidate*candrank

/cmh2 scores=rank noprint;

run;

Summary Statistics for candidate by candrank

Controlling for judge

Cochran-Mantel-Haenszel Statistics (Based on Rank Scores)

Statistic Alternative Hypothesis DF Value Prob

1 Nonzero Correlation 1 0.0667 0.7963

2 Row Mean Scores Differ 5 2.0476 0.8425

Total Sample Size = 18

59 59

Resampling Methods

“Resampling” is generating the sampling distribution by drawing repeated random samples from the observed sample itself. 1

This is useful for assessing the accuracies

(e.g. the bias and standard error) of complex statistics.

• Permutation Test• Bootstrap Method• Jackknife Method

60 60

Permutation TestDeveloped by R.A. Fisher (1890-1962) and E.J.G. Pitman (1897-1993) in the 1930s. 2

Draws SRS (Simple Random Samples) without replacement

Tests whether two samples X and Y , of size n1 and n2 respectively, are drawn from the same common distribution.

Hypotheses: Ho: Differences between the samples are due to chance.

Ha1: Y tends to have greater values than X , not simply due to chance

Ha2: Y tends to have smaller values than X , not simply due to chance

Ha3: There are differences between X and Y , not due to chance.

This method may be used to compare many different test statistics. To illustrate this method, however, let us consider the permutation test based on the difference between the sample averages.

d y x

61 61

Permutation TestMethodology

1. Pool the samples in to one group (of size n1 + n2).

2. List all of the possible regroupings of the observations into two groups of size n1 and n2.

3. For each possible regrouping, compute the sample averages and , and then compute the difference, .

4. To assess how “unusual” the original observed difference is, compute a p-value (a proportion) as follows:

For Ha1: p-value = (# of times ) /

For Ha2: p-value = (# of times )/

For Ha3: p-value = (# of times )/

1 2

1

n n

n

ix

iy

d y x i i id y x

id d

id d

1 2

1

n n

n

1 2

1

n n

n

1 2

1

n n

n

id d

62 62

Bootstrap MethodInvented by B. Efron (1938- )in the 1960s.

Draws a very large number of SRS with replacement (Note the difference from the Permutation Test)

Heavily computer-based method of deriving robust estimates of

standard error of sample statistics.

63 63

Jackknife Method1

First implemented by R.E. von Mises2 (1883-1953), then developed (separately) by Tukey (1915-2000) and Quenouille in the 1950s.

Resamples by deleting one observation at a time

This method is also useful for estimating the standard error of a statistic, say based on a random sample of size n drawn from some distribution ‘F’.

First, calculate the n values of the statistic denoted by

Let and be the standard deviation of

The jackknife estimate of is given by

1 2( , ,..., )nt t x x x

*1 2 1( , ,..., , ,..., )i i i nt t x x x x x

**

1

ni

i

tt

n

*ts

* * *1 2, ,..., nt t t

( )SE t

** * 2

1

( 1)1( ) ( )

nt

ii

n snjse t t t

n n

64

14.6 Resampling Methods

SAS can be used to perform permutation, bootstrap, and jackknife resampling.

For the most part macros are required. These can be written

and are also readily available on the web.

PROC MULTTEST can be used to perform several tests incorporating permutation or bootstrap

In the following two examples, We use permutation and bootstrap resampling to obtain t-test p-value adjustment.

65

14.6 Resampling MethodsExample 14.15 permutation test and

data capacitor;

Input group $ failtime @@;

Datalines;

Control 17.9 control 23.7 control 29.8

Stressed 15.2 stressed 18.3 stressed 21.1

;

Proc multtest data=capacitor permutation nsample=25000

out=results outsamp=samp;

test mean(failtime /lower);

class group;

contrast 'a vs b' -1 1;

Run;

proc print data=samp(obs=18);

run;

proc print data=results;

run;

The PERMUTATION option in the PROC MULTTEST statement requests permutation resampling, and NSAMPLE=25 000 requests 25000 permutation samples. The OUTSAMP=SAMP option creates an output SAS data set containing the 25000 permutation samples.

The TEST statement specifies the t-test for T. The test is lower-tailed. The grouping variable in the CLASS statement is group, and the coefficients across the groups are -1 and 1, as specified in the CONTRAST statement. (See Chapter 12)

PROC PRINT displays the first 18 observations of the Res data set containing the bootstrap samples.

66


Obs _sample_ _class_ _obs_ Failtime

1 1 control 6 21.1

2 1 control 5 18.3

3 1 control 3 29.8

4 1 stressed 2 23.7

5 1 stressed 1 17.9

6 1 stressed 4 15.2

7 2 control 5 18.3

8 2 control 2 23.7

9 2 control 6 21.1

10 2 stressed 3 29.8



13 3 control 2 23.7

14 3 control 1 17.9

15 3 control 6 21.1




Model Information

Test for continuous variables Mean t-test

Tails for continuous tests Lower-tailed

Strata weights None

P-value adjustment Permutation

Center continuous variables No

Number of resamples 25000

Seed 356405001

Contrast Coefficients

Contrast

group

control stressed

a vs b -1 1

Continuous Variable Tabulations

Variable group NumObs Mean Standard Deviation

failtime control 3 23.8000 5.9506

failtime Stressed

3 18.2000 2.9513

p-Values

Variable Contrast Raw Permutation

failtime a vs b 0.1090 0.1474

67

14.6 Resampling MethodsExample 14.17 bootstrap testdata capacitor;

Input group $ failtime @@;

Datalines;

control 17.9 control 23.7 control 29.8

stressed 15.2 stressed 18.3 stressed 21.1

;

Proc multtest data=capacitor bootstrap nsample=25

outsamp=res nocenter out=outboot;

test mean(failtime /lower);

class group;

contrast 'a vs b' -1 1;

Run;

proc print data=res(obs=18);

run;

proc print data=outboot;

run;

The BOOTSTRAP option in the PROC MULTTEST statement requests bootstrap resampling, and NSAMPLE=25 requests 25 bootstrap samples. The OUTSAMP=RES option creates an output SAS data set containing the 25 bootstrap samples.

The TEST statement specifies the t-test for T. The test is lower-tailed. The grouping variable in the CLASS statement is group, and the coefficients across the groups are -1 and 1, as specified in the CONTRAST statement. (See Chapter 12)

PROC PRINT displays the first 18 observations of the Res data set containing the bootstrap samples.

68


Obs _sample_ _class_ _obs_ failtime

1 1 control 6 21.1

2 1 control 6 21.1

3 1 control 6 21.1

4 1 stressed 2 23.7

5 1 stressed 1 17.9

6 1 stressed 6 21.1

7 2 control 4 15.2

8 2 control 6 21.1

9 2 control 2 23.7




13 3 control 2 23.7

14 3 control 4 15.2

15 3 control 3 29.8




Model Information

Test for continuous variables Mean t-test

Tails for continuous tests Lower-tailed

Strata weights None

P-value adjustment Bootstrap

Center continuous variables No

Number of resamples 25

Seed 270752001

Contrast Coefficients

Contrast

group

control stressed

a vs b -1 1

Continuous Variable Tabulations

Variable group NumObs Mean Standard Deviation

failtime control 3 23.8000 5.9506

failtime stressed 3 18.2000 2.9513

p-Values

Variable Contrast Raw Bootstrap

failtime a vs b 0.1090 0.0400

69

Works Cited

1. Tamhane, Ajit and Dorothy Dunlop. Statistics and Data Analysis. Upper Saddle River, NJ. Prentice Hall, Inc. 2000.

2. “Resampling (statistics)”. Wikipedia. <http://en.wikipedia.org/wiki/Resampling_(statistics)>. 2007.

3. “Ch.14: Nonparametric Statistical Method” Group project, Wei Zhu, instructor. 2006.

1 Nonparametric Statistical Methods Svetlana Stoyanchev Luke Schordine Li Ouyang Valencia Joseph...

Documents

Transcript of 1 Nonparametric Statistical Methods Svetlana Stoyanchev Luke Schordine Li Ouyang Valencia Joseph...