1 Nonparametric Statistical Methods Svetlana Stoyanchev Luke Schordine Li Ouyang Valencia Joseph...
-
Upload
juliana-eaton -
Category
Documents
-
view
217 -
download
2
Transcript of 1 Nonparametric Statistical Methods Svetlana Stoyanchev Luke Schordine Li Ouyang Valencia Joseph...
1
Nonparametric Statistical Methods
Svetlana StoyanchevLuke SchordineLi OuyangValencia JosephMinghui Lu Rachel Merrill Kleva CostaMichael JohnesJane Cerise
Statistics and Data Analysis
Chapter 14December 13, 2007
3
Why use nonparametric methods?
Lawyers’ income (http://www.nalp.org/)
Make very few assumptions about the data distribution•Ordinal scale•Not all data is normally distributed
4
Inference on a single sample: Sign Test
Use median μ as a measure instead of mean:
s+ = the number of xi ’s that exceed μ0 s- = n - s+
Reject H0 if s+ is large (or s- is small) How large should s+ be in order to reject H0 at a
given significance ?
H0: μ = μ0 H1: μ > μ0
Mean
Media
n
5
Random sample: X1, X2, … Xn with median μ0
Let: Prob( Xi> μ0 ) = p and Prob( Xi< μ0 ) = 1 - p
H0: μ = μ0 H1: μ > μ0
H0: p = 1/2 H1: p > 1/2
S+ ~ Bin ( n, p )S- ~ Bin ( n, 1 - p)
Apply the test of binomial proportion from Chapter 9!
Inference on a single sample: Sign Test
(S+, S- are Random Variable when H0 is true)
6
Rejection Criterion for H0:
One-sided test:
H0: μ = μ0 H1: μ > μ0 (or μ < μ0)
Two-sided test:
H0: μ = μ0 H1: μ != μ0
or
Inference on a single sample: Sign Test
Smin = min(S+,S-) Smax = max(S+,S-)
0 <= s- <= s+ <= n
7
When n > 20, distribution of S+ and S- can be approximated by normal distribution with
Can use Z-test with Z statistic:
Reject H0 when:
or
Inference on a single sample: Sign Test
8
Confidence interval for μOrdered data values:
Confidence interval for μ with prob:
-level CI for μ: where
198.0 199.0 200.5 200.8 201.3 202.5 202.2 203.4 203.7 206.3
H0: μ = 200 H1: μ != 200
/2 = 0.011 = 1 – 0.022 = 0.978
The lower 1.1% critical point is 1 (from table A1 n=10, p=.5) The upper 1.1% critical point is 9 (by symmetry)
97.8% CI = [199, 203.7]
Compute 95% CI for the temperature measurements and hypothesis. Because of discreteness can not find exact 95 % CI
Let
10 10
14.1.2: Wilcoxon Signed Rank Test
Designed by Frank Wilcoxon (1892-1965) to improve on the Sign Test
Takes into account whether xi is greater or lesser than µ̃L, and also the difference
di=xi-µ̃�0.
11
Frank Wilcoxon: The Man Behind the TestBorn in Ireland, grew up in Catskills in
New YorkEarned B.S. at Penn. Military Academy,
master’s at Rutgers, Ph.D. at Cornell,
all in chemistryWorked as a research scientist at
several laboratoriesBecame interested in statistical methods
after reading R.A. Fisher’s Statistical
Methods for Research WorkersIn response to Fisher’s Student’s T-tests, he developed non-parametric tests for paired and unpaired data sets Source: http://www.wikipedia.org
12
Wilcoxon Signed Rank Test
Wilcoxon’s paired sample test, it assumes symmetry about the median.
Assigns a rank to each difference di based on its absolute value, |di| = ri
Take the sums of the positive and negative deviations (W+ and W -, respectively), with the smallest deviation receiving first rank E (w+) = E(Σ i*Zi)
= E(1Z1+2Z2+…+nZn) = E(1Z1)+E(2Z2)+…+E(nZn) = 1E(Z1) + 2E(Z2)+…+nE(Zn), [ E(Z1)= E(Z2)=…=E(Zn) ] = 1E(Z1) + 2E(Z1)+…+nE(Z1) = (1+2+3+…+n) E(Z1)
1
n
i
i
W iZ
13
Wilcoxon Signed Rank Test The actual test utilizes the Z distribution:
( 1)1/ 2
4
n n
( 1)(2 1)
24
n n n
( 1)w 1/ 2
4( 1)(2 1)
24
n n
n n nz
14
Wilcoxon Signed Rank Test
Use a two-tailed Z-test with H0: μ0=0.
Reject if Zα/2 ≤ Z0
There are advantages:Uses ranked difference, not just differenceLeads to increased power
And disadvantages:Symmetry is assumed but may not be trueLeads to increased Type I error
15
1 2 3 4 5 6 7
iSubj. XA XB
di
XA—XB
|di|XA—XB
ri
XA—XB Wi
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16
78246445645230506450782284409072
78246248685625445640683668205832
0 0
+2 —3 —4 —4 +5 +6 +8
+10 +10 —14 +16 +20 +32 +40
0 0 2 3 4 4 5 6 8
10 10 14 16 20 32 40
------12
3.53.5567
8.58.51011121314
------+1—2
—3.5—3.5+5+6+7
+8.5+8.5—10+11+12+13+14
W = 67.0 TN = 14
Adapted from http://faculty.vassar.edu/lowry/ch12a.html1
n
i
i
W W
16
Wilcoxon Signed Rank Test: Example
For the present example, with N=14, W=67, and σw=±31.86, the result is:
067 0.5
2.0931.86
Z
00.5
W
WZ
(Subtracting 0.5 is a correction factor, due
to the fact that W is greater than μw=0)
Since Z0>Zα/2=1.96, reject the null hypothesis.
17
Wilcoxon Signed Rank Test: Confidence Interval
To get a 95% CI for the mean:Take pairwise averages of the data:
17
2
i jij
x xX
Order these Walsh Averages from greatest to least
(1-α) level CI: ( 1) ( )w N wX X
18 18
14.2 Inferences for Two Independent Samples
By-Li Ouyang
19 19
Problem: One population larger than another population.How to solve?two equivalent nonparametric tests-Wilcoxon, Mann and Whitney Test
14.2.1 Wilcoxon-Mann-Whitney Test1st-the Wilcoxon rank sum testAssumption: no ties in the two samples: x1,x2, …, xn1
and y1, y2, ,,,,,yn2.
1. Rank all N = n1 + n2 observations in ascending order.2. Denote w1= sum the rank of x’s. w2= sum the rank of y’s. Ranks range over the integers 1, 2, ….., N. We have,
3. Reject H0 if w1 is large or if w2 is small.Note: At significant level α, n1 ≠ n2, distributions of W1 ≠ W2.
1 2
( 1)1 2
2
N Nw w N
20 20
2nd-the Mann and Whitney U-test1.Compare xi with yi. u1= number of pairs xi > yi u2= number of pairs xi < yi and u1 + u2 = n1n2.2.Reject H0 if u1 is large or u2 is small.
Two Test Statistics are related as follows:
Advantage of the Mann-Whitney test:Same distribution (whether u1 or u2) & Distribution range : [0, n1n2 ]
P- value=P{U≥ u1}=P{U≤ u2}At significant level α, we reject H0 if P-value ≤ α or u1 ≥ un1, n2,α .
Denote: un1, n2,α - the upper α critical point.
1 1 2 21 1 2 2
( 1) ( 1),
2 2
n n n nu w u w
21 21
For large n1 and n2, the null distribution of U is Normal distributed.
Z-test(Large Sample)Test Statistic:
We reject H0 at significant level α, if z ≥ zα
Or
Two-sided test,Test Statistics: umax= max(u1 , u2) or umin = min (u1 , u2)
P-value = 2P{U ≥ umax}=2P{U ≤ umin }
1 2 1 2 ( 1)( ) , ( )
2 12
n n n n NE U Var U
1 21 1
1 2
1 1( )
2 2 2( 1) ( )
12
n nu u E U
Zn n N Var U
1, 2,
1 2 1 21
( 1)1
2 2 12 n n
n n n n Nu z u
22 22
Example : Failure Times of Capacitors (Wilcoxon-Mann-Whitney Test)18 capacitors, 8 under control group and 10 under stressed groupPerform the Wilcoxon-Mann-Whitney test to determine if thermal stress significantly reduces the time to failure of capacitors. α = 0.05.
n1 = 8 n2 = 10The rank sums arew1 = 4+8+10+11+13+14+17+18 = 95w2 = 1+2+3+5+6+7+9+12+15+16 = 76
Times to Failure for Two Capacitor Groups Ranks of Times to Failure
Control Group Stressed Group Control Group Stressed Group5.2 17.1 1.1 7.2 4 13 1 78.5 17.9 2.3 9.1 8 14 2 99.8 23.7 3.2 15.2 10 17 3 12
12.3 29.8 6.3 18.3 11 18 5 15 7.0 21.1 6 16
1 11 1
2 22 2
( 1) (8)(9)95 59
2 2( 1) (10)(11)
76 212 2
n nu w
n nu w
23 23
Let F1 be c.d.f of the control group. F2 be c.d.f of the stressed group.Check that u1 + u2= n1n2=80.From Table A.11 P-Value=0.051Large sample Z-test:
Conclusion: yields the P-Value= 1- Ф(1.643) =0.0502
0 1 2 1 1 2: . :H F F vs H F F
1 1 2
1 2
/ 2 1/ 2
( 1)
1259 (8)(10) / 2 1/ 2
1.643(8)(10)(19)
12
u n nZ
n n N
Table A.11 Upper-Tail Probabilities of the Null Distribution of the Wilcoxon-
Mann-Whitney Statistic
n1 n2 w1 u1
P(W≥w1)=P(U≥u1)
8 8 84 48 0.052
8 87 51 0.025
8 90 54 0.010
8 92 56 0.005
9 89 53 0.057
9 93 57 0.023
9 96 60 0.010
9 98 62 0.006
10 95 59 0.051
10 98 62 0.027
10 102 66 0.010
10 104 68 0.006
24 24
Null distribution of the Wilcoxon-Mann-Whitney Test Statistic
Two r.v’s, X and Y, with c.d.f’s F1 and F2,respectively.
Assumption: Under H0, all N= n1 + n2 observations
come from the common distribution F1 = F2. Therefore, all possible orderings of these observations with n1 coming from F1 and n2 coming from F2 are equally likely. There are: For example, Total orderings. n1 = 2, n2 = 3.
All possible Orderings of n1 = 2, n2 = 3
Ranksw1 u1
Ranksw1 u1
1 2 3 4 5 1 2 3 4 5
x x y y y 3 0 y x y x y 6 3
x y x y y 4 1 y x y y x 7 4
x y y x y 5 2 y y x x y 7 4
x y y y x 6 3 y y x y x 8 5
y x x x y 5 2 y y y x x 9 6
Null Distribution of W1 and U1(n1=2&n2=3)
w1 u1 P(W1 - w1) = P (U1 =u1)
3 0 0.14 1 0.15 2 0.26 3 0.27 4 0.28 5 0.19 6 0.1
1 1 2
!
! !
N N
n n n
510
2
25 25
14.2.2 Wilcoxon-Mann-Whitney Confidence IntervalAssumption: F1and F2belong to a location parameter family with location parameters θ1 and θ2 .(θ1&θ2 :respective population medians) F1(x)=F(x - θ1), and F2(y) =F(y - θ2) F: common unknown distribution function
How to calculate CI for θ1 –θ2 ?Step1: Calculate all N= n1n2 pairwise differences d ij = xi -yj (1≤i≤n1, 1 ≤j≤ n2)And rank them: d(1) ≤ d(2) ≤....≤ d(N)
Step 2:Lower α/2 critical point u = un1, n2,1-α/2
The 100(1-)% CI for is given by d(u+1) ≤ θ1 - θ2 ≤ d(N-u)
26 26
Example: Find 95% CI for the difference between the median failure times of the control group and thermally stressed group of capacitors.n1 =8, n2 =10, N= n1n2 =80The lower 2.2% critical point of the distribution of U is 17 The upper 2.2% critical point of the distribution of U is 80-17=63α/2=0.022 -> 1-α=1-0.044=0.956Therefore, [d(18) ,d(63) ] =[-1.1,14.7] is a 95.6% CI .
Differences dij = xi -yj between two group
xiyi
1.1 2.3 3.2 6.3 7.0 7.2 9.1 15.2 18.3 21.1
5.2 4.1 2.9 2.0 -1.1 -1.8 -2.0 -3.9 -10.0 -13.1 -15.98.5 7.4 6.2 5.3 2.2 1.5 1.3 -0.6 -6.7 -9.8 -12.69.8 8.7 7.5 6.6 3.5 2.8 2.6 0.7 -5.4 -8.5 -11.3
12.3 11.2 10.0 9.1 6.0 5.3 5.1 3.2 -2.9 -6.0 -8.817.1 16.0 14.8 13.9 10.8 10.1 9.9 8.0 1.9 -1.2 -4.017.9 16.8 15.6 14.7 11.6 10.9 10.7 8.8 2.7 -0.4 -3.223.7 22.6 21.4 20.5 17.4 16.7 16.5 14.6 8.5 5.4 2.6
29.8 28.7 27.5 26.6 23.5 22.8 22.6 20.7 14.6 11.5 8.7
Table A.11 n1 n2 u1(80-u1) P(W≥w1)
8 10 59(80-59=21) 0.051
10 62(80-62=18) 0.027
10 63(80-63=17) 0.22
10 66(80-66=14) 0.01
10 68(80-68=12) 0.006
27
Example Using SAS
Two Groups A & B Both groups are exposed to a chemical that encourages tumor growth Group B has been treated with a drug to prevent tumor formation
The masses (in grams) of tumors in each group are
Group A: 3.1 2.2 1.7 2.7 2.5
Group B: 0.0 0.0 1.0 2.3
We want to see if there are any differences in tumor mass between group A & B.
Thus we will use the Wilcoxon test. Puts all the data in increasing order Calculate the rank
Mass: 0.0 0.0 1.0 1.7 2.2 2.3 2.5 2.7 3.1
Group: B B B A A B A A A
Rank: 1.5 1.5 3 4 5 6 7 8 9
28
SAS Program
Data Tumor;Input Group $ Mass @@;Datalines;A 3.1 A 2.2 A 1.7 A 2.7 A 2.5B 0.0 B 0.0 B 1.0 B 2.3;
Proc NPAR1WAY data= Tumor Wilcoxon;Title "Non Parametric Test to Compare Tumor Masses";Class Group;Var Mass;Exact Wilcoxon;run;
proc univariate data=tumor normal plot;Title "More Descriptive Statistics";Class group;Var Mass;run;
29
The NPAR1WAY Procedure
Wilcoxon Scores (Rank Sums) for Variable Mass Classified by Variable Group
Group N Sum of Squares Expected Under HO Std Dev Under HO
Mean Score
A 5 33.0 25.0 4.065437 6.60
B 4 12.0 20.0 4.065437 3.00
Wilcoxon Two-Sample Test
StatisticsNormal ApproximationZ One-Sided Pr < Z Two-Sided Pr > |Z| t ApproximationOne-Sided Pr < Z Two-Sided Pr > |Z| Exact TestOne-Sided Pr <= S
Two-Sided Pr >= |S - Mean|
12.0000
-1.8448 0.0325 0.0651
0.0511 0.1023 0.0317 0.0635
Z includes a continuity correction of 0.5.
Kruskal-Wallis Test
Chi-Square 3.8723
DF 1
PR>Chi-Square 0.0491
30
The Univariate Procedure
Tests for Location: Mu0=0
Tests Statistic p Value
Student’s t T 10.3479 Pr > |t| 0.0005
Sign M 2.5 Pr >= |M| 0.0625
Signed Rank S 7.5 Pr >= |S| 0.0625
32
INFERENCES FOR SEVERAL INDEPENDENT SAMPLES
-The Kruskal-Wallis test is a -The Kruskal-Wallis test is a generalization of the Wilcoxon-generalization of the Wilcoxon-Mann-Whitney test for a ≥ 2 Mann-Whitney test for a ≥ 2 independent samplesindependent samples-It is also a nonparametric -It is also a nonparametric alternative to the ANOVA F-test alternative to the ANOVA F-test for a one-way layoutfor a one-way layout
33
The steps to the test:1) First rank all N values from smallest to
largest. And take the average rank of #’s with equal values using the formula (N+1)/2
2) Calculate rank sums ri= ∑j=1rij and averages ṝi = ri/ni, i=1, 2, …, a.
3) Calculate the test statistic kw = =
4) Reject H0 for large values of kw ( if kw > )
34
The Pedagogy Problem
Consider Example 14.9 on page 581 of the text, in which four methods of teaching the concept of percentage to sixth graders are compared. There are 28 classes, 7 using each method: the Case Method, the Formula Method, the Equation Method, and the Unitary Analysis Method.
35
DATA Test_Score; INPUT Method $ Score @@;DATALINES;C 14.59 C 23.44 C 25.43 C 18.15 C 20.82 C 14.06 C 14.26F 20.27 F 26.84 F 14.71 F 22.34 F 19.49 F 24.92 F 20.20E 27.82 E 24.92 E 28.68 E 23.32 E 32.85 E 33.90 E 23.42U 33.16 U 26.93 U 30.43 U 36.43 U 37.04 U 29.76 U 33.88;
PROC NPAR1WAY DATA=Test_Score WILCOXON;CLASS Method;VAR Score;*EXACT WILCOXON;RUN;
The Program
36
A Note About the ProgramYou might have noticed the asterisk in the program line:
*EXACT WILCOXON
The asterisk turns the line into a comment. Otherwise, SAS attempts to find an exact p-value for the test, and it can take a very long time. Otherwise, this command would be highly recommended. We’ll settle for a quicker approximation.
37
The Output The NPAR1WAY Procedure
Wilcoxon Scores (Rank Sums) for Variable Score Classified by Variable Method
Sum of Expected Std Dev Mean Method N Scores Under H0 Under H0 Score C 7 49.00 101.50 18.845498 7.000000 F 7 66.50 101.50 18.845498 9.500000 E 7 125.50 101.50 18.845498 17.928571 U 7 165.00 101.50 18.845498 23.571429
Average scores were used for ties.
Kruskal-Wallis Test
Chi-Square 18.1390 DF 3 Pr > Chi-Square 0.0004
38
A Note About the Output
We see that the value of kw is 18.1390, a value large enough to yield an approximate p-value of 0.0004... an extremely small value. At a level of significance of 5%, or even 1%, there is a strong suggestion that the methods are not equally effective, and that the Unitary Analysis Method seems to be the best choice.
39
Use this to check for differences between treatment groups.
Test Statistic: ṝi - ṝj (the difference in their rank avg. For large n’s, Ri – Rj is approximately normally
distributed. Therefore Zij =
Treatments i and j are different if |Zij|> qa,∞,α
40
INFERENCES FOR SEVERAL MATCHED SAMPLES
The Friedman Test is a generalization of the sign test for a ≥2 matched samples
It is also a nonparametric alternative to the ANOVA F-Test for a randomized block design
Since this is use for a block design, rankings are done within each individual group.
The steps for the test: Rank observations from a treatments separately within each
block. Where needed take the average of equal ranking values using (N+ 2) /2.
41
14.4.1 Friedman Test
Example 14.11
Ryan and Joiner give data on the percentage drip loss in meat loaves. The goal was to compare the eight oven positions, which might differ due to temperature variations. Three batches of eight loaves were baked. The loaves from each batch were randomly placed in the eight positions.
Analyze the data using the Friedman test.
Here the oven positions are treatments and batches are blocks.
42
14.4.1 Friedman TestExample 14.11, SASdata meatloaf; input ovenbatch ovenposition driploss @@; datalines;1 1 7.33 1 2 3.22 1 3 3.28 1 4 6.441 5 3.83 1 6 3.28 1 7 5.06 1 8 4.442 1 8.11 2 2 3.72 2 3 5.11 2 4 5.782 5 6.50 2 6 5.11 2 7 5.11 2 8 4.283 1 8.06 3 2 4.28 3 3 4.56 3 4 8.613 5 7.72 3 6 5.56 3 7 7.83 3 8 6.33 ;proc rank data=meatloaf out=rankings; by ovenbatch; var driploss; ranks drip; run;proc print data=rankings; run;proc means data=rankings sum; class ovenposition;var drip;run;proc freq data=rankings; tables ovenbatch*ovenposition*driploss /cmh2; run;proc freq data=meatloaf; tables ovenbatch*ovenposition*driploss /cmh2 scores=rank; run;
The Friedman test is identical to the ANOVA CMH statistic when the analysis uses rank scores (SCORES=RANK)
43
14.4.1 Friedman Test
Example 14.11, SAS results
Obs ovenbatch ovenposition driploss drip
1 1 1 7.33 8.0
2 1 2 3.22 1.0
3 1 3 3.28 2.5
4 1 4 6.44 7.0
5 1 5 3.83 4.0
6 1 6 3.28 2.5
7 1 7 5.06 6.0
8 1 8 4.44 5.0
9 2 1 8.11 8.0
10 2 2 3.72 1.0
11 2 3 5.11 4.0
12 2 4 5.78 6.0
13 2 5 6.50 7.0
14 2 6 5.11 4.0
15 2 7 5.11 4.0
16 2 8 4.28 2.0
17 3 1 8.06 7.0
18 3 2 4.28 1.0
19 3 3 4.56 2.0
20 3 4 8.61 8.0
21 3 5 7.72 5.0
22 3 6 5.56 3.0
23 3 7 7.83 6.0
24 3 8 6.33 4.0
Analysis Variable : drip Rank for Variable driploss
ovenposition N
Obs Sum
1 3 23.0000000
2 3 3.0000000
3 3 8.5000000
4 3 21.0000000
5 3 16.0000000
6 3 9.5000000
7 3 16.0000000
8 3 11.0000000
Summary Statistics for ovenposition by drip
Controlling for ovenbatch
Cochran-Mantel-Haenszel Statistics (Based on Table Scores)
Statistic Alternative Hypothesis DF Value Prob
1 Nonzero Correlation 1 0.1488 0.6997
2 Row Mean Scores Differ 7 17.9393 0.0122
Total Sample Size = 24
44
Calculate the Friedman Statistic: fr = Reject H0 for large values of fr.The distribution of this test can be approximated
by the Thus reject H0 if fr >
It is similar to the Kruskal- Wallis test|ri - rj|>
=
46
Rank Correlation Methods
Pearson Correlation Coefficient ρ measures only the degree of
linear association between random variables which are normally distributed, it can not deal with nonlinear case.
Spearman’s Rank Correlation Coefficient ρs and Kendall’s Rank
Correlation Coefficient τ measure the degree of monotone (increasing or decreasing) association between two variables.
Extreme (1 or -1) correlation does not imply a cause—effect relationship.
Zero correlation does not imply independence. A “strong” correlation is not necessarily statistically significant, and
vice versa.
47
Researchers at the European Centre for Road Safety Testing are trying to find out how the age of cars affects their braking capability. They test a group of ten cars of differing ages and find out the minimum stopping distances that the cars can achieve. The results are set out in the table below:
Car Age(months)
Xi
Mini Stopping at 40 kph (metres)
Yi
Age Rank
(ui)
Stopping Rank
(vi)
Differences of the Ranks
(di = ui-vi)
A 9 28.4 1 1 0
B 15 29.3 2 2 0
C 24 37.6 3 7 -4
D 30 36.2 4 4.5 -0.5
E 38 36.5 5 6 -1
F 46 35.3 6 3 3
G 53 36.2 7 4.5 2.5
H 60 44.1 8 8 0
I 64 44.8 9 9 0
J 76 47.2 10 10 0
d2=32.5
14.5.1 Spearman’s Rank Correlation Coefficient
48
14.5.1 Spearman’s Rank Correlation Coefficient• Ho: X and Y are independent => ρs = 0
• Ha: X and Y are positive (monotone) associated <=> ρs > 02
12
6 (6)(32.5)1 1 0.803
( 1) (10)(99)
n
iis n n
dr
P-value = 0.0081
For large samples n≥10, rs~ Normal (0, 1/(n-1))
409.29803.01 nrz s
Since -1<rs<1, rs=0.803 indicate a strong positive association between car ages and minimum stopping distance; in other words, the older the car, the longer the distance we could expect it to take to stop.
49
Car Age(months)
Xi
Mini Stopping at 40 kph (metres)
Yi
Concordant
Pairs (Nci)
Discordant
Pairs (Ndi)Tie Pairs (Nti)
A 9 28.4 9 0 0
B 15 29.3 8 0 0
C 24 37.6 3 4 0
D 30 36.2 4 1 1
E 38 36.5 3 2 0
F 46 35.3 4 0 0
G 53 36.2 3 0 0
H 60 44.1 2 0 0
I 64 44.8 1 0 0
J 76 47.2 0 0 0
Nc=37 Nd=7 Nt=1
Nci=#{j>i: xj>xi and yj>yi}Ndi=#{j>i: xj>xi and yj<yi}Nti=#{j>i: xj=xi or yj=yi}
14.5.2 Kendall’s Rank Correlation Coefficient
N
NN dcdc
Where N=Nc + Nd + Nt
50
14.5.2 Kendall’s Rank Correlation Coefficient
• Ho: X and Y are independent => τ = 0
• Ha: X and Y are positively associated <=> τ > 0
37 70.67
( )( ) (45 0)(45 1)c d
x y
N N
N T N T
9 ( 1) (9)(10)(9)0.67 2.697
2(2 5) 2(25)
n nz
n
P-value=0.00355
For Large samples n≥10,2(2 5)
~ (0, )9 ( 1)
nNormal
n n
Tie pairs:
1
2 0jg
xj
aT
1
22 2 1
jh
yj
bT
51
Kendall τ and Spearman ρs imply different interpretations: While Spearman ρs can be thought of as the regular Pearson ρ but computed just from ranks of variables, Kendall τ rather represents a probability.
Spearman’s rank correlation coefficient is related to Kendall’s coefficient of concordance, by rs=2w-1 when a=2
A piece of SAS code: PROC CORR DATA=CAR SPEARMAN KENDALL;
Which will generate the correlation coefficients by just click a way!
52
14.5.3 Kendall’s Coefficient of Concordance
This measures the degree to which many judges agree on the ranking of several subjects, suppose there were three employers ranking several candidates for a job, you get the following data:
Candidate a b c d e f---------------------------------------------Judge A 1 6 3 2 4 5---------------------------------------------Judge B 1 5 6 4 2 3---------------------------------------------Judge C 6 3 2 5 4 1---------------------------------------------Rank Sum 8 14 11 11 10 8
• Ho: Random assignment of ranks by the judges Judges are in disagreement
• Ha: Not Random assignment of ranks by the judges Judges are in agreement
53
14.5.3 Kendall’s Coefficient of Concordance
1367.0)16)(3(
05.2
w
a: treatmentsb: blocks ri: sum of ranksfr: Friedman statistic
a
ii ab
fraababr
d
d
ntdisagreeme
agreementw
1
222
max )1(12
)1(/}
2
)1({
07.1105.26305.65
)7)(3(3]9101111148[)7)(3(6
12
)1(3)1(
12
205.0,5
222222
2
abraab
fri
i
0≤w≤1, small values indicating disagreement and large values indicating agreement
Conclusion: We can not reject Null hypothesis, all employers give different rankings to the same candidates.
54
14.5 Rank Correlation Methods
Examples 14.12 and 14.13
Data are given on the yearly alcohol consumption from wine in liters per person
and yearly heart disease deaths per 100,000 people for 19 countries.
Test if there is an association between these two variables using Spearman’s rank correlation coefficient.
Test if there is an association between these two variables using Kendall’s rank correlation coefficient (Kendall’s tau).
55
14.5 Rank Correlation Methods
Example 14.12, 14.13, in SASdata wineheart;
input country $ alcohol deaths @@;
datalines;
australia 2.5 211 austria 3.9 167 belgium 2.9 131
canada 2.4 191 denmark 2.9 220 finland 0.8 297
france 9.1 71 iceland 0.8 211 ireland 0.7 300
italy 7.9 107 netherlands 1.8 167 newzealand 1.9 266
norway 0.8 227 spain 6.5 86 sweden 1.6 207
switzerland 5.8 115 uk 1.3 285 us 1.2 199
wgermany 2.7 172
;
proc corr data=wineheart spearman;
run;
proc corr data=wineheart kendall;
run;
2 Variables: alcohol deaths
Simple Statistics
Variable N Mean Std Dev Median Minimum Maximum
alcohol 19 3.02632 2.50972 2.40000 0.70000 9.10000
deaths 19 191.05263 68.39629 199.00000 71.00000 300.00000
Spearman Correlation Coefficients, N = 19 Prob > |r| under H0: Rho=0
alcohol deaths
alcohol 1.00000
-0.82886 <.0001
deaths -0.82886 <.0001
1.00000
Kendall Tau b Correlation Coefficients, N = 19 Prob > |r| under H0: Rho=0
alcohol deaths
alcohol 1.00000
-0.69644 <.0001
deaths -0.69644 <.0001
1.00000
56
14.5 Rank Correlation Methods
Example
data brakestats;
input car $ age stoppingdistance @@;
datalines;
a 9 28.4 b 15 29.3 c 24 37.6 d 30 36.2 e 38 36.5
f 46 35.3 g 53 36.2 h 60 44.1 i 64 44.8 j 76 47.2
;
proc corr data=brakestats spearman kendall;
run;
2 Variables: age stoppingdistance
Simple Statistics
Variable N Mean Std Dev Median Minimum Maximum
age 10 41.50000 22.11209 42.00000 9.00000 76.00000
stoppingdistance 10 37.56000 6.23773 36.35000 28.40000 47.20000
Spearman Correlation Coefficients, N = 10 Prob > |r| under H0: Rho=0
age stoppingdistance
age 1.00000
0.80244 0.0052
stoppingdistance 0.80244 0.0052
1.00000
Kendall Tau b Correlation Coefficients, N = 10 Prob > |r| under H0: Rho=0
age stoppingdistance
age 1.00000
0.67420 0.0071
stoppingdistance 0.67420 0.0071
1.00000
57
14.5.3 Kendall’s Coefficient of Concordance
The Kendall’s Coefficient of Concordance is closely related to the Friedman statistic, we can calculate the Coefficient of Concordance once we obtain the Friedman statistic using SAS.
Example: data election;
input judge $ candidate $ candrank @@;
datalines;
a a 1 a b 6 a c 3 a d 2 a e 4 a f 5
b a 1 b b 5 b c 6 b d 4 b e 2 b f 3
c a 6 c b 3 c c 2 c d 5 c e 4 c f 1
;
proc freq data=election;
tables judge*candidate*candrank
/cmh2 scores=rank noprint;
run;
Summary Statistics for candidate by candrank
Controlling for judge
Cochran-Mantel-Haenszel Statistics (Based on Rank Scores)
Statistic Alternative Hypothesis DF Value Prob
1 Nonzero Correlation 1 0.0667 0.7963
2 Row Mean Scores Differ 5 2.0476 0.8425
Total Sample Size = 18
59 59
Resampling Methods
“Resampling” is generating the sampling distribution by drawing repeated random samples from the observed sample itself. 1
This is useful for assessing the accuracies
(e.g. the bias and standard error) of complex statistics.
• Permutation Test• Bootstrap Method• Jackknife Method
60 60
Permutation TestDeveloped by R.A. Fisher (1890-1962) and E.J.G. Pitman (1897-1993) in the 1930s. 2
Draws SRS (Simple Random Samples) without replacement
Tests whether two samples X and Y , of size n1 and n2 respectively, are drawn from the same common distribution.
Hypotheses: Ho: Differences between the samples are due to chance.
Ha1: Y tends to have greater values than X , not simply due to chance
Ha2: Y tends to have smaller values than X , not simply due to chance
Ha3: There are differences between X and Y , not due to chance.
This method may be used to compare many different test statistics. To illustrate this method, however, let us consider the permutation test based on the difference between the sample averages.
d y x
61 61
Permutation TestMethodology
1. Pool the samples in to one group (of size n1 + n2).
2. List all of the possible regroupings of the observations into two groups of size n1 and n2.
3. For each possible regrouping, compute the sample averages and , and then compute the difference, .
4. To assess how “unusual” the original observed difference is, compute a p-value (a proportion) as follows:
For Ha1: p-value = (# of times ) /
For Ha2: p-value = (# of times )/
For Ha3: p-value = (# of times )/
1 2
1
n n
n
ix
iy
d y x i i id y x
id d
id d
1 2
1
n n
n
1 2
1
n n
n
1 2
1
n n
n
id d
62 62
Bootstrap MethodInvented by B. Efron (1938- )in the 1960s.
Draws a very large number of SRS with replacement (Note the difference from the Permutation Test)
Heavily computer-based method of deriving robust estimates of
standard error of sample statistics.
63 63
Jackknife Method1
First implemented by R.E. von Mises2 (1883-1953), then developed (separately) by Tukey (1915-2000) and Quenouille in the 1950s.
Resamples by deleting one observation at a time
This method is also useful for estimating the standard error of a statistic, say based on a random sample of size n drawn from some distribution ‘F’.
First, calculate the n values of the statistic denoted by
Let and be the standard deviation of
The jackknife estimate of is given by
1 2( , ,..., )nt t x x x
*1 2 1( , ,..., , ,..., )i i i nt t x x x x x
**
1
ni
i
tt
n
*ts
* * *1 2, ,..., nt t t
( )SE t
** * 2
1
( 1)1( ) ( )
nt
ii
n snjse t t t
n n
64
14.6 Resampling Methods
SAS can be used to perform permutation, bootstrap, and jackknife resampling.
For the most part macros are required. These can be written
and are also readily available on the web.
PROC MULTTEST can be used to perform several tests incorporating permutation or bootstrap
In the following two examples, We use permutation and bootstrap resampling to obtain t-test p-value adjustment.
65
14.6 Resampling MethodsExample 14.15 permutation test and
data capacitor;
Input group $ failtime @@;
Datalines;
Control 17.9 control 23.7 control 29.8
Stressed 15.2 stressed 18.3 stressed 21.1
;
Proc multtest data=capacitor permutation nsample=25000
out=results outsamp=samp;
test mean(failtime /lower);
class group;
contrast 'a vs b' -1 1;
Run;
proc print data=samp(obs=18);
run;
proc print data=results;
run;
The PERMUTATION option in the PROC MULTTEST statement requests permutation resampling, and NSAMPLE=25 000 requests 25000 permutation samples. The OUTSAMP=SAMP option creates an output SAS data set containing the 25000 permutation samples.
The TEST statement specifies the t-test for T. The test is lower-tailed. The grouping variable in the CLASS statement is group, and the coefficients across the groups are -1 and 1, as specified in the CONTRAST statement. (See Chapter 12)
PROC PRINT displays the first 18 observations of the Res data set containing the bootstrap samples.
66
14.6 Resampling Methods
Obs _sample_ _class_ _obs_ Failtime
1 1 control 6 21.1
2 1 control 5 18.3
3 1 control 3 29.8
4 1 stressed 2 23.7
5 1 stressed 1 17.9
6 1 stressed 4 15.2
7 2 control 5 18.3
8 2 control 2 23.7
9 2 control 6 21.1
10 2 stressed 3 29.8
11 2 stressed 4 15.2
12 2 stressed 1 17.9
13 3 control 2 23.7
14 3 control 1 17.9
15 3 control 6 21.1
16 3 stressed 4 15.2
17 3 stressed 3 29.8
18 3 stressed 5 18.3
Model Information
Test for continuous variables Mean t-test
Tails for continuous tests Lower-tailed
Strata weights None
P-value adjustment Permutation
Center continuous variables No
Number of resamples 25000
Seed 356405001
Contrast Coefficients
Contrast
group
control stressed
a vs b -1 1
Continuous Variable Tabulations
Variable group NumObs Mean Standard Deviation
failtime control 3 23.8000 5.9506
failtime Stressed
3 18.2000 2.9513
p-Values
Variable Contrast Raw Permutation
failtime a vs b 0.1090 0.1474
67
14.6 Resampling MethodsExample 14.17 bootstrap testdata capacitor;
Input group $ failtime @@;
Datalines;
control 17.9 control 23.7 control 29.8
stressed 15.2 stressed 18.3 stressed 21.1
;
Proc multtest data=capacitor bootstrap nsample=25
outsamp=res nocenter out=outboot;
test mean(failtime /lower);
class group;
contrast 'a vs b' -1 1;
Run;
proc print data=res(obs=18);
run;
proc print data=outboot;
run;
The BOOTSTRAP option in the PROC MULTTEST statement requests bootstrap resampling, and NSAMPLE=25 requests 25 bootstrap samples. The OUTSAMP=RES option creates an output SAS data set containing the 25 bootstrap samples.
The TEST statement specifies the t-test for T. The test is lower-tailed. The grouping variable in the CLASS statement is group, and the coefficients across the groups are -1 and 1, as specified in the CONTRAST statement. (See Chapter 12)
PROC PRINT displays the first 18 observations of the Res data set containing the bootstrap samples.
68
14.6 Resampling Methods
Obs _sample_ _class_ _obs_ failtime
1 1 control 6 21.1
2 1 control 6 21.1
3 1 control 6 21.1
4 1 stressed 2 23.7
5 1 stressed 1 17.9
6 1 stressed 6 21.1
7 2 control 4 15.2
8 2 control 6 21.1
9 2 control 2 23.7
10 2 stressed 1 17.9
11 2 stressed 3 29.8
12 2 stressed 6 21.1
13 3 control 2 23.7
14 3 control 4 15.2
15 3 control 3 29.8
16 3 stressed 3 29.8
17 3 stressed 3 29.8
18 3 stressed 1 17.9
Model Information
Test for continuous variables Mean t-test
Tails for continuous tests Lower-tailed
Strata weights None
P-value adjustment Bootstrap
Center continuous variables No
Number of resamples 25
Seed 270752001
Contrast Coefficients
Contrast
group
control stressed
a vs b -1 1
Continuous Variable Tabulations
Variable group NumObs Mean Standard Deviation
failtime control 3 23.8000 5.9506
failtime stressed 3 18.2000 2.9513
p-Values
Variable Contrast Raw Bootstrap
failtime a vs b 0.1090 0.0400
69
Works Cited
1. Tamhane, Ajit and Dorothy Dunlop. Statistics and Data Analysis. Upper Saddle River, NJ. Prentice Hall, Inc. 2000.
2. “Resampling (statistics)”. Wikipedia. <http://en.wikipedia.org/wiki/Resampling_(statistics)>. 2007.
3. “Ch.14: Nonparametric Statistical Method” Group project, Wei Zhu, instructor. 2006.