STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review,...

30
STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter March 30, 2015

Transcript of STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review,...

Page 1: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

STAT 135 Lab 8Hypothesis Testing Review, Mann-Whitney Testby Normal Approximation, and Wilcoxon Signed

Rank Test.

Rebecca Barter

March 30, 2015

Page 2: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Mann-Whitney Test

Page 3: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Mann-Whitney Test

Recall that the Mann-Whitney test is a test for the differencebetween two independent populations, and can be conducted asfollows

1. Concatenate the Xi and Yj into a single vector Z

2. Let n1 be the sample size of the smaller sample

3. Compute R = sum of the ranks of the smaller sample in Z

4. Compute R ′ = n1(m + n + 1)− R

5. Compute R∗ = min(R,R ′)

6. Compare the value of R ′ to critical values in a table: if thevalue is less than or equal to the tabulated value, reject thenull that F = G

Page 4: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Mann-Whitney test: the normal approximation

The Mann-Whitney test tests the null hypothesis H0 : F = G .This approach is based on the fact that when H0 is true, we shouldhave

π = P(X < Y ) =1

2

i.e. that it is equally likely that X < Y and that Y < X .

Page 5: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Mann-Whitney test: the normal approximation

Under H0 : F = G , we have π = P(X < Y ) = 12 .

Estimate the observed value of π by

I ranking the combined observations, X1, ...,Xn,Y1, ...,Ym,from smallest (rank = 1) to largest (rank = n + m)

I look at the number of times that we have elements of X thatare smaller than elements of Y .

Then our observed proportion is:

π̂ =# times we see Xi < Yj over all i , j

nm

so if we observe a value of π̂ that is significantly different from 12 ,

then this provides evidence against H0.

Page 6: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Mann-Whitney test: the normal approximation

Suppose, for example, that we had

X = (6.2, 3.7, 4.7, 1.3)

Y = (1.2, 0.8, 1.4, 2.5, 1.1)

Then our ranks are

rank(X ) = (9, 7, 8, 4)

rank(Y ) = (3, 1, 5, 6, 2)

We only see elements of X being less than elements of Y twice:

X4 = 1.3 < Y3 = 1.4

X4 = 1.3 < Y4 = 2.5

so π̂ = 2nm = 2

4×5 = 0.1, which is much less than the expected 0.5!

Page 7: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Mann-Whitney test: the normal approximation

It turns out that

π̂ =1

mn

(R ′ − m(m + 1)

2

)where R ′ is the sum of the ranks of the Yi ’s, n is the sample sizefor the Xi ’s and m is the sample size for the Yi ’s.

Page 8: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Mann-Whitney test: the normal approximation

Thus, for our example, where

X = (6.2, 3.7, 4.7, 1.3) rank(X ) = (9, 7, 8, 4)

Y = (1.2, 0.8, 1.4, 2.5, 1.1) rank(Y ) = (3, 1, 5, 6, 2)

we can verify the given formula:

π̂ =1

mn

(R ′ − m(m + 1)

2

)=

1

4× 5

((3 + 1 + 5 + 6 + 2)− 5× 6

2

)= 0.1

which is the same as we got before!

Page 9: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Mann-Whitney test: the normal approximation

Thus the Mann-Whitney test can be constructed as follows:

Define our test statistic to be

UY = R ′ − m(m + 1)

2

and recall that UY = nmπ̂, where under H0, we have π̂ = 0.5.

I To calculate an exact p-value for the test, we would need toknow the distribution of UY under H0.

I Unfortunately, we don’t know the exact distribution, but wecan use the normal approximation to compute an approximatep-value.

Page 10: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Mann-Whitney test: the normal approximation

Under H0 : F = G , we have that

EH0(UY ) =nm

2

VarH0(UY ) =nm(n + m + 1)

12

and it is known that, asymptotically, under H0, the standardizedstatistic tends to a normal distribution:

UY − EH0(UY )√VarH0(UY )

∼ N(0, 1)

Page 11: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Exercise

Page 12: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Exercise: Mann-Whitney

Researchers have asked several smokers how many cigarettes theyhad smoked in the previous day. The data is as follows

Women Men

4 27 3

20 521 6

816

Why might we prefer to use a non-parametric (Mann-Whitney)test rather than a t-test? Is there enough evidence to concludethat there is a difference between the number of cigarettes smokedper day between the sexes?

Page 13: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Hypothesis testing for comparingpaired samples

Page 14: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Comparing samples from paired populations

So far we have focused on testing hypotheses for

I a parameter from a single population (H0 : µ = 2)

I comparing a parameter from two different, but arbitrary andindependent, populations (H0 : µ1 = µ2)

We now want to compare a parameter from paired populations.

e.g. the (Xi ,Yi )’s might be independent pairs of twins.

Page 15: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Comparing samples from paired populations

We can treat paired tests as a one-sample problem:

where if we assume that Di = Xi − Yi has true mean µD , we aretesting

H0 : µD = 0

versus H1 : µD > 0 , H1 : µD < 0 , H1 : µD 6= 0

by observing whether D is far from 0

Can you see why is this different to observing X − Y ?

Page 16: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Hypothesis testing for comparingpaired samples

The paired Z-test

Page 17: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Comparing samples from paired populations

If we know the true variance σ2D , then we can use a Z -test:

H0 : µD = 0 H1 : µD < 0

corresponds to p-value:

PH0(D ≤ d) = PH0

(D

σD/√n≤ d

σD/√n

)CLT≈ Φ

(d

σD/√n

)

Page 18: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Comparing samples from paired populations

If we know the true variance σ2D , then we can use a Z -test:

H0 : µD = 0 H1 : µD > 0

corresponds to p-value:

PH0(D ≥ d) = PH0

(D

σD/√n≥ d

σD/√n

)CLT≈ 1− Φ

(d

σD/√n

)

Page 19: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Comparing samples from paired populations

If we know the true variance σ2D , then we can use a Z -test:

H0 : µD = 0 H1 : µD 6= 0

corresponds to p-value:

PH0(|D| ≥ |d |) = 2PH0

(D

σD/√n≥∣∣∣∣ d

σD/√n

∣∣∣∣) CLT≈ 2

(1− Φ

(∣∣∣∣ d

σD/√n

∣∣∣∣))

Page 20: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Hypothesis testing for comparingpaired samples

The paired t-test

Page 21: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Comparing samples from paired populations

If we don’t know the true variance σ2D , but the data comes from

normal distributions, then we can use a t-test:

H0 : µD = 0 H1 : µD < 0

corresponds to p-value:

PH0(D ≤ d) = PH0

(D

sd/√n≤ d

sd/√n

)= P

(tn−1 ≤

d

sd/√n

)where sd is the sample standard deviation from our observeddifferences d1, ..., dn.

and similarly for the other alternative hypotheses.

Page 22: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Hypothesis testing for comparingpaired samples

The non-parametric Wilcoxon signed rank test

Page 23: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Comparing samples from paired populations

What if we don’t know the true variance, σ2D , and we also don’t

think our data comes from a normal distribution?

We can use a non-parametric rank test to test the hypothesis. Thistest is called the Wilcoxon signed rank test, and is the pairedsample analog to the Mann-Whitney test.

Page 24: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Wilcoxon signed rank test

Technically, the WSRT is testing whether the Di come from asymmetric distribution, but think of this as testing the familiar

H0 : µd = 0

since if there is no difference between the two paired conditions,we expect about half of the Di to be positive and half negative.

Page 25: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Wilcoxon signed rank test

The general procedure is:

I Remove observations with no difference (Di = 0)

I Calculate Ri = the rank of |Di |I Calculate Wi = sign(Di )× Ri

I Compute the test statistic

W+ =∑

i :Wi>0

Wi

If H0 is true, then

E (W+) =n(n + 1)

4Var(W+) =

n(n + 1)(2n + 1)

24

so we can use a normal approximation to calculate a p-value.

Page 26: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Wilcoxon signed rank testIf H0 is true, then

E (W+) =n(n + 1)

4Var(W+) =

n(n + 1)(2n + 1)

24

and it turns out that the standardized version of the statistic tendsto a normal distribution:

W+ − E (W+)√Var(W+)

∼ N(0, 1)

so our p-value for the alternative H1 : µd < 0 is

P

(W+ − E (W+)√

Var(W+)≤ w+ − E (W+)√

Var(W+)

)= Φ

(w+ − E (W+)√

Var(W+)

)

and similarly for the other alternatives.

Note that w+ is the observed value of our test statistic.

Page 27: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Example: Wilcoxon signed rank test

Suppose we have four pairs of “before” and “after” measurements,and we want to test the hypothesis that the “after” measurementsare larger than the “before” measurements. That is

H0 : µd = 0 H1 : µd > 0

Before After Difference (D) |Difference| (|D|) Rank (R) Signed Rank (W )

25 27 2 2 2 229 25 -4 4 3 -360 59 -1 1 1 -127 37 10 10 4 4

Thus our observed test statistic is

w+ = 2 + 4 = 6

Page 28: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Example: Wilcoxon signed rank test

H0 : µd = 0 H1 : µd > 0

Our test statistic isw+ = 6

And

E (W+) =n(n + 1)

4=

4× 5

4= 5

Var(W+) =n(n + 1)(2n + 1)

24=

4× 5× 9

24=

15

2

So we can estimate our p-value using the normal approximation

p ≈ P

(Z >

6− 5√15/2

)= 1− Φ(0.365) = 0.358

And we thus fail to reject the null hypothesis: not enough evidenceto show that the “after” measurements are larger.

Page 29: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Exercise

Page 30: STAT 135 Lab 8 Hypothesis Testing Review, Mann …STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test. Rebecca Barter

Exercise (Rice Exercise 11.6.22)

An experiment was done to compare two methods of measuringthe calcium content of animal feeds. The standard method usescalcium oxalate precipitation followed by titration and is quitetime-consuming. A new method using flame photometry is faster.Measurements of the percent calcium content made by eachmethod of 118 routine feed samples are contained in the filecalcium.csv. Analyze the data to see if there is any systematicdifference between the two methods. Use both parametric andnonparametric tests.