Post on 23-Jun-2018
Wilcoxon Test and Calculating SampleSizes
Dan Spencer
UC Santa Cruz
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 1 / 33
Differences in the Means of TwoIndependent Groups
When using the t, t ′ or tp test statistics, we assumethat the responses in both groups are normallydistributedWhat if they are not normally distributed?
I If n1 and n2 are large enough, it is still okay to use thet-distribution
I However, if n1 and n2 are small, this is a problem
This non-normality sometimes occurs in animalstudies
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 2 / 33
Wilcoxon Rank-Sum Test
Sometimes called the Mann-Whitney-Wilcoxon test,the Mann-Whitney U test, or theWilcoxon-Mann-Whitney test
Test to see if the location of the responses betweenthe groups is different
Interpreted as a test for a difference in medians
An example of a nonparametric test, as it does nottest about parameters in an assumed distribution
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 3 / 33
Wilcoxon Rank-Sum: Assumptions
Responses are either continuous or ordinal
Observations from both groups are independent
The shape and spread of the response in the twodifferent populations is the same, but not necessarilynormal
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 4 / 33
t-Test Group Density Assumption
0.0
0.1
0.2
0.3
0.4
−5.0 −2.5 0.0 2.5 5.0Values
Den
sity Group
1
2
Density Assumption for t−Tests
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 5 / 33
Wilcoxon Group Density Assumption
0.00
0.05
0.10
0.15
0 5 10Values
Den
sity
Group1
Group2
Wilcoxon Density Assumption
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 6 / 33
Wilcoxon Rank-Sum: Hypotheses
Null Hypothesis (H0): The probability of arandomly-selected response from the first populationexceeding that of a randomly-selected response fromthe second population is equal to 0.5
I A slightly stronger hypothesis is that the distributions areequal in terms of location
I This hypothesis implies the above null hypothesis
Alternative Hypothesis (H1): The probability ofa randomly-selected response from the firstpopulation exceeding that of a randomly-selectedresponse from the second population is
I Not equal to 0.5I Greater than 0.5I Less than 0.5
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 7 / 33
Case Study: Chick Weights
Newly hatched chicks were separated into twogroups
I Sunflower seed dietI Horsebean seed diet
After six weeks, the weights of the chicks weremeasured in grams
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 8 / 33
Case Study: Chick Weights
●
265.0
267.5
270.0
horsebean sunflowerFeed Type
Wei
ght (
gram
s)
feed
horsebean
sunflower
Boxplots of Chick Weights by Feed Type
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 9 / 33
Case Study: Chick Weights
Both distributions look to be somewhat skewed tothe right because they either have a long tail or anoutlier (shown as a solitary point)
Sample sizes are small (8 and 10, respectively), so tand t ′ are not appropriate hereHypotheses:
I H0 : The distribution of chick weights in the two groupsis equal
I H1 : The distribution of chick weights is lower for thehorsebean group
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 10 / 33
Wilcoxon Rank-Sum Test Statistic
Combine groups, and rank all responses fromsmallest to largest
I The ranks number from 1 to nI n = n1 + n2
If there are ties, the ranks should be averagedI Values 7, 5, 6, 6I Their ranks would be 4, 1, 2.5, 2.5
The test statistic T is the sum of the ranks for thegroup with the smallest sample size
I If n1 = n2, T falls between the two rank sums
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 11 / 33
Rank Sums
HorsebeanWeights Ranks
266.84 14264.07 6263.82 4263.47 2264.33 8264.25 7263.22 1263.92 5Sum = 47
SunflowerWeights Ranks
267.75 15266.02 12266.29 13264.89 10269.24 17271.63 18264.74 9268.36 16264.99 11263.69 3Sum = 124
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 12 / 33
Case Study: Chick Weights
T = 47
Wilcoxon Rank-Sum rejection region values can befound in a table athttps://metxstats.soe.ucsc.edu/node/5
Since the research hypothesis is that the horsebeangroup has a lower-shifted distribution than thesunflower group, reject H0 if T is less than thevalues in the table when n1 = 8 and n2 = 10
I T is larger than the critical value for α = 0.025, 0.05,and 0.10
I Fail to reject H0 and conclude that distributions are notsignificantly shifted from one another
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 13 / 33
Normal Approximation
When both treatment groups are larger than 10, thenormal distribution approximates the distribution ofthe Wilcoxon Rank-Sum test statistic rather well
z =T − µTσT
µT =n1(n1 + n2 + 1)
2
σT =
√n1n2(n1 + n2 + 1)
12
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 14 / 33
Normal Approximation: Our Example
µT =n1(n1 + n2 + 1)
2
=8(8 + 10 + 1)
2= 76
σT =
√n1n2(n1 + n2 + 1)
12
=
√(8)(10)(8 + 10 + 1)
12= 11.25463
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 15 / 33
Normal Approximation: Our Example
z =47− 76
11.25463= −2.576717
This z-score certainly does fall in the rejection region
P-value ≈ 0.00499
This is a contradictory conclusion!
Use this approximation only when samples are largeenough!
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 16 / 33
Wilcoxon Rank-Sum Test in JMP
Analyze → Fit Y by X
Drag your variables to the appropriate Response andFactor boxes and click OK
Click the → Nonparametric → Wilcoxon Test
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 17 / 33
Wilcoxon Rank-Sum Test in JMP
JMP calls the test statistic S instead of TOnly the two-sided p-value for the normalaproximation is given
I For the one-sided p-value, divide by 2
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 18 / 33
Sample Size
Researchers aim to present evidence to support theirhypotheses about how the world worksMost of the time, this hypothesis aims to show thattreatments are significantly different from oneanother
I Usually, the aim is to reject H0
Ideally, sample sizes would be as big as possibleI However, time and money often limit sample sizes
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 19 / 33
Power
We want to minimize the chance of failing to rejecta false H0
I This chance is often represented by β
An experiment’s power is the chance that a falseH0 is correctly rejected
I 1− βWhen the chance of incorrectly rejecting H0 is fixedat some value α, the power of a test can beestimated for different sample sizes
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 20 / 33
Power: t Distributions
When H0 is true, the test statistic is centeredaround 0
When H1 is true, the test statistic is proportionallycentered at
∆∗ =µ1 − µ2 − D0
σ√
1n1
+ 1n2
I For simplicity, the quantity µ1 − µ2 − D0 is representedas ∆
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 21 / 33
Calculating Power
An experiment where n1 = n2 = 5, σ = 10, and∆ = 25α is fixed at 0.05 for the hypotheses
I H0 : µ1 − µ2 = 0I H1 : µ1 − µ2 6= 0
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 22 / 33
Power Illustrated
t*0.0
0.1
0.2
0.3
0.4
−5 0 5t
Den
sity Hypothesis
H0
H1
β, α, and t
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 23 / 33
Changing σ
t*0.0
0.1
0.2
0.3
0.4
−5 0 5t
Den
sity Hypothesis
H0
H1
σ = 10
t*0.0
0.1
0.2
0.3
0.4
−5 0 5t
Den
sity Hypothesis
H0
H1
σ = 8
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 24 / 33
Changing n
t*0.0
0.1
0.2
0.3
0.4
−5 0 5t
Den
sity Hypothesis
H0
H1
n1 = n2 = 5
t*0.0
0.1
0.2
0.3
0.4
−5 0 5t
Den
sity Hypothesis
H0
H1
n1 = n2 = 10
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 25 / 33
Maximizing Power
Increase n1 and n2 and decrease experimental erroras much as possible
We have previously discussed reducing experimentalerror by standardizing measurement practices
How do we choose the smallest possible sample sizewhile achieving a fixed α and β?
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 26 / 33
Calculating n
Fix or estimateI α - Chance of incorrectly rejecting H0
I β - Chance of incorrectly failing to reject H0
I σ - Estimated population standard deviationI ∆ - The size of difference that is desirable to detect
One-sided tests for µ1 − µ2:
n1 = n2 = 2σ2(zα + zβ)2
∆2
Two-sided tests for µ1 − µ2:
n1 = n2 = 2σ2(zα/2 + zβ)2
∆2
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 27 / 33
Calculating n
If |µ1 − µ2 − D0| ≥ ∆, type II error probability ≤ β
Typically, β is chosen to be ≤ 0.2
σ is estimated as s calculated from previousexperiments∆ is set as the minimum difference that is desirableto detect
I A treatment is only preferable if it increases CD4 cellcount by 100 or more, so ∆ ≥ 100
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 28 / 33
Calculating n: Tooth Growth
In a previous lesson, we examined the effects of thesource of vitamin C on tooth growth in guinea pigsLet’s say we want to conduct another study, butthis time, we want to be able to detect a truedifference of 3 millimeters in tooth length
I We’ll estimate that σ = 7.5, which was our estimate spI Fix α = 0.05I Fix β = 0.20
We’ll assume a two-sided test
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 29 / 33
Calculating n: Tooth Growth
n1 = n2 = 2(7.52)(z0.05/2 + z0.20)2
32
= 2(7.52)(1.959964 + 0.8416212)2
32
= 98.111
In order to have power = 1 - .2 = .8, the minimumsample size for each group is 99 guinea pigs
I In the case where a non-integer sample size is found,round up to the nearest whole number
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 30 / 33
Calculating Sample Size in JMP
DOE → Sample Size and PowerTwo Sample Means
I Enter αI σ (Std Dev)I Difference to detect (∆)I Power (1− β)I Continue
Note, small differences may exist due to roundingerrors
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 31 / 33
JMP Output
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 32 / 33
Notes on JMP
Note that this tool can also be used to evaluate thepower of a proposed study
A plot of power versus sample size can also beuseful in determining sample size
Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 33 / 33