Uses and Abuses of Non-parametric Statistics in Medical ...

32
Sponsored by the Clinical and Translational Science Institute (CTSI) and the Department of Population Health / Division of Biostatistics Uses and Abuses of Non-parametric Statistics in Medical Research John P Klein, PhD

Transcript of Uses and Abuses of Non-parametric Statistics in Medical ...

Page 1: Uses and Abuses of Non-parametric Statistics in Medical ...

Sponsored by the Clinical and Translational Science Institute (CTSI)and the Department of Population Health / Division of Biostatistics

Uses and Abuses of Non-parametric

Statistics in Medical Research

John P Klein, PhD

Page 2: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 2

Speaker Disclosure

In accordance with the ACCME policy on speaker disclosure, the speaker and planners who are in a position to control the educational activity of this program were asked to disclose all relevant financial relationships with any commercial interest to the audience. The speaker and program planners have no relationships to disclose.

Page 3: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 3

Outlineā€¢ Introduction

ā€“ What is a nonparametric test?ā€“ Motivating example

ā€¢ Why do we use nonparametric testsā€¢ Why not to use nonparametric testsā€¢ Specific tests for some common situations

ā€“ One sample, paired sampleā€“ Two independent samplesā€“ One way layoutā€“ Two way layoutā€“ Measures of Association

ā€¢ Concluding remarks

Page 4: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 4

Parametric Statistics

ā€¢ Body of Statistical Methods Based on an Assumed Model for the Underlying Population from which the Data was sampled

ā€¢ Inference is about some parameter (mean, variance, correlation) of the population

ā€¢ If model is incorrect the inference may be misleading

Page 5: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 5

Example of Parametric Procedures

ā€¢ Classical t-test assumes normal populations

ā€¢ Analysis of Variance assume normal populations

ā€¢ Logistic regress assumes binomial population

ā€¢ Pearsonā€™s correlation coefficient assumes bivariate normal

Page 6: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 6

Non Parametric Statisticsor

Distribution Free statisticsā€¢ Body of statistical methods that relax the

assumptions about the underlying population model

ā€¢ Typically statistics based on ranks or simple counts

ā€¢ Can be used for both ordinal and nominal data

Page 7: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 7

Simple Example

ā€¢ Study of the effects of tranquilizer on Hamilton Depression Score

X-Pre trial depression scoreY-Second visit depression scoreH0 : No change in depression scoreHA : Score is lower after treatment

Page 8: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 8

Patient Pre ScoreX

Post ScoreY

Change in Score

X-Y1 1.83 0.80 1.032 0.50 0.65 -0.153 1.62 0.60 1.024 2.48 2.05 0.435 1.68 1.06 0.626 1.88 1.29 0.597 1.55 1.06 0.498 3.16 3.14 0.029 1.30 1.29 0.01

Page 9: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 9

Data 1.03 -0.15 1.02 0.43 0.62 0.59 0.49 0.02 0.01

Sign Testā€¢ Count number of positive differences s=8ā€¢ If null hypothesis is true the number of positive

differences is like flipping a fair coin n times soP-value =Pr[Sā‰„s|n,1/2] (one sided)

= 2xSmaller of {Pr[Sā‰„s|n,1/2 ], Pr[Sā‰¤s|n,1/2 ]}

ā€¢ P value= Pr[Sā‰„8|9,1/2 ]=0.019ā€¢ Hence Hamilton score decreased by tranquilizer

Page 10: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 10

Wilcoxon Sign Rank TestPatient d=X-Y Rank |d|

1 1.03 92 -0.15 33 1.02 84 0.43 45 0.62 76 0.59 67 0.49 58 0.02 29 0.01 1

Page 11: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 11

Wilcoxon Sign Rank Testā€¢ Add up ranks associated with the positive (or negative)

differences T+ (T-)

ā€¢ Note sum of ranks =n(n+1)/2

ā€¢ Under Ho + and ā€“ values should be mixed so T+ should be close to its average value n(n+1)/4

ā€¢ Exact p-values can be found in tables which were found by numeration of all possible samples or using a large sample approximation

Z=(T+-n(n+1)/4)/{n(n+1)/(2n+1)/24}1/2

Page 12: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 12

Calculationsā€¢ n(n+1)/2=9x10/2=45ā€¢ T-=3 T+=45-3=42ā€¢ P-value from table p=0.006ā€¢ Normal Approximationā€¢ Mean=n(n+1)/4=22.5ā€¢ Var=n(n+1)(2n+1)/24=71.25ā€¢ Z=(42-22.5)/71.251/2=2.31 p=0.0104ā€¢ Hence Hamilton score decreased by tranquilizer

Page 13: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 13

Why Nonparametrics?1. Fewer Assumptions2. Exact p-values for small sample sizes3. Usually easier to apply. Involves counts

and ranks4. Often easier to understand why the

methods work5. Since it works on ranks does not require

numerical data but can be performed on ordinal data

Page 14: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 14

Why Nonparametrics?6. Provides simple tests for complicated hypotheses

ā€¢ Example k sampleHo: Ī¼1=Ī¼2=. . . =Ī¼k versusHo: at least one Ī¼j is differentParametric: Analysis of Variance (ANOVA)Non-Parametric: Kruskal-Wallis Test

Ho: Ī¼1=Ī¼2=. . . =Ī¼k versusHo: Ī¼1<Ī¼2<. . . <Ī¼k Parametric: Isotonic regression ??Non-Parametric: Jonkheere Test

Page 15: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 15

Why Nonparametrics?

7. Exact confidence intervals are available8. Gives protection against outliers9. Software for nonparametric methods is

available in most packages10. When software is not available

replace data by ranks in normal theory software

Page 16: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 16

Why not Nonparametrics?1. Loss of power when the parametric models is chosen correctly.i.e. More likely to not reject the null hypothesis when it is false

ā€¢ Loss of power depends on the tails of the underlying distribution.

ā€¢ Measured by the ā€œAsymptotic relative efficiencyā€ e(1, 2).

e(1, 2)=sample size for test 2/sample size for test 1To achieve equal power

If e(1, 2) <1 then test 2 is more powerful n2=e(1,2)n1

Page 17: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 17

DoubleNormal Uniform Logistic Exponential Cauchy

e(sign test,paired t test) .637 .333 .822 2.00 āˆže(Sign Rank, paired t) .955 1.000 1.097 1.50 āˆž

Page 18: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 18

Why Not Nonparametrics?

2. For large samples the Central Limit Theorem says the mean is approximately normally distributed so normal theory tests such as t-test, ANOVA which compare means are appropriate for any underlying distribution.

Page 19: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 19

One or Paired sample tests

ā€¢ Locationā€“ Parametric test: t-testā€“ Nonparametric tests

ā€¢ Sign Test (No Assumptions)ā€¢ Wilcoxon Sign Rank Test

ā€“ Works best for symmetric populations

ā€¢ Confidence intervals based on Hodges-Lehman statistic

Page 20: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 20

Two Independent SamplesLocation Tests

ā€¢ Parametric Tests (assumes normality)ā€“ t-test (Assumes equal variances)ā€“ t-test (Scatterhwaiteā€™s approximation)

ā€¢ Nonparametric testsā€“ Mann-Whitney-Wilcoxon test

ā€¢ Looks at the sum of the ranks of the observations from the 1st sample in the pooled sample

ā€“ Linear Rank tests Ī£ Ļ†(Ri) where Ri is rank of the ithobservation in sample 1 in pooled sample. (e.g.. Log rank test, normal scores test, etc)

Page 21: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 21

Two Independent SamplesTests for equal variance

ā€¢ Parametric test (assumes normality)ā€“ F test

ā€¢ Non parametric testsā€“ Ansari-Bradley test (assumes equal medians)1. Rank Data in pooled sample2. Assign scores 1 2 3 4 5 5 4 3 2 1 (n=10)3. Add up scores for 1st sample and reject if too small

or too largeā€“ Millerā€™s Jackknife test if unequal medians

Page 22: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 22

Two Independent SamplesOmnibus tests

ā€¢ Empirical distribution function Fn(x)= [# data points ā‰¤ x]/n

ā€¢ Kolmogorov-Smirnov test max[Fn(x) -Gn(x)]

ā€¢ Cramer-von-Mises testArea between Fn(x) -Gn(x)

Page 23: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 23

One way layoutk=3 or more independent groups

ā€¢ Parametricā€”(assumes normality, equal variances) Analysis of Variance for case 1

ā€¢ Nonparametric Model Xij=Īø+Ļ„j+Eij, i=1,ā€¦,nj, j=1,ā€¦,kā€“ Case 1 HA:[Ļ„1, . . ., Ļ„k not all equal]

ā€¢ Kruskal-Wallis testā€“ Case 2 HA:[Ļ„1ā‰¤ Ļ„2ā‰¤ . . . ā‰¤ Ļ„k with at least one

strict inequality]ā€¢ Jonckheere-Terpstra Test

Page 24: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 24

One way layoutk=3 or more independent groups

ā€¢ Nonparametric Model Xij=Īø+Ļ„j+Eij, i=1,ā€¦,nj, j=1,ā€¦,kā€“ Case 3 ā€“ HA:[Ļ„1ā‰¤ Ļ„2ā‰¤ . . . ā‰¤ Ļ„p-1ā‰¤ Ļ„pā‰¤ Ļ„p+1ā‰„ . . . ā‰„ Ļ„k with at

least one strict inequality]ā€¢ Mack-Wolfe test

Page 25: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 25

Two-Way Layout

Xijt=Īø+Ī²i+Ļ„j+Eijt i=1,ā€¦,n Subjects (Blocks) j=1,ā€¦,k Treatmentst=1,ā€¦,cij replicates

ā€¢ Parametric test ā€“ANOVAā€¢ Nonparametric test

ā€“ Friedmanā€™s Test (HA: Ļ„ā€™s not equal)

Page 26: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 26

Measures of Association

ā€¢ Parametricā€”Pearsonā€™s Product moment correlation coefficient, R. -1 ā‰¤ Rā‰¤ +1ā€“ Measures strength of linear association when

data is bivariate normal -1 ā‰¤ Rā‰¤ +1ā€“ When the marginals are not normal you can

have perfect association and an R ā‰  Ā±1

Page 27: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 27

Measures of AssociationNonparametric

ā€¢ Spearmanā€™s rhoā€”Usual correlation coefficient using ranksā€¢ Kendallā€™s tau

Pairs (Xi,Yi), (Xj,Yj) are concordant if (Xi-Yi)(Xj-Yj) >0; discordant if (Xi-Yi)(Xj-Yj) <0Tau =(#concordant-#discordant pairs)/(n(n-1)/2)

ā€¢ Both measures areā€“ Between -1,+1ā€“ Equal to 0 if X and Y are independentā€“ Equal to +/-1 if Y=a+bX

Page 28: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 28

Remarks

ā€¢ Most of these methods are in most software packages. One particularly good package is MINITAB

ā€¢ Good Reference is Hollander and Wolfe Nonparametric Statistical Methods. Wiley1999

Page 29: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 29

Summary

ā€¢ Nonparametric statistics provide a viable alternative to normal theory statistics

ā€¢ Nonparametric statistics are particularly useful whenā€“ The sample size is smallā€“ You want protection from outliersā€“ You have ordinal data

Page 30: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 30

Summary

Nonparametric statistics are particularly useful whenā€“ You are modeling association between

variablesā€“ You are doing tests with ordered or umbrella

alternativeā€“ You have survival data

Page 31: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 31

Resources

ā€¢ The Clinical and Translation Science Institute (CTSI) supports education, collaboration, and research in clinical and translational science: www.ctsi.mcw.edu

ā€¢ The Biostatistics Consulting Service provides comprehensive statistical support

http://www.mcw.edu/biostatsconsult.htm

Page 32: Uses and Abuses of Non-parametric Statistics in Medical ...

5/6/2009 CTSI Biostatistics 32

Free drop-in consultingā€¢ MCW/Froedtert/CHW:

ā€“ Monday, Wednesday, Friday 1 ā€“ 3 PM @ CTSI Administrative offices (LL772A)

ā€“ Tuesday, Thursday 1 ā€“ 3 PM @ Health Research Center, H2400

ā€¢ VA: 1st and 3rd Monday, 8:30-11:30 amā€“ VA Medical Center, Building 70, Room D-21

ā€¢ Marquette: 2nd and 4th Monday, 8:30-11:30 amā€“ Olin Engineering Building, Room 338D