Nonparametric tests European Molecular Biology Laboratory Predoc Bioinformatics Course 17 th Nov...
-
Upload
kari-scholfield -
Category
Documents
-
view
212 -
download
0
Transcript of Nonparametric tests European Molecular Biology Laboratory Predoc Bioinformatics Course 17 th Nov...
Nonparametric tests
European Molecular Biology LaboratoryPredoc Bioinformatics Course
17th Nov 2009
Tim Massingham, [email protected]
What is a nonparametric test?
Parametric: assume data from some family of distribution functions
Gamma distribution with different parameters
Normal distribution• mean• varianceGamma distribution• shape• scaleetc…
Non-parametric means that no assumptions about distribution
Generally means just look at ranks of data
Most traditional tests assume a normal distribution
Shape Scale1 2
2 23 25 1
10 0.5
Robustness
Pearson’s correlation test
Correlation = -0.05076632(p-value = 0.02318)
Correlation = -0.06499109 (p-value = 0.003632)
Correlation = -0.1011426 (p-value = 5.81e-06)
Correlation = 0.1204287 (p-value = 6.539e-08)
A single observation can change the outcome of many tests
Robust tests are resistant to outliers but require more data
200 observations from normal distributionx ~ normal(0,1) y ~ normal(1,3)
RobustnessA single observation can change the outcome of many tests
Robust tests are resistant to outliers but require more data
Spearman’s correlation test
Correlation = -0.03822845 (p-value = 0.08742)
Correlation = -0.03966966 (p-value = 0.07604)
Correlation = -0.03966966 (p-value = 0.07604)
Correlation = -0.03667266 (p-value = 0.101)
Pearson’s correlation test
Correlation = -0.05076632(p-value = 0.02318)
Correlation = -0.06499109 (p-value = 0.003632)
Correlation = -0.1011426 (p-value = 5.81e-06)
Correlation = 0.1204287 (p-value = 6.539e-08)
Non-parametric Parametric
Newcomb’s speed of light data
Newcomb’s lab(1878)
Washington monument(~12 ms later)
Standard test of all dataMean 26.295% confidence interval 23.6 28.9 (width=5.3)
Newcomb dropped the outlierMean 27.395% confidence interval 25.7 28.8 (width=3.1)
Robust test (Sign test for median)Median 27.095% confidence interval 26.0 28.5 (width=2.5)
Efficiency of robust tests
Few results, mostly for large samplesUsing median rather than mean 50% more dataWilcoxon test vs. t-test 20% more data (no more than)
Potvin and Roff (1993) Ecology 74:1617-1628
5 57 31 15 100 48 25 0 0
-33 0 -17 -27 -21
Percentage extra datafor same tests
Asymptotic Relative EfficiencyAsymptotic ≈ valid for large samplesRelative efficiency ≈ ratio of variance
Efficiency of robust tests
Few results, mostly for large samplesUsing median rather than mean 50% more dataWilcoxon test vs. t-test 20% more data (no more than)
Potvin and Roff (1993) Ecology 74:1617-1628
5 57 31 15 100 48 25 0 0
-33 0 -17 -27 -21
Percentage extra datafor same tests
Requires less data!
Kolmogorov test
AKA Kolmogorov-Smirnov test
Type of data: continuousParametric equivalent: noneDistribution of statistic: exact when no ties in data
Does this data follow a specific distribution?Are two sets of data from the same distribution?
Maximum difference
Kolmogorov test
Why does it work?
Rank difference constant under transformation
stretch and contract x axis
Kolmogorov test
For testing whether data is normally distributed or not, the Shapiro-Wilk test is preferred.See shapiro.test in R
Not valid when null distribution has been fitted to data, e.g. test against normal but fit mean and variance
ks.test(stud_logexp, pnorm)
One-sample Kolmogorov-Smirnov test
data: stud_logexpD = 0.0526, p-value < 2.2e-16alternative hypothesis: two-sided
Is Studentized expression data normal?
Kolmogorov two-sample testAre two sets of data from the same distribution?
Gene expression data from Arabidopsis thaliana• sprayed with 1.6mM Tween• sprayed with water
ks.test(logexp1,logexp2)
Two-sample Kolmogorov-Smirnov test
data: logexp1 and logexp2 D = 0.0207, p-value = 0.0001146alternative hypothesis: two-sided
Biggest deviations for low expression
Sign test
Is the median of the data zero?Is the median x? (Subtract x from data and test against zero)
Type of data: continuousParametric equivalent: Student’s t-test (one sample)Distribution of statistic: exact when no ties in data
50% 50%
50:50 chance each side medianCount them up use binomial test
median
<0 0 >0
12334 321 10155
Gene expression differences
binom.test( c(12334,10155) )
Exact binomial test
data: c(10155, 12334) number of successes = 10155, number of trials = 22489, p-value < 2.2e-16alternative hypothesis: true probability of success is not equal to 0.5 95 percent confidence interval: 0.4450344 0.4580863 sample estimates:probability of success 0.4515541
Sign test
Is the median of the data zero?Is the median x? (Subtract x from data and test against zero)
Gene expression differences
Expect difference in expression to be zeroDiscard differences of exactly zero
<0 0 >0
12334 321 10155
Confidence interval is on proportionnot the expression difference
SIGN.test in the PASWR package is a more convenient way of doing a sign test and gives confidence intervals.
Wilcoxon Signed Rank test
Type of data: ordinal (interval for paired data)Parametric equivalent: Student’s t-testDistribution of statistic: exact
Is the data symmetric about zero?Is the data symmetric about x? (Subtract x and test against zero)
Much stronger assumption than signed test
median=0.72
Test rejects non-symmetric dataa <- rweibull(1000,1,1)wilcox.test( a-median(a) )p-value = 1.087e-05
Wilcoxon Signed Rank testSpecial case when we do expect symmetry
X = Intrinsic + RandomX
Y = Intrinsic + RandomY
Look a pair X & Y
Random property• measurement error• natural variation
Paired dataSame gene under two different conditionsMeasuring response (before and after)Paired control, e.g. sibling pairs
Wilcoxon Signed Rank test
Paired dataSame gene under two different conditionsMeasuring response (before and after)Paired control, e.g. sibling pairs
Special case when we do expect symmetry
X = Intrinsic + RandomX
Y = Intrinsic + RandomY
-Distribution of difference is symmetric about zero
Look a pair X & Y
- =
Wilcoxon Signed Rank test
Have gene expression data in two matched Arabidopsis thaliana plants• one sprayed with 1.6mM Tween and left for one hour• one sprayed with distilled water and left for one hour
The genes form matched pairs
Water Tween Difference
Wilcoxon Signed Rank test
wilcox.test( lexp1 , lexp2, paired=TRUE )
Wilcoxon signed rank test with continuity correction
data: lexp1 and lexp2 V = 108347390, p-value < 2.2e-16alternative hypothesis: true location shift is not equal to 0
wilcox.test( lexp1 , lexp2, paired=TRUE , conf.int=TRUE)
Wilcoxon signed rank test with continuity correction
data: lexp1 and lexp2 V = 108347390, p-value < 2.2e-16alternative hypothesis: true location shift is not equal to 0 95 percent confidence interval: -0.05204535 -0.04207491 sample estimates:(pseudo)median -0.04705803
Wilcoxon Rank Sum Test
Also referred to as Mann-Whitney or Mann-Whitney-Wilcoxon test
Type of data: ordinalParametric equivalent: two-sample Student’s t-testDistribution of statistic: exact
Do two samples have the same median?
Look at same expression data but ignore pairing
wilcox.test( lexp1 , lexp2, conf.int=TRUE)
Wilcoxon rank sum test with continuity correction
data: lexp1 and lexp2 W = 256243890, p-value = 0.005504alternative hypothesis: true location shift is not equal to 0 95 percent confidence interval: -0.08455685 -0.01295834 sample estimates:difference in location -0.04910699
Paired vs two-sample tests
Pairing can make a huge difference to power of test
Look at a case where the variation in intrinsic greater than effect
wilcox.test(sample1,sample2)
Wilcoxon rank sum test
data: sample1 and sample2 W = 4930, p-value = 0.8652alternative hypothesis:
true location shift is not equal to 0
wilcox.test(sample1,sample2,paired=TRUE)
Wilcoxon signed rank test
data: sample1 and sample2 V = 1609, p-value = 0.001645alternative hypothesis:
true location shift is not equal to 0
Kruskal-Wallis
Type of data: ordinalParametric equivalent: ANOVADistribution of statistic: approximate
What if we have several groups?Arabidopis gene expression data consisted of 6 experiments
6 groups of expression data; do they have different medians?
kruskal.test(gene_expression)
Kruskal-Wallis rank sum test
data: gene_expression Kruskal-Wallis chi-squared = 58.421, df = 5, p-value = 2.575e-11
For two samples, Kruskal-Wallis is equivalent to Wilcoxon Rank Sum
Friedman test
Paired observationsWilcoxon Signed Rank test
Genes
Groups
Type of data: ordinalParametric equivalent: ANOVA with blocksDistribution of statistic: approximate
Genes
G1 G2 Groups
Many groupsKruskal-Wallis test
Many groups in distinct units
Friedman test
Classic example: wine tastingAsk 4 women to rank 3 different wines, is one wine preferred?
Merlot Shiraz Pinot Noir
Agnes 1 2 3
Clara 2 1 3
Mona 1 3 2
Pam 1 2 3
wine Merlot Shiraz Pinot NoirAgnes 1 2 3Clara 2 1 3Mona 1 3 2Pam 1 2 3
friedman.test(wine)
Friedman rank sum test
data: wine Friedman chi-squared = 4.5, df = 2, p-value = 0.1054
friedman.test(t(wine))
Friedman rank sum test
data: t(wine) Friedman chi-squared = 0.1429, df = 3, p-value = 0.9862
Agnes Clara Mona Pam
Merlot 1 2 1 1
Shiraz 2 1 3 2
Pinot Noir 3 3 2 3
Flip the question:Are judges ranking wines in a consistent manner?
Expected since forcing judges to rank
Friedman testAnother look at the Arabidopis data - look at first 20 genes
1 1360.8 638.2 839.8 807.9 1252.4 1421.92 12.4 3.6 0.9 2.1 3.4 12.03 1297.0 1354.8 1401.5 1198.6 1017.4 1322.24 73.9 83.4 87.4 156.4 150.3 69.05 943.6 938.9 904.8 1133.4 958.2 940.16 1301.4 1089.4 1153.5 1173.5 1157.8 1337.57 908.4 837.0 795.4 1227.2 1008.2 1027.68 1585.4 1699.7 1747.8 2093.3 1851.6 2118.79 2837.8 3848.7 3960.2 3438.9 3608.9 3987.410 1498.7 1095.0 1213.8 1719.0 1914.5 1836.211 1296.1 1033.5 1212.2 1256.6 1333.1 1345.012 35.2 29.8 23.3 8.6 10.5 22.713 41.1 27.3 26.8 13.6 15.2 29.614 64.2 31.9 32.5 14.8 13.1 37.315 60.3 45.2 41.2 28.8 24.0 38.616 136.6 89.6 83.6 42.4 39.7 95.017 518.9 333.1 347.8 229.8 206.3 421.318 108.9 70.0 61.5 80.4 78.9 80.419 1516.0 967.1 1038.5 600.7 565.2 1381.320 1377.4 853.8 834.7 415.4 366.6 965.8
Exp 1 Exp 2 Exp 3 Exp 4 Exp 5 Exp 6 Friedman Testp-value = 0.0006611
Kruskal-Wallis Testp-value = 0.8761
Genes
Friedman test
Exp 1 Exp 2 Exp 3 Exp 4 Exp 5 Exp 6
Exp 1 0.027 0.033 0.409 0.330 0.784
Exp 2 0.117 0.841 0.985 0.004
Exp 3 0.869 0.927 0.001
Exp 4 0.245 0.021
Exp 5 0.004
Exp 6
Friedman Testp-value = 0.0006611
Pairwise Wilcoxon Signed Rank(multiple comparisons problem)
Friedman / Kruskal-Wallis: at least one experiment shows differenceDoes not say which experiment
Exp 1 Exp 2 Exp 3 Exp 4 Exp 5 Exp 6
Exp 1 0.293 0.328 1.000 1.000 1.000
Exp 2 1.000 1.000 1.000 0.051
Exp 3 1.000 1.000 0.015
Exp 4 0.245 0.248
Exp 5 0.051
Exp 6
Raw p-values
Adjusted p-values
Friedman test
Exp 1 Exp 2 Exp 3 Exp 4 Exp 5 Exp 6
Exp 1 0.293 0.328 1.000 1.000 1.000
Exp 2 1.000 1.000 1.000 0.051
Exp 3 1.000 1.000 0.015
Exp 4 0.245 0.248
Exp 5 0.051
Exp 6
Adjusted p-values from Signed Rank testExperiment map
Actually have three pairs of experimentsA Exp 6 & Exp 1: with and without Tween, 1 hourB Exp 2 & Exp 3: with and without Tween, 2.5 hoursC Exp 5 & Exp 4: with and without Tween, 1 hour (replicate of A)
Difference detected may not be a useful oneBut note:
Looked at first 20 genesFull set has 22810
Aside on blockingG
ene
Experiment
The Friedman tests assumes thatall treatments are applied to all blocks“balanced complete design”
Statistical lingoExperiments are “treatments”Genes are “blocks”
Might not be able to do this• too expensive• blocks only available in packs of fixed size
Incomplete experimental designWhich treatments with which blocks is a critical issue
Aside on blockingG
ene
Experiment
The Friedman tests assumes thatall treatments are applied to all blocks“balanced complete design”
Statistical lingoExperiments are “treatments”Genes are “blocks”
Might not be able to do this• too expensive• blocks only available in packs of fixed size
Incomplete experimental designWhich treatments with which blocks is a critical issue
Talk to a statistician before you start