Why multiple tests are a problem? Rafael A. Irizarry.
-
Upload
estella-turner -
Category
Documents
-
view
215 -
download
1
Transcript of Why multiple tests are a problem? Rafael A. Irizarry.
Why multiple tests are a Why multiple tests are a problem?problem?
Rafael A. Irizarry
Other namesOther names
• Multiple comparisons
• Data snooping
• Others?
ReferencesReferences
• H. Scheffe (1953), “A method for judging all contrasts in the analysis of variance”, Biometrika 40:87-104
• D.B. Duncan (1965), “A Bayesian Approach to multiple comparisons” Technometrics 7:171-222.
• J.W. Tukey (1953), “The problem on multiple comparisons” reprinted in CWJWT Vol. VIII (1994)
• R.G. Miller, Simultaneous Statistical nference, 2nd ed. (Springer 1981)
Thanks to Yoav BenjaminiThanks to Yoav Benjamini
Benjamini and Hochberg (1995) “Controlling the false discovery rate: a practical and powerful approach to multiple testing”. JR Stat. Soc. Ser. B
ExampleExample
E. Giovannucci, A. Ascherio, E. Rimm, M. Stampfer, G.Coldizt, W. Willett:‘‘Intake of Carotenoids and Retinol in Relationto Risk of Prostate Cancer’’, Journal of the NationalCancer Insitute 87(23):1767--1776 (6 Dec 1995).
‘‘Using responses to a validated, semiquantitative food Frequency questionnaire mailed to participants in theHealth Professionals Follow-up Study in 1986, weassessed dietary intake for a 1-year period for a cohort of47,894 eligible subjects initially free of diagnosedcancer....We calculate the relative risk (RR) for each ofthe upper categories of intake of a specific food ornutrient by dividing the incidence of prostate canceramong men in each of these categories by the rate amongmen in the lowest intake level....
‘‘Of 46 vegetables and fruits or related products, fourwere significantly associated with lower prostate cancerrisk; of the four --- tomato sauce (P for trend = 0.001),tomatoes (P for trend = 0.03), and pizza (P for trend =0.05), but not strawberries --- were primary sources oflycopene.’’
BUT the Methods section one page later states:
‘‘For each of 131 food and beverage items listed ...’’And the (presumably strongest) carotenoids and p-valuesare listed in Table 2 (p.1770):
Tomato sauce Tomatoes Tomato juice Pizza0.001 0.03 0.67 0.05
‘‘Our findings ... suggest that tomato-based foods may beespecially beneficial regarding prostate cancer risk.’’
What is a p-value again?What is a p-value again?
When nothing protects, we expect
131 x 0.05 7
foods/nutrients to have p-values < 0.05
MicroarraysMicroarrays
When no genes are changing between two groups we expect
20,000 x 0.01 = 200
genes to have p-value < 0.01
However, false positives are not as bad as in other fields
What can we do?What can we do?
• p-values no longer mean what they used to… no argument
• Histogram of p-values is useful plot
• What can we do… lots of argument
Multiple Hypothesis TestingMultiple Hypothesis Testing
CalledSignificant
Not Called
Significant
Total
Null True V m0 – V m0
Altern.True S m1 – S m1
Total R m – R m
Null = Equivalent Expression; Alternative = Differential Expression
Error RatesError Rates•Per comparison error rate (PCER): the expected value of the number of Type I errors over the number of hypotheses
PCER = E(V)/m
•Per family error rate (PFER): the expected number of Type I errorsPFER = E(V)
•Family-wise error rate: the probability of at least one Type I errorFEWR = Pr(V ≥ 1)
•False discovery rate (FDR) rate that false discoveries occurFDR = E(V/R; R>0) = E(V/R | R>0)Pr(R>0)
•Positive false discovery rate (pFDR): rate that discoveries are falsepFDR = E(V/R | R>0)
•Many others
ConclusionsConclusions
• Lets do a multiple comparison of the different beers sold by the IF