MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
-
date post
19-Dec-2015 -
Category
Documents
-
view
213 -
download
0
Transcript of MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
MARE 250 Dr. Jason Turner
Analysis of Variance(ANOVA)
First name: Mister, Last name: T
Hypothesis Testing for Two Means Test for comparing the means of two populationsOne of the most common (and simple?) tests in statistics
What is the procedure if there are more than two populations?
“I pity myself that I got to be with these fools!”– Mr. T
Mr. T – Part 2: Electric BoogalooWhy not run multiple T-test?
1. Number of t-tests increases with # of groupsbecomes cognitively difficult
2. ↑ Number of analyses = ↑ probability of committing Type I error
Probability of committing at least one type I error = experiment-wise error rate
μ1 μ2 μ3
Analysis of Variance (ANOVA)
Method for comparing multiple sample means
Compares the means of a variable for populations that result from a classification by one other variable – Factor
Levels of the Factor – the possible values of the factor
At this point dealing with: One-Way AVOVA
Analysis of Variance (ANOVA)
Want to compare egg production in four populations of Roi in Hawaii
Variable of interest = egg production
Four populations result from classifying Roi in Hawaii by the Factor “Beach locations” whos Levels are Ka Lae, Kapoho, Richardson’s, Keauhou Bay
Assumptions for One-Way ANOVA
Four assumptions for t-test hypothesis testing:
1. Random Samples2. Independent Samples3. Normal Populations (or large samples)4. Variances (std. dev.) are equal
One-Way ANOVA
Other similarities T & ANOVA
A one-way analysis of variance (ANOVA) tests the hypothesis that the means of several populations are equal.
The method is an extension of the two-sample t-test, specifically for the case where the population variances are assumed to be equal.
“I pity the fools that think T and AVONA are similar!”– Mr. T
Other similarities T & ANOVA
A one-way analysis of variance requires the following:Response - measurement taken from the units sampled. Factor - discrete variable that is altered systematically. The different values chosen for the factor variable are called levels of the factor.
Each level of the factor in the analysis corresponds to a larger population with its own mean.
The sample mean is an estimate of the level mean for the whole population.
“That sucka stole my van!”– Mr. T
A one-way ANOVA can be used to tell you if there are statistically significant differences among the level means.
The null hypothesis for the test is that all population means (level means) are the same.
The alternative hypothesis is that one or more population means differ from the others.
Other similarities T & ANOVA
Analysis of Variance (ANOVA)
Method for comparing multiple sample means
Compares the means of a variable for populations that result from a classification by one other variable – Factor
Levels of the Factor – the possible values of the factor
At this point dealing with: One-Way AVOVA
Mr. T Says…
A one-way analysis of variance (ANOVA) tests the hypothesis that the means of several populations are equal
The method is an extension of the two-sample t-test, specifically for the case where the population variances are assumed to be equal
The null hypothesis for the test is that all population means (level means) are the same - H0: μ1 = μ2 = μ3 = μ4 The alternative hypothesis is that one or more population means differ from the others - Ha: Not all means are equal
“I pity the fools that think T and ANOVA are similar!”– Mr. T
also “Stay in School fool!” and the popular “Drugs are for Chumps!”
One-Way ANOVA : Analysis of Variance Table
One-way ANOVA: Egg production versus LocationSource DF SS MS F P Paint 3 281.7 93.9 6.02 0.004Error 20 312.1 15.6Total 23 593.8
Degrees of Freedom
Sum of SquaresMean Squares
F statisticP value
FF is the statistic used to test the hypothesis that all the factor level means are equal
It is calculated as the mean square for the factor divided by the mean squares for error
F is used to determine the p-value
Like the T statistic for a t-test, or an R-J value for a Ryan-Joiner
Analysis of Variance (ANOVA)
ANOVA is a statistical test (like t-test, normality ,etc)
However, MINITAB has different programs we can use to run an ANOVA
1-way ANOVA - easy to use, formatted
General Linear Model – (GLM) – less structured, more freedom
Four Sample Means, Only One Result?“I pity the fools that has only one result for 4 sample means!”– Mr. T
A one-way analysis of variance (ANOVA) tests the hypothesis that the means of several populations are equal
The null hypothesis for the test is that all population means (level means) are the same - H0: μ1 = μ2 = μ3 = μ4 The alternative hypothesis is that one or more population means differ from the others - Ha: Not all means are equalSource DF SS MS F P Pop 3 281.7 93.9 6.02 0.004Error 20 312.1 15.6Total 23 593.8
We reject the null = all means not equal
Is that all?
Multiple Comparisons
Allow you to determine the relations among all the means
Several methods: Tukey, Fisher’s LSD, Dunnett’s, Bonferroni, Scheffe, etc
Most focus on Tukey
2 Ways to Assess: 1) Confidence intervals (for 1-way ANOVA only) 2) Tukey test (for all other ANOVAs)
95% confidence intervals for each level of the factor
If the intervals do not overlap; suggests population means are different
Interpret with caution; rate of type I error increases when making multiple comparisons
Should use one of the four available methods for controlling the rate of type I error
Intervals for Pop 2 and Pop 4 do not overlap; suggests population means for these levels are differentLevel N Mean StDev Pop 1 6 14.733 3.363 (-----*------)Pop 2 6 8.567 5.500 (------*------)Pop 3 6 12.983 3.730 (------*------)Pop 4 6 18.067 2.636 (------*------)
Tukey's method
Tukey's method compares the means for each pair of factor levels using a family error rate to control the rate of type I error Results are presented as a set of confidence intervals for the difference between pairs of meansUse the intervals to determine whether the means are different:
If an interval does not contain zero, there is a statistically significant difference between the corresponding means
If the interval does contain zero, the difference between the means is not statistically significant
Tukey 95% Simultaneous Confidence IntervalsAll Pairwise Comparisons among Levels of AhiIndividual confidence level = 98.89%
Ahi = Pop1 subtracted from:Ahi Lower Center Upper
Pop 2 -12.553 -6.167 0.219 (-------*-------)Pop 3 -8.136 -1.750 4.636 (-------*-------)Pop 4 -3.053 3.333 9.719 (-------*-------)
Ahi = Pop 2 subtracted from:Ahi Lower Center UpperPop 3 -1.969 4.417 10.803 (-------*-------)Pop 4 3.114 9.500 15.886 (-------*-------)
Ahi = Pop 3 subtracted from:Ahi Lower Center Upper Pop 4 -1.303 5.083 11.469 (-------*-------)
Which multiple comparisons to use
At this point, unless otherwise advised – Tukey’s is fine
Bonferroni, Fisher’s LSD, Dunnett’s, Bonferroni, Scheffe all used heavily as well
Statistical versus practical significanceEven if the level means are significantly different from a statistical standpoint, the difference may not be of any practical importance
In the Ahi data, the smallest mean is 8.567 and the largest is 18.067. Is this 9.5-unit difference of any practical consequence?
Only knowledge of the subject area, not statistics, can be used to answer this question
Assumptions for One-Way ANOVA
Four assumptions for t-test hypothesis testing:
1. Random Samples2. Independent Samples3. Normal Populations (or large samples)4. Variances (std. dev.) are equal
One-Way ANOVA
Non-Parametric Version of ANOVA
If samples are independent, similarly distributed data
Use nonparamentric test regardless of normality or sample size
Is based upon mean of ranks of the data – not the mean or variance (Like Mann-Whitney)
If the variation in mean ranks is large – reject nullUses p-value like ANOVA
Last Resort/Not Resort –low sample size, “bad” data
Kruskal-Wallis
When Do I Do the What Now?
If you are reasonably sure that the distributions are normal –use ANOVA
Otherwise – use Kruskal-Wallis
“Well, whenever I'm confused, I just check my underwear. It holds the answer to all the important questions.” – Grandpa Simpson