Comparing Many Group Means One Way Analysis of Variance.
-
date post
15-Jan-2016 -
Category
Documents
-
view
228 -
download
0
Transcript of Comparing Many Group Means One Way Analysis of Variance.
Comparing Many Comparing Many Group MeansGroup Means
One Way Analysis of VarianceOne Way Analysis of Variance
Data SituationData Situation
The data situation has k populations The data situation has k populations and we wish to determine if there and we wish to determine if there are any differences in the population are any differences in the population means.means.
If there are differences, we need to If there are differences, we need to describe them.describe them.
Each population is sampled so that Each population is sampled so that n_1 to n_k observations are n_1 to n_k observations are obtained.obtained.
ExamplesExamples
An agricultural researcher wishes to An agricultural researcher wishes to know what color of bug trap catches the know what color of bug trap catches the most bugs. Four colors are considered: most bugs. Four colors are considered: yellow, white, green, blue.yellow, white, green, blue.
An education researcher wishes to know An education researcher wishes to know if where a student sits in a classroom has if where a student sits in a classroom has any relationship to the grade the student any relationship to the grade the student will receive in the class. Three seating will receive in the class. Three seating locations are considered: front, middle, locations are considered: front, middle, and back.and back.
HypothesesHypotheses
The null hypothesis is that all The null hypothesis is that all population means are equal. That is, population means are equal. That is, no differences.no differences.
The alternative hypothesis is that The alternative hypothesis is that not all of the population means are not all of the population means are equal.equal.
Note in this situation the Ha cannot Note in this situation the Ha cannot be represented by using just be represented by using just symbols.symbols.
HypothesesHypotheses
sequalnotallHa
Ho k
':
: 21
Test StatisticTest Statistic
The idea is that if the population means The idea is that if the population means differ, the variation between the group differ, the variation between the group sample means will be large relative to the sample means will be large relative to the variation within the groups. variation within the groups.
The test statistic is based on this idea – it The test statistic is based on this idea – it is a ratio of the variation between groups is a ratio of the variation between groups divided by the variation within groups. divided by the variation within groups.
This ratio has an F distribution under Ho.This ratio has an F distribution under Ho.
F DistributionF Distribution
The F distribution is a skewed right The F distribution is a skewed right distribution with minimum value distribution with minimum value zero, and can extend out to infinity.zero, and can extend out to infinity.
The F distribution is indexed by two The F distribution is indexed by two sets of degrees of freedom: the first sets of degrees of freedom: the first is the numerator degrees of is the numerator degrees of freedom, and the second is the freedom, and the second is the denominator degrees of freedom.denominator degrees of freedom.
Degrees of FreedomDegrees of Freedom
The numerator degrees of freedom The numerator degrees of freedom are the number of groups, k, minus are the number of groups, k, minus one giving k-1.one giving k-1.
The denominator degrees of freedom The denominator degrees of freedom are n-k, the number of observations are n-k, the number of observations minus the number of groups.minus the number of groups.
Hypothesis Test FormulaHypothesis Test Formula
)(
)(
1)(
)(
)(
:
:
,1
1 1
2
1
2
210
FobsFPValueP
knyy
kyynF
WithinMS
BetweenMSF
qualnotallmuseH
H
knk
k
i
n
jiij
k
iii
a
k
i
Test StatisticTest Statistic The test statistic formula looks complex, The test statistic formula looks complex,
but is pretty easy to understand: the but is pretty easy to understand: the denominator is simply finding the denominator is simply finding the variation of each observation yij around variation of each observation yij around its group average y-bar_i. This variation its group average y-bar_i. This variation is summed over all observations within is summed over all observations within each group. This gives the total each group. This gives the total squared variation within groups.squared variation within groups.
The numerator gives the variation of The numerator gives the variation of each group sample mean around an each group sample mean around an overall average y-bar.overall average y-bar.
P-ValueP-Value
The p-value is found using an F-table with the The p-value is found using an F-table with the appropriate degrees of freedom. appropriate degrees of freedom.
If the null hypothesis is not true, the test If the null hypothesis is not true, the test statistic F will be large and the probability of statistic F will be large and the probability of getting beyond a large positive F will be getting beyond a large positive F will be small. small.
This means that a small p-value implies This means that a small p-value implies evidence to doubt the Ho, just like it always evidence to doubt the Ho, just like it always does.does.
The test will always be one-sided, positive tail.The test will always be one-sided, positive tail.
F-DistributionF-Distribution
Color BoxplotsColor Boxplots
Trap Color ProblemTrap Color Problem
0001.)552.30(
552.30025.46
153.1406
)(
)(
:
:
20,3
4210
FPValueP
F
WithinMS
BetweenMSF
qualnotallmuseH
H
a
ConclusionConclusion
The data is unlikely to occur if the Ho The data is unlikely to occur if the Ho was true. The data are inconsistent was true. The data are inconsistent with the Ho.with the Ho.
There is evidence to doubt the Ho.There is evidence to doubt the Ho. There is evidence to support the Ha.There is evidence to support the Ha. There is evidence that not all of the There is evidence that not all of the
trap color means are equal. There trap color means are equal. There are differences between the colors in are differences between the colors in how many bugs they catch.how many bugs they catch.
ConclusionConclusion
The sample summary shows how the The sample summary shows how the colors differ.colors differ.
Sample means:
Treatment n Mean Std. Error
yellow 6 47.166668 2.7738862
White 6 15.666667 1.3581033
Green 6 31.5 4.047633
Blue 6 14.833333 2.181997
ConclusionConclusion
The means show that yellow is by far The means show that yellow is by far the best color at catching bugs.the best color at catching bugs.
The next best color is green.The next best color is green.
ANOVA TableANOVA Table
ANOVA table:
Source df SS MS F-Stat P-value
Treatments 3 4218.4585 1406.1528 30.551935 <0.0001
Error 20 920.5 46.025
Total 23 5138.9585
Means PlotMeans Plot
Row Seating AnalysisRow Seating Analysis
The response variable is grade in the The response variable is grade in the course (grade points).course (grade points).
The grouping variable is seating The grouping variable is seating location at three levels: front, location at three levels: front, middle, back. middle, back.
The idea is that academic The idea is that academic performance may be related to performance may be related to seating location in the classroom.seating location in the classroom.
Row Seating AnalysisRow Seating Analysis
Row Seating AnalysisRow Seating Analysis
Row Seating AnalysisRow Seating Analysis
Sample means:
Treatment n Mean Std. Error
Front 35 14.285714 0.9886511
Middle 25 11.12 1.3245376
Back 19 10.105263 1.4446812
ANOVA table:
Source df SS MS F-Stat P-value
Treatments 2 264.3011 132.15054 3.428296 0.0375
Error 76 2929.5723 38.547005
Total 78 3193.8735
ConclusionConclusion
The data are unlikely to occur if the The data are unlikely to occur if the Ho is true. The data is inconsistent Ho is true. The data is inconsistent with the Ho.with the Ho.
There is evidence to doubt the Ho.There is evidence to doubt the Ho. There is evidence to support the Ha.There is evidence to support the Ha. There is evidence the group means There is evidence the group means
differ by seating location. That is, the differ by seating location. That is, the seating location mean gpas differ.seating location mean gpas differ.
ConclusionConclusion
The table of means and the plots The table of means and the plots show clearly that the front of the show clearly that the front of the classroom has higher grades than classroom has higher grades than the other locations.the other locations.
From the graphs it is clear that the From the graphs it is clear that the middle and the back of the room middle and the back of the room have similar grades.have similar grades.
Where do you sit? Where do you sit?
Math ScoresMath Scores
Math ScoresMath Scores
Group n Mean Std. Error
Computer 20 0.95 0.45581043
None 14 0.78571427 0.85278195
Piano 34 3.6176472 0.52396184
Singing 10 -0.3 0.47258157
ANOVA table:
Source df SS MS F-Stat P-value
Treatments 3 184.10191 61.367302 8.418377 <0.0001
Error 74 539.4366 7.2896833
Total 77 723.53845
Math ScoresMath Scores