Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with...
-
Upload
kristopher-cobb -
Category
Documents
-
view
212 -
download
0
Transcript of Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with...
![Page 1: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/1.jpg)
Chapter 15
The Analysis of Variance
![Page 2: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/2.jpg)
2 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or breast when treated with ascorbate1. In this study, the authors wanted to determine if the survival times differ based on the affected organ.
1 Cameron, E. and Pauling, L. (1978) Supplemental ascorbate in the supportive treatment of cancer: re-evaluation of prolongation of survival time in terminal human cancer. Proceedings of the National Academy of Science, USA, 75, 4538-4542.
A Problem
![Page 3: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/3.jpg)
3 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
A comparative dotplot of the survival times is shown below.
A Problem
3000200010000
Survival Time (in days)
Dotplot for Survival Time
Cancer Type
Breast
Bronchus
Colon
Ovary
Stomach
![Page 4: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/4.jpg)
4 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
The hypotheses used to answer the question of interest are
H0: stomach = bronchus = colon = ovary = breast
Ha: At least two of the ’s are different
The question is similar to ones encountered in chapter 11 where we looked at tests for the difference of means of two different variables. In this case we are interested in looking a more than two variable.
A Problem
![Page 5: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/5.jpg)
5 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
A single-factor analysis of variance (ANOVA) problems involves a comparison of k population or treatment means 1, 2, … , k. The objective is to test the hypotheses:
H0: 1 = 2 = 3 = … = k
Ha: At least two of the ’s are different
Single-factor Analysis of Variance (ANOVA)
![Page 6: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/6.jpg)
6 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
The analysis is based on k independently selected samples, one from each population or for each treatment.
In the case of populations, a random sample from each population is selected independently of that from any other population.
When comparing treatments, the experimental units (subjects or objects) that receive any particular treatment are chosen at random from those available for the experiment.
A comparison of treatments based on independently selected experimental units is often referred to as a completely randomized design.
Single-factor Analysis of Variance (ANOVA)
![Page 7: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/7.jpg)
7 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Single-Factor Analysis of Variance (ANOVA)
70
60
50
40
Fertilizer
Yie
ld
Dotplots of Yield by Fertilizer(group means are indicated by lines)
Type 1 Type 2 Type 3
Sta
tistic
s
Psy
cho
log
y
Eco
nom
ics
Bus
ine
ss
85
75
65
Subject
Pric
e
Dotplots of Price by Subject(group means are indicated by lines)
Notice that in the comparative dotplot on the left, the differences in the treatment means is large relative to the variability within the samples while with the comparative dotplot on the right, the differences in the sample means is relative to the sample variability is not so clear cut. ANOVA techniques will allow us to determined if those differences are significant.
![Page 8: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/8.jpg)
8 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
ANOVA Notationk = number of populations or treatments being compared
Population or treatment 1 2 … k
Population or treatment mean 1 2 … k
Population or treatment variance …
Sample size n1 n2 … nk
Sample mean …
Sample variance …N = n1 + n2 + … + nk (Total number of observations in
the data set)T = grand total = sum of all N observations
1x 2x kx
21 2
2 2k
21s 2
2s 2ks
1 1 2 2 k kn x n x n x T
x grand meanN
k = number of populations or treatments being compared
Population or treatment 1 2 … k
Population or treatment mean 1 2 … k
Population or treatment variance …
Sample size n1 n2 … nk
Sample mean …
Sample variance …N = n1 + n2 + … + nk (Total number of observations in
the data set)T = grand total = sum of all N observations
1x 2x kx
21 2
2 2k
21s 2
2s 2ks
1 1 2 2 k kn x n x n x T
x grand meanN
1x 2x kx
21 2
2 2k
21s 2
2s 2ks
1 1 2 2 k kn x n x n x T
x grand meanN
![Page 9: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/9.jpg)
9 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Assumptions for ANOVA
1. Each of the k populations or treatments, the response distribution is normal.
2. 1 = 2 = … = k (The k normal distributions have identical standard deviations.
3. The observations in the sample from any particular one of the k populations or treatments are independent of one another.
4. When comparing population means, k random samples are selected independently of one another. When comparing treatment means, treatments are assigned at random to subjects or objects.
![Page 10: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/10.jpg)
10 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
DefinitionsA measure of disparity among the sample means is the treatment sum of squares, denoted by SSTr is given by
2 2 2
1 1 2 2 k kSSTr n x x n x x n x x
A measure of disparity among the sample means is the treatment sum of squares, denoted by SSTr is given by
2 2 2
1 1 2 2 k kSSTr n x x n x x n x x
A measure of variation within the k samples, called error sum of squares and denoted by SSE is given by
2 2 21 1 2 2 k kSSE n 1 s n 1 s n 1 s
A measure of variation within the k samples, called error sum of squares and denoted by SSE is given by
2 2 21 1 2 2 k kSSE n 1 s n 1 s n 1 s
![Page 11: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/11.jpg)
11 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Definitions
The error df comes from adding the df’s associated with each of the sample variances:
(n1 - 1) + (n2 - 1) + …+ (nk - 1)
= n1 + n2 … + nk - 1 - 1 - … - 1 = N - k
A mean square is a sum of squares divided by its df. In particular,
SSTrk 1
mean square for treatments = MSTr = SSTrk 1
mean square for treatments = MSTr =
mean square for error = MSE = SSEN k
mean square for error = MSE = SSEN k
![Page 12: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/12.jpg)
12 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
ExampleThree filling machines are used by a bottler to fill 12 oz cans of soda. In an attempt to determine if the three machines are filling the cans to the same (mean) level, independent samples of cans filled by each were selected and the amounts of soda in the cans measured. The samples are given below.
Machine 112.033 11.985 12.009 12.00912.033 12.025 12.054 12.050
Machine 212.031 11.985 11.998 11.99211.985 12.027 11.987
Machine 312.034 12.021 12.038 12.05812.001 12.020 12.029 12.01112.021
![Page 13: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/13.jpg)
13 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Example1 1 1n 8, x 12.0248, s 0.02301
2 2 2n 7, x 12.0007, s 0.01989
3 3 3n 9, x 12.0259, s 0.01650
2 2 21 1 2 2 k k
2 2 2
SSE n 1 s n 1 s n 1 s
7(0.0230078) 6(0.0198890) 8(0.01649579)
0.0037055 0.0023734 0.0021769
0.00825582
2 2 2
1 1 2 2 k k
2 2 2
SSTr n x x n x x n x x
8(0.0065833) 7(-0.0174524) 9(0.0077222)
0.000334672+0.00213210+0.00053669
0.00301552
x 12.018167
![Page 14: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/14.jpg)
14 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Example
1 1 1n 8, x 12.0248, s 0.02301
2 2 2n 7, x 12.0007, s 0.01989
3 3 3n 9, x 12.0259, s 0.01650
x 12.018167
SSTrk 1
mean square for treatments = MSTr =
SSTr 0.00301552MSTr 0.0015078
k 1 3 1
SSTrk 1
mean square for treatments = MSTr = SSTrk 1
mean square for treatments = MSTr =
SSTr 0.00301552MSTr 0.0015078
k 1 3 1
mean square for error = MSE = SSEN k
SSE 0.0082579MSE 0.00039313
N k 24 3
mean square for error = MSE = SSEN k
mean square for error = MSE = SSEN k
SSE 0.0082579MSE 0.00039313
N k 24 3
![Page 15: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/15.jpg)
15 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
CommentsBoth MSTr and MSE are quantities that are calculated from sample data. As such, both MSTr and MSE are statistics and have sampling distributions.
More specifically, when H0 is true (1 = 2 = 3 = …
= k), MSTr = MSE.
However, when H0 is false, MSTr = MSE and the greater the differences among the ’s, the larger MSTr will be relative to MSE.
![Page 16: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/16.jpg)
16 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
The Single-Factor ANOVA F Test
Null hypothesis: H0: 1 = 2 = 3 = … = k
Alternate hypothesis: At least two of the ’s are different
Test Statistic: MSTrF
MSETest Statistic: MSTr
FMSE
![Page 17: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/17.jpg)
17 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
The Single-Factor ANOVA F TestWhen H0 is true and the ANOVA assumptions are reasonable, F has an F distribution with df1 = k - 1 and df2 = N - k.
Values of F more contradictory to H0 than what was calculated are values even farther out in the upper tail, so the P-value is the area captured in the upper tail of the corresponding F curve.
![Page 18: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/18.jpg)
18 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
ExampleConsider the earlier example involving the three filling machines.Machine 1
12.033 11.985 12.009 12.009 12.033 12.025 12.054 12.050Machine 2
12.031 11.985 11.998 11.992 11.985 12.027 11.987
Machine 312.034 12.021 12.038 12.058 12.001 12.020 12.029 12.01112.021
1 1 1n 8, x 12.0248, s 0.02301
2 2 2n 7, x 12.0007, s 0.01989
3 3 3n 9, x 12.0259, s 0.01650
SSE 0.00825582SSTr 0.00301552
x 12.0181671 1 1n 8, x 12.0248, s 0.02301
2 2 2n 7, x 12.0007, s 0.01989
3 3 3n 9, x 12.0259, s 0.01650
SSE 0.00825582SSTr 0.00301552
x 12.018167
MSTr 0.0015078 MSE 0.00039313
![Page 19: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/19.jpg)
19 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Example
1. Let 1, 2 and 3 denote the true mean amount of soda in the cans filled by machines 1, 2 and 3, respectively.
2. H0: 1 = 2 = 3
3. Ha: At least two among are 1, 2 and 3
different
4. Significance level: = 0.01
5. Test statistic: MSTrF
MSE5. Test statistic: MSTr
FMSE
![Page 20: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/20.jpg)
20 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Example6. Looking at the comparative dotplot, it seems
reasonable to assume that the distributions have the same ’s. We shall look at the normality assumption on the next slide. *
12.0612.0512.0412.0312.0212.0112.0011.99
Fill
Dotplot for FillMachine
Machine 1
Machine 2
Machine 3
*When the sample sizes are large, we can make judgments about both the equality of the standard deviations and the normality of the underlying populations with a comparative boxplot.
![Page 21: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/21.jpg)
21 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Example6. (continued)
Looking at normal plots for the samples, it certainly appears reasonable to assume that the samples from Machine’s 1 and 2 are samples from normal distributions. Unfortunately, the normal plot for the sample from Machine 2 does not appear to be a sample from a normal population. So as to have a computational example, we shall continue and finish the test, treating the result with a “grain of salt.”
P-Value: 0.692A-Squared: 0.235
Anderson-Darling Normality Test
N: 8StDev: 0.0230078Average: 12.0248
12.05512.04512.03512.02512.01512.00511.99511.985
.999
.99
.95
.80
.50
.20
.05
.01
.001
Pro
bab
ility
Machine 1
Normal Probability Plot
P-Value: 0.031A-Squared: 0.729
Anderson-Darling Normality Test
N: 7StDev: 0.0198890Average: 12.0007
12.0312.0212.0112.0011.99
.999
.99
.95
.80
.50
.20
.05
.01
.001
Pro
bab
ility
Machine 2
Normal Probability Plot
P-Value: 0.702A-Squared: 0.237
Anderson-Darling Normality Test
N: 9StDev: 0.0164958Average: 12.0259
12.0612.0512.0412.0312.0212.0112.00
.999
.99
.95
.80
.50
.20
.05
.01
.001
Pro
bab
ility
Machine 3
Normal Probability Plot
![Page 22: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/22.jpg)
22 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Example7. Computation:
1 1 1n 8, x 12.0248, s 0.02301
2 2 2n 7, x 12.0007, s 0.01989
3 3 3n 9, x 12.0259, s 0.01650
1 1 1n 8, x 12.0248, s 0.02301
2 2 2n 7, x 12.0007, s 0.01989
3 3 3n 9, x 12.0259, s 0.01650
SSE 0.00825582SSTr 0.00301552 SSE 0.00825582SSTr 0.00301552
x 12.018167
MSTr 0.0015078 MSE 0.00039313MSTr 0.0015078 MSE 0.00039313
1 2 3N n n n 8 7 9 24, k 3
1
2
MSTr 0.0015078F 3.835
MSE 0.00039313df treatment df k 1 3 1 2
df error df N k 24 3 21
![Page 23: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/23.jpg)
23 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
8. P-value:
From the F table with numerator df1 = 2 and denominator df2 = 21 we can see that
0.025 < P-value < 0.05
(Minitab reports this value to be 0.038
3.835
dfden / dfnum 2
21 0.100 2.570.050 3.470.025 4.420.010 5.780.001 9.77
3.8353.835
dfden / dfnum 2
21 0.100 2.570.050 3.470.025 4.420.010 5.780.001 9.77
Example
1
2
MSTr 0.0015078F 3.835
MSE 0.00039313df treatment df k 1 3 1 2
df error df N k 24 3 21
Recall
1
2
MSTr 0.0015078F 3.835
MSE 0.00039313df treatment df k 1 3 1 2
df error df N k 24 3 21
Recall
![Page 24: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/24.jpg)
24 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Example9. Conclusion:
Since P-value > = 0.01, we fail to reject H0. We are unable to show that the mean fills are different and conclude that the differences in the mean fills of the machines show no statistically significant differences.
![Page 25: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/25.jpg)
25 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Total Sum of Squares
The relationship between the three sums of squares is SSTo = SSTr + SSEwhich is often called the fundamental identity for single-factor ANOVA.
Informally this relation is expressed as
Total variation = Explained variation + Unexplained variation
Total sum of squares, denoted by SSTo, is given by
with associated df = N - 1.
all N obs.
2SSTo (x x)
Total sum of squares, denoted by SSTo, is given by
with associated df = N - 1.
all N obs.
2SSTo (x x)
![Page 26: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/26.jpg)
26 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Single-factor ANOVA TableThe following is a fairly standard way of presenting the important calculations from an single-factor ANOVA. The output from most statistical packages will contain an additional column giving the P-value.
![Page 27: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/27.jpg)
27 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Single-factor ANOVA TableThe ANOVA table supplied by Minitab
One-way ANOVA: Fills versus Machine
Analysis of Variance for Fills Source DF SS MS F PMachine 2 0.003016 0.001508 3.84 0.038Error 21 0.008256 0.000393Total 23 0.011271
![Page 28: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/28.jpg)
28 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Another ExampleA food company that sells iced tea, produces 4 different flavored sweetened iced teas (lemon, raspberry, peach, green tea). A dietician working for the company needed to determine if the current formulations gave the same mean sodium levels for the four flavors. In order to determine if the four flavors had the same sodium levels, 15 bottles of each flavor were randomly (and independently) obtained and the sodium content in milligrams (mg) per 12 ounce serving was measured. The sample data are given on the next slide. Use the data to perform an appropriate hypothesis test at the 0.05 level of significance.
![Page 29: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/29.jpg)
29 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Another ExampleFlavor 135.0 35.6 34.1 39.6 35.6 32.3 36.6 34.535.2 33.8 36.7 37.2 34.0 33.8 35.8
Flavor 237.3 37.4 38.3 34.9 39.0 36.5 36.9 37.634.9 40.4 37.5 33.5 38.2 34.6 34.5
Flavor 335.2 33.4 34.5 38.1 36.2 35.4 38.5 31.536.7 35.6 36.7 39.3 36.8 31.5 33.2
Flavor 435.4 35.7 31.4 34.5 34.1 31.2 37.5 37.331.7 33.2 33.8 35.8
![Page 30: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/30.jpg)
30 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Another Example1. Let 1, 2 , 3 and 4 denote the true mean
sodium content per 12 ounce serving of iced tea for each of the 4 flavors (lemon, raspberry, peach, green tea) in that order, respectively
2. H0: 1 = 2 = 3 = 4
3. Ha: At least two among are 1, 2, 3 and 4
different
4. Significance level: = 0.05
5. Test statistic: MSTrF
MSE5. Test statistic: MSTr
FMSE
![Page 31: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/31.jpg)
31 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
6. Looking at the following comparative boxplot, it seems reasonable to assume that the distributions have the same ’s as well as the samples being samples from normal distributions (i.e., It is reasonable to assume that the distributions of sodium content per 12 ounce serving are normal for each of the four flavors.
Fla
vor
4
Fla
vor
3
Fla
vor
2
Fla
vor
1
40
35
30
Flavor
Sod
ium
con
ten
t
Boxplots of Sodium by Flavor(means are indicated by solid circles)
mg
/ q
w o
un
ce s
ervi
ng
Another Example
![Page 32: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/32.jpg)
32 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Another Example7. Computation:Flavor k si
Flavor 1 15 35.320 1.764Flavor 2 15 36.767 1.929Flavor 3 15 35.507 2.361Flavor 4 15 33.893 2.421
xi
x 35.3722 2 2 2
1 1 2 2 3 3 4 4
2 2
2 2
SSTr n (x x) n (x x) n (x x) n (x x)
15(35.320 35.372) 15(36.767 35.372)
15(35.507 35.372) 15(33.893 35.372)
64.673
Treatment df = k - 1 = 4 - 1 = 3
![Page 33: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/33.jpg)
33 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Another Example
7. Computation (continued):
2 2 2 21 1 2 2 3 3 4 4
2 2 2 2
SSE n 1 s n 1 s n 1 s n 1 s
14(1.764) 14(1.929) 14(2.361) 14(2.421)
255.74
Error df = N - k = 60 - 4 = 56
SSTr
SSE
SSTr 64.673dfMSTr 3F 4.72SSE 255.74MSE
df 56
![Page 34: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/34.jpg)
34 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Another Example8. P-value:
F = 4.27 with dfnumerator = 3 and dfdenominator = 56
dfden / dfnum 3
60 0.100 2.180.050 2.760.025 3.340.010 4.130.001 6.17 4.27
Using df = 60 (the closest entry to 56 in the table) we find
0.001 < P-value < 0.01
![Page 35: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/35.jpg)
35 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Another Example9. Conclusion:
Since P-value < = 0.05, we reject H0. We can conclude that the mean sodium content is different for at least two of the flavors.
We need to learn how to interpret the results and will spend some time on developing techniques to describe the differences among the ’s.
![Page 36: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/36.jpg)
36 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Multiple ComparisonsA multiple comparison procedure is a method for identifying differences among the ’s once the hypothesis of overall equality (H0) has been rejected.
The technique we will present is based on computing confidence intervals for difference of means for the pairs.
Specifically, if k populations or treatments are studied, we would create k(k-1)/2 differences. (i.e., with 3 treatments one would generate confidence intervals for 1 - 2, 1 - 3 and 2 - 3.) Notice that it is only necessary to look at a confidence interval for 1 - 2 to see if 1 and 2 differ.
![Page 37: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/37.jpg)
37 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
The Tukey-Kramer Multiple Comparison ProcedureWhen there are k populations or treatments being compared, k(k-1)/2 confidence intervals must be computed. If we denote the relevant Studentized range critical value by q, the intervals are as follows:
For i - j:
Two means are judged to differ significantly if the corresponding interval does not include zero.
i ji j
MSE 1 1( ) q
2 n n
When there are k populations or treatments being compared, k(k-1)/2 confidence intervals must be computed. If we denote the relevant Studentized range critical value by q, the intervals are as follows:
For i - j:
Two means are judged to differ significantly if the corresponding interval does not include zero.
i ji j
MSE 1 1( ) q
2 n n
![Page 38: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/38.jpg)
38 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
The Tukey-Kramer Multiple Comparison Procedure
When all of the sample sizes are the same, we denote n by n = n1 = n2 = n3 = … = nk, and the confidence intervals (for i - j) simplify to
i j
MSE( ) q
n
When all of the sample sizes are the same, we denote n by n = n1 = n2 = n3 = … = nk, and the confidence intervals (for i - j) simplify to
i j
MSE( ) q
n
![Page 39: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/39.jpg)
39 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Example (continued)Continuing with example dealing with the sodium content for the four flavors of iced tea we shall compute the Tukey-Kramer 95% Tukey-Kramer confidence intervals for 1 - 2, 1 - 3, 1 - 4, 2 - 3, 2 - 3 and 3 - 4.
1 2 3 4
255.74MSE 4.567, n n n n n 15
56q 3.74 approximation with df = 60 rather than 56
MSE 4.567q 3.74 2.06
n 15 /15)
![Page 40: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/40.jpg)
40 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Example (continued)
Notice that the confidence interval for 2 - 4 does not contain 0 so we can infer that the mean sodium content for flavors 2 and 4 differ.
Difference 95% Confidence
Limits 95% Confidence
Interval
1 – 2 = -1.45 -1.45 ± 2.06 (-3.51, 0.61)
1 – 3 = -0.19 -0.19 ± 2.06 (-2.25, 1.87)
1 – 4 = 1.43 1.43 ± 2.06 (-0.63, 3.49)
2 – 3 = 1.26 1.26 ± 2.06 (-0.8, 3.32)
2 – 4 = 2.87 2.87 ± 2.06 (0.81, 4.93)
3 – 4 = 1.61 1.61 ± 2.06 (-0.45, 3.67)
![Page 41: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/41.jpg)
41 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Example (continued)Notice that the confidence interval for 2 - 4 does not contain 0 so we can infer that the mean sodium content for flavors 2 and 4 differ.
We also illustrate the differences with the following listing of the sample means in increasing order with lines underneath those blocks of means that are indistinguishable.
Flavor 4 Flavor 1 Flavor 3 Flavor 2
33.893 35.320 35.507 36.767
Notice that the confidence interval for 2 - 4 does not contain 0 so we can infer that the mean sodium content for flavors 2 and 4 differ.
![Page 42: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/42.jpg)
42 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Minitab Output for Example
One-way ANOVA: Sodium versus Flavor
Analysis of Variance for Sodium Source DF SS MS F PFlavor 3 62.29 20.76 4.55 0.006Error 56 255.74 4.57Total 59 318.02 Individual 95% CIs For Mean Based on Pooled StDevLevel N Mean StDev --+---------+---------+---------+----Flavor 1 15 35.320 1.764 (------*-------) Flavor 2 15 36.767 1.929 (------*------) Flavor 3 15 35.507 2.361 (-------*------) Flavor 4 15 33.893 2.421 (------*------) --+---------+---------+---------+----Pooled StDev = 2.137 33.0 34.5 36.0 37.5
![Page 43: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/43.jpg)
43 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Minitab Output for Example
Tukey's pairwise comparisons
Family error rate = 0.0500Individual error rate = 0.0106
Critical value = 3.74
Intervals for (column level mean) - (row level mean)
Flavor 1 Flavor 2 Flavor 3
Flavor 2 -3.510 0.617
Flavor 3 -2.250 -0.804 1.877 3.324
Flavor 4 -0.637 0.810 -0.450 3.490 4.937 3.677
![Page 44: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/44.jpg)
44 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Simultaneous Confidence Level
The Tukey-Kramer intervals are created in a manner that controls the simultaneous confidence level.
For example at the 95% level, if the procedure is used repeatedly on many different data sets, in the long run only about 5% of the time would at least one of the intervals not include that value of what it is estimating.
We then talk about the family error rate being 5% which is the maximum probability of one or more of the confidence intervals of the differences of mean not containing the true difference of mean.
![Page 45: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/45.jpg)
45 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Randomized Block Experiment
Suppose that experimental units (individuals or objects to which the treatments are applied) are first separated into groups consisting of k units in such a way that the units within each group are as similar as possible. Within any particular group, the treatments are then randomly allocated so that each unit in a group receives a different treatment. The groups are often called blocks and the experimental design is referred to as a randomized block design.
![Page 46: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/46.jpg)
46 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Example
When choosing a variety of melon to plant, one thing that a farmer might be interested in is the length of time (in days) for the variety to bear harvestable fruit. Since the growing conditions (soil, temperature, humidity) also affect this, a farmer might experiment with three hybrid melons (denoted hybrid A, hybrid B and hybrid C) by taking each of the four fields that he wants to use for growing melons and subdividing each field into 3 subplots (1, 2 and 3) and then planting each hybrid in one subplot of each field. The blocks are the fields and the treatments are the hybrid that is planted. The question of interest would be “Are the mean times to bring harvestable fruit the same for all three hybrids?”
![Page 47: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/47.jpg)
47 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Assumptions and HypothesesThe single observation made on any particular treatment in a given block is assumed to be selected from a normal distribution. The variance of this distribution is 2, the same for each block-treatment combinations. However, the mean value may depend separately both on the treatment applied and on the block. The hypotheses of interest are as follows:
H0: The mean value does not depend on which treatment is applied
Ha: The mean value does depend on which treatment is applied
![Page 48: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/48.jpg)
48 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Summary of the Randomized Block F Test
Notation: Let
k = number of treatments
l = number of blocks
= average if all observations for treatment i
= average of all observations in block I
= average of all kl observations in the experiment (the grand mean)
ix
ib
x
Notation: Let
k = number of treatments
l = number of blocks
= average if all observations for treatment i
= average of all observations in block I
= average of all kl observations in the experiment (the grand mean)
ix
ib
x
![Page 49: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/49.jpg)
49 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Summary of the Randomized Block F TestSum of squares and associated df’s are as follows.
Sum of Squares Symbol df Formula
Treatments SSTr k –1
2 2 2
1 2 kl x x x x x x
Blocks SSBl l -1
2 2 2
1 2 lk b x b x b x
Error SSE (k – 1)(l –
1) by subtraction
Total SSTo kl - 1 all kl obs.
2(x x)
![Page 50: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/50.jpg)
50 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Summary of the Randomized Block F Test
SSE is obtained by subtraction through the use of the fundamental identity
SSTo = SSTr + SSBl + SSE
where
The test is based on df1 = k - 1 and df2 = (k - 1)(l - 1)
Test statistic: MSTrF
MSE
SSTr SSEMSTr and MSE
k 1 (k 1)(l 1)
SSE is obtained by subtraction through the use of the fundamental identity
SSTo = SSTr + SSBl + SSE
where
The test is based on df1 = k - 1 and df2 = (k - 1)(l - 1)
Test statistic: MSTrF
MSETest statistic: MSTr
FMSE
SSTr SSEMSTr and MSE
k 1 (k 1)(l 1)
![Page 51: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/51.jpg)
51 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
The ANOVA Table for a Randomized Block ExperimentS o u r c e o f V a r i a t i o n d f
S u m o f S q u a r e s M e a n S q u a r e F
T r e a t m e n t s k – 1 S S T r
S S T rM S T r
k 1
M S T rF
M S E
B l o c k s l - 1 S S B l
S S B lM S B l
l 1
E r r o r ( k – 1 ) ( l – 1 ) S S E
S S EM S E
( k 1) ( l 1)
T o t a l k l - 1 S S T o
![Page 52: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/52.jpg)
52 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Example (Food Prices)In an attempt to measure which of 3 grocery chains has the best overall prices, it was felt that there would be a great deal of variability of prices if items were randomly selected from each of the chains, so a randomized block experiment was devised to answer the question. A list of standard items was developed (typically a fairly large list would be used, but in the interest of providing a small example, 15 items were chosen and the price was suppose to be recorded for each of these items in each of the stores. There were problems with the collection of data so that only 7 items appeared in all the stores. The data is given in the next slide.
![Page 53: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/53.jpg)
53 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Example (Food Prices)
Product Store
Tops Wegmans Walmart
Tide (100 oz liquid detergent) $6.39 $5.59 $5.24 1 lb Land O'Lakes Butter $3.99 $3.49 $2.98 1 dozen Large Grade AA eggs $1.49 $1.49 $.72 Tropicana (no pulp, non-conc) OJ (64 oz) $3.99 $2.99 $2.50 2 Liter Diet Coke $1.39 $1.50 $1.04 1 loaf Wonderbread $2.09 $2.09 $1.43 18 oz jar Skippy Peanout Butter $2.49 $2.49 $1.77
![Page 54: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/54.jpg)
54 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
ANOVA Table
The resulting P-value for the treatments (stores) is 0.000. Our interpretation is that the stores have different means costs for the selected items. We need to do some multiple comparisons to determine the actual differences.
Source of Variation df
Sum of Squares
Mean Square F
Total 20 48.6355
7.5217
0.0607
1.3881 22.85Treatments (Store) 2 2.7762
Error 12 0.7289
Blocks 6 45.1303
![Page 55: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/55.jpg)
55 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Multiple Comparisons
As before, in single-factor ANOVA, once H0has been rejected, declare that treatments I and j differ significantly if the interval
does not include zero, where q is based on a comparison of k treatments and error df = (k - 1)(l - 1).
i j
MSE( ) q
l
As before, in single-factor ANOVA, once H0has been rejected, declare that treatments I and j differ significantly if the interval
does not include zero, where q is based on a comparison of k treatments and error df = (k - 1)(l - 1).
i j
MSE( ) q
l
![Page 56: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/56.jpg)
56 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Example Store Mean PriceWalmart $2.240 Wegmans $2.806 Tops $3.119
With 3 populations and dferror = 12, the q value from the Table of Critical Values for the Studentized range distribution for a 95% Tukey confidence interval 3.77, so
and the intervals are
MSE 0.0607q 3.77 0.351
l 7
i - j Interval
Weg -Wal = $0.566 (0.215, 0.917)
Tops -Wal = $0.879 (0.528, 1.230)
Tops -Weg = $0.313 (-0.038, 0.664)
![Page 57: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/57.jpg)
57 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Example
The resulting schematic display is
Store Mean PriceWalmart $2.240 Wegmans $2.806 Tops $3.119
i - j Interval
Weg -Wal = $0.566 (0.215, 0.917)
Tops -Wal = $0.879 (0.528, 1.230)
Tops -Weg = $0.313 (-0.038, 0.664)
Walmart Wegmans Tops$2.240 $2.806 $3.119
With the P-value = 0.000, we have established that the mean cost of the selected items is lower at Walmart than Wegmans or Tops, but we have not shown a significant difference between Wegmans and Tops.
![Page 58: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/58.jpg)
58 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Two-Factor ANOVANotation:
k = number of levels of factor A
l = number of levels of factor B
kl = number of treatments (each one a combination of a factor A level and a factor B level)
m = number of observations on each treatment
![Page 59: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/59.jpg)
59 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Two-Factor ANOVA ExampleA grocery store has two stocking supervisors, Fred & Wilma. The store is open 24 hours a day and would like to schedule these two individuals in a manner that is most effective. To help determine how to schedule them, a sample of their work was obtained by scheduling each of them for 5 times in each of the three shifts and then tracked the number of cases of groceries that were emptied and stacked during the shift. The data follows on the next slide.
![Page 60: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/60.jpg)
60 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Two-Factor ANOVA Example
Supervisor Day Swing Night495 547 481 457 500 578 504 496 485607 517 515 428 518 497481 520 498 508 471 560 572 550 583533 507 518 578 625 598
Shift
Fred
Wilma
![Page 61: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/61.jpg)
61 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
InteractionsThere is said to be an interaction between the factors, if the change in true average response when the level of one factor changes depend on the level of the other factor.
One can look at the possible interaction between two factors by drawing an interactions plot, which is a graph of the means of the response for one factor plotted against the values of the other factor.
![Page 62: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/62.jpg)
62 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Two-Factor ANOVA Example
Supervisor Day Swing NightFred 529.40 495.60 500.00 508.33Wilma 507.80 527.00 585.60 540.13Mean Output for Each Shift
518.60 511.30 542.80 524.23
Mean Output for Each
Supervisor
Shift
A table of the sample means for the 30 observations.
![Page 63: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/63.jpg)
63 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Two-Factor ANOVA ExampleTypically, only one of these interactions plots will be constructed. As you can see from these diagrams, there is a suggestion that Fred does better during the day and Wilma is better at night or during the swing shift. The question to ask is “Are these differences significant?” Specifically is there an interaction between the supervisor and the shift.
Fred Wilma
SwingNightDay
590
580
570
560
550
540
530
520
510
500
Shift
Supervisor
Mea
n
Interaction Plot - Data Means for Cases
Day Night Swing
WilmaFred
590
580
570
560
550
540
530
520
510
500
Supervisor
Shift
Mea
n
Interaction Plot - Data Means for Cases
![Page 64: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/64.jpg)
64 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
InteractionsIf the graphs of true average responses are connected line segments that are parallel, there is no interaction between the factors. In this case, the change in true average response when the level of one factor is changed is the same for each level of the other factor. Special cases of no interaction are as follows:
1.The true average response is the same for each level of factor A (no factor A main effects).
2.The true average response is the same for each level of factor B (no factor B main effects).
![Page 65: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/65.jpg)
65 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Basic Assumptions for Two-Factor ANOVA
The observations on any particular treatment are independently selected from a normal distribution with variance 2 (the same variance for each treatment), and samples from different treatments are independent of one another.
![Page 66: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/66.jpg)
66 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Two-Factor ANOVA TableThe following is a fairly standard way of presenting the important calculations for an two-factor ANOVA.
The fundamental identity isSSTo = SSA + SSB + SSAB +SSE
![Page 67: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/67.jpg)
67 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Two-Factor ANOVA ExampleMinitab output for the Two-Factor ANOVA
Two-way ANOVA: Cases versus Shift, Supervisor
Analysis of Variance for Cases Source DF SS MS F PShift 2 5437 2719 1.82 0.184Supervis 1 7584 7584 5.07 0.034Interaction 2 14365 7183 4.80 0.018Error 24 35878 1495Total 29 63265
1. Test of H0: no interaction between supervisor and Shift
With the p-value of 0.018 there is strong evidence of an interaction.
We go no further and draw the conclusion that this is an interactions.
![Page 68: Chapter 15 The Analysis of Variance. 2 A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or.](https://reader035.fdocuments.us/reader035/viewer/2022070403/56649f2c5503460f94c471cd/html5/thumbnails/68.jpg)
68 Copyright (c) 2001 Brooks/Cole, a division of Thomson Learning, Inc.
Two-Factor ANOVA ExampleLooking at either of the interaction plots it becomes clear that from a practical standpoint, Fred should be scheduled for days and if someone is to be scheduled for the night or swing shifts it should be Wilma, although it appears that Wilma would do best at night.
Fred Wilma
SwingNightDay
590
580
570
560
550
540
530
520
510
500
Shift
Supervisor
Mea
n
Interaction Plot - Data Means for Cases