Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA...
Transcript of Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA...
![Page 1: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/1.jpg)
19-1
Lecture 19
Introduction to ANOVA
STAT 512
Spring 2011
Background Reading
KNNL: 15.1-15.3, 16.1-16.2
![Page 2: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/2.jpg)
19-2
Topic Overview
• Categorical Variables
• Analysis of Variance
• Lots of Terminology
• An ANOVA example
![Page 3: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/3.jpg)
19-3
Categorical Variables
• To this point, with the exception of the last
lecture, all explanatory variables have been
quantitative; e.g. comparing X = 3 to X = 5
makes sense numerically
• For categorical or qualitative variables there
is no ‘numerical’ labeling; or if there is, it
isn’t meaningful.
![Page 4: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/4.jpg)
19-4
Example
• Five medical treatments – ten subjects on
each treatment.
• Goal: Compare the treatments in terms of
their effectiveness
� If there were two treatments, what would
we use?
![Page 5: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/5.jpg)
19-5
ANOVA
• ANOVA = Analysis of Variance
• Compare means among treatment groups,
without assuming any parametric
relationships (regression does assume such
a relationship).
• Example: Price vs. Sales Volume
![Page 6: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/6.jpg)
19-6
Regression Model
![Page 7: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/7.jpg)
19-7
ANOVA Model
KEY DIFFERENCE: No assumption is made
about the manner in which Price and Sales
Volume are related.
![Page 8: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/8.jpg)
19-8
Similarities to Regression
• Assumptions on errors identical as to
regression
• We assume each population is normal and
the variances are identical. We also
assume independence.
• Can get “predicted values” for each group,
as well as CI’s.
![Page 9: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/9.jpg)
19-9
Differences
• No specific relationship is assumed.
• Goal becomes: look for differences among
the groups.
![Page 10: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/10.jpg)
19-10
Terminology
• We may refer to any qualitative predictor
variable as a factor.
• Each factor has a certain number of levels.
• Experimental factors are “set” or
“assigned” to the experimental units;
observational factors are characteristics of
the experimental units that cannot be
assigned.
![Page 11: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/11.jpg)
19-11
Terminology (2)
• Factors are qualitative if they represent traits
that could not be placed in some logical
numerical order.
� GENDER, BRAND, DRUG
• Factors are quantitative if levels are
described by numerical quantities on an
equal interval scale.
� AGE, TEMPERATURE
![Page 12: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/12.jpg)
19-12
Terminology (3)
• A Treatment is a specific experimental
condition (determined by factors and levels
of each factor).
• The Experimental Unit (Basic Unit of
Study) is the smallest unit to which a
treatment can be assigned.
• A design is called balanced if each
treatment is replicated the same number of
times (i.e. same number of EU’s per
treatment).
![Page 13: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/13.jpg)
19-13
Examples
Five medications – each used for 10 subjects
• Medication is an experimental factor; EU is the subject
(person) receiving the medication.
• There are five treatments, which may or may not have
any logical “ordering”
• Design is balanced (generally) since we are able to
assign the treatments.
Ten age groups – 50 subjects
• Age is an observational, quantitative factor; subject is
again the EU; Design is probably not balanced
![Page 14: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/14.jpg)
19-14
Examples (2)
Blood Type
• Observational factor
• Qualitative factor
• Again design probably not balanced
Brand of Product
• Observational, qualitative factor
• Design likely balanced by arrangement
![Page 15: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/15.jpg)
19-15
Multiple Factors
• With two or more factors, each combination
of levels is generally called a treatment
combination
• Can treat as single variable if desired
• Example: Blood Type * Medication
� 4 blood types
� 5 medications
� 20 treatment combinations
![Page 16: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/16.jpg)
19-16
Crossed Factors
• Two factors are crossed if all factor
combinations are represented.
• Example: Blood Type * Medication
1 2 3 4 5
A xx xx xx xx xx
B xx xx xx xx xx
AB xx xx xx xx xx
O xx xx xx xx xx
Note: This type of table is called a design
chart.
![Page 17: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/17.jpg)
19-17
Nested Factors
• One factor has levels that are unique to a
given level of another factor
• Example: Plant * Operator
Plant #1 Plant #2 Plant #3
Op #1
Op #2
Op #3
Op #4
Op #5
Op #6
Op #7
Op #8
Op #9
• We say: Operators are nested within
manufacturing plants.
![Page 18: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/18.jpg)
19-18
Control Groups
• Often a control or placebo treatment is used.
This treatment is more of a “standard” than
a treatment, as it is the case of no treatment
at all.
• Comparing treatments to controls can be a
very effective way of showing that a
treatment is effective.
![Page 19: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/19.jpg)
19-19
Fixed vs. Random Factors
• For the most part, we will consider only
fixed effect models in this class. A factor
is called fixed because the levels are
chosen in advance of the experiment and
we were interested in differences in
response among those specific levels.
• Note: Random factors will need to be
treated differently, since their levels are
chosen randomly from a large population
of possible levels.
![Page 20: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/20.jpg)
19-20
Randomization
• Completely separate concept from random
effects.
• In an experimental study, generally want to
avoid any potential bias in the design by
randomizing treatments to experimental
units whenever possible.
• Randomization may be constrained.
Example: Have 100 people, 50 men and
50 women. Randomly assign each of the 5
treatments to 10 men and 10 women.
![Page 21: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/21.jpg)
19-21
Experimental Designs
• Completely Randomized Design
• Factorial Experiments
• Randomized Complete Block Designs
• Nested Designs
• Repeated Measures Designs
• Incomplete Block Designs
• We’ll discuss some of these. More thorough
experimental design course: STAT 514.
![Page 22: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/22.jpg)
19-22
Example
• Kenton Food Company Example (p685)
• Compare four different package designs
(numbered 1, 2, 3, 4 in no particular order)
• Response: # of cases sold
• 20 stores, but one was destroyed by fire
during the study; 19 observations
• SAS file: kenton.sas
![Page 23: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/23.jpg)
19-23
Data
Design 1 Design 2 Design 3 Design 4
11
17
16
14
15
12
10
15
19
11
23
20
18
17
27
33
22
26
28
![Page 24: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/24.jpg)
19-24
Scatter Plot
![Page 25: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/25.jpg)
19-25
ANOVA Code (SAS)
proc glm data=kenton; class design; model cases=design; means design /bon lines cldiff;
• Class statement identifies ALL categorical
variables (separate by spaces as in model)
• Means statement requests comparisons of
the group means (lots of options)
![Page 26: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/26.jpg)
19-26
Output
Source DF SS MS F Value Pr > F
Model 3 588 196 18.59 <.0001
Error 15 158 10.5
Total 18 746
R-Square Coeff Var Root MSE cases Mean
0.788055 17.43042 3.247563 18.63158
![Page 27: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/27.jpg)
19-27
Output (2)
Bonferroni (Dunn) t Tests for cases
NOTE: This test controls the Type I
experimentwise error rate, but it generally
has a higher Type II error rate than Tukey's
for all pairwise comparisons.
Alpha 0.05
Error Degrees of Freedom 15
Error Mean Square 10.54667
Critical Value of t 3.03628
Comparisons significant at the 0.05 level
are indicated by ***.
![Page 28: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/28.jpg)
19-28
Output (3)
design Difference Simultaneous 95%
Comparison Means Confidence Limits
4 - 3 7.700 1.085 14.315 ***
4 - 1 12.600 6.364 18.836 ***
4 - 2 13.800 7.564 20.036 ***
3 - 4 -7.700 -14.315 -1.085 ***
3 - 1 4.900 -1.715 11.515
3 - 2 6.100 -0.515 12.715
1 - 4 -12.600 -18.836 -6.364 ***
1 - 3 -4.900 -11.515 1.715
1 - 2 1.200 -5.036 7.436
2 - 4 -13.800 -20.036 -7.564 ***
2 - 3 -6.100 -12.715 0.515
2 - 1 -1.200 -7.436 5.036
![Page 29: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/29.jpg)
19-29
Output (4)
Group Mean N design
A 27.200 5 4
B 19.500 4 3
B
B 14.600 5 1
B
B 13.400 5 2
![Page 30: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/30.jpg)
19-30
Assumptions
• Should always check normality, constancy
of variance assumptions
• Plots to check these are as before
• No obvious problems for this dataset
![Page 31: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/31.jpg)
19-31
Residual Plot
![Page 32: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/32.jpg)
19-32
Normal QQ Plot
![Page 33: Lecture 19 Introduction to ANOVA - Purdue · PDF file19-1 Lecture 19 Introduction to ANOVA STAT 512 Spring 2011 Background Reading KNNL: 15.1-15.3, 16.1-16.2](https://reader033.fdocuments.us/reader033/viewer/2022051203/5aab80877f8b9a2b4c8c1901/html5/thumbnails/33.jpg)
19-33
Upcoming in Lecture 20...
• ANOVA Model I (Cell Means)
• Sections 16.3 – 16.6