Analysis of Covariance - Southern Methodist...
Transcript of Analysis of Covariance - Southern Methodist...
Theory Henson Data Another Example Gain Score Model
Analysis of Covariance
Dr. J. Kyle Roberts
Southern Methodist UniversitySimmons School of Education and Human Development
Department of Teaching and Learning
Theory Henson Data Another Example Gain Score Model
ANCOVA Theory
• The thought behind the analysis of covariance is thatsomeone might want to conduct an analysis in which they“control” for certain variables before doing an ANOVA.
• This type of analysis is frequently used in psychologicalliterature. In education, this might happen when we arewanting to test for differences in student mathematics abilityamong ethnic groups while controlling for student readinglevel.
• ANCOVA “combines” regression and ANOVA in that we doan ANOVA on the residualized (sort of) results from the linearmodel.
• This is like doing regression and then ANOVA.
Theory Henson Data Another Example Gain Score Model
ANCOVA Theory, cont.
• Glass and Stanley (1970) said that ANCOVA can be thoughtof as “an analysis of variance performed on the (Y − Y )scores where the Y s are predicted in the usual b1X + b0 way”(p. 499).
• Thompson (2006) gives three very strong cautions:
1. Homogeneity of regression assumption must be met.2. Covariate data must be extremely reliable.3. Residualized dependent variable scores must be interpretable.
Theory Henson Data Another Example Gain Score Model
The ANCOVA Model
• The one-factor ANCOVA fixed-effects model can be writtenas:
Yij = µY + αj + βw(Xij − µX) + εij
where• Yij is the dependent variable score for individual i in group j• µY is the grand mean of the dv• αj is the group effect for group j• βw is the within-groups regression slope• Xij is the observed score on the covariate• µX is the grand mean for the iv• εij is the random residual error
Theory Henson Data Another Example Gain Score Model
Null Hypothesis and Assumptions
• For the ANCOVA, the null hypothesis is stated as:
H0 : µ′.1 = µ′
.2 = · · · = µ′.j
where µ′.j is the adjusted mean for the dv for group j in the
presence of the covariates.• Assumptions
1. Random and independent errors2. Homogeneity of variance3. Homogeneity of regression
F =(SSwith(adj) − SSres)/(J − 1)
SSres/(N − 2J)
where SSres is the sum of squared residuals
SSres =J∑
j=1
SSj(1− r2j )
Theory Henson Data Another Example Gain Score Model
Henson (1998) - ANCOVA with Intact Groups
We are working with the Henson (1998) data.The data are athttp://faculty.smu.edu/kyler/courses/7312/henson.txt> henson <- read.table("henson.txt", header = T)> str(henson)
’data.frame’: 12 obs. of 3 variables:
$ read : int 30 30 40 40 45 45 50 50 55 55 ...
$ achieve: int 34 36 46 46 49 50 60 68 67 70 ...
$ edu : Factor w/ 2 levels "regular ed","special ed": 2 2 2 2 2 2 1 1 1 1 ...
> head(henson)
read achieve edu
1 30 34 special ed
2 30 36 special ed
3 40 46 special ed
4 40 46 special ed
5 45 49 special ed
6 45 50 special ed
Theory Henson Data Another Example Gain Score Model
Graphing Data
read
achi
eve
40
50
60
70
80
30 40 50 60
●
●●
●
●
●
●
●
●●
●●
read
achi
eve
40
50
60
70
80
30 40 50 60
●
●
●●
●●
●
●●
●
●
●
Theory Henson Data Another Example Gain Score Model
Graphing Data (again)
●●
●●
●●
●
● ●
●
●
●
30 35 40 45 50 55 60 65
4050
6070
80
read
achi
eve
Theory Henson Data Another Example Gain Score Model
First running the ANOVA
> m0 <- aov(achieve ~ edu, henson)> anova(m0)
Analysis of Variance Table
Response: achieve
Df Sum Sq Mean Sq F value Pr(>F)
edu 1 2380.08 2380.08 36.458 0.0001256
Residuals 10 652.83 65.28
Theory Henson Data Another Example Gain Score Model
Running the ANCOVA
In the ANCOVA case, it is imperative that you list the covariate(s)first in lm and the factor(s) at the end.> m1 <- lm(achieve ~ read + edu, henson)> anova(m1)
Analysis of Variance Table
Response: achieve
Df Sum Sq Mean Sq F value Pr(>F)
read 1 2912.04 2912.04 397.3828 9.346e-09
edu 1 54.93 54.93 7.4953 0.02293
Residuals 9 65.95 7.33
Theory Henson Data Another Example Gain Score Model
Testing the Homogeneity of Regression Assumption> m2 <- lm(achieve ~ read * edu, henson)> summary(m2)
Call:
lm(formula = achieve ~ read * edu, data = henson)
Residuals:
Min 1Q Median 3Q Max
-3.2857 -1.5000 0.1786 0.8571 4.7143
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.4286 9.9931 0.043 0.967
read 1.2571 0.1753 7.172 9.5e-05
eduspecial ed 5.2857 12.0917 0.437 0.674
read:eduspecial ed -0.2714 0.2479 -1.095 0.305
Residual standard error: 2.678 on 8 degrees of freedom
Multiple R-squared: 0.9811, Adjusted R-squared: 0.974
F-statistic: 138.3 on 3 and 8 DF, p-value: 3.124e-07
Theory Henson Data Another Example Gain Score Model
Lomax (2001) - Chapter 16
The data are athttp://faculty.smu.edu/kyler/courses/7312/lomax.txt> lomax <- read.table("lomax.txt", header = T)> str(lomax)
’data.frame’: 12 obs. of 3 variables:
$ quiz : int 1 2 3 4 5 6 1 2 4 5 ...
$ aptitude: int 4 3 5 6 7 9 1 3 2 4 ...
$ group : Factor w/ 2 levels "group 1","group 2": 1 1 1 1 1 1 2 2 2 2 ...
> head(lomax)
quiz aptitude group
1 1 4 group 1
2 2 3 group 1
3 3 5 group 1
4 4 6 group 1
5 5 7 group 1
6 6 9 group 1
Theory Henson Data Another Example Gain Score Model
Graphical Exploration> print(xyplot(quiz ~ aptitude, groups = group,+ data = lomax, type = c("p", "r"), pch = 16))
aptitude
quiz
1
2
3
4
5
6
2 4 6 8
●
●
●
●
●
●
●
●
●
●
● ●
Theory Henson Data Another Example Gain Score Model
Running the ANOVA
> new0 <- aov(quiz ~ group, lomax)> anova(new0)
Analysis of Variance Table
Response: quiz
Df Sum Sq Mean Sq F value Pr(>F)
group 1 0.75 0.75 0.1899 0.6723
Residuals 10 39.50 3.95
Theory Henson Data Another Example Gain Score Model
Running the ANCOVA
> new1 <- lm(quiz ~ aptitude + group, lomax)> anova(new1)
Analysis of Variance Table
Response: quiz
Df Sum Sq Mean Sq F value Pr(>F)
aptitude 1 20.8807 20.8807 21.961 0.001142
group 1 10.8122 10.8122 11.372 0.008228
Residuals 9 8.5571 0.9508
Theory Henson Data Another Example Gain Score Model
Covariance Adjustment Model for Gain Scores
• Suppose that we want to do an analysis of gain scores and wewant to see if there are some variables that help us explaindifferences in gain.
• For example, we might want to see if there were differinglevels of gain for two different groups.
• This is typical in a pre- and post-test design, whereby wemight look at the differences in gain between a treatment andcontrol group.
• This type of design does have some important assumptions,though.
• Equivalence at pretest• Homogeneity fo variance at pretest
Theory Henson Data Another Example Gain Score Model
Gain Score Data
Consider the following dataset.> gains1 <- data.frame(pre1 = c(rnorm(15, 20, 8),+ rnorm(12, 20, 7)), post1 = c(rnorm(15, 22,+ 7), rnorm(12, 35, 4)), treat = rep(c("control",+ "treatment"), c(15, 12)))> head(gains1)
pre1 post1 treat
1 4.457924 14.77789 control
2 31.403569 19.65162 control
3 12.772580 19.23911 control
4 14.929620 29.75043 control
5 33.888733 21.20869 control
6 3.592474 11.58902 control
Theory Henson Data Another Example Gain Score Model
> t.test(pre1 ~ treat, gains1)
Welch Two Sample t-test
data: pre1 by treat
t = 0.3426, df = 22.626, p-value = 0.735
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-5.433108 7.587736
sample estimates:
mean in group control mean in group treatment
17.39818 16.32086
> t.test(post1 ~ treat, gains1)
Welch Two Sample t-test
data: post1 by treat
t = -8.3247, df = 21.546, p-value = 3.555e-08
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-17.32007 -10.40477
sample estimates:
mean in group control mean in group treatment
21.31432 35.17674
Theory Henson Data Another Example Gain Score Model
Paired Samples Test
> with(gains1[gains1$treat == "treatment", ], t.test(pre1,+ post1, paired = T))
Paired t-test
data: pre1 and post1
t = -11.7569, df = 11, p-value = 1.435e-07
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-22.38586 -15.32590
sample estimates:
mean of the differences
-18.85588
Theory Henson Data Another Example Gain Score Model
Covariance Adjustment Model> gains1$gainscore <- gains1$post - gains1$pre> summary(m.gain <- lm(gainscore ~ pre1 + treat,+ gains1))
Call:
lm(formula = gainscore ~ pre1 + treat, data = gains1)
Residuals:
Min 1Q Median 3Q Max
-7.996 -2.626 -1.551 2.162 12.097
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 19.1347 2.1878 8.746 6.29e-09
pre1 -0.8747 0.1059 -8.259 1.79e-08
treattreatment 13.9974 1.7731 7.894 3.99e-08
Residual standard error: 4.569 on 24 degrees of freedom
Multiple R-squared: 0.8532, Adjusted R-squared: 0.841
F-statistic: 69.75 on 2 and 24 DF, p-value: 1.000e-10
Theory Henson Data Another Example Gain Score Model
Covariance Adjustment Model> summary(m.gain2 <- lm(gainscore ~ pre1 * treat,+ gains1))
Call:
lm(formula = gainscore ~ pre1 * treat, data = gains1)
Residuals:
Min 1Q Median 3Q Max
-8.097 -2.729 -1.530 2.298 12.069
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 19.26184 2.41921 7.962 4.65e-08
pre1 -0.88203 0.12059 -7.314 1.93e-07
treattreatment 13.38055 4.85528 2.756 0.0113
pre1:treattreatment 0.03731 0.27251 0.137 0.8923
Residual standard error: 4.665 on 23 degrees of freedom
Multiple R-squared: 0.8533, Adjusted R-squared: 0.8342
F-statistic: 44.61 on 3 and 23 DF, p-value: 9.5e-10
Theory Henson Data Another Example Gain Score Model
Homework for ANCOVA and Covariance Adjustment Model
1. Look at current journal articles in which the study usedANCOVA. You will probably have the best luck in thepsychology literature. Try and find an article that DOESreport checking for homogeneity of regression. Bring enoughcopies of just that page for everyone in the class.
2. Look back at your homework assignment from http://faculty.smu.edu/kyler/courses/7311/paired_hw.pdf.Re-run this analysis as a covariance adjustment model (justcontrolling for pretest differences on the gain scores). Doesyour output change your interpretation of your findings?