Psych 5510/6510

Post on 04-Jan-2016

46 views 0 download

description

Psych 5510/6510. Chapter 13: ANCOVA: Models with Continuous and Categorical Predictors Part 1 : Increasing Power in True Experimental Designs. Spring, 2009. ANCOVA: “The analysis of Covariance”. - PowerPoint PPT Presentation

Transcript of Psych 5510/6510

1

Psych 5510/6510

Chapter 13:

ANCOVA: Models with Continuous and Categorical Predictors

Part 1: Increasing Power in True Experimental Designs

Spring, 2009

2

ANCOVA: “The analysis of Covariance”

Originated in the analysis of experimental designs. The goal was to investigate the effects of the categorical variables while controlling some continuously measured variable, called a covariate.

3

Model Comparison Approach

In the model comparison approach we are simply putting both continuous and categorical variables into our model. Usually this will involve using categorical variables to code our independent variable (i.e. which experimental group the subject belongs in) and continuous variables to measure some other aspect of each subject (something that is not being manipulated by the experimenter, e.g. height or age).

4

ContextsWe will be looking at three contexts in which this

will be useful:1. Within a ‘true experimental’ design, where we

can use this approach to increase the power of the design and to add sophistication to our model.

2. Within a ‘quasi-experimental’ or ‘static group’ design, where we can use this approach to control a confounding variable.

3. Within a correlational design, where we can introduce a categorical variable to better understand a continuous variable.

5

Context 1: True Experimental Designs

True experimental design: subjects are randomly divided into groups, the independent variable is then manipulated by the experimenter.

As the subjects are randomly assigned to groups, it is assumed that the group means start off being fairly equal. If, after the independent variable has been applied, a statistically significant difference between the group means is found it is interpreted as being the result of the independent variable.

6

‘Priming’ Example60 subjects are randomly divided into two groups. Each

subject is shown two words, a ‘priming’ word for 2 seconds, followed by a ‘test’ word. The perceptual threshold of the test word is measured. For Group 1, the priming word is similar in shape to the test word, for Group 2, the priming word is similar in meaning to the test word.

• IV: Type of prime (shape or meaning)• DV: perceptual threshold

For the analysis contrast coding (1 and –1) was used to code the independent variable.

7

Results: t Test for Independent Groups

Mean Group 1: 31.13Mean Group 2: 29.42

t(58)=1.942, p=.057

8

Results: Linear Regression

Model Summary

.247a .061 .045 3.40635Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), Groupa.

ANOVAb

43.776 1 43.776 3.773 .057a

672.985 58 11.603

716.761 59

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), Groupa.

Dependent Variable: Primingb.

Coefficientsa

30.271 .440 68.835 .000

.854 .440 .247 1.942 .057

(Constant)

Group

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: Priminga.

)p0.854(Grou30.27Y ii

9

Interpretation

Well, the results were not quite statistically significant, we cannot conclude that the independent variable had an effect. Perhaps if the experiment just had a little more power we could have rejected the null hypothesis. In the previous semester we examined what influences power, in this case we will focus on the variance within the groups. If we can reduce the variance of the scores within the groups then we can increase the power of the experiment.

10

Within-Group VarianceGroup 1: Standard deviation=3.07Group 2: Standard deviation=3.71

Is that a lot? Well, it’s hard to say, but we can think about reducing it. To do that we can ask the question, ‘why do the scores differ within each group’? For our purposes in this chapter we will refine the question to ‘what measurable attribute of the subjects might be correlated to the dependent variable (perceptual threshold)’. Age comes to mind. There was a wide variety of ages within each group, if age is correlated with perceptual threshold, and the ages of the participants varies within each group, then it could account for some of the within-group variance.

11

Is Age Correlated with Perceptual Threshold?

MODEL C: Ŷi = β0

MODEL A: Ŷi = β0 + β1Agei

ANOVAb

113.783 1 113.783 11.790 .001a

559.754 58 9.651

673.536 59

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), Agea.

Dependent Variable: Yb.

12

Adding Age to the ExperimentUp to now our approach has been:

MODEL C: Ŷi = β0

MODEL A: Ŷi = β0 + β1Groupi

H0: β1= 0 Ha: β1 0

Now we are going to move to:MODEL C: Ŷi = β0 + β2Agei

MODEL A: Ŷi = β0 + β2Agei + β1Groupi

H0: β1= 0 Ha: β1 0

13

Adding Age to the Exp. (cont.)

MODEL A: Ŷi = β0 + β2Agei + β1Groupi

The mechanics of this are simple, we are going to measure the subject’s age and add that variable to the model along with the variable that uses a contrast code to indicate which group they are in.

14

Why This Helps (ANOVA)

If we think about this in terms of the t test for independent groups, what we are accomplishing is to remove from each group the variance that can be accounted for by knowing the subject’s age. If that reduces the variance in each group then the power of the experiment should increase.

15

Why This Helps (Regression)But let’s think about this in terms of the model comparison

approach.Previous:

MODEL C: Ŷi = β0

MODEL A: Ŷi = β0 + β1Groupi

Now:MODEL C: Ŷi = β0 + β2Agei

MODEL A: Ŷi = β0 + β2Agei + β1Groupi

If we are interested in the worthwhileness of adding variable ‘Group’ to the model, why would adding it to a model that already contains ‘Age’ be better then adding it to a model that didn’t contain ‘Age’?

16

Why This Helps (Regression)

To understand the explanation we need to note that variables ‘Age’ and ‘Group’ are likely to be fairly non-redundant. Remember that redundancy can be thought of as how much you can use one variable to predict the other. We have randomly divided people into groups, so the groups probably have a fairly similar distribution of ages, consequently we shouldn’t be able to use what group a person is in to predict their age, or vice versa. If the mean age in each group is the same then age and group are completely independent (non-redundant). The mean age in the two groups, however, will probably not be exactly the same so there could be a small amount of redundancy.

17

Why This Helps (Regression)

In the following diagrams I show how adding X1 to a model of Y that already contains X2 is more powerful than without X2, but only if the predictor variables are not very redundant.

18

Note while the amount of Y that can be explained by X1 is the same in both cases, the PRE is greater below:

19

The situation would be different if X1 and X2 were quite redundant:

20

Redundancy of Age and Group

The correlation between Age and Group is r=.007 (very low). If we square that to get the value of R² we get a value very close to zero, which would make the tolerance of Age and Group essentially 1. The redundancy doesn’t need to be this low for the covariate to add power, but I’m not complaining.

21

Results with Covariate (Age) Included

Model Summary

.469a .220 .192 3.13247Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), Age, Groupa.

ANOVAb

157.456 2 78.728 8.023 .001a

559.305 57 9.812

716.761 59

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), Age, Groupa.

Dependent Variable: Primingb.

Overall Analysis

22

Results with Covariate (Age) Included

)(Age281.)p0.864(Grou65.12Y iii

Both Group (p=.0367) and Age (p=.001) are worthwhile when added last to a model that contains the other predictor. Without Age in the model (our previous analysis) Group was not significant. The tolerances (not shown above) are very close to 1.00, indicating that Age and Group have very little redundancy.

Coefficientsa

21.651 2.564 8.443 .000

.281 .083 .398 3.404 .001 .397 .411 .398

.864 .404 .250 2.135 .037 .247 .272 .250

(Constant)

Age

Group

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig. Zero-order Partial Part

Correlations

Dependent Variable: Priminga.

23

Summary Table

Source SS df MS F p PRE

Regression 157.46 2 78.73 8.023 .001 .220

Group 44.73 1 44.73 4.56 .037 .074

Age 113.70 1 113.70 11.59 .001 .169

Residual 559.31 57 9.81

Total 716.76 59

24

SummaryWithout Age in the model the t value (from both the t test

for independent groups as well as from the regression analysis) for the effect of Group (the independent variable) was 1.942, p=.057.

With Age in the model the t value for the effect of Group was 2.135, p=.037.

What happened? In terms of the t test for independent groups, including Age in the model removed the variance that could be accounted for by Age before the evaluation of the effect of Group, thus power was increased. In terms of model comparison, including Age in Model C lowered SSE(C) in a manner that was not redundant with Group, so the proportional reduction of error by adding Group was greater.

25

With the tools we now have we can add covariates to any type of experimental design. For example, if we have two independent variables, ‘A’, and ‘B’, and A has two levels (contrast X1 can handle that), and B has three levels (contrast X2 and X3 can handle that), and we also are looking at the interaction of A and B (contrasts X4 and X5 can handle that), and we want to add two covariates ‘Age’ (X6) and ‘Height’ (X7) to gain power then we regress Y on:Ŷi = β0+β1X1 +β2X2 +β3X3 +β4X4 +β5X5 +β6X6 +β7X7

If Age and Height are not very redundant with the contrasts that code the independent variables then they will increase the power of the tests of those contrasts.

Fancier Designs

26

A Powerful ToolThis procedure provides a very simple tool for increasing

the power of a true experimental design.

1. Think of some reason why scores will differ within groups (e.g. age, income level, height, gender).

2. Measure that.

3. Test to see if that measure is significantly correlated with the dependent variable (i.e. regress the dependent variable on the measure).

4. If it is, add it to the model to increase the power of the test for the independent variable(s).

27

Ways of Thinking About ItIf we approach our original example from the

perspective of a t test for independent groups then our focus is on whether or not the independent variable (type of priming) had an effect on the dependent variable (perceptual threshold). Including the covariate of Age is simply a means of increasing the power the experiment, but our focus remains on the effect of the independent variable.

28

Ways of Thinking About ItIf we approach what we are doing from the model

comparison approach, then our focus is on trying to model perceptual thresholds, and we are interested in whether it would be good to have both Age and Type of Priming be part of our model. Our analysis shows that both are worthwhile. I wonder if they interact? Wouldn’t it be interesting if the effect of ‘type of priming’ was different across ages? Let’s check it out. We’ll create another variable that is (Group)x(Age) to test for an interaction...

29

Interactive ModelMODEL C: Ŷi = β0 + β1Groupi + β2Agei

MODEL A: Ŷi = β0 + β1Groupi + β2Agei + β3GroupiAgei

PRE = -1.50²=.02 p=0.260Coefficientsa

22.432 2.648 8.471 .000

.256 .085 .373 2.993 .004 .411 .371 .360

2.895 2.648 .864 1.093 .279 -.029 .145 .132

-.097 .085 -.901 -1.139 .260 -.066 -.150 -.137

(Constant)

Age

Group

GxA

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig. Zero-order Partial Part

Correlations

Dependent Variable: Ya.

Moving to an interactive model was not worthwhile, and we can seethat including the interaction term made the effect of Group no longerstatistically significant. Why? A look at the tolerances (not shown inthis table) shows a great deal of redundancy between Group and theGroup x Age interaction. So, let’s leave the interaction out of our model.

30

The Model

MODEL: Ŷi = β0 + β1Groupi + β2Agei

This is our best model of perceptual threshold so far, I wonder what variable to try next? We could think of another continuous variable that we could measure in our next study, or we might want to manipulate some independent variable and see if it adds to the model. The goal is to work towards a better and better model of perceptual threshold. This is the flavor of the model comparison approach.