Comparing Regression Lines From Independent Samples.

64
Comparing Regression Lines From Independent Samples

Transcript of Comparing Regression Lines From Independent Samples.

Comparing Regression Lines

From Independent Samples

The Design

• You have two or more groups.• One or more continuous predictors (C).• And one continuous outcome variable (Y).• You want to know if

• Y = a + b1C1 + … + bpCp + error

• Is the same across groups.

Poteat, Wuensch, & Gregg

• Predictive validity study• Children referred for school psychology

services• Does Grades = a + bIQ + error• Differ across races?• Called a “Potthoff analysis” by school

psychologists• Differences fell short of significance.

Two Groups, One X

• Y = a + b1C + b2G + b3CG• If there are more than two groups, groups

is represented by k-1 dummy variables• Wuensch, Jenkins, & Poteat• Y = Attitude to animals• C = Misanthropy• G = High idealism or not

SAS

• Potthoff.sas -- download• Potthoff.dat – download• Point program file to data file• Run the program• Data step: MxI = Misanth Idealism• Page 1: Ignoring idealism, is a .2 corr

between misanthropy and attitude to animals.

Zero-Order Correlations

Interpretation

• Misanthropy is significant related to support for animal rights.

• The two idealism groups do not differ significantly on support for animal rights – rpb = .092.

• The two idealism groups do not differ significantly on misanthropy – rpb = -.099.

Four Regression Models

• Proc Reg; • CGI: model ar = misanth idealism MxI;• C: model ar=misanth;• CG: model ar = misanth idealism;• CI: model ar = misanth MxI;

Model CGI

Analysis of VarianceSource DF Sum of

SquaresMeanSquare

F Value Pr > F

Model 3 4.05237 1.35079 5.10 0.0022

Error 150 39.73945 0.26493    

Corrected Total

153 43.79182      

Model C

Analysis of Variance

Source DF Sum ofSquares

MeanSquare

F Value Pr > F

Model 1 2.13252 2.13252 7.78 0.0060

Error 152 41.65930 0.27407    

Corrected Total

153 43.79182      

Test of Coincidence

• Compare model CGI with model C• Use a partial F test

full

reducedregfullreg

MSErf

SSSSF

)(

623.3)26493)(.13(

13252.205237.4)150 ,2(

F

p = .029

Conclusion

• This was a simultaneous test of intercept and slopes.

• We conclude that the two groups differ with respect to

• The intercepts, or• The slopes,• Or both.

An Easier Way

proc reg; model ar = misanth idealism MxI;

TEST idealism=0, MxI=0; run;

Test 1 Results for Dependent Variable ar

Source DF Mean Square F Value Pr > F

Numerator 2 0.95992 3.62 0.0291

Denominator 150 0.26493  

Model CI

Analysis of Variance

Source DF Sum ofSquares

MeanSquare

F Value Pr > F

Model 2 2.29525 1.14763 4.18 0.0172

Error 151 41.49657 0.27481    

Corrected Total

153 43.79182    

Test of Intercepts

• Compare model CGI with model CI.

• The intercepts differ significantly.

632.6)26493)(.23(

29525.205237.4)150 ,1(

F

p = .011

F(1, 150) = 6.632, p = .011

• As you know, on one df, t = SQRT(F)• Look back at Model CGI• For the test of main effect of idealism,

t = SQRT(6.632) = 2.58, p = .011.• If we had more than two groups we could

not take this shortcut.

Model CGI

Parameter Estimates

Variable DF ParameterEstimate

StandardError

t Value Pr > |t|

Intercept 1 1.62581 0.19894 8.17 <.0001

misanth 1 0.30006 0.08059 3.72 0.0003

idealism 1 0.77869 0.30236 2.58 0.0110

MxI 1 -0.28472 0.12641 -2.25 0.0258

Test of Parallelism

• Do the slopes differ significantly?• Compare model CGI with model CG• Is model fit significantly reduced when we

remove the interaction term?

073.5)26493)(.23(

70839.205237.4)150 ,1(

F

p = .026

Model CG

Analysis of Variance

Source DF Sum ofSquares

MeanSquare

F Value Pr > F

Model 2 2.70839 1.35419 4.98 0.0081

Error 151 41.08343 0.27208    

Corrected Total

153 43.79182      

F(1, 150) = 5.073, p = .026

• As you know, on one df, t = SQRT(F)• Look back at Model CGI• For the test of the interaction, t = 2.25,

p = .026.• If we had more than two groups we could

not take this shortcut.

Get the Separate Regression Lines

• Sort by groups.• Run the bivariate regressions• For nonidealists,

• For idealists,

MisanthAR 30.63.1

MisanthAR 02.40.2

Prepare Plots

Proc sgplot; scatter x = misanth y = ar;

reg x = misanth y = ar;

yaxis label='Attitude to Animals‘

grid values=(1 to 5 by 1);

xaxis label='Misanthropy‘

grid values=(1 to 5 by 1);

by idealism; run;

Another Plot

proc sgplot; reg x = misanth

y = ar / group = idealism nomarkers;

yaxis label='Attitude to Animals';

xaxis label='Misanthropy'; run;

Full Model Slope 1

• AR = 1.626 + (.300)Misanthropy + (.779)Idealism + (- .285)Interaction.

• This is a conditional slope.• predicted increase in AR accompanying a

one-point increase in misanthropy is .3 given that idealism has value zero (the idealists).

Mbb IXYX

• the conditional effect of X on Y given a particular value of the moderator is the conditional slope for predictor X + the interaction slope times the value of the moderator.

• Idealism as moderator, simple effect of misanthropy

MYX )285.(3.

Simple Slopes for Misanthropy

• Idealism = 0 (nonidealists)

• Each one point increase in Misanthropy lead to a .3 point increase in AR.

• Idealism = 1 (idealists)

• Each one point increase in Misanthropy leads to a .015 point increase in AR.

3.0)285.(3. YX

015.1)285.(3. YX

• Misanthropy as moderator, simple effects of idealism (group differences)

Mbb IXYX

MYX )285.(779.

Full Model Slope 2

• AR = 1.626 + (.300)Misanthropy + (.779)Idealism + (- .285)Interaction.

• This is a conditional slope.• The predicted increase in AR

accompanying a one-point increase in idealism (idealism groups were coded 0,1) is .779 given that misanthropy has value zero.

Mbb IXYX • Treating Idealism as the moderator, the simple

slope for the effect of misanthropy on AT is

MYX )285.(779.

Simple Slopes for Idealism

• predict the difference between the two idealism groups (idealist minus nonidealist) when misanthropy = 1)

• .779 -.285(1) = .505.• If misanthropy = 4, the predicted difference

in means is .779 - .285(4) = -.361

Probing the Interaction

• Same as simple effects analysis in ANOVA• We have already shown that the

relationship between misanthropy and support for animal rights is significant for nonidealists but not for idealists.

• Change perspectives -- how does misanthropy moderate the relationship between idealism (group) and support of animal rights.

Analysis of Simple Slopes

• Arbitrarily pick two or more values of misanthropy and compare the groups at those points.

• The points are often 1 SD below the mean, the mean, and 1 SD above the mean.

• Here, that would be misanthropy = 1.65, 2.32, and 2.99.

Testing the Simple Slopes

• To test the null that mean AR does not differ between groups when misanthropy = 1.65, we center the misanthropy scores around 1.65, recomputed the interaction term, and run the full model again.

• We repeat this action with the scores centered around 2.32 and then again centered around 2.99.

• See the code in the program.

Data Centered; set kevin;MisanthLow = misanth - 1.65;InteractLow = MisanthLow * Idealism;MisanthMean = misanth - 2.32; InteractMean = MisanthMean * Idealism;MisanthHigh = misanth - 2.99; InteractHigh = MisanthHigh * Idealism;proc reg;Low: model ar = MisanthLow idealism InteractLow;Mean: model ar = MisanthMean idealism InteractMean;High: model ar = MisanthHigh idealism InteractHigh; run; Quit;

The Code

Low Misanthropy

When MIS is low, AR is significantly higher (by .309) in the idealistic group than it is in the nonidealistic group.

Average Misanthropy

• The groups do not differ significantly when MIS is average.

High Misanthropy

• The groups do not differ significantly when MIS is High.

Process Hayes

• Makes it way easier to do this analysis.• Bring the process.sas program into SAS

and run it.• You have already read the data into the

work file “kevin.”• Hayes also provides a script to do the

same in SPSS.

The SAS Macro

%process (data=kevin,vars=ar misanth idealism,y=ar,x=idealism,m=misanth,

model=1,jn=1,plot=1);• Data= points to the SAS data file• Vars= identifies the variables• Y= identifies the outcome variable• X= identifies the focal predictor variable• M= identifies the moderator variable• Model=1 identifies the simple moderation model – see

the templates document

The SAS Macro

%process (data=kevin,vars=ar misanth idealism,y=ar,x=idealism,m=misanth,

model=1,jn=1,plot=1);• jn=1 invokes the Johnson-Neyman technique• Plot=1 requests the values for making a plot to

visualize the interaction.• Notice that the output includes all of the tests

we did earlier, the hard way.

Johnson-Neyman Technique

• Maps out the values of the moderator for which the effect of the focal predictor is significant versus those values for which it is not significant.

• I’ll use idealism groups as the focal predictor and misanthropy as the moderator.

The Boundary

• When misanthropy = 2.1286 or less, the difference between the groups is statistically significant (higher for the idealists), otherwise it is not.

• If we were to extrapolate beyond misanthropy = 4, we would find a second region where the difference between the groups would be significant (with the mean higher for the nonidealists).

Don’t Confuse Test of Slopes with Test of Correlation Coefficients

• If the slopes are the same across groups, the correlation coefficients (standardized slopes) may or may not.

• If the correlation coefficients are the same across groups, the slopes may or may not.

Different Slopes, Similar Correlations

Identical Slopes, Different Correlation Coefficients

Comparing the Groups on r

• At SPSS and SAS programs for comparing Pearson correlations and OLS regression coefficients is the code for this analysis.

• r is significant for nonidealists, not for idealists.

• For more than two groups, use this chisquare (also available at link above).

Analysis of Covariance

• You already know how to do this.• Just drop the interaction term from the

model.• Here that would not be appropriate, as

well have heterogeneity of regression.

A Couple of t Tests

• You may also want to compare the groups on the Y (ignoring C) and/or C.

• I have include those tests in the program.• This is redundant with the initial Proc Corr

output (point biserial correlations).

SPSS

• This analysis is easy to do with SPSS too.• See my handout.• You can do the analysis in a sequential

fashion.• And get the partial F tests from SPSS,

even with df > 1: Leave your calculator in the desk drawer.

Three Groups

• Two dummy variables, G1 and G2

• Two interaction terms, G1C and G2C

• To test the slopes you would see if the model fit were significantly reduced by simultaneously removing G1C and G2C .

• That would be an F with two df in its numerator.

• To test the intercepts, remove both G1 and G2

Let’s Go Fishing

• Length = a + bWeight for flounder• Does the relationship differ across

regions?– Pamlico Sound– Pamlico River– Tar River

• Potthoff3.sas and Potthoff3.dat• Download and run.

Proc GLM

• Proc GLM; class Location;oModel Length = WeightSR|Location;

• GLM creates the (2) dummy variables for you

• And the (2) interaction terms.

Full Model

Source DF Sum of Squares

Mean Square

F Value Pr > F

Model 5 1927227.450 385445.490 3088.09 <.0001

Error 745 92988.508 124.817    

Corrected Total

750 2020215.957      

Covariate Only ModelProc GLM; class Location;model Length = WeightSR / solution;

Weights are significantly correlated with lengths, r2 = .95, F(1, 749) = 14,544, p < .001.

Test of Coincidence

• On 4, 745 df, p < .001• The lines are not coincident.

927.111)124.817-(5

41921272.70 -01927227.45

)(

full

reducedregfullreg

MSErf

SSSSF

Look at Full Model

• The slopes do not differ significantly,F(2, 745) = 1.63, p = .20.

• The intercepts do differ significantly,F(2, 745) = 4.90, p = .008.

• Since the slopes do not differ significantly• But the intercepts do,• The group means must differ.

Source DF Type III SS Mean Square F Value Pr > F

WeightSR 1 692152.6411 692152.6411 5545.35 <.0001

Location 2 1223.2971 611.6486 4.90 0.0077WeightSR*Location

2 407.5286 203.7643 1.63 0.1961

Analysis of Covariance

• Location significantly affects mean length of flounder, after adjusting for the effect of weight.

Unadjusted Means (notice the different pattern)