Specification errors for interaction models: Implications for the shape of the overall pattern

22
Specification errors for interaction models: Implications for the shape of the overall pattern Jane E. Miller, PhD The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

description

Specification errors for interaction models: Implications for the shape of the overall pattern. Jane E. Miller, PhD. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. Overview. Review: Model specification with main effects and interaction terms - PowerPoint PPT Presentation

Transcript of Specification errors for interaction models: Implications for the shape of the overall pattern

Page 1: Specification errors for interaction models:  Implications for the shape of the overall pattern

Specification errors for interaction models:

Implications for the shape of the overall pattern

Jane E. Miller, PhD

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 2: Specification errors for interaction models:  Implications for the shape of the overall pattern

Overview

• Review: Model specification with main effects and interaction terms

• Implications of leaving the main effects terms out of a model intended to test for interactions

• Repercussions for – An interaction between two categorical independent

variables– An interaction between one categorical and one

continuous independent variable

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 3: Specification errors for interaction models:  Implications for the shape of the overall pattern

List of variables used in examples• Dependent variable = birth weight in grams (BW)• Independent variables:

– Main effects terms:• Race

– Two nominal categories (non-Hispanic black; non-Hispanic white is the reference category)

– One main effect dummy variable: NHB» Coded 1 = non-Hispanic black, 0 = non-Hispanic white

• Mother’s education– Three ordinal categories (< HS; = HS; > HS is the reference category)– Two main effects dummies: <HS, =HS

» Each coded 1 = named category, 0 = all other values

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 4: Specification errors for interaction models:  Implications for the shape of the overall pattern

List of variables, continued

• Interaction between race and mother’s education– Two interaction term dummies: NHB_<HS;

NHB_=HS• Each named using the “_” convention to link the names

of the component variables.• Each coded 1 = named category, 0 = all other values

– E.g., NHB_<HS = 1 for those who are both NHB and < HS, = 0 for all other combinations of race and education

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 5: Specification errors for interaction models:  Implications for the shape of the overall pattern

Model specification with interactions: race and education

• BW = f (race, education, race_education)– Birth weight is a function of race, education, and the race-by-

education interaction

• To specify a model that does not impose assumptions about the shape of the association, need ALL of the main effects and interaction term variables related to race and mother’s education

• BW = f (NHB, <HS, =HS, NHB_<HS, NHB_=HS)– Yellow denotes the main effects terms– Green denotes the interaction terms

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 6: Specification errors for interaction models:  Implications for the shape of the overall pattern

Some possible patterns of race, education, and birth weight

BW

< HS = HS > HS

BlackWhite

BW

< HS = HS > HS

BW

< HS = HS > HS

< HS = HS > HS< HS = HS > HS

BWBW

Interaction: magnitude Interaction: direction & magnitude

Main effect: race Main effect: education Main effects: race & education

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 7: Specification errors for interaction models:  Implications for the shape of the overall pattern

What happens if the specification omits the main effects terms?

• If we omit the main effects terms for the two independent variables involved in the interaction, the implied model is specified

BW = f (NHB_<HS, NHB_=HS)• Then the estimated βs for those two variables

compare those groups against everyone else – In this case all whites (regardless of mother’s educational

attainment) plus blacks whose mothers have > HS– This implicitly assumes that those four groups all have

equal mean birth weight, rather than testing for differences across those groups

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 8: Specification errors for interaction models:  Implications for the shape of the overall pattern

Repercussions of misspecification• Any differences among

– NHB & > HS – NHW & < HS – NHW &= HS – and NHW & > HSwill be overlooked because there are no terms in the model to

test for such differences.• β0 (the constant or intercept term) will be a weighted

average of birth weight for those four groups combined

• βNHB_<HS and βNHB_=HS will estimate the difference in mean birth weight for those groups compared to that combined reference category

“&” used to denote a group with that combination of characteristics, not an interaction term

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 9: Specification errors for interaction models:  Implications for the shape of the overall pattern

Implied pattern if main effects of race and education are omitted

BW

Non-Hispanic black Non-Hispanic white

< HS= HS> HS

Implied reference category for specification BW = f (NHB_<HS,

NHB_=HS)βNHB_=HS

βNHB_<HS

β0

Page 10: Specification errors for interaction models:  Implications for the shape of the overall pattern

Implied pattern if main effects of race and education are omitted

BW

< HS

Non-Hispanic blackNon-Hispanic white

βNHB_=HS

βNHB_<HS

= HS > HS

β0

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

BW = f (NHB_<HS, NHB_=HS)

Page 11: Specification errors for interaction models:  Implications for the shape of the overall pattern

Observed pattern based on model of NHANES III data with main effects and interaction terms

Model estimates separate levels (intercepts) for each combination of race and education

BW

< HS

Non-Hispanic blackNon-Hispanic white

βNHB + β=HS + βNHB_=HS

= HS > HS

βNHB

βNHB + β<HS + βNHB_<HS

β<HS β=HS

β0

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

BW = f (NHB, <HS, =HS, NHB_<HS, NHB_=HS)

Page 12: Specification errors for interaction models:  Implications for the shape of the overall pattern

Interaction between a continuous and a categorical independent variable (IV)

• Example: Race and income-to-poverty ratio (IPR)– Race is a two-category IV, specified with a dummy variable

NHB, coded• 1 = non-Hispanic black• 0 = non-Hispanic white (the reference category)

– IPR is a continuous variable calculated as annual family income (in dollars) divided by the Federal Poverty Level for a family of that size and age composition

– The interaction between race and IPR is a continuous variable calculated as the product of the NHB dummy and IPR

NHB_IPR = NHB × IPR

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 13: Specification errors for interaction models:  Implications for the shape of the overall pattern

Model specification to test an interaction between continuous and categorical IVs

• For a model with an interaction between two independent variables, need all of the ALL of the main effects and interaction term variables related to those two independent variables

• E.g., for a model of birth weight by race and IPR, include the main effect and interaction terms related to race and family IPR:

BW = f (NHB, IPR, NHB_IPR)

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 14: Specification errors for interaction models:  Implications for the shape of the overall pattern

What happens if the specification omits the main effects terms?

• If we omit the main effects terms for the two independent variables involved in the interaction, the implied model is specified

BW = f (NHB_IPR)• Then the coefficient βNHB_IPR estimates the slope of

the IPR/birth weight curve for blacks, but does not – Allow for a different intercept for blacks than for white– Test for a difference in slopes of the IPR/birth weight

curves for blacks and for whites

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 15: Specification errors for interaction models:  Implications for the shape of the overall pattern

Some possible patterns among income, race, and birth weight

Income

BW

Income

BW

Income

BW

Income

BW

Income

BW

Income

BW

WhiteBlack

Income main effect Income & race main effects Income & race main effects, and interaction: converging

Income & race main effects, and interaction: diverging

from same intercept

Income & race main effects, and interaction: diverging from different intercepts

Income & race main effects, and interaction: disordinal

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 16: Specification errors for interaction models:  Implications for the shape of the overall pattern

Implied pattern based on NHANES III data if main effects of race and IPR are omitted

BW = f (NHB_IPR) specification forces • The intercept to be the same for black and white infants• The slope of IPR/birth weight curve for white infants to be zero (flat)• The estimated slope of IPR/birth weight curve for black infants to be negative

IPR

BW

β0

βNHB_IPR

WhiteBlack

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 17: Specification errors for interaction models:  Implications for the shape of the overall pattern

Observed pattern based on model of NHANES III data with main effects and interaction terms

IPR

BW

β0

WhiteBlack

• BW = f (NHB, IPR, NHB_IPR) specification estimates– Different intercepts for blacks and for whites– Different slopes for blacks and for whites

• Slopes for both racial/ethnic groups are positive

= β0 + βNHB

= βIPR + βNHB_IPR

= βIPR

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 18: Specification errors for interaction models:  Implications for the shape of the overall pattern

Summary• Models intended to test for interactions should

initially include all main effects and interaction terms for the independent variables involved

• Such a specification – Does not impose a priori assumptions about the shape of

the association among the IVs and DV– Allows the data to reveal the shape and size of that pattern

• Empirical criteria can be used to simplify the specification if βs for some term(s) are not statistically significantly different from one another

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 19: Specification errors for interaction models:  Implications for the shape of the overall pattern

Suggested resources

• Miller, J. E. 2013. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. University of Chicago Press, chapter 16.

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 20: Specification errors for interaction models:  Implications for the shape of the overall pattern

Suggested online resources• Podcasts on

– Visualizing shapes of interaction patterns– Creating variables and specifying models to test for

interactions– Calculating the shape of an interaction pattern from

regression coefficients• Two categorical independent variables• One categorical and one continuous independent variable

– Testing whether a multivariate specification can be simplified

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

Page 21: Specification errors for interaction models:  Implications for the shape of the overall pattern

Suggested practice exercises

• Using your own data, estimate the following models for an interaction between two categorical independent variables– Main effects only– Main effects and interactions – Interaction terms only (omit the associated main effects terms)

• Using a spreadsheet, calculate and graph the implied overall pattern of the association between the two IVs involved in the interaction and your DV for EACH of the three specifications– See spreadsheet template

• Repeat the exercise for an interaction between one categorical and one continuous independent variable

Page 22: Specification errors for interaction models:  Implications for the shape of the overall pattern

Contact information

Jane E. Miller, [email protected]

Online materials available athttp://press.uchicago.edu/books/miller/multivariate/index.html

The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.