Post on 23-Jun-2015
description
A gentle introduction to growth curves
Dr. Sean P. Mackinnon, Dalhousie University
When to use a growth curve Growth curves measure patterns of change over
time Specifically, mean-level changes over time Patterns can be linear, quadratic, cubic, etc.
Time 1
Time 2 Time 3
John 10 7 5Mary 8 5 4Zoe 7 9 9Sarah 5 2 1Bill 2 4 3MEAN 6.4 5.4 4.4
Mean-Level Change**
Limitations of RM-ANOVA Requires a balanced design (i.e., no missing
data)
Requires equal spacing between time points
Requires independence of observations (not often possible in longitudinal data)
Requires homogeneity of variance
Growth Curves overcome these limitations Accounts for missing data using a full information
maximum likelihood (FIML) approach
Does not require equal spacing between time points (can specify unequal time points, e.g., 1, 2, 5, 7, 10)
Does not require independence of observations (can model different types of correlated error structures)
Is robust to violations of homogeneity of variance assumptions required by RM-ANOVA
So… what are growth curves?
Growth curves are a type of mixed (or multilevel) model
Simply put, multilevel models are a way of dealing with clustered data
For example…
Level 2Between-Subjects
(2 Participants)
Level 1Within-Subjects
(6 measurement occasions)
Participant ID001(Average)
Participant ID002(Average)
Growth Curves are Multilevel Models All multilevel models (MLMs) partition
variance into their appropriate levels E.g., students nested within schools
Multilevel models also use maximum likelihood estimation, which is better when there’s missing data and are more flexible when dealing with real data
Growth curves are a specific type of MLM where: The lowest level of observation is repeated
measures The predictor variable is TIME
Application to a clinical context The RCT is a
common design
Growth curves can be used instead of ANOVA
The time*interv interaction is most important
Leiter et al., 2012
How do you do this in SPSS? First, you need to convert your data from
“WIDE” format to “LONG” format
Wide Format
Long Format (Use the syntax provided in the handout to
get this): Long Format
Coding the Time Variable is Important The choices you make for your time variable will
influence your analyses!
If relationships are linear, need to be equidistant 1, 2, 3 OR -1, 0, 1, etc.
If you are expecting a quadratic relationship, need to also calculate time-squared 1, 4, 9 OR 1, 0, 1
Unequal time points 1 month, 3 month, 12 month 1, 3, 12
Decision 1: ML vs REML Maximum Likelihood Estimation (ML)
vs Restricted Maximum Likelihood Estimation
(REML)
REML is generally preferred because it provides more unbiased estimates
ML would be preferred if you need to compare nested models, as REML is not adequate for this
Decision 2: Fixed vs Random Random vs. Fixed Slopes & Intercepts
Random (varying): Allow to vary across people Fixed (constant): Force them to be equal across people
Random vs. Fixed has no single, agreed-upon definition (Gelman, 2005); I’m presenting a practical conceptualization
Fixed (constant) intercepts and slopes are more parsimonious and less computationally intensive, but may not be as good a fit to the data. Select the most parsimonious model that fits the data best.
Random (varying) Intercepts Random (varying) Slopes
http://www.spss.ch/upload/1126184451_Linear%20Mixed%20Effects%20Modeling%20in%20SPSS.pdf
Random (varying) InterceptsFixed (constant) Slopes
http://www.spss.ch/upload/1126184451_Linear%20Mixed%20Effects%20Modeling%20in%20SPSS.pdf
Fixed (constant) Intercepts Random (varying) Slopes
http://www.spss.ch/upload/1126184451_Linear%20Mixed%20Effects%20Modeling%20in%20SPSS.pdf
Decision 3: Linear, Quadratic, or Cubic? If slopes are allowed to be random (varying),
then you need at least: 3 time points for linear 4 time points for quadratic
Add time*time as a predictor 5 time points for cubic
Add time*time and time*time*time as predictors
One less time point needed if using fixed slopes
Today, I’m focusing on LINEAR relationships
Decision 4: Covariance Structure Is there a predictable pattern to the errors?
If you are unsure, specify an “unstructured” matrix Less parsimony because it lets things freely vary
AR(1) correlated error structure is also fairly common Autoregressive correlated errors, getting smaller as
timepoints get more distant
You can test multiple models with different plausible structures, and choose the one that fits the data best
Annotated Syntax
MIXED ASItotal WITH time interv
/METHOD = REML
/FIXED = time interv time*interv | SSTYPE(3)
/RANDOM = INTERCEPT time interv | SUBJECT(id) COVTYPE(UN)
/PRINT = SOLUTION TESTCOV HISTORY.
*Mixed model, dependant variable predicted by time and intervention
*Restricted Maximum Likelihood Estimation (usually better than ML)
*Put all predictors after FIXED. Indicate interactions by Var1*Var2
*The intercept, and the slopes for time and interv are random. The slope for the interaction is fixed because I omitted it from this part.
*”UN” Specifies an unstructured covariance matrix (other types are possible, but require thought)
Annotated Output: Model Comparison
Use the BIC values to compare nested models (e.g., random slopes vs fixed slopes)
Lower absolute values are better (∆BIC > 4)
Annotated Output: Covariance Parameters
UN(1,1) = Variance of the Intercept. Significant, so random intercepts are important to include.
UN(2,2) = Variance of the slope for time. Non-significant, which suggests that a more parsimonious model with fixed slopes for time would fit the data better.
Annotated Output
Interpret like ANOVA; parameters adjusted for clustering Time -> Main effect for time (linear, in this case) Interv -> Main effect for intervention Time * interv -> 2-way Interaction
Graphing the interaction is usually important to understand Dummy coding (0, 1) intervention helps a LOT
Graphing the interaction
Can graph the interaction using tools meant for moderation in linear regression with this kind of model
The parameters in the output are interpreted the same way, they’re just adjusted so that you’re accounting for the clustering due to repeated measurement and missing data
http://www.jeremydawson.co.uk/slopes.htm
A few closing points Other software can implement this (e.g., SAS,
Mplus, HLM)
Non-normal data may be better modeled with different distributional assumptions (e.g., poisson)
Modeling of covariance structures may be important, but can be challenging to figure out
Some programs (e.g., Mplus) may use a latent variable approach
Questions? Comments?
Thank you!
P.S. In the handout I provided, there is some syntax and instructions which may be helpful!
Email me if you want an electronic copy of the presentation:
mackinnon.sean@dal.ca
Appendix: Syntax*Convert data from LONG to WIDE format SORT CASES BY id time.CASESTOVARS /ID=id /INDEX=time /GROUPBY=VARIABLE. *Convert data from WIDE to LONG format VARSTOCASES /MAKE ASItotal FROM ASItotal.0 ASItotal.1 ASItotal.2 /INDEX=time(3) /KEEP=id interv /NULL=KEEP.
Appendix: Syntax*Linear Growth Curve with Intervention Group as Moderator (Random Intercept, Random Slopes)
MIXED ASItotal WITH time interv/METHOD = REML/FIXED = time interv time*interv | SSTYPE(3)/RANDOM = INTERCEPT time interv time*interv | SUBJECT(id) COVTYPE(UN) /PRINT = SOLUTION TESTCOV HISTORY.
Appendix: Syntax*Linear Growth Curve with Intervention Group as Moderator (Random Intercept, Fixed Slopes)
MIXED ASItotal WITH time interv/METHOD = REML/FIXED = time interv time*interv | SSTYPE(3)/RANDOM = INTERCEPT | SUBJECT(id) COVTYPE(UN) /PRINT = SOLUTION TESTCOV HISTORY.
Appendix: Syntax*Linear Growth Curve with Intervention Group as Moderator (Fixed Intercept, Random Slopes)
MIXED ASItotal WITH time interv/METHOD = REML/FIXED = time interv time*interv | SSTYPE(3)/RANDOM = time interv time*interv | SUBJECT(id) COVTYPE(UN) /PRINT = SOLUTION TESTCOV HISTORY.
Appendix: Syntax*Quadratic Growth Curve with Intervention Group as Moderator (Random Intercept, Fixed Slopes)
COMPUTE quadtime = time*time. EXECUTE.
MIXED ASItotal WITH time interv/METHOD = REML/FIXED = time quadtime interv time*interv quadtime*interv | SSTYPE(3)/RANDOM = INTERCEPT | SUBJECT(id) COVTYPE(UN) /PRINT = SOLUTION TESTCOV HISTORY.