Longitudinal Data Analysis in Stata

Longitudinal Data

Analysis in Stata

William JohnstonMarch 5, 2013

www.gse.harvard.edu

Accessing Workshop Materials

Go to:

isites.harvard.edu/research_technologies

– Click on Workshops tab (on the left) and then the Longitudinal Analysis with Stata folder (near the bottom)

– Save all of the files to the desktop (right click and ‘Save Link As’)

www.gse.harvard.edu

Agenda• Pre-Estimation Data Setup

• xtmixed Estimation

• xtmelogit Estimation & Post-Estimation

www.gse.harvard.edu

Data Preparation for xt___ Commands

• Data needs to be in long format• The append command adds the new

data file to the bottom, rather than the right, of your current dataset.

• Prior to appending, make sure time-varying variables are in this format:• var1 (in the first wave) • var2 (in the second wave) • var3 (in the third wave), etc.

www.gse.harvard.edu

Data Preparation, cont’d• Please refer to the syntax and Powerpoint slides from my Data Management workshop for more info• Merging to make a wide data file.• Reshaping data from wide to long or

vise-versa. • Looping commands over multiple

variables• Many other tips and tricks

www.gse.harvard.edu

Is Missing Data an Issue? • Chances are, you have missing data.

• The bad news:• You can’t ignore it, entirely.• If it’s non-random it can lead to

bias. • There is no one right way to

handle this issue. • The good news

• mi estimate works with most xt commands

• The multilevel model for change can accommodate missingness.

www.gse.harvard.edu

Estimation with xtmixedAnatomy of an xt command:

xtmixed y x || id: z, cov(un) variance mle

outcome

predictorDivider

between fixed and random portions

Cluster variable (random

intercepts)

Variable across which

random slopes are estimated

Allow covariances to be estimated

Display variance

rather than variation

Maximum likelihood estimation (default)

www.gse.harvard.edu

An example of xtmixed• Chapter 4 from ALDA / Stice et al. (1997)

• Outcome: alcuse• Predictors: age, coa, peer

• Model A: unconditional means (p. 92)• xtmixed alcuse || id: , variance mle

• Model B: unconditional growth model (p. 99)• xtmixed alcuse age_14|| id: age_14 , cov(un) variance mle

• Model F: final conditional model with centered predictors cpeer and ccoa (p. 114)

• xtmixed alcuse ccoa cpeer age_14 c.cpeer#c.age_14 || id: age_14, cov(un) variance mle

www.gse.harvard.edu

Interpreting Results

To the handout!!!

www.gse.harvard.edu

An example of xtmelogitThe data -- toenails!!!!

• This is the example from chapter 10 of Rabe-Hesketh & Skrondal (2012). Multilevel and Longitudinal Data Analysis Using Stata v. 2

• outcome: outcome (onycholysis--separation of nail plate from nail bed)

• treatment: treatment (0: itraconazole; 1: terbinafine)

• visit: visit number (1, 2, 3, .... 7)• month: exact timing of visit in months

www.gse.harvard.edu

xtmelogit, continued• xtdescribe what are the missing data patterns? • MLE allows for all data to be used...no

list-wise deletion! (MAR is assumed)• What is the nature of the outcome, across

the treatment groups? • See .do file

www.gse.harvard.edu

xtmelogit, continued• We want to relax the assumption of conditional

independence among the outcome responses for each person• We want a patient-specific random intercept

gen trt_month = treatment*month

xtmelogit outcome treatment month trt_month || patient: ,

• The or option provides odds ratios rather than log odds

xtmelogit outcome treatment month trt_month || patient: , or

www.gse.harvard.edu

xtmelogit: Post-Estimation

• How do the two groups compare, over time? margins treatment, at(month=(1(2)18)) predict(mu fixedonly) vsquish

marginsplot

• Is there a difference between the two groups after 10 months?

lincom 1.treatment + trt_month*10, orlincom 0.treatment + trt_month*10, or

(lincom 1.treatment + trt_month*10) – (lincom 1.treatment + trt_month*10), or

www.gse.harvard.edu

An example of xtmepoisson

The data - Breslow & Clayton (1993)

use http://www.stata-press.com/data/r12/epilepsy

• subject (n=59)• seizures: number of seizures• treat: 1=medication 0=placebo• visit: doctor visit • lage: log of age• lbas: log(.25*baseline seizures)• lbas_trt: lbas x treatment• v4: 4th visit indicator

www.gse.harvard.edu

xtmepoisson, continuedModel 1: Random intercepts, fixed effects for 4th

visit

xtmepoisson seizures treat lbas lbas_trt lage v4 || subject:

Model 2: Remove fixed effect of v4, add random slopes for each visit

xtmepoisson seizures treat lbas lbas_trt lage visit || subject: visit, cov(unstructured) intpoints(9)

www.gse.harvard.edu

xtmepoisson, continuedOther display possibilities

• variance option will display variances instead of variation

• irr option will display incidence-rate ratios

Interpretation of Model 1:• Significant drop in number of seizures at

the fourth visit• Treatment group has fewer seizures than

placebo group

Longitudinal Data Analysis in Stata

Documents

Transcript of Longitudinal Data Analysis in Stata