{ Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on...

19
{ Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using Stata (Second Edition) by Sophia Rabe-Hesketh Anders Skrondal

Transcript of { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on...

Page 1: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

{

Multilevel Modeling using

StataAndrew HicksCCPR Statistics and Methods Core

Workshop based on the book:

Multilevel and Longitudinal ModelingUsing Stata(Second Edition)

bySophia Rabe-HeskethAnders Skrondal

Page 2: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.
Page 3: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

200

300

400

500

600

700

Min

i Wrig

ht M

eas

ure

me

nts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17Subject ID

Occasion 1 Occasion 2

Within-Subject Dependence

Within-Subject Dependence: We can predict occasion 2 measurement ifwe know the subject’s occasion 1 measurement.

Between-Subject Heterogeneity: Large differences between subjects(compare subjects 9 and 15)

Within-subject dependence is due to between-subject heterogeneity

Page 4: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

Standard Regression Model

𝑦 𝑖𝑗=𝛽+πœ‰ 𝑖𝑗

Measurement of subject i on occasion j

Population Mean

Residuals (error terms)Independent over subjects and occasions

Clearly ignores information aboutwithin-subject dependence

{{

{ { 𝜷

Page 5: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

Variance Component Model

𝑦 𝑖𝑗=𝛽+πœ‰ 𝑖𝑗

𝜁 𝑗 πœ– 𝑖𝑗𝑦 𝑖𝑗=𝛽+ΒΏ +ΒΏRandom Intercept: deviation of subjectj’s mean from overall mean

Within-subject residual: deviation of observation i from subject j’s mean

Page 6: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

Variance Component Model

𝑦 𝑖𝑗=𝛽+πœ‰ 𝑖𝑗

𝜁 𝑗 πœ– 𝑖𝑗𝑦 𝑖𝑗=𝛽+ΒΏ +ΒΏRandom Intercept: deviation of subjectj’s mean from overall mean

Within-subject residual: deviation of observation i from subject j’s mean

Page 7: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

Variance Component Model

𝜁 𝑗 πœ– 𝑖𝑗𝑦 𝑖𝑗=𝛽+ΒΏ +ΒΏRandom Intercept: deviation of subjectj’s mean from overall mean

Within-subject residual: deviation of observation i from subject j’s mean

𝜷𝜁 𝑗

𝛽+𝜁 π‘—πœ–2 𝑗

πœ–1 𝑗

Page 8: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

Variance Component Model

𝜁 𝑗 πœ– 𝑖𝑗𝑦 𝑖𝑗=𝛽+ΒΏ +¿𝜁 𝑗 ∼ 𝑁 (0 ,πœ“)πœ– π‘–π‘—βˆΌ 𝑁 (0 ,πœƒ)

π‘‰π‘Žπ‘Ÿ ( 𝑦 𝑖𝑗 )=π‘‰π‘Žπ‘Ÿ ( 𝛽)+π‘‰π‘Žπ‘Ÿ (𝜁 𝑗)+π‘‰π‘Žπ‘Ÿ (πœ– 𝑖𝑗)0 πœ“ πœƒ

π‘‰π‘Žπ‘Ÿ ( 𝑦 𝑖𝑗 )=πœ“+πœƒ

Page 9: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

Variance Component Model

𝜁 𝑗 πœ– 𝑖𝑗𝑦 𝑖𝑗=𝛽+ΒΏ +ΒΏProportion of Total Variance due to subject differences:

=

=

Intraclass Correlation: within cluster correlation

=

Page 10: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.
Page 11: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

Random or Fixed Effect?

Since every subject has a different effect we can think of subjects as categorical explanatory variables. Since the effectsof each subject is random, we have been using a random effect model:

, 𝜁 π‘—βˆΌ 𝑁 (0 ,πœ“)What if we want to fix our model so that each effect is for a specific subject? Then we would use a fixed effect model:

,

.xtreg wm, fe

Page 12: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

Random or Fixed Effect?

random effect model:

if the interest concerns the population of clusters

β€œgeneralize the potential effect” i.e. nurse giving the drug

fixed effect model:

if we are interest in the β€œeffect” of the specific clusters in a particulardataset

β€œreplicable in life” i.e. the actual drug

Page 13: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

Random Intercept Model with Covariates

𝑦 𝑖𝑗=𝛽+πœ‰ 𝑖𝑗

𝜁 𝑗 πœ– 𝑖𝑗𝑦 𝑖𝑗=𝛽+ΒΏ +ΒΏwithout covariates:

Page 14: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

Random Intercept Model with Covariates

with covariates:

𝑦 𝑖𝑗=𝛽1+𝛽2 π‘₯2 𝑖𝑗+… 𝛽𝑝 π‘₯𝑝𝑖𝑗+πœ‰ 𝑖𝑗

πœ– 𝑖𝑗+¿𝑦 𝑖𝑗=𝛽1+𝛽2 π‘₯2 𝑖𝑗+… 𝛽𝑝 π‘₯𝑝𝑖𝑗+𝜁 𝑗

πœ– 𝑖𝑗+ΒΏ

random parameter not estimated with fixed parameters

but whose variance is estimated with variance of

Page 15: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.
Page 16: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

Ecological Fallacyoccurs when between-cluster relationships differ substantially from within-cluster relationships.

β€’ Can be caused by cluster-lever confounding

For example, mothers who smoke during pregnancy may also adoptother behaviors such as drinking and poor nutritional intake, or have lowersocioeconomic status and be less educated. These variables adversely affectbirthweight and have not be adequately controlled for. In these cases thecovariate is correlated with the error term. (endogeneity)

β€’ Because of this, the between-effect may be an overestimate of thetrue effect.

β€’ In contrast, for within-effects each mother serves as her own control, so within mother estimates may be closer to the true causal effect.

Page 17: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

How to test for endogeneity?

Use the Hausman test to compare two alternative estimators of

Page 18: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

Random-coefficient model

We’ve already considered random intercept models where the interceptis allowed to vary over clusters after controlling for covariates.

What if we would also like the coefficients (or slopes) to vary across clusters?

Models the involve both random intercepts and random slopes are called Random Coefficient Models

Page 19: { Multilevel Modeling using Stata Andrew Hicks CCPR Statistics and Methods Core Workshop based on the book: Multilevel and Longitudinal Modeling Using.

Random-coefficient model

Random Intercept Model:

𝑦 𝑖𝑗=𝛽1+𝛽2 π‘₯𝑖𝑗+𝜁 𝑗+πœ–π‘–π‘—

Random Coefficient Model:

𝑦 𝑖𝑗=𝛽1+𝛽2 π‘₯𝑖𝑗+𝜁 1 𝑗+𝜁2 𝑗 π‘₯ 𝑖𝑗+πœ– 𝑖𝑗

𝑦 𝑖𝑗=(𝛽¿¿1+𝜁1 𝑗)+(𝛽2+𝜁2 𝑗)π‘₯𝑖𝑗+πœ– 𝑖𝑗¿

cluster-specific random intercept

cluster-specific random slope