201ab Quantitative methods Linear Mixed Effects ModelsEDVUL| UCSD Psychology Fixed vs Random...

ED VUL | UCSD Psychology

201ab Quantitative methodsLinear Mixed Effects Models

1


Linear mixed effect models

• What are linear mixed effects models?• Fixed vs random effects? – Fixed effect: no distribution on coefficients– Random effect: some distribution on coefficients– Consequences: Bias/Variance, shrinkage, overcomplete models

• Significance tests in linear mixed effect models– Use lmerTest (Satterthwaite)

2


Linear mixed effect models• Linear models

i.e., y = sum(B X)

• With more random effects, than just error / residuals– e.g., variation by trial, by subject, by item, etc.

3


Distributions on parameters• Basic linear model:

yi = a + b*xi + ei– We simultaneously estimate a, b, and all the ei– Without any distributional constraints or minimization criteria,

infinitely many solutions that satisfy the equality. – But we say we are “minimizing squared error” or maximizing

likelihood under the following distributional assumption:ei ~ Normal(0, s)– This has several consequences:

(1) It creates a new parameter (s)(2) It constrains the error terms, making them no longer be free parameters.(2) This makes some solutions above better than others, which allows us to find one best-fitting model, best under that distributional constraint.

4


Fixed vs Random effects.

• Fixed effects are coefficients / offsets / parameters that are completely unconstrained.We have no distribution over the values of thoseparameters.This means that all possible values of those parameters are equally good a priori.a, b, s

5

yi = a + b*xi + eiei ~ Normal(0, s)

• Random effects are coefficients / offsets / parameters that have a specified distribution. This means that some values of those parameters are a priori more likely than others regardless of the data.ei


Many sources of variability• subject - Some Ss score higher than

others• test - Some tests are harder than others• subject:test (or test|subject)

Some tests are harder for some subjects.• measurement – measurements of a

person on the same test will not repeat perfectly.

• Etc. • Some may be indistinguishable in a given

dataset. e.g., subj:test and measurement are indistinguishable with 1 measurement / test / subjecte.g., variation in scores by subject age likely exists, but cannot be separated from other sources of subject variation here.

6


Models of the same data• No subject effect:

yi = a + b*test_Bi + ei

7



yi = a + b*test_Bi + ei• Adding a subject effect:

yi = a + b*test_Bi + c2 *sub_2i + c3 *sub_3i+ c4 *sub_4i + … + ei

• Note the consequence on test effect sig.8



yi = a + b*test_Bi + ei• Adding a subject effect:

yi = a + b*test_Bi + c2 *sub_2i + c3 *sub_3i+ c4 *sub_4i + … + ei

• Explicit variable interceptyi = c[subi] + b*test_Bi + ei

9


Models of the same dataSubject effect as k-1 offsets from intercept:yi = a + b*test_Bi + c2 *sub_2i + c3 *sub_3i+ c4 *sub_4i + … + ei

10

Subject effect as k unique interceptsyi = c[subi] + b*test_Bi + ei


Fixed effect modelsSubject effect as k-1 offsets from intercept:yi = a + b*test_Bi + c2 *sub_2i + c3 *sub_3i+ c4 *sub_4i + … + ei

11

Subject effect as k unique interceptsyi = c[subi] + b*test_Bi + ei

• We cannot fit an overall intercept and an offset for each subject, because they are overcomplete, indeterminable…yi = a + c[subi] + b*test_Bi + ei

• …unless we put a distribution on them


Random effect model

12

• We can fit an overall intercept and an offset for each subject, if subject offsets are random effects with a distribution…yi = a + c[subi] + b*test_Bi + eiei ~ Normal(0, se)c[] ~ Normal(0, sc)


Consequences of putting distributions on parameters

(1) We can fit overcomplete models by putting distributional constraints on groups of parameters.

13


Fixed vs Random effect model

14

Distributions on coefficients make them “shrink” toward the average, because we can balance different types of errors.

yi = a + c[subi] + b*test_Bi + eiei ~ Normal(0, se)c[] ~ Normal(0, sc)

yi = c[subi] + b*test_Bi + eiei ~ Normal(0, se)


Consequences of putting distributions on parameters

(1) We can fit overcomplete models by putting distributional constraints on groups of parameters.

(2) Shrinkage: Values biased toward the prior average.(3) Bias-variance tradeoff: With no distribution on parameters

we favor variance of estimates. With distribution on parameters we reduce variance, in exchange for some bias

15


Fixed vs Random effect model

16

Distributions on coefficients make them “shrink” toward the average, because we can balance different types of errors.

yi = a + c[subi] + b*test_Bi + eiei ~ Normal(0, se)c[] ~ Normal(0, sc)

yi = c[subi] + b*test_Bi + eiei ~ Normal(0, se)





17


Significance tests in lmer.• Significance tests are complicated, because of the many

sources of error that balance against one another.

18


Significance tests in lmer.• Significance tests are complicated, because of the many

sources of error that balance against one another.

19


Significance tests in lmer.• Standard lme4::lmer() summary output produces a t-value

but no p value because there is no df. You could assert that df is infinite and use Wald z-test.But you shouldn’t – fairly anticonservative for low n.

20


Significance tests in lmer.• A more common standard recommendation is to do nested

model comparisons via a likelihood ratio test. I have used this approach for a while.

21


Significance tests in lmer.• Finally: lmerTest::lmer(), which includes the Satterthwaite

approximation to get df and p values, just like standard LM output. I suggest doing this, as it seems to be theprevailing current recommendation.

22


Significance tests in lmer• Satterthwaite approximation / correction yields answers

consistent with classic tests when those are options:

23

This requires an REML=T fit, rather than an ML fit… and I’m not sure what the consistency guarantees are.





• Reminder (from last term) about common use case

24


Motivating dataWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.

str(test.data)

'data.frame': 400 obs. of 6 variables:$ student: Factor w/ 20 levels "S.1","S.10","S.11",..: 1 1 1 1 1 1 11 ...$ sex : Factor w/ 2 levels "female","male": 2 2 2 2 2 2 2 2 2 2 ...$ test : Factor w/ 2 levels "post","pre": 2 2 2 2 2 2 2 2 2 2 ...$ part : Factor w/ 10 levels "P.1","P.10","P.2",..: 1 3 4 5 6 7 8 9 ...$ section: Factor w/ 2 levels "qualitative",..: 1 1 1 1 1 2 2 2 2 2 ...$ score : num 53 50 79 67 70 68 68 62 65 79 ...

head(test.data)

student sex test part section scoreS.1 S.1 male pre P.1 qualitative 53S.1.2 S.1 male pre P.2 qualitative 50S.1.3 S.1 male pre P.3 qualitative 79S.1.4 S.1 male pre P.4 qualitative 67S.1.5 S.1 male pre P.5 qualitative 70S.1.6 S.1 male pre P.6 quantitative 68


MotivationWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.So: 20 (students) x 20 (time x parts) = 400 measurements

What’s the correlation structure?

How do you analyze this?


1) Do students improve from pre to post?2) Do females outperform males?3) Is qualitative or quantitative easier?4) Is improvement different for males and females? For

qual. vs quant.?5) Does qual./quant. improve more? easier for

males/females? 6) Is male/female difference different for qual/quant?

Pre/Post?7) Does learning [pre vs post] alter any qual/quant disparity

between males/females?8) Are some parts easier or harder?9) Are some parts easier for males/females? Improve more?10) Do some parts improve more for males than females?11) Do some students do better or worse?12) Are some students better at qual/quant? Improve more?13) Do some students improve more on qual/quant?

Random vs fixed effects: lay theoryWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.So: 20 (students) x 20 (time x parts) = 400 measurements

We can classify these kinds of questions as:Fixed main effects.Fixed 2-way interactions.Fixed 3-way interactions.Random main effects.Random 2-way interactions.Random 3-way interactions.


Fixed effects:- pre/post- male/female- qual/quantRandom effects:- student- part

Random vs fixed effects: lay theoryWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.So: 20 (students) x 20 (time x parts) = 400 measurements

Factor levels are of general relevance.We care about offsets for specific levels.

Factor levels are specific to our study. They are random samples of possible levels that might occur in the world.Maybe we care about variance across levels in general, but not the actual offsets for specific levels.


Crossed random effectsWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.So: 20 (students) x 20 (time x parts) = 400 measurements• repeated measures ANOVA worldview: two options:– Students as the unit of analysis.sex as “between student” factorsection and time as “within student” factors.

– Test part as the unit of analysis.section as a “between part” factor, sex and time as “within part” factors.


Analysis by pooling (subject)summary(aov(data=test.data,

score~sex*test*section + Error(student/(test*section))))

Error: studentDf Sum Sq Mean Sq F value Pr(>F)

sex 1 3570 3570 1.443 0.245Residuals 18 44539 2474

Error: student:testDf Sum Sq Mean Sq F value Pr(>F)

test 1 0.90 0.90 0.879 0.361 sex:test 1 33.06 33.06 32.195 2.21e-05 ***Residuals 18 18.48 1.03

Error: student:sectionDf Sum Sq Mean Sq F value Pr(>F)

section 1 5048 5048 661.6 1.20e-15 ***sex:section 1 1604 1604 210.2 2.27e-11 ***Residuals 18 137 8

Error: student:test:sectionDf Sum Sq Mean Sq F value Pr(>F)

test:section 1 1242.6 1242.6 1238.1 < 2e-16 ***sex:test:section 1 61.6 61.6 61.4 3.29e-07 ***Residuals 18 18.1 1.0

Error: WithinDf Sum Sq Mean Sq F value Pr(>F)

Residuals 320 31652 98.91


Analysis by pooling (item)summary(aov(data=test.data,

score~sex*test*section + Error(part/(test*sex))))

Error: partDf Sum Sq Mean Sq F value Pr(>F)

section 1 5048 5048 1.302 0.287Residuals 8 31014 3877

Error: part:testDf Sum Sq Mean Sq F value Pr(>F)

test 1 0.9 0.9 0.046 0.836 test:section 1 1242.6 1242.6 62.891 4.65e-05 ***Residuals 8 158.1 19.8

Error: part:sexDf Sum Sq Mean Sq F value Pr(>F)

sex 1 3570 3570 205.83 5.44e-07 ***sex:section 1 1604 1604 92.48 1.14e-05 ***Residuals 8 139 17

Error: part:test:sexDf Sum Sq Mean Sq F value Pr(>F)

sex:test 1 33.06 33.06 23.53 0.001270 ** sex:test:section 1 61.62 61.62 43.86 0.000165 ***Residuals 8 11.24 1.41


Residuals 360 45042 125.1


Analysis by item/subject pooling

summary(aov(data=test.data, score~sex*test*section + Error(student/(test*section))))

Error: studentDf Sum Sq Mean Sq F value Pr(>F)

sex 1 3570 3570 1.443 0.245Residuals 18 44539 2474

Error: student:testDf Sum Sq Mean Sq F value Pr(>F)

test 1 0.90 0.90 0.879 0.361 sex:test 1 33.06 33.06 32.195 2.21e-05 ***Residuals 18 18.48 1.03

Error: student:sectionDf Sum Sq Mean Sq F value Pr(>F)

section 1 5048 5048 661.6 1.20e-15 ***sex:section 1 1604 1604 210.2 2.27e-11 ***Residuals 18 137 8

Error: student:test:sectionDf Sum Sq Mean Sq F value Pr(>F)

test:section 1 1242.6 1242.6 1238.1 < 2e-16 ***sex:test:section 1 61.6 61.6 61.4 3.29e-07 ***Residuals 18 18.1 1.0


Residuals 320 31652 98.91

summary(aov(data=test.data, score~sex*test*section + Error(part/(test*sex))))

Error: partDf Sum Sq Mean Sq F value Pr(>F)

section 1 5048 5048 1.302 0.287Residuals 8 31014 3877

Error: part:testDf Sum Sq Mean Sq F value Pr(>F)

test 1 0.9 0.9 0.046 0.836 test:section 1 1242.6 1242.6 62.891 4.65e-05 ***Residuals 8 158.1 19.8

Error: part:sexDf Sum Sq Mean Sq F value Pr(>F)

sex 1 3570 3570 205.83 5.44e-07 ***sex:section 1 1604 1604 92.48 1.14e-05 ***Residuals 8 139 17

Error: part:test:sexDf Sum Sq Mean Sq F value Pr(>F)

sex:test 1 33.06 33.06 23.53 0.001270 ** sex:test:section 1 61.62 61.62 43.86 0.000165 ***Residuals 8 11.24 1.41


Residuals 360 45042 125.1

Both strategies give us the wrong answer because they neglect one source of covariation.

Subject analysis (pooling over parts)Ignores within-section among-part variation, most relevant for assessing differences between sections.

Item analysis (pooling over students)Ignores within-sex among-student variation, most relevant for assessing differences between sexes.


We want both

33

summary(aov(data=test.data, score~sex*test*section + Error(student/(test*section))))

summary(aov(data=test.data, score~sex*test*section + Error(part/(test*sex))))

But we can’t do this:summary(aov(data=test.data,

score~sex*test*section + Error(student/(test*section)) + Error(part/(test*sex))))

Need a more flexible way to specify correlation structure.


Random full random effect structureWe have pre- and post-class exam scores for 10 males and 10 females. The exam is divided into a qualitative and quantitative section, with 5 parts in each section.m = lmer(data=test.data, score~sex*test*section +(1|student)+(1|student:test)+(1|student:section)+(1|student:test:section))

This is the full “student” random effect structure.

m = lmer(data=test.data, score~sex*test*section +(1|part)+(1|part:test)+(1|part:sex)+(1|part:test:sex))

This is the full “part” random effect structure.

m = lmer(data=test.data, score~sex*test*section +(1|student)+(1|student:test)+(1|student:section)+(1|student:test:section)+(1|part)+(1|part:test)+(1|part:sex)+(1|part:test:sex)+(1|student:part))

This is the full random effect structure of the design.


lmerTest::lmer output

36


Fixed effects shrink ranef variance

37

Fixed effects improve model fit by shrinking variances.Which variances are shrunk by a given fixed effect term is thorny in imperfectly balanced designs, which is why df is ambiguous.

summary(m0) summary(mF)

Random effects: Groups Name Std.Dev.student:part (Int) 0.88364student:test:section (Int) 0.36681part:sex:test (Int) 0.95617student:section (Int) 0.71859student:test (Int) 0.01777part:sex (Int) 2.94028part:test (Int) 2.54545student (Int) 11.08337part (Int) 9.27407Residual 0.60322

Random effects:Groups Name Std.Dev.student:part (Int) 0.88375student:test:section (Int) 0.34807part:sex:test (Int) 0.28981student:section (Int) 0.70360student:test (Int) 0.04714part:sex (Int) 0.80917part:test (Int) 0.85788student (Int) 10.69397part (Int) 8.95206Residual 0.60474





• Reminder (from last term) about common use case• What kind of random effect structure?

38


What kind of random effect structure?This is a dummy coded random effect error structure.It will only work for categorical factors. It makes a few assumptions that may be violated. It is most similar to what the ‘repeated measures’ aov() command does. It will be anti-conservative in a few particular cases with odd covariances for the subject:X interactions.

This is the full “varying intercept and varying slope” random effect structure. Folks who think deeply about these things believe this to be the most generally effective specification. It uses quite a few more parameters to capture the (more flexible) random effect covariance structure. It estimates more parameters and is sometimes harder to fit.


What kind of random effect structure?


Why use linear mixed effects?• Crossed random effects.• Repeated measures designs that are unbalanced

(e.g., due to missing data).

• Account for lower-level variation when estimating high-level coefficients

• Model variation of lower-level coefficients between groups.• Estimating coefficients for specific groups that might have

little data by partially pooling across all groups.• Make predictions at different grouping levels.


Linear mixed effect suggestions

• If analyzing factorial experiment use full factorial, maximal random effect design.– If analyzing real-world data, you’ll need to be more

parsimonious.

• Use the variable-slope specification with ranef covariance.– If it has trouble converging…. Switch? Or use BRMS::?

• Use lmerTest::lmer and resulting t and F tests.

42

201ab Quantitative methods Linear Mixed Effects ModelsEDVUL| UCSD Psychology Fixed vs Random...

Documents

Transcript of 201ab Quantitative methods Linear Mixed Effects ModelsEDVUL| UCSD Psychology Fixed vs Random...