Department of Statistics Introduction to Modeling Change Over Time with Generalized Mixed Models...

Department of Statistics

Introduction to Modeling Change Over Time

withGeneralized Mixed Models

using SAS PROC GLIMMIX

A Short Course – 14 May 2007

Instructor: Walt Stroup, Ph.D.

Professor & Chair, UNL Department of Statistics

14 May 2007 SSP Core Facility 2

Department of Statistics Outline of ShortCourse (G/C = Growth/Change Model)

1. Introductiona. motivating examplesb. Social Science HLM-speak vs. BioStat GLMM-speak

2. GLMM / HLM a. essential backgroundb. recurring modeling issues

3. SAS / GLIMMIX syntax4. G/C Models - 1st part of the picture: Factorial trt designs

a. with various error structures & distributionsb. with repeated measures & correlated errors

5. G/C Models - 2nd part of the picture: Random Effects issues a. random coefficients b. prediction vs. estimation

6. G/C Models – 3rd part of the picture - GLM issues: Binary, count, rate, zero-inflated models

7. Power & Planning8. Nonlinear mixed models



Recurring Themes

“Mixed Model” Issues− fixed or random?−error terms – which one & are they correlated?−std error & d.f.−prediction or estimate? (“inference space”)

“GLM” Issues−what distribution? incl “is it really a distribution & does it matter”?

−what link – “data” vs “model” scale?−overdispersion−computational issues



Recurring Themes

George Bernard Shaw:

“America and England are two peoples separated by a common language.”

Generalized Mixed Models have− AgStat-speak

− BioStat-speak

− Social/Behavioral Science Stat (HLM) speak

One goal: serve as translator

picture ofGB Shaw



General considerations for modeling Several examples illustrating generalized and

mixed models Typology of models Background theory Decision chart to match model with software

available in SAS

I. Introduction



General Model considerations A Model is a description of the components of an

observation observation = systematic + random Nelder: random = ephemeral + noise or

random=random model + random error

Alternative: random = design components + remaining variation

“All models are wrong but some are useful” – G.E.P Box



General Mixed Model Setting

Y is vector of responses (observable) u is vector of random (design induced) effects

[not (directly) observable] relevant distributions

o Y|u ~ fC ( , R )

o u ~ fR ( 0, G )

Model is of conditional mean of Y|u

( | ) ( , , , )E Y u h X Z u

Inexact (but useful)•HLM level 1 •Biostat – subject-specific•Level 2

19-20 Oct 2006 GLIMMIX Short Course for Procter & Gamble 8


Typology of Models

Type Mean Model Distribution

NLMM h(X,,Z,u) y|u general,

u normal **

GLMM h(X+Zu) y|u general,

u normal *

LMM X+Zu u, y|u normal

NLM h(X,) y normal

GLM h(X) y general

LM X y normal

* for PROC GLIMMIX ** for this course (G/N)LMM can be more general



Example 1Random Effects Model

Data: Output 4.1, p. 94, SAS for Linear Models, 4th ed. 20 packages of ground beef 3 samples per package 2 counts per sample response variable: microbial count response = mean + sample + count + error i.e. observation

= systematic + random model + error



2 2

2

( )

1,2,..., 20; 1,2,3; 1,2

i.i.d. (0, ); ( ) i.i.d. (0, );

i.i.d. (0, )

ijk i ij ijk

i P ij S

ijk

y p s p e

i j k

p N s p N

e N

Model for Example 1

yijk is observation [ log(count) ]

is overall mean (systematic / fixed)

pi, s(p)ij are random model effects

eijk is random error

Convention: fixed Greek; random Latin



Hierarchical Levels

school

classroom

students

Level 1

Level 2

Level 3

size levelsmall 1

medium 2

large 3



Hierarchical Level to Statistical Model

school

classroom

students

student, classroom, schoolth th thijky k j i

( )ijk

ijk i ij ijk

y mean school classroom student

y s c s e

0

0

Level 1 (student):

( )

ijk ij ijk

ij i ij

y e

s c s

0

0

Level 2 (classroom): ( )ijk i ij ijk

i i

y c s e

s

Level 3 GLIMMIX-speak

HLM-speak



Modeling Issues

1. Estimate i2’s

2. Estimate, standard error, and interval estimate of

3. Estimates of package, sample effects

4. a.k.a. Estimates of school and classroom effects



Singer: HLM to MIXED

Unconditional means model

Include Level 2 Covariate

20

0 00 0 0 00

Radenbush & Byrk (2002)

~ 0,

~ 0,

ij j ij ij

j j j

y r r N

u u N

2 2

GLIMMIX

~ 0, ~ 0,

ij i ij

i A ij

y a e

a N e N

0 00 01 0

00 01 0

"HLM-speak"

MEANSES

MEANSES

j j j

ij j j ij

u

y u r

1

"GLIMMIX-speak"

ij j j ijy X s e

one-way random effects model



Example 2Blocking & Multi-Location

Data: SAS for Linear Models: Output 3.7, discussed as mixed model in section 4.3; Output 11.30; SAS for Mixed Models, 2nd ed. Section 6.6

Output 11.30 discussed here 3 treatments 8 locations location represent a population 3-12 blocks depending on location response = trt + loc + blk(loc) + trtloc + error i.e.

observation = systematic+random model+error



Example 2 framed by Extending School / Classroom Example

school

classroom

students

school

classroom

students

Treatment

Treatment



Model with Treatment

school

classroom

students

Treatment

0

0

( ) ( )

( ) ( , )

Level 1:

Level 2: ( , )

Level 3: between school model + trt as above

ijkl

ijkl i ij ijk ijkl

ijkl ijk ijkl

ijkl ij ijk ijkl

y trt school trt classroom school student

y s c s e

y e

y c s e



Modeling Issues

1. Appropriate error term to test treatment

2. Standard error of treatment mean

− (inference space)

3. Intra-block vs. inter-block analysis



ANOVA (ignoring block)

2 21

2 2 21 2

2 21

2

Source d.f. E

Treatment 2

Location 7

Loc Trt 14

erro

xpected Mean Square

r dfe

LT TRT

LT L

LT

k Q

k k

k

If Location fixed:

2

2

2

2

Treatment 2

Sour

Location 7

Loc Trt 14

error df

ce d.f. Expected Mean Square

e

TRT

LOC

LT

Q

Q

Q

Test of TRTaffected



Inference Space

2

Assuming are

Var(trt mean)=# obs/trt

MS(error)Std. error(trt

FLocations

mean)= 0.91# obs/t

xe

rt

i d

2 2 2

2 2 2

HOWEVER... if are

( )Var(trt mean)=

# obs/trt

ˆ ˆ ˆ( )Std. error(trt mean)

Location

= 3.62# obs/

s Rand

tr

o

t

m

L LT

L LT

k

k



Where does Uncertainty Arise?

Loc 1 Loc 2

Loc 7 Loc 8

Only from variation among obs within locations?

Locations fixedOr does variation among locations also contribute?

Locations random



Intra- vs. Inter-block analysis

Intra- (fixed) block analysis based only on within block treatment differences

Inter-block analysis also accounts for variance among blocks (random combines inter- and intra-)

Lead to equivalent tests when all treatments appear equally in each block

Not equivalent otherwise

In most cases, combined inter-/intra-block analysis is more efficient



Example 3Repeated Measures/Longitudinal

Data: SAS for Linear Models, Output 8.1; SAS for Mixed Models, Chapter 5

3 treatments (2 test drugs + placebo)

ni patients per treatment

8 times of measurement (1, 2, 3, ..., 8 hours post trt)

baseline measurement at time 0

response = trt + hour + trthour + pat(trt) + error i.e. observation = systematic + random model + error

Variations on this theme are “latent growth models”



Growth Models – SingerHLM-speak to GLIMMIX-speak

Unconditional Linear Growth Model HLM GLIMMIX

Level 1 Within subjects

Level 2 Between subjects 20 1

Level 1 (within individual)

time ~ 0,ij j j ij ij ijy r r N

0 00 0 0 00 01

1 10 1 1 11

00 0 10 1

00 10 0 1

between subject within subjects

0Leve

population-averaged subje

l 2: ~ ,0

ct

j j j

j j j

ij j j ij ij

ij j j ij ij

u uMVN

u u

y u u time r

time u u time r

-specific

PA SS



Singer (1998)

Excellent paper translating HLM-speak to Proc Mixed Uses Radenbusch & Byrk examples Fair Warning to Readers, however – it’s dated

−new features & output revisions in SAS

−some of the output encouraged confusion or poor practice

−specifics revised output of Fit StatisticsMisleading output for variance estimates deletedKenward-Roger procedure for d.f. & std errors

I’ll update & make switch to Proc GMIMMIX



Modeling Issues

1. Errors may be correlateda. May affect conclusions

b. How to select covariance model

2. Denominator degrees of freedom

3. Bias in standard errors and test statistics



Impact of Correlated Errors

Covariance Model den df F-value Pr>F

errors independent 483 7.11 <0.0001

errors correlated

no structure

(bias corrected)

69

(98.1)

4.06

(3.66)

<0.0001

AR(1) 483 3.93 <0.0001

AR(1)

bias corrected

424 3.89 <0.0001



Example 4

Data: SAS for Mixed Models, Section 14.5 2 treatment (Test Drug, Control) 8 clinics clinics represent a population nij subjects at jth location on ith treatment

response: favorable or unfavorable (fij = # fav) response = trt + clinic + clinicloc + error i.e. observation

= systematic + random model + error



Modeling Issues

1. Response (fij / nij) is binomial, not normal

2. Response may not be linear in model parameters

3. Errors may not be additive

4. Variance of binomial & normal are different

a. heterogeneous

b. depends of location parameter



Generalized Linear Mixed Model

2 2C TC

ij

( )

i.i.d. N(

let Pr{favorable response | trt ,

M

clinic }

0,

( | , ( ) ) ~ Binomial

); ( )

: log1

observations

i.i.d. N

= proportion =

odel

(0, )

(

, )

i

ij j ij i

ij

ij

j

j ij

ij

ij

i

i

j

j j

c tc

c tc

f

f

c

i j

n

tc n

exp[ ( ) ] modeled by

1 exp[ ( ) ]i j ij

iji j ij

c tc

c tc

e.g.Logisticmixedmodel



Example 5

SAS for Linear Models, Output 10.39 2 treatments ni persons per treatment 4 times of measurement response = number of seizures (count) baseline and age observations response = trt + hour + trthour + baseline & age pat(trt)

+ error i.e. observation = systematic + random model + error



Modeling Issues

Count typically not ~ normal Poisson (or negative binomial) more likely Generalized Linear Model Issues

−Linear model not good direct model of mean

−Variance depends on mean

Repeated Measures Issues−Observations within subjects correlated over time

−Between subject variance



Example 6

SAS for Mixed Models, Section 1.5.6 5 treatments observed in each of 4 randomized blocks several measurements at days between 130 and 180 growing

degree days response = (trt,day) + block + blktrt + error i.e.

observation = systematic + random model + error



Emergence over TIME by TRT

Black: NoTillRed: SumBlade (summer)Cyan: SB&SDGreen: SpDisk (spring)Blue: SpPlow



Modeling Issues

“Usual” mixed model and repeated measures issues, plus

Linear model is poor model of trtday means



Nonlinear Mixed Model

u

th

th

Mixed Model:

is trt day mean; is block effect

is between subject erro

:

r

exp{ exp[ ( )]}

is asymptote of i treatment

is "slope" of i t

ijk ij k ik ijk

ij k

ik

ij i i i j

i

i

y b w e

b

w

d

Gompertz Mod

te

el

a

th

reatment

is inflection point of i treatmenti

i



Typology of Models

Type Mean Model Distribution

NLMM h(X,,Z,u) y|u general,

u normal **

GLMM h(X+Zu) y|u general,

u normal *

LMM X+Zu u, y|u normal

NLM h(X,) y normal

GLM h(X) y general

LM X y normal

* for PROC GLIMMIX ** for this course (G/N)LMM can be more general



Generalized Mixed Model SAS Software Decision Table

Response Normal Errors Indep Corr

Random Effects no yes Mean Model

Linear? yes no yes no yes no

SAS Proc

GLM MIXED

GLIMMIX

NLIN MIXED GLIMMIX

NLMIXED %NLINMIX

MIXED GLIMMIX

NLMIXED %NLINMIX

Response Non-Normal

Errors Indep Correl Random Effects

no yes

Mean Model Linear?

yes no yes no yes no

SAS Proc

GENMOD GLIMMIX

GLIMMIX NLMIXED NLMIXED

GLIMMIX (GENMOD)



Essential GLMM Background



First

How do I run a SAS Program?

???????

It’s easier than the urban legends would have you believe



Basic Parts of SAS Program

DATA Step

PROC Step

Modify existing data set (Data __; Set__;)

Data your_choice_of_name; Input list of variables; /* $ after alphameric var */Datalines;data – one line / obs, one column per variable;

comment

Proc GLIMMIX Data= your_choice_of_name; CLASS block group & trt var; MODEL response=block trt covar / options;

...Run;

Data new_data_set_name; Set [old – e.g.] your_choice_of_name; program & data manipulation statements. e.g.LogY=Log(Y);


Department of Statistics Example of SAS Program

data demo1; input classroom trt $ time count; sc=sqrt(count);datalines;1 std 1 121 std 2 161 std 4 171 std 8 242 exper 1 172 exper 2 242 exper 4 302 exper 8 3211 std 1 1611 std 2 1511 std 4 2211 std 8 238 exper 1 158 exper 2 208 exper 4 248 exper 8 27;

proc glimmix data=demo1; class classroom trt time; model sc=trt time trt*time / dist=normal ddfm=kr; random classroom(trt); lsmeans trt*time; ods output lsmeans=lsm; run;

DATA Step PROC Step

Data; Set; + new PROC

data plot_growth; set lsm; log_time=log2(time); symbol i=join value=circle; proc gplot data=plot_growth; plot estimate*log_time=trt; run;



II. Generalized Mixed Model Theory Clarify Fixed vs Random effects Linear Models

− LM to LMM + GLM to GLMM

Estimation and Inference for − LMM− GLM− GLMM

For GLMM: − what follows naturally from GLM and LMM − Special Issues



Fixed vs. Random Effects?

Fixed Effect?− levels observed = population of interest (except regression)− levels deliberately chosen− inference: systematic relationship between y and

Random Effect?− observed levels represent target population− random sample? -- ideal (but seldom perfectly realized) − makes sense to conceptualize probability distribution

Bottom Line: do observed levels of effect plausibly represent a probability distribution?

−yes random effect−no fixed effect



General Structure of Model

Nelder: observation=systematic + random General approach:

− likelihood consists of two parts observation (y | u) random effects u

− model is mathematical description of = E(y | u)

Distribution:− observation y | u ~ f(,R)

− random effects u ~ MVN(0,G)

Model: = h(X,,Z,u) h() called “inverse link”



Linear Model (LM)

No random effects simple ANOVA (one error term) multiple regression

Assumption: ( , )

LM: Model by X , usually represented as

; (0, )

alternative representation (helpful for transition to GLMM)

y ( , )

y MVN R

y X e e N R

MVN X R



Generalizations of LM

LM (Linear Model)obs ~ normal

fixed effects only

obs ~non-normal fixed effects only

GLM: (Generalized Linear Model)

obs ~ normalRandom Effects

LMM: (Linear Mixed Model)

obs ~ non-normalrandom effects

GLMM (generalized linear mixed model)



GLM: Generalized Linear Model

Binomial: Logistic regression; Probit models Poisson: Log-linear models

Assumption: ( , )

is a function of

( ) called " " -- more later

GLM: model =g( ) by -- called " "

alternatively, model by ( ) " "

Note: here

y dist R

R

V Variance function

X link function

h X inverse link

y o

( ) makes no sense

Instead: ( ),

r g X e

y dist h X R



LMM: Linear Mixed Model Multi-error models; split-plot, multi-location Repeated measures a.k.a. Longitudinal data

Assume: | ( , ) (0, )

LMM: Model by

Familiar notation:

0;

0

alternatively:

| ; ~ (0, )

or (marginal model)

( , );

y u MVN R u MVN G

X Zu

u Gy X Zu e MVN

e R

y u MVN X Zu u MVN G

y MVN X V V ZGZ R

More vocabulary:

“G-side” concerns V(u)

“R-side” concerns V(e)



GLMM: Generalized Linear Mixed Model

Assume: | ( , ) as with GLM

depends on ( )

(0, )

GLMM models = ( | ) by

link function: = ( )

inverse link:

GLMM: | ,

Marginal Model: ( | ) ( ) (more later)

y u dist R

R V

u MVN G

E y u

g X Zu

h X Zu

y u dist h X Zu R

f y u f u du

Modellingwill involve

•Distribution

•Link (or inv link)

•G-side

•R-side



Some Grounding Before Moving On

“Hessian Fly” example, Gotway & Stroup (1997, JABES)

“Hessian Fly” not so important, but design & data structure are

16 treatments, 4 replications: 4x4 Lattice − 16 incomplete blocks organized into

4 complete blocks

Response: Yij/nij

(damaged / obs per trt x block unit)

1 2 5 6 1 5 2 6

3 4 7 8 9 13 10 14

9 10 13 14 3 7 4 8

11 12 15 16 11 15 12 16

1 6 2 5 1 14 13 2

11 16 12 15 7 12 11 8

1 14 13 10 5 10 9 6

3 8 7 4 3 16 15 4



Linear Model (LM)

2

Randomized Complete Block

; i.i.d. 0,

block effect; treatment effect

ij i j ij ij

i i

y e e N

proc glimmix;class block entry;model pct=block entry;

i

Incomplete Block Model - Intra-block analysis

incomplete block replaces complete block in denoting proc glimmix;class inc_block entry;model pct=inc_block entry;



Linear Mixed Model (LMM)

2 2

Randomized Complete Block - Random block effects

i.i.d. 0, ; i.i.d. 0, ;

block effect; treatment effect

ij i j ij

i R ij

i i

y r e

r N e N

r

proc glimmix;class block entry;model pct=entry;random block;

G-sidemodeling block effect

Incomplete block (recovery of interblock information)Replace “block” by “inc_block”)



LMMG-side / R-side

Two alternative “G-side” specifications:

proc glimmix;class block entry;model pct=entry;random block;

proc glimmix;class block entry;model pct=entry;random intercept/subject=block;

R-side specificationproc glimmix;class block entry;model pct=entry;random _residual_ / type=cs subject=block;

Here, it doesn’t matter (all equivalent) but for more complex models, the distinctions will matter



Generalized Linear Model (GLM) ,

GLM ("Logit ANOVA" model): log1

ij ij ij

iji j

ij

y Binomial n

proc glimmix; class block entry; model y/n = block entry;

or replace “block” by “inc_block” forintra-block logit ANOVA

More on GLIMMIX syntax later

Here, note Y/N causes default to Binomial distribution & Logit link

(same as GENMOD)



Generalized Linear Mixed Model (GLMM)

2| block effects , block effects i.i.d. 0,

GLM ("Logit ANOVA" mixed model): log1

ij ij ij i R

iji j

ij

y Binomial n r N

r

proc glimmix; class block entry; model y/n = entry; random intercept / subject=block;

proc glimmix; class block entry; model y/n = entry; random block;

Marginal model

not equivalent

proc glimmix; class block entry; model y/n = entry; random _residual_ / type=cs subject=block;



II. Inference in LM, GLM, LMM, and GLMMInference for based on

In LM theory, if it can be expressed as ( )

. .

ˆOLS ( )

: estimable ' '( ) ( )

Main advantage

ˆ

fixed effects estimable functions

K estimable A E y

i e K A X

X X X y

theorem K iff K K X X X X

K

invariant to choice of ( )

. . when not full rank, has no intrinsic interpretation

does

(e.g. treatment difference, marginal (least squares) mean

X WX

i e X

K



II. Examples of Estimable Functions

1 2

. . one way model: ; 1,2,3,4; 1,...,

Estimable functions include

Trt marginal ("Least Squares") mean (LSMean)

+ . . 1 1 0 0 0 for 1

Trt differences

e.g. 0 1 1 0 0

SS(trt) such tha

ij j ij

i

e g y e i j n

e g k i

k

K

t all equal

0 1 0 0 1

. . 0 0 1 0 1

0 0 0 0 1

i

e g K



II. Common Inference Results for GLM

0

1

2( )

02

ˆ ~ ( , ( ) )

exact for LM

Wald statistic:

purpose: test H : 0

ˆ ˆ( ) [ ( ) ] ( )

~

Note in OLS

( )

rank K

K approx MVN K K X WX K

K

Wald K K X WX K K

approx

SS HWald



II. GLM: Inference with Unknown Scale Parameter

02

2

0 02

0( , )

2

( )Recall, in OLS

But what if unknown?

( ) ( )Think ANOVA: Use

ˆ

( )Thus, ~( )

Generalization:

in GLM, scale parameter or

dfh dfe

SS HWald

SS H SS H

MSE

SS H dfhWald Frank K MSE

Pearson Deviance

dfe dfe



II. Extension of GLM Scale ParameterQuasi-Likelihood

Overdispersion

“Working Correlation”

Counts Poisson ( ) ( )

but in practice ( ) ( )

Quasi-likelihood: you specify ( )

E y Var y

E y Var y

E y Var y

1 1 12 2 2

Repeated Measures

Assumed distribution ( ) ( )

But in reality, errors are correlated, so model variance as

( ) where ( )

is working correlation - structure analogou

Var y diag V

Var y R AR R diag V

A

s to true R-side in LMM



II. GLM: Deviance and Likelihood Ratio Test

1 1 2 2

0 2

1 1

1 1 1

Full model: . . ( )

Decompose as

Suppose we want to test H : 0

1. Fit full model

( ) 2 log[ ( ) ( )]

2. Fit reduced model

( ) 2 log[ ( ) ( )]

3. LR statistic

( )

X i e h X

X X

Dev X X y

X

Dev X X y

Dev X Dev

1 1( )X



II. LMM: The “Mixed Model Equations”1 1

1

1 1

1 1 1

1 1 1 1

1 1

( ) ( ) ( )

( )( )

( )and ( )

solving yields

note:

ˆˆ ( ) and (

y y X Zu R y X Zu uG u

yX R y X Zu

yZ R y X Zu G u

u

X R X X R Z X R y

uX R Z Z R Z G Z R y

u GZ V y X X V X

1) X V y

Marginal Model Solution

Mixed Model Solution



II. LMM Inference – G and R known

1

Inference based on Predictable functions

"predictable" if is estimable

(reduces to estimable function if focus on fixed effects only)

ˆ1. [ ( )] [ ]

where

K M u K

K

KVar K M u u K M C

M

X R X X RC

_1

1 1 1

1 2( )

2. Let and =

statistic for tests on is

ˆ ˆ( ) [ ] ( ) ~ rank L

Z

Z R X Z R Z G

L K M u

Wald L

L L CL L



II. LMM Inference – G and R unknownˆ ˆ1. Replace and by and

estimate variance and covariance components

ˆ2. Denote as with estimated var/cov components

ˆ ˆ3. "Naive" [ ( )]

ˆbut ( )

Kenward-Roger adjustment

G R G R

C C

Var L L CL

E L CL L CL

( ),

4. Approximate

ˆ( ) [ ]( )( ) ( )

may be biased ; often must be approximated

rank L

F

L L CL LWald approx Frank L rank L

F



II. LMM: Variance Component Estimation

Several methods1. For variance-component-only models: use

EMS from ANOVA 2. Maximum likelihood

− problem: biased

3. Restricted maximum likelihood4. Several computational approaches

a. Newton Raphsonb. Fisher Scoringc. EM



What’s Wrong with ML?

An example to illustrate SAS for Mixed Models, Data Set 1.5.1 Incomplete Block design from Cochran & Cox,

Experimental Designs, p 456 15 treatments 15 blocks 4 treatments observed per block



C&C Example: ML and two alternatives

Intrablock (fixed block) analysisproc glimmix data=cc456; class trt bloc; model y=trt bloc;Inter/Intra-block (random block)analysis –defaultproc glimmix data=cc456; class trt bloc; model y=trt; random bloc;Inter/Intra-block (random block) analysis – MLproc glimmix data=cc456 method=mspl; class trt bloc; model y=trt; random bloc;

PROC MIXED defaultgive same result

equivalent to PROC GLM

same asProc MIXEDMETHOD=ML;



ML vs Alternative Results: Which is Right?

Intrablock (fixed block) Type III Tests of Fixed Effects

Effect

Num DF

Den DF F Value Pr > F

trt 14 31 1.23 0.3012

Intra/interblock (random) block default

Type III Tests of Fixed Effects

Effect

Num DF


trt 14 36.2 1.48 0.1676

2ˆ 8.62

2 2ˆ ˆ4.65 8.56R

Intra/interblock (random) block - ML

2 2ˆ ˆ4.50 6.04R


EffectNum

DFDen DF F Value Pr > F

trt 14 49.04 2.02 0.0352



Simulation ML or REML

1000 simulated data sets using C & C, p 456 design

B2/2 = 0.5

Recorded type I error rate for Ftrt

− intrablock

−REML random block

−ML random block

Variable N Mean ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ fixd_rej05 1000 0.0590000 REML_rej05 1000 0.0610000 ML_rej05 1000 0.2140000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ



II. LMM with estimated G and R Bias in std error and test statistics

1

1

Kenward & Roger ( , 1997)

Consider estimable function

ˆWhen unknown, estimates used to obtain

ˆ ˆ "naive" estimate ( ) ( )

Using Taylor series expansion, can show

ˆ[ (

Biometrics

K

V

Var K K X V X K

E K X V

2 1

1

,

) ]

1 ( )ˆ ˆ( ) cov( , )

2 i ji j i j

X K

X V XK X V X K K K



II. LMM: Degrees of Freedom

2 2 2

2 2

2 2 2

2 2

2

Simple Case

model: ( )

(0, ); ( ) (0, ); (0, )

ANOVA Source EMS

A

B

AB

error

ijk i j ij ijk

j B ij AB ijk

AB A

AB B

AB

y b ab e

b N ab N e N

n Q

n na

n



II. Degrees of Freedom (2)

1 2

2 21 2

2 2 2

Trt diff:

2 ( )2ˆ ˆ( ) ( )

denominator d.f.= ( )

Trt mean: +

1ˆ ˆ( + ) ( )

1 11 ( ) ( )

approximated via Satterthwaite's proc

AB

i

i AB B

MS ABVar nnb nb

df AB

Var n nnb

bMS AB MS Bnb b b

edure



II. Satterthwaite Approximation

22

2 2 2 22 2

for linear combination of MS

approximate d.f. for MS is

-1 1

e.g. -1 1

( ) ( )

i

i ii

i ii

i

i i

MS c MS

bMSAB MSBc MS

b b

c MS bMSAB MSBdf b b

df AB df B



II. Satterthwaite Approximation in LMM1 1

1 1

2 22[ ( ( ) )] 2( ( ) )Approximation: or

( ( ) ) ( ( ) )

E K X V X K K X V X K

Var K X V X K Var K X V X K

For vector K (e.g. treatment contrast):

1

1

1

1

Approximate ( ( ) ) by

( ( ) ), where vector of (co)variance components

2 { ( )

1 1,

Var K X V X K g Ag

K X V X Kg

V VA trace P P

i j

V ZGZ R P V V XCX V



II. GLMM Estimation

1 12 2

12

GLMM is model of ( | )

Link form: ( | )

Inverse link form: ( | ) = ( )

More general expression of distribution of |

|

( ) is "working correlation matrix"

E

i

E y u

g E y u X Zu

E y u h X Zu

y u

Var y u R R AR

R diag V A

stimation: as with LMM, may choose to focus on

1. only GLS equations in LMM;

Generalized Estimating Equations with GLMM

2. and several approachesu



II. Working Correlation

Recall Gotway & Stroup (1997) Hessian Fly Example

1 2 5 6 1 5 2 6

3 4 7 8 9 13 10 14

9 10 13 14 3 7 4 8

11 12 15 16 11 15 12 16

1 6 2 5 1 14 13 2

11 16 12 15 7 12 11 8

1 14 13 10 5 10 9 6

3 8 7 4 3 16 15 4

Gotway and Stroup considered spatial variation among e.u.

proc glimmix; class block entry; model y/n=entry; random intercept / subject=block; random _residual_ / type=sp(sph)(row col) subject=block;

MODEL sets up Binomial GLM, Logit linkRANDOM _RESIDUAL_ sets up a working correlationbased on SPHERICAL semivariogram



II. Marginal (PA) vs Subject-Specific Inference

Marginal Mean: ( )

Conditional Mean: ( | )

Note: ( ) ( | ) ( )

In general, cannot be further simplied

E y

E y u

E y E E y u E h X Zu

2 2

Example: log link, ~normal

( | ) exp( )

( ) exp( ) exp( ) ( )

( ) is moment generating function of eval at

( ) exp( )exp log ( )2 2

u

u

u u

u

E y u X Zu

E y E X Zu X M Z

M Z U Z

E y X E y X

Population Averaged (PA)

SS (true GLMM)



II. More on PA (marginal) vs. SSProbit-normal model:

Pr( 1| ) ( ); (0, )

can show

( ) ( )1

y u X Zu u N G

XE y X

Z GZ

2 2

2

in LMM, model ; (0, ); (0, )

1 .

1 .and ; 0, ;

.

1

are equivalent. However, in GLMM, they are not. Yield

different estimates, std. errors, etc.

u eX Zu e u N I e N I

X e e N R R



II. Estimation of GLMM model E(y|u) inverse link: E(y|u)=h(X+Zu) link: g[E(y|u)]==X+Zu to estimate and u need to evaluate f(y), f(y|u)

− approximate e.g. by Taylor series expansion Penalized Quasi-Likelihood (SAS %GLIMMIX) SAS PROC GLIMMIX (next slides)

− numerical integrate joint density Gauss-Hermite Quadrature (Proc NLMIXED)

− stochastically evaluate integral Monte Carlo Markov Chain (WinBugs – not in this course)



II. Computational Method Comparison GEE

− Computationally easy− Meaning of marginal results in GLM?

Linearized GLMM (current PROC GLIMMIX)− uses familiar LMM analogs (but many are ad hoc & need further research)− allows considerable R-side flexibility− adequate for many GLMM; breaks down for certain cases (binary data)

Integral Approximation (PROC NLMIXED)− better approximation that Linearized GLMM− BUT: ML only, simple G-side models only, no R-side

LaPlace− computationally less demanding than Integral approximation but often

“accurate enough”; same limitations as Integral approximations MCMC

− simple models only; limited & temperamental software− but in extreme cases, only way to get accurate results



Modeling Considerations



Basic Parts of SAS Program

DATA Step

PROC Step

Data your_choice_of_name; Input list of variables; /* $ after alphameric var */Datalines;data – one line / obs, one column per variable;

comment

proc glimmix data=demo1; class classroom trt time; model sc=trt time trt*time / dist=normal ddfm=kr; random classroom(trt); lsmeans trt*time; ods output lsmeans=lsm; run;



III. Modeling Considerations

Overdispersion

Marginal (PA) vs Conditional (SS) models

“Data” vs “Model” Scale



III. Model Considerations

Variance Model & Overdispersion Choice of Link Function Choice of Distribution Choice of Model Effects Correlated Errors?

Any of the above could show up as “overdispersion”



III. GLMM: Model Considerations Common dilemma Design, e.g. like “Hessian fly”

example BINOMIAL data Recover interblock

information - BLOCK random

1 2 5 6 1 5 2 6

3 4 7 8 9 13 10 14

9 10 13 14 3 7 4 8

11 12 15 16 11 15 12 16

1 6 2 5 1 14 13 2

11 16 12 15 7 12 11 8

1 14 13 10 5 10 9 6

3 8 7 4 3 16 15 4

ij

ij

expModel (Logit GLMM):

1 exp

or equivalently log1

i j

i j

i jij

r

r

r

Analysis reveals that the data are overdispersed



III. Hessian Fly Example

proc glimmix data=HessianFly; class block entry; model y/n = entry; random block;

Evidence of Overdispersionwhen >>1

Fit Statistics

-2 Res Log Pseudo-Likelihood 182.21

Generalized Chi-Square 107.96

Gener. Chi-Square / DF 2.25



III. Overdispersion

Observed variance > variance under presumed model

Symptom: Deviance/DFE or chi-square/DFE >> 1

Uniquely a GLM / GLMM issue

−not a consideration with LM, LMM

−y|u ~ normal implies variance not a function of mean

When is there an issue

− If Var(y) = f[E(y)] and

−using scale adjustment requires unrealistic assumptions



III. Common fix for Overdispersion

Multiply variance by scale parameter. Here: 1

proc glimmix data=HessianFly; class block entry; model y/n= entry; random block; random _residual_;

Issue: not a true likelihood

Covariance Parameter Estimates

Cov Parm Subject EstimateStandard

Error

Intercept block 0 .

Residual (VC) 2.2668 0.4627

estimates



Error

Intercept block 0.01116 0.03116

vs.

ˆw/o



Impact of Scale Parameter on Inference


EffectNum

DFDen

DF F Value Pr > F

entry 15 45 3.03 0.0020


EffectNum


entry 15 45 6.90 <.0001

no scale parameter

withscale parameteradjustment

but is this the best way to address the problem?

failure to account for overdispersion tends to increase type I error rate



III. Mean – Variance Overdispersion Models

2

( ) ( , )

No scale parameter 1 ,

binomial, poisson

1-Nonlinear scale parameter

1+

negative binomial, gen. poisson, beta

Linear scale parameter

gamma, inverse gaussian

No mean parameter

normal

Var y f



III. Marginal or Conditional Formulation

For many models (notably LMM) there are equivalent forms−conditional (mixed, SS) model

−marginal (PA) model

− lead to the same marginal log-likelihood

Distinction results from−G-side model; random model effects

−R-side model; marginal model



III. Example: variance component (G-side) vs. Compound symmetry (R-side)

2 2

2 2 2 2

2 2 22 2

2 2

i.i.d. 0, i.i.d. 0,

...

...

... ...

ij i j ij

i R ij

R R R

R Ri R

R

y r e

r N e N

Var Y J I



III. Compound Symmetry Equivalent

22 2 2C 2 2

2C

2

Let and =

Model:

if (same block),

0 otherwise

1 ...

1 ...

... ...

1

RR

R

ij i ij

ij ij kl

i C

y E

i kVar E Corr E E

Var Y

Models equivalent if 0



III. G-side / R-side

proc glimmix; class block entry; model y/n=entry; random block;

proc glimmix; class block entry; model y/n=entry; random intercept / subject=block;

same modelG-side

R-side modelproc glimmix; class block entry; model y/n=entry; random _residual_ / type=CS subject=block;

proc mixed; class block entry; model y=entry; repeated / type=CS subject=block;



III. Variance Component vs CS in GLMM Variance component model is GLMM CS model is GEE They are not equivalent

Conditional model: logit

exp|

1 exp

marginal distribution is ( ) | ( )

Marginal model: logit

with working correlation matrix def

ij i j

i jij i

i j

ij ij i i i

ij i j

r

ry u Binomial

r

p y p y u p u du

ined by CS form

is NOT Binomial, merely borrow Binomial-like

Does such a dist

quasi-likeli

ribution actu

hood f

ally e

o

x ?

r

st

m

i

ijy



III. Conditional vs. Marginal Results

Fit Statistics



Cov Parm Subject Estimate

CS block -0.03247

Residual 2.2992


EffectNum


entry 15 45 2.99 0.0023

Fit Statistics




Intercept block 0

Residual (VC) 2.2668


EffectNum


entry 15 45 3.03 0.0020

Conditional Marginal

which is right? •fit statistic?•can you simulate data using mechanism implied by model?



III. Marginal or Conditional?

How to choose?−Conditional: G-side; Marginal: R-side

−Fit statistic? (may help; may deceive)

General recommendation−G-side formulation preferred for non-normal data

−G-side effects operate inside the link function & hence always lead to valid conditional & marginal distributions

−R-side effects operate outside the link function

− for non-normal data, models implied by R-side effects may be vacuous



III. Impact of Model Effects

Back to Hessian Fly Data Incomplete Block Design Try more appropriate model

proc glimmix; class inc_block entry; model y/n-entry; random intercept / subject=inc_block;

Fit Statistics




Intercept inc_block 0.4971


EffectNum


entry 15 33 6.33 <.0001



III. Inference

After model fit & estimation, inference begins Also want at least some of following comparisons among groups (trt, entry...)

− test hypotheses

−obtain confidence intervals

−obtain predictions

− further model checking



III. Scale issue for GLM, GLMM

For GLM, GLMM there are two “natural scales”− linear (or model) scale (e.g. logit)

−data scale

May be other scales, depending on context−odds

−odds ratio



III. Choosing the Scale Example: Hessian Fly – binomial dist, logit link Data: measured as 0/1; per e.u. as Y/N Main focus: entry effect on P{indiv resp = 1}

ijLink: log1

ˆexpˆInverse Link:

ˆ1 exp

ij i jij

ijij

ij

r



III. Scale and Inference

These are estimat

Main tool of infe

ed on the "linear

rence: estimable functions

ˆ ê.g. entr

" or "model" scale

ˆ ˆ ˆcan denote: or

y "L

S Mean" +

ˆ êntry difference

Main focus of inference: on

j j j

j

j j

data scale

ê.g. 1|

entry difference between prob

êxpˆRequire "inverse

abilit

linking": ˆ1 ex

ies

p

ˆ ˆ

jj

j

j

j j

P resp entry i



III. Inverse Linking

Estimation occurs on model scale But reporting typically must occur on data scale

2

ˆÊstimate:

ˆˆStd error: . . ( )

ˆConfidence interval: . .

êxpˆÎnverse linked estimate e.g. ˆ1 exp

ˆˆÎnverse linked std error . . .

Inverse linked confidence

K

s e k Var k

z s e

h

hs e s e

interval ( ), ( )h LowerB h UpperB

“delta” rule



III. Model & Data Scale – Hessian Fly ExampleSolutions for Fixed Effects

Effect entry Estimate Standard Error DF t Value Pr > |t|

Intercept -1.9057 0.4886 15 -3.90 0.0014

entry 1 3.8001 0.6327 33 6.01 <.0001

entry 2 3.4821 0.6186 33 5.63 <.0001

Estimates

Label EstimateStandard

Error Lower Upper Mean

StandardErrorMean

LowerMean

UpperMean

entry 1 1.8944 0.4608 0.9568 2.8319 0.8693 0.05237 0.7225 0.9444

entry 2 1.5765 0.4321 0.6974 2.4555 0.8287 0.06133 0.6676 0.9210

diff entry 1-2 0.3179 0.5793 -0.8607 1.4965 0.5788 0.1412 0.2972 0.8171

linear or model scale data scalewhich of thesemake NOsense?



on to GLIMMIX



IV. GLIMMIX Syntax

SAS software for GLMs & Mixed models

Basic GLIMMIX syntax

Similarities & Differences vs existing SAS Procs

New features



IV. SAS Software for Linear Models LM

−Proc GLM, MIXED−Proc GLIMMIX

GLM−Proc GENMOD Proc NLMIXED−Proc GLIMMIX

LMM−Proc MIXED−Proc GLIMMIX

GLMM−Proc GLIMMIX Proc NLMIXED



IV. PROC GLIMMIX Syntax

What’s familiar (from MIXED & GENMOD)− CLASS− MODEL− DIST and LINK options in MODEL (like GENMOD)− RANDOM (for G-side)− ESTIMATE, CONTRAST, LSMEANS− ODS

What’s new or different− RANDOM _RESIDUAL_ (replaces REPEATED for R-side)− LSMESTIMATE− new options in LSMEANS (e.g. better options for factorial exp)− NLOPTIONS− Model diagnostics



IV. Relation between GLMM Structure and GLIMMIX Code

1 1

2 2

| ~ , ( )

GLMM: |

|

y u dist R Var u G

g u X Zu

Var y u V PV

proc glimmix; class variables; model <resp>=<fixed effects> /dist= link= ; random <g-side effects> / <options>; random _residual_ / type= subject= ;run;



IV. NLOPTIONS Statement

New Statement in GLIMMIX Controls Optimization technique, Line Search

Method, number of Iterations, etc

proc glimmix; class id a b; model y=a b a*b; random _residual_ / type=cs subject=id(a); nloptions tech=nrridge maxiter=100;

TECH=NRRIDGE causes GLIMMIX to use MIXED computing algorithm (good for comparison...)



IV. Programming Statements Similar to GENMOD, NLIN, NLMIXED GLIMMIX supports statements using DATA step syntax Use to transform variables, define quantities to output,

user-defined link, variance, etc. For example....

proc glimmix; class block entry;

pct=y/n; model pct=entry; random intercept / subject=block;



IV. Some GLIMMIX Defaults Useful to Know

In MODEL statement− response Y= NORMAL distribution & IDENTITY link

− response Y/N= BINOMIAL distribution and LOGIT link

For distributions without scale parameter in variance function (e.g. Binomial, Poisson)−no scale parameter assumed (unlike %GLIMMIX macro)

−obtain scale parameter with RANDOM _RESIDUAL_

Optimization method automatically matched based on DISTRIBUTION & LINK



IV. Estimation Methods in PROC GLIMMIX

Defaults depend on model, distribution, and link May be altered with METHOD= option

− in PROC statement

METHOD= options −variations on pseudo-likelihood

−RSPL

−RMPL

−MSPL

−MMPL

Restricted obj fct (like REML)

Unrestricted obj fct (like ML)

subject specific (conditional or mixed) model

population averaged (marginal) model



IV. Defaults & Methods (continued)

GLMM Default Method is RSPL For LMM, this is REML

− GLIMMIX uses different algorithm than MIXED, TECH=NRRIDG uses MIXED algorithm

− you can get slightly different numbers with MIXED/GLIMMIX

METHOD=MSPL yields ML estimates Methods appear in literature as MPL, PQL Gaussian adaptive quadrature and LaPlace

algorithms will be added to V 9.2−not available yet & not discussed here



IV. Examples

proc glimmix; class id; _variance_=_mu_*_mu_; model y=x / dist=poisson;run;

proc glimmix; class id; model y=x / dist=poisson; random _residual_;run;

proc glimmix; class id; model y=x / dist=poisson;run;

Poisson regressionLog linkchange variance function

Poisson regressionLog linkadd scale parameter

Poisson regressionLog link



IV. “GLM-mode” vs “GLMM-mode”

Use following trick to get GLM (GENMOD) type model via pseudo-likelihood

proc glimmix; class id; model y=x / dist=poisson; random _residual_;

proc glimmix; class id; model y=x / dist=poisson; random _residual_ / subject=id;

“GLM-mode”max likelihood

“GLMM-mode”pseudo likelihood

this is a GEE with indep working corr



IV. Distributions supported by GLIMMIX

ContinuousBetaNormalLognormalGammaExponentialInverse GaussianShifted T

DiscreteBinaryBinomialPoissonGeometricNegative BinomialMultinomial

−Nominal−Ordinal



IV. MIXED to GLIMMIX – R-side

proc mixed; class loc id trt time; model y=trt | time; random loc; repeated / type=ar(1) subject=id(loc);

proc glimmix; class loc id trt time; model y=trt | time; random intercept / subject=loc; random _residual_ / type=ar(1) subject=id(loc);

when you use GLIMMIX, you will notice it is much fussier about SUBJECT= statement when nested subject structure is present (MIXED more likely to let you get away with ignoring SUBJECT)



IV. More on R-side

proc mixed; class loc id trt time; model y=trt | time; random loc; repeated time / type=ar(1) subject=id(loc);

proc glimmix; class loc id trt time; model y=trt | time; random intercept / subject=loc; random time / type=ar(1) subject=id(loc) residual;

** vs random _residual_ / type=ar(1) subject=id(loc);

alternative formof random residuale.g when time points missing, unsorted etc.



IV. MIXED to GLIMMIX - Estimate MIXED: single row ESTIMATE statements

GLIMMIX: multi-row with multiplicity adjustment

proc mixed; class trt; model y=trt a x trt*a trt*x; estimate ’10 3’ trt 1 -1 trt*a 10 -10 trt*x 3 -3; estimate ’20 3’ trt 1 -1 trt*a 20 -20 trt*x 3 -3; estimate ’30 3’ trt 1 -1 trt*a 30 -30 trt*x 3 -3;

proc glimmix; class trt; model y=trt a x trt*a trt*x; estimate ’10 3’ trt 1 -1 trt*a 10 -10 trt*x 3 -3,

’20 3’ trt 1 -1 trt*a 20 -20 trt*x 3 -3,’30 3’ trt 1 -1 trt*a 30 -30 trt*x 3 -3 / adjust=scheffe;



IV. MIXED vs. GLIMMIX - LSMEANS

Example: Factorial

PROC MIXED; class A B; model y=A|B; lsmeans A B/diff; lsmeans A*B/diff slice=(A B);

PROC GLIMMIX; class A B; model y=A|B; lsmeans A B/diff lines; lsmeans A*B / slice=(A B) slicediff=(A B);

gives you table of all possible differences

tests – but does not estimate – simple effects A given B, vice versa

gives multiple range

display users love

restricts A*B diffs to actual simple effects, e.g. A1-A2|Bj



IV. GLIMMIX – LSMEANS (1) Main EffectsB Least Squares Means

B EstimateStandard

Error DF t Value Pr > |t|

1 18.5300 1.3226 13.69 14.01 <.0001

2 26.5200 1.3226 13.69 20.05 <.0001

4 28.2800 1.3226 13.69 21.38 <.0001

8 25.3000 1.3226 13.69 19.13 <.0001

T Grouping for B Least Squares Means

LS-means with the same letter are not significantly

different.

B Estimate

4 28.2800 A

A

2 26.5200 A

A

8 25.3000 A

1 18.5300 B

proc glimmix data=AxB_example; class block A B; model y=A|B/ddfm=satterth; random block block*B; lsmeans A B/diff lines; lsmeans A*B/slicediff=(A B);run;



IV. GLIMMIX – LSMEANS (2) Simple Effects

proc glimmix data=AxB_example; class block A B; model y=A|B/ddfm=satterth; random block block*B; lsmeans A B/diff lines; lsmeans A*B/slicediff=(A B);run;

Simple Effect Comparisons of A*B Least Squares Means By B

Simple Effect Level A _A Estimate

Standard Error DF t Value Pr > |t|

B 1 r s 2.9400 1.3144 16 2.24 0.0399

B 2 r s 2.6400 1.3144 16 2.01 0.0618

B 4 r s -0.2000 1.3144 16 -0.15 0.8810

B 8 r s -1.0000 1.3144 16 -0.76 0.4578

A*B Least Squares Means

A B EstimateStandard

Error

r 1 20.0000 1.4769

r 2 27.8400 1.4769

r 4 28.1800 1.4769

r 8 24.8000 1.4769

A*B Least Squares Means

A B EstimateStandard

Error

s 1 17.0600 1.4769

s 2 25.2000 1.4769

s 4 28.3800 1.4769

s 8 25.8000 1.4769



IV. GLIMMIX – LSMEANS (3) lsmeans a*b / diff; gave you this

Differences of A*B Least Squares Means

A B _A _B EstimateStandard


r 1 r 2 -7.8400 1.8796 19.49 -4.17 0.0005

r 1 r 4 -8.1800 1.8796 19.49 -4.35 0.0003

r 1 r 8 -4.8000 1.8796 19.49 -2.55 0.0192

r 1 s 1 2.9400 1.3144 16 2.24 0.0399

r 1 s 2 -5.2000 1.8796 19.49 -2.77 0.0121

r 1 s 4 -8.3800 1.8796 19.49 -4.46 0.0003

r 1 s 8 -5.8000 1.8796 19.49 -3.09 0.0060

r 2 r 4 -0.3400 1.8796 19.49 -0.18 0.8583

r 2 r 8 3.0400 1.8796 19.49 1.62 0.1219

r 2 s 1 10.7800 1.8796 19.49 5.74 <.0001

etc



IV. GLIMMIX -- LSMESTIMATEExample: Simple Effect in 2-Factor Factorial

Model:

Simple Effect, e.g. A|B

ijk ij ijk i j ij ijk

ij i j i i ij i j

y e e

estimate ‘A|B’ a*b 1 0 0 0 -1 0 0 0; not estimable

estimate ‘A|B’ a 1 -1 a*b 1 0 0 0 -1 0 0 0;must write

lsmestimate a*b ‘A|B’ 1 0 0 0 -1 0 0 0;new GLIMMIX alternative

Defined on not on model effects

Allows multiple LSMESTIMATES & ADJUST= for multiplicity

ij



IV. ODS Graphics With GLIMMIX

Not available with MIXED

ods html;ods graphics on;ods select MeanPlot;proc glimmix data=AxB_example; class block A B; model y=A|B/ddfm=satterth; random block block*B; lsmeans A*B/plot=MeanPlot

(sliceby=A join cl);run;ods graphics off;ods html close;run;



Factorial Treatment Design

Treatment Design vs Experiment (or study) Design

Factorial is type of treatment design Factor A, a levels; Factor B, b levels; etc Main inference tools:

−simple effects; e.g. method effect | variety j

− interaction; i.e. simple effects equal for all j

−main effects



is generic random structureijkEModel:

obs on A B

A B mean

Simple effect:

A | B :

B | A :

Interaction:

equal simple effects no interaction

e.g.

Main effect:

ijkijk ij

th thijk

thij

j ij i j

i ij ij

ij i j ij i j

y

y k ij

ij

E

or i i j j

specific form depends on design



GLIMMIX Features

Can estimate / test −simple effects

−main effect

−depending on which is appropriate

ODS graphics can graph / plot effects of interest SLICE can focus on simple effects in presence

of interaction SLICEDIFF can estimate simple effects of

interest



Modeling & Design



But My Study is not a Designed Experiment!

Comparative Study: any study whose purpose is to compare treatments or conditions (includes assessing change over time). Includes “quasi-experiments” & surveys with comparative objectives + designed experiments. Design principles apply to all!

Most modeling issues are study design issues

Most modeling errors result from poor understanding of design principles



If you are modeling, you need to understand design principles!!



Key Terms in Design Treatment Design: factors and levels & how they are

structured in the study. E.g factorial, planned obs over time

Experiment Design: Organization of experimental units (e.g into matched pairs, blocks, strata, clusters); plan by which they are assigned to treatment levels.

Experimental Unit: (e.u.) Smallest entity to which treatment levels (or treatment combinations) are independently assigned. E.U.s are legitimate units of replication

Sampling Unit: Unit on which measurement is taken. May be e.u. itself or subset of e.u. A.k.a. pseudo-replicate

Pseudo-replication: use of S.U.s as units of replication; common form of inappropriate design & analysis



Factorial & Experiment Designs idea: experimental unit is smallest entity to which

treatment level independently applied e.u. may be different size for different factors e.g. from SAS for Mixed Models, Section 4.6

−2 type 3 dose example dose applied to cage; type to animal in cage e.u. for dose: cage with 2 animals e.u. for type (and dose type): animal split-plot many variations (including repeated measures)



Adding to Model

school

classroom

students

school

classroom

students

TreatmentParticipate in Prof Devel

TreatmentDo Not Participate

curriculum

expstd

stdexp

curriculum



V. Factorial Treatment Designs

Basic Features

Come in Many (many, many) design forms

Experiment design & “quasi-experiment” or survey “study design”

−key to deciding what’s random & what’s fixed

−non-mixed (LM and GLM only) software is UNACCEPTABLE for these types of problems

Includes repeated measures (change... growth)

Normal and non-normal data



Type 1 Type 2

Type 2 type 1

Type 2 type 2

Type x Dose Design

Dose 1

Dose 2

Dose 3

or... Dose = Professional Development TrtType = Curriculum


Department of Statistics Figure 4.1 Possible design layouts for 22 factorial experiment

Treatments codes:

A1B1 A1B2 A2B1 A2B2 a. Completely Randomized c. Row-Column (Latin Square)

b. Randomized complete block

Blk 4

Blk 3

Blk 2

Blk 1

Blk 4

Blk 3

Blk 2

Blk 1

d. Split-plot 1, whole plot completely

randomized

col4col3col2col1

row4

row3

row2

row 1

col4col3col2col1

row4

row3

row2

row 1

FromSAS for Mixed Models

Treatment design:2 x 2 factorial

Experiment design:manymany variations

Here are 7(seven)



e. Split-plot 2, whole plot in randomized complete blocks

Blk 4

Blk 3

Blk 2

Blk 1

Blk 4

Blk 3

Blk 2

Blk 1

g. Split-plot 3. whole plot in row-

column (2 Latin squares)

Row 4

Row 3

col4col3col2col1

row2

row 1

Row 4

Row 3

col4col3col2col1

row2

row 1

f. Split-block, a.k.a. strip-split-plot

Blk 4

Blk 3

Blk 2

Blk 1

Blk 4

Blk 3

Blk 2

Blk 1

Even with 2 x 2 factorial

these seven are not all

we’re just getting started!



Split Block Example

SideL R

Position (same meaning both sides)

Microchip wafer



Choosing right model – step 1What is the experimental unit?

figure

4.1.a 4.1.b 4.1.c 4.1.d 4.1.e 4.1.f 4.1.g

effect CRD RCB LS split plot CR

split plot RCB

split-block

split-plot LS

block? no yes row col

no yes yes row col

A eu(A*B) blk*A*B row*col eu(A) blk*A blk*A row*col

B eu(A*B) blk*A*B row*col B*eu(A) blk*A*B blk*B row*col*B

A*B eu(A*B) blk*A*B row*col B*eu(A) blk*A*B blk*A*B row*col*B



Common Models in PROC MIXED/GLIMMIXDesign SAS – class, model and random statements CRD (Figure 4.1.a) class eu a b;

model y=a b a*b; RCB (Fig 4.1.b) class block a b;

model y=a b a*b; Random block; or Random intercept / subject=block;

Latin Square (4.1.c)

class row col a b; model y= a b a*b; Random row col;

Split-plot CR (4.1.d)

class eu a b; model y=a b a*b; random eu(a);

Split-plot RCB (4.1.e)

class block a b; model y=a b a*b; random block block*a;

Split-block (4.1.f) class block a b; model y=a b a*b; random block block*a block*b;

Split-plot LS (4.1.g)

class row col a b; model y=a b a*b; random row col row*col; (or, equivalently random row col row*col*a;)

MODEL treatment design RANDOM experiment (study) design



Model for split-plot: school-classroom example

Strategy: 1. list factor effects2. list e.u. for that effect3. each e.u. a random model effect

e.prof dev trt schoolcurriculum classroom(school)p.d curr classroom(school)

Effect e.u.g.

model: ijky

( )

or alternative expression

( )

note! is (not asampli n e.u.)ng unit

ij ik ijk

ij i j ij

ijk ik ijk

s t e

p c pc

E school trt e

student



Model for split-plot – Dose x Type exampleStrategy: 1. list factor effects

2. list e.u. for that effect3. each e.u. a random model effect

e.dose block dosetype block dose type

g.

model: ( )

dose type block dose t

Effect e.u

p

.

y e

ijk ij k iy bloc b d

or alternative expression

( )

note! NOT in model (not an e.u.)

k ijk

ij i j ij

ijk k ik ijk

e

d t dt

E bloc b d e

bloc type



Conventional ANOVA

2 2

2 2

2

2

2

Source EMS

bloc

dose

w.p. error† bloc dose

type

dose type

s.p. error††

S W D

S W

S T

S DT

S

t Q

t

Q

Q

H a.k.a.

between subjects

error

HH a.k.a.

within subjects

error



Standard errors of various terms

2 2

2

2

2 2

Main effects

2of dose Var= ( )rt

2of type Var= ( )rdSimple effects

2type|dose Var= ( )r

2dose|type Var= ( )r

i i S W

j j S

i ij ij S

j ij i j S W

t

Note: you can use MS() directly except for dose|typej



Programming in Proc GLIMMIXproc glimmix; class bloc type dose; model y=type|dose; random intercept dose / subject=bloc; ** i.e. random bloc bloc*dose; lsmeans type*dose / diff lines slicediff=(type dose) slice=(type dose); ods output lsmeans=lsm; run;

You can use ODS to output LSMEANS and GPLOT

for interaction plots, Or use ODS graphics directly

all possible meandifferences

simple effect differences only

simple effecttests only

with “MRT lines”



Type x Dose: Selected Output



Error

Intercept block 2.0735 2.7320

dose block 4.5132 2.8291

Residual 4.3189 1.5270


EffectNum


type 1 16 2.78 0.1151

dose 3 12 13.63 0.0004

type*dose 3 16 2.29 0.1176



Type x Dose LSMeans

type*dose Least Squares Means

type dose EstimateStandard


r 1 20.0000 1.4769 20.23 13.54 <.0001

r 2 27.8400 1.4769 20.23 18.85 <.0001

r 4 28.1800 1.4769 20.23 19.08 <.0001

r 8 24.8000 1.4769 20.23 16.79 <.0001

s 1 17.0600 1.4769 20.23 11.55 <.0001

s 2 25.2000 1.4769 20.23 17.06 <.0001

s 4 28.3800 1.4769 20.23 19.22 <.0001

s 8 25.8000 1.4769 20.23 17.47 <.0001



Type x Dose: “MRT Lines”T Grouping for type*dose Least Squares Means

LS-means with the same letter are not significantly different.

type dose Estimate

s 4 28.3800 A

A

r 4 28.1800 A

A

r 2 27.8400 A

A

s 8 25.8000 A

A

s 2 25.2000 A

A

r 8 24.8000 A

r 1 20.0000 B

s 1 17.0600 C

however ...



A Factorial Inference Flowchart

The Prime Directive: Interactions first!!!!!

Interaction?

Negligible

Interpret Main Effects

Non-ignorable

Interpret Simple Effects

Full Wheelbarrow



Plots of Differences between Means

LSMEANS allows various plots of mean differences

DIFFPlot: plots interval estimates of mean differences

ANoMPlot: (ANalysis of Means) plots difference between each treatment and the overall mean

ControlPlot: Plots each treatment vs control (e.g. like Dunnett test)



SAS for Mean Difference Plots

From Type x Dose exampleods html;ods graphics on;ods select Anomplot DiffPlot;proc glimmix data=variety_eval; class block type dose; model y=type|dose/ddfm=satterth; random block block*dose;

lsmeans dose/plot=DiffPlot; lsmeans dose/plot=AnomPlot; *lsmeans type*dose/plot=DiffPlot; *lsmeans type*dose/plot=AnomPlot;run;ods graphics off;ods html close;run;



SAS for Mean Difference Plots: DIFFPLOT



SAS for Mean Difference Plots: ANoMPLOT



Mean Difference Plots – Control Plots

From SAS for Linear Models – Output 3.17-3.22 Randomized Complete Block 5 Irrigation Treatments: Flood (control), Basin, Spray,

Sprinkler, Trickle

ods html;ods graphics on;ods select ControlPlot;proc glimmix order=data; class bloc irrig; model fruitwt=irrig; random bloc; lsmeans irrig/diff=control('flood') plot=controlplot adjust=dunnett; run;ods graphics off;ods html close; run;



Dunnett-style Control Plot



Back to Type x Dose Data: Interaction Plot



Type x Dose: Simple Effects

Tests of Effect Slices for type*dose Sliced By type

type

Num DF


r 3 19.49 8.12 0.0010

s 3 19.49 13.58 <.0001

Tests of Effect Slices for type*dose Sliced By dose

dose

Num DF


1 1 16 5.00 0.0399

2 1 16 4.03 0.0618

4 1 16 0.02 0.8810

8 1 16 0.58 0.4578

Simple Effect Comparisons of type*dose Least Squares Means By dose

Simple Effect Level type _type Estimate


dose 1 r s 2.9400 1.3144 16 2.24 0.0399

dose 2 r s 2.6400 1.3144 16 2.01 0.0618

dose 4 r s -0.2000 1.3144 16 -0.15 0.8810

dose 8 r s -1.0000 1.3144 16 -0.76 0.4578

SLICE: test only

SLICEDIFFestimatesetc



Type x Dose: Simple Effect Estimates by TypeSimple Effect Comparisons of type*dose Least Squares Means By type

Simple Effect Level dose _dose Estimate


type r 1 2 -7.8400 1.8796 19.49 -4.17 0.0005

type r 1 4 -8.1800 1.8796 19.49 -4.35 0.0003

type r 1 8 -4.8000 1.8796 19.49 -2.55 0.0192

type r 2 4 -0.3400 1.8796 19.49 -0.18 0.8583

type r 2 8 3.0400 1.8796 19.49 1.62 0.1219

type r 4 8 3.3800 1.8796 19.49 1.80 0.0876

type s 1 2 -8.1400 1.8796 19.49 -4.33 0.0003

type s 1 4 -11.3200 1.8796 19.49 -6.02 <.0001

type s 1 8 -8.7400 1.8796 19.49 -4.65 0.0002

type s 2 4 -3.1800 1.8796 19.49 -1.69 0.1066

type s 2 8 -0.6000 1.8796 19.49 -0.32 0.7530

type s 4 8 2.5800 1.8796 19.49 1.37 0.1855



Effect of dose?

contrast 'logdose linear' dose -3 -1 1 3; contrast 'logdose quad' dose 1 -1 -1 1; contrast 'logdose cubic' dose -1 3 -3 1; contrast 'type x linear' dose*type -3 -1 1 3 3 1 -1 -3; contrast 'type x quad' dose*type 1 -1 -1 1 -1 1 1 -1; contrast 'type x cubic' dose*type -1 3 -3 1 1 -3 3 -1;

Log(Dose)

otherwise.....

contrast 'dose linear' dose -11 -7 1 17; contrast 'dose quad' dose 20 -4 -29 13; contrast 'dose cubic' dose -8 14 -7 1; contrast 'type x linear' dose*type -11 -7 1 17 11 7 -1 -17; contrast 'type x quad' dose*type 20 -4 -29 13 -20 4 29 -13; contrast 'type x cubic' dose*type -8 14 -7 1 8 -14 7 -1;



LogDose contrast results

Contrasts

Num Den

Label DF DF F Value Pr > F

logdose linear 1 12 18.25 0.0011

logdose quad 1 12 22.54 0.0005

logdose cubic 1 12 0.08 0.7780

type x linear 1 16 6.22 0.0240

type x quad 1 16 0.04 0.8515

type x cubic 1 16 0.61 0.4472



Direct Regression – borrow from ANCOVAproc glimmix data=variety_eval; class block type dose; model y=type logdose(type) ld_sq(type) / noint ddfm=satterth solution; random intercept dose / subject=block; contrast 'equal quad by type?' ld_sq(type) 1 -1;run;

Solutions for Fixed Effects

Effect type EstimateStandard

Error DF t Value

type r 20.1890 1.4204 19.62 14.21

type s 17.0200 1.4204 19.62 11.98

logdose(type) r 9.8890 2.0181 21.45 4.90

logdose(type) s 10.9800 2.0181 21.45 5.44

ld_sq(type) r -2.8050 0.6447 21.45 -4.35

ld_sq(type) s -2.6800 0.6447 21.45 -4.16

Contrasts

Label

Nu

m DF

Den DF

F Value Pr > F

equal quad by type?

1 17 0.04 0.8497

can re-fit with LD_SQcommon to both types



Example 3

From SAS for Mixed Models, Section 4.7 4 “conditions” 3 diets Condition applied in incomplete block design 2 conditions per block Diet applied to cages within condition Condition is whole plot, diet is split-plot



“Plot plan”

diet 1 diet 2 diet 3 diet 2 diet 1 diet 3

diet 2 diet 1 diet 3 diet 1 diet 3 diet 2



Model?

blocking? yes e.u. with respect to condition “1/2 block” e.u. with repect to diet: “1/3 condition e.u.” e.u. w.r.t. cond x diet: same as diet

Model:

ijk i k ikj ijkblk wy e



SAS Program

proc glimmix data=fix2; class cage condition diet / ddfm=kr; model gain=condition diet condition*diet/ddfm=satterth; random intercept condition / subject=cage; run;

data & program: file ch4-ex3.sas



Selected Output



Error

Intercept cage 3.0376 5.0791

condition cage 0 .

Residual 27.8429 8.7672

how should one deal with negative variance component estimate?• revert to ANOVA via PROC GLM ?• in MIXED, use NOBOUND option ?• in GLIMMIX, use LowerB• alternatively, redefine model

• may be CS with plots in block negatively correlated


EffectNum


condition 3 23.61 2.71 0.0677

diet 2 20.17 0.93 0.4090

condition*diet 6 20.17 1.73 0.1661



Comparison with SAS Proc GLMproc glm data=fix2; class cage condition diet; model gain=cage condition cage*condition diet condition*diet; random cage cage*condition/test; lsmeans condition diet condition*diet;Tests of Hypotheses for Mixed Model Analysis of Variance

Source DF Type III SS Mean Square F Value Pr > F cage 5 198.277778 39.655556 2.73 0.2185 * condition 3 171.666667 57.222222 3.95 0.1446 Error 3 43.500000 14.500000 Error: MS(cage*condition) * This test assumes one or more other fixed effects are zero.

Source DF Type III SS Mean Square F Value Pr > F cage*condition 3 43.500000 14.500000 0.46 0.7144 * diet 2 52.055556 26.027778 0.82 0.4561

condition*diet 6 288.388889 48.064815 1.52 0.2333

Error: MS(Error) 16 504.888889 31.555556



More GLM output Least Squares Means

condition gain LSMEAN 1 Non-est 2 Non-est 3 Non-est 4 Non-est

diet gain LSMEAN normal 57.9166667 restrict 55.5000000 suppleme 58.1666667

condition diet gain LSMEAN 1 normal Non-est 1 restrict Non-est 1 suppleme Non-est 2 normal Non-est 2 restrict Non-est 2 suppleme Non-est 3 normal Non-est 3 restrict Non-est 3 suppleme Non-est 4 normal Non-est 4 restrict Non-est 4 suppleme Non-est

non-estimabilityresults from inappropriatedefinition of estimability

(based on fixed & random eff)

inescapable consequence ofProc GLM with mixed model

DON’Tuse Proc GLMwithmixed models!



GLM vs MIXED issues REML default: variance component estimates set to 0

− if BLOCK affected, type I error rate − if error term affected, power may − better to allow negative estimates− In MIXED: NOBOUND or METHOD=TYPE3− In GLIMMIX: LowerB

vs. GLM uses implied MS regardless GLM: inappropriate NON-EST artifact of incomplete

block design Standard errors for means, many simple effects

(including SLICE) incorrect in GLM (no fix!!)



GLIMMIX Option (1) – Like NOBOUND in MIXED

proc glimmix data=fix2; class cage condition diet; model gain=condition|diet/ddfm=kr; random intercept condition / subject=cage;

parms / lowerb=(1e-4,-10,1e-4); run;



Error

Intercept cage 5.0288 4.7149

condition cage -6.2404 4.8693

Residual 31.5556 11.1566


EffectNum


condition 3 4.718 4.31 0.0798

diet 2 16 0.82 0.4561

condition*diet 6 16 1.52 0.2333



GLIMMIX Option (2) – is it really correlation?proc glimmix data=fix2; class cage condition diet; model gain=condition|diet/ddfm=kr; random intercept / subject=cage;

random _residual_ / type=cs subject=condition*cage; run;



Intercept cage 5.0271

CS cage*condition -6.2402

Residual 31.5567


EffectNum


condition 3 4.717 4.31 0.0798

diet 2 16 0.82 0.4561

condition*diet 6 16 1.52 0.2334

2CC

2 2CC

Interblock correlation

=

0.2466



Modeling Change over Time

Regression over time Latent growth / change models Random coefficients over time Repeated measures experiment Longitudinal Data



From Acock – BMI Data

bmi

10

20

30

40

50

year

1997 yearfrmt

1998 yearfrmt

1999 yearfrmt

2000 yearfrmt

2001 yearfrmt

2002 yearfrmt

2003

Note – my sample differs from Acock’s, so the numbers won’t match



Basic Growth Model

Simplest model involves slope & intercept In “Stat-speak”

0 1

obs=intercept slope time + error

ij ijiy time e

this is just linear regression

21 2, ,..., may be 0,

may be (more l

indep

ater)

endent

correlated

j j tje e e N

or



Basic Growth Model in SAS

in PROC GLM

proc glm; model bmi=year;run;

Source DFSum of

Squares Mean Square F Value Pr > F

Model 1 432.856378 432.856378 19.68 <.0001

Error 229 5037.468822 21.997680

Corrected Total 230 5470.325200

R-Square Coeff Var Root MSE bmi Mean

0.079128 20.01197 4.690168 23.43682

Parameter EstimateStandard

Error t Value Pr > |t|

Intercept 21.38349324 0.55631931 38.44 <.0001

year 0.68444085 0.15429522 4.44 <.0001

very deceptive – more shortly

ˆregression equation: 21.38 0.684y Year



Growth Model in SAS - II

in PROC GLIMMIX

proc glimmix;class id; model bmi=year/solution; random _residual_ /subject=id; estimate 'y-hat in 1997' intercept 1 year 0 / cl; estimate 'y-hat in 2000' intercept 1 year 3 / cl; estimate 'y-hat in 2003' intercept 1 year 6 / cl;run;

selected output next page



Basic Growth Model – Selected GLIMMIX OutputCovariance Parameter Estimates

Cov Parm EstimateStandard

Error

Residual (VC) 21.9977 2.0558


Effect EstimateStandard


Intercept 21.3835 0.5563 32 38.44 <.0001

year 0.6844 0.1543 197 4.44 <.0001

Estimates


Error DF t Value Pr > |t| Alpha Lower Upper

y-hat in 1997 21.3835 0.5563 197 38.44 <.0001 0.05 20.2864 22.4806

y-hat in 2000 23.4368 0.3086 197 75.95 <.0001 0.05 22.8283 24.0454

y-hat in 2003 25.4901 0.5563 197 45.82 <.0001 0.05 24.3930 26.5872

Note: residual VC est = MSE from GLM ANOVA



G/C Model – Issue I – Account for ID

Recall R2 for Basic Growth Model very low You must account for variation among subjects (ID)

proc glm; class id; model bmi=id year;run;

proc glimmix;class id; model bmi=year/solution; random id; /* or random intercept / subject = id

okay

better



Selected Output

R-Square

0.815282

fromGLM

vs. 0.079



Error

Intercept id 17.2449 4.4950

Residual 5.1293 0.5168

from GLIMMIX

vs. 21.998




Intercept 21.3835 0.7712 32 27.73 <.0001

year 0.6844 0.07451 197 9.19 <.0001

estimatesdon’t changestd errors do



Growth Change Modeling Issue - II

Correlated Errors

0 1

21 2

I

, ,..., may be inde

n Mode

0,

may be

pendent

correlated

l

ij iji

j j tj

y year e

e e e N

or

Recall:

Correlation Modeled by Covariance Model

• Failure to model correlation increases P{type I error}

• Over-modeling correlation decreases Power



Covariance models2

2

2 3

22

Indep =I identical to split-plot

1

1CS =

1

1

NOTE: CS is reparameterization of Indep

1

1AR(1) =

1

1



More covariance models

1 2 3

1 22

1

21 1 2 1 1 3 1 2 1 4 1 2 3

22 2 3 2 2 4 2 3

23 3 4 3

24

21 12 13 14

22 23 24

23 34

24

1

1Toep =

.

1

ANTE(1) =

UN =



Issues in Repeated Measures

Impact of covariance structure? Selection of appropriate covariance? Bias in std errors, test statistics Degrees of freedom Nonlinear models over time Non-normal errors



Basic G/C Model with Covariance Model

Also known as Autocorrelation

proc glimmix;

class id;

model bmi=year/solution / ddfm=kr;

random intercept / subject=id;

random _residual_ /subject=id type=ar(1); run;

Competing Covariance Models compared via Fit Statistics•AICC BIC•HQIC CAIC

degree of freedomandstd error bias must be dealt withmore later



Selected Output for G/C Model w/ Autocorrelation



Error

Intercept id 14.8587 4.6202

AR(1) id 0.5623 0.1144

Residual 7.7165 1.8981

variance,covariance &correlation estimates

Fit Statistics

-2 Res Log Likelihood 1111.69

AIC (smaller is better) 1117.69

AICC (smaller is better) 1117.79

BIC (smaller is better) 1122.18

CAIC (smaller is better) 1125.18

HQIC (smaller is better) 1119.20




Effect EstimateStandard Error DF t Value Pr > |t|

Intercept 21.3238 0.8042 32 26.52 <.0001

year 0.6896 0.1102 197 6.26 <.0001

used to assess cov model

estimate – slight effectstd error – bigger effect



random coeff correl errors prediction add Gender add emotional prob



Repeated Measure Experimentsa.k.a. Longitudinal Data

Assign e.u. to treatments May use any design (completely random,

blocked, row-column, split-plot ....) Observations at planned times Objectives

1. assess changes in response over time

2. assess treatment effect on (1)



Typical repeated Measures Data

from SAS for Linear Models, Chapter 8 SAS for Mixed Models, 2nd ed, Chapter 5



From BMI Data: Are G/C Curves Equal by Gender?

interactionplot of G/Ccurve by gender



FYI – SAS Code to Get Interaction Plot

ods html;ods graphics on;ods select MeanPlot;proc glimmix data=bmi_uni_anc; class gender id year; model bmi=gender|year / solution ddfm=kr; random intercept / subject=id(gender); random _residual_ / type=ar(1) subject=id(gender); lsmeans gender*year / plot=MeanPlot (sliceby=gender join cl);run;ods graphics off;ods html close;run;



Model

2

Model: ( )

where mean

can express as:

( ) is between subjects error (0, )like whole-plot error

is within subjects error, like

ijk ij ik ijk

ij i j

ij i j ij

ik B

ijk

y id gender e

gender year

g yr g yr

id gender NI

e

1 2

split-plot error, ...

Let ... (0, )ik iki k i k iTk

except

e e e e e MVN

proc glimmix data=bmi_uni_anc; class gender id year; model bmi=gender|year / solution ddfm=kr; random intercept / subject=id(gender); random _residual_ / type=ar(1) subject=id(gender);

translates to:



Back to SAS for Mixed Models Example

2

1 2

Model: ( )

where mean

( ) is between subjects error (0, )

like whole-plot error

is within subjects error, like split-plot error, ...

Let ..

ijk ij ik ijk

ij i j

ik B

ijk

ik i k i k

y s trt e

trt time

s trt NI

e except

e e e

2 2

. (0, )

Hence ( ) ; typically

# trt's, =#subj/trt

ikiTk

ik S S B T Bik

AK ik

e e MVN

Var y V Z Z J

V Var y I V A K



Middle Ground between MANOVA and Split-Plot in Time via Proc GLIMMIX

PROC GLIMMIX; CLASSES SUBJ TRT TIME; MODEL Y= TRT TIME TRT*TIME; RANDOM INTERCEPT / SUBJECT=SUBJ(TRT); RANDOM TIME / TYPE=AR(1) SUBJECT=SUBJ(TRT) RESIDUAL;*LSMEANS TRT TIME TRT*TIME;TITLE 'MIXED - AR(1) ERRORS';RUN;

RANDOM specifies between subjects effects (G-side)

RANDOM...RESIDUAL specifies within subjects effect (R-side)

in many models, G- and R-side effects are not identifiable



Modeling Covariance among Repeated Measures

PROC MIXED DATA=univ; CLASSES SUBJ TRT TIME; MODEL Y= TRT TIME TRT*TIME; REPEATED TIME / TYPE=UN SSCP SUBJECT=SUBJ(TRT); ODS OUTPUT CovParms=cp;run;data times; do time1=1 to 8; do time2=1 to time1; dist=time1-time2; output; end; end;

data covplot; merge times cp;

proc gplot data=covplot; plot adjcorr*dist=time1;

Computes covariance betweenpairs of measurements(same subject, different times)based on Sum of squares & cross-products matrixthenplots them by distance



Plot of Covariance by Distance



Idealized PlotsCS=Subj(Trt), AR(1), AR(1)+Subj(Trt)

AR(1) + Subj(Trt)

CS= random Subj(Trt)

AR(1) only



Model Fitting Criteria in Version 8

1. Compound Symmetry proc glimmix; classes subj trt time; model y= trt time trt*time; random time / residual type=cs subject=subj(trt);title 'mixed - compound symmetry';

Fit Statistics

-2 Res Log Likelihood 839.39










Comparison of ModelsSmaller is Better

Compound Symmetry

Neg2LogLike Parms AIC AICC HQIC BIC CAIC 839.4 2 843.4 843.5 844.0 845.7 847.7

AR(1) + Subj(TRT) random effect


Unstructured




ANTE(1)

TOEP



How do Model Fitting Criteria Compare?

Guerin & Stroup (2000) compared AIC, BIC, HQIC, CAIC for simulated AR(1) and ARH(1) data

CAIC tends to select simpler models AIC tends to select most complex models * complex -- AIC > HQIC > BIC > CAIC -- simple Model too simple (correlation model not adequate) Type I error

rate too high Model too complex (correlation over-modeled) Type I error control

not affected, but power suffers

*Since 2000, SAS added AICC to address AIC issue Best choice depends on severity of Type I vs II error



An Inference Issue CS: Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F TRT 3 20 0.74 0.5425 TIME 7 140 109.04 <.0001 TRT*TIME 21 140 1.98 0.0106

AR(1)+between subj: Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F TRT 3 20 0.75 0.5344 TIME 7 140 60.55 <.0001 TRT*TIME 21 140 1.48 0.0921

UN: Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F TRT 3 20 0.74 0.5425 TIME 7 20 101.31 <.0001 TRT*TIME 21 20 1.37 0.2450

UN similar to MANOVA but MANOVA Trt*Time p-value was 0.50



Bias & Options for Adjusting

SAS Default uses estimated (co)variance components in V std errors biased , t-, F-statistics biased

“Robust” (a.k.a. “sandwich) estimate of K’V-1K available using EMPIRICAL option in MIXED

Kenward & Roger (Biometrics, 1997) proposed adjustment; available using DDFM=KR option in MODEL statement of MIXED

Guerin & Stroup (2000) evaluated KR option of SAS Version 8 with simulated AR(1) and ARH(1) data

Biased F resulted in inflated Type I error rates unless KR option used (for α=0.05, rejection rates >0.10 for TYPE=AR(1), up to 0.20 with TYPE=ANTE(1), UN



Sandwich (“Robust”) Estimator

OLS

OLS

1 1GLS

1 1GLS

0

1 10

ˆOLS estimate of :

ˆ ( )

ˆ ˆ ˆGLS estimate is:

ˆ ˆ ˆ

ˆˆ ˆLet based on residuals

ˆ ˆ ˆˆ ˆYields

"Sa

X X X y

Var X X X Var y X X X

X X X VX X X

X V X X V y

Var X V X X VX X V X

V V e y X

V V eeV

1 1 1 1GLS

ˆ ˆ ˆ ˆ ˆˆ ˆndwich" estimator: Var X V X X V eeV X X V X



How does the sandwich estimator perform?

proc mixed empirical; classes subj trt time; model y=trt time trt*time; random intercept/ subject=subj(trt); random time / type=ar(1) subject=subj(trt) residual;run;

Type 3 Tests of Fixed Effects

Num Den Effect DF DF F Value Pr > F

TRT 3 20 1.31 0.2981 TIME 7 140 121.57 <.0001 TRT*TIME 20 140 9.04 <.0001

vs. F=1.48; p=0.0921using default



Kenward and Roger

proc glimmix; classes subj trt time; model y= trt time trt*time/ddfm=kr; random intercept / subject=subj(trt); random time / type=ar(1) subject=subj(trt) residual;



TRT 3 20.5 0.77 0.5219 TIME 7 109 50.90 <.0001 TRT*TIME 21 117 1.24 0.2330



Alternative KR adjustment• in SAS, KR adjustment uses Hessian matrix by default• you can cause it to use the Information matrix instead• no documented advantage one way or another

PROC glimmix scoremod scoring=51; CLASSES SUBJ TRT TIME; MODEL Y= TRT TIME TRT*TIME/ddfm=kr; RANDOM intercept / subject=SUBJ(TRT); Random _resid_ / TYPE=AR(1) SUBJECT=SUBJ(TRT); nloptions technique=nrridg;

Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F

TRT 3 20.5 0.77 0.5264 TIME 7 112 54.18 <.0001 TRT*TIME 21 119 1.28 0.2010

vs. F=1.24, p=0.2330 using Hessian



Alternative Model for Change in BMI by Gender

0 1

0 0

1 1 1

0 1 1

0 1

Repeated Measures ANCOVA M

Level 1: Level 2: ( )

(odel

)( )

tj j j t tj

j i ij

j i

ijk i ij i t ijk

i i ij ijk

y yr egender id genderg

y gender id gender g yr eid gender e

proc glimmix data=bmi_uni_anc; class gender id year; model bmi=gender yr(gender) / noint solution ddfm=kr; random intercept / subject=id(gender); random _residual_ / type=ar(1) subject=id(gender); contrast 'male vs female intercept' gender 1 -1; contrast 'male vs female slope' yr(gender) 1 -1;run;



Selected Output



Intercept id(gender) 15.1933

AR(1) id(gender) 0.2928

Residual 7.8871


Effect gender EstimateStandard


gender 0 20.1988 0.6084 165.9 33.20 <.0001

gender 1 21.8298 0.5596 165.9 39.01 <.0001

yr(gender) 0 0.7860 0.08207 204.5 9.58 <.0001

yr(gender) 1 0.6462 0.07549 204.5 8.56 <.0001

Contrasts

LabelNum

DFDen DF

F Value Pr > F

male vs female intercept

1 165.9 3.89 0.0501

male vs female slope 1 204.5 1.57 0.2111



Alternative Model

proc glimmix data=bmi_uni; class gender id; model bmi=gender year(gender) / noint solution ddfm=kr; random intercept year(gender) / subject=id type=un; contrast 'male vs female intercept' gender 1 -1; contrast 'male vs female slope' year(gender) 1 -1;run;

This is a random coefficient model

Next section



Response Surface Split Plot with Repeated Measures

4 treatment factors (A, B, C, D)− 2 levels each

3 factors (A, B, C) applied to P( subject) treatment design: central composite design subjects split into 2 sub-units level of D randomly assigned to each sub-unit observations at 3 planned times (H)



Central Composite Design



Model for Central Composite Split-Split Plot

Effect e.u.

A, B, C

main effects & interactions P(A B C)

D D P(A B C)

D (A, B, C) D P(A B C)

H and all interactions

involving H H D P(A B C)

( , , ) ( , , )

)

(

( , ,

hijklm Ai Bj Ck l l Ai Bj Ck

m lm m Ai Bj Ck

y f X X X d f X X X

h dh f X X X

p a

) ( )hijk hijkl hijklmbc dp abc e



SAS Statements

proc glimmix; class ca cb cc p d u; *model y=a b c a*a b*b c*c a*b a*c b*c d d*a d*b d*c t t*t t*a t*b t*c t*d/htype=1 htype=3 ddfm=kr; model y=d a(d) b(d) c(d) a*a b*b c*c a*b a*c b*c t(d) t*t t*a t*b t*c

/noint solution htype=1 ddfm=kr; random p(ca cb cc) d*p(ca cb cc);



Key output



Intercept p(ca*cb*cc) 24.3200

d p(ca*cb*cc) 4.5151

Residual 11.4944


Effect d Estimate Standard Error

d 0 53.5687 2.3344

d 1 31.7168 2.3344

a(d) 0 16.8226 1.8101

a(d) 1 11.2226 1.8101

b(d) 0 19.5049 1.8101

b(d) 1 12.3715 1.8101

c(d) 0 4.4019 1.8101

c(d) 1 3.5352 1.8101

a*a 0.4980 3.2427

b*b -2.5020 3.2427

c*c 5.1647 3.2427

a*b 6.2083 1.8872

a*c -2.8333 1.8872

b*c 1.2083 1.8872

t(d) 0 9.4200 0.5504

t(d) 1 0.02442 0.5504

t*t -0.1487 1.1114

a*t 0.1160 0.5078

b*t 1.7331 0.5078

c*t 0.3513 0.5078

Fit Statistics




Complex Split-split-plot revisited

Recall A, B, C applied to units P P split in two, levels of D to each half Measured a 3 times Previous analysis assumed split on time Actually repeated measures Split-plot + repeated measures



CCD Split-plot + repeated measures

proc glimmix data=CCD_SpltPlt; class ca cb cc p d u; *model y=a b c a*a b*b c*c a*b a*c b*c d d*a d*b d*c t t*t t*a t*b t*c t*d/htype=1 htype=3 ddfm=kr; model y=d a(d) b(d) c(d) a*a b*b c*c a*b a*c b*c t(d) t*t t*a t*b t*c /

noint solution htype=1 ddfm=kr; random intercept / subject=p(ca cb cc); random _residual_ / type=sp(pow)(t) subject=d*p(ca cb cc);run;

AICC: 573.4 as split-split-plot551.1 as repeated measures using SP(POW)note SP(POW) is generalization of AR(1)

for unequally spaced times



Unreplicated Split-Plot

SAS for Mixed Models, Section 16.7 Quilt divided in half Each “half sheet” received 2 x 2 x 3 factorial

−2 pH levels (low high)−2 temp (cold hot)−3 dry cycles (air machine-delicate machine-normal

Material cut from each unit −washed 10, 20, 30, 40, 50 times

Breaking strength monitored Materials observed so reps by sheet lost



is the mean of the ijkth pH water temperature dry cycle (i=8,10; j=35,55; k=air, delicate, normal) at the lth

time of washing (l=10.20.30.40.50),rm is the effect of the mth block (m=1,2 in the design, but m=1

only in the data)wijkm is the ijkmth between subjects (or whole-plot) error effect,

assumed eijklm is the within subjects (or split-plot) error effect,

assumed

Model for Breaking Strength Experiment

ijklm ijkl m ijkm ijklmy r w e

ijkl

2(0, )WNID

2(0, )NID

where



ANOVA for Breaking Strength ExperimentSource of Variation d.f.

block 1

pH (P) 1

wash temp (T) 1

dry cycle (D) 2

PT 1

PD 2

TD 2

PTD 2

between subject error 11

no. of washes (W) 4

WP 4

WT 4

WD 8

WPT 4

WPD 8

WTD 8

WPTD 8

within subjects error 48

but these become 0when blockingby “half quilt”distinction lost



Breaking Strength vs # Washes by pH



Breaking Strength vs # Washes by Temp



Breaking Strength vs # Washes by Dry Cycle



Revised ANOVA Pool negligible effects to get between & within error

Source of Variation d.f.

pH (P) 1

wash temp (T) 1

dry cycle (D) 2

between subject error 7

linear effect of no. of washes (W Lin) 1

W LinP 1

W LinT 1

W LinD 2

within subjects error 43



GLIMMIX Program for Breaking Strength Experiment

proc glimmix data=shellie; class pH water_temp dry_cycle; model breaking_strength=pH water_temp dry_cycle w w*pH w*water_temp w*dry_cycle / solution; random pH*water_temp*dry_cycle; contrast 'air vs dryer effect on wear' w*dry_cycle 2 -1 -1; contrast 'delicate v normal effect on wear' w*dry_cycle 0 1 -1;run;



Revised GLIMMIX - Estimate Regression over # of Washes

proc glimmix data=shellie; class pH water_temp dry_cycle; model breaking_strength= w(pH) w(water_temp) w(dry_cycle)/noint solution; random pH*water_temp*dry_cycle; estimate 'slope: ph 8, cold, air‘

w(ph) 1 0 w(water_temp) 1 0 w(dry_cycle) 1 0 0; estimate 'slope: ph 8, cold, delicate'

w(ph) 1 0 w(water_temp) 1 0 w(dry_cycle) 0 1 0; estimate 'slope: ph 8, cold, normal'

w(ph) 1 0 w(water_temp) 1 0 w(dry_cycle) 0 0 1; estimate 'slope: ph 8, hot, air‘

w(ph) 1 0 w(water_temp) 0 1 w(dry_cycle) 1 0 0; estimate 'slope: ph 8, hot, delicate'

w(ph) 1 0 w(water_temp) 0 1 w(dry_cycle) 0 1 0;

etc for all pH – temp – dry cycle combinations



Regression – Selected Output

Solution for Fixed Effects

Effectwatertemp

Drycycle

pH Estimate

Standard Error

Intercept 0.1070 0.001895


Error

slope: ph 8, cold, air -0.00024 0.000077

slope: ph 8, cold, delicate -0.00047 0.000077

slope: ph 8, cold, normal -0.00050 0.000077

slope: ph 8, hot, air -0.00050 0.000077

slope: ph 8, hot, delicate -0.00073 0.000077

slope: ph 8, hot, normal -0.00076 0.000077

slope: ph 10, cold, air -0.00082 0.000077

slope: ph 10, cold, delicate -0.00105 0.000077

slope: ph 10, cold, normal -0.00108 0.000077

slope: ph 10, hot, air -0.00108 0.000077

slope: ph 10, hot, delicate -0.00131 0.000077

slope: ph 10, hot, normal -0.00134 0.000077

avg slope: ph 8 -0.00053 0.000054

avg slope: ph 10 -0.00111 0.000054

avg slope: cold water -0.00069 0.000054

avg slope: hot water -0.00095 0.000054

avg slope: air dry -0.00066 0.000063

avg slope: delicate dry -0.00089 0.000063

avg slope: normal dry -0.00092 0.000063



Prediction & Inference Space



VI. Prediction, “BLUP” and Inference Space

Estimation vs. Prediction

When “BLUP” is a good thing

Inference Space

−what is it?

−how can we use it?

Performance evaluation issues

Multi-location issues



Estimation, Prediction, and Inference Space Estimation based on estimable functions

Estimation applies to fixed effects only, inference is to entire population

Prediction based on “predictable functions”

Prediction applies to fixed & random effects, narrows scope of inference to specific subset defined by M’u

Examples: locations, workers, teachers, patients...

K

K M u



Prediction Example 1 Growth Change Modeling Issue - III

Random Coefficients Recall Basic Growth Model 0 1ij iji

y year e

0 0 0

1 1 12

0 0 012

1 1

Level 2:

0~ ,

0

bb

bMVN

b

proc glimmix data=bmi_uni;class id; model bmi=year/solution ddfm=kr; random intercept year / subject=id type=un solution; random _residual_ /subject=id type=ar(1);



Selected OutputCovariance Parameter Estimates


UN(1,1) id 10.8070

UN(2,1) id 0.5873

UN(2,2) id 0.2676

AR(1) id 0.3024

Residual 4.6021



Error t Value

Intercept 21.3577 0.6480 32.96

year 0.6870 0.1212 5.67

Solution for Random Effects

Effect Subject EstimateStd Err

Pred DF

Intercept id 73 2.1023 1.3487 165

year id 73 -0.1608 0.3118 165

Intercept id 281 -1.3178 1.3487 165

year id 281 -0.1353 0.3118 165

Intercept id 496 -1.8137 1.3487 165

year id 496 -0.07237 0.3118 165

partial listing



You can obtain Subject-Specific Estimatesproc glimmix data=bmi_uni;class id; model bmi=year/solution ddfm=kr; random intercept year / subject=id type=un solution; random _residual_ /subject=id type=ar(1); estimate 'popn avg slope' year 1 / cl; estimate 'id (73) specific slope' year 1 | year 1 / subject 1 0 cl e; estimate 'id (496) specific slope' year 1 | year 1 / subject 0 0 1 0 cl; estimate 'popn avg intercept' intercept 1 / cl; estimate 'predicted bmi in 1997' intercept 1 year 0 / cl; estimate 'id (73) specific intercept' intercept 1 | intercept 1 / subject 1 0 cl e; estimate 'id (496) specific intercept' intercept 1 | intercept 1 / subject 0 0 1 0 cl; estimate 'predicted bmi in 2000' intercept 1 year 3 / cl; estimate 'id (73) specific 2000 bmi' intercept 1 year 3 |

intercept 1 year 3/ subject 1 0 cl; estimate 'id (496) specific 2000 bmi' intercept 1 year 3 |

intercept 1 year 3/ subject 0 0 1 0 cl; estimate 'predicted bmi in 2003' intercept 1 year 6 / cl; estimate 'id (73) specific 2003 bmi' intercept 1 year 6 |

intercept 1 year 6/ subject 1 0 cl; estimate 'id (496) specific 2003 bmi' intercept 1 year 6 |

intercept 1 year 6/ subject 0 0 1 0 cl;run;



Best Linear Unbiased Prediction Look closer at Estimate statement

estimate 'popn avg slope' year 1 / cl; estimate 'id (73) specific slope' year 1 |

year 1 / subject 1 0 cl e; estimate 'id (496) specific slope' year 1 |

year 1 / subject 0 0 1 0 cl;

estimate 'predicted bmi in 2000' intercept 1 year 3 / cl; estimate 'id (73) specific 2000 bmi' intercept 1 year 3 | intercept 1 year 3/ subject 1 0 cl; estimate 'id (496) specific 2000 bmi' intercept 1 year 3 |

intercept 1 year 3/ subject 0 0 1 0 cl;

Coefficients to right of vertical bar ( | ) apply torandom effects – this is a new idea

BLUP - - - estimation (prediction) of random effects



Selected Estimates from Random Coeff BMI Model

Estimates


Error DF Lower Upper

popn avg slope 0.6870 0.1214 31.57 0.4396 0.9344

id (73) specific slope 0.5262 0.3833 18.35 -0.2779 1.3303

id (496) specific slope 0.6146 0.3833 18.35 -0.1895 1.4187

popn avg intercept 21.3577 0.6459 31.5 20.0413 22.6742

predicted bmi in 1997 21.3577 0.6459 31.5 20.0413 22.6742

id (73) specific intercept 23.4601 1.4916 33.36 20.4266 26.4935

id (496) specific intercept 19.5440 1.4916 33.36 16.5105 22.5775

predicted bmi in 2000 23.4186 0.7330 31.99 21.9255 24.9117

id (73) specific 2000 bmi 25.0387 0.9928 9.56 22.8127 27.2646

id (496) specific 2000 bmi 21.3878 0.9928 9.56 19.1618 23.6138

predicted bmi in 2003 25.4795 0.9605 31.84 23.5226 27.4365

id (73) specific 2003 bmi 26.6173 1.5462 20.15 23.3936 29.8410

id (496) specific 2003 bmi 23.2316 1.5462 20.15 20.0079 26.4553



Inference Space Example II:

Workers and machines From McLean, Sanders & Stroup (1991,

American Statistician) Also Chapter 6, ex 2, SAS for Mixed Models 2 machines 3 operators (sample from population) inference can apply to population of workers or

specific worker KEY CONCEPT: Inference Space



Worker-Machine Example: Fixed Effect Inference

proc glimmix;class machine operator;model y=machine/ddfm=kr;random operator machine*operator;lsmeans machine / diff;estimate 'BLUE - machine 1' intercept 1 machine 1 0;estimate 'BLUE - diff' machine 1 -1;


EffectNum


machine 1 2 20.26 0.0460

based on MS(mach) / MS(Mach*oper)

machine Least Squares Means

machine Estimate Std Error DF t Value Pr > |t|

1 50.9483 0.2467 2.973 206.50 <.0001

2 51.9567 0.2467 2.973 210.59 <.0001

Differences of machine Least Squares Means

machine _machine Estimate Std Error DF t Value Pr > |t|

1 2 -1.0083 0.2240 2 -4.50 0.0460

theseESTIMATEstatementsgive same result



Worker-Machine Example: Prediction

estimate 'BLUP - m1 narrow' intercept 3 machine 3 0 | operator 1 1 1 machine*operator 1 1 1 0 0 0/divisor=3;estimate 'BLUP - diff nrw' machine 3 -3 | machine*operator 1 1 1 -1 -1 -1/divisor=3;

estimate 'BLUP - oper 1' intercept 2 machine 1 1 | operator 2 0 0 machine*operator 1 0 0 1 0 0/divisor=2;estimate 'BLUP - m1 op1' intercept 1 machine 1 0 | operator 1 0 0 machine*operator 1 0 0 0 0 0;estimate 'BLUP - diff op1' machine 1 -1 | machine*operator 1 0 0 -1 0 0;

these statements apply inference to specific workers or worker-machine• machine 1 averaged over ONLY THE WORKERS IN THE STUDY• diff between machines for workers in study ONLY•operator 1 averaged over machines, with machine 1 only, oper-specific difference between machines



Worker-Machine Example: Prediction (2)

Estimates



BLUE - machine 1 50.9483 0.2467 2.973 206.50 <.0001

BLUE - diff -1.0083 0.2240 2 -4.50 0.0460

BLUP - m1 narrow 50.9483 0.08993 6 566.53 <.0001

BLUP - diff nrw -1.0083 0.1272 6 -7.93 0.0002

BLUP - oper 1 51.7366 0.1151 6.698 449.30 <.0001

BLUP - m1 op1 51.2979 0.1724 7.885 297.48 <.0001

BLUP - diff op1 -0.8773 0.2567 7.976 -3.42 0.0092

BLUE – inference to population of workersBLUP – inference to specific worker or set of workers

note impact of standard error



BLUP a.k.a. “Shrinkage Estimator”

BLUP is regressed toward mean

BLUP is E(u|Y)

Degree of skrinkage depends of variance component estimates


Cov Parm Estimate

operator 0.1073

machine*operator 0.05100

Residual 0.04852

1

e.g. operator BLUP is

( ) ( , ) ( )i i j j jE o Cov o y Var y y y



Relationship to Proc GLM

proc glm; class machine operator; model y=machine|operator; random operator machine*operator/test;lsmeans machine operator machine*operator/stderr;lsmeans machine/stderr e=machine*operator;estimate 'diff' machine 1 -1/e;run;

operator y LSMEANStandard

Error

1 51.7625000 0.1101420

vs. 51.74, 0.1151

machine operator y LSMEANStandard

Error

1 1 51.3550000 0.1557642

vs 51.30, 0.1724

machine y LSMEANStandard

Error

1 50.9483333 0.1583947

std error neither Mixed broad or narrowproduced byestimate “m1” intercept 3 machine 3 0 | operator 1 1 1 machine*operator 0 / divisor=3

machine y LSMEANStandard

Error

1 50.9483333 0.0899305

same as BLUP specific to workersin GLIMMIX



Prediction Example II: Multi-Location Data

From SAS for Mixed Models, 9 Locations 3 blocks per location 4 treatments Major issues are blocks fixed or random? if random how does one estimate location-specific

treatment effects?



ANOVA (ignoring block)

2 21

2 2 21 2

2 21

2

Source d.f. Expected Mean Square

Treatment 3

Location 8

Loc Trt 24

error dfe

LT TRT

LT L

LT

k Q

k k

k

If Location fixed:

2

2

2

2

Source d.f. Expected Mean Square

Treatment 3

Location 8

Loc Trt 24

error dfe

TRT

LOC

LT

Q

Q

Q

Test of TRTaffected



Inference Space

2

Assuming Locations are Fixed

Var(trt mean)=# obs/trt

MS(error)Std. error(trt mean)=

# obs/trt

2 2 2

2 2 2

HOWEVER... if Locations are Random

( )Var(trt mean)=

# obs/trt

ˆ ˆ ˆ( )Std. error(trt mean)=

# obs/trt

L LT

L LT

k

k



Where does Uncertainty Arise?

Loc 1 Loc 2

Loc 7 Loc 8

Only from variation among obs within locations?

Locations fixedOr does variation among locations also contribute?

Locations random



Location-Specific Effects: BLUP

Implies linear combination of fixed and random effect (predictable function = BLUP)

1 2 1 2

In Multi-Location trial, location-specific effect is

e.g. trt 1 vs trt 2 | location

=

j

j jL L



Basic SAS Programsfor fixed location: proc glimmix data=MultiCenter; class location block treatment; model response=location treatment location*treatment; random block(location); lsmeans treatment; lsmeans location*treatment/slice=location slicediff=location;run;

for random locationsproc glimmix data=MultiCenter; class location block treatment; model response=treatment/ddfm=KR; random location block(location) location*treatment; lsmeans treatment/diff; estimate 'trt1 vs trt2' treatment 1 -1 0; estimate 'loc A vs loc B' | location 1 -1 0; estimate 'trt 1 BLUP' intercept 8 treatment 8 | location 1 1 1 1 1 1 1 1/divisor=8; estimate 'trt1 at loc A blup' intercept 1 treatment 1 0 0 0 | location 1 0 location*treatment 1 0;

etc – see ch6 MultiCenter.sas for program in detail



“Take Home” points

Inference space usually implies random locations “Broad” inference on treatments applies to entire

population Location-specific inference may be of interest Requires BLUP Hans Peter Piepho has proposed mixed-model based

measures of commonality among locations Making locations fixed to maximize error d.f.

to test TRT is inappropriate



GLM Issues



VII. “GLM” Issues

Bernoulli data

−as a binomial

−special problems with BINARY data

Counts

Rates



Common Non-Normal Models Bernoulli (binary) observations Categorical data

− Binomial− multinomial

Counts− Poisson− Over dispersed (e.g. negative binomial)

Rates Survival times

− Gamma, Weibull Dispersion measures

− variance

Contingency tables



Elements of GLM(Generalized Linear Model)

Systematic model X Assumed distribution

− implied variance structure

Link function Examples

□ y ~ Bernoulli(p) p = (X)

or logit(p)=X□ Y~ Poisson() log () = X



GLM Example

From SAS for Linear Models

Output 10.1, re-expressed in 10.5

Challenger space shuttle data

relate prob{failure} to temperature at launch

DATA: TEMP, TD (# times thermal distress in O-ring, NO_TD



Approach to modeling

Assess relationship between TEMP and Prob{TD=1}, i.e O-rings show thermal distress Distribution: Bernoulli

Natural parameter: logit = log[p/(1-p)] Model: logit(Pr{TD})=a+b(Temp) Inverse link form:

Pr{TD}=exp[a+b(Temp)]/{1+exp[a+b(Temp)]}



SAS Program: Proc GENMOD

proc glimmix data=Challenger; model td/total=temp; estimate 'logit at 50 deg' intercept 1 temp 50 / ilink; estimate 'logit at 60 deg' intercept 1 temp 60 / ilink; estimate 'logit at 64.7 deg' intercept 1 temp 64.7 / ilink; estimate 'logit at 64.8 deg' intercept 1 temp 64.8 / ilink; estimate 'logit at 70 deg' intercept 1 temp 70 / ilink; estimate 'logit at 80 deg' intercept 1 temp 80 / ilink;run;



Relevant Output

( ) 15.04 0.23

Pr{ 1} temp ( F)

logit X

TD X

Fit Statistics

Pearson Chi-Square 11.13

Pearson Chi-Square / DF 0.80

Parameter Estimates


ErrorDF t Value Pr > |t|

Intercept 15.0429 7.3786 14 2.04 0.0608

temp -0.2322 0.1082 14 -2.14 0.0500

no evidence of overdispersion



Relevant Output (2)

Estimates


Error DF t Value Pr > |t| Mean

StandardErrorMean

logit at 50 deg 3.4348 2.0232 14 1.70 0.1117 0.9688 0.06121

logit at 60 deg 1.1131 1.0259 14 1.09 0.2962 0.7527 0.1909

logit at 64.7 deg 0.02197 0.6576 14 0.03 0.9738 0.5055 0.1644

logit at 64.8 deg -0.00125 0.6518 14 -0.00 0.9985 0.4997 0.1630

logit at 70 deg -1.2085 0.5953 14 -2.03 0.0618 0.2300 0.1054

logit at 80 deg -3.5301 1.4140 14 -2.50 0.0256 0.02847 0.03911

logit scale data scale



Alternatives

Express data in binomial form−SAS for Linear Models, 4th ed., output 10.5

Probit link

2

2

-1

1std normal c.d.f.

2

link function is

inverse link is

z

e dz

X

X



Logit vs Probit

Red: probitBlue: logit



Probit Modelproc glimmix data=Challenger;

model td/total=temp/link=probit solution;

estimate 'logit at 50 deg' intercept 1 temp 50 / ilink;


estimate 'logit at 64.7 deg' intercept 1 temp 64.7 / ilink;

estimate 'logit at 64.8 deg' intercept 1 temp 64.8 / ilink;



run;



Estimates



StandardErrorMean

logit at 50 deg 2.0201 1.1413 14 1.77 0.0985 0.9783 0.05917

logit at 60 deg 0.6692 0.6024 14 1.11 0.2854 0.7483 0.1921

logit at 64.7 deg 0.03421 0.3960 14 0.09 0.9324 0.5136 0.1579

logit at 64.8 deg 0.02070 0.3925 14 0.05 0.9587 0.5083 0.1566

logit at 70 deg -0.6818 0.3244 14 -2.10 0.0541 0.2477 0.1026

logit at 80 deg -2.0328 0.7277 14 -2.79 0.0144 0.02104 0.03678

Fit Statistics


Pearson Chi-Square / DF

0.78

Probit OutputParameter Estimates



Intercept 8.7750 4.0286 14 2.18 0.0470

temp -0.1351 0.05839 14 -2.31 0.0364



Option 3: Use Binary Data

proc glimmix data=O_Ring; model td_bin=temp / solution; model td_bin=temp /dist=binomial link=logit solution; estimate 'logit at 50 deg' intercept 1 temp 50 / ilink; estimate 'logit at 60 deg' intercept 1 temp 60 / ilink; estimate 'logit at 64.7 deg' intercept 1 temp 64.7 / ilink; estimate 'logit at 64.8 deg' intercept 1 temp 64.8 / ilink; estimate 'logit at 70 deg' intercept 1 temp 70 / ilink; estimate 'logit at 80 deg' intercept 1 temp 80 / ilink; run;

Careful!! Normal default



Binary OutputFit Statistics



Parameter Estimates



Intercept 15.0429 7.3786 21 2.04 0.0543

temp -0.2322 0.1082 21 -2.14 0.0438

Estimates



StandardErrorMean

logit at 50 deg 3.4348 2.0232 21 1.70 0.1043 0.9688 0.06121

logit at 60 deg 1.1131 1.0259 21 1.09 0.2902 0.7527 0.1909

logit at 64.7 deg 0.02197 0.6576 21 0.03 0.9737 0.5055 0.1644

logit at 64.8 deg -0.00124 0.6518 21 -0.00 0.9985 0.4997 0.1630

logit at 70 deg -1.2085 0.5953 21 -2.03 0.0552 0.2300 0.1054

logit at 80 deg -3.5301 1.4140 21 -2.50 0.0209 0.02847 0.03911

no evidence of overdispersion



Binary Data + Random Effects

Binary data in GLM with random effect can be troublesome

Pseudo-likelihood tends to produce biased variance / covariance component estimates

e.g. variance estimates biased down for small cluster size

Larger sample sizes tend to be required No overdispersion estimate



Binary GLMM example

courtesy of Oliver Schabenberger

200 subjects random intercept logistic link

data binary; do subject = 1 to 200; ranint = rannor(&seed); do i = 1 to &n; linp = &b0 + ranint; pi = 1/(1 + exp(-linp)); y = ranbin(0,1,pi); output; end; end; drop i; run;



Binary GLMM

Schabenberger used two programs

proc glimmix data=binary; class subject; model y(event='1') = / dist=binary link=logit s; random intercept / subject=subject; ods select ParameterEstimates CovParms; run;

proc nlmixed data=binary; parms s2 1 intercept -1; model y ~ binary(1/(1+exp(-intercept+gamma))); random gamma ~ normal(0,s2) subject=subject; ods select Dimensions ParameterEstimates; run;



GLIMMIX vs NLMIXED Binary Results



Error

Intercept subject 0.5251 0.1699



Error DF

Intercept -0.7159 0.09211 199

cluster size n=4

Parameter Estimates


Error DF

s2 0.8159 0.2718 199

intercept -0.8092 0.1085 199

GLIMMIX

NLMIXED

cluster size n=20


Cov ParmSubjec

t Estimate

Standard Err

or

Intercept subject 0.9905 0.1373



Error DF

Intercept -0.9239 0.08020 199

Parameter Estimates


Error DF

s2 1.1512 0.1659 199

intercept -0.9854 0.08691 199



Diagnostics & Alternative Models

Example using count data SAS Linear Models, Output 10.24 Historically, count data assumed ~ Poisson Implies mean=variance In practice, often variance>mean, overdispersion Requires modification

−scale to correct std error, test statistics for overdispersion

−use different distribution



Basic analysis + model checking

Model checking plots:1. Residuals vs pred

a. use std resid b. or deviance resc. std’ize pred scalelook for unequal scatter (wrong dist or var fct)pattern in resid (wrong model or link)

2. y* vs. (xbeta)linear or wrong link

proc glimmix data=a; class BLOCK CTL_TRT a b; model count=CTL_TRT a b a*b/dist=poisson; random intercept / subject=BLOCK; output out=check pred=xbeta pred(ilink)=pred residual=r pearson=resid_pearson;run;

data plot; merge check; adjlamda=2*sqrt(pred); ystar=xbeta+(count-pred)/pred; absres=abs(resid_pearson);

proc gplot; plot resid_pearson*(pred xbeta); plot (resid_pearson)*adjlamda; plot ystar*xbeta; plot absres*adjlamda;run;



Evidence of Overdispersion

Gener. chi-square / DF should be 1>1 indicates overdispersion<1 indicates underdispersion

Fit Statistics

-2 Res Log Pseudo-Likelihood 124.06Generalized Chi-Square 100.15Gener. Chi-Square / DF 3.34



Example: plot of residuals x adjlamda



Another look – absolute value resid vs adjlamda



Link? Plot ystar x XBeta

should be linear – no strong evidence of problem



Strategy 1: Adjust using scale parameter

Poisson log-likelihood is log( ) log !

( ) ( )

Quasi-likelihood allows scale parameter

log( )

Now, ( ) ( )

y y

E y Var y

y t q yQ dt

t

E y Var y



Implementation with GLIMMIX

2

SCALE estimated from RANDOM _RESIDUAL_

- ( )

alternatively can use - ( )

Generalized

N rank X

deviance

N rank X

proc glimmix data=a; class BLOCK CTL_TRT a b; model count=CTL_TRT a b a*b/dist=poisson htype=1,3; random intercept / subject=BLOCK; random _residual_; run;



Selected Output

Type I Tests of Fixed Effects

EffectNum


CTL_TRT 1 27 55.83 <.0001


EffectNum


CTL_TRT 0 . . .

A 2 27 9.19 0.0009

B 2 27 0.06 0.9402

A*B 4 27 3.11 0.0315

UnScaled Scaled


EffectNum


CTL_TRT 1 27 16.23 0.0004


EffectNum


CTL_TRT 0 . . .

A 2 27 2.67 0.0875

B 2 27 0.02 0.9822

A*B 4 27 0.90 0.4753

Note discrepancy for CTL_TRT and A main effect



Alternative 2: different distribution e.g. Negative Binomial

kparamnatural

kyVaryE

isk

kk

ky

ky

ky

k

kk

kyL

k

k

kky

ky

kkyN

yNy

N

ky

yNy

log,)(,)(

likelihood-quasi but family, exponloglog

)!1(!

)!1(loglogloglog

)!1(!

)!1( p.d.f. yields

and let :form useful More

)1()!1(!

)!1( :formstat text -math Standard

2

is the mean and k is the aggregation parametersmall k aggregation; k Poisson



Negative Binomial with GLIMMIX

proc glimmix data=a; class BLOCK CTL_TRT a b; model count=CTL_TRT a b a*b/dist=negbin htype=1,3; random intercept / subject=BLOCK;run;

Fit Statistics

-2 Res Log Pseudo-Likelihood 84.48




EffectNum


CTL_TRT 1 27 10.08 0.0037


EffectNum


CTL_TRT 0 . . .

A 2 27 3.53 0.0436

B 2 27 0.03 0.9753

A*B 4 27 1.02 0.4139



Modeling with Offsets

There are cases when modeling count alone is naive This occurs when counts are “per unit”

− Number of plants per plot

− Number of patients per county

− Number of students per district

− Number of boating accidents per year per lake

− Number of defects per lot

Accurate model must take units into account Essentially, based on log(count/unit) Log(count) is link; log(unit) is “offset”



Offset defined

Idea: raw count may be artifact of unit size Count / unit more informative Offset

−adjusts for size

− is a regressor whose coefficient is assumed to be 1.0

−used especially in conjuction with Poisson models with log link

−accounts for heterogeneity in rates resulting from difference in size



Modeling with Offsets

( )

exp

log ( ) log log

rate per unit size

i i

i i i

i i i

y Poisson

size

E y size

X offset



Example: Courtesy of Oliver Schabenberger

Some of the data X is predictor variable SIZE is the “unit” to be

taken into account

Obs size x count

1 5001 4.597 4

2 7550 4.245 76

3 1744 3.918 2

4 1451 3.273 2

5 5313 4.140 12

6 3687 3.438 4

7 3022 4.763 2

8 8809 4.445 9

9 4436 4.191 3

10 2621 4.835 6



Naive Modeling (not accounting for SIZE)

proc glimmix data=test; model count = x / s dist=poisson; ods select FitStatistics ParameterEstimates;run;

Fit Statistics

-2 Log Likelihood 647.12








Parameter Estimates


ErrorDF

t Value Pr > |t|

Intercept 2.0978 0.4143 38 5.06 <.0001

x -0.01619 0.1002 38 -0.16 0.8725



Poisson Model with Offset

proc glimmix data=test; offs = log(size); model count = x /s dist=poisson offset=offs; ods select FitStatistics ParameterEstimates;run;

Fit Statistics









Parameter Estimates


ErrorDF t Value Pr > |t|

Intercept -7.3168 0.5052 38 -14.48 <.0001

x 0.2247 0.1225 38 1.83 0.0746



Alternative to Offset?? Could count/size be treated as binomial?

proc glimmix data=test; offs = log(size); model count = x /s dist=poisson offset=offs; output out=gmxout1 pred(ilink)=mu; id _xbeta_ offs _linp_; ods exclude all;run;

proc glimmix data=test; model count/size = x /s dist=binomial; output out=gmxout2 pred(ilink)=prob; ods exclude all;run; data gmxout2; set gmxout2; predcount= prob * size;



Compare Poisson/Offset vs Binomial Results

Obs _xbeta_ offs _linp_ mu

1 -6.28394 8.51739 2.23346 9.3321

2 -6.36302 8.92930 2.56628 13.0173

3 -6.43649 7.46394 1.02745 2.7939

4 -6.58140 7.28001 0.69860 2.0109

5 -6.38661 8.57791 2.19130 8.9468

6 -6.54433 8.21257 1.66823 5.3028

7 -6.24664 8.01367 1.76703 5.8535

8 -6.31809 9.08353 2.76544 15.8860

9 -6.37516 8.39751 2.02235 7.5561

10 -6.23047 7.87131 1.64085 5.1595

Poisson results MU = pred count Bimomial results

Obs size x count prob predcount

1 5001 4.597 4 .001866023 9.3320

2 7550 4.245 76 .001724158 13.0174

3 1744 3.918 2 .001602034 2.7939

4 1451 3.273 2 .001385890 2.0109

5 5313 4.140 12 .001683963 8.9469

6 3687 3.438 4 .001438241 5.3028

7 3022 4.763 2 .001936911 5.8533

8 8809 4.445 9 .001803387 15.8860

9 4436 4.191 3 .001703368 7.5561

10 2621 4.835 6 .001968487 5.1594

predicted counts nearly identical



ZIP and Hurdle Models

Mixture models for count data−ZIP = “zero-inflated Poisson”−ZINB = “zero-inflated Negative Binomial”− in principle, other zero-inflated models limited only by

imagination Accommodate excess zeros

−Excess zeros cause overdispersion Are not in exponential family Cannot be fit with PROC GLIMMIX Can be fit using PROC NLMIXED



ZIP Model

1 Pr 0 0Pr

1 Pr 0 0

1 0

1 0!

i

i

i i

i i ii

i i

i i

ji

i

z Poisson

z jy j

z j

e j

ej

j

Observation

prob of 0 from Bernoulli process

prob of zero from Poissonprocess



Hurdle Model

Two part model−One process generates zeros

−Another process generates non-zeros

Pr 0 0

Pr Pr 01 Pr 0 0

1 Pr 0

i

i ii

i

z j

y j uz j

u

observationzeros fromZ process

truncated at zerodistribution



ZIP or Hurdle?

Number of doctor visits per year

Number of fish caught by sport fishermen

Cancer mortality



From SAS for Mixed Models, 2nd ed, Ch 15%let pi = 0.27;data zip; do s = 1 to 100; u = rannor(556712); do i = 1 to 20; x = int(ranuni(0)*100); y = int(rannor(0)*100); if (ranuni(0) < &pi) then do; count = 0; lambda = .; end; else do; lambda = exp(-2 + 0.01*x + 0.01*y + u); count = ranpoi(0,lambda); end; output; end; end; drop i u lambda;run;

Credit: Oliver

Schabenberger



ZIP Model with Random Effectsproc nlmixed data=zip; parameters b0=0 b1=0 b2=0 a0=0 s2u=1; /* linear predictor for the inflation probability */ linpinfl = a0; /* infprob = inflation probability for zeros */ /* = logistic transform of the linear predictor*/ infprob = 1/(1+exp(-linpinfl)); /* Poisson mean */ lambda = exp(b0 + b1*x + b2*y + u); /* Build the ZIP log likelihood */ if count=0 then ll = log(infprob + (1-infprob)*exp(-lambda)); else ll = log((1-infprob)) + count*log(lambda)-lgamma(count+1)-lambda; model count ~ general(ll); random u ~ normal(0,s2u) subject=s; estimate "inflation probability" infprob;run;



ZIP NLMIXED Selected ResultsFit Statistics





Parameter Estimates


Error DF t Value Pr > |t| Alpha Lower Upper Gradient

b0 -1.9979 0.1530 99 -13.06 <.0001 0.05 -2.3014 -1.6944 -0.00224

b1 0.01011 0.001299 99 7.78 <.0001 0.05 0.007535 0.01269 -0.15649

b2 0.01016 0.000394 99 25.78 <.0001 0.05 0.009378 0.01094 -0.0434

a0 -1.0934 0.1594 99 -6.86 <.0001 0.05 -1.4097 -0.7771 -0.00034

s2u 1.0828 0.2095 99 5.17 <.0001 0.05 0.6671 1.4985 -0.00145

Additional Estimates


Error DF t Value Pr > |t| Alpha Lower Upper

inflation probability 0.2510 0.02997 99 8.38 <.0001 0.05 0.1915 0.3104

true parameter valuesb0=-2 b1=b2=0.01a0=-0.9946 s2u=1



GLMM Multi-Clinic Binomial Data

SAS for Linear Models, Output 10.9 also SAS for Mixed Models, Ch 14 from Beitler & Landis, Biometrics, 1985 2 treatments (drug, cntl) 8 clinics, represent population nij patients observed on trt i at clinic j

yij have favorable response



GLMM for Beitler Landis Data

2 2

Pr | ,

Model: log ( )1

(0, ); (0, )

ij

iji j ij

ij

j C CTij

favorable trt i clinic j

c ct

c iid N ct iid N

proc glimmix data=a; class clinic trt; model fav/nij= trt/dist=binomial link=logit; random intercept trt / subject=clinic; lsmeans trt/odds; estimate 'lsm - cntl' intercept 1 trt 1 0 /ilink; estimate 'lsm - drug' intercept 1 trt 0 1 / ilink; estimate 'diff' trt 1 -1; contrast 'diff' trt 1 -1;run;



Intercept clinic 2.0103

trt clinic 0.06057



If you drop Clinic x Trtproc glimmix data=a; class clinic trt; model fav/nij= trt/dist=binomial link=logit; random intercept / subject=clinic; lsmeans trt/odds; estimate 'lsm - cntl' intercept 1 trt 1 0 /ilink; estimate 'lsm - drug' intercept 1 trt 0 1 / ilink; estimate 'diff' trt 1 -1; contrast 'diff' trt 1 -1;run;

conditional(SS)model

proc glimmix data=a; class clinic trt; model fav/nij= trt/dist=binomial link=logit; random _residual_ / type=cs subject=clinic; lsmeans trt/odds; estimate 'lsm - cntl' intercept 1 trt 1 0 /ilink; estimate 'lsm - drug' intercept 1 trt 0 1 / ilink; estimate 'diff' trt 1 -1; contrast 'diff' trt 1 -1;run;

marginal(PA)model



Selected Output – Conditional Model


Cov Parm Estimate

Standard Error

clinic 2.0327 1.2637


EffectNum


trt 1 7 5.98 0.0444

Estimates


ErrorDF t Value Pr > |t| Mean

StandardErrorMean

lsm - cntl -1.1464 0.5586 7 -2.05 0.0793 0.2411 0.1022

lsm - drug -0.4220 0.5552 7 -0.76 0.4720 0.3960 0.1328

diff -0.7244 0.2963 7 -2.45 0.0444

trt Least Squares Means

trt EstimateStandard

Error DF t Value Pr > |t| Odds

cntl -1.1464 0.5586 7 -2.05 0.0793 0.3178

drug -0.4220 0.5552 7 -0.76 0.4720 0.6557



GLMM with NLMIXED1. data step to define indicator for Trt=1 (because NLMIXED

lacks CLASS statement)data a; input clinic trt $ fav unfav; nij=fav+unfav; t1=(trt='drug');

2. then, run NLMIXEDproc nlmixed; parms mu=1 tau=0 s2c=2; eta=mu+tau*t1+cj; pij=exp(eta)/(1+exp(eta)); model fav~binomial(nij,pij); random cj~normal(0,s2c) subject=clinic; estimate 'trt effect' tau; estimate 'ctl p_hat' exp(mu)/(1+exp(mu)); estimate 'drug p_hat' exp(mu+tau)/(1+exp(mu+tau)); estimate 'diff on p_hat scale' exp(mu+tau)/(1+exp(mu+tau)) - exp(mu)/(1+exp(mu));run;



NLMIXED with CxT term included

proc nlmixed; parms mu=1 tau=0 s2c=2 s2ct=0.08; eta=mu+tau*t1+cj+c1j*t1+c2j*t2;; pij=exp(eta)/(1+exp(eta)); model fav~binomial(nij,pij); random cj c1j c2j~normal([0,0,0],[s2c,0,s2ct,0,0,s2ct])

subject=clinic; estimate 'trt effect' tau; estimate 'ctl p_hat' exp(mu)/(1+exp(mu)); estimate 'drug p_hat' exp(mu+tau)/(1+exp(mu+tau)); estimate 'diff on p_hat scale' exp(mu+tau)/(1+exp(mu+tau)) - exp(mu)/(1+exp(mu));run;

first, also define Trt=2 indicator, here denoted t2



Binary Repeated Measures

2 treatments 20 subjects (animals) per trt 5 times of measurement response at each measurement 0/1 suggested by companion animal vaccine trials



Several approaches

GEE using GENMOD PQL using %GLIMMIX

− random subj(trt), or

− CS

G-H quadrature using NLMIXED (not shown) but you could use MIXED type 1 error control of PQL + random subj(trt) not

acceptable power of PQL/CS or NLMIXED > GEE



various SAS pgm for binary rpt-M dataproc genmod; class trt animal day; model y=trt|day/dist=bin type1 type3; repeated subject=animal(trt)/ type=exch;

Proc GLIMMIX; CLASS trt animal day; MODEL y=trt|day / dist=binomial link=logit; random animal(trt);

Proc GLIMMIX; CLASS trt animal day; MODEL y=trt|day / dist=binomial link=logit; random day / rside type=cs subject=animal(trt);

GEE

PQL

random an(trt)

CS

NLMixed next page



NLMixeddata nlmx; set univar; t1=(trt=1); t2=(trt=2); d1=(day=1); d2=(day=2); d3=(day=3); d4=(day=4); d5=(day=5);

proc nlmixed; parms mu=1 a1=1 b1=1 b2=1 b3=1 b4=1

ab11=1 ab12=1 ab13=1 ab14=1 sb2=1; eta=mu+a1*t1+b1*d1+b2*d2+b3*d3+b4*d4+

ab11*t1*d1+ab12*t1*d2+ab13*t1*d3+ab14*t1*d4; pi=exp(eta+bse)/(1+exp(eta+bse)); model y~binary(pi); random bse~normal(0,sb2) subject=id; contrast 'trt' a1; contrast 'day' b1,b2,b3,b4; contrast 'trt x day' ab11,ab12,ab13,ab14;



Poisson Repeated Measures

Output 10.39 SAS for Linear Models Leppik, et al (1985); Thall & Vail (1990) 2 treatments 28 patients on trt=0; 31 on trt=1 4 times of measurement epilespsy: # seizures in 4 test periods baseline & age covariates



Model for seizure data

1 2

denote mean count (# seizures) trt , time

GL Model is:

log( ) ( ) (log_ ) (log_ )

Assume CS working correlation structure among repeated measures

ij

ij i j ij i

i j

base age

using GEE

proc genmod data=seizure; class id trt time;/* this model first */ *model y=trt time trt*time log_base trt*log_base log_age/ dist=poisson link=log type1 type3;/* then this model */ model y=trt time log_base(trt)log_age/ dist=poisson link=log type1 type3; repeated subject=id / type=exch corrw;

see SAS file for %GLIMMIX approach



GENMOD to GLIMMIX

using GEE

proc genmod data=seizure; class id trt time;model y=trt time log_base(trt)log_age/ dist=poisson link=log type1 type3; repeated subject=id / type=exch corrw;

equivalent GLIMMIX

proc glimmix data=seizure; class id trt time;model y=trt time log_base(trt)log_age/ dist=poisson link=log; random time / type=cs subject=id residual;



Degrees of Freedom & Standard Errors

Recall Satterthwaite approximation & Kenward-Roger bias adjustment in LMM

Same issues exist with GLMM

But not nearly as well researched

You can use SATTERTH and KR options in GLIMMIX with non-normal data & non-identity link

But what do they do?



Power



VIII. Power Many software packages for power & sample size

−e.g SAS PROC POWER− for FIXED effect models only

What if you have “Mixed Model Issues”?− random effects−split-plot structure−errors potentially correlated: longitudinal or spatial data−any other non-standard model structure

Methods based on PROC GLIMMIX−adapted from Stroup (2002, JABES)



Mixed Model Background – G, R unknown

)'(]'[)''(

Roger-Kenward ite,Satterthwa e.g.

edapproximat be toneedmay or design from obvious bemay

approx ~

and of components estimated using of estimate is ˆ

)(

)ˆ'(]ˆ'[)'ˆ'()0'(

1

],),([

1

KCLLK

FF

RGCC

Krank

KLCLKKF

Krank



Computing Power using SAS

create data set like proposed design (O’Brien: “exemplary data set”)

run PROC GLIMMIX with covariance components fixed

=(F computed by GLIMMIX)rank(K) [or chi-sq with GLM]

use GLIMMIX to compute

critical F (Fcrit ) is value s.t.

P{F (rank(K), υ, 0 ) > Fcrit}= [or chi-square]

Power = P{F [rank(K), υ, ] >Fcrit }

SAS functions can compute Fcrit & Power



/* step 1 - create data set with same structure as proposed design use MU (expected mean) instead of observed Y_ij values *//* this example shows power for 5, 10, and 15 e.u. per trt */

data crdpwrx1; input trt mu; do n=5 to 15 by 5; do eu=1 to n; output; end; end;cards;1 1002 943 90;

Compute Power with GLIMMIX – CRD example



Compute Power with GLIMMIX – CRD example

/* step 2 - use PROC GLIMMIX to compute non-centrality parameters for ANOVA tests & contrasts ODS statements output them to new data sets */proc sort data=crdpwrx1;by n;

proc glimmix data=crdpwrx1;by n; class trt; model mu=trt; parms (100)/hold=1; contrast 'et1 v et2' trt 0 1 -1; contrast 'c vs et' trt 2 -1 -1; ods output tests3=b; ods output contrasts=c;run;



/* step 3: combine ANOVA & contrast n-c parameter data sets use SAS functions PROBF and FINV to compute power */data power; set b c; alpha=0.05; ncparm=numdf*fvalue; fcrit=finv(1-alpha,numdf,dendf,0); power=1-probf(fcrit,numdf,dendf,ncparm);proc print;


EffectNum


trt 2 12 1.27 0.3169

Contrasts

LabelNum


et1 v et2 1 12 0.40 0.5390

c vs et 1 12 2.13 0.1698

Obs n Effect NumDF DenDF FValue ProbF Label alpha ncparm fcrit power

1 5 trt 2 12 1.27 0.3169 0.05 2.53333 3.88529 0.22361

2 5 1 12 0.40 0.5390 et1 v et2 0.05 0.40000 4.74723 0.08980

3 5 1 12 2.13 0.1698 c vs et 0.05 2.13333 4.74723 0.26978



More Advanced Example

Plots in 8 x 3 grid Main variation alone 8 “rows” 3 x 2 treatment design Alternative designs

− randomized complete block (4 blocks, size 6)

− incomplete block (8 blocks, size 3)

−split plot

RCBD “easy” but ignores natural variation



Picture the 8 x 3 Grid

Gradient



SAS Programs to Compare 8 x 3 Designdata a; input bloc trtmnt @@; do s_plot=1 to 3; input dose @@; mu=trtmnt*(0*(dose=1)+4*(dose=2)+8*(dose=3)); output; end;cards;1 1 1 2 31 2 1 2 32 1 1 2 32 2 1 2 33 1 1 2 33 2 1 2 34 1 1 2 34 2 1 2 3;

proc glimmix data=a noprofile; class bloc trtmnt dose; model mu=bloc trtmnt|dose; random trtmnt/subject=bloc; parms (4) (6) / hold=1,2; lsmeans trtmnt*dose / diff; contrast 'trt x lin'

trtmnt*dose 1 0 -1 -1 0 1; ods output diffs=b; ods output contrasts=c;run;

Split-Plot



8 x 3 – Incomplete Blockdata a; input bloc @@; do eu=1 to 3; input trtmnt dose @@; mu=trtmnt*(0*(dose=1)+4*(dose=2)+8*(dose=3)); output; end;cards;1 1 1 1 2 1 32 1 1 1 2 2 23 1 1 1 3 2 34 1 1 2 1 2 25 1 2 1 3 2 26 1 2 2 1 2 37 1 3 2 1 2 38 2 1 2 2 2 3;

proc glimmix data=a noprofile; class bloc trtmnt dose; model mu=trtmnt|dose; random intercept / subject=bloc; parms (4) (6) / hold=1,2; lsmeans trtmnt*dose / diff; contrast 'trt x lin'




8 x 3 Example - RCBDdata a; input trtmnt dose @@; do bloc=1 to 4; mu=trtmnt*(0*(dose=1)+4*(dose=2)+8*(dose=3)); output; end;cards;1 1 1 2 1 3 2 1 2 2 2 3;

proc glimmix data=a noprofile; class bloc trtmnt dose; model mu=bloc trtmnt|dose; parms (10) / hold=1; lsmeans trtmnt*dose / diff; contrast 'trt x lin'




Power for GLMs

2 treatments P{favorable outcome} for trt 1 p= 0.30; for trt 2 p=0.25 power if n1=300; n2=600data a; input trt y n; datalines;1 90 3002 150 600;

proc glimmix; class trt; model y/n=trt / chisq; ods output tests3=pwr;run;

data power; set pwr; alpha=0.05; ncparm=numdf*chisq; fcrit=cinv(1-alpha,numdf,0); power=1-probchi(fcrit,numdf,ncparm); proc print; run;



Power for GLMM Same trt and sample size per location as before 10 locations Var(Location)=0.25; Var(Trt*Loc)=0.125 Variance Components: variation in log(OddsRatio) Power?data a; input trt y n; do loc=1 to 10; output; end; datalines; 1 90 300 2 150 600 ;

proc glimmix data=a initglm; class trt loc; model y/n = trt / oddsratio; random intercept trt / subject=loc; random _residual_; parms (0.25) (0.125) (1) / hold=1,2,3; ods output tests3=pwr;run;



GLMM Power Analysis Results

Obs Effect NumDF DenDF alpha ncparm fcrit power

1 trt 1 9 0.05 2.29868 5.11736 0.27370

Odds Ratio Estimates

trt _trt Estimate DF

95% Confidence

Limits

1 2 1.286 9 0.884 1.871

Gives you expected Conf Limits for # Locations & N / Loccontemplated

Gives you the power of the test of TRT effect on prob(favorable)



GLMM Power: Impact of Sample Size?

N of subjects per trt per location?

N of Locations?

Three cases

1. n-300/600 10 loc2. n=600/1200, 10 loc3. n=300/600, 20 loc

data a; input trt y n; do loc=1 to 10; output; end; datalines; 1 90 300 2 150 600 ;





GLMM Power: Impact of Sample Size?Recall, for 10 locations, N=300/600,

CI for OddsRatio was (0.884, 1.871); Power was 0.274For 10 locations, N=600 / 1200

Odds Ratio Estimates

trt _trt Estimate DF 95% Confidence Limits

1 2 1.286 9 0.891 1.855


1 trt 1 9 0.05 2.40715 5.11736 0.28421

For 20 locations, N=300 / 600Odds Ratio Estimates

trt _trt Estimate DF 95% Confidence Limits

1 2 1.286 19 1.006 1.643


1 trt 1 19 0.05 4.59736 4.38075 0.53003

N alone has almost no impact



Spatial Data



Example 5 - Spatialfrom SAS for Mixed Models, Sect. 11.7

“Alliance” Data from Stroup, Baenziger, and Mulitze (1994)

in GLIMMIX-speak:

data two; set alliance; obs = _n_;proc glimmix data=two; class Entry Rep obs; model Yield=Entry/ddfm=kr; random intercept/subject=rep; random obs / type=sp(sph)(latitude longitude); parms (0.1) (43.4) (27.5) (11.5); lsmeans entry;



IX. Spatial Data

Example from SAS for Mixed Models−Spatial errors in Treatement Comparison studies only

−No spatial mapping, Kriging

Standard parametric models from Geostatistics

RSMOOTH alternative

Issues



r ep 1 2 3 4

LAT

4. 30

15. 05

25. 80

36. 55

47. 30

LNG

1. 2 7. 5 13. 8 20. 1 26. 4

From Stroup, Baenziger & Mulitze (Crop Science, 1994) 56 varieties, 4 blocks, e.u. = 4.3 1.2 m plots



Contour Plot of Response

B

B

B

B

NN

N

N

B = Buckskin N = NE86503



Additional GLIMMIX Code to Plot Spatial Variability

output out=gmxout2 pred=p; ods output lsmeans=lsm2; id entry latitude longitude _zgamma_; run; proc means data=gmxout2; var _zgamma_; run; proc print data=gmxout2(OBS=20); run; proc g3d data=gmxout2; plot latitude*longitude=_zgamma_ /grid;



Plot of Spherical Covariance



Alternative Using RSMOOTH

Advantage in Theory: RSMOOTH does not require parametric model of spatial variation, which can be unrealistic

e.g. Alliance data spatial variation is from winter kill

proc glimmix data=alliance; class Entry Rep; model Yield=Entry /ddfm=kr; *model Yield=Entry latitude longitude/ddfm=kr; random intercept/subject=rep; random latitude longitude / type=rsmooth;



RSMOOTH?

From Penalized Spline−Ruppert, Wand, and Carroll (2003, SemiParametric

Regression, Cambridge)

*

ˆˆPrediction: ( )

Objective Function :

; ( ) ( )

y B x

Q y B x y B x D



RSMOOTH (2)

Rewrite the model

0 1

2 2

is "knot" a.k.a. "join point"

Rexpress:

* ;

i j i jj

j

y x x e

y X Z e

then

Q y X Z



RSMOOTH (2)

2

Spline:

LMM:

y y X B y X B D

y y X Zu y X Zu



RSMOOTH yields following Spatial Plot



RSMOOTH vs SP(SPH)

Sp(SPH) RSMOOTH


Num DenEffect DF DF F Value Pr > F

Entry 55 148.2 1.77 0.0038


Num DenEffect DF DF F Value Pr > F

Entry 55 138.1 1.85 0.0021



However... Plot of LSMeans from two approaches

LSM_RSMOOTH average 31.06LSM_SP_SPH average 24.40

????



Some NLMM Issues

Consulting problem at UNL Why nonlinear mixed model (NLMM) seemed

appropriate Problems in implementation NLMM issues Alternatives whose implications are not

adequately understood



Wheat Sawfly Study

Gary Hein, Research Entomologist, Scottsbluff, NE RREC

Sawflies inhabit/damage wheat 5 tillage treatments: impact on sawflies Exp design used 4 randomized blocks Sawfly emergence measured at planned times

during growing season



Emergence over TIME by TRT

Black: NoTillRed: SumBlade (summer)Cyan: SB&SDGreen: SpDisk (spring)Blue: SpPlow



“Conventional” Analysis

Emerge = + TRT + blk + blk*trt + DATE + TRT*DATE + date*blk(trt)

• blk*trt a.k.a. between subjects or “whole-plot” error

• date*blk(trt) = within subjects or “split-plot” error

ANOVA: Source df

blk 3TRT 4betw subj error 12DATE 12TRT*DATE 48within subj error 180



Standard ANOVAmodel: emerge = + blk + TRT +w.p.error + TIME + TRT*TIME + s.p. error

The Mixed Procedure


Cov Parm Estimate blk 0.002177 blk*trt 0.005199 Residual 0.01845



trt 4 12 13.18 0.0002 date 12 180 157.38 <.0001

trt*date 48 180 5.18 <.0001

CS covariance fit adequately



Break out TRT*DATE effect



trt 4 12 15.62 0.0001 lin 1 177 2273.39 <.0001 quad 1 177 7.24 0.0078 cubic 1 177 161.10 <.0001 date 9 177 2.95 0.0027 lin*trt 4 177 0.59 0.6716 quad*trt 4 177 26.69 <.0001 cubic*trt 4 177 2.13 0.0792 trt*date 36 177 3.08 <.0001



Alternative Modeling Considerations

th th

2

2

:

mean of i trt at j time

~ . . . (0, )

~ . . . (0, )

ijk ij k ik ijk

ij

ik W

ijk

Basic form of Model y blk w e

w whole plot error i i d N

e split plot error i i d N

Modeling ij

1. Decompose ij in “standard ANOVA” +Trt+Time+Trt*Time

2. Further decompose via polynomial regression

3. Nonlinear decomposition, e.g. Gompertz

4. Transform yijk to “linearize” response profile over date

a. logit or probit (assume sigmoid profile is symmetric)

b. complementary log-log (allows asymmetry)



th

th

th

exp{ exp[ ( )]}

is asymptote of i treatment

is "slope" of i treatment

is inflection point of i treatment

:

ij i i i j

i

i

i

i

Gompertz Model

date



Parameter Estimates Standard Parameter Estimate Error DF t Value Pr > |t|

a1 0.9949 0.03629 19 27.42 <.0001 a2 0.9666 0.03793 19 25.48 <.0001 a3 0.9868 0.04609 19 21.41 <.0001 a4 1.0037 0.06284 19 15.97 <.0001 a5 0.9236 0.04390 19 21.04 <.0001

b1 0.5435 0.08104 19 6.71 <.0001 b2 0.4822 0.08743 19 5.52 <.0001 b3 0.4506 0.09845 19 4.58 0.0002 b4 0.3431 0.06859 19 5.00 <.0001 b5 0.8544 0.1810 19 4.72 0.0001

c1 0.3615 0.05388 19 6.71 <.0001

c2 0.3224 0.05841 19 5.52 <.0001 c3 0.2940 0.06370 19 4.62 0.0002 c4 0.2186 0.04360 19 5.01 <.0001 c5 0.5319 0.1125 19 4.73 0.0001 s2w 0.002926 0.001355 s2s 0.01598 0.001462

These areML estimates Bias?



Fit of Gompertz



Trt Comparisons with NLMIXED Contrasts Num Den Label DF DF F Value Pr > F

among a 4 19 0.50 0.7383 among b 4 19 2.19 0.1085 among c 4 19 2.30 0.0966 a: nt vs sum bld 1 19 0.29 0.5956 a: nt+sb vs sb&sd 1 19 0.01 0.9108 a: sp dsk vs sp plow 1 19 1.09 0.3089 a: nt+sb vs sp d+p 1 19 0.14 0.7169 b: nt vs sum bld 1 19 0.26 0.6132 b: nt+sb vs sb&sd 1 19 0.29 0.5950 b: sp dsk vs sp plow 1 19 6.97 0.0161 b: nt+sb vs sp d+p 1 19 0.57 0.4590 c: nt vs sum bld 1 19 0.24 0.6279 c: nt+sb vs sb&sd 1 19 0.41 0.5305 c: sp dsk vs sp plow 1 19 6.74 0.0177 c: nt+sb vs sp d+p 1 19 0.21 0.6497



Issues with Test Results

denominator degrees of freedom?DF in NLMIXED based on simple N-1 ruleMIXED uses Satterthwaite/KRNLMIXED analog?

bias in test statistics?In MIXED, ML variance estimates biased Test statistics biased Excessive type I error rates familiar in MIXEDSame in NLMIXED?



Alternative NLMIXED Analysis

1. Use MIXED to obtain REML estimates of W2

and S2

2. Include REML variance component estimates in NLMIXED as known

3. NLMIXED will compute std errors and test statistics using REML estimates



NLMIXED REML Tests

MLE: W2 = 0.002926 S

2 = 0.01598REML: W

2 = 0.005199 S2 = 0.01845

Num DenLabel DF DF F Value Pr > Famong a 4 19 0.38 0.8188among b 4 19 1.81 0.1690among c 4 19 1.89 0.1537a: nt vs sum bld 1 19 0.26 0.6138a: nt+sb vs sb&sd 1 19 0.00 0.9796a: sp dsk vs sp plow 1 19 0.77 0.3918a: nt+sb vs sp d+p 1 19 0.15 0.7046b: nt vs sum bld 1 19 0.22 0.6419b: nt+sb vs sb&sd 1 19 0.18 0.6737b: sp dsk vs sp plow 1 19 5.88 0.0255b: nt+sb vs sp d+p 1 19 0.52 0.4788c: nt vs sum bld 1 19 0.21 0.6555c: nt+sb vs sb&sd 1 19 0.27 0.6114c: sp dsk vs sp plow 1 19 5.68 0.0277c: nt+sb vs sp d+p 1 19 0.20 0.6586

Vs. ML

.1085

.0966

.0161

.0177



Hein: “What if we transform the data to linearize it, then use MIXED?”

exp{ exp[ ( )]

if we assume =1

then

log[ log( )] ( )

y date

y date

Denote response variable emerge by y

then:



Plot of CLogLog over Date by Trt



MIXED Analysis of CLogLog



trt 4 12 15.69 0.0001 lin 1 180 1402.85 <.0001 lin*trt 4 180 3.58 0.0077 trt*date 55 180 7.02 <.0001

Test of Lin and Lin*Trt correspond toequality of i and i for all treatmentsin Gompertz NLMM



Decomposing Contrasts

Num DenLabel DF DF F Value Pr > F

trt (b) 4 15 6.12 0.0040c 4 120 3.62 0.0080b: nt v sum bld 1 15 2.15 0.1631b: nt&sb vs sb&sd 1 15 4.37 0.0541b: sp d v p 1 15 2.27 0.1526b: nt&sb v sp d&p 1 15 19.96 0.0005c: nt v sum bld 1 120 2.11 0.1491 c: nt&sb vs sb&sd 1 120 3.49 0.0644c: sp d v p 1 120 0.99 0.3214c: nt&sb v sp d&p 1 120 11.08 0.0012

Vs NLMM

.169

.154

.674

.026

.611

.028

NLMM too conservative? or is Linearized LMM too liberal?



Unresolved Issues



Unresolved NLMIXED Issues

REML vs. ML variance component estimates Degrees of Freedom

Starting Values and Convergence

Are NLMIXED tests too conservative?

Implications for standard errors??

Correlated error repeated measures? When are linearized models analyzed using LMM

(e.g. Proc Mixed) preferable?

Design



GLIMMIX vs MIXED/GENMOD

GLIMMIX has very useful mean comparison options not available in MIXED

−especially for Factorial Simple Effects

GLIMMIX can model true GLMM’s

GLIMMIX is “touchy” (e.g. use of SUBJECT=)

Many Research Issues

−RSMOOTH−Properties of NonNormal KR, working

correlation, DDF, etc.−Computational Methods



Does GLIMMIX replace MIXED/GENMOD?

For GLMMs – no question For GLMs / LMMs

− for the most part – YES

Most GENMOD & MIXED programs can be duplicated in GLIMMIX−Mean Comparison features

−no need to “trick” GENMOD into GLMM with marginal model (e.g. split-plot, rpt measures)

Department of Statistics Introduction to Modeling Change Over Time with Generalized Mixed Models...

Documents

Transcript of Department of Statistics Introduction to Modeling Change Over Time with Generalized Mixed Models...