Department of Statistics Introduction to Modeling Change Over Time with Generalized Mixed Models...

Post on 29-Dec-2015

242 views 6 download

Tags:

Transcript of Department of Statistics Introduction to Modeling Change Over Time with Generalized Mixed Models...

Department of Statistics

Introduction to Modeling Change Over Time

withGeneralized Mixed Models

using SAS PROC GLIMMIX

A Short Course – 14 May 2007

Instructor: Walt Stroup, Ph.D.

Professor & Chair, UNL Department of Statistics

14 May 2007 SSP Core Facility 2

Department of Statistics Outline of ShortCourse (G/C = Growth/Change Model)

1. Introductiona. motivating examplesb. Social Science HLM-speak vs. BioStat GLMM-speak

2. GLMM / HLM a. essential backgroundb. recurring modeling issues

3. SAS / GLIMMIX syntax4. G/C Models - 1st part of the picture: Factorial trt designs

a. with various error structures & distributionsb. with repeated measures & correlated errors

5. G/C Models - 2nd part of the picture: Random Effects issues a. random coefficients b. prediction vs. estimation

6. G/C Models – 3rd part of the picture - GLM issues: Binary, count, rate, zero-inflated models

7. Power & Planning8. Nonlinear mixed models

14 May 2007 SSP Core Facility 3

Department of Statistics

Recurring Themes

“Mixed Model” Issues− fixed or random?−error terms – which one & are they correlated?−std error & d.f.−prediction or estimate? (“inference space”)

“GLM” Issues−what distribution? incl “is it really a distribution & does it matter”?

−what link – “data” vs “model” scale?−overdispersion−computational issues

14 May 2007 SSP Core Facility 4

Department of Statistics

Recurring Themes

George Bernard Shaw:

“America and England are two peoples separated by a common language.”

Generalized Mixed Models have− AgStat-speak

− BioStat-speak

− Social/Behavioral Science Stat (HLM) speak

One goal: serve as translator

picture ofGB Shaw

14 May 2007 SSP Core Facility 5

Department of Statistics

General considerations for modeling Several examples illustrating generalized and

mixed models Typology of models Background theory Decision chart to match model with software

available in SAS

I. Introduction

14 May 2007 SSP Core Facility 6

Department of Statistics

General Model considerations A Model is a description of the components of an

observation observation = systematic + random Nelder: random = ephemeral + noise or

random=random model + random error

Alternative: random = design components + remaining variation

“All models are wrong but some are useful” – G.E.P Box

14 May 2007 SSP Core Facility 7

Department of Statistics

General Mixed Model Setting

Y is vector of responses (observable) u is vector of random (design induced) effects

[not (directly) observable] relevant distributions

o Y|u ~ fC ( , R )

o u ~ fR ( 0, G )

Model is of conditional mean of Y|u

( | ) ( , , , )E Y u h X Z u

Inexact (but useful)•HLM level 1 •Biostat – subject-specific•Level 2

19-20 Oct 2006 GLIMMIX Short Course for Procter & Gamble 8

Department of Statistics

Typology of Models

Type Mean Model Distribution

NLMM h(X,,Z,u) y|u general,

u normal **

GLMM h(X+Zu) y|u general,

u normal *

LMM X+Zu u, y|u normal

NLM h(X,) y normal

GLM h(X) y general

LM X y normal

* for PROC GLIMMIX ** for this course (G/N)LMM can be more general

14 May 2007 SSP Core Facility 9

Department of Statistics

Example 1Random Effects Model

Data: Output 4.1, p. 94, SAS for Linear Models, 4th ed. 20 packages of ground beef 3 samples per package 2 counts per sample response variable: microbial count response = mean + sample + count + error i.e. observation

= systematic + random model + error

14 May 2007 SSP Core Facility 10

Department of Statistics

2 2

2

( )

1,2,..., 20; 1,2,3; 1,2

i.i.d. (0, ); ( ) i.i.d. (0, );

i.i.d. (0, )

ijk i ij ijk

i P ij S

ijk

y p s p e

i j k

p N s p N

e N

Model for Example 1

yijk is observation [ log(count) ]

is overall mean (systematic / fixed)

pi, s(p)ij are random model effects

eijk is random error

Convention: fixed Greek; random Latin

14 May 2007 SSP Core Facility 11

Department of Statistics

Hierarchical Levels

school

classroom

students

Level 1

Level 2

Level 3

size levelsmall 1

medium 2

large 3

14 May 2007 SSP Core Facility 12

Department of Statistics

Hierarchical Level to Statistical Model

school

classroom

students

student, classroom, schoolth th thijky k j i

( )ijk

ijk i ij ijk

y mean school classroom student

y s c s e

0

0

Level 1 (student):

( )

ijk ij ijk

ij i ij

y e

s c s

0

0

Level 2 (classroom): ( )ijk i ij ijk

i i

y c s e

s

Level 3 GLIMMIX-speak

HLM-speak

14 May 2007 SSP Core Facility 13

Department of Statistics

Modeling Issues

1. Estimate i2’s

2. Estimate, standard error, and interval estimate of

3. Estimates of package, sample effects

4. a.k.a. Estimates of school and classroom effects

14 May 2007 SSP Core Facility 14

Department of Statistics

Singer: HLM to MIXED

Unconditional means model

Include Level 2 Covariate

20

0 00 0 0 00

Radenbush & Byrk (2002)

~ 0,

~ 0,

ij j ij ij

j j j

y r r N

u u N

2 2

GLIMMIX

~ 0, ~ 0,

ij i ij

i A ij

y a e

a N e N

0 00 01 0

00 01 0

"HLM-speak"

MEANSES

MEANSES

j j j

ij j j ij

u

y u r

1

"GLIMMIX-speak"

ij j j ijy X s e

one-way random effects model

14 May 2007 SSP Core Facility 15

Department of Statistics

Example 2Blocking & Multi-Location

Data: SAS for Linear Models: Output 3.7, discussed as mixed model in section 4.3; Output 11.30; SAS for Mixed Models, 2nd ed. Section 6.6

Output 11.30 discussed here 3 treatments 8 locations location represent a population 3-12 blocks depending on location response = trt + loc + blk(loc) + trtloc + error i.e.

observation = systematic+random model+error

14 May 2007 SSP Core Facility 16

Department of Statistics

Example 2 framed by Extending School / Classroom Example

school

classroom

students

school

classroom

students

Treatment

Treatment

14 May 2007 SSP Core Facility 17

Department of Statistics

Model with Treatment

school

classroom

students

Treatment

0

0

( ) ( )

( ) ( , )

Level 1:

Level 2: ( , )

Level 3: between school model + trt as above

ijkl

ijkl i ij ijk ijkl

ijkl ijk ijkl

ijkl ij ijk ijkl

y trt school trt classroom school student

y s c s e

y e

y c s e

14 May 2007 SSP Core Facility 18

Department of Statistics

Modeling Issues

1. Appropriate error term to test treatment

2. Standard error of treatment mean

− (inference space)

3. Intra-block vs. inter-block analysis

14 May 2007 SSP Core Facility 19

Department of Statistics

ANOVA (ignoring block)

2 21

2 2 21 2

2 21

2

Source d.f. E

Treatment 2

Location 7

Loc Trt 14

erro

xpected Mean Square

r dfe

LT TRT

LT L

LT

k Q

k k

k

If Location fixed:

2

2

2

2

Treatment 2

Sour

Location 7

Loc Trt 14

error df

ce d.f. Expected Mean Square

e

TRT

LOC

LT

Q

Q

Q

Test of TRTaffected

14 May 2007 SSP Core Facility 20

Department of Statistics

Inference Space

2

Assuming are

Var(trt mean)=# obs/trt

MS(error)Std. error(trt

FLocations

mean)= 0.91# obs/t

xe

rt

i d

2 2 2

2 2 2

HOWEVER... if are

( )Var(trt mean)=

# obs/trt

ˆ ˆ ˆ( )Std. error(trt mean)

Location

= 3.62# obs/

s Rand

tr

o

t

m

L LT

L LT

k

k

14 May 2007 SSP Core Facility 21

Department of Statistics

Where does Uncertainty Arise?

Loc 1 Loc 2

Loc 7 Loc 8

Only from variation among obs within locations?

Locations fixedOr does variation among locations also contribute?

Locations random

14 May 2007 SSP Core Facility 22

Department of Statistics

Intra- vs. Inter-block analysis

Intra- (fixed) block analysis based only on within block treatment differences

Inter-block analysis also accounts for variance among blocks (random combines inter- and intra-)

Lead to equivalent tests when all treatments appear equally in each block

Not equivalent otherwise

In most cases, combined inter-/intra-block analysis is more efficient

14 May 2007 SSP Core Facility 23

Department of Statistics

Example 3Repeated Measures/Longitudinal

Data: SAS for Linear Models, Output 8.1; SAS for Mixed Models, Chapter 5

3 treatments (2 test drugs + placebo)

ni patients per treatment

8 times of measurement (1, 2, 3, ..., 8 hours post trt)

baseline measurement at time 0

response = trt + hour + trthour + pat(trt) + error i.e. observation = systematic + random model + error

Variations on this theme are “latent growth models”

14 May 2007 SSP Core Facility 24

Department of Statistics

Growth Models – SingerHLM-speak to GLIMMIX-speak

Unconditional Linear Growth Model HLM GLIMMIX

Level 1 Within subjects

Level 2 Between subjects 20 1

Level 1 (within individual)

time ~ 0,ij j j ij ij ijy r r N

0 00 0 0 00 01

1 10 1 1 11

00 0 10 1

00 10 0 1

between subject within subjects

0Leve

population-averaged subje

l 2: ~ ,0

ct

j j j

j j j

ij j j ij ij

ij j j ij ij

u uMVN

u u

y u u time r

time u u time r

-specific

PA SS

14 May 2007 SSP Core Facility 25

Department of Statistics

Singer (1998)

Excellent paper translating HLM-speak to Proc Mixed Uses Radenbusch & Byrk examples Fair Warning to Readers, however – it’s dated

−new features & output revisions in SAS

−some of the output encouraged confusion or poor practice

−specifics revised output of Fit StatisticsMisleading output for variance estimates deletedKenward-Roger procedure for d.f. & std errors

I’ll update & make switch to Proc GMIMMIX

14 May 2007 SSP Core Facility 26

Department of Statistics

Modeling Issues

1. Errors may be correlateda. May affect conclusions

b. How to select covariance model

2. Denominator degrees of freedom

3. Bias in standard errors and test statistics

14 May 2007 SSP Core Facility 27

Department of Statistics

Impact of Correlated Errors

Covariance Model den df F-value Pr>F

errors independent 483 7.11 <0.0001

errors correlated

no structure

(bias corrected)

69

(98.1)

4.06

(3.66)

<0.0001

AR(1) 483 3.93 <0.0001

AR(1)

bias corrected

424 3.89 <0.0001

14 May 2007 SSP Core Facility 28

Department of Statistics

Example 4

Data: SAS for Mixed Models, Section 14.5 2 treatment (Test Drug, Control) 8 clinics clinics represent a population nij subjects at jth location on ith treatment

response: favorable or unfavorable (fij = # fav) response = trt + clinic + clinicloc + error i.e. observation

= systematic + random model + error

14 May 2007 SSP Core Facility 29

Department of Statistics

Modeling Issues

1. Response (fij / nij) is binomial, not normal

2. Response may not be linear in model parameters

3. Errors may not be additive

4. Variance of binomial & normal are different

a. heterogeneous

b. depends of location parameter

14 May 2007 SSP Core Facility 30

Department of Statistics

Generalized Linear Mixed Model

2 2C TC

ij

( )

i.i.d. N(

let Pr{favorable response | trt ,

M

clinic }

0,

( | , ( ) ) ~ Binomial

); ( )

: log1

observations

i.i.d. N

= proportion =

odel

(0, )

(

, )

i

ij j ij i

ij

ij

j

j ij

ij

ij

i

i

j

j j

c tc

c tc

f

f

c

i j

n

tc n

exp[ ( ) ] modeled by

1 exp[ ( ) ]i j ij

iji j ij

c tc

c tc

e.g.Logisticmixedmodel

14 May 2007 SSP Core Facility 31

Department of Statistics

Example 5

SAS for Linear Models, Output 10.39 2 treatments ni persons per treatment 4 times of measurement response = number of seizures (count) baseline and age observations response = trt + hour + trthour + baseline & age pat(trt)

+ error i.e. observation = systematic + random model + error

14 May 2007 SSP Core Facility 32

Department of Statistics

Modeling Issues

Count typically not ~ normal Poisson (or negative binomial) more likely Generalized Linear Model Issues

−Linear model not good direct model of mean

−Variance depends on mean

Repeated Measures Issues−Observations within subjects correlated over time

−Between subject variance

14 May 2007 SSP Core Facility 33

Department of Statistics

Example 6

SAS for Mixed Models, Section 1.5.6 5 treatments observed in each of 4 randomized blocks several measurements at days between 130 and 180 growing

degree days response = (trt,day) + block + blktrt + error i.e.

observation = systematic + random model + error

14 May 2007 SSP Core Facility 34

Department of Statistics

Emergence over TIME by TRT

Black: NoTillRed: SumBlade (summer)Cyan: SB&SDGreen: SpDisk (spring)Blue: SpPlow

14 May 2007 SSP Core Facility 35

Department of Statistics

Modeling Issues

“Usual” mixed model and repeated measures issues, plus

Linear model is poor model of trtday means

14 May 2007 SSP Core Facility 36

Department of Statistics

Nonlinear Mixed Model

u

th

th

Mixed Model:

is trt day mean; is block effect

is between subject erro

:

r

exp{ exp[ ( )]}

is asymptote of i treatment

is "slope" of i t

ijk ij k ik ijk

ij k

ik

ij i i i j

i

i

y b w e

b

w

d

Gompertz Mod

te

el

a

th

reatment

is inflection point of i treatmenti

i

14 May 2007 SSP Core Facility 37

Department of Statistics

Typology of Models

Type Mean Model Distribution

NLMM h(X,,Z,u) y|u general,

u normal **

GLMM h(X+Zu) y|u general,

u normal *

LMM X+Zu u, y|u normal

NLM h(X,) y normal

GLM h(X) y general

LM X y normal

* for PROC GLIMMIX ** for this course (G/N)LMM can be more general

14 May 2007 SSP Core Facility 38

Department of Statistics

Generalized Mixed Model SAS Software Decision Table

Response Normal Errors Indep Corr

Random Effects no yes Mean Model

Linear? yes no yes no yes no

SAS Proc

GLM MIXED

GLIMMIX

NLIN MIXED GLIMMIX

NLMIXED %NLINMIX

MIXED GLIMMIX

NLMIXED %NLINMIX

Response Non-Normal

Errors Indep Correl Random Effects

no yes

Mean Model Linear?

yes no yes no yes no

SAS Proc

GENMOD GLIMMIX

GLIMMIX NLMIXED NLMIXED

GLIMMIX (GENMOD)

14 May 2007 SSP Core Facility 39

Department of Statistics

Essential GLMM Background

14 May 2007 SSP Core Facility 40

Department of Statistics

First

How do I run a SAS Program?

???????

It’s easier than the urban legends would have you believe

14 May 2007 SSP Core Facility 41

Department of Statistics

Basic Parts of SAS Program

DATA Step

PROC Step

Modify existing data set (Data __; Set__;)

Data your_choice_of_name; Input list of variables; /* $ after alphameric var */Datalines;data – one line / obs, one column per variable;

comment

Proc GLIMMIX Data= your_choice_of_name; CLASS block group & trt var; MODEL response=block trt covar / options;

...Run;

Data new_data_set_name; Set [old – e.g.] your_choice_of_name; program & data manipulation statements. e.g.LogY=Log(Y);

14 May 2007 SSP Core Facility 42

Department of Statistics Example of SAS Program

data demo1; input classroom trt $ time count; sc=sqrt(count);datalines;1 std 1 121 std 2 161 std 4 171 std 8 242 exper 1 172 exper 2 242 exper 4 302 exper 8 3211 std 1 1611 std 2 1511 std 4 2211 std 8 238 exper 1 158 exper 2 208 exper 4 248 exper 8 27;

proc glimmix data=demo1; class classroom trt time; model sc=trt time trt*time / dist=normal ddfm=kr; random classroom(trt); lsmeans trt*time; ods output lsmeans=lsm; run;

DATA Step PROC Step

Data; Set; + new PROC

data plot_growth; set lsm; log_time=log2(time); symbol i=join value=circle; proc gplot data=plot_growth; plot estimate*log_time=trt; run;

14 May 2007 SSP Core Facility 43

Department of Statistics

II. Generalized Mixed Model Theory Clarify Fixed vs Random effects Linear Models

− LM to LMM + GLM to GLMM

Estimation and Inference for − LMM− GLM− GLMM

For GLMM: − what follows naturally from GLM and LMM − Special Issues

14 May 2007 SSP Core Facility 44

Department of Statistics

Fixed vs. Random Effects?

Fixed Effect?− levels observed = population of interest (except regression)− levels deliberately chosen− inference: systematic relationship between y and

Random Effect?− observed levels represent target population− random sample? -- ideal (but seldom perfectly realized) − makes sense to conceptualize probability distribution

Bottom Line: do observed levels of effect plausibly represent a probability distribution?

−yes random effect−no fixed effect

14 May 2007 SSP Core Facility 45

Department of Statistics

General Structure of Model

Nelder: observation=systematic + random General approach:

− likelihood consists of two parts observation (y | u) random effects u

− model is mathematical description of = E(y | u)

Distribution:− observation y | u ~ f(,R)

− random effects u ~ MVN(0,G)

Model: = h(X,,Z,u) h() called “inverse link”

14 May 2007 SSP Core Facility 46

Department of Statistics

Linear Model (LM)

No random effects simple ANOVA (one error term) multiple regression

Assumption: ( , )

LM: Model by X , usually represented as

; (0, )

alternative representation (helpful for transition to GLMM)

y ( , )

y MVN R

y X e e N R

MVN X R

14 May 2007 SSP Core Facility 47

Department of Statistics

Generalizations of LM

LM (Linear Model)obs ~ normal

fixed effects only

obs ~non-normal fixed effects only

GLM: (Generalized Linear Model)

obs ~ normalRandom Effects

LMM: (Linear Mixed Model)

obs ~ non-normalrandom effects

GLMM (generalized linear mixed model)

14 May 2007 SSP Core Facility 48

Department of Statistics

GLM: Generalized Linear Model

Binomial: Logistic regression; Probit models Poisson: Log-linear models

Assumption: ( , )

is a function of

( ) called " " -- more later

GLM: model =g( ) by -- called " "

alternatively, model by ( ) " "

Note: here

y dist R

R

V Variance function

X link function

h X inverse link

y o

( ) makes no sense

Instead: ( ),

r g X e

y dist h X R

14 May 2007 SSP Core Facility 49

Department of Statistics

LMM: Linear Mixed Model Multi-error models; split-plot, multi-location Repeated measures a.k.a. Longitudinal data

Assume: | ( , ) (0, )

LMM: Model by

Familiar notation:

0;

0

alternatively:

| ; ~ (0, )

or (marginal model)

( , );

y u MVN R u MVN G

X Zu

u Gy X Zu e MVN

e R

y u MVN X Zu u MVN G

y MVN X V V ZGZ R

More vocabulary:

“G-side” concerns V(u)

“R-side” concerns V(e)

14 May 2007 SSP Core Facility 50

Department of Statistics

GLMM: Generalized Linear Mixed Model

Assume: | ( , ) as with GLM

depends on ( )

(0, )

GLMM models = ( | ) by

link function: = ( )

inverse link:

GLMM: | ,

Marginal Model: ( | ) ( ) (more later)

y u dist R

R V

u MVN G

E y u

g X Zu

h X Zu

y u dist h X Zu R

f y u f u du

Modellingwill involve

•Distribution

•Link (or inv link)

•G-side

•R-side

14 May 2007 SSP Core Facility 51

Department of Statistics

Some Grounding Before Moving On

“Hessian Fly” example, Gotway & Stroup (1997, JABES)

“Hessian Fly” not so important, but design & data structure are

16 treatments, 4 replications: 4x4 Lattice − 16 incomplete blocks organized into

4 complete blocks

Response: Yij/nij

(damaged / obs per trt x block unit)

1 2 5 6 1 5 2 6

3 4 7 8 9 13 10 14

9 10 13 14 3 7 4 8

11 12 15 16 11 15 12 16

1 6 2 5 1 14 13 2

11 16 12 15 7 12 11 8

1 14 13 10 5 10 9 6

3 8 7 4 3 16 15 4

14 May 2007 SSP Core Facility 52

Department of Statistics

Linear Model (LM)

2

Randomized Complete Block

; i.i.d. 0,

block effect; treatment effect

ij i j ij ij

i i

y e e N

proc glimmix;class block entry;model pct=block entry;

i

Incomplete Block Model - Intra-block analysis

incomplete block replaces complete block in denoting proc glimmix;class inc_block entry;model pct=inc_block entry;

14 May 2007 SSP Core Facility 53

Department of Statistics

Linear Mixed Model (LMM)

2 2

Randomized Complete Block - Random block effects

i.i.d. 0, ; i.i.d. 0, ;

block effect; treatment effect

ij i j ij

i R ij

i i

y r e

r N e N

r

proc glimmix;class block entry;model pct=entry;random block;

G-sidemodeling block effect

Incomplete block (recovery of interblock information)Replace “block” by “inc_block”)

14 May 2007 SSP Core Facility 54

Department of Statistics

LMMG-side / R-side

Two alternative “G-side” specifications:

proc glimmix;class block entry;model pct=entry;random block;

proc glimmix;class block entry;model pct=entry;random intercept/subject=block;

R-side specificationproc glimmix;class block entry;model pct=entry;random _residual_ / type=cs subject=block;

Here, it doesn’t matter (all equivalent) but for more complex models, the distinctions will matter

14 May 2007 SSP Core Facility 55

Department of Statistics

Generalized Linear Model (GLM) ,

GLM ("Logit ANOVA" model): log1

ij ij ij

iji j

ij

y Binomial n

proc glimmix; class block entry; model y/n = block entry;

or replace “block” by “inc_block” forintra-block logit ANOVA

More on GLIMMIX syntax later

Here, note Y/N causes default to Binomial distribution & Logit link

(same as GENMOD)

14 May 2007 SSP Core Facility 56

Department of Statistics

Generalized Linear Mixed Model (GLMM)

2| block effects , block effects i.i.d. 0,

GLM ("Logit ANOVA" mixed model): log1

ij ij ij i R

iji j

ij

y Binomial n r N

r

proc glimmix; class block entry; model y/n = entry; random intercept / subject=block;

proc glimmix; class block entry; model y/n = entry; random block;

Marginal model

not equivalent

proc glimmix; class block entry; model y/n = entry; random _residual_ / type=cs subject=block;

14 May 2007 SSP Core Facility 57

Department of Statistics

II. Inference in LM, GLM, LMM, and GLMMInference for based on

In LM theory, if it can be expressed as ( )

. .

ˆOLS ( )

: estimable ' '( ) ( )

Main advantage

ˆ

fixed effects estimable functions

K estimable A E y

i e K A X

X X X y

theorem K iff K K X X X X

K

invariant to choice of ( )

. . when not full rank, has no intrinsic interpretation

does

(e.g. treatment difference, marginal (least squares) mean

X WX

i e X

K

14 May 2007 SSP Core Facility 58

Department of Statistics

II. Examples of Estimable Functions

1 2

. . one way model: ; 1,2,3,4; 1,...,

Estimable functions include

Trt marginal ("Least Squares") mean (LSMean)

+ . . 1 1 0 0 0 for 1

Trt differences

e.g. 0 1 1 0 0

SS(trt) such tha

ij j ij

i

e g y e i j n

e g k i

k

K

t all equal

0 1 0 0 1

. . 0 0 1 0 1

0 0 0 0 1

i

e g K

14 May 2007 SSP Core Facility 59

Department of Statistics

II. Common Inference Results for GLM

0

1

2( )

02

ˆ ~ ( , ( ) )

exact for LM

Wald statistic:

purpose: test H : 0

ˆ ˆ( ) [ ( ) ] ( )

~

Note in OLS

( )

rank K

K approx MVN K K X WX K

K

Wald K K X WX K K

approx

SS HWald

14 May 2007 SSP Core Facility 60

Department of Statistics

II. GLM: Inference with Unknown Scale Parameter

02

2

0 02

0( , )

2

( )Recall, in OLS

But what if unknown?

( ) ( )Think ANOVA: Use

ˆ

( )Thus, ~( )

Generalization:

in GLM, scale parameter or

dfh dfe

SS HWald

SS H SS H

MSE

SS H dfhWald Frank K MSE

Pearson Deviance

dfe dfe

14 May 2007 SSP Core Facility 61

Department of Statistics

II. Extension of GLM Scale ParameterQuasi-Likelihood

Overdispersion

“Working Correlation”

Counts Poisson ( ) ( )

but in practice ( ) ( )

Quasi-likelihood: you specify ( )

E y Var y

E y Var y

E y Var y

1 1 12 2 2

Repeated Measures

Assumed distribution ( ) ( )

But in reality, errors are correlated, so model variance as

( ) where ( )

is working correlation - structure analogou

Var y diag V

Var y R AR R diag V

A

s to true R-side in LMM

14 May 2007 SSP Core Facility 62

Department of Statistics

II. GLM: Deviance and Likelihood Ratio Test

1 1 2 2

0 2

1 1

1 1 1

Full model: . . ( )

Decompose as

Suppose we want to test H : 0

1. Fit full model

( ) 2 log[ ( ) ( )]

2. Fit reduced model

( ) 2 log[ ( ) ( )]

3. LR statistic

( )

X i e h X

X X

Dev X X y

X

Dev X X y

Dev X Dev

1 1( )X

14 May 2007 SSP Core Facility 63

Department of Statistics

II. LMM: The “Mixed Model Equations”1 1

1

1 1

1 1 1

1 1 1 1

1 1

( ) ( ) ( )

( )( )

( )and ( )

solving yields

note:

ˆˆ ( ) and (

y y X Zu R y X Zu uG u

yX R y X Zu

yZ R y X Zu G u

u

X R X X R Z X R y

uX R Z Z R Z G Z R y

u GZ V y X X V X

1) X V y

Marginal Model Solution

Mixed Model Solution

14 May 2007 SSP Core Facility 64

Department of Statistics

II. LMM Inference – G and R known

1

Inference based on Predictable functions

"predictable" if is estimable

(reduces to estimable function if focus on fixed effects only)

ˆ1. [ ( )] [ ]

where

K M u K

K

KVar K M u u K M C

M

X R X X RC

_1

1 1 1

1 2( )

2. Let and =

statistic for tests on is

ˆ ˆ( ) [ ] ( ) ~ rank L

Z

Z R X Z R Z G

L K M u

Wald L

L L CL L

14 May 2007 SSP Core Facility 65

Department of Statistics

II. LMM Inference – G and R unknownˆ ˆ1. Replace and by and

estimate variance and covariance components

ˆ2. Denote as with estimated var/cov components

ˆ ˆ3. "Naive" [ ( )]

ˆbut ( )

Kenward-Roger adjustment

G R G R

C C

Var L L CL

E L CL L CL

( ),

4. Approximate

ˆ( ) [ ]( )( ) ( )

may be biased ; often must be approximated

rank L

F

L L CL LWald approx Frank L rank L

F

14 May 2007 SSP Core Facility 66

Department of Statistics

II. LMM: Variance Component Estimation

Several methods1. For variance-component-only models: use

EMS from ANOVA 2. Maximum likelihood

− problem: biased

3. Restricted maximum likelihood4. Several computational approaches

a. Newton Raphsonb. Fisher Scoringc. EM

14 May 2007 SSP Core Facility 67

Department of Statistics

What’s Wrong with ML?

An example to illustrate SAS for Mixed Models, Data Set 1.5.1 Incomplete Block design from Cochran & Cox,

Experimental Designs, p 456 15 treatments 15 blocks 4 treatments observed per block

14 May 2007 SSP Core Facility 68

Department of Statistics

C&C Example: ML and two alternatives

Intrablock (fixed block) analysisproc glimmix data=cc456; class trt bloc; model y=trt bloc;Inter/Intra-block (random block)analysis –defaultproc glimmix data=cc456; class trt bloc; model y=trt; random bloc;Inter/Intra-block (random block) analysis – MLproc glimmix data=cc456 method=mspl; class trt bloc; model y=trt; random bloc;

PROC MIXED defaultgive same result

equivalent to PROC GLM

same asProc MIXEDMETHOD=ML;

14 May 2007 SSP Core Facility 69

Department of Statistics

ML vs Alternative Results: Which is Right?

Intrablock (fixed block) Type III Tests of Fixed Effects

Effect

Num DF

Den DF F Value Pr > F

trt 14 31 1.23 0.3012

Intra/inter- block (random) block default

Type III Tests of Fixed Effects

Effect

Num DF

Den DF F Value Pr > F

trt 14 36.2 1.48 0.1676

2ˆ 8.62

2 2ˆ ˆ4.65 8.56R

Intra/inter- block (random) block - ML

2 2ˆ ˆ4.50 6.04R

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

trt 14 49.04 2.02 0.0352

14 May 2007 SSP Core Facility 70

Department of Statistics

Simulation ML or REML

1000 simulated data sets using C & C, p 456 design

B2/2 = 0.5

Recorded type I error rate for Ftrt

− intrablock

−REML random block

−ML random block

Variable N Mean ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ fixd_rej05 1000 0.0590000 REML_rej05 1000 0.0610000 ML_rej05 1000 0.2140000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

14 May 2007 SSP Core Facility 71

Department of Statistics

II. LMM with estimated G and R Bias in std error and test statistics

1

1

Kenward & Roger ( , 1997)

Consider estimable function

ˆWhen unknown, estimates used to obtain

ˆ ˆ "naive" estimate ( ) ( )

Using Taylor series expansion, can show

ˆ[ (

Biometrics

K

V

Var K K X V X K

E K X V

2 1

1

,

) ]

1 ( )ˆ ˆ( ) cov( , )

2 i ji j i j

X K

X V XK X V X K K K

14 May 2007 SSP Core Facility 72

Department of Statistics

II. LMM: Degrees of Freedom

2 2 2

2 2

2 2 2

2 2

2

Simple Case

model: ( )

(0, ); ( ) (0, ); (0, )

ANOVA Source EMS

A

B

AB

error

ijk i j ij ijk

j B ij AB ijk

AB A

AB B

AB

y b ab e

b N ab N e N

n Q

n na

n

14 May 2007 SSP Core Facility 73

Department of Statistics

II. Degrees of Freedom (2)

1 2

2 21 2

2 2 2

Trt diff:

2 ( )2ˆ ˆ( ) ( )

denominator d.f.= ( )

Trt mean: +

1ˆ ˆ( + ) ( )

1 11 ( ) ( )

approximated via Satterthwaite's proc

AB

i

i AB B

MS ABVar nnb nb

df AB

Var n nnb

bMS AB MS Bnb b b

edure

14 May 2007 SSP Core Facility 74

Department of Statistics

II. Satterthwaite Approximation

22

2 2 2 22 2

for linear combination of MS

approximate d.f. for MS is

-1 1

e.g. -1 1

( ) ( )

i

i ii

i ii

i

i i

MS c MS

bMSAB MSBc MS

b b

c MS bMSAB MSBdf b b

df AB df B

14 May 2007 SSP Core Facility 75

Department of Statistics

II. Satterthwaite Approximation in LMM1 1

1 1

2 22[ ( ( ) )] 2( ( ) )Approximation: or

( ( ) ) ( ( ) )

E K X V X K K X V X K

Var K X V X K Var K X V X K

For vector K (e.g. treatment contrast):

1

1

1

1

Approximate ( ( ) ) by

( ( ) ), where vector of (co)variance components

2 { ( )

1 1,

Var K X V X K g Ag

K X V X Kg

V VA trace P P

i j

V ZGZ R P V V XCX V

14 May 2007 SSP Core Facility 76

Department of Statistics

II. GLMM Estimation

1 12 2

12

GLMM is model of ( | )

Link form: ( | )

Inverse link form: ( | ) = ( )

More general expression of distribution of |

|

( ) is "working correlation matrix"

E

i

E y u

g E y u X Zu

E y u h X Zu

y u

Var y u R R AR

R diag V A

stimation: as with LMM, may choose to focus on

1. only GLS equations in LMM;

Generalized Estimating Equations with GLMM

2. and several approachesu

14 May 2007 SSP Core Facility 77

Department of Statistics

II. Working Correlation

Recall Gotway & Stroup (1997) Hessian Fly Example

1 2 5 6 1 5 2 6

3 4 7 8 9 13 10 14

9 10 13 14 3 7 4 8

11 12 15 16 11 15 12 16

1 6 2 5 1 14 13 2

11 16 12 15 7 12 11 8

1 14 13 10 5 10 9 6

3 8 7 4 3 16 15 4

Gotway and Stroup considered spatial variation among e.u.

proc glimmix; class block entry; model y/n=entry; random intercept / subject=block; random _residual_ / type=sp(sph)(row col) subject=block;

MODEL sets up Binomial GLM, Logit linkRANDOM _RESIDUAL_ sets up a working correlationbased on SPHERICAL semivariogram

14 May 2007 SSP Core Facility 78

Department of Statistics

II. Marginal (PA) vs Subject-Specific Inference

Marginal Mean: ( )

Conditional Mean: ( | )

Note: ( ) ( | ) ( )

In general, cannot be further simplied

E y

E y u

E y E E y u E h X Zu

2 2

Example: log link, ~normal

( | ) exp( )

( ) exp( ) exp( ) ( )

( ) is moment generating function of eval at

( ) exp( )exp log ( )2 2

u

u

u u

u

E y u X Zu

E y E X Zu X M Z

M Z U Z

E y X E y X

Population Averaged (PA)

SS (true GLMM)

14 May 2007 SSP Core Facility 79

Department of Statistics

II. More on PA (marginal) vs. SSProbit-normal model:

Pr( 1| ) ( ); (0, )

can show

( ) ( )1

y u X Zu u N G

XE y X

Z GZ

2 2

2

in LMM, model ; (0, ); (0, )

1 .

1 .and ; 0, ;

.

1

are equivalent. However, in GLMM, they are not. Yield

different estimates, std. errors, etc.

u eX Zu e u N I e N I

X e e N R R

14 May 2007 SSP Core Facility 80

Department of Statistics

II. Estimation of GLMM model E(y|u) inverse link: E(y|u)=h(X+Zu) link: g[E(y|u)]==X+Zu to estimate and u need to evaluate f(y), f(y|u)

− approximate e.g. by Taylor series expansion Penalized Quasi-Likelihood (SAS %GLIMMIX) SAS PROC GLIMMIX (next slides)

− numerical integrate joint density Gauss-Hermite Quadrature (Proc NLMIXED)

− stochastically evaluate integral Monte Carlo Markov Chain (WinBugs – not in this course)

14 May 2007 SSP Core Facility 81

Department of Statistics

II. Computational Method Comparison GEE

− Computationally easy− Meaning of marginal results in GLM?

Linearized GLMM (current PROC GLIMMIX)− uses familiar LMM analogs (but many are ad hoc & need further research)− allows considerable R-side flexibility− adequate for many GLMM; breaks down for certain cases (binary data)

Integral Approximation (PROC NLMIXED)− better approximation that Linearized GLMM− BUT: ML only, simple G-side models only, no R-side

LaPlace− computationally less demanding than Integral approximation but often

“accurate enough”; same limitations as Integral approximations MCMC

− simple models only; limited & temperamental software− but in extreme cases, only way to get accurate results

14 May 2007 SSP Core Facility 82

Department of Statistics

Modeling Considerations

14 May 2007 SSP Core Facility 83

Department of Statistics

Basic Parts of SAS Program

DATA Step

PROC Step

Data your_choice_of_name; Input list of variables; /* $ after alphameric var */Datalines;data – one line / obs, one column per variable;

comment

proc glimmix data=demo1; class classroom trt time; model sc=trt time trt*time / dist=normal ddfm=kr; random classroom(trt); lsmeans trt*time; ods output lsmeans=lsm; run;

14 May 2007 SSP Core Facility 84

Department of Statistics

III. Modeling Considerations

Overdispersion

Marginal (PA) vs Conditional (SS) models

“Data” vs “Model” Scale

14 May 2007 SSP Core Facility 85

Department of Statistics

III. Model Considerations

Variance Model & Overdispersion Choice of Link Function Choice of Distribution Choice of Model Effects Correlated Errors?

Any of the above could show up as “overdispersion”

14 May 2007 SSP Core Facility 86

Department of Statistics

III. GLMM: Model Considerations Common dilemma Design, e.g. like “Hessian fly”

example BINOMIAL data Recover interblock

information - BLOCK random

1 2 5 6 1 5 2 6

3 4 7 8 9 13 10 14

9 10 13 14 3 7 4 8

11 12 15 16 11 15 12 16

1 6 2 5 1 14 13 2

11 16 12 15 7 12 11 8

1 14 13 10 5 10 9 6

3 8 7 4 3 16 15 4

ij

ij

expModel (Logit GLMM):

1 exp

or equivalently log1

i j

i j

i jij

r

r

r

Analysis reveals that the data are overdispersed

14 May 2007 SSP Core Facility 87

Department of Statistics

III. Hessian Fly Example

proc glimmix data=HessianFly; class block entry; model y/n = entry; random block;

Evidence of Overdispersionwhen >>1

Fit Statistics

-2 Res Log Pseudo-Likelihood 182.21

Generalized Chi-Square 107.96

Gener. Chi-Square / DF 2.25

14 May 2007 SSP Core Facility 88

Department of Statistics

III. Overdispersion

Observed variance > variance under presumed model

Symptom: Deviance/DFE or chi-square/DFE >> 1

Uniquely a GLM / GLMM issue

−not a consideration with LM, LMM

−y|u ~ normal implies variance not a function of mean

When is there an issue

− If Var(y) = f[E(y)] and

−using scale adjustment requires unrealistic assumptions

14 May 2007 SSP Core Facility 89

Department of Statistics

III. Common fix for Overdispersion

Multiply variance by scale parameter. Here: 1

proc glimmix data=HessianFly; class block entry; model y/n= entry; random block; random _residual_;

Issue: not a true likelihood

Covariance Parameter Estimates

Cov Parm Subject EstimateStandard

Error

Intercept block 0 .

Residual (VC) 2.2668 0.4627

estimates

Covariance Parameter Estimates

Cov Parm Subject EstimateStandard

Error

Intercept block 0.01116 0.03116

vs.

ˆw/o

14 May 2007 SSP Core Facility 90

Department of Statistics

Impact of Scale Parameter on Inference

Type III Tests of Fixed Effects

EffectNum

DFDen

DF F Value Pr > F

entry 15 45 3.03 0.0020

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

entry 15 45 6.90 <.0001

no scale parameter

withscale parameteradjustment

but is this the best way to address the problem?

failure to account for overdispersion tends to increase type I error rate

14 May 2007 SSP Core Facility 91

Department of Statistics

III. Mean – Variance Overdispersion Models

2

( ) ( , )

No scale parameter 1 ,

binomial, poisson

1-Nonlinear scale parameter

1+

negative binomial, gen. poisson, beta

Linear scale parameter

gamma, inverse gaussian

No mean parameter

normal

Var y f

14 May 2007 SSP Core Facility 92

Department of Statistics

III. Marginal or Conditional Formulation

For many models (notably LMM) there are equivalent forms−conditional (mixed, SS) model

−marginal (PA) model

− lead to the same marginal log-likelihood

Distinction results from−G-side model; random model effects

−R-side model; marginal model

14 May 2007 SSP Core Facility 93

Department of Statistics

III. Example: variance component (G-side) vs. Compound symmetry (R-side)

2 2

2 2 2 2

2 2 22 2

2 2

i.i.d. 0, i.i.d. 0,

...

...

... ...

ij i j ij

i R ij

R R R

R Ri R

R

y r e

r N e N

Var Y J I

14 May 2007 SSP Core Facility 94

Department of Statistics

III. Compound Symmetry Equivalent

22 2 2C 2 2

2C

2

Let and =

Model:

if (same block),

0 otherwise

1 ...

1 ...

... ...

1

RR

R

ij i ij

ij ij kl

i C

y E

i kVar E Corr E E

Var Y

Models equivalent if 0

14 May 2007 SSP Core Facility 95

Department of Statistics

III. G-side / R-side

proc glimmix; class block entry; model y/n=entry; random block;

proc glimmix; class block entry; model y/n=entry; random intercept / subject=block;

same modelG-side

R-side modelproc glimmix; class block entry; model y/n=entry; random _residual_ / type=CS subject=block;

proc mixed; class block entry; model y=entry; repeated / type=CS subject=block;

14 May 2007 SSP Core Facility 96

Department of Statistics

III. Variance Component vs CS in GLMM Variance component model is GLMM CS model is GEE They are not equivalent

Conditional model: logit

exp|

1 exp

marginal distribution is ( ) | ( )

Marginal model: logit

with working correlation matrix def

ij i j

i jij i

i j

ij ij i i i

ij i j

r

ry u Binomial

r

p y p y u p u du

ined by CS form

is NOT Binomial, merely borrow Binomial-like

Does such a dist

quasi-likeli

ribution actu

hood f

ally e

o

x ?

r

st

m

i

ijy

14 May 2007 SSP Core Facility 97

Department of Statistics

III. Conditional vs. Marginal Results

Fit Statistics

Gener. Chi-Square / DF 2.30

Covariance Parameter Estimates

Cov Parm Subject Estimate

CS block -0.03247

Residual 2.2992

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

entry 15 45 2.99 0.0023

Fit Statistics

Gener. Chi-Square / DF 2.27

Covariance Parameter Estimates

Cov Parm Subject Estimate

Intercept block 0

Residual (VC) 2.2668

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

entry 15 45 3.03 0.0020

Conditional Marginal

which is right? •fit statistic?•can you simulate data using mechanism implied by model?

14 May 2007 SSP Core Facility 98

Department of Statistics

III. Marginal or Conditional?

How to choose?−Conditional: G-side; Marginal: R-side

−Fit statistic? (may help; may deceive)

General recommendation−G-side formulation preferred for non-normal data

−G-side effects operate inside the link function & hence always lead to valid conditional & marginal distributions

−R-side effects operate outside the link function

− for non-normal data, models implied by R-side effects may be vacuous

14 May 2007 SSP Core Facility 99

Department of Statistics

III. Impact of Model Effects

Back to Hessian Fly Data Incomplete Block Design Try more appropriate model

proc glimmix; class inc_block entry; model y/n-entry; random intercept / subject=inc_block;

Fit Statistics

Gener. Chi-Square / DF 1.41

Covariance Parameter Estimates

Cov Parm Subject Estimate

Intercept inc_block 0.4971

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

entry 15 33 6.33 <.0001

14 May 2007 SSP Core Facility 100

Department of Statistics

III. Inference

After model fit & estimation, inference begins Also want at least some of following comparisons among groups (trt, entry...)

− test hypotheses

−obtain confidence intervals

−obtain predictions

− further model checking

14 May 2007 SSP Core Facility 101

Department of Statistics

III. Scale issue for GLM, GLMM

For GLM, GLMM there are two “natural scales”− linear (or model) scale (e.g. logit)

−data scale

May be other scales, depending on context−odds

−odds ratio

14 May 2007 SSP Core Facility 102

Department of Statistics

III. Choosing the Scale Example: Hessian Fly – binomial dist, logit link Data: measured as 0/1; per e.u. as Y/N Main focus: entry effect on P{indiv resp = 1}

ijLink: log1

ˆexpˆInverse Link:

ˆ1 exp

ij i jij

ijij

ij

r

14 May 2007 SSP Core Facility 103

Department of Statistics

III. Scale and Inference

These are estimat

Main tool of infe

ed on the "linear

rence: estimable functions

ˆ ˆe.g. entr

" or "model" scale

ˆ ˆ ˆcan denote: or

y "L

S Mean" +

ˆ ˆentry difference

Main focus of inference: on

j j j

j

j j

data scale

ˆe.g. 1|

entry difference between prob

ˆexpˆRequire "inverse

abilit

linking": ˆ1 ex

ies

p

ˆ ˆ

jj

j

j

j j

P resp entry i

14 May 2007 SSP Core Facility 104

Department of Statistics

III. Inverse Linking

Estimation occurs on model scale But reporting typically must occur on data scale

2

ˆˆEstimate:

ˆˆStd error: . . ( )

ˆConfidence interval: . .

ˆexpˆˆInverse linked estimate e.g. ˆ1 exp

ˆˆˆInverse linked std error . . .

Inverse linked confidence

K

s e k Var k

z s e

h

hs e s e

interval ( ), ( )h LowerB h UpperB

“delta” rule

14 May 2007 SSP Core Facility 105

Department of Statistics

III. Model & Data Scale – Hessian Fly ExampleSolutions for Fixed Effects

Effect entry Estimate Standard Error DF t Value Pr > |t|

Intercept -1.9057 0.4886 15 -3.90 0.0014

entry 1 3.8001 0.6327 33 6.01 <.0001

entry 2 3.4821 0.6186 33 5.63 <.0001

Estimates

Label EstimateStandard

Error Lower Upper Mean

StandardErrorMean

LowerMean

UpperMean

entry 1 1.8944 0.4608 0.9568 2.8319 0.8693 0.05237 0.7225 0.9444

entry 2 1.5765 0.4321 0.6974 2.4555 0.8287 0.06133 0.6676 0.9210

diff entry 1-2 0.3179 0.5793 -0.8607 1.4965 0.5788 0.1412 0.2972 0.8171

linear or model scale data scalewhich of thesemake NOsense?

14 May 2007 SSP Core Facility 106

Department of Statistics

on to GLIMMIX

14 May 2007 SSP Core Facility 107

Department of Statistics

IV. GLIMMIX Syntax

SAS software for GLMs & Mixed models

Basic GLIMMIX syntax

Similarities & Differences vs existing SAS Procs

New features

14 May 2007 SSP Core Facility 108

Department of Statistics

IV. SAS Software for Linear Models LM

−Proc GLM, MIXED−Proc GLIMMIX

GLM−Proc GENMOD Proc NLMIXED−Proc GLIMMIX

LMM−Proc MIXED−Proc GLIMMIX

GLMM−Proc GLIMMIX Proc NLMIXED

14 May 2007 SSP Core Facility 109

Department of Statistics

IV. PROC GLIMMIX Syntax

What’s familiar (from MIXED & GENMOD)− CLASS− MODEL− DIST and LINK options in MODEL (like GENMOD)− RANDOM (for G-side)− ESTIMATE, CONTRAST, LSMEANS− ODS

What’s new or different− RANDOM _RESIDUAL_ (replaces REPEATED for R-side)− LSMESTIMATE− new options in LSMEANS (e.g. better options for factorial exp)− NLOPTIONS− Model diagnostics

14 May 2007 SSP Core Facility 110

Department of Statistics

IV. Relation between GLMM Structure and GLIMMIX Code

1 1

2 2

| ~ , ( )

GLMM: |

|

y u dist R Var u G

g u X Zu

Var y u V PV

proc glimmix; class variables; model <resp>=<fixed effects> /dist= link= ; random <g-side effects> / <options>; random _residual_ / type= subject= ;run;

14 May 2007 SSP Core Facility 111

Department of Statistics

IV. NLOPTIONS Statement

New Statement in GLIMMIX Controls Optimization technique, Line Search

Method, number of Iterations, etc

proc glimmix; class id a b; model y=a b a*b; random _residual_ / type=cs subject=id(a); nloptions tech=nrridge maxiter=100;

TECH=NRRIDGE causes GLIMMIX to use MIXED computing algorithm (good for comparison...)

14 May 2007 SSP Core Facility 112

Department of Statistics

IV. Programming Statements Similar to GENMOD, NLIN, NLMIXED GLIMMIX supports statements using DATA step syntax Use to transform variables, define quantities to output,

user-defined link, variance, etc. For example....

proc glimmix; class block entry;

pct=y/n; model pct=entry; random intercept / subject=block;

14 May 2007 SSP Core Facility 113

Department of Statistics

IV. Some GLIMMIX Defaults Useful to Know

In MODEL statement− response Y= NORMAL distribution & IDENTITY link

− response Y/N= BINOMIAL distribution and LOGIT link

For distributions without scale parameter in variance function (e.g. Binomial, Poisson)−no scale parameter assumed (unlike %GLIMMIX macro)

−obtain scale parameter with RANDOM _RESIDUAL_

Optimization method automatically matched based on DISTRIBUTION & LINK

14 May 2007 SSP Core Facility 114

Department of Statistics

IV. Estimation Methods in PROC GLIMMIX

Defaults depend on model, distribution, and link May be altered with METHOD= option

− in PROC statement

METHOD= options −variations on pseudo-likelihood

−RSPL

−RMPL

−MSPL

−MMPL

Restricted obj fct (like REML)

Unrestricted obj fct (like ML)

subject specific (conditional or mixed) model

population averaged (marginal) model

14 May 2007 SSP Core Facility 115

Department of Statistics

IV. Defaults & Methods (continued)

GLMM Default Method is RSPL For LMM, this is REML

− GLIMMIX uses different algorithm than MIXED, TECH=NRRIDG uses MIXED algorithm

− you can get slightly different numbers with MIXED/GLIMMIX

METHOD=MSPL yields ML estimates Methods appear in literature as MPL, PQL Gaussian adaptive quadrature and LaPlace

algorithms will be added to V 9.2−not available yet & not discussed here

14 May 2007 SSP Core Facility 116

Department of Statistics

IV. Examples

proc glimmix; class id; _variance_=_mu_*_mu_; model y=x / dist=poisson;run;

proc glimmix; class id; model y=x / dist=poisson; random _residual_;run;

proc glimmix; class id; model y=x / dist=poisson;run;

Poisson regressionLog linkchange variance function

Poisson regressionLog linkadd scale parameter

Poisson regressionLog link

14 May 2007 SSP Core Facility 117

Department of Statistics

IV. “GLM-mode” vs “GLMM-mode”

Use following trick to get GLM (GENMOD) type model via pseudo-likelihood

proc glimmix; class id; model y=x / dist=poisson; random _residual_;

proc glimmix; class id; model y=x / dist=poisson; random _residual_ / subject=id;

“GLM-mode”max likelihood

“GLMM-mode”pseudo likelihood

this is a GEE with indep working corr

14 May 2007 SSP Core Facility 118

Department of Statistics

IV. Distributions supported by GLIMMIX

ContinuousBetaNormalLognormalGammaExponentialInverse GaussianShifted T

DiscreteBinaryBinomialPoissonGeometricNegative BinomialMultinomial

−Nominal−Ordinal

14 May 2007 SSP Core Facility 119

Department of Statistics

IV. MIXED to GLIMMIX – R-side

proc mixed; class loc id trt time; model y=trt | time; random loc; repeated / type=ar(1) subject=id(loc);

proc glimmix; class loc id trt time; model y=trt | time; random intercept / subject=loc; random _residual_ / type=ar(1) subject=id(loc);

when you use GLIMMIX, you will notice it is much fussier about SUBJECT= statement when nested subject structure is present (MIXED more likely to let you get away with ignoring SUBJECT)

14 May 2007 SSP Core Facility 120

Department of Statistics

IV. More on R-side

proc mixed; class loc id trt time; model y=trt | time; random loc; repeated time / type=ar(1) subject=id(loc);

proc glimmix; class loc id trt time; model y=trt | time; random intercept / subject=loc; random time / type=ar(1) subject=id(loc) residual;

** vs random _residual_ / type=ar(1) subject=id(loc);

alternative formof random residuale.g when time points missing, unsorted etc.

14 May 2007 SSP Core Facility 121

Department of Statistics

IV. MIXED to GLIMMIX - Estimate MIXED: single row ESTIMATE statements

GLIMMIX: multi-row with multiplicity adjustment

proc mixed; class trt; model y=trt a x trt*a trt*x; estimate ’10 3’ trt 1 -1 trt*a 10 -10 trt*x 3 -3; estimate ’20 3’ trt 1 -1 trt*a 20 -20 trt*x 3 -3; estimate ’30 3’ trt 1 -1 trt*a 30 -30 trt*x 3 -3;

proc glimmix; class trt; model y=trt a x trt*a trt*x; estimate ’10 3’ trt 1 -1 trt*a 10 -10 trt*x 3 -3,

’20 3’ trt 1 -1 trt*a 20 -20 trt*x 3 -3,’30 3’ trt 1 -1 trt*a 30 -30 trt*x 3 -3 / adjust=scheffe;

14 May 2007 SSP Core Facility 122

Department of Statistics

IV. MIXED vs. GLIMMIX - LSMEANS

Example: Factorial

PROC MIXED; class A B; model y=A|B; lsmeans A B/diff; lsmeans A*B/diff slice=(A B);

PROC GLIMMIX; class A B; model y=A|B; lsmeans A B/diff lines; lsmeans A*B / slice=(A B) slicediff=(A B);

gives you table of all possible differences

tests – but does not estimate – simple effects A given B, vice versa

gives multiple range

display users love

restricts A*B diffs to actual simple effects, e.g. A1-A2|Bj

14 May 2007 SSP Core Facility 123

Department of Statistics

IV. GLIMMIX – LSMEANS (1) Main EffectsB Least Squares Means

B EstimateStandard

Error DF t Value Pr > |t|

1 18.5300 1.3226 13.69 14.01 <.0001

2 26.5200 1.3226 13.69 20.05 <.0001

4 28.2800 1.3226 13.69 21.38 <.0001

8 25.3000 1.3226 13.69 19.13 <.0001

T Grouping for B Least Squares Means

LS-means with the same letter are not significantly

different.

B Estimate

4 28.2800 A

A

2 26.5200 A

A

8 25.3000 A

1 18.5300 B

proc glimmix data=AxB_example; class block A B; model y=A|B/ddfm=satterth; random block block*B; lsmeans A B/diff lines; lsmeans A*B/slicediff=(A B);run;

14 May 2007 SSP Core Facility 124

Department of Statistics

IV. GLIMMIX – LSMEANS (2) Simple Effects

proc glimmix data=AxB_example; class block A B; model y=A|B/ddfm=satterth; random block block*B; lsmeans A B/diff lines; lsmeans A*B/slicediff=(A B);run;

Simple Effect Comparisons of A*B Least Squares Means By B

Simple Effect Level A _A Estimate

Standard Error DF t Value Pr > |t|

B 1 r s 2.9400 1.3144 16 2.24 0.0399

B 2 r s 2.6400 1.3144 16 2.01 0.0618

B 4 r s -0.2000 1.3144 16 -0.15 0.8810

B 8 r s -1.0000 1.3144 16 -0.76 0.4578

A*B Least Squares Means

A B EstimateStandard

Error

r 1 20.0000 1.4769

r 2 27.8400 1.4769

r 4 28.1800 1.4769

r 8 24.8000 1.4769

A*B Least Squares Means

A B EstimateStandard

Error

s 1 17.0600 1.4769

s 2 25.2000 1.4769

s 4 28.3800 1.4769

s 8 25.8000 1.4769

14 May 2007 SSP Core Facility 125

Department of Statistics

IV. GLIMMIX – LSMEANS (3) lsmeans a*b / diff; gave you this

Differences of A*B Least Squares Means

A B _A _B EstimateStandard

Error DF t Value Pr > |t|

r 1 r 2 -7.8400 1.8796 19.49 -4.17 0.0005

r 1 r 4 -8.1800 1.8796 19.49 -4.35 0.0003

r 1 r 8 -4.8000 1.8796 19.49 -2.55 0.0192

r 1 s 1 2.9400 1.3144 16 2.24 0.0399

r 1 s 2 -5.2000 1.8796 19.49 -2.77 0.0121

r 1 s 4 -8.3800 1.8796 19.49 -4.46 0.0003

r 1 s 8 -5.8000 1.8796 19.49 -3.09 0.0060

r 2 r 4 -0.3400 1.8796 19.49 -0.18 0.8583

r 2 r 8 3.0400 1.8796 19.49 1.62 0.1219

r 2 s 1 10.7800 1.8796 19.49 5.74 <.0001

etc

14 May 2007 SSP Core Facility 126

Department of Statistics

IV. GLIMMIX -- LSMESTIMATEExample: Simple Effect in 2-Factor Factorial

Model:

Simple Effect, e.g. A|B

ijk ij ijk i j ij ijk

ij i j i i ij i j

y e e

estimate ‘A|B’ a*b 1 0 0 0 -1 0 0 0; not estimable

estimate ‘A|B’ a 1 -1 a*b 1 0 0 0 -1 0 0 0;must write

lsmestimate a*b ‘A|B’ 1 0 0 0 -1 0 0 0;new GLIMMIX alternative

Defined on not on model effects

Allows multiple LSMESTIMATES & ADJUST= for multiplicity

ij

14 May 2007 SSP Core Facility 127

Department of Statistics

IV. ODS Graphics With GLIMMIX

Not available with MIXED

ods html;ods graphics on;ods select MeanPlot;proc glimmix data=AxB_example; class block A B; model y=A|B/ddfm=satterth; random block block*B; lsmeans A*B/plot=MeanPlot

(sliceby=A join cl);run;ods graphics off;ods html close;run;

14 May 2007 SSP Core Facility 128

Department of Statistics

Factorial Treatment Design

Treatment Design vs Experiment (or study) Design

Factorial is type of treatment design Factor A, a levels; Factor B, b levels; etc Main inference tools:

−simple effects; e.g. method effect | variety j

− interaction; i.e. simple effects equal for all j

−main effects

14 May 2007 SSP Core Facility 129

Department of Statistics

is generic random structureijkEModel:

obs on A B

A B mean

Simple effect:

A | B :

B | A :

Interaction:

equal simple effects no interaction

e.g.

Main effect:

ijkijk ij

th thijk

thij

j ij i j

i ij ij

ij i j ij i j

y

y k ij

ij

E

or i i j j

specific form depends on design

14 May 2007 SSP Core Facility 130

Department of Statistics

GLIMMIX Features

Can estimate / test −simple effects

−main effect

−depending on which is appropriate

ODS graphics can graph / plot effects of interest SLICE can focus on simple effects in presence

of interaction SLICEDIFF can estimate simple effects of

interest

14 May 2007 SSP Core Facility 131

Department of Statistics

Modeling & Design

14 May 2007 SSP Core Facility 132

Department of Statistics

But My Study is not a Designed Experiment!

Comparative Study: any study whose purpose is to compare treatments or conditions (includes assessing change over time). Includes “quasi-experiments” & surveys with comparative objectives + designed experiments. Design principles apply to all!

Most modeling issues are study design issues

Most modeling errors result from poor understanding of design principles

14 May 2007 SSP Core Facility 133

Department of Statistics

If you are modeling, you need to understand design principles!!

14 May 2007 SSP Core Facility 134

Department of Statistics

Key Terms in Design Treatment Design: factors and levels & how they are

structured in the study. E.g factorial, planned obs over time

Experiment Design: Organization of experimental units (e.g into matched pairs, blocks, strata, clusters); plan by which they are assigned to treatment levels.

Experimental Unit: (e.u.) Smallest entity to which treatment levels (or treatment combinations) are independently assigned. E.U.s are legitimate units of replication

Sampling Unit: Unit on which measurement is taken. May be e.u. itself or subset of e.u. A.k.a. pseudo-replicate

Pseudo-replication: use of S.U.s as units of replication; common form of inappropriate design & analysis

14 May 2007 SSP Core Facility 135

Department of Statistics

Factorial & Experiment Designs idea: experimental unit is smallest entity to which

treatment level independently applied e.u. may be different size for different factors e.g. from SAS for Mixed Models, Section 4.6

−2 type 3 dose example dose applied to cage; type to animal in cage e.u. for dose: cage with 2 animals e.u. for type (and dose type): animal split-plot many variations (including repeated measures)

14 May 2007 SSP Core Facility 136

Department of Statistics

Adding to Model

school

classroom

students

school

classroom

students

TreatmentParticipate in Prof Devel

TreatmentDo Not Participate

curriculum

expstd

stdexp

curriculum

14 May 2007 SSP Core Facility 137

Department of Statistics

V. Factorial Treatment Designs

Basic Features

Come in Many (many, many) design forms

Experiment design & “quasi-experiment” or survey “study design”

−key to deciding what’s random & what’s fixed

−non-mixed (LM and GLM only) software is UNACCEPTABLE for these types of problems

Includes repeated measures (change... growth)

Normal and non-normal data

14 May 2007 SSP Core Facility 138

Department of Statistics

Type 1 Type 2

Type 2 type 1

Type 2 type 2

Type x Dose Design

Dose 1

Dose 2

Dose 3

or... Dose = Professional Development TrtType = Curriculum

14 May 2007 SSP Core Facility 139

Department of Statistics Figure 4.1 Possible design layouts for 22 factorial experiment

Treatments codes:

A1B1 A1B2 A2B1 A2B2 a. Completely Randomized c. Row-Column (Latin Square)

b. Randomized complete block

Blk 4

Blk 3

Blk 2

Blk 1

Blk 4

Blk 3

Blk 2

Blk 1

d. Split-plot 1, whole plot completely

randomized

col4col3col2col1

row4

row3

row2

row 1

col4col3col2col1

row4

row3

row2

row 1

FromSAS for Mixed Models

Treatment design:2 x 2 factorial

Experiment design:manymany variations

Here are 7(seven)

14 May 2007 SSP Core Facility 140

Department of Statistics

e. Split-plot 2, whole plot in randomized complete blocks

Blk 4

Blk 3

Blk 2

Blk 1

Blk 4

Blk 3

Blk 2

Blk 1

g. Split-plot 3. whole plot in row-

column (2 Latin squares)

Row 4

Row 3

col4col3col2col1

row2

row 1

Row 4

Row 3

col4col3col2col1

row2

row 1

f. Split-block, a.k.a. strip-split-plot

Blk 4

Blk 3

Blk 2

Blk 1

Blk 4

Blk 3

Blk 2

Blk 1

Even with 2 x 2 factorial

these seven are not all

we’re just getting started!

14 May 2007 SSP Core Facility 141

Department of Statistics

Split Block Example

SideL R

Position (same meaning both sides)

Microchip wafer

14 May 2007 SSP Core Facility 142

Department of Statistics

Choosing right model – step 1What is the experimental unit?

figure

4.1.a 4.1.b 4.1.c 4.1.d 4.1.e 4.1.f 4.1.g

effect CRD RCB LS split plot CR

split plot RCB

split-block

split-plot LS

block? no yes row col

no yes yes row col

A eu(A*B) blk*A*B row*col eu(A) blk*A blk*A row*col

B eu(A*B) blk*A*B row*col B*eu(A) blk*A*B blk*B row*col*B

A*B eu(A*B) blk*A*B row*col B*eu(A) blk*A*B blk*A*B row*col*B

14 May 2007 SSP Core Facility 143

Department of Statistics

Common Models in PROC MIXED/GLIMMIXDesign SAS – class, model and random statements CRD (Figure 4.1.a) class eu a b;

model y=a b a*b; RCB (Fig 4.1.b) class block a b;

model y=a b a*b; Random block; or Random intercept / subject=block;

Latin Square (4.1.c)

class row col a b; model y= a b a*b; Random row col;

Split-plot CR (4.1.d)

class eu a b; model y=a b a*b; random eu(a);

Split-plot RCB (4.1.e)

class block a b; model y=a b a*b; random block block*a;

Split-block (4.1.f) class block a b; model y=a b a*b; random block block*a block*b;

Split-plot LS (4.1.g)

class row col a b; model y=a b a*b; random row col row*col; (or, equivalently random row col row*col*a;)

MODEL treatment design RANDOM experiment (study) design

14 May 2007 SSP Core Facility 144

Department of Statistics

Model for split-plot: school-classroom example

Strategy: 1. list factor effects2. list e.u. for that effect3. each e.u. a random model effect

e.prof dev trt schoolcurriculum classroom(school)p.d curr classroom(school)

Effect e.u.g.

model: ijky

( )

or alternative expression

( )

note! is (not asampli n e.u.)ng unit

ij ik ijk

ij i j ij

ijk ik ijk

s t e

p c pc

E school trt e

student

14 May 2007 SSP Core Facility 145

Department of Statistics

Model for split-plot – Dose x Type exampleStrategy: 1. list factor effects

2. list e.u. for that effect3. each e.u. a random model effect

e.dose block dosetype block dose type

g.

model: ( )

dose type block dose t

Effect e.u

p

.

y e

ijk ij k iy bloc b d

or alternative expression

( )

note! NOT in model (not an e.u.)

k ijk

ij i j ij

ijk k ik ijk

e

d t dt

E bloc b d e

bloc type

14 May 2007 SSP Core Facility 146

Department of Statistics

Conventional ANOVA

2 2

2 2

2

2

2

Source EMS

bloc

dose

w.p. error† bloc dose

type

dose type

s.p. error††

S W D

S W

S T

S DT

S

t Q

t

Q

Q

H a.k.a.

between subjects

error

HH a.k.a.

within subjects

error

14 May 2007 SSP Core Facility 147

Department of Statistics

Standard errors of various terms

2 2

2

2

2 2

Main effects

2of dose Var= ( )rt

2of type Var= ( )rdSimple effects

2type|dose Var= ( )r

2dose|type Var= ( )r

i i S W

j j S

i ij ij S

j ij i j S W

t

Note: you can use MS() directly except for dose|typej

14 May 2007 SSP Core Facility 148

Department of Statistics

Programming in Proc GLIMMIXproc glimmix; class bloc type dose; model y=type|dose; random intercept dose / subject=bloc; ** i.e. random bloc bloc*dose; lsmeans type*dose / diff lines slicediff=(type dose) slice=(type dose); ods output lsmeans=lsm; run;

You can use ODS to output LSMEANS and GPLOT

for interaction plots, Or use ODS graphics directly

all possible meandifferences

simple effect differences only

simple effecttests only

with “MRT lines”

14 May 2007 SSP Core Facility 149

Department of Statistics

Type x Dose: Selected Output

Covariance Parameter Estimates

Cov Parm Subject EstimateStandard

Error

Intercept block 2.0735 2.7320

dose block 4.5132 2.8291

Residual 4.3189 1.5270

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

type 1 16 2.78 0.1151

dose 3 12 13.63 0.0004

type*dose 3 16 2.29 0.1176

14 May 2007 SSP Core Facility 150

Department of Statistics

Type x Dose LSMeans

type*dose Least Squares Means

type dose EstimateStandard

Error DF t Value Pr > |t|

r 1 20.0000 1.4769 20.23 13.54 <.0001

r 2 27.8400 1.4769 20.23 18.85 <.0001

r 4 28.1800 1.4769 20.23 19.08 <.0001

r 8 24.8000 1.4769 20.23 16.79 <.0001

s 1 17.0600 1.4769 20.23 11.55 <.0001

s 2 25.2000 1.4769 20.23 17.06 <.0001

s 4 28.3800 1.4769 20.23 19.22 <.0001

s 8 25.8000 1.4769 20.23 17.47 <.0001

14 May 2007 SSP Core Facility 151

Department of Statistics

Type x Dose: “MRT Lines”T Grouping for type*dose Least Squares Means

LS-means with the same letter are not significantly different.

type dose Estimate

s 4 28.3800 A

A

r 4 28.1800 A

A

r 2 27.8400 A

A

s 8 25.8000 A

A

s 2 25.2000 A

A

r 8 24.8000 A

r 1 20.0000 B

s 1 17.0600 C

however ...

14 May 2007 SSP Core Facility 152

Department of Statistics

A Factorial Inference Flowchart

The Prime Directive: Interactions first!!!!!

Interaction?

Negligible

Interpret Main Effects

Non-ignorable

Interpret Simple Effects

Full Wheelbarrow

14 May 2007 SSP Core Facility 153

Department of Statistics

Plots of Differences between Means

LSMEANS allows various plots of mean differences

DIFFPlot: plots interval estimates of mean differences

ANoMPlot: (ANalysis of Means) plots difference between each treatment and the overall mean

ControlPlot: Plots each treatment vs control (e.g. like Dunnett test)

14 May 2007 SSP Core Facility 154

Department of Statistics

SAS for Mean Difference Plots

From Type x Dose exampleods html;ods graphics on;ods select Anomplot DiffPlot;proc glimmix data=variety_eval; class block type dose; model y=type|dose/ddfm=satterth; random block block*dose;

lsmeans dose/plot=DiffPlot; lsmeans dose/plot=AnomPlot; *lsmeans type*dose/plot=DiffPlot; *lsmeans type*dose/plot=AnomPlot;run;ods graphics off;ods html close;run;

14 May 2007 SSP Core Facility 155

Department of Statistics

SAS for Mean Difference Plots: DIFFPLOT

14 May 2007 SSP Core Facility 156

Department of Statistics

SAS for Mean Difference Plots: ANoMPLOT

14 May 2007 SSP Core Facility 157

Department of Statistics

Mean Difference Plots – Control Plots

From SAS for Linear Models – Output 3.17-3.22 Randomized Complete Block 5 Irrigation Treatments: Flood (control), Basin, Spray,

Sprinkler, Trickle

ods html;ods graphics on;ods select ControlPlot;proc glimmix order=data; class bloc irrig; model fruitwt=irrig; random bloc; lsmeans irrig/diff=control('flood') plot=controlplot adjust=dunnett; run;ods graphics off;ods html close; run;

14 May 2007 SSP Core Facility 158

Department of Statistics

Dunnett-style Control Plot

14 May 2007 SSP Core Facility 159

Department of Statistics

Back to Type x Dose Data: Interaction Plot

14 May 2007 SSP Core Facility 160

Department of Statistics

Type x Dose: Simple Effects

Tests of Effect Slices for type*dose Sliced By type

type

Num DF

Den DF F Value Pr > F

r 3 19.49 8.12 0.0010

s 3 19.49 13.58 <.0001

Tests of Effect Slices for type*dose Sliced By dose

dose

Num DF

Den DF F Value Pr > F

1 1 16 5.00 0.0399

2 1 16 4.03 0.0618

4 1 16 0.02 0.8810

8 1 16 0.58 0.4578

Simple Effect Comparisons of type*dose Least Squares Means By dose

Simple Effect Level type _type Estimate

Standard Error DF t Value Pr > |t|

dose 1 r s 2.9400 1.3144 16 2.24 0.0399

dose 2 r s 2.6400 1.3144 16 2.01 0.0618

dose 4 r s -0.2000 1.3144 16 -0.15 0.8810

dose 8 r s -1.0000 1.3144 16 -0.76 0.4578

SLICE: test only

SLICEDIFFestimatesetc

14 May 2007 SSP Core Facility 161

Department of Statistics

Type x Dose: Simple Effect Estimates by TypeSimple Effect Comparisons of type*dose Least Squares Means By type

Simple Effect Level dose _dose Estimate

Standard Error DF t Value Pr > |t|

type r 1 2 -7.8400 1.8796 19.49 -4.17 0.0005

type r 1 4 -8.1800 1.8796 19.49 -4.35 0.0003

type r 1 8 -4.8000 1.8796 19.49 -2.55 0.0192

type r 2 4 -0.3400 1.8796 19.49 -0.18 0.8583

type r 2 8 3.0400 1.8796 19.49 1.62 0.1219

type r 4 8 3.3800 1.8796 19.49 1.80 0.0876

type s 1 2 -8.1400 1.8796 19.49 -4.33 0.0003

type s 1 4 -11.3200 1.8796 19.49 -6.02 <.0001

type s 1 8 -8.7400 1.8796 19.49 -4.65 0.0002

type s 2 4 -3.1800 1.8796 19.49 -1.69 0.1066

type s 2 8 -0.6000 1.8796 19.49 -0.32 0.7530

type s 4 8 2.5800 1.8796 19.49 1.37 0.1855

14 May 2007 SSP Core Facility 162

Department of Statistics

Effect of dose?

contrast 'logdose linear' dose -3 -1 1 3; contrast 'logdose quad' dose 1 -1 -1 1; contrast 'logdose cubic' dose -1 3 -3 1; contrast 'type x linear' dose*type -3 -1 1 3 3 1 -1 -3; contrast 'type x quad' dose*type 1 -1 -1 1 -1 1 1 -1; contrast 'type x cubic' dose*type -1 3 -3 1 1 -3 3 -1;

Log(Dose)

otherwise.....

contrast 'dose linear' dose -11 -7 1 17; contrast 'dose quad' dose 20 -4 -29 13; contrast 'dose cubic' dose -8 14 -7 1; contrast 'type x linear' dose*type -11 -7 1 17 11 7 -1 -17; contrast 'type x quad' dose*type 20 -4 -29 13 -20 4 29 -13; contrast 'type x cubic' dose*type -8 14 -7 1 8 -14 7 -1;

14 May 2007 SSP Core Facility 163

Department of Statistics

LogDose contrast results

Contrasts

Num Den

Label DF DF F Value Pr > F

logdose linear 1 12 18.25 0.0011

logdose quad 1 12 22.54 0.0005

logdose cubic 1 12 0.08 0.7780

type x linear 1 16 6.22 0.0240

type x quad 1 16 0.04 0.8515

type x cubic 1 16 0.61 0.4472

14 May 2007 SSP Core Facility 164

Department of Statistics

Direct Regression – borrow from ANCOVAproc glimmix data=variety_eval; class block type dose; model y=type logdose(type) ld_sq(type) / noint ddfm=satterth solution; random intercept dose / subject=block; contrast 'equal quad by type?' ld_sq(type) 1 -1;run;

Solutions for Fixed Effects

Effect type EstimateStandard

Error DF t Value

type r 20.1890 1.4204 19.62 14.21

type s 17.0200 1.4204 19.62 11.98

logdose(type) r 9.8890 2.0181 21.45 4.90

logdose(type) s 10.9800 2.0181 21.45 5.44

ld_sq(type) r -2.8050 0.6447 21.45 -4.35

ld_sq(type) s -2.6800 0.6447 21.45 -4.16

Contrasts

Label

Nu

m DF

Den DF

F Value Pr > F

equal quad by type?

1 17 0.04 0.8497

can re-fit with LD_SQcommon to both types

14 May 2007 SSP Core Facility 165

Department of Statistics

Example 3

From SAS for Mixed Models, Section 4.7 4 “conditions” 3 diets Condition applied in incomplete block design 2 conditions per block Diet applied to cages within condition Condition is whole plot, diet is split-plot

14 May 2007 SSP Core Facility 166

Department of Statistics

“Plot plan”

diet 1 diet 2 diet 3 diet 2 diet 1 diet 3

diet 2 diet 1 diet 3 diet 1 diet 3 diet 2

14 May 2007 SSP Core Facility 167

Department of Statistics

Model?

blocking? yes e.u. with respect to condition “1/2 block” e.u. with repect to diet: “1/3 condition e.u.” e.u. w.r.t. cond x diet: same as diet

Model:

ijk i k ikj ijkblk wy e

14 May 2007 SSP Core Facility 168

Department of Statistics

SAS Program

proc glimmix data=fix2; class cage condition diet / ddfm=kr; model gain=condition diet condition*diet/ddfm=satterth; random intercept condition / subject=cage; run;

data & program: file ch4-ex3.sas

14 May 2007 SSP Core Facility 169

Department of Statistics

Selected Output

Covariance Parameter Estimates

Cov Parm Subject EstimateStandard

Error

Intercept cage 3.0376 5.0791

condition cage 0 .

Residual 27.8429 8.7672

how should one deal with negative variance component estimate?• revert to ANOVA via PROC GLM ?• in MIXED, use NOBOUND option ?• in GLIMMIX, use LowerB• alternatively, redefine model

• may be CS with plots in block negatively correlated

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

condition 3 23.61 2.71 0.0677

diet 2 20.17 0.93 0.4090

condition*diet 6 20.17 1.73 0.1661

14 May 2007 SSP Core Facility 170

Department of Statistics

Comparison with SAS Proc GLMproc glm data=fix2; class cage condition diet; model gain=cage condition cage*condition diet condition*diet; random cage cage*condition/test; lsmeans condition diet condition*diet;Tests of Hypotheses for Mixed Model Analysis of Variance

Source DF Type III SS Mean Square F Value Pr > F cage 5 198.277778 39.655556 2.73 0.2185 * condition 3 171.666667 57.222222 3.95 0.1446 Error 3 43.500000 14.500000 Error: MS(cage*condition) * This test assumes one or more other fixed effects are zero.

Source DF Type III SS Mean Square F Value Pr > F cage*condition 3 43.500000 14.500000 0.46 0.7144 * diet 2 52.055556 26.027778 0.82 0.4561

condition*diet 6 288.388889 48.064815 1.52 0.2333

Error: MS(Error) 16 504.888889 31.555556

14 May 2007 SSP Core Facility 171

Department of Statistics

More GLM output Least Squares Means

condition gain LSMEAN 1 Non-est 2 Non-est 3 Non-est 4 Non-est

diet gain LSMEAN normal 57.9166667 restrict 55.5000000 suppleme 58.1666667

condition diet gain LSMEAN 1 normal Non-est 1 restrict Non-est 1 suppleme Non-est 2 normal Non-est 2 restrict Non-est 2 suppleme Non-est 3 normal Non-est 3 restrict Non-est 3 suppleme Non-est 4 normal Non-est 4 restrict Non-est 4 suppleme Non-est

non-estimabilityresults from inappropriatedefinition of estimability

(based on fixed & random eff)

inescapable consequence ofProc GLM with mixed model

DON’Tuse Proc GLMwithmixed models!

14 May 2007 SSP Core Facility 172

Department of Statistics

GLM vs MIXED issues REML default: variance component estimates set to 0

− if BLOCK affected, type I error rate − if error term affected, power may − better to allow negative estimates− In MIXED: NOBOUND or METHOD=TYPE3− In GLIMMIX: LowerB

vs. GLM uses implied MS regardless GLM: inappropriate NON-EST artifact of incomplete

block design Standard errors for means, many simple effects

(including SLICE) incorrect in GLM (no fix!!)

14 May 2007 SSP Core Facility 173

Department of Statistics

GLIMMIX Option (1) – Like NOBOUND in MIXED

proc glimmix data=fix2; class cage condition diet; model gain=condition|diet/ddfm=kr; random intercept condition / subject=cage;

parms / lowerb=(1e-4,-10,1e-4); run;

Covariance Parameter Estimates

Cov Parm Subject EstimateStandard

Error

Intercept cage 5.0288 4.7149

condition cage -6.2404 4.8693

Residual 31.5556 11.1566

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

condition 3 4.718 4.31 0.0798

diet 2 16 0.82 0.4561

condition*diet 6 16 1.52 0.2333

14 May 2007 SSP Core Facility 174

Department of Statistics

GLIMMIX Option (2) – is it really correlation?proc glimmix data=fix2; class cage condition diet; model gain=condition|diet/ddfm=kr; random intercept / subject=cage;

random _residual_ / type=cs subject=condition*cage; run;

Covariance Parameter Estimates

Cov Parm Subject Estimate

Intercept cage 5.0271

CS cage*condition -6.2402

Residual 31.5567

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

condition 3 4.717 4.31 0.0798

diet 2 16 0.82 0.4561

condition*diet 6 16 1.52 0.2334

2CC

2 2CC

Interblock correlation

=

0.2466

14 May 2007 SSP Core Facility 175

Department of Statistics

Modeling Change over Time

Regression over time Latent growth / change models Random coefficients over time Repeated measures experiment Longitudinal Data

14 May 2007 SSP Core Facility 176

Department of Statistics

From Acock – BMI Data

bmi

10

20

30

40

50

year

1997 yearfrmt

1998 yearfrmt

1999 yearfrmt

2000 yearfrmt

2001 yearfrmt

2002 yearfrmt

2003

Note – my sample differs from Acock’s, so the numbers won’t match

14 May 2007 SSP Core Facility 177

Department of Statistics

Basic Growth Model

Simplest model involves slope & intercept In “Stat-speak”

0 1

obs=intercept slope time + error

ij ijiy time e

this is just linear regression

21 2, ,..., may be 0,

may be (more l

indep

ater)

endent

correlated

j j tje e e N

or

14 May 2007 SSP Core Facility 178

Department of Statistics

Basic Growth Model in SAS

in PROC GLM

proc glm; model bmi=year;run;

Source DFSum of

Squares Mean Square F Value Pr > F

Model 1 432.856378 432.856378 19.68 <.0001

Error 229 5037.468822 21.997680

Corrected Total 230 5470.325200

R-Square Coeff Var Root MSE bmi Mean

0.079128 20.01197 4.690168 23.43682

Parameter EstimateStandard

Error t Value Pr > |t|

Intercept 21.38349324 0.55631931 38.44 <.0001

year 0.68444085 0.15429522 4.44 <.0001

very deceptive – more shortly

ˆregression equation: 21.38 0.684y Year

14 May 2007 SSP Core Facility 179

Department of Statistics

Growth Model in SAS - II

in PROC GLIMMIX

proc glimmix;class id; model bmi=year/solution; random _residual_ /subject=id; estimate 'y-hat in 1997' intercept 1 year 0 / cl; estimate 'y-hat in 2000' intercept 1 year 3 / cl; estimate 'y-hat in 2003' intercept 1 year 6 / cl;run;

selected output next page

14 May 2007 SSP Core Facility 180

Department of Statistics

Basic Growth Model – Selected GLIMMIX OutputCovariance Parameter Estimates

Cov Parm EstimateStandard

Error

Residual (VC) 21.9977 2.0558

Solutions for Fixed Effects

Effect EstimateStandard

Error DF t Value Pr > |t|

Intercept 21.3835 0.5563 32 38.44 <.0001

year 0.6844 0.1543 197 4.44 <.0001

Estimates

Label EstimateStandard

Error DF t Value Pr > |t| Alpha Lower Upper

y-hat in 1997 21.3835 0.5563 197 38.44 <.0001 0.05 20.2864 22.4806

y-hat in 2000 23.4368 0.3086 197 75.95 <.0001 0.05 22.8283 24.0454

y-hat in 2003 25.4901 0.5563 197 45.82 <.0001 0.05 24.3930 26.5872

Note: residual VC est = MSE from GLM ANOVA

14 May 2007 SSP Core Facility 181

Department of Statistics

G/C Model – Issue I – Account for ID

Recall R2 for Basic Growth Model very low You must account for variation among subjects (ID)

proc glm; class id; model bmi=id year;run;

proc glimmix;class id; model bmi=year/solution; random id; /* or random intercept / subject = id

okay

better

14 May 2007 SSP Core Facility 182

Department of Statistics

Selected Output

R-Square

0.815282

fromGLM

vs. 0.079

Covariance Parameter Estimates

Cov Parm Subject EstimateStandard

Error

Intercept id 17.2449 4.4950

Residual 5.1293 0.5168

from GLIMMIX

vs. 21.998

Solutions for Fixed Effects

Effect EstimateStandard

Error DF t Value Pr > |t|

Intercept 21.3835 0.7712 32 27.73 <.0001

year 0.6844 0.07451 197 9.19 <.0001

estimatesdon’t changestd errors do

14 May 2007 SSP Core Facility 183

Department of Statistics

Growth Change Modeling Issue - II

Correlated Errors

0 1

21 2

I

, ,..., may be inde

n Mode

0,

may be

pendent

correlated

l

ij iji

j j tj

y year e

e e e N

or

Recall:

Correlation Modeled by Covariance Model

• Failure to model correlation increases P{type I error}

• Over-modeling correlation decreases Power

14 May 2007 SSP Core Facility 184

Department of Statistics

Covariance models2

2

2 3

22

Indep =I identical to split-plot

1

1CS =

1

1

NOTE: CS is reparameterization of Indep

1

1AR(1) =

1

1

14 May 2007 SSP Core Facility 185

Department of Statistics

More covariance models

1 2 3

1 22

1

21 1 2 1 1 3 1 2 1 4 1 2 3

22 2 3 2 2 4 2 3

23 3 4 3

24

21 12 13 14

22 23 24

23 34

24

1

1Toep =

.

1

ANTE(1) =

UN =

14 May 2007 SSP Core Facility 186

Department of Statistics

Issues in Repeated Measures

Impact of covariance structure? Selection of appropriate covariance? Bias in std errors, test statistics Degrees of freedom Nonlinear models over time Non-normal errors

14 May 2007 SSP Core Facility 187

Department of Statistics

Basic G/C Model with Covariance Model

Also known as Autocorrelation

proc glimmix;

class id;

model bmi=year/solution / ddfm=kr;

random intercept / subject=id;

random _residual_ /subject=id type=ar(1); run;

Competing Covariance Models compared via Fit Statistics•AICC BIC•HQIC CAIC

degree of freedomandstd error bias must be dealt withmore later

14 May 2007 SSP Core Facility 188

Department of Statistics

Selected Output for G/C Model w/ Autocorrelation

Covariance Parameter Estimates

Cov Parm Subject EstimateStandard

Error

Intercept id 14.8587 4.6202

AR(1) id 0.5623 0.1144

Residual 7.7165 1.8981

variance,covariance &correlation estimates

Fit Statistics

-2 Res Log Likelihood 1111.69

AIC (smaller is better) 1117.69

AICC (smaller is better) 1117.79

BIC (smaller is better) 1122.18

CAIC (smaller is better) 1125.18

HQIC (smaller is better) 1119.20

Generalized Chi-Square 1767.07

Gener. Chi-Square / DF 7.72

Solutions for Fixed Effects

Effect EstimateStandard Error DF t Value Pr > |t|

Intercept 21.3238 0.8042 32 26.52 <.0001

year 0.6896 0.1102 197 6.26 <.0001

used to assess cov model

estimate – slight effectstd error – bigger effect

14 May 2007 SSP Core Facility 189

Department of Statistics

random coeff correl errors prediction add Gender add emotional prob

14 May 2007 SSP Core Facility 190

Department of Statistics

Repeated Measure Experimentsa.k.a. Longitudinal Data

Assign e.u. to treatments May use any design (completely random,

blocked, row-column, split-plot ....) Observations at planned times Objectives

1. assess changes in response over time

2. assess treatment effect on (1)

14 May 2007 SSP Core Facility 191

Department of Statistics

Typical repeated Measures Data

from SAS for Linear Models, Chapter 8 SAS for Mixed Models, 2nd ed, Chapter 5

14 May 2007 SSP Core Facility 192

Department of Statistics

From BMI Data: Are G/C Curves Equal by Gender?

interactionplot of G/Ccurve by gender

14 May 2007 SSP Core Facility 193

Department of Statistics

FYI – SAS Code to Get Interaction Plot

ods html;ods graphics on;ods select MeanPlot;proc glimmix data=bmi_uni_anc; class gender id year; model bmi=gender|year / solution ddfm=kr; random intercept / subject=id(gender); random _residual_ / type=ar(1) subject=id(gender); lsmeans gender*year / plot=MeanPlot (sliceby=gender join cl);run;ods graphics off;ods html close;run;

14 May 2007 SSP Core Facility 194

Department of Statistics

Model

2

Model: ( )

where mean

can express as:

( ) is between subjects error (0, )like whole-plot error

is within subjects error, like

ijk ij ik ijk

ij i j

ij i j ij

ik B

ijk

y id gender e

gender year

g yr g yr

id gender NI

e

1 2

split-plot error, ...

Let ... (0, )ik iki k i k iTk

except

e e e e e MVN

proc glimmix data=bmi_uni_anc; class gender id year; model bmi=gender|year / solution ddfm=kr; random intercept / subject=id(gender); random _residual_ / type=ar(1) subject=id(gender);

translates to:

14 May 2007 SSP Core Facility 195

Department of Statistics

Back to SAS for Mixed Models Example

2

1 2

Model: ( )

where mean

( ) is between subjects error (0, )

like whole-plot error

is within subjects error, like split-plot error, ...

Let ..

ijk ij ik ijk

ij i j

ik B

ijk

ik i k i k

y s trt e

trt time

s trt NI

e except

e e e

2 2

. (0, )

Hence ( ) ; typically

# trt's, =#subj/trt

ikiTk

ik S S B T Bik

AK ik

e e MVN

Var y V Z Z J

V Var y I V A K

14 May 2007 SSP Core Facility 196

Department of Statistics

Middle Ground between MANOVA and Split-Plot in Time via Proc GLIMMIX

PROC GLIMMIX; CLASSES SUBJ TRT TIME; MODEL Y= TRT TIME TRT*TIME; RANDOM INTERCEPT / SUBJECT=SUBJ(TRT); RANDOM TIME / TYPE=AR(1) SUBJECT=SUBJ(TRT) RESIDUAL;*LSMEANS TRT TIME TRT*TIME;TITLE 'MIXED - AR(1) ERRORS';RUN;

RANDOM specifies between subjects effects (G-side)

RANDOM...RESIDUAL specifies within subjects effect (R-side)

in many models, G- and R-side effects are not identifiable

14 May 2007 SSP Core Facility 197

Department of Statistics

Modeling Covariance among Repeated Measures

PROC MIXED DATA=univ; CLASSES SUBJ TRT TIME; MODEL Y= TRT TIME TRT*TIME; REPEATED TIME / TYPE=UN SSCP SUBJECT=SUBJ(TRT); ODS OUTPUT CovParms=cp;run;data times; do time1=1 to 8; do time2=1 to time1; dist=time1-time2; output; end; end;

data covplot; merge times cp;

proc gplot data=covplot; plot adjcorr*dist=time1;

Computes covariance betweenpairs of measurements(same subject, different times)based on Sum of squares & cross-products matrixthenplots them by distance

14 May 2007 SSP Core Facility 198

Department of Statistics

Plot of Covariance by Distance

14 May 2007 SSP Core Facility 199

Department of Statistics

Idealized PlotsCS=Subj(Trt), AR(1), AR(1)+Subj(Trt)

AR(1) + Subj(Trt)

CS= random Subj(Trt)

AR(1) only

14 May 2007 SSP Core Facility 200

Department of Statistics

Model Fitting Criteria in Version 8

1. Compound Symmetry proc glimmix; classes subj trt time; model y= trt time trt*time; random time / residual type=cs subject=subj(trt);title 'mixed - compound symmetry';

Fit Statistics

-2 Res Log Likelihood 839.39

AIC (smaller is better) 843.39

AICC (smaller is better) 843.47

BIC (smaller is better) 845.75

CAIC (smaller is better) 847.75

HQIC (smaller is better) 844.02

Generalized Chi-Square 767.61

Gener. Chi-Square / DF 4.80

14 May 2007 SSP Core Facility 201

Department of Statistics

Comparison of ModelsSmaller is Better

Compound Symmetry

Neg2LogLike Parms AIC AICC HQIC BIC CAIC 839.4 2 843.4 843.5 844.0 845.7 847.7

AR(1) + Subj(TRT) random effect

Neg2LogLike Parms AIC AICC HQIC BIC CAIC 788.7 3 794.7 794.8 795.6 798.2 801.2

Unstructured

Neg2LogLike Parms AIC AICC HQIC BIC CAIC 760.5 36 832.5 854.1 843.7 874.9 910.9

Neg2LogLike Parms AIC AICC HQIC BIC CAIC 780.7 15 810.7 814.0 815.3 828.3 843.3

Neg2LogLike Parms AIC AICC HQIC BIC CAIC 784.9 8 800.9 801.9 803.4 810.4 818.4

ANTE(1)

TOEP

14 May 2007 SSP Core Facility 202

Department of Statistics

How do Model Fitting Criteria Compare?

Guerin & Stroup (2000) compared AIC, BIC, HQIC, CAIC for simulated AR(1) and ARH(1) data

CAIC tends to select simpler models AIC tends to select most complex models * complex -- AIC > HQIC > BIC > CAIC -- simple Model too simple (correlation model not adequate) Type I error

rate too high Model too complex (correlation over-modeled) Type I error control

not affected, but power suffers

*Since 2000, SAS added AICC to address AIC issue Best choice depends on severity of Type I vs II error

14 May 2007 SSP Core Facility 203

Department of Statistics

An Inference Issue CS: Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F TRT 3 20 0.74 0.5425 TIME 7 140 109.04 <.0001 TRT*TIME 21 140 1.98 0.0106

AR(1)+between subj: Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F TRT 3 20 0.75 0.5344 TIME 7 140 60.55 <.0001 TRT*TIME 21 140 1.48 0.0921

UN: Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F TRT 3 20 0.74 0.5425 TIME 7 20 101.31 <.0001 TRT*TIME 21 20 1.37 0.2450

UN similar to MANOVA but MANOVA Trt*Time p-value was 0.50

14 May 2007 SSP Core Facility 204

Department of Statistics

Bias & Options for Adjusting

SAS Default uses estimated (co)variance components in V std errors biased , t-, F-statistics biased

“Robust” (a.k.a. “sandwich) estimate of K’V-1K available using EMPIRICAL option in MIXED

Kenward & Roger (Biometrics, 1997) proposed adjustment; available using DDFM=KR option in MODEL statement of MIXED

Guerin & Stroup (2000) evaluated KR option of SAS Version 8 with simulated AR(1) and ARH(1) data

Biased F resulted in inflated Type I error rates unless KR option used (for α=0.05, rejection rates >0.10 for TYPE=AR(1), up to 0.20 with TYPE=ANTE(1), UN

14 May 2007 SSP Core Facility 205

Department of Statistics

Sandwich (“Robust”) Estimator

OLS

OLS

1 1GLS

1 1GLS

0

1 10

ˆOLS estimate of :

ˆ ( )

ˆ ˆ ˆGLS estimate is:

ˆ ˆ ˆ

ˆˆ ˆLet based on residuals

ˆ ˆ ˆˆ ˆYields

"Sa

X X X y

Var X X X Var y X X X

X X X VX X X

X V X X V y

Var X V X X VX X V X

V V e y X

V V eeV

1 1 1 1GLS

ˆ ˆ ˆ ˆ ˆˆ ˆndwich" estimator: Var X V X X V eeV X X V X

14 May 2007 SSP Core Facility 206

Department of Statistics

How does the sandwich estimator perform?

proc mixed empirical; classes subj trt time; model y=trt time trt*time; random intercept/ subject=subj(trt); random time / type=ar(1) subject=subj(trt) residual;run;

Type 3 Tests of Fixed Effects

Num Den Effect DF DF F Value Pr > F

TRT 3 20 1.31 0.2981 TIME 7 140 121.57 <.0001 TRT*TIME 20 140 9.04 <.0001

vs. F=1.48; p=0.0921using default

14 May 2007 SSP Core Facility 207

Department of Statistics

Kenward and Roger

proc glimmix; classes subj trt time; model y= trt time trt*time/ddfm=kr; random intercept / subject=subj(trt); random time / type=ar(1) subject=subj(trt) residual;

Type 3 Tests of Fixed Effects

Num Den Effect DF DF F Value Pr > F

TRT 3 20.5 0.77 0.5219 TIME 7 109 50.90 <.0001 TRT*TIME 21 117 1.24 0.2330

14 May 2007 SSP Core Facility 208

Department of Statistics

Alternative KR adjustment• in SAS, KR adjustment uses Hessian matrix by default• you can cause it to use the Information matrix instead• no documented advantage one way or another

PROC glimmix scoremod scoring=51; CLASSES SUBJ TRT TIME; MODEL Y= TRT TIME TRT*TIME/ddfm=kr; RANDOM intercept / subject=SUBJ(TRT); Random _resid_ / TYPE=AR(1) SUBJECT=SUBJ(TRT); nloptions technique=nrridg;

Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F

TRT 3 20.5 0.77 0.5264 TIME 7 112 54.18 <.0001 TRT*TIME 21 119 1.28 0.2010

vs. F=1.24, p=0.2330 using Hessian

14 May 2007 SSP Core Facility 209

Department of Statistics

Alternative Model for Change in BMI by Gender

0 1

0 0

1 1 1

0 1 1

0 1

Repeated Measures ANCOVA M

Level 1: Level 2: ( )

(odel

)( )

tj j j t tj

j i ij

j i

ijk i ij i t ijk

i i ij ijk

y yr egender id genderg

y gender id gender g yr eid gender e

proc glimmix data=bmi_uni_anc; class gender id year; model bmi=gender yr(gender) / noint solution ddfm=kr; random intercept / subject=id(gender); random _residual_ / type=ar(1) subject=id(gender); contrast 'male vs female intercept' gender 1 -1; contrast 'male vs female slope' yr(gender) 1 -1;run;

14 May 2007 SSP Core Facility 210

Department of Statistics

Selected Output

Covariance Parameter Estimates

Cov Parm Subject Estimate

Intercept id(gender) 15.1933

AR(1) id(gender) 0.2928

Residual 7.8871

Solutions for Fixed Effects

Effect gender EstimateStandard

Error DF t Value Pr > |t|

gender 0 20.1988 0.6084 165.9 33.20 <.0001

gender 1 21.8298 0.5596 165.9 39.01 <.0001

yr(gender) 0 0.7860 0.08207 204.5 9.58 <.0001

yr(gender) 1 0.6462 0.07549 204.5 8.56 <.0001

Contrasts

LabelNum

DFDen DF

F Value Pr > F

male vs female intercept

1 165.9 3.89 0.0501

male vs female slope 1 204.5 1.57 0.2111

14 May 2007 SSP Core Facility 211

Department of Statistics

Alternative Model

proc glimmix data=bmi_uni; class gender id; model bmi=gender year(gender) / noint solution ddfm=kr; random intercept year(gender) / subject=id type=un; contrast 'male vs female intercept' gender 1 -1; contrast 'male vs female slope' year(gender) 1 -1;run;

This is a random coefficient model

Next section

14 May 2007 SSP Core Facility 212

Department of Statistics

Response Surface Split Plot with Repeated Measures

4 treatment factors (A, B, C, D)− 2 levels each

3 factors (A, B, C) applied to P( subject) treatment design: central composite design subjects split into 2 sub-units level of D randomly assigned to each sub-unit observations at 3 planned times (H)

14 May 2007 SSP Core Facility 213

Department of Statistics

Central Composite Design

14 May 2007 SSP Core Facility 214

Department of Statistics

Model for Central Composite Split-Split Plot

Effect e.u.

A, B, C

main effects & interactions P(A B C)

D D P(A B C)

D (A, B, C) D P(A B C)

H and all interactions

involving H H D P(A B C)

( , , ) ( , , )

)

(

( , ,

hijklm Ai Bj Ck l l Ai Bj Ck

m lm m Ai Bj Ck

y f X X X d f X X X

h dh f X X X

p a

) ( )hijk hijkl hijklmbc dp abc e

14 May 2007 SSP Core Facility 215

Department of Statistics

SAS Statements

proc glimmix; class ca cb cc p d u; *model y=a b c a*a b*b c*c a*b a*c b*c d d*a d*b d*c t t*t t*a t*b t*c t*d/htype=1 htype=3 ddfm=kr; model y=d a(d) b(d) c(d) a*a b*b c*c a*b a*c b*c t(d) t*t t*a t*b t*c

/noint solution htype=1 ddfm=kr; random p(ca cb cc) d*p(ca cb cc);

14 May 2007 SSP Core Facility 216

Department of Statistics

Key output

Covariance Parameter Estimates

Cov Parm Subject Estimate

Intercept p(ca*cb*cc) 24.3200

d p(ca*cb*cc) 4.5151

Residual 11.4944

Solutions for Fixed Effects

Effect d Estimate Standard Error

d 0 53.5687 2.3344

d 1 31.7168 2.3344

a(d) 0 16.8226 1.8101

a(d) 1 11.2226 1.8101

b(d) 0 19.5049 1.8101

b(d) 1 12.3715 1.8101

c(d) 0 4.4019 1.8101

c(d) 1 3.5352 1.8101

a*a 0.4980 3.2427

b*b -2.5020 3.2427

c*c 5.1647 3.2427

a*b 6.2083 1.8872

a*c -2.8333 1.8872

b*c 1.2083 1.8872

t(d) 0 9.4200 0.5504

t(d) 1 0.02442 0.5504

t*t -0.1487 1.1114

a*t 0.1160 0.5078

b*t 1.7331 0.5078

c*t 0.3513 0.5078

Fit Statistics

AICC (smaller is better) 573.40

14 May 2007 SSP Core Facility 217

Department of Statistics

Complex Split-split-plot revisited

Recall A, B, C applied to units P P split in two, levels of D to each half Measured a 3 times Previous analysis assumed split on time Actually repeated measures Split-plot + repeated measures

14 May 2007 SSP Core Facility 218

Department of Statistics

CCD Split-plot + repeated measures

proc glimmix data=CCD_SpltPlt; class ca cb cc p d u; *model y=a b c a*a b*b c*c a*b a*c b*c d d*a d*b d*c t t*t t*a t*b t*c t*d/htype=1 htype=3 ddfm=kr; model y=d a(d) b(d) c(d) a*a b*b c*c a*b a*c b*c t(d) t*t t*a t*b t*c /

noint solution htype=1 ddfm=kr; random intercept / subject=p(ca cb cc); random _residual_ / type=sp(pow)(t) subject=d*p(ca cb cc);run;

AICC: 573.4 as split-split-plot551.1 as repeated measures using SP(POW)note SP(POW) is generalization of AR(1)

for unequally spaced times

14 May 2007 SSP Core Facility 219

Department of Statistics

Unreplicated Split-Plot

SAS for Mixed Models, Section 16.7 Quilt divided in half Each “half sheet” received 2 x 2 x 3 factorial

−2 pH levels (low high)−2 temp (cold hot)−3 dry cycles (air machine-delicate machine-normal

Material cut from each unit −washed 10, 20, 30, 40, 50 times

Breaking strength monitored Materials observed so reps by sheet lost

14 May 2007 SSP Core Facility 220

Department of Statistics

is the mean of the ijkth pH water temperature dry cycle (i=8,10; j=35,55; k=air, delicate, normal) at the lth

time of washing (l=10.20.30.40.50),rm is the effect of the mth block (m=1,2 in the design, but m=1

only in the data)wijkm is the ijkmth between subjects (or whole-plot) error effect,

assumed eijklm is the within subjects (or split-plot) error effect,

assumed

Model for Breaking Strength Experiment

ijklm ijkl m ijkm ijklmy r w e

ijkl

2(0, )WNID

2(0, )NID

where

14 May 2007 SSP Core Facility 221

Department of Statistics

ANOVA for Breaking Strength ExperimentSource of Variation d.f.

block 1

pH (P) 1

wash temp (T) 1

dry cycle (D) 2

PT 1

PD 2

TD 2

PTD 2

between subject error 11

no. of washes (W) 4

WP 4

WT 4

WD 8

WPT 4

WPD 8

WTD 8

WPTD 8

within subjects error 48

but these become 0when blockingby “half quilt”distinction lost

14 May 2007 SSP Core Facility 222

Department of Statistics

Breaking Strength vs # Washes by pH

14 May 2007 SSP Core Facility 223

Department of Statistics

Breaking Strength vs # Washes by Temp

14 May 2007 SSP Core Facility 224

Department of Statistics

Breaking Strength vs # Washes by Dry Cycle

14 May 2007 SSP Core Facility 225

Department of Statistics

Revised ANOVA Pool negligible effects to get between & within error

Source of Variation d.f.

pH (P) 1

wash temp (T) 1

dry cycle (D) 2

between subject error 7

linear effect of no. of washes (W Lin) 1

W LinP 1

W LinT 1

W LinD 2

within subjects error 43

14 May 2007 SSP Core Facility 226

Department of Statistics

GLIMMIX Program for Breaking Strength Experiment

proc glimmix data=shellie; class pH water_temp dry_cycle; model breaking_strength=pH water_temp dry_cycle w w*pH w*water_temp w*dry_cycle / solution; random pH*water_temp*dry_cycle; contrast 'air vs dryer effect on wear' w*dry_cycle 2 -1 -1; contrast 'delicate v normal effect on wear' w*dry_cycle 0 1 -1;run;

14 May 2007 SSP Core Facility 227

Department of Statistics

Revised GLIMMIX - Estimate Regression over # of Washes

proc glimmix data=shellie; class pH water_temp dry_cycle; model breaking_strength= w(pH) w(water_temp) w(dry_cycle)/noint solution; random pH*water_temp*dry_cycle; estimate 'slope: ph 8, cold, air‘

w(ph) 1 0 w(water_temp) 1 0 w(dry_cycle) 1 0 0; estimate 'slope: ph 8, cold, delicate'

w(ph) 1 0 w(water_temp) 1 0 w(dry_cycle) 0 1 0; estimate 'slope: ph 8, cold, normal'

w(ph) 1 0 w(water_temp) 1 0 w(dry_cycle) 0 0 1; estimate 'slope: ph 8, hot, air‘

w(ph) 1 0 w(water_temp) 0 1 w(dry_cycle) 1 0 0; estimate 'slope: ph 8, hot, delicate'

w(ph) 1 0 w(water_temp) 0 1 w(dry_cycle) 0 1 0;

etc for all pH – temp – dry cycle combinations

14 May 2007 SSP Core Facility 228

Department of Statistics

Regression – Selected Output

Solution for Fixed Effects

Effectwatertemp

Drycycle

pH Estimate

Standard Error

Intercept 0.1070 0.001895

Label EstimateStandard

Error

slope: ph 8, cold, air -0.00024 0.000077

slope: ph 8, cold, delicate -0.00047 0.000077

slope: ph 8, cold, normal -0.00050 0.000077

slope: ph 8, hot, air -0.00050 0.000077

slope: ph 8, hot, delicate -0.00073 0.000077

slope: ph 8, hot, normal -0.00076 0.000077

slope: ph 10, cold, air -0.00082 0.000077

slope: ph 10, cold, delicate -0.00105 0.000077

slope: ph 10, cold, normal -0.00108 0.000077

slope: ph 10, hot, air -0.00108 0.000077

slope: ph 10, hot, delicate -0.00131 0.000077

slope: ph 10, hot, normal -0.00134 0.000077

avg slope: ph 8 -0.00053 0.000054

avg slope: ph 10 -0.00111 0.000054

avg slope: cold water -0.00069 0.000054

avg slope: hot water -0.00095 0.000054

avg slope: air dry -0.00066 0.000063

avg slope: delicate dry -0.00089 0.000063

avg slope: normal dry -0.00092 0.000063

14 May 2007 SSP Core Facility 229

Department of Statistics

Prediction & Inference Space

14 May 2007 SSP Core Facility 230

Department of Statistics

VI. Prediction, “BLUP” and Inference Space

Estimation vs. Prediction

When “BLUP” is a good thing

Inference Space

−what is it?

−how can we use it?

Performance evaluation issues

Multi-location issues

14 May 2007 SSP Core Facility 231

Department of Statistics

Estimation, Prediction, and Inference Space Estimation based on estimable functions

Estimation applies to fixed effects only, inference is to entire population

Prediction based on “predictable functions”

Prediction applies to fixed & random effects, narrows scope of inference to specific subset defined by M’u

Examples: locations, workers, teachers, patients...

K

K M u

14 May 2007 SSP Core Facility 232

Department of Statistics

Prediction Example 1 Growth Change Modeling Issue - III

Random Coefficients Recall Basic Growth Model 0 1ij iji

y year e

0 0 0

1 1 12

0 0 012

1 1

Level 2:

0~ ,

0

bb

bMVN

b

proc glimmix data=bmi_uni;class id; model bmi=year/solution ddfm=kr; random intercept year / subject=id type=un solution; random _residual_ /subject=id type=ar(1);

14 May 2007 SSP Core Facility 233

Department of Statistics

Selected OutputCovariance Parameter Estimates

Cov Parm Subject Estimate

UN(1,1) id 10.8070

UN(2,1) id 0.5873

UN(2,2) id 0.2676

AR(1) id 0.3024

Residual 4.6021

Solutions for Fixed Effects

Effect EstimateStandard

Error t Value

Intercept 21.3577 0.6480 32.96

year 0.6870 0.1212 5.67

Solution for Random Effects

Effect Subject EstimateStd Err

Pred DF

Intercept id 73 2.1023 1.3487 165

year id 73 -0.1608 0.3118 165

Intercept id 281 -1.3178 1.3487 165

year id 281 -0.1353 0.3118 165

Intercept id 496 -1.8137 1.3487 165

year id 496 -0.07237 0.3118 165

partial listing

14 May 2007 SSP Core Facility 234

Department of Statistics

You can obtain Subject-Specific Estimatesproc glimmix data=bmi_uni;class id; model bmi=year/solution ddfm=kr; random intercept year / subject=id type=un solution; random _residual_ /subject=id type=ar(1); estimate 'popn avg slope' year 1 / cl; estimate 'id (73) specific slope' year 1 | year 1 / subject 1 0 cl e; estimate 'id (496) specific slope' year 1 | year 1 / subject 0 0 1 0 cl; estimate 'popn avg intercept' intercept 1 / cl; estimate 'predicted bmi in 1997' intercept 1 year 0 / cl; estimate 'id (73) specific intercept' intercept 1 | intercept 1 / subject 1 0 cl e; estimate 'id (496) specific intercept' intercept 1 | intercept 1 / subject 0 0 1 0 cl; estimate 'predicted bmi in 2000' intercept 1 year 3 / cl; estimate 'id (73) specific 2000 bmi' intercept 1 year 3 |

intercept 1 year 3/ subject 1 0 cl; estimate 'id (496) specific 2000 bmi' intercept 1 year 3 |

intercept 1 year 3/ subject 0 0 1 0 cl; estimate 'predicted bmi in 2003' intercept 1 year 6 / cl; estimate 'id (73) specific 2003 bmi' intercept 1 year 6 |

intercept 1 year 6/ subject 1 0 cl; estimate 'id (496) specific 2003 bmi' intercept 1 year 6 |

intercept 1 year 6/ subject 0 0 1 0 cl;run;

14 May 2007 SSP Core Facility 235

Department of Statistics

Best Linear Unbiased Prediction Look closer at Estimate statement

estimate 'popn avg slope' year 1 / cl; estimate 'id (73) specific slope' year 1 |

year 1 / subject 1 0 cl e; estimate 'id (496) specific slope' year 1 |

year 1 / subject 0 0 1 0 cl;

estimate 'predicted bmi in 2000' intercept 1 year 3 / cl; estimate 'id (73) specific 2000 bmi' intercept 1 year 3 | intercept 1 year 3/ subject 1 0 cl; estimate 'id (496) specific 2000 bmi' intercept 1 year 3 |

intercept 1 year 3/ subject 0 0 1 0 cl;

Coefficients to right of vertical bar ( | ) apply torandom effects – this is a new idea

BLUP - - - estimation (prediction) of random effects

14 May 2007 SSP Core Facility 236

Department of Statistics

Selected Estimates from Random Coeff BMI Model

Estimates

Label EstimateStandard

Error DF Lower Upper

popn avg slope 0.6870 0.1214 31.57 0.4396 0.9344

id (73) specific slope 0.5262 0.3833 18.35 -0.2779 1.3303

id (496) specific slope 0.6146 0.3833 18.35 -0.1895 1.4187

popn avg intercept 21.3577 0.6459 31.5 20.0413 22.6742

predicted bmi in 1997 21.3577 0.6459 31.5 20.0413 22.6742

id (73) specific intercept 23.4601 1.4916 33.36 20.4266 26.4935

id (496) specific intercept 19.5440 1.4916 33.36 16.5105 22.5775

predicted bmi in 2000 23.4186 0.7330 31.99 21.9255 24.9117

id (73) specific 2000 bmi 25.0387 0.9928 9.56 22.8127 27.2646

id (496) specific 2000 bmi 21.3878 0.9928 9.56 19.1618 23.6138

predicted bmi in 2003 25.4795 0.9605 31.84 23.5226 27.4365

id (73) specific 2003 bmi 26.6173 1.5462 20.15 23.3936 29.8410

id (496) specific 2003 bmi 23.2316 1.5462 20.15 20.0079 26.4553

14 May 2007 SSP Core Facility 237

Department of Statistics

Inference Space Example II:

Workers and machines From McLean, Sanders & Stroup (1991,

American Statistician) Also Chapter 6, ex 2, SAS for Mixed Models 2 machines 3 operators (sample from population) inference can apply to population of workers or

specific worker KEY CONCEPT: Inference Space

14 May 2007 SSP Core Facility 238

Department of Statistics

Worker-Machine Example: Fixed Effect Inference

proc glimmix;class machine operator;model y=machine/ddfm=kr;random operator machine*operator;lsmeans machine / diff;estimate 'BLUE - machine 1' intercept 1 machine 1 0;estimate 'BLUE - diff' machine 1 -1;

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

machine 1 2 20.26 0.0460

based on MS(mach) / MS(Mach*oper)

machine Least Squares Means

machine Estimate Std Error DF t Value Pr > |t|

1 50.9483 0.2467 2.973 206.50 <.0001

2 51.9567 0.2467 2.973 210.59 <.0001

Differences of machine Least Squares Means

machine _machine Estimate Std Error DF t Value Pr > |t|

1 2 -1.0083 0.2240 2 -4.50 0.0460

theseESTIMATEstatementsgive same result

14 May 2007 SSP Core Facility 239

Department of Statistics

Worker-Machine Example: Prediction

estimate 'BLUP - m1 narrow' intercept 3 machine 3 0 | operator 1 1 1 machine*operator 1 1 1 0 0 0/divisor=3;estimate 'BLUP - diff nrw' machine 3 -3 | machine*operator 1 1 1 -1 -1 -1/divisor=3;

estimate 'BLUP - oper 1' intercept 2 machine 1 1 | operator 2 0 0 machine*operator 1 0 0 1 0 0/divisor=2;estimate 'BLUP - m1 op1' intercept 1 machine 1 0 | operator 1 0 0 machine*operator 1 0 0 0 0 0;estimate 'BLUP - diff op1' machine 1 -1 | machine*operator 1 0 0 -1 0 0;

these statements apply inference to specific workers or worker-machine• machine 1 averaged over ONLY THE WORKERS IN THE STUDY• diff between machines for workers in study ONLY•operator 1 averaged over machines, with machine 1 only, oper-specific difference between machines

14 May 2007 SSP Core Facility 240

Department of Statistics

Worker-Machine Example: Prediction (2)

Estimates

Label EstimateStandard

Error DF t Value Pr > |t|

BLUE - machine 1 50.9483 0.2467 2.973 206.50 <.0001

BLUE - diff -1.0083 0.2240 2 -4.50 0.0460

BLUP - m1 narrow 50.9483 0.08993 6 566.53 <.0001

BLUP - diff nrw -1.0083 0.1272 6 -7.93 0.0002

BLUP - oper 1 51.7366 0.1151 6.698 449.30 <.0001

BLUP - m1 op1 51.2979 0.1724 7.885 297.48 <.0001

BLUP - diff op1 -0.8773 0.2567 7.976 -3.42 0.0092

BLUE – inference to population of workersBLUP – inference to specific worker or set of workers

note impact of standard error

14 May 2007 SSP Core Facility 241

Department of Statistics

BLUP a.k.a. “Shrinkage Estimator”

BLUP is regressed toward mean

BLUP is E(u|Y)

Degree of skrinkage depends of variance component estimates

Covariance Parameter Estimates

Cov Parm Estimate

operator 0.1073

machine*operator 0.05100

Residual 0.04852

1

e.g. operator BLUP is

( ) ( , ) ( )i i j j jE o Cov o y Var y y y

14 May 2007 SSP Core Facility 242

Department of Statistics

Relationship to Proc GLM

proc glm; class machine operator; model y=machine|operator; random operator machine*operator/test;lsmeans machine operator machine*operator/stderr;lsmeans machine/stderr e=machine*operator;estimate 'diff' machine 1 -1/e;run;

operator y LSMEANStandard

Error

1 51.7625000 0.1101420

vs. 51.74, 0.1151

machine operator y LSMEANStandard

Error

1 1 51.3550000 0.1557642

vs 51.30, 0.1724

machine y LSMEANStandard

Error

1 50.9483333 0.1583947

std error neither Mixed broad or narrowproduced byestimate “m1” intercept 3 machine 3 0 | operator 1 1 1 machine*operator 0 / divisor=3

machine y LSMEANStandard

Error

1 50.9483333 0.0899305

same as BLUP specific to workersin GLIMMIX

14 May 2007 SSP Core Facility 243

Department of Statistics

Prediction Example II: Multi-Location Data

From SAS for Mixed Models, 9 Locations 3 blocks per location 4 treatments Major issues are blocks fixed or random? if random how does one estimate location-specific

treatment effects?

14 May 2007 SSP Core Facility 244

Department of Statistics

ANOVA (ignoring block)

2 21

2 2 21 2

2 21

2

Source d.f. Expected Mean Square

Treatment 3

Location 8

Loc Trt 24

error dfe

LT TRT

LT L

LT

k Q

k k

k

If Location fixed:

2

2

2

2

Source d.f. Expected Mean Square

Treatment 3

Location 8

Loc Trt 24

error dfe

TRT

LOC

LT

Q

Q

Q

Test of TRTaffected

14 May 2007 SSP Core Facility 245

Department of Statistics

Inference Space

2

Assuming Locations are Fixed

Var(trt mean)=# obs/trt

MS(error)Std. error(trt mean)=

# obs/trt

2 2 2

2 2 2

HOWEVER... if Locations are Random

( )Var(trt mean)=

# obs/trt

ˆ ˆ ˆ( )Std. error(trt mean)=

# obs/trt

L LT

L LT

k

k

14 May 2007 SSP Core Facility 246

Department of Statistics

Where does Uncertainty Arise?

Loc 1 Loc 2

Loc 7 Loc 8

Only from variation among obs within locations?

Locations fixedOr does variation among locations also contribute?

Locations random

14 May 2007 SSP Core Facility 247

Department of Statistics

Location-Specific Effects: BLUP

Implies linear combination of fixed and random effect (predictable function = BLUP)

1 2 1 2

In Multi-Location trial, location-specific effect is

e.g. trt 1 vs trt 2 | location

=

j

j jL L

14 May 2007 SSP Core Facility 248

Department of Statistics

Basic SAS Programsfor fixed location: proc glimmix data=MultiCenter; class location block treatment; model response=location treatment location*treatment; random block(location); lsmeans treatment; lsmeans location*treatment/slice=location slicediff=location;run;

for random locationsproc glimmix data=MultiCenter; class location block treatment; model response=treatment/ddfm=KR; random location block(location) location*treatment; lsmeans treatment/diff; estimate 'trt1 vs trt2' treatment 1 -1 0; estimate 'loc A vs loc B' | location 1 -1 0; estimate 'trt 1 BLUP' intercept 8 treatment 8 | location 1 1 1 1 1 1 1 1/divisor=8; estimate 'trt1 at loc A blup' intercept 1 treatment 1 0 0 0 | location 1 0 location*treatment 1 0;

etc – see ch6 MultiCenter.sas for program in detail

14 May 2007 SSP Core Facility 249

Department of Statistics

“Take Home” points

Inference space usually implies random locations “Broad” inference on treatments applies to entire

population Location-specific inference may be of interest Requires BLUP Hans Peter Piepho has proposed mixed-model based

measures of commonality among locations Making locations fixed to maximize error d.f.

to test TRT is inappropriate

14 May 2007 SSP Core Facility 250

Department of Statistics

GLM Issues

14 May 2007 SSP Core Facility 251

Department of Statistics

VII. “GLM” Issues

Bernoulli data

−as a binomial

−special problems with BINARY data

Counts

Rates

14 May 2007 SSP Core Facility 252

Department of Statistics

Common Non-Normal Models Bernoulli (binary) observations Categorical data

− Binomial− multinomial

Counts− Poisson− Over dispersed (e.g. negative binomial)

Rates Survival times

− Gamma, Weibull Dispersion measures

− variance

Contingency tables

14 May 2007 SSP Core Facility 253

Department of Statistics

Elements of GLM(Generalized Linear Model)

Systematic model X Assumed distribution

− implied variance structure

Link function Examples

□ y ~ Bernoulli(p) p = (X)

or logit(p)=X□ Y~ Poisson() log () = X

14 May 2007 SSP Core Facility 254

Department of Statistics

GLM Example

From SAS for Linear Models

Output 10.1, re-expressed in 10.5

Challenger space shuttle data

relate prob{failure} to temperature at launch

DATA: TEMP, TD (# times thermal distress in O-ring, NO_TD

14 May 2007 SSP Core Facility 255

Department of Statistics

Approach to modeling

Assess relationship between TEMP and Prob{TD=1}, i.e O-rings show thermal distress Distribution: Bernoulli

Natural parameter: logit = log[p/(1-p)] Model: logit(Pr{TD})=a+b(Temp) Inverse link form:

Pr{TD}=exp[a+b(Temp)]/{1+exp[a+b(Temp)]}

14 May 2007 SSP Core Facility 256

Department of Statistics

SAS Program: Proc GENMOD

proc glimmix data=Challenger; model td/total=temp; estimate 'logit at 50 deg' intercept 1 temp 50 / ilink; estimate 'logit at 60 deg' intercept 1 temp 60 / ilink; estimate 'logit at 64.7 deg' intercept 1 temp 64.7 / ilink; estimate 'logit at 64.8 deg' intercept 1 temp 64.8 / ilink; estimate 'logit at 70 deg' intercept 1 temp 70 / ilink; estimate 'logit at 80 deg' intercept 1 temp 80 / ilink;run;

14 May 2007 SSP Core Facility 257

Department of Statistics

Relevant Output

( ) 15.04 0.23

Pr{ 1} temp ( F)

logit X

TD X

Fit Statistics

Pearson Chi-Square 11.13

Pearson Chi-Square / DF 0.80

Parameter Estimates

Effect EstimateStandard

ErrorDF t Value Pr > |t|

Intercept 15.0429 7.3786 14 2.04 0.0608

temp -0.2322 0.1082 14 -2.14 0.0500

no evidence of overdispersion

14 May 2007 SSP Core Facility 258

Department of Statistics

Relevant Output (2)

Estimates

Label EstimateStandard

Error DF t Value Pr > |t| Mean

StandardErrorMean

logit at 50 deg 3.4348 2.0232 14 1.70 0.1117 0.9688 0.06121

logit at 60 deg 1.1131 1.0259 14 1.09 0.2962 0.7527 0.1909

logit at 64.7 deg 0.02197 0.6576 14 0.03 0.9738 0.5055 0.1644

logit at 64.8 deg -0.00125 0.6518 14 -0.00 0.9985 0.4997 0.1630

logit at 70 deg -1.2085 0.5953 14 -2.03 0.0618 0.2300 0.1054

logit at 80 deg -3.5301 1.4140 14 -2.50 0.0256 0.02847 0.03911

logit scale data scale

14 May 2007 SSP Core Facility 259

Department of Statistics

Alternatives

Express data in binomial form−SAS for Linear Models, 4th ed., output 10.5

Probit link

2

2

-1

1std normal c.d.f.

2

link function is

inverse link is

z

e dz

X

X

14 May 2007 SSP Core Facility 260

Department of Statistics

Logit vs Probit

Red: probitBlue: logit

14 May 2007 SSP Core Facility 261

Department of Statistics

Probit Modelproc glimmix data=Challenger;

model td/total=temp/link=probit solution;

estimate 'logit at 50 deg' intercept 1 temp 50 / ilink;

estimate 'logit at 60 deg' intercept 1 temp 60 / ilink;

estimate 'logit at 64.7 deg' intercept 1 temp 64.7 / ilink;

estimate 'logit at 64.8 deg' intercept 1 temp 64.8 / ilink;

estimate 'logit at 70 deg' intercept 1 temp 70 / ilink;

estimate 'logit at 80 deg' intercept 1 temp 80 / ilink;

run;

14 May 2007 SSP Core Facility 262

Department of Statistics

Estimates

Label EstimateStandard

Error DF t Value Pr > |t| Mean

StandardErrorMean

logit at 50 deg 2.0201 1.1413 14 1.77 0.0985 0.9783 0.05917

logit at 60 deg 0.6692 0.6024 14 1.11 0.2854 0.7483 0.1921

logit at 64.7 deg 0.03421 0.3960 14 0.09 0.9324 0.5136 0.1579

logit at 64.8 deg 0.02070 0.3925 14 0.05 0.9587 0.5083 0.1566

logit at 70 deg -0.6818 0.3244 14 -2.10 0.0541 0.2477 0.1026

logit at 80 deg -2.0328 0.7277 14 -2.79 0.0144 0.02104 0.03678

Fit Statistics

Pearson Chi-Square 10.98

Pearson Chi-Square / DF

0.78

Probit OutputParameter Estimates

Effect EstimateStandard

Error DF t Value Pr > |t|

Intercept 8.7750 4.0286 14 2.18 0.0470

temp -0.1351 0.05839 14 -2.31 0.0364

14 May 2007 SSP Core Facility 263

Department of Statistics

Option 3: Use Binary Data

proc glimmix data=O_Ring; model td_bin=temp / solution; model td_bin=temp /dist=binomial link=logit solution; estimate 'logit at 50 deg' intercept 1 temp 50 / ilink; estimate 'logit at 60 deg' intercept 1 temp 60 / ilink; estimate 'logit at 64.7 deg' intercept 1 temp 64.7 / ilink; estimate 'logit at 64.8 deg' intercept 1 temp 64.8 / ilink; estimate 'logit at 70 deg' intercept 1 temp 70 / ilink; estimate 'logit at 80 deg' intercept 1 temp 80 / ilink; run;

Careful!! Normal default

14 May 2007 SSP Core Facility 264

Department of Statistics

Binary OutputFit Statistics

Pearson Chi-Square 23.17

Pearson Chi-Square / DF 1.10

Parameter Estimates

Effect EstimateStandard

Error DF t Value Pr > |t|

Intercept 15.0429 7.3786 21 2.04 0.0543

temp -0.2322 0.1082 21 -2.14 0.0438

Estimates

Label EstimateStandard

Error DF t Value Pr > |t| Mean

StandardErrorMean

logit at 50 deg 3.4348 2.0232 21 1.70 0.1043 0.9688 0.06121

logit at 60 deg 1.1131 1.0259 21 1.09 0.2902 0.7527 0.1909

logit at 64.7 deg 0.02197 0.6576 21 0.03 0.9737 0.5055 0.1644

logit at 64.8 deg -0.00124 0.6518 21 -0.00 0.9985 0.4997 0.1630

logit at 70 deg -1.2085 0.5953 21 -2.03 0.0552 0.2300 0.1054

logit at 80 deg -3.5301 1.4140 21 -2.50 0.0209 0.02847 0.03911

no evidence of overdispersion

14 May 2007 SSP Core Facility 265

Department of Statistics

Binary Data + Random Effects

Binary data in GLM with random effect can be troublesome

Pseudo-likelihood tends to produce biased variance / covariance component estimates

e.g. variance estimates biased down for small cluster size

Larger sample sizes tend to be required No overdispersion estimate

14 May 2007 SSP Core Facility 266

Department of Statistics

Binary GLMM example

courtesy of Oliver Schabenberger

200 subjects random intercept logistic link

data binary; do subject = 1 to 200; ranint = rannor(&seed); do i = 1 to &n; linp = &b0 + ranint; pi = 1/(1 + exp(-linp)); y = ranbin(0,1,pi); output; end; end; drop i; run;

14 May 2007 SSP Core Facility 267

Department of Statistics

Binary GLMM

Schabenberger used two programs

proc glimmix data=binary; class subject; model y(event='1') = / dist=binary link=logit s; random intercept / subject=subject; ods select ParameterEstimates CovParms; run;

proc nlmixed data=binary; parms s2 1 intercept -1; model y ~ binary(1/(1+exp(-intercept+gamma))); random gamma ~ normal(0,s2) subject=subject; ods select Dimensions ParameterEstimates; run;

14 May 2007 SSP Core Facility 268

Department of Statistics

GLIMMIX vs NLMIXED Binary Results

Covariance Parameter Estimates

Cov Parm Subject EstimateStandard

Error

Intercept subject 0.5251 0.1699

Solutions for Fixed Effects

Effect EstimateStandard

Error DF

Intercept -0.7159 0.09211 199

cluster size n=4

Parameter Estimates

Parameter EstimateStandard

Error DF

s2 0.8159 0.2718 199

intercept -0.8092 0.1085 199

GLIMMIX

NLMIXED

cluster size n=20

Covariance Parameter Estimates

Cov ParmSubjec

t Estimate

Standard Err

or

Intercept subject 0.9905 0.1373

Solutions for Fixed Effects

Effect EstimateStandard

Error DF

Intercept -0.9239 0.08020 199

Parameter Estimates

Parameter EstimateStandard

Error DF

s2 1.1512 0.1659 199

intercept -0.9854 0.08691 199

14 May 2007 SSP Core Facility 269

Department of Statistics

Diagnostics & Alternative Models

Example using count data SAS Linear Models, Output 10.24 Historically, count data assumed ~ Poisson Implies mean=variance In practice, often variance>mean, overdispersion Requires modification

−scale to correct std error, test statistics for overdispersion

−use different distribution

14 May 2007 SSP Core Facility 270

Department of Statistics

Basic analysis + model checking

Model checking plots:1. Residuals vs pred

a. use std resid b. or deviance resc. std’ize pred scalelook for unequal scatter (wrong dist or var fct)pattern in resid (wrong model or link)

2. y* vs. (xbeta)linear or wrong link

proc glimmix data=a; class BLOCK CTL_TRT a b; model count=CTL_TRT a b a*b/dist=poisson; random intercept / subject=BLOCK; output out=check pred=xbeta pred(ilink)=pred residual=r pearson=resid_pearson;run;

data plot; merge check; adjlamda=2*sqrt(pred); ystar=xbeta+(count-pred)/pred; absres=abs(resid_pearson);

proc gplot; plot resid_pearson*(pred xbeta); plot (resid_pearson)*adjlamda; plot ystar*xbeta; plot absres*adjlamda;run;

14 May 2007 SSP Core Facility 271

Department of Statistics

Evidence of Overdispersion

Gener. chi-square / DF should be 1>1 indicates overdispersion<1 indicates underdispersion

Fit Statistics

-2 Res Log Pseudo-Likelihood 124.06Generalized Chi-Square 100.15Gener. Chi-Square / DF 3.34

14 May 2007 SSP Core Facility 272

Department of Statistics

Example: plot of residuals x adjlamda

14 May 2007 SSP Core Facility 273

Department of Statistics

Another look – absolute value resid vs adjlamda

14 May 2007 SSP Core Facility 274

Department of Statistics

Link? Plot ystar x XBeta

should be linear – no strong evidence of problem

14 May 2007 SSP Core Facility 275

Department of Statistics

Strategy 1: Adjust using scale parameter

Poisson log-likelihood is log( ) log !

( ) ( )

Quasi-likelihood allows scale parameter

log( )

Now, ( ) ( )

y y

E y Var y

y t q yQ dt

t

E y Var y

14 May 2007 SSP Core Facility 276

Department of Statistics

Implementation with GLIMMIX

2

SCALE estimated from RANDOM _RESIDUAL_

- ( )

alternatively can use - ( )

Generalized

N rank X

deviance

N rank X

proc glimmix data=a; class BLOCK CTL_TRT a b; model count=CTL_TRT a b a*b/dist=poisson htype=1,3; random intercept / subject=BLOCK; random _residual_; run;

14 May 2007 SSP Core Facility 277

Department of Statistics

Selected Output

Type I Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

CTL_TRT 1 27 55.83 <.0001

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

CTL_TRT 0 . . .

A 2 27 9.19 0.0009

B 2 27 0.06 0.9402

A*B 4 27 3.11 0.0315

UnScaled Scaled

Type I Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

CTL_TRT 1 27 16.23 0.0004

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

CTL_TRT 0 . . .

A 2 27 2.67 0.0875

B 2 27 0.02 0.9822

A*B 4 27 0.90 0.4753

Note discrepancy for CTL_TRT and A main effect

14 May 2007 SSP Core Facility 278

Department of Statistics

Alternative 2: different distribution e.g. Negative Binomial

kparamnatural

kyVaryE

isk

kk

ky

ky

ky

k

kk

kyL

k

k

kky

ky

kkyN

yNy

N

ky

yNy

log,)(,)(

likelihood-quasi but family, exponloglog

)!1(!

)!1(loglogloglog

)!1(!

)!1( p.d.f. yields

and let :form useful More

)1()!1(!

)!1( :formstat text -math Standard

2

is the mean and k is the aggregation parametersmall k aggregation; k Poisson

14 May 2007 SSP Core Facility 279

Department of Statistics

Negative Binomial with GLIMMIX

proc glimmix data=a; class BLOCK CTL_TRT a b; model count=CTL_TRT a b a*b/dist=negbin htype=1,3; random intercept / subject=BLOCK;run;

Fit Statistics

-2 Res Log Pseudo-Likelihood 84.48

Generalized Chi-Square 28.32

Gener. Chi-Square / DF 0.94

Type I Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

CTL_TRT 1 27 10.08 0.0037

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

CTL_TRT 0 . . .

A 2 27 3.53 0.0436

B 2 27 0.03 0.9753

A*B 4 27 1.02 0.4139

14 May 2007 SSP Core Facility 280

Department of Statistics

Modeling with Offsets

There are cases when modeling count alone is naive This occurs when counts are “per unit”

− Number of plants per plot

− Number of patients per county

− Number of students per district

− Number of boating accidents per year per lake

− Number of defects per lot

Accurate model must take units into account Essentially, based on log(count/unit) Log(count) is link; log(unit) is “offset”

14 May 2007 SSP Core Facility 281

Department of Statistics

Offset defined

Idea: raw count may be artifact of unit size Count / unit more informative Offset

−adjusts for size

− is a regressor whose coefficient is assumed to be 1.0

−used especially in conjuction with Poisson models with log link

−accounts for heterogeneity in rates resulting from difference in size

14 May 2007 SSP Core Facility 282

Department of Statistics

Modeling with Offsets

( )

exp

log ( ) log log

rate per unit size

i i

i i i

i i i

y Poisson

size

E y size

X offset

14 May 2007 SSP Core Facility 283

Department of Statistics

Example: Courtesy of Oliver Schabenberger

Some of the data X is predictor variable SIZE is the “unit” to be

taken into account

Obs size x count

1 5001 4.597 4

2 7550 4.245 76

3 1744 3.918 2

4 1451 3.273 2

5 5313 4.140 12

6 3687 3.438 4

7 3022 4.763 2

8 8809 4.445 9

9 4436 4.191 3

10 2621 4.835 6

14 May 2007 SSP Core Facility 284

Department of Statistics

Naive Modeling (not accounting for SIZE)

proc glimmix data=test; model count = x / s dist=poisson; ods select FitStatistics ParameterEstimates;run;

Fit Statistics

-2 Log Likelihood 647.12

AIC (smaller is better) 651.12

AICC (smaller is better) 651.45

BIC (smaller is better) 654.50

CAIC (smaller is better) 656.50

HQIC (smaller is better) 652.35

Pearson Chi-Square 1078.66

Pearson Chi-Square / DF 28.39

Parameter Estimates

Effect EstimateStandard

ErrorDF

t Value Pr > |t|

Intercept 2.0978 0.4143 38 5.06 <.0001

x -0.01619 0.1002 38 -0.16 0.8725

14 May 2007 SSP Core Facility 285

Department of Statistics

Poisson Model with Offset

proc glimmix data=test; offs = log(size); model count = x /s dist=poisson offset=offs; ods select FitStatistics ParameterEstimates;run;

Fit Statistics

-2 Log Likelihood 318.41

AIC (smaller is better) 322.41

AICC (smaller is better) 322.73

BIC (smaller is better) 325.79

CAIC (smaller is better) 327.79

HQIC (smaller is better) 323.63

Pearson Chi-Square 347.09

Pearson Chi-Square / DF 9.13

Parameter Estimates

Effect EstimateStandard

ErrorDF t Value Pr > |t|

Intercept -7.3168 0.5052 38 -14.48 <.0001

x 0.2247 0.1225 38 1.83 0.0746

14 May 2007 SSP Core Facility 286

Department of Statistics

Alternative to Offset?? Could count/size be treated as binomial?

proc glimmix data=test; offs = log(size); model count = x /s dist=poisson offset=offs; output out=gmxout1 pred(ilink)=mu; id _xbeta_ offs _linp_; ods exclude all;run;

proc glimmix data=test; model count/size = x /s dist=binomial; output out=gmxout2 pred(ilink)=prob; ods exclude all;run; data gmxout2; set gmxout2; predcount= prob * size;

14 May 2007 SSP Core Facility 287

Department of Statistics

Compare Poisson/Offset vs Binomial Results

Obs _xbeta_ offs _linp_ mu

1 -6.28394 8.51739 2.23346 9.3321

2 -6.36302 8.92930 2.56628 13.0173

3 -6.43649 7.46394 1.02745 2.7939

4 -6.58140 7.28001 0.69860 2.0109

5 -6.38661 8.57791 2.19130 8.9468

6 -6.54433 8.21257 1.66823 5.3028

7 -6.24664 8.01367 1.76703 5.8535

8 -6.31809 9.08353 2.76544 15.8860

9 -6.37516 8.39751 2.02235 7.5561

10 -6.23047 7.87131 1.64085 5.1595

Poisson results MU = pred count Bimomial results

Obs size x count prob predcount

1 5001 4.597 4 .001866023 9.3320

2 7550 4.245 76 .001724158 13.0174

3 1744 3.918 2 .001602034 2.7939

4 1451 3.273 2 .001385890 2.0109

5 5313 4.140 12 .001683963 8.9469

6 3687 3.438 4 .001438241 5.3028

7 3022 4.763 2 .001936911 5.8533

8 8809 4.445 9 .001803387 15.8860

9 4436 4.191 3 .001703368 7.5561

10 2621 4.835 6 .001968487 5.1594

predicted counts nearly identical

14 May 2007 SSP Core Facility 288

Department of Statistics

ZIP and Hurdle Models

Mixture models for count data−ZIP = “zero-inflated Poisson”−ZINB = “zero-inflated Negative Binomial”− in principle, other zero-inflated models limited only by

imagination Accommodate excess zeros

−Excess zeros cause overdispersion Are not in exponential family Cannot be fit with PROC GLIMMIX Can be fit using PROC NLMIXED

14 May 2007 SSP Core Facility 289

Department of Statistics

ZIP Model

1 Pr 0 0Pr

1 Pr 0 0

1 0

1 0!

i

i

i i

i i ii

i i

i i

ji

i

z Poisson

z jy j

z j

e j

ej

j

Observation

prob of 0 from Bernoulli process

prob of zero from Poissonprocess

14 May 2007 SSP Core Facility 290

Department of Statistics

Hurdle Model

Two part model−One process generates zeros

−Another process generates non-zeros

Pr 0 0

Pr Pr 01 Pr 0 0

1 Pr 0

i

i ii

i

z j

y j uz j

u

observationzeros fromZ process

truncated at zerodistribution

14 May 2007 SSP Core Facility 291

Department of Statistics

ZIP or Hurdle?

Number of doctor visits per year

Number of fish caught by sport fishermen

Cancer mortality

14 May 2007 SSP Core Facility 292

Department of Statistics

From SAS for Mixed Models, 2nd ed, Ch 15%let pi = 0.27;data zip; do s = 1 to 100; u = rannor(556712); do i = 1 to 20; x = int(ranuni(0)*100); y = int(rannor(0)*100); if (ranuni(0) < &pi) then do; count = 0; lambda = .; end; else do; lambda = exp(-2 + 0.01*x + 0.01*y + u); count = ranpoi(0,lambda); end; output; end; end; drop i u lambda;run;

Credit: Oliver

Schabenberger

14 May 2007 SSP Core Facility 293

Department of Statistics

ZIP Model with Random Effectsproc nlmixed data=zip; parameters b0=0 b1=0 b2=0 a0=0 s2u=1; /* linear predictor for the inflation probability */ linpinfl = a0; /* infprob = inflation probability for zeros */ /* = logistic transform of the linear predictor*/ infprob = 1/(1+exp(-linpinfl)); /* Poisson mean */ lambda = exp(b0 + b1*x + b2*y + u); /* Build the ZIP log likelihood */ if count=0 then ll = log(infprob + (1-infprob)*exp(-lambda)); else ll = log((1-infprob)) + count*log(lambda)-lgamma(count+1)-lambda; model count ~ general(ll); random u ~ normal(0,s2u) subject=s; estimate "inflation probability" infprob;run;

14 May 2007 SSP Core Facility 294

Department of Statistics

ZIP NLMIXED Selected ResultsFit Statistics

-2 Log Likelihood 2803.6

AIC (smaller is better) 2813.6

AICC (smaller is better) 2813.7

BIC (smaller is better) 2826.7

Parameter Estimates

Parameter EstimateStandard

Error DF t Value Pr > |t| Alpha Lower Upper Gradient

b0 -1.9979 0.1530 99 -13.06 <.0001 0.05 -2.3014 -1.6944 -0.00224

b1 0.01011 0.001299 99 7.78 <.0001 0.05 0.007535 0.01269 -0.15649

b2 0.01016 0.000394 99 25.78 <.0001 0.05 0.009378 0.01094 -0.0434

a0 -1.0934 0.1594 99 -6.86 <.0001 0.05 -1.4097 -0.7771 -0.00034

s2u 1.0828 0.2095 99 5.17 <.0001 0.05 0.6671 1.4985 -0.00145

Additional Estimates

Label EstimateStandard

Error DF t Value Pr > |t| Alpha Lower Upper

inflation probability 0.2510 0.02997 99 8.38 <.0001 0.05 0.1915 0.3104

true parameter valuesb0=-2 b1=b2=0.01a0=-0.9946 s2u=1

14 May 2007 SSP Core Facility 295

Department of Statistics

GLMM Multi-Clinic Binomial Data

SAS for Linear Models, Output 10.9 also SAS for Mixed Models, Ch 14 from Beitler & Landis, Biometrics, 1985 2 treatments (drug, cntl) 8 clinics, represent population nij patients observed on trt i at clinic j

yij have favorable response

14 May 2007 SSP Core Facility 296

Department of Statistics

GLMM for Beitler Landis Data

2 2

Pr | ,

Model: log ( )1

(0, ); (0, )

ij

iji j ij

ij

j C CTij

favorable trt i clinic j

c ct

c iid N ct iid N

proc glimmix data=a; class clinic trt; model fav/nij= trt/dist=binomial link=logit; random intercept trt / subject=clinic; lsmeans trt/odds; estimate 'lsm - cntl' intercept 1 trt 1 0 /ilink; estimate 'lsm - drug' intercept 1 trt 0 1 / ilink; estimate 'diff' trt 1 -1; contrast 'diff' trt 1 -1;run;

Covariance Parameter Estimates

Cov Parm Subject Estimate

Intercept clinic 2.0103

trt clinic 0.06057

14 May 2007 SSP Core Facility 297

Department of Statistics

If you drop Clinic x Trtproc glimmix data=a; class clinic trt; model fav/nij= trt/dist=binomial link=logit; random intercept / subject=clinic; lsmeans trt/odds; estimate 'lsm - cntl' intercept 1 trt 1 0 /ilink; estimate 'lsm - drug' intercept 1 trt 0 1 / ilink; estimate 'diff' trt 1 -1; contrast 'diff' trt 1 -1;run;

conditional(SS)model

proc glimmix data=a; class clinic trt; model fav/nij= trt/dist=binomial link=logit; random _residual_ / type=cs subject=clinic; lsmeans trt/odds; estimate 'lsm - cntl' intercept 1 trt 1 0 /ilink; estimate 'lsm - drug' intercept 1 trt 0 1 / ilink; estimate 'diff' trt 1 -1; contrast 'diff' trt 1 -1;run;

marginal(PA)model

14 May 2007 SSP Core Facility 298

Department of Statistics

Selected Output – Conditional Model

Covariance Parameter Estimates

Cov Parm Estimate

Standard Error

clinic 2.0327 1.2637

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

trt 1 7 5.98 0.0444

Estimates

Label EstimateStandard

ErrorDF t Value Pr > |t| Mean

StandardErrorMean

lsm - cntl -1.1464 0.5586 7 -2.05 0.0793 0.2411 0.1022

lsm - drug -0.4220 0.5552 7 -0.76 0.4720 0.3960 0.1328

diff -0.7244 0.2963 7 -2.45 0.0444

trt Least Squares Means

trt EstimateStandard

Error DF t Value Pr > |t| Odds

cntl -1.1464 0.5586 7 -2.05 0.0793 0.3178

drug -0.4220 0.5552 7 -0.76 0.4720 0.6557

14 May 2007 SSP Core Facility 299

Department of Statistics

GLMM with NLMIXED1. data step to define indicator for Trt=1 (because NLMIXED

lacks CLASS statement)data a; input clinic trt $ fav unfav; nij=fav+unfav; t1=(trt='drug');

2. then, run NLMIXEDproc nlmixed; parms mu=1 tau=0 s2c=2; eta=mu+tau*t1+cj; pij=exp(eta)/(1+exp(eta)); model fav~binomial(nij,pij); random cj~normal(0,s2c) subject=clinic; estimate 'trt effect' tau; estimate 'ctl p_hat' exp(mu)/(1+exp(mu)); estimate 'drug p_hat' exp(mu+tau)/(1+exp(mu+tau)); estimate 'diff on p_hat scale' exp(mu+tau)/(1+exp(mu+tau)) - exp(mu)/(1+exp(mu));run;

14 May 2007 SSP Core Facility 300

Department of Statistics

NLMIXED with CxT term included

proc nlmixed; parms mu=1 tau=0 s2c=2 s2ct=0.08; eta=mu+tau*t1+cj+c1j*t1+c2j*t2;; pij=exp(eta)/(1+exp(eta)); model fav~binomial(nij,pij); random cj c1j c2j~normal([0,0,0],[s2c,0,s2ct,0,0,s2ct])

subject=clinic; estimate 'trt effect' tau; estimate 'ctl p_hat' exp(mu)/(1+exp(mu)); estimate 'drug p_hat' exp(mu+tau)/(1+exp(mu+tau)); estimate 'diff on p_hat scale' exp(mu+tau)/(1+exp(mu+tau)) - exp(mu)/(1+exp(mu));run;

first, also define Trt=2 indicator, here denoted t2

14 May 2007 SSP Core Facility 301

Department of Statistics

Binary Repeated Measures

2 treatments 20 subjects (animals) per trt 5 times of measurement response at each measurement 0/1 suggested by companion animal vaccine trials

14 May 2007 SSP Core Facility 302

Department of Statistics

Several approaches

GEE using GENMOD PQL using %GLIMMIX

− random subj(trt), or

− CS

G-H quadrature using NLMIXED (not shown) but you could use MIXED type 1 error control of PQL + random subj(trt) not

acceptable power of PQL/CS or NLMIXED > GEE

14 May 2007 SSP Core Facility 303

Department of Statistics

various SAS pgm for binary rpt-M dataproc genmod; class trt animal day; model y=trt|day/dist=bin type1 type3; repeated subject=animal(trt)/ type=exch;

Proc GLIMMIX; CLASS trt animal day; MODEL y=trt|day / dist=binomial link=logit; random animal(trt);

Proc GLIMMIX; CLASS trt animal day; MODEL y=trt|day / dist=binomial link=logit; random day / rside type=cs subject=animal(trt);

GEE

PQL

random an(trt)

CS

NLMixed next page

14 May 2007 SSP Core Facility 304

Department of Statistics

NLMixeddata nlmx; set univar; t1=(trt=1); t2=(trt=2); d1=(day=1); d2=(day=2); d3=(day=3); d4=(day=4); d5=(day=5);

proc nlmixed; parms mu=1 a1=1 b1=1 b2=1 b3=1 b4=1

ab11=1 ab12=1 ab13=1 ab14=1 sb2=1; eta=mu+a1*t1+b1*d1+b2*d2+b3*d3+b4*d4+

ab11*t1*d1+ab12*t1*d2+ab13*t1*d3+ab14*t1*d4; pi=exp(eta+bse)/(1+exp(eta+bse)); model y~binary(pi); random bse~normal(0,sb2) subject=id; contrast 'trt' a1; contrast 'day' b1,b2,b3,b4; contrast 'trt x day' ab11,ab12,ab13,ab14;

14 May 2007 SSP Core Facility 305

Department of Statistics

Poisson Repeated Measures

Output 10.39 SAS for Linear Models Leppik, et al (1985); Thall & Vail (1990) 2 treatments 28 patients on trt=0; 31 on trt=1 4 times of measurement epilespsy: # seizures in 4 test periods baseline & age covariates

14 May 2007 SSP Core Facility 306

Department of Statistics

Model for seizure data

1 2

denote mean count (# seizures) trt , time

GL Model is:

log( ) ( ) (log_ ) (log_ )

Assume CS working correlation structure among repeated measures

ij

ij i j ij i

i j

base age

using GEE

proc genmod data=seizure; class id trt time;/* this model first */ *model y=trt time trt*time log_base trt*log_base log_age/ dist=poisson link=log type1 type3;/* then this model */ model y=trt time log_base(trt)log_age/ dist=poisson link=log type1 type3; repeated subject=id / type=exch corrw;

see SAS file for %GLIMMIX approach

14 May 2007 SSP Core Facility 307

Department of Statistics

GENMOD to GLIMMIX

using GEE

proc genmod data=seizure; class id trt time;model y=trt time log_base(trt)log_age/ dist=poisson link=log type1 type3; repeated subject=id / type=exch corrw;

equivalent GLIMMIX

proc glimmix data=seizure; class id trt time;model y=trt time log_base(trt)log_age/ dist=poisson link=log; random time / type=cs subject=id residual;

14 May 2007 SSP Core Facility 308

Department of Statistics

Degrees of Freedom & Standard Errors

Recall Satterthwaite approximation & Kenward-Roger bias adjustment in LMM

Same issues exist with GLMM

But not nearly as well researched

You can use SATTERTH and KR options in GLIMMIX with non-normal data & non-identity link

But what do they do?

14 May 2007 SSP Core Facility 309

Department of Statistics

Power

14 May 2007 SSP Core Facility 310

Department of Statistics

VIII. Power Many software packages for power & sample size

−e.g SAS PROC POWER− for FIXED effect models only

What if you have “Mixed Model Issues”?− random effects−split-plot structure−errors potentially correlated: longitudinal or spatial data−any other non-standard model structure

Methods based on PROC GLIMMIX−adapted from Stroup (2002, JABES)

14 May 2007 SSP Core Facility 311

Department of Statistics

Mixed Model Background – G, R unknown

)'(]'[)''(

Roger-Kenward ite,Satterthwa e.g.

edapproximat be toneedmay or design from obvious bemay

approx ~

and of components estimated using of estimate is ˆ

)(

)ˆ'(]ˆ'[)'ˆ'()0'(

1

],),([

1

KCLLK

FF

RGCC

Krank

KLCLKKF

Krank

14 May 2007 SSP Core Facility 312

Department of Statistics

Computing Power using SAS

create data set like proposed design (O’Brien: “exemplary data set”)

run PROC GLIMMIX with covariance components fixed

=(F computed by GLIMMIX)rank(K) [or chi-sq with GLM]

use GLIMMIX to compute

critical F (Fcrit ) is value s.t.

P{F (rank(K), υ, 0 ) > Fcrit}= [or chi-square]

Power = P{F [rank(K), υ, ] >Fcrit }

SAS functions can compute Fcrit & Power

14 May 2007 SSP Core Facility 313

Department of Statistics

/* step 1 - create data set with same structure as proposed design use MU (expected mean) instead of observed Y_ij values *//* this example shows power for 5, 10, and 15 e.u. per trt */

data crdpwrx1; input trt mu; do n=5 to 15 by 5; do eu=1 to n; output; end; end;cards;1 1002 943 90;

Compute Power with GLIMMIX – CRD example

14 May 2007 SSP Core Facility 314

Department of Statistics

Compute Power with GLIMMIX – CRD example

/* step 2 - use PROC GLIMMIX to compute non-centrality parameters for ANOVA tests & contrasts ODS statements output them to new data sets */proc sort data=crdpwrx1;by n;

proc glimmix data=crdpwrx1;by n; class trt; model mu=trt; parms (100)/hold=1; contrast 'et1 v et2' trt 0 1 -1; contrast 'c vs et' trt 2 -1 -1; ods output tests3=b; ods output contrasts=c;run;

14 May 2007 SSP Core Facility 315

Department of Statistics

/* step 3: combine ANOVA & contrast n-c parameter data sets use SAS functions PROBF and FINV to compute power */data power; set b c; alpha=0.05; ncparm=numdf*fvalue; fcrit=finv(1-alpha,numdf,dendf,0); power=1-probf(fcrit,numdf,dendf,ncparm);proc print;

Type III Tests of Fixed Effects

EffectNum

DFDen DF F Value Pr > F

trt 2 12 1.27 0.3169

Contrasts

LabelNum

DFDen DF F Value Pr > F

et1 v et2 1 12 0.40 0.5390

c vs et 1 12 2.13 0.1698

Obs n Effect NumDF DenDF FValue ProbF Label alpha ncparm fcrit power

1 5 trt 2 12 1.27 0.3169 0.05 2.53333 3.88529 0.22361

2 5 1 12 0.40 0.5390 et1 v et2 0.05 0.40000 4.74723 0.08980

3 5 1 12 2.13 0.1698 c vs et 0.05 2.13333 4.74723 0.26978

14 May 2007 SSP Core Facility 316

Department of Statistics

More Advanced Example

Plots in 8 x 3 grid Main variation alone 8 “rows” 3 x 2 treatment design Alternative designs

− randomized complete block (4 blocks, size 6)

− incomplete block (8 blocks, size 3)

−split plot

RCBD “easy” but ignores natural variation

14 May 2007 SSP Core Facility 317

Department of Statistics

Picture the 8 x 3 Grid

Gradient

14 May 2007 SSP Core Facility 318

Department of Statistics

SAS Programs to Compare 8 x 3 Designdata a; input bloc trtmnt @@; do s_plot=1 to 3; input dose @@; mu=trtmnt*(0*(dose=1)+4*(dose=2)+8*(dose=3)); output; end;cards;1 1 1 2 31 2 1 2 32 1 1 2 32 2 1 2 33 1 1 2 33 2 1 2 34 1 1 2 34 2 1 2 3;

proc glimmix data=a noprofile; class bloc trtmnt dose; model mu=bloc trtmnt|dose; random trtmnt/subject=bloc; parms (4) (6) / hold=1,2; lsmeans trtmnt*dose / diff; contrast 'trt x lin'

trtmnt*dose 1 0 -1 -1 0 1; ods output diffs=b; ods output contrasts=c;run;

Split-Plot

14 May 2007 SSP Core Facility 319

Department of Statistics

8 x 3 – Incomplete Blockdata a; input bloc @@; do eu=1 to 3; input trtmnt dose @@; mu=trtmnt*(0*(dose=1)+4*(dose=2)+8*(dose=3)); output; end;cards;1 1 1 1 2 1 32 1 1 1 2 2 23 1 1 1 3 2 34 1 1 2 1 2 25 1 2 1 3 2 26 1 2 2 1 2 37 1 3 2 1 2 38 2 1 2 2 2 3;

proc glimmix data=a noprofile; class bloc trtmnt dose; model mu=trtmnt|dose; random intercept / subject=bloc; parms (4) (6) / hold=1,2; lsmeans trtmnt*dose / diff; contrast 'trt x lin'

trtmnt*dose 1 0 -1 -1 0 1; ods output diffs=b; ods output contrasts=c;run;

14 May 2007 SSP Core Facility 320

Department of Statistics

8 x 3 Example - RCBDdata a; input trtmnt dose @@; do bloc=1 to 4; mu=trtmnt*(0*(dose=1)+4*(dose=2)+8*(dose=3)); output; end;cards;1 1 1 2 1 3 2 1 2 2 2 3;

proc glimmix data=a noprofile; class bloc trtmnt dose; model mu=bloc trtmnt|dose; parms (10) / hold=1; lsmeans trtmnt*dose / diff; contrast 'trt x lin'

trtmnt*dose 1 0 -1 -1 0 1; ods output diffs=b; ods output contrasts=c;run;

14 May 2007 SSP Core Facility 321

Department of Statistics

Power for GLMs

2 treatments P{favorable outcome} for trt 1 p= 0.30; for trt 2 p=0.25 power if n1=300; n2=600data a; input trt y n; datalines;1 90 3002 150 600;

proc glimmix; class trt; model y/n=trt / chisq; ods output tests3=pwr;run;

data power; set pwr; alpha=0.05; ncparm=numdf*chisq; fcrit=cinv(1-alpha,numdf,0); power=1-probchi(fcrit,numdf,ncparm); proc print; run;

14 May 2007 SSP Core Facility 322

Department of Statistics

Power for GLMM Same trt and sample size per location as before 10 locations Var(Location)=0.25; Var(Trt*Loc)=0.125 Variance Components: variation in log(OddsRatio) Power?data a; input trt y n; do loc=1 to 10; output; end; datalines; 1 90 300 2 150 600 ;

proc glimmix data=a initglm; class trt loc; model y/n = trt / oddsratio; random intercept trt / subject=loc; random _residual_; parms (0.25) (0.125) (1) / hold=1,2,3; ods output tests3=pwr;run;

14 May 2007 SSP Core Facility 323

Department of Statistics

GLMM Power Analysis Results

Obs Effect NumDF DenDF alpha ncparm fcrit power

1 trt 1 9 0.05 2.29868 5.11736 0.27370

Odds Ratio Estimates

trt _trt Estimate DF

95% Confidence

Limits

1 2 1.286 9 0.884 1.871

Gives you expected Conf Limits for # Locations & N / Loccontemplated

Gives you the power of the test of TRT effect on prob(favorable)

14 May 2007 SSP Core Facility 324

Department of Statistics

GLMM Power: Impact of Sample Size?

N of subjects per trt per location?

N of Locations?

Three cases

1. n-300/600 10 loc2. n=600/1200, 10 loc3. n=300/600, 20 loc

data a; input trt y n; do loc=1 to 10; output; end; datalines; 1 90 300 2 150 600 ;

data a; input trt y n; do loc=1 to 10; output; end; datalines; 1 180 600 2 300 1200 ;

data a; input trt y n; do loc=1 to 20; output; end; datalines; 1 90 300 2 150 600 ;

14 May 2007 SSP Core Facility 325

Department of Statistics

GLMM Power: Impact of Sample Size?Recall, for 10 locations, N=300/600,

CI for OddsRatio was (0.884, 1.871); Power was 0.274For 10 locations, N=600 / 1200

Odds Ratio Estimates

trt _trt Estimate DF 95% Confidence Limits

1 2 1.286 9 0.891 1.855

Obs Effect NumDF DenDF alpha ncparm fcrit power

1 trt 1 9 0.05 2.40715 5.11736 0.28421

For 20 locations, N=300 / 600Odds Ratio Estimates

trt _trt Estimate DF 95% Confidence Limits

1 2 1.286 19 1.006 1.643

Obs Effect NumDF DenDF alpha ncparm fcrit power

1 trt 1 19 0.05 4.59736 4.38075 0.53003

N alone has almost no impact

14 May 2007 SSP Core Facility 326

Department of Statistics

Spatial Data

14 May 2007 SSP Core Facility 327

Department of Statistics

Example 5 - Spatialfrom SAS for Mixed Models, Sect. 11.7

“Alliance” Data from Stroup, Baenziger, and Mulitze (1994)

in GLIMMIX-speak:

data two; set alliance; obs = _n_;proc glimmix data=two; class Entry Rep obs; model Yield=Entry/ddfm=kr; random intercept/subject=rep; random obs / type=sp(sph)(latitude longitude); parms (0.1) (43.4) (27.5) (11.5); lsmeans entry;

14 May 2007 SSP Core Facility 328

Department of Statistics

IX. Spatial Data

Example from SAS for Mixed Models−Spatial errors in Treatement Comparison studies only

−No spatial mapping, Kriging

Standard parametric models from Geostatistics

RSMOOTH alternative

Issues

14 May 2007 SSP Core Facility 329

Department of Statistics

r ep 1 2 3 4

LAT

4. 30

15. 05

25. 80

36. 55

47. 30

LNG

1. 2 7. 5 13. 8 20. 1 26. 4

From Stroup, Baenziger & Mulitze (Crop Science, 1994) 56 varieties, 4 blocks, e.u. = 4.3 1.2 m plots

14 May 2007 SSP Core Facility 330

Department of Statistics

Contour Plot of Response

B

B

B

B

NN

N

N

B = Buckskin N = NE86503

14 May 2007 SSP Core Facility 331

Department of Statistics

Additional GLIMMIX Code to Plot Spatial Variability

output out=gmxout2 pred=p; ods output lsmeans=lsm2; id entry latitude longitude _zgamma_; run; proc means data=gmxout2; var _zgamma_; run; proc print data=gmxout2(OBS=20); run; proc g3d data=gmxout2; plot latitude*longitude=_zgamma_ /grid;

14 May 2007 SSP Core Facility 332

Department of Statistics

Plot of Spherical Covariance

14 May 2007 SSP Core Facility 333

Department of Statistics

Alternative Using RSMOOTH

Advantage in Theory: RSMOOTH does not require parametric model of spatial variation, which can be unrealistic

e.g. Alliance data spatial variation is from winter kill

proc glimmix data=alliance; class Entry Rep; model Yield=Entry /ddfm=kr; *model Yield=Entry latitude longitude/ddfm=kr; random intercept/subject=rep; random latitude longitude / type=rsmooth;

14 May 2007 SSP Core Facility 334

Department of Statistics

RSMOOTH?

From Penalized Spline−Ruppert, Wand, and Carroll (2003, SemiParametric

Regression, Cambridge)

*

ˆˆPrediction: ( )

Objective Function :

; ( ) ( )

y B x

Q y B x y B x D

14 May 2007 SSP Core Facility 335

Department of Statistics

RSMOOTH (2)

Rewrite the model

0 1

2 2

is "knot" a.k.a. "join point"

Rexpress:

* ;

i j i jj

j

y x x e

y X Z e

then

Q y X Z

14 May 2007 SSP Core Facility 336

Department of Statistics

RSMOOTH (2)

2

Spline:

LMM:

y y X B y X B D

y y X Zu y X Zu

14 May 2007 SSP Core Facility 337

Department of Statistics

RSMOOTH yields following Spatial Plot

14 May 2007 SSP Core Facility 338

Department of Statistics

RSMOOTH vs SP(SPH)

Sp(SPH) RSMOOTH

Type III Tests of Fixed Effects

Num DenEffect DF DF F Value Pr > F

Entry 55 148.2 1.77 0.0038

Type III Tests of Fixed Effects

Num DenEffect DF DF F Value Pr > F

Entry 55 138.1 1.85 0.0021

14 May 2007 SSP Core Facility 339

Department of Statistics

However... Plot of LSMeans from two approaches

LSM_RSMOOTH average 31.06LSM_SP_SPH average 24.40

????

14 May 2007 SSP Core Facility 340

Department of Statistics

14 May 2007 SSP Core Facility 341

Department of Statistics

Some NLMM Issues

Consulting problem at UNL Why nonlinear mixed model (NLMM) seemed

appropriate Problems in implementation NLMM issues Alternatives whose implications are not

adequately understood

14 May 2007 SSP Core Facility 342

Department of Statistics

Wheat Sawfly Study

Gary Hein, Research Entomologist, Scottsbluff, NE RREC

Sawflies inhabit/damage wheat 5 tillage treatments: impact on sawflies Exp design used 4 randomized blocks Sawfly emergence measured at planned times

during growing season

14 May 2007 SSP Core Facility 343

Department of Statistics

Emergence over TIME by TRT

Black: NoTillRed: SumBlade (summer)Cyan: SB&SDGreen: SpDisk (spring)Blue: SpPlow

14 May 2007 SSP Core Facility 344

Department of Statistics

“Conventional” Analysis

Emerge = + TRT + blk + blk*trt + DATE + TRT*DATE + date*blk(trt)

• blk*trt a.k.a. between subjects or “whole-plot” error

• date*blk(trt) = within subjects or “split-plot” error

ANOVA: Source df

blk 3TRT 4betw subj error 12DATE 12TRT*DATE 48within subj error 180

14 May 2007 SSP Core Facility 345

Department of Statistics

Standard ANOVAmodel: emerge = + blk + TRT +w.p.error + TIME + TRT*TIME + s.p. error

The Mixed Procedure

Covariance Parameter Estimates

Cov Parm Estimate blk 0.002177 blk*trt 0.005199 Residual 0.01845

Type 3 Tests of Fixed Effects

Num Den Effect DF DF F Value Pr > F

trt 4 12 13.18 0.0002 date 12 180 157.38 <.0001

trt*date 48 180 5.18 <.0001

CS covariance fit adequately

14 May 2007 SSP Core Facility 346

Department of Statistics

Break out TRT*DATE effect

Type 1 Tests of Fixed Effects

Num Den Effect DF DF F Value Pr > F

trt 4 12 15.62 0.0001 lin 1 177 2273.39 <.0001 quad 1 177 7.24 0.0078 cubic 1 177 161.10 <.0001 date 9 177 2.95 0.0027 lin*trt 4 177 0.59 0.6716 quad*trt 4 177 26.69 <.0001 cubic*trt 4 177 2.13 0.0792 trt*date 36 177 3.08 <.0001

14 May 2007 SSP Core Facility 347

Department of Statistics

Alternative Modeling Considerations

th th

2

2

:

mean of i trt at j time

~ . . . (0, )

~ . . . (0, )

ijk ij k ik ijk

ij

ik W

ijk

Basic form of Model y blk w e

w whole plot error i i d N

e split plot error i i d N

Modeling ij

1. Decompose ij in “standard ANOVA” +Trt+Time+Trt*Time

2. Further decompose via polynomial regression

3. Nonlinear decomposition, e.g. Gompertz

4. Transform yijk to “linearize” response profile over date

a. logit or probit (assume sigmoid profile is symmetric)

b. complementary log-log (allows asymmetry)

14 May 2007 SSP Core Facility 348

Department of Statistics

th

th

th

exp{ exp[ ( )]}

is asymptote of i treatment

is "slope" of i treatment

is inflection point of i treatment

:

ij i i i j

i

i

i

i

Gompertz Model

date

14 May 2007 SSP Core Facility 349

Department of Statistics

Parameter Estimates Standard Parameter Estimate Error DF t Value Pr > |t|

a1 0.9949 0.03629 19 27.42 <.0001 a2 0.9666 0.03793 19 25.48 <.0001 a3 0.9868 0.04609 19 21.41 <.0001 a4 1.0037 0.06284 19 15.97 <.0001 a5 0.9236 0.04390 19 21.04 <.0001

b1 0.5435 0.08104 19 6.71 <.0001 b2 0.4822 0.08743 19 5.52 <.0001 b3 0.4506 0.09845 19 4.58 0.0002 b4 0.3431 0.06859 19 5.00 <.0001 b5 0.8544 0.1810 19 4.72 0.0001

c1 0.3615 0.05388 19 6.71 <.0001

c2 0.3224 0.05841 19 5.52 <.0001 c3 0.2940 0.06370 19 4.62 0.0002 c4 0.2186 0.04360 19 5.01 <.0001 c5 0.5319 0.1125 19 4.73 0.0001 s2w 0.002926 0.001355 s2s 0.01598 0.001462

These areML estimates Bias?

14 May 2007 SSP Core Facility 350

Department of Statistics

Fit of Gompertz

14 May 2007 SSP Core Facility 351

Department of Statistics

Trt Comparisons with NLMIXED Contrasts Num Den Label DF DF F Value Pr > F

among a 4 19 0.50 0.7383 among b 4 19 2.19 0.1085 among c 4 19 2.30 0.0966 a: nt vs sum bld 1 19 0.29 0.5956 a: nt+sb vs sb&sd 1 19 0.01 0.9108 a: sp dsk vs sp plow 1 19 1.09 0.3089 a: nt+sb vs sp d+p 1 19 0.14 0.7169 b: nt vs sum bld 1 19 0.26 0.6132 b: nt+sb vs sb&sd 1 19 0.29 0.5950 b: sp dsk vs sp plow 1 19 6.97 0.0161 b: nt+sb vs sp d+p 1 19 0.57 0.4590 c: nt vs sum bld 1 19 0.24 0.6279 c: nt+sb vs sb&sd 1 19 0.41 0.5305 c: sp dsk vs sp plow 1 19 6.74 0.0177 c: nt+sb vs sp d+p 1 19 0.21 0.6497

14 May 2007 SSP Core Facility 352

Department of Statistics

Issues with Test Results

denominator degrees of freedom?DF in NLMIXED based on simple N-1 ruleMIXED uses Satterthwaite/KRNLMIXED analog?

bias in test statistics?In MIXED, ML variance estimates biased Test statistics biased Excessive type I error rates familiar in MIXEDSame in NLMIXED?

14 May 2007 SSP Core Facility 353

Department of Statistics

Alternative NLMIXED Analysis

1. Use MIXED to obtain REML estimates of W2

and S2

2. Include REML variance component estimates in NLMIXED as known

3. NLMIXED will compute std errors and test statistics using REML estimates

14 May 2007 SSP Core Facility 354

Department of Statistics

NLMIXED REML Tests

MLE: W2 = 0.002926 S

2 = 0.01598REML: W

2 = 0.005199 S2 = 0.01845

Num DenLabel DF DF F Value Pr > Famong a 4 19 0.38 0.8188among b 4 19 1.81 0.1690among c 4 19 1.89 0.1537a: nt vs sum bld 1 19 0.26 0.6138a: nt+sb vs sb&sd 1 19 0.00 0.9796a: sp dsk vs sp plow 1 19 0.77 0.3918a: nt+sb vs sp d+p 1 19 0.15 0.7046b: nt vs sum bld 1 19 0.22 0.6419b: nt+sb vs sb&sd 1 19 0.18 0.6737b: sp dsk vs sp plow 1 19 5.88 0.0255b: nt+sb vs sp d+p 1 19 0.52 0.4788c: nt vs sum bld 1 19 0.21 0.6555c: nt+sb vs sb&sd 1 19 0.27 0.6114c: sp dsk vs sp plow 1 19 5.68 0.0277c: nt+sb vs sp d+p 1 19 0.20 0.6586

Vs. ML

.1085

.0966

.0161

.0177

14 May 2007 SSP Core Facility 355

Department of Statistics

Hein: “What if we transform the data to linearize it, then use MIXED?”

exp{ exp[ ( )]

if we assume =1

then

log[ log( )] ( )

y date

y date

Denote response variable emerge by y

then:

14 May 2007 SSP Core Facility 356

Department of Statistics

Plot of CLogLog over Date by Trt

14 May 2007 SSP Core Facility 357

Department of Statistics

MIXED Analysis of CLogLog

Type 1 Tests of Fixed Effects

Num Den Effect DF DF F Value Pr > F

trt 4 12 15.69 0.0001 lin 1 180 1402.85 <.0001 lin*trt 4 180 3.58 0.0077 trt*date 55 180 7.02 <.0001

Test of Lin and Lin*Trt correspond toequality of i and i for all treatmentsin Gompertz NLMM

14 May 2007 SSP Core Facility 358

Department of Statistics

Decomposing Contrasts

Num DenLabel DF DF F Value Pr > F

trt (b) 4 15 6.12 0.0040c 4 120 3.62 0.0080b: nt v sum bld 1 15 2.15 0.1631b: nt&sb vs sb&sd 1 15 4.37 0.0541b: sp d v p 1 15 2.27 0.1526b: nt&sb v sp d&p 1 15 19.96 0.0005c: nt v sum bld 1 120 2.11 0.1491 c: nt&sb vs sb&sd 1 120 3.49 0.0644c: sp d v p 1 120 0.99 0.3214c: nt&sb v sp d&p 1 120 11.08 0.0012

Vs NLMM

.169

.154

.674

.026

.611

.028

NLMM too conservative? or is Linearized LMM too liberal?

14 May 2007 SSP Core Facility 359

Department of Statistics

Unresolved Issues

14 May 2007 SSP Core Facility 360

Department of Statistics

Unresolved NLMIXED Issues

REML vs. ML variance component estimates Degrees of Freedom

Starting Values and Convergence

Are NLMIXED tests too conservative?

Implications for standard errors??

Correlated error repeated measures? When are linearized models analyzed using LMM

(e.g. Proc Mixed) preferable?

Design

14 May 2007 SSP Core Facility 361

Department of Statistics

GLIMMIX vs MIXED/GENMOD

GLIMMIX has very useful mean comparison options not available in MIXED

−especially for Factorial Simple Effects

GLIMMIX can model true GLMM’s

GLIMMIX is “touchy” (e.g. use of SUBJECT=)

Many Research Issues

−RSMOOTH−Properties of NonNormal KR, working

correlation, DDF, etc.−Computational Methods

14 May 2007 SSP Core Facility 362

Department of Statistics

Does GLIMMIX replace MIXED/GENMOD?

For GLMMs – no question For GLMs / LMMs

− for the most part – YES

Most GENMOD & MIXED programs can be duplicated in GLIMMIX−Mean Comparison features

−no need to “trick” GENMOD into GLMM with marginal model (e.g. split-plot, rpt measures)