Department of Statistics Introduction to Modeling Change Over Time with Generalized Mixed Models...
-
Upload
dwayne-rodgers -
Category
Documents
-
view
239 -
download
6
Transcript of Department of Statistics Introduction to Modeling Change Over Time with Generalized Mixed Models...
Department of Statistics
Introduction to Modeling Change Over Time
withGeneralized Mixed Models
using SAS PROC GLIMMIX
A Short Course – 14 May 2007
Instructor: Walt Stroup, Ph.D.
Professor & Chair, UNL Department of Statistics
14 May 2007 SSP Core Facility 2
Department of Statistics Outline of ShortCourse (G/C = Growth/Change Model)
1. Introductiona. motivating examplesb. Social Science HLM-speak vs. BioStat GLMM-speak
2. GLMM / HLM a. essential backgroundb. recurring modeling issues
3. SAS / GLIMMIX syntax4. G/C Models - 1st part of the picture: Factorial trt designs
a. with various error structures & distributionsb. with repeated measures & correlated errors
5. G/C Models - 2nd part of the picture: Random Effects issues a. random coefficients b. prediction vs. estimation
6. G/C Models – 3rd part of the picture - GLM issues: Binary, count, rate, zero-inflated models
7. Power & Planning8. Nonlinear mixed models
14 May 2007 SSP Core Facility 3
Department of Statistics
Recurring Themes
“Mixed Model” Issues− fixed or random?−error terms – which one & are they correlated?−std error & d.f.−prediction or estimate? (“inference space”)
“GLM” Issues−what distribution? incl “is it really a distribution & does it matter”?
−what link – “data” vs “model” scale?−overdispersion−computational issues
14 May 2007 SSP Core Facility 4
Department of Statistics
Recurring Themes
George Bernard Shaw:
“America and England are two peoples separated by a common language.”
Generalized Mixed Models have− AgStat-speak
− BioStat-speak
− Social/Behavioral Science Stat (HLM) speak
One goal: serve as translator
picture ofGB Shaw
14 May 2007 SSP Core Facility 5
Department of Statistics
General considerations for modeling Several examples illustrating generalized and
mixed models Typology of models Background theory Decision chart to match model with software
available in SAS
I. Introduction
14 May 2007 SSP Core Facility 6
Department of Statistics
General Model considerations A Model is a description of the components of an
observation observation = systematic + random Nelder: random = ephemeral + noise or
random=random model + random error
Alternative: random = design components + remaining variation
“All models are wrong but some are useful” – G.E.P Box
14 May 2007 SSP Core Facility 7
Department of Statistics
General Mixed Model Setting
Y is vector of responses (observable) u is vector of random (design induced) effects
[not (directly) observable] relevant distributions
o Y|u ~ fC ( , R )
o u ~ fR ( 0, G )
Model is of conditional mean of Y|u
( | ) ( , , , )E Y u h X Z u
Inexact (but useful)•HLM level 1 •Biostat – subject-specific•Level 2
19-20 Oct 2006 GLIMMIX Short Course for Procter & Gamble 8
Department of Statistics
Typology of Models
Type Mean Model Distribution
NLMM h(X,,Z,u) y|u general,
u normal **
GLMM h(X+Zu) y|u general,
u normal *
LMM X+Zu u, y|u normal
NLM h(X,) y normal
GLM h(X) y general
LM X y normal
* for PROC GLIMMIX ** for this course (G/N)LMM can be more general
14 May 2007 SSP Core Facility 9
Department of Statistics
Example 1Random Effects Model
Data: Output 4.1, p. 94, SAS for Linear Models, 4th ed. 20 packages of ground beef 3 samples per package 2 counts per sample response variable: microbial count response = mean + sample + count + error i.e. observation
= systematic + random model + error
14 May 2007 SSP Core Facility 10
Department of Statistics
2 2
2
( )
1,2,..., 20; 1,2,3; 1,2
i.i.d. (0, ); ( ) i.i.d. (0, );
i.i.d. (0, )
ijk i ij ijk
i P ij S
ijk
y p s p e
i j k
p N s p N
e N
Model for Example 1
yijk is observation [ log(count) ]
is overall mean (systematic / fixed)
pi, s(p)ij are random model effects
eijk is random error
Convention: fixed Greek; random Latin
14 May 2007 SSP Core Facility 11
Department of Statistics
Hierarchical Levels
school
classroom
students
Level 1
Level 2
Level 3
size levelsmall 1
medium 2
large 3
14 May 2007 SSP Core Facility 12
Department of Statistics
Hierarchical Level to Statistical Model
school
classroom
students
student, classroom, schoolth th thijky k j i
( )ijk
ijk i ij ijk
y mean school classroom student
y s c s e
0
0
Level 1 (student):
( )
ijk ij ijk
ij i ij
y e
s c s
0
0
Level 2 (classroom): ( )ijk i ij ijk
i i
y c s e
s
Level 3 GLIMMIX-speak
HLM-speak
14 May 2007 SSP Core Facility 13
Department of Statistics
Modeling Issues
1. Estimate i2’s
2. Estimate, standard error, and interval estimate of
3. Estimates of package, sample effects
4. a.k.a. Estimates of school and classroom effects
14 May 2007 SSP Core Facility 14
Department of Statistics
Singer: HLM to MIXED
Unconditional means model
Include Level 2 Covariate
20
0 00 0 0 00
Radenbush & Byrk (2002)
~ 0,
~ 0,
ij j ij ij
j j j
y r r N
u u N
2 2
GLIMMIX
~ 0, ~ 0,
ij i ij
i A ij
y a e
a N e N
0 00 01 0
00 01 0
"HLM-speak"
MEANSES
MEANSES
j j j
ij j j ij
u
y u r
1
"GLIMMIX-speak"
ij j j ijy X s e
one-way random effects model
14 May 2007 SSP Core Facility 15
Department of Statistics
Example 2Blocking & Multi-Location
Data: SAS for Linear Models: Output 3.7, discussed as mixed model in section 4.3; Output 11.30; SAS for Mixed Models, 2nd ed. Section 6.6
Output 11.30 discussed here 3 treatments 8 locations location represent a population 3-12 blocks depending on location response = trt + loc + blk(loc) + trtloc + error i.e.
observation = systematic+random model+error
14 May 2007 SSP Core Facility 16
Department of Statistics
Example 2 framed by Extending School / Classroom Example
school
classroom
students
school
classroom
students
Treatment
Treatment
14 May 2007 SSP Core Facility 17
Department of Statistics
Model with Treatment
school
classroom
students
Treatment
0
0
( ) ( )
( ) ( , )
Level 1:
Level 2: ( , )
Level 3: between school model + trt as above
ijkl
ijkl i ij ijk ijkl
ijkl ijk ijkl
ijkl ij ijk ijkl
y trt school trt classroom school student
y s c s e
y e
y c s e
14 May 2007 SSP Core Facility 18
Department of Statistics
Modeling Issues
1. Appropriate error term to test treatment
2. Standard error of treatment mean
− (inference space)
3. Intra-block vs. inter-block analysis
14 May 2007 SSP Core Facility 19
Department of Statistics
ANOVA (ignoring block)
2 21
2 2 21 2
2 21
2
Source d.f. E
Treatment 2
Location 7
Loc Trt 14
erro
xpected Mean Square
r dfe
LT TRT
LT L
LT
k Q
k k
k
If Location fixed:
2
2
2
2
Treatment 2
Sour
Location 7
Loc Trt 14
error df
ce d.f. Expected Mean Square
e
TRT
LOC
LT
Q
Q
Q
Test of TRTaffected
14 May 2007 SSP Core Facility 20
Department of Statistics
Inference Space
2
Assuming are
Var(trt mean)=# obs/trt
MS(error)Std. error(trt
FLocations
mean)= 0.91# obs/t
xe
rt
i d
2 2 2
2 2 2
HOWEVER... if are
( )Var(trt mean)=
# obs/trt
ˆ ˆ ˆ( )Std. error(trt mean)
Location
= 3.62# obs/
s Rand
tr
o
t
m
L LT
L LT
k
k
14 May 2007 SSP Core Facility 21
Department of Statistics
Where does Uncertainty Arise?
Loc 1 Loc 2
Loc 7 Loc 8
Only from variation among obs within locations?
Locations fixedOr does variation among locations also contribute?
Locations random
14 May 2007 SSP Core Facility 22
Department of Statistics
Intra- vs. Inter-block analysis
Intra- (fixed) block analysis based only on within block treatment differences
Inter-block analysis also accounts for variance among blocks (random combines inter- and intra-)
Lead to equivalent tests when all treatments appear equally in each block
Not equivalent otherwise
In most cases, combined inter-/intra-block analysis is more efficient
14 May 2007 SSP Core Facility 23
Department of Statistics
Example 3Repeated Measures/Longitudinal
Data: SAS for Linear Models, Output 8.1; SAS for Mixed Models, Chapter 5
3 treatments (2 test drugs + placebo)
ni patients per treatment
8 times of measurement (1, 2, 3, ..., 8 hours post trt)
baseline measurement at time 0
response = trt + hour + trthour + pat(trt) + error i.e. observation = systematic + random model + error
Variations on this theme are “latent growth models”
14 May 2007 SSP Core Facility 24
Department of Statistics
Growth Models – SingerHLM-speak to GLIMMIX-speak
Unconditional Linear Growth Model HLM GLIMMIX
Level 1 Within subjects
Level 2 Between subjects 20 1
Level 1 (within individual)
time ~ 0,ij j j ij ij ijy r r N
0 00 0 0 00 01
1 10 1 1 11
00 0 10 1
00 10 0 1
between subject within subjects
0Leve
population-averaged subje
l 2: ~ ,0
ct
j j j
j j j
ij j j ij ij
ij j j ij ij
u uMVN
u u
y u u time r
time u u time r
-specific
PA SS
14 May 2007 SSP Core Facility 25
Department of Statistics
Singer (1998)
Excellent paper translating HLM-speak to Proc Mixed Uses Radenbusch & Byrk examples Fair Warning to Readers, however – it’s dated
−new features & output revisions in SAS
−some of the output encouraged confusion or poor practice
−specifics revised output of Fit StatisticsMisleading output for variance estimates deletedKenward-Roger procedure for d.f. & std errors
I’ll update & make switch to Proc GMIMMIX
14 May 2007 SSP Core Facility 26
Department of Statistics
Modeling Issues
1. Errors may be correlateda. May affect conclusions
b. How to select covariance model
2. Denominator degrees of freedom
3. Bias in standard errors and test statistics
14 May 2007 SSP Core Facility 27
Department of Statistics
Impact of Correlated Errors
Covariance Model den df F-value Pr>F
errors independent 483 7.11 <0.0001
errors correlated
no structure
(bias corrected)
69
(98.1)
4.06
(3.66)
<0.0001
AR(1) 483 3.93 <0.0001
AR(1)
bias corrected
424 3.89 <0.0001
14 May 2007 SSP Core Facility 28
Department of Statistics
Example 4
Data: SAS for Mixed Models, Section 14.5 2 treatment (Test Drug, Control) 8 clinics clinics represent a population nij subjects at jth location on ith treatment
response: favorable or unfavorable (fij = # fav) response = trt + clinic + clinicloc + error i.e. observation
= systematic + random model + error
14 May 2007 SSP Core Facility 29
Department of Statistics
Modeling Issues
1. Response (fij / nij) is binomial, not normal
2. Response may not be linear in model parameters
3. Errors may not be additive
4. Variance of binomial & normal are different
a. heterogeneous
b. depends of location parameter
14 May 2007 SSP Core Facility 30
Department of Statistics
Generalized Linear Mixed Model
2 2C TC
ij
( )
i.i.d. N(
let Pr{favorable response | trt ,
M
clinic }
0,
( | , ( ) ) ~ Binomial
); ( )
: log1
observations
i.i.d. N
= proportion =
odel
(0, )
(
, )
i
ij j ij i
ij
ij
j
j ij
ij
ij
i
i
j
j j
c tc
c tc
f
f
c
i j
n
tc n
exp[ ( ) ] modeled by
1 exp[ ( ) ]i j ij
iji j ij
c tc
c tc
e.g.Logisticmixedmodel
14 May 2007 SSP Core Facility 31
Department of Statistics
Example 5
SAS for Linear Models, Output 10.39 2 treatments ni persons per treatment 4 times of measurement response = number of seizures (count) baseline and age observations response = trt + hour + trthour + baseline & age pat(trt)
+ error i.e. observation = systematic + random model + error
14 May 2007 SSP Core Facility 32
Department of Statistics
Modeling Issues
Count typically not ~ normal Poisson (or negative binomial) more likely Generalized Linear Model Issues
−Linear model not good direct model of mean
−Variance depends on mean
Repeated Measures Issues−Observations within subjects correlated over time
−Between subject variance
14 May 2007 SSP Core Facility 33
Department of Statistics
Example 6
SAS for Mixed Models, Section 1.5.6 5 treatments observed in each of 4 randomized blocks several measurements at days between 130 and 180 growing
degree days response = (trt,day) + block + blktrt + error i.e.
observation = systematic + random model + error
14 May 2007 SSP Core Facility 34
Department of Statistics
Emergence over TIME by TRT
Black: NoTillRed: SumBlade (summer)Cyan: SB&SDGreen: SpDisk (spring)Blue: SpPlow
14 May 2007 SSP Core Facility 35
Department of Statistics
Modeling Issues
“Usual” mixed model and repeated measures issues, plus
Linear model is poor model of trtday means
14 May 2007 SSP Core Facility 36
Department of Statistics
Nonlinear Mixed Model
u
th
th
Mixed Model:
is trt day mean; is block effect
is between subject erro
:
r
exp{ exp[ ( )]}
is asymptote of i treatment
is "slope" of i t
ijk ij k ik ijk
ij k
ik
ij i i i j
i
i
y b w e
b
w
d
Gompertz Mod
te
el
a
th
reatment
is inflection point of i treatmenti
i
14 May 2007 SSP Core Facility 37
Department of Statistics
Typology of Models
Type Mean Model Distribution
NLMM h(X,,Z,u) y|u general,
u normal **
GLMM h(X+Zu) y|u general,
u normal *
LMM X+Zu u, y|u normal
NLM h(X,) y normal
GLM h(X) y general
LM X y normal
* for PROC GLIMMIX ** for this course (G/N)LMM can be more general
14 May 2007 SSP Core Facility 38
Department of Statistics
Generalized Mixed Model SAS Software Decision Table
Response Normal Errors Indep Corr
Random Effects no yes Mean Model
Linear? yes no yes no yes no
SAS Proc
GLM MIXED
GLIMMIX
NLIN MIXED GLIMMIX
NLMIXED %NLINMIX
MIXED GLIMMIX
NLMIXED %NLINMIX
Response Non-Normal
Errors Indep Correl Random Effects
no yes
Mean Model Linear?
yes no yes no yes no
SAS Proc
GENMOD GLIMMIX
GLIMMIX NLMIXED NLMIXED
GLIMMIX (GENMOD)
14 May 2007 SSP Core Facility 39
Department of Statistics
Essential GLMM Background
14 May 2007 SSP Core Facility 40
Department of Statistics
First
How do I run a SAS Program?
???????
It’s easier than the urban legends would have you believe
14 May 2007 SSP Core Facility 41
Department of Statistics
Basic Parts of SAS Program
DATA Step
PROC Step
Modify existing data set (Data __; Set__;)
Data your_choice_of_name; Input list of variables; /* $ after alphameric var */Datalines;data – one line / obs, one column per variable;
comment
Proc GLIMMIX Data= your_choice_of_name; CLASS block group & trt var; MODEL response=block trt covar / options;
...Run;
Data new_data_set_name; Set [old – e.g.] your_choice_of_name; program & data manipulation statements. e.g.LogY=Log(Y);
14 May 2007 SSP Core Facility 42
Department of Statistics Example of SAS Program
data demo1; input classroom trt $ time count; sc=sqrt(count);datalines;1 std 1 121 std 2 161 std 4 171 std 8 242 exper 1 172 exper 2 242 exper 4 302 exper 8 3211 std 1 1611 std 2 1511 std 4 2211 std 8 238 exper 1 158 exper 2 208 exper 4 248 exper 8 27;
proc glimmix data=demo1; class classroom trt time; model sc=trt time trt*time / dist=normal ddfm=kr; random classroom(trt); lsmeans trt*time; ods output lsmeans=lsm; run;
DATA Step PROC Step
Data; Set; + new PROC
data plot_growth; set lsm; log_time=log2(time); symbol i=join value=circle; proc gplot data=plot_growth; plot estimate*log_time=trt; run;
14 May 2007 SSP Core Facility 43
Department of Statistics
II. Generalized Mixed Model Theory Clarify Fixed vs Random effects Linear Models
− LM to LMM + GLM to GLMM
Estimation and Inference for − LMM− GLM− GLMM
For GLMM: − what follows naturally from GLM and LMM − Special Issues
14 May 2007 SSP Core Facility 44
Department of Statistics
Fixed vs. Random Effects?
Fixed Effect?− levels observed = population of interest (except regression)− levels deliberately chosen− inference: systematic relationship between y and
Random Effect?− observed levels represent target population− random sample? -- ideal (but seldom perfectly realized) − makes sense to conceptualize probability distribution
Bottom Line: do observed levels of effect plausibly represent a probability distribution?
−yes random effect−no fixed effect
14 May 2007 SSP Core Facility 45
Department of Statistics
General Structure of Model
Nelder: observation=systematic + random General approach:
− likelihood consists of two parts observation (y | u) random effects u
− model is mathematical description of = E(y | u)
Distribution:− observation y | u ~ f(,R)
− random effects u ~ MVN(0,G)
Model: = h(X,,Z,u) h() called “inverse link”
14 May 2007 SSP Core Facility 46
Department of Statistics
Linear Model (LM)
No random effects simple ANOVA (one error term) multiple regression
Assumption: ( , )
LM: Model by X , usually represented as
; (0, )
alternative representation (helpful for transition to GLMM)
y ( , )
y MVN R
y X e e N R
MVN X R
14 May 2007 SSP Core Facility 47
Department of Statistics
Generalizations of LM
LM (Linear Model)obs ~ normal
fixed effects only
obs ~non-normal fixed effects only
GLM: (Generalized Linear Model)
obs ~ normalRandom Effects
LMM: (Linear Mixed Model)
obs ~ non-normalrandom effects
GLMM (generalized linear mixed model)
14 May 2007 SSP Core Facility 48
Department of Statistics
GLM: Generalized Linear Model
Binomial: Logistic regression; Probit models Poisson: Log-linear models
Assumption: ( , )
is a function of
( ) called " " -- more later
GLM: model =g( ) by -- called " "
alternatively, model by ( ) " "
Note: here
y dist R
R
V Variance function
X link function
h X inverse link
y o
( ) makes no sense
Instead: ( ),
r g X e
y dist h X R
14 May 2007 SSP Core Facility 49
Department of Statistics
LMM: Linear Mixed Model Multi-error models; split-plot, multi-location Repeated measures a.k.a. Longitudinal data
Assume: | ( , ) (0, )
LMM: Model by
Familiar notation:
0;
0
alternatively:
| ; ~ (0, )
or (marginal model)
( , );
y u MVN R u MVN G
X Zu
u Gy X Zu e MVN
e R
y u MVN X Zu u MVN G
y MVN X V V ZGZ R
More vocabulary:
“G-side” concerns V(u)
“R-side” concerns V(e)
14 May 2007 SSP Core Facility 50
Department of Statistics
GLMM: Generalized Linear Mixed Model
Assume: | ( , ) as with GLM
depends on ( )
(0, )
GLMM models = ( | ) by
link function: = ( )
inverse link:
GLMM: | ,
Marginal Model: ( | ) ( ) (more later)
y u dist R
R V
u MVN G
E y u
g X Zu
h X Zu
y u dist h X Zu R
f y u f u du
Modellingwill involve
•Distribution
•Link (or inv link)
•G-side
•R-side
14 May 2007 SSP Core Facility 51
Department of Statistics
Some Grounding Before Moving On
“Hessian Fly” example, Gotway & Stroup (1997, JABES)
“Hessian Fly” not so important, but design & data structure are
16 treatments, 4 replications: 4x4 Lattice − 16 incomplete blocks organized into
4 complete blocks
Response: Yij/nij
(damaged / obs per trt x block unit)
1 2 5 6 1 5 2 6
3 4 7 8 9 13 10 14
9 10 13 14 3 7 4 8
11 12 15 16 11 15 12 16
1 6 2 5 1 14 13 2
11 16 12 15 7 12 11 8
1 14 13 10 5 10 9 6
3 8 7 4 3 16 15 4
14 May 2007 SSP Core Facility 52
Department of Statistics
Linear Model (LM)
2
Randomized Complete Block
; i.i.d. 0,
block effect; treatment effect
ij i j ij ij
i i
y e e N
proc glimmix;class block entry;model pct=block entry;
i
Incomplete Block Model - Intra-block analysis
incomplete block replaces complete block in denoting proc glimmix;class inc_block entry;model pct=inc_block entry;
14 May 2007 SSP Core Facility 53
Department of Statistics
Linear Mixed Model (LMM)
2 2
Randomized Complete Block - Random block effects
i.i.d. 0, ; i.i.d. 0, ;
block effect; treatment effect
ij i j ij
i R ij
i i
y r e
r N e N
r
proc glimmix;class block entry;model pct=entry;random block;
G-sidemodeling block effect
Incomplete block (recovery of interblock information)Replace “block” by “inc_block”)
14 May 2007 SSP Core Facility 54
Department of Statistics
LMMG-side / R-side
Two alternative “G-side” specifications:
proc glimmix;class block entry;model pct=entry;random block;
proc glimmix;class block entry;model pct=entry;random intercept/subject=block;
R-side specificationproc glimmix;class block entry;model pct=entry;random _residual_ / type=cs subject=block;
Here, it doesn’t matter (all equivalent) but for more complex models, the distinctions will matter
14 May 2007 SSP Core Facility 55
Department of Statistics
Generalized Linear Model (GLM) ,
GLM ("Logit ANOVA" model): log1
ij ij ij
iji j
ij
y Binomial n
proc glimmix; class block entry; model y/n = block entry;
or replace “block” by “inc_block” forintra-block logit ANOVA
More on GLIMMIX syntax later
Here, note Y/N causes default to Binomial distribution & Logit link
(same as GENMOD)
14 May 2007 SSP Core Facility 56
Department of Statistics
Generalized Linear Mixed Model (GLMM)
2| block effects , block effects i.i.d. 0,
GLM ("Logit ANOVA" mixed model): log1
ij ij ij i R
iji j
ij
y Binomial n r N
r
proc glimmix; class block entry; model y/n = entry; random intercept / subject=block;
proc glimmix; class block entry; model y/n = entry; random block;
Marginal model
not equivalent
proc glimmix; class block entry; model y/n = entry; random _residual_ / type=cs subject=block;
14 May 2007 SSP Core Facility 57
Department of Statistics
II. Inference in LM, GLM, LMM, and GLMMInference for based on
In LM theory, if it can be expressed as ( )
. .
ˆOLS ( )
: estimable ' '( ) ( )
Main advantage
ˆ
fixed effects estimable functions
K estimable A E y
i e K A X
X X X y
theorem K iff K K X X X X
K
invariant to choice of ( )
. . when not full rank, has no intrinsic interpretation
does
(e.g. treatment difference, marginal (least squares) mean
X WX
i e X
K
14 May 2007 SSP Core Facility 58
Department of Statistics
II. Examples of Estimable Functions
1 2
. . one way model: ; 1,2,3,4; 1,...,
Estimable functions include
Trt marginal ("Least Squares") mean (LSMean)
+ . . 1 1 0 0 0 for 1
Trt differences
e.g. 0 1 1 0 0
SS(trt) such tha
ij j ij
i
e g y e i j n
e g k i
k
K
t all equal
0 1 0 0 1
. . 0 0 1 0 1
0 0 0 0 1
i
e g K
14 May 2007 SSP Core Facility 59
Department of Statistics
II. Common Inference Results for GLM
0
1
2( )
02
ˆ ~ ( , ( ) )
exact for LM
Wald statistic:
purpose: test H : 0
ˆ ˆ( ) [ ( ) ] ( )
~
Note in OLS
( )
rank K
K approx MVN K K X WX K
K
Wald K K X WX K K
approx
SS HWald
14 May 2007 SSP Core Facility 60
Department of Statistics
II. GLM: Inference with Unknown Scale Parameter
02
2
0 02
0( , )
2
( )Recall, in OLS
But what if unknown?
( ) ( )Think ANOVA: Use
ˆ
( )Thus, ~( )
Generalization:
in GLM, scale parameter or
dfh dfe
SS HWald
SS H SS H
MSE
SS H dfhWald Frank K MSE
Pearson Deviance
dfe dfe
14 May 2007 SSP Core Facility 61
Department of Statistics
II. Extension of GLM Scale ParameterQuasi-Likelihood
Overdispersion
“Working Correlation”
Counts Poisson ( ) ( )
but in practice ( ) ( )
Quasi-likelihood: you specify ( )
E y Var y
E y Var y
E y Var y
1 1 12 2 2
Repeated Measures
Assumed distribution ( ) ( )
But in reality, errors are correlated, so model variance as
( ) where ( )
is working correlation - structure analogou
Var y diag V
Var y R AR R diag V
A
s to true R-side in LMM
14 May 2007 SSP Core Facility 62
Department of Statistics
II. GLM: Deviance and Likelihood Ratio Test
1 1 2 2
0 2
1 1
1 1 1
Full model: . . ( )
Decompose as
Suppose we want to test H : 0
1. Fit full model
( ) 2 log[ ( ) ( )]
2. Fit reduced model
( ) 2 log[ ( ) ( )]
3. LR statistic
( )
X i e h X
X X
Dev X X y
X
Dev X X y
Dev X Dev
1 1( )X
14 May 2007 SSP Core Facility 63
Department of Statistics
II. LMM: The “Mixed Model Equations”1 1
1
1 1
1 1 1
1 1 1 1
1 1
( ) ( ) ( )
( )( )
( )and ( )
solving yields
note:
ˆˆ ( ) and (
y y X Zu R y X Zu uG u
yX R y X Zu
yZ R y X Zu G u
u
X R X X R Z X R y
uX R Z Z R Z G Z R y
u GZ V y X X V X
1) X V y
Marginal Model Solution
Mixed Model Solution
14 May 2007 SSP Core Facility 64
Department of Statistics
II. LMM Inference – G and R known
1
Inference based on Predictable functions
"predictable" if is estimable
(reduces to estimable function if focus on fixed effects only)
ˆ1. [ ( )] [ ]
where
K M u K
K
KVar K M u u K M C
M
X R X X RC
_1
1 1 1
1 2( )
2. Let and =
statistic for tests on is
ˆ ˆ( ) [ ] ( ) ~ rank L
Z
Z R X Z R Z G
L K M u
Wald L
L L CL L
14 May 2007 SSP Core Facility 65
Department of Statistics
II. LMM Inference – G and R unknownˆ ˆ1. Replace and by and
estimate variance and covariance components
ˆ2. Denote as with estimated var/cov components
ˆ ˆ3. "Naive" [ ( )]
ˆbut ( )
Kenward-Roger adjustment
G R G R
C C
Var L L CL
E L CL L CL
( ),
4. Approximate
ˆ( ) [ ]( )( ) ( )
may be biased ; often must be approximated
rank L
F
L L CL LWald approx Frank L rank L
F
14 May 2007 SSP Core Facility 66
Department of Statistics
II. LMM: Variance Component Estimation
Several methods1. For variance-component-only models: use
EMS from ANOVA 2. Maximum likelihood
− problem: biased
3. Restricted maximum likelihood4. Several computational approaches
a. Newton Raphsonb. Fisher Scoringc. EM
14 May 2007 SSP Core Facility 67
Department of Statistics
What’s Wrong with ML?
An example to illustrate SAS for Mixed Models, Data Set 1.5.1 Incomplete Block design from Cochran & Cox,
Experimental Designs, p 456 15 treatments 15 blocks 4 treatments observed per block
14 May 2007 SSP Core Facility 68
Department of Statistics
C&C Example: ML and two alternatives
Intrablock (fixed block) analysisproc glimmix data=cc456; class trt bloc; model y=trt bloc;Inter/Intra-block (random block)analysis –defaultproc glimmix data=cc456; class trt bloc; model y=trt; random bloc;Inter/Intra-block (random block) analysis – MLproc glimmix data=cc456 method=mspl; class trt bloc; model y=trt; random bloc;
PROC MIXED defaultgive same result
equivalent to PROC GLM
same asProc MIXEDMETHOD=ML;
14 May 2007 SSP Core Facility 69
Department of Statistics
ML vs Alternative Results: Which is Right?
Intrablock (fixed block) Type III Tests of Fixed Effects
Effect
Num DF
Den DF F Value Pr > F
trt 14 31 1.23 0.3012
Intra/inter- block (random) block default
Type III Tests of Fixed Effects
Effect
Num DF
Den DF F Value Pr > F
trt 14 36.2 1.48 0.1676
2ˆ 8.62
2 2ˆ ˆ4.65 8.56R
Intra/inter- block (random) block - ML
2 2ˆ ˆ4.50 6.04R
Type III Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
trt 14 49.04 2.02 0.0352
14 May 2007 SSP Core Facility 70
Department of Statistics
Simulation ML or REML
1000 simulated data sets using C & C, p 456 design
B2/2 = 0.5
Recorded type I error rate for Ftrt
− intrablock
−REML random block
−ML random block
Variable N Mean ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ fixd_rej05 1000 0.0590000 REML_rej05 1000 0.0610000 ML_rej05 1000 0.2140000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
14 May 2007 SSP Core Facility 71
Department of Statistics
II. LMM with estimated G and R Bias in std error and test statistics
1
1
Kenward & Roger ( , 1997)
Consider estimable function
ˆWhen unknown, estimates used to obtain
ˆ ˆ "naive" estimate ( ) ( )
Using Taylor series expansion, can show
ˆ[ (
Biometrics
K
V
Var K K X V X K
E K X V
2 1
1
,
) ]
1 ( )ˆ ˆ( ) cov( , )
2 i ji j i j
X K
X V XK X V X K K K
14 May 2007 SSP Core Facility 72
Department of Statistics
II. LMM: Degrees of Freedom
2 2 2
2 2
2 2 2
2 2
2
Simple Case
model: ( )
(0, ); ( ) (0, ); (0, )
ANOVA Source EMS
A
B
AB
error
ijk i j ij ijk
j B ij AB ijk
AB A
AB B
AB
y b ab e
b N ab N e N
n Q
n na
n
14 May 2007 SSP Core Facility 73
Department of Statistics
II. Degrees of Freedom (2)
1 2
2 21 2
2 2 2
Trt diff:
2 ( )2ˆ ˆ( ) ( )
denominator d.f.= ( )
Trt mean: +
1ˆ ˆ( + ) ( )
1 11 ( ) ( )
approximated via Satterthwaite's proc
AB
i
i AB B
MS ABVar nnb nb
df AB
Var n nnb
bMS AB MS Bnb b b
edure
14 May 2007 SSP Core Facility 74
Department of Statistics
II. Satterthwaite Approximation
22
2 2 2 22 2
for linear combination of MS
approximate d.f. for MS is
-1 1
e.g. -1 1
( ) ( )
i
i ii
i ii
i
i i
MS c MS
bMSAB MSBc MS
b b
c MS bMSAB MSBdf b b
df AB df B
14 May 2007 SSP Core Facility 75
Department of Statistics
II. Satterthwaite Approximation in LMM1 1
1 1
2 22[ ( ( ) )] 2( ( ) )Approximation: or
( ( ) ) ( ( ) )
E K X V X K K X V X K
Var K X V X K Var K X V X K
For vector K (e.g. treatment contrast):
1
1
1
1
Approximate ( ( ) ) by
( ( ) ), where vector of (co)variance components
2 { ( )
1 1,
Var K X V X K g Ag
K X V X Kg
V VA trace P P
i j
V ZGZ R P V V XCX V
14 May 2007 SSP Core Facility 76
Department of Statistics
II. GLMM Estimation
1 12 2
12
GLMM is model of ( | )
Link form: ( | )
Inverse link form: ( | ) = ( )
More general expression of distribution of |
|
( ) is "working correlation matrix"
E
i
E y u
g E y u X Zu
E y u h X Zu
y u
Var y u R R AR
R diag V A
stimation: as with LMM, may choose to focus on
1. only GLS equations in LMM;
Generalized Estimating Equations with GLMM
2. and several approachesu
14 May 2007 SSP Core Facility 77
Department of Statistics
II. Working Correlation
Recall Gotway & Stroup (1997) Hessian Fly Example
1 2 5 6 1 5 2 6
3 4 7 8 9 13 10 14
9 10 13 14 3 7 4 8
11 12 15 16 11 15 12 16
1 6 2 5 1 14 13 2
11 16 12 15 7 12 11 8
1 14 13 10 5 10 9 6
3 8 7 4 3 16 15 4
Gotway and Stroup considered spatial variation among e.u.
proc glimmix; class block entry; model y/n=entry; random intercept / subject=block; random _residual_ / type=sp(sph)(row col) subject=block;
MODEL sets up Binomial GLM, Logit linkRANDOM _RESIDUAL_ sets up a working correlationbased on SPHERICAL semivariogram
14 May 2007 SSP Core Facility 78
Department of Statistics
II. Marginal (PA) vs Subject-Specific Inference
Marginal Mean: ( )
Conditional Mean: ( | )
Note: ( ) ( | ) ( )
In general, cannot be further simplied
E y
E y u
E y E E y u E h X Zu
2 2
Example: log link, ~normal
( | ) exp( )
( ) exp( ) exp( ) ( )
( ) is moment generating function of eval at
( ) exp( )exp log ( )2 2
u
u
u u
u
E y u X Zu
E y E X Zu X M Z
M Z U Z
E y X E y X
Population Averaged (PA)
SS (true GLMM)
14 May 2007 SSP Core Facility 79
Department of Statistics
II. More on PA (marginal) vs. SSProbit-normal model:
Pr( 1| ) ( ); (0, )
can show
( ) ( )1
y u X Zu u N G
XE y X
Z GZ
2 2
2
in LMM, model ; (0, ); (0, )
1 .
1 .and ; 0, ;
.
1
are equivalent. However, in GLMM, they are not. Yield
different estimates, std. errors, etc.
u eX Zu e u N I e N I
X e e N R R
14 May 2007 SSP Core Facility 80
Department of Statistics
II. Estimation of GLMM model E(y|u) inverse link: E(y|u)=h(X+Zu) link: g[E(y|u)]==X+Zu to estimate and u need to evaluate f(y), f(y|u)
− approximate e.g. by Taylor series expansion Penalized Quasi-Likelihood (SAS %GLIMMIX) SAS PROC GLIMMIX (next slides)
− numerical integrate joint density Gauss-Hermite Quadrature (Proc NLMIXED)
− stochastically evaluate integral Monte Carlo Markov Chain (WinBugs – not in this course)
14 May 2007 SSP Core Facility 81
Department of Statistics
II. Computational Method Comparison GEE
− Computationally easy− Meaning of marginal results in GLM?
Linearized GLMM (current PROC GLIMMIX)− uses familiar LMM analogs (but many are ad hoc & need further research)− allows considerable R-side flexibility− adequate for many GLMM; breaks down for certain cases (binary data)
Integral Approximation (PROC NLMIXED)− better approximation that Linearized GLMM− BUT: ML only, simple G-side models only, no R-side
LaPlace− computationally less demanding than Integral approximation but often
“accurate enough”; same limitations as Integral approximations MCMC
− simple models only; limited & temperamental software− but in extreme cases, only way to get accurate results
14 May 2007 SSP Core Facility 82
Department of Statistics
Modeling Considerations
14 May 2007 SSP Core Facility 83
Department of Statistics
Basic Parts of SAS Program
DATA Step
PROC Step
Data your_choice_of_name; Input list of variables; /* $ after alphameric var */Datalines;data – one line / obs, one column per variable;
comment
proc glimmix data=demo1; class classroom trt time; model sc=trt time trt*time / dist=normal ddfm=kr; random classroom(trt); lsmeans trt*time; ods output lsmeans=lsm; run;
14 May 2007 SSP Core Facility 84
Department of Statistics
III. Modeling Considerations
Overdispersion
Marginal (PA) vs Conditional (SS) models
“Data” vs “Model” Scale
14 May 2007 SSP Core Facility 85
Department of Statistics
III. Model Considerations
Variance Model & Overdispersion Choice of Link Function Choice of Distribution Choice of Model Effects Correlated Errors?
Any of the above could show up as “overdispersion”
14 May 2007 SSP Core Facility 86
Department of Statistics
III. GLMM: Model Considerations Common dilemma Design, e.g. like “Hessian fly”
example BINOMIAL data Recover interblock
information - BLOCK random
1 2 5 6 1 5 2 6
3 4 7 8 9 13 10 14
9 10 13 14 3 7 4 8
11 12 15 16 11 15 12 16
1 6 2 5 1 14 13 2
11 16 12 15 7 12 11 8
1 14 13 10 5 10 9 6
3 8 7 4 3 16 15 4
ij
ij
expModel (Logit GLMM):
1 exp
or equivalently log1
i j
i j
i jij
r
r
r
Analysis reveals that the data are overdispersed
14 May 2007 SSP Core Facility 87
Department of Statistics
III. Hessian Fly Example
proc glimmix data=HessianFly; class block entry; model y/n = entry; random block;
Evidence of Overdispersionwhen >>1
Fit Statistics
-2 Res Log Pseudo-Likelihood 182.21
Generalized Chi-Square 107.96
Gener. Chi-Square / DF 2.25
14 May 2007 SSP Core Facility 88
Department of Statistics
III. Overdispersion
Observed variance > variance under presumed model
Symptom: Deviance/DFE or chi-square/DFE >> 1
Uniquely a GLM / GLMM issue
−not a consideration with LM, LMM
−y|u ~ normal implies variance not a function of mean
When is there an issue
− If Var(y) = f[E(y)] and
−using scale adjustment requires unrealistic assumptions
14 May 2007 SSP Core Facility 89
Department of Statistics
III. Common fix for Overdispersion
Multiply variance by scale parameter. Here: 1
proc glimmix data=HessianFly; class block entry; model y/n= entry; random block; random _residual_;
Issue: not a true likelihood
Covariance Parameter Estimates
Cov Parm Subject EstimateStandard
Error
Intercept block 0 .
Residual (VC) 2.2668 0.4627
estimates
Covariance Parameter Estimates
Cov Parm Subject EstimateStandard
Error
Intercept block 0.01116 0.03116
vs.
ˆw/o
14 May 2007 SSP Core Facility 90
Department of Statistics
Impact of Scale Parameter on Inference
Type III Tests of Fixed Effects
EffectNum
DFDen
DF F Value Pr > F
entry 15 45 3.03 0.0020
Type III Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
entry 15 45 6.90 <.0001
no scale parameter
withscale parameteradjustment
but is this the best way to address the problem?
failure to account for overdispersion tends to increase type I error rate
14 May 2007 SSP Core Facility 91
Department of Statistics
III. Mean – Variance Overdispersion Models
2
( ) ( , )
No scale parameter 1 ,
binomial, poisson
1-Nonlinear scale parameter
1+
negative binomial, gen. poisson, beta
Linear scale parameter
gamma, inverse gaussian
No mean parameter
normal
Var y f
14 May 2007 SSP Core Facility 92
Department of Statistics
III. Marginal or Conditional Formulation
For many models (notably LMM) there are equivalent forms−conditional (mixed, SS) model
−marginal (PA) model
− lead to the same marginal log-likelihood
Distinction results from−G-side model; random model effects
−R-side model; marginal model
14 May 2007 SSP Core Facility 93
Department of Statistics
III. Example: variance component (G-side) vs. Compound symmetry (R-side)
2 2
2 2 2 2
2 2 22 2
2 2
i.i.d. 0, i.i.d. 0,
...
...
... ...
ij i j ij
i R ij
R R R
R Ri R
R
y r e
r N e N
Var Y J I
14 May 2007 SSP Core Facility 94
Department of Statistics
III. Compound Symmetry Equivalent
22 2 2C 2 2
2C
2
Let and =
Model:
if (same block),
0 otherwise
1 ...
1 ...
... ...
1
RR
R
ij i ij
ij ij kl
i C
y E
i kVar E Corr E E
Var Y
Models equivalent if 0
14 May 2007 SSP Core Facility 95
Department of Statistics
III. G-side / R-side
proc glimmix; class block entry; model y/n=entry; random block;
proc glimmix; class block entry; model y/n=entry; random intercept / subject=block;
same modelG-side
R-side modelproc glimmix; class block entry; model y/n=entry; random _residual_ / type=CS subject=block;
proc mixed; class block entry; model y=entry; repeated / type=CS subject=block;
14 May 2007 SSP Core Facility 96
Department of Statistics
III. Variance Component vs CS in GLMM Variance component model is GLMM CS model is GEE They are not equivalent
Conditional model: logit
exp|
1 exp
marginal distribution is ( ) | ( )
Marginal model: logit
with working correlation matrix def
ij i j
i jij i
i j
ij ij i i i
ij i j
r
ry u Binomial
r
p y p y u p u du
ined by CS form
is NOT Binomial, merely borrow Binomial-like
Does such a dist
quasi-likeli
ribution actu
hood f
ally e
o
x ?
r
st
m
i
ijy
14 May 2007 SSP Core Facility 97
Department of Statistics
III. Conditional vs. Marginal Results
Fit Statistics
Gener. Chi-Square / DF 2.30
Covariance Parameter Estimates
Cov Parm Subject Estimate
CS block -0.03247
Residual 2.2992
Type III Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
entry 15 45 2.99 0.0023
Fit Statistics
Gener. Chi-Square / DF 2.27
Covariance Parameter Estimates
Cov Parm Subject Estimate
Intercept block 0
Residual (VC) 2.2668
Type III Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
entry 15 45 3.03 0.0020
Conditional Marginal
which is right? •fit statistic?•can you simulate data using mechanism implied by model?
14 May 2007 SSP Core Facility 98
Department of Statistics
III. Marginal or Conditional?
How to choose?−Conditional: G-side; Marginal: R-side
−Fit statistic? (may help; may deceive)
General recommendation−G-side formulation preferred for non-normal data
−G-side effects operate inside the link function & hence always lead to valid conditional & marginal distributions
−R-side effects operate outside the link function
− for non-normal data, models implied by R-side effects may be vacuous
14 May 2007 SSP Core Facility 99
Department of Statistics
III. Impact of Model Effects
Back to Hessian Fly Data Incomplete Block Design Try more appropriate model
proc glimmix; class inc_block entry; model y/n-entry; random intercept / subject=inc_block;
Fit Statistics
Gener. Chi-Square / DF 1.41
Covariance Parameter Estimates
Cov Parm Subject Estimate
Intercept inc_block 0.4971
Type III Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
entry 15 33 6.33 <.0001
14 May 2007 SSP Core Facility 100
Department of Statistics
III. Inference
After model fit & estimation, inference begins Also want at least some of following comparisons among groups (trt, entry...)
− test hypotheses
−obtain confidence intervals
−obtain predictions
− further model checking
14 May 2007 SSP Core Facility 101
Department of Statistics
III. Scale issue for GLM, GLMM
For GLM, GLMM there are two “natural scales”− linear (or model) scale (e.g. logit)
−data scale
May be other scales, depending on context−odds
−odds ratio
14 May 2007 SSP Core Facility 102
Department of Statistics
III. Choosing the Scale Example: Hessian Fly – binomial dist, logit link Data: measured as 0/1; per e.u. as Y/N Main focus: entry effect on P{indiv resp = 1}
ijLink: log1
ˆexpˆInverse Link:
ˆ1 exp
ij i jij
ijij
ij
r
14 May 2007 SSP Core Facility 103
Department of Statistics
III. Scale and Inference
These are estimat
Main tool of infe
ed on the "linear
rence: estimable functions
ˆ ˆe.g. entr
" or "model" scale
ˆ ˆ ˆcan denote: or
y "L
S Mean" +
ˆ ˆentry difference
Main focus of inference: on
j j j
j
j j
data scale
ˆe.g. 1|
entry difference between prob
ˆexpˆRequire "inverse
abilit
linking": ˆ1 ex
ies
p
ˆ ˆ
jj
j
j
j j
P resp entry i
14 May 2007 SSP Core Facility 104
Department of Statistics
III. Inverse Linking
Estimation occurs on model scale But reporting typically must occur on data scale
2
ˆˆEstimate:
ˆˆStd error: . . ( )
ˆConfidence interval: . .
ˆexpˆˆInverse linked estimate e.g. ˆ1 exp
ˆˆˆInverse linked std error . . .
Inverse linked confidence
K
s e k Var k
z s e
h
hs e s e
interval ( ), ( )h LowerB h UpperB
“delta” rule
14 May 2007 SSP Core Facility 105
Department of Statistics
III. Model & Data Scale – Hessian Fly ExampleSolutions for Fixed Effects
Effect entry Estimate Standard Error DF t Value Pr > |t|
Intercept -1.9057 0.4886 15 -3.90 0.0014
entry 1 3.8001 0.6327 33 6.01 <.0001
entry 2 3.4821 0.6186 33 5.63 <.0001
Estimates
Label EstimateStandard
Error Lower Upper Mean
StandardErrorMean
LowerMean
UpperMean
entry 1 1.8944 0.4608 0.9568 2.8319 0.8693 0.05237 0.7225 0.9444
entry 2 1.5765 0.4321 0.6974 2.4555 0.8287 0.06133 0.6676 0.9210
diff entry 1-2 0.3179 0.5793 -0.8607 1.4965 0.5788 0.1412 0.2972 0.8171
linear or model scale data scalewhich of thesemake NOsense?
14 May 2007 SSP Core Facility 106
Department of Statistics
on to GLIMMIX
14 May 2007 SSP Core Facility 107
Department of Statistics
IV. GLIMMIX Syntax
SAS software for GLMs & Mixed models
Basic GLIMMIX syntax
Similarities & Differences vs existing SAS Procs
New features
14 May 2007 SSP Core Facility 108
Department of Statistics
IV. SAS Software for Linear Models LM
−Proc GLM, MIXED−Proc GLIMMIX
GLM−Proc GENMOD Proc NLMIXED−Proc GLIMMIX
LMM−Proc MIXED−Proc GLIMMIX
GLMM−Proc GLIMMIX Proc NLMIXED
14 May 2007 SSP Core Facility 109
Department of Statistics
IV. PROC GLIMMIX Syntax
What’s familiar (from MIXED & GENMOD)− CLASS− MODEL− DIST and LINK options in MODEL (like GENMOD)− RANDOM (for G-side)− ESTIMATE, CONTRAST, LSMEANS− ODS
What’s new or different− RANDOM _RESIDUAL_ (replaces REPEATED for R-side)− LSMESTIMATE− new options in LSMEANS (e.g. better options for factorial exp)− NLOPTIONS− Model diagnostics
14 May 2007 SSP Core Facility 110
Department of Statistics
IV. Relation between GLMM Structure and GLIMMIX Code
1 1
2 2
| ~ , ( )
GLMM: |
|
y u dist R Var u G
g u X Zu
Var y u V PV
proc glimmix; class variables; model <resp>=<fixed effects> /dist= link= ; random <g-side effects> / <options>; random _residual_ / type= subject= ;run;
14 May 2007 SSP Core Facility 111
Department of Statistics
IV. NLOPTIONS Statement
New Statement in GLIMMIX Controls Optimization technique, Line Search
Method, number of Iterations, etc
proc glimmix; class id a b; model y=a b a*b; random _residual_ / type=cs subject=id(a); nloptions tech=nrridge maxiter=100;
TECH=NRRIDGE causes GLIMMIX to use MIXED computing algorithm (good for comparison...)
14 May 2007 SSP Core Facility 112
Department of Statistics
IV. Programming Statements Similar to GENMOD, NLIN, NLMIXED GLIMMIX supports statements using DATA step syntax Use to transform variables, define quantities to output,
user-defined link, variance, etc. For example....
proc glimmix; class block entry;
pct=y/n; model pct=entry; random intercept / subject=block;
14 May 2007 SSP Core Facility 113
Department of Statistics
IV. Some GLIMMIX Defaults Useful to Know
In MODEL statement− response Y= NORMAL distribution & IDENTITY link
− response Y/N= BINOMIAL distribution and LOGIT link
For distributions without scale parameter in variance function (e.g. Binomial, Poisson)−no scale parameter assumed (unlike %GLIMMIX macro)
−obtain scale parameter with RANDOM _RESIDUAL_
Optimization method automatically matched based on DISTRIBUTION & LINK
14 May 2007 SSP Core Facility 114
Department of Statistics
IV. Estimation Methods in PROC GLIMMIX
Defaults depend on model, distribution, and link May be altered with METHOD= option
− in PROC statement
METHOD= options −variations on pseudo-likelihood
−RSPL
−RMPL
−MSPL
−MMPL
Restricted obj fct (like REML)
Unrestricted obj fct (like ML)
subject specific (conditional or mixed) model
population averaged (marginal) model
14 May 2007 SSP Core Facility 115
Department of Statistics
IV. Defaults & Methods (continued)
GLMM Default Method is RSPL For LMM, this is REML
− GLIMMIX uses different algorithm than MIXED, TECH=NRRIDG uses MIXED algorithm
− you can get slightly different numbers with MIXED/GLIMMIX
METHOD=MSPL yields ML estimates Methods appear in literature as MPL, PQL Gaussian adaptive quadrature and LaPlace
algorithms will be added to V 9.2−not available yet & not discussed here
14 May 2007 SSP Core Facility 116
Department of Statistics
IV. Examples
proc glimmix; class id; _variance_=_mu_*_mu_; model y=x / dist=poisson;run;
proc glimmix; class id; model y=x / dist=poisson; random _residual_;run;
proc glimmix; class id; model y=x / dist=poisson;run;
Poisson regressionLog linkchange variance function
Poisson regressionLog linkadd scale parameter
Poisson regressionLog link
14 May 2007 SSP Core Facility 117
Department of Statistics
IV. “GLM-mode” vs “GLMM-mode”
Use following trick to get GLM (GENMOD) type model via pseudo-likelihood
proc glimmix; class id; model y=x / dist=poisson; random _residual_;
proc glimmix; class id; model y=x / dist=poisson; random _residual_ / subject=id;
“GLM-mode”max likelihood
“GLMM-mode”pseudo likelihood
this is a GEE with indep working corr
14 May 2007 SSP Core Facility 118
Department of Statistics
IV. Distributions supported by GLIMMIX
ContinuousBetaNormalLognormalGammaExponentialInverse GaussianShifted T
DiscreteBinaryBinomialPoissonGeometricNegative BinomialMultinomial
−Nominal−Ordinal
14 May 2007 SSP Core Facility 119
Department of Statistics
IV. MIXED to GLIMMIX – R-side
proc mixed; class loc id trt time; model y=trt | time; random loc; repeated / type=ar(1) subject=id(loc);
proc glimmix; class loc id trt time; model y=trt | time; random intercept / subject=loc; random _residual_ / type=ar(1) subject=id(loc);
when you use GLIMMIX, you will notice it is much fussier about SUBJECT= statement when nested subject structure is present (MIXED more likely to let you get away with ignoring SUBJECT)
14 May 2007 SSP Core Facility 120
Department of Statistics
IV. More on R-side
proc mixed; class loc id trt time; model y=trt | time; random loc; repeated time / type=ar(1) subject=id(loc);
proc glimmix; class loc id trt time; model y=trt | time; random intercept / subject=loc; random time / type=ar(1) subject=id(loc) residual;
** vs random _residual_ / type=ar(1) subject=id(loc);
alternative formof random residuale.g when time points missing, unsorted etc.
14 May 2007 SSP Core Facility 121
Department of Statistics
IV. MIXED to GLIMMIX - Estimate MIXED: single row ESTIMATE statements
GLIMMIX: multi-row with multiplicity adjustment
proc mixed; class trt; model y=trt a x trt*a trt*x; estimate ’10 3’ trt 1 -1 trt*a 10 -10 trt*x 3 -3; estimate ’20 3’ trt 1 -1 trt*a 20 -20 trt*x 3 -3; estimate ’30 3’ trt 1 -1 trt*a 30 -30 trt*x 3 -3;
proc glimmix; class trt; model y=trt a x trt*a trt*x; estimate ’10 3’ trt 1 -1 trt*a 10 -10 trt*x 3 -3,
’20 3’ trt 1 -1 trt*a 20 -20 trt*x 3 -3,’30 3’ trt 1 -1 trt*a 30 -30 trt*x 3 -3 / adjust=scheffe;
14 May 2007 SSP Core Facility 122
Department of Statistics
IV. MIXED vs. GLIMMIX - LSMEANS
Example: Factorial
PROC MIXED; class A B; model y=A|B; lsmeans A B/diff; lsmeans A*B/diff slice=(A B);
PROC GLIMMIX; class A B; model y=A|B; lsmeans A B/diff lines; lsmeans A*B / slice=(A B) slicediff=(A B);
gives you table of all possible differences
tests – but does not estimate – simple effects A given B, vice versa
gives multiple range
display users love
restricts A*B diffs to actual simple effects, e.g. A1-A2|Bj
14 May 2007 SSP Core Facility 123
Department of Statistics
IV. GLIMMIX – LSMEANS (1) Main EffectsB Least Squares Means
B EstimateStandard
Error DF t Value Pr > |t|
1 18.5300 1.3226 13.69 14.01 <.0001
2 26.5200 1.3226 13.69 20.05 <.0001
4 28.2800 1.3226 13.69 21.38 <.0001
8 25.3000 1.3226 13.69 19.13 <.0001
T Grouping for B Least Squares Means
LS-means with the same letter are not significantly
different.
B Estimate
4 28.2800 A
A
2 26.5200 A
A
8 25.3000 A
1 18.5300 B
proc glimmix data=AxB_example; class block A B; model y=A|B/ddfm=satterth; random block block*B; lsmeans A B/diff lines; lsmeans A*B/slicediff=(A B);run;
14 May 2007 SSP Core Facility 124
Department of Statistics
IV. GLIMMIX – LSMEANS (2) Simple Effects
proc glimmix data=AxB_example; class block A B; model y=A|B/ddfm=satterth; random block block*B; lsmeans A B/diff lines; lsmeans A*B/slicediff=(A B);run;
Simple Effect Comparisons of A*B Least Squares Means By B
Simple Effect Level A _A Estimate
Standard Error DF t Value Pr > |t|
B 1 r s 2.9400 1.3144 16 2.24 0.0399
B 2 r s 2.6400 1.3144 16 2.01 0.0618
B 4 r s -0.2000 1.3144 16 -0.15 0.8810
B 8 r s -1.0000 1.3144 16 -0.76 0.4578
A*B Least Squares Means
A B EstimateStandard
Error
r 1 20.0000 1.4769
r 2 27.8400 1.4769
r 4 28.1800 1.4769
r 8 24.8000 1.4769
A*B Least Squares Means
A B EstimateStandard
Error
s 1 17.0600 1.4769
s 2 25.2000 1.4769
s 4 28.3800 1.4769
s 8 25.8000 1.4769
14 May 2007 SSP Core Facility 125
Department of Statistics
IV. GLIMMIX – LSMEANS (3) lsmeans a*b / diff; gave you this
Differences of A*B Least Squares Means
A B _A _B EstimateStandard
Error DF t Value Pr > |t|
r 1 r 2 -7.8400 1.8796 19.49 -4.17 0.0005
r 1 r 4 -8.1800 1.8796 19.49 -4.35 0.0003
r 1 r 8 -4.8000 1.8796 19.49 -2.55 0.0192
r 1 s 1 2.9400 1.3144 16 2.24 0.0399
r 1 s 2 -5.2000 1.8796 19.49 -2.77 0.0121
r 1 s 4 -8.3800 1.8796 19.49 -4.46 0.0003
r 1 s 8 -5.8000 1.8796 19.49 -3.09 0.0060
r 2 r 4 -0.3400 1.8796 19.49 -0.18 0.8583
r 2 r 8 3.0400 1.8796 19.49 1.62 0.1219
r 2 s 1 10.7800 1.8796 19.49 5.74 <.0001
etc
14 May 2007 SSP Core Facility 126
Department of Statistics
IV. GLIMMIX -- LSMESTIMATEExample: Simple Effect in 2-Factor Factorial
Model:
Simple Effect, e.g. A|B
ijk ij ijk i j ij ijk
ij i j i i ij i j
y e e
estimate ‘A|B’ a*b 1 0 0 0 -1 0 0 0; not estimable
estimate ‘A|B’ a 1 -1 a*b 1 0 0 0 -1 0 0 0;must write
lsmestimate a*b ‘A|B’ 1 0 0 0 -1 0 0 0;new GLIMMIX alternative
Defined on not on model effects
Allows multiple LSMESTIMATES & ADJUST= for multiplicity
ij
14 May 2007 SSP Core Facility 127
Department of Statistics
IV. ODS Graphics With GLIMMIX
Not available with MIXED
ods html;ods graphics on;ods select MeanPlot;proc glimmix data=AxB_example; class block A B; model y=A|B/ddfm=satterth; random block block*B; lsmeans A*B/plot=MeanPlot
(sliceby=A join cl);run;ods graphics off;ods html close;run;
14 May 2007 SSP Core Facility 128
Department of Statistics
Factorial Treatment Design
Treatment Design vs Experiment (or study) Design
Factorial is type of treatment design Factor A, a levels; Factor B, b levels; etc Main inference tools:
−simple effects; e.g. method effect | variety j
− interaction; i.e. simple effects equal for all j
−main effects
14 May 2007 SSP Core Facility 129
Department of Statistics
is generic random structureijkEModel:
obs on A B
A B mean
Simple effect:
A | B :
B | A :
Interaction:
equal simple effects no interaction
e.g.
Main effect:
ijkijk ij
th thijk
thij
j ij i j
i ij ij
ij i j ij i j
y
y k ij
ij
E
or i i j j
specific form depends on design
14 May 2007 SSP Core Facility 130
Department of Statistics
GLIMMIX Features
Can estimate / test −simple effects
−main effect
−depending on which is appropriate
ODS graphics can graph / plot effects of interest SLICE can focus on simple effects in presence
of interaction SLICEDIFF can estimate simple effects of
interest
14 May 2007 SSP Core Facility 131
Department of Statistics
Modeling & Design
14 May 2007 SSP Core Facility 132
Department of Statistics
But My Study is not a Designed Experiment!
Comparative Study: any study whose purpose is to compare treatments or conditions (includes assessing change over time). Includes “quasi-experiments” & surveys with comparative objectives + designed experiments. Design principles apply to all!
Most modeling issues are study design issues
Most modeling errors result from poor understanding of design principles
14 May 2007 SSP Core Facility 133
Department of Statistics
If you are modeling, you need to understand design principles!!
14 May 2007 SSP Core Facility 134
Department of Statistics
Key Terms in Design Treatment Design: factors and levels & how they are
structured in the study. E.g factorial, planned obs over time
Experiment Design: Organization of experimental units (e.g into matched pairs, blocks, strata, clusters); plan by which they are assigned to treatment levels.
Experimental Unit: (e.u.) Smallest entity to which treatment levels (or treatment combinations) are independently assigned. E.U.s are legitimate units of replication
Sampling Unit: Unit on which measurement is taken. May be e.u. itself or subset of e.u. A.k.a. pseudo-replicate
Pseudo-replication: use of S.U.s as units of replication; common form of inappropriate design & analysis
14 May 2007 SSP Core Facility 135
Department of Statistics
Factorial & Experiment Designs idea: experimental unit is smallest entity to which
treatment level independently applied e.u. may be different size for different factors e.g. from SAS for Mixed Models, Section 4.6
−2 type 3 dose example dose applied to cage; type to animal in cage e.u. for dose: cage with 2 animals e.u. for type (and dose type): animal split-plot many variations (including repeated measures)
14 May 2007 SSP Core Facility 136
Department of Statistics
Adding to Model
school
classroom
students
school
classroom
students
TreatmentParticipate in Prof Devel
TreatmentDo Not Participate
curriculum
expstd
stdexp
curriculum
14 May 2007 SSP Core Facility 137
Department of Statistics
V. Factorial Treatment Designs
Basic Features
Come in Many (many, many) design forms
Experiment design & “quasi-experiment” or survey “study design”
−key to deciding what’s random & what’s fixed
−non-mixed (LM and GLM only) software is UNACCEPTABLE for these types of problems
Includes repeated measures (change... growth)
Normal and non-normal data
14 May 2007 SSP Core Facility 138
Department of Statistics
Type 1 Type 2
Type 2 type 1
Type 2 type 2
Type x Dose Design
Dose 1
Dose 2
Dose 3
or... Dose = Professional Development TrtType = Curriculum
14 May 2007 SSP Core Facility 139
Department of Statistics Figure 4.1 Possible design layouts for 22 factorial experiment
Treatments codes:
A1B1 A1B2 A2B1 A2B2 a. Completely Randomized c. Row-Column (Latin Square)
b. Randomized complete block
Blk 4
Blk 3
Blk 2
Blk 1
Blk 4
Blk 3
Blk 2
Blk 1
d. Split-plot 1, whole plot completely
randomized
col4col3col2col1
row4
row3
row2
row 1
col4col3col2col1
row4
row3
row2
row 1
FromSAS for Mixed Models
Treatment design:2 x 2 factorial
Experiment design:manymany variations
Here are 7(seven)
14 May 2007 SSP Core Facility 140
Department of Statistics
e. Split-plot 2, whole plot in randomized complete blocks
Blk 4
Blk 3
Blk 2
Blk 1
Blk 4
Blk 3
Blk 2
Blk 1
g. Split-plot 3. whole plot in row-
column (2 Latin squares)
Row 4
Row 3
col4col3col2col1
row2
row 1
Row 4
Row 3
col4col3col2col1
row2
row 1
f. Split-block, a.k.a. strip-split-plot
Blk 4
Blk 3
Blk 2
Blk 1
Blk 4
Blk 3
Blk 2
Blk 1
Even with 2 x 2 factorial
these seven are not all
we’re just getting started!
14 May 2007 SSP Core Facility 141
Department of Statistics
Split Block Example
SideL R
Position (same meaning both sides)
Microchip wafer
14 May 2007 SSP Core Facility 142
Department of Statistics
Choosing right model – step 1What is the experimental unit?
figure
4.1.a 4.1.b 4.1.c 4.1.d 4.1.e 4.1.f 4.1.g
effect CRD RCB LS split plot CR
split plot RCB
split-block
split-plot LS
block? no yes row col
no yes yes row col
A eu(A*B) blk*A*B row*col eu(A) blk*A blk*A row*col
B eu(A*B) blk*A*B row*col B*eu(A) blk*A*B blk*B row*col*B
A*B eu(A*B) blk*A*B row*col B*eu(A) blk*A*B blk*A*B row*col*B
14 May 2007 SSP Core Facility 143
Department of Statistics
Common Models in PROC MIXED/GLIMMIXDesign SAS – class, model and random statements CRD (Figure 4.1.a) class eu a b;
model y=a b a*b; RCB (Fig 4.1.b) class block a b;
model y=a b a*b; Random block; or Random intercept / subject=block;
Latin Square (4.1.c)
class row col a b; model y= a b a*b; Random row col;
Split-plot CR (4.1.d)
class eu a b; model y=a b a*b; random eu(a);
Split-plot RCB (4.1.e)
class block a b; model y=a b a*b; random block block*a;
Split-block (4.1.f) class block a b; model y=a b a*b; random block block*a block*b;
Split-plot LS (4.1.g)
class row col a b; model y=a b a*b; random row col row*col; (or, equivalently random row col row*col*a;)
MODEL treatment design RANDOM experiment (study) design
14 May 2007 SSP Core Facility 144
Department of Statistics
Model for split-plot: school-classroom example
Strategy: 1. list factor effects2. list e.u. for that effect3. each e.u. a random model effect
e.prof dev trt schoolcurriculum classroom(school)p.d curr classroom(school)
Effect e.u.g.
model: ijky
( )
or alternative expression
( )
note! is (not asampli n e.u.)ng unit
ij ik ijk
ij i j ij
ijk ik ijk
s t e
p c pc
E school trt e
student
14 May 2007 SSP Core Facility 145
Department of Statistics
Model for split-plot – Dose x Type exampleStrategy: 1. list factor effects
2. list e.u. for that effect3. each e.u. a random model effect
e.dose block dosetype block dose type
g.
model: ( )
dose type block dose t
Effect e.u
p
.
y e
ijk ij k iy bloc b d
or alternative expression
( )
note! NOT in model (not an e.u.)
k ijk
ij i j ij
ijk k ik ijk
e
d t dt
E bloc b d e
bloc type
14 May 2007 SSP Core Facility 146
Department of Statistics
Conventional ANOVA
2 2
2 2
2
2
2
Source EMS
bloc
dose
w.p. error† bloc dose
type
dose type
s.p. error††
S W D
S W
S T
S DT
S
t Q
t
Q
Q
H a.k.a.
between subjects
error
HH a.k.a.
within subjects
error
14 May 2007 SSP Core Facility 147
Department of Statistics
Standard errors of various terms
2 2
2
2
2 2
Main effects
2of dose Var= ( )rt
2of type Var= ( )rdSimple effects
2type|dose Var= ( )r
2dose|type Var= ( )r
i i S W
j j S
i ij ij S
j ij i j S W
t
Note: you can use MS() directly except for dose|typej
14 May 2007 SSP Core Facility 148
Department of Statistics
Programming in Proc GLIMMIXproc glimmix; class bloc type dose; model y=type|dose; random intercept dose / subject=bloc; ** i.e. random bloc bloc*dose; lsmeans type*dose / diff lines slicediff=(type dose) slice=(type dose); ods output lsmeans=lsm; run;
You can use ODS to output LSMEANS and GPLOT
for interaction plots, Or use ODS graphics directly
all possible meandifferences
simple effect differences only
simple effecttests only
with “MRT lines”
14 May 2007 SSP Core Facility 149
Department of Statistics
Type x Dose: Selected Output
Covariance Parameter Estimates
Cov Parm Subject EstimateStandard
Error
Intercept block 2.0735 2.7320
dose block 4.5132 2.8291
Residual 4.3189 1.5270
Type III Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
type 1 16 2.78 0.1151
dose 3 12 13.63 0.0004
type*dose 3 16 2.29 0.1176
14 May 2007 SSP Core Facility 150
Department of Statistics
Type x Dose LSMeans
type*dose Least Squares Means
type dose EstimateStandard
Error DF t Value Pr > |t|
r 1 20.0000 1.4769 20.23 13.54 <.0001
r 2 27.8400 1.4769 20.23 18.85 <.0001
r 4 28.1800 1.4769 20.23 19.08 <.0001
r 8 24.8000 1.4769 20.23 16.79 <.0001
s 1 17.0600 1.4769 20.23 11.55 <.0001
s 2 25.2000 1.4769 20.23 17.06 <.0001
s 4 28.3800 1.4769 20.23 19.22 <.0001
s 8 25.8000 1.4769 20.23 17.47 <.0001
14 May 2007 SSP Core Facility 151
Department of Statistics
Type x Dose: “MRT Lines”T Grouping for type*dose Least Squares Means
LS-means with the same letter are not significantly different.
type dose Estimate
s 4 28.3800 A
A
r 4 28.1800 A
A
r 2 27.8400 A
A
s 8 25.8000 A
A
s 2 25.2000 A
A
r 8 24.8000 A
r 1 20.0000 B
s 1 17.0600 C
however ...
14 May 2007 SSP Core Facility 152
Department of Statistics
A Factorial Inference Flowchart
The Prime Directive: Interactions first!!!!!
Interaction?
Negligible
Interpret Main Effects
Non-ignorable
Interpret Simple Effects
Full Wheelbarrow
14 May 2007 SSP Core Facility 153
Department of Statistics
Plots of Differences between Means
LSMEANS allows various plots of mean differences
DIFFPlot: plots interval estimates of mean differences
ANoMPlot: (ANalysis of Means) plots difference between each treatment and the overall mean
ControlPlot: Plots each treatment vs control (e.g. like Dunnett test)
14 May 2007 SSP Core Facility 154
Department of Statistics
SAS for Mean Difference Plots
From Type x Dose exampleods html;ods graphics on;ods select Anomplot DiffPlot;proc glimmix data=variety_eval; class block type dose; model y=type|dose/ddfm=satterth; random block block*dose;
lsmeans dose/plot=DiffPlot; lsmeans dose/plot=AnomPlot; *lsmeans type*dose/plot=DiffPlot; *lsmeans type*dose/plot=AnomPlot;run;ods graphics off;ods html close;run;
14 May 2007 SSP Core Facility 155
Department of Statistics
SAS for Mean Difference Plots: DIFFPLOT
14 May 2007 SSP Core Facility 156
Department of Statistics
SAS for Mean Difference Plots: ANoMPLOT
14 May 2007 SSP Core Facility 157
Department of Statistics
Mean Difference Plots – Control Plots
From SAS for Linear Models – Output 3.17-3.22 Randomized Complete Block 5 Irrigation Treatments: Flood (control), Basin, Spray,
Sprinkler, Trickle
ods html;ods graphics on;ods select ControlPlot;proc glimmix order=data; class bloc irrig; model fruitwt=irrig; random bloc; lsmeans irrig/diff=control('flood') plot=controlplot adjust=dunnett; run;ods graphics off;ods html close; run;
14 May 2007 SSP Core Facility 158
Department of Statistics
Dunnett-style Control Plot
14 May 2007 SSP Core Facility 159
Department of Statistics
Back to Type x Dose Data: Interaction Plot
14 May 2007 SSP Core Facility 160
Department of Statistics
Type x Dose: Simple Effects
Tests of Effect Slices for type*dose Sliced By type
type
Num DF
Den DF F Value Pr > F
r 3 19.49 8.12 0.0010
s 3 19.49 13.58 <.0001
Tests of Effect Slices for type*dose Sliced By dose
dose
Num DF
Den DF F Value Pr > F
1 1 16 5.00 0.0399
2 1 16 4.03 0.0618
4 1 16 0.02 0.8810
8 1 16 0.58 0.4578
Simple Effect Comparisons of type*dose Least Squares Means By dose
Simple Effect Level type _type Estimate
Standard Error DF t Value Pr > |t|
dose 1 r s 2.9400 1.3144 16 2.24 0.0399
dose 2 r s 2.6400 1.3144 16 2.01 0.0618
dose 4 r s -0.2000 1.3144 16 -0.15 0.8810
dose 8 r s -1.0000 1.3144 16 -0.76 0.4578
SLICE: test only
SLICEDIFFestimatesetc
14 May 2007 SSP Core Facility 161
Department of Statistics
Type x Dose: Simple Effect Estimates by TypeSimple Effect Comparisons of type*dose Least Squares Means By type
Simple Effect Level dose _dose Estimate
Standard Error DF t Value Pr > |t|
type r 1 2 -7.8400 1.8796 19.49 -4.17 0.0005
type r 1 4 -8.1800 1.8796 19.49 -4.35 0.0003
type r 1 8 -4.8000 1.8796 19.49 -2.55 0.0192
type r 2 4 -0.3400 1.8796 19.49 -0.18 0.8583
type r 2 8 3.0400 1.8796 19.49 1.62 0.1219
type r 4 8 3.3800 1.8796 19.49 1.80 0.0876
type s 1 2 -8.1400 1.8796 19.49 -4.33 0.0003
type s 1 4 -11.3200 1.8796 19.49 -6.02 <.0001
type s 1 8 -8.7400 1.8796 19.49 -4.65 0.0002
type s 2 4 -3.1800 1.8796 19.49 -1.69 0.1066
type s 2 8 -0.6000 1.8796 19.49 -0.32 0.7530
type s 4 8 2.5800 1.8796 19.49 1.37 0.1855
14 May 2007 SSP Core Facility 162
Department of Statistics
Effect of dose?
contrast 'logdose linear' dose -3 -1 1 3; contrast 'logdose quad' dose 1 -1 -1 1; contrast 'logdose cubic' dose -1 3 -3 1; contrast 'type x linear' dose*type -3 -1 1 3 3 1 -1 -3; contrast 'type x quad' dose*type 1 -1 -1 1 -1 1 1 -1; contrast 'type x cubic' dose*type -1 3 -3 1 1 -3 3 -1;
Log(Dose)
otherwise.....
contrast 'dose linear' dose -11 -7 1 17; contrast 'dose quad' dose 20 -4 -29 13; contrast 'dose cubic' dose -8 14 -7 1; contrast 'type x linear' dose*type -11 -7 1 17 11 7 -1 -17; contrast 'type x quad' dose*type 20 -4 -29 13 -20 4 29 -13; contrast 'type x cubic' dose*type -8 14 -7 1 8 -14 7 -1;
14 May 2007 SSP Core Facility 163
Department of Statistics
LogDose contrast results
Contrasts
Num Den
Label DF DF F Value Pr > F
logdose linear 1 12 18.25 0.0011
logdose quad 1 12 22.54 0.0005
logdose cubic 1 12 0.08 0.7780
type x linear 1 16 6.22 0.0240
type x quad 1 16 0.04 0.8515
type x cubic 1 16 0.61 0.4472
14 May 2007 SSP Core Facility 164
Department of Statistics
Direct Regression – borrow from ANCOVAproc glimmix data=variety_eval; class block type dose; model y=type logdose(type) ld_sq(type) / noint ddfm=satterth solution; random intercept dose / subject=block; contrast 'equal quad by type?' ld_sq(type) 1 -1;run;
Solutions for Fixed Effects
Effect type EstimateStandard
Error DF t Value
type r 20.1890 1.4204 19.62 14.21
type s 17.0200 1.4204 19.62 11.98
logdose(type) r 9.8890 2.0181 21.45 4.90
logdose(type) s 10.9800 2.0181 21.45 5.44
ld_sq(type) r -2.8050 0.6447 21.45 -4.35
ld_sq(type) s -2.6800 0.6447 21.45 -4.16
Contrasts
Label
Nu
m DF
Den DF
F Value Pr > F
equal quad by type?
1 17 0.04 0.8497
can re-fit with LD_SQcommon to both types
14 May 2007 SSP Core Facility 165
Department of Statistics
Example 3
From SAS for Mixed Models, Section 4.7 4 “conditions” 3 diets Condition applied in incomplete block design 2 conditions per block Diet applied to cages within condition Condition is whole plot, diet is split-plot
14 May 2007 SSP Core Facility 166
Department of Statistics
“Plot plan”
diet 1 diet 2 diet 3 diet 2 diet 1 diet 3
diet 2 diet 1 diet 3 diet 1 diet 3 diet 2
14 May 2007 SSP Core Facility 167
Department of Statistics
Model?
blocking? yes e.u. with respect to condition “1/2 block” e.u. with repect to diet: “1/3 condition e.u.” e.u. w.r.t. cond x diet: same as diet
Model:
ijk i k ikj ijkblk wy e
14 May 2007 SSP Core Facility 168
Department of Statistics
SAS Program
proc glimmix data=fix2; class cage condition diet / ddfm=kr; model gain=condition diet condition*diet/ddfm=satterth; random intercept condition / subject=cage; run;
data & program: file ch4-ex3.sas
14 May 2007 SSP Core Facility 169
Department of Statistics
Selected Output
Covariance Parameter Estimates
Cov Parm Subject EstimateStandard
Error
Intercept cage 3.0376 5.0791
condition cage 0 .
Residual 27.8429 8.7672
how should one deal with negative variance component estimate?• revert to ANOVA via PROC GLM ?• in MIXED, use NOBOUND option ?• in GLIMMIX, use LowerB• alternatively, redefine model
• may be CS with plots in block negatively correlated
Type III Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
condition 3 23.61 2.71 0.0677
diet 2 20.17 0.93 0.4090
condition*diet 6 20.17 1.73 0.1661
14 May 2007 SSP Core Facility 170
Department of Statistics
Comparison with SAS Proc GLMproc glm data=fix2; class cage condition diet; model gain=cage condition cage*condition diet condition*diet; random cage cage*condition/test; lsmeans condition diet condition*diet;Tests of Hypotheses for Mixed Model Analysis of Variance
Source DF Type III SS Mean Square F Value Pr > F cage 5 198.277778 39.655556 2.73 0.2185 * condition 3 171.666667 57.222222 3.95 0.1446 Error 3 43.500000 14.500000 Error: MS(cage*condition) * This test assumes one or more other fixed effects are zero.
Source DF Type III SS Mean Square F Value Pr > F cage*condition 3 43.500000 14.500000 0.46 0.7144 * diet 2 52.055556 26.027778 0.82 0.4561
condition*diet 6 288.388889 48.064815 1.52 0.2333
Error: MS(Error) 16 504.888889 31.555556
14 May 2007 SSP Core Facility 171
Department of Statistics
More GLM output Least Squares Means
condition gain LSMEAN 1 Non-est 2 Non-est 3 Non-est 4 Non-est
diet gain LSMEAN normal 57.9166667 restrict 55.5000000 suppleme 58.1666667
condition diet gain LSMEAN 1 normal Non-est 1 restrict Non-est 1 suppleme Non-est 2 normal Non-est 2 restrict Non-est 2 suppleme Non-est 3 normal Non-est 3 restrict Non-est 3 suppleme Non-est 4 normal Non-est 4 restrict Non-est 4 suppleme Non-est
non-estimabilityresults from inappropriatedefinition of estimability
(based on fixed & random eff)
inescapable consequence ofProc GLM with mixed model
DON’Tuse Proc GLMwithmixed models!
14 May 2007 SSP Core Facility 172
Department of Statistics
GLM vs MIXED issues REML default: variance component estimates set to 0
− if BLOCK affected, type I error rate − if error term affected, power may − better to allow negative estimates− In MIXED: NOBOUND or METHOD=TYPE3− In GLIMMIX: LowerB
vs. GLM uses implied MS regardless GLM: inappropriate NON-EST artifact of incomplete
block design Standard errors for means, many simple effects
(including SLICE) incorrect in GLM (no fix!!)
14 May 2007 SSP Core Facility 173
Department of Statistics
GLIMMIX Option (1) – Like NOBOUND in MIXED
proc glimmix data=fix2; class cage condition diet; model gain=condition|diet/ddfm=kr; random intercept condition / subject=cage;
parms / lowerb=(1e-4,-10,1e-4); run;
Covariance Parameter Estimates
Cov Parm Subject EstimateStandard
Error
Intercept cage 5.0288 4.7149
condition cage -6.2404 4.8693
Residual 31.5556 11.1566
Type III Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
condition 3 4.718 4.31 0.0798
diet 2 16 0.82 0.4561
condition*diet 6 16 1.52 0.2333
14 May 2007 SSP Core Facility 174
Department of Statistics
GLIMMIX Option (2) – is it really correlation?proc glimmix data=fix2; class cage condition diet; model gain=condition|diet/ddfm=kr; random intercept / subject=cage;
random _residual_ / type=cs subject=condition*cage; run;
Covariance Parameter Estimates
Cov Parm Subject Estimate
Intercept cage 5.0271
CS cage*condition -6.2402
Residual 31.5567
Type III Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
condition 3 4.717 4.31 0.0798
diet 2 16 0.82 0.4561
condition*diet 6 16 1.52 0.2334
2CC
2 2CC
Interblock correlation
=
0.2466
14 May 2007 SSP Core Facility 175
Department of Statistics
Modeling Change over Time
Regression over time Latent growth / change models Random coefficients over time Repeated measures experiment Longitudinal Data
14 May 2007 SSP Core Facility 176
Department of Statistics
From Acock – BMI Data
bmi
10
20
30
40
50
year
1997 yearfrmt
1998 yearfrmt
1999 yearfrmt
2000 yearfrmt
2001 yearfrmt
2002 yearfrmt
2003
Note – my sample differs from Acock’s, so the numbers won’t match
14 May 2007 SSP Core Facility 177
Department of Statistics
Basic Growth Model
Simplest model involves slope & intercept In “Stat-speak”
0 1
obs=intercept slope time + error
ij ijiy time e
this is just linear regression
21 2, ,..., may be 0,
may be (more l
indep
ater)
endent
correlated
j j tje e e N
or
14 May 2007 SSP Core Facility 178
Department of Statistics
Basic Growth Model in SAS
in PROC GLM
proc glm; model bmi=year;run;
Source DFSum of
Squares Mean Square F Value Pr > F
Model 1 432.856378 432.856378 19.68 <.0001
Error 229 5037.468822 21.997680
Corrected Total 230 5470.325200
R-Square Coeff Var Root MSE bmi Mean
0.079128 20.01197 4.690168 23.43682
Parameter EstimateStandard
Error t Value Pr > |t|
Intercept 21.38349324 0.55631931 38.44 <.0001
year 0.68444085 0.15429522 4.44 <.0001
very deceptive – more shortly
ˆregression equation: 21.38 0.684y Year
14 May 2007 SSP Core Facility 179
Department of Statistics
Growth Model in SAS - II
in PROC GLIMMIX
proc glimmix;class id; model bmi=year/solution; random _residual_ /subject=id; estimate 'y-hat in 1997' intercept 1 year 0 / cl; estimate 'y-hat in 2000' intercept 1 year 3 / cl; estimate 'y-hat in 2003' intercept 1 year 6 / cl;run;
selected output next page
14 May 2007 SSP Core Facility 180
Department of Statistics
Basic Growth Model – Selected GLIMMIX OutputCovariance Parameter Estimates
Cov Parm EstimateStandard
Error
Residual (VC) 21.9977 2.0558
Solutions for Fixed Effects
Effect EstimateStandard
Error DF t Value Pr > |t|
Intercept 21.3835 0.5563 32 38.44 <.0001
year 0.6844 0.1543 197 4.44 <.0001
Estimates
Label EstimateStandard
Error DF t Value Pr > |t| Alpha Lower Upper
y-hat in 1997 21.3835 0.5563 197 38.44 <.0001 0.05 20.2864 22.4806
y-hat in 2000 23.4368 0.3086 197 75.95 <.0001 0.05 22.8283 24.0454
y-hat in 2003 25.4901 0.5563 197 45.82 <.0001 0.05 24.3930 26.5872
Note: residual VC est = MSE from GLM ANOVA
14 May 2007 SSP Core Facility 181
Department of Statistics
G/C Model – Issue I – Account for ID
Recall R2 for Basic Growth Model very low You must account for variation among subjects (ID)
proc glm; class id; model bmi=id year;run;
proc glimmix;class id; model bmi=year/solution; random id; /* or random intercept / subject = id
okay
better
14 May 2007 SSP Core Facility 182
Department of Statistics
Selected Output
R-Square
0.815282
fromGLM
vs. 0.079
Covariance Parameter Estimates
Cov Parm Subject EstimateStandard
Error
Intercept id 17.2449 4.4950
Residual 5.1293 0.5168
from GLIMMIX
vs. 21.998
Solutions for Fixed Effects
Effect EstimateStandard
Error DF t Value Pr > |t|
Intercept 21.3835 0.7712 32 27.73 <.0001
year 0.6844 0.07451 197 9.19 <.0001
estimatesdon’t changestd errors do
14 May 2007 SSP Core Facility 183
Department of Statistics
Growth Change Modeling Issue - II
Correlated Errors
0 1
21 2
I
, ,..., may be inde
n Mode
0,
may be
pendent
correlated
l
ij iji
j j tj
y year e
e e e N
or
Recall:
Correlation Modeled by Covariance Model
• Failure to model correlation increases P{type I error}
• Over-modeling correlation decreases Power
14 May 2007 SSP Core Facility 184
Department of Statistics
Covariance models2
2
2 3
22
Indep =I identical to split-plot
1
1CS =
1
1
NOTE: CS is reparameterization of Indep
1
1AR(1) =
1
1
14 May 2007 SSP Core Facility 185
Department of Statistics
More covariance models
1 2 3
1 22
1
21 1 2 1 1 3 1 2 1 4 1 2 3
22 2 3 2 2 4 2 3
23 3 4 3
24
21 12 13 14
22 23 24
23 34
24
1
1Toep =
.
1
ANTE(1) =
UN =
14 May 2007 SSP Core Facility 186
Department of Statistics
Issues in Repeated Measures
Impact of covariance structure? Selection of appropriate covariance? Bias in std errors, test statistics Degrees of freedom Nonlinear models over time Non-normal errors
14 May 2007 SSP Core Facility 187
Department of Statistics
Basic G/C Model with Covariance Model
Also known as Autocorrelation
proc glimmix;
class id;
model bmi=year/solution / ddfm=kr;
random intercept / subject=id;
random _residual_ /subject=id type=ar(1); run;
Competing Covariance Models compared via Fit Statistics•AICC BIC•HQIC CAIC
degree of freedomandstd error bias must be dealt withmore later
14 May 2007 SSP Core Facility 188
Department of Statistics
Selected Output for G/C Model w/ Autocorrelation
Covariance Parameter Estimates
Cov Parm Subject EstimateStandard
Error
Intercept id 14.8587 4.6202
AR(1) id 0.5623 0.1144
Residual 7.7165 1.8981
variance,covariance &correlation estimates
Fit Statistics
-2 Res Log Likelihood 1111.69
AIC (smaller is better) 1117.69
AICC (smaller is better) 1117.79
BIC (smaller is better) 1122.18
CAIC (smaller is better) 1125.18
HQIC (smaller is better) 1119.20
Generalized Chi-Square 1767.07
Gener. Chi-Square / DF 7.72
Solutions for Fixed Effects
Effect EstimateStandard Error DF t Value Pr > |t|
Intercept 21.3238 0.8042 32 26.52 <.0001
year 0.6896 0.1102 197 6.26 <.0001
used to assess cov model
estimate – slight effectstd error – bigger effect
14 May 2007 SSP Core Facility 189
Department of Statistics
random coeff correl errors prediction add Gender add emotional prob
14 May 2007 SSP Core Facility 190
Department of Statistics
Repeated Measure Experimentsa.k.a. Longitudinal Data
Assign e.u. to treatments May use any design (completely random,
blocked, row-column, split-plot ....) Observations at planned times Objectives
1. assess changes in response over time
2. assess treatment effect on (1)
14 May 2007 SSP Core Facility 191
Department of Statistics
Typical repeated Measures Data
from SAS for Linear Models, Chapter 8 SAS for Mixed Models, 2nd ed, Chapter 5
14 May 2007 SSP Core Facility 192
Department of Statistics
From BMI Data: Are G/C Curves Equal by Gender?
interactionplot of G/Ccurve by gender
14 May 2007 SSP Core Facility 193
Department of Statistics
FYI – SAS Code to Get Interaction Plot
ods html;ods graphics on;ods select MeanPlot;proc glimmix data=bmi_uni_anc; class gender id year; model bmi=gender|year / solution ddfm=kr; random intercept / subject=id(gender); random _residual_ / type=ar(1) subject=id(gender); lsmeans gender*year / plot=MeanPlot (sliceby=gender join cl);run;ods graphics off;ods html close;run;
14 May 2007 SSP Core Facility 194
Department of Statistics
Model
2
Model: ( )
where mean
can express as:
( ) is between subjects error (0, )like whole-plot error
is within subjects error, like
ijk ij ik ijk
ij i j
ij i j ij
ik B
ijk
y id gender e
gender year
g yr g yr
id gender NI
e
1 2
split-plot error, ...
Let ... (0, )ik iki k i k iTk
except
e e e e e MVN
proc glimmix data=bmi_uni_anc; class gender id year; model bmi=gender|year / solution ddfm=kr; random intercept / subject=id(gender); random _residual_ / type=ar(1) subject=id(gender);
translates to:
14 May 2007 SSP Core Facility 195
Department of Statistics
Back to SAS for Mixed Models Example
2
1 2
Model: ( )
where mean
( ) is between subjects error (0, )
like whole-plot error
is within subjects error, like split-plot error, ...
Let ..
ijk ij ik ijk
ij i j
ik B
ijk
ik i k i k
y s trt e
trt time
s trt NI
e except
e e e
2 2
. (0, )
Hence ( ) ; typically
# trt's, =#subj/trt
ikiTk
ik S S B T Bik
AK ik
e e MVN
Var y V Z Z J
V Var y I V A K
14 May 2007 SSP Core Facility 196
Department of Statistics
Middle Ground between MANOVA and Split-Plot in Time via Proc GLIMMIX
PROC GLIMMIX; CLASSES SUBJ TRT TIME; MODEL Y= TRT TIME TRT*TIME; RANDOM INTERCEPT / SUBJECT=SUBJ(TRT); RANDOM TIME / TYPE=AR(1) SUBJECT=SUBJ(TRT) RESIDUAL;*LSMEANS TRT TIME TRT*TIME;TITLE 'MIXED - AR(1) ERRORS';RUN;
RANDOM specifies between subjects effects (G-side)
RANDOM...RESIDUAL specifies within subjects effect (R-side)
in many models, G- and R-side effects are not identifiable
14 May 2007 SSP Core Facility 197
Department of Statistics
Modeling Covariance among Repeated Measures
PROC MIXED DATA=univ; CLASSES SUBJ TRT TIME; MODEL Y= TRT TIME TRT*TIME; REPEATED TIME / TYPE=UN SSCP SUBJECT=SUBJ(TRT); ODS OUTPUT CovParms=cp;run;data times; do time1=1 to 8; do time2=1 to time1; dist=time1-time2; output; end; end;
data covplot; merge times cp;
proc gplot data=covplot; plot adjcorr*dist=time1;
Computes covariance betweenpairs of measurements(same subject, different times)based on Sum of squares & cross-products matrixthenplots them by distance
14 May 2007 SSP Core Facility 198
Department of Statistics
Plot of Covariance by Distance
14 May 2007 SSP Core Facility 199
Department of Statistics
Idealized PlotsCS=Subj(Trt), AR(1), AR(1)+Subj(Trt)
AR(1) + Subj(Trt)
CS= random Subj(Trt)
AR(1) only
14 May 2007 SSP Core Facility 200
Department of Statistics
Model Fitting Criteria in Version 8
1. Compound Symmetry proc glimmix; classes subj trt time; model y= trt time trt*time; random time / residual type=cs subject=subj(trt);title 'mixed - compound symmetry';
Fit Statistics
-2 Res Log Likelihood 839.39
AIC (smaller is better) 843.39
AICC (smaller is better) 843.47
BIC (smaller is better) 845.75
CAIC (smaller is better) 847.75
HQIC (smaller is better) 844.02
Generalized Chi-Square 767.61
Gener. Chi-Square / DF 4.80
14 May 2007 SSP Core Facility 201
Department of Statistics
Comparison of ModelsSmaller is Better
Compound Symmetry
Neg2LogLike Parms AIC AICC HQIC BIC CAIC 839.4 2 843.4 843.5 844.0 845.7 847.7
AR(1) + Subj(TRT) random effect
Neg2LogLike Parms AIC AICC HQIC BIC CAIC 788.7 3 794.7 794.8 795.6 798.2 801.2
Unstructured
Neg2LogLike Parms AIC AICC HQIC BIC CAIC 760.5 36 832.5 854.1 843.7 874.9 910.9
Neg2LogLike Parms AIC AICC HQIC BIC CAIC 780.7 15 810.7 814.0 815.3 828.3 843.3
Neg2LogLike Parms AIC AICC HQIC BIC CAIC 784.9 8 800.9 801.9 803.4 810.4 818.4
ANTE(1)
TOEP
14 May 2007 SSP Core Facility 202
Department of Statistics
How do Model Fitting Criteria Compare?
Guerin & Stroup (2000) compared AIC, BIC, HQIC, CAIC for simulated AR(1) and ARH(1) data
CAIC tends to select simpler models AIC tends to select most complex models * complex -- AIC > HQIC > BIC > CAIC -- simple Model too simple (correlation model not adequate) Type I error
rate too high Model too complex (correlation over-modeled) Type I error control
not affected, but power suffers
*Since 2000, SAS added AICC to address AIC issue Best choice depends on severity of Type I vs II error
14 May 2007 SSP Core Facility 203
Department of Statistics
An Inference Issue CS: Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F TRT 3 20 0.74 0.5425 TIME 7 140 109.04 <.0001 TRT*TIME 21 140 1.98 0.0106
AR(1)+between subj: Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F TRT 3 20 0.75 0.5344 TIME 7 140 60.55 <.0001 TRT*TIME 21 140 1.48 0.0921
UN: Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F TRT 3 20 0.74 0.5425 TIME 7 20 101.31 <.0001 TRT*TIME 21 20 1.37 0.2450
UN similar to MANOVA but MANOVA Trt*Time p-value was 0.50
14 May 2007 SSP Core Facility 204
Department of Statistics
Bias & Options for Adjusting
SAS Default uses estimated (co)variance components in V std errors biased , t-, F-statistics biased
“Robust” (a.k.a. “sandwich) estimate of K’V-1K available using EMPIRICAL option in MIXED
Kenward & Roger (Biometrics, 1997) proposed adjustment; available using DDFM=KR option in MODEL statement of MIXED
Guerin & Stroup (2000) evaluated KR option of SAS Version 8 with simulated AR(1) and ARH(1) data
Biased F resulted in inflated Type I error rates unless KR option used (for α=0.05, rejection rates >0.10 for TYPE=AR(1), up to 0.20 with TYPE=ANTE(1), UN
14 May 2007 SSP Core Facility 205
Department of Statistics
Sandwich (“Robust”) Estimator
OLS
OLS
1 1GLS
1 1GLS
0
1 10
ˆOLS estimate of :
ˆ ( )
ˆ ˆ ˆGLS estimate is:
ˆ ˆ ˆ
ˆˆ ˆLet based on residuals
ˆ ˆ ˆˆ ˆYields
"Sa
X X X y
Var X X X Var y X X X
X X X VX X X
X V X X V y
Var X V X X VX X V X
V V e y X
V V eeV
1 1 1 1GLS
ˆ ˆ ˆ ˆ ˆˆ ˆndwich" estimator: Var X V X X V eeV X X V X
14 May 2007 SSP Core Facility 206
Department of Statistics
How does the sandwich estimator perform?
proc mixed empirical; classes subj trt time; model y=trt time trt*time; random intercept/ subject=subj(trt); random time / type=ar(1) subject=subj(trt) residual;run;
Type 3 Tests of Fixed Effects
Num Den Effect DF DF F Value Pr > F
TRT 3 20 1.31 0.2981 TIME 7 140 121.57 <.0001 TRT*TIME 20 140 9.04 <.0001
vs. F=1.48; p=0.0921using default
14 May 2007 SSP Core Facility 207
Department of Statistics
Kenward and Roger
proc glimmix; classes subj trt time; model y= trt time trt*time/ddfm=kr; random intercept / subject=subj(trt); random time / type=ar(1) subject=subj(trt) residual;
Type 3 Tests of Fixed Effects
Num Den Effect DF DF F Value Pr > F
TRT 3 20.5 0.77 0.5219 TIME 7 109 50.90 <.0001 TRT*TIME 21 117 1.24 0.2330
14 May 2007 SSP Core Facility 208
Department of Statistics
Alternative KR adjustment• in SAS, KR adjustment uses Hessian matrix by default• you can cause it to use the Information matrix instead• no documented advantage one way or another
PROC glimmix scoremod scoring=51; CLASSES SUBJ TRT TIME; MODEL Y= TRT TIME TRT*TIME/ddfm=kr; RANDOM intercept / subject=SUBJ(TRT); Random _resid_ / TYPE=AR(1) SUBJECT=SUBJ(TRT); nloptions technique=nrridg;
Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F
TRT 3 20.5 0.77 0.5264 TIME 7 112 54.18 <.0001 TRT*TIME 21 119 1.28 0.2010
vs. F=1.24, p=0.2330 using Hessian
14 May 2007 SSP Core Facility 209
Department of Statistics
Alternative Model for Change in BMI by Gender
0 1
0 0
1 1 1
0 1 1
0 1
Repeated Measures ANCOVA M
Level 1: Level 2: ( )
(odel
)( )
tj j j t tj
j i ij
j i
ijk i ij i t ijk
i i ij ijk
y yr egender id genderg
y gender id gender g yr eid gender e
proc glimmix data=bmi_uni_anc; class gender id year; model bmi=gender yr(gender) / noint solution ddfm=kr; random intercept / subject=id(gender); random _residual_ / type=ar(1) subject=id(gender); contrast 'male vs female intercept' gender 1 -1; contrast 'male vs female slope' yr(gender) 1 -1;run;
14 May 2007 SSP Core Facility 210
Department of Statistics
Selected Output
Covariance Parameter Estimates
Cov Parm Subject Estimate
Intercept id(gender) 15.1933
AR(1) id(gender) 0.2928
Residual 7.8871
Solutions for Fixed Effects
Effect gender EstimateStandard
Error DF t Value Pr > |t|
gender 0 20.1988 0.6084 165.9 33.20 <.0001
gender 1 21.8298 0.5596 165.9 39.01 <.0001
yr(gender) 0 0.7860 0.08207 204.5 9.58 <.0001
yr(gender) 1 0.6462 0.07549 204.5 8.56 <.0001
Contrasts
LabelNum
DFDen DF
F Value Pr > F
male vs female intercept
1 165.9 3.89 0.0501
male vs female slope 1 204.5 1.57 0.2111
14 May 2007 SSP Core Facility 211
Department of Statistics
Alternative Model
proc glimmix data=bmi_uni; class gender id; model bmi=gender year(gender) / noint solution ddfm=kr; random intercept year(gender) / subject=id type=un; contrast 'male vs female intercept' gender 1 -1; contrast 'male vs female slope' year(gender) 1 -1;run;
This is a random coefficient model
Next section
14 May 2007 SSP Core Facility 212
Department of Statistics
Response Surface Split Plot with Repeated Measures
4 treatment factors (A, B, C, D)− 2 levels each
3 factors (A, B, C) applied to P( subject) treatment design: central composite design subjects split into 2 sub-units level of D randomly assigned to each sub-unit observations at 3 planned times (H)
14 May 2007 SSP Core Facility 213
Department of Statistics
Central Composite Design
14 May 2007 SSP Core Facility 214
Department of Statistics
Model for Central Composite Split-Split Plot
Effect e.u.
A, B, C
main effects & interactions P(A B C)
D D P(A B C)
D (A, B, C) D P(A B C)
H and all interactions
involving H H D P(A B C)
( , , ) ( , , )
)
(
( , ,
hijklm Ai Bj Ck l l Ai Bj Ck
m lm m Ai Bj Ck
y f X X X d f X X X
h dh f X X X
p a
) ( )hijk hijkl hijklmbc dp abc e
14 May 2007 SSP Core Facility 215
Department of Statistics
SAS Statements
proc glimmix; class ca cb cc p d u; *model y=a b c a*a b*b c*c a*b a*c b*c d d*a d*b d*c t t*t t*a t*b t*c t*d/htype=1 htype=3 ddfm=kr; model y=d a(d) b(d) c(d) a*a b*b c*c a*b a*c b*c t(d) t*t t*a t*b t*c
/noint solution htype=1 ddfm=kr; random p(ca cb cc) d*p(ca cb cc);
14 May 2007 SSP Core Facility 216
Department of Statistics
Key output
Covariance Parameter Estimates
Cov Parm Subject Estimate
Intercept p(ca*cb*cc) 24.3200
d p(ca*cb*cc) 4.5151
Residual 11.4944
Solutions for Fixed Effects
Effect d Estimate Standard Error
d 0 53.5687 2.3344
d 1 31.7168 2.3344
a(d) 0 16.8226 1.8101
a(d) 1 11.2226 1.8101
b(d) 0 19.5049 1.8101
b(d) 1 12.3715 1.8101
c(d) 0 4.4019 1.8101
c(d) 1 3.5352 1.8101
a*a 0.4980 3.2427
b*b -2.5020 3.2427
c*c 5.1647 3.2427
a*b 6.2083 1.8872
a*c -2.8333 1.8872
b*c 1.2083 1.8872
t(d) 0 9.4200 0.5504
t(d) 1 0.02442 0.5504
t*t -0.1487 1.1114
a*t 0.1160 0.5078
b*t 1.7331 0.5078
c*t 0.3513 0.5078
Fit Statistics
AICC (smaller is better) 573.40
14 May 2007 SSP Core Facility 217
Department of Statistics
Complex Split-split-plot revisited
Recall A, B, C applied to units P P split in two, levels of D to each half Measured a 3 times Previous analysis assumed split on time Actually repeated measures Split-plot + repeated measures
14 May 2007 SSP Core Facility 218
Department of Statistics
CCD Split-plot + repeated measures
proc glimmix data=CCD_SpltPlt; class ca cb cc p d u; *model y=a b c a*a b*b c*c a*b a*c b*c d d*a d*b d*c t t*t t*a t*b t*c t*d/htype=1 htype=3 ddfm=kr; model y=d a(d) b(d) c(d) a*a b*b c*c a*b a*c b*c t(d) t*t t*a t*b t*c /
noint solution htype=1 ddfm=kr; random intercept / subject=p(ca cb cc); random _residual_ / type=sp(pow)(t) subject=d*p(ca cb cc);run;
AICC: 573.4 as split-split-plot551.1 as repeated measures using SP(POW)note SP(POW) is generalization of AR(1)
for unequally spaced times
14 May 2007 SSP Core Facility 219
Department of Statistics
Unreplicated Split-Plot
SAS for Mixed Models, Section 16.7 Quilt divided in half Each “half sheet” received 2 x 2 x 3 factorial
−2 pH levels (low high)−2 temp (cold hot)−3 dry cycles (air machine-delicate machine-normal
Material cut from each unit −washed 10, 20, 30, 40, 50 times
Breaking strength monitored Materials observed so reps by sheet lost
14 May 2007 SSP Core Facility 220
Department of Statistics
is the mean of the ijkth pH water temperature dry cycle (i=8,10; j=35,55; k=air, delicate, normal) at the lth
time of washing (l=10.20.30.40.50),rm is the effect of the mth block (m=1,2 in the design, but m=1
only in the data)wijkm is the ijkmth between subjects (or whole-plot) error effect,
assumed eijklm is the within subjects (or split-plot) error effect,
assumed
Model for Breaking Strength Experiment
ijklm ijkl m ijkm ijklmy r w e
ijkl
2(0, )WNID
2(0, )NID
where
14 May 2007 SSP Core Facility 221
Department of Statistics
ANOVA for Breaking Strength ExperimentSource of Variation d.f.
block 1
pH (P) 1
wash temp (T) 1
dry cycle (D) 2
PT 1
PD 2
TD 2
PTD 2
between subject error 11
no. of washes (W) 4
WP 4
WT 4
WD 8
WPT 4
WPD 8
WTD 8
WPTD 8
within subjects error 48
but these become 0when blockingby “half quilt”distinction lost
14 May 2007 SSP Core Facility 222
Department of Statistics
Breaking Strength vs # Washes by pH
14 May 2007 SSP Core Facility 223
Department of Statistics
Breaking Strength vs # Washes by Temp
14 May 2007 SSP Core Facility 224
Department of Statistics
Breaking Strength vs # Washes by Dry Cycle
14 May 2007 SSP Core Facility 225
Department of Statistics
Revised ANOVA Pool negligible effects to get between & within error
Source of Variation d.f.
pH (P) 1
wash temp (T) 1
dry cycle (D) 2
between subject error 7
linear effect of no. of washes (W Lin) 1
W LinP 1
W LinT 1
W LinD 2
within subjects error 43
14 May 2007 SSP Core Facility 226
Department of Statistics
GLIMMIX Program for Breaking Strength Experiment
proc glimmix data=shellie; class pH water_temp dry_cycle; model breaking_strength=pH water_temp dry_cycle w w*pH w*water_temp w*dry_cycle / solution; random pH*water_temp*dry_cycle; contrast 'air vs dryer effect on wear' w*dry_cycle 2 -1 -1; contrast 'delicate v normal effect on wear' w*dry_cycle 0 1 -1;run;
14 May 2007 SSP Core Facility 227
Department of Statistics
Revised GLIMMIX - Estimate Regression over # of Washes
proc glimmix data=shellie; class pH water_temp dry_cycle; model breaking_strength= w(pH) w(water_temp) w(dry_cycle)/noint solution; random pH*water_temp*dry_cycle; estimate 'slope: ph 8, cold, air‘
w(ph) 1 0 w(water_temp) 1 0 w(dry_cycle) 1 0 0; estimate 'slope: ph 8, cold, delicate'
w(ph) 1 0 w(water_temp) 1 0 w(dry_cycle) 0 1 0; estimate 'slope: ph 8, cold, normal'
w(ph) 1 0 w(water_temp) 1 0 w(dry_cycle) 0 0 1; estimate 'slope: ph 8, hot, air‘
w(ph) 1 0 w(water_temp) 0 1 w(dry_cycle) 1 0 0; estimate 'slope: ph 8, hot, delicate'
w(ph) 1 0 w(water_temp) 0 1 w(dry_cycle) 0 1 0;
etc for all pH – temp – dry cycle combinations
14 May 2007 SSP Core Facility 228
Department of Statistics
Regression – Selected Output
Solution for Fixed Effects
Effectwatertemp
Drycycle
pH Estimate
Standard Error
Intercept 0.1070 0.001895
Label EstimateStandard
Error
slope: ph 8, cold, air -0.00024 0.000077
slope: ph 8, cold, delicate -0.00047 0.000077
slope: ph 8, cold, normal -0.00050 0.000077
slope: ph 8, hot, air -0.00050 0.000077
slope: ph 8, hot, delicate -0.00073 0.000077
slope: ph 8, hot, normal -0.00076 0.000077
slope: ph 10, cold, air -0.00082 0.000077
slope: ph 10, cold, delicate -0.00105 0.000077
slope: ph 10, cold, normal -0.00108 0.000077
slope: ph 10, hot, air -0.00108 0.000077
slope: ph 10, hot, delicate -0.00131 0.000077
slope: ph 10, hot, normal -0.00134 0.000077
avg slope: ph 8 -0.00053 0.000054
avg slope: ph 10 -0.00111 0.000054
avg slope: cold water -0.00069 0.000054
avg slope: hot water -0.00095 0.000054
avg slope: air dry -0.00066 0.000063
avg slope: delicate dry -0.00089 0.000063
avg slope: normal dry -0.00092 0.000063
14 May 2007 SSP Core Facility 229
Department of Statistics
Prediction & Inference Space
14 May 2007 SSP Core Facility 230
Department of Statistics
VI. Prediction, “BLUP” and Inference Space
Estimation vs. Prediction
When “BLUP” is a good thing
Inference Space
−what is it?
−how can we use it?
Performance evaluation issues
Multi-location issues
14 May 2007 SSP Core Facility 231
Department of Statistics
Estimation, Prediction, and Inference Space Estimation based on estimable functions
Estimation applies to fixed effects only, inference is to entire population
Prediction based on “predictable functions”
Prediction applies to fixed & random effects, narrows scope of inference to specific subset defined by M’u
Examples: locations, workers, teachers, patients...
K
K M u
14 May 2007 SSP Core Facility 232
Department of Statistics
Prediction Example 1 Growth Change Modeling Issue - III
Random Coefficients Recall Basic Growth Model 0 1ij iji
y year e
0 0 0
1 1 12
0 0 012
1 1
Level 2:
0~ ,
0
bb
bMVN
b
proc glimmix data=bmi_uni;class id; model bmi=year/solution ddfm=kr; random intercept year / subject=id type=un solution; random _residual_ /subject=id type=ar(1);
14 May 2007 SSP Core Facility 233
Department of Statistics
Selected OutputCovariance Parameter Estimates
Cov Parm Subject Estimate
UN(1,1) id 10.8070
UN(2,1) id 0.5873
UN(2,2) id 0.2676
AR(1) id 0.3024
Residual 4.6021
Solutions for Fixed Effects
Effect EstimateStandard
Error t Value
Intercept 21.3577 0.6480 32.96
year 0.6870 0.1212 5.67
Solution for Random Effects
Effect Subject EstimateStd Err
Pred DF
Intercept id 73 2.1023 1.3487 165
year id 73 -0.1608 0.3118 165
Intercept id 281 -1.3178 1.3487 165
year id 281 -0.1353 0.3118 165
Intercept id 496 -1.8137 1.3487 165
year id 496 -0.07237 0.3118 165
partial listing
14 May 2007 SSP Core Facility 234
Department of Statistics
You can obtain Subject-Specific Estimatesproc glimmix data=bmi_uni;class id; model bmi=year/solution ddfm=kr; random intercept year / subject=id type=un solution; random _residual_ /subject=id type=ar(1); estimate 'popn avg slope' year 1 / cl; estimate 'id (73) specific slope' year 1 | year 1 / subject 1 0 cl e; estimate 'id (496) specific slope' year 1 | year 1 / subject 0 0 1 0 cl; estimate 'popn avg intercept' intercept 1 / cl; estimate 'predicted bmi in 1997' intercept 1 year 0 / cl; estimate 'id (73) specific intercept' intercept 1 | intercept 1 / subject 1 0 cl e; estimate 'id (496) specific intercept' intercept 1 | intercept 1 / subject 0 0 1 0 cl; estimate 'predicted bmi in 2000' intercept 1 year 3 / cl; estimate 'id (73) specific 2000 bmi' intercept 1 year 3 |
intercept 1 year 3/ subject 1 0 cl; estimate 'id (496) specific 2000 bmi' intercept 1 year 3 |
intercept 1 year 3/ subject 0 0 1 0 cl; estimate 'predicted bmi in 2003' intercept 1 year 6 / cl; estimate 'id (73) specific 2003 bmi' intercept 1 year 6 |
intercept 1 year 6/ subject 1 0 cl; estimate 'id (496) specific 2003 bmi' intercept 1 year 6 |
intercept 1 year 6/ subject 0 0 1 0 cl;run;
14 May 2007 SSP Core Facility 235
Department of Statistics
Best Linear Unbiased Prediction Look closer at Estimate statement
estimate 'popn avg slope' year 1 / cl; estimate 'id (73) specific slope' year 1 |
year 1 / subject 1 0 cl e; estimate 'id (496) specific slope' year 1 |
year 1 / subject 0 0 1 0 cl;
estimate 'predicted bmi in 2000' intercept 1 year 3 / cl; estimate 'id (73) specific 2000 bmi' intercept 1 year 3 | intercept 1 year 3/ subject 1 0 cl; estimate 'id (496) specific 2000 bmi' intercept 1 year 3 |
intercept 1 year 3/ subject 0 0 1 0 cl;
Coefficients to right of vertical bar ( | ) apply torandom effects – this is a new idea
BLUP - - - estimation (prediction) of random effects
14 May 2007 SSP Core Facility 236
Department of Statistics
Selected Estimates from Random Coeff BMI Model
Estimates
Label EstimateStandard
Error DF Lower Upper
popn avg slope 0.6870 0.1214 31.57 0.4396 0.9344
id (73) specific slope 0.5262 0.3833 18.35 -0.2779 1.3303
id (496) specific slope 0.6146 0.3833 18.35 -0.1895 1.4187
popn avg intercept 21.3577 0.6459 31.5 20.0413 22.6742
predicted bmi in 1997 21.3577 0.6459 31.5 20.0413 22.6742
id (73) specific intercept 23.4601 1.4916 33.36 20.4266 26.4935
id (496) specific intercept 19.5440 1.4916 33.36 16.5105 22.5775
predicted bmi in 2000 23.4186 0.7330 31.99 21.9255 24.9117
id (73) specific 2000 bmi 25.0387 0.9928 9.56 22.8127 27.2646
id (496) specific 2000 bmi 21.3878 0.9928 9.56 19.1618 23.6138
predicted bmi in 2003 25.4795 0.9605 31.84 23.5226 27.4365
id (73) specific 2003 bmi 26.6173 1.5462 20.15 23.3936 29.8410
id (496) specific 2003 bmi 23.2316 1.5462 20.15 20.0079 26.4553
14 May 2007 SSP Core Facility 237
Department of Statistics
Inference Space Example II:
Workers and machines From McLean, Sanders & Stroup (1991,
American Statistician) Also Chapter 6, ex 2, SAS for Mixed Models 2 machines 3 operators (sample from population) inference can apply to population of workers or
specific worker KEY CONCEPT: Inference Space
14 May 2007 SSP Core Facility 238
Department of Statistics
Worker-Machine Example: Fixed Effect Inference
proc glimmix;class machine operator;model y=machine/ddfm=kr;random operator machine*operator;lsmeans machine / diff;estimate 'BLUE - machine 1' intercept 1 machine 1 0;estimate 'BLUE - diff' machine 1 -1;
Type III Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
machine 1 2 20.26 0.0460
based on MS(mach) / MS(Mach*oper)
machine Least Squares Means
machine Estimate Std Error DF t Value Pr > |t|
1 50.9483 0.2467 2.973 206.50 <.0001
2 51.9567 0.2467 2.973 210.59 <.0001
Differences of machine Least Squares Means
machine _machine Estimate Std Error DF t Value Pr > |t|
1 2 -1.0083 0.2240 2 -4.50 0.0460
theseESTIMATEstatementsgive same result
14 May 2007 SSP Core Facility 239
Department of Statistics
Worker-Machine Example: Prediction
estimate 'BLUP - m1 narrow' intercept 3 machine 3 0 | operator 1 1 1 machine*operator 1 1 1 0 0 0/divisor=3;estimate 'BLUP - diff nrw' machine 3 -3 | machine*operator 1 1 1 -1 -1 -1/divisor=3;
estimate 'BLUP - oper 1' intercept 2 machine 1 1 | operator 2 0 0 machine*operator 1 0 0 1 0 0/divisor=2;estimate 'BLUP - m1 op1' intercept 1 machine 1 0 | operator 1 0 0 machine*operator 1 0 0 0 0 0;estimate 'BLUP - diff op1' machine 1 -1 | machine*operator 1 0 0 -1 0 0;
these statements apply inference to specific workers or worker-machine• machine 1 averaged over ONLY THE WORKERS IN THE STUDY• diff between machines for workers in study ONLY•operator 1 averaged over machines, with machine 1 only, oper-specific difference between machines
14 May 2007 SSP Core Facility 240
Department of Statistics
Worker-Machine Example: Prediction (2)
Estimates
Label EstimateStandard
Error DF t Value Pr > |t|
BLUE - machine 1 50.9483 0.2467 2.973 206.50 <.0001
BLUE - diff -1.0083 0.2240 2 -4.50 0.0460
BLUP - m1 narrow 50.9483 0.08993 6 566.53 <.0001
BLUP - diff nrw -1.0083 0.1272 6 -7.93 0.0002
BLUP - oper 1 51.7366 0.1151 6.698 449.30 <.0001
BLUP - m1 op1 51.2979 0.1724 7.885 297.48 <.0001
BLUP - diff op1 -0.8773 0.2567 7.976 -3.42 0.0092
BLUE – inference to population of workersBLUP – inference to specific worker or set of workers
note impact of standard error
14 May 2007 SSP Core Facility 241
Department of Statistics
BLUP a.k.a. “Shrinkage Estimator”
BLUP is regressed toward mean
BLUP is E(u|Y)
Degree of skrinkage depends of variance component estimates
Covariance Parameter Estimates
Cov Parm Estimate
operator 0.1073
machine*operator 0.05100
Residual 0.04852
1
e.g. operator BLUP is
( ) ( , ) ( )i i j j jE o Cov o y Var y y y
14 May 2007 SSP Core Facility 242
Department of Statistics
Relationship to Proc GLM
proc glm; class machine operator; model y=machine|operator; random operator machine*operator/test;lsmeans machine operator machine*operator/stderr;lsmeans machine/stderr e=machine*operator;estimate 'diff' machine 1 -1/e;run;
operator y LSMEANStandard
Error
1 51.7625000 0.1101420
vs. 51.74, 0.1151
machine operator y LSMEANStandard
Error
1 1 51.3550000 0.1557642
vs 51.30, 0.1724
machine y LSMEANStandard
Error
1 50.9483333 0.1583947
std error neither Mixed broad or narrowproduced byestimate “m1” intercept 3 machine 3 0 | operator 1 1 1 machine*operator 0 / divisor=3
machine y LSMEANStandard
Error
1 50.9483333 0.0899305
same as BLUP specific to workersin GLIMMIX
14 May 2007 SSP Core Facility 243
Department of Statistics
Prediction Example II: Multi-Location Data
From SAS for Mixed Models, 9 Locations 3 blocks per location 4 treatments Major issues are blocks fixed or random? if random how does one estimate location-specific
treatment effects?
14 May 2007 SSP Core Facility 244
Department of Statistics
ANOVA (ignoring block)
2 21
2 2 21 2
2 21
2
Source d.f. Expected Mean Square
Treatment 3
Location 8
Loc Trt 24
error dfe
LT TRT
LT L
LT
k Q
k k
k
If Location fixed:
2
2
2
2
Source d.f. Expected Mean Square
Treatment 3
Location 8
Loc Trt 24
error dfe
TRT
LOC
LT
Q
Q
Q
Test of TRTaffected
14 May 2007 SSP Core Facility 245
Department of Statistics
Inference Space
2
Assuming Locations are Fixed
Var(trt mean)=# obs/trt
MS(error)Std. error(trt mean)=
# obs/trt
2 2 2
2 2 2
HOWEVER... if Locations are Random
( )Var(trt mean)=
# obs/trt
ˆ ˆ ˆ( )Std. error(trt mean)=
# obs/trt
L LT
L LT
k
k
14 May 2007 SSP Core Facility 246
Department of Statistics
Where does Uncertainty Arise?
Loc 1 Loc 2
Loc 7 Loc 8
Only from variation among obs within locations?
Locations fixedOr does variation among locations also contribute?
Locations random
14 May 2007 SSP Core Facility 247
Department of Statistics
Location-Specific Effects: BLUP
Implies linear combination of fixed and random effect (predictable function = BLUP)
1 2 1 2
In Multi-Location trial, location-specific effect is
e.g. trt 1 vs trt 2 | location
=
j
j jL L
14 May 2007 SSP Core Facility 248
Department of Statistics
Basic SAS Programsfor fixed location: proc glimmix data=MultiCenter; class location block treatment; model response=location treatment location*treatment; random block(location); lsmeans treatment; lsmeans location*treatment/slice=location slicediff=location;run;
for random locationsproc glimmix data=MultiCenter; class location block treatment; model response=treatment/ddfm=KR; random location block(location) location*treatment; lsmeans treatment/diff; estimate 'trt1 vs trt2' treatment 1 -1 0; estimate 'loc A vs loc B' | location 1 -1 0; estimate 'trt 1 BLUP' intercept 8 treatment 8 | location 1 1 1 1 1 1 1 1/divisor=8; estimate 'trt1 at loc A blup' intercept 1 treatment 1 0 0 0 | location 1 0 location*treatment 1 0;
etc – see ch6 MultiCenter.sas for program in detail
14 May 2007 SSP Core Facility 249
Department of Statistics
“Take Home” points
Inference space usually implies random locations “Broad” inference on treatments applies to entire
population Location-specific inference may be of interest Requires BLUP Hans Peter Piepho has proposed mixed-model based
measures of commonality among locations Making locations fixed to maximize error d.f.
to test TRT is inappropriate
14 May 2007 SSP Core Facility 250
Department of Statistics
GLM Issues
14 May 2007 SSP Core Facility 251
Department of Statistics
VII. “GLM” Issues
Bernoulli data
−as a binomial
−special problems with BINARY data
Counts
Rates
14 May 2007 SSP Core Facility 252
Department of Statistics
Common Non-Normal Models Bernoulli (binary) observations Categorical data
− Binomial− multinomial
Counts− Poisson− Over dispersed (e.g. negative binomial)
Rates Survival times
− Gamma, Weibull Dispersion measures
− variance
Contingency tables
14 May 2007 SSP Core Facility 253
Department of Statistics
Elements of GLM(Generalized Linear Model)
Systematic model X Assumed distribution
− implied variance structure
Link function Examples
□ y ~ Bernoulli(p) p = (X)
or logit(p)=X□ Y~ Poisson() log () = X
14 May 2007 SSP Core Facility 254
Department of Statistics
GLM Example
From SAS for Linear Models
Output 10.1, re-expressed in 10.5
Challenger space shuttle data
relate prob{failure} to temperature at launch
DATA: TEMP, TD (# times thermal distress in O-ring, NO_TD
14 May 2007 SSP Core Facility 255
Department of Statistics
Approach to modeling
Assess relationship between TEMP and Prob{TD=1}, i.e O-rings show thermal distress Distribution: Bernoulli
Natural parameter: logit = log[p/(1-p)] Model: logit(Pr{TD})=a+b(Temp) Inverse link form:
Pr{TD}=exp[a+b(Temp)]/{1+exp[a+b(Temp)]}
14 May 2007 SSP Core Facility 256
Department of Statistics
SAS Program: Proc GENMOD
proc glimmix data=Challenger; model td/total=temp; estimate 'logit at 50 deg' intercept 1 temp 50 / ilink; estimate 'logit at 60 deg' intercept 1 temp 60 / ilink; estimate 'logit at 64.7 deg' intercept 1 temp 64.7 / ilink; estimate 'logit at 64.8 deg' intercept 1 temp 64.8 / ilink; estimate 'logit at 70 deg' intercept 1 temp 70 / ilink; estimate 'logit at 80 deg' intercept 1 temp 80 / ilink;run;
14 May 2007 SSP Core Facility 257
Department of Statistics
Relevant Output
( ) 15.04 0.23
Pr{ 1} temp ( F)
logit X
TD X
Fit Statistics
Pearson Chi-Square 11.13
Pearson Chi-Square / DF 0.80
Parameter Estimates
Effect EstimateStandard
ErrorDF t Value Pr > |t|
Intercept 15.0429 7.3786 14 2.04 0.0608
temp -0.2322 0.1082 14 -2.14 0.0500
no evidence of overdispersion
14 May 2007 SSP Core Facility 258
Department of Statistics
Relevant Output (2)
Estimates
Label EstimateStandard
Error DF t Value Pr > |t| Mean
StandardErrorMean
logit at 50 deg 3.4348 2.0232 14 1.70 0.1117 0.9688 0.06121
logit at 60 deg 1.1131 1.0259 14 1.09 0.2962 0.7527 0.1909
logit at 64.7 deg 0.02197 0.6576 14 0.03 0.9738 0.5055 0.1644
logit at 64.8 deg -0.00125 0.6518 14 -0.00 0.9985 0.4997 0.1630
logit at 70 deg -1.2085 0.5953 14 -2.03 0.0618 0.2300 0.1054
logit at 80 deg -3.5301 1.4140 14 -2.50 0.0256 0.02847 0.03911
logit scale data scale
14 May 2007 SSP Core Facility 259
Department of Statistics
Alternatives
Express data in binomial form−SAS for Linear Models, 4th ed., output 10.5
Probit link
2
2
-1
1std normal c.d.f.
2
link function is
inverse link is
z
e dz
X
X
14 May 2007 SSP Core Facility 260
Department of Statistics
Logit vs Probit
Red: probitBlue: logit
14 May 2007 SSP Core Facility 261
Department of Statistics
Probit Modelproc glimmix data=Challenger;
model td/total=temp/link=probit solution;
estimate 'logit at 50 deg' intercept 1 temp 50 / ilink;
estimate 'logit at 60 deg' intercept 1 temp 60 / ilink;
estimate 'logit at 64.7 deg' intercept 1 temp 64.7 / ilink;
estimate 'logit at 64.8 deg' intercept 1 temp 64.8 / ilink;
estimate 'logit at 70 deg' intercept 1 temp 70 / ilink;
estimate 'logit at 80 deg' intercept 1 temp 80 / ilink;
run;
14 May 2007 SSP Core Facility 262
Department of Statistics
Estimates
Label EstimateStandard
Error DF t Value Pr > |t| Mean
StandardErrorMean
logit at 50 deg 2.0201 1.1413 14 1.77 0.0985 0.9783 0.05917
logit at 60 deg 0.6692 0.6024 14 1.11 0.2854 0.7483 0.1921
logit at 64.7 deg 0.03421 0.3960 14 0.09 0.9324 0.5136 0.1579
logit at 64.8 deg 0.02070 0.3925 14 0.05 0.9587 0.5083 0.1566
logit at 70 deg -0.6818 0.3244 14 -2.10 0.0541 0.2477 0.1026
logit at 80 deg -2.0328 0.7277 14 -2.79 0.0144 0.02104 0.03678
Fit Statistics
Pearson Chi-Square 10.98
Pearson Chi-Square / DF
0.78
Probit OutputParameter Estimates
Effect EstimateStandard
Error DF t Value Pr > |t|
Intercept 8.7750 4.0286 14 2.18 0.0470
temp -0.1351 0.05839 14 -2.31 0.0364
14 May 2007 SSP Core Facility 263
Department of Statistics
Option 3: Use Binary Data
proc glimmix data=O_Ring; model td_bin=temp / solution; model td_bin=temp /dist=binomial link=logit solution; estimate 'logit at 50 deg' intercept 1 temp 50 / ilink; estimate 'logit at 60 deg' intercept 1 temp 60 / ilink; estimate 'logit at 64.7 deg' intercept 1 temp 64.7 / ilink; estimate 'logit at 64.8 deg' intercept 1 temp 64.8 / ilink; estimate 'logit at 70 deg' intercept 1 temp 70 / ilink; estimate 'logit at 80 deg' intercept 1 temp 80 / ilink; run;
Careful!! Normal default
14 May 2007 SSP Core Facility 264
Department of Statistics
Binary OutputFit Statistics
Pearson Chi-Square 23.17
Pearson Chi-Square / DF 1.10
Parameter Estimates
Effect EstimateStandard
Error DF t Value Pr > |t|
Intercept 15.0429 7.3786 21 2.04 0.0543
temp -0.2322 0.1082 21 -2.14 0.0438
Estimates
Label EstimateStandard
Error DF t Value Pr > |t| Mean
StandardErrorMean
logit at 50 deg 3.4348 2.0232 21 1.70 0.1043 0.9688 0.06121
logit at 60 deg 1.1131 1.0259 21 1.09 0.2902 0.7527 0.1909
logit at 64.7 deg 0.02197 0.6576 21 0.03 0.9737 0.5055 0.1644
logit at 64.8 deg -0.00124 0.6518 21 -0.00 0.9985 0.4997 0.1630
logit at 70 deg -1.2085 0.5953 21 -2.03 0.0552 0.2300 0.1054
logit at 80 deg -3.5301 1.4140 21 -2.50 0.0209 0.02847 0.03911
no evidence of overdispersion
14 May 2007 SSP Core Facility 265
Department of Statistics
Binary Data + Random Effects
Binary data in GLM with random effect can be troublesome
Pseudo-likelihood tends to produce biased variance / covariance component estimates
e.g. variance estimates biased down for small cluster size
Larger sample sizes tend to be required No overdispersion estimate
14 May 2007 SSP Core Facility 266
Department of Statistics
Binary GLMM example
courtesy of Oliver Schabenberger
200 subjects random intercept logistic link
data binary; do subject = 1 to 200; ranint = rannor(&seed); do i = 1 to &n; linp = &b0 + ranint; pi = 1/(1 + exp(-linp)); y = ranbin(0,1,pi); output; end; end; drop i; run;
14 May 2007 SSP Core Facility 267
Department of Statistics
Binary GLMM
Schabenberger used two programs
proc glimmix data=binary; class subject; model y(event='1') = / dist=binary link=logit s; random intercept / subject=subject; ods select ParameterEstimates CovParms; run;
proc nlmixed data=binary; parms s2 1 intercept -1; model y ~ binary(1/(1+exp(-intercept+gamma))); random gamma ~ normal(0,s2) subject=subject; ods select Dimensions ParameterEstimates; run;
14 May 2007 SSP Core Facility 268
Department of Statistics
GLIMMIX vs NLMIXED Binary Results
Covariance Parameter Estimates
Cov Parm Subject EstimateStandard
Error
Intercept subject 0.5251 0.1699
Solutions for Fixed Effects
Effect EstimateStandard
Error DF
Intercept -0.7159 0.09211 199
cluster size n=4
Parameter Estimates
Parameter EstimateStandard
Error DF
s2 0.8159 0.2718 199
intercept -0.8092 0.1085 199
GLIMMIX
NLMIXED
cluster size n=20
Covariance Parameter Estimates
Cov ParmSubjec
t Estimate
Standard Err
or
Intercept subject 0.9905 0.1373
Solutions for Fixed Effects
Effect EstimateStandard
Error DF
Intercept -0.9239 0.08020 199
Parameter Estimates
Parameter EstimateStandard
Error DF
s2 1.1512 0.1659 199
intercept -0.9854 0.08691 199
14 May 2007 SSP Core Facility 269
Department of Statistics
Diagnostics & Alternative Models
Example using count data SAS Linear Models, Output 10.24 Historically, count data assumed ~ Poisson Implies mean=variance In practice, often variance>mean, overdispersion Requires modification
−scale to correct std error, test statistics for overdispersion
−use different distribution
14 May 2007 SSP Core Facility 270
Department of Statistics
Basic analysis + model checking
Model checking plots:1. Residuals vs pred
a. use std resid b. or deviance resc. std’ize pred scalelook for unequal scatter (wrong dist or var fct)pattern in resid (wrong model or link)
2. y* vs. (xbeta)linear or wrong link
proc glimmix data=a; class BLOCK CTL_TRT a b; model count=CTL_TRT a b a*b/dist=poisson; random intercept / subject=BLOCK; output out=check pred=xbeta pred(ilink)=pred residual=r pearson=resid_pearson;run;
data plot; merge check; adjlamda=2*sqrt(pred); ystar=xbeta+(count-pred)/pred; absres=abs(resid_pearson);
proc gplot; plot resid_pearson*(pred xbeta); plot (resid_pearson)*adjlamda; plot ystar*xbeta; plot absres*adjlamda;run;
14 May 2007 SSP Core Facility 271
Department of Statistics
Evidence of Overdispersion
Gener. chi-square / DF should be 1>1 indicates overdispersion<1 indicates underdispersion
Fit Statistics
-2 Res Log Pseudo-Likelihood 124.06Generalized Chi-Square 100.15Gener. Chi-Square / DF 3.34
14 May 2007 SSP Core Facility 272
Department of Statistics
Example: plot of residuals x adjlamda
14 May 2007 SSP Core Facility 273
Department of Statistics
Another look – absolute value resid vs adjlamda
14 May 2007 SSP Core Facility 274
Department of Statistics
Link? Plot ystar x XBeta
should be linear – no strong evidence of problem
14 May 2007 SSP Core Facility 275
Department of Statistics
Strategy 1: Adjust using scale parameter
Poisson log-likelihood is log( ) log !
( ) ( )
Quasi-likelihood allows scale parameter
log( )
Now, ( ) ( )
y y
E y Var y
y t q yQ dt
t
E y Var y
14 May 2007 SSP Core Facility 276
Department of Statistics
Implementation with GLIMMIX
2
SCALE estimated from RANDOM _RESIDUAL_
- ( )
alternatively can use - ( )
Generalized
N rank X
deviance
N rank X
proc glimmix data=a; class BLOCK CTL_TRT a b; model count=CTL_TRT a b a*b/dist=poisson htype=1,3; random intercept / subject=BLOCK; random _residual_; run;
14 May 2007 SSP Core Facility 277
Department of Statistics
Selected Output
Type I Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
CTL_TRT 1 27 55.83 <.0001
Type III Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
CTL_TRT 0 . . .
A 2 27 9.19 0.0009
B 2 27 0.06 0.9402
A*B 4 27 3.11 0.0315
UnScaled Scaled
Type I Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
CTL_TRT 1 27 16.23 0.0004
Type III Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
CTL_TRT 0 . . .
A 2 27 2.67 0.0875
B 2 27 0.02 0.9822
A*B 4 27 0.90 0.4753
Note discrepancy for CTL_TRT and A main effect
14 May 2007 SSP Core Facility 278
Department of Statistics
Alternative 2: different distribution e.g. Negative Binomial
kparamnatural
kyVaryE
isk
kk
ky
ky
ky
k
kk
kyL
k
k
kky
ky
kkyN
yNy
N
ky
yNy
log,)(,)(
likelihood-quasi but family, exponloglog
)!1(!
)!1(loglogloglog
)!1(!
)!1( p.d.f. yields
and let :form useful More
)1()!1(!
)!1( :formstat text -math Standard
2
is the mean and k is the aggregation parametersmall k aggregation; k Poisson
14 May 2007 SSP Core Facility 279
Department of Statistics
Negative Binomial with GLIMMIX
proc glimmix data=a; class BLOCK CTL_TRT a b; model count=CTL_TRT a b a*b/dist=negbin htype=1,3; random intercept / subject=BLOCK;run;
Fit Statistics
-2 Res Log Pseudo-Likelihood 84.48
Generalized Chi-Square 28.32
Gener. Chi-Square / DF 0.94
Type I Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
CTL_TRT 1 27 10.08 0.0037
Type III Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
CTL_TRT 0 . . .
A 2 27 3.53 0.0436
B 2 27 0.03 0.9753
A*B 4 27 1.02 0.4139
14 May 2007 SSP Core Facility 280
Department of Statistics
Modeling with Offsets
There are cases when modeling count alone is naive This occurs when counts are “per unit”
− Number of plants per plot
− Number of patients per county
− Number of students per district
− Number of boating accidents per year per lake
− Number of defects per lot
Accurate model must take units into account Essentially, based on log(count/unit) Log(count) is link; log(unit) is “offset”
14 May 2007 SSP Core Facility 281
Department of Statistics
Offset defined
Idea: raw count may be artifact of unit size Count / unit more informative Offset
−adjusts for size
− is a regressor whose coefficient is assumed to be 1.0
−used especially in conjuction with Poisson models with log link
−accounts for heterogeneity in rates resulting from difference in size
14 May 2007 SSP Core Facility 282
Department of Statistics
Modeling with Offsets
( )
exp
log ( ) log log
rate per unit size
i i
i i i
i i i
y Poisson
size
E y size
X offset
14 May 2007 SSP Core Facility 283
Department of Statistics
Example: Courtesy of Oliver Schabenberger
Some of the data X is predictor variable SIZE is the “unit” to be
taken into account
Obs size x count
1 5001 4.597 4
2 7550 4.245 76
3 1744 3.918 2
4 1451 3.273 2
5 5313 4.140 12
6 3687 3.438 4
7 3022 4.763 2
8 8809 4.445 9
9 4436 4.191 3
10 2621 4.835 6
14 May 2007 SSP Core Facility 284
Department of Statistics
Naive Modeling (not accounting for SIZE)
proc glimmix data=test; model count = x / s dist=poisson; ods select FitStatistics ParameterEstimates;run;
Fit Statistics
-2 Log Likelihood 647.12
AIC (smaller is better) 651.12
AICC (smaller is better) 651.45
BIC (smaller is better) 654.50
CAIC (smaller is better) 656.50
HQIC (smaller is better) 652.35
Pearson Chi-Square 1078.66
Pearson Chi-Square / DF 28.39
Parameter Estimates
Effect EstimateStandard
ErrorDF
t Value Pr > |t|
Intercept 2.0978 0.4143 38 5.06 <.0001
x -0.01619 0.1002 38 -0.16 0.8725
14 May 2007 SSP Core Facility 285
Department of Statistics
Poisson Model with Offset
proc glimmix data=test; offs = log(size); model count = x /s dist=poisson offset=offs; ods select FitStatistics ParameterEstimates;run;
Fit Statistics
-2 Log Likelihood 318.41
AIC (smaller is better) 322.41
AICC (smaller is better) 322.73
BIC (smaller is better) 325.79
CAIC (smaller is better) 327.79
HQIC (smaller is better) 323.63
Pearson Chi-Square 347.09
Pearson Chi-Square / DF 9.13
Parameter Estimates
Effect EstimateStandard
ErrorDF t Value Pr > |t|
Intercept -7.3168 0.5052 38 -14.48 <.0001
x 0.2247 0.1225 38 1.83 0.0746
14 May 2007 SSP Core Facility 286
Department of Statistics
Alternative to Offset?? Could count/size be treated as binomial?
proc glimmix data=test; offs = log(size); model count = x /s dist=poisson offset=offs; output out=gmxout1 pred(ilink)=mu; id _xbeta_ offs _linp_; ods exclude all;run;
proc glimmix data=test; model count/size = x /s dist=binomial; output out=gmxout2 pred(ilink)=prob; ods exclude all;run; data gmxout2; set gmxout2; predcount= prob * size;
14 May 2007 SSP Core Facility 287
Department of Statistics
Compare Poisson/Offset vs Binomial Results
Obs _xbeta_ offs _linp_ mu
1 -6.28394 8.51739 2.23346 9.3321
2 -6.36302 8.92930 2.56628 13.0173
3 -6.43649 7.46394 1.02745 2.7939
4 -6.58140 7.28001 0.69860 2.0109
5 -6.38661 8.57791 2.19130 8.9468
6 -6.54433 8.21257 1.66823 5.3028
7 -6.24664 8.01367 1.76703 5.8535
8 -6.31809 9.08353 2.76544 15.8860
9 -6.37516 8.39751 2.02235 7.5561
10 -6.23047 7.87131 1.64085 5.1595
Poisson results MU = pred count Bimomial results
Obs size x count prob predcount
1 5001 4.597 4 .001866023 9.3320
2 7550 4.245 76 .001724158 13.0174
3 1744 3.918 2 .001602034 2.7939
4 1451 3.273 2 .001385890 2.0109
5 5313 4.140 12 .001683963 8.9469
6 3687 3.438 4 .001438241 5.3028
7 3022 4.763 2 .001936911 5.8533
8 8809 4.445 9 .001803387 15.8860
9 4436 4.191 3 .001703368 7.5561
10 2621 4.835 6 .001968487 5.1594
predicted counts nearly identical
14 May 2007 SSP Core Facility 288
Department of Statistics
ZIP and Hurdle Models
Mixture models for count data−ZIP = “zero-inflated Poisson”−ZINB = “zero-inflated Negative Binomial”− in principle, other zero-inflated models limited only by
imagination Accommodate excess zeros
−Excess zeros cause overdispersion Are not in exponential family Cannot be fit with PROC GLIMMIX Can be fit using PROC NLMIXED
14 May 2007 SSP Core Facility 289
Department of Statistics
ZIP Model
1 Pr 0 0Pr
1 Pr 0 0
1 0
1 0!
i
i
i i
i i ii
i i
i i
ji
i
z Poisson
z jy j
z j
e j
ej
j
Observation
prob of 0 from Bernoulli process
prob of zero from Poissonprocess
14 May 2007 SSP Core Facility 290
Department of Statistics
Hurdle Model
Two part model−One process generates zeros
−Another process generates non-zeros
Pr 0 0
Pr Pr 01 Pr 0 0
1 Pr 0
i
i ii
i
z j
y j uz j
u
observationzeros fromZ process
truncated at zerodistribution
14 May 2007 SSP Core Facility 291
Department of Statistics
ZIP or Hurdle?
Number of doctor visits per year
Number of fish caught by sport fishermen
Cancer mortality
14 May 2007 SSP Core Facility 292
Department of Statistics
From SAS for Mixed Models, 2nd ed, Ch 15%let pi = 0.27;data zip; do s = 1 to 100; u = rannor(556712); do i = 1 to 20; x = int(ranuni(0)*100); y = int(rannor(0)*100); if (ranuni(0) < &pi) then do; count = 0; lambda = .; end; else do; lambda = exp(-2 + 0.01*x + 0.01*y + u); count = ranpoi(0,lambda); end; output; end; end; drop i u lambda;run;
Credit: Oliver
Schabenberger
14 May 2007 SSP Core Facility 293
Department of Statistics
ZIP Model with Random Effectsproc nlmixed data=zip; parameters b0=0 b1=0 b2=0 a0=0 s2u=1; /* linear predictor for the inflation probability */ linpinfl = a0; /* infprob = inflation probability for zeros */ /* = logistic transform of the linear predictor*/ infprob = 1/(1+exp(-linpinfl)); /* Poisson mean */ lambda = exp(b0 + b1*x + b2*y + u); /* Build the ZIP log likelihood */ if count=0 then ll = log(infprob + (1-infprob)*exp(-lambda)); else ll = log((1-infprob)) + count*log(lambda)-lgamma(count+1)-lambda; model count ~ general(ll); random u ~ normal(0,s2u) subject=s; estimate "inflation probability" infprob;run;
14 May 2007 SSP Core Facility 294
Department of Statistics
ZIP NLMIXED Selected ResultsFit Statistics
-2 Log Likelihood 2803.6
AIC (smaller is better) 2813.6
AICC (smaller is better) 2813.7
BIC (smaller is better) 2826.7
Parameter Estimates
Parameter EstimateStandard
Error DF t Value Pr > |t| Alpha Lower Upper Gradient
b0 -1.9979 0.1530 99 -13.06 <.0001 0.05 -2.3014 -1.6944 -0.00224
b1 0.01011 0.001299 99 7.78 <.0001 0.05 0.007535 0.01269 -0.15649
b2 0.01016 0.000394 99 25.78 <.0001 0.05 0.009378 0.01094 -0.0434
a0 -1.0934 0.1594 99 -6.86 <.0001 0.05 -1.4097 -0.7771 -0.00034
s2u 1.0828 0.2095 99 5.17 <.0001 0.05 0.6671 1.4985 -0.00145
Additional Estimates
Label EstimateStandard
Error DF t Value Pr > |t| Alpha Lower Upper
inflation probability 0.2510 0.02997 99 8.38 <.0001 0.05 0.1915 0.3104
true parameter valuesb0=-2 b1=b2=0.01a0=-0.9946 s2u=1
14 May 2007 SSP Core Facility 295
Department of Statistics
GLMM Multi-Clinic Binomial Data
SAS for Linear Models, Output 10.9 also SAS for Mixed Models, Ch 14 from Beitler & Landis, Biometrics, 1985 2 treatments (drug, cntl) 8 clinics, represent population nij patients observed on trt i at clinic j
yij have favorable response
14 May 2007 SSP Core Facility 296
Department of Statistics
GLMM for Beitler Landis Data
2 2
Pr | ,
Model: log ( )1
(0, ); (0, )
ij
iji j ij
ij
j C CTij
favorable trt i clinic j
c ct
c iid N ct iid N
proc glimmix data=a; class clinic trt; model fav/nij= trt/dist=binomial link=logit; random intercept trt / subject=clinic; lsmeans trt/odds; estimate 'lsm - cntl' intercept 1 trt 1 0 /ilink; estimate 'lsm - drug' intercept 1 trt 0 1 / ilink; estimate 'diff' trt 1 -1; contrast 'diff' trt 1 -1;run;
Covariance Parameter Estimates
Cov Parm Subject Estimate
Intercept clinic 2.0103
trt clinic 0.06057
14 May 2007 SSP Core Facility 297
Department of Statistics
If you drop Clinic x Trtproc glimmix data=a; class clinic trt; model fav/nij= trt/dist=binomial link=logit; random intercept / subject=clinic; lsmeans trt/odds; estimate 'lsm - cntl' intercept 1 trt 1 0 /ilink; estimate 'lsm - drug' intercept 1 trt 0 1 / ilink; estimate 'diff' trt 1 -1; contrast 'diff' trt 1 -1;run;
conditional(SS)model
proc glimmix data=a; class clinic trt; model fav/nij= trt/dist=binomial link=logit; random _residual_ / type=cs subject=clinic; lsmeans trt/odds; estimate 'lsm - cntl' intercept 1 trt 1 0 /ilink; estimate 'lsm - drug' intercept 1 trt 0 1 / ilink; estimate 'diff' trt 1 -1; contrast 'diff' trt 1 -1;run;
marginal(PA)model
14 May 2007 SSP Core Facility 298
Department of Statistics
Selected Output – Conditional Model
Covariance Parameter Estimates
Cov Parm Estimate
Standard Error
clinic 2.0327 1.2637
Type III Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
trt 1 7 5.98 0.0444
Estimates
Label EstimateStandard
ErrorDF t Value Pr > |t| Mean
StandardErrorMean
lsm - cntl -1.1464 0.5586 7 -2.05 0.0793 0.2411 0.1022
lsm - drug -0.4220 0.5552 7 -0.76 0.4720 0.3960 0.1328
diff -0.7244 0.2963 7 -2.45 0.0444
trt Least Squares Means
trt EstimateStandard
Error DF t Value Pr > |t| Odds
cntl -1.1464 0.5586 7 -2.05 0.0793 0.3178
drug -0.4220 0.5552 7 -0.76 0.4720 0.6557
14 May 2007 SSP Core Facility 299
Department of Statistics
GLMM with NLMIXED1. data step to define indicator for Trt=1 (because NLMIXED
lacks CLASS statement)data a; input clinic trt $ fav unfav; nij=fav+unfav; t1=(trt='drug');
2. then, run NLMIXEDproc nlmixed; parms mu=1 tau=0 s2c=2; eta=mu+tau*t1+cj; pij=exp(eta)/(1+exp(eta)); model fav~binomial(nij,pij); random cj~normal(0,s2c) subject=clinic; estimate 'trt effect' tau; estimate 'ctl p_hat' exp(mu)/(1+exp(mu)); estimate 'drug p_hat' exp(mu+tau)/(1+exp(mu+tau)); estimate 'diff on p_hat scale' exp(mu+tau)/(1+exp(mu+tau)) - exp(mu)/(1+exp(mu));run;
14 May 2007 SSP Core Facility 300
Department of Statistics
NLMIXED with CxT term included
proc nlmixed; parms mu=1 tau=0 s2c=2 s2ct=0.08; eta=mu+tau*t1+cj+c1j*t1+c2j*t2;; pij=exp(eta)/(1+exp(eta)); model fav~binomial(nij,pij); random cj c1j c2j~normal([0,0,0],[s2c,0,s2ct,0,0,s2ct])
subject=clinic; estimate 'trt effect' tau; estimate 'ctl p_hat' exp(mu)/(1+exp(mu)); estimate 'drug p_hat' exp(mu+tau)/(1+exp(mu+tau)); estimate 'diff on p_hat scale' exp(mu+tau)/(1+exp(mu+tau)) - exp(mu)/(1+exp(mu));run;
first, also define Trt=2 indicator, here denoted t2
14 May 2007 SSP Core Facility 301
Department of Statistics
Binary Repeated Measures
2 treatments 20 subjects (animals) per trt 5 times of measurement response at each measurement 0/1 suggested by companion animal vaccine trials
14 May 2007 SSP Core Facility 302
Department of Statistics
Several approaches
GEE using GENMOD PQL using %GLIMMIX
− random subj(trt), or
− CS
G-H quadrature using NLMIXED (not shown) but you could use MIXED type 1 error control of PQL + random subj(trt) not
acceptable power of PQL/CS or NLMIXED > GEE
14 May 2007 SSP Core Facility 303
Department of Statistics
various SAS pgm for binary rpt-M dataproc genmod; class trt animal day; model y=trt|day/dist=bin type1 type3; repeated subject=animal(trt)/ type=exch;
Proc GLIMMIX; CLASS trt animal day; MODEL y=trt|day / dist=binomial link=logit; random animal(trt);
Proc GLIMMIX; CLASS trt animal day; MODEL y=trt|day / dist=binomial link=logit; random day / rside type=cs subject=animal(trt);
GEE
PQL
random an(trt)
CS
NLMixed next page
14 May 2007 SSP Core Facility 304
Department of Statistics
NLMixeddata nlmx; set univar; t1=(trt=1); t2=(trt=2); d1=(day=1); d2=(day=2); d3=(day=3); d4=(day=4); d5=(day=5);
proc nlmixed; parms mu=1 a1=1 b1=1 b2=1 b3=1 b4=1
ab11=1 ab12=1 ab13=1 ab14=1 sb2=1; eta=mu+a1*t1+b1*d1+b2*d2+b3*d3+b4*d4+
ab11*t1*d1+ab12*t1*d2+ab13*t1*d3+ab14*t1*d4; pi=exp(eta+bse)/(1+exp(eta+bse)); model y~binary(pi); random bse~normal(0,sb2) subject=id; contrast 'trt' a1; contrast 'day' b1,b2,b3,b4; contrast 'trt x day' ab11,ab12,ab13,ab14;
14 May 2007 SSP Core Facility 305
Department of Statistics
Poisson Repeated Measures
Output 10.39 SAS for Linear Models Leppik, et al (1985); Thall & Vail (1990) 2 treatments 28 patients on trt=0; 31 on trt=1 4 times of measurement epilespsy: # seizures in 4 test periods baseline & age covariates
14 May 2007 SSP Core Facility 306
Department of Statistics
Model for seizure data
1 2
denote mean count (# seizures) trt , time
GL Model is:
log( ) ( ) (log_ ) (log_ )
Assume CS working correlation structure among repeated measures
ij
ij i j ij i
i j
base age
using GEE
proc genmod data=seizure; class id trt time;/* this model first */ *model y=trt time trt*time log_base trt*log_base log_age/ dist=poisson link=log type1 type3;/* then this model */ model y=trt time log_base(trt)log_age/ dist=poisson link=log type1 type3; repeated subject=id / type=exch corrw;
see SAS file for %GLIMMIX approach
14 May 2007 SSP Core Facility 307
Department of Statistics
GENMOD to GLIMMIX
using GEE
proc genmod data=seizure; class id trt time;model y=trt time log_base(trt)log_age/ dist=poisson link=log type1 type3; repeated subject=id / type=exch corrw;
equivalent GLIMMIX
proc glimmix data=seizure; class id trt time;model y=trt time log_base(trt)log_age/ dist=poisson link=log; random time / type=cs subject=id residual;
14 May 2007 SSP Core Facility 308
Department of Statistics
Degrees of Freedom & Standard Errors
Recall Satterthwaite approximation & Kenward-Roger bias adjustment in LMM
Same issues exist with GLMM
But not nearly as well researched
You can use SATTERTH and KR options in GLIMMIX with non-normal data & non-identity link
But what do they do?
14 May 2007 SSP Core Facility 309
Department of Statistics
Power
14 May 2007 SSP Core Facility 310
Department of Statistics
VIII. Power Many software packages for power & sample size
−e.g SAS PROC POWER− for FIXED effect models only
What if you have “Mixed Model Issues”?− random effects−split-plot structure−errors potentially correlated: longitudinal or spatial data−any other non-standard model structure
Methods based on PROC GLIMMIX−adapted from Stroup (2002, JABES)
14 May 2007 SSP Core Facility 311
Department of Statistics
Mixed Model Background – G, R unknown
)'(]'[)''(
Roger-Kenward ite,Satterthwa e.g.
edapproximat be toneedmay or design from obvious bemay
approx ~
and of components estimated using of estimate is ˆ
)(
)ˆ'(]ˆ'[)'ˆ'()0'(
1
],),([
1
KCLLK
FF
RGCC
Krank
KLCLKKF
Krank
14 May 2007 SSP Core Facility 312
Department of Statistics
Computing Power using SAS
create data set like proposed design (O’Brien: “exemplary data set”)
run PROC GLIMMIX with covariance components fixed
=(F computed by GLIMMIX)rank(K) [or chi-sq with GLM]
use GLIMMIX to compute
critical F (Fcrit ) is value s.t.
P{F (rank(K), υ, 0 ) > Fcrit}= [or chi-square]
Power = P{F [rank(K), υ, ] >Fcrit }
SAS functions can compute Fcrit & Power
14 May 2007 SSP Core Facility 313
Department of Statistics
/* step 1 - create data set with same structure as proposed design use MU (expected mean) instead of observed Y_ij values *//* this example shows power for 5, 10, and 15 e.u. per trt */
data crdpwrx1; input trt mu; do n=5 to 15 by 5; do eu=1 to n; output; end; end;cards;1 1002 943 90;
Compute Power with GLIMMIX – CRD example
14 May 2007 SSP Core Facility 314
Department of Statistics
Compute Power with GLIMMIX – CRD example
/* step 2 - use PROC GLIMMIX to compute non-centrality parameters for ANOVA tests & contrasts ODS statements output them to new data sets */proc sort data=crdpwrx1;by n;
proc glimmix data=crdpwrx1;by n; class trt; model mu=trt; parms (100)/hold=1; contrast 'et1 v et2' trt 0 1 -1; contrast 'c vs et' trt 2 -1 -1; ods output tests3=b; ods output contrasts=c;run;
14 May 2007 SSP Core Facility 315
Department of Statistics
/* step 3: combine ANOVA & contrast n-c parameter data sets use SAS functions PROBF and FINV to compute power */data power; set b c; alpha=0.05; ncparm=numdf*fvalue; fcrit=finv(1-alpha,numdf,dendf,0); power=1-probf(fcrit,numdf,dendf,ncparm);proc print;
Type III Tests of Fixed Effects
EffectNum
DFDen DF F Value Pr > F
trt 2 12 1.27 0.3169
Contrasts
LabelNum
DFDen DF F Value Pr > F
et1 v et2 1 12 0.40 0.5390
c vs et 1 12 2.13 0.1698
Obs n Effect NumDF DenDF FValue ProbF Label alpha ncparm fcrit power
1 5 trt 2 12 1.27 0.3169 0.05 2.53333 3.88529 0.22361
2 5 1 12 0.40 0.5390 et1 v et2 0.05 0.40000 4.74723 0.08980
3 5 1 12 2.13 0.1698 c vs et 0.05 2.13333 4.74723 0.26978
14 May 2007 SSP Core Facility 316
Department of Statistics
More Advanced Example
Plots in 8 x 3 grid Main variation alone 8 “rows” 3 x 2 treatment design Alternative designs
− randomized complete block (4 blocks, size 6)
− incomplete block (8 blocks, size 3)
−split plot
RCBD “easy” but ignores natural variation
14 May 2007 SSP Core Facility 317
Department of Statistics
Picture the 8 x 3 Grid
Gradient
14 May 2007 SSP Core Facility 318
Department of Statistics
SAS Programs to Compare 8 x 3 Designdata a; input bloc trtmnt @@; do s_plot=1 to 3; input dose @@; mu=trtmnt*(0*(dose=1)+4*(dose=2)+8*(dose=3)); output; end;cards;1 1 1 2 31 2 1 2 32 1 1 2 32 2 1 2 33 1 1 2 33 2 1 2 34 1 1 2 34 2 1 2 3;
proc glimmix data=a noprofile; class bloc trtmnt dose; model mu=bloc trtmnt|dose; random trtmnt/subject=bloc; parms (4) (6) / hold=1,2; lsmeans trtmnt*dose / diff; contrast 'trt x lin'
trtmnt*dose 1 0 -1 -1 0 1; ods output diffs=b; ods output contrasts=c;run;
Split-Plot
14 May 2007 SSP Core Facility 319
Department of Statistics
8 x 3 – Incomplete Blockdata a; input bloc @@; do eu=1 to 3; input trtmnt dose @@; mu=trtmnt*(0*(dose=1)+4*(dose=2)+8*(dose=3)); output; end;cards;1 1 1 1 2 1 32 1 1 1 2 2 23 1 1 1 3 2 34 1 1 2 1 2 25 1 2 1 3 2 26 1 2 2 1 2 37 1 3 2 1 2 38 2 1 2 2 2 3;
proc glimmix data=a noprofile; class bloc trtmnt dose; model mu=trtmnt|dose; random intercept / subject=bloc; parms (4) (6) / hold=1,2; lsmeans trtmnt*dose / diff; contrast 'trt x lin'
trtmnt*dose 1 0 -1 -1 0 1; ods output diffs=b; ods output contrasts=c;run;
14 May 2007 SSP Core Facility 320
Department of Statistics
8 x 3 Example - RCBDdata a; input trtmnt dose @@; do bloc=1 to 4; mu=trtmnt*(0*(dose=1)+4*(dose=2)+8*(dose=3)); output; end;cards;1 1 1 2 1 3 2 1 2 2 2 3;
proc glimmix data=a noprofile; class bloc trtmnt dose; model mu=bloc trtmnt|dose; parms (10) / hold=1; lsmeans trtmnt*dose / diff; contrast 'trt x lin'
trtmnt*dose 1 0 -1 -1 0 1; ods output diffs=b; ods output contrasts=c;run;
14 May 2007 SSP Core Facility 321
Department of Statistics
Power for GLMs
2 treatments P{favorable outcome} for trt 1 p= 0.30; for trt 2 p=0.25 power if n1=300; n2=600data a; input trt y n; datalines;1 90 3002 150 600;
proc glimmix; class trt; model y/n=trt / chisq; ods output tests3=pwr;run;
data power; set pwr; alpha=0.05; ncparm=numdf*chisq; fcrit=cinv(1-alpha,numdf,0); power=1-probchi(fcrit,numdf,ncparm); proc print; run;
14 May 2007 SSP Core Facility 322
Department of Statistics
Power for GLMM Same trt and sample size per location as before 10 locations Var(Location)=0.25; Var(Trt*Loc)=0.125 Variance Components: variation in log(OddsRatio) Power?data a; input trt y n; do loc=1 to 10; output; end; datalines; 1 90 300 2 150 600 ;
proc glimmix data=a initglm; class trt loc; model y/n = trt / oddsratio; random intercept trt / subject=loc; random _residual_; parms (0.25) (0.125) (1) / hold=1,2,3; ods output tests3=pwr;run;
14 May 2007 SSP Core Facility 323
Department of Statistics
GLMM Power Analysis Results
Obs Effect NumDF DenDF alpha ncparm fcrit power
1 trt 1 9 0.05 2.29868 5.11736 0.27370
Odds Ratio Estimates
trt _trt Estimate DF
95% Confidence
Limits
1 2 1.286 9 0.884 1.871
Gives you expected Conf Limits for # Locations & N / Loccontemplated
Gives you the power of the test of TRT effect on prob(favorable)
14 May 2007 SSP Core Facility 324
Department of Statistics
GLMM Power: Impact of Sample Size?
N of subjects per trt per location?
N of Locations?
Three cases
1. n-300/600 10 loc2. n=600/1200, 10 loc3. n=300/600, 20 loc
data a; input trt y n; do loc=1 to 10; output; end; datalines; 1 90 300 2 150 600 ;
data a; input trt y n; do loc=1 to 10; output; end; datalines; 1 180 600 2 300 1200 ;
data a; input trt y n; do loc=1 to 20; output; end; datalines; 1 90 300 2 150 600 ;
14 May 2007 SSP Core Facility 325
Department of Statistics
GLMM Power: Impact of Sample Size?Recall, for 10 locations, N=300/600,
CI for OddsRatio was (0.884, 1.871); Power was 0.274For 10 locations, N=600 / 1200
Odds Ratio Estimates
trt _trt Estimate DF 95% Confidence Limits
1 2 1.286 9 0.891 1.855
Obs Effect NumDF DenDF alpha ncparm fcrit power
1 trt 1 9 0.05 2.40715 5.11736 0.28421
For 20 locations, N=300 / 600Odds Ratio Estimates
trt _trt Estimate DF 95% Confidence Limits
1 2 1.286 19 1.006 1.643
Obs Effect NumDF DenDF alpha ncparm fcrit power
1 trt 1 19 0.05 4.59736 4.38075 0.53003
N alone has almost no impact
14 May 2007 SSP Core Facility 326
Department of Statistics
Spatial Data
14 May 2007 SSP Core Facility 327
Department of Statistics
Example 5 - Spatialfrom SAS for Mixed Models, Sect. 11.7
“Alliance” Data from Stroup, Baenziger, and Mulitze (1994)
in GLIMMIX-speak:
data two; set alliance; obs = _n_;proc glimmix data=two; class Entry Rep obs; model Yield=Entry/ddfm=kr; random intercept/subject=rep; random obs / type=sp(sph)(latitude longitude); parms (0.1) (43.4) (27.5) (11.5); lsmeans entry;
14 May 2007 SSP Core Facility 328
Department of Statistics
IX. Spatial Data
Example from SAS for Mixed Models−Spatial errors in Treatement Comparison studies only
−No spatial mapping, Kriging
Standard parametric models from Geostatistics
RSMOOTH alternative
Issues
14 May 2007 SSP Core Facility 329
Department of Statistics
r ep 1 2 3 4
LAT
4. 30
15. 05
25. 80
36. 55
47. 30
LNG
1. 2 7. 5 13. 8 20. 1 26. 4
From Stroup, Baenziger & Mulitze (Crop Science, 1994) 56 varieties, 4 blocks, e.u. = 4.3 1.2 m plots
14 May 2007 SSP Core Facility 330
Department of Statistics
Contour Plot of Response
B
B
B
B
NN
N
N
B = Buckskin N = NE86503
14 May 2007 SSP Core Facility 331
Department of Statistics
Additional GLIMMIX Code to Plot Spatial Variability
output out=gmxout2 pred=p; ods output lsmeans=lsm2; id entry latitude longitude _zgamma_; run; proc means data=gmxout2; var _zgamma_; run; proc print data=gmxout2(OBS=20); run; proc g3d data=gmxout2; plot latitude*longitude=_zgamma_ /grid;
14 May 2007 SSP Core Facility 332
Department of Statistics
Plot of Spherical Covariance
14 May 2007 SSP Core Facility 333
Department of Statistics
Alternative Using RSMOOTH
Advantage in Theory: RSMOOTH does not require parametric model of spatial variation, which can be unrealistic
e.g. Alliance data spatial variation is from winter kill
proc glimmix data=alliance; class Entry Rep; model Yield=Entry /ddfm=kr; *model Yield=Entry latitude longitude/ddfm=kr; random intercept/subject=rep; random latitude longitude / type=rsmooth;
14 May 2007 SSP Core Facility 334
Department of Statistics
RSMOOTH?
From Penalized Spline−Ruppert, Wand, and Carroll (2003, SemiParametric
Regression, Cambridge)
*
ˆˆPrediction: ( )
Objective Function :
; ( ) ( )
y B x
Q y B x y B x D
14 May 2007 SSP Core Facility 335
Department of Statistics
RSMOOTH (2)
Rewrite the model
0 1
2 2
is "knot" a.k.a. "join point"
Rexpress:
* ;
i j i jj
j
y x x e
y X Z e
then
Q y X Z
14 May 2007 SSP Core Facility 336
Department of Statistics
RSMOOTH (2)
2
Spline:
LMM:
y y X B y X B D
y y X Zu y X Zu
14 May 2007 SSP Core Facility 337
Department of Statistics
RSMOOTH yields following Spatial Plot
14 May 2007 SSP Core Facility 338
Department of Statistics
RSMOOTH vs SP(SPH)
Sp(SPH) RSMOOTH
Type III Tests of Fixed Effects
Num DenEffect DF DF F Value Pr > F
Entry 55 148.2 1.77 0.0038
Type III Tests of Fixed Effects
Num DenEffect DF DF F Value Pr > F
Entry 55 138.1 1.85 0.0021
14 May 2007 SSP Core Facility 339
Department of Statistics
However... Plot of LSMeans from two approaches
LSM_RSMOOTH average 31.06LSM_SP_SPH average 24.40
????
14 May 2007 SSP Core Facility 340
Department of Statistics
14 May 2007 SSP Core Facility 341
Department of Statistics
Some NLMM Issues
Consulting problem at UNL Why nonlinear mixed model (NLMM) seemed
appropriate Problems in implementation NLMM issues Alternatives whose implications are not
adequately understood
14 May 2007 SSP Core Facility 342
Department of Statistics
Wheat Sawfly Study
Gary Hein, Research Entomologist, Scottsbluff, NE RREC
Sawflies inhabit/damage wheat 5 tillage treatments: impact on sawflies Exp design used 4 randomized blocks Sawfly emergence measured at planned times
during growing season
14 May 2007 SSP Core Facility 343
Department of Statistics
Emergence over TIME by TRT
Black: NoTillRed: SumBlade (summer)Cyan: SB&SDGreen: SpDisk (spring)Blue: SpPlow
14 May 2007 SSP Core Facility 344
Department of Statistics
“Conventional” Analysis
Emerge = + TRT + blk + blk*trt + DATE + TRT*DATE + date*blk(trt)
• blk*trt a.k.a. between subjects or “whole-plot” error
• date*blk(trt) = within subjects or “split-plot” error
ANOVA: Source df
blk 3TRT 4betw subj error 12DATE 12TRT*DATE 48within subj error 180
14 May 2007 SSP Core Facility 345
Department of Statistics
Standard ANOVAmodel: emerge = + blk + TRT +w.p.error + TIME + TRT*TIME + s.p. error
The Mixed Procedure
Covariance Parameter Estimates
Cov Parm Estimate blk 0.002177 blk*trt 0.005199 Residual 0.01845
Type 3 Tests of Fixed Effects
Num Den Effect DF DF F Value Pr > F
trt 4 12 13.18 0.0002 date 12 180 157.38 <.0001
trt*date 48 180 5.18 <.0001
CS covariance fit adequately
14 May 2007 SSP Core Facility 346
Department of Statistics
Break out TRT*DATE effect
Type 1 Tests of Fixed Effects
Num Den Effect DF DF F Value Pr > F
trt 4 12 15.62 0.0001 lin 1 177 2273.39 <.0001 quad 1 177 7.24 0.0078 cubic 1 177 161.10 <.0001 date 9 177 2.95 0.0027 lin*trt 4 177 0.59 0.6716 quad*trt 4 177 26.69 <.0001 cubic*trt 4 177 2.13 0.0792 trt*date 36 177 3.08 <.0001
14 May 2007 SSP Core Facility 347
Department of Statistics
Alternative Modeling Considerations
th th
2
2
:
mean of i trt at j time
~ . . . (0, )
~ . . . (0, )
ijk ij k ik ijk
ij
ik W
ijk
Basic form of Model y blk w e
w whole plot error i i d N
e split plot error i i d N
Modeling ij
1. Decompose ij in “standard ANOVA” +Trt+Time+Trt*Time
2. Further decompose via polynomial regression
3. Nonlinear decomposition, e.g. Gompertz
4. Transform yijk to “linearize” response profile over date
a. logit or probit (assume sigmoid profile is symmetric)
b. complementary log-log (allows asymmetry)
14 May 2007 SSP Core Facility 348
Department of Statistics
th
th
th
exp{ exp[ ( )]}
is asymptote of i treatment
is "slope" of i treatment
is inflection point of i treatment
:
ij i i i j
i
i
i
i
Gompertz Model
date
14 May 2007 SSP Core Facility 349
Department of Statistics
Parameter Estimates Standard Parameter Estimate Error DF t Value Pr > |t|
a1 0.9949 0.03629 19 27.42 <.0001 a2 0.9666 0.03793 19 25.48 <.0001 a3 0.9868 0.04609 19 21.41 <.0001 a4 1.0037 0.06284 19 15.97 <.0001 a5 0.9236 0.04390 19 21.04 <.0001
b1 0.5435 0.08104 19 6.71 <.0001 b2 0.4822 0.08743 19 5.52 <.0001 b3 0.4506 0.09845 19 4.58 0.0002 b4 0.3431 0.06859 19 5.00 <.0001 b5 0.8544 0.1810 19 4.72 0.0001
c1 0.3615 0.05388 19 6.71 <.0001
c2 0.3224 0.05841 19 5.52 <.0001 c3 0.2940 0.06370 19 4.62 0.0002 c4 0.2186 0.04360 19 5.01 <.0001 c5 0.5319 0.1125 19 4.73 0.0001 s2w 0.002926 0.001355 s2s 0.01598 0.001462
These areML estimates Bias?
14 May 2007 SSP Core Facility 350
Department of Statistics
Fit of Gompertz
14 May 2007 SSP Core Facility 351
Department of Statistics
Trt Comparisons with NLMIXED Contrasts Num Den Label DF DF F Value Pr > F
among a 4 19 0.50 0.7383 among b 4 19 2.19 0.1085 among c 4 19 2.30 0.0966 a: nt vs sum bld 1 19 0.29 0.5956 a: nt+sb vs sb&sd 1 19 0.01 0.9108 a: sp dsk vs sp plow 1 19 1.09 0.3089 a: nt+sb vs sp d+p 1 19 0.14 0.7169 b: nt vs sum bld 1 19 0.26 0.6132 b: nt+sb vs sb&sd 1 19 0.29 0.5950 b: sp dsk vs sp plow 1 19 6.97 0.0161 b: nt+sb vs sp d+p 1 19 0.57 0.4590 c: nt vs sum bld 1 19 0.24 0.6279 c: nt+sb vs sb&sd 1 19 0.41 0.5305 c: sp dsk vs sp plow 1 19 6.74 0.0177 c: nt+sb vs sp d+p 1 19 0.21 0.6497
14 May 2007 SSP Core Facility 352
Department of Statistics
Issues with Test Results
denominator degrees of freedom?DF in NLMIXED based on simple N-1 ruleMIXED uses Satterthwaite/KRNLMIXED analog?
bias in test statistics?In MIXED, ML variance estimates biased Test statistics biased Excessive type I error rates familiar in MIXEDSame in NLMIXED?
14 May 2007 SSP Core Facility 353
Department of Statistics
Alternative NLMIXED Analysis
1. Use MIXED to obtain REML estimates of W2
and S2
2. Include REML variance component estimates in NLMIXED as known
3. NLMIXED will compute std errors and test statistics using REML estimates
14 May 2007 SSP Core Facility 354
Department of Statistics
NLMIXED REML Tests
MLE: W2 = 0.002926 S
2 = 0.01598REML: W
2 = 0.005199 S2 = 0.01845
Num DenLabel DF DF F Value Pr > Famong a 4 19 0.38 0.8188among b 4 19 1.81 0.1690among c 4 19 1.89 0.1537a: nt vs sum bld 1 19 0.26 0.6138a: nt+sb vs sb&sd 1 19 0.00 0.9796a: sp dsk vs sp plow 1 19 0.77 0.3918a: nt+sb vs sp d+p 1 19 0.15 0.7046b: nt vs sum bld 1 19 0.22 0.6419b: nt+sb vs sb&sd 1 19 0.18 0.6737b: sp dsk vs sp plow 1 19 5.88 0.0255b: nt+sb vs sp d+p 1 19 0.52 0.4788c: nt vs sum bld 1 19 0.21 0.6555c: nt+sb vs sb&sd 1 19 0.27 0.6114c: sp dsk vs sp plow 1 19 5.68 0.0277c: nt+sb vs sp d+p 1 19 0.20 0.6586
Vs. ML
.1085
.0966
.0161
.0177
14 May 2007 SSP Core Facility 355
Department of Statistics
Hein: “What if we transform the data to linearize it, then use MIXED?”
exp{ exp[ ( )]
if we assume =1
then
log[ log( )] ( )
y date
y date
Denote response variable emerge by y
then:
14 May 2007 SSP Core Facility 356
Department of Statistics
Plot of CLogLog over Date by Trt
14 May 2007 SSP Core Facility 357
Department of Statistics
MIXED Analysis of CLogLog
Type 1 Tests of Fixed Effects
Num Den Effect DF DF F Value Pr > F
trt 4 12 15.69 0.0001 lin 1 180 1402.85 <.0001 lin*trt 4 180 3.58 0.0077 trt*date 55 180 7.02 <.0001
Test of Lin and Lin*Trt correspond toequality of i and i for all treatmentsin Gompertz NLMM
14 May 2007 SSP Core Facility 358
Department of Statistics
Decomposing Contrasts
Num DenLabel DF DF F Value Pr > F
trt (b) 4 15 6.12 0.0040c 4 120 3.62 0.0080b: nt v sum bld 1 15 2.15 0.1631b: nt&sb vs sb&sd 1 15 4.37 0.0541b: sp d v p 1 15 2.27 0.1526b: nt&sb v sp d&p 1 15 19.96 0.0005c: nt v sum bld 1 120 2.11 0.1491 c: nt&sb vs sb&sd 1 120 3.49 0.0644c: sp d v p 1 120 0.99 0.3214c: nt&sb v sp d&p 1 120 11.08 0.0012
Vs NLMM
.169
.154
.674
.026
.611
.028
NLMM too conservative? or is Linearized LMM too liberal?
14 May 2007 SSP Core Facility 359
Department of Statistics
Unresolved Issues
14 May 2007 SSP Core Facility 360
Department of Statistics
Unresolved NLMIXED Issues
REML vs. ML variance component estimates Degrees of Freedom
Starting Values and Convergence
Are NLMIXED tests too conservative?
Implications for standard errors??
Correlated error repeated measures? When are linearized models analyzed using LMM
(e.g. Proc Mixed) preferable?
Design
14 May 2007 SSP Core Facility 361
Department of Statistics
GLIMMIX vs MIXED/GENMOD
GLIMMIX has very useful mean comparison options not available in MIXED
−especially for Factorial Simple Effects
GLIMMIX can model true GLMM’s
GLIMMIX is “touchy” (e.g. use of SUBJECT=)
Many Research Issues
−RSMOOTH−Properties of NonNormal KR, working
correlation, DDF, etc.−Computational Methods
14 May 2007 SSP Core Facility 362
Department of Statistics
Does GLIMMIX replace MIXED/GENMOD?
For GLMMs – no question For GLMs / LMMs
− for the most part – YES
Most GENMOD & MIXED programs can be duplicated in GLIMMIX−Mean Comparison features
−no need to “trick” GENMOD into GLMM with marginal model (e.g. split-plot, rpt measures)