Power in Mixed E ects - UMN Statisticsusers.stat.umn.edu/~gary/classes/5303/lectures/Mixed... ·...

Power in Mixed Effects

Gary W. Oehlert

School of StatisticsUniversity of Minnesota

December 1, 2014

Power is an important aspect of designing an experiment; we nowreturn to power in mixed effects.

We will compute power for“old school” tests. This techniqueworks for balanced designs; it will be exact for some situations withREML and approximate in other situations.

Modeling Assumptions

For pure random terms αβij where all contributing factors arerandom, we assume all αβijs are independent of each other.

For mixed terms αβij , say where A is fixed and B is random, wehave a choice:

Unrestricted assumptions say that all elements of αβij areindependent.

Restricted assumptions say that elements of αβij will add to zeroacross any fixed subscript (i in this case) but areotherwise independent.

Restricted assumptions induce negative correlation among somerandom effects:

Under restricted assumptions, two random effects from the samemixed term are negatively correlated if all of their subscriptscorresponding to random factors are the same.

Otherwise, they are independent.

The text usually defaults to restricted assumptions, but eithercould be appropriate depending on the situation, or something elsecould be better still.

It is usually very tricky to decide which mixed modelingassumptions are appropriate.

In general, unrestricted assumptions are more conservative(typically more difficult to reject the null.)

The lme and lmer functions in R fit using the unrestricted modelassumptions.

Via heroic effort, one can make lme fit some models under therestricted assumptions.

Give R predilections, we will concentrate on the unrestrictedapproach.

Anova and Expected Mean Squares

Old school testing in mixed effects proceeds as follows:

1 Compute an Anova table as if everything in the model is afixed effect.

2 Compute the expectation of every mean square (the expectedmean squares) using the complete Hasse diagram.

3 Use the Hasse diagram (or EMS) to determine the correctratio of MS (correct F test) for every term of interest.

4 Do the tests.

We need the EMS and DF for the F-test to do power.

The complete Hasse diagram includes super- and subscripts onevery term.

For each node on the diagram, add a superscript that indicates thenumber of different levels of the effect in that term.

For each node on the diagram, add a subscript that indicates thedegrees of freedom. Compute the df for a term U by starting withthe superscript for U and subtracting the subscripts (df) for allterms above U.

A has 5 levels, B has 4 levels, C has 2 levels, 2 replications.

Cheese raters

1. The representative element for a random term is its variance.

2. The representative element for a fixed term is the sum of thesquared fixed effects divided by degrees of freedom.

3. Contribution from a term is N, divided by the superscript, timesthe representative element.

4. Using unrestricted model assumptions, the EMS for a term isthe contribution from that term and all random terms below it.

A fixed, B random, crossed (first diagram)

MSE4040σ

2 = σ2

MSAB4040σ

2 + 4020σ

2αβ = σ2 + 2σ2

αβ

MSA4040σ

2 + 4020σ

2αβ + 40

5

P5i=1 αi

2

4 = σ2 + 2σ2αβ + 8

P5i=1 αi

2

4

MSB4040σ

2 + 4020σ

2αβ + 40

4 σ2β = σ2 + 2σ2

αβ + 10σ2β

A random, B random, C fixed, crossed (second diagram)

MSE8080σ

2 = σ2

MSABC8080σ

2 + 8040σ

2αβγ = σ2 + 2σ2

αβγ

MSAB8080σ

2 + 8040σ

2αβγ + 80

20σ2αβ = σ2 + 2σ2

αβγ + 4σ2αβ

MSBC8080σ

2 + 8040σ

2αβγ + 80

8 σ2βγ = σ2 + 2σ2

αβγ + 10σ2βγ

MSAC8080σ

2 + 8040σ

2αβγ + 80

10σ2αγ = σ2 + 2σ2

αβγ + 8σ2αγ

A random, B random, C fixed, crossed (second diagram),continued.

MSC8080σ

2 + 8040σ

2αβγ + 80

10σ2αγ + 80

8 σ2βγ + 80

2

P2i=1 γi

2

1 =

σ2 + 2σ2αβγ + 8σ2

αγ + 10σ2βγ + 40

P2i=1 γi

2

1

MSB8080σ

2 + 8040σ

2αβγ + 80

20σ2αβ + 80

8 σ2βγ + 80

4 σ2β =

σ2 + 2σ2αβγ + 4σ2

αβ + 10σ2βγ + 20σ2

β

MSA8080σ

2 + 8040σ

2αβγ + 80

20σ2αβ + 80

10σ2αγ + 80

5 σ2α =

σ2 + 2σ2αβγ + 4σ2

αβ + 8σ2αγ + 16σ2

α

A random, B random, C random, fully nested (third diagram).

MSE8080σ

2 = σ2

MSC8080σ

2 + 8040σ

2γ = σ2 + 2σ2

γ

MSB8080σ

2 + 8040σ

2γ + 80

20σ2β = σ2 + 2σ2

γ + 4σ2β

MSA8080σ

2 + 8040σ

2γ + 80

20σ2β + 80

5 σ2α = σ2 + 2σ2

γ + 4σ2β + 16σ2

α

Cheese raters.

MSE160160σ

2 = σ2

MSRC160160σ

2 + 16080 σ

2ργ = σ2 + 2σ2

ργ

MSBC160160σ

2 + 16080 σ

2ργ + 160

8

P2,4i,j=1 βγ

2jk

3 = σ2 + 2σ2ργ + 20

P2,4i,j=1 βγ

2jk

3

MSC160160σ

2 + 16080 σ

2ργ + 160

4

P4k=1 γ

2k

3 = σ2 + 2σ2ργ + 40

P4j=1 γ

2k

3

MSR160160σ

2 + 16080 σ

2ργ + 160

20 σ2ρ = σ2 + 2σ2

ργ + 8σ2ρ

MSB160160σ

2 + 16080 σ

2ργ + 160

20 σ2ρ + 160

2

P2j=1 β

2j

1

= σ2 + 2σ2ργ + 8σ2

ρ + 80P2

j=1 β2j

1

A has a levels, B has b levels, C has c levels, n replications.

Notice the pattern1 of the integer multiplier for regular models likethese:

A nbcB nacC nabAB ncAC nbBC naABC n

The multiplier is the product of the levels not in the term.

1Also notice the pattern that computing EMS is pretty damn tedious.

F Tests

In the old school approach, we test a null hypothesis such asσ2α = 0 or 0 =

∑β2

j by

Finding two Mss with EMSs that differ by a multiple of theitem of interest.

Computing the F ratio of those two MS and using the df forthe two MSs to find a p-value.

I asserted that as if it were always possible; this is not alwayspossible.

A fixed, B random, crossed (first diagram)

Item Num. MS Den. MS∑α2

i MSA MSAB

σ2β MAB MSAB

σ2αβ MSAB MSE

No trouble here.

Fully nested design (third diagram)

Item Num. MS Den. MS

σ2α MSA MSB

σ2β MAB MSC

σ2γ MSC MSE

No trouble here.

Cheese raters

Item Num. MS Den. MS∑j β

2j MSB MSR∑

k γ2k MSC MSRC∑

j ,k βγ2jk MSBC MSRC

σ2ρ MSR MSRC

σ2ργ MSRC MSE

No trouble here.

And that brings us to the second diagram.

Item Num. MS Den. MS∑k γ

2k — —

σ2β — —

σ2α — —

σ2αβ MSAB MSABC

σ2αγ MSAC MSABC

σ2βγ MSBC MSABC

σ2αβγ MSABC MSE

In the second diagram/model, there are no ordinary F-tests formain effects!

Looking at the Hasse diagrams, the denominator for a term (usingunrestricted model assumptions) is the first random term belowthe term of interest.

If there is more than one random term you can get to withoutgoing through another random term, then there is no exact test.

We will eventually get to approximate tests for these cases.

Power for Fixed Effects

Recall that the noncentrality parameter controls power for fixedeffects (along with degrees of freedom and the error rate).

The expected mean square for a fixed has the form:

random stuff +N

superscript× sum of squared effects

df

The noncentrality parameter is then

Nsuperscript × sum of squared effects

random stuff

The random stuff goes to the denominator and the df disappears.

Try this approach on Chapter 7 problems . . . it works.

For testing A in the first diagram, the EMS is

σ2 + 2σ2αβ + 8

∑5i=1 αi

2

4

and the noncentrality parameter is

8∑5

i=1 αi2

σ2 + 2σ2αβ

Note that you need to know, or make assumptions about, twovariances to compute the NCP.

In more general form, the NCP for this test is

nb∑5

i=1 αi2

σ2 + nσ2αβ

You cannot always make a noncentrality parameter in a mixedeffects model arbitrarily large by increasing n.

For this problem, you need to increase b, the number of levels ofthe random term B, to make the power go to 1.

Power for Random Effects

Power for random effects is actually easier than power for fixedeffects.

Suppose we want to test H0 : σ2η = 0.

We have two MS with EMS1 = τ + kσ2η and EMS2 = τ .

The F test is MS1/MS2 with ν1 and ν2 df.

Under the null,

MS1

MS2∼ τ + k × 0

τFν1,ν2 = Fν1,ν2

Under the alternative,

MS1

MS2∼τ + kσ2

η

τFν1,ν2

We reject H0 if MS1/MS2 > FE,ν1,ν2 .

Looking at the distribution under the alternative, we reject when

τ + kσ2η

τFν1,ν2 > FE,ν1,ν2

or, put another way, when

Fν1,ν2 >τ

τ + kσ2η

FE,ν1,ν2

Consider testing H0 : σ2αγ = 0 in the second model (fully crossed

three-way design).

The test is MSAC/MSABC . The df are 4 and 12.

Test at E = .01 assuming σ2 = 1, σ2αβγ = 2, and σ2

αγ = .5.

The EMS are:EMSAC = σ2 + 2σ2

αβγ + 8σ2αγ = 9

EMSABC = σ2 + 2σ2αβγ = 5

F.01,4,12 = 5.41 and 5/9× 5.41 = 3.01

Power is the probability that F with 4 and 12 df is larger than 3.01,which is .062 (which is not much power).

Approximate Tests

When there is no F test we can usually construct an approximatetest.

We want to find a sum of two MS for the numerator and a sum oftwo MS for the denominator such that the sum of the EMS on thetop is equal to the sum of the EMS on the bottom plus the termof interest.

On the Hasse diagram, if there are two random terms immediatelybelow the term of interest, the bottom will be the sum of thosetwo random terms, and the top will be the sum of the term ofinterest plus the term where the denominator terms “intersect.”

For testing σ2α in our second example, the numerator MS are:

MSA, EMSA = σ2 + 2σ2αβγ + 4σ2

αβ + 8σ2αγ + 16σ2

α

MSABC , EMSABC = σ2 + 2σ2αβγ

The denominator MS are:

MSAB , EMSAB = σ2 + 2σ2αβγ + 4σ2

αβ

MSAC , EMSABC = σ2 + 2σ2αβγ + 8σ2

αγ

Numerator and denominator sums are:

EMSA + EMSABC = 2σ2 + 4σ2αβγ + 4σ2

αβ + 8σ2αγ + 16σ2

α

EMSAB + EMSAC = 2σ2 + 4σ2αβγ + 4σ2

αβ + 8σ2αγ

The approximate test is a decent statistic, the problem is that itdoesn’t follow an F distribution (or any other standarddistribution).

What we do is treat it as if it follows an F distribution, andcompute some approximate degrees of freedom separately for thenumerator and denominator.

The df approximation is called the Satterthwaite approximation.

Suppose we’re trying to get an approximate df for MS1 + MS2 withν1 and ν2 df.

Let Ei = MSi and Vi = 2E 2i /νi .

The the approximate df for the sum is:

2(E1 + E2)2

V1 + V2

For computing power, let Ei = EMSi . (In fact, even for data Ei issupposed to be EMSi , we’re just using MSi to estimate EMSi .)

Power in Mixed E ects - UMN Statisticsusers.stat.umn.edu/~gary/classes/5303/lectures/Mixed... ·...

Documents

Transcript of Power in Mixed E ects - UMN Statisticsusers.stat.umn.edu/~gary/classes/5303/lectures/Mixed... ·...