Single-Factor Studies KNNL – Chapter 16. Single-Factor Models Independent Variable can be...
-
Upload
nicholas-sims -
Category
Documents
-
view
215 -
download
0
Transcript of Single-Factor Studies KNNL – Chapter 16. Single-Factor Models Independent Variable can be...
Single-Factor Studies
KNNL – Chapter 16
Single-Factor Models
• Independent Variable can be qualitative or quantitative
• If Quantitative, we typically assume a linear, polynomial, or no “structural” relation
• If Qualitative, we typically have no “structural” relation
• Balanced designs have equal numbers of replicates at each level of the independent variable
• When no structure is assumed, we refer to models as “Analysis of Variance” models, and use indicator variables for treatments in regression model
Single-Factor ANOVA Model
• Model Assumptions for Model Testing All probability distributions are normal All probability distributions have equal variance Responses are random samples from their
probability distributions, and are independent• Analysis Procedure
Test for differences among factor level means Follow-up (post-hoc) comparisons among pairs or
groups of factor level means
Cell Means Model
11
# of levels of the study factor
# of replicates (cases, units) for the level of the study factor
... overall sample size (number of cases)
1,..., 1,...,
Respo
thi
r
r i Ti
ij i ij i
ij
r
n i
n n n n
Y i r j n
Y
2
2 2
nse for case within the level of the study factor
Population mean for the level of the study factor
~ 0, where Normally and Independently Distributed
are i
th th
thi
ij
ij i ij
ij
j i
i
NID NID
E Y Y
Y
2ndependent ,iN
Cell Means Model – Regression Form
1 2 3
211 11
212 12
1 221 21 2
2 222 22
3 231 31
32 32
Suppose 3 and 2
1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0
0 0 1
r n n n
Y
Y
Y
Y
Y
Y
Y X β ε ε
2
2
11 1
12 11
21 22
22 23
31 3
32 3
11
0
0 0 0 0 0
1 0 0
1 0 0
0 1 0
0 1 0
0 0 1
0 0 1
2 0 0
0 2 0
0 0 2
E Y
E Y
E Y
E Y
E Y
E Y
Y Y
I
E Y Xβ
X'X X'Y
^
1112 11 12
^
2 221 22 21 22
^331 32 31 32
3
0.5 0 0
0 0.5 0
0 0 0.5
YY Y
Y Y Y Y Y
Y Y Y Y Y
^ -1β = X'X X'Y
Model Interpretations
• Factor Level Means Observational Studies – The i represent the
population means among units from the populations of factor levels
Experimental Studies - The i represent the means of the various factor levels, had they been assigned to a population of experimental units
• Fixed and Random Factors Fixed Factors – All levels of interest are observed in study Random Factors – Factor levels included in study represent a
sample from a population of factor levels
Fitting ANOVA Models
1 1 1
1 1 1 1
22
1 1 1
Notation:
Least Squares and Maximum Likelihood Estimation
Error Sum of Squares:
i i
i i
i
n nr
ij ijn nr rij i ji i
ii ij ijj i j ii i T T T
n nr
ij ij ii j j
Y YY n YY
Y Y Y Y Y Yn n n n n
Q Y
1 1
^ ^1
1
221 11 2
2 1 1
2 1,...,
Setting 0 1,...,
1 1Likelihood: ,..., , | ,..., exp
22
maximizing Likelihood wrt
i k
k
k
i
r
nr
kj ki jk
n
kjnj
kk kkj kjk k
nr
r rn ij ini j
QY k r
YQ
Y n Y k rn
L Y Y Y
2
11 1
^ ^
,..., minimizing
Fitted values: Residuals:
inr
r ij ii j
ij i ij iij ij ij
Y
Y Y e Y Y Y Y
Analysis of Variance
Total Deviation from Deviation of trt mean Deviation trt mean (residual) from overall mean
1 1 1 1
2
1 1
0i i
i
i iij ij
n nr r
i i i iij iji j i j
nr
ij ii j
Y Y Y Y Y Y
Y Y Y Y Y Y Y Y
Y Y Y
2 2
1 1 1 1
2
1 1
2 2
1 1 1
Total (Corrected) Sum of Squares: 1
Treatment Sum of Squares: 1
Error Sum of Squares:
i i
i
i
n nr r
i iji j i j
nr
ij TO Ti j
nr r
i ii TRi j i
Y Y Y
SSTO Y Y df n
SSTR Y Y n Y Y df r
2
1 1
2
212 2 2
1 1 1
Note: Useful result:
1 1 11
Mean Squares: 1
i
i
i
nr
iij E Ti j
TO TR E
n
iij n r rj
ii i i ij i i E T ij i ii
T
SSE Y Y df n r
SSTO SSTR SSE df df df
Y Y
s n s Y Y SSE n s df n r nn
SSTR SSEMSTR MSE
r n
r
ANOVA Table
2
22 1
1
22
1 1
2
1 1
2 2 2
1
Source { }
Treatments 11 1
Error
Total 1
Note:
i
i
r
i iri
iii
nr
iT iji j T
nr
T iji j
r
ii T iji j
df SS MS E MS
nSSTR
r SSTR n Y Y MSTRr r
SSEn r SSE Y Y MSE
n r
n SSTO Y Y
SSTR n Y n Y SSE Y
2
1 1 1
2 2 2 2 2 2 2 2
1 1 1
2 22 22 2 2 2
1 1
2 222 21
i
i
nr r
iii i
nr r
ij i ij ij i ij i i Ti j i
r r
i i i ii i i i ii ii i
r
i ii
T T T
n Y
E Y Y E Y E Y n n
E Y Y E Y E n Y n rn n
nE Y Y E Y E
n n n
2 2 2
T Tn Y n
F-Test for H0:r
0 1
*
2 21 02 2
: ... : Not all are equal
Test Statistic:
Under null hypothesis (and independence and normality of errors):
~ ~ and are independent (independent even if falsT
r A i
r n r
H H
MSTRF
MSE
SSTR SSEH
2
2
*0
e)
1~ 1,
Decision Rule: Reject if 1 ; 1,
T
T
T
SSTRr
MSTRF r n r
SSE MSEn r
MSTRH F F r n r
MSE
General Linear Test of Equal Means
0 1
^ ^
2^ 2
1 1 1 1
: ... Common Mean (Reduced Model)
: Not all are equal (Complete Model)
Reduced Model:
( ) 1
Complete (Full) Mo
i i
r c c
A i
ijc
n nr r
ijij ij R Ti j i j
H
H
Y Y
SSE R Y Y Y Y SSTO df n
^ ^
2^ 2
1 1 1 1
*
del:
( )
( ) ( )1 1
Test Statistic: ( )
i i
i iji
n nr r
ij iij ij F Ti j i j
T TR F
F T T
Y Y
SSE F Y Y Y Y SSE df n r
SSTO SSESSE R SSE F SSTRn n rdf df r
FSSE F SSE SSEdf n r n
MSTR
MSE
r
Factor Effects Model
2
1
Alternative Form of Model (Necessary for interactions in multi-factor models):
"Effect" of factor level
~ 0,
Defining :
Unweighted Mean:
thi i i i i
ij i ij ij
r
ii
i
i
Y NID
r
1
1 1 1
1 1
0
Weighted Mean: s.t. 1 0
Weights may represent the population sizes in observational studies
Note: ... ... 0
r
i
r r r
i i i i ii i i
r r
w w w
Regression Approach – Factor Effects Model1 2 3 1 2 3 3 1 2
11 11
12 12
21 211
22 222
31 31
32 32
Suppose 3 and 2 and Unweighted Mean Model: 0
1 1 0
1 1 0
1 0 1
1 0 1
1 1 1
1 1 1
r n n n
Y
Y
Y
Y
Y
Y
Y X β ε
11 11
12 11
21 221
22 222
31 1 2
32 1 2
1 1 0
1 1 0
1 0 1
1 0 1
1 1 1
1 1 1
E Y
E Y
E Y
E Y
E Y
E Y
E Y Xβ
3
3
11 12 21 22 31 32
11 12 31 32
21 22 31 32
11 12 21 22 31 32
11 12 31 32
6 0 0
0 4 2
0 2 4
1/ 6 0 0
0 1/ 3 1/ 6
0 1/ 6 1/ 3
Y Y Y Y Y Y
Y Y Y Y
Y Y Y Y
Y Y Y Y Y Y
Y Y Y Y
^ -1
X'X X'Y
β = X'X X'Y
^
^
1 1
^221 22 31 32
2
Y
Y Y
Y Y Y Y Y Y
Factor Effects Model with Weighted Mean
1 1 1
1 1
1 1
1 1 1 , 1
11
Weights are relative sample sizes:
0 0
...
1 if 1
if
0 otherwise
ii
T
r r ri
i i i i ii i iT
r ri
r r i i r ii i r
ij ij r ij r ij
ijr
nw
n
nw n
n
nn n
n
Y X X
i
nX i r
n
1, 1
1 if 1
... if
0 otherwise
rij r
r
i r
nX i r
n
Regression for Cell Means Model
1 1
1
11^
0 1
...
1 if 1 1 if ...
0 if 1 0 if
When fitting with a regression package, no intercept is used
Under : ...
ij i ij ij r ijr
r
rr
Y X X
i i rX X
i i r
Y
Y
H
β β
:
1
1
r c
c Y
^
X β β
Randomization (aka Permutation) Tests• Treats the units in the study as a finite population of
units, each with a fixed error term ij
• When the randomization procedure assigns the unit to treatment i, we observe Yij = i + ij
• When there are no treatment effects (all i = 0), Yij = ij
• We can compute a test statistic, such as F* under all (or in practice, many) potential treatment arrangements of the observed units (responses)
• The p-value is measured as proportion of observed test statistics as or more extreme than original.
• Total number of potential permutations = nT!/(n1!...nr!)
Power Approach to Sample Size Choice - Tables
2
* 1 1
2
1 1
When the means are not all equal, the -statistic is non-central :
1~ 1, , where where
1When all sample sizes are equal: where
r r
i i i ii i
TT
r
i ii i
F F
n nF F r n r
r n
n
r
*
The power of the test, when conducted at the significance level of :
Pr 1 ; 1, , See Table B.11
Choose sample sizes so that the power is sufficiently high for specific
mean levels of in
r
T
r
F F r n r
1 1terest ,..., or effects levels of interest ,...,
max minTable B.12 is simple to use for equal sample sizes and
r r
i i
Power Approach to Sample Size Choice – R Code
2
* 1 12
2
1 12
When the means are not all equal, the -statistic is non-central :
~ 1, , where where
When all sample sizes are equal: where
r r
i i i ii i
TT
r r
i ii i
F F
n nF F r n r
n
n
* *
The power of the test, when conducted at the significance level of :
Pr 1 ; 1, | ~ 1, ,
In R:
1 ; 1, (1 , 1, )
Power = 1 1 (1 , 1, ), 1, ,
T T
T T
T T
r
F F r n r F F r n r
F r n r qf r n r
pf qf r n r r n r
Power Approach to Finding “Best” TreatmentGoal: Determining the best treatment (one with highest or lowest mean):
1 Probability the treatment with highest (lowest) sample mean
has highest (lowest) population mean
Differenc
e between highest (lowest) mean and 2nd highest (lowest) mean
Number of treatments
Table B.13 gives for various ,1
Solve for for given ,
r
nr
n