Principles in the design of multiphase experiments with a later
laboratory phase: orthogonal designs Chris Brien 1, Bronwyn Harch
2, Ray Correll 2 & Rosemary Bailey 3 1 University of South
Australia, 2 CSIRO Mathematics, Informatics & Statistics, 3
Queen Mary University of London [email protected]
http://chris.brien.name/multitier
Slide 2
Outline 1.Primary experimental design principles.
2.Factor-allocation description for standard designs. 3.Single set
description. 4.Principles for simple multiphase experiments.
5.Principles leading to complications, even with orthogonality.
6.Summary. 2
Slide 3
1)Primary experimental design principles Principle 1 (Evaluate
designs with skeleton ANOVA tables) Use whether or not data to be
analyzed by ANOVA. Principle 2 (Fundamentals): Use randomization,
replication and blocking (local control). Principle 3 (Minimize
variance): Block entities to form new entities, within new entities
being more homogeneous; assign treatments to least variable
entity-type. Principle 4 (Split units): confound some treatment
sources with more variable sources if some treatment factors: i.
require larger units than others, ii. are expected to have a larger
effect, or iii. are of less interest than others. 3
Slide 4
A standard athlete training example 9 training conditions
combinations of 3 surfaces and 3 intensities of training to be
investigated. Assume the prime interest is in surface differences
intensities are only included to observe the surfaces over a range
of intensities. Testing is to be conducted over 4 Months: In each
month, 3 endurance athletes are to be recruited. Each athlete will
undergo 3 tests, separated by 7 days, under 3 different training
conditions. On completion of each test, the heart rate of the
athlete will be measured. Randomize 3 intensities to 3 athletes in
a month and 3 surfaces to 3 tests in an athlete. A split-unit
design, employing Principles 2, 3 and 4(iii). 4 Peeling et al.
(2009)
Slide 5
2) Factor-allocation description for standard designs Standard
designs involve a single allocation in which a set of treatments is
assigned to a set of units: treatments are whatever are allocated;
units are what treatments are allocated to; treatments and units
each referred to as a set of objects; Often do by randomization
using a permutation of the units. More generally treatments are
allocated to units e.g. using a spatial design or systematically
Each set of objects is indexed by a set of factors: Unit or
unallocated factors (indexing units); Treatment or allocated
factors (indexing treatments). Represent the allocation using
factor-allocation diagrams that have a panel for each set of
objects with: a list of the factors; their numbers of levels; their
nesting relationships. 5 (Nelder, 1965; Brien, 1983; Brien &
Bailey, 2006)
Slide 6
Factor-allocation diagram for the standard athlete training
experiment 6 One allocation (randomization): a set of training
conditions to a set of tests. 3Intensities 3Surfaces 9 training
conditions 4Months 3Athletes in M 3Tests in M, A 36 tests The set
of factors belonging to a set of objects forms a tier: they have
the same status in the allocation (randomization): {Intensities,
Surfaces} or {Months, Athletes, Tests} Textbook experiments are
two-tiered. A crucial feature is that diagram automatically shows
EU and restrictions on randomization/allocation.
Slide 7
Some derived items Sets of generalized factors (terms in the
mixed model): Months, Months Athletes, Months Athletes Tests;
Intensities, Surfaces, Intensities Surfaces. Corresponding types of
entities (groupings of objects): month, athlete, test (last two are
Eus); intensity, surface, training condition (intensity-surface
combination). Corresponding sources (in an ANOVA): Months,
Athletes[M], Tests[M A]; Intensities, Surfaces,
Intensities#Surfaces. 7 3Intensities 3Surfaces 9 training
conditions 4Months 3Athletes in M 3Tests in M, A 36 tests
Slide 8
Skeleton ANOVA Intensities is confounded with the more-variable
Athletes[M] & Surfaces with Tests[M^A]. 8 3Intensities
3Surfaces 9 training conditions 4Months 3Athletes in M 3Tests in M,
A 36 tests
Slide 9
Mixed model Take generalized factors derived from
factor-allocation diagram and assign to either fixed or random
model: Intensities + Surfaces + Intensities Surfaces | Months +
Months Athletes + Months Athletes Tests Corresponds to the mixed
model: Y = X I I + X S S + X I S I S + Z M u M + Z M A u M A + Z M
A T u M A T. where the Xs and Zs are indicator variable matrices
for the generalized factors in its subscript, and s and us are
fixed and random parameters, respectively, with 9 This is an ANOVA
model, equivalent to the randomization model, and is also written:
Y = X I I + X S S + X I S I S + Z M u M + Z M A u M A + e. To fit
in SAS must set the DDFM option of MODEL to KENWARDROGER. Brien
& Demtrio (2009).
Slide 10
3)Single-set description Single set of factors that uniquely
indexes observations: {Months, Intensities, Surfaces} (Athletes and
Tests omitted). What are the EUs in the single-set approach? A set
of units that are indexed by Months-Intensities combinations and
another set by the Months-Intensities-Surfaces combinations. Of
course, Months-Intensities(-Surfaces) are not actual EUs, as
Intensities (Surfaces) are not randomized to those combinations.
They act as a proxy for the unnamed units. Mixed model is: I + S +
I S | M + M I + M I S. Previous model: I + S + I S | M + M A + M A
T. Former more economical as A and T not needed. In SAS, default
DDFM option of MODEL ( CONTAIN ) works. However, M A and M I are
different sources of variability: inherent variability vs
block-treatment interaction. This "trick" is confusing, false
economy and not always possible. 10 e.g. Searle, Casella &
McCulloch (1992); Littel et al. (2006).
Slide 11
Single-set description ANOVA Single set of factors that
uniquely indexes observations: {Months, Intensities, Surfaces} Use
factors to derive skeleton ANOVA 11 Confounding not exhibited, and
need E[MSq] to see that M#I is denominator for Intensities.
Slide 12
Summary of factor-allocation versus single-set description
Factor-allocation description is based on the tiers and so is
multi-set. It has a specific factor for the EUs and so their
identity not obscured. Single-set description factors are a subset
of those identified in the factor-allocation description and so
more economical. Skeleton ANOVA from factor-allocation description
shows: The confounding resulting from the allocation; The origin of
the sources of variation more accurately (Athletes[Months] versus
Months#Intensities). 12
Slide 13
4)Principles for simple multiphase experiments Suppose in the
athlete training experiment: in addition to heart rate taken
immediately upon completion of a test, the free haemoglobin is to
be measured using blood specimens taken from the athletes after
each test, and the specimens are transported to the laboratory for
analysis. The experiment is two phase: testing and laboratory
phases. The outcome of the testing phase is heart rate and a blood
specimen. The outcome of the laboratory phase is the free
haemoglobin. How to process the specimens from the first phase in
the laboratory phase? 13
Slide 14
Some principles Principle 5 (Simplicity desirable): assign
first-phase units to laboratory units so that each first-phase
source is confounded with a single laboratory source. Use composed
randomizations with an orthogonal design. Principle 6 (Preplan
all): if possible. Principle 7 (Allocate all and randomize in
laboratory): always allocate all treatment and unit factors and
randomize first-phase units and lab treatments. Principle 8 (Big
with big): Confound big first-phase sources with big laboratory
sources, provided no confounding of treatment with first-phase
sources. 14
Slide 15
A simple two-phase athlete training experiment Simplest is to
randomize specimens from a test to locations (in time or space)
during the laboratory phase. 15 3Intensities 3Surfaces 9 training
conditions 4Months 3Athletes in M 3Tests in M, A 36 tests
36Locations 36 locations Composed randomizations
Slide 16
A simple two-phase athlete training experiment (contd) No.
tests = no. locations = 36 and so tests sources exhaust the
locations source. Cannot separately estimate locations and tests
variability, but can estimate their sum. But do not want to hold
blood specimens for 4 months. 16
Slide 17
A simple two-phase athlete training experiment (contd) Simplest
is to align lab-phase and first-phase blocking. 17 3Intensities
3Surfaces 9 training conditions 4Months 3Athletes in M 3Tests in M,
A 36 tests 4Batches 9Locations in B 36 locations Note Months
confounded with Batches (i.e. Big with Big). Composed
randomizations
Slide 18
A simple two-phase athlete training experiment mixed model Form
generalized factors and assign to fixed or random: I + S + I S | M
+ M A + M A T + B + B L. ANOVA shows us i) there will be aliasing
and ii) model without lab terms will fit and be sufficient a model
of convenience. 18 3Intensities 3Surfaces 9 training conditions
4Months 3Athletes in M 3Tests in M, A 36 tests 4Batches 9Locations
in B 36 locations
Slide 19
The multiphase law DF for sources from a previous phase can
never be increased as a result of the laboratory-phase design.
However, it is possible that first-phase sources are split into two
or more sources, each with fewer degrees of freedom than the
original source. 19 DF for first phase sources unaffected.
Slide 20
Factor-allocation in multiphase experiments While multiphase
experiments will often involve multiple allocations, not always: A
two-phase experiment will not if the first phase involves a survey
i.e. no allocation e.g. tissues sampled from animals of different
sexes. A two-phase experiment may include more than two
allocations: e.g. a grazing trial in the first phase that involves
two composed randomizations. Factor-allocation description is
particularly helpful in understanding multiphase experiments with
multiple allocations. 20
Slide 21
5)Principles leading to complications, even with orthogonality
Principle 9 (Use pseudofactors): An elegant way to split sources
(as opposed to introducing grouping factors unconnected to real
sources of variability). Principle 10 (Compensating across phases):
Sometimes, if something is confounded with more variable first-
phase source, can confound with less variable lab source. Principle
11 (Laboratory replication): Replicate laboratory analysis of
first-phase units if lab variability much greater than 1 st -phase
variation; Often involves splitting product from the first phase
into portions (e.g. batches of harvested crop, wines, blood
specimens into aliquots, drops, lots, samples and fractions).
Principle 12 (Laboratory treatments): Sometimes treatments are
introduced in the laboratory phase and this involves extra
randomization. 21
Slide 22
A replicated two-phase athlete training experiment Suppose
duplicates of free haemoglobin to be done: 2 fractions taken from
each specimen; One fraction taken from 9 specimens for the month
and analyzed; Then, the other fraction from 9 specimens analyzed.
22 3Intensities 3Surfaces 9 training conditions 4Months 2Fractions
in M, A, T 3Athletes in M 3Tests in M, A 72 fractions 4Batches
2Rounds in B 9Locations in B, R 72 locations 2F 1 in M Problem: 18
fractions in a month to assign to 2 rounds in a batch: Use F 1 to
group the 9 fractions to be analyzed in the same round. An
alternative is to introduce the grouping factor FGroups, but, while
in the analysis, not an anticipated variability source.
Slide 23
A replicated two-phase athlete training experiment (contd) 23
3Intensities 3Surfaces 9 training conditions 4Months 2Fractions in
M, A, T 3Athletes in M 3Tests in M, A 72 fractions 2F 1 in M Note
split source 4Batches 2Rounds in B 9Locations in B, R 72
locations
Slide 24
A replicated two-phase athlete training experiment (contd) 24
Last line measures laboratory variation. Again, no. fractions = no.
locations = 72 and so fractions sources exhaust locations sources.
Consequently, not all terms could be included in a mixed model.
Pseudofactors not needed in mixed model (cf. grouping
factors).
Slide 25
A replicated two-phase athlete training experiment mixed model
25 Form generalized factors and assign to fixed or random: I + S +
I S | M + M A + M A T + M A T F + B + B R + B R L. Removing aliased
terms gives one model of convenience: I + S + I S | M A + M A T + B
+ B R + B R L (M, M A T F omitted). Another model would result from
removing B and B R L (need B R).
Slide 26
Single-set description for example Single set of factors that
uniquely indexes observations: {4 Months x 2 Rounds x 3 Intensities
x 3 Surfaces}. 26 3Intensities 3Surfaces 9 training conditions
4Months 2Fractions in M, A, T 3Athletes in M 3Tests in M, A 72
fractions 2F 1 in M 4Batches 2Rounds in B 9Locations in B, R 72
locations How do: the locations sources; Fractions; affect the
response? Similar to model of convenience
Slide 27
27 6)Summary Factor-allocation, rather than single-set,
description used, being more informative and particularly helpful
in multiphase experiments. Multiphase experiments usually have
multiple allocations. Have provided 4 standard principles and 8
principles specific to orthogonal, multiphase designs. In practice,
will be important to have some idea of likely sources of laboratory
variation. Are laboratory treatments to be incorporated? Will
laboratory replicates be necessary?
Slide 28
28 References Brien, C. J. (1983). Analysis of variance tables
based on experimental structure. Biometrics, 39, 53-59. Brien,
C.J., and Bailey, R.A. (2006) Multiple randomizations (with
discussion). J. Roy. Statist. Soc., Ser. B, 68, 571609. Brien, C.J.
and Demtrio, C.G.B. (2009) Formulating mixed models for
experiments, including longitudinal experiments. J. Agr. Biol. Env.
Stat., 14, 253-80. Brien, C.J., Harch, B.D., Correll, R.L. and
Bailey, R.A. (2010) Multiphase experiments with laboratory phases
subsequent to the initial phase. I. Orthogonal designs. Journal of
Agricultural, Biological and Environmental Statistics, submitted.
Littell, R. C., G. A. Milliken, et al. (2006). SAS for Mixed
Models. Cary, N.C., SAS Press. Nelder, J. A. (1965). The analysis
of randomized experiments with orthogonal block structure.
Proceedings of the Royal Society of London, Series A, 283(1393),
147-162, 163-178. Peeling, P., B. Dawson, et al. (2009). Training
Surface and Intensity: Inflammation, Hemolysis, and Hepcidin
Expression. Medicine & Science in Sports & Exercise, 41,
1138-1145. Searle, S. R., G. Casella, et al. (1992). Variance
components. New York, Wiley.
Slide 29
29 Web address for link to Multitiered experiments site
http://chris.brien.name/multitier