1
Stat 232
Experimental Design
Spring 2008
2
Ching-Shui Cheng
Office: 419 Evans HallPhone: 642-9968Email: [email protected]
Office Hours: Tu Th 2:00-3:00 and by appointment
3
Course webpage:
http://www.stat.berkeley.edu/~cheng/232.htm
4
No textbook
Recommended (for first half of the course):
Design of Comparative Exeperiments by R. A. Bailey, to appear in 2008
http://www.maths.qmul.ac.uk/~rab/DOEbook/
Experiments: Planning, Analysis, and Parameter Design Optimization by C. F. J. Wu and M. Hamada
Statistics for Experimenters: Design, Innovation and Discovery by Box, Hunter and Hunter
A useful software: GenStat
5
Experimental Design
Planning of experiments to produce valid information as efficiently as possible
6
Comparative Experiments
Treatments (varieties)
Varieties of grain, fertilizers, drugs, ….
Experimental units (plots): smallest division of the experimental material so that different units can receive different treatments
Plots, patients, ….
7
Design: How to assign the treatments to the experimental units
Fundamental difficulty: variability among the units; no two units are exactly the same.
Each unit can be assigned only one treatment.
Different responses may be observed even if the same treatment is
assigned to the units.
Systematic assignments may lead to bias.
8
R. A. Fisher worked at the Rothamsted Experimental Station in the United Kingdom to evaluate the success of various fertilizer treatments.
9
Fisher found the data from experiments going on for decades to be basically worthless because of poor experimental design.
Fertilizer had been applied to a field one year and not in another in order to compare the yield of grain produced in the two years.
BUT It may have rained more, or been sunnier, in different years. The seeds used may have differed between years as well.
Or fertilizer was applied to one field and not to a nearby field in the same year.
BUT The fields might have different soil, water, drainage, and history
of previous use.
Too many factors affecting the results were “uncontrolled.”
10
Fisher’s solution: Randomization
In the same field and same year,
apply fertilizer to randomly spaced
plots within the field.
This averages out the effect of
variation within the field in
drainage and soil composition on
yield, as well as controlling for
weather, etc.
F F F F F F
F F F F F F F F
F F F F F
F F F F F F F F
F F F F F
F F F F
11
Randomization prevents any particular treatment from
receiving more than its fair share of better units, thereby
eliminating potential systematic bias. Some treatments may
still get lucky, but if we assign many units to each treatment,
then the effects of chance will average out.
Replications
In addition to guarding against potential systematic biases,
randomization also provides a basis for doing statistical
inference.
(Randomization model)
12
F F F F F F F F F F F F
F F F F F F F F F F F F
F F F F F F F F F F F F
Start with an initial design
Randomly permute (labels of) the experimental units
Complete randomization: Pick one of the 72! Permutationsrandomly
13
1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3 3 3 3 3
3 3 3 3 3 3 4 4 4 4 4 4
4 4 4 4 4 4 4 4 4 4 4 4
Pick one of the 72! Permutations randomly
4 treatments
Completely randomized design
14
blocking
A disadvantage of complete randomization is that when variations among the experimental units are large, the treatment comparisons do not have good precision. Blocking is an effective way to reduce experimental error. The experimental units are divided into more homogeneous groups called blocks. Better precision can be achieved by comparing the treatments within blocks.
15
Randomized complete block design
After randomization:
16
Wine tasting
Four wines are tasted and evaluated by each of eight judges.
A unit is one tasting by one judge; judges are blocks. So there are eight blocks and 32 units.
Units within each judge are identified by order of tasting.
17
18
Block what you can and randomize what you cannot.
19
Randomization Blocking Replication
20
Incomplete block design
7 treatments
21
Each of ten housewives does four washloads in an experiment to compare five new detergents.
5 treatments and 10 blocks of size 4.
22
Incomplete block design
7 treatments
23
Incomplete block design
Balanced incomplete block design
Randomize by randomly permuting the block labels and independently permuting the unit labels within each block.
24
Two simple block (unit) structures Nesting
block/unit
Crossing
row * column
25
Two simple block structures
Nesting
block/unit
Crossing
row * columnLatin square
26
27
Wine tasting
28
Simple block structures
Iterated crossing and nesting
cover most, though not all block structures encountered in practice
Nelder (1965)
29
Consumer testing
A consumer organization wishes to compare 8 brands of
vacuum cleaner. There is one sample for each brand.
Each of four housewives tests two cleaners in her home
for a week. To allow for housewife effects, each housewife
tests each cleaner and therefore takes part in the trial for 4
weeks.
8 treatments
Block structure:
30
A α B β C γ D δ
B γ A δ D α C β
C δ D γ A β B α
D β C α B δ A γ
Trojan square
31
Treatment structures
No structure
Treatments vs. control
Factorial structure
A fertilizer may be a combination of three factors (variables) N (nitrogen), P (Phosphate), K (Potassium)
32
Treatment structure
Block structure (unit structure)
Design
Randomization
Analysis
33
Choice of design
Efficiency Combinatorial considerations Practical considerations
34
McLeod and Brewster (2004) Technometrics
A company was experiencing problems with one of its chrome-plating processes in that when a particularcomplex-shaped part was being plated, excessive pitting and cracking, as well as poor adhesion and uneven deposition of chrome across the part, were observed. With the goal being the identification of key factors affecting the quality of the process, a screening experiment was planned.
In collaboration with the company’s process engineers, sixfactors were identified for consideration in the experiment.
35
Hard-to-vary treatment factors
A: chrome concentration B: Chrome to sulfate ratio C: bath temperature
Easy-to-vary treatment factors
p: etching current density q: plating current density r: part geometry
36
The responses included the numbers of pits and cracks, in addition to hardness and thickness readings at various locations on the part.
Suppose each of the six factors have two levels, then there are 64 treatments.
A complete factorial design needs 64 experimental runs
37
Block structure: 4 weeks/4 days/2 runs
Treatment structure: A * B * C * p * q * r
Each of the six factors has two levels
Fractional factorial design
38
Miller (1997) Technometrics
Experimental objective: Investigate methods of
reducing the wrinkling of clothes being laundered
39
Miller (1997)
The experiment is run in 2 blocks and employs
4 washers and 4 driers. Sets of cloth samples
are run through the washers and the samples
are divided into groups such that each group
contains exactly one sample from each washer.
Each group of samples is then assigned to one
of the driers. Once dried, the extent of wrinkling
on each sample is evaluated.
40
Treatment structure:
A, B, C, D, E, F: configurations of washers
a,b,c,d: configurations of dryers
41
Block structure:2 blocks/(4 washers * 4 dryers)
42
Block 1 Block 2 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 10 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 1 0 1 0 0 0 0 0 0 0 1 1 1 1 0 0 0 1 1 1 1 0 0 10 1 1 0 1 1 0 0 0 0 0 1 1 1 0 0 0 1 1 00 1 1 0 1 1 0 0 1 1 0 1 1 1 0 0 0 1 0 10 1 1 0 1 1 1 1 0 0 0 1 1 1 0 0 1 0 1 00 1 1 0 1 1 1 1 1 1 0 1 1 1 0 0 1 0 0 11 0 1 1 0 1 0 0 0 0 1 0 1 0 1 0 0 1 1 01 0 1 1 0 1 0 0 1 1 1 0 1 0 1 0 0 1 0 11 0 1 1 0 1 1 1 0 0 1 0 1 0 1 0 1 0 1 01 0 1 1 0 1 1 1 1 1 1 0 1 0 1 0 1 0 0 11 1 0 1 1 0 0 0 0 0 1 1 0 0 0 1 0 1 1 0 1 1 0 1 1 0 0 0 1 1 1 1 0 0 0 1 0 1 0 1 1 1 0 1 1 0 1 1 0 0 1 1 0 0 0 1 1 0 1 01 1 0 1 1 0 1 1 1 1 1 1 0 0 0 1 1 0 0 1
43
GenStat code
factor [nvalue=32;levels=2] block,A,B,C,D,E,F,a,b,c,d
& [levels=4] wash, dryer
generate block,wash,dryer
blockstructure block/(wash*dryer)
treatmentstructure
(A+B+C+D+E+F)*(A+B+C+D+E+F)
+(a+b+c+d)*(a+b+c+d)
+(A+B+C+D+E+F)*(a+b+c+d)
44
matrix [rows=10; columns=5; values=“ b r1 r2 c1 c2"
0, 0, 1, 0, 0,0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0,1, 1, 0, 0, 0,1, 1, 1, 0, 0, 0, 0, 0, 0, 1,1, 0, 0, 0, 1, 1, 0, 0, 1, 0,
0, 0, 0, 1, 0] Mkey
45
Akey [blockfactors=block,wash,dryer; Key=Mkey;rowprimes=!(10(2));colprimes=!(5(2)); colmappings
=!(1,2,2,3,3)] Pdesign Arandom [blocks=block/(wash*dryer);seed=12345]PDESIGN ANOVA
46
Introduction; randomization and blocking Some mathematical preliminaries Linear models Block structures; strata, null ANOVA Computation of estimates; ANOVA table Orthogonal designs Non-orthogonal designs Factorial designs Response surface methodology Other topics as time permits
Outline
Top Related