Factorial Experiments: -Blocking, -Confounding, and -Fractional Factorial Designs.
Wednesday, July 30, 20144:30pm – 6:30 pm1020 Torgersen Hall
Emanuel Msemo
ABOUT THE INSTRUCTOR
Graduate student in Virginia Tech Department of Statistics
B.A. ECONOMICS AND STATISTICS (UDSM,TANZANIA)MSc. STATISTICS (VT,USA)
LEAD/ASSOCIATE COLLABORATOR IN LISA
“If your experiment needs a statistician, you need a better experiment.” Ernest Rutherford
MORE ABOUT LISA
What?Laboratory for Interdisciplinary Statistical Analysis
Why?Mission: to provide statistical advice, analysis, and education to Virginia Tech researchers
How?Collaboration requests, Walk-in Consulting, Short Courses
Where?Walk-in Consulting in GLC and various other locationsCollaboration meetings typically held in Sandy 312
Who?Graduate students and faculty members in VT statistics department
www.lisa.vt.edu
HOW TO SUBMIT A COLLABORATION REQUEST
Go to www.lisa.stat.vt.edu Click link for “Collaboration Request Form”Sign into the website using VT PID and passwordEnter your information (email, college, etc.)Describe your project (project title, research goals, specific research questions, if you have already collected data, special requests, etc.) Contact assigned LISA collaborators as soon as possible to schedule a meeting
.
LISA helps VT researchers benefit from the use of Statistics
Short Courses: Designed to help graduate students apply statistics in their researchWalk-In Consulting: M-F 1-3 PM GLC Video Conference Room; 11 AM-1 PM Old Security Building Room 103 For questions requiring <30 mins
All services are FREE for VT researchers. We assist with research—not class projects or homework.
Collaboration:
Visit our website to request personalized statistical advice and assistance with:
Experimental Design • Data Analysis • Interpreting ResultsGrant Proposals • Software (R, SAS, JMP, SPSS...)LISA statistical collaborators aim to explain concepts in ways useful for your research.
Great advice right now: Meet with LISA before collecting your data.
COURSE CONTENTS:1. INTRODUCTION TO DESIGN AND ANALYSIS OF EXPERIMENTS
1.1 Introduction
1.2 Basic Principles
1.3 Some standard experimental designs designs
2. INTRODUCTION TO FACTORIAL DESIGNS
2.1 Basic Definitions and Principles
2.2 The advantage of factorials
2.3 The two-Factor factorial designs
2.5 Blocking in a factorial designs
2.4 The general factorial designs
4. BLOCKING AND CONFOUNDING IN THE 2K FACTORIAL DESIGNS
4.1 Introduction
4.2 Blocking a replicated 2k factorial design.
4.3 Confounding in the 2k factorial designs.
5. TWO LEVEL FRACTIONAL FACTORIAL DESIGNS
5.1 Why do we need fractional factorial designs?
5.2 The one-half Fraction of the 2k factorial design
5.3The one-quarter Fraction of the 2k factorial design
3. THE 2K FACTORIAL DESIGNS
3.1 Introduction
3.2 The 22 and 23 designs and the General 2k designs
3.3 A single replicate of the 2k designs
INTRODUCTION TO DESIGN AND ANALYSIS OF EXPERIMENTS
Questions:What is the main purpose of running an experiment ?
What do one hope to be able to show?
Typically, an experiment may be run for one or more of the following reasons:
1. To determine the principal causes of variation in a measured response
2. To find conditions that give rise to a maximum or minimum response
3. To compare the response achieved at different settings of controllable variables
4. To obtain a mathematical model in order to predict future responses
An Experiment involves the manipulation of one or more variables by an experimenter in order to determine the effects of this manipulation on another variable.
Much research departs from this pattern in that nature rather than the experimenter manipulates the variables. Such research is referred to as Observational studies
This course is concerned with COMPARATIVE EXPERIMENTS These allows conclusions to be drawn about cause and effect (Causal relationships)
A source of variation is anything that could cause an observation to be different from another observation
Sources of Variation
Independent VariablesThe variable that is under the control of the experimenter. The terms independent variables, treatments, experimental conditions, controllable variables can be used interchangeably
Dependent variableThe dependent variable (response) reflects any effects associated with manipulation of the independent variable
Those that can be controlled and are of interest are called treatments or treatment factors
Those that are not of interest but are difficult to control are nuisance factors
Now Sources of Variation are of two types:
PROCESS
Z1 Z2 ZP
X1 X2 XP
…….
…….
INPUTS
Uncontrollable factors
Controllable factors
OUTPUT (Response)
The primary goal of an experiment is to determine the amount of variation caused by the treatment factors in the presence of other sources of variation
Adapted from Montgomery (2013)
The objective of the experiment may include the following;
Determine which conditions are most influential on the response
Determine where to set the influential conditions so that the response is always near the desired nominal value
Determine where to set the influential conditions so that variability in the response is small
Determine where to set the influential conditions so that the effects of the uncontrollable Variables are minimized
EXAMPLE;
Researchers were interested to see the food consumption of albino rats when exposed to microwave radiation
“If albino rats are subjected to microwave radiation, then their food consumption will decrease”
Independent variable?
Dependant variable?
Nuisance factor (s)?
……………………….
……………………….
……………………….
TRY!
BASIC PRINCIPLESThe three basic principles of experimental designs are;
RandomizationThe allocation of experimental material and the order in which the individual runs of the experiment are to be performed are randomly determined
ReplicationIndependent repeat run of each factor combination
Number of Experimental Units to which a treatment is assigned
BlockingA block is a set of experimental units sharing a common characteristics thought to affect the response, and to which a separate random assignment is made
Blocking is used to reduce or eliminate the variability transmitted from a nuisance factor
SOME STANDARD EXPERIMENTAL DESIGNS
The term experimental design refers to a plan of assigning experimental conditions to subjects and the statistical analysis associated with the plan.
OR An experimental design is a rule that determines the assignment of the experimental units to the treatments.
Some standard designs that are used frequently includes;
Completely Randomized designA completely randomized design (CRD) refer to a design in which the experimenter assigns the EU’s to the treatments completely at random, subject only to the number of observations to be taken on each treatment.
The model is of the form;
Response = constant + effect of a treatment + error
Block designs
This is a design in which experimenter partitions the EU’s in blocks, determines the allocation of treatments to blocks, and assigns the EU’s within each block to the treatments completely at random
The model is of the form
Response = Constant + effect of a block + effect of treatment + error
Designs with two or more blocking factors
These involves two major sources of variation that have been designated as blocking factors.
The model is of the form
Response = Constant + effect of row block + effect of column block + effect of treatment + error
INTRODUCTION TO FACTORIAL DESIGNS
Experiments often involves several factors, and usually the objective of the experimenter is to determine the influence these factors have on the response.
Several approaches can be employed to deal when faced with more than one treatments
Best – guess ApproachExperimenter select an arbitrary combinations of treatments, test them and see what happens
One - Factor - at - a - time (OFAT)
Consists of selecting a starting point, or baseline set of levels, for each factor, and then successively varying each factor over its range with the other factors held constant at the baseline level.
The valuable approach to dealing with several factors is to conduct a FACTORIAL EXPERIMENT
This is an experimental strategy in which factors are varied together, instead of one at a time
In a factorial design, in each complete trial or replicate of the experiment, all possible combination of the levels of the factors are investigated.
e.g.
If there are a levels of factor A and b levels of factor B, each replicatecontains all ab treatment combinations
The model is of the form
Response = Constant + Effect of factor A + Effect of factor B + Interaction effect + Error term
B HighA High
B HighA Low
B LowA Low
B LowA High
Consider the following example (adapted from Montgomery, 2013) of a two-factors (A and B) factorial experiment with both design factors at two levels (High and Low)
5230
20 40
Main effect : Change in response produced by a change in the level of a factor
Factor A
Main Effect = 40 + 52 _ 20 + 302 2
= 21
Factor B
Main Effect = ?
,Increasing factor A from low level to high level,causes an average response increase of 21 units
InteractionA HighB High
A HighB Low
A LowB High
A LowB Low
1240
20 50
At low level of factor B
The A effect = 50 – 20
= 30
At high level of factor B
The A effect = 12 - 40
= -28
The effect of A depends on the level chosen for factor B
“If the difference in response between the levels of one factor is not the same at all levels of the other factors then we say there is an interaction between the factors” (Montgomery 2013)
The magnitude of the interaction effect is the average difference in the two factor A effects
AB = (-28 – 30)2
= -29
In this case, factor A has an effect, but it depends on the level of factor B be chosen
A effect = 1
Interaction GraphicallyR
esp
on
se
Res
po
nse
Factor A Factor A
B High
B Low
B High
B Low
Low High Low High
A factorial experimentwithout interaction
A factorial experiment withinteraction
Factorial designs has several advantages;
They are more efficient than One Factor at a Time
A factorial design is necessary when interactions may be present to avoid misleading conclusions
Factorial designs allow the effect of a factor to be estimated at a several levels of the other factors, yielding conclusions that are valid over a range of experimental conditions
The two factor Factorial Design
The simplest types of factorial design involves only two factors.
There are a levels of factor A and b levels of factor B, and these are arranged in a factorial design.
There are n replicates, and each replicate of the experiment contains all the ab combination.
ExampleAn engineer is designing a battery for use in a device that will be subjected to some extreme variations in temperature. The only design parameter that he can select is the plate material for the battery. For the purpose of testing temperature can be controlled in the product development laboratory (Montgomery, 2013)
Life (in hours) Data
TemperatureMaterialType 15 70 125
130 74150159138168
1
2
3
155180188126110160
34 80136106174150
4075122115120139
208225589682
7058704510460
The design has two factors each at three levels and is then regarded as 32 factorial design.
The engineer wants to answer the following questions;
1. What effects do material type and temperature have on the life of the battery?
2 .Is there a choice of material that would give uniformly long life regardless of temperature?
Both factors are assumed to be fixed, hence we have a fixed effect model
The design is a completely Randomized Design
Analysis of Variance for Battery life (in hours)
Source DF Seq SS Adj SS Adj MS F P-valueMaterial Type 2 10683.7 10683.7 5341.9 7.91 0.002Temperature 2 39118.7 39118.7 19559.4 28.97 0.000Material Type*Temperature 4 9613.8 9613.8 2403.4 3.56 0.019Error 27 18230.7 18230.7 675.2Total 35 77647.0
We have a significant interaction between temperature and material type.
Interaction plot
Significant interaction is indicated by the lack of parallelism of thelines,Longer life is attained at low temperature, regardlessOf material type
The General Factorial Design
The results for the two – factor factorial design may be extended to the general case where there are a levels of factor A, b levels of factor B, c levels of factor C, and so on, arranged in a factorial experiment.
Sometimes, it is not feasible or practical to completely randomize all of the runs in a factorial.
The presence of a nuisance factor may require that experiment be run in blocks.
The model is of the form
Response = Constant + Effect of factor A + Effect of factor B + interaction effect + Block Effect + Error term
The 2K Factorial designsThis is a case of a factorial design with K factors, each at only two levels.
These levels may be quantitative or qualitative.
A complete replicate of this design requires 2K observation and is called 2K factorial design.
Assumptions1. The factors are fixed.
2. The designs are completely randomized.
3. The usual normality assumptions are satisfied.
The design with only two factors each at two levels is called 22 factorial design
The levels of the factors may be arbitrarily called “Low” and “High”
FactorA B Treatment Combination
-+-+
--++
A Low, B LowA High, B LowA Low, B HighA High, B High
The order in which the runs are made is a completelyrandomized experiment
(1) a b ab
The four treatment combination in the design can be represented by lower case letters
The high level factor in any treatment combination is denoted by the corresponding lower case letter
The low level of a factor in a treatment combination is represented by the absence of the corresponding letter
The average effect of a factor is the change in the response produced by a change in the level of that factor averaged over the levels of the other factor
The symbols (1), a, b, ab represents the total of the observation at all n replicates taken at a treatment combination
A main effect = 1/2n[ab + a – b – (1)]
B main effect = 1/2n[ab +b - a – (1)]
AB effect = 1/2n{[ab + (1) – a – b]
In experiments involving 2K designs, it is always important to examine the magnitude and direction of the factor effect to determine which factors are likely to be important
Effect Magnitude and direction should always be considered along with ANOVA, because the ANOVA alone does not convey this information
Contrast A = ab + a – b – (1) = Total effect of A
We can write the treatment combination in the order (1), a, b, ab. Also called the standard order (or Yates order)
Treatment Combination
Factorial Effect
I A B AB
(1) a bab
++++
-+-+
--++
+--+
The above is also called the table of plus and minus signs
We define;
Suppose that three factors, A ,B and C, each at two levels are of interest. The design is referred as 23 factorial design
Treatment Combination
Factorial Effects
I A B AB C AC BC ABC
(1)ababcacbcabc
++++++++
-+-+-+-+
--++--++
+--++--+
----++++
+-+--+-+
++----++
-++-+--+
A contrast = [ab + a + ac + abc – (1) – b – c - bc
B contrast = ?
The design with K factors each at two levels is called a 2K factorial design
The treatment combination are written in standard order using notation introduced in a 22 and 23 designs
In General;
A single replicate of the 2K Designs
For even a moderate number of factors, the total number of treatment combinations in a 2K factorials designs is large.
25 design has 32 treatment combinations
26 design has 64 treatment combinations
Resources are usually limited, and the number of replicates that the experimenter can employ may be restricted
Frequently, available resources only allow a single replicate of the design to be run, unless the experimenter is willing to omit some of the original factors
An analysis of an unreplicated factorials assume that certain high –order interaction are negligible and combine their means squares to estimate the error
This is an appeal to sparsity of effect principle, that is most systems are dominated by some of the main effect and low – order interactions, and most high – order interactions are negligible
When analyzing data from unreplicated factorial designs, its is suggested to use normal probability plot of estimates of the effects
ExampleA chemical product is produced in a pressure vessel. A factorial experiment is carried out in the pilot plant to studythe factors thought to influence filtration rate of this product. The four factors are Temperature (A), pressure (B), concentration of formaldehyde (C), and string rate (D). Each factor is present at two levels. The process engineer is interested in maximizing the filtration rate. Current processgives filtration rate of around 75 gal/h. The process currently uses the factor C at high level. The engineer would like to reduce the formaldehyde concentration as much as possible but has been unable to do so because it always results in lower filtration rates (Montgomery, 2013)
A B C D Response- + - - 48- - - + 43- + + + 70- + + - 80- - + - 68+ - - - 71+ - + - 60- + - + 45+ + + + 96+ - + + 86+ + + - 65+ - - + 100- - + + 75+ + - - 65- - - - 45+ + - + 104
TreatmentCombination
(1) a babcacbcabcdadbdabdcdacdbcdabcd
Factors
The design matrix and response data obtained from single replicate of the 24 experiment
The Normal probability plot is given below
The important effects that emerge from this analysis are the main effects of A,C and D and the AC and AD interactions
The main effect plot for Temperature
The plot indicate that its better to run the Temperature at high levels
The main effect plot for Concentration of Formaldehyde
The plot indicate that its better to run the concentration of formaldehyde at high levels
The main effect plot for Stirring rate
The plot indicate that its better to run the stirring rate at high levels
However, its necessary to examine any interactions that are important
The best results are obtained with low concentration of formaldehyde and high temperatures
The AD interaction indicate that stirring rate D has little effect at low temperatures but a very positive effects at high temperature
Therefore best filtration rates would appear to be obtained when A and D are at High level and C is at low level. This will allow Formaldehyde to be reduced to the lower levels
Model TermEffect
EstimatesSum of
SquaresPercent
ContributionA 21.63 1870.56 32.64
B 3.13 39.06 0.68
C 9.88 390.06 6.81
D 14.63 855.56 14.93
AB 0.13 0.063 1.091E-003
AC -18.13 1314.06 22.93
AD 16.63 1105.56 19.29
BC 2.38 22.56 0.39
BD -0.38 0.56 9.815E-003
CD -1.12 5.06 0.088
ABC 1.88 14.06 0.25
ABD 4.13 68.06 1.19
ACD -1.62 10.56 0.18
BCD -2.63 27.56 0.48
ABCD 1.38 7.56 0.13
Factor effect Estimates and sums of squares for the 24 Design
Source DF Seq SS Adj SS AdjMS F P
A 1 1870.56 1870.56 1870.56 83.36 <0.0001 C 1 390.06 390.06 390.06 17.38 <0.0001 D 1 855.56 855.56 855.56 38.13 <0.0001 A*C 1 1314.06 1314.06 1314.06 58.56 <0.0001 A*D 1 1105.56 1105.56 1105.56 49.27 <0.0001 C*D 1 5.06 5.06 5.06 <1 A*C*D 1 10.56 10.56 10.56 <1 Residual Error 8 179.52 22.44 Total 15 5730.94
ANOVA for A, C and D
Blocking and Confounding in the 2K factorial designsThere are situations that may hinder the experimenter to
perform all of the runs in a 2K factorial experiment under homogenous conditions
A single batch of raw material might not be large enough to make all of the required runs
An experimenter with a prior knowledge, may decide to run a pilot experiment with different batches of raw materials
The design technique used in this situations is Blocking
Blocking a Replicated 2K Factorial design
Suppose that the 2K factorial design has been replicated n times
With n replicates, then each set of homogenous conditions defines a block, and each replicate is run in one of the blocks
The run in each block (or replicate) will be made in random order
Confounding in the 2K Factorial designs
Many situations it is impossible to perform a complete replicate of a factorial design in one block
Confounding is a design technique for arranging a complete factorial experiment in blocks, where the block size is smaller than the number of treatment combinations in one replicate
The technique causes information about certain treatment effects (usually) higher order interactions) to be indistinguishable from or confounded with blocks
Confounding the 2K Factorial design in two Blocks
Suppose we want to run a single replicate of the 22 design
Each of the 22 = 4 treatment combination requires a quantity of raw material
Suppose each batch of raw material is only large enough for two treatment combination to be tested, thus two batches of raw material are required
If batches of raw materials are considered as blocks, then we must assign two of the four treatment combinations to each block
Consider table of plus and minus signs for the 22 design
TreatmentCombination
Factorial Effect
I A B Block
(1) a b ab
++++
-+-+
--++
AB
+--+
1221
The block effect and the AB interaction are identical. That is, AB is confounded with blocks.
The order in which the treatment combination are run within a block are randomly Determined
(1)ab
ab
Block 1 Block 2
This scheme can be used to confound any 2K design into two blocks
Consider a 23 design run into two blocks
Suppose we wish to confound the ABC interaction with blocks
Treatment Combination
Factorial Effect
I A B AB C AC BC ABC Block
(1) a b ab cacbcabc
++++++++
-+-+-+-+
--++--++
+--++--+
----++++
+-+--+-+
++----++
-++-+--+
12212112
Again, we assign treatment combinations that are minus on ABC to Block 1 and the rest to block 2
The treatment combinations within a block are run in a random order
Block 1 Block 2
(1)abacbc
abcabc
ABC is confounded with blocks
Alternative method for constructing the block
The method uses the linear combination;
L = a1x1 + a2x2 + .....+ akxk
This is called a defining contrast
For the 2K ,xi = 0 (low level) or xi = 1 (high level), ai = 0 or 1
Treatment combination that produces the same value of L (mod 2) will be placed in the same block
The only possible values of L (mod 2) are 0 and 1, hence we will have exactly two blocks
If resources are sufficient to allow the replication of confounded designs, it is generally better to use a slightly different method of designing the blocks in each replicate
We can confound different effects in each replicate so that some information on all effects is obtained
This approach is called partial confounding
Consider our previous example;
Two modification;
1. The 16 treatment combination cannot all be run using one batch of raw material. Experimenter will use two batches of raw material, hence two blocks each with 8 runs
2. Introduce a block effect, by considering one batch as of poor quality, such That all the responses will be 20 units less in this block
The defining contrast is;
L = x1 +x2 + x3
Experimenter will confound the highest order interaction ABCD
The two resulting blocks are;
(1)abacbcadbdcdabcd
abcdabcbcdacdabd
The half Normal plot for the blocked design
Source DF Seq SS Adj SS Adj MS F P-Value
Blocks 1 1387.56 1387.56 1387.56A 1 1870.56 1870.56 1870.56 89.76 <0.0001 C 1 390.06 390.06 390.06 18.72 <0.0001 D 1 855.56 855.56 855.56 41.05 <0.0001A*C 1 1314.06 1314.06 1314.06 63.05 <0.0001 A*D 1 1105.56 1105.56 1105.56 53.05 <0.0001Residual Error 9 187.56 187.56 20.8403Total 15 7110.94
Similar methods can be used to confound the 2K designs to four blocks, and so on, depending on requirement
NOTE:Blocking is a noise reduction technique. If we don’t block, then the added variability from the nuisance variable effect ends up getting distributed across the other design factors
Two – level Fractional Factorial Designs
As the number of factors in a 2K factorial designs increases, the number of runs required for a complete replicate of the design rapidly outgrows the resources of most experimenters
If the experimenter can reasonably assume that certain high-order interactions are negligible, information on the main effects and lower order interactions may be obtained by running a fraction of a complete factorial experiment
Fractional factorials designs are widely used for product and process designs, process improvement and industrial/business experimentation
Fractional factorials are used for screening experiments
The successfully use of Fractional factorials designs is based on three key ideas;
1. The sparsity of effect principle
2. The projection property
3. Sequential experimentation
The one – half Fraction of the 2K Design
Suppose an experimenter has two factors, each at two levels but cannot afford to run all 23 = 8 treatment combinations
They can however afford four runs
This suggests a one – half fraction of a 23 design
A one – half fraction of the 23 design is often called a 23-1 design
Recall the table of plus and minus signs for a 23 design
Suppose we select those treatment combinations that have a plus in the ABC column to form 23-1 design, then ABC is called a generator of this particular design
Usually a generator such as ABC is referred as a WORD
The identity column is always plus, so we call;
I = ABC , The defining relation for our design
Now, It is impossible to differentiate between A and BC, B and AC, and C and AB
We say the effects are aliased
The alias structure may be easily determined by using a defined relation by multiplying any column by the defining relation
A *I = A * ABC = A2BC = BC
A = BC
B*I = B * ABC = AB2C = AC
B = AC
This half fraction with I = ABC is called the Principal fraction
Design Resolution
A design is of resolution R if no p-factor effect is aliased with another effect containing less than R-p factors
Roman numeral subscript are usually used to denote design resolutions
Designs of resolution III, IV and V are particularly important
Resolution III designsThese are designs in which no main effects are aliased with any other main effects, but main effects are aliased with two factor interactions and some two factor interactions may be aliased with each other
e.g. the 23-1 design with I = ABC is of resolution III
Resolution IV designsNo main effects is aliased with any other main effect or with any two factor interactions, but two factor interactions are aliased with each other
e.g. A 24-1 design with I = ABCD is a resolution IV design
Resolution V designsNo main effect or two factor interactions is aliased with any other main effect or two factor interaction, but two factor interactions are aliased with three factor interactions
e.g. A 25-1 design with I = ABCDE is a resolution V design
Construction of One half Fraction
A one half fraction of the 2K design is obtained by writing down a basic design consisting of the runs for the full 2K-1 factorials and then adding the kth factor by identifying its plus and minus levels with the plus and minus signs of the highest order interactions ABC..(K-1)
The 23-1 resolution III design is obtained by writing down the full 22 factorials as the basic design and then equating C to the AB interactions
One half fraction of the 23 design
Run
Full 22 Factorial (Basic Design)
A B
Resolution III, I = ABC
A B C = AB
1234
-+-+
--++
-+-+
--++
+--+
Consider the filtration rate example;
We will simulate what would happen if a half – fraction of the 24 design had been run instead of the full factorial
We will use the 24-1 with I = ABCD, As this will generate the highest resolution possible
•We will first write down the basic design, which is 23 design
•The basic design has eight runs but with three factors
•To find the fourth factor levels, we solve I = ABCD for D
D * I = D * ABCD = ABCD2 = ABC
The resolution IV design with I = ABCD
Run
Basic Design
D = ABCTreatment CombinationA CB
12345678
-+-+-+-+
--++--++
----++++
-++-+--+
(1)adbdabcdacbcabcd
Term Effect A 19.000 B 1.500 C 14.000 D 16.500 A*B -1.000 A*C -18.500 A*D 19.000
Estimates of Effects
A ,C and D have large effects, and so is the interactions involving them
Thank you
Reference
Montgomery, D.C (2013). Design and analysis of experiments. Wiley, New York.
Top Related