Download - Factorial Experiments: - Blocking, -Confounding, and -Fractional Factorial Designs.

Factorial Experiments: -Blocking, -Confounding, and -Fractional Factorial Designs.

Wednesday, July 30, 20144:30pm – 6:30 pm1020 Torgersen Hall

Emanuel Msemo

ABOUT THE INSTRUCTOR

Graduate student in Virginia Tech Department of Statistics

B.A. ECONOMICS AND STATISTICS (UDSM,TANZANIA)MSc. STATISTICS (VT,USA)

LEAD/ASSOCIATE COLLABORATOR IN LISA

“If your experiment needs a statistician, you need a better experiment.” Ernest Rutherford

MORE ABOUT LISA

What?Laboratory for Interdisciplinary Statistical Analysis

Why?Mission: to provide statistical advice, analysis, and education to Virginia Tech researchers

How?Collaboration requests, Walk-in Consulting, Short Courses

Where?Walk-in Consulting in GLC and various other locationsCollaboration meetings typically held in Sandy 312

Who?Graduate students and faculty members in VT statistics department

www.lisa.vt.edu

HOW TO SUBMIT A COLLABORATION REQUEST

Go to www.lisa.stat.vt.edu Click link for “Collaboration Request Form”Sign into the website using VT PID and passwordEnter your information (email, college, etc.)Describe your project (project title, research goals, specific research questions, if you have already collected data, special requests, etc.) Contact assigned LISA collaborators as soon as possible to schedule a meeting

.

LISA helps VT researchers benefit from the use of Statistics

Short Courses: Designed to help graduate students apply statistics in their researchWalk-In Consulting: M-F 1-3 PM GLC Video Conference Room; 11 AM-1 PM Old Security Building Room 103 For questions requiring <30 mins

All services are FREE for VT researchers. We assist with research—not class projects or homework.

Collaboration:

Visit our website to request personalized statistical advice and assistance with:

Experimental Design • Data Analysis • Interpreting ResultsGrant Proposals • Software (R, SAS, JMP, SPSS...)LISA statistical collaborators aim to explain concepts in ways useful for your research.

Great advice right now: Meet with LISA before collecting your data.

COURSE CONTENTS:1. INTRODUCTION TO DESIGN AND ANALYSIS OF EXPERIMENTS

1.1 Introduction

1.2 Basic Principles

1.3 Some standard experimental designs designs

2. INTRODUCTION TO FACTORIAL DESIGNS

2.1 Basic Definitions and Principles

2.2 The advantage of factorials

2.3 The two-Factor factorial designs

2.5 Blocking in a factorial designs

2.4 The general factorial designs

4. BLOCKING AND CONFOUNDING IN THE 2K FACTORIAL DESIGNS

4.1 Introduction

4.2 Blocking a replicated 2k factorial design.

4.3 Confounding in the 2k factorial designs.

5. TWO LEVEL FRACTIONAL FACTORIAL DESIGNS

5.1 Why do we need fractional factorial designs?

5.2 The one-half Fraction of the 2k factorial design

5.3The one-quarter Fraction of the 2k factorial design

3. THE 2K FACTORIAL DESIGNS

3.1 Introduction

3.2 The 22 and 23 designs and the General 2k designs

3.3 A single replicate of the 2k designs

INTRODUCTION TO DESIGN AND ANALYSIS OF EXPERIMENTS

Questions:What is the main purpose of running an experiment ?

What do one hope to be able to show?

Typically, an experiment may be run for one or more of the following reasons:

1. To determine the principal causes of variation in a measured response

2. To find conditions that give rise to a maximum or minimum response

3. To compare the response achieved at different settings of controllable variables

4. To obtain a mathematical model in order to predict future responses

An Experiment involves the manipulation of one or more variables by an experimenter in order to determine the effects of this manipulation on another variable.

Much research departs from this pattern in that nature rather than the experimenter manipulates the variables. Such research is referred to as Observational studies

This course is concerned with COMPARATIVE EXPERIMENTS These allows conclusions to be drawn about cause and effect (Causal relationships)

A source of variation is anything that could cause an observation to be different from another observation

Sources of Variation

Independent VariablesThe variable that is under the control of the experimenter. The terms independent variables, treatments, experimental conditions, controllable variables can be used interchangeably

Dependent variableThe dependent variable (response) reflects any effects associated with manipulation of the independent variable

Those that can be controlled and are of interest are called treatments or treatment factors

Those that are not of interest but are difficult to control are nuisance factors

Now Sources of Variation are of two types:

PROCESS

Z1 Z2 ZP

X1 X2 XP

…….

…….

INPUTS

Uncontrollable factors

Controllable factors

OUTPUT (Response)

The primary goal of an experiment is to determine the amount of variation caused by the treatment factors in the presence of other sources of variation

Adapted from Montgomery (2013)

The objective of the experiment may include the following;

Determine which conditions are most influential on the response

Determine where to set the influential conditions so that the response is always near the desired nominal value

Determine where to set the influential conditions so that variability in the response is small

Determine where to set the influential conditions so that the effects of the uncontrollable Variables are minimized

EXAMPLE;

Researchers were interested to see the food consumption of albino rats when exposed to microwave radiation

“If albino rats are subjected to microwave radiation, then their food consumption will decrease”

Independent variable?

Dependant variable?

Nuisance factor (s)?

……………………….

……………………….

……………………….

TRY!

BASIC PRINCIPLESThe three basic principles of experimental designs are;

RandomizationThe allocation of experimental material and the order in which the individual runs of the experiment are to be performed are randomly determined

ReplicationIndependent repeat run of each factor combination

Number of Experimental Units to which a treatment is assigned

BlockingA block is a set of experimental units sharing a common characteristics thought to affect the response, and to which a separate random assignment is made

Blocking is used to reduce or eliminate the variability transmitted from a nuisance factor

SOME STANDARD EXPERIMENTAL DESIGNS

The term experimental design refers to a plan of assigning experimental conditions to subjects and the statistical analysis associated with the plan.

OR An experimental design is a rule that determines the assignment of the experimental units to the treatments.

Some standard designs that are used frequently includes;

Completely Randomized designA completely randomized design (CRD) refer to a design in which the experimenter assigns the EU’s to the treatments completely at random, subject only to the number of observations to be taken on each treatment.

The model is of the form;

Response = constant + effect of a treatment + error

Block designs

This is a design in which experimenter partitions the EU’s in blocks, determines the allocation of treatments to blocks, and assigns the EU’s within each block to the treatments completely at random

The model is of the form

Response = Constant + effect of a block + effect of treatment + error

Designs with two or more blocking factors

These involves two major sources of variation that have been designated as blocking factors.


Response = Constant + effect of row block + effect of column block + effect of treatment + error

INTRODUCTION TO FACTORIAL DESIGNS

Experiments often involves several factors, and usually the objective of the experimenter is to determine the influence these factors have on the response.

Several approaches can be employed to deal when faced with more than one treatments

Best – guess ApproachExperimenter select an arbitrary combinations of treatments, test them and see what happens

One - Factor - at - a - time (OFAT)

Consists of selecting a starting point, or baseline set of levels, for each factor, and then successively varying each factor over its range with the other factors held constant at the baseline level.

The valuable approach to dealing with several factors is to conduct a FACTORIAL EXPERIMENT

This is an experimental strategy in which factors are varied together, instead of one at a time

In a factorial design, in each complete trial or replicate of the experiment, all possible combination of the levels of the factors are investigated.

e.g.

If there are a levels of factor A and b levels of factor B, each replicatecontains all ab treatment combinations


Response = Constant + Effect of factor A + Effect of factor B + Interaction effect + Error term

B HighA High

B HighA Low

B LowA Low

B LowA High

Consider the following example (adapted from Montgomery, 2013) of a two-factors (A and B) factorial experiment with both design factors at two levels (High and Low)

5230

20 40

Main effect : Change in response produced by a change in the level of a factor

Factor A

Main Effect = 40 + 52 _ 20 + 302 2

= 21

Factor B

Main Effect = ?

,Increasing factor A from low level to high level,causes an average response increase of 21 units

InteractionA HighB High

A HighB Low

A LowB High

A LowB Low

1240

20 50

At low level of factor B

The A effect = 50 – 20

= 30

At high level of factor B

The A effect = 12 - 40

= -28

The effect of A depends on the level chosen for factor B

“If the difference in response between the levels of one factor is not the same at all levels of the other factors then we say there is an interaction between the factors” (Montgomery 2013)

The magnitude of the interaction effect is the average difference in the two factor A effects

AB = (-28 – 30)2

= -29

In this case, factor A has an effect, but it depends on the level of factor B be chosen

A effect = 1

Interaction GraphicallyR

esp

on

se

Res

po

nse

Factor A Factor A

B High

B Low

B High

B Low

Low High Low High

A factorial experimentwithout interaction

A factorial experiment withinteraction

Factorial designs has several advantages;

They are more efficient than One Factor at a Time

A factorial design is necessary when interactions may be present to avoid misleading conclusions

Factorial designs allow the effect of a factor to be estimated at a several levels of the other factors, yielding conclusions that are valid over a range of experimental conditions

The two factor Factorial Design

The simplest types of factorial design involves only two factors.

There are a levels of factor A and b levels of factor B, and these are arranged in a factorial design.

There are n replicates, and each replicate of the experiment contains all the ab combination.

ExampleAn engineer is designing a battery for use in a device that will be subjected to some extreme variations in temperature. The only design parameter that he can select is the plate material for the battery. For the purpose of testing temperature can be controlled in the product development laboratory (Montgomery, 2013)

Life (in hours) Data

TemperatureMaterialType 15 70 125

130 74150159138168

1

2

3

155180188126110160

34 80136106174150

4075122115120139

208225589682

7058704510460

The design has two factors each at three levels and is then regarded as 32 factorial design.

The engineer wants to answer the following questions;

1. What effects do material type and temperature have on the life of the battery?

2 .Is there a choice of material that would give uniformly long life regardless of temperature?

Both factors are assumed to be fixed, hence we have a fixed effect model

The design is a completely Randomized Design

Analysis of Variance for Battery life (in hours)

Source DF Seq SS Adj SS Adj MS F P-valueMaterial Type 2 10683.7 10683.7 5341.9 7.91 0.002Temperature 2 39118.7 39118.7 19559.4 28.97 0.000Material Type*Temperature 4 9613.8 9613.8 2403.4 3.56 0.019Error 27 18230.7 18230.7 675.2Total 35 77647.0

We have a significant interaction between temperature and material type.

Interaction plot

Significant interaction is indicated by the lack of parallelism of thelines,Longer life is attained at low temperature, regardlessOf material type

The General Factorial Design

The results for the two – factor factorial design may be extended to the general case where there are a levels of factor A, b levels of factor B, c levels of factor C, and so on, arranged in a factorial experiment.

Sometimes, it is not feasible or practical to completely randomize all of the runs in a factorial.

The presence of a nuisance factor may require that experiment be run in blocks.


Response = Constant + Effect of factor A + Effect of factor B + interaction effect + Block Effect + Error term

The 2K Factorial designsThis is a case of a factorial design with K factors, each at only two levels.

These levels may be quantitative or qualitative.

A complete replicate of this design requires 2K observation and is called 2K factorial design.

Assumptions1. The factors are fixed.

2. The designs are completely randomized.

3. The usual normality assumptions are satisfied.

The design with only two factors each at two levels is called 22 factorial design

The levels of the factors may be arbitrarily called “Low” and “High”

FactorA B Treatment Combination

-+-+

--++

A Low, B LowA High, B LowA Low, B HighA High, B High

The order in which the runs are made is a completelyrandomized experiment

(1) a b ab

The four treatment combination in the design can be represented by lower case letters

The high level factor in any treatment combination is denoted by the corresponding lower case letter

The low level of a factor in a treatment combination is represented by the absence of the corresponding letter

The average effect of a factor is the change in the response produced by a change in the level of that factor averaged over the levels of the other factor

The symbols (1), a, b, ab represents the total of the observation at all n replicates taken at a treatment combination

A main effect = 1/2n[ab + a – b – (1)]

B main effect = 1/2n[ab +b - a – (1)]

AB effect = 1/2n{[ab + (1) – a – b]

In experiments involving 2K designs, it is always important to examine the magnitude and direction of the factor effect to determine which factors are likely to be important

Effect Magnitude and direction should always be considered along with ANOVA, because the ANOVA alone does not convey this information

Contrast A = ab + a – b – (1) = Total effect of A

We can write the treatment combination in the order (1), a, b, ab. Also called the standard order (or Yates order)

Treatment Combination

Factorial Effect

I A B AB

(1) a bab

++++

-+-+

--++

+--+

The above is also called the table of plus and minus signs

We define;

Suppose that three factors, A ,B and C, each at two levels are of interest. The design is referred as 23 factorial design


Factorial Effects

I A B AB C AC BC ABC

(1)ababcacbcabc

++++++++

-+-+-+-+

--++--++

+--++--+

----++++

+-+--+-+

++----++

-++-+--+

A contrast = [ab + a + ac + abc – (1) – b – c - bc

B contrast = ?

The design with K factors each at two levels is called a 2K factorial design

The treatment combination are written in standard order using notation introduced in a 22 and 23 designs

In General;

A single replicate of the 2K Designs

For even a moderate number of factors, the total number of treatment combinations in a 2K factorials designs is large.

25 design has 32 treatment combinations

26 design has 64 treatment combinations

Resources are usually limited, and the number of replicates that the experimenter can employ may be restricted

Frequently, available resources only allow a single replicate of the design to be run, unless the experimenter is willing to omit some of the original factors

An analysis of an unreplicated factorials assume that certain high –order interaction are negligible and combine their means squares to estimate the error

This is an appeal to sparsity of effect principle, that is most systems are dominated by some of the main effect and low – order interactions, and most high – order interactions are negligible

When analyzing data from unreplicated factorial designs, its is suggested to use normal probability plot of estimates of the effects

ExampleA chemical product is produced in a pressure vessel. A factorial experiment is carried out in the pilot plant to studythe factors thought to influence filtration rate of this product. The four factors are Temperature (A), pressure (B), concentration of formaldehyde (C), and string rate (D). Each factor is present at two levels. The process engineer is interested in maximizing the filtration rate. Current processgives filtration rate of around 75 gal/h. The process currently uses the factor C at high level. The engineer would like to reduce the formaldehyde concentration as much as possible but has been unable to do so because it always results in lower filtration rates (Montgomery, 2013)

A B C D Response- + - - 48- - - + 43- + + + 70- + + - 80- - + - 68+ - - - 71+ - + - 60- + - + 45+ + + + 96+ - + + 86+ + + - 65+ - - + 100- - + + 75+ + - - 65- - - - 45+ + - + 104

TreatmentCombination

(1) a babcacbcabcdadbdabdcdacdbcdabcd

Factors

The design matrix and response data obtained from single replicate of the 24 experiment

The Normal probability plot is given below

The important effects that emerge from this analysis are the main effects of A,C and D and the AC and AD interactions

The main effect plot for Temperature

The plot indicate that its better to run the Temperature at high levels

The main effect plot for Concentration of Formaldehyde

The plot indicate that its better to run the concentration of formaldehyde at high levels

The main effect plot for Stirring rate

The plot indicate that its better to run the stirring rate at high levels

However, its necessary to examine any interactions that are important

The best results are obtained with low concentration of formaldehyde and high temperatures

The AD interaction indicate that stirring rate D has little effect at low temperatures but a very positive effects at high temperature

Therefore best filtration rates would appear to be obtained when A and D are at High level and C is at low level. This will allow Formaldehyde to be reduced to the lower levels

Model TermEffect

EstimatesSum of

SquaresPercent

ContributionA 21.63 1870.56 32.64

B 3.13 39.06 0.68

C 9.88 390.06 6.81

D 14.63 855.56 14.93

AB 0.13 0.063 1.091E-003

AC -18.13 1314.06 22.93

AD 16.63 1105.56 19.29

BC 2.38 22.56 0.39

BD -0.38 0.56 9.815E-003

CD -1.12 5.06 0.088

ABC 1.88 14.06 0.25

ABD 4.13 68.06 1.19

ACD -1.62 10.56 0.18

BCD -2.63 27.56 0.48

ABCD 1.38 7.56 0.13

Factor effect Estimates and sums of squares for the 24 Design

Source DF Seq SS Adj SS AdjMS F P

A 1 1870.56 1870.56 1870.56 83.36 <0.0001 C 1 390.06 390.06 390.06 17.38 <0.0001 D 1 855.56 855.56 855.56 38.13 <0.0001 A*C 1 1314.06 1314.06 1314.06 58.56 <0.0001 A*D 1 1105.56 1105.56 1105.56 49.27 <0.0001 C*D 1 5.06 5.06 5.06 <1 A*C*D 1 10.56 10.56 10.56 <1 Residual Error 8 179.52 22.44 Total 15 5730.94

ANOVA for A, C and D

Blocking and Confounding in the 2K factorial designsThere are situations that may hinder the experimenter to

perform all of the runs in a 2K factorial experiment under homogenous conditions

A single batch of raw material might not be large enough to make all of the required runs

An experimenter with a prior knowledge, may decide to run a pilot experiment with different batches of raw materials

The design technique used in this situations is Blocking

Blocking a Replicated 2K Factorial design

Suppose that the 2K factorial design has been replicated n times

With n replicates, then each set of homogenous conditions defines a block, and each replicate is run in one of the blocks

The run in each block (or replicate) will be made in random order

Confounding in the 2K Factorial designs

Many situations it is impossible to perform a complete replicate of a factorial design in one block

Confounding is a design technique for arranging a complete factorial experiment in blocks, where the block size is smaller than the number of treatment combinations in one replicate

The technique causes information about certain treatment effects (usually) higher order interactions) to be indistinguishable from or confounded with blocks

Confounding the 2K Factorial design in two Blocks

Suppose we want to run a single replicate of the 22 design

Each of the 22 = 4 treatment combination requires a quantity of raw material

Suppose each batch of raw material is only large enough for two treatment combination to be tested, thus two batches of raw material are required

If batches of raw materials are considered as blocks, then we must assign two of the four treatment combinations to each block

Consider table of plus and minus signs for the 22 design

TreatmentCombination

Factorial Effect

I A B Block

(1) a b ab

++++

-+-+

--++

AB

+--+

1221

The block effect and the AB interaction are identical. That is, AB is confounded with blocks.

The order in which the treatment combination are run within a block are randomly Determined

(1)ab

ab

Block 1 Block 2

This scheme can be used to confound any 2K design into two blocks

Consider a 23 design run into two blocks

Suppose we wish to confound the ABC interaction with blocks


Factorial Effect

I A B AB C AC BC ABC Block

(1) a b ab cacbcabc

++++++++

-+-+-+-+

--++--++

+--++--+

----++++

+-+--+-+

++----++

-++-+--+

12212112

Again, we assign treatment combinations that are minus on ABC to Block 1 and the rest to block 2

The treatment combinations within a block are run in a random order

Block 1 Block 2

(1)abacbc

abcabc

ABC is confounded with blocks

Alternative method for constructing the block

The method uses the linear combination;

L = a1x1 + a2x2 + .....+ akxk

This is called a defining contrast

For the 2K ,xi = 0 (low level) or xi = 1 (high level), ai = 0 or 1

Treatment combination that produces the same value of L (mod 2) will be placed in the same block

The only possible values of L (mod 2) are 0 and 1, hence we will have exactly two blocks

If resources are sufficient to allow the replication of confounded designs, it is generally better to use a slightly different method of designing the blocks in each replicate

We can confound different effects in each replicate so that some information on all effects is obtained

This approach is called partial confounding

Consider our previous example;

Two modification;

1. The 16 treatment combination cannot all be run using one batch of raw material. Experimenter will use two batches of raw material, hence two blocks each with 8 runs

2. Introduce a block effect, by considering one batch as of poor quality, such That all the responses will be 20 units less in this block

The defining contrast is;

L = x1 +x2 + x3

Experimenter will confound the highest order interaction ABCD

The two resulting blocks are;

(1)abacbcadbdcdabcd

abcdabcbcdacdabd

The half Normal plot for the blocked design

Source DF Seq SS Adj SS Adj MS F P-Value

Blocks 1 1387.56 1387.56 1387.56A 1 1870.56 1870.56 1870.56 89.76 <0.0001 C 1 390.06 390.06 390.06 18.72 <0.0001 D 1 855.56 855.56 855.56 41.05 <0.0001A*C 1 1314.06 1314.06 1314.06 63.05 <0.0001 A*D 1 1105.56 1105.56 1105.56 53.05 <0.0001Residual Error 9 187.56 187.56 20.8403Total 15 7110.94

Similar methods can be used to confound the 2K designs to four blocks, and so on, depending on requirement

NOTE:Blocking is a noise reduction technique. If we don’t block, then the added variability from the nuisance variable effect ends up getting distributed across the other design factors

Two – level Fractional Factorial Designs

As the number of factors in a 2K factorial designs increases, the number of runs required for a complete replicate of the design rapidly outgrows the resources of most experimenters

If the experimenter can reasonably assume that certain high-order interactions are negligible, information on the main effects and lower order interactions may be obtained by running a fraction of a complete factorial experiment

Fractional factorials designs are widely used for product and process designs, process improvement and industrial/business experimentation

Fractional factorials are used for screening experiments

The successfully use of Fractional factorials designs is based on three key ideas;

1. The sparsity of effect principle

2. The projection property

3. Sequential experimentation

The one – half Fraction of the 2K Design

Suppose an experimenter has two factors, each at two levels but cannot afford to run all 23 = 8 treatment combinations

They can however afford four runs

This suggests a one – half fraction of a 23 design

A one – half fraction of the 23 design is often called a 23-1 design

Recall the table of plus and minus signs for a 23 design

Suppose we select those treatment combinations that have a plus in the ABC column to form 23-1 design, then ABC is called a generator of this particular design

Usually a generator such as ABC is referred as a WORD

The identity column is always plus, so we call;

I = ABC , The defining relation for our design

Now, It is impossible to differentiate between A and BC, B and AC, and C and AB

We say the effects are aliased

The alias structure may be easily determined by using a defined relation by multiplying any column by the defining relation

A *I = A * ABC = A2BC = BC

A = BC

B*I = B * ABC = AB2C = AC

B = AC

This half fraction with I = ABC is called the Principal fraction

Design Resolution

A design is of resolution R if no p-factor effect is aliased with another effect containing less than R-p factors

Roman numeral subscript are usually used to denote design resolutions

Designs of resolution III, IV and V are particularly important

Resolution III designsThese are designs in which no main effects are aliased with any other main effects, but main effects are aliased with two factor interactions and some two factor interactions may be aliased with each other

e.g. the 23-1 design with I = ABC is of resolution III

Resolution IV designsNo main effects is aliased with any other main effect or with any two factor interactions, but two factor interactions are aliased with each other

e.g. A 24-1 design with I = ABCD is a resolution IV design

Resolution V designsNo main effect or two factor interactions is aliased with any other main effect or two factor interaction, but two factor interactions are aliased with three factor interactions

e.g. A 25-1 design with I = ABCDE is a resolution V design

Construction of One half Fraction

A one half fraction of the 2K design is obtained by writing down a basic design consisting of the runs for the full 2K-1 factorials and then adding the kth factor by identifying its plus and minus levels with the plus and minus signs of the highest order interactions ABC..(K-1)

The 23-1 resolution III design is obtained by writing down the full 22 factorials as the basic design and then equating C to the AB interactions

One half fraction of the 23 design

Run

Full 22 Factorial (Basic Design)

A B

Resolution III, I = ABC

A B C = AB

1234

-+-+

--++

-+-+

--++

+--+

Consider the filtration rate example;

We will simulate what would happen if a half – fraction of the 24 design had been run instead of the full factorial

We will use the 24-1 with I = ABCD, As this will generate the highest resolution possible

•We will first write down the basic design, which is 23 design

•The basic design has eight runs but with three factors

•To find the fourth factor levels, we solve I = ABCD for D

D * I = D * ABCD = ABCD2 = ABC

The resolution IV design with I = ABCD

Run

Basic Design

D = ABCTreatment CombinationA CB

12345678

-+-+-+-+

--++--++

----++++

-++-+--+

(1)adbdabcdacbcabcd

Term Effect A 19.000 B 1.500 C 14.000 D 16.500 A*B -1.000 A*C -18.500 A*D 19.000

Estimates of Effects

A ,C and D have large effects, and so is the interactions involving them

Thank you

Reference

Montgomery, D.C (2013). Design and analysis of experiments. Wiley, New York.