Further exploratory analysis of split-plot experiments to study certain stratified effects

This article was downloaded by: [Tufts University]On: 14 November 2014, At: 14:16Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Applied StatisticsPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/cjas20

Further exploratory analysis of split-plot experiments to study certainstratified effectsTony Cooper a & Mary G. Leitnaker ba Six Sigma Associates , Knoxville, TN, USAb University of Tennessee , Knoxville, TN, USAPublished online: 22 Jan 2007.

To cite this article: Tony Cooper & Mary G. Leitnaker (2006) Further exploratory analysis of split-plot experiments to study certain stratified effects, Journal of Applied Statistics, 33:8, 773-786,DOI: 10.1080/02664760600742201

To link to this article: http://dx.doi.org/10.1080/02664760600742201

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to orarising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/loi/cjas20

http://www.tandfonline.com/action/showCitFormats?doi=10.1080/02664760600742201

http://dx.doi.org/10.1080/02664760600742201

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

Further Exploratory Analysis of

Split-plot Experiments to Study

Certain Stratified Effects

TONY COOPER� & MARY G. LEITNAKER��

�Six Sigma Associates, Knoxville, TN, USA; ��University of Tennessee, Knoxville, TN, USA

ABSTRACT Designed experiments are a key component in many companies’ improvementstrategies. Because completely randomized experiments are not always reasonable from a cost orphysical perspective, split-plot experiments are prevalent. The recommended analysis accountsfor the different sources of variation affecting whole-plot and split-plot error. Howeverexperiments on industrial processes must be run and, consequently analyzed quite differently fromones run in a controlled environment. Such experiments are typically subject to a wide array ofuncontrolled, and barely understood, variation. In particular, it is important to examine theexperimental results for additional, unanticipated sources of variation. In this paper, weconsider how unanticipated, stratified effects may influence a split-plot experiment and discussfurther exploratory analysis to indicate the presence of stratified effects. Examples of suchexperiments are provided, additional tests are suggested and discussed in light of their power,and recommendations given.

KEY WORDS: Designed experiment, split-plot error, source of variation

Introduction

Although, as the name suggests, split-plot experiments were introduced as a strategy for

agricultural experiments, the split-plot arrangement is also common in industry. Box

(1996) notes that it is often ‘not only most convenient but also most efficient to run an

industrial experiment in a split-plot mode.’ In this same article, Box also notes that

although the split plot structure is often run in industry, it is sometimes not recognized

as such and consequently the experiment is not analyzed correctly. Daniel (1976)

stated, although possibly with some exaggeration, that ‘All industrial experiments are

split-plot experiments’. A primary reason that split plot experiments are so common in

industry is that by reducing the number of times some factors are changed, the cost and

time of running the experiment is greatly reduced. In fact, many industrial experiments

would be prohibitively costly or time-consuming were they not run as split plots. And

when an experiment is not randomized the experimenter will have to accept certain

Journal of Applied Statistics

Vol. 33, No. 8, 773–786, September 2006

Correspondence Address: Mary G. Leitnaker, College of Business Administration, Department of Statistics,

University of Tennessee, 331 Stokely Management Center, Knoxville, TN 37996 0532, USA. Email:

[email protected]

0266-4763 Print=1360-0532 Online=06=080773–14 # 2006 Taylor & FrancisDOI: 10.1080=02664760600742201

Dow

nloa

ded

by [

Tuf

ts U

nive

rsity

] at

14:

16 1

4 N

ovem

ber

2014

compromises (Hahn, 1978). Other authors have discussed the use of split plots in industrial

experiments. Hicks (1993) uses an industrial example to introduce his discussion of split-

plot experiments. Anderson & McLean (1974) have an extensive discussion of the use of

split-plots in industrial experiments.

Experimenting in an Industrial Setting

Statistically designed experiments on industrial processes, as opposed to experiments con-

ducted in more controlled settings, are almost always conducted in an environment subject

to a wide array of uncontrolled variation. In fact, experiments are often performed on an

industrial process to gain a better understanding of the myriad sources of variation acting

on the process. An experiment that is to be conducted on an ongoing process must consider

the nature of the sources of variation that affect measured outcomes of the process. In par-

ticular, the time span over which the variation occurs is an important consideration. For

example, variation that can be attributed to a machine performing repeated cycles but

in a slightly different fashion each time would likely be observed to create variation in

outcomes measured in a short time span. On the other hand, variation in incoming

raw materials would have a different impact on variation. Variation within a lot of raw

materials may affect short-term variation. Significant variation between lots of raw

material may create ‘long-term’ variation, i.e. variation that would be captured when

measurements are made over a long time span. And of course any process will be

subject to variation from many other sources stemming from different methods, materials,

practices, etc that occur in the process. So before designing an experiment, it would be

necessary to have a good understanding of what these likely sources of variation are

and how their impact is likely to be felt on measured process outcomes. Clearly, a com-

plete knowledge of these sources is not feasible. Yet, our best understanding of their beha-

vior guides our choice of experimental design.

The time span over which suspected sources of variation are likely to affect process out-

comes is one major element in choosing an experimental design. For example, if we know

that the process is affected by variation in incoming lots of material, we would like to

conduct experiments in blocks to specifically study the manner in which factor effects

might change across different lots of incoming material. A discussion of this use of block-

ing and how our analysis of an industrial experiment is performed when blocking is used is

described in papers by Sanders et al. (2001) and Leitnaker & Mee (2001). As we shall see

later, this approach to analyzing blocked experiments also informs our analysis of split-

plot experiments.

Another manner in which variation might present itself in an industrial process is

spatially. For example, a process may require that material pass through a drying oven.

Material that moves through one area of this oven may result in a different measured

outcome than material that was exposed to a different area of the oven. Many similar situ-

ations exist in industrial processes. Incoming batches of raw material may have had a criti-

cal element settle in transportation, meaning that the process will experience a fixed

change across the time that the process is using this batch. Multiple spindles exist on a

machine forming paper cups as well as on the one forming lids. Multiple lanes of

product are stamped by a cutting machine. Sheets of glass, aluminum, paper, etc are

subject to variations in the cross as well as machine direction. Again, our knowledge

about the likely occurrence of this type of ‘spatial’ variation will be needed to design

an experiment. Here again, we cannot expect complete knowledge of such sources of vari-

ation. But the more complete our knowledge the better will be the experiment we design.

Furthermore, it is the objective of this paper to illustrate how our analysis of experiments

774 T. Cooper & M.G. Leitnaker

Dow

nloa

ded

by [

Tuf

ts U

nive

rsity

] at

14:

16 1

4 N

ovem

ber

2014

run in an industrial setting should consider the possibility that such spatial variation exists,

even though we may have been unaware of its possibility prior to experimentation.

Manufacturing Example: Adhesive Experiment

The example that generated our present discussion of split-plot experiments occurred in

the electronics industry and involved an experiment to study the application of an

epoxy. In this instance, the type of ‘spatial variation’ referred to in the preceding paragraph

was expected to exist in the application process. Applying the correct amount or thickness

of an epoxy, as measured by height (H ), is critical to proper functioning of a component.

A robot is programmed to dispense the epoxy on the perimeter of a part. The specification

of 2–3 mm is to ensure proper adhesion and size at the next step in the process. The engin-

eers who designed the experiment wanted to study the effects of three settings on the robot

that applied the epoxy. These settings were:

. the distance (D) that the robot nozzle was from the surface,

. the pressure (P) at which the epoxy was being dispensed, and

. the speed (S ) at which the robot nozzle moved along the epoxy ledge.

Furthermore, the epoxy used by the robot was purchased in tubes. Engineers who worked

on the process suspected that the consistency of the epoxy material affected the height. In

particular, the adhesive would age once the tube was opened and this aging might affect

height. A factor (E), end of tube, was used to capture the effect of this aging. A set of eight

experimental runs was completed on each end (E) of each tube, the front and the back.

Thus, the factor E captures a spatial effect (front to back of tube) thought to be active

in creating variation in the application process. The eight runs were performed by rando-

mizing two levels of each of the three factors, D, P, and S.

The engineers wanted to examine both the effect of within-tube variation as well as

tube-to-tube variation. Although previous studies have shown that the epoxy material is

within specifications for composition and viscosity, it is still believed that variation

within these specifications might affect the final height of applied epoxy. To address

the above experimental issues, five tubes (T ) of epoxy were used. Following the notation

of Anderson & McLean (1974), the tubes are considered a replication factor, E, the tube

end, is a whole-plot factor, and D, P, and S are split-plot factors.

The ANOVA table for the above split-plot experiment appears in Table 1. Following the

suggested analysis of Anderson & McLean (1974), the interaction term of the replication

(T) and whole-plot factor (E) has been considered to be whole-plot error; the F test for E

uses this term in the denominator.

A summary of the results of this experiment should note that the whole-plot factor (E)

has little effect when compared to whole-plot error. In addition, an examination of the

magnitude of the interaction terms involving T and other split-plot terms, shows that

they all appear to be of the same size. There is no evidence that results are inconsistent

from tube-to-tube. In other words, the ‘long-term’ tube-to-tube variation does not

appear to behave inconsistently across time. Consequently, the split-plot error term was

formed by pooling the sums of squares of all split-plot terms that contain the replication

factor T. The main effects for P and D appear to be significant. Better management of these

factors may aid in better controlling the height dimension in this process.

An interesting facet of the ANOVA table is that the estimate of whole-plot error seems

considerably smaller than the split-plot error. Of course, the expected mean square of

whole-plot error is larger than or equal to the expected mean square of split-plot error

Further Exploratory Analysis of Split-plot Experiments 775

Dow

nloa

ded

by [

Tuf

ts U

nive

rsity

] at

14:

16 1

4 N

ovem

ber

2014

under the usual model assumptions. So if it were concluded that split-plot error was ‘too

large’ as compared to whole-plot error, some unexplained variation is being shown. And a

better understanding of what is inducing this variation is likely to result in significant

process improvements.

In the present situation it is instructive to note that the engineers included ‘tube end’ as a

factor and they studied multiple tubes. They were concerned with the effect of the consist-

ency of the epoxy material within and between tubes. These potential sources of variation

need to be considered further. In this experiment tube-to-tube was addressed by replicating

on different tubes. The within tube variability was thought to be primarily the result of

aging once a tube was opened. The factor E captures and quantifies this effect. The exper-

imenter’s expectation as to within-tube variability is described in Figure 1(a). However,

the inflated split-plot error could be explained if the experimenter misunderstood the

nature of the within-tube variation. Figure 1(b) describes alternate within-tube variability

Table 1. ANOVA table for split-plot analysis of adhesive experiment

Source DF Mean square F p-value

T 4 0.3258

E 1 0.0924 1.173 0.3397

T�E 4 0.0788 Estimate of Whole-plot Error

D 1 3.9485 20.460 0.0000

P 1 4.2888 22.223 0.0000

S 1 0.5922 3.069 0.0853

P�D 1 0.4317 2.237 0.1403

D�S 1 0.0644 0.333 0.5659

P�S 1 0.0047 0.024 0.8762

D�P�S 1 0.2978 1.543 0.2193

E�D 1 0.2041 1.058 0.3082

E�P 1 0.0064 0.033 0.8563

E�S 1 0.5061 2.622 0.1110

E�P�D 1 0.4604 2.386 0.1281

E�D�S 1 0.0281 0.146 0.7043

E�P�S 1 0.2321 1.203 0.2775

E�D�P�S 1 0.1632 0.846 0.3618

Terms Pooled Into Split-plot Error

T�D 4 0.2250

T�P 4 0.1969

T�S 4 0.0857

T�E�D 4 0.1106

T�E�P 4 0.2552

T�E�S 4 0.6621

T�P�D 4 0.1512

T�D�S 4 0.1907

T�P�S 4 0.0612

T�E�P�S 4 0.0172

T�E�P�D 4 0.0510

T�E�D�S 4 0.3621

T�P�S�D 4 0.1537

T�E�P�S�D 4 0.1791

Split-plot Error 56 0.1930


Dow

nloa

ded

by [

Tuf

ts U

nive

rsity

] at

14:

16 1

4 N

ovem

ber

2014

that due to the way the experiment was run would inflate the split-plot error but not

increase the whole-plot error!

The experimental strategy is to run the eight runs making up the whole-plot at each end

of the tube. If the alternative view (b) were true, then the inconsistency would affect these

runs. However, since it is systematic, the average adhesive quality is the same at both ends

of the tube – the whole-plot error estimate would not include the effect of adhesive

consistency.

One can imagine other industrial situations where the split-plot error estimate would be

larger than the whole-plot error estimate due to a stratified effect being present. Some

additional examples would be:

(1) Ovens often have airflow patterns and hot spots that are not clearly understood. An

experimenter may fill the oven with experimental units from several treatment com-

binations, but still require multiple oven runs. The experimenter may not realize

that there is a consistent temperature gradient through the oven, top-to-bottom,

front-to-back, or maybe middle to outside. This variation would affect the units in

the run, but on average, each run would be the same.

(2) Mixing tanks could have systematic variation from the top to the bottom of the tank

or middle to outer edges. Multiple treatment combinations could be run from each

tank, but the whole experiment might require several tanks. Again, the systematic

effect will average out from tank-load to tank-load, but affect individual runs

within a tank-load. Thus the split-plot error will be inflated in comparison to the

whole-plot error.

In each of these cases, an inflated split-plot could result from the misunderstanding of a

spatial source of variation present in the process and so the experiment was not designed to

account for this source.

In the following section we explore a model that incorporates a stratified effect within

the split-plot. Calculation of the expected mean squares for this model provides a basis for

speculating on the reasons for large split-plot error relative to whole-plot error.

ANOVA Table for Model with Stratified Fixed Effects

In this section we consider a small split-plot experiment with one whole-plot factor and

one split-plot factor. The usual statistical model for this situation is written as:

Yijk ¼ mþ Ri þ l j þ (Rl)ij þ dij þ fk þ (lf) jk þ (Rf)ik þ (Rlf)ijk þ 1ijk

Figure 1. Within-tube consistency. (Density of Grey Represents Adhesive Quality.)


Dow

nloa

ded

by [

Tuf

ts U

nive

rsity

] at

14:

16 1

4 N

ovem

ber

2014

for

i ¼ 1, . . . , r

j ¼ 1, . . . , a

and

k ¼ 1, . . . , b

We will assume that the dijs are iid normal with mean 0 and variance s2wp and the 1ijks are

iid normal with mean 0 and variance s2.

In the above model, r is the number of replications, a is the number of levels of the

whole-plot factor, A, and b is the number of levels of the split-plot factor, B. The term

‘dij’ is the whole-plot error term. Its variance is estimated by the mean square of RA.

And ‘1ijk’ is the split-plot error term. Assuming no replication by treatment interaction,

as was determined to be true in our previous example, the split-plot variance can be esti-

mated by the combined mean squares of RB and RAB. An ANOVA table for this model is

provided in Table 2.

We wish to consider the situation, described in the previous section, where there is a

stratification occurring within each whole-plot. The nature of this stratification is likely

itself to be subject to variation. But in order to gain insight into the effect that the stra-

tification has on the expected mean squares, we will assume a fixed stratification within

each whole-plot. In fact, we will assume that the nature of the stratification is to simply

increase the value of half of the observations within a whole-plot by an amount g and

decrease the value of the remaining observations by the same amount, g. Since the

split-plot levels are assigned at random within a whole-plot, the observations that are

increased are considered to be ‘randomly chosen.’ Thus, whether an observation is

increased or decreased by the stratification will be assumed independent of the

effects of the other factors. Furthermore, our derivation of the expected mean

squares for the split-plot error terms will assume that there is an even number of

levels of the split-plot factor, B. In other words, b is assumed to be an even

number. A model for this experiment can be written as:

Zijk ¼ Yijk þ Gijk

Table 2. ANOVA for split-plot experiment

Source df MS EMS

R r 2 1 MS(R) s2 þ abs2Rep

A a 2 1 MS(A) s2 þ bs2wp þ uA

R�A Estimates whole-plot

error

(r 2 1)(a 2 1) MS(RA) s2 þ bs2wp

B b 2 1 MS(B) s2 þ uB

A�B (a 2 1)(b 2 1) MS(A�B) s2 þ uAB

R�B R�A�B Estimates split-plot

error

a(r 2 1)(b 2 1) MSE s2


Dow

nloa

ded

by [

Tuf

ts U

nive

rsity

] at

14:

16 1

4 N

ovem

ber

2014

where Yijk is as defined above and Gijk is such that

P(Gijk ¼ g) ¼ P(Gijk ¼ �g) ¼ 1=2

and

Xb

k¼1

Gijk ¼ Gij: ¼ 0

The ANOVA table for the model that includes this stratification is provided in Table 3.

We note that the expected mean squares are unchanged for the whole-plot effects, R, A,

and R�A. For example, the sums of squares, SSR, for Replication will be:

SSR ¼

Pri¼1

z2i::

ab� CM ¼

Pri¼1

y2i::

ab� CM

since summing Gijk over k will be zero. And the expected mean squares for the split-plot

terms, including the split-plot error term (MSE), are increased by the same factor,

b

b� 1g2

For example, the sums of squares for B will be:

SSB ¼

Pbk¼1

(y::k þ G::k)2

ra� CM

SSB ¼

Py2::k

ra� CM þ

2Pbk¼1

y::kG::k þPbk¼1

G2::k

ra

Since y and G are independent and E(Gk) ¼ 0, the cross-product term will drop out.

Table 3. ANOVA for split-plot experiment with stratification in the whole-plot

Source df MS EMS

R r 2 1 MS(R) s2 þ absRep2

A a 2 1 MS(A) s2 þ bs2wp þ uA

R�A Estimates whole-plot

error

(r 2 1)(a 2 1) MS(RA) s2 þ bs2wp

B b 2 1 MS(B) s2 þ uB þb

b� 1g2

A�B (a 2 1)(b 2 1) MS(A�B) s2 þ uAB þb

b� 1g2

R�B R�A�B Estimates split-plot

error

a(r 2 1)(b 2 1) MSE s2 þb

b� 1g2


Dow

nloa

ded

by [

Tuf

ts U

nive

rsity

] at

14:

16 1

4 N

ovem

ber

2014

Further,

P(G::k ¼ g(ra� 2h)) ¼ra

h

� �1

2

� �ra

So E(G..k2) ¼ g 2ra for all k, which means

E(SSB) ¼ E

Py2::k

ra� CM

� �þ bg2

We would ask the reader to make note from the above derivation that the multiplier,

b/(b-1) of g 2 reflects the fact that there are b runs made within the whole-plot. In the

present case, these b runs are the number of levels of the factor B. In experiments with

more than one split-plot factor, b would be the number of runs made within a whole-plot.

The above derivation of the expected mean squares for split-plot error with stratification

justifies the statement that stratification is a likely candidate to consider when split-plot

error is large as compared to whole-plot error. If such a determination were made, it

would be useful to explore the physical reasons for stratification that may be present. In

other words, an important direction for future study learned from the experiment might

be the need to investigate a possibly large, unexpected source of variation that was not pre-

viously evident. However, it remains to be determined how large split-plot error as com-

pared to whole-plot error should be before we could reasonably suggest such an

investigation.

Using Simulations to Evaluate Power

A program was written using Visual Basic in Microsoft Excel. It develops advice for

evaluating the case where a stratified source of variation is included within the whole-

plots. The intention of the simulation is to understand how powerful an F test would

be for evaluating a large split-plot error with various levels of stratification. Due to the

particular interest in the case where the split-plot error estimate is larger than the

whole-plot error estimate, a one-sided F test was used. The simulation program generated

10,000 experiments for each of the following 108 combinations of cases:

. two and three replicates,

. two, three and four levels of A (the whole-plot factor),

. two, four and eight levels of B (the split-plot factor).

. Additionally, each combination considered cases where g ¼ 0, 0.5, 1, 1.5, 2 and 4,

. The standard deviation for random split-plot error (s2) was held at 1 and the additional

whole-plot component (s2wp) was 0.

The simulation builds on the situation described in the previous section, i.e. the nature of

the stratification is simply to increase the value of half of the observations within a whole-

plot by an amount g and decrease the value of the remaining observations by the same

amount, g. As part of the simulation the averages of the split-plot error and the whole-

plot error for all 108 � 10,000 simulations were calculated. This comparison was

performed as a means of checking on the results of the simulation. The results of the simu-

lation compare well with the derived expected mean squares as illustrated in Table 4. As


Dow

nloa

ded

by [

Tuf

ts U

nive

rsity

] at

14:

16 1

4 N

ovem

ber

2014

expected the split-plot error estimate depends on s2, b, and g (Figure 2). The correspond-

ing estimates of whole-plot error are not reproduced in this paper – they are all about 1 as

indicated by the derivation.

The simulation considers the power of an F test to assess the presence of stratification

within the whole-plots. These values of g represent a shift of 2 g due to a systematic source

of variability unintentionally included within the whole-plots. An F test comparing the

ratio estimated split plot error=estimated whole plot error, or in terms of expected

values, s2 þ b=b� 1g2� �

=s2 when s2wp ¼ 0, could be used to assess the presence of a g

effect. Using the estimates for the split-plot and whole-plot error terms from the simu-

lation, tables for a ¼ 0.05 (Table 5), a ¼ 0.10 (Table 6) and a ¼ 0.25 (Table 7) are

Table 4. Mean estimates and derived values for split-plot error. (derived values are in italics)

r a b 0 0.5 1 1.5 2 4

2 2 2 1.01 (1.00) 1.51 (1.50) 2.93 (3.00) 5.53 (5.50) 9.07 (9.00) 32.83 (33.00)

4 1.00 (1.00) 1.34 (1.33) 2.32 (2.33) 4.02 (4.00) 6.31 (6.33) 22.20 (22.33)

8 1.00 (1.00) 1.29 (1.29) 2.13 (2.14) 3.57 (3.57) 5.57 (5.57) 19.26 (19.29)

3 2 0.99 (1.00) 1.51 (1.50) 3.02 (3.00) 5.48 (5.50) 8.90 (9.00) 33.23 (33.00)

4 1.00 (1.00) 1.33 (1.33) 2.33 (2.33) 4.00 (4.00) 6.34 (6.33) 22.31 (22.33)

8 0.99 (1.00) 1.28 (1.29) 2.15 (2.14) 3.57 (3.57) 5.59 (5.57) 19.27 (19.29)

4 2 1.00 (1.00) 1.50 (1.50) 3.01 (3.00) 5.43 (5.50) 8.93 (9.00) 33.24 (33.00)

4 1.01 (1.00) 1.32 (1.33) 2.33 (2.33) 3.99 (4.00) 6.30 (6.33) 22.30 (22.33)

8 1.00 (1.00) 1.29 (1.29) 2.15 (2.14) 3.56 (3.57) 5.58 (5.57) 19.24 (19.29)

3 2 2 1.00 (1.00) 1.50 (1.50) 3.02 (3.00) 5.50 (5.50) 9.07 (9.00) 32.87 (33.00)

4 1.00 (1.00) 1.34 (1.33) 2.33 (2.33) 3.99 (4.00) 6.32 (6.33) 22.26 (22.33)

8 1.00 (1.00) 1.29 (1.29) 2.15 (2.14) 3.57 (3.57) 5.58 (5.57) 19.30 (19.29)

3 2 1.00 (1.00) 1.49 (1.50) 3.01 (3.00) 5.49 (5.50) 8.99 (9.00) 32.88 (33.00)

4 1.00 (1.00) 1.34 (1.33) 2.04 (2.33) 4.00 (4.00) 6.33 (6.33) 22.30 (22.33)

8 1.00 (1.00) 1.29 (1.29) 2.15 (2.14) 3.57 (3.57) 5.57 (5.57) 19.27 (19.29)

4 2 0.99 (1.00) 1.49 (1.50) 2.99 (3.00) 5.50 (5.50) 9.03 (9.00) 32.91 (33.00)

4 1.00 (1.00) 1.33 (1.33) 2.33 (2.33) 4.00 (4.00) 6.33 (6.33) 22.32 (22.33)

8 1.00 (1.00) 1.29 (1.29) 2.15 (2.14) 3.57 (3.57) 5.58 (5.57) 19.29 (19.29)

Figure 2. Estimated split-plot error vs. g for levels of b


Dow

nloa

ded

by [

Tuf

ts U

nive

rsity

] at

14:

16 1

4 N

ovem

ber

2014

Table 5. Estimated power for tests for stratification within the whole-plots 2a ¼ 0.05 (situations with power . 0.75 are shown in bold)

For a ¼ 0.05

Reps

Levels

of A

Levels

of B

df for

whole-plot

error

df for

split-plot

error F critical g ¼ 0 g ¼ 0.5 g ¼ 1 g ¼ 1.5 g ¼ 2 g ¼ 4

2 2 2 1 2 199.50 5.1 6.3 8.5 12.5 15.2 28.1

2 2 4 1 6 233.99 4.9 5.6 7.5 10.1 12.6 23.1

2 2 8 1 14 245.36 5.1 5.7 7.2 9.4 11.6 22.1

2 3 2 2 3 19.16 4.7 7.3 14.4 23.9 35.3 72.0

2 3 4 2 9 19.38 5.0 6.2 11.3 17.7 27.5 65.8

2 3 8 2 21 19.45 5.0 6.3 10.6 17.0 25.0 62.7

2 4 2 3 4 9.12 5.2 8.3 19.3 36.3 54.0 89.6

2 4 4 3 12 8.74 5.0 7.3 14.4 28.5 44.7 91.4

2 4 8 3 28 8.62 4.9 7.0 14.0 25.6 41.2 90.0

3 2 2 2 4 19.25 5.2 7.0 14.5 24.3 36.9 75.5

3 2 4 2 12 19.41 4.8 6.7 11.5 18.7 27.8 66.4

3 2 8 2 28 19.46 5.1 6.5 10.7 17.3 24.9 62.4

3 3 2 4 6 6.16 4.6 9.5 26.2 49.7 72.9 97.7

3 3 4 4 18 5.82 4.7 8.5 16.9 40.2 63.5 99.0

3 3 8 4 42 5.71 5.1 8.2 17.7 35.2 56.8 98.7

3 4 2 6 8 4.15 5.0 11.6 36.1 69.1 89.2 99.6

3 4 4 6 24 3.84 5.3 9.2 27.8 58.4 84.2 100.0

3 4 8 6 56 3.74 5.2 8.3 24.6 54.2 81.3 100.0

782

T.

Cooper

&M

.G.

Leitn

aker

Dow

nloa

ded

by [

Tuf

ts U

nive

rsity

] at

14:

16 1

4 N

ovem

ber

2014


For a ¼ 0.1

Reps

Levels

of A

Levels

of B

df for

whole-plot

error

df for

split-plot


2 2 2 1 2 49.50 10.0 12.6 16.9 24.0 29.5 50.2

2 2 4 1 6 58.20 10.1 11.4 15.6 19.4 25.1 43.9

2 2 8 1 14 61.07 10.5 10.8 14.5 18.6 23.2 42.7

2 3 2 2 3 9.16 9.9 14.4 26.4 41.1 55.9 85.0

2 3 4 2 9 9.38 9.7 13.1 22.1 33.1 48.1 86.6

2 3 8 2 21 9.44 9.7 12.4 20.6 31.8 44.0 85.3

2 4 2 3 4 5.34 10.2 16.5 33.2 55.1 71.8 93.6

2 4 4 3 12 5.22 9.7 14.2 27.5 47.5 66.3 97.8

2 4 8 3 28 5.17 9.7 14.1 25.7 43.5 62.9 98.1

3 2 2 2 4 9.24 9.9 14.2 26.7 42.4 59.8 89.8

3 2 4 2 12 9.41 9.6 13.5 22.2 34.1 47.6 88.4

3 2 8 2 28 9.46 10.2 13.2 20.3 31.6 44.5 86.1

3 3 2 4 6 4.01 9.9 17.9 42.1 68.7 87.0 98.5

3 3 4 4 18 3.85 9.7 16.5 29.2 59.4 81.6 99.9

3 3 8 4 42 3.80 9.7 15.3 31.1 55.1 77.9 99.9

3 4 2 6 8 2.98 10.0 20.8 53.0 83.4 95.4 99.7

3 4 4 6 24 2.82 10.4 17.3 44.2 76.5 94.4 100.0

3 4 8 6 56 2.76 10.0 16.3 40.8 73.2 93.2 100.0

Furth

erE

xplo

rato

ryA

nalysis

of

Split-p

lot

Exp

erimen

ts783

Dow

nloa

ded

by [

Tuf

ts U

nive

rsity

] at

14:

16 1

4 N

ovem

ber

2014


For a ¼ 0.25

Reps

Levels

of A

Levels

of B

df for

whole-plot

error

df for

split-plot


2 2 2 1 2 7.50 24.7 29.9 40.5 53.6 62.5 78.8

2 2 4 1 6 8.98 25.6 28.1 38.1 47.8 57.4 84.2

2 2 8 1 14 9.47 25.6 27.8 35.7 45.7 54.1 83.2

2 3 2 2 3 3.15 24.8 34.6 54.0 71.8 81.8 91.0

2 3 4 2 9 3.37 24.8 31.4 48.5 65.8 81.0 98.4

2 3 8 2 21 3.43 24.7 30.6 45.8 64.0 78.8 99.1

2 4 2 3 4 2.39 25.4 37.8 61.5 80.3 89.5 95.6

2 4 4 3 12 2.45 25.0 33.9 55.1 77.6 90.6 99.8

2 4 8 3 28 2.46 24.9 33.1 53.5 75.3 90.4 99.9

3 2 2 2 4 3.23 24.7 34.8 55.5 74.0 86.6 94.9

3 2 4 2 12 3.39 24.7 32.2 47.8 66.6 81.7 99.2

3 2 8 2 28 3.44 24.9 31.5 46.4 63.9 79.1 99.3

3 3 2 4 6 2.08 24.9 39.2 70.1 89.8 96.3 98.9

3 3 4 4 18 2.08 24.9 35.7 54.8 85.9 96.8 100.0

3 3 8 4 42 2.08 24.6 34.9 60.2 84.5 96.6 100.0

3 4 2 6 8 1.78 25.1 43.3 78.4 95.4 98.9 99.8

3 4 4 6 24 1.75 25.6 38.4 72.1 94.3 99.3 100.0

3 4 8 6 56 1.74 25.6 38.0 70.4 93.6 99.4 100.0

784

T.

Cooper

&M

.G.

Leitn

aker

Dow

nloa

ded

by [

Tuf

ts U

nive

rsity

] at

14:

16 1

4 N

ovem

ber

2014

reproduced. An initial look at these tables considered cases with b , 0.25 (or power

greater than 0.75.) These situations are highlighted in bold.

Discussion and Recommendations

An examination of the simulation results indicates that even for moderately large stratified

effects, the usual F test for comparing split-plot to whole-plot error is not very powerful for

detecting the presence of a stratified effect. It is only when alpha becomes as high as 0.25

that we can be reasonably confident of detecting a stratification effect which is as large as

1.5. Our recommendation is to use an alpha level of 0.25. Our reasoning is that we are not

conducting a hypothesis test in the usual sense. The nature of our work at this point is more

exploratory in nature. If we decide that split-plot error is large as compared to whole-plot

error, our decision impels us to more closely examine the process for a stratified effect.

And it does not recommend a process change based on this result. Consequently, we

think that recommending a better understanding of the process to identify stratification

when none exists is a less costly mistake than deciding that no stratification exists when

it does.

In our example of the adhesive experiment, an F test comparing split-plot error to

whole-plot error would result in a p-value of 0.198. Our suggestion would be to investigate

possible stratification within the whole-plots. Now it is unlikely that the stratification

described by our model in equation (1) is a reasonable representation of this physical situa-

tion. This model incorporates an abrupt change from g to 2g. It is more likely that a

gradual stratification exists throughout the ends of the tubes. However, this model

would provide a conservative means of speculating about the likely size of a gradient,

whether a gradual or abrupt change. If we take the expected mean square of whole-plot

error, 0.0788 as an estimate of s 2, then a simple method of moments estimator of g

would be 0.32. (We note that b would be 8 here as there are eight runs within each

whole-plot.) Figure 3 compares this ‘speculated’ value of g to the other effects estimated

in the adhesive experiment. We believe this figure indicates the merits of investigating the

possible stratification further as this stratification may be the largest effect active in this

process.

Figure 3. Comparison of parameter estimates for fixed effects to estimated g


Dow

nloa

ded

by [

Tuf

ts U

nive

rsity

] at

14:

16 1

4 N

ovem

ber

2014

Conclusions

Our first conclusion, and probably the most important, is that the split-plot experiment as

conducted is not a powerful method for detecting stratification. In other words, if a sus-

pected reason for stratification existed, a different means of gaining an understanding of

the nature of the stratification should be used. In the present example, we might have

recommended that the engineers collect process data on the observed differences in

height throughout the use of a tube. Prior to conducting the experiment it was suspected

that heights differed from one end of the tube to another. Baseline data on heights through-

out a tube for a number of tubes studied across different times and process conditions

might have indicated the kind of stratification we expect occurred within a tube.

A second conclusion we draw from our study of the adhesive experiment is the value of

performing additional, exploratory analysis of experimental data. Many useful industrial

experiments are performed on actual industrial processes rather than in a controlled

laboratory setting. This fact means that our learning from an experiment may not be

limited to those factors we pre-selected for our experimental study. Important process

knowledge may be gained by our ability to recognize unanticipated sources of variation.

A thoughtful consideration of what variation is being captured in our experimental work

can serve as a guide to further, beneficial process study. In this paper, we have achieved

this examination by exploring possible stratification within split-plot experiments and

evaluating the effects this stratification would have on experimental results.

References

Anderson, V. & McLean, R. (1974) Design of Experiments; A Realistic Approach (New York: Marcel Decker).

Box, G. (1996) Quality quandaries; split-plot experiments, Quality Engineering, 8(3), pp. 515–520.

Daniel, C. (1976) Applications of Statistics to Industrial Experimentation (New York: Wiley).

Hahn, G. (1978) More on randomization, ChemTech, March, pp. 164–165.

Hicks, C. (1993) Fundamental Concepts in the Design of Experiments (Fort Worth: Saunders College Publishing).

Leitnaker, M.G. & Mee, R. (2001) The analytic use of two-level factorials in incomplete blocks to examine the

stability of factor effects, Quality Engineering, 14(1), pp. 49–58.

Sanders, D., Leitnaker, M.G. & McLean, R.A. (2001) Randomized block designs in analytical studies, with Doug

Sanders and Bob McLean, Quality Engineering, 14(1), pp. 1–8.


Dow

nloa

ded

by [

Tuf

ts U

nive

rsity

] at

14:

16 1

4 N

ovem

ber

2014

Further exploratory analysis of split-plot experiments to study certain stratified effects

Documents

Transcript of Further exploratory analysis of split-plot experiments to study certain stratified effects