Further exploratory analysis of split-plot experiments to study certain stratified effects
Transcript of Further exploratory analysis of split-plot experiments to study certain stratified effects
This article was downloaded by: [Tufts University]On: 14 November 2014, At: 14:16Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Journal of Applied StatisticsPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/cjas20
Further exploratory analysis of split-plot experiments to study certainstratified effectsTony Cooper a & Mary G. Leitnaker ba Six Sigma Associates , Knoxville, TN, USAb University of Tennessee , Knoxville, TN, USAPublished online: 22 Jan 2007.
To cite this article: Tony Cooper & Mary G. Leitnaker (2006) Further exploratory analysis of split-plot experiments to study certain stratified effects, Journal of Applied Statistics, 33:8, 773-786,DOI: 10.1080/02664760600742201
To link to this article: http://dx.doi.org/10.1080/02664760600742201
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to orarising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions
Further Exploratory Analysis of
Split-plot Experiments to Study
Certain Stratified Effects
TONY COOPER� & MARY G. LEITNAKER��
�Six Sigma Associates, Knoxville, TN, USA; ��University of Tennessee, Knoxville, TN, USA
ABSTRACT Designed experiments are a key component in many companies’ improvementstrategies. Because completely randomized experiments are not always reasonable from a cost orphysical perspective, split-plot experiments are prevalent. The recommended analysis accountsfor the different sources of variation affecting whole-plot and split-plot error. Howeverexperiments on industrial processes must be run and, consequently analyzed quite differently fromones run in a controlled environment. Such experiments are typically subject to a wide array ofuncontrolled, and barely understood, variation. In particular, it is important to examine theexperimental results for additional, unanticipated sources of variation. In this paper, weconsider how unanticipated, stratified effects may influence a split-plot experiment and discussfurther exploratory analysis to indicate the presence of stratified effects. Examples of suchexperiments are provided, additional tests are suggested and discussed in light of their power,and recommendations given.
KEY WORDS: Designed experiment, split-plot error, source of variation
Introduction
Although, as the name suggests, split-plot experiments were introduced as a strategy for
agricultural experiments, the split-plot arrangement is also common in industry. Box
(1996) notes that it is often ‘not only most convenient but also most efficient to run an
industrial experiment in a split-plot mode.’ In this same article, Box also notes that
although the split plot structure is often run in industry, it is sometimes not recognized
as such and consequently the experiment is not analyzed correctly. Daniel (1976)
stated, although possibly with some exaggeration, that ‘All industrial experiments are
split-plot experiments’. A primary reason that split plot experiments are so common in
industry is that by reducing the number of times some factors are changed, the cost and
time of running the experiment is greatly reduced. In fact, many industrial experiments
would be prohibitively costly or time-consuming were they not run as split plots. And
when an experiment is not randomized the experimenter will have to accept certain
Journal of Applied Statistics
Vol. 33, No. 8, 773–786, September 2006
Correspondence Address: Mary G. Leitnaker, College of Business Administration, Department of Statistics,
University of Tennessee, 331 Stokely Management Center, Knoxville, TN 37996 0532, USA. Email:
0266-4763 Print=1360-0532 Online=06=080773–14 # 2006 Taylor & FrancisDOI: 10.1080=02664760600742201
Dow
nloa
ded
by [
Tuf
ts U
nive
rsity
] at
14:
16 1
4 N
ovem
ber
2014
compromises (Hahn, 1978). Other authors have discussed the use of split plots in industrial
experiments. Hicks (1993) uses an industrial example to introduce his discussion of split-
plot experiments. Anderson & McLean (1974) have an extensive discussion of the use of
split-plots in industrial experiments.
Experimenting in an Industrial Setting
Statistically designed experiments on industrial processes, as opposed to experiments con-
ducted in more controlled settings, are almost always conducted in an environment subject
to a wide array of uncontrolled variation. In fact, experiments are often performed on an
industrial process to gain a better understanding of the myriad sources of variation acting
on the process. An experiment that is to be conducted on an ongoing process must consider
the nature of the sources of variation that affect measured outcomes of the process. In par-
ticular, the time span over which the variation occurs is an important consideration. For
example, variation that can be attributed to a machine performing repeated cycles but
in a slightly different fashion each time would likely be observed to create variation in
outcomes measured in a short time span. On the other hand, variation in incoming
raw materials would have a different impact on variation. Variation within a lot of raw
materials may affect short-term variation. Significant variation between lots of raw
material may create ‘long-term’ variation, i.e. variation that would be captured when
measurements are made over a long time span. And of course any process will be
subject to variation from many other sources stemming from different methods, materials,
practices, etc that occur in the process. So before designing an experiment, it would be
necessary to have a good understanding of what these likely sources of variation are
and how their impact is likely to be felt on measured process outcomes. Clearly, a com-
plete knowledge of these sources is not feasible. Yet, our best understanding of their beha-
vior guides our choice of experimental design.
The time span over which suspected sources of variation are likely to affect process out-
comes is one major element in choosing an experimental design. For example, if we know
that the process is affected by variation in incoming lots of material, we would like to
conduct experiments in blocks to specifically study the manner in which factor effects
might change across different lots of incoming material. A discussion of this use of block-
ing and how our analysis of an industrial experiment is performed when blocking is used is
described in papers by Sanders et al. (2001) and Leitnaker & Mee (2001). As we shall see
later, this approach to analyzing blocked experiments also informs our analysis of split-
plot experiments.
Another manner in which variation might present itself in an industrial process is
spatially. For example, a process may require that material pass through a drying oven.
Material that moves through one area of this oven may result in a different measured
outcome than material that was exposed to a different area of the oven. Many similar situ-
ations exist in industrial processes. Incoming batches of raw material may have had a criti-
cal element settle in transportation, meaning that the process will experience a fixed
change across the time that the process is using this batch. Multiple spindles exist on a
machine forming paper cups as well as on the one forming lids. Multiple lanes of
product are stamped by a cutting machine. Sheets of glass, aluminum, paper, etc are
subject to variations in the cross as well as machine direction. Again, our knowledge
about the likely occurrence of this type of ‘spatial’ variation will be needed to design
an experiment. Here again, we cannot expect complete knowledge of such sources of vari-
ation. But the more complete our knowledge the better will be the experiment we design.
Furthermore, it is the objective of this paper to illustrate how our analysis of experiments
774 T. Cooper & M.G. Leitnaker
Dow
nloa
ded
by [
Tuf
ts U
nive
rsity
] at
14:
16 1
4 N
ovem
ber
2014
run in an industrial setting should consider the possibility that such spatial variation exists,
even though we may have been unaware of its possibility prior to experimentation.
Manufacturing Example: Adhesive Experiment
The example that generated our present discussion of split-plot experiments occurred in
the electronics industry and involved an experiment to study the application of an
epoxy. In this instance, the type of ‘spatial variation’ referred to in the preceding paragraph
was expected to exist in the application process. Applying the correct amount or thickness
of an epoxy, as measured by height (H ), is critical to proper functioning of a component.
A robot is programmed to dispense the epoxy on the perimeter of a part. The specification
of 2–3 mm is to ensure proper adhesion and size at the next step in the process. The engin-
eers who designed the experiment wanted to study the effects of three settings on the robot
that applied the epoxy. These settings were:
. the distance (D) that the robot nozzle was from the surface,
. the pressure (P) at which the epoxy was being dispensed, and
. the speed (S ) at which the robot nozzle moved along the epoxy ledge.
Furthermore, the epoxy used by the robot was purchased in tubes. Engineers who worked
on the process suspected that the consistency of the epoxy material affected the height. In
particular, the adhesive would age once the tube was opened and this aging might affect
height. A factor (E), end of tube, was used to capture the effect of this aging. A set of eight
experimental runs was completed on each end (E) of each tube, the front and the back.
Thus, the factor E captures a spatial effect (front to back of tube) thought to be active
in creating variation in the application process. The eight runs were performed by rando-
mizing two levels of each of the three factors, D, P, and S.
The engineers wanted to examine both the effect of within-tube variation as well as
tube-to-tube variation. Although previous studies have shown that the epoxy material is
within specifications for composition and viscosity, it is still believed that variation
within these specifications might affect the final height of applied epoxy. To address
the above experimental issues, five tubes (T ) of epoxy were used. Following the notation
of Anderson & McLean (1974), the tubes are considered a replication factor, E, the tube
end, is a whole-plot factor, and D, P, and S are split-plot factors.
The ANOVA table for the above split-plot experiment appears in Table 1. Following the
suggested analysis of Anderson & McLean (1974), the interaction term of the replication
(T) and whole-plot factor (E) has been considered to be whole-plot error; the F test for E
uses this term in the denominator.
A summary of the results of this experiment should note that the whole-plot factor (E)
has little effect when compared to whole-plot error. In addition, an examination of the
magnitude of the interaction terms involving T and other split-plot terms, shows that
they all appear to be of the same size. There is no evidence that results are inconsistent
from tube-to-tube. In other words, the ‘long-term’ tube-to-tube variation does not
appear to behave inconsistently across time. Consequently, the split-plot error term was
formed by pooling the sums of squares of all split-plot terms that contain the replication
factor T. The main effects for P and D appear to be significant. Better management of these
factors may aid in better controlling the height dimension in this process.
An interesting facet of the ANOVA table is that the estimate of whole-plot error seems
considerably smaller than the split-plot error. Of course, the expected mean square of
whole-plot error is larger than or equal to the expected mean square of split-plot error
Further Exploratory Analysis of Split-plot Experiments 775
Dow
nloa
ded
by [
Tuf
ts U
nive
rsity
] at
14:
16 1
4 N
ovem
ber
2014
under the usual model assumptions. So if it were concluded that split-plot error was ‘too
large’ as compared to whole-plot error, some unexplained variation is being shown. And a
better understanding of what is inducing this variation is likely to result in significant
process improvements.
In the present situation it is instructive to note that the engineers included ‘tube end’ as a
factor and they studied multiple tubes. They were concerned with the effect of the consist-
ency of the epoxy material within and between tubes. These potential sources of variation
need to be considered further. In this experiment tube-to-tube was addressed by replicating
on different tubes. The within tube variability was thought to be primarily the result of
aging once a tube was opened. The factor E captures and quantifies this effect. The exper-
imenter’s expectation as to within-tube variability is described in Figure 1(a). However,
the inflated split-plot error could be explained if the experimenter misunderstood the
nature of the within-tube variation. Figure 1(b) describes alternate within-tube variability
Table 1. ANOVA table for split-plot analysis of adhesive experiment
Source DF Mean square F p-value
T 4 0.3258
E 1 0.0924 1.173 0.3397
T�E 4 0.0788 Estimate of Whole-plot Error
D 1 3.9485 20.460 0.0000
P 1 4.2888 22.223 0.0000
S 1 0.5922 3.069 0.0853
P�D 1 0.4317 2.237 0.1403
D�S 1 0.0644 0.333 0.5659
P�S 1 0.0047 0.024 0.8762
D�P�S 1 0.2978 1.543 0.2193
E�D 1 0.2041 1.058 0.3082
E�P 1 0.0064 0.033 0.8563
E�S 1 0.5061 2.622 0.1110
E�P�D 1 0.4604 2.386 0.1281
E�D�S 1 0.0281 0.146 0.7043
E�P�S 1 0.2321 1.203 0.2775
E�D�P�S 1 0.1632 0.846 0.3618
Terms Pooled Into Split-plot Error
T�D 4 0.2250
T�P 4 0.1969
T�S 4 0.0857
T�E�D 4 0.1106
T�E�P 4 0.2552
T�E�S 4 0.6621
T�P�D 4 0.1512
T�D�S 4 0.1907
T�P�S 4 0.0612
T�E�P�S 4 0.0172
T�E�P�D 4 0.0510
T�E�D�S 4 0.3621
T�P�S�D 4 0.1537
T�E�P�S�D 4 0.1791
Split-plot Error 56 0.1930
776 T. Cooper & M.G. Leitnaker
Dow
nloa
ded
by [
Tuf
ts U
nive
rsity
] at
14:
16 1
4 N
ovem
ber
2014
that due to the way the experiment was run would inflate the split-plot error but not
increase the whole-plot error!
The experimental strategy is to run the eight runs making up the whole-plot at each end
of the tube. If the alternative view (b) were true, then the inconsistency would affect these
runs. However, since it is systematic, the average adhesive quality is the same at both ends
of the tube – the whole-plot error estimate would not include the effect of adhesive
consistency.
One can imagine other industrial situations where the split-plot error estimate would be
larger than the whole-plot error estimate due to a stratified effect being present. Some
additional examples would be:
(1) Ovens often have airflow patterns and hot spots that are not clearly understood. An
experimenter may fill the oven with experimental units from several treatment com-
binations, but still require multiple oven runs. The experimenter may not realize
that there is a consistent temperature gradient through the oven, top-to-bottom,
front-to-back, or maybe middle to outside. This variation would affect the units in
the run, but on average, each run would be the same.
(2) Mixing tanks could have systematic variation from the top to the bottom of the tank
or middle to outer edges. Multiple treatment combinations could be run from each
tank, but the whole experiment might require several tanks. Again, the systematic
effect will average out from tank-load to tank-load, but affect individual runs
within a tank-load. Thus the split-plot error will be inflated in comparison to the
whole-plot error.
In each of these cases, an inflated split-plot could result from the misunderstanding of a
spatial source of variation present in the process and so the experiment was not designed to
account for this source.
In the following section we explore a model that incorporates a stratified effect within
the split-plot. Calculation of the expected mean squares for this model provides a basis for
speculating on the reasons for large split-plot error relative to whole-plot error.
ANOVA Table for Model with Stratified Fixed Effects
In this section we consider a small split-plot experiment with one whole-plot factor and
one split-plot factor. The usual statistical model for this situation is written as:
Yijk ¼ mþ Ri þ l j þ (Rl)ij þ dij þ fk þ (lf) jk þ (Rf)ik þ (Rlf)ijk þ 1ijk
Figure 1. Within-tube consistency. (Density of Grey Represents Adhesive Quality.)
Further Exploratory Analysis of Split-plot Experiments 777
Dow
nloa
ded
by [
Tuf
ts U
nive
rsity
] at
14:
16 1
4 N
ovem
ber
2014
for
i ¼ 1, . . . , r
j ¼ 1, . . . , a
and
k ¼ 1, . . . , b
We will assume that the dijs are iid normal with mean 0 and variance s2wp and the 1ijks are
iid normal with mean 0 and variance s2.
In the above model, r is the number of replications, a is the number of levels of the
whole-plot factor, A, and b is the number of levels of the split-plot factor, B. The term
‘dij’ is the whole-plot error term. Its variance is estimated by the mean square of RA.
And ‘1ijk’ is the split-plot error term. Assuming no replication by treatment interaction,
as was determined to be true in our previous example, the split-plot variance can be esti-
mated by the combined mean squares of RB and RAB. An ANOVA table for this model is
provided in Table 2.
We wish to consider the situation, described in the previous section, where there is a
stratification occurring within each whole-plot. The nature of this stratification is likely
itself to be subject to variation. But in order to gain insight into the effect that the stra-
tification has on the expected mean squares, we will assume a fixed stratification within
each whole-plot. In fact, we will assume that the nature of the stratification is to simply
increase the value of half of the observations within a whole-plot by an amount g and
decrease the value of the remaining observations by the same amount, g. Since the
split-plot levels are assigned at random within a whole-plot, the observations that are
increased are considered to be ‘randomly chosen.’ Thus, whether an observation is
increased or decreased by the stratification will be assumed independent of the
effects of the other factors. Furthermore, our derivation of the expected mean
squares for the split-plot error terms will assume that there is an even number of
levels of the split-plot factor, B. In other words, b is assumed to be an even
number. A model for this experiment can be written as:
Zijk ¼ Yijk þ Gijk
Table 2. ANOVA for split-plot experiment
Source df MS EMS
R r 2 1 MS(R) s2 þ abs2Rep
A a 2 1 MS(A) s2 þ bs2wp þ uA
R�A Estimates whole-plot
error
(r 2 1)(a 2 1) MS(RA) s2 þ bs2wp
B b 2 1 MS(B) s2 þ uB
A�B (a 2 1)(b 2 1) MS(A�B) s2 þ uAB
R�B R�A�B Estimates split-plot
error
a(r 2 1)(b 2 1) MSE s2
778 T. Cooper & M.G. Leitnaker
Dow
nloa
ded
by [
Tuf
ts U
nive
rsity
] at
14:
16 1
4 N
ovem
ber
2014
where Yijk is as defined above and Gijk is such that
P(Gijk ¼ g) ¼ P(Gijk ¼ �g) ¼ 1=2
and
Xb
k¼1
Gijk ¼ Gij: ¼ 0
The ANOVA table for the model that includes this stratification is provided in Table 3.
We note that the expected mean squares are unchanged for the whole-plot effects, R, A,
and R�A. For example, the sums of squares, SSR, for Replication will be:
SSR ¼
Pri¼1
z2i::
ab� CM ¼
Pri¼1
y2i::
ab� CM
since summing Gijk over k will be zero. And the expected mean squares for the split-plot
terms, including the split-plot error term (MSE), are increased by the same factor,
b
b� 1g2
For example, the sums of squares for B will be:
SSB ¼
Pbk¼1
(y::k þ G::k)2
ra� CM
SSB ¼
Py2::k
ra� CM þ
2Pbk¼1
y::kG::k þPbk¼1
G2::k
ra
Since y and G are independent and E(Gk) ¼ 0, the cross-product term will drop out.
Table 3. ANOVA for split-plot experiment with stratification in the whole-plot
Source df MS EMS
R r 2 1 MS(R) s2 þ absRep2
A a 2 1 MS(A) s2 þ bs2wp þ uA
R�A Estimates whole-plot
error
(r 2 1)(a 2 1) MS(RA) s2 þ bs2wp
B b 2 1 MS(B) s2 þ uB þb
b� 1g2
A�B (a 2 1)(b 2 1) MS(A�B) s2 þ uAB þb
b� 1g2
R�B R�A�B Estimates split-plot
error
a(r 2 1)(b 2 1) MSE s2 þb
b� 1g2
Further Exploratory Analysis of Split-plot Experiments 779
Dow
nloa
ded
by [
Tuf
ts U
nive
rsity
] at
14:
16 1
4 N
ovem
ber
2014
Further,
P(G::k ¼ g(ra� 2h)) ¼ra
h
� �1
2
� �ra
So E(G..k2) ¼ g 2ra for all k, which means
E(SSB) ¼ E
Py2::k
ra� CM
� �þ bg2
We would ask the reader to make note from the above derivation that the multiplier,
b/(b-1) of g 2 reflects the fact that there are b runs made within the whole-plot. In the
present case, these b runs are the number of levels of the factor B. In experiments with
more than one split-plot factor, b would be the number of runs made within a whole-plot.
The above derivation of the expected mean squares for split-plot error with stratification
justifies the statement that stratification is a likely candidate to consider when split-plot
error is large as compared to whole-plot error. If such a determination were made, it
would be useful to explore the physical reasons for stratification that may be present. In
other words, an important direction for future study learned from the experiment might
be the need to investigate a possibly large, unexpected source of variation that was not pre-
viously evident. However, it remains to be determined how large split-plot error as com-
pared to whole-plot error should be before we could reasonably suggest such an
investigation.
Using Simulations to Evaluate Power
A program was written using Visual Basic in Microsoft Excel. It develops advice for
evaluating the case where a stratified source of variation is included within the whole-
plots. The intention of the simulation is to understand how powerful an F test would
be for evaluating a large split-plot error with various levels of stratification. Due to the
particular interest in the case where the split-plot error estimate is larger than the
whole-plot error estimate, a one-sided F test was used. The simulation program generated
10,000 experiments for each of the following 108 combinations of cases:
. two and three replicates,
. two, three and four levels of A (the whole-plot factor),
. two, four and eight levels of B (the split-plot factor).
. Additionally, each combination considered cases where g ¼ 0, 0.5, 1, 1.5, 2 and 4,
. The standard deviation for random split-plot error (s2) was held at 1 and the additional
whole-plot component (s2wp) was 0.
The simulation builds on the situation described in the previous section, i.e. the nature of
the stratification is simply to increase the value of half of the observations within a whole-
plot by an amount g and decrease the value of the remaining observations by the same
amount, g. As part of the simulation the averages of the split-plot error and the whole-
plot error for all 108 � 10,000 simulations were calculated. This comparison was
performed as a means of checking on the results of the simulation. The results of the simu-
lation compare well with the derived expected mean squares as illustrated in Table 4. As
780 T. Cooper & M.G. Leitnaker
Dow
nloa
ded
by [
Tuf
ts U
nive
rsity
] at
14:
16 1
4 N
ovem
ber
2014
expected the split-plot error estimate depends on s2, b, and g (Figure 2). The correspond-
ing estimates of whole-plot error are not reproduced in this paper – they are all about 1 as
indicated by the derivation.
The simulation considers the power of an F test to assess the presence of stratification
within the whole-plots. These values of g represent a shift of 2 g due to a systematic source
of variability unintentionally included within the whole-plots. An F test comparing the
ratio estimated split plot error=estimated whole plot error, or in terms of expected
values, s2 þ b=b� 1g2� �
=s2 when s2wp ¼ 0, could be used to assess the presence of a g
effect. Using the estimates for the split-plot and whole-plot error terms from the simu-
lation, tables for a ¼ 0.05 (Table 5), a ¼ 0.10 (Table 6) and a ¼ 0.25 (Table 7) are
Table 4. Mean estimates and derived values for split-plot error. (derived values are in italics)
r a b 0 0.5 1 1.5 2 4
2 2 2 1.01 (1.00) 1.51 (1.50) 2.93 (3.00) 5.53 (5.50) 9.07 (9.00) 32.83 (33.00)
4 1.00 (1.00) 1.34 (1.33) 2.32 (2.33) 4.02 (4.00) 6.31 (6.33) 22.20 (22.33)
8 1.00 (1.00) 1.29 (1.29) 2.13 (2.14) 3.57 (3.57) 5.57 (5.57) 19.26 (19.29)
3 2 0.99 (1.00) 1.51 (1.50) 3.02 (3.00) 5.48 (5.50) 8.90 (9.00) 33.23 (33.00)
4 1.00 (1.00) 1.33 (1.33) 2.33 (2.33) 4.00 (4.00) 6.34 (6.33) 22.31 (22.33)
8 0.99 (1.00) 1.28 (1.29) 2.15 (2.14) 3.57 (3.57) 5.59 (5.57) 19.27 (19.29)
4 2 1.00 (1.00) 1.50 (1.50) 3.01 (3.00) 5.43 (5.50) 8.93 (9.00) 33.24 (33.00)
4 1.01 (1.00) 1.32 (1.33) 2.33 (2.33) 3.99 (4.00) 6.30 (6.33) 22.30 (22.33)
8 1.00 (1.00) 1.29 (1.29) 2.15 (2.14) 3.56 (3.57) 5.58 (5.57) 19.24 (19.29)
3 2 2 1.00 (1.00) 1.50 (1.50) 3.02 (3.00) 5.50 (5.50) 9.07 (9.00) 32.87 (33.00)
4 1.00 (1.00) 1.34 (1.33) 2.33 (2.33) 3.99 (4.00) 6.32 (6.33) 22.26 (22.33)
8 1.00 (1.00) 1.29 (1.29) 2.15 (2.14) 3.57 (3.57) 5.58 (5.57) 19.30 (19.29)
3 2 1.00 (1.00) 1.49 (1.50) 3.01 (3.00) 5.49 (5.50) 8.99 (9.00) 32.88 (33.00)
4 1.00 (1.00) 1.34 (1.33) 2.04 (2.33) 4.00 (4.00) 6.33 (6.33) 22.30 (22.33)
8 1.00 (1.00) 1.29 (1.29) 2.15 (2.14) 3.57 (3.57) 5.57 (5.57) 19.27 (19.29)
4 2 0.99 (1.00) 1.49 (1.50) 2.99 (3.00) 5.50 (5.50) 9.03 (9.00) 32.91 (33.00)
4 1.00 (1.00) 1.33 (1.33) 2.33 (2.33) 4.00 (4.00) 6.33 (6.33) 22.32 (22.33)
8 1.00 (1.00) 1.29 (1.29) 2.15 (2.14) 3.57 (3.57) 5.58 (5.57) 19.29 (19.29)
Figure 2. Estimated split-plot error vs. g for levels of b
Further Exploratory Analysis of Split-plot Experiments 781
Dow
nloa
ded
by [
Tuf
ts U
nive
rsity
] at
14:
16 1
4 N
ovem
ber
2014
Table 5. Estimated power for tests for stratification within the whole-plots 2a ¼ 0.05 (situations with power . 0.75 are shown in bold)
For a ¼ 0.05
Reps
Levels
of A
Levels
of B
df for
whole-plot
error
df for
split-plot
error F critical g ¼ 0 g ¼ 0.5 g ¼ 1 g ¼ 1.5 g ¼ 2 g ¼ 4
2 2 2 1 2 199.50 5.1 6.3 8.5 12.5 15.2 28.1
2 2 4 1 6 233.99 4.9 5.6 7.5 10.1 12.6 23.1
2 2 8 1 14 245.36 5.1 5.7 7.2 9.4 11.6 22.1
2 3 2 2 3 19.16 4.7 7.3 14.4 23.9 35.3 72.0
2 3 4 2 9 19.38 5.0 6.2 11.3 17.7 27.5 65.8
2 3 8 2 21 19.45 5.0 6.3 10.6 17.0 25.0 62.7
2 4 2 3 4 9.12 5.2 8.3 19.3 36.3 54.0 89.6
2 4 4 3 12 8.74 5.0 7.3 14.4 28.5 44.7 91.4
2 4 8 3 28 8.62 4.9 7.0 14.0 25.6 41.2 90.0
3 2 2 2 4 19.25 5.2 7.0 14.5 24.3 36.9 75.5
3 2 4 2 12 19.41 4.8 6.7 11.5 18.7 27.8 66.4
3 2 8 2 28 19.46 5.1 6.5 10.7 17.3 24.9 62.4
3 3 2 4 6 6.16 4.6 9.5 26.2 49.7 72.9 97.7
3 3 4 4 18 5.82 4.7 8.5 16.9 40.2 63.5 99.0
3 3 8 4 42 5.71 5.1 8.2 17.7 35.2 56.8 98.7
3 4 2 6 8 4.15 5.0 11.6 36.1 69.1 89.2 99.6
3 4 4 6 24 3.84 5.3 9.2 27.8 58.4 84.2 100.0
3 4 8 6 56 3.74 5.2 8.3 24.6 54.2 81.3 100.0
782
T.
Cooper
&M
.G.
Leitn
aker
Dow
nloa
ded
by [
Tuf
ts U
nive
rsity
] at
14:
16 1
4 N
ovem
ber
2014
Table 6. Estimated power for tests for stratification within the whole-plots 2a ¼ 0.1 (situations with power . 0.75 are shown in bold)
For a ¼ 0.1
Reps
Levels
of A
Levels
of B
df for
whole-plot
error
df for
split-plot
error F critical g ¼ 0 g ¼ 0.5 g ¼ 1 g ¼ 1.5 g ¼ 2 g ¼ 4
2 2 2 1 2 49.50 10.0 12.6 16.9 24.0 29.5 50.2
2 2 4 1 6 58.20 10.1 11.4 15.6 19.4 25.1 43.9
2 2 8 1 14 61.07 10.5 10.8 14.5 18.6 23.2 42.7
2 3 2 2 3 9.16 9.9 14.4 26.4 41.1 55.9 85.0
2 3 4 2 9 9.38 9.7 13.1 22.1 33.1 48.1 86.6
2 3 8 2 21 9.44 9.7 12.4 20.6 31.8 44.0 85.3
2 4 2 3 4 5.34 10.2 16.5 33.2 55.1 71.8 93.6
2 4 4 3 12 5.22 9.7 14.2 27.5 47.5 66.3 97.8
2 4 8 3 28 5.17 9.7 14.1 25.7 43.5 62.9 98.1
3 2 2 2 4 9.24 9.9 14.2 26.7 42.4 59.8 89.8
3 2 4 2 12 9.41 9.6 13.5 22.2 34.1 47.6 88.4
3 2 8 2 28 9.46 10.2 13.2 20.3 31.6 44.5 86.1
3 3 2 4 6 4.01 9.9 17.9 42.1 68.7 87.0 98.5
3 3 4 4 18 3.85 9.7 16.5 29.2 59.4 81.6 99.9
3 3 8 4 42 3.80 9.7 15.3 31.1 55.1 77.9 99.9
3 4 2 6 8 2.98 10.0 20.8 53.0 83.4 95.4 99.7
3 4 4 6 24 2.82 10.4 17.3 44.2 76.5 94.4 100.0
3 4 8 6 56 2.76 10.0 16.3 40.8 73.2 93.2 100.0
Furth
erE
xplo
rato
ryA
nalysis
of
Split-p
lot
Exp
erimen
ts783
Dow
nloa
ded
by [
Tuf
ts U
nive
rsity
] at
14:
16 1
4 N
ovem
ber
2014
Table 7. Estimated power for tests for stratification within the whole-plots 2a ¼ 0.25 (situations with power . 0.75 are shown in bold)
For a ¼ 0.25
Reps
Levels
of A
Levels
of B
df for
whole-plot
error
df for
split-plot
error F critical g ¼ 0 g ¼ 0.5 g ¼ 1 g ¼ 1.5 g ¼ 2 g ¼ 4
2 2 2 1 2 7.50 24.7 29.9 40.5 53.6 62.5 78.8
2 2 4 1 6 8.98 25.6 28.1 38.1 47.8 57.4 84.2
2 2 8 1 14 9.47 25.6 27.8 35.7 45.7 54.1 83.2
2 3 2 2 3 3.15 24.8 34.6 54.0 71.8 81.8 91.0
2 3 4 2 9 3.37 24.8 31.4 48.5 65.8 81.0 98.4
2 3 8 2 21 3.43 24.7 30.6 45.8 64.0 78.8 99.1
2 4 2 3 4 2.39 25.4 37.8 61.5 80.3 89.5 95.6
2 4 4 3 12 2.45 25.0 33.9 55.1 77.6 90.6 99.8
2 4 8 3 28 2.46 24.9 33.1 53.5 75.3 90.4 99.9
3 2 2 2 4 3.23 24.7 34.8 55.5 74.0 86.6 94.9
3 2 4 2 12 3.39 24.7 32.2 47.8 66.6 81.7 99.2
3 2 8 2 28 3.44 24.9 31.5 46.4 63.9 79.1 99.3
3 3 2 4 6 2.08 24.9 39.2 70.1 89.8 96.3 98.9
3 3 4 4 18 2.08 24.9 35.7 54.8 85.9 96.8 100.0
3 3 8 4 42 2.08 24.6 34.9 60.2 84.5 96.6 100.0
3 4 2 6 8 1.78 25.1 43.3 78.4 95.4 98.9 99.8
3 4 4 6 24 1.75 25.6 38.4 72.1 94.3 99.3 100.0
3 4 8 6 56 1.74 25.6 38.0 70.4 93.6 99.4 100.0
784
T.
Cooper
&M
.G.
Leitn
aker
Dow
nloa
ded
by [
Tuf
ts U
nive
rsity
] at
14:
16 1
4 N
ovem
ber
2014
reproduced. An initial look at these tables considered cases with b , 0.25 (or power
greater than 0.75.) These situations are highlighted in bold.
Discussion and Recommendations
An examination of the simulation results indicates that even for moderately large stratified
effects, the usual F test for comparing split-plot to whole-plot error is not very powerful for
detecting the presence of a stratified effect. It is only when alpha becomes as high as 0.25
that we can be reasonably confident of detecting a stratification effect which is as large as
1.5. Our recommendation is to use an alpha level of 0.25. Our reasoning is that we are not
conducting a hypothesis test in the usual sense. The nature of our work at this point is more
exploratory in nature. If we decide that split-plot error is large as compared to whole-plot
error, our decision impels us to more closely examine the process for a stratified effect.
And it does not recommend a process change based on this result. Consequently, we
think that recommending a better understanding of the process to identify stratification
when none exists is a less costly mistake than deciding that no stratification exists when
it does.
In our example of the adhesive experiment, an F test comparing split-plot error to
whole-plot error would result in a p-value of 0.198. Our suggestion would be to investigate
possible stratification within the whole-plots. Now it is unlikely that the stratification
described by our model in equation (1) is a reasonable representation of this physical situa-
tion. This model incorporates an abrupt change from g to 2g. It is more likely that a
gradual stratification exists throughout the ends of the tubes. However, this model
would provide a conservative means of speculating about the likely size of a gradient,
whether a gradual or abrupt change. If we take the expected mean square of whole-plot
error, 0.0788 as an estimate of s 2, then a simple method of moments estimator of g
would be 0.32. (We note that b would be 8 here as there are eight runs within each
whole-plot.) Figure 3 compares this ‘speculated’ value of g to the other effects estimated
in the adhesive experiment. We believe this figure indicates the merits of investigating the
possible stratification further as this stratification may be the largest effect active in this
process.
Figure 3. Comparison of parameter estimates for fixed effects to estimated g
Further Exploratory Analysis of Split-plot Experiments 785
Dow
nloa
ded
by [
Tuf
ts U
nive
rsity
] at
14:
16 1
4 N
ovem
ber
2014
Conclusions
Our first conclusion, and probably the most important, is that the split-plot experiment as
conducted is not a powerful method for detecting stratification. In other words, if a sus-
pected reason for stratification existed, a different means of gaining an understanding of
the nature of the stratification should be used. In the present example, we might have
recommended that the engineers collect process data on the observed differences in
height throughout the use of a tube. Prior to conducting the experiment it was suspected
that heights differed from one end of the tube to another. Baseline data on heights through-
out a tube for a number of tubes studied across different times and process conditions
might have indicated the kind of stratification we expect occurred within a tube.
A second conclusion we draw from our study of the adhesive experiment is the value of
performing additional, exploratory analysis of experimental data. Many useful industrial
experiments are performed on actual industrial processes rather than in a controlled
laboratory setting. This fact means that our learning from an experiment may not be
limited to those factors we pre-selected for our experimental study. Important process
knowledge may be gained by our ability to recognize unanticipated sources of variation.
A thoughtful consideration of what variation is being captured in our experimental work
can serve as a guide to further, beneficial process study. In this paper, we have achieved
this examination by exploring possible stratification within split-plot experiments and
evaluating the effects this stratification would have on experimental results.
References
Anderson, V. & McLean, R. (1974) Design of Experiments; A Realistic Approach (New York: Marcel Decker).
Box, G. (1996) Quality quandaries; split-plot experiments, Quality Engineering, 8(3), pp. 515–520.
Daniel, C. (1976) Applications of Statistics to Industrial Experimentation (New York: Wiley).
Hahn, G. (1978) More on randomization, ChemTech, March, pp. 164–165.
Hicks, C. (1993) Fundamental Concepts in the Design of Experiments (Fort Worth: Saunders College Publishing).
Leitnaker, M.G. & Mee, R. (2001) The analytic use of two-level factorials in incomplete blocks to examine the
stability of factor effects, Quality Engineering, 14(1), pp. 49–58.
Sanders, D., Leitnaker, M.G. & McLean, R.A. (2001) Randomized block designs in analytical studies, with Doug
Sanders and Bob McLean, Quality Engineering, 14(1), pp. 1–8.
786 T. Cooper & M.G. Leitnaker
Dow
nloa
ded
by [
Tuf
ts U
nive
rsity
] at
14:
16 1
4 N
ovem
ber
2014