Random and Fixed Factor ANOVA Models - Gauge R&R Studies

GE Research &Development Center

______________________________________________________________

Technical Information Series

Random and Fixed FactorANOVA Models:Gauge R&R Studies

T.A. Early and R. Neagu

99CRD094, July 1999

Class 1

Corporate Research and Development

Technical Report Abstract Page

Title Random and Fixed Factor ANOVA Models: Gauge R&R Studies

Author(s) T.A. Early Phone (518)387-6590R. Neagu* 8*833-6590

Component Characterization and Environmental Technology Laboratory

ReportNumber 99CRD094 Date July 1999

Numberof Pages 10 Class 1

Key Words ANOVA, Gauge R&R, factor, fixed, random, effects

The effects of random and fixed factors on gauge repeatability and reproducibility analysis arediscussed. The context of the gauge study along with the desired calculated outcome guide the gaugepractitioner to the proper ANOVA model to be applied to the result of a gauge study. This paperdiscusses the practical aspects of gauge study factors and their impact on the interpretation of theexperimental results. Sample calculations are given.

Manuscript received June 29, 1999

*Information Technology Laboratory

1

Random and Fixed Factor ANOVA Models:Gauge R&R Studies

T.A. Early and R. Neagu

The performance of the measurement system is an essential ingredient in any programdesigned to improve the quality of a product or process. At a minimum, the measurementsystem must be able to detect random fluctuations in the process or product to beimproved 1. If this performance level is not available from the measurement system, theimprovement program simply cannot proceed. Instead, program activity must focus onimproving the measurement system.

A gage repeatability and reproducibility (R&R) study is a design of experiments(DOE) with the special purpose of understanding the source(s) of variability in a meas-urement system2 which can be attributed to human, instrument or random effects. In atypical study involving a non-destructive test, several parts are measured by severaloperators several times. The variation observed when a single operator measures thesame part consistently is called repeatability or pure error. The variation observed whenone operator duplicates the measurement of another operator on the same part is calledreproducibility. The parts themselves also contribution to the overall variation of the dataobtained in a gage R&R study. Certainly, this can be a large source of variation if theparts used in the gage R&R study are very different. This variation due to different partsis not of primary interest however. Rather, the intended purpose of the study is to deter-mine how similar parts can be differentiated by the measurement system. Also of impor-tance in a gage R&R study, is the possible interaction of the parts and operators. In otherwords, do different operators measure different parts differently? For example, does oneoperator consistently get results below a second operator when measuring small partwhile consistently getting results above the second operator when measuring large parts?

A two-factor ANOVA model for a gage R&R study DOE is given by the equation3:

( ) ijkijjiijkY εαββαµ ++++= . [1]

This model states that any measurement, Yijk, is made up of several terms. First is thegrand mean of all possible observations, µ. The second two terms are the main effectfrom the ith operator, αi, and the effect from the jth part, βj. The third term is the effectfrom the part-operator interaction, (αβ )ij. The final term is a contribution of operatorrepeatability on his kth measurement, or pure error, εijk. For this paper, we will assume a

1 Early, T. A., “Effective Management of Six Sigma Projects: A Balanced Approach”, November 1997, GECR&D Technical Information Series, 97CRD153.2 Montgomery, D. C. and Runger, G. C. 1993-4. “Gauge Capability and Designed Experiments. Part I:Basic Methods”, Quality Engineering 6(1), pp.115–135.3 Neter, J., Kutner, M.H., Nachtsheim, C. J. and Wasserman, W., Applied Linear Statistical Models, 4thedition, IRWIN, 1996.

2

balanced design, i.e., each one of the a operators measures each one of the b parts thesame n number of times.

For this ANOVA model, formulas for computing sum of squares can be established(see Appendix A). A mean sum of squares can then be calculated by dividing the sum ofsquares for each contribution by the degrees of freedom for that contribution. These meansquare values can now be used to test the different properties of the terms used in themodel. The null hypothesis, H0, states that none of the factors, parts, operators, or theirinteraction, has any effect on the individual measurements. That is to say, variation in thedata cannot be attributed to any factor or interaction of factors. It is important to note thatthe correct definition of a specific null hypothesis depends on the context of the gageR&R experiment. Of particular importance is the nature and source of the levels of thefactors selected for the DOE gage study as well as the kind of inference to be made fromthe results of the experiment4.

In one scenario, parts and operators chosen for a gage R&R study might represent asmall sample taken at random from a larger pool of parts and operators. In this scenario,the study is generally performed to infer the behavior of the larger pool of parts andoperators based on the random sample taken from those collections. This is called arandom factor levels model and usually tests of significance to decide whether thevariance of the model terms is significantly different from zero are performed. Forexample, the null hypothesis for testing whether an operator-part interaction exists is:

0: 20 =αβσH .

In this scenario, the test statistic for the interaction term is calculated by dividing themean square value for the interaction by the pure error. If this value is larger than tabularF-statistic, then the null hypothesis can be rejected. The alternative hypothesis,

0: 2 ≠αβσaH , is assumed. If the test statistic is less that the tabular F-statistic, then the nullhypothesis cannot be rejected. That is, compared to the pure noise in the measurementsystem, there is no supportable evidence that there an operator-part interaction. In thiscase, the model can be recast without the interaction term. This simpler model can also betreated by the methods described in this paper and that exercise is left to the reader. Forthe model described in Equation 1, the test statistic for the main effects terms (part andoperator) are calculated by dividing the mean square value for the main effect by themean square of the interaction5. The test statistic is generated this way because of the un-derlying assumption regarding the random choice of factor levels and implies certaincharacteristics about the expected mean squares of these models (vida infra).

The random factor level model can be contrasted with another scenario in which partsand operators are chosen for the gage R&R for a specific reason. Perhaps all the operatorsof a measurement are selected for the DOE, or specific operators are chosen because ex-perimenter has particular interest in comparing the performance of the chosen operators.

4 Feder, P. I., “Some Differences between Fixed, Mixed and Random Effects Analysis of VarianceModels”, Journal of Quality Technology, April 1974, pp 98-106.5 For an alternate way of computing F statistics, see Wolach, A. H., McHale, M.A. (1987), “F ratios andquasi F ratios for fixed, mixed and random model ANOVAs”, Behavior Research Methods, Instruments, &Computers 19, no. 4, pp 409-412 and Burdick R. K., “Using confidence intervals to test variancecomponents”, Journal of Quality Technology, January 1994, pp 30-38.

3

Specifically in this fixed factor level model, inference about a larger population is not thegoal of the study and in fact may have no meaning. Rather, specific comparisons of dif-ferent factor levels are the point of the study. While in this case, Equation [1] still definesthe experimental model, however, the meaning of the terms in the model is different: theydo not represent random variables anymore, they are just constants subject to somerestrictions. To test whether an operator-part interaction exists, the null hypothesis is:

( ) 0:0 =ijallH αβ .

This hypothesis is fundamentally different than that formulated previously. The teststatistic would be calculated in the same way as described in the first scenario. Based onthe different inferential model in this scenario however, test statistics for the presence offactor main effects are calculated differently by dividing the mean squares for each of themain effects terms by the pure error. As in the first scenario, the underlying assumptionregarding this fixed factor level model implies certain characteristics about the expectedmean squares of the model.

Fixed factor level models are called ANOVA model I. Random factor level modelsare called ANOVA model II. Mixed models, where some factors have random levels andother factors have fixed levels, are also possible and are called ANOVA model III. Table1 lists the expected mean squares for each of these types for the model described inEquations [1]:

Table 1. Expected mean square values are listed for each of the three ANOVAModels when interactions are important (model Equation [1]).

MeanSquare

Model I(A and B fixed)

Model II(A and B random)

Model III(A fixed, B random)

MSA1

22

−+ ∑

anb iα

σ 222ααβ σσσ nbn ++

1

222

−++ ∑

anbn iα

σσ αβ

MSB1

22

−+ ∑

bna iβ

σ 222βαβ σσσ nan ++ 22

βσσ na+

MSAB( )

( )( )11

2

2

−−+ ∑∑

ban ijαβ

σ 22αβσσ n+ 22

αβσσ n+

MSE 2σ 2σ 2σ

Each expected mean square in this table is made up of one or more terms. For randomfactors, the terms are based on the expected variance of the entire population of levels offactors and interactions in the model, 2222 or,, αββα σσσσ . For fixed factors, the expected

mean squares are based on true variance of the factor main effects. For mixed models, if afixed factor interacts with a random factor, the expected mean square for that fixed factoralso includes random factor variance terms. The test statistic for making inferences abouta model term is generated by dividing mean squares that have the same expectation underH0. In addition, under Ha the numerator mean square has a larger expectation than the de-

4

nominator mean square. This is the only way to correctly generate a test that can then becompared to the F-statistic. The test statistic will be larger than the comparable F-statisticif the model term contributes to the variance of the experimental data more than would beexpected by pure chance. Each test statistic for the three models is listed in Table 2:

Table 2. The test for each model term is listed for a two-factor DOE wheninteractions are important (model Equation [1]).

Test forSignificance of:

Model I(A and B fixed)

Model II(A and B random)

Model III(A fixed, B random)

A MSA/MSE MSA/MSAB MSA/MSAB

B MSB/MSE MSB/MSAB MSB/MSE

AB Interaction MSAB/MSE MSAB/MSE MSAB/MSE

The expected mean squares provide a way to predict the components of variance associ-ated with each term in the model. We solve for the variance component of interest, thenwe replace the expected mean squares by the mean squares; by doing so, we know thatthe estimator we obtain is unbiased. For example, with a type II random factor model, theestimate6 of the variance of the AB interaction7 term is just:

n

MSEMSABs

−=2αβ . [2]

This is the expected variance in interaction of the entire population of operators with theentire population of parts. (We are estimating variance based on the finite data of thegage study, so here we replace σ with s.) The variance of all operators and the variance ofall parts can be estimated by:

nb

MSABMSAs

−=2α and

na

MSABMSBs

−=2β .

Thus, an estimate of the gage variance is just the sum of the appropriate components:

2222αβα σσσσ ++=gage .

The first component, MSE or pure error is assigned to within-operator repeatability. Thefinal two terms can be assigned to operator-to-operator reproducibility. It is interesting tocompare this analysis with those from a mixed, type III model.

6 Confidence intervals for the estimates can be computed. For example, see Burdick, R. K., Larsen, G. A.,Confidence Intervals on measures of variability in R&R studies, Journal of Quality Technology, July 1997,pp. 261-273.7 Montgomery, D. C. and Runger, G. C., 1993-4 "Gauge Capability Analysis and Designed Experiments.Part II: Experimental Design Models and Variance Component Estimation," Quality Engineering 6, 289.

5

Suppose all of the operators involved in a measurement are included in the gage R&Rstudy. Suppose also that ANOVA model III is applicable, that is, the part factor is ran-dom. Although the operator factor is fixed, the part factor being random makes the inter-action operator-part a random factor. The component of variance due to the operator-partinteraction can be estimated as shown in Equation 28. In this case, the operator factor isfixed, not random. The uninteresting variance component due to parts is easily estimatedby:

na

MSEMSBs

−=2β .

The more interesting variation due to the operator cannot be estimated as a randomvariance because of the context of the problem. The variance due to operators cannot bemodeled by a random variable. The gage data represents performance of all the operatorsand not a random sample of operators from a larger pool. However, the operator-to-operator variability is accounted for when we compute the squared sum of errors, SSE.The total sum of squares is broken down into SSOperator + SSPart + SSOperator*Part +SSE. Therefore, the variance components (see formulas in Appendix B) take into accountthe operator-to-operator variability by using MSE. Estimation of contrasts (comparisons)for operator level means can be computed. For example, in a two-operator gage, operatormeans can be calculated for each of the operators according to Appendix A. Of greatinterest is the bias difference, also called the contrast, L, of the two operators:

2121 ααµµ −=−=L .

This contrast allows the direct comparison of measurements obtained by the twooperators, irrespective of the magnitude of their bias. With the contrast definition of:

iicL µ∑= , where ∑ = 0ic ,

for this two operator study, 11 =c and 12 −=c . The contrast between operators is notknown exactly however, due to the fact that each Yijk observation include pure error.

While the estimate of the contrast, L̂ , is simply the difference in the operatormeans, ⋅⋅⋅⋅ −= 21

ˆ YYL the estimated value of the variance of the contrast is:

∑= 22iL c

ab

MSEs .

Simple stated the expected bias of the operators is obtained through a simple comparisonof the average measurement of each each operator in the gage R&R study. Therefore, thisoperator bias can be removed as a source of deviation in the measurement system. Thegage variation becomes:

222αβσσσ +=gage .

8 Dolezal, K. K., Burdick, R. K. and Birch, N. J., April 1998, “Analysis of a two-factor R & R study withfixed operators”, Journal of Quality Technology, pp 163-170.

6

What about the part factor in a gage R&R study. Are parts a random or fixed factor?Parts selected for a gage R&R study need to be as close as possible to the actual parts themeasurement is designed to test. In fact, the parts selected may be a random selection ofthe actual parts to which the measurement is applied. On the face of it, it sounds like partsis a random factor. Nonetheless, variation of a gage study due to differences in parts isnot a direct goal of the study. This disinterest in the variability of parts does not implythat parts are a fixed factor. The quantitative measure of part-operator interactions, whichis a part of reproducibility, is important. This part-operator interaction needs to becharacterized for all possible parts a measurement system might encounter not just theinteraction with the specific parts selected for the gage study. Therefore, parts should berandomly selected for a gage study. Thus, model I ANOVA, where all factors are fixed,will almost never be applied to a gage study. Whether model II or III is applied dependson the inferences to be made about the operator.

While non-destructive, fully crossed designs have been discussed in this article,destructive, nested design can also be treated in a similar manner. In a simple destructivegage R&R study involving two factors of operator and part (nested within operator), anunderstanding of part-operator interaction is not possible because part-to-part variabilityis confounded with all terms in the model. That is, different operators cannot measure thesame part because it was destroyed in the measurement. Whether a factor is fixed or ran-dom still has the same significance in the generation of expected mean squares (andtherefore, statistical significance). It is of utmost importance that parts chosen for adestructive gage study be as similar as possible, so that in this case, parts will be a fixedfactor.

In conclusion, the context of a gage R&R study and a complete understanding of theinferences to be made from the results of the study determine how the experimentalresults can be statistically interpreted. Statistical tests and their associated probabilitiesvary in structure according to the nature of the levels of factors chosen in the study. Inparticular, fixed operator gage studies allows for the successful unbiasing of the maineffect of operator on the experimental results. Finally, the interested reader is directed toAppendix B where an example data set is used to illustrate the points of this article.

7

Appendix A Calculations of Mean Squares

The sum of squares calculation is described for the simple, two-factor experiment withthe model equation:

( ) ijkijjiijkY εαββαµ ++++= .

Levels for the first factor, A, are specified by the index i running from one to a. For thesecond factor, B, levels are indexed by j and runs from one to b. Replicate data at thesame factor levels are index by k and run from one to n. Several means are defined:

Grand mean:

nab

Y

Y i j kijk∑∑∑

=⋅⋅⋅

A level means:

nb

Y

Y j kijk

i

∑∑=⋅⋅

B level means:

na

YY i k

ijk

j

∑∑=⋅⋅

Treatment means:

n

YY k

ijk

ij

∑=⋅

With these definitions sums of squares (variances) can be calculated for all componentsof the model:

Term Description Formula

SSTO Total variance ( )∑∑∑ ⋅⋅⋅−i j k

ijk YY2

SSEReplication (pure error)variance ( )∑∑∑ ⋅−

i j kijijk YY

2

SSAVariance due to factor A ( )∑ ⋅⋅⋅⋅⋅ −

ii YYnb

2

SSBVariance due to factor B ( )∑ ⋅⋅⋅⋅⋅ −

jj YYna

2

SSABVariance due to ABinteraction ( ) ( )( )∑∑∑∑ ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ −−=+−−

i jjijiij

i jjiij YYYYnYYYYn

2

SSTO = SSE + SSA + SSB + SSAB

It is interesting to point out that the second form of SSAB contains no squared terms,but is always guaranteed to be >= 0.

8

Appendix B An Example

A fictitious non-destructive gage R&R was performed where all three operators used in aprocess measurement measured five parts three times each with the following results:

Part1 2 3 4 5

1019 977 992 988 9671 1017 980 1004 991 981

1018 992 1001 982 971

1031 1001 1010 1018 997Operator 2 1031 1007 1025 1018 992

1025 1010 1019 1024 1002

990 962 1015 1023 9803 991 966 1020 1019 990

986 952 1013 1027 976

First, averages are calculated for each component:

Grand Mean: 1000... =Y

Operator Means: 9941014992 321 === ⋅⋅⋅⋅⋅⋅ YYY

Part Means: 984101010119831012 54321 ===== ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ YYYYY

Treatment Means, ⋅ijY :

j1 2 3 4 5

1 1018 983 999 987 973i 2 1029 1006 1018 1020 997

3 989 960 1016 1023 982

From these means, sums of squares (SS) can be calculated for each component of themodel (see Appendix A). Degrees of freedom (df) for each term for this non-destructivetest is also listed along with the mean squares (MS). Using Table 2, F test statistics can becalculated for each term, except of course, the error term that is the basic variance bywhich all other variances reference. Finally, probabilities (P) are calculated for each teststatistic. This is the probability that random chance can account for the variance that isassigned to this term:

9

Term SS df MS F POperator 4440 2 2220.0 3.398 0.085Part 8190 4 2047.5 69.328 0.000Operator*Part 5226 8 653.3 22.119 0.000Error 886 30 29.5Total 18742 44

Note that the part factor test statistic and significance is very different than would becalculated if this were a pure random factor model. (In this case, Part would have an F-test statistic of 3.134 and a P-value of 0.079.)

Using the estimated mean squares for model III given in Table 1, estimated variancesare calculated for three terms:

Estimatedvariances

Value Formula

2σ 29.5 MSE

2*OperatorPartσ 207.9

n

MSEMSAB −

2Partσ 154.9

na

MSEMSB −

Variance for the operator factor cannot be estimated because the operator levels cannotbe represented by a random variable9. Since operator is a fixed factor, it is reasonable tounbias each operator’s measurement. (Note that this is justified even with an ANOVA P-value of greater than 0.05. This is because the P-value of the simple operator means ismuch less than 0.05, primarily based on the relatively low pure error and the nb = 15 dif-ferent observations for each operator.) A contrast for any operator can be defined with theconstraints defined in the main article. For example, a contrast between operator #1 and#3 can be developed:

2992994ˆ13 =−=−= ⋅⋅⋅⋅ YYL , with c1 = -1 and c3 = 1.

We can now estimate the variance of this contrast:

( ) ( )93.3

15

1

15

1 222 =

−+= MSEsL .

The 95% confidence interval for the contrast is:

( ) 1.422,5.0ˆ ±=−± LsnbtL .

9 Minitab 12.2 Reference Manual: Anova Models and Gage R&R Studies.

10

This confidence interval contains zero so that the gage study does not support a sig-nificant contrast between operator #1 and #3. The contrast (bias) for the operator #2 com-pared to operators #1 and #3 can be defined as:

212

ˆ 3122 =

+−= ⋅⋅⋅⋅

⋅⋅YY

YL , so that c2 = 1 and c1 = c3 = −0.5.

The estimated variance of this contrast is:

( ) ( ) ( )95.2

15

1

15

1

15

1 2222 =

−+−+= MSEsL ,

and the 95% confidence interval is 21 ± 3.5. This confidence interval does not containzero and is therefore significant. We are statistically justified in unbiasing the measure-ments of operator #2. Based on 15-sample operator means, the total gage variation nowbecomes:

4.2372*

22 =+= OperatorPartgage σσσ .

This represents a 31% lower value than the value predicted by ANOVA model IIerroneously applied to the same data.

T.A. Early Random and Fixed Factor ANOVA Models: 99CRD094R. Neagu Gauge R&R Studies July 1999

Random and Fixed Factor ANOVA Models - Gauge R&R Studies

Documents

Transcript of Random and Fixed Factor ANOVA Models - Gauge R&R Studies