Statistics for managers using microsoft excel cp-11

48
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter 11 Analysis of Variance

Transcript of Statistics for managers using microsoft excel cp-11

Page 1: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-1

Statistics for ManagersUsing Microsoft® Excel

5th Edition

Chapter 11Analysis of Variance

Page 2: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-2

Learning Objectives

In this chapter, you learn: The basic concepts of experimental design How to use the one-way analysis of variance

to test for differences among the means of several groups

How to use the two-way analysis of variance and interpret the interaction

Page 3: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-3

Chapter Overview

Analysis of Variance (ANOVA)

F-test

Tukey-Kramer

test

One-Way ANOVA

Two-Way ANOVA

InteractionEffects

Page 4: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-4

General ANOVA Setting

Investigator controls one or more factors of interest Each factor contains two or more levels Levels can be numerical or categorical Different levels produce different groups Think of the groups as populations

Observe effects on the dependent variable Are the groups the same?

Experimental design: the plan used to collect the data

Page 5: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-5

Completely Randomized Design

Experimental units (subjects) are assigned randomly to the different levels (groups) Subjects are assumed homogeneous

Only one factor or independent variable With two or more levels (groups)

Analyzed by one-factor analysis of variance (one-way ANOVA)

Page 6: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-6

One-Way Analysis of Variance

Evaluate the difference among the means of three or more groups

Examples: Accident rates for 1st, 2nd, and 3rd shift Expected mileage for five brands of tires

Assumptions Populations are normally distributed Populations have equal variances Samples are randomly and independently drawn

Page 7: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-7

Hypotheses: One-Way ANOVA

All population means are equal i.e., no treatment effect (no variation in means among groups)

At least one population mean is different i.e., there is a treatment (groups) effect Does not mean that all population means are different (at

least one of the means is different from the others)

c3210 μμμμ:H

same the are means population the of all Not:H1

Page 8: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-8

Hypotheses: One-Way ANOVA

All Means are the same:The Null Hypothesis is True

(No Group Effect)

c3210 μμμμ:H

same theare μ allNot :H j1

321 μμμ

Page 9: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-9

Hypotheses: One-Way ANOVA

At least one mean is different:The Null Hypothesis is NOT true (Treatment Effect is present)

c3210 μμμμ:H

same theare μ allNot :H j1

321 μμμ 321 μμμ

or

Page 10: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-10

Partitioning the Variation

Total variation can be split into two parts:

SST = Total Sum of Squares (Total variation)

SSA = Sum of Squares Among Groups (Among-group variation)

SSW = Sum of Squares Within Groups (Within-group variation)

SST = SSA + SSW

Page 11: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-11

Partitioning the Variation

Total Variation = the aggregate dispersion of the individual data values around the overall (grand) mean of all factor levels (SST)

Within-Group Variation = dispersion that exists among the data values within the particular factor levels (SSW)

Among-Group Variation = dispersion between the factor sample means (SSA)

SST = SSA + SSW

Page 12: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-12

Partitioning the Variation

Among Group Variation (SSA)

Within Group Variation (SSW)

Total Variation (SST)

= +

Page 13: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-13

The Total Sum of Squares

c

j

n

iij

j

XXSST1 1

2)(Where:

SST = Total sum of squares

c = number of groups

nj = number of values in group j

Xij = ith value from group j

X = grand mean (mean of all data values)

SST = SSA + SSW

Page 14: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-14

The Total Sum of Squares

2212

211 )(...)()( XXXXXXSST nc

c

j

n

iij

j

XXSST1 1

2)(

Page 15: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-15

Among-Group Variation

Where:

SSA = Sum of squares among groups

c = number of groups

nj = sample size from group j

Xj = sample mean from group j

X = grand mean (mean of all data values)

2

1

)( XXnSSA j

c

jj

SST = SSA + SSW

Page 16: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-16

Among-Group Variation

2j

c

1jj )XX(nSSA

2cc

222

211 )XX(n...)XX(n)XX(nSSA

µ1 µ2 µc

Page 17: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-17

Within-Group Variation

Where:

SSW = Sum of squares within groups

c = number of groups

nj = sample size from group j

Xj = sample mean from group j

Xij = ith value in group j

2

11

)( j

j

XXSSW ij

n

i

c

j

SST = SSA + SSW

Page 18: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-18

Within-Group Variation

2221

211 )(...)()( 11 cXXXXXXSSW nc

2

11

)( jij

n

i

c

j

XXSSWj

j

Page 19: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-19

Obtaining the Mean Squares

cnSSWMSW

1cSSAMSA

1nSSTMST

Mean Squares Among

Mean Squares Within

Mean Squares Within

Page 20: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-20

One-Way ANOVA Table

Source of Variation

df SS MS(Variance)

F-Ratio

Among Groups

c-1 SSA MSA

Within Groups

n-c SSW MSW

Total n-1 SST = SSA+SSW

c = number of groupsn = sum of the sample sizes from all groupsdf = degrees of freedom

MSWMSAF

Page 21: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-21

One-Way ANOVATest Statistic

Test statistic

MSA is mean squares among variances MSW is mean squares within variances

Degrees of freedom df1 = c – 1 (c = number of groups) df2 = n – c (n = sum of all sample sizes)

MSWMSAF

H0: μ1= μ2 = … = μc

H1: At least two population means are different

Page 22: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-22

One-Way ANOVATest Statistic

The F statistic is the ratio of the among variance to the within variance The ratio must always be positive df1 = c -1 will typically be small df2 = n - c will typically be large

Decision Rule:Reject H0 if F > FU,

otherwise do not reject H00

= .05

Reject H0Do not reject H0 FU

Page 23: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-23

One-Way ANOVAExample

Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204

You want to see if three different golf clubs yield different distances. You randomly select five measurements from trials on an automated driving machine for each club. At the .05 significance level, is there a difference in mean distance?

Page 24: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-24

One-Way ANOVAExample

227.0 x

205.8 x 226.0x 249.2x 321

••••

270

260

250

240

230

220

210

200

190

•••••

•••••

Distance

1X

2X

3XX

Club1 2 3

Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204

Page 25: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-25

One-Way ANOVAExample

X1 = 249.2

X2 = 226.0

X3 = 205.8

X = 227.0

n1 = 5

n2 = 5

n3 = 5

n = 15

c = 3SSA = 5 (249.2 – 227)2 + 5 (226 – 227)2 + 5 (205.8 – 227)2 = 4716.4

SSW = (254 – 249.2)2 + (263 – 249.2)2 +…+ (204 – 205.8)2 = 1119.6

Page 26: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-26

One-Way ANOVAExample

MSA = 4716.4 / (3-1) = 2358.2

MSW = 1119.6 / (15-3) = 93.325.275

93.32358.2F

F = 25.2750

= .05

FU = 3.89Reject H0Do not

reject H0

Critical Value:

FU = 3.89

Page 27: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-27

One-Way ANOVAExample

H0: μ1 = μ2 = μ3

H1: μj not all equal = .05 df1= 2 df2 = 12

Decision:

Reject H0 at α = 0.05

Conclusion: There is evidence that at least one μj differs from the rest

Page 28: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-28

One-Way ANOVA in Excel

EXCEL:Select Tools ____Data

Analysis_____

ANOVA: single factor

SUMMARYGroups Count Sum Average Variance

Club 1 5 1246 249.2 108.2

Club 2 5 1130 226 77.5

Club 3 5 1029 205.8 94.2

ANOVASource of Variation SS df MS F P-value F crit

Between Groups 4716.4 2 2358.2 25.275 4.99E-05 3.89

Within Groups 1119.6 12 93.3

Total 5836.0 14        

Page 29: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-29

The Tukey-Kramer Procedure

Tells which population means are significantly different e.g.: μ1 = μ2 ≠ μ3

Done after rejection of equal means in ANOVA Allows pair-wise comparisons

Compare absolute mean differences with critical range

xμ 1 = μ 2 μ 3

Page 30: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-30

Tukey-Kramer Critical Range

where:QU = Value from Studentized Range Distribution with c and n - c degrees of freedom for the desired level of (see appendix E.9 table)MSW = Mean Square Withinnj and nj’ = Sample sizes from groups j and j’

j'j n1

n1

2MSWRange Critical UQ

Page 31: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-31

The Tukey-Kramer Procedure1. Compute absolute mean differences:

Club 1 Club 2 Club 3254 234 200263 218 222241 235 197237 227 206251 216 204 20.2205.8226.0xx

43.4205.8249.2xx

23.2226.0249.2xx

32

31

21

2. Find the QU value from the table in appendix E.9 with c = 3 and (n – c) = (15 – 3) = 12 degrees of freedom for the desired level of ( = .05 used here):

3.77Q U

Page 32: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-32

The Tukey-Kramer Procedure

5. All of the absolute mean differences are greater than critical range. Therefore there is a significant difference between each pair of means at the 5% level of significance.

16.28551

51

293.33.77

n1

n1

2MSWQRange Critical

j'jU

3. Compute Critical Range:

20.2xx43.4xx23.2xx 323121 4. Compare:

Page 33: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-33

ANOVA Assumptions

Randomness and Independence Select random samples from the c groups (or

randomly assign the levels) Normality

The sample values from each group are from a normal population

Homogeneity of Variance Can be tested with Levene’s Test

Page 34: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-34

ANOVA AssumptionsLevene’s Test Tests the assumption that the variances of each group

are equal. First, define the null and alternative hypotheses:

H0: σ21 = σ2

2 = …=σ2c

H1: Not all σ2j are equal

Second, compute the absolute value of the difference between each value and the median of each group.

Third, perform a one-way ANOVA on these absolute differences.

Page 35: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-35

Two-Way ANOVA

Examines the effect of Two factors of interest on the dependent

variable e.g., Percent carbonation and line speed on soft

drink bottling process Interaction between the different levels of these

two factors e.g., Does the effect of one particular carbonation

level depend on which level the line speed is set?

Page 36: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-36

Two-Way ANOVA

Assumptions

Populations are normally distributed Populations have equal variances Independent random samples are selected

Page 37: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-37

Two-Way ANOVASources of VariationTwo Factors of interest: A and B

r = number of levels of factor Ac = number of levels of factor Bn/ = number of replications for each celln = total number of observations in all cells(n = rcn/)Xijk = value of the kth observation of level i of factor A and level j of factor B

Page 38: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-38

Two-Way ANOVASources of Variation

SSTTotal Variation

SSAFactor A Variation

SSBFactor B Variation

SSABVariation due to interaction

between A and B

SSERandom variation (Error)

Degrees of Freedom:

r – 1

c – 1

(r – 1)(c – 1)

rc(n/ – 1)

n - 1

SST = SSA + SSB + SSAB + SSE

Page 39: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-39

Two-Way ANOVAEquations

r

1i

c

1j

n

1k

2ijk )XX(SST

2r

1i..i )XX(ncSSA

2c

1j.j. )XX(nrSSB

Total Variation:

Factor A Variation:

Factor B Variation:

Page 40: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-40

Two-Way ANOVAEquations

2

1 1..... )( XXXXnSSAB

r

i

c

jjiij

r

1i

c

1j

n

1k

2.ijijk )XX(SSE

Interaction Variation:

Sum of Squares Error:

Page 41: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-41

Two-Way ANOVAEquations

Mean Grandnrc

XX

r

1i

c

1j

n

1kijk

r) ..., 2, 1, (i A factor of level i of Meannc

XX th

c

1j

n

1kijk

..i

r = number of levels of factor Ac = number of levels of factor Bn/ = number of replications in each cell

Page 42: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-42

Two-Way ANOVAEquations

ij cell ofMean n

XX

n

1k

ijkij.

c) ..., 2, 1, (j Bfactor of level j ofMean nr

XX th

r

1i

n

1kijk

.j.

r = number of levels of factor Ac = number of levels of factor Bn/ = number of replications in each cell

Page 43: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-43

Two-Way ANOVAEquations

1rSSA Afactor square MeanMSA

1cSSBB factor square MeanMSB

)1c)(1r(SSABninteractio square MeanMSAB

)1'n(rcSSEerror square MeanMSE

Page 44: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-44

Two-Way ANOVA:The F Test Statistic

F Test for Factor B Effect

F Test for Interaction Effect

H0: μ1.. = μ2.. = • • • = μr..

H1: Not all μi.. are equal

H0: the interaction of A and B is equal to zero

H1: interaction of A and B isn’t zero

F Test for Factor A Effect

H0: μ.1. = μ.2. = • • • = μ.c.

H1: Not all μ.j. are equal

Reject H0 if

F > FUMSEMSAF

MSEMSBF

MSEMSABF

Reject H0 if

F > FU

Reject H0 if

F > FU

Page 45: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-45

Two-Way ANOVA:Summary Table

Source ofVariation

Degrees of Freedom

Sum ofSquares

Mean Squares

FStatistic

Factor A r – 1 SSA MSA = SSA /(r – 1)

MSA/MSE

Factor B c – 1 SSBMSB

= SSB /(c – 1)MSB/MSE

AB(Interaction) (r – 1)(c – 1) SSAB MSAB

= SSAB / (r – 1)(c – 1)MSAB/

MSE

Error rc(n’ – 1) SSEMSE

= SSE/rc(n’ – 1)

Total n – 1 SST

Page 46: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-46

Two-Way ANOVA:Features

Degrees of freedom always add up n-1 = rc(n/-1) + (r-1) + (c-1) + (r-1)(c-1) Total = error + factor A + factor B + interaction

The denominator of the F Test is always the same but the numerator is different

The sums of squares always add up SST = SSE + SSA + SSB + SSAB Total = error + factor A + factor B + interaction

Page 47: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-47

Two-Way ANOVA:Interaction

No Significant Interaction:

Factor B Level 1

Factor B Level 3

Factor B Level 2

Factor A Levels

Factor B Level 1

Factor B Level 3

Factor B Level 2

Factor A Levels

Mea

n R

espo

nse

Mea

n R

espo

nse

Interaction is present:

Page 48: Statistics for managers using microsoft excel cp-11

Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc. Chap 11-48

Chapter Summary

Described one-way analysis of variance The logic of ANOVA ANOVA assumptions F test for difference in c means The Tukey-Kramer procedure for multiple comparisons

Described two-way analysis of variance Examined effects of multiple factors Examined interaction between factors

In this chapter, we have