1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response...

52
1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly influential FACTOR)

Transcript of 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response...

Page 1: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

1

Analysis of Variance & One Factor Designs

Y= DEPENDENT VARIABLE

(“yield”)

(“response variable”)

(“quality indicator”)

X = INDEPENDENT VARIABLE

(A possibly influential FACTOR)

Page 2: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

2

OBJECTIVE: To determine the impact of X on Y

Mathematical Model:

Y = f (x, ) , where = (impact of) all factors other than X

Ex: Y = Battery Life

(hours)

X = Brand of Battery

= Many other factors (possibly, some we’re unaware of)

Page 3: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

3

Statistical Model

“LEVEL” OF BRAND(Brand is, of course, represented as “categorical”)

Y11 Y12 • • • • • • •Y1c

Yij

Y21

YRI

1

2

R

1 2 • • •  •  •  • • • C

Yij = + j + ij

i = 1, . . . . . , R

j = 1, . . . . . , C

YRc•   •  •   •    •   •    •    • 

Page 4: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

4

Where

= OVERALL AVERAGE

j = index for FACTOR (Brand) LEVEL

i= index for “replication”

j = Differential effect (response) associated with jth level of X

and ij = “noise” or “error” associated with the (particular) (i,j)th data value.

Let j = AVERAGE associated with jth level of X

j = j – and = AVERAGE of j .

Page 5: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

5

Yij = + j + ij

By definition, j = 0C

j=1

The experiment produces

R x C Yij data values.

The analysis produces estimates of c. (We can then get estimates of

the ij by subtraction).

Page 6: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

6

Y11 Y12 • • • • • •Y1c

Y21

YRI

••

••

YRc

              

1 2

C

Y• 1Y• c(Y• j)

•  •  •  •  •  •  •  •  • 

Y• 2

3

Y•1, Y•2, etc., are Column Means

• • • • •

•  •  •   •  •    

Page 7: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

7

Y• • = Y• j /C = “GRAND MEAN”

(assuming same # data points in each column)

(otherwise, Y• • = mean of all the data)

j=1

c

Page 8: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

8

MODEL: Yij = + j + ij

Y• • estimates

Y • j - Y • • estimatesj (= j – ) (for all j)

These estimates are based on Gauss’ (1796)

PRINCIPLE OF LEAST SQUARES

and (I would argue) on COMMON SENSE

Page 9: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

9

MODEL: Yij = + j + ij

If you insert the estimates into the MODEL,

(1) Yij = Y • • + (Y•j - Y • • ) + ij.

it follows that our estimate of ij is

(2) ij = Yij - Y•j

<

<

Page 10: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

10

Then, Yij = Y• • + (Y• j - Y• • ) + ( Yij - Y• j)

or, (Yij - Y• • ) = (Y•j - Y• •) + (Yij - Y•j ) { { {(3)

TOTAL

VARIABILITY

in Y

=

Variability

in Y

associated

with X

Variability

in Y

associated

with all other factors

+

Page 11: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

11

If you square both sides of (3), and double sum both sides (over i and j), you get, [after some unpleasant algebra, but

lots of terms which “cancel”]

(Yij - Y• • )2 = R • (Y•j - Y• •)

2 + (Yij - Y•j)

2C R

j=1 i=1 { { {j=1

C C R

j=1 i=1

TSS

TOTAL SUM OF SQUARES

=

=

SSBC

SUM OF SQUARES BETWEEN COLUMNS

+

+

SSW (SSE)

SUM OF SQUARES WITHIN COLUMNS( ( (

( ((

Page 12: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

12

ANOVA TABLE

SOURCE OF VARIABILITY

SSQ DFMeansquare

(M.S.)

Between Columns (due to brand)

Within Columns (due to error)

SSBC C - 1 MSBC

SSBC

C - 1

SSW (R - 1) • CSSW

(R-1)•C= MSW

=

TOTAL TSS RC -1

Page 13: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

13

Example: Y = LIFETIME (HOURS)

BRAND3 replications

per level 1 2 3 4 5 6 7 8

1.8 4.2 8.6 7.0 4.2 4.2 7.8 9.0

5.0 5.4 4.6 5.0 7.8 4.2 7.0 7.4

1.0 4.2 4.2 9.0 6.6 5.4 9.8 5.8

2.6 4.6 5.8 7.0 6.2 4.6 8.2 7.4 5.8

SSBC = 3 ( [2.6 - 5.8]2 + [4.6 - 5.8]

2 + • • • + [7.4 - 5.8]2)

= 3 (23.04)

= 69.12

Page 14: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

14

(1.8 - 2.6)2 = .64 (4.2 - 4.6)2 =.16 (9.0 -7.4)2 = 2.56

(5.0 - 2.6)2 = 5.76 (5.4 - 4.6)2= .64 • • • • (7.4 - 7.4)2 = 0

(1.0 - 2.6)2 = 2.56 (4.2 - 4.6)2= .16 (5.8 - 7.4)2 = 2.56

8.96 .96 5.12

Total of (8.96 + .96 + • • • • • • + 5.12),

SSW = 46.72

SSW =

Page 15: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

15

ANOVA TABLE

Source of Variability

SSQ df M.S.

BRAND

ERROR

69.12

46.72

7

= 8 - 1

16

= 2 (8)

9.87

2.92

TOTAL 115.84 23

= (3 • 8) -1

Page 16: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

16

We can show:

E (MSBC) = 2 +

“VCOL”{

MEASURE OF DIFFERENCES

AMONG COLUMN MEANS

RC-1

• (j - )2

{

j

((

E (MSW) = 2

(Assuming each Yij has (constant) standard deviation, )

(More about assumptions, Later)

Page 17: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

17

E ( MSBC ) = 2 + VCOL

E ( MSW ) = 2

This suggests that

if MSBC

MSW > 1 ,

There’s some evidence of non-zero VCOL, or “level of X affects Y”

if MSBC

MSW< 1 ,

No evidence that VCOL > 0, or that “level of X affects Y”

Page 18: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

18

With HO: Level of X has no impact on Y

HI: Level of X does have impact on Y,

We need

MSBC

MSW> > 1

to reject HO.

Page 19: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

19

More Formally,

HO: 1 = 2 = • • • c = 0

HI: not all j = 0

OR

HO: 1 = 2 = • • • • c

HI: not all j are EQUAL

(All column means are equal)

Page 20: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

20

The probability Law of

MSBC

MSW= “Fcalc” , is

The F - distribution with (C-1, (R-1)C)degrees of freedom

Assuming

HO true.

C = Table Value

Page 21: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

21

In our problem:

ANOVA TABLE

Source of Variability

SSQ df M.S.

BRAND

ERROR

69.12

46.72

7

16

9.87

2.92 = 9.87 2.92

Fcalc

3.38

Page 22: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

22

= .05

C = 2.66 3.38

F table coming up

(7,16 DF)

Page 23: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

23

F-Table

Page 24: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

24

Hence, at = .05, Reject Ho .

(i.e., Conclude that level of BRAND does have an impact on battery lifetime.)

Page 25: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

25

ONE FACTOR ANOVA, Using EXCEL1.8 4.2 8.6 7 4.2 4.2 7.8 9

5 5.4 4.6 5 7.8 4.2 7 7.41 4.2 4.2 9 6.6 5.4 9.8 5.8

Anova: Single-Factor

Summary

Groups Count Sum Average Variance

Column 1 3 7.8 2.6 4.48Column 2 3 13.8 4.6 0.48Column 3 3 17.4 5.8 5.92Column 4 3 21 7 4Column 5 3 18.6 6.2 3.36Column 6 3 13.8 4.6 0.48Column 7 3 24.6 8.2 2.08Column 8 3 22.2 7.4 2.56

ANOVA

Source of VariationSS df MS F P-value F crit

Between Groups 69.12 7 9.87429 3.3816 0.02064 2.657Within Groups 46.72 16 2.92

Total 115.8 23

Page 26: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

26

SPSS/MINITAB INPUTVAR001 VAR002

1.8 1

5.0 1

1.0 1

4.2 2

5.4 2

4.2 2

. .

. .

. .

9.0 8

7.4 8

5.8 8

Page 27: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

27

ONE_FACTOR ANOVA, using SPSS - - - - - O N E W A Y - - - - - Variable Lifetime By Variable Device

Analysis of Variance

Sum of Mean F F

Source D.F. Squares Squares Ratio Prob.

Between Groups 7 69.1200 9.8743 3.3816 .0206

Within Groups 16 46.7200 2.9200

Total 23 115.8400

Page 28: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

28

ONE FACTOR ANOVA (MINITAB)

Analysis of Variance for life

Source DF SS MS F P

brand 7 69.12 9.87 3.38 0.021

Error 16 46.72 2.92

Total 23 115.84

MINITAB: STAT>>ANOVA>>ONE-WAY

Page 29: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

29

1 2 3 4 5 6 7 8

1

2

3

4

5

6

7

8

9

10

brand

life

Dotplots of life by brand(group means are indicated by lines)

Page 30: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

30

1 2 3 4 5 6 7 8

0

1

2

3

4

5

6

7

8

9

10

brand

lifeBoxplots of life by brand

(means are indicated by solid circles)

Page 31: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

31

EXAMPLE: MORTARThe tension bond strength of cement mortar is an important characteristic of the product. An engineer is interested in comparing the strength of a modified formulation in which polymer latex emulsions have been added during mixing to the strength of the unmodified mortar. The experimenter has collected 10 observations on strength for the modified formulation and another 10 observations for the unmodified formulation.

Page 32: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

32

Modified Unmodified

16.85 17.5016.40 17.6317.21 18.2516.35 18.0016.52 17.8617.04 17.7516.96 18.2217.15 17.9016.59 17.9616.57 18.15

Page 33: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

33

One-way ANOVA: strength versus type (Minitab)

Analysis of Variance for strengthSource DF SS MS F Ptype 1 6.7048 6.7048 82.98 0.000Error 18 1.4544 0.0808Total 19 8.1592

Page 34: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

34

1 2

16.5

17.5

18.5

type

stre

ngth

Boxplots of strength by type(means are indicated by solid circles)

Page 35: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

35

ONE FACTOR ANOVA, using JMPMVPC Survey Results

Amesbury Andover Methuen Salem66 55 56 6466 50 56 7066 51 57 6267 47 58 6470 57 61 6664 48 54 6271 52 62 6766 50 57 6071 48 61 6867 50 58 6863 48 54 6660 49 51 6666 52 57 6170 48 60 6369 48 59 6766 48 56 6770 51 61 7065 49 55 6271 46 62 6263 51 53 6869 54 59 7067 54 58 6264 49 54 6368 55 58 6565 47 55 6867 47 58 6865 53 55 6470 51 60 6568 50 58 6973 54 64 62

Page 36: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

36

S c o re B y L o c a t io n

45

50

55

60

65

70

75

Amesbury Andover Methuen Salem

Location

O n e w a y A n o v aS u m m a ry o f F it

R S q u a re 0 .8 4 1 5 0 5R S q u a re A d j 0 .8 3 7 4 0 6R o o t M e a n S q u a re E rro r 2 .9 3 2 5 2 7M e a n o f R e s p o n s e 6 0 .0 9 1 6 7O b s e rv a t io n s (o r S u m W g ts ) 1 2 0

A n a ly s is o f V a r ia n c eS o u rc e D F S u m o f S q u a re s M e a n S q u a re F R a t ioM o d e l 3 5 2 9 6 .4 2 5 0 1 7 6 5 .4 7 2 0 5 .2 9 4 7E rro r 1 1 6 9 9 7 .5 6 6 7 8 .6 0 P ro b > FC T o ta l 1 1 9 6 2 9 3 .9 9 1 7 5 2 .8 9 < .0 0 0 1

M e a n s fo r O n e w a y A n o v aL e v e l N u m b e r M e a n S td E rro rA m e s b u ry 3 0 6 7 .1 0 0 0 0 .5 3 5 4 0A n d o v e r 3 0 5 0 .4 0 0 0 0 .5 3 5 4 0M e th u e n 3 0 5 7 .5 6 6 7 0 .5 3 5 4 0S a le m 3 0 6 5 .3 0 0 0 0 .5 3 5 4 0

S td E rro r u s e s a p o o le d e s t im a te o f e r ro r v a r ia n c e

Page 37: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

37

AssumptionsBasically, the same as in

Regression analysis:

MODEL:

Yij = + j + ij

1.) the ij are indep. random variables

2.) Each ij is Normally Distributed

E(ij) = 0 for all i, j

3.) 2(ij) = constant for all i, j

Normality plot

Residual plot

Run order plot

Page 38: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

38

Diagnosis: Normality

• The points on the normality plot must more or less follow a line to claim “normal distributed”.

• There are statistic tests to verify it scientifically. • The ANOVA method we learn here is not

sensitive to the normality assumption. That is, a mild departure from the normal distribution will not change our conclusions much.

Normality plot: normal scores vs. residuals

Page 39: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

39

0.50.0-0.5

2

1

0

-1

-2

No

rma

l Sco

re

Residual

Normal Probability Plot of the Residuals(response is strength)

From Mortar data:

Page 40: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

40

Diagnosis: Constant Variances

• The points on the residual plot must be more or less within a horizontal band to claim “constant variances”.

• There are statistic tests to verify it scientifically. • The ANOVA method we learn here is not sensitive

to the constant variances assumption. That is, slightly different variances within groups will not change our conclusions much.

Residual plot: fitted values vs. residuals

Page 41: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

41

18.017.517.0

0.5

0.0

-0.5

Fitted Value

Re

sid

ua

l

Residuals Versus the Fitted Values(response is strength)

From Mortar data:

Page 42: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

42

Diagnosis: Randomness/Independence

• The run order plot must show no “systematic” patterns to claim “randomness”.

• There are statistic tests to verify it scientifically. • The ANOVA method is sensitive to the constant

variances assumption. That is, a little level of dependence between data points will change our conclusions a lot.

Run order plot: order vs. residuals

Page 43: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

43

2018161412108642

0.5

0.0

-0.5

Observation Order

Re

sid

ua

l

Residuals Versus the Order of the Data(response is strength)

From Mortar data:

Page 44: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

44

This assumes a “fixed model”:Inherent interest in the specific levels of the factors under study - there’s

no direct interest in extrapolating to other levels - inference will be limited to levels that appear in the experiment. Experimenter selects the

levels

If a “random model”:Levels in experiment randomly selected from a population of such levels, and inference is to be made about the entire population of

levels.

Then, besides assumptions 1 to 3, there is another assumption:

4) a) the j are independent random variables which are normally distributed with constant variance

b) the j and ij are independent

Page 45: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

45

With these assumptions, the estimates

(Y.. and the Y• j ) are “Maximum likelihood

estimates”(a statistical notion which could be thought of as “efficiency” [“most likely value”]),

and, more directly relevant:

The “Conventional” F- and t- tests are applicable (VALID) for a variety of hypothesis testing and confidence interval computations.

Page 46: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

46

KRUSKAL - WALLIS TEST

(Non - Parametric Alternative)

HO: The probability distributions are identical for each level of the factor

HI: Not all the distributions are the same

Page 47: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

47

Brand

A B C

32 32 28

30 32 21

30 26 15

29 26 15

26 22 14

23 20 14

20 19 14

19 16 11

18 14 9

12 14 8

BATTERY LIFETIME (hours)

(each column rank ordered, for simplicity)

Mean: 23.9 22.1 14.9 (here, irrelevant!!)

Page 48: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

48

HO: no difference in distribution among the three brands with

respect to battery lifetime

HI: At least one of the 3 brands differs in distribution from the others with respect to lifetime

Page 49: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

49

Brand

A B C

32 (29) 32 (29) 28 (24)

30 (26.5) 32 (29) 21 (18)

30 (26.5) 26 (22) 15 (10.5)

29 (25) 26 (22) 15 (10.5)

26 (22) 22 (19) 14 (7)

23 (20) 20 (16.5) 14 (7)

20 (16.5) 19 (14.5) 14 (7)

19 (14.5) 16 (12) 11 (3)

18 (13) 14 (7) 9 (2)

12 (4) 14 (7) 8 (1)T1 = 197 T2 = 178 T3 = 90

n1 = 10 n2 = 10 n3 = 10

Ranks

Page 50: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

50

TEST STATISTIC:

H =12

N (N + 1)• (Tj

2/nj ) - 3 (N + 1)

nj = # data values in column j

N = nj

K = # Columns (levels)

Tj = SUM OF RANKS OF DATA ON COL j When all DATA COMBINED

(There is a slight adjustment in the formula as a function of the number of ties in rank.)

K

j = 1

K

j = 1

Page 51: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

51

H =

[ 12 197 2 178 2 902

30 (31) 10 10 10+ +

[ - 3 (31)

= 8.41

(with adjustment for ties, we get 8.46)

Page 52: 1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.

52

We can show that, under HO , H is well

approximated by a 2 distribution with df = K - 1.

What do we do with H?

Here, df = 2, and at = .05, the critical value = 5.99

2

df

dfFdf,=

5.99 8.41 = H

= .05

Reject HO; conclude that mean lifetime NOT the same for all 3 BRANDS

8