Asking and Answering Questions about More Than Two Means

34
2 PREVIEW In Chapters 13 and 14, you learned methods for testing H 0 : m 1 2 m 2 5 0 (or equivalently, m 1 5 m 2 ), where m 1 and m 2 are the means of two different populations or the mean responses when two different treatments are applied. However, many investigations involve comparing more than two population or treatment means, as illustrated in the following example. Asking and Answering Questions about More Than Two Means 2 Preview Chapter Learning Objectives 17.1 The Analysis of Variance— Single-Factor ANOVA and the F Test 17.2 Multiple Comparisons Appendix: ANOVA Computations Are You Ready to Move On? Chapter Review Exercises Technology Notes Appendix Tables Table 7: Values That Capture Specified Upper-Tail F Curve Areas Table 8: Critical Values of q for the Studentized Range Distribution Answers to Selected Exercises James Woodson/Digital Vision/Getty Images © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Transcript of Asking and Answering Questions about More Than Two Means

Page 1: Asking and Answering Questions about More Than Two Means

2

Preview

In Chapters 13 and 14, you learned methods for testing H0: m1 2 m

2 5 0

(or equivalently, m1 5 m

2 ), where m

1 and m

2 are the means of two different

populations or the mean responses when two different treatments are applied.

However, many investigations involve comparing more than two population or

treatment means, as illustrated in the following example.

Asking and Answering Questions about More Than Two Means

17

2

Preview Chapter Learning Objectives17.1 The Analysis of Variance—

Single-Factor ANOVA and the F Test

17.2 Multiple Comparisons Appendix: ANOVA

Computations Are You Ready to Move On?

Chapter Review Exercises Technology Notes Appendix Tables Table 7: Values That Capture

Specified Upper-Tail F Curve Areas

Table 8: Critical Values of q for the Studentized Range Distribution

Answers to Selected Exercises

Jam

es W

oods

on/D

igita

l Vis

ion/

Getty

Imag

es

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Page 2: Asking and Answering Questions about More Than Two Means

Conceptual UnderstandingAfter completing this chapter, you should be able toC1 Understand how a research question about differences between three or more population or treatment

means is translated into hypotheses.

Mastering the MechanicsAfter completing this chapter, you should be able toM1 Translate a research question or claim about differences between three or more population or

treatment means into null and alternative hypotheses.M2 Know the conditions for appropriate use of the ANOVA F testM3 Carry out an ANOVA F test.M4 Use a multiple comparison procedure to identify differences in population or treatment means.

Putting it into PracticeAfter completing this chapter, you should be able toP1 Recognize when a situation calls for testing hypotheses about differences between three or more

population or treatment means.P2 Carry out an ANOVA F test and interpret the conclusion in context.

ChaPter Learning ObjeCtives

3

Preview exaMPLe risky soccerIn a study to see if the high incidence of head injuries among soccer players might be related to memory recall, researchers collected data from three samples of college students (“No Evidence of Impaired Neurocognitive Performance in Collegiate Soccer Players,” The American Journal of Sports Medicine [2002]: 157–162). One sample consisted of soccer athletes, one sample consisted of athletes whose sport was not soccer, and one sample was a comparison group consisting of students who did not participate in sports. The following information on scores from the Hopkins Verbal Learning Test (which measures memory recall) was given in the paper.

Notice that the three sample means are different. But even when the population means are equal, you would not expect the three sample means to be exactly equal. Are the differences in sample means consistent with what is expected simply due to chance differences from one sample to another when the population means are equal, or are the differences large enough that you should conclude that the three population means are not all equal? This is the type of problem considered in this chapter.

GroupSoccer

AthletesNonsoccer

AthletesComparison

Group

Sample Size 86 95 53

Sample Mean Score 29.90 30.94 29.32

Sample Standard Deviation 3.73 5.14 3.78

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Page 3: Asking and Answering Questions about More Than Two Means

ChaPter 17 Asking and Answering Questions about More Than Two Means4

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

When more than two populations or treatments are being compared, the characteristic that distinguishes the populations or treatments from one another is called the factor under investigation. For example, an experiment might be carried out to compare three differ-ent methods for teaching reading (three different treatments), in which case the factor of interest would be teaching method, a qualitative factor. If the growth of the fish raised in waters having different salinity levels—0%, 10%, 20%, and 30%—is of interest, the factor salinity level is quantitative.

A single-factor analysis of variance (ANOVA) problem involves a comparison of k population or treatment means m

1, m

2, …, m

k. The objective is to test

H0: m

1 5 m

2 5 . . . 5 m

k

against

Ha: At least two of m's are different

When comparing populations, the analysis is based on independently selected random samples, one from each population. When comparing treatment means, the data are from an experiment, and the analysis assumes random assignment of the experimental units (subjects or objects) to treatments. If, in addition, the experimental units are chosen at random from a population of interest, it is also possible to generalize the results of the analysis to this population.

Whether the null hypothesis in a single-factor ANOVA should be rejected depends on how much the samples from the different populations or treatments differ from one another. Figure 17.1 displays observations that might result when random samples are selected from each of three populations. Each dotplot displays five observations from the first population, four observations from the second population, and six observations from the third population. For both displays, the three sample means are located by arrows. The means of the two samples from Population 1 are equal, as are the means for the two samples from population 2 and for the two samples from Population 3.

Mean ofSample 1

Mean ofSample 2

Mean ofSample 3

(a)

(b)

Mean ofSample 1

Mean ofSample 2

Mean ofSample 3

After looking at the data in Figure 17.1(a), you would probably think that the claim m

1 5 m

2 5 m

3 appears to be false. Not only are the three sample means different, but also

the three samples are clearly separated. In other words, differences between the three sample means are quite large relative to the variability within each sample.

The situation pictured in Figure 17.1(b) is much less clear-cut. The sample means are as different as they were in the first data set, but now there is considerable over-lap among the three samples. The separation between sample means might be due to the substantial variability in the populations (and therefore the samples) rather than to differences between m

1, m

2, and m

3. The phrase analysis of variance comes from

the idea of analyzing variability in the data to see how much can be attributed to

seCtiOn 17.1 the analysis of variance—single-Factor anOva and the F test

FIGurE 17.1 Two possible ANOVA data sets when three populations are compared: green circle 5 observation from Population 1; orange circle 5 observation from Population 2; blue circle 5 observation from Population 3

Unless otherwise noted, all content on this page is © Cengage Learning.

Page 4: Asking and Answering Questions about More Than Two Means

5

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

17.1 The Analysis of Variance—Single-Factor ANOVA and the F Test

differences in the µ’s and how much is due to variability in the individual populations. In Figure 17.1(a), the within-sample variability is small relative to the between-sample variability, whereas in Figure 17.1(b), a great deal more of the total variability is due to variation within each sample. If differences between the sample means can be explained entirely by within-sample variability, there is no compelling reason to reject H

0: m

1 5 m

2 5 m

3.

notations and assumptionsNotation in single-factor ANOVA is a natural extension of the notation used in earlier chapters for comparing two population or treatment means.

A decision between

H0: m

1 5 m

2 5 . . . 5 m

k

and

Ha: At least two of m's are different

is based on examining the _ x values to see whether observed differences are small enough

to be explained by sampling variability alone or whether an alternative explanation for the differences is more plausible.

Example 17.1 an indicator of heart attack riskThe article “Could Mean Platelet Volume Be a Predictive Marker for Acute Myocardial Infarction?” (Medical Science Monitor [2005]: 387–392) described a study in which four groups of patients seeking treatment for chest pain were compared with respect to the mean platelet volume (MPV, measured in fL). The four groups considered were based on the clinical diagnosis: (1) noncardiac chest pain, (2) stable angina, (3) unstable angina, and (4) heart attack. The purpose of the study was to determine if the mean MPV differed for the four groups, and in particular if the mean MPV was different for the heart attack group, because then MPV could be used as an indicator of heart attack risk.

To carry out this study, patients seen for chest pain were divided into groups accord-ing to diagnosis. The researchers then selected a random sample of 35 from each of the resulting k 5 4 groups. The researchers believed that this sampling process would result in samples that were representative of the four populations of interest and that could be regarded as if they were random samples from these four populations. Table 17.1 presents summary values given in the paper.

k 5 number of populations or treatments being compared

Population or treatment 1 2 ... kPopulation or treatment mean m

1 m

2 ... m

k

Population or treatment variance s 1 2 s

2 2 ... s

k 2

Sample size n1 n

2 ... n

k

Sample mean _ x 1

_ x 2 ...

_ x k

Sample variance s 1 2

s 2 2

...

s k 2

N 5 n1 1

n

2 1 . . . 1 n

k (the total number of observations in the data set)

T 5 grand total 5 sum of all N observations in the data set 5 n1 _ x 1 1 n

2 _ x 2 1 . . . 1 n

k _ x k

_

_ x 5 grand mean 5 T __

N

ANOVA Notation

The Analysis of Variance—Single-Factor ANOVA and the F Test

Page 5: Asking and Answering Questions about More Than Two Means

ChaPter 17 Asking and Answering Questions about More Than Two Means6

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

As with the inferential methods of previous chapters, the validity of the ANOVA test for H

0: m

1 5 m

2 5 . . . 5 m

k requires that some conditions be met.

With m1 denoting the true mean MPV for group i (i 5 1, 2, 3, 4), consider the

null hypothesis H0: m

1 5 m

2 5 m

3 5 m

4. Figure 17.2 shows a comparative boxplot

of the four samples (based on data consistent with summary values given in the paper). The mean MPV for the heart attack sample is larger than for the other three samples, and the boxplot for the heart attack sample appears to be shifted a bit higher than the boxplots for the other three samples. However, because the four boxplots show substantial overlap, it is not obvious whether H

0 is plausible or should be rejected. In situations like this, a

formal test procedure is helpful.

1312119 10

Noncardiac

Stable angina

Unstable angina

Heart attack

MPV

In practice, the test based on these assumptions works well as long as the conditions are not too badly violated. If the sample sizes are reasonably large, normal probability plots or boxplots of the data in each sample are helpful in checking the condition of nor-mality. Often, however, sample sizes are so small that a separate normal probability plot or boxplot for each sample is of little value in checking normality. In this case, a single combined plot can be constructed by first subtracting

_ x 1 from each observation in the first

Group Number Group Description

Sample Size

Sample Mean

Sample Standard Deviation

1 Noncardiac chest pain 35 10.89 0.692 Stable angina 35 11.25 0.743 Unstable angina 35 11.37 0.914 Heart attack 35 11.75 1.07

tabLe 17.1 summary values for MPv Data of example 17.1

FIGurE 17.2 Boxplots for Example 17.1

1. Each of the k population or treatment response distributions is normal.2. s

1 5 s

2 5 . . . 5 s

k (The k normal distributions have equal standard

deviations.)3. The observations in the sample from any particular one of the k populations or

treatments are independent of one another.4. When comparing population means, the k random samples are selected inde-

pendently of one another. When comparing treatment means, experimental units are assigned at random to treatments.

Conditions for ANOVA

Unless otherwise noted, all content on this page is © Cengage Learning.

Page 6: Asking and Answering Questions about More Than Two Means

7

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

17.1 The Analysis of Variance—Single-Factor ANOVA and the F Test

The number of error degrees of freedom comes from adding the number of degrees of freedom associated with each of the sample variances:

(n1 2 1) 1 (n

2 2 1) 1 (n

k 2 1) 5 n

1 1 n

2 1 n

k 21 21 2 1

5 N 2 k

sample, _ x 2 from each value in the second sample, and so on, and then constructing a normal

probability or boxplot of all N deviations from their respective means. Figure 17.3 shows such a normal probability plot for the data of Example 17.1.

−3 −2 −1 0 1 2 3

12

13

11

10

9

Normal score

Dev

iatio

n

There is a formal procedure for testing the equality of population standard deviations. Unfortunately, it is quite sensitive to even a small violation of the normality condition. However, the equal population or treatment standard deviation condition can be considered reasonably met if the largest of the sample standard deviations is at most twice the smallest one. For example, the largest standard deviation in Example 17.1 is s

4 5 1.07, which is

only about 1.5 times the smallest standard deviation (s1 5 0.69).

The analysis of variance test procedure is based on the following measures of varia-tion in the data.

DEFINItION

A measure of differences among the sample means is the treatment sum of squares, denoted by SSTr and given by

SSTr 5 n1 ( _ x 1 2

_

_ x ) 2 1 n

2 ( _ x 2 2

_

_ x ) 2 1 . . . 1 n

k ( _ x k 2

_

_ x ) 2

A measure of variation within the k samples, called error sum of squares and denoted by SSE, is

SSE 5 (n1 2 1) s

1 2 1 (n

2 2 1) s

2 2 1 . . . 1 (n

k 2 1) s

k 2

Each sum of squares has an associated df:

treatment df 5 k 2 1 error df 5 N 2 k

A mean square is a sum of squares divided by its df. In particular,

mean square for treatments 5 MSTr 5 SSTr _____ k 21

mean square for error 5 MSE 5 SSE ______ N 2 k

FIGurE 17.3 A normal probability plot using the combined data of Example 17.1

… … …

Unless otherwise noted, all content on this page is © Cengage Learning.

Page 7: Asking and Answering Questions about More Than Two Means

ChaPter 17 Asking and Answering Questions about More Than Two Means8

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Both MSTr and MSE are quantities whose values can be calculated once sample data are available (they are statistics). Each of these statistics varies in value from data set to data set. Both statistics MSTr and MSE have sampling distributions, and these sampling distributions have mean values. The following box describes the relationship between the mean values of MSTr and MSE.

When H0 is true (

m

1 5 m

2 5 . . . 5 m

k ),

mMSTr

5 m

MSE

However, when H0 is false,

mMSTr

. mMSE

and the greater the differences among the m9s, the larger mMSTr

will be relative to mMSE

.

According to this result, when H0 is true, you would expect the values of the two

mean squares to be close. However, you would expect MSTr to be substantially greater than MSE when some µ’s differ greatly from others. Thus, a calculated MSTr that is much larger than MSE is inconsistent with the null hypothesis. In Example 17.2, MSTr 5 4.400 and MSE 5 0.749, so MSTr is about six times as large as MSE. Can this be attributed solely to sampling variability, or is the ratio MSTr/MSE large enough to suggest that the null hypothesis is false? Before a formal test procedure can be described, you have to learn about a new family of probability distributions called F distributions.

An F distribution always arises in connection with a ratio. A particular F distribu-tion is obtained by specifying both numerator degrees of freedom (df

1) and denominator

Let’s return to the mean platelet volume (MPV) data of Example 17.1. The grand mean _

_ x

was calculated to be 11.315. Notice that because the sample sizes are all equal, the grand mean is just the average of the four sample means (this will not usually be the case when the sample sizes are unequal). With

_ x 1 5 10.89,

_ x 2 5 11.25,

_ x 3 5 11.37,

_ x 4 5 11.75, and

n1 5 n

2 5 n

3 5 n

4 5 35,

SSTr 5 n1 ( _ x 1 2

_

_ x ) 2 1 n

2 ( _ x 2 2

_

_ x ) 2 1 . . . 1 n

k ( _ x k 2

_

_ x ) 2

5 35 (10.89 2 11.315) 2 1 35(11.25 2 11.315) 2 1 35(11.37 2 11.315) 2 1 35(11.75 2 11.315) 2

5 6.322 1 0.148 1 0.106 1 6.623

5 13.199

Because s1 5 0.69, s

2 5 0.74, s

3 5 0.91, s

4 5 1.07,

SSE 5 (n1 2 1) s

1 2 1 (n

2 2 1) s

2 2 1 . . . 1 (n

k 2 1) s

k 2

5 (35 2 1) (0.69) 2 1 (35 2 1) (0.74) 2 1 (35 2 1) (0.91) 2 1 (35 2 1) (1.07) 2

5 101.888

The numbers of degrees of freedom are

treatment df 5 k 21 5 3

error df 5 N 2k 5 35 1 35 1 35 1 35 2 4 5 136

from which

MSTr 5 SSTr _____ k 21 5

13.199 ______ 3 5 4.4000

MSE 5 SSE ______ N 2 k 5

101.888 _______ 136 5 0.749

Example 17.2 heart attack Calculations

Page 8: Asking and Answering Questions about More Than Two Means

9

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

17.1 The Analysis of Variance—Single-Factor ANOVA and the F Test

degrees of freedom (df2). Figure 17.4 shows an F curve for a particular choice of df

1 and

df2. The ANOVA test of this section is an upper-tailed test, so a P-value is the area under

an appropriate F curve to the right of the calculated value of the test statistic.

F curve for particular df1, df2

Shaded area = P-value for upper-tailed F test

Calculated F

Constructing tables of these upper-tail areas is cumbersome, because there are two degrees of freedom rather than just one (as in the case of t distributions). For selected (df1, df2) pairs, the F table (Appendix Table 7) gives only the four numbers that capture tail areas 0.10, 0.05, 0.01, and 0.001, respectively. Here are the four numbers for df

1 5 4,

df2 5 10 along with the statements that can be made about the P-value:

Tail area 0.10 0.05 0.01 0.001Value 2.61 3.48 5.99 11.28 ↑ ↑ ↑ ↑ ↑ a b c d e

a. F , 2.16 → tail area 5 P-value > 0.10b. 2.61 , F , 3.48 → 0.05 , P-value , 0.10c. 3.48 , F , 5.99 → 0.01 , P-value , 0.05d. 5.99 , F , 11.28 → 0.001 , P-value , 0.01e. F > 11.28 → P-value , 0.001

For example, if F 5 7.12, then 0.001 , P-value , 0.01. If a test with a 5 0.05 is used, H

0 should be rejected, because P-value a. The most frequently used statistical computer

packages can provide exact P-values for F tests.

Appropriate when the following conditions are met:

1. Each of the k population or treatment response distributions is normal.2. s

1 5 s

2 5 . . . 5 s

k (The k normal distributions have equal standard deviations.)

3. The observations in the sample from any particular one of the k populations or treatments are independent of one another.

4. When comparing population means, the k random samples are selected inde-pendently of one another. When comparing treatment means, experimental units are assigned at random to treatments.

When these conditions are met, the following test statistic can be used:

F 5 MSTr _____ MSE

When the conditions above are met and the null hypothesis is true, the F statistic has an approximate F distribution with

df1 5 k 2 1 and df

2 5 N 2 k

Form of the null hypothesis: H0: m

1 5 m

2 5 . . . 5 m

k

Form of the alternative hypothesis: Ha: At least two of the m9s are diffrent

The P-value is: Area under the F curve to the right of the calculated value of the test statistic

Single Factor ANOVA F test for Equality of three or More Population Means

FIGurE 17.4 An F curve and P-value for an upper-tailed test

Unless otherwise noted, all content on this page is © Cengage Learning.

Page 9: Asking and Answering Questions about More Than Two Means

ChaPter 17 Asking and Answering Questions about More Than Two Means10

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Example 17.3 heart attacks revisited

Recall that the two mean squares for the MPV data given in Example 17.1 were calculated in Example 17.2 to be

MSTr 5 4.400 MSE 5 0.749

You can now use the five-step process for hypothesis testing problems (HMC3) to test the hypotheses of interest.

Process Step

H Hypotheses The question of interest is whether there are differences in mean MPV for the four different diagnosis groups.

Population characteristics of interest:m

1 5 mean MPV for the noncardiac chest pain group

m2 5 mean MPV for the stable angina group

m3 5 mean MPV for the unstable angina group

m4 5 mean MPV for the heart attack group

Hypotheses:Null hypothesis: H

0: m

1 5 m

2 5 m

3 5 m

4

Alternative hypothesis: Ha: At least two of the m9s are diffrent

M Method Because the answers to the four key questions are hypothesis testing, sample data, one numerical variable and four independently selected samples, consider an ANOVA F test.

Potential method:ANOVA F test. The test statistic for this test is

F 5 MSTr _____ MSE

When the null hypothesis is true, this statistic has approximately an F distribution with

df1 5 k 2 1 and df

2 5 N 2 k

Once you have decided to proceed with the test, you need to select a significance level for the test. In this example, you might choose a value for a of 0.05.

Significance level:

5 0.05

C Check The samples were independently selected. The largest sample stan-dard deviation (from Table 17.1, s

4 5 1.07) is not more than twice

as large as the smallest sample standard deviation (s1 5 0.69), so the

equal population standard deviations condition is reasonably met. A normal probability plot (see Figure 17.3) indicates that the normality condition is also reasonably met.

C Calculate MSTr 5 4.400 MSE 5 0.749 (from Example 17.2)

Test statistic:

F 5 MSTr _____ MSE 5

4.400 _____ 0.749 5 5.87

Degrees of freedom

df1 5 k 2 1 5 4 2 1 5 3

df2 5 N 2 k 5 140 2 4 5 136

(continued)

a

Page 10: Asking and Answering Questions about More Than Two Means

11

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

17.1 The Analysis of Variance—Single-Factor ANOVA and the F Test

Process Step

Associated P-value:P-value 5 area under F curve to the right of 5.87Using df

1 5 3 and df

2 5 120 (the closest value to 136 that appears in

the table), Appendix Table 7 shows that the area to the right of 5.78 is 0.001. Since 5.87 > 5.78 it follows that the P-value is less than 0.001.

C Communicate results

Because the P-value is less than the selected significance level, you reject the null hypothesis.

Decision: Reject H0.

The final conclusion for the test should be stated in context and answer the question posed.Conclusion: You can conclude that the mean MPV is not the same for all four patient groups.

Example 17.4 hormones and body FatThe article “Growth Hormone and Sex Steroid Administration in Healthy Aged Women and Men” (Journal of the American Medical Association [2002]: 2282–2292) described an experi-ment to investigate the effect of four treatments on various body characteristics. In this double-blind experiment, each of 57 female subjects age 65 or older was assigned at ran-dom to one of the following four treatments: (1) placebo “growth hormone” and placebo “steroid” (denoted by P 1 P); (2) placebo “growth hormone” and the steroid estradiol (denoted by P 1 S); (3) growth hormone and placebo “steroid” (denoted by G 1 P); and (4) growth hormone and the steroid estradiol (denoted by G 1 S).

The following table lists data on change in body fat mass over the 26-week period following the treatments that are consistent with summary quantities given in the article.

Change in Body Fat Mass (kg)

treatment P 1 P P 1 S G 1 P G 1 S

0.1 20.1 21.6 23.10.6 0.2 20.4 23.22.2 0.0 0.4 22.00.7 20.4 22.0 22.0

22.0 20.9 23.4 23.30.7 21.1 22.8 20.50.0 1.2 22.2 24.5

22.6 0.1 21.8 20.721.4 0.7 23.3 21.8

1.5 22.0 22.1 22.32.8 20.9 23.6 21.30.3 3.0 20.4 21.0

21.0 1.0 23.1 25.621.0 1.2 22.9

21.620.2

n 14 14 13 16 _ x 0.064 20.286 22.023 22.250

s 1.545 1.218 1.264 1.468s2 2.387 1.484 1.598 2.155

For this example, N 5 57, grand total 5 265.4, and _

_ x . 5

265.4 ______ 57 5 21.15.

Techniques for determining which means differ are introduced in Section 17.2.

Page 11: Asking and Answering Questions about More Than Two Means

ChaPter 17 Asking and Answering Questions about More Than Two Means12

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Process Step

H Hypotheses The question of interest is whether there are differences in mean change in body fat mass for the four treatments.

Population characteristics of interest:m

1 5 mean change in body fat mass for the P 1 P treatment

m2 5 mean change in body fat mass for the P 1 S treatment

m3 5 mean change in body fat mass for the G 1 P treatment

m4 5 mean change in body fat mass for the G 1 S treatment

Hypotheses:Null hypothesis: H

0: m

1 5 m

2 5 m

3 5 m

4

Alternative hypothesis: Ha: At least two of the m9s are different

M Method Because the answers to the four key questions are hypothesis testing, sample data, one numerical variable, and four independently selected samples, consider an ANOVA F test.

Potential method:ANOVA F test. The test statistic for this test is

F 5 MSTr _____ MSE

When the null hypothesis is true, this statistic has approximately an F distribution with

df1 5 k 2 1 and df

2 5 N 2 k

Once you have decided to proceed with the test, you need to select a significance level for the test. For this example, a significance level of 0.01 will be used.

Significance level:

C Check The subjects in the experiment were randomly assigned to treatments. The largest sample standard deviation (s

1 5 1.545) is not more than

twice as large as the smallest sample standard deviation (s2 5 1.218),

so the equal population standard deviations condition is reasonably met. Boxplots of the data from each of the four samples are shown in Figure 17.5. The boxplots are roughly symmetric, and there are no outliers, so the normality condition is also reasonably met.

C Calculate SSTr 5 n1 ( _ x 1 2

_

_ x ) 2 1 n

2 ( _ x 2 2

_

_ x ) 2 1 . . . 1 n

k ( _ x k 2

_

_ x ) 2

5 14(0.064 2 (21.15)) 2 1 14(20.286 2 (21.15)) 2 1 13(22.023 2 (21.15)) 2 1 16(22.250 2 (21.15)) 2 5 60.37

treatment df 5 k 2 1 5 3

SSE 5 (n1 2 1) s

1 2 1 (n

2 2 1) s

2 2 1 . . . 1 (n

k 2 1) s

k 2

5 13(2.387) 1 13(1.484) 1 12(1.598) 1 15(2.155) 5 101.81

Test statistic:

F 5 MSTr _____ MSE 5

SSTr treatment df _______________ SSE error df 5 60.37 3 _________ 101.81 53 5

20.12 _____ 1.92 5 10.48

Degrees of freedom

df1 5 k 2 1 5 4 2 1 5 3

df2 5 N 2 k 5 57 2 4 5 53

Associated P-value:P-value 5 area under F curve to the right of 10.48Using df

1 5 3 and df

2 5 60 (the closest value to 53 that appears in

the table), Appendix Table 7 shows that the area to the right of 6.17 is 0.001. Since 10.48 > 6.17 it follows that the P-value is less than 0.001.

(continued)

5 0.01a

Page 12: Asking and Answering Questions about More Than Two Means

13

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

17.1 The Analysis of Variance—Single-Factor ANOVA and the F Test

The quantity SSTo, the sum of squared deviations about the grand mean, is a measure of total variability in the data set consisting of all k samples. The quantity SSE results from measuring variability separately within each sample and then combining. Such within-sample variability is present regardless of whether or not H

0 is true. The magnitude of

SSTr, on the other hand, depends on whether the null hypothesis is true or false. The more the m’s differ from one another, the larger SSTr will tend to be. SSTr represents variation that can (at least to some extent) be explained by any differences between means. An infor-mal paraphrase of the fundamental identity for single-factor ANOVA is

total variation 5 explained variation 1 unexplained variation

Once any two of the sums of squares have been calculated, the remaining one is easily obtained from the fundamental identity. Often SSTo and SSTr are calculated first (using computational formulas given in the appendix to this chapter), and then SSE is obtained by subtraction: SSE 5 SSTo − SSTr. All the degrees of freedom, sums of squares, and mean squares are entered in an ANOVA table, as displayed in Table 17.2. The P-value usually appears to the right of F when the analysis is done by a statistical software package.

summarizing an anOvaANOVA calculations are often summarized in a tabular format called an ANOVA table. To understand such a table, one more sum of squares must be defined.

Process Step

C Communicate results

Because the P-value is less than the selected significance level, reject the null hypothesis.

Decision: Reject H0.

Conclusion: You can conclude that the mean change in body fat mass is not the same for all four patient groups.

Total sum of squares, denoted by SSTo, is given by

SSTo 5 ∑

(x 2 _

_ x ) 2

with associated df 5 N 2 1

The relationship between the three sums of squares SSTo, SSTr, and SSE is

SSTo 5 SSTr 1 SSE

which is called the fundamental identity for single-factor ANOVA

3210−6 −5 −4 −3 −2 −1

P + P

P + S

G + P

G + S

Change in body fat mass

FIGurE 17.5 Boxplots for the data of Example 17.4

all N obs.

Unless otherwise noted, all content on this page is © Cengage Learning.

Page 13: Asking and Answering Questions about More Than Two Means

ChaPter 17 Asking and Answering Questions about More Than Two Means14

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

c. Answer the question posed in Part (b) if the F value given there resulted from sample sizes n

1 5 9, n

2 5 8, n

3 5 7,

and n4 5 8.

17.3 The authors of the paper “Age and Violent Content Labels Make Video Games Forbidden Fruits for Youth” (Pediatrics [2009]: 870–876) carried out an experiment to determine if restrictive labels on video games actually increased the attractiveness of the game for young game players. Participants read a description of a new video game and were asked how much they wanted to play the game. The description also included an age rating. Some participants read the description with an age restrictive label of 71, indicating that the game was not appropriate for chil-dren under the age of 7. Others read the same description, but with an age restrictive label of 121, 161, or 181. The following data for 12- to 13-year-old boys are fictitious but are consistent with summary statistics given in the paper. (The sample sizes in the actual experiment were larger.) For purposes of this exercise, you can assume that the boys were assigned at random to one of the four age label treatments (71, 121, 161, and 181). Data shown are the boys’ rat-ings of how much they wanted to play the game on a scale of 1 to 10. Do the data provide convincing evidence that the mean rating associated with the game description by 12- to

An ANOVA table from Minitab for the change in body fat mass data of Example 17.4 is shown in Table 17.3. The reported P-value is 0.000, consistent with the previous conclu-sion that P-value < 0.001.

seCtiOn 17.1 Exercise Set 1

17.1 Give as much information as you can about the P-value for an upper-tailed F test in each of the following situations.a. df

1 5 4, df

2 5 15, F 5 5.37

b. df1 5 4, df

2 5 15, F 5 1.90

c. df1 5 4, df

2 5 15, F 5 4.89

d. df1 5 3, df

2 5 20, F 5 14.48

e. df1 5 3, df

2 5 20, F 5 2.69

f. df1 5 4, df

2 5 50, F 5 3.24

17.2 Employees of a certain state university system can choose from among four different health plans. Each plan differs somewhat from the others in terms of hos-pitalization coverage. Four samples of recently hospital-ized individuals were selected, each sample consisting of people covered by a different health plan. The length of the hospital stay (number of days) was determined for each individual selected.a. What hypotheses would you test to decide whether the mean

lengths of stay are not the same for all four health plans?b. If each sample consisted of eight individuals and

the value of the ANOVA F statistic was F 5 4.37, what conclusion would be appropriate for a test with a 5 0.01?

Source of Variation df Sum of Squares Mean Square F

Treatments k 2 1 SSTr MSTr 5 SSTr ______ k 2 1 F 5

MSTr _____ MSE

Error N 2 k SSE MSE 5 SSE ______ N 2 k

Total N 2 1 SSTo

tabLe 17.2 general Format for a single-Factor anOva table

One-way ANOVA Source DF SS MS F P

Factor 3 60.37 20.12 10.48 0.000

Error 53 101.81 1.92

Total 56 162.18

tabLe 17.3 an anOva table from Minitab for the Data of example 17.4

seCtiOn 17.1 exerCiseseach exercise set assesses the following chapter learning objectives: C1, M1, M2, M3, P1, P2

Page 14: Asking and Answering Questions about More Than Two Means

15

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

17.1 The Analysis of Variance—Single-Factor ANOVA and the F Test

13-year-old boys is not the same for all four restrictive rating labels? Test the appropriate hypotheses using a significance level of 0.05.

71 label 121 label 161 label 181 label

6 8 7 10

6 7 9 9

6 8 8 6

5 5 6 8

4 7 7 7

8 9 4 6

6 5 8 8

1 8 9 9

2 4 6 10

4 7 7 8

17.4 The accompanying data on calcium content of wheat are consistent with summary quantities that appeared in the article “Mineral Contents of Cereal Grains as Affected by Storage and Insect Infestation” ( Journal of Stored Products Research [1992]: 147–151). Four different storage times were considered. Partial output from the SAS computer package is also shown.

Storage Period Observations

0 months 58.75 57.94 58.91 56.85 55.21 57.30

1 month 58.87 56.43 56.51 57.67 59.75 58.48

2 months 59.13 60.38 58.01 59.95 59.51 60.34

4 months 62.32 58.76 60.03 59.36 59.61 61.95

71 label 121 label 161 label 181 label

4 4 6 8 7 5 4 6 6 4 8 6 5 6 6 5 3 3 10 7 6 5 8 4 4 3 6 10 5 8 6 610 5 8 8 5 9 5 7

a. Verify that the sums of squares and df’s are as given in the ANOVA table.

b. Is there sufficient evidence to conclude that the mean calcium content is not the same for the four different

storage times? Use the value of F from the ANOVA table to test the appropriate hypotheses at significance level 0.05.

seCtiOn 17.1 Exercise Set 2

17.5 Give as much information as you can about the P-value of the single-factor ANOVA F test in each of the following situations.a. k 5 5, n

1 5 n

2 5 n

3 5 n

4 5 n

5 5 4, F 5 5.37

b. k 5 5, n1 5 n

2 5 n

3 5 5, n

4 5 n

5 5 4, F 5 2.83

c. k 5 3, n1 5 4, n

2 5 5, n

3 5 6, F 5 5.02

d. k 5 3, n1 5 n

2 5 4, n

3 5 6, F 5 15.90

e. k 5 4, n1 5 n

2 5 15, n

3 5 12, n

4 5 10, F 5 1.75

17.6 The paper referenced in Exercise 17.3 also gave data for 12- to 13-year-old girls. Data consistent with summary values in the paper are shown below. Do the data provide convincing evidence that the mean rating associated with the game description for 12- to 13-year-old girls is not the same for all four age restrictive rating labels? Test the appropriate hypotheses using a 5 0.05.

17.7 The experiment described in Example 17.4 also gave data on change in body fat mass for men (“Growth Hormone and Sex Steroid Administration in Healthy Aged Women and Men,” Journal of the American Medical Association [2002]: 2282–2292). Each of 74 male subjects who were over age 65 was assigned at random to one of the following four treatments: (1) placebo “growth hormone” and placebo “steroid” (denoted by P 1 P); (2) placebo “growth hormone” and the steroid testosterone (denoted by P 1 S); (3) growth hormone and placebo “steroid” (denoted by G 1 P); and (4) growth hormone and the steroid testosterone (denoted by G 1 S). The accompanying table lists data on change in body fat mass over the 26-week period following the treatment that are consistent with summary quantities given in the article.

Dependent Variable: CALCIUMSum of Mean

Source DF Squares Square F Value Pr>F

Model 3 32.13815000 10.71271667 6.51 0.0030

Error 20 32.90103333 1.64505167

Corrected Total 23 65.03918333

R-Square C.V. Root MSE CALCIUM Mean

0.494135 2.180018 1.282596 58.8341667

Page 15: Asking and Answering Questions about More Than Two Means

ChaPter 17 Asking and Answering Questions about More Than Two Means16

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Change in Body Fat Mass (kg)

treatment P 1 P P 1 S G 1 P G 1 S

0.3 23.7 23.8 25.0

0.4 21.0 23.2 25.0

21.7 0.2 24.9 23.0

20.5 22.3 25.2 22.6

22.1 1.5 22.2 26.2

1.3 21.4 23.5 27.0

0.8 1.2 24.4 24.5

1.5 22.5 20.8 24.2

21.2 23.3 21.8 25.2

20.2 0.2 24.0 26.2

1.7 0.6 21.9 24.0

1.2 20.7 23.0 23.9

0.6 20.1 21.8 23.3

0.4 23.1 22.9 25.7

21.3 0.3 22.9 24.5

20.2 20.5 22.9 24.3

0.7 20.8 23.7 24.0

20.7 24.2

20.9 24.7

22.0

20.6

n 0.117 0.121 0.117 0.119

_ x 0.100 20.933 23.112 24.605

s 1.139 1.443 1.178 1.122

s2 1.297 2.082 1.388 1.259

tabLe FOr exerCise 17.9

type of Box Compression Strength (lb)Sample Mean Sample SD

1 655.5 788.3 734.3 721.4 679.1 699.4 713.00 46.55

2 789.2 772.5 786.9 686.1 732.1 774.8 756.93 40.34

3 737.1 639.0 696.3 671.7 717.2 727.1 698.07 37.20

4 535.1 628.7 542.4 559.0 586.9 520.0 562.02 39.87

_

_ x 5 682.50

Also, N 5 74, grand total 5 2158.3, and _

_ x 5 2158.3 _______

74 . 5

22.139 Carry out an F test to see whether mean change in body fat mass differs for the four treatments.

17.8 In an experiment to investigate the performance of four different brands of spark plugs intended for use on a 125-cc motorcycle, five plugs of each brand were tested, and the number of miles (at a constant speed) until failure was observed. A partially completed ANOVA table is given. Fill in the missing entries, and test the relevant hypotheses using a 0.05 level of significance.

Additional Exercise for Section 17.1

17.9 The article “Compression of Single-Wall Corrugated Shipping Containers using Fixed and Floating text Platens” (Journal of Testing and Evaluation [1992]: 318–320) described an experiment in which several different types of boxes were compared with respect to compression strength (in pounds). The data at the bottom of the page resulted from a single-factor experiment involving k 5 4 types of boxes (the sample means and standard deviations are in close agreement with values given in the paper). Do these data provide evidence to support the claim that the mean compression strength is not the same for all four box types? Test the relevant hypothesis using a significance level of 0.01.

Age Group YouthsYoung Adults Adults Seniors

Sample Size 106 255 314 36

_ x 2.00 3.40 3.07 2.84

s 1.56 1.68 1.66 1.89

17.10 The accompanying summary statistics for a measure of social marginality for samples of youths, young adults, adults, and seniors appeared in the paper “Perceived Causes of Loneliness in Adulthood” (Journal of Social Behavior and Personality [2000]: 67–84). The social marginality score mea-sured actual and perceived social rejection, with higher scores indicating greater social rejection. For purposes of this exer-cise, assume that it is reasonable to regard the four samples as representative of the U.S. population in the corresponding age groups and that the distributions of social marginality scores for these four groups are approximately normal with the same standard deviation. Is there evidence that the mean social mar-ginality score is not the same for all four age groups? Test the relevant hypotheses using a 5 0.05.

Source of Variation df

Sum of Squares

Mean Square F

treatments

Error 235,419.04

total 310,500.76

Page 16: Asking and Answering Questions about More Than Two Means

17

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

17.2 Multiple Comparisons

17.11 The chapter Preview Example described a study comparing three groups of college students (soccer athletes, non–soccer athletes, and a comparison group consisting of students who did not participate in intercollegiate sports). The following is information on scores from the Hopkins Verbal Learning Test (which measures immediate memory recall).

GroupSoccer

AthletesNonsoccer

Athletes Comparison

Group

Sample size 86 95 53

Sample mean score 29.90 30.94 29.32

Sample standard deviation

3.73 5.14 3.78

1996 30,000 34,000 36,000 38,000 40,000

1997 30,000 35,000 37,000 38,000 40,000

1998 40,000 41,000 43,000 44,000 50,000

In addition, _

_ x 5. 30.19 Suppose that it is reasonable to regard

these three samples as random samples from the three stu-dent populations of interest. Is there sufficient evidence to conclude that the mean Hopkins score is not the same for the three student populations? Use a 5 0.05.

17.12 Suppose that a random sample of size n 5 5 was selected from the vineyard properties for sale in Sonoma County, California, in each of 3 years. The following data are consistent with summary information on price per acre (in dol-lars, rounded to the nearest thousand) for disease-resistant grape vineyards in Sonoma County (Wines and Vines, November 1999).

a. Construct boxplots for each of the 3 years on a common axis, and label each by year. Comment on the similarities and differences.

b. Carry out an ANOVA to determine whether there is evi-dence to support the claim that the mean price per acre for vineyard land in Sonoma County was not the same for the 3 years considered. Use a significance level of 0.05 for your test.

17.13 Parents are frequently concerned when their child seems slow to begin walking (although when the child

finally walks, the resulting havoc sometimes has the par-ents wishing they could turn back the clock!). The article “Walking in the Newborn” (Science, 176 [1972]: 314–315) reported on an experiment in which the effects of several different treatments on the age at which a child first walks were compared. Children in the first group were given spe-cial walking exercises for 12 minutes per day beginning at age 1 week and lasting 7 weeks. The second group of chil-dren received daily exercises but not the walking exercises administered to the first group. The third and fourth groups were control groups. They received no special treatment and differed only in that the third group’s progress was checked weekly, whereas the fourth group’s progress was checked just once at the end of the study. Observations on age (in months) when the children first walked are shown in the accompanying table. Also given is the ANOVA table, obtained from the SPSS computer package.

a. Verify the entries in the ANOVA table.

b. State and test the relevant hypotheses using a signifi-cance level of 0.05.

When H0: m

1 5 m

2 5 . . . 5 m

k is rejected by the F test, you believe that there are differ-

ences among the k population or treatment means. A natural question to ask at this point is, which means differ? For example, with k 5 4, it might be the case that m

1 5 m

2 5 m

4, with

m3 different from the other three means. Another possibility is that m

1 5 m

4 and m

2 5 m

3.

seCtiOn 17.2 Multiple Comparisons

Age n total

Treatment 1 9.00 9.50 9.75 6 60.75

10.00 13.00 9.50

Treatment 2 11.00 10.00 10.00 6 68.25

11.75 10.50 15.00

Treatment 3 11.50 12.00 9.00 6 70.25

11.50 13.25 13.00

Treatment 4 13.25 11.50 12.00 12.00 561.75

13.50 11.50

Analysisof Variance

Source df Sum of sq. Mean Sq. F Ratio F Prob

BetweenGroups

3 14.779 4.926 2.142 .129

With in Group

19 43.690 2.299

Total 22 58.467

Page 17: Asking and Answering Questions about More Than Two Means

ChaPter 17 Asking and Answering Questions about More Than Two Means18

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Still another possibility is that all four means are different from one another. A multiple comparisons procedure is a method for identifying differences among the m’s once the hypothesis that all of the means are equal has been rejected. The Tukey-Kramer (T-K) mul-tiple comparisons procedure is one method that can be used to identify differences.

The T-K procedure is based on computing confidence intervals for the difference between each possible pair of m’s. For example, for k 5 3, there are three differences to consider:

m1 2 m

2 m

1 2 m

3 m

2 2 m

3

(The difference m2 2 m

1 is not considered, because the interval for m

1 2 m

2 provides the

same information. Similarly, intervals for m3 2 m

1 and m

3 2 m

2 are not necessary.) Once

all confidence intervals have been computed, each is examined to determine whether the interval includes 0. If a particular interval does not include 0, the two means are declared “significantly different” from one another. If an interval includes 0, there is no evidence of a significant difference between the means involved.

Suppose, for example, that k 5 3 and that the three confidence intervals are

Difference t-K Confidence Interval

m1 2 m

2 (20.9, 3.5)

m1 2 m

3 (2.6, 7.0)

m2 2 m

3 (1.2, 5.7)

Because the interval for m1 2 m

2 includes 0, you would say that m

1 and m

2 do not dif-

fer significantly. The other two intervals do not include 0, so you would conclude that m

1 Þ m

3 and m

2 Þ m

3.

The T-K intervals are based on critical values for a probability distribution called the Studentized range distribution. These critical values appear in Appendix Table 8. To find a critical value, enter the table at the column corresponding to the number of populations or treatments being compared, move down to the rows corresponding to the number of error degrees of freedom (N 2 k), and select either the value for a 95% confidence level or the one for a 99% level.

If the sample sizes are all the same, you can use n to denote the common value of n

1, n

2, . . ., n

k. In this case, the 6 term for each interval is the same quantity

q ÏWWW

MSE _____ n

the tukey–Kramer Multiple Comparison Procedure

When there are k populations or treatments being compared, k (k 2 1)

________ 2 confidence

intervals must be computed. Denoting the relevant Studentized range critical value (from Appendix Table 8) by q, the intervals are as follows:

For mi 2 mj: ( _ x i 2

_ x j) 6 q Ï

WWWWWW MSE _____ 2 ( 1 __ ni

1 1 __ nj )

Two means are judged to differ significantly if the corresponding interval does not include zero.

Example 17.4 introduced the accompanying data on change in body fat mass resulting from a double-blind experiment designed to compare the following four treatments: (1) placebo “growth hormone” and placebo “steroid” (denoted by P 1 P); (2) placebo “growth hormone” and the steroid estradiol (denoted by P 1 S); (3) growth hormone and

Example 17.5 hormones and body Fat revisited

Page 18: Asking and Answering Questions about More Than Two Means

17.2 Multiple Comparisons 19

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

placebo “steroid” (denoted by G 1 P); and (4) growth hormone and the steroid estradiol (denoted by G 1 S). From Example 17.4, MSTr 5 20.12, MSE 5 1.92, and F 5 10.48 with an associated P-value , 0.001. It was concluded that the mean change in body fat mass is not the same for all four treatments.

Change in Body Fat Mass (kg)

treatment P 1 P P 1 S G 1 P G 1 S

0.1 20.1 21.6 23.10.6 0.2 20.4 23.22.2 0.0 0.4 22.00.7 20.4 22.0 22.0

22.0 20.9 23.4 23.30.7 21.1 22.8 20.50.0 1.2 22.2 24.5

22.6 0.1 21.8 20.721.4 0.7 23.3 21.8

1.5 22.0 22.1 22.32.8 20.9 23.6 21.30.3 23.0 20.4 21.0

21.0 1.0 23.1 25.621.0 1.2 22.9

21.620.2

n 14 14 13 16 _ x 0.064 20.286 22.023 22.250

s 1.545 1.218 1.264 1.468s2 2.387 1.484 1.598 2.155

You would conclude that m1 is not significantly different from m2 and that m3 is not significantly different from m4. You would also conclude that m1 and m2 are signifi-cantly different from both m3 and m4. Note that Treatments 1 and 2 were treatments that

Appendix Table 8 gives the 95% Studentized range critical value q 5 3.74 (using k 5 4 and error df 5 60, the closest tabled value to df 5 N 2 k 5 53). The first two T-K intervals are

m1 2 m

2: (0.064 2 (20.286)) 6 3.74 Ï

WWWWWWWW ( 1.92 ____

2 ) ( 1 ___

14 1 1 ___

14 )

5 0.35 6 1.39

5 (21.04, 1.74)

m1 2 m

3: (0.064 2 (22.023)) 6 3.74 Ï

WWWWWWWW ( 1.92 ____

2 ) ( 1 ___

14 1 1 ___

13 )

5 2.09 6 1.41

5 (0.68, 3.50)

The remaining intervals are

m1 2 m4 (0.97, 3.66) Does not include 0m2 2 m3 (0.32, 3.15) Does not include 0m2 2 m4 (0.62, 3.31) Does not include 0m3 2 m4 (21.14, 1.60) Includes 0

Includes 0

Does not include 0

Page 19: Asking and Answering Questions about More Than Two Means

ChaPter 17 Asking and Answering Questions about More Than Two Means20

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Minitab can be used to construct T-K intervals if raw data are available. Typical output (based on Example 17.5) is shown in Figure 17.6. From the output, you can see that the confidence interval for m

1 (P 1 P) 2 m

2 (P 1 S) is (21.039, 1.739), that for m

2 (P 1 S) 2

m4 (G 1 S) is (0.619, 3.309), and so on.

Tukey 95% Simultaneous Confidence IntervalsAll Pairwise Comparisons

Individual confidence level = 98.95%

G + S subtracted from:

G + P

G + P subtracted from:

-1.145 0.227 1.599P + S 0.619 1.964 3.309P + P 0.969 2.314 3.659

Lower Center Upper

P + S 0.322 1.737 3.153P + P 0.672 2.087 3.503

Lower Center Upper

P + S subtracted from:

P + P -1.039 0.350

-2.0--------+---------+---------+---------+-

--------+---------+---------+---------+-(------*------)

0.0 2.0 4.0

-2.0--------+---------+---------+---------+-

--------+---------+---------+---------+-

(------*-------)

(------*-----)(------*------)

(------*------)

(------*------)

0.0 2.0 4.0

-2.0--------+---------+---------+---------+-

--------+---------+---------+---------+-

0.0 2.0 4.0

1.739Lower Center Upper

FIGurE 17.6 The T-K intervals for Example 17.5 (from Minitab)

Why calculate the T-K intervals rather than use the t confidence interval for a dif-ference between m’s from Chapter 13? The answer is that the T-K intervals control the simultaneous confidence level at approximately 95% (or 99%). That is, if the pro-cedure is used repeatedly on many different data sets, in the long run only about 5% (or 1%) of the time would at least one of the intervals not include the value of what the interval is estimating. Consider using separate 95% t intervals, each one having a 5% error rate. In those instances, the chance that at least one interval would make an incorrect statement about a difference in m’s increases dramatically with the number of intervals calculated. The Minitab output in Figure 17.6 shows that to achieve a simul-taneous confidence level of about 95% (experimentwise or “family” error rate of 5%) when k 5 4 and error df 5 76, the individual interval confidence levels must be 98.95% (individual error rate 1.05%).

An effective display for summarizing the results of any multiple comparisons proce-dure involves listing the

_ x ’s and underscoring pairs judged to be not significantly different.

The process for constructing such a display is described in the following box.

administered a placebo in place of the growth hormone and Treatments 3 and 4 were treat-ments that included the growth hormone. This analysis was the basis of the researchers’ conclusion that growth hormone, with or without steroids, decreased body fat mass.

Unless otherwise noted, all content on this page is © Cengage Learning.

Page 20: Asking and Answering Questions about More Than Two Means

17.2 Multiple Comparisons 21

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

To illustrate this summary procedure, suppose that four samples with _ x 1 5 19,

_ x 2 5 27,

_ x 3 5 24, and

_ x 4 5 10 are used to test H

0: m

1 5 m

2 5 m

3 5 m

4 and that this hypothesis is

rejected. Suppose the T-K confidence intervals indicate that m2 is significantly different

from both m1 and m

4, and that there are no other significant differences. The resulting sum-

mary display would then be

Population 4 1 3 2

Sample mean 10 19 24 27

Summarizing the results of the tukey–Kramer Procedure1. List the sample means in increasing order, identifying the corresponding popu-

lation or treatment just above the value of each _ x .

2. Use the T-K intervals to determine the group of means that do not differ significantly from the first in the list. Draw a horizontal line extending from the smallest mean to the last mean in the group identified. For example, if there are five means, arranged in order,

Population 3 2 1 4 5 Sample mean

_ x 3

_ x 2

_ x 1

_ x 4

_ x 5

and m3 is judged to be not significantly different from m

2 or m

1, but is judged to be

sig nifi cantly different from m4 and m

5, draw the following line:

Population 3 2 1 4 5 Sample mean

_ x 3

_ x 2

_ x 1

_ x 4

_ x 5

3. Use the T–K intervals to determine the group of means that are not significantly different from the second smallest. (You need consider only means that appear to the right of the mean under consideration.) If there is already a line connect-ing the second smallest mean with all means in the new group identified, no new line need be drawn. If this entire group of means is not underscored with a single line, draw a line extending from the second smallest to the last mean in the new group. Continuing with our example, if m

2 is not significantly

different from m1 but is significantly different from m

4 and m

5, no new line

need be drawn. However, if m2 is not significantly different from either

m1 or m

4 but is judged to be different from m

5, a second line is drawn as

shown: Population 3 2 1 4 5 Sample mean

_ x 3

_ x 2

_ x 1

_ x 4

_ x 5

4. Continue considering the means in the order listed, adding new lines as needed.

Example 17.6 sleep time

A biologist studied the effects of ethanol on sleep time. A sample of 20 rats, matched for age and other characteristics, was selected, and each rat was given an oral injection having a particular concentration of ethanol per body weight. The rapid eye movement (REM) sleep time for each rat was then recorded for a 24-hour period, with the results shown in the following table:

treatment Observations __

x

1. 0 (control) 88.6 73.2 91.4 68.0 75.2 79.28 2. 1 g/kg 63.0 53.9 69.2 50.1 71.5 61.54 3. 2 g/kg 44.9 59.5 40.2 56.3 38.7 47.92 4. 4 g/kg 31.0 39.6 45.3 25.2 22.7 32.76

Page 21: Asking and Answering Questions about More Than Two Means

ChaPter 17 Asking and Answering Questions about More Than Two Means22

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Table 17.4 (an ANOVA table from SAS) leads to the conclusion that actual mean REM sleep time is not the same for all four treatments (the P-value for the F test is 0.0001).

tabLe 17.4 sas anOva table for example 17.6

Analysis of Variance ProcedureDependent Variable: TIME

Sum of MeanSource DF Squares Square F Value Pr > F

Model 3 5882.35750 1960.78583 21.09 0.0001

Error 16 1487.40000 92.96250

Total 19 7369.75750

Difference Interval Includes O?

m1 2 m

217.74 6 17.446 no

m1 2 m

331.36 6 17.446 no

m1 2 m

446.24 6 17.446 no

m2 2 m

313.08 6 17.446 yes

m2 2 m

428.78 6 17.446 no

m3 2 m

415.16 6 17.446 yes

The only T-K intervals that include zero are those for m2 2 m

3 and m

3 2 m

4. The cor-

responding underscoring pattern is

_ x 4

_ x 3

_ x 2

_ x 1

32.76 47.92 61.54 79.28

Figure 17.7 displays the SAS output that agrees with our underscoring; letters are used to indicate groupings in place of the underscoring.

The T-K intervals are

FIGurE 17.7 SAS output for Example 17.6

Alpha 5 0.05 df 5 16 MSE 5 92.9625Critical Value of Studentized Range 5 4.046Minimum Significant Difference 5 17.446Means with the same letter are not significantly different.Tukey Grouping Mean N Treatment

A 79.280 5 0 (control)B 61.540 5 1 g/kg

C B 47.920 5 2 g/kgC 32.760 5 4 g/kg

How satisfied are college students with dormitory roommates? The article “roommate Satisfaction and Ethnic Identity in Mixed-race and White university roommate Dyads” ( Journal of College Student Development [1998]: 194–199) investigated differences among randomly assigned African American/white, Asian/white, Hispanic/white, and white/white roommate pairs. The researchers used a one-way ANOVA to analyze scores on the Roommate Relationship Inventory to see whether a difference in mean score existed for the four types of roommate pairs. They reported “significant differences among the means (P , 0.01). Follow-up Tukey [intervals] . . . indicated differences between White dyads (M 5 77.49) and African American/White dyads (M 5 71.27). No other significant dif-ferences were found.”

Although the mean satisfaction score for the Asian/white and Hispanic/white groups were not given, they must have been between 77.49 (the mean for the white/white pairs)

Example 17.7 roommate satisfaction

Unless otherwise noted, all content on this page is © Cengage Learning.

Page 22: Asking and Answering Questions about More Than Two Means

17.2 Multiple Comparisons 23

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

each exercise set assesses the following chapter learning objectives: M4, P1

seCtiOn 17.2 exerCises

seCtiOn 17.2 Exercise Set 1

17.14 Leaf surface area is an important variable in plant gas-exchange rates. Dry matter per unit surface area (mg/cm3) was measured for trees raised under three different growing conditions. Let m

1, m

2, and m

3 represent the mean dry matter

per unit surface area for the growing conditions 1, 2, and 3, respectively. The given 95% simultaneous confidence inter-vals are:

Difference m1 2 m

2m

1 2 m

3m

2 2 m

3

Interval (23.11, 21.11) (24.06, 22.06) (21.95, 0.05)

Which of the following four statements do you think describes the relationship between m

1, m

2, and m

3? Explain your choice.

a. m1 5 m

2, and m

3 differs from m

1 and m

2.

b. m1 5 m

3, and m

2 differs from m

1 and m

3.

c. m2 5 m

3, and m

1 differs from m

2 and m

3.

d. All three m’s are different from one another.

17.15 The accompanying underscoring pattern appears in the article “Women’s and Men’s Eating Behavior Following Exposure to Ideal-Body Images and text” (Communications Research [2006]: 507–529). Women either viewed slides depicting images of thin female models with no text (treat-ment 1); viewed the same slides accompanied by diet and exercise-related text (treatment 2); or viewed the same slides accompanied by text that was unrelated to diet and exercise (treatment 3). A fourth group of women did not view any slides (treatment 4). Participants were assigned at random to the four treatments. Participants were then asked to complete a questionnaire in a room where pretzels were set out on the tables. An observer recorded how many pretzels participants ate while completing the question-naire. Write a few sentences interpreting this underscoring pattern.

Treatment: 2 1 4 3

Mean number of pretzels consumed:

0.97 1.03 2.20 2.65

17.16 The accompanying data resulted from a flammability study in which specimens of five different fabrics were tested to determine burn times.

1 17.8 16.2 15.9 15.52 13.2 10.4 11.3

Fabric 3 11.8 11.0 9.2 10.04 16.5 15.3 14.1 15.0 13.95 13.9 10.8 12.8 11.7

MSTr 5 23.67 MSE 5 1.39 F 5 17.08P-value 5 0.000

The accompanying output gives the T-K intervals as calcu-lated by Minitab. Identify significant differences and give the underscoring pattern.

Individual error rate 5 0.00750Critical value 5 4.37Intervals for (column level mean) 2 (row level mean)

1 2 3 4 1.938

2 7.495 3.278 21.645

3 8.422 3.912 21.050 25.983 26.900

4 3.830 20.670 22.020 1.478 23.445 24.372 0.220

5 6.622 2.112 0.772 5.100

seCtiOn 17.2 Exercise Set 2

17.17 The paper “trends in Blood Lead Levels and Blood Lead testing among u.S. Children Aged 1 to 5 Years” (Pediatrics [2009]: e376–e385) gave data on blood lead levels (in mg/dL) for samples of children living in homes that had been clas-sified either at low, medium, or high risk of lead exposure, based on when the home was constructed. After using a multi-ple comparison procedure, the authors reported the following:

1. The difference in mean blood lead level between low-risk housing and medium-risk housing was significant.

and 71.27 (the mean for the African American/white pairs). (If they had been larger than 77.49, they would have been significantly different from the African American/white pairs mean, and if they had been smaller than 71.27, they would have been significantly different from the white/white pairs mean.) An underscoring consistent with the reported information is

White/White Hispanic/ African-American/

White and White

Asian/White

Page 23: Asking and Answering Questions about More Than Two Means

ChaPter 17 Asking and Answering Questions about More Than Two Means24

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

b. Is there evidence that seeds eaten and then excreted by lizards germinate at a higher rate than those eaten and then excreted by birds? Give statistical evidence to support your answer.

Additional Exercises for Section 17.2 17.20 Samples of six different brands of diet or imitation margarine were analyzed to determine the level of physi-ologically active polyunsaturated fatty acids (PAPUFA, in percent), resulting in the data shown in the accompanying table. (The data are fictitious, but the sample means agree with data reported in Consumer Reports.)

Imperial 14.1 13.6 14.4 14.3

Parkay 12.8 12.5 13.4 13.0 12.3

Blue Bonnet 13.5 13.4 14.1 14.3

Chiffon 13.2 12.7 12.6 13.9

Mazola 16.8 17.2 16.4 17.3 18.0

Fleischmann’s 18.1 17.2 18.7 18.4

a. Test for differences among the true mean PAPUFA per-centages for the different brands. Use a 5 0.05.

b. Use the T-K procedure to compute 95% simultaneous confidence intervals for all differences between means and give the corresponding underscoring pattern.

17.21 The nutritional quality of shrubs commonly used for feed by rabbits was the focus of a study summarized in the article “Estimation of Browse by Size Classes for Snowshoe Hare” ( Journal of Wildlife Management [1980]: 34–40). The energy contents (cal/g) of three sizes (4 mm or less, 5–7 mm, and 8–10 mm) of serviceberries were studied. Let m

1, m

2,

and m3 denote the true mean energy content for the three size

classes. Suppose that 95% simultaneous confidence intervals for m

1 2 m

2, m

1 2 m

3, and m

2 2 m

3 are (210, 290), (150,

450), and (10, 310), respectively. How would you interpret these intervals?

17.22 Consider the accompanying data on plant growth after the application of five different types of growth hormone.

1 13 17 7 14

2 21 13 20 17

Hormone 3 18 14 17 21

4 7 11 18 10

5 6 11 15 8

a. Carry out the F test at level a 5 0.05.b. What happens when the T-K procedure is applied? (Note:

This “contradiction” can occur when H0 is “barely”

rejected. It happens because the test and the multiple comparison method are based on different distributions. Consult your friendly neighborhood statistician for more information.)

2. The difference in mean blood lead level between low-risk housing and high-risk housing was significant.

3. The difference in mean blood lead level between medium-risk housing and high-risk housing was significant.

Which of the following sets of T-K intervals (Set 1, 2, or 3) is consistent with the authors’ conclusions? Explain your choice.

mL 5 mean blood lead level for children living in low-risk

housingm

M 5 mean blood lead level for children living in medium-

risk housingm

H 5 mean blood lead level for children living in high-risk

housing

17.18 The paper referenced in the Exercise 17.15 also gave the following underscoring pattern for men.

Treatment: 2 1 3 4Mean number of pretzels consumed:

6.61 5.96 3.38 2.70

a. Write a few sentences interpreting this underscoring pattern.

b. Using your answers from Part (a) and from the Exer-cise 17.15, write a few sentences describing the differences between how men and women respond to the treatments.

17.19 Do lizards play a role in spreading plant seeds? Some research carried out in South Africa would suggest so (“Dispersal of Namaqua Fig [Ficus cordata cordata] Seeds by the Augrabies Flat Lizard [Platysaurus broadleyi],” Journal of Herpetology [1999]: 328–330). The researchers collected 400 seeds of a particular type of fig, 100 of which were from each treatment: lizard dung, bird dung, rock hyrax dung, and uneaten figs. They planted these seeds in batches of 5, and for each group of 5 they recorded how many of the seeds germinated. This resulted in 20 observations for each treatment. The treatment means and standard deviations are given in the accompanying table.

treatment n __

x s

Uneaten figs 20 2.40 0.30

Lizard dung 20 2.35 0.33

Bird dung 20 1.70 0.34

Hyrax dung 20 1.45 0.28

a. Construct the appropriate ANOVA table, and test the hypothesis that there is no difference between mean number of seeds germinating for the four treatments.

Difference Set 1 Set 2 Set 3

mL 2 m

M (20.6, 0.1) (20.6, 20.1) (20.6, 20.1)

mL 2 m

H (21.5, 20.6) (21.5, 20.6) (21.5, 20.6)

mM 2 m

H (20.9, 20.3) (20.9, 0.3) (20.9, 20.3)

Is there evidence that seeds eaten and then excreted by lizards germinate at a higher rate than those eaten and then excreted by birds? Give statistical evidence to support your answer.

Page 24: Asking and Answering Questions about More Than Two Means

Chapter 17 Appendix: ANOVA Computations

single-Factor anOvaLet T

1 denote the sum of the observations in the sample from the first population or treat-

ment, and let T2, …, T

k denote the other sample totals. Also let T represent the sum of all

N observations—the grand total—and

CF 5 correction factor 5 T 2 ___ N

Then

SSTo 5 ∑

x2 2 CF

SSTr 5 T

12

___ n1 1

T22

___ n2 1 … 1

Tk2

___ nk 2 CF

SSE 5 SSTo 2 SSTr

Example 15A.1

Treatment 1 4.2 3.7 5.0 4.8 T1 5 17.7 n

1 5 4

Treatment 2 5.7 6.2 6.4 T2 5 18.3 n

2 5 3

Treatment 3 4.6 3.2 3.5 3.9 T3 5 15.2 n

3 5 4

T 5 51.2 N 5 11

CF 5 correction factor 5 T 2 ___ N 5

(51.2)2

______ 11 5 238.31

SSTr 5 T

12

___ n1 1

T22

___ n2 1 … 1

Tk2

___ nk 2 CF

5 (17.7)2

______ 4 1 (18.3)2

______ 3 1 (15.2)2

______ 4 2 238.31

5 9.40

SSTo 5 ∑

x2 2 CF 5 (4.2)2 1 (3.7)2 1 … 1 (3.9)2 2 238.31 5 11.81

SSE 5 SSTo 2 SSTr 5 118.1 2 9.40 5 2.41

17.23 (C1, M1, M2, M3)The paper “Women’s and Men’s Eating Behavior Following Exposure to Ideal-Body Images and text” (Communication Research [2006]: 507–529) describes an experiment in which 74 men were assigned at random to one of four treatments:1. Viewed slides of fit, muscular men2. Viewed slides of fit, muscular men accompanied by diet

and fitness-related text3. Viewed slides of fit, muscular men accompanied by text

not related to diet and fitness4. Did not view any slides

The participants then went to a room to complete a ques-tionnaire. In this room, bowls of pretzels were set out on the tables. A research assistant noted how many pretzels were consumed by each participant while completing the questionnaire. Data consistent with summary quantities given in the paper are given in the accompanying table. Do these data provide convincing evidence that the mean number of pretzels consumed is not the same for all four treatments? Test the relevant hypotheses using a signifi-cance level of 0.05.

all chapter learning objectives are assessed in these exercises. the learning objectives assessed in each exercise are given in parentheses for each exercise.

are yOU reaDy tO MOve On? ChaPter 17 review exerCises

25Appendix: ANOVA Computations

Is there evidence that seeds eaten and then excreted by lizards germinate at a higher rate than those eaten and then excreted by birds? Give statistical evidence to support your answer.

all N obs.

all N obs.

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Page 25: Asking and Answering Questions about More Than Two Means

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

ChaPter 17 Asking and Answering Questions about More Than Two Means26

17.24 (P1, P2)Can use of an online plagiarism-detection system reduce plagiarism in student research papers? The paper “Plagiarism and technology: A tool for Coping with Plagiarism” ( Journal of Education for Business [2005]: 149–152) describes a study in which randomly selected research papers submitted by stu-dents during five semesters were analyzed for plagiarism. For each paper, the percentage of plagiarized words in the paper was determined by an online analysis. In each of the five semesters, students were told during the first two class meetings that they would have to submit an electronic ver-sion of their research papers and that the papers would be reviewed for plagiarism. Suppose that the number of papers sampled in each of the five semesters and the means and standard deviations for percentage of plagiarized words are as given in the accompanying table. For purposes of this exercise, assume that the conditions necessary for the ANOVA F test are reasonable. Do these data provide evi-dence to support the claim that mean percentage of plagia-rized words is not the same for all five semesters? Test the appropriate hypotheses using a 5 0.05.

treatment 1 treatment 2 treatment 3 treatment 4

8 6 1 5 7 8 5 2 4 0 2 513 4 0 7 2 9 3 5 1 8 0 2 5 6 3 0 8 2 4 011 7 4 3 5 8 5 4 1 8 5 2 0 5 7 4 6 14 8 1 4 9 4 110 0 0 7 6 6 0 3 312 12

5 610 8 6 210

Semester n Mean Standard deviation

1 39 6.31 3.752 42 3.31 3.063 32 1.79 3.254 32 1.83 3.135 34 1.50 2.37

17.25 (M4, P2)The paper referenced in Exercise 17.3 described an experi-ment to determine if restrictive age labeling on video games increased the attractiveness of the game for boys ages 12 to 13. In that exercise, the null hypothesis was H

0: m

1 5 m

2

5 m3 5 m

4, where m

1 is the population mean attractiveness

rating for the game with the 71 age label, and m2, m

3, and

m4

are the population mean attractiveness scores for the 121, 161, and 181 age labels, respectively. The sample data are given in the accompanying table.

71 label 121 label 161 label 181 label

6 8 7 106 7 9 96 8 8 65 5 6 84 7 7 78 9 4 66 5 8 81 8 9 92 4 6 104 7 7 8

a. Compute the 95% T-K intervals and then use the under-scoring procedure described in this section to identify significant differences among the age labels.

b. Based on your answer to Part (a), write a few sentences commenting on the theory that the more restrictive the age label on a video game, the more attractive the game is to 12- to 13-year-old boys.

17.26 (M4)The authors of the paper “Beyond the Shooter Game: Examining Presence and Hostile Outcomes among Male Game Players” (Communication Research [2006]: 448–466) stud-ied how video game content might influence attitudes and behavior. Male students at a large Midwestern university were assigned at random to play one of three action-oriented video games. Two of the games involved some violence—one was a shooting game and one was a fighting game. The third game was a nonviolent race car driving game. After playing a game for 20 minutes, participants answered a set of questions. The responses were used to determine values of three measures of aggression: (1) a measure of aggres-sive behavior; (2) a measure of aggressive thoughts; and (3) a measure of aggressive feelings. The authors hypothesized that the means for the three measures of aggression would be greatest for the fighting game and lowest for the driving game.a. For the measure of aggressive behavior, the paper reports

that the mean score for the fighting game was signifi-cantly higher than the mean scores for the shooting and driving game, but that the mean scores for the shooting

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Page 26: Asking and Answering Questions about More Than Two Means

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Technology Notes 27

and driving games were not significantly different. The three sample means were:

Driving Shooting Fighting

Sample mean 3.42 4.00 5.30

Use the underscoring procedure of this section to construct a display that shows any significant differences in mean aggressive behavior score among the three games.

b. For the measure of aggressive thoughts, the three sample means were:

Driving Shooting Fighting

Sample mean 2.81 3.44 4.01

The paper states that the mean score for the fighting game only significantly differed from the mean score for the driving game, and that the mean score for the shooting game did not significantly differ from either the fighting or driving games. Use the underscoring procedure of this section to construct a display that shows any significant differences in mean aggressive thoughts score among the three games.

teChnOLOgy nOtes

ANOVAJMP1. Input the raw data into the first column2. Input the group information into the second column

3. Click Analyze then select Fit Y by X4. Click and drag the first column name from the box under

Select Columns to the box next to Y, response5. Click and drag the second column name from the box under

Select Columns to the box next to X, Factor6. Click OK7. Click the red arrow next to Oneway Analysis of… and select

Means/ANOVA

MINItAB

Data stored in separate columns1. Input each group’s data in a separate column

2. Click Stat then ANOVA then One-Way (unstacked)… 3. Click in the box under responses (in separate columns):4. Double-click the column name containing each group’s data5. Click OK

Data stored in one column1. Input the data into one column2. Input the group information into a second column ©

201

4 C

enga

ge L

earn

ing.

All

Rig

hts

Res

erve

d. M

ay n

ot b

e sc

anne

d, c

opie

d or

dup

licat

ed, o

r po

sted

to a

pub

licly

acc

essi

ble

web

site

, in

who

le o

r in

par

t.

Unless otherwise noted, all content on this page is © Cengage Learning.

Page 27: Asking and Answering Questions about More Than Two Means

ChaPter 17 Asking and Answering Questions about More Than Two Means28

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

3. Click Stat then ANOVA then One-Way…4. Click in the box next to response: and double-click the

column name containing the raw data values5. Click in the box next to Factor: and double-click the column

name containing the group information6. Click OK

SPSS1. Input the raw data for all groups into one column2. Input the group information into a second column (use group

numbers)

3. Click Analyze then click Compare Means then click One-Way ANOVA…

4. Click the name of the column containing the raw data and click the arrow to move it to the box under Dependent List:

5. Click the name of the column containing the group data and click the arrow to move it to the box under Factor:

6. Click OK

Excel1. Input the raw data for each group into a separate column2. Click the Data ribbon3. Click Data Analysis in the Analysis group

Note: If you do not see Data Analysis listed on the Ribbon, see the Technology Notes for Chapter 2 for instructions on installing this add-on.4. Select Anova: Single Factor and click OK5. Click on the box next to Input range and select ALL columns

of data (if you typed and selected column titles, click the box next to Labels in First row)

6. Click in the box next to Alpha and type the significance level7. Click OK

Note: The test statistic and p-value can be found in the first row of the table under F and P-value, respectively.

tI-83/841. Enter the data for each group into a separate list starting

with L1 (In order to access lists press the StAt key, highlight the option called Edit… then press ENtEr)

2. Press StAt3. Highlight tEStS4. Highlight ANOVA and press ENtEr5. Press 2nd then 16. Press ,7. Press 2nd then 28. Press ,9. Continue to input lists where data is stored separated by

commas until you input the final list10. When you are finished entering all lists, press )11. Press ENtEr

tI-Nspire

Summarized Data1. Enter the summary information for the first group in a list

in the following order: the value for n followed by a comma then the value of

__ x followed by a comma then the value of s

(In order to access data lists select the spreadsheet option and press enter)

Note: Be sure to title the lists by selecting the top row of the column and typing a title.2. Enter the summary information for the first group in a list in

the following order: the value for n followed by a comma then the value of

__ x followed by a comma then the value of s

3. Continue to enter summary information for each group in this manner

ChaPter 17 Asking and Answering Questions about More Than Two Means28

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Unless otherwise noted, all content on this page is © Cengage Learning.

Page 28: Asking and Answering Questions about More Than Two Means

17.2 Multiple Comparisons 29

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

4. When you are finished entering data for each group, press menu then 4:Statistics then 4:Stat tests then C:ANOVA… then press enter

5. For Data Input Method choose Stats from the drop-down menu

6. For Number of Groups enter the number of groups, k7. In the box next to Group 1 Stats select the list containing

group one’s summary statistics8. In the box next to Group 2 Stats select the list containing

group one’s summary statistics9. Continue entering summary statistics in this manner for all

groups10. Press OK

raw data1. Enter each group’s data into separate data lists (In order to

access data lists, select the spreadsheet option and press enter)

Note: Be sure to title the lists by selecting the top row of the column and typing a title.2. Press the menu key and select 4:Statistics then 4:Stat tests

then C:ANOVA… and press enter3. For Data Input Method choose Data from the drop-down

menu4. For Number of Groups input the number of groups, k5. Press OK6. For List 1 select the list title that contains group one’s data

from the drop-down menu7. For List 2 select the list title that contains group two’s data

from the drop-down menu8. Continue to select the appropriate lists for all groups9. When you are finished inputting lists press OK

Technology Notes 29

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Page 29: Asking and Answering Questions about More Than Two Means

ChaPter 17 Asking and Answering Questions about More Than Two Means30

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

df1

df2 Area 1 2 3 4 5 6 7 8 9 10

1 .10 39.86 49.50 53.59 55.83 57.24 58.20 58.91 59.44 59.86 60.19.05 161.40 199.50 215.70 224.60 230.20 234.00 236.80 238.90 240.50 241.90.01 4052.00 5000.00 5403.00 5625.00 5764.00 5859.00 5928.00 5981.00 6022.00 6056.00

2 .10 8.53 9.00 9.16 9.24 9.29 9.33 9.35 9.37 9.38 9.39.05 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40.01 98.50 99.00 99.17 99.25 99.30 99.33 99.36 99.37 99.39 99.40.001 998.50 999.00 999.20 999.20 999.30 999.30 999.40 999.40 999.40 999.40

3 .10 5.54 5.46 5.39 5.34 5.31 5.28 5.27 5.25 5.24 5.23.05 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79.01 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49 27.35 27.23.001 167.00 148.50 141.10 137.10 134.60 132.80 131.60 130.60 129.90 129.20

4 .10 4.54 4.32 4.19 4.11 4.05 4.01 3.98 3.95 3.94 3.92.05 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96.01 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80 14.66 14.55.001 74.14 61.25 56.18 53.44 51.71 50.53 49.66 49.00 48.47 48.05

5 .10 4.06 3.78 3.62 3.52 3.45 3.40 3.37 3.34 3.32 3.30.05 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74.01 16.26 13.27 12.06 11.39 10.97 10.67 10.46 10.29 10.16 10.05.001 47.18 37.12 33.20 31.09 29.75 28.83 28.16 27.65 27.24 26.92

6 .10 3.78 3.46 3.29 3.18 3.11 3.05 3.01 2.98 2.96 2.94.05 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06.01 13.75 10.92 9.78 9.15 8.75 8.47 8.26 8.10 7.98 7.87.001 35.51 27.00 23.70 21.92 20.80 20.03 19.46 19.03 18.69 18.41

7 .10 3.59 3.26 3.07 2.96 2.88 2.83 2.78 2.75 2.72 2.70.05 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64.01 12.25 9.55 8.45 7.85 7.46 7.19 6.99 6.84 6.72 6.62.001 29.25 21.69 18.77 17.20 16.21 15.52 15.02 14.63 14.33 14.08

8 .10 3.46 3.11 2.92 2.81 2.73 2.67 2.62 2.59 2.56 2.54.05 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35.01 11.26 8.65 7.59 7.01 6.63 6.37 6.18 6.03 5.91 5.81.001 25.41 18.49 15.83 14.39 13.48 12.86 12.40 12.05 11.77 11.54

9 .10 3.36 3.01 2.81 2.69 2.61 2.55 2.51 2.47 2.44 2.42.05 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14.01 10.56 8.02 6.99 6.42 6.06 5.80 5.61 5.47 5.35 5.26.001 22.86 16.39 13.90 12.56 11.71 11.13 10.70 10.37 10.11 9.89

tabLe 7 values that Capture specified Upper-tail F Curve areas

(continued)

Appendix tables

Page 30: Asking and Answering Questions about More Than Two Means

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

31Appendix

df1

df2 Area 1 2 3 4 5 6 7 8 9 10

10 .10 3.29 2.92 2.73 2.61 2.52 2.46 2.41 2.38 2.35 2.32.05 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98.01 10.04 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 4.85.001 21.04 14.91 12.55 11.28 10.48 9.93 9.52 9.20 8.96 8.75

11 .10 3.23 2.86 2.66 2.54 2.45 2.39 2.34 2.30 2.27 2.25.05 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85.01 9.65 7.21 6.22 5.67 5.32 5.07 4.89 4.74 4.63 4.54.001 19.69 13.81 11.56 10.35 9.58 9.05 8.66 8.35 8.12 7.92

12 .10 3.18 2.81 2.61 2.48 2.39 2.33 2.28 2.24 2.21 2.19.05 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75.01 9.33 6.93 5.95 5.41 5.06 4.82 4.64 4.50 4.39 4.30.001 18.64 12.97 10.80 9.63 8.89 8.38 8.00 7.71 7.48 7.29

13 .10 3.14 2.76 2.56 2.43 2.35 2.28 2.23 2.20 2.16 2.14.05 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67.01 9.07 6.70 5.74 5.21 4.86 4.62 4.44 4.30 4.19 4.10.001 17.82 12.31 10.21 9.07 8.35 7.86 7.49 7.21 6.98 6.80

14 .10 3.10 2.73 2.52 2.39 2.31 2.24 2.19 2.15 2.12 2.10.05 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60.01 8.86 6.51 5.56 5.04 4.69 4.46 4.28 4.14 4.03 3.94.001 17.14 11.78 9.73 8.62 7.92 7.44 7.08 6.80 6.58 6.40

15 .10 3.07 2.70 2.49 2.36 2.27 2.21 2.16 2.12 2.09 2.06.05 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54.01 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 3.80.001 16.59 11.34 9.34 8.25 7.57 7.09 6.74 6.47 6.26 6.08

16 .10 3.05 2.67 2.46 2.33 2.24 2.18 2.13 2.09 2.06 2.03.05 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49.01 8.53 6.23 5.29 4.77 4.44 4.20 4.03 3.89 3.78 3.69.001 16.12 10.97 9.01 7.94 7.27 6.80 6.46 6.19 5.98 5.81

17 .10 3.03 2.64 2.44 2.31 2.22 2.15 2.10 2.06 2.03 2.00.05 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 2.45.01 8.40 6.11 5.18 4.67 4.34 4.10 3.93 3.79 3.68 3.59.001 15.72 10.66 8.73 7.68 7.02 6.56 6.22 5.96 5.75 5.58

18 .10 3.01 2.62 2.42 2.29 2.20 2.13 2.08 2.04 2.00 1.98.05 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41.01 8.29 6.01 5.09 4.58 4.25 4.01 3.84 3.71 3.60 3.51.001 15.38 10.39 8.49 7.46 6.81 6.35 6.02 5.76 5.56 5.39

19 .10 2.99 2.61 2.40 2.27 2.18 2.11 2.06 2.02 1.98 1.96.05 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 2.38.01 8.18 5.93 5.01 4.50 4.17 3.94 3.77 3.63 3.52 3.43.001 15.08 10.16 8.28 7.27 6.62 6.18 5.85 5.59 5.39 5.22

tabLe 7 values that Capture specified Upper-tail F Curve areas (Continued)

Page 31: Asking and Answering Questions about More Than Two Means

ChaPter 17 Asking and Answering Questions about More Than Two Means32

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

df1

df2 Area 1 2 3 4 5 6 7 8 9 10

20 .10 2.97 2.59 2.38 2.25 2.16 2.09 2.04 2.00 1.96 1.94.05 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35.01 8.10 5.85 4.94 4.43 4.10 3.87 3.70 3.56 3.46 3.37.001 14.82 9.95 8.10 7.10 6.46 6.02 5.69 5.44 5.24 5.08

21 .10 2.96 2.57 2.36 2.23 2.14 2.08 2.02 1.98 1.95 1.92.05 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32.01 8.02 5.78 4.87 4.37 4.04 3.81 3.64 3.51 3.40 3.31.001 14.59 9.77 7.94 6.95 6.32 5.88 5.56 5.31 5.11 4.95

22 .10 2.95 2.56 2.35 2.22 2.13 2.06 2.01 1.97 1.93 1.90.05 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 2.30.01 7.95 5.72 4.82 4.31 3.99 3.76 3.59 3.45 3.35 3.26.001 14.38 9.61 7.80 6.81 6.19 5.76 5.44 5.19 4.99 4.83

23 .10 2.94 2.55 2.34 2.21 2.11 2.05 1.99 1.95 1.92 1.89.05 4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32 2.27.01 7.88 5.66 4.76 4.26 3.94 3.71 3.54 3.41 3.30 3.21.001 14.20 9.47 7.67 6.70 6.08 5.65 5.33 5.09 4.89 4.73

24 .10 2.93 2.54 2.33 2.19 2.10 2.04 1.98 1.94 1.91 1.88.05 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25.01 7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.26 3.17.001 14.03 9.34 7.55 6.59 5.98 5.55 5.23 4.99 4.80 4.64

25 .10 2.92 2.53 2.32 2.18 2.09 2.02 1.97 1.93 1.89 1.87.05 4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28 2.24.01 7.77 5.57 4.68 4.18 3.85 3.63 3.46 3.32 3.22 3.13.001 13.88 9.22 7.45 6.49 5.89 5.46 5.15 4.91 4.71 4.56

26 .10 2.91 2.52 2.31 2.17 2.08 2.01 1.96 1.92 1.88 1.86.05 4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27 2.22.01 7.72 5.53 4.64 4.14 3.82 3.59 3.42 3.29 3.18 3.09.001 13.74 9.12 7.36 6.41 5.80 5.38 5.07 4.83 4.64 4.48

27 .10 2.90 2.51 2.30 2.17 2.07 2.00 1.95 1.91 1.87 1.85.05 4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.31 2.25 2.20.01 7.68 5.49 4.60 4.11 3.78 3.56 3.39 3.26 3.15 3.06.001 13.61 9.02 7.27 6.33 5.73 5.31 5.00 4.76 4.57 4.41

28 .10 2.89 2.50 2.29 2.16 2.06 2.00 1.94 1.90 1.87 1.84.05 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 2.19.01 7.64 5.45 4.57 4.07 3.75 3.53 3.36 3.23 3.12 3.03.001 13.50 8.93 7.19 6.25 5.66 5.24 4.93 4.69 4.50 4.35

29 .10 2.89 2.50 2.28 2.15 2.06 1.99 1.93 1.89 1.86 1.83.05 4.18 3.33 2.93 2.70 2.55 2.43 2.35 2.28 2.22 2.18.01 7.60 5.42 4.54 4.04 3.73 3.50 3.33 3.20 3.09 3.00.001 13.39 8.85 7.12 6.19 5.59 5.18 4.87 4.64 4.45 4.29

tabLe 7 values that Capture specified Upper-tail F Curve areas (Continued)

(continued)

Page 32: Asking and Answering Questions about More Than Two Means

33

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Appendix

df1

df2 Area 1 2 3 4 5 6 7 8 9 10

30 .10 2.88 2.49 2.28 2.14 2.05 1.98 1.93 1.88 1.85 1.82.05 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16.01 7.56 5.39 4.51 4.02 3.70 3.47 3.30 3.17 3.07 2.98.001 13.29 8.77 7.05 6.12 5.53 5.12 4.82 4.58 4.39 4.24

40 .10 2.84 2.44 2.23 2.09 2.00 1.93 1.87 1.83 1.79 1.76.05 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.08.01 7.31 5.18 4.31 3.83 3.51 3.29 3.12 2.99 2.89 2.80.001 12.61 8.25 6.59 5.70 5.13 4.73 4.44 4.21 4.02 3.87

60 .10 2.79 2.39 2.18 2.04 1.95 1.87 1.82 1.77 1.74 1.71.05 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.99.01 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 2.72 2.63.001 11.97 7.77 6.17 5.31 4.76 4.37 4.09 3.86 3.69 3.54

90 .10 2.76 2.36 2.15 2.01 1.91 1.84 1.78 1.74 1.70 1.67.05 3.95 3.10 2.71 2.47 2.32 2.20 2.11 2.04 1.99 1.94.01 6.93 4.85 4.01 3.53 3.23 3.01 2.84 2.72 2.61 2.52.001 11.57 7.47 5.91 5.06 4.53 4.15 3.87 3.65 3.48 3.34

120 .10 2.75 2.35 2.13 1.99 1.90 1.82 1.77 1.72 1.68 1.65.05 3.92 3.07 2.68 2.45 2.29 2.18 2.09 2.02 1.96 1.91.01 6.85 4.79 3.95 3.48 3.17 2.96 2.79 2.66 2.56 2.47.001 11.38 7.32 5.78 4.95 4.42 4.04 3.77 3.55 3.38 3.24

240 .10 2.73 2.32 2.10 1.97 1.87 1.80 1.74 1.70 1.65 1.63.05 3.88 3.03 2.64 2.41 2.25 2.14 2.04 1.98 1.92 1.87.01 6.74 4.69 3.86 3.40 3.09 2.88 2.71 2.59 2.48 2.40.001 11.10 7.11 5.60 4.78 4.25 3.89 3.62 3.41 3.24 3.09

∞ .10 2.71 2.30 2.08 1.94 1.85 1.77 1.72 1.67 1.63 1.60.05 3.84 3.00 2.60 2.37 2.21 2.10 2.01 1.94 1.88 1.83.01 6.63 4.61 3.78 3.32 3.02 2.80 2.64 2.51 2.41 2.32.001 10.83 6.91 5.42 4.62 4.10 3.74 3.47 3.27 3.10 2.96

tabLe 7 values that Capture specified Upper-tail F Curve areas (Continued)

Page 33: Asking and Answering Questions about More Than Two Means

ChaPter 17 Asking and Answering Questions about More Than Two Means34

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Error df

Confi-dence level

Number of populations, treatments, or levels being compared

3 4 5 6 7 8 9 10

5 95% 4.60 5.22 5.67 6.03 6.33 6.58 6.80 6.99 99% 6.98 7.80 8.42 8.91 9.32 9.67 9.97 10.24 6 95% 4.34 4.90 5.30 5.63 5.90 6.12 6.32 6.49 99% 6.33 7.03 7.56 7.97 8.32 8.61 8.87 9.10 7 95% 4.16 4.68 5.06 5.36 5.61 5.82 6.00 6.16 99% 5.92 6.54 7.01 7.37 7.68 7.94 8.17 8.37 8 95% 4.04 4.53 4.89 5.17 5.40 5.60 5.77 5.92 99% 5.64 6.20 6.62 6.96 7.24 7.47 7.68 7.86 9 95% 3.95 4.41 4.76 5.02 5.24 5.43 5.59 5.74 99% 5.43 5.96 6.35 6.66 6.91 7.13 7.33 7.49 10 95% 3.88 4.33 4.65 4.91 5.12 5.30 5.46 5.60 99% 5.27 5.77 6.14 6.43 6.67 6.87 7.05 7.21 11 95% 3.82 4.26 4.57 4.82 5.03 5.20 5.35 5.49 99% 5.15 5.62 5.97 6.25 6.48 6.67 6.84 6.99 12 95% 3.77 4.20 4.51 4.75 4.95 5.12 5.27 5.39 99% 5.05 5.50 5.84 6.10 6.32 6.51 6.67 6.81 13 95% 3.73 4.15 4.45 4.69 4.88 5.05 5.19 5.32 99% 4.96 5.40 5.73 5.98 6.19 6.37 6.53 6.67 14 95% 3.70 4.11 4.41 4.64 4.83 4.99 5.13 5.25 99% 4.89 5.32 5.63 5.88 6.08 6.26 6.41 6.54 15 95% 3.67 4.08 4.37 4.59 4.78 4.94 5.08 5.20 99% 4.84 5.25 5.56 5.80 5.99 6.16 6.31 6.44 16 95% 3.65 4.05 4.33 4.56 4.74 4.90 5.03 5.15 99% 4.79 5.19 5.49 5.72 5.92 6.08 6.22 6.35 17 95% 3.63 4.02 4.30 4.52 4.70 4.86 4.99 5.11 99% 4.74 5.14 5.43 5.66 5.85 6.01 6.15 6.27 18 95% 3.61 4.00 4.28 4.49 4.67 4.82 4.96 5.07 99% 4.70 5.09 5.38 5.60 5.79 5.94 6.08 6.20 19 95% 3.59 3.98 4.25 4.47 4.65 4.79 4.92 5.04 99% 4.67 5.05 5.33 5.55 5.73 5.89 6.02 6.14 20 95% 3.58 3.96 4.23 4.45 4.62 4.77 4.90 5.01 99% 4.64 5.02 5.29 5.51 5.69 5.84 5.97 6.09 24 95% 3.53 3.90 4.17 4.37 4.54 4.68 4.81 4.92 99% 4.55 4.91 5.17 5.37 5.54 5.69 5.81 5.92 30 95% 3.49 3.85 4.10 4.30 4.46 4.60 4.72 4.82 99% 4.45 4.80 5.05 5.24 5.40 5.54 5.65 5.76 40 95% 3.44 3.79 4.04 4.23 4.39 4.52 4.63 4.73 99% 4.37 4.70 4.93 5.11 5.26 5.39 5.50 5.60 60 95% 3.40 3.74 3.98 4.16 4.31 4.44 4.55 4.65 99% 4.28 4.59 4.82 4.99 5.13 5.25 5.36 5.45120 95% 3.36 3.68 3.92 4.10 4.24 4.36 4.47 4.56 99% 4.20 4.50 4.71 4.87 5.01 5.12 5.21 5.30

∞ 95% 3.31 3.63 3.86 4.03 4.17 4.29 4.39 4.47 99% 4.12 4.40 4.60 4.76 4.88 4.99 5.08 5.16

tabLe 8 Critical values of q for the studentized range Distribution

Page 34: Asking and Answering Questions about More Than Two Means

17.2 Multiple Comparisons 35

© 2

014

Cen

gage

Lea

rnin

g. A

ll R

ight

s R

eser

ved.

May

not

be

scan

ned,

cop

ied

or d

uplic

ated

, or

post

ed to

a p

ublic

ly a

cces

sibl

e w

ebsi

te, i

n w

hole

or

in p

art.

Answers to Selected Exercises seCtiOn 17.1Exercise Set 1

17.1 (a) 0.001 , P-value , 0.01 (b) P-value . 0.10 (c) P-value 5 0.01 (d) P-value , 0.001 (e) 0.05 , P-value , 0.10 (f) 0.01 , P-value , 0.05 (using df

1 5 4 and

df2 5 60)

17.2 (a) H0: m

1 5 m

2 5 m

3 5 m

4 , H

a: At least two of the

four mi’s are different. (b) P-value 5 0.012, fail to reject H

0

(c) P-value 5 0.012, fail to reject H0

17.3 F 5 6.687, P-value 5 0.001, reject H0

17.4 (a) SSTr 5 n1( _ x 1 2

_

_ x )2 1 n

2 ( _ x 2 2

_

_ x )2 1 n

3 ( _ x 3 2

_

_ x )2

1 n4 ( _ x 4 2

_

_ x )2 5 32.13815000;

Treatment df 5 k 2 1 5 3; SSE 5 (n1 2 1)s

12 1 (n

2 2 1)s

22

1 (n3 2 1)s

32 1 (n

4 2 1)s

42 5 32.90103333;

Error df 5 N 2 k 5 20

(b) H0: m

1 5 m

2 5 m

3 5 m

4 ; H

a: At least two among m

1, m

2,

m3, m

4 are different; F 5 6.51, P-value 5 0.033; reject H

0.

Additional Exercises

17.9 H0: m

1 5 m

2 5 m

3 5 m

4 ; H

a : At least two among m

1,

m2, m

3, m

4 are different; F 5 25.094, P-value < 0; reject H

0.

17.11 F 5 2.62, 0.05 , P-value , 0.10, fail to reject H0

17.13 (a) See solutions manual for detailed computations.(b) F 5 2.142, P-value . 0.10, fail to reject H

0

seCtiOn 17.2Exercise Set 1

17.14 Since the interval for m2 2 m

3 is the only one that

contains zero, we have evidence of a difference between m1

and m2, and between m

1 and m

3, but not between m

2 and m

3.

Thus, statement c is the correct choice.

17.15 In increasing order of the resulting mean numbers of pretzels eaten, the treatments were: slides with related text, slides with no text, no slides, and slides with unrelated text. There were no significant differences between the results for slides with related text and slides with no text, or for

no slides and slides with unrelated text. However, there was a significant difference between the mean numbers of pretzels eaten for no slides and slides with no text (and also between the results for no slides and slides with related text). Likewise, there was a significant difference between the mean numbers of pretzels eaten for slides with unrelated text and slides with no text (and also between the results for slides with unrelated text and slides with related text).

17.16

Fabric 3 Fabric 2 Fabric 5 Fabric 4 Fabric 1Sample mean 10.5 11.633 12.3 14.96 16.35

Additional Exercises

17.21 The interval for m1 2 m

2 contains zero, and hence m

1

and m2 are judged not different. The intervals for µ

1 2 m

3

and m2 2 m

3 do not contain zero, so m

1 and m

3 are judged to

be different, and m2 and m

3 are judged to be different. There

is evidence that m3 is different from the other two means.

are yOU reaDy tO MOve On? ChaPter 17 review exerCises17.23 F 5 5.273, P-value 5 0.002, reject H

0

17.25 (a)

Difference Interval Includes 0?

m1 2 m

2(24.027, 0.027) Yes

m1 2 m

3(24.327, 20.273) No

m1 2 m

4(25.327, 21.273) No

m2 2 m

3(22.327, 1.727) Yes

m2 2 m

4(23.327, 0.727) Yes

m3 2 m

4(23.027, 1.027) Yes

71 label 121 label 161 label 181 labelSample mean 4.8 6.8 7.1 8.1

(b) The more restrictive the age label on the video game, the higher the sample mean rating given by the boys used in the experiment. However, according to the T-K intervals, the only significant differences were between the means for the 7 1 label and the 16 1 label and between the means for the 7 1 label and the 18 1 label.

35