Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY...

18
CHAPTER 21 In this chapter we cover... Two-sample problems: proportions The sampling distribution of a difference between proportions Large-sample confidence intervals for comparing proportions Using technology Accurate confidence intervals for comparing proportions Significance tests for comparing proportions Michael S. Lewis/CORBIS Comparing Two Proportions In a two-sample problem, we want to compare two populations or the responses to two treatments based on two independent samples. When the comparison involves the means of two populations, we use the two-sample t methods of Chapter 19. Now we turn to methods to compare the proportions of successes in two populations. Two-sample problems: proportions We will use notation similar to that used in our study of two-sample t statistics. The groups we want to compare are Population 1 and Population 2. We have a sep- arate SRS from each population or responses from two treatments in a randomized comparative experiment. A subscript shows which group a parameter or statistic describes. Here is our notation: Population Sample Sample Population proportion size proportion 1 p 1 n 1 ˆ p 1 2 p 2 n 2 ˆ p 2 512

Transcript of Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY...

Page 1: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

CH

AP

TE

R

21In this chapter we cover...

Two-sample problems:proportions

The sampling distributionof a difference betweenproportions

Large-sample confidenceintervals for comparingproportions

Using technology

Accurate confidenceintervals for comparingproportions

Significance tests forcomparing proportions

Mic

hael

S.Le

wis

/CO

RB

IS

Comparing TwoProportions

In a two-sample problem, we want to compare two populations or the responsesto two treatments based on two independent samples. When the comparisoninvolves the means of two populations, we use the two-sample t methods ofChapter 19. Now we turn to methods to compare the proportions of successes intwo populations.

Two-sample problems: proportionsWe will use notation similar to that used in our study of two-sample t statistics.The groups we want to compare are Population 1 and Population 2. We have a sep-arate SRS from each population or responses from two treatments in a randomizedcomparative experiment. A subscript shows which group a parameter or statisticdescribes. Here is our notation:

Population Sample SamplePopulation proportion size proportion

1 p1 n1 p̂12 p2 n2 p̂2

512

Page 2: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

The sampling distribution of a difference between proportions 513

We compare the populations by doing inference about the difference p1 − p2 be-tween the population proportions. The statistic that estimates this difference isthe difference between the two sample proportions, p̂1 − p̂2.

E X A M P L E 2 1 . 1 Young adults living with their parents

STATE: A surprising number of young adults (ages 19 to 25) still live in their parents’

4STEPSTEP

home. A random sample by the National Institutes of Health included 2253 men and2629 women in this age group.1 The survey found that 986 of the men and 923 of thewomen lived with their parents. Is this good evidence that different proportions of youngmen and young women live with their parents? How large is the difference between theproportions of young men and young women who live with their parents?

FORMULATE: Take young men to be Population 1 and young women to be Popula-tion 2. The population proportions who live in their parents’ home are p1 for men andp2 for women. We want to test the hypotheses

H0: p1 = p2 (the same as H0: p1 − p2 = 0)

Ha: p1 �= p2 (the same as Ha: p1 − p2 �= 0)

We also want to give a confidence interval for the difference p1 − p2.

SOLVE: Inference about population proportions is based on the sample proportions

p̂1 = 9862253

= 0.4376 (men)

p̂2 = 9232629

= 0.3511 (women)

We see that about 44% of the men but only about 35% of the women lived with theirparents. Because the samples are large and the sample proportions are quite different,we expect that a test will be highly significant (in fact, P < 0.0001). So we concentrateon the confidence interval. To estimate p1 − p2, start from the difference of sampleproportions

p̂1 − p̂2 = 0.4376 − 0.3511 = 0.0865

To complete the Solve step, we must know how this difference behaves.

The sampling distribution of a differencebetween proportions

To use p̂1 − p̂2 for inference, we must know its sampling distribution. Here are thefacts we need:

• When the samples are large, the distribution of p̂1 − p̂2 is approximatelyNormal.

• The mean of the sampling distribution is p1 − p2. That is, the differencebetween sample proportions is an unbiased estimator of the differencebetween population proportions.

Page 3: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 21, 2006 21:39

514 C H A P T E R 21 • Comparing Two Proportions

Values of p1 - p2

Mean p1 - p2

Standard deviationSampling distribution

of p1 - p2p1(1 - p1) p2(1 - p2)

n1 n2+

F I G U R E 2 1 . 1 Select independent SRSs from two populations having proportions ofsuccesses p1 and p2. The proportions of successes in the two samples are p̂1 and p̂2.When the samples are large, the sampling distribution of the difference p̂1 − p̂2 isapproximately Normal.

• The standard deviation of the distribution is√p1(1 − p1)

n1+ p2(1 − p2)

n2

Figure 21.1 displays the distribution of p̂1 − p̂2. The standard deviation ofp̂1 − p̂2 involves the unknown parameters p1 and p2. Just as in the previous chap-ter, we must replace these by estimates in order to do inference. And just as inthe previous chapter, we do this a bit differently for confidence intervals and fortests.

Large-sample confidence intervals forcomparing proportions

To obtain a confidence interval, replace the population proportions p1 and p2 inthe standard deviation by the sample proportions. The result is the standard errorstandard errorof the statistic p̂1 − p̂2:

SE =√

p̂1(1 − p̂1)n1

+ p̂2(1 − p̂2)n2

The confidence interval has the same form we met in the previous chapter

estimate ± z∗SEestimate

Page 4: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

Large-sample confidence intervals for comparing proportions 515

LARGE-SAMPLE CONFIDENCE INTERVAL FORCOMPARING TWO PROPORTIONS

Draw an SRS of size n1 from a large population having proportion p1 ofsuccesses and draw an independent SRS of size n2 from another largepopulation having proportion p2 of successes. When n1 and n2 are large, anapproximate level C confidence interval for p1 − p2 is

( p̂1 − p̂2) ± z∗SE

In this formula the standard error SE of p̂1 − p̂2 is

SE =√

p̂1(1 − p̂1)n1

+ p̂2(1 − p̂2)n2

and z∗ is the critical value for the standard Normal density curve with areaC between −z∗ and z∗.

Use this interval only when the numbers of successes and failures are each10 or more in both samples.

E X A M P L E 2 1 . 2 Men versus women living with their parents

We can now complete Example 21.1. Here is a summary of the basic information:

4STEPSTEP

Population Sample Number of SamplePopulation description size successes proportion

1 men n1 = 2253 986 p̂1 = 986/2253 = 0.43762 women n2 = 2629 923 p̂2 = 923/2629 = 0.3511

SOLVE: We will give a 95% confidence interval for p1 − p2, the difference between theproportions of young men and young women who live with their parents. To check thatthe large-sample confidence interval is safe, look at the counts of successes and failuresin the two samples. All of these four counts are much larger than 10, so the large-samplemethod will be accurate. The standard error is

SE =√

p̂1(1 − p̂1)n1

+ p̂2(1 − p̂2)n2

=√

(0.4376)(0.5624)2253

+ (0.3511)(0.6489)2629

=√

0.0001959 = 0.01400

Page 5: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

516 C H A P T E R 21 • Comparing Two Proportions

The 95% confidence interval is

( p̂1 − p̂2) ± z∗SE = (0.4376 − 0.3511) ± (1.960)(0.01400)

= 0.0865 ± 0.0274

= 0.059 to 0.114

CONCLUDE: We are 95% confident that the percent of young men living with theirparents is between 5.9 and 11.4 percentage points higher than the percent of youngwomen who live with their parents.

Computer-assistedinterviewing

The days of the interviewer with aclipboard are past. Interviewers nowread questions from a computerscreen and use the keyboard toenter responses. The computer skipsirrelevant items—once a womansays that she has no children,further questions about her childrennever appear. The computer caneven present questions in randomorder to avoid bias due to alwaysfollowing the same order. Softwarekeeps records of who has respondedand prepares a file of data from theresponses. The tedious process oftransferring responses from paper tocomputer, once a source of errors,has disappeared.

The sample survey in this example selected a single random sample of youngadults, not two separate random samples of young men and young women. To gettwo samples, we divided the single sample by sex. This means that we did notknow the two sample sizes n1 and n2 until after the data were in hand. The two-sample z procedures for comparing proportions are valid in such situations. This isan important fact about these methods.

Using technologyFigure 21.2 displays software output for Example 21.2 from a graphing calculatorand two statistical software programs. As usual, you can understand the outputeven without knowledge of the program that produced it. Minitab gives the test aswell as the confidence interval, confirming that the difference between men andwomen is highly significant. Excel spreadsheet output is not shown because Excellacks menu items for inference about proportions. You must use the spreadsheet’s

CrunchIt!

Texas Instruments TI-83 or TI-84

Two sample Proportion with summary

95% confidence interval results:

p1 - p2 : difference in proportions

p1 : proportion of successes for population 1

p2 : proportion of successes for population 2

Difference Count1 Total1 count2 Total2 Sample Diff. Std. Err. L. Limit U. Limit

p1 - p2 986 2253 923 2629 0.08655464 0.013996254 0.059122488 0.1139868

F I G U R E 2 1 . 2 Output from the TI-83 graphing calculator, CrunchIt!, and Minitabfor the 95% confidence interval of Example 21.2 (continued).

Page 6: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

Accurate confidence intervals for comparing proportions 517

Minitab

Session

Test and CI for Two Proportions

Sample X N Sample p1 986 2253 0.4376392 923 2629 0.351084

Difference = p (1) - p (2)Estimate for difference: 0.086554695% CI for difference: (0.0591225, 0.113987)Test for difference = 0 (vs not = 0): Z = 6.18 P-Value = 0.000

F I G U R E 2 1 . 2 (continued).

formula capability to program the confidence interval or test statistic and then tofind the P -value of a test.

A P P L Y Y O U R K N O W L E D G E

21.1 Who uses instant messaging? Teenagers (ages 12 to 17) are much more likely 4STEPSTEP

to use instant messaging online than are adults (ages 18 and older). How muchmore likely? A random sample of Internet users found that 736 out of 981 teensand 511 out of 1217 adults use instant messaging.2 Give a 95% confidenceinterval for the difference between the proportions of teenage and adult Internetusers who use instant messaging. Follow the four-step process as illustrated inExamples 21.1 and 21.2.

21.2 How to quit smoking. Nicotine patches are often used to help smokers quit. 4STEPSTEP

Does giving medicine to fight depression help? A randomized double-blindexperiment assigned 244 smokers who wanted to stop to receive nicotine patchesand another 245 to receive both a patch and the antidepression drug bupropion.After a year, 40 subjects in the nicotine patch group and 87 in thepatch-plus-drug group had abstained from smoking.3 Give a 99% confidenceinterval for the difference (treatment minus control) in the proportion of smokerswho quit. Follow the four-step process as illustrated in Examples 21.1 and 21.2.

Stockbyte Platinum/Alamy

Accurate confidence intervals forcomparing proportions

Like the large-sample confidence interval for a single proportion p, the large-sample in-

CAUTIONUTION

terval for p1 − p2 generally has true confidence level less than the level you asked for.The inaccuracy is not as serious as in the one-sample case, at least if our guidelinesfor use are followed. Once again, adding imaginary observations greatly improvesthe accuracy.4

Page 7: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

518 C H A P T E R 21 • Comparing Two Proportions

PLUS FOUR CONFIDENCE INTERVAL FOR COMPARINGTWO PROPORTIONS

Draw independent SRSs from two populations with population proportionsof successes p1 and p2. To get the plus four confidence interval for thedifference p1 − p2, add four imaginary observations, one success and onefailure in each of the two samples. Then use the large-sample confidenceinterval with the new sample sizes (actual sample sizes + 2) and counts ofsuccesses (actual counts + 1).

Use this interval when the sample size is at least 5 in each group, with anycounts of successes and failures.

If your software does not offer the plus four method, just enter the new plusfour sample sizes and success counts into the large-sample procedure.

E X A M P L E 2 1 . 3 Shrubs that withstand fire4STEPSTEP STATE: Some shrubs can resprout from their roots after their tops are destroyed. Fire is

a serious threat to shrubs in dry climates, as it can injure the roots as well as destroy thetops. One study of resprouting took place in a dry area of Mexico.5 The investigatorsrandomly assigned shrubs to treatment and control groups. They clipped the tops of allthe shrubs. They then applied a propane torch to the stumps of the treatment groupto simulate a fire. A shrub is a success if it resprouts. Here are the data for the shrubXerospirea hartwegiana:

Population Sample Number of SamplePopulation description size successes proportion

1 control n1 = 12 12 p̂1 = 12/12 = 1.0002 treatment n2 = 12 8 p̂2 = 8/12 = 0.667

How much does burning reduce the proportion of shrubs of this species that resprout?

FORMULATE: Give a 90% confidence interval for the difference of population pro-portions, p1 − p2.

SOLVE: The conditions for the large-sample interval are not met. In fact, there are nofailures in the control group. We will use the plus four method. Add four imaginaryobservations. The new data summary is

Population Sample Number of Plus four samplePopulation description size successes proportion

1 control n1 + 2 = 14 12 + 1 = 13 p̃1 = 13/14 = 0.92862 treatment n2 + 2 = 14 8 + 1 = 9 p̃2 = 9/14 = 0.6429

Page 8: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

Accurate confidence intervals for comparing proportions 519

The standard error based on the new facts is

SE =√

p̃1(1 − p̃1)n1 + 2

+ p̃2(1 − p̃2)n2 + 2

=√

(0.9286)(0.0714)14

+ (0.6429)(0.3571)14

=√

0.02113 = 0.1454

The plus four 90% confidence interval is

( p̃1 − p̃2) ± z∗SE = (0.9286 − 0.6429) ± (1.645)(0.1454)

= 0.2857 ± 0.2392

= 0.047 to 0.525

CONCLUDE: We are 90% confident that burning reduces the percent of these shrubsthat resprout by between 4.7% and 52.5%.

The plus four interval may be conservative (that is, the true confidence levelmay be higher than you asked for) for very small samples and population p ’s closeto 0 or 1, as in this example. It is generally much more accurate than the large-sample interval when the samples are small. Nevertheless, the plus four intervalin Example 21.3 cannot save us from the fact that small samples produce wideconfidence intervals.

A P P L Y Y O U R K N O W L E D G E

21.3 Broken crackers. We don’t like to find broken crackers when we open the 4STEPSTEP

package. How can makers reduce breaking? One idea is to microwave the crackersfor 30 seconds right after baking them. Breaks start as hairline cracks called“checking.” Assign 65 newly baked crackers to the microwave and another 65 to acontrol group that is not microwaved. After one day, none of the microwavegroup and 16 of the control group show checking.6 Give the 95% plus fourconfidence interval for the amount by which microwaving reduces the proportionof checking. The plus four method is particularly helpful when, as here, a count ofsuccesses is zero. Follow the four-step process as illustrated in Example 21.3.

Jim Cummins/Getty Images

21.4 In-line skaters. A study of injuries to in-line skaters used data from theNational Electronic Injury Surveillance System, which collects data from arandom sample of hospital emergency rooms. The researchers interviewed161 people who came to emergency rooms with injuries from in-line skating.Wrist injuries (mostly fractures) were the most common.7

(a) The interviews found that 53 people were wearing wrist guards and 6 of thesehad wrist injuries. Of the 108 who did not wear wrist guards, 45 had wrist injuries.Why should we not use the large-sample confidence interval for these data?

(b) Give the plus four 95% confidence interval for the difference between thetwo population proportions of wrist injuries. State carefully what populations yourinference compares. We would like to draw conclusions about all in-line skaters,but we have data only for injured skaters.

Page 9: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

520 C H A P T E R 21 • Comparing Two Proportions

Significance tests for comparing proportionsAn observed difference between two sample proportions can reflect an actual dif-ference between the populations, or it may just be due to chance variation in ran-dom sampling. Significance tests help us decide if the effect we see in the samples isreally there in the populations. The null hypothesis says that there is no differencebetween the two populations:

H0: p1 = p2

The alternative hypothesis says what kind of difference we expect.

E X A M P L E 2 1 . 4 Choosing a mate

STATE: “Would you marry a person from a lower social class than your own? ” Re-4STEPSTEP

searchers asked this question of a sample of 385 black, never-married students at twohistorically black colleges in the South. We will consider this to be an SRS of black stu-dents at historically black colleges. Of the 149 men in the sample, 91 said “Yes.”Amongthe 236 women, 117 said “Yes.”8 Is there reason to think that different proportions ofmen and women in this student population would be willing to marry beneath theirclass?

FORMULATE: Take men to be Population 1 and women to be Population 2. We hadno direction for the difference in mind before looking at the data, so we have a two-sidedalternative:

H0: p1 = p2

Ha: p1 �= p2

SOLVE: The men and women in a single SRS can be treated as if they were separateSRSs of men and women students. The sample proportions who would marry someonefrom a lower social class are

p̂1 = 91149

= 0.611 (men)

p̂2 = 117236

= 0.496 (women)

That is, about 61% of the men but only about 50% of the women would marry beneaththeir class. Is this apparent difference statistically significant? To continue the solution,we must learn the proper test.

To do a test, standardize p̂1 − p̂2 to get a z statistic. If H0 is true, all the ob-servations in both samples come from a single population of students of whom asingle unknown proportion p would marry someone from a lower social class. Soinstead of estimating p1 and p2 separately, we pool the two samples and use the

Page 10: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

Significance tests for comparing proportions 521

overall sample proportion to estimate the single population parameter p . Call thisthe pooled sample proportion. It is pooled sample proportion

p̂ = number of successes in both samples combinednumber of individuals in both samples combined

Use p̂ in place of both p̂1 and p̂2 in the expression for the standard error SE ofp̂1 − p̂2 to get a z statistic that has the standard Normal distribution when H0 istrue. Here is the test.

Statisticians honest anddishonest

Government statisticians ought toproduce honest data. We trust themonthly unemployment rate andConsumer Price Index to guide bothpublic and private decisions.Honesty can’t be taken for grantedeverywhere, however. In 1998, theRussian government arrested thetop statisticians in the StateCommittee for Statistics. They wereaccused of taking bribes to fudgedata to help companies avoid taxes.“It means that we know nothingabout the performance of Russiancompanies,” said one newspapereditor.

SIGNIFICANCE TEST FOR COMPARING TWO PROPORTIONS

Draw an SRS of size n1 from a large population having proportion p1 ofsuccesses and draw an independent SRS of size n2 from another largepopulation having proportion p2 of successes. To test the hypothesisH0: p1 = p2, first find the pooled proportion p̂ of successes in bothsamples combined. Then compute the z statistic

z = p̂1 − p̂2√p̂(1 − p̂)

(1n1

+ 1n2

)

In terms of a variable Z having the standard Normal distribution, theP-value for a test of H0 against

Ha: p1 > p2 is P (Z ≥ z)z

Ha: p1 < p2 is P (Z ≤ z)z

Ha: p1 �= p2 is 2P (Z ≥ |z|)|z|

Use this test when the counts of successes and failures are each 5 or more inboth samples.

E X A M P L E 2 1 . 5 Choosing a mate, continued

SOLVE: The data come from an SRS and the counts of successes and failures are all

4STEPSTEP

much larger than 5. The pooled proportion of students who would marry beneath their

Page 11: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

522 C H A P T E R 21 • Comparing Two Proportions

own social class is

p̂ = number of “Yes” responses among men and women combinednumber of men and women combined

= 91 + 117149 + 236

= 208385

= 0.5403

The z test statistic is

z = p̂1 − p̂2√p̂(1 − p̂)

(1n1

+ 1n2

)

= 0.611 − 0.496√(0.5403)(0.4597)

(1

149+ 1

236

)

= 0.1150.05215

= 2.205

The two-sided P -value is the area under the standard Normal curve more than 2.205distant from 0. Figure 21.3 shows this area. Software tells us that P = 0.0275.

Without software, you can use the bottom row of Table C (standard Normal criticalvalues) to approximate P with no calculations: z = 2.205 lies between the critical values2.054 and 2.326 for two-sided P-values 0.04 and 0.02.

CONCLUDE: There is good evidence (P < 0.04) that men are more likely than womento say they will marry someone from a lower social class.

Standard Normalcurve

z = 2.205− 2.205 z

P-value is double thearea under the curveto the right of z = 2.205.

F I G U R E 2 1 . 3 The P-value for the two-sided test of Example 21.5.

Page 12: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

Chapter 21 Summary 523

A P P L Y Y O U R K N O W L E D G E

21.5 The Gold Coast. A historian examining British colonial records for the Gold 4STEPSTEPCoast in Africa suspects that the death rate was higher among African miners

than among European miners. In the year 1936, there were 223 deaths among33,809 African miners and 7 deaths among 1541 European miners on the GoldCoast.9 (The Gold Coast became the independent nation of Ghana in 1957.)

Consider this year as a random sample from the colonial era in West Africa.Is there good evidence that the proportion of African miners who died was higherthan the proportion of European miners who died? Follow the four-step process asillustrated in Example 21.5.

21.6 How to quit smoking, continued. Exercise 21.2 describes a randomized

4STEPSTEP

comparative experiment to test whether adding medicine to fight depressionincreases the effectiveness of nicotine patches in helping smokers to quit. Howsignificant is the evidence that the medicine increases the success rate? Follow thefour-step process as illustrated in Example 21.5.

Michael S. Lewis/CORBIS

C H A P T E R 21 SUMMARYThe data in a two-sample problem are two independent SRSs, each drawn froma separate population.Tests and confidence intervals to compare the proportions p1 and p2 of successesin the two populations are based on the difference p̂1 − p̂2 between the sampleproportions of successes in the two SRSs.When the sample sizes n1 and n2 are large, the sampling distribution of p̂1 − p̂2is close to Normal with mean p1 − p2.The level C large-sample confidence interval for p1 − p2 is

( p̂1 − p̂2) ± z∗SE

where the standard error of p̂1 − p̂2 is

SE =√

p̂1(1 − p̂1)n1

+ p̂2(1 − p̂2)n2

and z∗ is a standard Normal critical value.The true confidence level of the large-sample interval can be substantially lessthan the planned level C. Use this interval only if the counts of successes andfailures in both samples are 10 or greater.To get a more accurate confidence interval, add four imaginary observations: onesuccess and one failure in each sample. Then use the same formula for theconfidence interval. This is the plus four confidence interval. You can use itwhenever both samples have 5 or more observations.Significance tests for H0: p1 = p2 use the pooled sample proportion

p̂ = number of successes in both samples combinednumber of individuals in both samples combined

Page 13: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

524 C H A P T E R 21 • Comparing Two Proportions

and the z statistic

z = p̂1 − p̂2√p̂(1 − p̂)

(1n1

+ 1n2

)

P -values come from the standard Normal distribution. Use this test when thereare 5 or more successes and 5 or more failures in both samples.

C H E C K Y O U R S K I L L S

A sample survey interviews SRSs of 500 female college students and 550 male collegestudents. Each student is asked if he or she worked for pay last summer. In all, 410 of thewomen and 484 of the men say “Yes.”Exercises 21.7 to 21.11 are based on this survey.

21.7 Take pM and pF to be the proportions of all college males and females whoworked last summer. We conjectured before seeing the data that men are morelikely to work. The hypotheses to be tested are

(a) H0: pM = pF versus Ha: pM �= pF .

(b) H0: pM = pF versus Ha: pM > pF .

(c) H0: pM = pF versus Ha: pM < pF .

21.8 The sample proportions of college males and females who worked last summer areabout

(a) p̂M = 0.88 and p̂ F = 0.82.

(b) p̂M = 0.82 and p̂ F = 0.88.

(c) p̂M = 0.75 and p̂ F = 0.97.

21.9 The pooled sample proportion who worked last summer is about

(a) p̂ = 1.70. (b) p̂ = 0.89. (c) p̂ = 0.85.

21.10 The z statistic for a test comparing the proportions of college men and womenwho worked last summer is about

(a) z = 2.66. (b) z = 2.72. (c) z = 3.10.

21.11 The 95% large-sample confidence interval for the difference pM − pF in theproportions of college men and women who worked last summer is about

(a) 0.06 ± 0.00095. (b) 0.06 ± 0.043. (c) 0.06 ± 0.036.

21.12 In an experiment to learn if substance M can help restore memory, the brains of20 rats were treated to damage their memories. The rats were trained to run amaze. After a day, 10 rats were given M and 7 of them succeeded in the maze;only 2 of the 10 control rats were successful. The z test for “no difference” against“a higher proportion of the M group succeeds” has

(a) z = 2.25, P < 0.02.

(b) z = 2.60, P < 0.005.

(c) z = 2.25, P < 0.04 but not < 0.02.

21.13 The z test in the previous exercise

(a) may be inaccurate because the populations are too small.

(b) may be inaccurate because some counts of successes and failures are too small.

(c) is reasonably accurate because the conditions for inference are met.

Page 14: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

Chapter 21 Exercises 525

21.14 The plus four 90% confidence interval for the difference between the proportionof rats that succeed when given M and the proportion that succeed without it is

(a) 0.455 ± 0.312. (b) 0.417 ± 0.304. (c) 0.417 ± 0.185.

C H A P T E R 21 EXERCISES

We recommend using the plus four method for all confidence intervals for proportions.However, the large-sample method is acceptable when the guidelines for its use are met.

Mark Harmel/Getty Images

21.15 Genetically altered mice. Genetic influences on cancer can be studied bymanipulating the genetic makeup of mice. One of the processes that turn geneson or off (so to speak) in particular locations is called “DNA methylation.” Dolow levels of this process help cause tumors? Compare mice altered to have lowlevels with normal mice. Of 33 mice with lowered levels of DNA methylation,23 developed tumors. None of the control group of 18 normal mice developedtumors in the same time period.10

(a) Explain why we cannot safely use either the large-sample confidence intervalor the test for comparing the proportions of normal and altered mice that developtumors.

(b) Give a 99% confidence interval for the difference in the proportions of thetwo populations that develop tumors.

(c) Based on your confidence interval, is the difference between normal andaltered mice significant at the 1% level?

21.16 Drug testing in schools. In 2002 the Supreme Court ruled that schools couldrequire random drug tests of students participating in competitive after-schoolactivities such as athletics. Does drug testing reduce use of illegal drugs? A studycompared two similar high schools in Oregon. Wahtonka High School testedathletes at random and Warrenton High School did not. In a confidential survey,7 of 135 athletes at Wahtonka and 27 of 141 athletes at Warrenton said they wereusing drugs.11 Regard these athletes as SRSs from the populations of athletes atsimilar schools with and without drug testing.

(a) You should not use the large-sample confidence interval. Why not?

(b) The plus four method adds two observations, a success and a failure, to eachsample. What are the sample sizes and the numbers of drug users after you do this?

(c) Give the plus four 95% confidence interval for the difference between theproportion of athletes using drugs at schools with and without testing.

21.17 I refuse! Do our emotions influence economic decisions? One way to examine 4STEPSTEP

the issue is to have subjects play an “ultimatum game” against other people andagainst a computer. Your partner (person or computer) gets $10, on the conditionthat it be shared with you. The partner makes you an offer. If you refuse, neitherof you gets anything. So it’s to your advantage to accept even the unfair offer of$2 out of the $10. Some people get mad and refuse unfair offers. Here are data onthe responses of 76 subjects randomly assigned to receive an offer of $2 fromeither a person they were introduced to or a computer:12

Accept Reject

Human offers 20 18Computer offers 32 6

Page 15: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

526 C H A P T E R 21 • Comparing Two Proportions

We suspect that emotion will lead to offers from another person being rejectedmore often than offers from an impersonal computer. Do a test to assess theevidence for this conjecture. Follow the four-step process as illustrated inExample 21.5.

21.18 Drug testing in schools, continued. Exercise 21.16 describes a study thatcompared the proportions of athletes who use illegal drugs in two similar highschools, one that tests for drugs and one that does not. Drug testing is intended toreduce use of drugs. Do the data give good reason to think that drug use amongathletes is lower in schools that test for drugs? State hypotheses, find the teststatistic, and use either software or the bottom row of Table C for the P -value. Besure to state your conclusion. (Because the study is not an experiment, theconclusion depends on the condition that athletes in these two schools can beconsidered SRSs from all similar schools.)

Call a statistician. Does involving a statistician to help with statistical methodsimprove the chance that a medical research paper will be published? A study of paperssubmitted to two medical journals found that 135 of 190 papers that lacked statisticalassistance were rejected without even being reviewed in detail. In contrast, 293 of the514 papers with statistical help were sent back without review.13 Exercises 21.19 to21.21 are based on this study.

21.19 Does statistical help make a difference? Is there a significant difference in theproportions of papers with and without statistical help that are rejected withoutreview? Use software or the bottom row of Table C to get a P -value. (Thisobservational study does not establish causation: studies that include statisticalhelp may also be better in other ways than those that do not.)

21.20 How often are statisticians involved? Give a 95% confidence interval for theproportion of papers submitted to these journals that include help from astatistician.

21.21 How big a difference? Give a 95% confidence interval for the differencebetween the proportions of papers rejected without review when a statistician isand is not involved in the research.

21.22 Steroids in high school. A study by the National Athletic Trainers Associationsurveyed 1679 high school freshmen and 1366 high school seniors in Illinois.Results showed that 34 of the freshmen and 24 of the seniors had used anabolicsteroids. Steroids, which are dangerous, are sometimes used to improve athleticperformance.14

(a) In order to draw conclusions about all Illinois freshmen and seniors, howshould the study samples be chosen?

(b) Give a 95% confidence interval for the proportion of all high schoolfreshmen in Illinois who have used steroids.

(c) Is there a significant difference between the proportions of freshmen andseniors who have used steroids?

21.23 Detecting genetically modified soybeans. Exercise 20.35 (page 509) describesa study in which batches of soybeans containing some genetically modified (GM)beans were submitted to 23 grain-handling facilities. When batches contained 1%of GM beans, 18 of the facilities detected the presence of GM beans. Only 7 ofthe facilities detected GM beans when they made up one-tenth of 1% of the

Page 16: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

Chapter 21 Exercises 527

beans in the batches. Explain why we cannot use the methods of this chapter tocompare the proportions of facilities that will detect the two levels of GMsoybeans.

21.24 Significant does not mean important. Never forget that even small effects canbe statistically significant if the samples are large. To illustrate this fact, return tothe study of 148 small businesses described in Exercise 20.37 (page 510).Of these, 106 were headed by men and 42 were headed by women. During athree-year period, 15 of the men’s businesses and 7 of the women’s businessesfailed.

(a) Find the proportions of failures for businesses headed by women andbusinesses headed by men. These sample proportions are quite close to each other.Give the P -value for the z test of the hypothesis that the same proportion ofwomen’s and men’s businesses fail. (Use the two-sided alternative.) The test isvery far from being significant.

(b) Now suppose that the same sample proportions came from a sample 30 timesas large. That is, 210 out of 1260 businesses headed by women and 450 out of3180 businesses headed by men fail. Verify that the proportions of failures areexactly the same as in (a). Repeat the z test for the new data, and show that it isnow significant at the α = 0.05 level.

(c) It is wise to use a confidence interval to estimate the size of an effect, ratherthan just giving a P -value. Give 95% confidence intervals for the differencebetween the proportions of women’s and men’s businesses that fail for the settingsof both (a) and (b). What is the effect of larger samples on the confidenceinterval?

In responding to Exercises 21.25 to 21.34, follow the Formulate, Solve, andConclude steps of the four-step process. It may be helpful to restate in your own wordsthe State information given in the exercise.

21.25 Satisfaction with high schools. A sample survey asked 202 black parents and

4STEPSTEP

201 white parents of high school children, “Are the public high schools in yourstate doing an excellent, good, fair or poor job, or don’t you know enough to say?”The investigators suspected that black parents are generally less satisfied withtheir public schools than are whites. Among the black parents, 81 thought highschools were doing a “good” or “excellent” job; 103 of the white parents felt thisway.15 Is there good evidence that the proportion of all black parents who thinktheir state’s high schools are good or excellent is lower than the proportion ofwhite parents with this opinion?

21.26 College is important. The sample survey described in the previous exercise alsoasked respondents if they agreed with the statement “A college education hasbecome as important as a high school diploma used to be.” In the sample, 125 of201 white parents and 154 of 202 black parents said that they “strongly agreed.” Isthere good reason to think that different percents of all black and white parentswould strongly agree with the statement?

21.27 Seat belt use. The proportion of drivers who use seat belts depends on thingslike age (young people are more likely to go unbelted) and gender (women aremore likely to use belts). It also depends on local law. In New York City, policecan stop a driver who is not belted. In Boston at the time of the survey, policecould cite a driver for not wearing a seat belt only if the driver had been stopped

Page 17: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

528 C H A P T E R 21 • Comparing Two Proportions

for some other violation. Here are data from observing random samples of femaleHispanic drivers in these two cities:16

City Drivers Belted

New York 220 183Boston 117 68

(a) Is this an experiment or an observational study? Why?

(b) Comparing local laws suggests the hypothesis that a smaller proportion ofdrivers wear seat belts in Boston than in New York. Do the data give goodevidence that this is true for female Hispanic drivers?

21.28 Ethnicity and seat belt use. Here are data from the study described in theprevious exercise for Hispanic and white male drivers in Chicago:

Group Drivers Belted

Hispanic 539 286White 292 164

Is there a significant difference between Hispanic and white drivers? How large isthe difference? Do inference to answer both questions. Be sure to explain exactlywhat inference you choose to do.

Scott Camazine/Photo Researchers

21.29 Lyme disease. Lyme disease is spread in the northeastern United States byinfected ticks. The ticks are infected mainly by feeding on mice, so more miceresult in more infected ticks. The mouse population in turn rises and falls with theabundance of acorns, their favored food. Experimenters studied two similar forestareas in a year when the acorn crop failed. They added hundreds of thousands ofacorns to one area to imitate an abundant acorn crop, while leaving the otherarea untouched. The next spring, 54 of the 72 mice trapped in the first area werein breeding condition, versus 10 of the 17 mice trapped in the second area.17

Estimate the difference between the proportions of mice ready to breed in goodacorn years and bad acorn years. (Use 90% confidence. Be sure to justify yourchoice of confidence interval.)

21.30 Are urban students more successful? North Carolina State University lookedat the factors that affect the success of students in a required chemical engineeringcourse. Students must get a C or better in the course in order to continue aschemical engineering majors, so a “success” is a grade of C or better. There were65 students from urban or suburban backgrounds, and 52 of these studentssucceeded. Another 55 students were from rural or small-town backgrounds; 30 ofthese students succeeded in the course.18 Is there good evidence that theproportion of students who succeed is different for urban/suburban versusrural/small-town backgrounds? How large is the difference? (Use 90% confidence.)

21.31 Does preschool help? To study the long-term effects of preschool programs forpoor children, the High/Scope Educational Research Foundation has followedtwo groups of Michigan children since early childhood.19 One group of 62attended preschool as 3- and 4-year-olds. A control group of 61 children from the

Page 18: Comparing Two Proportionsvirtual.yosemite.cc.ca.us/jcurl/Math134 4 s/ch21...P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18 516 CHAPTER

P1: PBU/OVY P2: PBU/OVY QC: PBU/OVY T1: PBU

GTBL011-21 GTBL011-Moore-v18.cls June 20, 2006 21:18

Chapter 21 Exercises 529

same area and similar backgrounds did not attend preschool. Over a ten-yearperiod as adults, 38 of the preschool sample and 49 of the control sample neededsocial services (mainly welfare). Does the study provide significant evidence thatchildren who attend preschool have less need for social services as adults? Howlarge is the difference between the proportions of the preschool and no-preschoolpopulations that require social services? Do inference to answer both questions.Be sure to explain exactly what inference you choose to do.

21.32 Female and male students. The North Carolina State University study(Exercise 21.30) also looked at possible differences in the proportions of femaleand male students who succeeded in the course. They found that 23 of the34 women and 60 of the 89 men succeeded. Is there evidence of a differencebetween the proportions of women and men who succeed?

21.33 Study design. The study in Exercise 21.31 randomly assigned 123 children tothe two groups. The same data could have come from a study that followedchildren whose parents did or did not enroll them in preschool. Explain carefullyhow the conclusions we can draw depend on which design was used.

21.34 Using credit cards. Are shoppers more or less likely to use credit cards for“impulse purchases” that they decide to make on the spot, as opposed to purchasesthat they had in mind when they went to the store? Stop every third personleaving a department store with a purchase. (This is in effect a random sample ofpeople who buy at that store.) A few questions allow us to classify the purchase asimpulse or not. Here are the data on how the customer paid:20

Credit Card?

Yes No

Impulse purchases 13 18Planned purchases 35 31

Estimate with 95% confidence the percent of all customers at this store who use acredit card. Give numerical summaries to describe the difference in credit card usebetween impulse and planned purchases. Is this difference statistically significant?