Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

27
Copyright © 2010 Pearson Education, Inc.

Transcript of Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Page 1: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
Page 2: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc.

Chapter 22Comparing Two Proportions

Page 3: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 3

Comparing Two Proportions

Comparisons between two percentages are much more common than questions about isolated percentages. And they are more interesting.

We often want to know how two groups differ, whether a treatment is better than a placebo control, or whether this year’s results are better than last year’s.

Page 4: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc.

A county health department tries an experiment using several hundred volunteers who were planning to use a nicotine patch to help quit smoking. The subjects were split into two groups. Group 1 were given the patch and attended a weekly discussion support group, Group 2 just got the patch. After six months, 46 of 143 people in Group 1 and 30 of 151 people in Group 2 has successfully stopped smoking. Do these results suggest that such support groups could be an effective way to help people stop smoking?

Slide 22 - 4

Page 5: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 5

Another Ruler

In order to examine the difference between two proportions, we need another ruler—the standard deviation of the sampling distribution model for the difference between two proportions.

Recall that standard deviations don’t add, but variances do. In fact, the variance of the sum or difference of two independent random quantities is the sum of their individual variances.

Page 6: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 6

The Standard Deviation of the Difference Between Two Proportions

Proportions observed in independent random samples are independent. Thus, we can add their variances. So…

The standard deviation of the difference between two sample proportions is

Thus, the standard error is

1 1 2 21 2

1 2

ˆ ˆp q p q

SD p pn n

1 1 2 21 2

1 2

ˆ ˆ ˆ ˆˆ ˆ

p q p qSE p p

n n

Page 7: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc.

A survey of 886 randomly selected teenagers (12-17) found that more than half of them have online profiles. There appear to be differences between boys and girls in their online behavior. Among teens aged 15-17, 57% of the 248 boys had online profiles, compared to 70% of the 256 girls.

If we want to estimate how large the difference truly is, first calculate the standard error of the sample proportions.

Slide 22 - 7

Page 8: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 8

Assumptions and Conditions

Independence Assumptions: Randomization Condition: The data in each

group should be drawn independently and at random from a homogeneous population or generated by a randomized comparative experiment.

The 10% Condition: If the data are sampled without replacement, the sample should not exceed 10% of the population.

Independent Groups Assumption: The two groups we’re comparing must be independent of each other.

Page 9: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 9

Assumptions and Conditions (cont.)

Sample Size Condition: Each of the groups must be big enough… Success/Failure Condition: Both groups are big

enough that at least 10 successes and at least 10 failures have been observed in each.

Page 10: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc.

Among a random sample of teens aged 15-17, 57% of the 248 boys had online profiles, compared to 70% of the 256 girls.

Can we use these results to make inferences about all 15-17 year olds?

What are the assumptions and conditions?

Slide 22 - 10

Page 11: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 11

The Sampling Distribution

We already know that for large enough samples, each of our proportions has an approximately Normal sampling distribution.

The same is true of their difference.

Page 12: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 12

The Sampling Distribution (cont.)

Provided that the sampled values are independent, the samples are independent, and the samples sizes are large enough, the sampling distribution of is modeled by a Normal model with Mean:

Standard deviation:

1 2p p

1 2ˆ ˆp p

1 1 2 21 2

1 2

ˆ ˆp q p q

SD p pn n

Page 13: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 13

Two-Proportion z-Interval When the conditions are met, we are ready to

find the confidence interval for the difference of two proportions:

The confidence interval is

where

The critical value z* depends on the particular confidence level, C, that you specify.

1 1 2 21 2

1 2

ˆ ˆ ˆ ˆˆ ˆ

p q p qSE p p

n n

1 2 1 2ˆ ˆ ˆ ˆp p z SE p p

Page 14: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc.

Among a random sample of teens aged 15-17, 57% of the 248 boys had online profiles, compared to 70% of the 256 girls. We calculated the SE for the difference in sample proportions to be SE(girls - boys ) = 0.0425 and found that the assumptions and conditions have been met.

Construct a 95% confidence interval about the difference in online behavior.

Explain your result in context.

Slide 22 - 14

Page 15: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc.

A Gallup poll asked whether the attribute “intelligent” applied to men in general. The poll revealed that 28% of 506 men thought it did, but only 14% of 250 women agreed. We want to estimate the true size of the gender gap by creating a 95% confidence interval.

Slide 22 - 15

Page 16: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc.

A charity looking for donations runs a test to see if they will be more effective soliciting donations by email or regular mail. They send the same letter to two different random groups of people and received donations 26% of the time from the group that received an email, and 15% from those who received the request by regular mail. A 90% confidence interval estimated the difference in donation rates to be 11% ± 7%

Interpret this confidence interval in context. Based on this confidence interval, what conclusion would

we reach if we tested the hypothesis that there is no difference in the response rates to the two methods?

Slide 22 - 16

Page 17: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 17

Everyone into the Pool

The typical hypothesis test for the difference in two proportions is the one of no difference. In symbols, H0: p1 – p2 = 0.

Since we are hypothesizing that there is no difference between the two proportions, that means that the standard deviations for each proportion are the same.

Since this is the case, we combine (pool) the counts to get one overall proportion.

Page 18: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 18

Everyone into the Pool (cont.) The pooled proportion is

where and

If the numbers of successes are not whole numbers, round them first. (This is the only time you should round values in the middle of a calculation.)

1 2

1 2

ˆ pooledSuccess Success

pn n

1 1 1ˆSuccess n p 2 2 2ˆSuccess n p

Page 19: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 19

Everyone into the Pool (cont.)

We then put this pooled value into the formula, substituting it for both sample proportions in the standard error formula:

1 21 2

ˆ ˆ ˆ ˆˆ ˆ pooled pooled pooled pooled

pooled

p q p qSE p p

n n

Page 20: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 20

Compared to What?

We’ll reject our null hypothesis if we see a large enough difference in the two proportions.

How can we decide whether the difference we see is large? Just compare it with its standard deviation.

Unlike previous hypothesis testing situations, the null hypothesis doesn’t provide a standard deviation, so we’ll use a standard error (here, pooled).

Page 21: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 21

Two-Proportion z-Test

The conditions for the two-proportion z-test are the same as for the two-proportion z-interval.

We are testing the hypothesis H0: p1 – p2 = 0, or, equivalently, H0: p1 = p2.

Because we hypothesize that the proportions are equal, we pool them to find

1 2

1 2

ˆ pooledSuccess Success

pn n

Page 22: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 22

Two-Proportion z-Test (cont.)

We use the pooled value to estimate the standard error:

Now we find the test statistic:

When the conditions are met and the null hypothesis is true, this statistic follows the standard Normal model, so we can use that model to obtain a P-value.

1 21 2

ˆ ˆ ˆ ˆˆ ˆ pooled pooled pooled pooled

pooled

p q p qSE p p

n n

1 2

1 2

ˆ ˆ( ) 0ˆ ˆ( )pooled

p pzSE p p

Page 23: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc.

The National Sleep Foundation conducted a study of 1010 randomly chosen people. Of the 995 respondents, 26% of the 184 people under 30 reported that they snored, while 39% of the 811 people over 30 reported snoring.

Use a 2 proportion z-Test to determine if there really is a difference between the two age groups.

Slide 22 - 23

Page 24: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc.

If we go back to the online habits survey, Only 19% (62) of the 325 girls said they were easy to find from their online profiles, while 285 (75) of the 268 boys said the same.

Are these results evidence of a difference between boys and girls? Perform a two proportion z-test and discuss what you find.

Slide 22 - 24

Page 25: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 25

What Can Go Wrong? Don’t use two-sample proportion methods when

the samples aren’t independent. These methods give wrong answers when the

independence assumption is violated. Don’t apply inference methods when there was

no randomization. Our data must come from representative

random samples or from a properly randomized experiment.

Don’t interpret a significant difference in proportions causally. Be careful not to jump to conclusions about

causality.

Page 26: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 26

What have we learned?

We’ve now looked at inference for the difference in two proportions.

Perhaps the most important thing to remember is that the concepts and interpretations are essentially the same—only the mechanics have changed slightly.

Page 27: Copyright © 2010 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.

Copyright © 2010 Pearson Education, Inc. Slide 22 - 27

What have we learned? (cont.)

Hypothesis tests and confidence intervals for the difference in two proportions are based on Normal models. Both require us to find the standard error of the

difference in two proportions. We do that by adding the variances of the

two sample proportions, assuming our two groups are independent.

When we test a hypothesis that the two proportions are equal, we pool the sample data; for confidence intervals we don’t pool.