BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

20
BPS - 5th Ed. Chapter 20 1 Chapter 20 Comparing Two Proportions

Transcript of BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

Page 1: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 1

Chapter 20

Comparing Two Proportions

Page 2: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 2

Two-Sample Problems

The goal of inference is to compare the responses to two treatments or to compare the characteristics of two populations.

We have a separate sample from each treatment or each population. The units are not matched, and the samples can be of differing sizes.

Page 3: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 3

Case Study

A study is performed to test of the reliability of products produced by two machines. Machine A produced 8 defective parts in a run of 140, while machine B produced 10 defective parts in a run of 200.

This is an example of when to use the two-proportion z procedures.

n Defects

Machine A 140 8

Machine B 200 11

Machine Reliability

Page 4: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 4

Inference about the Difference p1 – p2Simple Conditions

The difference in the population proportions is estimated by the difference in the sample proportions:

When both of the samples are large, the sampling distribution of this difference is approximately Normal with mean p1 – p2 and standard deviation

21 pp ˆˆ

2

22

1

11 11

n

pp

n

pp

Page 5: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 5

Inference about the Difference p1 – p2 Sampling Distribution

Page 6: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 6

Since the population proportions p1 and p2 are unknown, the standard deviation of the difference in sample proportions will need to be estimated by substituting for p1 and p2:

Standard Error

21 and pp ˆˆ

2

22

1

11 11

n

pp

n

ppSE

ˆˆˆˆ

Page 7: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 7

Page 8: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 8

Case Study: Reliability

We are 90% confident that the difference in proportion of defectives for the two machines is between -3.97% and 4.39%. Since 0 is in this interval, it is unlikely that the two machines differ in reliability.

2

22

1

1121

11

n

pp

n

ppzpp

ˆˆˆˆˆˆ

Compute a 90% confidence interval for the difference in reliabilities (as measured by proportion of defectives) for the two machines.

Confidence Interval

0.0439 to 0.03970.04180.0021

200200111

20011

140140

81140

8

1.64520011

1408

Page 9: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 9

The standard confidence interval approach yields unstable or erratic inferences.

By adding four imaginary observations (one success and one failure to each sample), the inferences can be stabilized.

This leads to more accurate inference of the difference of the population proportions.

Adjustment to Confidence Interval“Plus Four” Confidence Interval for p1 – p2

Page 10: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 10

Adjustment to Confidence Interval“Plus Four” Confidence Interval for p1 – p2

Add 4 imaginary observations, one success and one failure to each sample.

Compute the “plus four” proportions.

Use the “plus four” proportions in the formula.

21successes of number

11

n

p~2

1successes of number

22

n

p~

2

1

2

1

2

22

1

1121

n

p~p~

n

p~p~zp~p~

Page 11: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 11

Case Study: Reliability“Plus Four” 90% Confidence Interval

1429

214018

1

p~20212

2200111

2 p~

2

1

2

1

2

22

1

1121

n

p~p~

n

p~p~zp~p~

04740 to 039400434000400

202202121

20212

142142

91142

9

645120212

1429

....

.

We are 90% confident that the difference in proportion of defectives for the two machines is between -3.94% and 4.74%. Since 0 is in this interval, it is unlikely that the two machines differ in reliability.

(This is more accurate.)

Page 12: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 12

The Hypotheses for TestingTwo Proportions

Null: H0: p1 = p2

One sided alternatives

Ha: p1 > p2

Ha: p1 < p2

Two sided alternative

Ha: p1 p2

Page 13: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 13

Pooled Sample Proportion If H0 is true (p1=p2), then the two proportions are equal

to some common value p. Instead of estimating p1 and p2 separately, we will

combine or pool the sample information to estimate p. This combined or pooled estimate is called the

pooled sample proportion, and we will use it in place of each of the sample proportions in the expression for the standard error SE.

pooled sample proportion

samples both in nsobservatio of number totalsamples both in successes of number totalp̂

Page 14: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 14

Test Statistic for Two Proportions Use the pooled sample proportion in place of each of

the individual sample proportions in the expression for the standard error SE in the test statistic:

2

22

1

11

21

11n

p̂p̂n

p̂p̂

p̂p̂z

21

21

11

n

ˆˆ

n

ˆˆ

p̂p̂z

pppp

21

21

111

nnp̂p̂

p̂p̂z

Page 15: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 15

P-value for Testing Two Proportions Ha: p1 > p2

P-value is the probability of getting a value as large or larger than the observed test statistic (z) value.

Ha: p1 < p2 P-value is the probability of getting a value as small or

smaller than the observed test statistic (z) value.

Ha: p1 ≠ p2 P-value is two times the probability of getting a value as

large or larger than the absolute value of the observed test statistic (z) value.

Page 16: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 16

Page 17: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 17

Case Study

A university financial aid office polled a simple random sample of undergraduate students to study their summer employment.

Not all students were employed the previous summer. Here are the results:

Is there evidence that the proportion of male students who had summer jobs differs from the proportion of female students who had summer jobs.

Summer Status Men Women

Employed 718 593

Not Employed 79 139

Total 797 732

Summer Jobs

Page 18: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 18

Null: The proportion of male students who had summer jobs is the same as the proportion of female students who had summer jobs. [H0: p1 = p2]

Alt: The proportion of male students who had summer jobs differs from the proportion of female students who had summer jobs. [Ha: p1 ≠ p2]

The HypothesesCase Study: Summer Jobs

Page 19: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 19

Case Study: Summer Jobs

n1 = 797 and n2 = 732 (both large, so test statistic follows a Normal distribution) Pooled sample proportion:

standardized score (test statistic):

Test Statistic

Page 20: BPS - 5th Ed. Chapter 201 Comparing Two Proportions.

BPS - 5th Ed. Chapter 20 20

Case Study: Summer Jobs1. Hypotheses: H0: p1 = p2

Ha: p1 ≠ p2

2. Test Statistic:

3. P-value:P-value = 2P(Z > 5.07) = 0.000000396 (using a

computer)

P-value = 2P(Z > 5.07) < 2(1 – 0.9998) = 0.0004 (Table A)

[since 5.07 > 3.49 (the largest z-value in the table)]

4. Conclusion:

Since the P-value is smaller than = 0.001, there is very strong evidence that the proportion of male students who had summer jobs differs from that of female students.