Chapter 20

Chapter 20

Testing Hypotheses About Proportions

Confidence Interval

Confidence Interval for p

n

ppzp

n

ppzp

ˆ1ˆ*ˆ to

ˆ1ˆ*ˆ

Confidence Interval

Plausible values for the unknown population proportion, p.

We have confidence in the process that produced this interval.

Inference

Propose a value for the population proportion, p.

Does the sample data support this value?

Comparing Confidence Intervals with Hypothesis Tests

Confidence Interval: A level of confidence is chosen. We determine a range of possible values for the parameter that are consistent with the data (at the chosen confidence level).

Comparing Confidence Intervals with Hypothesis Tests

Hypothesis Test: Only one possible value for the parameter, called the hypothesized value, is tested. We determine the strength of the evidence provided by the data against the proposition that the hypothesized value is the true value.

Example - Hypothesis Test A law firm will represent people in a class

action lawsuit against a car manufacturer only if it is sure that more than 10% of the cars have a particular defect. Population: Cars of a particular make and

model. Parameter: Proportion of this make and

model of car that have a particular defect.

Example - Hypothesis Test

Test a claim about the population proportion, p

Start by formulating a hypotheses. Null Hypothesis

• H0: p = 0.10

Alternative Hypothesis• HA: p > 0.10

Parts to a Hypothesis Test

Null Hypothesis (H0) What the model is believed to be H0: p = p0

• Ex: Fair coin: H0: p = 0.5


Alternative Hypothesis (Ha) Claim you would like to prove Ha: p < p0

Ha: p > p0

Ha: p p0

• Ex: Fair Coin?: Ha: p 0.5

Writing a Hypotheses In the 1950’s only about 40% of high school

graduates went on to college. Has the percentage changed? H0: p = .4 vs. Ha: p .4

Is a coin fair? H0: p = .5 vs. Ha: p .5

Only about 20% of people who try to quit smoking succeed. Sellers of a motivational tape claim that listening to the recorded messages can help people quit. H0: p = .2 vs. Ha: p > .2

Writing a Hypotheses A governor is concerned about his “negatives” (the

percentage of state residents who express disapproval of his job performance.) His political committee pays for a series of TV ads, hoping that they can keep the negatives below 30%. They will use follow-up polling to assess the ads’ effectiveness H0: p = .3 vs. Ha: p < .3

Coke will only market their new zero calorie soft drink only if they are sure that 60% percent of the people like the flavor H0: p = .6 vs. Ha: p > .6

Relationship Between H0 and Ha

Law & Order We assume people accused of a crime are

innocent until proven guilty.• H0: person is innocent

You, as the prosecutor, must gather enough evidence to prove that the person accused is guilty beyond a shadow of a doubt.

• Ha: person is not innocent.

Example - Hypothesis Test Next, take a random sample and calculate

The law firm contacts 100 car owners at random and finds out that 12 of them have cars that have the defect. Thus = .12.

• Is this sufficient evidence for the law firm to proceed with the class action law suit?

Ask: How likely is it that our came from a population with a mean po ? If it is likely, then I have no reason to question the value

for po. If it is unlikely, then we do have reason to question the

value of po.

p̂

p̂

p̂


Check your assumptions! npo and nqo are greater than or equal to 10 n is less than 10% of the population random sample independent values


Sampling Distribution of if Null Hypothesis is True

N po,po(1 po )

n

p̂


Sampling distribution of Shape approximately normal because 10%

condition and success/failure condition satisfied.

Mean: p = 0.10 (because we assume H0 is true)

Standard Deviation:

p̂

03.0100

)90.0(10.0


Calculate a Test Statistic

npp

ppz

oo

o

)1(

ˆ


67.003.0

02.0

100)90.0(10.0

10.012.0

1

ˆ

00

0

z

npp

ppz


Use Z-Table

z 0.05 0.06 0.07

0.050.06 0.74860.07


0.00 0.05 0.10 0.15 0.20 p-hat0.12

0.7486

1 - 0.7486= 0.2514


Interpretation Getting a sample proportion of 0.12 or

more will happen about 25% (P-value = 0.25) of the time when taking a random sample of 100 from a population whose population proportion is p = 0.10.


Interpretation Getting a value of the sample proportion

of 0.12 is consistent with random sampling from a population with proportion p = 0.10. This sample result does not contradict the null hypothesis. The P-value is not small, therefore fail to reject H0.

Parts to a Hypothesis Test P-value = The probability of getting the observed

statistic (i.e. ) or one that is more extreme given that the null hypothesis is true.

Ha: p < po (one-sided test) p-value = P(Z < z)

Ha: p > po (one-sided test) p-value = P(Z > z)

Ha: p ≠ po (two-sided test) p-value = 2*P(Z > |z|)

p̂

Ha: p < po

p-value = P(Z < z)

Ha: p > po

p-value = P(Z > z) = P(Z < -z)

Ha: p ≠ po

p-value = 2*P(Z > |z|) = 2*P(Z < -|z|)

Parts to a Hypothesis Test Decision

Small p-values mean there is evidence that null hypothesis is incorrect.

Large p-values mean there is no evidence that null hypothesis is incorrect.

What values are considered small or large? alpha level (significance level) = α Typical values (0.01, 0.05, 0.10)


Decision (in terms of H0)

Reject H0

When p-value is smaller than α Enough evidence exists to say that H0 is most likely

incorrect.

Do not reject H0

When p-value is larger than α Not enough evidence exists to say that H0 is

incorrect.


Conclusion (in terms of Ha)

If we reject H0, the conclusion would be: There is evidence in favor of Ha

If we fail to reject H0, the conclusion would be There is not enough evidence in favor of Ha


Decision Just remember one phrase: “If the p-value <

, reject H0”

Conclusion What have you decided about p? Stated in terms of problem


Step 1: Hypotheses Specify

Step 2: Test Statistic

Step 3: P-value

Step 4: Decision & Conclusion

Example #1 Many people have trouble programming

their VCRs, so a company has developed what it hopes will be easier instructions. The goal is to have at least 90% of all customers succeed. The company tests the new system on 200 randomly selected people, and 188 of them were successful. Do you think the new system meets the company’s goal?

Example #1 Step 1:

Population parameter of concern: • p = proportion of people who successfully program their

VCRs.

Hypotheses:• H0: p = 0.9

• Ha: p > 0.9

• We want to test this hypothesis at a =.05 level

94.0200

188ˆ p

Example #1

Step 2: Assumptions:

• Random sample• Independence

• npo = 200(0.9) = 180 > 10

• nqo = 200(0.1) = 20 > 10

• n = 200 is less than 10% of the population size

Example #1 Step 2:

Model:

Test Statistic:

)021.0200

)1.0)(9.0(,9.0(~ˆ Np

90.1021.

90.94.

z

Example #1

Step 3: P-value:

• P(Z > z) = P(Z > 1.90) = 0.0287

• If p-value < , reject H0.

• Is there strong enough evidence to reject H0?

• If we want strong evidence (beyond a shadow of a doubt), should be small.

Example #1

Step 4: Decision:

• 0.0287 = p-value < = 0.05, so reject H0.

Conclusion:• There is evidence that the new system works in

helping customers succeed in programming their VCRs.

Example #1 What is the interpretation of the p-value in the

context of the problem?

If the true proportion of people that can successfully program their VCR with the new instructions is 90%, the probability of getting a sample proportion of 94% or one higher (more extreme) is about 2.9% (i.e. not very likely).

Example #2 In 1991, the state of New Mexico became

concerned that their DWI rate was considerably above the national average. The national average that year, was .00809. Suppose they set up road blocks to allow them to randomly select drivers and record (and arrest) the number who were above the legal blood alcohol level. Out of a random sample of 100,000 drivers, 2213 were above the limit (and subsequently arrested). Was there strong evidence that the DWI rate in New Mexico was higher than the national average?

Example #2 Step 1:

Parameter of interest:• p = proportion of New Mexicans that have

blood alcohol level above the limit. Hypotheses:

• H0: p = 0.00809

• Ha: p > 0.00809

Example #2

Step 2: Check the necessary assumptions:

• npo = 100,000(0.00809) = 809

• nqo = 100,000(0.99191) = 99191

• The population of New Mexico in 1991 was 1,547,115. Our sample size of 100,000 is less than 10% of the population.

• Random sample• independence

Example #2

Step 2: Model:

Test Statistic:

000283.0,00809.0~ˆ Np

02213.0100000

2213ˆ p

61.49000283.0

00809.002213.0

)ˆ(

)ˆ(ˆ

p

ppz

Example #2

Step 3: P-value:

• P(Z > 49.61) = P(Z < -49.61) = 0 (or < 0.0001)

Step 4: Decision:

• Use = 0.05

• P-value < , reject H0.

Example #2 Step 4:

Conclusion:• There is enough evidence to say that New

Mexico’s DWI is probably higher than the national average.

• Does driving in New Mexico cause you to be drunk?

• No, we are providing statistical inference based on data (evidence) gathered.

Example #2 What does the p-value mean in the context of

this problem?

If the true proportion of New Mexicans that have a blood alcohol level above the legal limit is 0.809%, the probability of getting a sample proportion of 2.2% or higher (more extreme) almost 0 (i.e. very unlikely).

General Notes Always list both the null and alternative

hypotheses for each problem. Remember that the null states a value for the

population parameter p.• The null arises from the context of the problem, not from

the sample.• We start by assuming that the null is true.• If we find evidence, we can reject the null, but we never

accept the null. We can fail to reject the null.

The alternative states what your alternate assumptions is if you reject the null.

General Notes Know how to determine whether you should use a

one-sided or two-sided model. It depends upon how the question is worded.

The z-statistic will be the same for each model, but the final p-value will change. Know the way to determine the p-value in each case.

Always remember to interpret your conclusion in terms of the problem. State what the outcome is and what the likelihood of it

occurring is.

Example #3

A large company hopes to improve satisfaction, setting as a goal that no more than 5% negative comments. A random survey of 350 customers found only 13 with complaints. Is the company meeting its goal?

Example #3

Step 1: Population parameter of concern

• p = proportion of dissatisfied customers

H0: p = 0.05

Ha: p < 0.05

Example #3

Step 2: Assumptions

• 350(0.05) = 17.5• 350(0.95) = 332.5• The company is large, so 350 is probably less

than 10% of all of their customers• Sample was random• Customers are independent

Example #3 Step 2:

Model

Test Statistic)012.0 ,05.0(~ˆ

)350

)95.0)(05.0( ,05.0(~ˆ

037.0350

13ˆ

Np

Np

p

08.1012.0

05.0037.0

z

Example #3 Step 3:

P-value:P(Z<-1.08) = 0.1401

Step 4: Decision:

• Since p-value = 0.1401 > = 0.05, fail to reject Ho

Conclusion:• There is no evidence that the company is meeting its

goal of receiving less than 5% negative comments.

Example #3

What is the interpretation of the p-value in the context of this problem?

If the true proportion of customers that are dissatisfied is 5%, the probability of getting a sample proportion of 3.7% or less (more extreme) is about 14%.

Example #4 An airline’s public relations department says

that the airline rarely loses passengers’ luggage. It futher claims that on those occasions when luggage is lost, 90% is recovered and delivered to it owner within 24 hours. A consumer group who surveyed a large number of air travelers found that only 103 out of 122 people who lost luggage on that airline were reunited with the missing items by the next day. What do you think about the airline’s claim? Use α = 0.05

Example #4

Step 1: Population parameter of concern:

• p = proportion of people who lost their luggage and had it returned within 24 hours

HO: p = 0.9

HA: p < 0.9

Example #4

Step 2: Assumptions

• 122(0.9) = 109.8• 122(0.1) = 12.2• 122 is less than 10% of all people who have ever

lost luggage on this airline.• Random sample• Independent values

Example #4

Step 2: Model

Test Statistic

)027.0,9.0(

122

)1.0(9.0,9.0

N

N

07.2027.0

9.0844.0

)1(

ˆ

n

pp

ppz

oo

o

Example #4

Step 3: P-value

0192.0

)07.2(

ZP

Example #4

Step 4: Decision:

• p-value = 0.0192 < α = 0.05, reject HO

Conclusion: The population proportion of people who lost their luggage that have it returned within 24 hours on this airline is less than 90%. The airline’s claim is probably not true.

Example #4 What is the interpretation of the p-value in the

context of this problem?

If the true proportion of people who lost their luggage and had it returned within 24 hours is 90%, then the probability of getting a sample proportion of 84% or less (more extreme) is about 1.9% (pretty unlikely).

Chapter 20

Documents

Transcript of Chapter 20