Chapter 20
-
Upload
jillian-cleveland -
Category
Documents
-
view
36 -
download
0
description
Transcript of Chapter 20
Chapter 20
Testing Hypotheses About Proportions
Confidence Interval
Confidence Interval for p
n
ppzp
n
ppzp
ˆ1ˆ*ˆ to
ˆ1ˆ*ˆ
Confidence Interval
Plausible values for the unknown population proportion, p.
We have confidence in the process that produced this interval.
Inference
Propose a value for the population proportion, p.
Does the sample data support this value?
Comparing Confidence Intervals with Hypothesis Tests
Confidence Interval: A level of confidence is chosen. We determine a range of possible values for the parameter that are consistent with the data (at the chosen confidence level).
Comparing Confidence Intervals with Hypothesis Tests
Hypothesis Test: Only one possible value for the parameter, called the hypothesized value, is tested. We determine the strength of the evidence provided by the data against the proposition that the hypothesized value is the true value.
Example - Hypothesis Test A law firm will represent people in a class
action lawsuit against a car manufacturer only if it is sure that more than 10% of the cars have a particular defect. Population: Cars of a particular make and
model. Parameter: Proportion of this make and
model of car that have a particular defect.
Example - Hypothesis Test
Test a claim about the population proportion, p
Start by formulating a hypotheses. Null Hypothesis
• H0: p = 0.10
Alternative Hypothesis• HA: p > 0.10
Parts to a Hypothesis Test
Null Hypothesis (H0) What the model is believed to be H0: p = p0
• Ex: Fair coin: H0: p = 0.5
Parts to a Hypothesis Test
Alternative Hypothesis (Ha) Claim you would like to prove Ha: p < p0
Ha: p > p0
Ha: p p0
• Ex: Fair Coin?: Ha: p 0.5
Writing a Hypotheses In the 1950’s only about 40% of high school
graduates went on to college. Has the percentage changed? H0: p = .4 vs. Ha: p .4
Is a coin fair? H0: p = .5 vs. Ha: p .5
Only about 20% of people who try to quit smoking succeed. Sellers of a motivational tape claim that listening to the recorded messages can help people quit. H0: p = .2 vs. Ha: p > .2
Writing a Hypotheses A governor is concerned about his “negatives” (the
percentage of state residents who express disapproval of his job performance.) His political committee pays for a series of TV ads, hoping that they can keep the negatives below 30%. They will use follow-up polling to assess the ads’ effectiveness H0: p = .3 vs. Ha: p < .3
Coke will only market their new zero calorie soft drink only if they are sure that 60% percent of the people like the flavor H0: p = .6 vs. Ha: p > .6
Relationship Between H0 and Ha
Law & Order We assume people accused of a crime are
innocent until proven guilty.• H0: person is innocent
You, as the prosecutor, must gather enough evidence to prove that the person accused is guilty beyond a shadow of a doubt.
• Ha: person is not innocent.
Example - Hypothesis Test Next, take a random sample and calculate
The law firm contacts 100 car owners at random and finds out that 12 of them have cars that have the defect. Thus = .12.
• Is this sufficient evidence for the law firm to proceed with the class action law suit?
Ask: How likely is it that our came from a population with a mean po ? If it is likely, then I have no reason to question the value
for po. If it is unlikely, then we do have reason to question the
value of po.
p̂
p̂
p̂
Example - Hypothesis Test
Check your assumptions! npo and nqo are greater than or equal to 10 n is less than 10% of the population random sample independent values
Example - Hypothesis Test
Sampling Distribution of if Null Hypothesis is True
N po,po(1 po )
n
p̂
Example - Hypothesis Test
Sampling distribution of Shape approximately normal because 10%
condition and success/failure condition satisfied.
Mean: p = 0.10 (because we assume H0 is true)
Standard Deviation:
p̂
03.0100
)90.0(10.0
Example - Hypothesis Test
Calculate a Test Statistic
npp
ppz
oo
o
)1(
ˆ
Example - Hypothesis Test
67.003.0
02.0
100)90.0(10.0
10.012.0
1
ˆ
00
0
z
npp
ppz
Example - Hypothesis Test
Use Z-Table
z 0.05 0.06 0.07
0.050.06 0.74860.07
Example - Hypothesis Test
0.00 0.05 0.10 0.15 0.20 p-hat0.12
0.7486
1 - 0.7486= 0.2514
Example - Hypothesis Test
Interpretation Getting a sample proportion of 0.12 or
more will happen about 25% (P-value = 0.25) of the time when taking a random sample of 100 from a population whose population proportion is p = 0.10.
Example - Hypothesis Test
Interpretation Getting a value of the sample proportion
of 0.12 is consistent with random sampling from a population with proportion p = 0.10. This sample result does not contradict the null hypothesis. The P-value is not small, therefore fail to reject H0.
Parts to a Hypothesis Test P-value = The probability of getting the observed
statistic (i.e. ) or one that is more extreme given that the null hypothesis is true.
Ha: p < po (one-sided test) p-value = P(Z < z)
Ha: p > po (one-sided test) p-value = P(Z > z)
Ha: p ≠ po (two-sided test) p-value = 2*P(Z > |z|)
p̂
Ha: p < po
p-value = P(Z < z)
Ha: p > po
p-value = P(Z > z) = P(Z < -z)
Ha: p ≠ po
p-value = 2*P(Z > |z|) = 2*P(Z < -|z|)
Parts to a Hypothesis Test Decision
Small p-values mean there is evidence that null hypothesis is incorrect.
Large p-values mean there is no evidence that null hypothesis is incorrect.
What values are considered small or large? alpha level (significance level) = α Typical values (0.01, 0.05, 0.10)
Parts to a Hypothesis Test
Decision (in terms of H0)
Reject H0
When p-value is smaller than α Enough evidence exists to say that H0 is most likely
incorrect.
Do not reject H0
When p-value is larger than α Not enough evidence exists to say that H0 is
incorrect.
Parts to a Hypothesis Test
Conclusion (in terms of Ha)
If we reject H0, the conclusion would be: There is evidence in favor of Ha
If we fail to reject H0, the conclusion would be There is not enough evidence in favor of Ha
Parts to a Hypothesis Test
Decision Just remember one phrase: “If the p-value <
, reject H0”
Conclusion What have you decided about p? Stated in terms of problem
Parts to a Hypothesis Test
Step 1: Hypotheses Specify
Step 2: Test Statistic
Step 3: P-value
Step 4: Decision & Conclusion
Example #1 Many people have trouble programming
their VCRs, so a company has developed what it hopes will be easier instructions. The goal is to have at least 90% of all customers succeed. The company tests the new system on 200 randomly selected people, and 188 of them were successful. Do you think the new system meets the company’s goal?
Example #1 Step 1:
Population parameter of concern: • p = proportion of people who successfully program their
VCRs.
Hypotheses:• H0: p = 0.9
• Ha: p > 0.9
• We want to test this hypothesis at a =.05 level
94.0200
188ˆ p
Example #1
Step 2: Assumptions:
• Random sample• Independence
• npo = 200(0.9) = 180 > 10
• nqo = 200(0.1) = 20 > 10
• n = 200 is less than 10% of the population size
Example #1 Step 2:
Model:
Test Statistic:
)021.0200
)1.0)(9.0(,9.0(~ˆ Np
90.1021.
90.94.
z
Example #1
Step 3: P-value:
• P(Z > z) = P(Z > 1.90) = 0.0287
• If p-value < , reject H0.
• Is there strong enough evidence to reject H0?
• If we want strong evidence (beyond a shadow of a doubt), should be small.
Example #1
Step 4: Decision:
• 0.0287 = p-value < = 0.05, so reject H0.
Conclusion:• There is evidence that the new system works in
helping customers succeed in programming their VCRs.
Example #1 What is the interpretation of the p-value in the
context of the problem?
If the true proportion of people that can successfully program their VCR with the new instructions is 90%, the probability of getting a sample proportion of 94% or one higher (more extreme) is about 2.9% (i.e. not very likely).
Example #2 In 1991, the state of New Mexico became
concerned that their DWI rate was considerably above the national average. The national average that year, was .00809. Suppose they set up road blocks to allow them to randomly select drivers and record (and arrest) the number who were above the legal blood alcohol level. Out of a random sample of 100,000 drivers, 2213 were above the limit (and subsequently arrested). Was there strong evidence that the DWI rate in New Mexico was higher than the national average?
Example #2 Step 1:
Parameter of interest:• p = proportion of New Mexicans that have
blood alcohol level above the limit. Hypotheses:
• H0: p = 0.00809
• Ha: p > 0.00809
Example #2
Step 2: Check the necessary assumptions:
• npo = 100,000(0.00809) = 809
• nqo = 100,000(0.99191) = 99191
• The population of New Mexico in 1991 was 1,547,115. Our sample size of 100,000 is less than 10% of the population.
• Random sample• independence
Example #2
Step 2: Model:
Test Statistic:
000283.0,00809.0~ˆ Np
02213.0100000
2213ˆ p
61.49000283.0
00809.002213.0
)ˆ(
)ˆ(ˆ
p
ppz
Example #2
Step 3: P-value:
• P(Z > 49.61) = P(Z < -49.61) = 0 (or < 0.0001)
Step 4: Decision:
• Use = 0.05
• P-value < , reject H0.
Example #2 Step 4:
Conclusion:• There is enough evidence to say that New
Mexico’s DWI is probably higher than the national average.
• Does driving in New Mexico cause you to be drunk?
• No, we are providing statistical inference based on data (evidence) gathered.
Example #2 What does the p-value mean in the context of
this problem?
If the true proportion of New Mexicans that have a blood alcohol level above the legal limit is 0.809%, the probability of getting a sample proportion of 2.2% or higher (more extreme) almost 0 (i.e. very unlikely).
General Notes Always list both the null and alternative
hypotheses for each problem. Remember that the null states a value for the
population parameter p.• The null arises from the context of the problem, not from
the sample.• We start by assuming that the null is true.• If we find evidence, we can reject the null, but we never
accept the null. We can fail to reject the null.
The alternative states what your alternate assumptions is if you reject the null.
General Notes Know how to determine whether you should use a
one-sided or two-sided model. It depends upon how the question is worded.
The z-statistic will be the same for each model, but the final p-value will change. Know the way to determine the p-value in each case.
Always remember to interpret your conclusion in terms of the problem. State what the outcome is and what the likelihood of it
occurring is.
Example #3
A large company hopes to improve satisfaction, setting as a goal that no more than 5% negative comments. A random survey of 350 customers found only 13 with complaints. Is the company meeting its goal?
Example #3
Step 1: Population parameter of concern
• p = proportion of dissatisfied customers
H0: p = 0.05
Ha: p < 0.05
Example #3
Step 2: Assumptions
• 350(0.05) = 17.5• 350(0.95) = 332.5• The company is large, so 350 is probably less
than 10% of all of their customers• Sample was random• Customers are independent
Example #3 Step 2:
Model
Test Statistic)012.0 ,05.0(~ˆ
)350
)95.0)(05.0( ,05.0(~ˆ
037.0350
13ˆ
Np
Np
p
08.1012.0
05.0037.0
z
Example #3 Step 3:
P-value:P(Z<-1.08) = 0.1401
Step 4: Decision:
• Since p-value = 0.1401 > = 0.05, fail to reject Ho
Conclusion:• There is no evidence that the company is meeting its
goal of receiving less than 5% negative comments.
Example #3
What is the interpretation of the p-value in the context of this problem?
If the true proportion of customers that are dissatisfied is 5%, the probability of getting a sample proportion of 3.7% or less (more extreme) is about 14%.
Example #4 An airline’s public relations department says
that the airline rarely loses passengers’ luggage. It futher claims that on those occasions when luggage is lost, 90% is recovered and delivered to it owner within 24 hours. A consumer group who surveyed a large number of air travelers found that only 103 out of 122 people who lost luggage on that airline were reunited with the missing items by the next day. What do you think about the airline’s claim? Use α = 0.05
Example #4
Step 1: Population parameter of concern:
• p = proportion of people who lost their luggage and had it returned within 24 hours
HO: p = 0.9
HA: p < 0.9
Example #4
Step 2: Assumptions
• 122(0.9) = 109.8• 122(0.1) = 12.2• 122 is less than 10% of all people who have ever
lost luggage on this airline.• Random sample• Independent values
Example #4
Step 2: Model
Test Statistic
)027.0,9.0(
122
)1.0(9.0,9.0
N
N
07.2027.0
9.0844.0
)1(
ˆ
n
pp
ppz
oo
o
Example #4
Step 3: P-value
0192.0
)07.2(
ZP
Example #4
Step 4: Decision:
• p-value = 0.0192 < α = 0.05, reject HO
Conclusion: The population proportion of people who lost their luggage that have it returned within 24 hours on this airline is less than 90%. The airline’s claim is probably not true.
Example #4 What is the interpretation of the p-value in the
context of this problem?
If the true proportion of people who lost their luggage and had it returned within 24 hours is 90%, then the probability of getting a sample proportion of 84% or less (more extreme) is about 1.9% (pretty unlikely).