Chapter 10 Handouts - CoffeeCup...
Transcript of Chapter 10 Handouts - CoffeeCup...
Chapter 10 Handouts
Page 1 of 35
Section 10.1 The Language of Hypothesis Testing
The big picture is:
Use a random sample to make , or to learn
something about the larger population.
Example: A soft-drink company may claim that, on average, its cans contain
12 ounces of soda. A government agency may want to test whether or not such
cans really do contain, on average, 12 ounces of soda, or whether they are
“short-changing” the consumers by putting less than that amount in the cans. A
sample of 100 cans of the soft drink that is being investigated is taken. The
mean amount of soda in the 100 cans is 11.89 ounces. That is less than the
claimed amount of 12 ounces, so can we conclude that the soft-drink company is
lying to the public?
Two ways to do this:
1. Confidence intervals: Use sample data to estimate a population value.
Provides a range of for the population
parameter.
2. Hypothesis testing: Use sample data to test a specific claim about a
population. Provides a conclusion about a
for the population parameter, whether an assumed value is plausible or
not.
This is the heart of hypothesis testing: we make an assumption about reality, for
example the true value of the population mean or proportion. We gather sample
data. Then we compare the sample data to the assumed value, and see if it
the assumption.
Chapter 10 Handouts
Page 2 of 35
In order to run a hypothesis test, you need TWO hypotheses, the
hypothesis and the hypothesis. A hypothesis is a
statement regarding a characteristic of a population. We will run hypothesis test
on two parameters:
1. A proportion (Section 10.2)
2. A mean (Section 10.3)
Null Hypothesis:
Represented by .
Statement that a population parameter .
This hypothesis will be accepted UNLESS we have convincing
evidence to the contrary.
It is based on the expectation of , status quo,
“same old, same old”.
Often, hypothesis tests are designed in such a manner that we
to reject the null hypothesis.
The null hypothesis is always written in this form:
H0:
where are the values for the
population mean or proportion.
Alternative Hypothesis:
Represented by .
Statement that the population parameter is
the value stated in the null hypothesis.
It can either be
It is often the conclusion that we are TRYING to make.
There are three forms for the alternative hypothesis (see next page).
Chapter 10 Handouts
Page 3 of 35
Form of Alternative Hypothesis
Type of test Purpose of test
H1:
To decide whether the population mean
or proportion is
the hypothesized value, µ0
𝑜𝑟 𝑝0. In other words, it could be either
larger or smaller.
H1:
To decide whether the population mean
or proportion is
the hypothesized value, µ0 𝑜𝑟 𝑝0. In
other words, we already have reason to
believe it is less, we just want to test it.
H1:
To decide whether the population mean
or proportion is
the hypothesized value, µ0 𝑜𝑟 𝑝0. In
other words, we already have reason to
believe it is more, we just want to test it.
Example: Soft drinks
The company is claiming that the cans contain 12 oz. of soda on average,
therefore the null hypothesis is that:
H0:
The government agency is concerned that the manufacturer is putting LESS
THAN that amount into the cans, therefore the alternative hypothesis would be:
H1:
Chapter 10 Handouts
Page 4 of 35
The Logic of Hypothesis Testing
Start with an initial assumption (about population proportion or mean).
Collect sample data.
Find the point estimate from the sample data.
Compare the point estimate to the hypothesized population value,
using a standard mathematical procedure.
Decide if the hypothesized population value is reasonable.
There are basically two outcomes:
1. The sample data are with the hypothesized
value. The sample mean or proportion is not that
from the hypothesized population mean or proportion. Therefore, do
reject the null hypothesis.
2. The sample data are with the
hypothesized value. The sample mean or proportion is
different from the hypothesized mean or proportion, and more supportive
of the alternative hypothesis. In that case we reject the null hypothesis,
and accept the alternative hypothesis.
Notice that we have to have evidence that the
null hypothesis is not true, before we end up rejecting it.
Chapter 10 Handouts
Page 5 of 35
Example: Innocent until proven guilty!
Consider a jury trial:
H0:
H1:
The null hypothesis is always assumed to be , until we have
evidence to the .
In a trial, the prosecution has to provide convincing evidence, “beyond a
reasonable doubt”, that the person is in fact guilty.
Source: Introductory Statistics, 8th edition, Prem S. Mann
A statistical hypothesis test works exactly like this!
There is some critical point, which can be calculated, beyond which the null
hypothesis is .
Chapter 10 Handouts
Page 6 of 35
Type I and II errors
There’s always a chance of
making a mistake, and
coming to the wrong
conclusion! We are making
our decision based on
.
Four outcomes: #1: The null hypothesis is , but we it.
#2: The null hypothesis is , and we reject it.
#3: The null hypothesis is , but we to reject it.
#4: The null hypothesis is , and we it.
Type I error: Rejecting the null hypothesis when it is actually . The
probability of making a Type I error is called the ,
of a hypothesis test.
Type II error: Not rejecting the null hypothesis when it is in fact .
The probability of that happening is .
Ideally, the probabilities of of these types of errors would be small.
However, there is a relationship between the two errors:
For a fixed sample size, the smaller the Type I error (α) is, the
the probability of a Type II error ().
Chapter 10 Handouts
Page 7 of 35
Example: Criminal Trial Analogy
H0: defendant is not guilty
H1: defendant is guilty
Four possible outcomes:
1. Type I error: H0 is true, but it is rejected.
2. Correct decision H0 is true, and it is NOT rejected.
3. Type II error: H0 is false, but it is NOT rejected.
4. Correct decision H0 is false, and it IS rejected.
Which error (Type I or Type II) is more serious???
Chapter 10 Handouts
Page 8 of 35
Star Trek – to Vaporize or NOT to Vaporize? (Type I and II errors)
“He’s dead, Jim,” said Dr. McCoy to Captain Kirk.
Mr. Spock, as the science officer, is put in charge of statistically determining the
correctness of Dr. McCoy’s statement and deciding the fate of the crew member (if dead
– to vaporize, if alive – to try and revive).
1. Presumably, the crew member was alive to start with, so “being alive” represents
the condition of “no change”, or the “status quo”, the assumed starting condition.
Use this information to formulate a null hypothesis and an alternative hypothesis:
H0:
H1:
2. Referring to the table on the previous page, describe the four possible outcomes as
a result of Spock’s decision about whether the crew member is dead or alive.
1. Type I error:
2. Correct decision
3. Type II error:
4. Correct decision
3. Which of the two types of errors is more serious in this case, and why??
Chapter 10 Handouts
Page 9 of 35
Possible Conclusions for a Hypothesis Test
After running a hypothesis test, there are in general two possible conclusions:
1. The null hypothesis is . In this case, we
have concluded that the data provide sufficient evidence to support the
hypothesis.
2. The null hypothesis is rejected. In this case, we conclude that
the data do NOT provide sufficient evidence to support the
hypothesis.
Note in this case: It does NOT mean that we proved the null hypothesis to
be ! It simply means that we are going to “reserve
judgment” about which hypothesis is true. It is impossible to
the null hypothesis! It just means that the sample data wasn’t
enough to reject it.
KEY: Stating the conclusion:
There sufficient evidence at the level of
significance to conclude that “insert statement in alternative hypothesis”.
Note: if we did NOT reject H0, then there sufficient evidence, If we DID reject H0, then there sufficient evidence. Example: Jury Trial H0: The person is not guilty.
H1: The person is guilty.
If the jury comes back with a “not guilty” verdict: There
sufficient evidence (at the level of significance) to conclude that
“ ”.
Chapter 10 Handouts
Page 10 of 35
Section 10.2 Hypothesis Tests for a Population Proportion
Test Statistic
The test statistic is:
A number that we calculate from the .
It is a , so in other words we are
converting the data to the standard normal distribution.
It shows the sample value is from the
hypothesized value.
The hypothesized value is always at the of the
distribution, where .
Use the test statistic to make a decision about whether or not to
the null hypothesis.
o If the sample data is very far away from the hypothesized value
conclude that the null hypothesis is .
o If the sample data is not that far away from the hypothesized value
conclude that the null hypothesis is
wrong (do not reject it).
The test statistic for this section is as follows:
For proportion:
Compare this test statistic to the hypothesized value of the proportion, and see
how far away it is from that value.
Chapter 10 Handouts
Page 11 of 35
Critical Value for the Classical Approach
The critical value is the beyond which we will
reject the null hypothesis, if the test statistic lies beyond the critical value.
The critical value for a hypothesis test is the z-score that separates the
region from the region.
The critical value depends on two things:
1. The of the test,
2. The type of test that it is: ,
The hypothesized value of the proportion, 𝑝0, is at the of the
normal curve. It corresponds to when it is standardized. That’s
why we want to see if our sample data is close to the center, or if it is way out in
the tails, far away from the center.
The type of test depends entirely on what the
hypothesis is, .
Chapter 10 Handouts
Page 12 of 35
Looking up Critical Values
“At a significance level of 5%”
1. Two-Tailed Test:
2. Left-Tailed Test:
3. Right-Tailed Test:
Chapter 10 Handouts
Page 13 of 35
Summary Table of Critical Values:
Note: these values assume an area of to the of the
z-score.
Chapter 10 Handouts
Page 14 of 35
Chapter 10 Handouts
Page 15 of 35
Example: In a 2011 National Institute on Alcohol Abuse and Alcoholism
survey, 33% of American adults said that they had never consumed alcohol. In a
more current (2014) random sample of 2300 adult Americans, 805 of them said
that they have never consumed alcohol. At the 5% significance level, do the data
provide sufficient evidence that the current percentage of American adults who
have never consumed alcohol has changed from the previous value of 33%?
Step 1: State the null and alternative hypotheses.
Step 2: Decide on the significance level, α.
Step 3: Compute the value of the test statistic.
Chapter 10 Handouts
Page 16 of 35
Classical Approach:
Step 3: Determine the critical value(s).
What kind of a test?
Step 4: If the value of the test statistic falls in the critical region, reject H0;
otherwise, do not reject H0.
Step 5: State the conclusion.
There sufficient evidence at the level of
significance to conclude that the current percentage of American adults who
have never consumed alcohol the previous
value of 33%.
Chapter 10 Handouts
Page 17 of 35
P-Value Approach to Hypothesis Testing
The only difference is in how we make the decision about whether we reject the
null hypothesis or not:
Classical Approach: find the critical value(s) based on the significance
level, if the test statistic is more extreme than the critical value, then
REJECT the null hypothesis.
P-Value Approach: based on the test statistic (from the sample data),
calculate the P-value. If the P-value is the
significance level of the test, then REJECT the null hypothesis
P-Value Method
Instead of finding the critical value/region we find the P-Value.
The P-value gives us more information (vs. just a fail to reject/reject) about
the result of the test. It gives a measure of the of the
evidence against the null hypothesis.
The P-value is just a probability, so it’s a .
Represents how likely we would be to observe our sample or one more
extreme (further out) if the null hypothesis were .
P-value close to 0 (very small) means .
So, if P-value is small (less than ), then .
If the P-value is the null must !
Chapter 10 Handouts
Page 18 of 35
Finding the P-Value for a One-Proportion z-Test
1. Determine type of test:
2. Calculate test statistic:
3. Find the P-value, the probability of getting that sample, or one more extreme
than that.
a. For a one-tailed test (either left or right), the P-value = the area under
the standard normal curve the test statistic. In
other words, the area in the tail of the curve, either to the left or the
right of the test statistic.
b. For a two-tailed test, the P-value = the area
under the standard normal curve BEYOND the test statistic.
Chapter 10 Handouts
Page 19 of 35
Decision Criterion for a Hypothesis Test Using the P-Value
Reject null hypothesis (H0) if where is
the stated significance level for the problem (such as 0.05).
Fail to reject null hypothesis (H0) if .
Rationale: Again, the P-value is just a probability, the probability of getting your
sample data assuming that the hypothesized value is . If the
probability is very , then the hypothesized value is
probably true.
Chapter 10 Handouts
Page 20 of 35
Example: In a 2011 National Institute on Alcohol Abuse and Alcoholism
survey, 33% of American adults said that they had never consumed alcohol. In a
more current (2014) random sample of 2300 adult Americans, 805 of them said
that they have never consumed alcohol. At the 5% significance level, do the data
provide sufficient evidence that the current percentage of American adults who
have never consumed alcohol has changed from the previous value of 33%?
Step 1: State the null and alternative hypotheses.
Step 2: Decide on the significance level, α.
Step 3: Compute the value of the test statistic.
Redo analysis (Steps 3 & 4) using the P-value method:
Step 3: Determine the P-value, P.
Step 4: Compare the P-value to the significance level, α. If P < α, reject H0.
Chapter 10 Handouts
Page 21 of 35
For the SAME problem, use a 0.01 significance level to test the claim that the
current percentage of American adults who have never consumed alcohol is
different from the previous value of 33%. Does the conclusion change?
P-Value Method:
Step 4: Compare the P-value to the significance level, α. If P < α, reject H0.
Does the conclusion change?
Why the difference? Use the Classical Approach to visualize the problem:
With a significance of 0.01, the critical values have been moved
away from the center (the hypothesized value). Therefore, the sample data has
to be even more for the null hypothesis to be rejected.
It’s essentially the same concept as in Chapter 9 with the confidence intervals.
With higher confidence (say 99% vs. 95%), the intervals would become ,
to include more possible values. Now, with smaller significance (say 0.01 vs.
0.05), the interval of sample values that may have come from a population with
the hypothesized value becomes .
Chapter 10 Handouts
Page 22 of 35
Using Technology - STATDISK
Analysis/Hypothesis Testing/Proportion One Sample
Chapter 10 Handouts
Page 23 of 35
In one study of 71 smokers who tried to quit smoking with nicotine patch therapy,
39 were smoking one year after the treatment, and 32 were not smoking one
year after the treatment. At the 10% significance level, do the data provide
sufficient evidence to conclude that more than 50% of smokers who try to quit
with nicotine patch therapy are still smoking one year after the treatment?
Step 1: State the null and alternative hypotheses.
Step 2: Decide on the significance level, α.
Step 3: Compute the value of the test statistic.
Classical Approach:
Step 3: Determine the critical value(s).
What kind of a test?
Step 4: If the value of the test statistic falls in the critical region, reject H0;
otherwise, do not reject H0.
(continued on next page)
Chapter 10 Handouts
Page 24 of 35
Step 5: State the conclusion.
There sufficient evidence at the level of
significance to conclude that a majority of smokers who try to quit with nicotine
patch therapy are still smoking one year after the treatment.
Now, go back and redo analysis (Steps 3 & 4) using the P-value Approach:
Step 3: Determine the P-value, P.
Step 4: Compare the P-value to the significance level, α. If P < α, reject H0.
Chapter 10 Handouts
Page 25 of 35
Chapter 10 Handouts
Page 26 of 35
m&m’s – Testing a Claim about a Proportion
The information on the following page is from the m&m’s website, and it includes the
proportion (percentage) of each color of m&m’s that are included in a bag. According to
the manufacturer, the proportion of orange m&m’s equals 20%. At the 5% significance
level, do your data provide sufficient evidence to conclude that the percentage of orange
m&m’s in a bag is different from 20%?
Use BOTH the Classical Approach and the P-value Approach.
Gather your sample data by calculating the proportion of orange m&m’s in a bag of plain
m&m’s.
x = no. of orange m&m’s in the bag = ______
n = sample size = total no. of m&m’s in the bag = ______
p̂ = sample proportion = x/n = ______ (to 3 decimal places)
Step 1: State the null and alternative hypotheses.
H0: __________
H1: __________
Step 2: Decide on the significance level, α.
Step 3: Compute the value of the test statistic.
n
pp
ppz
)1(
ˆ
00
0
0
=
Classical Approach:
Step 3: Determine the critical value(s).
What kind of a test?
Chapter 10 Handouts
Page 27 of 35
Step 4: If the value of the test statistic falls in the critical region, reject H0;
otherwise, do not reject H0.
Step 5: State the conclusion.
There sufficient evidence at the level of
significance to conclude that the percentage of orange m&m’s in a bag is
different from 20%.
Now, go back and redo analysis (Steps 3 & 4) using the P-value Approach:
Step 3: Determine the P-value, P.
Step 4: Compare the P-value to the significance level, α. If P < α, reject H0.
Error or Correct Decision?
Assume that what the manufacturer tells us is true, that the proportion of orange
m&m’s is really 20%. Based on your analysis, did you make the correct decision,
or did you make an error? If so, what type of error?
Chapter 10 Handouts
Page 28 of 35
Section 10-3 Hypothesis Tests for a Population Mean
The methods in this section are very similar to what we just learned for testing a
claim about a proportion. In this case, we will be testing a claim about a mean.
Example: Is the mean amount of soda in the pop cans really equal to 12
ounces, or is it actually less than that?
As in Section 9.2, because we don’t know , we have to use the t distribution
instead of the z distribution, because there is a higher level of .
As a result of using the t distribution:
The critical regions are even further out in the than
they would be with the z distribution.
Therefore, the sample data must be
(further away from the hypothesized value) to fall in the critical region
and cause H0 to be rejected.
Chapter 10 Handouts
Page 29 of 35
Notice that we calculate a t0 test statistic instead of a z0 test statistic:
where,
n = sample size or number of trials
s = sample standard deviation
x = sample mean
0 = population mean (value being tested)
Another difference is that we use Table VII to look up a critical t value.
We will focus on the Classical Approach in class, since the P-Value Approach is
best accomplished using technology, otherwise an approximation is required.
Chapter 10 Handouts
Page 30 of 35
Using the t Distribution for Hypothesis Testing
Classical Approach – works the same as z distribution
By hand: use Table VII to look up the tcrit values.
Assume = 0.05:
The critical value tcrit depends on
the
Whatever the critical value tcrit is, it
will be from
the center than the corresponding
critical z value would be.
Assume a sample size:
df =
For a two-tailed test, Area in Right
Tail = .
For a one-tailed test, Area in Right
Tail = .
For a left-tailed test, YOU have to
assign the negative sign to the
critical value!
As before, use the critical values to identify the “critical regions”. The null
hypothesis will be if the test statistic falls in the
critical region.
t
t
t
tcrit = ?
tcrit =?
tcrit = ?
tcrit = ?
Chapter 10 Handouts
Page 31 of 35
Example:
Sixteen different cereals are randomly selected, and the sugar content (grams of
sugar per gram of cereal) is obtained for each cereal. Assume that the
population is normally distributed. The sample mean is 0.295 g of sugar, and the
sample standard deviation is 0.168g. Use a 0.05 significance level to test the
claim of a cereal lobbyist that the mean sugar content for all cereals is less than
0.3 g.
Step 1: State the null and alternative hypotheses.
Step 2: Decide on the significance level, α.
Step 3: Compute the value of the test statistic.
n =
x =
0 =
s =
n
s
xt 0
0
=
Chapter 10 Handouts
Page 32 of 35
Classical Approach:
Step 3: Determine the critical value(s) from Table VII.
What type of a test?
Area in =
df =
From Table VII: tcrit =
Step 4: If the value of the test statistic falls in the critical region, reject H0;
otherwise, do not reject H0.
Step 5: State the conclusion.
There sufficient evidence at the level of
significance to conclude that the mean sugar content for all cereals is less than
0.3 g.
Chapter 10 Handouts
Page 33 of 35
P-Value Method:
Step 6: Find the P-value – have to use STATDISK for this!
Analysis/Hypothesis Testing/Mean – One Sample
P-value =
Step 4: Compare the P-value to the significance level, α. If P < α, reject H0.
Step 5: State the conclusion.
Same as above.
Chapter 10 Handouts
Page 34 of 35
In previous tests, baseballs were dropped 24 ft onto a concrete surface, and they
bounced an average of 92.84 in. In a test of a sample of 40 new baseballs, the
bounce heights had a mean of 92.67 in. and a standard deviation of 1.79 in. Use
a 0.05 significance level to determine whether there is sufficient evidence to
support the claim that the new baseballs have bounce heights with a mean
different from 92.84 in. Does it appear that the new baseballs are different?
Step 1: State the null and alternative hypotheses.
Step 2: Decide on the significance level, α.
Step 3: Compute the value of the test statistic.
n =
x =
0 =
s =
n
s
xt 0
0
=
Chapter 10 Handouts
Page 35 of 35
Classical Approach:
Step 3: Determine the critical value(s) from Table VII.
What type of a test?
Area in =
df =
From Table VII: tcrit =
Step 4: If the value of the test statistic falls in the critical region, reject H0;
otherwise, do not reject H0.
Step 5: State the conclusion.
There sufficient evidence at the level of
significance to conclude that the new baseballs have a different mean bounce
height compared to the old baseballs.