Chapter 10 Handouts - CoffeeCup...

Chapter 10 Handouts

of 35

Section 10.1 The Language of Hypothesis Testing

The big picture is:

Use a random sample to make , or to learn

something about the larger population.

Example: A soft-drink company may claim that, on average, its cans contain

12 ounces of soda. A government agency may want to test whether or not such

cans really do contain, on average, 12 ounces of soda, or whether they are

“short-changing” the consumers by putting less than that amount in the cans. A

sample of 100 cans of the soft drink that is being investigated is taken. The

mean amount of soda in the 100 cans is 11.89 ounces. That is less than the

claimed amount of 12 ounces, so can we conclude that the soft-drink company is

lying to the public?

Two ways to do this:

1. Confidence intervals: Use sample data to estimate a population value.

Provides a range of for the population

parameter.

2. Hypothesis testing: Use sample data to test a specific claim about a

population. Provides a conclusion about a

for the population parameter, whether an assumed value is plausible or

not.

This is the heart of hypothesis testing: we make an assumption about reality, for

example the true value of the population mean or proportion. We gather sample

data. Then we compare the sample data to the assumed value, and see if it

the assumption.

Chapter 10 Handouts

of 35

In order to run a hypothesis test, you need TWO hypotheses, the

hypothesis and the hypothesis. A hypothesis is a

statement regarding a characteristic of a population. We will run hypothesis test

on two parameters:

1. A proportion (Section 10.2)

2. A mean (Section 10.3)

Null Hypothesis:

Represented by .

Statement that a population parameter .

This hypothesis will be accepted UNLESS we have convincing

evidence to the contrary.

It is based on the expectation of , status quo,

“same old, same old”.

Often, hypothesis tests are designed in such a manner that we

to reject the null hypothesis.

The null hypothesis is always written in this form:

H0:

where are the values for the

population mean or proportion.

Alternative Hypothesis:

Represented by .

Statement that the population parameter is

the value stated in the null hypothesis.

It can either be

It is often the conclusion that we are TRYING to make.

There are three forms for the alternative hypothesis (see next page).

Chapter 10 Handouts

of 35

Form of Alternative Hypothesis

Type of test Purpose of test

H1:

To decide whether the population mean

or proportion is

the hypothesized value, µ0

𝑜𝑟 𝑝0. In other words, it could be either

larger or smaller.

H1:


or proportion is

the hypothesized value, µ0 𝑜𝑟 𝑝0. In

other words, we already have reason to

believe it is less, we just want to test it.

H1:


or proportion is

the hypothesized value, µ0 𝑜𝑟 𝑝0. In

other words, we already have reason to

believe it is more, we just want to test it.

Example: Soft drinks

The company is claiming that the cans contain 12 oz. of soda on average,

therefore the null hypothesis is that:

H0:

The government agency is concerned that the manufacturer is putting LESS

THAN that amount into the cans, therefore the alternative hypothesis would be:

H1:

Chapter 10 Handouts

of 35

The Logic of Hypothesis Testing

Start with an initial assumption (about population proportion or mean).

Collect sample data.

Find the point estimate from the sample data.

Compare the point estimate to the hypothesized population value,

using a standard mathematical procedure.

Decide if the hypothesized population value is reasonable.

There are basically two outcomes:

1. The sample data are with the hypothesized

value. The sample mean or proportion is not that

from the hypothesized population mean or proportion. Therefore, do

reject the null hypothesis.

2. The sample data are with the

hypothesized value. The sample mean or proportion is

different from the hypothesized mean or proportion, and more supportive

of the alternative hypothesis. In that case we reject the null hypothesis,

and accept the alternative hypothesis.

Notice that we have to have evidence that the

null hypothesis is not true, before we end up rejecting it.

Chapter 10 Handouts

of 35

Example: Innocent until proven guilty!

Consider a jury trial:

H0:

H1:

The null hypothesis is always assumed to be , until we have

evidence to the .

In a trial, the prosecution has to provide convincing evidence, “beyond a

reasonable doubt”, that the person is in fact guilty.

Source: Introductory Statistics, 8th edition, Prem S. Mann

A statistical hypothesis test works exactly like this!

There is some critical point, which can be calculated, beyond which the null

hypothesis is .

Chapter 10 Handouts

of 35

Type I and II errors

There’s always a chance of

making a mistake, and

coming to the wrong

conclusion! We are making

our decision based on

.

Four outcomes: #1: The null hypothesis is , but we it.

#2: The null hypothesis is , and we reject it.

#3: The null hypothesis is , but we to reject it.

#4: The null hypothesis is , and we it.

Type I error: Rejecting the null hypothesis when it is actually . The

probability of making a Type I error is called the ,

of a hypothesis test.

Type II error: Not rejecting the null hypothesis when it is in fact .

The probability of that happening is .

Ideally, the probabilities of of these types of errors would be small.

However, there is a relationship between the two errors:

For a fixed sample size, the smaller the Type I error (α) is, the

the probability of a Type II error ().

Chapter 10 Handouts

of 35

Example: Criminal Trial Analogy

H0: defendant is not guilty

H1: defendant is guilty

Four possible outcomes:

1. Type I error: H0 is true, but it is rejected.

2. Correct decision H0 is true, and it is NOT rejected.

3. Type II error: H0 is false, but it is NOT rejected.

4. Correct decision H0 is false, and it IS rejected.

Which error (Type I or Type II) is more serious???

Chapter 10 Handouts

of 35

Star Trek – to Vaporize or NOT to Vaporize? (Type I and II errors)

“He’s dead, Jim,” said Dr. McCoy to Captain Kirk.

Mr. Spock, as the science officer, is put in charge of statistically determining the

correctness of Dr. McCoy’s statement and deciding the fate of the crew member (if dead

– to vaporize, if alive – to try and revive).

1. Presumably, the crew member was alive to start with, so “being alive” represents

the condition of “no change”, or the “status quo”, the assumed starting condition.

Use this information to formulate a null hypothesis and an alternative hypothesis:

H0:

H1:

2. Referring to the table on the previous page, describe the four possible outcomes as

a result of Spock’s decision about whether the crew member is dead or alive.

1. Type I error:

2. Correct decision

3. Type II error:

4. Correct decision

3. Which of the two types of errors is more serious in this case, and why??

Chapter 10 Handouts

of 35

Possible Conclusions for a Hypothesis Test

After running a hypothesis test, there are in general two possible conclusions:

1. The null hypothesis is . In this case, we

have concluded that the data provide sufficient evidence to support the

hypothesis.

2. The null hypothesis is rejected. In this case, we conclude that

the data do NOT provide sufficient evidence to support the

hypothesis.

Note in this case: It does NOT mean that we proved the null hypothesis to

be ! It simply means that we are going to “reserve

judgment” about which hypothesis is true. It is impossible to

the null hypothesis! It just means that the sample data wasn’t

enough to reject it.

KEY: Stating the conclusion:

There sufficient evidence at the level of

significance to conclude that “insert statement in alternative hypothesis”.

Note: if we did NOT reject H0, then there sufficient evidence, If we DID reject H0, then there sufficient evidence. Example: Jury Trial H0: The person is not guilty.

H1: The person is guilty.

If the jury comes back with a “not guilty” verdict: There

sufficient evidence (at the level of significance) to conclude that

“ ”.

Chapter 10 Handouts

of 35

Section 10.2 Hypothesis Tests for a Population Proportion

Test Statistic

The test statistic is:

A number that we calculate from the .

It is a , so in other words we are

converting the data to the standard normal distribution.

It shows the sample value is from the

hypothesized value.

The hypothesized value is always at the of the

distribution, where .

Use the test statistic to make a decision about whether or not to

the null hypothesis.

o If the sample data is very far away from the hypothesized value

conclude that the null hypothesis is .

o If the sample data is not that far away from the hypothesized value

conclude that the null hypothesis is

wrong (do not reject it).

The test statistic for this section is as follows:

For proportion:

Compare this test statistic to the hypothesized value of the proportion, and see

how far away it is from that value.

Chapter 10 Handouts

of 35

Critical Value for the Classical Approach

The critical value is the beyond which we will

reject the null hypothesis, if the test statistic lies beyond the critical value.

The critical value for a hypothesis test is the z-score that separates the

region from the region.

The critical value depends on two things:

1. The of the test,

2. The type of test that it is: ,

The hypothesized value of the proportion, 𝑝0, is at the of the

normal curve. It corresponds to when it is standardized. That’s

why we want to see if our sample data is close to the center, or if it is way out in

the tails, far away from the center.

The type of test depends entirely on what the

hypothesis is, .

Chapter 10 Handouts

of 35

Looking up Critical Values

“At a significance level of 5%”

1. Two-Tailed Test:

2. Left-Tailed Test:

3. Right-Tailed Test:

Chapter 10 Handouts

of 35

Summary Table of Critical Values:

Note: these values assume an area of to the of the

z-score.

Chapter 10 Handouts

of 35

Chapter 10 Handouts

of 35

Example: In a 2011 National Institute on Alcohol Abuse and Alcoholism

survey, 33% of American adults said that they had never consumed alcohol. In a

more current (2014) random sample of 2300 adult Americans, 805 of them said

that they have never consumed alcohol. At the 5% significance level, do the data

provide sufficient evidence that the current percentage of American adults who

have never consumed alcohol has changed from the previous value of 33%?

Step 1: State the null and alternative hypotheses.

Step 2: Decide on the significance level, α.

Step 3: Compute the value of the test statistic.

Chapter 10 Handouts

of 35

Classical Approach:

Step 3: Determine the critical value(s).

What kind of a test?

Step 4: If the value of the test statistic falls in the critical region, reject H0;

otherwise, do not reject H0.

Step 5: State the conclusion.


significance to conclude that the current percentage of American adults who

have never consumed alcohol the previous

value of 33%.

Chapter 10 Handouts

of 35

P-Value Approach to Hypothesis Testing

The only difference is in how we make the decision about whether we reject the

null hypothesis or not:

Classical Approach: find the critical value(s) based on the significance

level, if the test statistic is more extreme than the critical value, then

REJECT the null hypothesis.

P-Value Approach: based on the test statistic (from the sample data),

calculate the P-value. If the P-value is the

significance level of the test, then REJECT the null hypothesis

P-Value Method

Instead of finding the critical value/region we find the P-Value.

The P-value gives us more information (vs. just a fail to reject/reject) about

the result of the test. It gives a measure of the of the

evidence against the null hypothesis.

The P-value is just a probability, so it’s a .

Represents how likely we would be to observe our sample or one more

extreme (further out) if the null hypothesis were .

P-value close to 0 (very small) means .

So, if P-value is small (less than ), then .

If the P-value is the null must !

Chapter 10 Handouts

of 35

Finding the P-Value for a One-Proportion z-Test

1. Determine type of test:

2. Calculate test statistic:

3. Find the P-value, the probability of getting that sample, or one more extreme

than that.

a. For a one-tailed test (either left or right), the P-value = the area under

the standard normal curve the test statistic. In

other words, the area in the tail of the curve, either to the left or the

right of the test statistic.

b. For a two-tailed test, the P-value = the area

under the standard normal curve BEYOND the test statistic.

Chapter 10 Handouts

of 35

Decision Criterion for a Hypothesis Test Using the P-Value

Reject null hypothesis (H0) if where is

the stated significance level for the problem (such as 0.05).

Fail to reject null hypothesis (H0) if .

Rationale: Again, the P-value is just a probability, the probability of getting your

sample data assuming that the hypothesized value is . If the

probability is very , then the hypothesized value is

probably true.

Chapter 10 Handouts

of 35

Example: In a 2011 National Institute on Alcohol Abuse and Alcoholism

survey, 33% of American adults said that they had never consumed alcohol. In a

more current (2014) random sample of 2300 adult Americans, 805 of them said

that they have never consumed alcohol. At the 5% significance level, do the data

provide sufficient evidence that the current percentage of American adults who

have never consumed alcohol has changed from the previous value of 33%?




Redo analysis (Steps 3 & 4) using the P-value method:

Step 3: Determine the P-value, P.

Step 4: Compare the P-value to the significance level, α. If P < α, reject H0.

Chapter 10 Handouts

of 35

For the SAME problem, use a 0.01 significance level to test the claim that the

current percentage of American adults who have never consumed alcohol is

different from the previous value of 33%. Does the conclusion change?

P-Value Method:


Does the conclusion change?

Why the difference? Use the Classical Approach to visualize the problem:

With a significance of 0.01, the critical values have been moved

away from the center (the hypothesized value). Therefore, the sample data has

to be even more for the null hypothesis to be rejected.

It’s essentially the same concept as in Chapter 9 with the confidence intervals.

With higher confidence (say 99% vs. 95%), the intervals would become ,

to include more possible values. Now, with smaller significance (say 0.01 vs.

0.05), the interval of sample values that may have come from a population with

the hypothesized value becomes .

Chapter 10 Handouts

of 35

Using Technology - STATDISK

Analysis/Hypothesis Testing/Proportion One Sample

Chapter 10 Handouts

of 35

In one study of 71 smokers who tried to quit smoking with nicotine patch therapy,

39 were smoking one year after the treatment, and 32 were not smoking one

year after the treatment. At the 10% significance level, do the data provide

sufficient evidence to conclude that more than 50% of smokers who try to quit

with nicotine patch therapy are still smoking one year after the treatment?




Classical Approach:





(continued on next page)

Chapter 10 Handouts

of 35



significance to conclude that a majority of smokers who try to quit with nicotine

patch therapy are still smoking one year after the treatment.

Now, go back and redo analysis (Steps 3 & 4) using the P-value Approach:



Chapter 10 Handouts

of 35

Chapter 10 Handouts

of 35

m&m’s – Testing a Claim about a Proportion

The information on the following page is from the m&m’s website, and it includes the

proportion (percentage) of each color of m&m’s that are included in a bag. According to

the manufacturer, the proportion of orange m&m’s equals 20%. At the 5% significance

level, do your data provide sufficient evidence to conclude that the percentage of orange

m&m’s in a bag is different from 20%?

Use BOTH the Classical Approach and the P-value Approach.

Gather your sample data by calculating the proportion of orange m&m’s in a bag of plain

m&m’s.

x = no. of orange m&m’s in the bag = ______

n = sample size = total no. of m&m’s in the bag = ______

p̂ = sample proportion = x/n = ______ (to 3 decimal places)


H0: __________

H1: __________



n

pp

ppz

)1(

ˆ

00

0

0

=

Classical Approach:



Chapter 10 Handouts

of 35





significance to conclude that the percentage of orange m&m’s in a bag is

different from 20%.

Now, go back and redo analysis (Steps 3 & 4) using the P-value Approach:



Error or Correct Decision?

Assume that what the manufacturer tells us is true, that the proportion of orange

m&m’s is really 20%. Based on your analysis, did you make the correct decision,

or did you make an error? If so, what type of error?

Chapter 10 Handouts

of 35

Section 10-3 Hypothesis Tests for a Population Mean

The methods in this section are very similar to what we just learned for testing a

claim about a proportion. In this case, we will be testing a claim about a mean.

Example: Is the mean amount of soda in the pop cans really equal to 12

ounces, or is it actually less than that?

As in Section 9.2, because we don’t know , we have to use the t distribution

instead of the z distribution, because there is a higher level of .

As a result of using the t distribution:

The critical regions are even further out in the than

they would be with the z distribution.

Therefore, the sample data must be

(further away from the hypothesized value) to fall in the critical region

and cause H0 to be rejected.

Chapter 10 Handouts

of 35

Notice that we calculate a t0 test statistic instead of a z0 test statistic:

where,

n = sample size or number of trials

s = sample standard deviation

x = sample mean

0 = population mean (value being tested)

Another difference is that we use Table VII to look up a critical t value.

We will focus on the Classical Approach in class, since the P-Value Approach is

best accomplished using technology, otherwise an approximation is required.

Chapter 10 Handouts

of 35

Using the t Distribution for Hypothesis Testing

Classical Approach – works the same as z distribution

By hand: use Table VII to look up the tcrit values.

Assume = 0.05:

The critical value tcrit depends on

the

Whatever the critical value tcrit is, it

will be from

the center than the corresponding

critical z value would be.

Assume a sample size:

df =

For a two-tailed test, Area in Right

Tail = .

For a one-tailed test, Area in Right

Tail = .

For a left-tailed test, YOU have to

assign the negative sign to the

critical value!

As before, use the critical values to identify the “critical regions”. The null

hypothesis will be if the test statistic falls in the

critical region.

t

t

t

tcrit = ?

tcrit =?

tcrit = ?

tcrit = ?

Chapter 10 Handouts

of 35

Example:

Sixteen different cereals are randomly selected, and the sugar content (grams of

sugar per gram of cereal) is obtained for each cereal. Assume that the

population is normally distributed. The sample mean is 0.295 g of sugar, and the

sample standard deviation is 0.168g. Use a 0.05 significance level to test the

claim of a cereal lobbyist that the mean sugar content for all cereals is less than

0.3 g.




n =

x =

0 =

s =

n

s

xt 0

0

=

Chapter 10 Handouts

of 35

Classical Approach:

Step 3: Determine the critical value(s) from Table VII.

What type of a test?

Area in =

df =

From Table VII: tcrit =





significance to conclude that the mean sugar content for all cereals is less than

0.3 g.

Chapter 10 Handouts

of 35

P-Value Method:

Step 6: Find the P-value – have to use STATDISK for this!

Analysis/Hypothesis Testing/Mean – One Sample

P-value =



Same as above.

Chapter 10 Handouts

of 35

In previous tests, baseballs were dropped 24 ft onto a concrete surface, and they

bounced an average of 92.84 in. In a test of a sample of 40 new baseballs, the

bounce heights had a mean of 92.67 in. and a standard deviation of 1.79 in. Use

a 0.05 significance level to determine whether there is sufficient evidence to

support the claim that the new baseballs have bounce heights with a mean

different from 92.84 in. Does it appear that the new baseballs are different?




n =

x =

0 =

s =

n

s

xt 0

0

=

Chapter 10 Handouts

of 35

Classical Approach:

Step 3: Determine the critical value(s) from Table VII.

What type of a test?

Area in =

df =

From Table VII: tcrit =





significance to conclude that the new baseballs have a different mean bounce

height compared to the old baseballs.

Chapter 10 Handouts - CoffeeCup...

Documents

Transcript of Chapter 10 Handouts - CoffeeCup...