AP Statistics Section 11.1 A Basics of Significance Tests.

AP Statistics Section 11.1 ABasics of Significance Tests

Is the card RED or BLACK? Each student in the class will be asked if the next card in a well

shuffled deck of cards is red or black. One point of EC will be given to each student who chooses correctly. Before we begin, what proportion of the cards in the deck do expect to be black?

_______ 5.

All of you will be expected to keep track of how many cards are red and how many are

black.RED: BLACK:

What was our sample proportion of black cards? _______

Has your opinion on what proportion of the cards in the deck are black changed?

Let’s use STATKEY to explore this situation further.

Confidence intervals are one of the two most common types of

statistical inference. Use a confidence interval when

your goal is to ____________________________. parameter population a estimate

The second type of inference, called significance tests, has a

different goal:

.population a concerning claim someabout

databy provided evidence theassess to

Example 1: I claim that I make 80% of my free throws. To test my claim, you ask me to shoot 20 free throws. I

make only 8 out of 20. Assume p = .8, and find the probability of making exactly 8 of the 20 free throws.

Also, find the probability of making 8 or less free throws.

)8,8,.20(fbinomialpd 000087.

)8,8,.20(fbinomialcd 0001.

“Aha!” you say. “Someone who makes 80% of his free throws would almost never make only 8 out of 20. So I don’t believe your claim.” Your reasoning is based on asking what would happen if my claim were true and

we repeated the sample of 20 free throws many times. I would almost never make as few as 8. This outcome is so unlikely that it gives strong evidence that my claim is

not true.

Significance tests use elaborate vocabulary but the basic idea is simple: getting an outcome that

would rarely happen if a claim were true is strong evidence that the claim

is not true.

A significance test is a formal procedure for comparing observed data with a hypothesis

whose truth we want to assess. The hypothesis is a statement about a population parameter

such as the population mean ___ or population proportion ___. The results of a test are expressed in terms of a probability that

measures _______________________________________.

p

agree hypothesis theand data the wellhow

The reasoning behind statistical tests, like that of confidence

intervals, is based on asking what would happen if we repeated the

sampling or experiment many times. We will begin with the unrealistic assumption that we know , the

population standard deviation.

Example 2: Vehicle accidents can result in serious injuries to drivers and passengers. In the case of life-threatening injuries, victims generally need medical attention within 8 minutes of the crash. Several cities have begun to monitor paramedic response times. In one such city, the mean response time (RT) to all such accidents involving life-threatening injuries last year was minutes with minutes. The city manager shares this information with emergency personnel and encourages them to “do better” next year. At the end of the following year, the city manager selects a simple random sample of 400 calls involving life-threatening injuries and examines the response times. For this sample, the mean response time was minutes. Do these data provide good evidence that response times have decreased since last year?

7.62

48.6x

Remember, sample results may vary! Maybe the mean RT for the

SRS is simply a result of ____________________.ty variabilisampling

We want to use the same reasoning here as we did in the previous example. We make a claim

and ask if the data give evidence *__________*

it. We would like to conclude that the mean RT ____________, so the claim we test is

that RTs _____________________. If we assume the RTs for calls involving life-

threatening injury have not decreased, the mean RT

for the population of all such calls would still be __________ (assume ________ too).

against

decreaseddecreasednot have

6.7 2

Consider the sampling distribution of from 400 calls:Shape:

Mean:Standard deviation:

Find the probability of 6.48 minutes.

CLT - Normal approx.

7.6x

4002

x n

10nN as long as x

014.)1,.7.6,48.6,1000( normalcdf

x

An observed value this small would rarely occur by chance if the true

minutes. This observed value is good evidence

that the true is, in fact, less than 6.7 minutes. Thus we can

conclude the average response time decreased this year.

6.7 were

In example 2, we asked whether the accident RT data are likely if, in fact, there is no decrease in

paramedics’ RTs. Because the reasoning of significance tests looks for evidence against a

claim, we start with the claim we seek evidence against, such as “no decrease in response time.” This claim is our _________________( ____ ).

This is the statement being tested in a significance test.

hypothesis null 0H

The significance test is designed to assess the strength of the evidence

against the null hypothesis. Usually the null hypothesis is a statement of “no

change”, or “no difference” from historical values. The null hypothesis can be thought of as the “status quo”

hypothesis.

The claim about the population that we are trying to find evidence for is the alternative hypothesis ( ____ ).aH

In example 2, the null hypothesis says “no decrease” in the mean RT of 6.7 min.”:

H0:________ while the alternative hypothesis says “there is a decrease in the

mean RT of 6.7 min.”: Ha: ________ where is the mean response time to all calls involving life-threatening injuries in

the city this year.

7.6

7.6

In this instance the alternative hypothesis is one-sided because

we are interested only in deviations from the null hypothesis

in one direction.

Hypotheses always refer to some population, not to a particular

outcome. Be sure to state in terms of a population

parameter.aHH and 0

Example 3: Does the job satisfaction of assembly workers differ when their work is machine-paced rather

than self-paced? One study chose 18 subjects at random from a group of people who assembled

electronic devices. Half of the subjects were assigned at random to each of two groups. Both groups did similar assembly work, but one work setup allowed workers to pace themselves, and the other featured an assembly

line that moved at fixed time intervals so that the workers were paced by the machine. After two weeks, all subjects took the Job Diagnosis Survey (JDS), a test

of job satisfaction. Then they switched work setups and took the JDS again after two more weeks.

This is a _________________design

experiment. The response variable is the

__________________________, self-paced minus machine-paced.

pairs-matched

scores JDSin difference

The parameter of interest is the mean of the differences in JDS scores in the

population of all assembly workers. The null hypothesis

says that there (is a / is no) difference inthe scores:

:0H 0

The authors of the study simply wanted to know if the two work conditions have different levels

of job satisfaction. They did not specify the direction of difference. The alternative

hypothesis is therefore two-sided; that is either _______ or _______. For simplicity, we write

this as _______.0 0

:aH 0

The alternative hypothesis should express the hopes or suspicions we have before

we see the data. It is cheating to first look at the data and then frame the alternative

hypothesis to fit what the data show. If you do not have a specific direction firmly

in mind in advance, use a two-sided alternative.

AP Statistics Section 11.1 A Basics of Significance Tests.

Documents

Transcript of AP Statistics Section 11.1 A Basics of Significance Tests.