1
Hypothesis Testing:
Population Mean and Proportion
Hypothesis Testing:
Population Mean and Proportion
Chapter 9
Chapter 9 - Learning Objectives
• Conduct a HYPOTHESIS TEST for a SINGLE population MEAN or PROPORTION.
1. Describe the logic of developing the null and alternative hypotheses.
2. Describe what is meant by Type I and Type II errors.
3. Test a hypothesis about population mean, when variance of the population is known
4. Test a hypothesis about population mean, when variance of the population is unknown
5. Test a hypothesis about population proportion
6. Explain the relationship between confidence intervals and hypothesis tests.
3
Introduction Hypothesis:
A belief/presumption/claim/assertion about something which can be rejected or accepted.
A statement that could be accepted or refuted subject to the presence or lack of enough statistical evidence (data).
4
IntroductionHypothesis Testing
Is a general procedure of making inference about population parameter.
The main purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief about a parameter.
Enables us to determine whether the belief about the population parameter is supported by the data (evidence).
5
IntroductionExamples
Developing a drug requires some minimum level of efficacy.
In a random sample of patients (half of whom are treated with a particular drug and the other half are given a placebo), is there enough evidence that support the belief that the drug is effective in curing a certain disease?
6
IntroductionExamples
Entrepreneurs who develop new products want to know what proportion of potential customers could buy their products. Consider that a purchase of the product by 10% of the consumers is the minimum threshold.
In a random sample of potential customers, is there enough statistical evidence that support the view that more than 10% of the potential customers will purchase a new products?
7
Introduction
Answering these kind of day to day real world problems involve testing the available wisdom/belief (hypothesis).
If sufficient statistical evidence is not available, implementation of a new idea, product and policy is difficult.
8
Introduction Two Types of Hypothesis:
1.The Null hypothesis (H0) The belief or statement about
something It is what is typical of the population. It is a term that
represents the “business as usual” idea, where nothing out of the ordinary occurs.
2. The Alternative Hypothesis (H1) The alternative to the belief/statement It is what is being considered as the challenge to the
existing idea. The view of about the population characteristic, that if it true, would trigger some new action, some change in procedures that had previously defined “business as usual.”
The Null Hypothesis: Note
• Non Directional, two-tail test: The alternative is formulated in such a way that the
response is non-directional (could be greater or lower).
Directional (One tailed test) The alternative to the null hypothesis is specified in
such a way that it is greater than what is stated in the null ( right-tailed test)
The alternative to the null is specified in such a way that it is lower than what is sated in the null ( left-tailed test)
10
IntroductionExample
A person is accused of committing a crime.
Plaintiffs/Prosecutors bring the case to a court. Jury is selected and evidence is presented.
Based on the evidence, not the accusation, jurors either convict or acquit the person.
When the trial begins: What would the jurors are expected to presume about the
defendant?
What would the plaintiff/prosecutors want the jurors to believe?
What would the plaintiff/prosecutors need to provide to the jury?
11
Example-1
Two Types of Hypothesis:
1. H0: The defendant is innocent (Business as usual)
2. H1: The defendant is guilty as charged
12
Example-2
Two Types of Hypothesis:
1. H0: The new drug is not effective
(Business as usual)
2. H1: The new drug is effective
(challenge to the business as usual view)
13
Introduction Jury decision 1. Convict the Defendant:
Find the defendant GUILTY. I.e., there is enough evidence to support the claim.
Rejecting the null hypothesis H0
2. Acquit the defendant: Find the defendant NOT GUILTY as charged. I.e., No
sufficient evidence is presented. Fail to reject the null hypothesis H0
Two Potential Errors While Testing Hypothesis
No errorType II error:
Type Ierror:
No error
State of RealityH0 True H0 False
H0 isTrue
H0 isFalse
Test Result Says
15
IntroductionTwo Types of Errors while hypothesis testing:
1. Type-I error (): Rejecting a true null hypothesis. The probability of rejecting a true null hypothesis is also called significance level.
Wrongly convicting an innocent person
2. Type-II error (β): Failing to reject a False null hypothesis. β represents, the probability of failing to reject a false null hypothesis
Acquitting a guilty defendant.
Note that the two are inversely related. Minimizing one will lead to the maximization of the other error.
16
9.2 Hypothesis Testing in practice
Example: Three highway patrol officers were assigned
to check whether the average speed of cars traveling on certain stretch of a highway matches the posted limit of 55 mph.
One officer claims that on the average the
speed at which cars travel on the stretch is greater than the posted limit of 55 mph.
17
9.2 Hypothesis Testing in practice
Example: Three highway patrol officers were assigned
to check whether the average speed of cars traveling on certain stretch of a highway matches the posted limit of 55 mph.
The other officer claims the contrary: The speed at which cars travel on the stretch is less than the posted limit of 55 mph.
18
9.2 Hypothesis Testing in practice
Example: Three highway patrol officers were assigned
to check whether the average speed of cars traveling on certain stretch of a highway matches the posted limit of 55 mph.
The third officer claims that on the average
the speed at which cars travel on the stretch is different from the posted limit of 55 mph.
19
9.2 Hypothesis Testing in practice
Example: To test these claims a random sample of 200 cars
traveling over the same stretch of highway was taken. Results from the sample indicate the average speed of cars travelling on the stretch of the highway is 56mph. It is known that the standard deviation for the entire population of cars traveling over the stretch is just 5 mph.
At 5% level of significance, test if each of the officers’ claims can be supported by enough statistical evidence.
20
Developing a Hypothesis:
We want to know whether the mean speed of cars traveling on the highway is greater than 55 as stated by the second officer.
As he trusts his officers, the sheriff wants to maintain each of the officers claim. However, he needs to know if there is sufficient information (evidence) to support either of the claims.
9.2 Hypothesis Testing in practice
21
Step-1: Determine the null and the alternative hypothesizes.
In the null hypothesis, always we should specify a single value of the population parameter (in this case, the average speed of all cars traveling on the stretch,
H0 : = 55
9.2 Hypothesis Testing in practice
22
H1 : > 55
Step-1: Determine the null and the alternative hypothesizes.
In the alternative hypothesis, we state the belief we want to maintain (in this case, officer’s claim) provided that there is sufficient information that enables us to reject the null.
H0 : = 55
9.2 Hypothesis Testing in practice
23
9.2 Hypothesis Testing in practice
H0: µ=55 H1: µ>55
N=200; Sample mean=56; Std. Deviation=5;
24
Approaches to Testing
There are two methods (approaches) to determine (test) whether or not there is enough information in the sample to support the hypothesis we want to maintain [i.e., alternative hypothesis (H1)].
1. The (Critical Value or Rejection region method).
2. The p-value method.
25
This is a method in which we establish range of values (critical value) such that if the test statistic falls into the region, we reject the null hypothesis in favor of the alternative hypothesis.
This is a method in which we establish range of values (critical value) such that if the test statistic falls into the region, we reject the null hypothesis in favor of the alternative hypothesis.
The Rejection Region Method
26
• If the sample mean lies in the rejection region, then there is sufficient (enough) evidence to support the alternative
claim(H1). That is, to reject the null(H0)
The Rejection Region Method – for a Right - Tail Test
Reject H0 here
Critical value of the sample mean
27
If information about population SD is available, and the distribution of the population is considered normal, then the test statistic is estimated as follows
n
-xCalZ
=
Step-2:Compute the Test Statistic
28
Using the information from the sample, we calculate the standardized test statistic as follows
83.2828.22005
55-56===CalZ
The standardized test statistic
29
H1 : > 55
Establish the critical value (rejection region), the value based on which we want to say “ yes, there is sufficient information to support the hypothesis that we want to keep:
H0 : = 55
Step-3: Establish the Critical Value
30
To establish the critical value, we must first determine the acceptable margin of error (the error that we are will to take in making the decision).
In our case, it was 5%
Step-3: Establish the Critical Value
1%or 5%or %10
31
For the 5%(one sided) margin of error, the critical value is 1.645
Step-3: Establish the Critical Value
32
Compare the test statistic with established critical value, and make the appropriate decision
Step-4: Compare the Test statistic with Established Critical value
33
Reject H0 if
ZZCal ||
Step-4: The Rejection Region Method- Compare the Test statistic with Established Critical value
Decision Rule:Reject the null hypothesis in favor of the
alternative if the calculated value of Z is greater than the critical value (Z at the chosen level of significance)
34
Using the information from the sample, we computed the test statistic to be
83.2828.22005
55-56===CalZ
35
Conclusion
The calculated value of Z (2.83) is greater than the critical value(1.645). Thus we reject the null hypothesis.
1.645 > 828.2
36
If the computed sample mean falls in the rejection region, then we reject the null hypothesis. If we do so, it means that… there is sufficient evidence that supports the idea we want to maintain (the one stated in the alternative hypothesis).
Thus we can say that the officer’s claim that on average people are driving above the speed limit is valid.
Step-5: Make a Conclusion
37
Repeat the procedure for each of the other officer’s claim as well In each case formulate the null and the
alternative; decide on the level of significance (margin of error) at which you want to test the hypothesis
Compute the test statistic and the critical values
Make business decision (reject or fail to reject the null hypothesis and indicate the implication
38
The P value method
As an alternative to testing a hypothesis using the rejection region method, we can also use the p-value method.
The p-value provides information about the amount of statistical evidence that supports the alternative hypothesis.
39
What is P-Value?It is the probability that we reject the null hypothesis when it is true.
In other words, it is a method in which we test the probability of observing a test statistic at least as extreme as the one computed, given that the null hypothesis is true.
What is P-Value?It is the probability that we reject the null hypothesis when it is true.
In other words, it is a method in which we test the probability of observing a test statistic at least as extreme as the one computed, given that the null hypothesis is true.
The p-value Method
40
Decision Rule when using the p-value method:
Reject the null hypothesis (H0) only if the P-Value is LESS THAN the level of significance ().
Decision Rule when using the p-value method:
Reject the null hypothesis (H0) only if the P-Value is LESS THAN the level of significance ().
The p-value Method
41
10.2 Hypothesis Testing in practice
Example: In a a random sample of 200 cars traveling over the same
stretch of high way, the average speed of cars is found to be 56mph. It is known that the standard deviation for the entire population of cars traveling over the stretch is just 5 mph, and the posted speed limit is 55mph.
At 5% level of significance, using the p-value method test if cars are traveling faster than the speed limit.
42
To use the p-value method:1. Formulate the null and alternative hypothesis2. Compute the test statistic (Z calculated)3. Depending up on H1, find the P(Z>Zcal)
P-Value= P(Z>2.83)= 0.500-p(0<z<2.83) = 0.500-0.4977
=0.00234. Compare the p-value with the level of significance 5. Reject the null hypothesis only if the p-value is less
than significance level6. Make the appropriate inference
To use the p-value method:1. Formulate the null and alternative hypothesis2. Compute the test statistic (Z calculated)3. Depending up on H1, find the P(Z>Zcal)
P-Value= P(Z>2.83)= 0.500-p(0<z<2.83) = 0.500-0.4977
=0.00234. Compare the p-value with the level of significance 5. Reject the null hypothesis only if the p-value is less
than significance level6. Make the appropriate inference
The p-value Method
43
Decision
P-value(0.0023) < 0.05 (significance level).So we reject the null hypothesis.
What does 0.0023 mean?
Decision
P-value(0.0023) < 0.05 (significance level).So we reject the null hypothesis.
What does 0.0023 mean?
The p-value Method
44
Hands-On-Problem In the past, patrons of a cinema complex have spent an
average of $2.50 for popcorn and other snacks, with a standard deviation of $0.90.
It is known that the amount of expenditure is normally distributed. After an intensive public campaign about the health effects of popcorn, the mean expenditure of a sample of 18 patrons is found to be $2.10.
At 5% significance level, test if this recent experience suggest a decline in spending?
45
Hands-On-Problem Formulate the null and the alternative hypothesizes for the
problem.
Establish and the critical values in terms of the standardized (Z) values and test the hypothesis using the rejection region method.
Compute the P-Value and test hypothesis using the p-value method.
Interpret your results.
46
Hands-On-Problem Using the rejection Region Method Formulate the Null and Alternative Hypothesis:
H0: µ=2.50 H1: µ <2.50
Z cal (test statistic)=
Z table =Z0.05=-1.645
Decision: |Z cal| > |Z table|: Thus reject H0.
Conclusion: There is enough evidence that supports the view that the public campaign has worked effectively (The campaign has resulted in reduced consumption of Pop Corn)
891
18900
502102.
...
47
Hands-On-Problem Using the p-Value method:
H0: µ=2.50 H1: µ <2.50
Z cal (test statistic)=
P(Z<-1.89)= P(Z>1.89): 0.5000-0.4706= 0.0294 Level of significance is 0.05
Decision: P value=0.0294 < =0.05 . Thus reject H0.
Conclusion: There is enough evidence that supports the view that the public campaign has worked effectively (Has resulted in reduced consumption of Pop Corn)
891
18900
502102.
...
49
Testing Proportions We apply the same procedure we used the previous examples.
However, it is important to note the following when testing the proportions.
1. We apply only the Z distribution when testing sample proportion
2. To apply the Z test np>5 and n(1-p)>5
3. Both rejection region and p-value method can be used to test hypothesis about proportions
50
If the p-value is less than 1%, there is evidence that supports the alternative hypothesis. (at the significance level of 1% or more): Overwhelming
If the p-value is between 1% and 5%, there is a evidence that supports the alternative hypothesis (at 5% or more significance level): Strong
If the p-value is between 5% and 10% there is a evidence that supports the alternative hypothesis. (at 10% ore more): Weak
If the p-value exceeds 10%, there is no evidence that supports the alternative hypothesis, unless we choose a margin of error that exceeds 10%
Interpreting the p-value
51
Testing Proportions
Example: An auto repair shop owner claims that no less than 70%
of his customers are satisfied with his work. However, a survey of 150 customers reveals that only 66% are satisfied with the work performed. At 5% level of significance, can we conclude that less than 70% of the customers are satisfied?
Testing a Hypothesis about the mean of the population-- Variance of the population is unknown
We follow similar procedure that we have used in testing a hypothesis about the mean when variance of the population is known, except that we use a t-distribution instead of a Z-distribution
54
Steps to solve such a Problem1. Formulate the null and the alternative hypothesizes.
2. Establish the test statistic (calculated t) using the information from the sample and the critical value (table value of t) at the n-1 degrees of freedom and selected level of significance
3. If using the rejection region method, compare the calculated t with the critical value of t, and reject the null hypothesis if the calculated value is greater than the table value.
4. If using the P-Value method, compute the p-value using the test statistic and reject the null hypothesis if the p-value is less than the significance level selected.
5. Make the appropriate inference depending up on the results from the test.
Top Related