Week 8
description
Transcript of Week 8
![Page 1: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/1.jpg)
Week 8
Chapter 8 - Hypothesis Testing I:
The One-Sample Case
![Page 2: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/2.jpg)
Chapter 8
Hypothesis Testing I:
The One-Sample Case
![Page 3: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/3.jpg)
Significant Differences
Hypothesis testing is designed to detect significant differences: differences that did not occur by random chance. Hypothesis testing is also called significance testing
This chapter focuses on the “one sample” case: we compare a random sample against a population
We compare a sample statistic to a (hypothesized) population parameter to see if there is a significant difference
![Page 4: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/4.jpg)
Scenario #1: Dependent variable is measured at the interval/ratio level with a large sample size and known population standard deviation
![Page 5: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/5.jpg)
Example
Effectiveness of rehabilitation center for alcoholics
Absentee rates for sample and community:
![Page 6: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/6.jpg)
Example
Main question: Does the population of all treated alcoholics have different absentee rates than the community as a whole?What is the cause of the difference between 6.8 and 7.2?
Real difference? Random chance?
![Page 7: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/7.jpg)
Example
![Page 8: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/8.jpg)
Example
Ho: μ = 7.2 days per year
![Page 9: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/9.jpg)
Example
![Page 10: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/10.jpg)
Example
The sample outcome (-3.15) falls in the shaded area,
thus we reject Ho
![Page 11: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/11.jpg)
The Five-Step Model
1. Make assumptions and meet test requirements
2. State the null hypothesis (Ho)
3. Select the sampling distribution and establish the critical region
4. Compute the test statistic
5. Making a decision and interpret the results of the test
![Page 12: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/12.jpg)
The Five-Step Model: Example
The education department at a university has been accused of “grade inflation” so education majors have much higher GPAs than students in general
The mean GPA for all students is 2.70 (μ) A random sample of 117 (N) education
majors has a mean GPA of 3.00, with a standard deviation (s) of 0.70
![Page 13: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/13.jpg)
Step 1: Make Assumptions and Meet Test Requirements Random sampling
Hypothesis testing assumes samples were selected according to EPSEM
The sample of 117 was randomly selected from all education majors
Level of measurement is interval-ratio GPA is an interval-ratio level variable, so the mean
is an appropriate statistic Sampling distribution is normal in shape
This is a “large” sample (N ≥ 100)
![Page 14: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/14.jpg)
Step 2: State the Null Hypothesis
Ho: μ = 2.7
The sample of 117 comes from a population that has a GPA of 2.7
The difference between 2.7 and 3.0 is trivial and caused by random chance
H1: μ ≠ 2.7
The sample of 117 comes from a population that does not have a GPA of 2.7
The difference between 2.7 and 3.0 reflects an actual difference between education majors and other students
![Page 15: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/15.jpg)
Step 3: Select Sampling Distribution and Establish the Critical Region Sampling Distribution= Z
Alpha (α) = 0.05 Alpha is the indicator of “rare” events Any difference with a probability less than α is rare
and will cause us to reject the Ho
Critical region begins at ±1.96 This is the critical Z score associated with a two-
tailed test and alpha equal 0.05 If the obtained Z score falls in the critical region,
reject Ho
![Page 16: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/16.jpg)
Step 4: Compute the Test Statistic
Z (obtained) = 4.62
62.4111770.0
70.200.3
1Ns
XZ
![Page 17: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/17.jpg)
Step 5: Make a Decision and Interpret Results
The obtained Z score fell in the critical region
so we reject the Ho
If the H0 were true, a sample outcome of
3.00 would be unlikely Therefore, the Ho is false and must be
rejected Education majors have a GPA that is
significantly different from the general student body
![Page 18: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/18.jpg)
The Five Step Model: Summary
In hypothesis testing, we try to identify statistically significant differences that did not occur by random chance
In this example, the difference between the parameter 2.70 and the statistic 3.00 was large and unlikely (p < 0.05) to have occurred by random chance
![Page 19: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/19.jpg)
The Five Step Model: Summary
We rejected the Ho and concluded that the difference
was significant It is very likely that education majors have GPAs higher
than the general student body
![Page 20: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/20.jpg)
Crucial Choices in Five-Step Model
Model is fairly rigid, but there are two crucial choices:
1. One-tailed or two-tailed test
2. Alpha (α) level
![Page 21: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/21.jpg)
Choosing a One or Two-Tailed Test
Two-tailed: States that population mean is “not equal” to value stated in null hypothesis
One-tailed: Differences in a specific direction Examples:
![Page 22: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/22.jpg)
Choosing a One or Two-Tailed Test
![Page 23: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/23.jpg)
Choosing a One or Two-Tailed Test
![Page 24: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/24.jpg)
Choosing a One or Two-Tailed Test
![Page 25: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/25.jpg)
Choosing a One or Two-Tailed Test
![Page 26: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/26.jpg)
Selecting an Alpha Level
By assigning an alpha level, one defines an “unlikely” sample outcome
Alpha level is the probability that the decision to reject the null hypothesis is incorrect
Examine this table for critical regions:
![Page 27: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/27.jpg)
Type I and Type II Errors
Type I, or alpha error: Rejecting a true null hypothesis
Type II, or beta error: Failing to reject a false null hypothesis
Examine table below for relationships between decision making and errors
![Page 28: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/28.jpg)
Scenario #2: Dependent variable is measured at the interval/ratio level with a large sample size and unknown population standard deviation
![Page 29: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/29.jpg)
Scenario #2: How can we test a hypothesis when the population
standard deviation (σ) is not known, as is usually the case? For large samples (N ≥ 100), can use the sample
standard deviation (s) as an estimator of the population standard deviation (σ)
Use standard (Z) normal distribution Thus, follow the same procedures as you would
for Scenario #1
![Page 30: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/30.jpg)
Scenario #3: Dependent variable is measured at the interval/ratio level with a small sample size and unknown population standard deviation
![Page 31: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/31.jpg)
Scenario #3
For small samples, s is too biased an estimator of σ so do not use standard normal distribution
Use Student’s t distribution
![Page 32: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/32.jpg)
The Student’s t Distribution
Compare the Z distribution to the Student’s t distribution:
![Page 33: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/33.jpg)
The Student’s t Distribution
![Page 34: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/34.jpg)
Student’s t: Using Appendix B
How t table differs from Z table:
1. Column at left for degrees of freedom (df) df = N – 1
2. Alpha levels along top two rows: one- and two-tailed
3. Entries in table are actual scores: t(critical) Mark beginning of critical region, not
areas under the curve
![Page 35: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/35.jpg)
Scenario #4: Dependent variable is measured at the nominal level with a large sample size and known population standard deviation
![Page 36: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/36.jpg)
The Five-Step Model: Proportions
When analyzing variables that are not measured at the interval-ratio level (and therefore a mean is inappropriate), we can test a hypothesis on a one sample proportion instead
The five step model remains primarily the same, with the following changes: The assumptions are: random sampling, nominal level
of measurement, and normal sampling distribution The formula for Z(obtained) is:
![Page 37: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/37.jpg)
The Five-Step Model: Proportions
A random sample of 122 households in a low-income neighborhood revealed that 53 (Ps = 0.43 = 53/122) of the households were headed by women
In the city as a whole, the proportion of women-headed households is 0.39 (Pu)
Are households in lower-income neighborhoods significantly different from the city as a whole?
Conduct a 90% hypothesis test (alpha = 0.10)
![Page 38: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/38.jpg)
Step 1: Make Assumptions and Meet Test Requirements Random sampling
Hypothesis testing assumes samples were selected according to EPSEM
The sample of 122 was randomly selected from all lower-income neighborhoods
Level of measurement is nominal Women-headed households is measured as a
proportion Sampling distribution is normal in shape
This is a “large” sample (N ≥ 100)
![Page 39: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/39.jpg)
Step 2: State the Null Hypothesis
Ho: Pu = 0.39
The sample of 122 comes from a population where 39% of households are headed by women
The difference between 0.43 and 0.39 is trivial and caused by random chance
H1: Pu ≠ 0.39
The sample of 122 comes from a population where the percent of women-headed households is not 39
The difference between 0.43 and 0.39 reflects an actual difference between lower-income neighborhoods and all neighborhoods
![Page 40: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/40.jpg)
Step 3: Select Sampling Distribution and Establish the Critical Region
Sampling Distribution= Z Alpha (α) = 0.10 (two-tailed)
Critical region begins at ±1.65 This is the critical Z score associated with
a two-tailed test and alpha equal 0.10 If the obtained Z score falls in the critical
region, reject Ho
![Page 41: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/41.jpg)
Step 4: Compute the Test Statistic
Z (obtained) = +0.91
91.0122)39.01)(39.0(
39.043.0
N)P1(P
PPZ
uu
us
![Page 42: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/42.jpg)
Step 5: Make a Decision and Interpret Results
The obtained Z score did not fall in the critical
region so we fail to reject the Ho
If the H0 were true, a sample outcome of
0.43 would be likely Therefore, the Ho is not false and cannot
be rejected The population of women-headed households
in lower-income neighborhoods is not significantly different from the city as a whole
![Page 43: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/43.jpg)
Scenario #5: Dependent variable is measured at the nominal level with a small sample size
This is not considered in your text
![Page 44: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/44.jpg)
Conclusion
Scenario #1: Dependent variable is measured at the interval/ratio level with a large sample size and known population standard deviation Use standard Z distribution formula
Scenario #2: Dependent variable is measured at the interval/ratio level with a large sample size and unknown population standard deviation Use standard Z distribution formula
Scenario #3: Dependent variable is measured at the interval/ratio level with a small sample size and unknown population standard deviation Use standard T distribution formula
![Page 45: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/45.jpg)
Scenario #4: Dependent variable is measured at the nominal level with a large sample size and known population standard deviation Use slightly modified Z distribution formula
Scenario #5: Dependent variable is measured at the nominal level with a small sample size\ This is not covered in your text or in class
![Page 46: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/46.jpg)
USING SPSS
On the top menu, click on “Analyze” Select “Compare Means” Select “One Sample T Test”
![Page 47: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/47.jpg)
Hypothesis testing in SPSS
One-sample test (value of the mean in the population) Analyze / Compare Means / One sample test
Click on OPTIONS to choose the confidence level
![Page 48: Week 8](https://reader035.fdocuments.us/reader035/viewer/2022062722/56813a46550346895da238fa/html5/thumbnails/48.jpg)
Output
T-TestOne-Sample Statistics
779 404.4871 115.41804 4.13528Amount spentN Mean Std. Deviation
Std. ErrorMean
One-Sample Test
1.085 778 .278 4.4871 -3.6305 12.6047Amount spentt df Sig. (2-tailed)
MeanDifference Lower Upper
95% ConfidenceInterval of the
Difference
Test Value = 400
p value
The null hypothesis is not rejected (as the p-value is larger than 0.05)