Chapter 13
description
Transcript of Chapter 13
![Page 1: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/1.jpg)
Essential Statistics Chapter 13 1
Chapter 13
Introduction to Inference
![Page 2: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/2.jpg)
What We’ll Learn?<Part I>
Confidence IntervalConfidence levelMargin of errorCritical value Z* <Part II>Null hypothesis H0
Alternative hypothesis Ha
Z test statistic P ValueSignificance level Essential Statistics Chapter 13 2
![Page 3: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/3.jpg)
Essential Statistics Chapter 13 3
Provides methods for drawing conclusions about a population from sample data– Confidence Intervals
What is the population mean?– Gives an estimated range of value which
is likely to include an unknown population parameter such as population mean µ
Statistical Inference
![Page 4: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/4.jpg)
Essential Statistics Chapter 13 4
◙ SRS from the population of interest◙ Variable has a Normal distribution
N(, ) in the population◙ Although the value of is unknown, the value of the population standard deviation is known
Inference about a MeanSimple Conditions
![Page 5: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/5.jpg)
Essential Statistics Chapter 13 5
A level C confidence interval has two parts1. An confidence interval usually has the form:
estimate ± margin of error
1. The confidence level C, which is the probability that the interval will capture the true parameter value in repeated samples; that is, C is the success rate for the method.
Confidence Interval
nσzx
![Page 6: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/6.jpg)
Video references
http://www.youtube.com/watch?v=Ohz-PZqaMtk http://www.youtube.com/watch?v=Q6Lj_8yt4Qk <short, brief define Confidence interval>
http://www.youtube.com/watch?v=U59Rbpus824
Essential Statistics Chapter 13 6
![Page 7: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/7.jpg)
Essential Statistics Chapter 13 7
Case StudyNAEP Quantitative Scores
(National Assessment of Educational Progress)
Rivera-Batiz, F. L., “Quantitative literacy and the likelihood of employment among young adults,” Journal of Human
Resources, 27 (1992), pp. 313-328.
What is the average score for all young adult males?
![Page 8: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/8.jpg)
Essential Statistics Chapter 13 8
Case StudyNAEP Quantitative Scores
The NAEP survey includes a short test of quantitative skills, covering mainly basic arithmetic and the ability to apply it to realistic problems. Scores on the test range from 0 to 500, with higher scores indicating greater numerical abilities. It is known that NAEP scores have standard deviation = 60.
![Page 9: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/9.jpg)
Essential Statistics Chapter 13 9
Case StudyNAEP Quantitative Scores
In a recent year, 840 men 21 to 25 years of age were in the NAEP sample. Their mean quantitative score was 272.
On the basis of this sample, estimate the mean score in the population of all 9.5 million young men of these ages.
![Page 10: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/10.jpg)
Essential Statistics Chapter 13 10
Case StudyNAEP Quantitative Scores
1. To estimate the unknown population mean , use the sample mean = 272.
2. The law of large numbers suggests that will be close to , but there will be some error in the estimate.
3. The sampling distribution of has the Normal distribution with mean and standard deviation
x
x
x
n
60
8402 .1
![Page 11: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/11.jpg)
Essential Statistics Chapter 13 11
Case StudyNAEP Quantitative Scores
![Page 12: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/12.jpg)
Essential Statistics Chapter 13 12
Case StudyNAEP Quantitative Scores
4. The 68-95-99.7 rule indicates that and are within two standard deviations (4.2) of each other in about 95% of all samples.
x
2 6 7 . 8=4 . 22 7 2=4 . 2 x2 7 6 . 2=4 . 2+2 7 2=4 . 2+x
![Page 13: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/13.jpg)
Essential Statistics Chapter 13 13
Case StudyNAEP Quantitative Scores
So, if we estimate that lies within 4.2 of , we’ll be right about 95% of the time.
x
![Page 14: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/14.jpg)
Essential Statistics Chapter 13 14
Take an SRS of size n from a Normal population with unknown mean and known standard deviation . A level C confidence interval for is:
z* is called the critical value, and z* and –z* mark off the Central area C under a standard normal curve (next slide); values of z* for many choices of C can be found at the bottom of Table C in the back of the textbook, and the most common values are on the next slide.
Confidence IntervalMean of a Normal Population
nσzx
![Page 15: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/15.jpg)
Essential Statistics Chapter 13 15
Confidence IntervalMean of a Normal Population
![Page 16: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/16.jpg)
Essential Statistics Chapter 13 16
Case StudyNAEP Quantitative Scores
Using the 68-95-99.7 rule gave an approximate 95% confidence interval. A more precise 95% confidence interval can be found using the appropriate value of z* (1.960) with the previous formula.
2 6 7 . 8 8 4=4 . 1 1 62 7 2=1 )( 1 . 9 6 0 ) ( 2 . x2 7 6 . 1 1 6=4 . 1 1 62 7 2=1 )( 1 . 9 6 0 ) ( 2 . x
We are 95% confident that the average NAEP quantitative score for all adult males is between 267.884 and 276.116.
![Page 17: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/17.jpg)
Express Confidence Interval The confidence level describes the uncertainty
associated with a sampling method A 95% confidence level means that we would
expect 95% of the interval estimates to include the population parameter
To express a confidence interval, you need three pieces of information.
◙ Confidence level◙ Statistic◙ Margin of error
Essential Statistics Chapter 13 17
![Page 18: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/18.jpg)
Construct A Confidence Interval Identify a sample statistic. Locate the statistic
(e.g, mean, standard deviation) Select a confidence level, choose 90%, 95%, or
99% confidence levels or others Find the margin of error: Margin of error = Critical
value x Standard deviation of statistic
Confidence interval = sample statistic + margin of error
Essential Statistics Chapter 13 18
nσzx
![Page 19: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/19.jpg)
Exercise 1 Suppose we want to estimate the average weight of
an adult male in Dekalb County, GA. We draw a random sample of 1,000 men from a population of 1,000,000 men and weigh them. We find that the average weighs in sample is 180 pounds, and the standard deviation is 30 pounds. What is the 95% confidence interval.
(A) 180 + 1.86 (B) 180 + 3.0 (C) 180 + 5.88 (D) 180 + 30
Essential Statistics Chapter 13 19
![Page 20: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/20.jpg)
Essential Statistics Chapter 13 20
Hypothesis, H0, Ha Significance Sample P-value ConclusionFrom sample mean to draw an conclusion about
population mean
Hypothesis Test & P Value
![Page 21: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/21.jpg)
Essential Statistics Chapter 13 21
The statement being tested in a statistical test is called the null hypothesis.
Usually the null hypothesis is a statement of “no effect” or “no difference”, or it is a statement of equality.
When performing a hypothesis test, we assume that the null hypothesis is true until we have sufficient evidence against it.
Stating HypothesesNull Hypothesis, H0
![Page 22: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/22.jpg)
Essential Statistics Chapter 13 22
The statement we are trying to find evidence for is called the alternative hypothesis.
Usually the alternative hypothesis is a statement of “there is an effect” or “there is a difference”, or it is a statement of inequality.
The alternative hypothesis is what we are trying to prove.
Stating HypothesesAlternative Hypothesis, Ha
![Page 23: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/23.jpg)
Essential Statistics Chapter 13 23
Case Study I
Sweetening ColasDiet colas use artificial sweeteners to avoid sugar. These sweeteners gradually lose their sweetness over time. Trained testers sip the cola and assign a “sweetness score” of 1 to 10. The cola is then retested after some time and the two scores are compared to determine the difference in sweetness after storage. Bigger differences indicate bigger loss of sweetness.
![Page 24: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/24.jpg)
Essential Statistics Chapter 13 24
Case Study ISweetening Colas
Suppose we know that for any cola, the sweetness loss scores vary from taster to taster according to a Normal distribution with standard deviation = 1.
The mean for all tasters measures loss of sweetness.
The sweetness losses for a new cola, as measured by 10 trained testers, yields an average sweetness loss of = 1.02. Do the data provide sufficient evidence that the new cola lost sweetness in storage?
x
![Page 25: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/25.jpg)
Essential Statistics Chapter 13 25
Case Study ISweetening Colas
If the claim that = 0 is true (no loss of sweetness, on average), the sampling distribution of from 10 tasters is Normal with = 0 and standard deviation
The data yielded = 1.02, which is more than three standard deviations from = 0. This is strong evidence that the new cola lost sweetness in storage.
If the data yielded = 0.3, which is less than one standard deviations from = 0, there would be no evidence that the new cola lost sweetness in storage.
x
0 .3 1610
1
n
σ
x
x
![Page 26: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/26.jpg)
Essential Statistics Chapter 13 26
Case Study ISweetening Colas
![Page 27: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/27.jpg)
Essential Statistics Chapter 13 27
The Hypotheses for Means
Null: H0: = 0
One sided alternatives Ha:
Ha:
Two sided alternative Ha: 0
![Page 28: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/28.jpg)
Essential Statistics Chapter 13 28
Sweetening ColasCase Study I
The null hypothesis is no average sweetness loss occurs, while the alternative hypothesis (that which we want to show is likely to be true) is that an average sweetness loss does occur.
H0: = 0 Ha: > 0
This is considered a one-sided test because we are interested only in determining if the cola lost sweetness (gaining sweetness is of no consequence in this study).
![Page 29: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/29.jpg)
Essential Statistics Chapter 13 29
Studying Job SatisfactionCase Study II
Does the job satisfaction of assembly workers differ when their work is machine-paced rather than self-paced? A matched pairs study was performed on a sample of workers, and each worker’s satisfaction was assessed after working in each setting. The response variable is the difference in satisfaction scores, self-paced minus machine-paced.
![Page 30: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/30.jpg)
Essential Statistics Chapter 13 30
Studying Job SatisfactionCase Study II
The null hypothesis is no average difference in scores in the population of assembly workers, while the alternative hypothesis (that which we want to show is likely to be true) is there is an average difference in scores in the population of assembly workers.
H0: = 0 Ha: ≠ 0
This is considered a two-sided test because we are interested determining if a difference exists (the direction of the difference is not of interest in this study).
![Page 31: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/31.jpg)
Essential Statistics Chapter 13 31
Take an SRS of size n from a Normal population with unknown mean and known standard deviation . The test statistic for hypotheses about the mean (H0: = 0) of a Normal distribution is the standardized version of :
Test StatisticTesting the Mean of a Normal Population
nσ
μxz 0
x
![Page 32: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/32.jpg)
Essential Statistics Chapter 13 32
Sweetening ColasCase Study I
If the null hypothesis of no average sweetness loss is true, the test statistic would be:
Because the sample result is more than 3 standard deviations above the hypothesized mean 0, it gives strong evidence that the mean sweetness loss is not 0, but positive.
3.23
101
01.020
nσ
μxz
![Page 33: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/33.jpg)
Essential Statistics Chapter 13 33
■ The p-value is the probability or area marked by the test statistic z (from sample) in the normal distribution curve when assuming that the null hypothesis is true.
■ The smaller the p-value, the more significant is the difference between the null hypothesis and the sample results.
■ The smaller the P-value, the stronger the evidence the data provide against the null hypothesis.
P-value
![Page 34: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/34.jpg)
Essential Statistics Chapter 13 34
P-value for Testing Means Ha: > 0 when Ha contains “greater than” symbol, Perform a
Right- tailed test P-value is the probability of getting a value as large or
larger than the observed test statistic (z) value.
Ha: < 0 when Ha contains “less than” symbol, Perform a Left-tailed test
P-value is the probability of getting a value as small or smaller than the observed test statistic (z) value.
Ha: 0 when Ha contains “not equal to” symbol, Perform a Two-tailed test
P-value is two times the probability of getting a value as large or larger than the absolute value of the observed test statistic (z) value.
![Page 35: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/35.jpg)
Essential Statistics Chapter 13 35
![Page 36: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/36.jpg)
Essential Statistics Chapter 13 36
Sweetening ColasCase Study I
For test statistic z = 3.23 and alternative hypothesisHa: > 0, the P-value would be:
P-value = P(Z ≥ 3.23) = 1 – 0.9994 = 0.0006
If H0 is true, there is only a 0.0006 (0.06%) chance that we would see results at least as extreme as those in the sample; thus, since we saw results that are unlikely if H0 is true, we therefore have evidence against H0 and in favor of Ha.
![Page 37: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/37.jpg)
Essential Statistics Chapter 13 37
Sweetening ColasCase Study I
![Page 38: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/38.jpg)
Essential Statistics Chapter 13 38
Studying Job SatisfactionCase Study II
Suppose job satisfaction scores follow a Normal distribution with standard deviation = 60. Data from 18 workers gave a sample mean score of 17. If the null hypothesis of no average difference in job satisfaction is true, the test statistic would be:
1.20
1860
0170
nσ
μxz
![Page 39: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/39.jpg)
Essential Statistics Chapter 13 39
Studying Job SatisfactionCase Study II
For test statistic z = 1.20 and alternative hypothesisHa: ≠ 0, the P-value would be:
P-value = P(Z < -1.20 or Z > 1.20) = 2 P(Z < -1.20) = 2 P(Z >
1.20)= (2)(0.1151) = 0.2302
If H0 is true, there is a 0.2302 (23.02%) chance that we would see results at least as extreme as those in the sample; thus, since we saw results that are likely if H0 is true, we therefore do not have good evidence against H0 and in favor of Ha.
![Page 40: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/40.jpg)
Essential Statistics Chapter 13 40
Studying Job SatisfactionCase Study II
![Page 41: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/41.jpg)
Essential Statistics Chapter 13 41
If the P-value is as small as or smaller than the significance level (i.e., P-value ≤ ), then we say that the data give results that are statistically significant at level .
“Rejects the null hypothesis" when the p-value is less than the significance level α
If we choose = 0.05, we are requiring that the data give evidence against H0 so strong that it would occur no more than 5% of the time when H0 is true.
If we choose = 0.01, we are insisting on stronger evidence against H0, evidence so strong that it would occur only 1% of the time when H0 is true.
Statistical Significance α
![Page 42: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/42.jpg)
Essential Statistics Chapter 13 42
The five steps in carrying out a significance test:1. State the null and alternative hypotheses.2. Set the level of significance, usually ɑ = 0.053. Take a sample from population and provide
the Z test statistic.4. Locate the P-value from Table A.5. Using P-value compare with ɑ to reject null
hypothesis if P-value < ɑ
Tests Procedure for a Population Mean
![Page 43: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/43.jpg)
Essential Statistics Chapter 13 43
Sweetening ColasCase Study I
1. Hypotheses: H0: = 0Ha: > 0
2. Test Statistic:
3. P-value: P-value = P(Z > 3.23) = 1 – 0.9994 = 0.00064. Conclusion:
Since the P-value is smaller than = 0.01, there is very strong evidence that the new cola loses sweetness on average during storage at room temperature.
3.23
101
01.020
nσ
μxz
![Page 44: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/44.jpg)
Essential Statistics Chapter 13 44
Studying Job SatisfactionCase Study II
1. Hypotheses: H0: = 0Ha: ≠ 0
2. Test Statistic:
3. P-value: P-value = 2P(Z > 1.20) = (2)(1 – 0.8849) = 0.23024. Conclusion:
Since the P-value is larger than = 0.10, there is not sufficient evidence that mean job satisfaction of assembly workers differs when their work is machine-paced rather than self-paced.
1.20
1860
0170
nσ
μxz
![Page 45: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/45.jpg)
Essential Statistics Chapter 13 45
Case Study II
A 90% confidence interval for is:
Since 0 = 0 is in this confidence interval, it is plausible that the true value of is 0; thus, there is not sufficient evidence(at = 0.10) that the mean job satisfaction of assembly workers differs when their work is machine-paced rather than self-paced.
40.26 to 6 .26
23.261718
601.64517
n
σzx
Studying Job Satisfaction
![Page 46: Chapter 13](https://reader035.fdocuments.us/reader035/viewer/2022070502/56814948550346895db694e8/html5/thumbnails/46.jpg)
Interesting Video
Essential Statistics Chapter 13 46
http://www.youtube.com/watch?v=l9ueYYpYU_s<Tailed test>
http://www.youtube.com/watch?v=BX9iMIC6mcg<Five Steps>
http://www.youtube.com/watch?v=cW16A7hXbTo<P-Value>