Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer [email protected] Office:N763...
-
date post
21-Dec-2015 -
Category
Documents
-
view
213 -
download
0
Transcript of Estimates and sample sizes Chapter 6 Prof. Felix Apfaltrer [email protected] Office:N763...
Estimates and sample sizes Chapter 6
Prof. Felix Apfaltrer
Office:N763
Phone: 212-220 74 21
Office hours:
Tue, Thu 10am-11:30 am
2
StatisticalMethods
DescriptiveStatistics
InferentialStatistics
EstimationHypothesis
Testing
Inferential Statistics
3
• 1. Involves:– Estimation– Hypothesis Testing
• 2. Purpose– Make Decisions
about Population Characteristics
Population?Population?
Inferential Statistics
4
PopulationPopulation
SampleSample
Sample Sample statistic statistic
((XX))
Estimates Estimates & tests& tests
Inference Process
5
Estimation
PointEstimate Interval
Confidence
Point Estimates
6
Estimate PopulationEstimate PopulationParameter...Parameter...
with Samplewith SampleStatisticStatistic
MeanMean xx
ProportionProportion pp pp̂̂
VarianceVariance 22 ss 22
DifferencesDifferences 11 - - 22 xx11 --xx22
Point Estimates 2
7
Samples and estimation
Assumptions:
• Samples should be simple random samples
• Binomial distribution requirements satisfied
• Normal distribution approximation: ok.
Example (survey about photo-cop):
• 829 Minnesotans surveyed
• 51% opposed to cameras used for issuing traffic tickets
• Point estimate ^p=0.51
Notation:p = population proportion^ p = x/n sample proportion of x successes in a sample of size n^q = 1 - ^ p = sample proportion of failures in a sample of size n
Definition:
A point estimate is a single value used to estimate a population parameter.
8
A study found the body temperatures of 106 healthy adults. The sample mean was 98.2 degrees and the sample standard deviation was 0.62 degrees. Find the point estimate of the population mean of all body temperatures.
Because the sample mean x is the best point estimate of the population mean , we conclude that the best point estimate of the population mean of all body temperatures is 98.20o F.
X
Example Point Estimate
9
Estimation
PointEstimate
IntervalConfidence
Confidence Intervals
10
Confidence Confidence intervalinterval
Confidence Confidence limitlimit (upper) (upper)
Sample statistic Sample statistic
(point estimate)(point estimate)
Confidence Confidence limitlimit (lower) (lower)
Definition: A confidence interval is a range (or an interval) of values used to estimate the true value of the population parameter. The confidence level gives us the success rate of the procedure used to construct the confidence interval.
Definition: A confidence interval is a range (or an interval) of values used to estimate the true value of the population parameter. The confidence level gives us the success rate of the procedure used to construct the confidence interval.
Confidence Interval Definition
11
Confidence intervals (cont)
Definition:
A confidence interval (CI) is a range of values used to estimate the true value of a population parameter.
Example (survey about photo-cop):• 829 Minnesotans surveyed• 51% opposed to cameras used for issuing traffic tickets• The 0.95 confidence interval estimate of the population
proportion p against photo-cop is0.476<p<0.544
Confidence level associated to confidence interval. = success rate for estimate to be in interval• given as probability 1- • For example: confidence level of 0.95, =0.05
confidence level of 0.99, =0.01
12
Confidence intervals(cont)Definition:A critical values are the numbers
-z/2 and z/2 that separate the areas /2 on the left and right tails from the center area 1- .
Example (survey about photo-cop):• For confidence level of 0.95, =0.05, so = 0.025
• In table A-2, -z/2= - 1.96• and z/2 =1.96
short visit from Navy.
• Calculate P(X>=308 days).
• What does this suggest?
• Premature if below 4%. Find length.
Q11 SAT =998 = 202
College requires 1100 minimum.
• Find percentage satisfying requirement.
• Find 40% percentile. Why does college
• not ask: top 40%?
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1-
/2 /2
Confidence level z/2
90% 0.10 1.645
95% 0.05 1.96
99% 0.01 2.575
13
The confidence level is often expressed as probability 1 - , where is the complement of the confidence level. For a 0.95(95%) confidence level, = 0.05. For a 0.99(99%) confidence level, = 0.01.
Level of Confidence
Margin of Erroris the maximum likely difference observed between sample mean x and population mean µ,
and is denoted by E.
E = z/2 •n
14
Confidence Interval (or Interval Estimate) for
Population Mean µ when is known
x – E < µ < x + E
(x – E, x + E)
x + E
15
Procedure for Constructing a Confidence Interval for µ
when is known1. Verify that the required assumptions are met.
2. Find the critical value z2 that corresponds to the desired degree of confidence.
3. Evaluate the margin of error E = z2 • / n .
x – E < µ < x + E
4. Find the values of x – E and x + E. Substitute thosevalues in the general format of the confidence interval:
16
Round-Off Rule for Confidence Intervals Used to Estimate µ
1. When using the original set of data, round the confidence interval limits to one more decimal place than used in original set of data.
2. When the original set of data is unknown and only the summary statistics (n,x,s) are used, round the confidence interval limits to the same number of decimal places used for the sample mean.
17
n = 106
x = 98.20o
s = 0.62o
= 0.05/2 = 0.025
z / 2 = 1.96
E = z / 2 • = 1.96 • 0.62 = 0.12n 106
98.08o < < 98.32o
Example: A study found the body temperatures of 106 healthy adults. The sample mean was 98.2 degrees and the sample standard deviation was 0.62 degrees. Find the margin of error E and the 95% confidence interval for µ.
x – E < < x + E
98.20o – 0.12 < < 98.20o + 0.12
18
(z/2) n =
E
2
Sample Size for Estimating Mean
When finding the sample size n, if the Formula above does not result in a whole number, always increase the value of n to the next larger whole number.
If is unknown: a) range/4, or b) calculate the sample standard deviation s and use it in place of , orc) Estimate the value of by using the results of some other study that was done earlier.
If is unknown: a) range/4, or b) calculate the sample standard deviation s and use it in place of , orc) Estimate the value of by using the results of some other study that was done earlier.
19
= 0.05/2 = 0.025
z / 2 = 1.96
E = 2 = 15
n = 1.96 • 15 = 216.09 = 217 2
2
With a simple random sample of only 217 statistics professors, we will be 95% confident that the sample mean will be within 2 points of the true population mean .
Example: Assume that we want to estimate the mean IQ score for the population of statistics professors. How many statistics professors must be randomly selected for IQ tests if we want 95% confidence that the sample mean is within 2 IQ points of the population mean? Assume that = 15, as is found in the general population.
Example: Assume that we want to estimate the mean IQ score for the population of statistics professors. How many statistics professors must be randomly selected for IQ tests if we want 95% confidence that the sample mean is within 2 IQ points of the population mean? Assume that = 15, as is found in the general population.
20
90% Samples90% Samples
95% Samples95% Samples
99% Samples99% Samples
+1.65+1.65x x +2.58+2.58xx
XX
+1.96+1.96xx
-2.58-2.58xx -1.65-1.65xx
-1.96-1.96xx
XX= = ± Z ± Zxx
Many Samples Have Same Interval
21
Not KnownAssumptions
Use Student t distribution1. The Student t distribution is different for different sample sizes (see Figure for the cases n = 3 and n = 12).2. The Student t distribution has the same general symmetric bell shape as the normal distribution but it reflects the greater variability (with wider distributions) that is expected with small samples.3. The Student t distribution has a mean of t = 0 (just as the standard normal distribution has a mean of z = 0).4. The standard deviation of the Student t distribution varies with the sample size and is greater than 1 (unlike the standard normal distribution, which has a σ= 1).5. As the sample size n gets larger, the Student t distribution gets closer to the normal distribution.
1) The sample is a simple random sample.2) Either the sample is from a normally distributed population, or n > 30.
Felix Apfaltrer
If the distribution of a population is essentially normal, then the distribution of
t =x - µ
s
n
Degrees of Freedom (df ) corresponds to the number of sample values that can vary after certain restrictions have been imposed on all data values
is essentially a Student t Distribution for all samples of size n, and is used to find critical values denoted by t/2.
df = n – 1
in this section.
23
x – E < µ < x + E
t/2 found in Table A-3
where E = t/2 ns
x – E < µ < x + E
t/2 found in Table A-3
Confidence Interval for the Estimate of E
Based on an Unknown and a Small Simple Random Sample from a Normally Distributed Population
E is the margin of Error and tα/2 hasn-1 degrees of freedom.
24
n = 106
x = 98.20o
s = 0.62o
= 0.05/2 = 0.025
t / 2 = 1.96
E = t / 2 • s = 1.984 • 0.62 = 0.1195n 106
98.08o < < 98.32o
x – E < < x + E98.20o – 0.1195 < < 98.20o + 0.1195
Example: A study found the body temperatures of 106 healthy adults. The sample mean was 98.2 degrees and the sample standard deviation was 0.62 degrees. Find the margin of error E and the 95% confidence interval for µ.
Based on the sample provided, the confidence interval for the population mean is 98.08o < < 98.32o. The interval is the same here as in above
example, but in some other cases, the difference would be much greater.
25
Important Properties of the Student t Distribution
1. The Student t distribution is different for different sample sizes (see Figure 6-5 for the cases n = 3 and n = 12).
2. The Student t distribution has the same general symmetric bell shape as the normal distribution but it reflects the greater variability (with wider distributions) that is expected with small samples.
3. The Student t distribution has a mean of t = 0 (just as the standard normal distribution has a mean of z = 0).
4. The standard deviation of the Student t distribution varies with the sample size and is greater than 1 (unlike the standard normal distribution, which has a = 1).
5. As the sample size n gets larger, the Student t distribution gets closer to the normal distribution.
26
Using the z Normal and t Distribution
27
Example: Data Set 14 in Appendix B includes the Flesch ease of reading scores for 12 different pages randomly selected from J.K. Rowling’s Harry Potter and the Sorcerer’s Stone. Find the 95% interval estimate of , the mean Flesch ease of reading score. (The 12 pages’ distribution appears to be bell-shaped.)
x = 80.75
s = 4.68
= 0.05/2 = 0.025
t/2 = 2.20180.75 – 2.97355 < µ < 80.75 + 2.97355
77.77645 < < 83.7235577.78 < < 83.72
x – E < µ < x + E
E = t2 s = (2.201)(4.68) = 2.97355
n 12
28
Finding the Point Estimate and E from a
Confidence Interval
Margin of Error:
E = (upper confidence limit) – (lower confidence limit)
2
Point estimate of µ: x = (upper confidence limit) + (lower confidence limit)
2