lecture7n.pptx
-
Upload
hoang-nguyen -
Category
Documents
-
view
32 -
download
5
Transcript of lecture7n.pptx
27 September, 2011 STAT 101 -- Part VI 2
Highlight the last lecture (cont’d)
Assumption: Population is normal
n is large • Resampling
• nonparametric
yesyes
No No
27 September, 2011 STAT 101 -- Part VII 6
VII. Confidence Intervals
Point and interval estimations ofMeans of normal distribution and non-normal distributionProportion parameter of binomial distribution
Determining sample sizeFor the meanFor the proportion
27 September, 2011 STAT 101 -- Part VII 8
Confidence Intervals for the population mean
Assumptions: Population is normally distributed Standard deviation of population is given
27 September, 2011 STAT 101 -- Part VII 11
Example: protein intake Find the 95% confidence interval for the average daily
protein intake of men aged 20-25. Population standard deviation is 58.6 grams. The random sample with size 267 men aged 20-25 is
observed. The margin of error is Before collected the data from 267 men, we can say that
there are 95% chance of the random interval will include
It is noted that sample mean is still a random variable before collecting any data and we are still talking about probability.
After collecting the daily protein intake of these 267 men and calculating the sample mean of 72.1 grams.
27 September, 2011 STAT 101 -- Part VII 12
After obtained the numerical result from sampling, we can not say that the population mean falls between 65.071g and 79.129g with 95% chance.
The correct way to present the result is: The 95% confidence interval for the average
daily protein intake for men aged 20-25 is (65.071g, 79.129g)
Having determined a numerical result from one specific sample, it is no longer sensible to speak about the probability of its covering the fixed quantity
If many repeated samples with same sample size were taken from the same population and the confidence intervals were constructed, the proportion of intervals containing would be approximately 0.95.
27 September, 2011 STAT 101 -- Part VII 14
Interpretation of confidence intervals
Values below true mean Values above true mean
True population mean
http://www.socr.ucla.edu/Applets.dir/ConfidenceInterval.html
27 September, 2011 STAT 101 -- Part VII 15
Confidence intervals for the mean with unknown population variance
The only assumption is the population distribution is normal. The population standard deviation is unknown. It is reasonable to estimate the population standard deviation from
the sample standard deviation.
27 September, 2011 STAT 101 -- Part VII 20
Large sample size cases No assumptions of normal population distribution or the
population variance. If the sample size is sufficiently large, the Central-Limit
Theorem may be applied to guarantee that
27 September, 2011 STAT 101 -- Part VII 21
Flow Chart for determining the distributions
Is population distribution normal?
Is population standard deviation given?
Normal tables t-distribution tables
Normal tables
Large sample
size (>120)
yes
yes
no
no
Is sample size sufficiently large (n
>=30), such that CLT applied?
Use other methods
yes no
27 September, 2011 STAT 101 -- Part VII 22
Factors affecting the length of a confidence interval The shorter the length of confidence interval, the better the
estimation Consider the confidence interval for population mean
The length of confidence interval is then The length depends on S, n and
n n increases, length decreases
increases (confidence level decrease), length decreases
S S increases, length increases
27 September, 2011 STAT 101 -- Part VII 23
Determining sample size for the mean
The required sample size can be found to reach a desired margin of error with a specified level of confidence.
The margin of error is also called sampling error
The margin of error can be interpreted as
the amount of imprecision in the estimate of the population parameter
the amount added and subtracted to the point estimate to form the confidence interval
27 September, 2011 STAT 101 -- Part VII 25
Numerical example A consumer group wants to estimate the mean electric bill for
the amount of July for single-family homes in a large city. Based on studies conducted in other cities, the standard
deviation is assumed to be $25. The group wants to estimate the mean bill for July to within ±
$5 with 99% confidence. What sample size is needed?
27 September, 2011 STAT 101 -- Part VII 27
Estimation for the binomial distribution
Recall the common structure of the binomial distribution: A sample of n independent trials Each trial can have only two possible
outcomes which are denoted as `success’ and `failure’
The probability of a success at each trial is assumed to be constant p
The parameters of the binomial distribution are n and p
Now, assume that p is unknown and we want to use the sample proportion to estimate p
27 September, 2011 STAT 101 -- Part VII 30
Sampling distribution of
Sampling distribution of sample proportion
Population
1st sample of n
3rd sample of n
kth sample of n
2nd sample of n
27 September, 2011 STAT 101 -- Part VII 31
Sampling distribution of sample proportion
In previous section, we discussed that normal approximation to the binomial distribution
In fact, the normal approximation can be justified on the basis of the Central-Limit Theorem since sample proportion is just a sample mean
The textbook uses the rule of CLT: By the CLT, we get
27 September, 2011 STAT 101 -- Part VII 33
Example During June and July of 2001, the European Union Executive
Commission conducted a study of 6,543 European adults. Of those surveyed, 56% said that the euro single currency would promote economic growth and 73% knew the correct date of the changeover (January 1, 2002).
Construct a 95% confidence interval estimate for the proportion of European adults who believe that the euro would promote economic growth.
Interpret the interval constructed.
27 September, 2011 STAT 101 -- Part VII 35
Requirements of determining sample size for the proportion
27 September, 2011 STAT 101 -- Part VII 36
Numerical example: A study of 658 CEOs conducted by the Conference Board reported
that 250 stated that their company’s greatest concern was sustained and steady top-line growth (“CEOs’ Greatest Concerns,” USA Today Snapshots, May 8, 2006, P1D).
To conduct a follow-up study to estimate the population proportion of CEOs whose greatest concern was sustained and steady top-line growth to within ±0.01 with 95% confidence, how many CEOs would you survey?
Useful and interesting websites
27 September, 2011 STAT 101 -- Part VII 38
http://www.socr.ucla.edu/Applets.dir/ConfidenceInterval.html Confidence Intervals
simulations
http://en.wikipedia.org/wiki/Confidence_intervalConfidence Intervals information
27 September, 2011 STAT 101 -- Part VII 39
10 15 20 25 30 35 40
G2
G1
Assignment 2: Box-and-Whisker Plot
Stem-and-Leaf Display
Stem unit: 1
Statistics 12 0Sample Size 94 13Mean 37.08511 14Median 40 15Std. Deviation 4.760183 16Minimum 12 17Maximum 40 18
1920212223 0242526 0 027 028 0 02930 031 0 0 032 0 033 0 034 0 0 0 035 0 0 0 0 036 0 0 0 037 0 0 0 0 0 0 038 0 0 0 039 0 0 0
40 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0