Transcript of Interval Estimates A point estimate gives a plausible single number estimate for a parameter. We may...
Slide 1
Interval Estimates A point estimate gives a plausible single
number estimate for a parameter. We may also be interested in a
range of plausible values for a parameter or even future
measurements Called an interval estimate Confidence intervals
Prediction intervals Tolerance intervals Credibility intervals
(Bayesian. No time unfortunately)
Slide 2
Confidence Intervals is a parameter we are interested in.
Usually: a mean a variance/standard deviation a proportion Consider
an experiment that will collect n data points. Then BEFORE we
collect the data, we can devise procedure such that:
Slide 3
Confidence Intervals The PROCEDURE produces intervals: that
cover with probability 1- The interval is random. We dont know the
upper-estimate or lower-estimate yet. Since it is random and it is
the result of a repeatable experiment, it can be associated with a
probability under the frequentist definition
Slide 4
Confidence Intervals BUT, in order to get a numerical interval
we perform the experiment and plug in the data Once the sample of
data is measured, the interval is no longer random. We cannot say
that is in the specific, realized interval with probability 1- is
either in the interval or it is not, we just dont know which. We
cant do repeatable experiments to compute the probability that is
in the specific interval we realized with the sample we measured.
So lets make up a new word: confidence
Slide 5
Confidence Intervals Any realized interval: contains with 1-
confidence. We can be 1- confident that the realized interval
covers . The width of the confidence interval varies with sample
size and level of confidence Confidence increases, width increases
Sample size increases, width decreases
Slide 6
Confidence Intervals Caution: IT IS NOT CORRECT to say that
there a (1- )100% probability that the true value of a parameter is
between the bounds of any given CI. true value of parameter Here
90% of the CIs contain the true value of the parameter Graphical
representation of 90% CIs is for a parameter: Take a sample.
Compute a CI.
Slide 7
Construction of a large sample two-sided CI for a mean depends
on: Sample size: n Standard error for means: Desired level of
confidence 1- is significance level Use /2 to compute N(0,1)
z-value (1-)100% CI for population mean using a sample average and
standard error is: Confidence Intervals Assumes = s
Slide 8
Compute a 95% confidence interval for the mean using this
sample set: Confidence Intervals Fragment #Fragment nD 11.52005
21.52003 31.52001 41.52004 51.52000 61.52001 71.52008 81.52011
91.52008 101.52008 111.52008 Putting this together: [1.52005
(-1.96)(0.00001), 1.52005 + (-1.96)(0.00001)] 95% CI for sample =
[1.520123, 1.519980]
Slide 9
Confidence Intervals large sample two-sided CI for a mean
Slide 10
Construction of a small sample two-sided CI for a mean depends
on: Sample size: n Standard error for means: Desired level of
confidence 1- is significance level Use /2 to compute Student-t(n)
t-value (1-)100% CI for population mean using a sample average and
standard error is: Confidence Intervals
Slide 11
small sample two-sided CI for a mean
Slide 12
Construction of a large sample two-sided CI for a proportion p,
depends on: Sample size: n Standard error for proportion: Desired
level of confidence 1- is significance level Use /2 to compute
N(0,1) z-value (1-)100% CI for proportion using sample standard
error is: Confidence Intervals
Slide 13
Mike Neel and Major Wells of the ATF carried out a comparison
of known matching (KM) striation pattern tool marks (AFTE J
39(3):176-198, 2007). Of the 914 KM comparisons they examined, it
was found that 19 had 3 4-line (4X) consecutive matching striation
(CMS) patches. Compute the large sample 95% confidence interval
around the proportion of KM pairs with 3 4X CMS patches.
Slide 14
Construction of a small sample two-sided CI for a proportion:
For desired level of confidence (1-)100% CI for population is:
Confidence Intervals Upper Bound Lower Bound
Slide 15
Confidence Intervals Compute the small sample 95% confidence
interval around the proportion of KM pairs with 3 4X CMS
patches.
Slide 16
One-sided CIs Use z or t(n) instead of z /2 or t(n) /2 and only
+ or depending on which bound is desired. Confidence Intervals What
is the lower one-sided small sample 95% confidence limit on the
proportion of KM pairs with 3 4X CMS patches?
Slide 17
Prediction Intervals What if we want to predict, a value of a
future data for a given experiment? What can be said about the
values produced by a future experiment based on data in hand
Vardman, Morris ? We want to predict at a specified level of
confidence (ehhhh probability) We want an interval estimate for our
prediction.
Slide 18
Prediction Intervals Since we know the distribution of the data
generating process, along with the distributions parameters, this
is an easy question to answer. The reliability of analytical
balances in a narcotics lab is set by policy. A balances lifetime
is its reliable use time (in hr) before it needs major servicing. A
vendor states that its balances have a Weibull distributed lifetime
with shape ( ) = 15 and scale ( ) = 5000 What is the 90% prediction
interval for the lifetime of the next balance shipped out of the
factory?
Slide 19
Prediction Intervals Most of the units have this lifetime
Predict: 90% chance next unit will have a lifetime in here 4100 hr
5400 hr
Slide 20
Prediction Intervals We pretty much never know the datas
distribution and its parameters precisely. Usually we are stuck
with just a sample of data. What can we say about the prediction
intervals if all we have is a sample? If the datas distribution is
Gaussian (normal): Prediction interval for the n th + 1 data point,
having observed n data points: For the 1- probability
prediction
Slide 21
Prediction Intervals 15 RI measurements from a pane of glass
were: 1.51881, 1.51874, 1.51883, 1.51865, 1.51878, 1.51882,
1.51876, 1.51877, 1.51882, 1.51882, 1.51883, 1.51882, 1.51882,
1.51882, 1.51879 Assuming these RI measurements follow a normal
distribution with unknown mean and variance, compute the 95%
prediction interval for the 16 th measurement.
Slide 22
Prediction Intervals 95% Prediction interval for the 16 th
measurement after the first 15 RI measurements The actual 16 th RI
measurement. Did we make it in? Load some glass RI data from Prof.
James Currans (Auckland) dafs package
Slide 23
Prediction Intervals If the datas distribution is unknown
Vardman, Morris : Prediction interval for the n th + 1 data point,
having observed n data points, x i : Prediction probability:
Prediction probability dictated by sample size { min(x i ), max(x i
) } Also called the non-parametric prediction interval
Slide 24
Prediction Intervals 15 RI measurements from a pane of glass
were: 1.51881, 1.51874, 1.51883, 1.51865, 1.51878, 1.51882,
1.51876, 1.51877, 1.51882, 1.51882, 1.51883, 1.51882, 1.51882,
1.51882, 1.51879 Compute the prediction interval for the 16 th
measurement, assuming an unknown distribution for the data. How
does the width compare the the Gaussian prediction interval? What
is the sample size required to have prediction interval with a
probability of at least 95%?
Slide 25
Prediction Intervals non-parametric prediction interval
prediction probability Width of 95% Gaussian PI Width of 87.5%
non-parametric PI
Slide 26
Tolerance Intervals What if we want to predict a large fraction
of future data for a given experiment? A prediction interval is
good for only the next, not-yet- produced value We want to give an
interval where 100p% of future values should fall with 1 -
confidence (Ehhhh,.. probability if they havent been produced
yet)
Slide 27
Tolerance Intervals If the datas distribution is Gaussian
(normal): Two sided tolerance interval for 100p% of the mass of
future data values with 1 - confidence (probability): Two sided
tolerance scale factor for 1- probability, p-fraction of the of the
future values: Chi-squared quantile, qchisq Degrees of freedom,
df
Slide 28
Tolerance Intervals If the datas distribution is Gaussian
(normal): One sided tolerance interval for 100p% of the mass of
future data values with 1 - confidence (probability): One sided
tolerance scale factor for 1- probability, p-fraction of the of the
future values: For large samples (n>30): For small samples:
Degrees of freedom, df Non-centrality param, ncp
Slide 29
Tolerance Intervals If the datas distribution is unknown
Vardman, Morris : Two sided tolerance interval for 100p% of the
mass of future data values: Tolerance confidence (probability):
Probability depends on proportion p and sample size, n { min(x i ),
max(x i ) } Also called the non-parametric tolerance interval
Slide 30
Tolerance Intervals 15 RI measurements from a pane of glass
were: 1.51881, 1.51874, 1.51883, 1.51865, 1.51878, 1.51882,
1.51876, 1.51877, 1.51882, 1.51882, 1.51883, 1.51882, 1.51882,
1.51882, 1.51879 Based on this sample, with what probability can a
tolerance interval be produced which covers 99% of future RI,
assuming the datas distribution is unknown?
Slide 31
Tolerance Intervals 15 RI measurements from a pane of glass
were: 1.51881, 1.51874, 1.51883, 1.51865, 1.51878, 1.51882,
1.51876, 1.51877, 1.51882, 1.51882, 1.51883, 1.51882, 1.51882,
1.51882, 1.51879 Assuming the RI measurements follow a normal
distribution, compute the two-sided tolerance interval for where
83% of future measurement will lay with 95% probability. Assuming
the data follows a Gaussian distribution. Reference Vardeman and
Morris notes for Industrial Engineering 361, Iowa State:
http://www.public.iastate.edu/~vardeman/IE361/Modules/Module%2018C.pdf
Slide 32
Tolerance Intervals Make life a little easier and use the
package tolerance to do the tolerance interval calcs tolerance
interval assuming the data is Gaussian