Interval Estimates A point estimate gives a plausible single number estimate for a parameter. We may...

Interval Estimates A point estimate gives a plausible single number estimate for a parameter. We may also be interested in a range of plausible values for a parameter or even future measurements Called an interval estimate Confidence intervals Prediction intervals Tolerance intervals Credibility intervals (Bayesian. No time unfortunately)

Confidence Intervals is a parameter we are interested in. Usually: a mean a variance/standard deviation a proportion Consider an experiment that will collect n data points. Then BEFORE we collect the data, we can devise procedure such that:

Confidence Intervals The PROCEDURE produces intervals: that cover with probability 1- The interval is random. We dont know the upper-estimate or lower-estimate yet. Since it is random and it is the result of a repeatable experiment, it can be associated with a probability under the frequentist definition

Confidence Intervals BUT, in order to get a numerical interval we perform the experiment and plug in the data Once the sample of data is measured, the interval is no longer random. We cannot say that is in the specific, realized interval with probability 1- is either in the interval or it is not, we just dont know which. We cant do repeatable experiments to compute the probability that is in the specific interval we realized with the sample we measured. So lets make up a new word: confidence

Confidence Intervals Any realized interval: contains with 1- confidence. We can be 1- confident that the realized interval covers . The width of the confidence interval varies with sample size and level of confidence Confidence increases, width increases Sample size increases, width decreases

Confidence Intervals Caution: IT IS NOT CORRECT to say that there a (1- )100% probability that the true value of a parameter is between the bounds of any given CI. true value of parameter Here 90% of the CIs contain the true value of the parameter Graphical representation of 90% CIs is for a parameter: Take a sample. Compute a CI.

Construction of a large sample two-sided CI for a mean depends on: Sample size: n Standard error for means: Desired level of confidence 1- is significance level Use /2 to compute N(0,1) z-value (1-)100% CI for population mean using a sample average and standard error is: Confidence Intervals Assumes = s

Compute a 95% confidence interval for the mean using this sample set: Confidence Intervals Fragment #Fragment nD 11.52005 21.52003 31.52001 41.52004 51.52000 61.52001 71.52008 81.52011 91.52008 101.52008 111.52008 Putting this together: [1.52005 (-1.96)(0.00001), 1.52005 + (-1.96)(0.00001)] 95% CI for sample = [1.520123, 1.519980]

Confidence Intervals large sample two-sided CI for a mean

Construction of a small sample two-sided CI for a mean depends on: Sample size: n Standard error for means: Desired level of confidence 1- is significance level Use /2 to compute Student-t(n) t-value (1-)100% CI for population mean using a sample average and standard error is: Confidence Intervals

small sample two-sided CI for a mean

Construction of a large sample two-sided CI for a proportion p, depends on: Sample size: n Standard error for proportion: Desired level of confidence 1- is significance level Use /2 to compute N(0,1) z-value (1-)100% CI for proportion using sample standard error is: Confidence Intervals

Mike Neel and Major Wells of the ATF carried out a comparison of known matching (KM) striation pattern tool marks (AFTE J 39(3):176-198, 2007). Of the 914 KM comparisons they examined, it was found that 19 had 3 4-line (4X) consecutive matching striation (CMS) patches. Compute the large sample 95% confidence interval around the proportion of KM pairs with 3 4X CMS patches.

Construction of a small sample two-sided CI for a proportion: For desired level of confidence (1-)100% CI for population is: Confidence Intervals Upper Bound Lower Bound

Confidence Intervals Compute the small sample 95% confidence interval around the proportion of KM pairs with 3 4X CMS patches.

One-sided CIs Use z or t(n) instead of z /2 or t(n) /2 and only + or depending on which bound is desired. Confidence Intervals What is the lower one-sided small sample 95% confidence limit on the proportion of KM pairs with 3 4X CMS patches?

Prediction Intervals What if we want to predict, a value of a future data for a given experiment? What can be said about the values produced by a future experiment based on data in hand Vardman, Morris ? We want to predict at a specified level of confidence (ehhhh probability) We want an interval estimate for our prediction.

Prediction Intervals Since we know the distribution of the data generating process, along with the distributions parameters, this is an easy question to answer. The reliability of analytical balances in a narcotics lab is set by policy. A balances lifetime is its reliable use time (in hr) before it needs major servicing. A vendor states that its balances have a Weibull distributed lifetime with shape ( ) = 15 and scale ( ) = 5000 What is the 90% prediction interval for the lifetime of the next balance shipped out of the factory?

Prediction Intervals Most of the units have this lifetime Predict: 90% chance next unit will have a lifetime in here 4100 hr 5400 hr

Prediction Intervals We pretty much never know the datas distribution and its parameters precisely. Usually we are stuck with just a sample of data. What can we say about the prediction intervals if all we have is a sample? If the datas distribution is Gaussian (normal): Prediction interval for the n th + 1 data point, having observed n data points: For the 1- probability prediction

Prediction Intervals 15 RI measurements from a pane of glass were: 1.51881, 1.51874, 1.51883, 1.51865, 1.51878, 1.51882, 1.51876, 1.51877, 1.51882, 1.51882, 1.51883, 1.51882, 1.51882, 1.51882, 1.51879 Assuming these RI measurements follow a normal distribution with unknown mean and variance, compute the 95% prediction interval for the 16 th measurement.

Prediction Intervals 95% Prediction interval for the 16 th measurement after the first 15 RI measurements The actual 16 th RI measurement. Did we make it in? Load some glass RI data from Prof. James Currans (Auckland) dafs package

Prediction Intervals If the datas distribution is unknown Vardman, Morris : Prediction interval for the n th + 1 data point, having observed n data points, x i : Prediction probability: Prediction probability dictated by sample size { min(x i ), max(x i ) } Also called the non-parametric prediction interval

Prediction Intervals 15 RI measurements from a pane of glass were: 1.51881, 1.51874, 1.51883, 1.51865, 1.51878, 1.51882, 1.51876, 1.51877, 1.51882, 1.51882, 1.51883, 1.51882, 1.51882, 1.51882, 1.51879 Compute the prediction interval for the 16 th measurement, assuming an unknown distribution for the data. How does the width compare the the Gaussian prediction interval? What is the sample size required to have prediction interval with a probability of at least 95%?

Prediction Intervals non-parametric prediction interval prediction probability Width of 95% Gaussian PI Width of 87.5% non-parametric PI

Tolerance Intervals What if we want to predict a large fraction of future data for a given experiment? A prediction interval is good for only the next, not-yet- produced value We want to give an interval where 100p% of future values should fall with 1 - confidence (Ehhhh,.. probability if they havent been produced yet)

Tolerance Intervals If the datas distribution is Gaussian (normal): Two sided tolerance interval for 100p% of the mass of future data values with 1 - confidence (probability): Two sided tolerance scale factor for 1- probability, p-fraction of the of the future values: Chi-squared quantile, qchisq Degrees of freedom, df

Tolerance Intervals If the datas distribution is Gaussian (normal): One sided tolerance interval for 100p% of the mass of future data values with 1 - confidence (probability): One sided tolerance scale factor for 1- probability, p-fraction of the of the future values: For large samples (n>30): For small samples: Degrees of freedom, df Non-centrality param, ncp

Tolerance Intervals If the datas distribution is unknown Vardman, Morris : Two sided tolerance interval for 100p% of the mass of future data values: Tolerance confidence (probability): Probability depends on proportion p and sample size, n { min(x i ), max(x i ) } Also called the non-parametric tolerance interval

Tolerance Intervals 15 RI measurements from a pane of glass were: 1.51881, 1.51874, 1.51883, 1.51865, 1.51878, 1.51882, 1.51876, 1.51877, 1.51882, 1.51882, 1.51883, 1.51882, 1.51882, 1.51882, 1.51879 Based on this sample, with what probability can a tolerance interval be produced which covers 99% of future RI, assuming the datas distribution is unknown?

Tolerance Intervals 15 RI measurements from a pane of glass were: 1.51881, 1.51874, 1.51883, 1.51865, 1.51878, 1.51882, 1.51876, 1.51877, 1.51882, 1.51882, 1.51883, 1.51882, 1.51882, 1.51882, 1.51879 Assuming the RI measurements follow a normal distribution, compute the two-sided tolerance interval for where 83% of future measurement will lay with 95% probability. Assuming the data follows a Gaussian distribution. Reference Vardeman and Morris notes for Industrial Engineering 361, Iowa State: http://www.public.iastate.edu/~vardeman/IE361/Modules/Module%2018C.pdf

Tolerance Intervals Make life a little easier and use the package tolerance to do the tolerance interval calcs tolerance interval assuming the data is Gaussian

Interval Estimates A point estimate gives a plausible single number estimate for a parameter. We may...

Documents

Transcript of Interval Estimates A point estimate gives a plausible single number estimate for a parameter. We may...