Uncertainty andUncertainty andconfidence intervalsconfidence intervals
Statistical estimation methods, Finse
Friday 10.9.2010, 12.45–14.05
Andreas Lindén
OutlineOutline• Point estimates and uncertainty• Sampling distribution
– Standard error– Covariation between parameters
• Finding the VC-matrix for the parameter estimates– Analytical formulas– From the Hessian matrix– Bootstrapping
• The idea behind confidence intervals• General methods for constructing confidence intervals of parameters
– CI based on the central limit theorem– Profile likelihood CI– CI by bootstrapping
3
Point estimates and uncertaintyPoint estimates and uncertainty• The main output in any statistical model fitting are the
parameter estimates– Point estimates — one value for each parameter– The effect sizes– Answers the question “how much”
• Point estimates are of little use without any assessment of uncertainty– Standard error– Confidence intervals– p-values– Estimated sampling distribution– Bayesian credible intervals– Plotting Bayesian posterior distribution
4
Sampling distributionSampling distribution• The probability distribution of a parameter estimate
– Calculated from a sample– Variability due to sampling effects
• Typically depends on sample size or the number of degrees of freedom (df)
• Examples of common sampling distributions– Student’s t-distribution– F-distribution– χ²-distribution
5
Degrees of freedomDegrees of freedom
Y
X
In a linear regression df = n – 2
6
Properties of the sampling distributionProperties of the sampling distribution
• The standard error (SE) of a parameter, is the estimated standard deviation of the sampling distribution– Square root of parameter variance
• Parameters are not necessarily unrelated– The sampling distribution of several parameters is multivariate– Example: regression slope and intercept
7
Linear regression – simulated dataLinear regression – simulated dataParam. a b σ²
True value 4.00 1.00 0.80
Estim. 1 4.29 0.96 0.70
Estim. 2 4.13 0.97 0.36
Estim. 3 3.86 0.98 0.83
Estim. 4 3.77 1.04 0.75
Estim. 5 3.63 1.06 0.63
Estim. 6 4.39 0.93 0.72
Estim. 7 3.80 0.98 0.91
Estim. 8 3.78 1.06 0.92
Estim. 9 3.74 1.07 0.69
Estim. 10 4.62 0.84 0.50
… … … …
Estim 100 3.54 1.06 0.71
8
Properties of the sampling distributionProperties of the sampling distribution
• The standard error (SE) of a parameter, is the estimated standard deviation of the sampling distribution– Square root of parameter variance
• Parameters are not necessarily unrelated– The sampling distribution of several parameters is multivariate– Example: regression slope and intercept
0.1531 -0.0273 0.0031
COV = -0.0273 0.0059 0.0002
0.0031 0.0002 0.0335
1.0000 -0.9085 0.0432
CORR = -0.9085 1.0000 0.0159
0.0432 0.0159 1.0000
9
Properties of the sampling distributionProperties of the sampling distribution
• The standard error (SE) of a parameter, is the estimated standard deviation of the sampling distribution– Square root of parameter variance
• Parameters are not necessarily unrelated– The sampling distribution of several parameters is multivariate– Example: regression slope and intercept
• Methods to obtain the VC-matrix (or standard errors) for a set of parameters– Analytical formulas– Bootstrap– The inverse of the Hessian matrix
10
Parameter variances analyticallyParameter variances analytically• For many common situations the SE and VC-matrix of a set of parameters
can be calculated with analytical formulas• Standard error of the sample mean
• Standard error of the estimated binomial probability
11
BootstrapBootstrap• The bootstrap is a general and common resampling method• Used to simulate the sampling distribution• Information in the sample itself is used to mimic the original
sampling procedure– Non-parametric bootstrap — sampling with replacement – Parametric bootstrap — simulation based on parameter estimates
• The procedure is repeated B times (e.g. B = 1000)• To make inference from the bootstrapped estimates
– Sample standard deviation = bootstrap estimate of SE– Sample VC-matrix = bootstrap estimate of VC-matrix– Mean = difference between bootstrap mean and original estimate is
an estimate of bias
12
VC-matrix from the HessianVC-matrix from the Hessian• The Hessian matrix (H)
– 2nd derivative of the (multivariate) negative log-likelihood at the ML-estimate
– Typically given as an output by software for numerical optimization
• The inverse of the Hessian is an estimate of the parameters’ variance-covariance matrix
13
Confidence interval (CI)Confidence interval (CI)• An frequentistic interval estimate of one or several
parameters• A fraction α of all correctly produced CI:s will fail to include
the true parameter value– Trust your 95% CI and take the risk α = 0.05
• NB! Should not be confused with Bayesian credible intervals– CI:s should not be thought to contain the parameter with 95%
probability– The CI is based on the sampling distribution, not on an estimated
probability distribution for the parameter of interest
14
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 10
10
20
30
40
50
60
70
80
90
100
15
CI based on central limit theoremCI based on central limit theorem
• The sum/mean of many random values are approximately normally distributed– Actually t-distributed with df depending on
sample size and model complexity– Might matter with small sample size
• As a rule of thumb, an arbitrary parameter estimate ± 2*SE produce an approximate 95% confidence interval– With infinitely many observations ± 1.96*SE
16
CI from profile likelihoodCI from profile likelihood• The profile deviance
– The change in −2*log-likelihood, in comparison to the ML-estimate
– Asymptotically χ²-distributed (assuming infinite sample size)
• Confidence intervals can be obtained as the range around the ML-estimate, for which the profile deviance is under a critical level– The 1 – α quantile from χ²-distribution– One-parameter -> df = 1 (e.g. 3.841 for α = 0.05)– k-dimensional profile deviance -> df = k
17
95% CI from profile deviance95% CI from profile deviance
–2*LL
Parameter value
Fmin + 3.841
Fmin
18
2-D confidence regions2-D confidence regions
Parameter a
Parameter b
99% confidence region, deviance χ²df2 = 9.201
95% confidence region, deviance χ²df2 = 5.992
19
CI by bootstrappingCI by bootstrapping
• A 100*(1 – α)% CI for a parameter can be calculated from the sampling distribution– The α / 2 and 1 – α /2 quantiles (e.g. 0.025 and
0.975 with α = 0.05)
• In bootstrapping, simply use the sample quantiles of simulated values
ExercisesExercises• Data: The prevalence of an infectious disease in a human
population is investigated. The infection is recorded with 100% detection efficiency. In a sample of N = 80 humans X = 18 infections were found.
• Model: Assume that infection (x = 0 or 1) of a host individual is an independent Bernoulli trial with probability pi, such that the probability of infection is constant over all hosts.
• (This equals a logistic regression with an intercept only. Host specific explanatory variables, such as age, condition, etc. could be used to improve the model of pi closer.)
Do the following in R:Do the following in R:a) Calculate and plot the profile (log) likelihood of infection probability p
b) What is the maximum likelihood estimate of p (called p̂� )?
c) Construct 95% and 99% confidence intervals for p̂� based on the profile likelihood
d) Calculate the analytic SE for p̂�
e) Construct symmetric 95% confidence interval for p̂� based on the central limit theorem and the SE obtained in previous exercise
f) Simulate and plot the sampling distribution of p̂� by parametric bootstrapping (B = 10000)
g) Calculate the bootstrap SE of p̂�
h) Construct 95% confidence interval for p̂� based on the bootstrap
Top Related