Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 ›...
Transcript of Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 ›...
![Page 1: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/1.jpg)
Session 1: Introduction
Geir-Arne Fuglstad <[email protected]>
Department of Mathematical Sciences, NTNU
April 14, 2016
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 2: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/2.jpg)
2
Instructors
— Geir-Arne Fuglstad• Postdoc with Håvard Rue• Working with sensitivity and robustifying results from INLA• PhD on non-stationary spatial modelling with SPDE-models
— Jingyi Guo• Ph.D. student with Andrea Riebler and Håvard Rue• Master in Statistics from Lund University
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 3: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/3.jpg)
3
Practical information
Slides and code for the sessions available athttp://www.math.ntnu.no/~fuglstad/Lund2016
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 4: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/4.jpg)
4
Practical informationInformation about the software, examples, papers and help can befound at http://www.r-inla.org
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 5: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/5.jpg)
5
What is INLA?
We separate between three different parts:1. The INLA method2. The SPDE models3. The INLA R-package
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 6: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/6.jpg)
6
The INLA method
An approach for fast Bayesian inference with “latent Gaussianmodels”
Read paper:Rue, Martino, and Chopin (2009) “Approximate Bayesian inference for latent Gaussian
models by using integrated nested Laplace approximations.” Journal of the royal statistical
society: Series B. 71, 319–392
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 7: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/7.jpg)
7
The SPDE models
A novel way to get around the computational inefficiencies ofcontinuously indexed spatial fields (GRFs)
Read paper:Lindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between Gaussian fields and
Gaussian Markov random fields: the stochastic partial differential equation approach.”
Journal of the royal statistical society: Series B. 73, 319–392
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 8: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/8.jpg)
8
The INLA package
The R-package is an implementation of the INLA method and theSPDE models with a flexible and simple interface
Download with:source("http://www.math.ntnu.no/inla/givemeINLA-testing.R")
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 9: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/9.jpg)
9
The history of INLA
— The development has been driven by Håvard Rue and is theresult of many years of hard work
— Around 2002–2004 he and Leonard Held started to realize theimportance of the class of models that INLA handles
— In 2005 Håvard Rue and Leonard Held wrote the book“Gaussian Markov Random Fields: Theory and Applications”
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 10: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/10.jpg)
10
The history of INLA
— The first implementation in C was finished in 2007, butrequired hand-crafted input-files
— Arnoldo Frigessi (Oslo) suggested that an R-interface wasnecessary to reach a broad audience
— Sara Martino wrote the first prototype of the R-interface inJanuary/February 2008
— The source code now consists of many, many, many lines...— The source code is available at
https://bitbucket.org/hrue/r-inla/
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 11: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/11.jpg)
11
Who develops INLA?
Håvard Rue, Finn Lindgren, Daniel Simpson, Andrea Riebler, (SaraMartino, Thiago Guerrera Martins, Rupali Akerkar) and others(photo 2011)
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 12: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/12.jpg)
12
Aims of the course
— Get an overview of latent Gaussian models— Get an overview of the INLA method— Learn how to use INLA for (generalized) linear models, and
more— Learn how to do spatial modelling with INLA
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 13: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/13.jpg)
13
Structure of the course
Day 1:10:30–11:45 Session 1: Introduction13:15–15:00 Session 2: R-INLA15:30–17:00 Session 3: Practical session with R-INLADay 2:09:15–10:00 Session 4: Advanced Example10:30–11:45 Session 5: Spatial modelling with INLA13:15–15:00 Session 6: Practical session with spatial modelling
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 14: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/14.jpg)
14
In general
— Ask questions!— Discuss with us!— If you have questions, you can use the google group or
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 15: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/15.jpg)
15
Outline
Motivation
Bayesian hierarchical models
Latent Gaussian models
Deterministic inference
R and INLA
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 16: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/16.jpg)
16
Why use INLA?
— Provides full Bayesian analysis— Quick to write code, do not need to write a sampler— Runs quickly— Can be used for a flexible class of models (Latent Gaussian
Models)
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 17: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/17.jpg)
17
Example: Ski flying recordsWe have ski flying world records y = (y1, . . . , yn) and their datesx1, . . . , xn, and want to fit a simple linear regression with Gaussianresponses, where
E(yi) = µ+ βxi , Var(yi) = τ−1, i = 1, . . . ,n
1960 1970 1980 1990 2000 2010
14
01
80
22
0
Year
Dis
tan
ce
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 18: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/18.jpg)
18
Frequentist analysis
1 mod = lm(Length ~ Date , data = skiData)2 summary(mod)
1960 1970 1980 1990 2000 2010
140
180
220
Year
Dis
tance
Estimates
µ: -3986 (66),β: 2.10 (0.03)σ = 1/
√τ : 3.98
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 19: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/19.jpg)
19
Bayesian analysis1 res = inla(Length ~ Date , data = skiData)2 res$summary.fixed [ ,1:2]; res$summary.hyperpar
−4400 −4000 −36000.0
00
0.0
05
mu
Mean = −3986, SD = 65
1.9 2.1 2.3
04
8
beta
Mean = 2.10, SD = 0.03
3.0 3.5 4.0 4.5 5.0
0.0
0.2
0.4
0.6
0.8
1.0
standard deviation
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 20: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/20.jpg)
20
Real-world problems are typically morecomplicated!
Often we need to— include complicated dependency structures— stabilize the inference
Can be achieved with hierarchical Bayesian modelling, but...
Two main challenges:
— Need computationally efficient methods to calculate posteriors.— Select priors in a sensible way
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 21: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/21.jpg)
20
Real-world problems are typically morecomplicated!
Often we need to— include complicated dependency structures— stabilize the inference
Can be achieved with hierarchical Bayesian modelling, but...
Two main challenges:
— Need computationally efficient methods to calculate posteriors.— Select priors in a sensible way
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 22: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/22.jpg)
21
Bayesian hierarchical models
INLA can analyze Bayesian hierarchical models specified in threestages:
Stage 1: What is the distribution of the responses?
Stage 1.5: How is the mean/variance/probability of the responselinked to the underlying unobserved components?
Stage 2: What is the distribution of the underlying unobservedcomponents?
Stage 3: What are our prior beliefs about the parameterscontrolling the components in the model?
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 23: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/23.jpg)
21
Bayesian hierarchical models
INLA can analyze Bayesian hierarchical models specified in threestages:
Stage 1: What is the distribution of the responses?
Stage 1.5: How is the mean/variance/probability of the responselinked to the underlying unobserved components?
Stage 2: What is the distribution of the underlying unobservedcomponents?
Stage 3: What are our prior beliefs about the parameterscontrolling the components in the model?
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 24: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/24.jpg)
21
Bayesian hierarchical models
INLA can analyze Bayesian hierarchical models specified in threestages:
Stage 1: What is the distribution of the responses?
Stage 1.5: How is the mean/variance/probability of the responselinked to the underlying unobserved components?
Stage 2: What is the distribution of the underlying unobservedcomponents?
Stage 3: What are our prior beliefs about the parameterscontrolling the components in the model?
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 25: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/25.jpg)
21
Bayesian hierarchical models
INLA can analyze Bayesian hierarchical models specified in threestages:
Stage 1: What is the distribution of the responses?Stage 1.5: How is the mean/variance/probability of the response
linked to the underlying unobserved components?Stage 2: What is the distribution of the underlying unobserved
components?Stage 3: What are our prior beliefs about the parameters
controlling the components in the model?
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 26: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/26.jpg)
22
Stage 1
How is the data (y) generated from the underlying components (x)and hyperparameters (θ) in the model:— Gaussian response?— Count data? (E.g. Poisson, negative binomial)— Zero-inflation?— Point pattern? (E.g. Log-Gaussian cox process)— Binary data?
The response distribution is connected to x and θ through thelikelihood π(y |x ,θ)
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 27: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/27.jpg)
23
Stage 1.5
In INLA Stage 1 and Stage 2 must be connected through linearpredictors by
π(y |x ,θ) =n∏
i=1
π(yi |ηi ,θ),
where each ηi is a linear combination of the model components x .
For example, ηi = µ+ βxi can be combined withGaussian: ηi = µi
Poisson: ηi = log(µi)
Binomial: ηi = logit(pi)
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 28: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/28.jpg)
24
Stage 2
The underlying unobserved components x are called latentcomponents and can be:— Covariates— Unstructured random effects (individual effects, group effects)— Structured random effects (AR(1), regional effects,
continuously indexed spatial effects)
The distribution of the the model components are specified byπ(x |θ)
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 29: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/29.jpg)
25
Stage 3
The likelihood and the latent model typically have hyperparametersthat control their behavior. The hyperparameters θ can include:
— Variance of unstructured effects— Range and variance of spatial effects— Autocorrelation parameter— Variance of observation noise— Probability of a zero (zero-inflated models)
The a priori beliefs about these parameters are placed in the priorπ(θ)
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 30: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/30.jpg)
26
Statistical jargon
It can be phrased with equations as
Stage 1: y |x ,θ ∼ π(y |x ,θ) =∏n
i=1 π(yi |ηi ,θ)
Stage 1.5: Each ηi is a linear combination of elements of xStage 2: x |θ ∼ π(x |θ)
Stage 3: θ ∼ π(θ)
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 31: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/31.jpg)
27
Example: Disease mapping in Germany
We have observed larynx cancer mortality counts for males in 544district of Germany from 1986 to 1990 and want to make a model.
Information given:
yi : The count at location i .Ei : An offset; expected number of
cases in district i .ci : A covariate (level of smoking
consumption) at location isi : spatial location i (here, district).
0.0
0.5
1.0
1.5
2.0
2.5
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 32: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/32.jpg)
28
Stage 1: The data
First we decide on the likelihood for our data y
— Need a distribution for counts— We decide to model our responses as
yi | ηi ∼ Poisson(Ei exp(ηi)),
ηi is a linear function of the latent components
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 33: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/33.jpg)
29
Stage 2: The latent model
We choose four components— Intercept µ— Spatially structured effect fs and unstructured effect u— Covariate effect f (ci) of the exposure covariate ci
Combine with linear predictor ηi = µ+ fs(si) + f (ci) + ui , and thefull latent field x = (µ, {fs(·)}, {f (·)},u1,u2, . . . ,un)
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 34: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/34.jpg)
30
Stage 3: Hyperparameters
The structured and unstructured spatial effect as well as thesmooth covariate effect are each controlled by one parameter
— τc , τf , τη: The precisions (inverse variances) of the covariateeffect, spatial effect and unstructured effect, respectively.
Hyperparameters θ = (τc , τf , τη) must be given a prior π(τc , τf , τη).
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 35: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/35.jpg)
31
Quantities of interest
Median of structured spatial effect Covariate effect f (ci)exp(fs(si))
0.8
1.0
1.2
1.4
1.6
0 20 40 60 80 100
−0
.50
.00
.5
0.025%, 0.5% and 0.975%
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 36: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/36.jpg)
32
Latent Gaussian models
A key feature of the example is that it is contained in the veryflexible and useful class of models called Latent Gaussian models
— The characteristic property is that the latent part of thehierarchical model is Gaussian, x |θ ∼ N (0,Q−1)
• The expected value is 0• The precision matrix (inverse covariance matrix) is Q
Together with the linear predictor restriction this defines the classof models INLA can handle
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 37: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/37.jpg)
32
Latent Gaussian models
A key feature of the example is that it is contained in the veryflexible and useful class of models called Latent Gaussian models
— The characteristic property is that the latent part of thehierarchical model is Gaussian, x |θ ∼ N (0,Q−1)
• The expected value is 0• The precision matrix (inverse covariance matrix) is Q
Together with the linear predictor restriction this defines the classof models INLA can handle
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 38: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/38.jpg)
32
Latent Gaussian models
A key feature of the example is that it is contained in the veryflexible and useful class of models called Latent Gaussian models
— The characteristic property is that the latent part of thehierarchical model is Gaussian, x |θ ∼ N (0,Q−1)
• The expected value is 0• The precision matrix (inverse covariance matrix) is Q
Together with the linear predictor restriction this defines the classof models INLA can handle
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 39: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/39.jpg)
33
The general set-upThe class contains GLMs, GLMMs, GAMs, GAMMs, and more.Can be constructed by connecting the mean µi to the linearpredictor, ηi , through a link function g,
ηi = g(µi) =
α +
zTi β +
ui +
∑γ
wγ,i
fγ(cγ,i)
, i = 1,2, . . . ,n
where
α : Intercept
β : Fixed effects of covariates z
u : Unstructured error terms
{fγ(·)} : Non-linear/smooth effects of covariates c
{wγ,i} : Known weights defined for each observed data point
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 40: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/40.jpg)
33
The general set-upThe class contains GLMs, GLMMs, GAMs, GAMMs, and more.Can be constructed by connecting the mean µi to the linearpredictor, ηi , through a link function g,
ηi = g(µi) = α +
zTi β +
ui +
∑γ
wγ,i
fγ(cγ,i)
, i = 1,2, . . . ,n
whereα : Intercept
β : Fixed effects of covariates z
u : Unstructured error terms
{fγ(·)} : Non-linear/smooth effects of covariates c
{wγ,i} : Known weights defined for each observed data point
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 41: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/41.jpg)
33
The general set-upThe class contains GLMs, GLMMs, GAMs, GAMMs, and more.Can be constructed by connecting the mean µi to the linearpredictor, ηi , through a link function g,
ηi = g(µi) = α + zTi β +
ui +
∑γ
wγ,i
fγ(cγ,i)
, i = 1,2, . . . ,n
whereα : Interceptβ : Fixed effects of covariates z
u : Unstructured error terms
{fγ(·)} : Non-linear/smooth effects of covariates c
{wγ,i} : Known weights defined for each observed data point
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 42: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/42.jpg)
33
The general set-upThe class contains GLMs, GLMMs, GAMs, GAMMs, and more.Can be constructed by connecting the mean µi to the linearpredictor, ηi , through a link function g,
ηi = g(µi) = α + zTi β + ui +
∑γ
wγ,i
fγ(cγ,i)
, i = 1,2, . . . ,n
whereα : Interceptβ : Fixed effects of covariates zu : Unstructured error terms
{fγ(·)} : Non-linear/smooth effects of covariates c
{wγ,i} : Known weights defined for each observed data point
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 43: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/43.jpg)
33
The general set-upThe class contains GLMs, GLMMs, GAMs, GAMMs, and more.Can be constructed by connecting the mean µi to the linearpredictor, ηi , through a link function g,
ηi = g(µi) = α + zTi β + ui +
∑γ
wγ,i
fγ(cγ,i), i = 1,2, . . . ,n
whereα : Interceptβ : Fixed effects of covariates zu : Unstructured error terms
{fγ(·)} : Non-linear/smooth effects of covariates c
{wγ,i} : Known weights defined for each observed data point
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 44: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/44.jpg)
33
The general set-upThe class contains GLMs, GLMMs, GAMs, GAMMs, and more.Can be constructed by connecting the mean µi to the linearpredictor, ηi , through a link function g,
ηi = g(µi) = α + zTi β + ui +
∑γ
wγ,i fγ(cγ,i), i = 1,2, . . . ,n
whereα : Interceptβ : Fixed effects of covariates zu : Unstructured error terms
{fγ(·)} : Non-linear/smooth effects of covariates c{wγ,i} : Known weights defined for each observed data point
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 45: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/45.jpg)
34
Flexibility through f -functions
The functions {fγ} provides very different types of random effects— f (time): E.g., an AR(1) process, RW1 or RW2— f (spatial location): E.g., a Matérn field— f (covariate): E.g., a RW1 or RW2 on the covariate values— f (time, spatial location): spatio-temporal effect— And much more
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 46: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/46.jpg)
35
Additivity
— One of the most useful features of the framework is theadditivity
— Effects can easily be removed and added without difficulty— Each component might adds a new latent part and might add
new hyperparameters, but the modelling framework andcomputations stay the same
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 47: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/47.jpg)
36
Example: Smoothing binary time-series
— Observed the sequence y1, y2, . . . , yn of 0s and 1s— Each time t has an associated covariate xi
— We want to smooth the time series by inferring the sequencept , for t = 1,2, . . . ,n, of probabilities for 1s at each time step
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 48: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/48.jpg)
37
Example: Smoothing time series
Stage 1: Bernoulli distribution for the responses
yt |ηt ∼ Bernoulli(
exp (ηt )
1 + exp (ηt )
)
Stage 2: Covariates, AR(1) component and random noise areconnected to likelihood by
ηt = β0 + β1xt + at + vt
Stage 3: ρ : Dependence parameter in AR(1) processσ2
a : Marginal variance in AR(1) processσ2
v : Variance of the unstructed error
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 49: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/49.jpg)
37
Example: Smoothing time series
Stage 1: Bernoulli distribution for the responses
yt |ηt ∼ Bernoulli(
exp (ηt )
1 + exp (ηt )
)Stage 2: Covariates, AR(1) component and random noise are
connected to likelihood by
ηt = β0 + β1xt + at + vt
Stage 3: ρ : Dependence parameter in AR(1) processσ2
a : Marginal variance in AR(1) processσ2
v : Variance of the unstructed error
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 50: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/50.jpg)
37
Example: Smoothing time series
Stage 1: Bernoulli distribution for the responses
yt |ηt ∼ Bernoulli(
exp (ηt )
1 + exp (ηt )
)Stage 2: Covariates, AR(1) component and random noise are
connected to likelihood by
ηt = β0 + β1xt + at + vt
Stage 3: ρ : Dependence parameter in AR(1) processσ2
a : Marginal variance in AR(1) processσ2
v : Variance of the unstructed error
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 51: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/51.jpg)
38
Loads of examples— Dynamic linear models— Stochastic volatility models— Generalised linear (mixed) models— Generalised additive (mixed) models— Measurement error models— Spline smoothing— Semi-parametric regression— Disease mapping— Log-Gaussian Cox-processes— Spatio-temporal models— Survival analysis— And more!
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 52: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/52.jpg)
39
Computations
Now we have a modelling framework.
But how are the calculations actually done?
It depends on what you want to compute!
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 53: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/53.jpg)
39
Computations
Now we have a modelling framework.
But how are the calculations actually done?
It depends on what you want to compute!
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 54: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/54.jpg)
40
What are we interested in?
— Quantiles for the fixed effects
— A linear combination of elements from the latent field (e.g. theaverage over an area of a spatial effect, the difference of twoeffects)
— A single hyperparameter (e.g. the range)— A non-linear combination of hyperparameters (breeding values
for livestock)— Predictions at unobserved locations
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 55: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/55.jpg)
40
What are we interested in?
— Quantiles for the fixed effects— A linear combination of elements from the latent field (e.g. the
average over an area of a spatial effect, the difference of twoeffects)
— A single hyperparameter (e.g. the range)— A non-linear combination of hyperparameters (breeding values
for livestock)— Predictions at unobserved locations
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 56: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/56.jpg)
40
What are we interested in?
— Quantiles for the fixed effects— A linear combination of elements from the latent field (e.g. the
average over an area of a spatial effect, the difference of twoeffects)
— A single hyperparameter (e.g. the range)
— A non-linear combination of hyperparameters (breeding valuesfor livestock)
— Predictions at unobserved locations
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 57: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/57.jpg)
40
What are we interested in?
— Quantiles for the fixed effects— A linear combination of elements from the latent field (e.g. the
average over an area of a spatial effect, the difference of twoeffects)
— A single hyperparameter (e.g. the range)— A non-linear combination of hyperparameters (breeding values
for livestock)
— Predictions at unobserved locations
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 58: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/58.jpg)
40
What are we interested in?
— Quantiles for the fixed effects— A linear combination of elements from the latent field (e.g. the
average over an area of a spatial effect, the difference of twoeffects)
— A single hyperparameter (e.g. the range)— A non-linear combination of hyperparameters (breeding values
for livestock)— Predictions at unobserved locations
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 59: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/59.jpg)
41
What do we need to compute?
Often interested in the marginal posteriors the latent field
π(xi |y)
or the marginal posteriors the hyperparameters
π(θi |y)
or the posterior of another statistics
π(f (x ,θ)|y)
However, these can almost never be computed analytically.
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 60: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/60.jpg)
41
What do we need to compute?
Often interested in the marginal posteriors the latent field
π(xi |y)
or the marginal posteriors the hyperparameters
π(θi |y)
or the posterior of another statistics
π(f (x ,θ)|y)
However, these can almost never be computed analytically.
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 61: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/61.jpg)
42
Traditional approach with MCMC
Construct Markov chains with the target posterior distribution asthe stationary distribution.— Extensively used for Bayesian inference since the 1980’s— It is flexible and general— There are generic tools such as JAGS or OpenBUGS— Or more specific tools for more specific models, e.g. BayesX
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 62: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/62.jpg)
43
Alternative approach
— MCMC “works” for everything, but it can be incredibly slow— Is it possible to make a quicker, more specialized inference
scheme which only needs to work for this limited class ofmodels?
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 63: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/63.jpg)
44
Our model framework
Latent Gaussian models:Stage 1: y |x ,θ ∼
∏i π(yi |ηi ,θ)
Stage 2: x |θ ∼ π(x |θ) ∼ N (0,Q(θ)−1) Gaussian!Stage 3: θ ∼ π(θ)
where the precision matrix Q(θ) is sparse. Generally these“sparse” Gaussian distributions are called Gaussian Markovrandom fields (GMRFs).
The sparseness can be exploited for very quick computations forthe Gaussian part of the model through numerical algorithms forsparse matrices.
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 64: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/64.jpg)
45
The INLA idea
Directly approximate the posterior marginals
π(xi | y) and π(θj | y)
using the posterior distribution
π(x ,θ | y) ∝ π(θ)π(x | θ)π(y | x ,θ).
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 65: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/65.jpg)
46
Toy example: SmoothingObservations
yi = m(i) + εi , i = 1, . . . ,n
— εi is i.i.d. Gaussian noise with known precision, τ0— m(i) is an unknown smooth function wrt i
1 n = 502 idx = 1:n3 fun = 100*((idx -n/2)/n)^34 y = fun + rnorm(n, mean
=0, sd=1)5 plot(idx , y)
0 10 20 30 40 50
−10
−5
05
10
idx
y
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 66: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/66.jpg)
47
Assumed hierarchical model1. Data: Gaussian observations with known precision
yi | xi , θ ∼ N (xi , τ0)
2. Latent model: A second-order random walk for the underlyingsmooth function1
π(x | θ) ∝ θ(n−2)/2 exp
(−θ
2
n∑i=3
(xi − 2xi−1 + xi−2)2
)3. Hyperparameter: The smoothing parameter θ is assigned a
Γ(a,b) prior
π(θ) ∝ θa−1 exp (−bθ) , θ > 0
1model="rw2"
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 67: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/67.jpg)
47
Assumed hierarchical model1. Data: Gaussian observations with known precision
yi | xi , θ ∼ N (xi , τ0)
2. Latent model: A second-order random walk for the underlyingsmooth function1
π(x | θ) ∝ θ(n−2)/2 exp
(−θ
2
n∑i=3
(xi − 2xi−1 + xi−2)2
)
3. Hyperparameter: The smoothing parameter θ is assigned aΓ(a,b) prior
π(θ) ∝ θa−1 exp (−bθ) , θ > 0
1model="rw2"
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 68: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/68.jpg)
47
Assumed hierarchical model1. Data: Gaussian observations with known precision
yi | xi , θ ∼ N (xi , τ0)
2. Latent model: A second-order random walk for the underlyingsmooth function1
π(x | θ) ∝ θ(n−2)/2 exp
(−θ
2
n∑i=3
(xi − 2xi−1 + xi−2)2
)3. Hyperparameter: The smoothing parameter θ is assigned a
Γ(a,b) prior
π(θ) ∝ θa−1 exp (−bθ) , θ > 0
1model="rw2"
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 69: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/69.jpg)
48
Derivation of posterior marginals (I)
Sincex ,y | θ ∼ N (·, ·)
(derived using π(x ,y | θ) ∝ π(y | x , θ) π(x | θ)),we can compute (numerically) all marginals, using that
π(θ | y) ∝
Gaussian︷ ︸︸ ︷π(x ,y | θ) π(θ)
π(x | y , θ)︸ ︷︷ ︸Gaussian
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 70: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/70.jpg)
48
Derivation of posterior marginals (I)
Sincex ,y | θ ∼ N (·, ·)
(derived using π(x ,y | θ) ∝ π(y | x , θ) π(x | θ)),we can compute (numerically) all marginals, using that
π(θ | y) ∝
Gaussian︷ ︸︸ ︷π(x ,y | θ) π(θ)
π(x | y , θ)︸ ︷︷ ︸Gaussian
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 71: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/71.jpg)
49
Posterior marginal for hyperparameter
1 2 3 4 5 6 7
0e
+0
04
e−
19
8e
−1
9
Posterior marginal for theta
Log(theta)
Un
no
rma
lise
d d
en
sity
1 2 3 4 5 6 70
.00
.40
.81
.2
Posterior marginal for theta
Log(theta)
De
nsity
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 72: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/72.jpg)
50
Derivation of posterior marginals (II)
Fromx | y , θ ∼ N (·, ·)
we can compute
π(xi | y) =
∫π(xi | θ,y)︸ ︷︷ ︸
Gaussian
π(θ | y) dθ
≈∑
k
π(xi | θk ,y)π(θk | y)∆k
where θk , k = 1, . . . ,K , correspond to representative points of θ | yand ∆k are the corresponding weights.
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 73: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/73.jpg)
50
Derivation of posterior marginals (II)
Fromx | y , θ ∼ N (·, ·)
we can compute
π(xi | y) =
∫π(xi | θ,y)︸ ︷︷ ︸
Gaussian
π(θ | y) dθ
≈∑
k
π(xi | θk ,y)π(θk | y)∆k
where θk , k = 1, . . . ,K , correspond to representative points of θ | yand ∆k are the corresponding weights.
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 74: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/74.jpg)
51
Posterior marginal for latent parameters
−13 −12 −11 −10 −9 −8 −7 −6
0.0
0.2
0.4
0.6
0.8
Posterior marginal for x_1 for each theta (unweighted)
x_1
Density
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 75: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/75.jpg)
52
Posterior marginal for latent parameters
−13 −12 −11 −10 −9 −8 −7 −6
0.0
00.0
50.1
00.1
50.2
0
Posterior marginal for x_1 for each theta (weighted)
x_1
Density
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 76: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/76.jpg)
53
Posterior marginal for latent parameters
−13 −12 −11 −10 −9 −8 −7 −6
0.0
0.2
0.4
0.6
0.8
Posterior marginal for x_1
x_1
Density
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 77: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/77.jpg)
54
Fitted splineThe posterior marginals are used to calculate summary statistics,like means, variances and credible intervals:
0 10 20 30 40 50
−10
−5
05
10
idx
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 78: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/78.jpg)
55
Comparison with maximum likelihood
The red line is the Bayesian posterior, the blue line is the “posterior”using the MLE of θ, and the vertical line is the observed value y1
−13 −12 −11 −10 −9 −8 −7 −6
0.0
0.2
0.4
0.6
x_1
Density
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 79: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/79.jpg)
56
Extensions
This is the simple basic idea behind INLA
However, we need to extend this basic idea to deal with— More than one hyperparameter— Non-Gaussian observations
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 80: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/80.jpg)
56
Extensions
This is the simple basic idea behind INLA
However, we need to extend this basic idea to deal with— More than one hyperparameter— Non-Gaussian observations
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 81: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/81.jpg)
57
Extension: More than onehyperparameter
Step 1 Explore π(θ|y)— Locate the mode— Use the Hessian to construct new variables— Grid-search
Step 2 Approximate marginals based on these integrationpoints
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 82: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/82.jpg)
57
Extension: More than onehyperparameter
Step 1 Explore π(θ|y)— Locate the mode— Use the Hessian to construct new variables— Grid-search
Step 2 Approximate marginals based on these integrationpoints
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 83: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/83.jpg)
57
Extension: More than onehyperparameter
Step 1 Explore π(θ|y)— Locate the mode— Use the Hessian to construct new variables— Grid-search
Step 2 Approximate marginals based on these integrationpoints
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 84: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/84.jpg)
57
Extension: More than onehyperparameter
Step 1 Explore π(θ|y)— Locate the mode— Use the Hessian to construct new variables— Grid-search
Step 2 Approximate marginals based on these integrationpoints
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 85: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/85.jpg)
58
Non-Gaussian observations
— π(x |y ,θ) is often very close to a Gaussian distribution evenwith a non-Gaussian likelihood, and can be replaced with aLaplace approximation
— All the difficult high-dimensional integrals w.r.t. the latent fieldare easy, and only the integrals w.r.t. the hyperparametersremain
— The integrals can be done efficiently numerically when thenumber of hyperparameters is low
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 86: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/86.jpg)
59
Limitations
— The dimension of the latent field x can be large (102–106)— But the dimension of the hyperparameters θ must be small
(≤ 9)
In other words, each random effect can be big, but there cannot betoo many random effects unless they share parameters.
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 87: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/87.jpg)
60
How to use INLA?
INLA is implemented through the package INLA in the R softwarewhich— is the most popular computing language in applied statistics— is open source and free— has a lot of packages that extend the base functionality— has a very user friendly formula interface
linear_model <- lm(weight ~ group)
Fits the linear model
weighti = µ+ groupi + εi
www.ntnu.no G.-A. Fuglstad, Introduction
![Page 88: Session 1: Introduction - Personal webpages at NTNU › ... › Lund2016 › Session1 › Introduction.pdfLindgren, F., Rue, H. and Lindström, J. (2011) “An explicit link between](https://reader033.fdocuments.us/reader033/viewer/2022042322/5f0c1a5a7e708231d433c13b/html5/thumbnails/88.jpg)
61
Summary of INLA
Three main ingredients in INLA— Latent Gaussian models— Laplace approximations— Gaussian Markov random fields
These ingredients leads to a very nice tool for Bayesian inference:— fast— accurate— scales well for moderate sizes
www.ntnu.no G.-A. Fuglstad, Introduction