Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q...
Transcript of Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q...
![Page 1: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/1.jpg)
Gaussian Processes
1
Schedule
I 17 Jan: Gaussian processes (Jo Eidsvik)
I 24 Jan: Hands-on project on Gaussian processes (Team effort, workin groups)
I 31 Jan: Latent Gaussian models and INLA (Jo Eidsvik)
I 7 Feb: Hands-on project on INLA (Team effort, work in groups)
I 12-13 Feb: Template model builder. (Guest lecturer Hans J Skaug)
I To be decided...
![Page 2: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/2.jpg)
Gaussian Processes
Motivation
Large spatial (spatio-temporal) datasets
Gaussian processes are very commonly used in practice.
![Page 3: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/3.jpg)
Gaussian Processes
Motivation
Other contexts
I Genetic data
I Functional data
I Response surfaces modeling and optimization
Extremely common as building block in several machine learningapplications.
![Page 4: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/4.jpg)
Gaussian Processes
Preliminaries
Univariate Gaussian distribution
-3 -2 -1 0 1 2 3
x
0
0.1
0.2
0.3
0.4
0.5
0.6
pdf p(x
)
Y ∼ N(µ, σ2).
p(y) =1√2πσ
exp
(−1
2
(y − µ)2
σ2
), y ∈ R.
Z =Y − µσ
, Y = µ+ σZ .
Z is standard normal, mean 0 and variance 1.
![Page 5: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/5.jpg)
Gaussian Processes
Preliminaries
Multivariate gaussian distribution
Size n × 1 vector Y = (Y1, . . . ,Yn)
p(y) =1
(2π)n/2|Σ|1/2exp
(−1
2(y − µ)′Σ−1(y − µ)
), Y ∈ Rn.
Mean is E (Y ) = µ = (µ1, . . . , µn). Variance-covariance matrix is
Σ =
Σ1,1 . . . Σ1,n
. . . . . . . . .Σn,1 . . . Σn,n
,Σi,i = σ2
i = Var(Yi ), Σi,j = Cov(Yi ,Yj), Corr(Yi ,Yj) = Σi,j/(σiσj).
![Page 6: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/6.jpg)
Gaussian Processes
Preliminaries
Illustrations n = 2- correlation 0.9 (left), independent (right).
-2 0 2
x1
-3
-2
-1
0
1
2
3
x2
-2 0 2
x1
-3
-2
-1
0
1
2
3
x2
![Page 7: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/7.jpg)
Gaussian Processes
Preliminaries
Joint for blocks
Y A = (YA,1, . . . ,YA,nA), Y B = (YB,1, . . . ,YB,nB ), joint Gaussian withmean (µA,µB), covariance:
µ = (µA,µB), Σ =
[ΣA ΣA,B
ΣB,A ΣB
],
![Page 8: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/8.jpg)
Gaussian Processes
Preliminaries
Conditioning
E (Y A|Y B) = µA + ΣA,BΣ−1B (Y B − µB),
Var(Y A|Y B) = ΣA −ΣA,BΣ−1B ΣB,A.
Mean is linear in conditioning variable (data).Variance is not dependent on data.
![Page 9: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/9.jpg)
Gaussian Processes
Preliminaries
Illustration
-3 -2 -1 0 1 2 3
x1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
co
nd
itio
na
l p
df
p(x
1|x
2)
x2=-1 x
2=1
Figure: Conditional pdf for Y1 when Y2 = 1 or Y2 = −1.
![Page 10: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/10.jpg)
Gaussian Processes
Preliminaries
Transformation
Z = L−1(Y − µ), Y = µ + LZ , Σ = LL′.
Z = (Z1, . . . ,Zn) are independent standard normal, mean 0 and variance1.
![Page 11: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/11.jpg)
Gaussian Processes
Preliminaries
Cholesky factorization
Σ =
Σ1,1 . . . Σ1,n
. . . . . . . . .Σn,1 . . . Σn,n
= LL′,
Lower triangular matrix
L =
L1,1 0 . . . 0L2,1 L2,2 . . . 0. . . . . . . . . 0Ln,1 Ln,2 . . . Ln,n
,
![Page 12: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/12.jpg)
Gaussian Processes
Preliminaries
Cholesky - example
Σ =
[1 0.9
0.9 1
], L =
[1 0
0.9 0.44
].
Consider sampling from joint p(y1, y2) = p(y1)p(y2|y1):
I Sample from p(y1) is constructed by Y1 = µ1 + L1,1Z1.
I Sample from p(y2|y1) is constructed byY2 = µ2 + L2,1Z1 + L2,2Z2 = µ2 + L2,1
Y1−µ1
L1,1+ L2,2Z2
![Page 13: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/13.jpg)
Gaussian Processes
Processes
Gaussian process
For any set of time locations t1, . . . , tn. Y (t1), . . . ,Y (tn) is jointlyGaussian.Mean µ(ti ), i = 1, . . . , n.
Σ =
Σ1,1 . . . Σ1,n
. . . . . . . . .Σn,1 . . . Σn,n
,
![Page 14: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/14.jpg)
Gaussian Processes
Processes
Covariance function
The covariance tends to decay with distance:
Σi,j = γ(|ti − tj |),
for some covariance function γ.Examples:
γexp(h) = σ2 exp(−φh)
γmat(h) = σ2(1 + φh) exp(−φh)
γgauss(h) = σ2 exp(−φh2)
![Page 15: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/15.jpg)
Gaussian Processes
Processes
Illustration of covariance function
0 5 10 15 20 25 30 35 40 45 50
Distance |t-s|
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Co
rre
latio
nExp. corr
Matern type corr
Gauss. corr
![Page 16: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/16.jpg)
Gaussian Processes
Processes
Illustration of samples of Gaussian processes
0 10 20 30 40 50 60 70 80 90 100
Time t
-3
-2
-1
0
1
2
3
x(t
)
Exp. corr
Matern type corr
![Page 17: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/17.jpg)
Gaussian Processes
Processes
Markov property
The exponential correlation function gives a Markov process.For any t > s > u,
p(y(t)|y(s), y(u)) = p(y(t)|y(s)).
(Proof by trivariate distribution, and conditioning.)
E (Y A|Y B) = µA + ΣA,BΣ−1B (Y B − µB),
Var(Y A|Y B) = ΣA −ΣA,BΣ−1B ΣB,A.
![Page 18: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/18.jpg)
Gaussian Processes
Processes
Conditional formula
E (Y A|Y B) = µA + ΣA,BΣ−1B (Y B − µB),
Var(Y A|Y B) = ΣA −ΣA,BΣ−1B ΣB,A.
I Expectation linear in data.
I Variance only dependent on data locations, not data.
I Expectation close to conditioning variables near data locations, goesto µA far from data.
I Variance small near data locations, goes to ΣA far from data.
I Close data locations are not double data.
![Page 19: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/19.jpg)
Gaussian Processes
Processes
Illustration of conditioning in Gaussian processes
0 10 20 30 40 50 60 70 80 90 100
t
-2
-1
0
1
2
x(t
)
0 10 20 30 40 50 60 70 80 90 100
t
-2
-1
0
1
2
3
x(t
)
![Page 20: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/20.jpg)
Gaussian Processes
Processes
Application of Gaussian processes: function optimization
Several applications involve very time-demanding or costly experiment.Which configurations of inputs give highest output?Goal is to find the optimal input without too many trials / tests.
Approach: Fit a GP to the function, based on the evaluation points andresults. This allows fast consideration of which evaluations points tochoose!
![Page 21: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/21.jpg)
Gaussian Processes
Processes
Production quality : Current prediction
Goal: Find the best temperature input, to give optimal production.Which temperature to evaluate next? Experiment is costly, want to dofew evaluations.
![Page 22: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/22.jpg)
Gaussian Processes
Processes
Expected improvement
Maximum so far Y ∗ = max(Y ).
EIn(s) = E (max(Y (s)− Y ∗, 0)|Y )
![Page 23: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/23.jpg)
Gaussian Processes
Processes
Sequential uncertainty reduction
Perform test at t = 40, result is Y (40) = 50.
![Page 24: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/24.jpg)
Gaussian Processes
Processes
Sequential uncertainty reduction and optimization
Maximum so far Y ∗ = max(Y ,Yn+1).
EIn+1(s) = E (max(Y (s)− Y ∗, 0)|Y ,Yn+1)
![Page 25: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/25.jpg)
Gaussian Processes
Processes
Project on function optimization using EI next week
Analytical solutions to parts of computational challenge.
![Page 26: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/26.jpg)
Gaussian Processes
Gaussian Markov random fields
Precision matrix Q
Σ−1 = Q =
[QA QA,B
QB,A QB
].
Q holds the conditional variance structure.
![Page 27: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/27.jpg)
Gaussian Processes
Gaussian Markov random fields
Interpretation of precision
Q−1A = Var(Y A|Y B),
E(Y A|Y B) = µA −Q−1A QA,B(Y B − µB),
(Proof by QΣ = I .Or by writing out quadratic form and p(Y A|Y B) ∝ p(Y A,Y B).)
![Page 28: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/28.jpg)
Gaussian Processes
Gaussian Markov random fields
Sparse precision matrix Q
I For graphs the precision matrix is sparse.I Qij = 0 if nodes i and j are not neighbors. Conditionally
independent.I Qi,i±2 = 0 for exponential covariance function on a regular grid in
time.
![Page 29: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/29.jpg)
Gaussian Processes
Gaussian Markov random fields
Sparse precision matrix Q
This sparseness means that several techniques from numerical analysiscan be used. Solve Qb = a quickly for b.
![Page 30: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/30.jpg)
Gaussian Processes
Gaussian Markov random fields
Cholesky factorization of QCommon method for sampling and evalution:
Q =
Q1,1 . . . Q1,n
. . . . . . . . .Qn,1 . . . Qn,n
= LQL′Q ,
Lower triangular matrix
LQ =
LQ,1,1 0 . . . 0LQ,2,1 LQ,2,2 . . . 0. . . . . . . . . 0
LQ,n,1 LQ,n,2 . . . LQ,n,n
,The Cholesky factor is often sparse, but not as sparse as Q, because itholds only partial conditional structure, according to an ordering.Sparsity is maintained for exponential covariance function in timedimension (Markov).
![Page 31: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/31.jpg)
Gaussian Processes
Gaussian Markov random fields
Sampling and evaluation using LQ
Q =
Q1,1 . . . Q1,n
. . . . . . . . .Qn,1 . . . Qn,n
= LQL′Q ,
LQY = Z .
(Previously, for covariance we had Y = LZ .)
log |Q| = 2 log |LQ | = 2∑i
LQ,ii
![Page 32: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/32.jpg)
Gaussian Processes
Gaussian Markov random fields
GMRF for spatial applications.
A Markovian model can be constructed for a spatial Gaussian processes(Lindgren et al., 2011).The spatial process is viewed as a stochastic partial differential equation,and the solution is embedded in a triagularized graph over a spatialdomain.More later (31 Jan).
![Page 33: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/33.jpg)
Gaussian Processes
Regression Model
Regression for spatial datasets
Applications with trends and residual spatial variation.
![Page 34: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/34.jpg)
Gaussian Processes
Regression Model
Spatial Gaussian regression model
Model: Y (s) = X (s)β + w(s) + ε(s).
1. Y (s) response variable at location /location-time position s.
2. β regression effects. X (s) covariates at s.
3. w(s) structured (space-time correlated) Gaussian process.
4. ε(s) unstructured (independent) Gaussian measurement noise.
![Page 35: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/35.jpg)
Gaussian Processes
Regression Model
Gaussian model
Model: Y (s) = X (s)β + w(s) + ε(s).Data at n locations: Y = (Y (s1), . . . ,Y (sn))′.Main goals are:
I Parameter estimation
I Prediction
![Page 36: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/36.jpg)
Gaussian Processes
Regression Model
Gaussian model
Likelihood for parameter estimation:
l(Y ;β,θ) = −1
2log |C | − 1
2(Y − Xβ)′C−1(Y − Xβ)
C (θ) = C = Σ + τ 2I nVar(w) = Σ, Var(ε(si )) = τ 2 for all i .θ include parameters of the covariance model.
![Page 37: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/37.jpg)
Gaussian Processes
Method
Maximum likelihood
MLE:(θ, β) = argmaxθ,β{l(Y ;β,θ)}.
![Page 38: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/38.jpg)
Gaussian Processes
Method
Analytical derivatives
Formulas for matrix derivatives.
Q(θ) = C−1
β = [X ′QX ]−1X ′QY ,
Z = Y − X β
d log |C |dθr
= trace(QdCdθr
)
dZ ′QZdθr
= −Z ′QdCdθr
QZ .
![Page 39: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/39.jpg)
Gaussian Processes
Method
Score and Hessian for θ
dl
dθr= −1
2trace(Q
dCdθr
) +1
2Z ′Q
dCdθr
QZ ,
E
(d2l
dθrdθs
)= −1
2trace(Q
dCdθs
QdCdθr
).
![Page 40: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/40.jpg)
Gaussian Processes
Method
Updates for each iteration
Q = Q(θp)
βp = [X ′QX ]−1X ′QY ,
θp+1 = θp − E
(d2l(Y ; βp, θp)
dθ2
)−1dl(Y ; βp, θp)
dθ,
Iterative scheme usually starts from preliminary guess, obatined viasummary statistics.
![Page 41: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/41.jpg)
Gaussian Processes
Method
Illustration maximization
Exponential covariance with nugget effect. θ = (θ1, θ2, θ3)′: logprecision, logistic range, log nugget precision.
![Page 42: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/42.jpg)
Gaussian Processes
Method
Asymptotic properties
θ ≈ N(θ,G−1).
G = G (θ) = −E(d2l
dθ2
).
![Page 43: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/43.jpg)
Gaussian Processes
Prediction
Prediction from joint Gaussian formulation
Prediction
Y0 = E (Y0|Y ) = X 0β + C 0,.C−1(Y − X β).
C 0,. is size 1× n vector of cross-covariances between prediction site s0
and data sites.Prediction variance
Var(Y0|Y ) = C0 − C 0,.C−1C ′0,..
![Page 44: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/44.jpg)
Gaussian Processes
Numerical examples
Synthetic data
Consider unit square. Create grid of 252 = 625 locations. Use 49 datarandomly assigned, or along center line (two designs).
Covariance C (h) = τ 2I (h = 0) + σ2(1 + φh) exp(−φh), h = |s i − s j |.θ include transformations of: σ, τ and φ.
![Page 45: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/45.jpg)
Gaussian Processes
Numerical examples
Likelihood optimization
True parameters β = (−2, 3, 1), θ = (0.25, 9, 0.0025).Random design:β = [−2(0.486), 3.43(0.552), 0.812(0.538)]θ = [ 0.298(0.118), 7.89(1.98), 0.00563(0.00679)]Center design:β = [−2.06(0.576), 3.4(0.733), 0.353(0.733)]θ = [0.255(0.141), 7.19(1.97), 0.00283(0.00128)]
![Page 46: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/46.jpg)
Gaussian Processes
Numerical examples
Predictions
![Page 47: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/47.jpg)
Gaussian Processes
Numerical examples
Project on spatial Gaussian regression model next week
![Page 48: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/48.jpg)
Gaussian Processes
Numerical examples
Mining example
Data giving oxide grade information at n = 1600 locations.
Easting
No
rth
ing
500 m Prediction line
B
A
Use spatial statistics to predict oxide grade. Question: Will mining beprofitable? Should one gather more data before making mining decision?
![Page 49: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/49.jpg)
Gaussian Processes
Numerical examples
Mining example
Model: Y (s) = X (s)β + w(s) + ε(s).Data at n borehole locations used to fit model parameters by MLE.Starting values from variogram (related to sample covariance).
0 50 100 150 200 250 3000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Distance (m)
Va
rio
gra
m
In−Strike
Perpendicular
Omni−directional
![Page 50: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/50.jpg)
Gaussian Processes
Numerical examples
Mining example
Current prediction
Line Distance (m)
Altitu
de
(m
)
50 100 150 200 250 300 350 400 450 500
0
50
100
150
200
250
300
0
1
2
3
4
5
A B
, Line Distance (m)
Altitu
de
(m
)
50 100 150 200 250 300 350 400 450 500
0
50
100
150
200
250
300
0.3
0.35
0.4
0.45
0.5
0.55
0.6
A B
![Page 51: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/51.jpg)
Gaussian Processes
Numerical examples
Mining exampleDevelop or not? Decision is done using expected profits.Value based on current data:
PV = max(py , 0), py = r tµ|y − k t1,
Will more data influence the mining decision? Expected value with moredata:
PoV =
∫yn
max(py ,yn , 0)π(yn|y)dyn, py ,yn = r tµ|y ,yn − k t1,
Is more data valuable. Compute the value of information (VOI):
VOI = PoV− PV.
If VOI exceeds the cost of data, it is worthwhile gathering thisinformation.
(Computations partly analytical, similar to EI.)
![Page 52: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/52.jpg)
Gaussian Processes
Numerical examples
Mining example
Use this to compare two data types in planned boreholes:
I XRF : scanning in a lab (expensive, accurate).
I XMET : handheld device for grade scanning at the mining site(inexpensive, inaccurate).
![Page 53: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/53.jpg)
Gaussian Processes
Numerical examples
Mining example
0 2 4 6 8 10 12
x 105
0
2
4
6
8
10
12x 10
5
XRF
XMETNothing
Cost of XMET ($)
Cost of X
RF
($)
![Page 54: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/54.jpg)
Gaussian Processes
Numerical examples
Schedule
I 17 Jan: Gaussian processes
I 24 Jan: Hands-on project on Gaussian processes
I 31 Jan: Latent Gaussian models and INLA.
I 7 Feb: Hands-on project on INLA
I 12-13 Feb: Template model builder. (Guest lecturer Hans J Skaug)
![Page 55: Schedulefolk.ntnu.no/joeid/phdclass_16jan18.pdfGaussian Markov random elds Sparse precision matrix Q I For graphs the precision matrix is sparse. I Q ij = 0 if nodes i and j are not](https://reader036.fdocuments.us/reader036/viewer/2022062609/60f7d3a6514a5e01302e5fcc/html5/thumbnails/55.jpg)
Gaussian Processes
Numerical examples
Schedule
Project (24 Jan) will include
I Expected improvement for function maximization.
I Parameter estimation in spatial regression model (forest example).