C:/Users/diggle/Desktop/teaching/geostatistics/short ... · PDF fileModel-based Geostatistics...
Transcript of C:/Users/diggle/Desktop/teaching/geostatistics/short ... · PDF fileModel-based Geostatistics...
Model-based Geostatistics
Peter J Diggle
Lancaster University and Johns Hopkins
University School of Public Health
July 2011
Approximate timetable
1 0.00 - 0.30 Introduction and motivating examples2 0.30 - 1.30 Linear models3 1.30 - 2.15 Generalized linear models4 2.15 - 2.30 Design5 2.30 - 3.00 Preferential sampling
Most of the material is based on selected chapters from:
Diggle, P.J and Ribeiro, P.J. (2007). Model-basedGeostatistics. New York : Springer
Many (but not all) of the methods discussed are implementedin the R package geoR, see http://www.leg.ufpr.br/geoR/
Geostatistics
• traditionally, a self-contained methodology for spatialprediction, developed at Ecole des Mines, Fontainebleau,France
• nowadays, that part of spatial statistics which isconcerned with data obtained by spatially discretesampling of a spatially continuous process
Example 1.2: Tropical disease mapping
9 10 11 12 13 14 15
23
45
67
8
X Coord
9 10 11 12 13 14 15
23
45
67
8
X Coord
Y C
oord
−3 −2 −1 0 1
−6
−5
−4
−3
−2
−1
0
RAPLOA
PAR
AS
ITO
LOG
Y
Example 1.3: Environmental monitoring
5.0 5.5 6.0 6.5
46.5
47.0
47.5
48.0
48.5
1997 sample2000 sample
Model-based Geostatistics
• the application of general principles of statisticalmodelling and inference to geostatistical problems
• Example: kriging as minimum mean square errorprediction under Gaussian modelling assumptions
Gaussian geostatistics
Model
• Stationary Gaussian process S(x) : x ∈ IR2
· E[S(x)] = µ
· Cov{S(x), S(x′)} = σ2ρ(‖x − x′‖)
• Mutually independent Yi|S(·) ∼ N(S(x), τ 2)
Point predictor S(x) = E[S(x)|Y ]
• linear in Y = (Y1, ..., Yn);
• interpolates Y if τ 2 = 0.
Parameter uncertainty?
• usually ignored in traditional geostatistics(plug-in prediction)
Plotting data
require(geoR)
data(elevation)
elevation<-read.geodata("elevation.txt")
?elevation
points(elevation,cex.min=1,cex.max=4)
?points.geodata
points(elevation,cex.min=1,cex.max=4,pt.div="quint")
plot(elevation)
Notation
• Y = {Yi : i = 1, ..., n} is the measurement data
• {xi : i = 1, ..., n} is the sampling design
• A is the region of interest
• Y = {Y (x) : x ∈ A} is the measurement process
• S∗ = {S(x) : x ∈ A} is the signal process
• T = F(S∗) is the target for prediction
• [S∗, Y ] = [S∗][Y |S∗] is the geostatistical model
Gaussian model-based geostatistics
Model specification:
• Stationary Gaussian process S(x) : x ∈ IR2
· E[S(x)] = µ
· Cov{S(x), S(x′)} = σ2ρ(‖x − x′‖)
• Mutually independent Yi|S(·) ∼ N(S(xi), τ2)
Minimum mean square error prediction
[S, Y ] = [S][Y |S]
• T = t(Y ) is a point predictor
• MSE(T ) = E[(T − T )2]
Theorem: MSE(T ) takes its minimum value when T = E(T |Y ).
Proof uses result that for any predictor T ,
E[(T − T )2] = EY [VarT (T |Y )] + EY {[ET (T |Y ) − T ]2}
Immediate corollary is that
E[(T − T )2] = EY [Var(T |Y )] ≈ Var(T |Y )
Simple and ordinary kriging
Recall Gaussian model:
• Stationary Gaussian process S(x) : x ∈ IR2
· E[S(x)] = µ
· Cov{S(x), S(x′)} = σ2ρ(‖x − x′‖)
• Mutually independent Yi|S(·) ∼ N(S(x), τ 2)
Gaussian model implies
Y ∼ MVN(µ1, σ2V )
V = R + (τ 2/σ2)I Rij = ρ(‖xi − xj‖)Target for prediction is T = S(x), write r = (r1, ..., rn) where
ri = ρ(‖x − xi‖)
Standard results on multivariate Normal then give [T |Y ] asmultivariate Gaussian with mean and variance
T = µ + r′V −1(Y − µ1) (1)
Var(T |Y ) = σ2(1 − r′V −1r). (2)
Simple kriging: µ = Y Ordinary kriging: µ = (1′V −11)−11′V −1Y
The Matern family of correlation functions
ρ(u) = 2κ−1(u/φ)κKκ(u/φ)
• parameters κ > 0 and φ > 0
• Kκ(·) : modified Bessel function of order κ
• κ = 0.5 gives ρ(u) = exp{−u/φ}
• κ → ∞ gives ρ(u) = exp{−(u/φ)2}
• κ and φ are not orthogonal:
– helpful re-parametrisation: α = 2φ√κ
– but estimation of κ is difficult
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
u
ρ(u)
κ = 0.5 , φ = 0.25κ = 1.5 , φ = 0.16κ = 2.5 , φ = 0.13
0.0 0.2 0.4 0.6 0.8 1.0
−2
−1
01
2
x
S(x
)
Simple kriging: three examples
1. Varying κ (smoothness of S(x))
0.2 0.4 0.6 0.8
−1.
5−
0.5
0.0
0.5
1.0
1.5
locations
Y(x
)
3. Varying τ 2/σ2 (noise-to-dignal ratio)
0.0 0.2 0.4 0.6 0.8 1.0
−1.
5−
0.5
0.5
1.5
locations
Y(x
)
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.5
1.0
1.5
locations
stan
d. d
evia
tion
Predicting non-linear functionals
• minimum mean square error prediction is not invariantunder non-linear transformation
• the complete answer to a prediction problem is thepredictive distribution, [T |Y ]
• Recommended strategy:
– draw repeated samples from [S∗|Y ](conditional simulation)
– calculate required summaries(examples to follow)
Theoretical variograms
• the variogram of a process Y (x) is the function
V (x, x′) =1
2Var{Y (x) − Y (x′)}
• for the spatial Gaussian model, with u = ||x − x′||,
V (u) = τ 2 + σ2{1 − ρ(u)}
• provides a summary of the basic structural parametersof the spatial Gaussian process
Structural parameters
(τ 2, σ2, φ)
0 1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
1.0
u
V(u
)
sill
nugget
practical range
• the nugget variance: τ 2
• the sill: τ 2 + σ2, where σ2 = Var{S(x)}
• the practical range: ρ(u) = ρ(u/φ)
Empirical variograms
uij = ‖xi − xj‖ vij = 0.5[y(xi) − y(xj)]2
• the variogram cloud is a scatterplot of the points (uij , vij)
• the empirical variogram smooths the variogram cloud byaveraging within bins: u − h/2 ≤ uij < u + h/2
• for a process with non-constant mean (covariates), useresiduals r(xi) = y(xi) − µ(xi) to compute vij
Example: elevation data
0 2 4 6 8
010
000
2000
030
000
u
V(u)
0 2 4 6 8
010
0020
0030
0040
0050
0060
00
u
V(u)
1. vij ∼ V (uij)χ2
1
2. the vij are correlated
Consequences:
• variogram cloud is unstable, pointwise and in overall shape
• binning addresses point 1, but not point 2
Calculating an empirical variogram
vario1<-variog(elevation,uvec=0.5*(0:10))
plot(vario1)
plot(vario1,pch=19,col="red")
?variog
vario2<-variog(elevation,uvec=0.5*(0:10),trend="2nd")
plot(vario2)
names(vario1)
plot(vario1$u,vario1$v,type="l",xlim=c(0,5),ylim=c(0,6500),
xlab="u",ylab="V(u)")
lines(vario2$u,vario2$v,col="red")
Parameter estimation using the variogram
What not to do and how to do it
• weighted least squares criterion:
W (θ) =∑
k
nk{[Vk − V (uk; θ)]}2
where θ denotes vector of covariance parameters and Vk
is average of nk variogram ordinates vij.
• need to choose upper limit for u (arbitrary?)
• variations include:– fitting models to the variogram cloud– other estimators for the empirical variogram– different proposals for weights
Comments on variogram fitting
1. Can give equally good fits for different extrapolationsat origin.
0 2 4 6 8 10 12 14
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
u
V(u)
2. Correlation between variogram points inducessmoothness.
Empirical variograms for three simulationsfrom the same model.
0.0 0.2 0.4 0.6
0.0
0.5
1.0
1.5
u
V(u)
3. Fit is highly sensitive to specification of the mean.
Illustration with linear trend surface:
• solid smooth line: theoretical variogram;
• dotted line: from data;
• solid line: from true residuals;
• dashed line: from estimated residuals.
0 2 4 6
0.0
0.5
1.0
1.5
2.0
2.5
u
V(u)
Parameter estimation: maximum likelihood
Y ∼ MVN(µ1, σ2R + τ 2I)
R is n × n matrix with (i, j)th element ρ(uij) whereuij = ||xi − xj || (Euclidean distance between xi and xj)
Adding explanatory variables is straightforward:
µ(xi) =k∑
j=1
fk(xi)βk
where dk(xi) is a vector of covariates at location xi, hence
Y ∼ MVN(Dβ,σ2R + τ 2I)
Gaussian log-likelihood function
Y ∼ MVN(Dβ,σ2R + τ 2I)
• write ν2 = τ 2/σ2, hence σ2V = σ2(R + ν2I)
• log-likelihood function is maximised for
β(V ) = (D′V −1D)−1D′V −1y
σ2 = n−1(y − Dβ)′V −1(y − Dβ)
• substitute (β, σ2) to give reduced maximisation problem
L∗(ν2, φ, κ) ∝ −0.5{n log |σ2| + log |(R + ν2I)|}
In practice, choose κ from a discrete set, e.g. κ = 0.5, 1, 2
Maximum likelihood estimation
mlfit2<-likfit(elevation,trend="2nd",ini.cov.pars=c(1000,1),
cov.model="matern",kappa=2)
mlfit2
Comments on maximum likelihood
• likelihood-based methods preferable to variogram-basedmethods
• restricted maximum likelihood is widely recommendedbut in our experience is sensitive to mis-specification ofthe mean model.
• composite likelihood treats contributions from pairs (Yi, Yj)as if independent
• approximate likelihoods useful for handling largedata-sets
• examining profile likelihoods is advisable, to check forpoorly identified parameters
Trans-Gaussian models
• assume Gaussian model holds after point-wisetransformation
• Box-Cox family is widely used
Y ∗i = hλ(Yi) =
{
(Y λi − 1)/λ if λ 6= 0
log(Yi) if λ = 0
• bias-correction? only if point prediction is required
Example: log-Gaussian kriging
• T (x) = exp{S(x)} T (x) = exp{S(x) + v(x)/2}
• S1, ..., Sm are a sample from [S|Y ]
• Ti = exp(Si) ⇒ T1, ..., Tm are a sample from [T |Y ]
Plug-in prediction
region<-matrix(c(0,0,6.5,0,6.5,6.5,0,6.5),4,2,T)
grid<-as.matrix(pred_grid(region,by=0.25))
KC<-krige.control(obj.model=mlfit2,trend.d="2nd",trend.l="2nd")
OC<-output.control(n.predictive=100)
set.seed(24367)
predictions<-krige.conv(geodata=elevation,locations=grid,
borders=region,krige=KC,output=OC)
image(predictions)
points(elevation,add=T)
par(mfrow=c(1,2))
hist(elevation$data,main="data")
predict.max<-NULL
for (sim in 1:100) {
predict.max<-c(predict.max,max(predictions$simulations[,sim]))
}
hist(predict.max,main="predicted maximum")
Bayesian inference
Model specification
[Y, S, θ] = [θ][S|θ][Y |S, θ]
Parameter estimation
• integration gives
[Y, θ] =
∫
[Y, S, θ]dS
• Bayes’ Theorem gives posterior distribution
[θ|Y ] = [Y |θ][θ]/[Y ]
where [Y ] =∫
[Y |θ][θ]dθ
Prediction: S → S∗
• expand model specification to
[Y, S∗, θ] = [θ][S|θ][Y |S, θ][S∗|S, θ]
• plug-in predictive distribution is
[S∗|Y, θ]
• Bayesian predictive distribution is
[S∗|Y ] =
∫
[S∗|Y, θ][θ|Y ]dθ
• for any target T = t(S∗), required predictive distribution[T |Y ] follows by direct calculation
Notes
• likelihood function is central to both classical and Bayesianinference
• Bayesian prediction is a weighted average of plug-inpredictions, with different plug-in values of θweighted according to their conditional probabilitiesgiven the observed data.
• Bayesian prediction is usually more conservativethan plug-in prediction
• Sampling from posterior and predictive distributions usesdirect Monte Carlo:
– conjugate priors for mean and variance parameters
– discrete priors for covariance parameters
Bayesian estimation and prediction
MC<-model.control(trend.d="2nd",trend.l="2nd",kappa=2)
PC<-prior.control(beta.prior="flat",sigmasq.prior="sc.inv.chisq",
sigmasq=1000,df.sigmasq=4,phi.discrete=0.5*(1:5),
tausq.rel.prior="uniform",tausq.rel.discrete=0.1*(1:5))
OC<-output.control(n.posterior=100,n.predictive=100,
simulations.predictive=T,signal=T,moments=F)
set.seed(24367)
results.bayes<-krige.bayes(geodata=elevation,locations=grid,
borders=region,model=MC,prior=PC,output=OC)
Plotting posterior distributions
plot(results.bayes)
posteriors.bayes<-results.bayes$posterior
posterior.sample<-posteriors.bayes$sample
par(mfrow=c(3,3))
for (i in 1:9) {
hist(posterior.sample[,i],main=" ")
}
par(mfrow=c(1,1))
plot(posterior.sample[,2],posterior.sample[,3])
Plotting predictive distributions
predictions.bayes<-results.bayes$predictive
image(unique(grid[,1]),unique(grid[,2]),
matrix(predictions.bayes$mean.simulations,27,27))
points(elevation,add=T)
par(mfrow=c(1,2))
predict.max<-NULL
for (sim in 1:100) {
predict.max<-c(predict.max,max(predictions$simulations[,sim]))
}
hist(predict.max,xlab="maximum", main="plug-in")
predict.bayes.max<-NULL
for (sim in 1:100) {
predict.bayes.max<-c(predict.bayes.max,
max(predictions.bayes$simulations[,sim]))
}
hist(predict.bayes.max,xlab="maximum",main="Bayesian")
AfricanProgramme forOnchocerciasisControl
• “river blindness” – an endemic disease in wet tropicalregions
• donation programme of mass treatment with ivermectin
• approximately 30 million treatments to date
• serious adverse reactions experienced by some patientshighly co-infected with Loa loa parasites
• precautionary measures put in place before masstreatment in areas of high Loa loa prevalence
http://www.who.int/pbd/blindness/onchocerciasis/en/
The Loa loa prediction problem
Ground-truth survey data
• random sample of subjects in each of a number of villages
• blood-samples test positive/negative for Loa loa
Environmental data (satellite images)
• measured on regular grid to cover region of interest
• elevation, green-ness of vegetation
Objectives
• predict local prevalence throughout study-region (Cameroon)
• compute local exceedance probabilities,
P(prevalence > 0.2|data)
Loa loa: a generalised linear model
• Latent spatial process
S(x) ∼ SGP{0, σ2, ρ(u))}ρ(u) = exp(−|u|/φ)
• Linear predictor
d(x) = environmental variables at location x
η(x) = d(x)′β + S(x)
p(x) = log[η(x)/{1 − η(x)}]
• Error distribution
Yi|S(·) ∼ Bin{ni, p(xi)}
Schematic representation of Loa loa model
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
environmental gradient
local fluctuation
sampling errorprevalence
location
Computation
• likelihood is intractable
• MCMC solution is feasible, but:
– slow
– and needs careful tuning
geoR.glm package available from www.r-project.org
• INLA (Rue et al, 2009) a very promising alternative
R package and more available from www.r-inla.org
The modelling strategy
• use relationship between environmental variables and ground-truth prevalence to construct preliminary predictions vialogistic regression
• use local deviations from regression model to estimatesmooth residual spatial variation
• Bayesian paradigm for quantification of uncertainty inresulting model-based predictions
logit prevalence vs MAX = max NDVI
0.65 0.70 0.75 0.80 0.85 0.90
−5
−4
−3
−2
−1
0
Max Greeness
logi
t pre
vale
nce
Comparing non-spatial and spatial predictionsin Cameroon
Non-spatial
Predicted prevalence - 'without ground truth data'
3020100
Obse
rved p
reva
lence
(%
)60
50
40
30
20
10
0
Spatial
Predicted prevalence - 'with ground truth data' (%)
403020100
Obs
erve
d pr
eval
ence
(%
)
60
50
40
30
20
10
0
Next Steps
• analysis confirms value of local ground-truth prevalencedata
• in some areas, need more ground-truth data to reducepredictive uncertainty
• but parasitological surveys are expensive
RAPLOA
• a cheaper alternative to parasitological sampling:
– have you ever experienced eye-worm?
– did it look like this photograph?
– did it go away within a week?
• RAPLOA data to be collected:
– in sample of villages previously surveyedparasitologically (to calibrate parasitology vs RAPLOAestimates)
– in villages not surveyed parasitologically (to reducelocal uncertainty)
• bivariate model needed for combined analysis ofparasitological and RAPLOA prevalence estimates
UNDP/World Bank/WHO Special Programme for Research & Training in Tropical Disease (TDR)
R E P O RT O F A M U LT I - C E N T R E S T U DY
Rapid AssessmentProceduresfor Loiasis
TDR/IDE/RP/RAPL/01.1
RAPLOA calibration
0 20 40 60 80 100
020
4060
8010
0
RAPLOA prevalence
para
sito
logy
pre
vale
nce
−6 −4 −2 0 2
−6
−4
−2
02
RAPLOA logitpa
rasi
tolo
gy lo
git
locatio
n
Empirical logit transformation linearises relationshipColour-coding corresponds to four surveys in different regions
RAPLOA calibration (ctd)
0 20 40 60 80 100
020
4060
8010
0
RAPLOA prevalence
para
sito
logy
pre
vale
nce
Fit linear functional relationship on logit scale and back-transform
RAPLOA mapping: methodology
1. use geostatistical model to draw samples from predictivedistribution of logit-transformed RAPLOA prevalence
2. use calibration model to draw samples from predictivedistribution of logit-transformed parasitological prevalenceconditional on RAPLOA prevalence
3. back-transform to parasitological prevalence
4. map empirical exceedance proportions at each location
Method ensures correct propagation of uncertaintyat each stage
RAPLOA mapping: results
5 10 15 20 25 30 35
−15
−10
−5
05
1015
longitude
latit
ude
0.0−0.20.2−0.40.4−0.60.6−0.80.8−1.0
5 10 15 20 25 30 35
−15
−10
−5
05
10
0.0
0.2
0.4
0.6
0.8
1.0
Geostatistical design
• Retrospective
Add to, or delete from, an existing set of measurementlocations xi ∈ A : i = 1, ..., n.
• Prospective
Choose optimal positions for a new set of measurementlocations xi ∈ A : i = 1, ..., n.
Naive design folklore
• Spatial correlation decreases with increasing distance.
• Therefore, close pairs of points are wasteful.
• Therefore, spatially regular designs are a good thing.
Less naive design folklore
• Spatial correlation decreases with increasing distance.
• Therefore, close pairs of points are wasteful if you knowthe correct model.
• But in practice, at best, you need to estimate unknownmodel parameters.
• And to estimate model parameters, you need your designto include a wide range of inter-point distances.
• Therefore, spatially regular designs should be temperedby the inclusion of some close pairs of points.
Examples of compromise designs
0 1
0
1
A) Lattice plus close pairs design
0 1
0
1
B) Lattice plus in−fill design
Further remarks on geostatistical design
1. Conceptually more complex problems include:
(a) design when some sub-areas are more interestingthan others;
(b) design for best prediction of non-linear functionalsof S(·);
(c) multi-stage designs (see next session).
2. Theoretically optimal designs may not be practicable(eg Loa loa photo).
3. A more realistic goal is to suggest constructions for good,general-purpose designs.
Geostatistics re-visited
locations X signal S measurements Y
• Usually write geostatistical model as
[S, Y ] = [S][Y |S]
• What if X is stochastic? Usual implicit assumption is
[X,S, Y ] = [X][S][Y |S],hence can ignore [X] for inference about [S, Y ].
• Resulting likelihood:
L(θ) =
∫
[S][Y |S]dS
Preferential sampling
locations X signal S measurements Y
• Conventional model:
[X,S, Y ] = [S][X][Y |S] (1)
• Preferential sampling model:
[X,S, Y ] = [S][X|S][Y |S,X] (2)
• Key point for inference: even if [Y |S,X] in (2) and [Y |S]in (1) are algebraically the same, the term [X|S] in (1)cannot be ignored for inference about [S, Y ], because ofthe shared dependence on the unobserved process S
A model for preferential sampling
[X,S, Y ] = [S][X|S][Y |S,X]
• [S] = SGP(0, σ2, ρ) (stationary Gaussian process)
• [X|S] = inhomogenous Poisson process with intensity
λ(x) = exp{α + βS(x)}
• [Y |S,X] = N{µ + S(x), τ 2} (independent Gaussian)
Simulation of preferential sampling model
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Y C
oord
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Y C
oord
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Y C
oord
Locations (dots) and underlying signal process (grey-scale):
• left-hand panel: uniform non-preferential
• centre-panel: clustered preferential
• right-hand panel: clustered non-preferential
Likelihood inference
[X,S, Y ] = [S][X|S][Y |S,X]
• data are Xand Y , hence likelihood is
L(θ) =
∫
[X,S, Y ]dS = ES [[X|S][Y |S,X]]
• evaluate expectation by Monte Carlo,
LMC(θ) = m−1
m∑
j=1
[X|Sj ][Y |Sj, X]
• use anti-thetic pairs S2j = −S2j−1
Likelihood inference: an importance sampler
Re-write likelihood as
L(θ) =
∫
[X|S][Y |X,S][S|Y ]
[S|Y ][S]dS
• [S] = [S0][S1|S0]
• [S|Y ] = [S0|Y ][S1|S0, Y ] = [S0|Y ][S1|S0]
• [Y |X,S] = [Y |S0]
⇒
L(θ) =
∫
[X|S] [Y |S0]
[S0|Y ][S0][S|Y ]dS
= ES|Y
[
[X|S] [Y |S0]
[S0|Y ][S0]
]
Practical solutions to weak identifability
1. Strong Bayesian priors (if you can believe them)
2. Explanatory variables as surrogate for S
3. Two-stage sampling
Heavy metal bio-monitoring in Galicia
500000 550000 600000 650000
4650
000
4700
000
4750
000
4800
000
4850
000
1997 sample
alence
Heavy metal bio-monitoring in Galicia
5.0 5.5 6.0 6.5
46.5
47.0
47.5
48.0
48.5
1997 sample2000 sample
alence
Heavy metal bio-monitoring in Galicia
• 1997 sampling design highly non-uniform
• may lead to biased estimates of residual spatial variation
• 2000 sampling design better for fitting model of residualspatial variation
• possible modelling framework is:
– 2000 sampling is non-preferential
– 1997 sampling may be preferential
– some parameters in common between 1997 and 2000
Summary statistics
untransformed log-transformed1997 2000 1997 2000
Number of locations 63 132 63 132Mean 4.72 2.15 1.44 0.66Standard deviation 2.21 1.18 0.48 0.43Minimum 1.67 0.80 0.52 -0.22Maximum 9.51 8.70 2.25 2.16
Marginal distributions of log-transformed leadconcentrations
−0.5 0.0 0.5 1.0 1.5 2.0 2.5
0.0
0.2
0.4
0.6
0.8
1.0
ts
log(lead concentration)
cum
ula
tiv
eproportio
n
Conventional geostatistical analysis:fitted variograms
0.0 0.2 0.4 0.6 0.8 1.0 1.2
0.00
0.05
0.10
0.15
0.20
0.25
0.0 0.2 0.4 0.6 0.8 1.0 1.2
0.00
0.05
0.10
0.15
0.20
0.25
catio
n
uu
V(u)
V(u)
Likelihood ratio testing
1997 preferential? parameterisation logL(θ)no free parameters −96.79
common σ, φ, τ 100.62
yes free parameters −82.95common σ, φ, τ −86.04
Parameter estimation for joint model
θ θ SE Correlation matrix
µ97 1.68 0.19 1.00 0.25 0.30 0.56 0.13 -0.02µ00 0.74 0.10 0.25 1.00 0.10 0.26 0.11 -0.12log σ -0.94 0.04 0.30 0.10 1.00 0.18 -0.55 0.09log φ -1.40 0.06 0.56 0.26 0.18 1.00 0.47 -0.19log τ -1.48 0.04 0.13 0.11 -0.55 0.47 1.00 -0.23β -1.01 0.21 -0.02 -0.12 0.09 -0.19 -0.23 1.00
Spatial prediction of lead concentrations
Common scale runs from −0.756 (red) to 8.358 (white)
5.0 5.5 6.0 6.5
46.5
47.0
47.5
48.0
5.0 5.5 6.0 6.5
46.5
47.0
47.5
48.0
5.0 5.5 6.0 6.5
46.5
47.0
47.5
48.0
u)
preferential non-preferential difference
Heavy metal bio-monitoring in Galicia
• 1997 sampling design is good for monitoring effects ofindustrial activity
• but would lead to potential biased estimates of residualspatial variation
• 2000 sampling design is good for fitting model of residualspatial variation
• assuming stability of pollution levels over time, possibleanalysis strategy is:
– use 2000 data, or sub-set thereof, to model spatialvariation
– holding spatial correlation parameters fixed, use 1997data to model point-source effects of industriallocations.
Closing remarks
• There is nothing special about geostatistics.
• Parameter uncertainty can have a material impact onprediction.
• Bayesian paradigm deals naturally with parameteruncertainty.
• Implementation through MCMC is not whollysatisfactory:
– sensitivity to priors?
– convergence of algorithms?
– routine implementation on large data-sets?
• INLA shows great promise as alternative to MCMC
• Model-based approach clarifies distinctions between:
– the substantive problem;
– formulation of an appropriate model;
– inference within the chosen model;
– diagnostic checking and re-formulation.
• Analyse problems, not data:
– what is the scientific question?
– what data will best allow us to answer the question?
– what is a reasonable model to impose on the data?
– inference: avoid ad hoc methods if possible
– fit, reflect, re-formulate as necessary
– answer the question.
References
Crainiceanu, C., Diggle, P.J. and Rowlingson, B.S. (2008) Bivari-ate modelling and prediction of spatial variation in Loa loa preva-lence in tropical Africa (with Discussion). Journal of the American
Statistical Association, 103, 21–43.
Diggle, P.J. and Lophaven, S. (2006). Bayesian geostatistical de-sign. Scandinavian Journal of Statistics, 33, 55–64.
Diggle, P.J., Menezes, R. and Su, T.-L. (2010). Geostatistical anal-ysis under preferential sampling (with Discussion).Applied Statis-
tics, 59, 191–232.
Diggle, P.J., Moyeed, R.A. and Tawn, J.A. (1998). Model-basedGeostatistics (with Discussion). Applied Statistics 47 299–350.
Diggle, P.J. and Ribeiro, P.J. (2007). Model-based Geostatistics.New York: Springer.
Diggle, P.J., Thomson, M.C., Christensen, O.F., Rowlingson, B.,Obsomer, V., Gardon, J., Wanji, S., Takougang, I., Enyong, P.,Kamgno, J., Remme, H., Boussinesq, M. and Molyneux, D.H.(2007). Spatial modelling and prediction of Loa loa risk: deci-sion making under uncertainty. Annals of Tropical Medicine and
Parasitology, 101, 499–509.
Rue, H., Martino, S. and Chopin, N. (2009). Approximate Bayesianinference for latent Gaussian models by using integrated nestedLaplace approximations (with Discussion). Journal of the RoyalStatistical Society B 71, 319–392(see also http://www.r-inla.org/)