Lecture 20 - Spatial Random Effects Models + Point Reference...

36
Lecture 20 Spatial Random Effects Models + Point Reference Spatial Data Colin Rundel 04/03/2017 1

Transcript of Lecture 20 - Spatial Random Effects Models + Point Reference...

Page 1: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Lecture 20Spatial Random Effects Models + Point Reference Spatial Data

Colin Rundel04/03/2017

1

Page 2: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Spatial Random Effects Models

2

Page 3: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Scottish Lip Cancer Data

Observed Expected

8°W 6°W 4°W 2°W 8°W 6°W 4°W 2°W

55°N

56°N

57°N

58°N

59°N

60°N

61°N

0

20

40

60

80

value

3

Page 4: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

55°N

56°N

57°N

58°N

59°N

60°N

61°N

8°W 6°W 4°W 2°W

0

2

4

6

Obs/Exp

55°N

56°N

57°N

58°N

59°N

60°N

61°N

8°W 6°W 4°W 2°W

0

5

10

15

20

% Agg Fish Forest

4

Page 5: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Neighborhood / weight matrix

5

Page 6: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Moran’s I

morans_I = function(y, w){n = length(y)y_bar = mean(y)num = sum(w * (y-y_bar) %*% t(y-y_bar))denom = sum( (y-y_bar)^2 )(n/sum(w)) * (num/denom)

}

morans_I(y = lip_cancer$Observed, w = diag(rowSums(W)) %*% W)## [1] 0.2258758

morans_I(y = lip_cancer$Observed / lip_cancer$Expected,w = diag(rowSums(W)) %*% W)

## [1] 0.5323161

ape::Moran.I(lip_cancer$Observed / lip_cancer$Expected,weight = diag(rowSums(W)) %*% W) %>% str()

## List of 4## $ observed: num 0.666## $ expected: num -0.0182## $ sd : num 0.0784## $ p.value : num 0

6

Page 7: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

A hierachical model for lip cancer

We have observed counts of lip cancer for 56 districts in Scotland. Let yirepresent the number of lip cancer for district i.

yi ∼ Poisson(λi)

log(λi) = log(Ei) + xiβ + ωi

ω ∼ N (0, σ2(D − ϕW)−1)

where Ei is the expected counts for each region (and serves as an offet).

7

Page 8: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Data prep & JAGS model

D = diag(rowSums(W))X = model.matrix(~scale(lip_cancer$pcaff))log_offset = log(lip_cancer$Expected)y = lip_cancer$Observed

## model{## for(i in 1:length(y)) {## y[i] ~ dpois(lambda[i])## y_pred[i] ~ dpois(lambda[i])## log(lambda[i]) <- log_offset[i] + X[i,] %*% beta + omega[i]## }#### for(i in 1:2) {## beta[i] ~ dnorm(0,1)## }#### omega ~ dmnorm(rep(0,length(y)), tau * (D - phi*W))## sigma2 <- 1/tau## tau ~ dgamma(2, 2)## phi ~ dunif(0,0.99)## }

8

Page 9: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Model Results

35000 40000 45000 50000 55000 60000

−0.

40.

2

Iterations

Trace of beta[1]

−0.5 0.0 0.5

0.0

1.0

2.0

Density of beta[1]

N = 1000 Bandwidth = 0.04924

35000 40000 45000 50000 55000 60000

0.0

0.2

0.4

Iterations

Trace of beta[2]

−0.1 0.0 0.1 0.2 0.3 0.4 0.5

01

23

Density of beta[2]

N = 1000 Bandwidth = 0.02561

9

Page 10: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

35000 40000 45000 50000 55000 60000

0.4

0.7

1.0

Iterations

Trace of phi

0.4 0.5 0.6 0.7 0.8 0.9 1.0

04

812

Density of phi

N = 1000 Bandwidth = 0.01076

35000 40000 45000 50000 55000 60000

0.5

1.5

Iterations

Trace of sigma2

0.5 1.0 1.5 2.00.

01.

0

Density of sigma2

N = 1000 Bandwidth = 0.06232

10

Page 11: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Predictions & Residuals

55°N

56°N

57°N

58°N

59°N

60°N

61°N

8°W 6°W 4°W 2°W

10

20

30

Predicted Cases

55°N

56°N

57°N

58°N

59°N

60°N

61°N

8°W 6°W 4°W 2°W

−2

0

2

Residuals

11

Page 12: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Residuals + RMSE + Moran’s I

#RMSElip_cancer_pred$resid %>% .^2 %>% mean() %>% sqrt()## [1] 1.498675

#Moran’s Imorans_I(y = lip_cancer_pred$resid, w = diag(rowSums(W)) %*% W)## [1] 0.05661104

ape::Moran.I(lip_cancer_pred$resid,weight = diag(rowSums(W)) %*% W) %>% str()

## List of 4## $ observed: num 0.0642## $ expected: num -0.0182## $ sd : num 0.0803## $ p.value : num 0.305

12

Page 13: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Point Referenced Data

13

Page 14: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Example - PM2.5 from CSN

The Chemical Speciation Network are a series of air quality monitors run byEPA (221 locations in 2007). We’ll look at a subset of the data from Nov 11th,2007 (n=191) for just PM2.5.

25

30

35

40

45

50

−120 −100 −80

long

lat

pm25

10

20

30

40

10

20

30

40

pm25

14

Page 15: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

csn## # A tibble: 191 × 5## site longitude latitude date pm25## <int> <dbl> <dbl> <dttm> <dbl>## 1 10730023 -86.81500 33.55306 2007-11-14 19.43555## 2 10732003 -86.92417 33.49972 2007-11-14 26.40000## 3 10890014 -86.58637 34.68767 2007-11-14 13.40000## 4 11011002 -86.25637 32.40712 2007-11-14 19.70000## 5 11130001 -84.99917 32.47639 2007-11-14 22.60000## 6 40139997 -112.09577 33.50383 2007-11-14 12.30000## 7 40191028 -110.98230 32.29515 2007-11-14 7.20000## 8 51190007 -92.28130 34.75619 2007-11-14 12.70000## 9 60070002 -121.84222 39.75750 2007-11-14 10.00000## 10 60190008 -119.77222 36.78139 2007-11-14 32.26205## # ... with 181 more rows

15

Page 16: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Aside - Splines

16

Page 17: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Splines in 1d - Smoothing Splines

These are a mathematical analogue to the drafting splines representedusing a penalized regression model.

We want to find a function f(x) that best fits our observed datay = y1, . . . , yn while being as smooth as possible.

arg minf(x)

n∑i=1

(yi − f(xi))2 + λ

∫f′′(x)2 dx

Interestingly, this minimization problem has an exact solution which is givenby a mixture of weighted natural cubic splines (cubic splines that are linearin the tails) with knots at the observed data locations (xs).

17

Page 18: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Splines in 1d - Smoothing Splines

These are a mathematical analogue to the drafting splines representedusing a penalized regression model.

We want to find a function f(x) that best fits our observed datay = y1, . . . , yn while being as smooth as possible.

arg minf(x)

n∑i=1

(yi − f(xi))2 + λ

∫f′′(x)2 dx

Interestingly, this minimization problem has an exact solution which is givenby a mixture of weighted natural cubic splines (cubic splines that are linearin the tails) with knots at the observed data locations (xs).

17

Page 19: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Splines in 2d - Thin Plate Splines

Now imagine we have observed data of the form (xi, yi, zi) where we wish topredict zi given xi and yi for all i. We can naturally extend the smoothingspline model in two dimensions,

arg minf(x,y)

n∑i=1

(zi − f(xi, yi))2 + λ

∫ ∫ (∂2f∂x2

+ 2∂2f

∂x ∂y+

∂2f∂y2

)dx dy

The solution to this equation has a natural representation using a weightedsum of radial basis functions with knots at the observed data locations

f(x, y) =n∑i=1

wi d(xi, yi)2 log d(xi, yi).

18

Page 20: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Fitting a TPS

library(fields)coords = select(csn, longitude, latitude) %>% as.matrix()tps = Tps(x = coords, Y=csn$pm25)

pm25_pred = rpm25_pred[cells] = predict(tps, pred_coords)

plot(pm25_pred)points(coords, pch=16, cex=0.5)

−120 −110 −100 −90 −80 −70

2530

3540

4550

5101520253035

19

Page 21: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Gaussin Process Models / Kriging

20

Page 22: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Variogram

library(geoR)coords = csn %>% select(latitude, longitude) %>% as.matrix()d = dist(coords) %>% as.matrix()

variog(coords = coords, data = csn$pm25, messages = FALSE,uvec = seq(0, max(d)/2, length.out=50)) %>% plot()

0 5 10 15 20 25

020

6010

0

distance

sem

ivar

ianc

e

21

Page 23: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

variog(coords = coords, data = csn$pm25, messages = FALSE,uvec = seq(0, max(d)/4, length.out=50)) %>% plot()

0 2 4 6 8 10 12 14

020

4060

80

distance

sem

ivar

ianc

e

22

Page 24: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Isotropy / Anisotropy

v4 = variog4(coords = coords, data = csn$pm25, messages = FALSE,uvec = seq(0, max(d)/4, length.out = 50))

plot(v4)

0 2 4 6 8 10 12 14

020

4060

8010

012

0

distance

sem

ivar

ianc

e

0°45°90°135°

23

Page 25: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

GP Spatial Model

If we assume that our data is stationary and isotropic then we can use a GaussianProcess model to fit the data. We will assume an exponential covariance structure.

y ∼ N (µ1, Σ)

{Σ}ij = σ2 exp(−r ∥si − sj∥) + σ2n 1i=j

we can also view this as a spatial random effects model where

y(s) = µ(s) + w(s) + ϵ(s)

w(s) ∼ N (0,Σ′)

ϵ(si) ∼ N (0, σ2n)

{Σ′}ij = σ2 exp(−r ∥si − sj∥)

24

Page 26: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

GP Spatial Model

If we assume that our data is stationary and isotropic then we can use a GaussianProcess model to fit the data. We will assume an exponential covariance structure.

y ∼ N (µ1, Σ)

{Σ}ij = σ2 exp(−r ∥si − sj∥) + σ2n 1i=j

we can also view this as a spatial random effects model where

y(s) = µ(s) + w(s) + ϵ(s)

w(s) ∼ N (0,Σ′)

ϵ(si) ∼ N (0, σ2n)

{Σ′}ij = σ2 exp(−r ∥si − sj∥)

24

Page 27: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Fitting with spBayes

library(spBayes)

n = nrow(csn)n_samp = 20000coords = select(csn, longitude, latitude) %>% as.matrix()max_range = max(dist(coords)) / 4

starting = list(phi = 3/3, sigma.sq = 33, tau.sq = 17)tuning = list(”phi”=0.1, ”sigma.sq”=0.1, ”tau.sq”=0.1)priors = list(

beta.Norm = list(0, 1000),phi.Unif = c(3/max_range, 6),sigma.sq.IG = c(2, 2),tau.sq.IG = c(2, 2)

)

25

Page 28: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

m = spLM(pm25 ~ 1, data = csn, coords = coords, starting = starting, priors = priors,cov.model = ”exponential”, n.samples = n_samp, tuning = tuning,n.report = n_samp/2)

## ----------------------------------------## General model description## ----------------------------------------## Model fit with 191 observations.#### Number of covariates 1 (including intercept if specified).#### Using the exponential spatial correlation model.#### Number of MCMC samples 20000.#### Priors and hyperpriors:## beta normal:## mu: 0.000## cov:## 1000.000#### sigma.sq IG hyperpriors shape=2.00000 and scale=2.00000## tau.sq IG hyperpriors shape=2.00000 and scale=2.00000## phi Unif hyperpriors a=0.21888 and b=6.00000## -------------------------------------------------## Sampling## -------------------------------------------------## Sampled: 10000 of 20000, 50.00%## Report interval Metrop. Acceptance rate: 33.25%## Overall Metrop. Acceptance rate: 33.25%## -------------------------------------------------## Sampled: 20000 of 20000, 100.00%## Report interval Metrop. Acceptance rate: 33.53%## Overall Metrop. Acceptance rate: 33.39%## -------------------------------------------------

26

Page 29: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

m = spRecover(m, start=n_samp/2+1, thin = (n_samp/2)/1000)## -------------------------------------------------## Recovering beta and w## -------------------------------------------------## Sampled: 99 of 1000, 9.90%## Sampled: 199 of 1000, 19.90%## Sampled: 299 of 1000, 29.90%## Sampled: 399 of 1000, 39.90%## Sampled: 499 of 1000, 49.90%## Sampled: 599 of 1000, 59.90%## Sampled: 699 of 1000, 69.90%## Sampled: 799 of 1000, 79.90%## Sampled: 899 of 1000, 89.90%## Sampled: 999 of 1000, 99.90%

27

Page 30: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Parameter values

m$p.theta.recover.samples %>% mcmc() %>% plot()

0 200 400 600 800 1000

3070

Iterations

Trace of sigma.sq

20 40 60 80 100

0.00

0.03

Density of sigma.sq

N = 1000 Bandwidth = 2.766

0 200 400 600 800 1000

1020

Iterations

Trace of tau.sq

5 10 15 20 25

0.00

0.10

Density of tau.sq

N = 1000 Bandwidth = 0.802

0 200 400 600 800 1000

0.2

0.6

Iterations

Trace of phi

0.2 0.3 0.4 0.5 0.6 0.7 0.8

02

4Density of phi

N = 1000 Bandwidth = 0.02228

28

Page 31: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

m$p.beta.recover.samples %>% mcmc() %>% plot()

0 200 400 600 800 1000

1015

20

Iterations

Trace of (Intercept)

5 10 15 20 250.

000.

050.

100.

150.

200.

25

Density of (Intercept)

N = 1000 Bandwidth = 0.4045

29

Page 32: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

Predictions

m_pred = spPredict(m, pred_coords, pred.covars = matrix(1, nrow=nrow(pred_coords)),start=n_samp/2+1, thin=(n_samp/2)/1000)

## ----------------------------------------## General model description## ----------------------------------------## Model fit with 191 observations.#### Prediction at 900 locations.#### Number of covariates 1 (including intercept if specified).#### Using the exponential spatial correlation model.#### -------------------------------------------------## Sampling## -------------------------------------------------## Sampled: 100 of 1000, 9.90%## Sampled: 200 of 1000, 19.90%## Sampled: 300 of 1000, 29.90%## Sampled: 400 of 1000, 39.90%## Sampled: 500 of 1000, 49.90%## Sampled: 600 of 1000, 59.90%## Sampled: 700 of 1000, 69.90%## Sampled: 800 of 1000, 79.90%## Sampled: 900 of 1000, 89.90%## Sampled: 1000 of 1000, 99.90%m_pred_summary = post_summary(t(m_pred$p.y.predictive.samples))

30

Page 33: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

splm_pm25_pred = rsplm_pm25_pred[cells] = m_pred_summary$post_mean

plot(splm_pm25_pred)points(coords, pch=16, cex=0.5)

−120 −110 −100 −90 −80 −70

2530

3540

4550

5101520253035

31

Page 34: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

JAGs Model

## model{## for(i in 1:length(y)){## y[i] ~ dnorm(beta + w[i], tau)## mu_w[i] <- 0## }#### for(i in 1:length(y)){## for(j in 1:length(y)){## Sigma_w[i,j] <- sigma2_w * exp(-phi * d[i,j])## }## }## Sigma_w_inv <- inverse(Sigma_w)## w ~ dmnorm(mu_w, Sigma_w_inv)#### beta ~ dnorm(0, 1/1000)## sigma2_w ~ dgamma(2, 2)## sigma2 ~ dgamma(2, 2)## tau <- 1/sigma2## phi ~ dunif(3/14, 6)## }

32

Page 35: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

10000 11000 12000 13000 14000 15000

1012

14

Iterations

Trace of beta

10 12 14 16

0.0

0.2

0.4

Density of beta

N = 1000 Bandwidth = 0.2477

10000 11000 12000 13000 14000 15000

0.3

0.5

0.7

Iterations

Trace of phi

0.2 0.3 0.4 0.5 0.6 0.7 0.80

24

Density of phi

N = 1000 Bandwidth = 0.02054

33

Page 36: Lecture 20 - Spatial Random Effects Models + Point Reference ...cr173/Sta444_Sp17/slides/Lec20.pdfModelResults 350004000045000500005500060000-0.4 0.2 Iterations Trace of beta[1]-0.5

10000 11000 12000 13000 14000 15000

1015

20

Iterations

Trace of sigma2

5 10 15 20

0.00

0.10

Density of sigma2

N = 1000 Bandwidth = 0.5785

10000 11000 12000 13000 14000 15000

1525

Iterations

Trace of sigma2_w

10 15 20 25 300.

000.

08

Density of sigma2_w

N = 1000 Bandwidth = 0.7447

34