Regression Model N Standard, semi-conjugate priors:...
Transcript of Regression Model N Standard, semi-conjugate priors:...
1
Fuel Consumption
3.41.92.72.02.92.62.29.13.83.4Y
4.93.13.62.94.63.63.36.55.95.5X
Plot Y versus X
R2 = 61.5%
Least Squares fit
Y ~ X.ctr
0.0123413.3450.411.38X.ctr
0.0001467.430.483.56(Intercept)
Pr(>|t|)t valueStd.ErrorEst.
2
Regression Model
The model takes the following form:
yi = xi! + "i
"i ~ N(0, #2)
Standard, semi-conjugate priors:
! ~ N(b0, B0-1)
#!2 ~ Gamma(c0/2, d0/2)
WinBUGSModel {
# Likelihood
for (i in 1:N) {
Y[i] ~ dnorm(mu[i], tau)
mu[i] <- a + b*X[i]
}
# Prior
a ~ dnorm(0.0, 1.0E-6)
b ~ dnorm(0.0, 1.0E-6)
tau ~ dgamma(0.001, 0.001)
sd <- 1/sqrt(tau)
}
Data
list(N=10)
Inits
list(tau=1.0)
# Data in separate file
Y[] X[]
3.4 5.5
3.8 5.9
9.1 6.5
2.2 3.3
2.6 3.6
2.9 4.6
2.0 2.9
2.7 3.6
1.9 3.1
3.4 4.9
END
3
• Data columns in rectangular
format need to be headed by the
array name.
• The arrays must be of equal size
– Array names must have explicit
brackets:
Input Data in Rectangular
Format
age[] sex[]
26 0
52 1
.....
34 0
END
Place 'END'
statement after last
row, followed by at
least one blank line
Multidimensional Arrays• Multi-dimensional arrays can be
specified by explicit indexing
– The first index position for any array
must always be empty.
Y[,1] Y[,2] Y[,3] Y[,4] Y[,5]
151 199 246 283 320
145 199 249 293 354
147 214 263 312 328
.......
153 200 244 286 324
END
4
WinBUGS Model {
# Likelihood
for (i in 1:N) {
Y[i] ~ dnorm(mu[i], tau)
mu[i] <- a + b*(X[i] - mean(X[]))
}
# Prior
a ~ dnorm(0.0, 1.0E-6)
b ~ dnorm(0.0, 1.0E-6)
tau ~ dgamma(0.001, 0.001)
sd <- 1/sqrt(tau)
}
• Regular estimate
node mean sd 2.5% median 97.5% start sample
a 3.396 0.4952 2.403 3.397 4.346 5001 5000
b 1.305 0.4121 0.4571 1.307 2.118 5001 5000
sd 1.514 0.4442 0.9118 1.426 2.597 5001 5000
• Robust estimate
node mean sd 2.5% median 97.5% start sample
a 2.929 0.1428 2.673 2.92 3.236 5001 5000
b 0.6268 0.1412 0.3879 0.6139 0.9326 5001 5000
sd 0.352 0.192 0.1452 0.3009 0.8453 5001 5000
sd 0.352 0.192 0.007105 0.1452 0.3009 0.8453 5001
5000
5
Fuel Consumption
Example
Regression & Prediction
• Relationship between rainfall in
December and rainfall in November
in York, U.K. (1971 – 1980)
– yi=rainfall in Dec. of year i (i =1,…10)
– xi =rainfall in Nov. of year i (i =1,…10)
( ) [ ]( )( )
( )001.0,001.0~1
001.0,0~
0001.0,0~
*,,,,~,
2
22
Gamma
Nb
Na
xbaxbaYENy iiiiiii
!
µ!µ!µ +==
6
Ten-Years of Data
Nov. Rainfall
20 40 60 80 100 120 140
20
30
40
50
Prediction
• Predict 1981 Dec. rainfall
– Nov. rainfall was 34.1 mm
•Actual rainfall was 12.3 mm
•34.1 mm rain in Nov. 1984 & 106.2
mm rain in Dec. 1984!
[ ] 1.34*, babayE new +=
7
model {
for (i in 1:n) {
y[i] ~ dnorm(mu[i],tau.y)
mu[i] <- a + b*x[i]
}
# priors
a ~ dnorm(0, 0.0001)
b ~ dnorm(0, 0.001)
tau.y ~ dgamma(0.001, 0.001)
sig2.y <- 1/tau.y
# prediction
mu.new <- a+b*x.new
y.new ~ dnorm(mu.new, tau.y)
}
Rainfall (mm.) Data
list(x=c(23.9, 43.3, 36.3, 40.6, 57,
52.5, 46.1,142, 112.6, 23.7),
y=c(41, 52, 18.7, 55, 40,
29.2, 51, 17.6, 46.6, 57),
n=10,
x.new = 34.1)
8
Improvement With Centering
• Substitutemu[i] <- a + b*(x[i] - mean(x[]))
• Don’t forget, mu.new <- a + b*(x.new – mean(x[]))
More Robust Regression
• Used t dist’n with small d.f. forresiduals
– Scaled-mixture of normals
• Can also treat d.f. as parameterin model
( ) ( )( )!
"#
$%
22
2
,~
,~,~
&'(
µ&µ'
InvV
VNVyty
i
iii
i
9
Scale Mixture of Normals
• New model
( )
( )
( )!!"
!!#
$
%
&'
()*
+%,
baGammaInvV
GammaInv
VNVy
ty i
iii
i
,~
,~
,~,
,~22
2 --.
.µ.
/µ-
model {
for (i in 1:n) {
y[i] ~ dnorm(mu[i],tau.y[i])
mu[i] <- a + b*(x[i]-mean(x[]))
tau.y[i] <- V*lambda[i]
lambda[i] ~ dgamma(nu.2, nu.2)
}
a ~ dnorm(0, 0.0001)
b ~ dnorm(0, 0.001)
V ~ dgamma(0.001, 0.001)
nu.2 <- nu/2
}
Scaled Mixture of Normals
10
Data & Fit
Results
• Regular model
• Robust model
Regular Model
5000500176.942.469.45417.0542.43y.new
5000500152.8642.6332.015.26442.57mu.new
500050010.1118-0.1596-0.43680.137-0.1618b
5000500150.7840.7130.535.03140.68a
samplestart97.5%median2.5% sd mean node
Robust Model
5000500184.1944.453.61820.6244.31y.new
5000500154.1844.4533.815.18644.3mu.new
500050010.1243-0.1871-0.44510.146-0.18b
5000500151.5142.432.024.93642.19a
samplestart97.5%median2.5% sd mean node
11
Predicted December Rainfall
Weight Gain of Rats
• 30 young rats (controls)
• Weights measured weekly for 5
weeks
• Linear
Day
Weight
8 15 22 29 36
100
125
150
175
200
225
250
275
300
325
350
375
400
12
Hierarchical Model
0 1 2 3 4 5
0
5
10
15
20
25
30
35
40
45
Time
Y
Pop’n mean
Mean profile #2
Mean profile #1
Growth Curve Model
• Assume linear weight gain
( ) ( )
( ) ( )( ) ( )
( )
( ) ( )33
2
33
2
33
2
6
0
6
0
2
0
2
0
22
10,10~1&10,10~1
,10,10~1
10,0.0~&10,0.0~
,~&,~
,,~,
!!!!
!!
+==
GammaGamma
Gamma
NN
NaN
ttNy
y
ii
ijiiijiijyijiij
"#
"#
$$
$
"#
$""$#
"#µµ$µ$µ
13
model {
for (i in 1:N ) {
for (j in 1:T ) {
Y[i,j] ~ dnorm(mu[i,j], tau.y)
mu[i,j] <- alpha[i]+beta[i]*(x[j]-xbar)
}
alpha[i] ~ dnorm(alpha.c, tau.alpha)
beta[i] ~ dnorm(beta.0, tau.beta)
}
tau.y ~ dgamma(0.001,0.001)
alpha.c ~ dnorm(0.0,1.0E-6)
alpha.0 <- alpha.c – (xbar * beta.0)
beta.0 ~ dnorm(0.0,1.0E-6)
tau.alpha ~ dgamma(0.001,0.001)
tau.beta ~ dgamma(0.001,0.001)
sigma.alpha~ dunif(0,100)
sigma.beta~ dunif(0,100)
tau.alpha<-1/(sigma.alpha*sigma.alpha)
tau.beta<-1/(sigma.beta*sigma.beta)
}
Old inv-Gamma
New: Uniform
prior for SD
Burn-in 20,000; 2 chains;
10,000 saves/chain
alpha.0 chains 1:2
iteration
20001 22500 25000 27500 30000
90.0
100.0
110.0
120.0
130.0
beta.0 chains 1:2
iteration
20001 22500 25000 27500 30000
5.5
5.75
6.0
6.25
6.5
6.75
14
Bring MCMC output
into S-Plus (R, etc.)
CODA
• What is CODA?
– CODA stands for Convergence Diagnostic
and Output Analysis
– Menu-driven set of S-Plus/R functions
• an output processor for BUGS
– May be used with MCMC output from a user's
own programs
• Format MCMC output appropriately (see
the CODA manual for details).
15
What does CODA do?
• CODA computes convergence
diagnostics and statistical and
graphical summaries for the
samples produced by the Gibbs
sampler.
– Can use CODA for any MCMC output
BOA
Bayesian Output Analysis Program
• Another program for working with
MCMC output
• boa.init()
– starts a BOA session.
– Sets up internal data structures &
initializes with default values.
www.public-health.uiowa.edu/boa/
16
BOA
Import data from BUGS output files
• boa.importBUGS(prefix, path=NULL)
–prefix = Prefix for BUGS output files.
•"prefix.ind" & "prefix.out".
–path = directory path for files
• Example
– BugsOut <-
boa.importBUGS(“FuelConsumption",
path=“E:/2007/WinBUGS examples/")
Bring MCMC output
into S-Plus (R, etc.)
17
Save indexing file as ASCII
with .ind extension
Save MCMC output as ASCII
with .out extension
18
Use BOA to Read
Into S-Plus or R
• rat1 <- boa.importBUGS(”rat1",
path=”d:/BayesianDataAnalysis/")
> dimnames(rat1)[[2]]
[1] "alpha[1]" "alpha[2]" "alpha[3]" "alpha[4]" "alpha[5]"
"alpha[6]" "alpha[7]" "alpha[8]" "alpha[9]"
[10] "alpha[10]" "alpha[11]" "alpha[12]" "alpha[13]" "alpha[14]"
"alpha[15]" "alpha[16]" "alpha[17]" "alpha[18]"
...
19
Plot Data & Fits
alpha.0 <- rat1[,31]; beta.0 <- rat1[,63]
expect.y <- mean(alpha.0) + mean(beta.0)*Day
for (i in 1:nrow(RatData)) {
plot(Day, RatData[i,], ylim=c(100,400),
xlab="Day", ylab="Weight")
y <- apply(rat1[,indx:(indx+4)], 2, mean)
lines(Day, y, lwd=4)
lines(Day, expect.y, lwd=4, col=3)
indx <- indx+5
}
Fits To Data
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
Day
Weight
10 15 20 25 30 35
100
200
300
400
20
Examples
Day
Weight
10 15 20 25 30 35
100
150
200
250
300
350
400
Pop’n
Mean
seed O. aegyptiaco 75 seed O. aegyptiaco 73
Bean Cucumber Bean Cucumber
r n r/n r n r/n r n r/n r n r/n
_________________________________________________________________
10 39 0.26 5 6 0.83 8 16 0.50 3 12 0.25
23 62 0.37 53 74 0.72 10 30 0.33 22 41 0.54
23 81 0.28 55 72 0.76 8 28 0.29 15 30 0.50
26 51 0.51 32 51 0.63 23 45 0.51 32 51 0.63
17 39 0.44 46 79 0.58 0 4 0.00 3 7 0.43
10 13 0.77
Example: Logistic Regression
with Random Effects
– The prop’n of seeds that germinated
– 21 plates
– 2-by-2 factorial: seed & type of root
extract
21
x1i = seed type for ith exp’t
x2i = type of root extract
( )( )
( )!µ
µ"
,0~
logit
,bin~
211222110
Nb
xxáxáxáá
bp
npr
i
iiiii
iiii
iii
+++=
+==
Seed Germination Model(extra-binomial variation)
Graphical Model for
Seeds Experiments
for(i IN 1 : N)
beta[i]
p[i]
sigma
tau
alpha12alpha2alpha1alpha0
n[i]
x1[i]
x2[i]
mu[i]
r[i]
22
WinBUGS Code
model {
for ( i in 1 : N ) {
r[i] ~ dbin(p[i],n[i])
mu[i] <- alpha0 + alpha1 * x1[i] + alpha2 * x2[i] +
alpha12 * x1[i] * x2[i]
b[i] ~ dnorm(0.0, tau)
beta[i] <- mu[i] + b[i]
logit(p[i]) <- beta[i]
}
}
Priors
alpha0 ~ dnorm(0.0,1.0E-6)
alpha1 ~ dnorm(0.0,1.0E-6)
alpha2 ~ dnorm(0.0,1.0E-6)
alpha12 ~ dnorm(0.0,1.0E-6)
tau ~ dgamma(0.001,0.001)
sigma <- 1 / sqrt(tau)
Recall: ( )1,0~b!"N
i
23
What Are These Priors?
• WinBUGS calls them “proper but
minimally informative”
– Gaussian:
•mean=0
•variance=1,000,000
– Gamma:
•mean=1
•variance=1,000
Noninformative Prior?alpha0 ~ N(0, 1e6)
10,000 draws
-4000 -3000 -2000 -1000 0 1000 2000 3000 4000
0.0
0.0001
0.0002
0.0003
0.0004
P(alpha0)
alpha0
24
Noninformative Prior?alpha0 ~ N(0, 1e6)
-100 -80 -60 -40 -20 0 10 20 30 40 50 60 70 80 90
0.0
0.0001
0.0002
0.0003
0.0004
P(alpha0)
alpha0
10,000 draws
Noninformative Prior?
! ~ gamma(0.001, 0.001)
0 50 100 200 300 400 500 600 700
0
100
200
300
400
500
Inf
10,000 draws
!
log " 2( )
25
Noninformative for Probability
of Germination?
0.0 0.2 0.4 0.6 0.8 1.0
01
23
45
X1=0 & X2=0
Prob
Density
0.0 0.2 0.4 0.6 0.8 1.0
01
23
45
X1=1 & X2=0
Prob
Density
0.0 0.2 0.4 0.6 0.8 1.0
01
23
45
X1=0 & X2=1
Prob
Density
0.0 0.2 0.4 0.6 0.8 1.0
01
23
45
X1=1 & X2=1
Prob
Density
Inits
list(alpha0 = 0, alpha1 = 0, alpha2 = 0, alpha12 = 0, tau = 1)
list(alpha0 = -5, alpha1 = 2, alpha2 = -2, alpha12 = 1, tau = 0.5)
list(alpha0 = 5, alpha1 = -2, alpha2 = 2, alpha12 = -1, tau = 1.5)
alpha0 chains 1:3
iteration
1 2500 5000 7500 10000
-1.5
-1.0
-0.5
0.0
0.5
alpha12 chains 1:3
iteration
1 2500 5000 7500 10000
-4.0
-2.0
0.0
2.0
4.0
26
node mean sd MC error 2.50% median 97.50%
alpha0 -0.5403 0.1882 0.006046 -0.8985 -0.5433 -0.1539
alpha1 0.0688 0.3093 0.009449 -0.5701 0.08043 0.6427
alpha12 -0.8068 0.4229 0.01322 -1.648 -0.7997 0.001344
alpha2 1.341 0.2654 0.008576 0.8121 1.34 1.87
sigma 0.2786 0.1354 0.005259 0.04954 0.2704 0.5779
Results: 3 Chains (1 - 10,000)
alpha0 chains 1:3
iteration
1 5000
0.0
0.5
1.0
alpha12 chains 1:3
iteration
1 5000
0.0
0.5
1.0
1.5
3 Chains: iters 5,000 - 10,000
Risk
Density
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
6
X1=0; X2=0
Risk
Density
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
6
X1=1; X2=0
Risk
Density
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
6
X1=0; X2=1
Risk
Density
0.0 0.2 0.4 0.6 0.8 1.0
0
1
2
3
4
5
6
X1=1; X2=1
Predictive Risks
27
Interaction Term
alpha12
Density
-3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Pr(alpha12>0)
= 0.027
Logistic Regression Example
O-Ring Failures
23 binary obs: O-ring failures & temp (ºF)
ºFFail
……
660
631
581
571
531
28
Resultscenter temperatures
-0.089-0.197-0.273-0.362-0.587temp
-0.141-0.803-1.204-1.646-2.637Int’cept
97.5%75%50%25%2.5%
0.00230.00130.128-0.289temp
0.01160.00630.634-1.249Int’cept
Time-
series
SE
Naive SESDMean
29
Fitted Failure Probabilities
Set a Prior for Parameters
• Engineers think
–
– "t = Pr(fail at t ºF)
!
" 55 ~ Beta(1, 0.577) & " 75 ~ Beta(0.577,1)
30
!
" 55 ~ Beta(1, 0.577) & " 75 ~ Beta(0.577,1)
logit " 55( ) = #0 + #1 *55 & logit " 75( ) = #0 + #1 * 75
Convert to Normal Prior
• Generate random
Betas
• Fit linear model to
logits
• Use these estimates
for variance matrix!
ˆ " 1( i)
=logit p75
( i)( ) # logit p55
(i)( )75 # 55
,
ˆ " 0( i)
= logit p55
( i)( ) # ˆ " 1(i)$ 55 # t ( )
Results With New Prior
31
0.111-0.255temp
0.375-0.765Int’cept
SDMean
-0.067-0.179-0.243-0.321-0.499temp
-0.059-0.515-0.753-1.015-1.508Int’cept
97.5%75%50%25%2.5%
Summary
Distribution of Fits
32
What is the LD50?
• LD50 is temperature at which the
Pr(failure) = 50%
!
logit 0.5( ) = "0 + "1 LD50 # t ( )
LD50 =#"0"1
+ t
69.467.666.564.958.3
97.5%75%50%25%2.5%
Latent Variables (Tolerance)
• General model
– Latent tolerance (Z)
– Logistic model:
– Probit model:
• How do we fit it?
– Albert & Chib
•Data augmentation
!
Pr Yi=1"( ) = Pr Z
i> 0"( ) =1# F #X
i"( )
!
Zi~ N X",1( )
!
Zi ~ Logistic X",1( )
33
Probit Model Example
• 30 math students at a university
• Relationship between grade & math
SAT
– Y=1 if C or higher; 0 otherwise
– Consider underlying latent ability
!
"i = Pr yi =1( ) = Pr Zi > 0µi( ) =1# F #µi( )
µi = $0
+ $1% xi # x ( )
Model
• Choices for distribution of latent
“math ability”
– Logistic
– Normal
– Extreme-value
•Complementary log-log link
!
F x( ) =1" exp "exp x( )[ ], "# < x <#
log "log 1" pi( )[ ] = Xi$
Different scales make it
difficult to compare fits
based on coefficients.
Look at predicted
probabilities.
34
Results
• Higher math SAT higher
probability of grade ! C
0.0130.037SAT-M
0.3090.739Int’cept
SDMean
0.0640.0460.0360.0290.016SAT-M
1.3670.9450.7310.5290.143Int’cept
97.5%75%50%25%2.5%
35
Compare 2 Chains
Intercept SAT-M
beta.start=c(-0.5, -0.05) beta.start=c(1, 0.05)
Residual Plot
!
ri= z
i" x
i#
Can also examine
!
ri
*= yi " ˆ p i
36
Diagnostics
• Model
• Residuals
• Convergence
Some Checking Functions
1. Residual:
2. Standardized residual:
3. Chance of a more extreme obs:
4. Chance of a more “surprising” obs:
5. Predictive ordinate: (of the observation):
!
yi " E Yi( )
!
yi " E Yi( )[ ] Var Yi( )
!
min Pr Yi < yi( ) ,Pr Yi " yi( ){ }
!
p Yi : p Yi( ) " p yi( )[ ]
!
p yi( )
37
Predictive Distributions
• Split data into training &validation sets.
• If Yval cond. indep. of ytrain, given #,
– and only need new random nodes Yvalin WinBUGS.
• Compare observed yval to predictedYval via stats.
!
p Yval ytrain( ) = p Yval ytrain ,"( )p " ytrain( )d"#
!
p Yval ytrain( ) = p Yval "( )p " ytrain( )d"#
Case-Deleted or
Cross-Validation Predictions
• Bayes rule:
– stages
• Reweight sample
– sampled from
– form weights!
p " y( )# p yi "( )p " y $i[ ]( )
!
wi "p #* y
$ i[ ]( )p #* y( )
"1
p yi #*( )!
"*
!
p " y( )
38
Cross-Validation
Predictive Density Estimate
• With sample
• and
• estimate
!
p y " i[ ]( )p y( )
=1
p yi y " i[ ] ,#( )p # y( )d#$
!
" j, j =1,K,B{ } ~ p " y( )
!
ˆ f yi y " i[ ]( ) =1
B
1
f yi #j( )j=1
B
$%
&
' '
(
)
* *
"1
Harmonic mean:
may be unstable!
Example: Onions
• Weight vs. time
• LogisticGrowth ( )xbb
bbxyE
21
0
1,
!+=
x
y
2 4 6 8 10 12 14
0200
400
600
39
Data
1 2 3 4 5
16.08 33.83 65.80 97.20 191.55
6 7 8 9 10
326.20 386.87 520.53 590.03 651.92
11 12 13 14 15
724.93 699.56 689.86 637.56 717.41
WinBUGSmodel {
for (i in 1:N) {
model[i] <- beta[1]/(1+beta[2]*pow(beta[3], X[i]))
Y[i] ~ dnorm(model[i], tau)
resid[i] <- Y[i]-model[i]
density[i] <- exp(-resid[i]*resid[i]*tau/2)/(sqrt(2*pi/tau))
w[i] <- 1/density[i]
}
model14 <- beta[1]/(1+beta[2]*pow(beta[3], 14))
pred.Y14 ~ dnorm(model14, tau)
beta[1] <- b0
beta[2] <- exp(b1)
beta[3] <- exp(b2)/(1+exp(b2))
b0 ~ dnorm(0, 1.0E-8)
b1 ~ dnorm(0, 1.0E-6)
b2 ~ dnorm(0, 1.0E-6)
tau ~ dgamma(0.001, 0.001)
sigma <- 1/sqrt(tau)
}
40
WinBUGS
model {
for (i in 1:N) {
model[i] <- beta[1]/(1+beta[2]*pow(beta[3], X[i]))
Y[i] ~ dnorm(model[i], tau)
resid[i] <- Y[i]-model[i]
density[i] <- exp(-resid[i]*resid[i]*tau/2)/(sqrt(2*pi/tau))
w[i] <- 1/density[i]
}
model14 <- beta[1]/(1+beta[2]*pow(beta[3], 14))
pred.Y14 ~ dnorm(model14, tau)
beta[1] <- b0
beta[2] <- exp(b1)
beta[3] <- exp(b2)/(1+exp(b2))
b0 ~ dnorm(0, 1.0E-8)
b1 ~ dnorm(0, 1.0E-6)
b2 ~ dnorm(0, 1.0E-6)
tau ~ dgamma(0.001, 0.001)
sigma <- 1/sqrt(tau)
}Example: Onion Data
14th Observation = 637.6
0.034ˆ =p
( ) 0.0024ˆ ,1414=! obsyyp
( ) 0.0031ˆ ,1414=! obsyyp
41
Example: Onion Data
14th Observation = 637.6
• Predictive value w/o
reweighting: 697.0
• Reweighted predictive value:
718.8
• Actual predictive value
(w/o observation 14): 717.1
Predictive Density
(reweighted)
y14
500 600 700 800 900
0.0
0.0
05
0.0
10
0.0
15
0.0
20
Weighted (cv) sample
Unweighted sample
Sample w/o case 14
42
Residuals
• Cross-validation
predictive density
– compare expected value, given
remaining obsns to observed value
• Need to compute from
!
rid = yi " E yi y j, j# i[ ]
!
p yi y i( )( )
Convergence Diagnostics
• Start multiple chains with
different initial values
– Dispersed starting points
• Monitor quantities of interest
periodically
– Compare within-chain variation to
between-chain variation
• Check acceptance prop’n (M-H)
43
Gelman-Rubin
• For scalar quantities (e.g., $)
– J chains of length n
– Compute B (between) & W (within)
!
B =n
J "1# . j "#
..( )2
j=1
J
$ ,# . j =
1
n# ij
i=1
n
$ ,# ..
=1
J# . j
j=1
J
$
!
W =1
Js j2
j=1
J
" , s j2 =
1
n #1$ij #$
. j( )2
i=1
n
"
Gelman-Rubin (2)
• is unbiased
but overestimates the marginal
posterior variance until
convergence
– cluster sampling
– if started from overdispersed values
• W underestimates until
convergence
!
var+ " y( ) =
n #1
nW +
1
nB
!
var " y( )
44
Gelman-Rubin (3)
• As chains converge,
– “Estimated potential scale
reduction” if continue to !
•Ratio of upper to lower estimates
• Rule of thumb: OK when
!
ˆ R =var
+ " y( )W
=
n #1
nW +
1
nB
W$1
!
ˆ R <1.2
In WinBUGS
• GR diag: Gelman-Rubinconvergence statistic
• modified Brooks & Gelman (1998)
– Width of central 80% interval ofpooled runs is green
– Average width of the 80% intervalswithin the individual runs is blue
– Their ratio R (= pooled/within ) is red
– for plotting, pooled & within intervalwidths normalized to have an overallmaximum of one.
45
Implementation
• Statistics calculated in bins of
length 50:
– Generally expect R to be greater than
1 if starting values are suitably over-
dispersed.
• Want
– convergence of R to 1, and
– convergence of both the pooled &
within interval widths to stability.
In WinBUGS
• Can list values underlying the
GR plots in a window by double-
clicking on the plot followed by
ctrl-left-mouse-click on the
window.