1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.
-
Upload
latrell-kilbourn -
Category
Documents
-
view
223 -
download
6
Transcript of 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.
![Page 1: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/1.jpg)
1
Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation
![Page 2: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/2.jpg)
2
Simulation Methods
*
problem:
Bayes theorem allows us to write down unnormalized density which proportional to the posterior for virtually any model,
construct a simulator
dim() > 100
Solutions:
1. direct iid simulator (use asymptotics)
2. Importance sampling
3. MCMC
![Page 3: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/3.jpg)
3
MCMC Methods
“solution”
exploit special structure of the problem to:
formulate a Markov Chain on parameter space with π as long-run or “equilibrium distribution.
simulate from MC, starting from some point
Use sub-sequence of draws as simulator
![Page 4: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/4.jpg)
4
MCMC Methods
Start from 0
construct a sequence of r.v. 1 r, , ,
r 1 r r 1 r 1 r
r 1 r r
, , Markovian Property
~ F
under some conditions on F,
r 0 "converges" to
![Page 5: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/5.jpg)
5
Ergodicity
r1R r
R
rA
A
mi
estimate E g g d ;
gˆ
lim ergodic propertyˆ
1ˆi) p Pr A d ; p IR
ii) g
This means that we can estimate any aspect of the joint distribution using sequences of draws from MC.
Denote the sequence of draws as:
1 r, , ,
![Page 6: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/6.jpg)
6
Practical Considerations
Effect of initial conditions-
“burn-in” -- run for B iterations, discard and use only last R-B
Non-iid Simulator-
Is this a problem?
no: LLN works for dep sequences
yes: simulation error larger than iid seq
Method for Constructing the Chain!
![Page 7: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/7.jpg)
7
Asymptotics
Any simulation-based method relies on asymptotics for justification.
We have made fun of asymptotics for inference problems. Classical Econometrics – “approximate answer” to the wrong question.
We are not using asymptotics to approximate for a fixed sample size.
The sample size is large and under our control!
![Page 8: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/8.jpg)
8
Simulating from Bivariate Normal
21 2 1 1
1~ N 0,
1
~ N 0,1 and ~ N , 1
In R, we would use the Cholesky root to simulate:
1 1
22 1 2
~ z
z 1 z
2
Lz ; z ~ N 0,I
1 0L
1
![Page 9: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/9.jpg)
9
Gibbs SamplerA joint distribution can always be factored into a marginal × a conditional. There is also a sense in which the conditional distributions fully summarize the joint.
2 22 1 1 1 2 2~ N , 1 ~ N , 1
A simulator: Start at point
00 1
02
1 0 22 1
1 1 21 2
~ N ,1
~ N ,1
Draw in two steps: 1
Note: this is a Markov Chain. Current point entirely summarizes past.
![Page 10: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/10.jpg)
10
Gibbs Sampler
A simulator:
Start at point
00 1
02
1 0 22 1
1 1 21 2
~ N ,1
~ N ,1
Draw in two steps: 1
repeat!
![Page 11: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/11.jpg)
11
Hammersley-Clifford TheoremExistence of GS for bivariate distribution implies that the complete set of conditionals summarize all info in the joint.
H-C Construction:
2 11 2
2 12
1 2
pp ,
pd
p
Why?
![Page 12: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/12.jpg)
12
Hammersley-Clifford Theorem
1 2
2 1 1 22 2 2
1 2 1 11 2
2
p ,
p p p 1d d d
p , p pp
p
2 1 2 11 2 1 2
2 12
11 2
p pp , p ,
1pd pp
![Page 13: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/13.jpg)
13
rbiNormGibbs
theta1
the
ta2
-3 -2 -1 0 1 2 3
-3-2
-10
12
3
Gibbs Sampler with Intermediate Moves: Rho = 0.9
B
theta1
the
ta2
-3 -2 -1 0 1 2 3
-3-2
-10
12
3
Gibbs Sampler with Intermediate Moves: Rho = 0.9
B
theta1
the
ta2
-3 -2 -1 0 1 2 3
-3-2
-10
12
3
Gibbs Sampler with Intermediate Moves: Rho = 0.9
B
theta1
the
ta2
-3 -2 -1 0 1 2 3
-3-2
-10
12
3
Gibbs Sampler with Intermediate Moves: Rho = 0.9
B
![Page 14: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/14.jpg)
14
Intuition for dependence
This is a Markov Chain!
Average step “size” :
21
![Page 15: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/15.jpg)
15
rbiNormGibbs
0 5 10 15 20
-0.2
0.2
0.4
0.6
0.8
1.0
Lag
AC
F
Series 1
0 5 10 15 20
-0.2
0.2
0.4
0.6
0.8
1.0
Lag
Series 1 & Series 2
-20 -15 -10 -5 0
-0.2
0.2
0.4
0.6
0.8
1.0
Lag
AC
F
Series 2 & Series 1
0 5 10 15 20
-0.2
0.2
0.4
0.6
0.8
1.0
Lag
Series 2
non-iid draws!
Who cares?
Loss of Efficiency
![Page 16: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/16.jpg)
16
Ergodicity
0 5 10 15 20 25 30
0.0
0.4
0.8
Lag
ACF of Theta1
0 200 400 600 800 1000
0.2
0.4
0.6
0.8
Convergence of Sample Correlation
R
r i i11 1 2 2r i 1
r 2 2r ri i1 11 1 2 2r ri 1 i 1
ˆ
iid draws
Gibbs Sampler draws
![Page 17: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/17.jpg)
17
Relative Numerical EfficiencyDraws from the Gibbs Sampler come from a stationary yet autocorrelated process. We can compute the sampling error of averages of these draws.
Assume we wish to estimate
We would use:
E g
r r1 1R Rr r
g gˆ
2
1 1 2 1 R
1R 2 1 2 R
var g cov g ,g cov g ,gvar ˆ
cov g ,g var g var g
![Page 18: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/18.jpg)
18
Relative Numerical Efficiency
R 1 R jj1 RRj
var g var gvar 1 2ˆ
Rf
R
Ratio of variance to variance if iid.
m
m 1 jR jm 1
j 1
f 1 2 ˆ
Here we truncate the lag at m. Choice of m?
numEff in bayesm
![Page 19: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/19.jpg)
19
General Gibbs sampler
’ = (1, 2, …, p)
Sample from: 1,1 = f1(1| 0,2, …, 0,p) 1,2 = f2(2| 1,1, 0,3, …, 0,p)
1,p = fp(p| 1,1, …, 1,p-1)
to obtain the first iterate
where fi = () / () d-i
-i = (1,2, …,i-1, i+1, …,p)
“Blocking”
![Page 20: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/20.jpg)
20
Different prior for Bayes Regression
Suppose the prior for β does not depend on σ2: p(,2) = p() p(2). That is, prior belief about β does not depend on 2. Why should views about depend on scale of error terms? Only true for data-based prior information NOT for subject matter information!
1p( ) exp ( )' A( )
2
0 212 2 2 0 0
2
sp( ) ( ) exp
2
![Page 21: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/21.jpg)
21
Different posterior
The posterior for σ2 now depends on β:
1
2 2 1
2 1 2
1
22 1 1
1 02
22 0 01
0
[ |y,X, ] N( ,( X 'X A) )
ˆwith ( X 'X A) ( X 'X A )
ˆ (X 'X) X 'y
s[ | y,X, ] with n
s (y X ) (y X )s
n
Depends on
![Page 22: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/22.jpg)
22
Different simulation strategy
3) Repeat
2) Draw [2 | y, X, ] (conditional on !)
1) Draw [ | y, X, 2]
Scheme: [y|X, , 2] [] [2]
![Page 23: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/23.jpg)
23
runiregGibbsruniregGibbs=function(Data,Prior,Mcmc){# # Purpose:# perform Gibbs iterations for Univ Regression Model using# prior with beta, sigma-sq indep# # Arguments:# Data -- list of data # y,X# Prior -- list of prior hyperparameters# betabar,A prior mean, prior precision# nu, ssq prior on sigmasq# Mcmc -- list of MCMC parms# sigmasq=initial value for sigmasq# R number of draws# keep -- thinning parameter# # Output: # list of beta, sigmasq#
![Page 24: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/24.jpg)
24
runiregGibbs (continued)# Model:# y = Xbeta + e e ~N(0,sigmasq)# y is n x 1# X is n x k# beta is k x 1 vector of coefficients## Priors: beta ~ N(betabar,A^-1)# sigmasq ~ (nu*ssq)/chisq_nu# ## check arguments#.sigmasqdraw=double(floor(Mcmc$R/keep))betadraw=matrix(double(floor(Mcmc$R*nvar/keep)),ncol=nvar)XpX=crossprod(X)Xpy=crossprod(X,y)sigmasq=as.vector(sigmasq)
itime=proc.time()[3]cat("MCMC Iteration (est time to end - min) ",fill=TRUE)flush()
![Page 25: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/25.jpg)
25
runiregGibbs (continued)
for (rep in 1:Mcmc$R){## first draw beta | sigmasq# IR=backsolve(chol(XpX/sigmasq+A),diag(nvar)) btilde=crossprod(t(IR))%*%(Xpy/sigmasq+A%*%betabar) beta = btilde + IR%*%rnorm(nvar)## now draw sigmasq | beta# res=y-X%*%beta s=t(res)%*%res sigmasq=(nu*ssq + s)/rchisq(1,nu+nobs) sigmasq=as.vector(sigmasq)
![Page 26: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/26.jpg)
26
runiregGibbs (continued)
##print time to completion and draw # every 100th draw# if(rep%%100 == 0) {ctime=proc.time()[3] timetoend=((ctime-itime)/rep)*(R-rep) cat(" ",rep," (",round(timetoend/60,1),")",fill=TRUE) flush()}
if(rep%%keep == 0) {mkeep=rep/keep; betadraw[mkeep,]=beta; sigmasqdraw[mkeep]=sigmasq}}ctime = proc.time()[3]cat(' Total Time Elapsed: ',round((ctime-itime)/60,2),'\n')
list(betadraw=betadraw,sigmasqdraw=sigmasqdraw)}
![Page 27: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/27.jpg)
27
R session
set.seed(66)n=100X=cbind(rep(1,n),runif(n),runif(n),runif(n))beta=c(1,2,3,4)sigsq=1.0y=X%*%beta+rnorm(n,sd=sqrt(sigsq))
A=diag(c(.05,.05,.05,.05))betabar=c(0,0,0,0)nu=3ssq=1.0
R=1000
Data=list(y=y,X=X)Prior=list(A=A,betabar=betabar,nu=nu,ssq=ssq)Mcmc=list(R=R,keep=1)
out=runiregGibbs(Data=Data,Prior=Prior,Mcmc=Mcmc)
![Page 28: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/28.jpg)
28
R session (continued)
Starting Gibbs Sampler for Univariate Regression Model with 100 observations Prior Parms: betabar[1] 0 0 0 0A [,1] [,2] [,3] [,4][1,] 0.05 0.00 0.00 0.00[2,] 0.00 0.05 0.00 0.00[3,] 0.00 0.00 0.05 0.00[4,] 0.00 0.00 0.00 0.05nu = 3 ssq= 1 MCMC parms: R= 1000 keep= 1
![Page 29: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/29.jpg)
29
R session (continued)
MCMC Iteration (est time to end - min) 100 ( 0 ) 200 ( 0 ) 300 ( 0 ) 400 ( 0 ) 500 ( 0 ) 600 ( 0 ) 700 ( 0 ) 800 ( 0 ) 900 ( 0 ) 1000 ( 0 ) Total Time Elapsed: 0.01
![Page 30: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/30.jpg)
30
0 200 400 600 800 1000
01
23
45
6
Draws of Beta
ou
t$b
eta
dra
w
![Page 31: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/31.jpg)
31
0 200 400 600 800 1000
0.8
1.0
1.2
1.4
1.6
Draws of Sigma Squared
ou
t$si
gm
asq
dra
w
![Page 32: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/32.jpg)
32
Data Augmentation
GS is well-suited for linear models. Extends to conditionally conjugate models, e.g. SUR.
Data Augmentation extends class of models which can be analyzed via GS.
origins: missing data
traditional approach:
obs missingp y y y ,y
obs obs
obs miss miss
p y p y p
p y ,y dy p
![Page 33: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/33.jpg)
33
Data Augmentation
Solution: regard ymiss as what it is: an unobservable! Tanner and Wong (87)
GS:
miss obs obs miss
miss obs obs
p ,y y p y ,y p
p y y , p y p
miss obs
miss obs
y ,y
y ,y
complete data posterior!
miss" "yunder “ignorable”
missing data assumption
![Page 34: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/34.jpg)
34
Data Augmentation-Probit Ex
Consider the Binary Probit model:
ii
'i i i i
1 if z 0y
0 otherwise
z x ~ N 0,1
Z is a latent, unobserved variable
0
p y x, p y,z x, dz p y z,x, p z x, dz
f z p z x, dz
Pr y 1 p z x, dz Pr x ' x '
Pr y 0 x '
Integrate out z to obtain likelihood
![Page 35: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/35.jpg)
35
Data augmentation
All unobservables are objects of inference, including parameters and latent variables. Augment β with z.
For Probit, we desire the joint posterior of latents and β.
p(z, |y) p z ,y p z,y p z ,y p z
Conditional independence of y,β.
Gibbs Sampler:
z ,y
z
![Page 36: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/36.jpg)
36
Probit conditional distributions
[z|β, y]
This is a truncated normal distribution:
if y = 1, truncation is from below at 0 (z > 0, z=x’β + , > -x’β)
if y = 0, truncation is from above
How do we make these draws? We use the inverse CDF method.
![Page 37: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/37.jpg)
37
Inverse cdf
If X ~ F U ~ Uniform[0,1] Then F-1(U) = X
0
1
x
Let G be the cdf of X truncated to [a,b]
F(x) F(a)
G(x)F(b) F(a)
![Page 38: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/38.jpg)
38
Inverse cdf
what is G-1? solve G(x) = y
F(x) F(a)
yF(b) F(a)
F(x) y(F(b) F(a)) F(a)
1x F (y(F(b) F(a)) F(a))
1
Draw u ~ U(0,1)
x F (u(F(b) F(a)) F(a))
![Page 39: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/39.jpg)
39
rtrun
rtrun=function(mu,sigma,a,b){# function to draw from univariate truncated norm# a is vector of lower bounds for truncation# b is vector of upper bounds for truncation#FA=pnorm(((a-mu)/sigma))FB=pnorm(((b-mu)/sigma))mu+sigma*qnorm(runif(length(mu))*(FB-FA)+FA)}
![Page 40: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/40.jpg)
40
Probit conditional distributions
[|z,X] [z|X,] []
1
1
1
[ |y,X] Normal( ,(X 'X A) )
ˆ(X 'X A) (X 'X A )
ˆ (X 'X) X 'z
1 1[ | ,A ]~N( ,A )
Standard Bayes regression with unit error variance!
![Page 41: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/41.jpg)
41
rbprobitGibbsrbprobitGibbs=function(Data,Prior,Mcmc){## purpose: # draw from posterior for binary probit using Gibbs Sampler## Arguments:# Data - list of X,y # X is nobs x nvar, y is nobs vector of 0,1# Prior - list of A, betabar# A is nvar x nvar prior preci matrix# betabar is nvar x 1 prior mean# Mcmc# R is number of draws# keep is thinning parameter## Output:# list of betadraws# Model: y = 1 if w=Xbeta + e > 0 e ~N(0,1)## Prior: beta ~ N(betabar,A^-1)
![Page 42: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/42.jpg)
42
rbprobitGibbs (continued)# define functions needed#breg1=function(root,X,y,Abetabar) {# Purpose: draw from posterior for linear regression, sigmasq=1.0# # Arguments:# root is chol((X'X+A)^-1)# Abetabar = A*betabar## Output: draw from posterior# # Model: y = Xbeta + e e ~ N(0,I)## Prior: beta ~ N(betabar,A^-1)#cov=crossprod(root,root)betatilde=cov%*%(crossprod(X,y)+Abetabar)betatilde+t(root)%*%rnorm(length(betatilde))}.. (error checking part of code).
![Page 43: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/43.jpg)
43
rbprobitGibbs (continued)
betadraw=matrix(double(floor(R/keep)*nvar),ncol=nvar)
beta=c(rep(0,nvar))
sigma=c(rep(1,nrow(X)))
root=chol(chol2inv(chol((crossprod(X,X)+A))))
Abetabar=crossprod(A,betabar)
a=ifelse(y == 0,-100, 0)
b=ifelse(y == 0, 0, 100)#
# start main iteration loop
#
itime=proc.time()[3]
cat("MCMC Iteration (est time to end - min) ",fill=TRUE)
flush()
if y = 0, truncate to (-100,0)
if y = 1, truncate to (0, 100)
![Page 44: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/44.jpg)
44
rbprobitGibbs (continued)
for (rep in 1:R)
{
mu=X%*%beta
z=rtrun(mu,sigma,a,b)
beta=breg1(root,X,z,Abetabar)
}
![Page 45: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/45.jpg)
45
Binary probit example
## rbprobitGibbs example
##
set.seed(66)
simbprobit=
function(X,beta) {
## function to simulate from binary probit including x variable
y=ifelse((X%*%beta+rnorm(nrow(X)))<0,0,1)
list(X=X,y=y,beta=beta)
}
![Page 46: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/46.jpg)
46
Binary probit example
nobs=100X=cbind(rep(1,nobs),runif(nobs),runif(nobs),runif(nobs))beta=c(-2,-1,1,2)nvar=ncol(X)simout=simbprobit(X,beta)
Data=list(X=simout$X,y=simout$y)Mcmc=list(R=2000,keep=1)
out=rbprobitGibbs(Data=Data,Mcmc=Mcmc)
cat(" Betadraws ",fill=TRUE)mat=apply(out$betadraw,2,quantile,probs=c(.01,.05,.5,.95,.99))mat=rbind(beta,mat); rownames(mat)[1]="beta"; print(mat)
![Page 47: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/47.jpg)
47
0 500 1000 1500 2000
-4-2
02
46
Probit Beta Draws
ou
t$b
eta
dra
w
![Page 48: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/48.jpg)
48
Summary statistics
Betadraws [,1] [,2] [,3] [,4]
beta -2.000000 -1.00000000 1.00000000 2.000000
1% -4.113488 -2.69028853 -0.08326063 1.392206
5% -3.588499 -2.19816304 0.20862118 1.867192
50% -2.504669 -1.04634198 1.17242924 2.946999
95% -1.556600 -0.06133085 2.08300392 4.166941
99% -1.233392 0.34910141 2.43453863 4.680425
![Page 49: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/49.jpg)
49
Binary probit example
Pr y 1x, x '
Probability | x=(0,.1,0)
0.45 0.50 0.55 0.60
05
1015
20Probability | x=(0,4,0)
0.0 0.2 0.4 0.6 0.8 1.0
0.0
1.0
2.0
Example from BSM:
![Page 50: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/50.jpg)
50
Mixtures of normals
i ind ind
i
y ~ N ,
ind ~ Multinomial( pvec)
A general flexible model or a non-parametric method of density approximation?
indi is a augmented variable that points to whichnormal distribution is associated with observation i.ind is an indicator variable that classifies observations one of the length(pvec) components.
i k k kky ~ N ,
![Page 51: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/51.jpg)
51
Model hierarchy
pvec indk
k
yi
Model [pvec][ind|pvec][k|ind][k|ind,k][Y|k,k]
Conditionals [pvec|ind,priors][ind|pvec,{k,k},y][{k,k}|ind,y,priors]
k
1k k
Priors :
pvec ~ Dirichlet
~ IW ,V
~ N , a
k 1, ,K
![Page 52: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/52.jpg)
52
Gibbs Sampler for Mixture of Normals
Conditionals [pvec|ind,priors]
[ind|pvec,{k,k},y]
n
k k k k ii 1
pvec ~ Dirichlet
n ; n I ind k
i i i,1 i,K
i k ki,k k
i k kk
ind ~ multinomial ; ' ,...,
y ,pvec
y ,
φ( ) is the multivariate normal density
![Page 53: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/53.jpg)
53
Gibbs Sampler for Mixtures of Normals
[{k,k}|ind,y,priors]
k
'1
'k k i k
'n
u
Y U; U ; u ~ N 0,
u
k
k k k
1k k k k kn a
Y , ,V ~ IW n ,V S
Y , , ,a ~ N ,
given ind (classification), this is just a MRM!
'* ' * 'k k k k
'
k k
1
k k k k
'k k k
S Y Y
a
n a n Y a
Y Y / n
![Page 54: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/54.jpg)
54
Identification for Mixtures of Normals
Likelihood for mixture of K normals can have up to K! modes of equal height!
So-called “label” switching problem: I can permute the labels of each component without changing likelihood.
Implies the Gibbs Sampler may not navigate all modes! Who cares?
Joint density or any function of this is identified!
![Page 55: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/55.jpg)
55
Label-Switching Example
Consider a mixture of two univariate normals that are not very “separated” and with a relatively small amount of data. Density of y is unimodal with mode a 1.5
y .5N 1,1 .5N 2,1
1
1
1
1
1
1
111
11
1
1
11
11
1
11
1
1
1
1
1111
1
1
1
1
1
1
11
1111
1
11111
1
11
1
1
1
1111
11
1
1
1
1
1
11
1
111
1
1
1
1
1
1
11
1
1
1
1
1
1
1111
1
1
1
1
1
1
11
1
1
1
1
1
0 20 40 60 80 100
0.5
1.0
1.5
2.0
2.5
mud
raw
2
2
22
22
22
22
2
2
2
2222
2
2
2
22
2
2
2
2
2
2
2
2
22
2
2
2
2
2
2
2
2
2
2
2
2
2
22
2
2
2
2
222
2
2
2
222
2
2
22222
2
2
22
2
2
22
2
22
2
2
222
2
2
22
2
22
2
22
2
2
2
22
2
2
Label-switches
![Page 56: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/56.jpg)
56
Label-Switching Example
Density of y is identified. Using Gibbs Sampler, we get R draws from posterior of joint density
-1 0 1 2 3 4
0.0
0.1
0.2
0.3
0.4
0.5
0.6
1 1 2 2p y p y , 1 p y ,
![Page 57: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/57.jpg)
57
Identification for Mixtures of Normals
We use unconstrained Gibbs Sampler (rnmixGibbs).Others advocate restrictions or post-processing of draws to identify components
Pros: superior mixingfocuses attention on identified quantities
Cons:can’t make inferences about component parmsmust summarize posterior of joint density!
![Page 58: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/58.jpg)
58
Identification for Mixtures of Normals
In practice, what is the implication of label-switching?
we can’t use:
but we can use
r1k kR r
r1k kR r
E
E
r r r1k k kR r k
E p y p y ,
![Page 59: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/59.jpg)
59
Multivariate Mix of Norms Ex
1 2 1 3 1
k
1
2
; 2 ; 3 ;3
4
5
1 .5 .5
.5 1
.5
.5 .5 1
1/ 2
pvec 1/ 3
1/ 6 0 100 200 300 400
r
Nor
mal
Com
pone
nt
12
34
56
78
9
![Page 60: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/60.jpg)
60
Multivariate Mix of Norms Ex
R R K
r r r r r r rk k k k k
r 1 r 1 k 1
1 1ˆ ˆp y p y , ,p p y ,R R
0 5 10 15
0.00
0.10
0.20
0.30
draw 100
![Page 61: 1 Introduction to MCMC methods, the Gibbs Sampler, and Data Augmentation.](https://reader030.fdocuments.us/reader030/viewer/2022012922/551acd9555034656628b5f81/html5/thumbnails/61.jpg)
61
Bivariate Distributions and Marginals
0 2 4 6 8 10
0.00
0.05
0.10
0.15
0.20
dens
ity -1 0 1 2 3 4 5
02
46
8
True Bivariate Marginal
-1 0 1 2 3 4 5
02
46
8Posterior Mean of Bivariate Marginal