A vanilla Rao–Blackwellisation ofMetropolis–Hastings algorithms
Randal DOUC and Christian ROBERTTelecom SudParis, France
April 2009
1 / 24
Main themes
1 Rao–Blackwellisation on MCMC.2 Can be performed in any Hastings Metropolis algorithm.3 Asymptotically more efficient to usual MCMC with a
controlled amount of calculations.
2 / 24
Main themes
1 Rao–Blackwellisation on MCMC.2 Can be performed in any Hastings Metropolis algorithm.3 Asymptotically more efficient to usual MCMC with a
controlled amount of calculations.
2 / 24
Main themes
1 Rao–Blackwellisation on MCMC.2 Can be performed in any Hastings Metropolis algorithm.3 Asymptotically more efficient to usual MCMC with a
controlled amount of calculations.
2 / 24
Main themes
1 Rao–Blackwellisation on MCMC.2 Can be performed in any Hastings Metropolis algorithm.3 Asymptotically more efficient to usual MCMC with a
controlled amount of calculations.
2 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
3 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
3 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
3 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
3 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
3 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
4 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hastings algorithm
1 We wish to approximate
I =
∫h(x)π(x)dx∫π(x)dx
=
∫
h(x)π̄(x)dx
2 x 7→ π(x) is known but not∫π(x)dx .
3 Approximate I with δ = 1n
∑nt=1 h(x (t)) where (x (t)) is a Markov
chain with limiting distribution π̄.
4 Convergence obtained from Law of Large Numbers or CLT forMarkov chains.
5 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hastings algorithm
1 We wish to approximate
I =
∫h(x)π(x)dx∫π(x)dx
=
∫
h(x)π̄(x)dx
2 x 7→ π(x) is known but not∫π(x)dx .
3 Approximate I with δ = 1n
∑nt=1 h(x (t)) where (x (t)) is a Markov
chain with limiting distribution π̄.
4 Convergence obtained from Law of Large Numbers or CLT forMarkov chains.
5 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hastings algorithm
1 We wish to approximate
I =
∫h(x)π(x)dx∫π(x)dx
=
∫
h(x)π̄(x)dx
2 x 7→ π(x) is known but not∫π(x)dx .
3 Approximate I with δ = 1n
∑nt=1 h(x (t)) where (x (t)) is a Markov
chain with limiting distribution π̄.
4 Convergence obtained from Law of Large Numbers or CLT forMarkov chains.
5 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hastings algorithm
1 We wish to approximate
I =
∫h(x)π(x)dx∫π(x)dx
=
∫
h(x)π̄(x)dx
2 x 7→ π(x) is known but not∫π(x)dx .
3 Approximate I with δ = 1n
∑nt=1 h(x (t)) where (x (t)) is a Markov
chain with limiting distribution π̄.
4 Convergence obtained from Law of Large Numbers or CLT forMarkov chains.
5 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hasting Algorithm
Suppose that x (t) is drawn.
1 Simulate yt ∼ q(·|x (t)).
2 Set x (t+1) = yt with probability
α(x (t), yt) = min{
1,π(yt )
π(x (t))
q(x (t)|yt)
q(yt |x (t))
}
Otherwise, set x (t+1) = x (t) .
3 α is such that the detailed balance equation is satisfied: ⊲ π̄ isthe stationary distribution of (x (t)).
◮ The accepted candidates are simulated with the rejectionalgorithm.
6 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hasting Algorithm
Suppose that x (t) is drawn.
1 Simulate yt ∼ q(·|x (t)).
2 Set x (t+1) = yt with probability
α(x (t), yt) = min{
1,π(yt )
π(x (t))
q(x (t)|yt)
q(yt |x (t))
}
Otherwise, set x (t+1) = x (t) .
3 α is such that the detailed balance equation is satisfied: ⊲ π̄ isthe stationary distribution of (x (t)).
◮ The accepted candidates are simulated with the rejectionalgorithm.
6 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hasting Algorithm
Suppose that x (t) is drawn.
1 Simulate yt ∼ q(·|x (t)).
2 Set x (t+1) = yt with probability
α(x (t), yt) = min{
1,π(yt )
π(x (t))
q(x (t)|yt)
q(yt |x (t))
}
Otherwise, set x (t+1) = x (t) .
3 α is such that the detailed balance equation is satisfied:
π(x)q(y |x)α(x , y) = π(y)q(x |y)α(y , x).
⊲ π̄ is the stationary distribution of (x (t)).
◮ The accepted candidates are simulated with the rejectionalgorithm.
6 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Metropolis Hasting Algorithm
Suppose that x (t) is drawn.
1 Simulate yt ∼ q(·|x (t)).
2 Set x (t+1) = yt with probability
α(x (t), yt) = min{
1,π(yt )
π(x (t))
q(x (t)|yt)
q(yt |x (t))
}
Otherwise, set x (t+1) = x (t) .
3 α is such that the detailed balance equation is satisfied:
π(x)q(y |x)α(x , y) = π(y)q(x |y)α(y , x).
⊲ π̄ is the stationary distribution of (x (t)).
◮ The accepted candidates are simulated with the rejectionalgorithm.
6 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
7 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
1 Alternative representation of the estimator δ is
δ =1n
n∑
t=1
h(x (t)) =1N
MN∑
i=1
nih(zi ) ,
where
zi ’s are the accepted yj ’s,MN is the number of accepted yj ’s till time N,ni is the number of times zi appears in the sequence (x (t))t .
8 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
q̃(·|zi ) =α(zi , ·) q(·|zi )
p(zi )≤ q(·|zi )
p(zi ),
where p(zi ) =∫α(zi , y) q(y |zi )dy . To simulate according to q̃(·|zi ):
1 Propose a candidate y ∼ q(·|zi )
2 Accept with probability
q̃(y |zi )/
(q(y |zi )
p(zi )
)
= α(zi , y)
Otherwise, reject it and starts again.
3 ◮ this is the transition of the HM algorithm.
The transition kernel q̃ admits π̃ as a stationary distribution:
π̃(x)q̃(y |x) =
9 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
q̃(·|zi ) =α(zi , ·) q(·|zi )
p(zi )≤ q(·|zi )
p(zi ),
where p(zi ) =∫α(zi , y) q(y |zi )dy . To simulate according to q̃(·|zi ):
1 Propose a candidate y ∼ q(·|zi )
2 Accept with probability
q̃(y |zi )/
(q(y |zi )
p(zi )
)
= α(zi , y)
Otherwise, reject it and starts again.
3 ◮ this is the transition of the HM algorithm.
The transition kernel q̃ admits π̃ as a stationary distribution:
π̃(x)q̃(y |x) =
9 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
q̃(·|zi ) =α(zi , ·) q(·|zi )
p(zi )≤ q(·|zi )
p(zi ),
where p(zi ) =∫α(zi , y) q(y |zi )dy . To simulate according to q̃(·|zi ):
1 Propose a candidate y ∼ q(·|zi )
2 Accept with probability
q̃(y |zi )/
(q(y |zi )
p(zi )
)
= α(zi , y)
Otherwise, reject it and starts again.
3 ◮ this is the transition of the HM algorithm.
The transition kernel q̃ admits π̃ as a stationary distribution:
π̃(x)q̃(y |x) =
9 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
q̃(·|zi ) =α(zi , ·) q(·|zi )
p(zi )≤ q(·|zi )
p(zi ),
where p(zi ) =∫α(zi , y) q(y |zi )dy . To simulate according to q̃(·|zi ):
1 Propose a candidate y ∼ q(·|zi )
2 Accept with probability
q̃(y |zi )/
(q(y |zi )
p(zi )
)
= α(zi , y)
Otherwise, reject it and starts again.
3 ◮ this is the transition of the HM algorithm.
The transition kernel q̃ admits π̃ as a stationary distribution:
π̃(x)q̃(y |x) =π(x)p(x)
∫π(u)p(u)du
︸ ︷︷ ︸
π̃(x)
α(x , y)q(y |x)
p(x)︸ ︷︷ ︸
q̃(y |x)
9 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
q̃(·|zi ) =α(zi , ·) q(·|zi )
p(zi )≤ q(·|zi )
p(zi ),
where p(zi ) =∫α(zi , y) q(y |zi )dy . To simulate according to q̃(·|zi ):
1 Propose a candidate y ∼ q(·|zi )
2 Accept with probability
q̃(y |zi )/
(q(y |zi )
p(zi )
)
= α(zi , y)
Otherwise, reject it and starts again.
3 ◮ this is the transition of the HM algorithm.
The transition kernel q̃ admits π̃ as a stationary distribution:
π̃(x)q̃(y |x) =π(x)α(x , y)q(y |x)∫π(u)p(u)du
9 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
q̃(·|zi ) =α(zi , ·) q(·|zi )
p(zi )≤ q(·|zi )
p(zi ),
where p(zi ) =∫α(zi , y) q(y |zi )dy . To simulate according to q̃(·|zi ):
1 Propose a candidate y ∼ q(·|zi )
2 Accept with probability
q̃(y |zi )/
(q(y |zi )
p(zi )
)
= α(zi , y)
Otherwise, reject it and starts again.
3 ◮ this is the transition of the HM algorithm.
The transition kernel q̃ admits π̃ as a stationary distribution:
π̃(x)q̃(y |x) =π(y)α(y , x)q(x |y)∫π(u)p(u)du
9 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
q̃(·|zi ) =α(zi , ·) q(·|zi )
p(zi )≤ q(·|zi )
p(zi ),
where p(zi ) =∫α(zi , y) q(y |zi )dy . To simulate according to q̃(·|zi ):
1 Propose a candidate y ∼ q(·|zi )
2 Accept with probability
q̃(y |zi )/
(q(y |zi )
p(zi )
)
= α(zi , y)
Otherwise, reject it and starts again.
3 ◮ this is the transition of the HM algorithm.
The transition kernel q̃ admits π̃ as a stationary distribution:
π̃(x)q̃(y |x) = π̃(y)q̃(x |y) ,
9 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Lemme
The sequence (zi , ni) satisfies
1 (zi , ni)i is a Markov chain;
2 zi+1 and ni are independent given zi ;
3 ni is distributed as a geometric random variable with probabilityparameter
p(zi ) :=
∫
α(zi , y) q(y |zi ) dy ; (1)
4 (zi )i is a Markov chain with transition kernel
Q̃(z, dy) = q̃(y |z)dy and stationary distribution π̃ such that
q̃(·|z) ∝ α(z, ·) q(·|z) and π̃(·) ∝ π(·)p(·) .
10 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Lemme
The sequence (zi , ni) satisfies
1 (zi , ni)i is a Markov chain;
2 zi+1 and ni are independent given zi ;
3 ni is distributed as a geometric random variable with probabilityparameter
p(zi ) :=
∫
α(zi , y) q(y |zi ) dy ; (1)
4 (zi )i is a Markov chain with transition kernel
Q̃(z, dy) = q̃(y |z)dy and stationary distribution π̃ such that
q̃(·|z) ∝ α(z, ·) q(·|z) and π̃(·) ∝ π(·)p(·) .
10 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Lemme
The sequence (zi , ni) satisfies
1 (zi , ni)i is a Markov chain;
2 zi+1 and ni are independent given zi ;
3 ni is distributed as a geometric random variable with probabilityparameter
p(zi ) :=
∫
α(zi , y) q(y |zi ) dy ; (1)
4 (zi )i is a Markov chain with transition kernel
Q̃(z, dy) = q̃(y |z)dy and stationary distribution π̃ such that
q̃(·|z) ∝ α(z, ·) q(·|z) and π̃(·) ∝ π(·)p(·) .
10 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Lemme
The sequence (zi , ni) satisfies
1 (zi , ni)i is a Markov chain;
2 zi+1 and ni are independent given zi ;
3 ni is distributed as a geometric random variable with probabilityparameter
p(zi ) :=
∫
α(zi , y) q(y |zi ) dy ; (1)
4 (zi )i is a Markov chain with transition kernel
Q̃(z, dy) = q̃(y |z)dy and stationary distribution π̃ such that
q̃(·|z) ∝ α(z, ·) q(·|z) and π̃(·) ∝ π(·)p(·) .
10 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
zi−1
11 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
zi−1 zi
ni−1
indep
indep
11 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
zi−1 zi zi+1
ni−1 ni
indep
indep
indep
indep
11 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
zi−1 zi zi+1
ni−1 ni
indep
indep
indep
indep
δ =1n
n∑
t=1
h(x (t)) =1N
MN∑
i=1
nih(zi ) .
11 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
zi−1 zi zi+1
ni−1 ni
indep
indep
indep
indep
δ =1n
n∑
t=1
h(x (t)) =1N
MN∑
i=1
nih(zi ) .
11 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
12 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
1 A natural idea:
δ∗ =1N
MN∑
i=1
h(zi )
p(zi ),
13 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
1 A natural idea:
δ∗ ≃
∑MNi=1
h(zi )
p(zi )∑MN
i=11
p(zi )
=
∑MNi=1
π(zi )
π̃(zi )h(zi )
∑MNi=1
π(zi )
π̃(zi )
.
13 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
1 A natural idea:
δ∗ ≃
∑MNi=1
h(zi )
p(zi )∑MN
i=11
p(zi )
=
∑MNi=1
π(zi )
π̃(zi )h(zi )
∑MNi=1
π(zi )
π̃(zi )
.
2 But p not available in closed form.
13 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
1 A natural idea:
δ∗ ≃
∑MNi=1
h(zi )
p(zi )∑MN
i=11
p(zi )
=
∑MNi=1
π(zi )
π̃(zi )h(zi )
∑MNi=1
π(zi )
π̃(zi )
.
2 But p not available in closed form.
3 The geometric ni is the obvious solution that is used in theoriginal Metropolis–Hastings estimate.
13 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
1 A natural idea:
δ∗ ≃
∑MNi=1
h(zi )
p(zi )∑MN
i=11
p(zi )
=
∑MNi=1
π(zi )
π̃(zi )h(zi )
∑MNi=1
π(zi )
π̃(zi )
.
2 But p not available in closed form.
3 The geometric ni is the obvious solution that is used in theoriginal Metropolis–Hastings estimate.
ni = 1 +
∞∑
j=1
∏
ℓ≤j
I {uℓ ≥ α(zi , yℓ)} ,
13 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
ni = 1 +
∞∑
j=1
∏
ℓ≤j
I {uℓ ≥ α(zi , yℓ)} ,
Lemma
If (yj)j is an iid sequence with distribution q(y |zi ), the quantity
ξ̂i = 1 +
∞∑
j=1
∏
ℓ≤j
{1 − α(zi , yℓ)}
is an unbiased estimator of 1/p(zi ) which variance, conditional on zi ,
is lower than the conditional variance of ni , {1 − p(zi )}/p2(zi ).
13 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
ξ̂i = 1 +∞∑
j=1
∏
ℓ≤j
{1 − α(zi , yℓ)}
1 Infinite sum but sometimes finite:
α(x (t), yt) = min{
1,π(yt )
π(x (t))
q(x (t)|yt)
q(yt |x (t))
}
For example: take a symetric random walk as a proposal.
2 What if we wish to be sure that the sum is finite?
14 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
Proposition
If (yj)j is an iid sequence with distribution q(y |zi ) and (uj)j is an iiduniform sequence, for any k ≥ 0, the quantity
ξ̂ki = 1 +
∞∑
j=1
∏
1≤ℓ≤k∧j
{1 − α(zi , yj)}∏
k+1≤ℓ≤j
I {uℓ ≥ α(zi , yℓ)} (2)
is an unbiased estimator of 1/p(zi ) with an almost sure finite numberof terms.
15 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
Proposition
If (yj)j is an iid sequence with distribution q(y |zi ) and (uj)j is an iiduniform sequence, for any k ≥ 0, the quantity
ξ̂ki = 1 +
∞∑
j=1
∏
1≤ℓ≤k∧j
{1 − α(zi , yj)}∏
k+1≤ℓ≤j
I {uℓ ≥ α(zi , yℓ)} (2)
is an unbiased estimator of 1/p(zi ) with an almost sure finite numberof terms. Moreover, for k ≥ 1,
V
[
ξ̂ki
∣∣∣ zi
]
=1 − p(zi )
p2(zi)−1 − (1 − 2p(zi ) + r(zi))
k
2p(zi ) − r(zi )
(2 − p(zi )
p2(zi )
)
(p(zi )−r(zi )) ,
where p(zi ) :=∫α(zi , y) q(y |zi ) dy . and r(zi) :=
∫α2(zi , y) q(y |zi ) dy .
15 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
Proposition
If (yj)j is an iid sequence with distribution q(y |zi ) and (uj)j is an iiduniform sequence, for any k ≥ 0, the quantity
ξ̂ki = 1 +
∞∑
j=1
∏
1≤ℓ≤k∧j
{1 − α(zi , yj)}∏
k+1≤ℓ≤j
I {uℓ ≥ α(zi , yℓ)} (2)
is an unbiased estimator of 1/p(zi ) with an almost sure finite numberof terms. Therefore, we have
V
[
ξ̂i
∣∣∣ zi
]
≤ V
[
ξ̂ki
∣∣∣ zi
]
≤ V
[
ξ̂0i
∣∣∣ zi
]
= V [ni | zi ] .
15 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
zi−1
ξ̂ki = 1 +
∞∑
j=1
∏
1≤ℓ≤k∧j
{1 − α(zi , yj)}∏
k+1≤ℓ≤j
I {uℓ ≥ α(zi , yℓ)} (3)
16 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
zi−1 zi
ξ̂ki−1
not indep
not indep
ξ̂ki = 1 +
∞∑
j=1
∏
1≤ℓ≤k∧j
{1 − α(zi , yj)}∏
k+1≤ℓ≤j
I {uℓ ≥ α(zi , yℓ)} (3)
16 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
zi−1 zi zi+1
ξ̂ki−1 ξ̂k
i
not indep
not indep
not indep
not indep
ξ̂ki = 1 +
∞∑
j=1
∏
1≤ℓ≤k∧j
{1 − α(zi , yj)}∏
k+1≤ℓ≤j
I {uℓ ≥ α(zi , yℓ)} (3)
16 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
zi−1 zi zi+1
ξ̂ki−1 ξ̂k
i
not indep
not indep
not indep
not indep
δkM =
∑Mi=1 ξ̂
ki h(zi )
∑Mi=1 ξ̂
ki
.
16 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Variance reduction
zi−1 zi zi+1
ξ̂ki−1 ξ̂k
i
not indep
not indep
not indep
not indep
δkM =
∑Mi=1 ξ̂
ki h(zi )
∑Mi=1 ξ̂
ki
.
16 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
Let
δkM =
∑Mi=1 ξ̂
ki h(zi )
∑Mi=1 ξ̂
ki
.
For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ <∞}.
17 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
Let
δkM =
∑Mi=1 ξ̂
ki h(zi )
∑Mi=1 ξ̂
ki
.
For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ <∞}.Assume that there exist a positive function ϕ ≥ 1 such that
∀h ∈ Cϕ,∑M
i=1 h(zi )/p(zi )∑M
i=1 1/p(zi )
P−→ π(h) (3)
Theorem
Under the assumption that π(p) > 0, the following convergenceproperty holds:
i) If h is in Cϕ, then
δkM
P−→M→∞ π(h) (◮CONSISTENCY)
17 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
Let
δkM =
∑Mi=1 ξ̂
ki h(zi )
∑Mi=1 ξ̂
ki
.
For any positive function ϕ, we denote Cϕ = {h; |h/ϕ|∞ <∞}.Assume that there exist a positive function ψ such that
∀h ∈ Cψ,√
M
(∑Mi=1 h(zi )/p(zi )∑M
i=1 1/p(zi )− π(h)
)
L−→ N (0, Γ(h))
Theorem
Under the assumption that π(p) > 0, the following convergenceproperty holds:
ii) If, in addition, h2/p ∈ Cϕ and h ∈ Cψ, then
√M(δk
M − π(h))L−→M→∞ N (0,Vk [h − π(h)]) , (◮CLT)
where Vk (h) := π(p)∫π(dz)V
[
ξ̂ki
∣∣∣ z
]
h2(z)p(z) + Γ(h) .17 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
We will need some additional assumptions. Assume a maximalinequality for the Markov chain (zi )i : there exists a measurablefunction ζ such that for any starting point x ,
∀h ∈ Cζ , Px
∣∣∣∣∣∣
sup0≤i≤N
i∑
j=0
[h(zi ) − π̃(h)]
∣∣∣∣∣∣
> ǫ
≤ NCh(x)
ǫ2
Theorem
Assume that h is such that h/p ∈ Cζ and {Ch/p, h2/p2} ⊂ Cφ. Assumemoreover that
√M(δ0
M − π(h)) L−→ N (0,V0[h − π(h)]) .
Then, for any starting point x,
√
MN
(∑Nt=1 h(x (t))
N− π(h)
)
L−→N→∞ N (0,V0[h − π(h)]) ,
18 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
We will need some additional assumptions. Assume a maximalinequality for the Markov chain (zi )i : there exists a measurablefunction ζ such that for any starting point x ,
∀h ∈ Cζ , Px
∣∣∣∣∣∣
sup0≤i≤N
i∑
j=0
[h(zi ) − π̃(h)]
∣∣∣∣∣∣
> ǫ
≤ NCh(x)
ǫ2
Moreover, assume that ∃φ ≥ 1 such that for any starting point x ,
∀h ∈ Cφ, Q̃n(x , h)P−→ π̃(h) = π(ph)/π(p) ,
Theorem
Assume that h is such that h/p ∈ Cζ and {Ch/p, h2/p2} ⊂ Cφ. Assumemoreover that
√M(δ0
M − π(h)) L−→ N (0,V0[h − π(h)]) .
Then, for any starting point x,
(∑ )18 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
We will need some additional assumptions. Assume a maximalinequality for the Markov chain (zi )i : there exists a measurablefunction ζ such that for any starting point x ,
∀h ∈ Cζ , Px
∣∣∣∣∣∣
sup0≤i≤N
i∑
j=0
[h(zi ) − π̃(h)]
∣∣∣∣∣∣
> ǫ
≤ NCh(x)
ǫ2
Moreover, assume that ∃φ ≥ 1 such that for any starting point x ,
∀h ∈ Cφ, Q̃n(x , h)P−→ π̃(h) = π(ph)/π(p) ,
Theorem
Assume that h is such that h/p ∈ Cζ and {Ch/p, h2/p2} ⊂ Cφ. Assumemoreover that
√M(δ0
M − π(h)) L−→ N (0,V0[h − π(h)]) .
Then, for any starting point x,
(∑ )18 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
∀h ∈ Cζ , Px
∣∣∣∣∣∣
sup0≤i≤N
i∑
j=0
[h(zi ) − π̃(h)]
∣∣∣∣∣∣
> ǫ
≤ NCh(x)
ǫ2
∀h ∈ Cφ, Q̃n(x , h)P−→ π̃(h) = π(ph)/π(p) ,
Theorem
Assume that h is such that h/p ∈ Cζ and {Ch/p, h2/p2} ⊂ Cφ. Assumemoreover that
√M(δ0
M − π(h)) L−→ N (0,V0[h − π(h)]) .
Then, for any starting point x,
√
MN
(∑Nt=1 h(x (t))
N− π(h)
)
L−→N→∞ N (0,V0[h − π(h)]) ,
where MN is defined by18 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
Theorem
Assume that h is such that h/p ∈ Cζ and {Ch/p, h2/p2} ⊂ Cφ. Assumemoreover that
√M(δ0
M − π(h)) L−→ N (0,V0[h − π(h)]) .
Then, for any starting point x,
√
MN
(∑Nt=1 h(x (t))
N− π(h)
)
L−→N→∞ N (0,V0[h − π(h)]) ,
where MN is defined by
MN∑
i=1
ξ̂0i ≤ N <
MN+1∑
i=1
ξ̂0i . (3)
18 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Asymptotic results
Theorem
Assume that h is such that h/p ∈ Cζ and {Ch/p, h2/p2} ⊂ Cφ. Assumemoreover that
√M(δ0
M − π(h)) L−→ N (0,V0[h − π(h)]) .
Then, for any starting point x,
√
MN
(∑Nt=1 h(x (t))
N− π(h)
)
L−→N→∞ N (0,V0[h − π(h)]) ,
where MN is defined by
MN∑
i=1
ξ̂0i ≤ N <
MN+1∑
i=1
ξ̂0i . (3)
18 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
19 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Figure: Overlay of the variations of 250 iid realisations of theestimates δ (gold) and δ∞ (grey) of E[X ] = 0 for 1000 iterations, alongwith the 90% interquantile range for the estimates δ (brown) and δ∞
(pink), in the setting of a random walk Gaussian proposal with scaleτ = 10.
20 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Figure: Overlay of the variations of 500 iid realisations of theestimates δ (deep grey), δ∞ (medium grey) and of the importancesampling version (light grey) of E[X ] = 10 when X ∼ Exp(.1) for 100iterations, along with the 90% interquantile ranges (same colourcode), in the setting of an independent exponential proposal withscale µ = 0.02.
21 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
π(x) = β(1 − β)x and 2q(y |x) =
{
I|x−y |=1 if x > 0 ,
I|y |≤1 if x = 0 .
For this problem,
p(x) = 1 − β/2 and r(x) = 1 − β + β2/2 .
We can therefore compute the gain in variance
p(x) − r(x)
2p(x) − r(x)
2 − p(x)
p2(x)= 2
β(1 − β)(2 + β)
(2 − β2)(2 − β)2
which is optimal for β = 0.174, leading to a gain of 0.578 while therelative gain in variance is
p(x) − r(x)
2p(x) − r(x)
2 − p(x)
1 − p(x)=
(1 − β)(2 + β)
(2 − β2)
which is decreasing in β.
22 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
Outline
1 Introduction
2 Some properties of the HM algorithm
3 Rao–BlackwellisationVariance reductionAsymptotic results
4 Illustrations
5 Conclusion
23 / 24
Introduction Some properties of the HM algorithm Rao–Blackwellisation Illustrations Conclusion
a) Rao Blackwellisation of any HM algorithm with a controledamount of additional calculation.
b) Link with the importance sampling of Markov chains.
c) Analysis with asymptotic results on triangular arrays.
24 / 24
Top Related