"reflections on the probability space induced by moment conditions with implications for Bayesian...
-
Upload
christian-robert -
Category
Economy & Finance
-
view
2.359 -
download
0
description
Transcript of "reflections on the probability space induced by moment conditions with implications for Bayesian...
![Page 1: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/1.jpg)
“reflections on the probability space induced bymoment conditions with implications for Bayesian
inference”: a discussion
Christian P. RobertUniversite Paris-Dauphine, Paris & University of Warwick, Coventry
![Page 2: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/2.jpg)
Outline
what is the question?
what could the question be?
what is the answer?
what could the answer be ?
![Page 3: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/3.jpg)
what is the question?
”If one specifies a set of moment functions collectedtogether into a vector m(x , θ) of dimension M, regards θas random and asserts that some transformation Z (x , θ)has distribution ψ, then what is required to use thisinformation and then possibly a prior to make validinference?” R. Gallant, p.4
![Page 4: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/4.jpg)
Priors without efforts
I quest for model induced prior dating back to early 1900’s[Lhoste, 1923]
I reference priors such as Jeffreys’ prior induced by samplingdistribution
[Jeffreys, 1939]
I Fiducial distributions as Fisher’s attempted answer[Fisher, 1956]
![Page 5: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/5.jpg)
Fisher’s t fiducial distribution
When considering
t =x − θ
s/√
n
the ratio has a frequentist t distribution with n − 1 degrees offreedom
![Page 6: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/6.jpg)
Fisher’s t fiducial distribution
However, no equivalent justification in asserting that
t =x − θ
s/√
n
has a t posterior distribution with n − 1 degrees of freedom on θ,given (x , s) except when using a non-informative and improperprior π(θ,σ2) ∝ 1/σ2 since, then
θ ∼ Tn−1(x , s/√n)
![Page 7: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/7.jpg)
Fisher’s t fiducial distribution
Furthermore, neither Bayesian nor frequentist interpretation impliesthat
t =x − θ
s/√
n
has a t posterior distribution with n − 1 degrees of freedom jointly
![Page 8: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/8.jpg)
what could the question be?
Given a set of moment equations
E[m(X1, . . . , Xn, θ)] = 0
(where both the Xi ’s and θ are random), can one derive alikelihood function and a prior distribution compatible with thoseconstraints?
![Page 9: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/9.jpg)
coherence across sample sizes n
Highly complex question since it implies the integral equation∫Θ×Xn
m(x1, . . . , xn, θ)π(θ)f (x1|θ) · · · f (xn|θ)dθdx1 · · · dxn = 0
must or should have a solution in (π, f ) for all n’s.possible outside of a likelihood x prior modelling?
![Page 10: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/10.jpg)
coherence across sample sizes n
Highly complex question since it implies the integral equation∫Θ×Xn
m(x1, . . . , xn, θ)π(θ)f (x1|θ) · · · f (xn|θ)dθdx1 · · · dxn = 0
must or should have a solution in (π, f ) for all n’s.possible outside of a likelihood x prior modelling?
![Page 11: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/11.jpg)
Zellner’s Bayesian method of moments
Given moment conditions on parameter θ and σ2
E[θ|x1, . . . , xn] = xn E[σ2|x1, . . .] = s2n var(θ|σ2, x1, . . .) = σ2/n
derivation of a maximum entropy posterior
θ|σ2, x1, . . . ∼ N(xn, σ2/n) σ−2|x1, . . . ∼ Exp(s2n)
[Zellner, 1996]
but incompatible with corresponding predictive distribution[Geisser & Seidenfeld, 1999]
![Page 12: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/12.jpg)
Zellner’s Bayesian method of moments
Given moment conditions on parameter θ and σ2
E[θ|x1, . . . , xn] = xn E[σ2|x1, . . .] = s2n var(θ|σ2, x1, . . .) = σ2/n
derivation of a maximum entropy posterior
θ|σ2, x1, . . . ∼ N(xn, σ2/n) σ−2|x1, . . . ∼ Exp(s2n)
[Zellner, 1996]
but incompatible with corresponding predictive distribution[Geisser & Seidenfeld, 1999]
![Page 13: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/13.jpg)
what is the answer?
Under the condition that Z (·, θ) is surjective,
p?(x |θ) = ψ(Z (x , θ))
and arbitrary choice of prior π(θ)
I lhs and rhs operate on different spaces
I no reason why density ψ should integrate against Lebesguemeasure in n-dimensional Euclidean space
I no direct connection with a genuine likelihood function, i.e.,product of the densities of the Xi ’s (conditional on θ)
![Page 14: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/14.jpg)
what is the answer?
Under the condition that Z (·, θ) is surjective,
p?(x |θ) = ψ(Z (x , θ))
and arbitrary choice of prior π(θ)
I lhs and rhs operate on different spaces
I no reason why density ψ should integrate against Lebesguemeasure in n-dimensional Euclidean space
I no direct connection with a genuine likelihood function, i.e.,product of the densities of the Xi ’s (conditional on θ)
![Page 15: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/15.jpg)
what could the answer be?
“A common situation that requires consideration of thenotions that follow is that deriving the likelihood from astructural model is analytically intractable and onecannot verify that the numerical approximations onewould have to make to circumvent the intractability aresufficiently accurate.” R. Gallant, p.7
![Page 16: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/16.jpg)
Approximative Bayesian answers
Defining joint distribution on (θ, x1, . . . , xn) through momentequations prevents regular Bayesian inference as likelihood isunavailablethere may be alternative available:
I Approximative Bayesian computation (ABC) and empiricallikelihood based Bayesian inference
[Tavare et al., 1999; Owen, 201; Mengersen et al., 2013]
I INLA (Laplace), EP (expectation/propagation),[Martino et al., 2008; Barthelme & Chopin, 2014]
I variational Bayes[Jaakkola & Jordan, 2000]
![Page 17: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/17.jpg)
Approximative Bayesian answers
Defining joint distribution on (θ, x1, . . . , xn) through momentequations prevents regular Bayesian inference as likelihood isunavailablethere may be alternative available:
I Approximative Bayesian computation (ABC) and empiricallikelihood based Bayesian inference
[Tavare et al., 1999; Owen, 201; Mengersen et al., 2013]
I INLA (Laplace), EP (expectation/propagation),[Martino et al., 2008; Barthelme & Chopin, 2014]
I variational Bayes[Jaakkola & Jordan, 2000]
![Page 18: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/18.jpg)
Bayesian approximative answers
I Using a fake likelihood does not prohibit Bayesian analysis, asshown in the paper with model in eqn. (45)
I However this requires case-by-case consistency analysis sincepseudo-likelihoods do not offer same garantees
I Example of ABC model choice based on insufficient statistics[Marin et al., 2014]
![Page 19: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/19.jpg)
Empirical likelihood (EL)
Dataset x made of n independent replicates x = (x1, . . . , xn) of arv X ∼ F
Generalized moment condition pseudo-model
EF
[h(X ,φ)
]= 0,
where h known function, and φ unknown parameter
Induced empirical likelihood
Lel(φ|x) = maxp
n∏i=1
pi
for all p such that 0 6 pi 6 1,∑
i pi = 1,∑
i pih(xi ,φ) = 0
[Owen, 1988, B’ka, & Empirical Likelihood, 2001]
![Page 20: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/20.jpg)
Empirical likelihood (EL)
Dataset x made of n independent replicates x = (x1, . . . , xn) of arv X ∼ F
Generalized moment condition pseudo-model
EF
[h(X ,φ)
]= 0,
where h known function, and φ unknown parameter
Induced empirical likelihood
Lel(φ|x) = maxp
n∏i=1
pi
for all p such that 0 6 pi 6 1,∑
i pi = 1,∑
i pih(xi ,φ) = 0
[Owen, 1988, B’ka, & Empirical Likelihood, 2001]
![Page 21: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/21.jpg)
Raw ABCel sampler
Naıve implementation: Act as if EL was an exact likelihood[Lazar, 2003, B’ka]
for i = 1 → N do
generate φi from the prior distribution π(·)set the weight ωi = Lel(φi |xobs)
end for
return (φi ,ωi ), i = 1, . . . , N
I Output weighted sample of size N
[Mengersen et al., 2013, PNAS]
![Page 22: "reflections on the probability space induced by moment conditions with implications for Bayesian inference": a discussion](https://reader033.fdocuments.us/reader033/viewer/2022051617/5597de7c1a28ab58388b46b8/html5/thumbnails/22.jpg)
Raw ABCel sampler
Naıve implementation: Act as if EL was an exact likelihood[Lazar, 2003, B’ka]
for i = 1 → N do
generate φi from the prior distribution π(·)set the weight ωi = Lel(φi |xobs)
end for
return (φi ,ωi ), i = 1, . . . , N
I Performance evaluated through effective sample size
ESS = 1/ N∑
i=1
ωi
/ N∑j=1
ωj
2
[Mengersen et al., 2013, PNAS]