Computing Bayesian posterior with empiricallikelihood in population genetics
Pierre Pudlo
INRA & U. Montpellier 2
SSB, 18 sept. 2012
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 1 / 29
Table of contents
1 Likelihood free methods
2 Empirical likelihood
3 Population genetics
4 Numerical experiments
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 2 / 29
Table of contents
1 Likelihood free methods
2 Empirical likelihood
3 Population genetics
4 Numerical experiments
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 3 / 29
When the likelihood is not completely known
Likelihood intractable: `(y|φ) =
∫U`(y,u|φ)du
where u is a latent process
I Hidden Markov and other dynamic models: latent processwhich is not observed
↪→ Classical answer: Markov chain Monte Carlo,. . .
I Population genetics: the whole gene genealogy is unobservedLikelihood is an integral over
I all possible gene genealogiesI all possible mutations along the genealogies
↪→ Classical answer: Approximate Bayesian computation (ABC)
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 4 / 29
When the likelihood is not completely known
Likelihood intractable: `(y|φ) =
∫U`(y,u|φ)du
where u is a latent processI Hidden Markov and other dynamic models: latent process
which is not observed
↪→ Classical answer: Markov chain Monte Carlo,. . .
I Population genetics: the whole gene genealogy is unobservedLikelihood is an integral over
I all possible gene genealogiesI all possible mutations along the genealogies
↪→ Classical answer: Approximate Bayesian computation (ABC)
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 4 / 29
When the likelihood is not completely known
Likelihood intractable: `(y|φ) =
∫U`(y,u|φ)du
where u is a latent processI Hidden Markov and other dynamic models: latent process
which is not observed
↪→ Classical answer: Markov chain Monte Carlo,. . .
I Population genetics: the whole gene genealogy is unobservedLikelihood is an integral over
I all possible gene genealogiesI all possible mutations along the genealogies
↪→ Classical answer: Approximate Bayesian computation (ABC)
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 4 / 29
ABC in a nutshell
Posterior distribution is the conditional distribution ofπ(φ)`(x|φ) (∗)
knowing that x = xobs
MethodologyDraw a (large) set of particles (φi, xi) from (∗) and use anonparametric estimate of the conditional density
π(φ|xobs) ∝ π(φ)`(xobs|φ)
Seminal papersI Tavare, Balding, Griffith and Donnelly (1997, Genetics)I Pritchard, Seielstad, Perez-Lezuan, Feldman (1999, Molecular
Biology and Evolution)
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 5 / 29
ABC in a nutshellPosterior distribution is the conditional distribution of
π(φ)`(x|φ) (∗)knowing that x = xobs
MethodologyDraw a (large) set of particles (φi, xi) from (∗) and use anonparametric estimate of the conditional density
π(φ|xobs) ∝ π(φ)`(xobs|φ)
Shortcomings.I time consuming – If simulation of the latent process is not
straightforward
I curse of dimensionality vs. lost of information –I If x lies in a high dimensional space X (often), we are unable to
estimate of the conditional densityI Hence, we project the (observed and simulated) datasets on a
space with smaller dimension (trough summary statistics)η : X → Rd (summary statistics)
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 5 / 29
ABC in a nutshellPosterior distribution is the conditional distribution of
π(φ)`(x|φ) (∗)knowing that x = xobs
MethodologyDraw a (large) set of particles (φi, xi) from (∗) and use anonparametric estimate of the conditional density
π(φ|xobs) ∝ π(φ)`(xobs|φ)
Shortcomings.I time consuming – If simulation of the latent process is not
straightforwardI curse of dimensionality vs. lost of information –
I If x lies in a high dimensional space X (often), we are unable toestimate of the conditional density
I Hence, we project the (observed and simulated) datasets on aspace with smaller dimension (trough summary statistics)η : X → Rd (summary statistics)
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 5 / 29
You went to school to learn, girl (. . . )Why 2 plus 2 makes fourNow, now, now, I’m gonna teach you (. . . )
All you gotta do is repeat after me!A, B, C!It’s easy as 1, 2, 3!Or simple as Do, re, mi! (. . . )
SurveysI Beaumont (2010, Annual Review of Ecology, Evolution, and
Systematics)I Lopes, Beaumont (2010, Genetics and Evolution)I Csillery, Blum, Gaggiotti, Francois (2010, Trends in Ecology and
Evolution)I Marin, Pudlo, Robert, Ryder (to appear, Statis. and Comput.)
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 6 / 29
Table of contents
1 Likelihood free methods
2 Empirical likelihood
3 Population genetics
4 Numerical experiments
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 7 / 29
Empirical likelihood (EL)
Owen (1988, Biometrika),Owen (2001, Chapman &Hall)
EmpiricalLikelihood
Data
Parameters' definition in term of a moment constraint
x = (x1, ..., xK)
xProfiled likelihood
xi
Black boxInput:I Independent data
x = (x1, . . . , xK)I Definition of the parameters
E(h(xi,φ)) = 0Output:I A profiled “likelihood function”
Lel(φ|x)
EL do not require a fully defined andoften complex (hence debatable)parametric model
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 8 / 29
Empirical likelihood (EL)
Owen (1988, Biometrika), Owen (2001, Chapman & Hall)
Assume that the dataset x is composed of n independent replicatesx = (x1, . . . , xn) of some X ∼ F
Generalized moment condition modelThe law F of X satisfy
EF[h(X,φ)
]= 0,
where h is a known function, and φ an unknown parameter
Empirical likelihood
Lel(φ|x) = maxp
n∏i=1
pi (∗)
for all p such that 0 6 pi 6 1,∑pi = 1,
∑i pih(xi,φ) = 0.
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 9 / 29
Empirical likelihood (EL)
Empirical likelihood
Lel(φ|x) = maxp
n∏i=1
pi (∗)
for all p such that 0 6 pi 6 1,∑pi = 1,
∑i pih(xi,φ) = 0.
Why (∗)?
(1) L(p) :=n∏i=1
pi is the likehood of
discrete distribution p =∑i piδxi
(2)∑i
pih(xi,φ) = 0 implies that p
fulfils the moment condition
How to compute?
Newton-Lagrange scheme.
The constraint is linear (in p)
See Owen (2001) chap. 12& package emplik in R
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 10 / 29
EL examples
EmpiricalLikelihood
Data
Parameters' definition in term of a moment constraint
x = (x1, ..., xK)
xProfiled likelihood
xi
Simple case: φ = meanh(xi,φ) = xi − φ
I Lel(φ|x) is maximal atφ = 1
K(x1 + · · ·+ xK)
I if enough data (K large),& if data from a true parametricmodel,then Lel ≈ true likelihood
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 11 / 29
EL examples
EmpiricalLikelihood
Data
Parameters' definition in term of a moment constraint
x = (x1, ..., xK)
xProfiled likelihood
xi
h(xi,φ) = xi − φred = param. likelihoodblack = Lel
Example 1: Gaussian modelK = 20
1.0 1.5 2.0 2.5 3.0
0.0
0.2
0.4
0.6
0.8
1.0
theta
el.n
orm
al
K = 200
1.0 1.5 2.0 2.5 3.0
0.0
0.2
0.4
0.6
0.8
1.0
theta
el.n
orm
al
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 12 / 29
EL examples
EmpiricalLikelihood
Data
Parameters' definition in term of a moment constraint
x = (x1, ..., xK)
xProfiled likelihood
xi
red = true likelihoodblue = Gaussian likelihoodblack = Lel
Example 2: non Gaussian modelK = 200
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
0.0
0.2
0.4
0.6
0.8
1.0
theta
el.n
orm
al
I Lel is better than a false model.
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 13 / 29
EL examples
EmpiricalLikelihood
Data
Parameters' definition in term of a moment constraint
x = (x1, ..., xK)
xProfiled likelihood
xi
red = true likelihoodblue = Gaussian likelihoodblack = Lel
Example 2: non Gaussian modelK = 200
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
0.0
0.2
0.4
0.6
0.8
1.0
theta
el.n
orm
al
I Lel is better than a false model.
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 13 / 29
Table of contents
1 Likelihood free methods
2 Empirical likelihood
3 Population genetics
4 Numerical experiments
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 14 / 29
Neutral model at a given microsatellite locus, in aclosed panmictic population at equilibrium
Sample of 8 genes
Mutations according tothe Simple stepwiseMutation Model (SMM)• date of the mutations ∼
Poisson process withintensity θ/2 over thebranches• MRCA = 100• independent mutations:±1 with pr. 1/2
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
Neutral model at a given microsatellite locus, in aclosed panmictic population at equilibrium
Kingman’s genealogyWhen time axis isnormalized,T(k) ∼ Exp(k(k− 1)/2)
Mutations according tothe Simple stepwiseMutation Model (SMM)• date of the mutations ∼
Poisson process withintensity θ/2 over thebranches• MRCA = 100• independent mutations:±1 with pr. 1/2
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
Neutral model at a given microsatellite locus, in aclosed panmictic population at equilibrium
Kingman’s genealogyWhen time axis isnormalized,T(k) ∼ Exp(k(k− 1)/2)
Mutations according tothe Simple stepwiseMutation Model (SMM)• date of the mutations ∼
Poisson process withintensity θ/2 over thebranches
• MRCA = 100• independent mutations:±1 with pr. 1/2
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
Neutral model at a given microsatellite locus, in aclosed panmictic population at equilibrium
Observations: leafs of the treeθ = ?
Kingman’s genealogyWhen time axis isnormalized,T(k) ∼ Exp(k(k− 1)/2)
Mutations according tothe Simple stepwiseMutation Model (SMM)• date of the mutations ∼
Poisson process withintensity θ/2 over thebranches• MRCA = 100• independent mutations:±1 with pr. 1/2
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 15 / 29
Much more interesting models. . .
I several independent locusIndependent gene genealogies and mutations
I different populationslinked by an evolutionary scenario made of divergences,admixtures, migrations between populations, etc.
I larger sample sizeusually between 50 and 100 genes
A typical evolutionary scenario:
MRCA
POP 0 POP 1 POP 2
τ1
τ2
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 16 / 29
Moment condition in population genetics?
Main difficultyDerive a constraint
EF[h(X,φ)
]= 0,
on the parameters of interest φ when X is the genotypes of the sampleof individuals at a given locus
E.g., in phylogeography, φ is composed ofI dates of divergence between populations,I ratio of population sizes,I mutation rates, etc.
None of them are moments of the distribution of the allelic states of thesample
↪→ h = pairwise composite scores whose zero is the pairwisemaximum likelihood estimator
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 17 / 29
Moment condition in population genetics?
Main difficultyDerive a constraint
EF[h(X,φ)
]= 0,
on the parameters of interest φ when X is the genotypes of the sampleof individuals at a given locus
E.g., in phylogeography, φ is composed ofI dates of divergence between populations,I ratio of population sizes,I mutation rates, etc.
None of them are moments of the distribution of the allelic states of thesample
↪→ h = pairwise composite scores whose zero is the pairwisemaximum likelihood estimator
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 17 / 29
Pairwise composite likelihood?
The intra-locus pairwise likelihood
`2(xk|φ) =∏i<j
`2(xik, xjk|φ)
with x1k, . . . , xnk : allelic states of the gene sample at the k-th locus
Sample of size 2⇒ coalescent tree is a single line joining xik to xjk⇒ closed form of `2(xik, xjk|φ)
The pairwise score function
∇φ log `2(xk|φ) =∑i<j
∇φ log `2(xik, xjk|φ)
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 18 / 29
Composite lik , approximation of lik� Composite likelihoods are often much more narrow than the
distribution of the model (See, e.g., Cooley et al., 2012)
Hence,I produces too narrow posterior distributionsI confidence intervals or credibility intervals are not reliable
But,
Eφ(∇φ log `2(x|φ)) =∫∇φ
[log `2(x|φ)
]`(x|φ)dx
=∑i<j
∫∇φ
[log `2(xi, xj|φ)
]`2(xi, xj|φ)dxidxj
=∑i<j
∫ ∇φ`2(xi, xj|φ)`2(xi, xj|φ)
`2(xi, xj|φ)dxidxj
=∑i<j
∇φ[ ∫`2(xi, xj|φ)dxidxj
]= 0
Safe with EL because we only use position of its modePudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 19 / 29
Pairwise likelihood: a simple case
Assumptions
I sample ⊂ closed, panmicticpopulation at equilibrium
I marker: microsatelliteI mutation rate: θ/2
if xik et xjk are two genes of thesample,
`2(xik, xjk|θ) depends only on
δ = xik − xjk
`2(δ|θ) =1√1+ 2θ
ρ (θ)|δ|
withρ(θ) =
θ
1+ θ+√1+ 2θ
Pairwise score function∂θ log `2(δ|θ) =
−1
1+ 2θ+
|δ|
θ√1+ 2θ
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 20 / 29
Pairwise likelihood: a simple case
Assumptions
I sample ⊂ closed, panmicticpopulation at equilibrium
I marker: microsatelliteI mutation rate: θ/2
if xik et xjk are two genes of thesample,
`2(xik, xjk|θ) depends only on
δ = xik − xjk
`2(δ|θ) =1√1+ 2θ
ρ (θ)|δ|
withρ(θ) =
θ
1+ θ+√1+ 2θ
Pairwise score function∂θ log `2(δ|θ) =
−1
1+ 2θ+
|δ|
θ√1+ 2θ
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 20 / 29
Pairwise likelihood: a simple case
Assumptions
I sample ⊂ closed, panmicticpopulation at equilibrium
I marker: microsatelliteI mutation rate: θ/2
if xik et xjk are two genes of thesample,
`2(xik, xjk|θ) depends only on
δ = xik − xjk
`2(δ|θ) =1√1+ 2θ
ρ (θ)|δ|
withρ(θ) =
θ
1+ θ+√1+ 2θ
Pairwise score function∂θ log `2(δ|θ) =
−1
1+ 2θ+
|δ|
θ√1+ 2θ
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 20 / 29
Pairwise likelihood: 2 diverging populations
MRCA
POP a POP b
τ
AssumptionsI τ: divergence date of pop.a and b
I θ/2: mutation rateLet xik and xjk be two genescoming resp. from pop. a and bSet δ = xik − x
jk.
Then `2(δ|θ, τ) =e−τθ√1+ 2θ
+∞∑k=−∞ ρ(θ)
|k|Iδ−k(τθ).
whereIn(z) nth-order modified Besselfunction of the first kind
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 21 / 29
Pairwise likelihood: 2 diverging populations
MRCA
POP a POP b
τ
AssumptionsI τ: divergence date of pop.a and b
I θ/2: mutation rateLet xik and xjk be two genescoming resp. from pop. a and bSet δ = xik − x
jk.
A 2-dim score function∂τ log `2(δ|θ, τ) =
−θ+θ
2
`2(δ− 1|θ, τ) + `2(δ+ 1|θ, τ)`2(δ|θ, τ)
∂θ log `2(δ|θ, τ) =
−τ−1
1+ 2θ+
τ
2
`2(δ− 1|θ, τ) + `2(δ+ 1|θ, τ)`2(δ|θ, τ)
+
q(δ|θ, τ)`2(δ|θ, τ)
whereq(δ|θ, τ) :=
e−τθ√1+ 2θ
ρ ′(θ)
ρ(θ)
∞∑k=−∞ |k|ρ(θ)|k|Iδ−k(τθ)
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 21 / 29
Table of contents
1 Likelihood free methods
2 Empirical likelihood
3 Population genetics
4 Numerical experiments
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 22 / 29
A first experimentEvolutionary scenario:
MRCA
POP 0 POP 1
τ
Dataset:I 50 genes per populations,I 100 microsat. loci
Assumptions:I Ne identical over all
populationsI φ = (log10 θ, log10 τ)I uniform prior over(−1., 1.5)× (−1., 1.)
Comparison of the originalABC with ABCel
ESS=7034
log(theta)
Den
sity
0.00 0.05 0.10 0.15 0.20 0.25
05
1015
log(tau1)
Den
sity
−0.3 −0.2 −0.1 0.0 0.1 0.2 0.3
01
23
45
67
histogram = ABCelcurve = original ABCvertical line = “true” parameter
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 23 / 29
A first experimentEvolutionary scenario:
MRCA
POP 0 POP 1
τ
Dataset:I 50 genes per populations,I 100 microsat. loci
Assumptions:I Ne identical over all
populationsI φ = (log10 θ, log10 τ)I uniform prior over(−1., 1.5)× (−1., 1.)
Comparison of the originalABC with ABCel
ESS=7034
log(theta)
Den
sity
0.00 0.05 0.10 0.15 0.20 0.25
05
1015
log(tau1)
Den
sity
−0.3 −0.2 −0.1 0.0 0.1 0.2 0.3
01
23
45
67
histogram = ABCelcurve = original ABCvertical line = “true” parameter
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 23 / 29
ABC vs. ABCel on 100 replicates of the 1st experiment
Accuracy:log10 θ log10 τ
ABC ABCel ABC ABCel(1) 0.097 0.094 0.315 0.117(2) 0.071 0.059 0.272 0.077(3) 0.68 0.81 1.0 0.80
(1) Root Mean Square Error of the posterior mean(2) Median Absolute Deviation of the posterior median(3) Coverage of the credibility interval of probability 0.8
Computation time: on a recent 6-core computer (C++/OpenMP)I ABC ≈ 4 hoursI ABCel≈ 2 minutes
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 24 / 29
Second experimentEvolutionary scenario:
MRCA
POP 0 POP 1 POP 2
τ1
τ2
Dataset:I 50 genes per populations,I 100 microsat. loci
Assumptions:I Ne identical over all
populationsI φ = log10(θ, τ1, τ2)I non-informative uniform prior
Comparison of the original ABCwith ABCelhistogram = ABCelcurve = original ABCvertical line = “true” parameter
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
Second experimentEvolutionary scenario:
MRCA
POP 0 POP 1 POP 2
τ1
τ2
Dataset:I 50 genes per populations,I 100 microsat. loci
Assumptions:I Ne identical over all
populationsI φ = log10(θ, τ1, τ2)I non-informative uniform prior
Comparison of the original ABCwith ABCel
histogram = ABCelcurve = original ABCvertical line = “true” parameter
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
Second experimentEvolutionary scenario:
MRCA
POP 0 POP 1 POP 2
τ1
τ2
Dataset:I 50 genes per populations,I 100 microsat. loci
Assumptions:I Ne identical over all
populationsI φ = log10(θ, τ1, τ2)I non-informative uniform prior
Comparison of the original ABCwith ABCel
histogram = ABCelcurve = original ABCvertical line = “true” parameter
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
Second experimentEvolutionary scenario:
MRCA
POP 0 POP 1 POP 2
τ1
τ2
Dataset:I 50 genes per populations,I 100 microsat. loci
Assumptions:I Ne identical over all
populationsI φ = log10(θ, τ1, τ2)I non-informative uniform prior
Comparison of the original ABCwith ABCel
histogram = ABCelcurve = original ABCvertical line = “true” parameter
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 25 / 29
ABC vs. ABCel on 100 replicates of the 2ndexperimentAccuracy:
log10 θ log10 τ1 log10 τ2ABC ABCel ABC ABCel ABC ABCel
(1) 0.0059 0.0794 0.472 0.483 29.3 4.76(2) 0.048 0.053 0.32 0.28 4.13 3.36(3) 0.79 0.76 0.88 0.76 0.89 0.79
(1) Root Mean Square Error of the posterior mean(2) Median Absolute Deviation of the posterior median(3) Coverage of the credibility interval of probability 0.8
Computation time: on a recent 6-core computer (C++/OpenMP)I ABC ≈ 6 hoursI ABCel≈ 8 minutes
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 26 / 29
Why?
On large datasets, ABCel gives more accurate results than ABC
ABC simplifies the dataset through summary statisticsDue to the large dimension of x, the original ABC algorithm estimates
π(θ∣∣∣η(xobs)
),
where η(xobs) is some (non-linear) projection of the observed dataseton a space with smaller dimension↪→ Some information is lost
ABCelsimplifies the model through a generalized moment conditionmodel.↪→ Here, the moment condition model is based on pairwisecomposition likelihood
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 27 / 29
Joint work withI Christian P. Robert (U. Dauphine & IUF)I Kerrie Mengersen (QUT, Australia)I Raphael Leblois (INRA CBGP, Montpellier)
Grant from ANR throughProject “Emile”
First preprint on arXivApproximate Bayesian computation viaempirical likelihood
Coming soon: population genetic modelswhich are too slow to simulate
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 28 / 29
Joint work withI Christian P. Robert (U. Dauphine & IUF)I Kerrie Mengersen (QUT, Australia)I Raphael Leblois (INRA CBGP, Montpellier)
Grant from ANR throughProject “Emile”
First preprint on arXivApproximate Bayesian computation viaempirical likelihood
Coming soon: population genetic modelswhich are too slow to simulate
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 28 / 29
Empirical likelihood
Wilk’s theorem to EL likelihood ratio.
Theorem (Owen, 1991)Let X,X1, . . . ,Xn be iid random vector with common distribution F0.For φ ∈ Θ, x ∈X , let h(x,φ) ∈ Rs.Let φ0 ∈ Θ be such that Var(h(X, θ0)) is finite and has rank q > 0.
If φ0 satisfies E(h(X,φ0)) = 0, then −2 log(Lel(φ0|X1:n)
n−n
) → χ2(q)
Note thatI n−n is the maximum value of Lel
I no condition to ensure a unique value for θ0I no condition to make φel = argmaxLel a good estimate of φ0I provides tests and confidence intervals
Pudlo (INRA & U. Montpellier 2) ABCel GenPop SSB 09/2012 29 / 29
Top Related