Bayesian inference - University of Edinburgh. Daunizeau Institute of Empirical Research in...

J. Daunizeau

Institute of Empirical Research in Economics, Zurich, Switzerland

Brain and Spine Institute, Paris, France

Bayesian inference

Overview of the talk

1 Probabilistic modelling and representation of uncertainty

1.1 Bayesian paradigm

1.2 Hierarchical models

1.3 Frequentist versus Bayesian inference

2 Numerical Bayesian inference methods

2.1 Sampling methods

2.2 Variational methods (ReML, EM, VB)

3 SPM applications

3.1 aMRI segmentation

3.2 Decoding of brain images

3.3 Model-based fMRI analysis (with spatial priors)

3.4 Dynamic causal modelling









3 SPM applications





Degree of plausibility desiderata:- should be represented using real numbers (D1)

- should conform with intuition (D2)- should be consistent (D3)

a=2b=5

a=2

• normalization:

• marginalization:

• conditioning :(Bayes rule)

Bayesian paradigmprobability theory: basics

Bayesian paradigmderiving the likelihood function

- Model of data with unknown parameters:

( )y f θ= e.g., GLM: ( )f Xθ θ=

- But data is noisy: ( )y f θ ε= +

- Assume noise/residuals is ‘small’:

( ) 2

2

1exp

2p ε ε

σ

∝ −

( )4 0.05P ε σ> ≈

ε

→ Distribution of data, given fixed parameters:

( ) ( )( )2

2

1exp

2p y y fθ θ

σ

∝ − −

θ

f

Likelihood:

Prior:

Bayes rule:

Bayesian paradigmlikelihood, priors and the model evidence

θ

generative model m

Bayesian paradigmforward and inverse problems

( ),p y mϑ

forward problem

likelihood

( ),p y mϑ

inverse problem

posterior distribution

Principle of parsimony :

« plurality should not be assumed without necessity »

y=f(

x)y

= f

(x)

x

“Occam’s razor” :

model e

vid

en

ce p

(y|m

)

space of all data sets

Model evidence:

Bayesian paradigmmodel comparison

••• hierarchy

causality

Hierarchical modelsprinciple

Hierarchical modelsdirected acyclic graphs (DAGs)

•••

prior densities posterior densities

Hierarchical modelsunivariate linear hierarchical model

( )t t Y≡ t *

( )0*P t t H>

( )0p t H

( )0*P t t H α> ≤if then reject H0

• estimate parameters (obtain test stat.)

H

0:θ = 0• define the null, e.g.:

• apply decision rule, i.e.:

classical SPM

( )p yθ

θ

( )0P H y

( )0P H y α≥if then accept H0

• invert model (obtain posterior pdf)

H

0:θ > 0• define the null, e.g.:

• apply decision rule, i.e.:

Bayesian PPM

Frequentist versus Bayesian inferencea (quick) note on hypothesis testing

Y

• define the null and the alternative hypothesis in terms of priors, e.g.:

( )

( ) ( )

0 0

1 1

1 if 0:

0 otherwise

: 0,

H p H

H p H N

θθ

θ

==

= Σ

( )( )

0

1

1P H y

P H y≤if then reject H0• apply decision rule, i.e.:

y

Frequentist versus Bayesian inferencewhat about bilateral tests?

( )1p Y H

( )0p Y H

space of all datasets

• Savage-Dickey ratios (nested models, i.i.d. priors):

( ) ( )( )( )

1

0 1

1

0 ,

0

p y Hp y H p y H

p H

θ

θ

==

=









3 SPM applications





Sampling methodsMCMC example: Gibbs sampling

Variational methodsVB / EM / ReML

→ VB : maximize the free energy F(q) w.r.t. the “variational” posterior q(θ)

under some (e.g., mean field, Laplace) approximation

( )1 or 2q θ

( )1 or 2,p y mθ

( )1 2, ,p y mθ θ

θ

1

θ

2









3 SPM applications





realignmentrealignment smoothingsmoothingnormalisationnormalisation general linear modelgeneral linear modeltemplatetemplate Gaussian Gaussian field theoryfield theoryp <0.05p <0.05statisticalstatisticalinferenceinference

segmentationsegmentationsegmentationsegmentationand normalisationand normalisationand normalisationand normalisation dynamic causaldynamic causaldynamic causaldynamic causalmodellingmodellingmodellingmodellingposterior probabilityposterior probabilityposterior probabilityposterior probabilitymaps (PPMs)maps (PPMs)maps (PPMs)maps (PPMs) multivariatemultivariatemultivariatemultivariatedecodingdecodingdecodingdecoding

grey matter CSFwhite matter

…

…

yi

ci λ

µk

µ2

µ1

σ1σ 2 σ k

class variances

class

means

ith voxel

value

ith voxellabel

class

frequencies

aMRI segmentationmixture of Gaussians (MoG) model

Decoding of brain imagesrecognizing brain states from fMRI

+

fixation cross

>>

paceresponse

log-evidence of X-Y sparse mappings:effect of lateralization

log-evidence of X-Y bilateral mappings:effect of spatial deployment

fMRI time series analysisspatial priors and model comparison

PPM: regions best explainedby short-term memory model

PPM: regions best explained by long-term memory model

fMRI time series

GLM coeff

prior variance

of GLM coeff

prior variance

of data noiseAR coeff

(correlated noise)

short-term memorydesign matrix (X)

long-term memorydesign matrix (X)

m2m1 m3 m4V1V1V1V1 V5V5V5V5stim PPCPPCPPCPPCattentionV1V1V1V1 V5V5V5V5stim PPCPPCPPCPPCattention V1V1V1V1 V5V5V5V5stim PPCPPCPPCPPCattention V1V1V1V1 V5V5V5V5stim PPCPPCPPCPPCattentionm1 m2 m3 m4151050 V1V1V1V1 V5V5V5V5stim PPCPPCPPCPPCattention1.251.251.251.250.130.130.130.130.460.460.460.46 0.390.390.390.390.260.260.260.260.260.260.260.26 0.100.100.100.10 estimatedeffective synaptic strengthsfor best model (m4)models marginal likelihood

ln p y m( )

Dynamic Causal Modellingnetwork structure identification

1 2

31 2

31 2

3

1 2

3

time

( , , )x f x u θ=&

32θ

21θ

13θ

0t∆ →

t t+ ∆

t t− ∆

tu

13

uθ3

uθ

DCMs and DAGsa note on causality

m1

m2

diff

ere

nces in

lo

g-

mo

de

l e

vid

en

ces

( ) ( )1 2ln lnp y m p y m−

subjects

fixed effect

random effect

assume all subjects correspond to the same model

assume different subjects might correspond to different models

Dynamic Causal Modellingmodel comparison for group studies

I thank you for your attention.

A note on statistical significancelessons from the Neyman-Pearson lemma

• Neyman-Pearson lemma: the likelihood ratio (or Bayes factor) test

( )( )

1

0

p y Hu

p y HΛ = ≥

is the most powerful test of size to test the null. ( )0p u Hα = Λ ≥

MVB (Bayes factor) u=1.09, power=56%

CCA (F-statistics)

F=2.20, power=20%

error I rate

1 -

err

or

II r

ate

ROC analysis

• what is the threshold u, above which the Bayes factor test yields a error I rate of 5%?

Bayesian inference - University of Edinburgh. Daunizeau Institute of Empirical Research in...

Documents

Transcript of Bayesian inference - University of Edinburgh. Daunizeau Institute of Empirical Research in...