Statistical Inference Rik Henson With thanks to: Karl Friston, Andrew Holmes, Stefan Kiebel, Will...

Statistical InferenceStatistical Inference

Rik Henson

With thanks to: Karl Friston, Andrew Holmes, Stefan Kiebel, Will Penny

Statistical InferenceStatistical Inference

Rik Henson

With thanks to: Karl Friston, Andrew Holmes, Stefan Kiebel, Will Penny

OverviewOverviewOverviewOverview

Motioncorrection

Smoothing

kernel

Spatialnormalisation

Standardtemplate

fMRI time-series Statistical Parametric Map

General Linear Model

Design matrix

Parameter Estimates


1. General Linear Model1. General Linear ModelDesign MatrixDesign MatrixGlobal normalisationGlobal normalisation

2. fMRI timeseries2. fMRI timeseriesHighpass filteringHighpass filteringHRF convolutionHRF convolutionTemporal autocorrelationTemporal autocorrelation

3. Statistical Inference3. Statistical InferenceGaussian Field TheoryGaussian Field Theory

4. Random Effects4. Random Effects

5. Experimental Designs5. Experimental Designs

6. Effective Connectivity6. Effective Connectivity







• If If n=100,000n=100,000 voxels tested with voxels tested with ppuu=0.05=0.05 of falsely rejecting of falsely rejecting HHoo... ...

… …then approx then approx n n p puu (eg 5,000) (eg 5,000)

will do so by chance (false will do so by chance (false positives, or “type I” errors)positives, or “type I” errors)

• Therefore need to “correct” p-Therefore need to “correct” p-values for number of comparisonsvalues for number of comparisons

• A severe correction would be a A severe correction would be a Bonferroni, where Bonferroni, where ppcc = p = pu u /n/n……

… …but this is only appropriate when but this is only appropriate when the the nn tests independent… tests independent…

… … SPMs are smooth, meaning that SPMs are smooth, meaning that nearby voxels are correlatednearby voxels are correlated

=> Gaussian Field Theory...=> Gaussian Field Theory...

• If If n=100,000n=100,000 voxels tested with voxels tested with ppuu=0.05=0.05 of falsely rejecting of falsely rejecting HHoo... ...

… …then approx then approx n n p puu (eg 5,000) (eg 5,000)

will do so by chance (false will do so by chance (false positives, or “type I” errors)positives, or “type I” errors)

• Therefore need to “correct” p-Therefore need to “correct” p-values for number of comparisonsvalues for number of comparisons

• A severe correction would be a A severe correction would be a Bonferroni, where Bonferroni, where ppcc = p = pu u /n/n……

… …but this is only appropriate when but this is only appropriate when the the nn tests independent… tests independent…

… … SPMs are smooth, meaning that SPMs are smooth, meaning that nearby voxels are correlatednearby voxels are correlated

=> Gaussian Field Theory...=> Gaussian Field Theory...

Multiple comparisons…Multiple comparisons…Multiple comparisons…Multiple comparisons…

Gaussian10mm FWHM(2mm pixels)

pu = 0.05

SPM{t} Eg random noise

Gaussian Field TheoryGaussian Field TheoryGaussian Field TheoryGaussian Field Theory

• Consider SPM as lattice representation Consider SPM as lattice representation of continuous random field of continuous random field

• ““Euler characteristic” - topological Euler characteristic” - topological measure of “excursion set” (e.g,measure of “excursion set” (e.g, # components - # “holes”)# components - # “holes”)

• Consider SPM as lattice representation Consider SPM as lattice representation of continuous random field of continuous random field

• ““Euler characteristic” - topological Euler characteristic” - topological measure of “excursion set” (e.g,measure of “excursion set” (e.g, # components - # “holes”)# components - # “holes”)

• Smoothness estimated by covariance of Smoothness estimated by covariance of partial derivatives of residuals partial derivatives of residuals (expressed as “resels” or (expressed as “resels” or FWHMFWHM))

• Assumes:Assumes: 1) residuals are multivariate normal1) residuals are multivariate normal 2) smoothness » voxel size2) smoothness » voxel size (practically, (practically, FWHMFWHM 3 3 VoxDim)VoxDim)

• Not necessarily stationary: smoothness Not necessarily stationary: smoothness estimated locally as resels-per-voxelestimated locally as resels-per-voxel

• General form for expected Euler characteristic for General form for expected Euler characteristic for DD dimensions: dimensions:

EE[[(AAuu)] = ] = R Rd d (()) d d ((uu))

• General form for expected Euler characteristic for General form for expected Euler characteristic for DD dimensions: dimensions:

EE[[(AAuu)] = ] = R Rd d (()) d d ((uu))

Rd (): d-dimensional Minkowski

– function of dimension, d, space and smoothness:

R0() = () Euler characteristic of

R1() = resel diameter

R2() = resel surface area

R3() = resel volume

d (): d-dimensional EC density of Z(x)

– function of dimension, d, threshold, u, and statistic, e.g. Z-statistic:

0(u) = 1- (u)

1(u) = (4 ln2)1/2 exp(-u2/2) / (2)

2(u) = (4 ln2) exp(-u2/2) / (2)3/2

3(u) = (4 ln2)3/2 (u2 -1) exp(-u2/2) / (2)2

4(u) = (4 ln2)2 (u3 -3u) exp(-u2/2) / (2)5/2

Generalised FormGeneralised FormGeneralised FormGeneralised Form

Levels of InferenceLevels of InferenceLevels of InferenceLevels of Inference

• Three levels of inference:Three levels of inference:

– extreme voxel valuesextreme voxel values voxel-levelvoxel-level inferenceinference

– big suprathreshold clustersbig suprathreshold clusters cluster-levelcluster-level inferenceinference

– many suprathreshold clustersmany suprathreshold clusters set-levelset-level inferenceinference

• Three levels of inference:Three levels of inference:

– extreme voxel valuesextreme voxel values voxel-levelvoxel-level inferenceinference

– big suprathreshold clustersbig suprathreshold clusters cluster-levelcluster-level inferenceinference

– many suprathreshold clustersmany suprathreshold clusters set-levelset-level inferenceinference

n=82n=82

n=32n=32

n=1n=122

Parameters:Parameters:

““Height” threshold, Height” threshold, uu - t > 3.09- t > 3.09““Extent” threshold, Extent” threshold, kk - 12 voxels- 12 voxels

Dimension, Dimension, DD - 3 - 3 Volume, Volume, SS - 32- 323 3 voxelsvoxelsSmoothnessSmoothness, FWHM, FWHM - 4.7 voxels- 4.7 voxels

Omnibus:Omnibus: P(c P(c 7, t 7, t u) = 0.031 u) = 0.031

voxel-level: voxel-level: P(t P(t 4.37) = .048 4.37) = .048

set-level:set-level: P(c P(c 3, n 3, n k, t k, t u) = 0.019 u) = 0.019

cluster-level:cluster-level: P(n P(n 82, t 82, t u) = 0.029 u) = 0.029

(Spatial) Specificity vs. Sensitivity(Spatial) Specificity vs. Sensitivity(Spatial) Specificity vs. Sensitivity(Spatial) Specificity vs. Sensitivity

Small-volume correctionSmall-volume correctionSmall-volume correctionSmall-volume correction

• If have an a priori region of interest, no need to correct for If have an a priori region of interest, no need to correct for whole-brain!whole-brain!

• But can use GFT to correct for a Small Volume (But can use GFT to correct for a Small Volume (SVCSVC))

• Volume can be based on:Volume can be based on:

– An anatomically-defined regionAn anatomically-defined region

– A geometric approximation to the above (eg rhomboid/sphere)A geometric approximation to the above (eg rhomboid/sphere)

– A functionally-defined mask (based on an ORTHOGONAL contrast!)A functionally-defined mask (based on an ORTHOGONAL contrast!)

• Extent of correction can be APPROXIMATED by a Bonferonni Extent of correction can be APPROXIMATED by a Bonferonni correction for the number of resels…correction for the number of resels…

• ..but correction also depends on shape (surface area) as well as ..but correction also depends on shape (surface area) as well as size (volume) of region (may want to smooth volume if rough)size (volume) of region (may want to smooth volume if rough)

• If have an a priori region of interest, no need to correct for If have an a priori region of interest, no need to correct for whole-brain!whole-brain!

• But can use GFT to correct for a Small Volume (But can use GFT to correct for a Small Volume (SVCSVC))

• Volume can be based on:Volume can be based on:

– An anatomically-defined regionAn anatomically-defined region

– A geometric approximation to the above (eg rhomboid/sphere)A geometric approximation to the above (eg rhomboid/sphere)

– A functionally-defined mask (based on an ORTHOGONAL contrast!)A functionally-defined mask (based on an ORTHOGONAL contrast!)

• Extent of correction can be APPROXIMATED by a Bonferonni Extent of correction can be APPROXIMATED by a Bonferonni correction for the number of resels…correction for the number of resels…

• ..but correction also depends on shape (surface area) as well as ..but correction also depends on shape (surface area) as well as size (volume) of region (may want to smooth volume if rough)size (volume) of region (may want to smooth volume if rough)

Example SPM window

Fixed vs. Random EffectsFixed vs. Random EffectsFixed vs. Random EffectsFixed vs. Random Effects

Subject 1

• Subjects can be Subjects can be FixedFixed or or RandomRandom variables variables

• If subjects are a Fixed variable in a single design If subjects are a Fixed variable in a single design matrix (SPM “sessions”), the error term conflates matrix (SPM “sessions”), the error term conflates within- and between-subject variancewithin- and between-subject variance

• Subjects can be Subjects can be FixedFixed or or RandomRandom variables variables

• If subjects are a Fixed variable in a single design If subjects are a Fixed variable in a single design matrix (SPM “sessions”), the error term conflates matrix (SPM “sessions”), the error term conflates within- and between-subject variancewithin- and between-subject variance

Subject 2

Subject 3

Subject 4

Subject 6

Multi-subject Fixed Effect model

error df ~ 300

Subject 5

–In PET, this is not such a problem because the In PET, this is not such a problem because the within-subject (between-scan) variance can be as within-subject (between-scan) variance can be as great as the between-subject variance; but in fMRI great as the between-subject variance; but in fMRI the between-scan variance is normally much smaller the between-scan variance is normally much smaller than the between-subject variancethan the between-subject variance

• If one wishes to make an inference from a subject If one wishes to make an inference from a subject sample to the population, one needs to treat sample to the population, one needs to treat subjects subjects as a Random variableas a Random variable, and needs a proper mixture , and needs a proper mixture of within- and between-subject varianceof within- and between-subject variance

• In SPM, this is achieved by a two-stage In SPM, this is achieved by a two-stage procedure:procedure:

1)1) (Contrasts of) parameters are estimated from a (Fixed (Contrasts of) parameters are estimated from a (Fixed Effect) model for each subjectEffect) model for each subject

2)2) Images of these contrasts become the data for a second Images of these contrasts become the data for a second design matrix (usually simple t-test or ANOVA)design matrix (usually simple t-test or ANOVA)

WHEN special case of n independent observations per subject:

var(pop) = 2b + 2

w / Nn

Two-stage “Summary Statistic” approachTwo-stage “Summary Statistic” approachTwo-stage “Summary Statistic” approachTwo-stage “Summary Statistic” approach

p < 0.001 (uncorrected)

SPM{t}

1st-level (within-subject) 2nd-level (between-subject)

cont

rast

imag

es o

f c

i

1^

2^

3^

4^

5^

6^

1

^

^

^

^

^

^

wwithin-subject error^

N=6 subjects(error df =5)

One-sample t-test

pop

Limitations of 2-stage approachLimitations of 2-stage approachLimitations of 2-stage approachLimitations of 2-stage approach

• Summary statistic approach is a special case, valid Summary statistic approach is a special case, valid only when each subject’s design matrix is identical only when each subject’s design matrix is identical (“(“balanced designsbalanced designs”)”)

• In practice, the approach is reasonably robust to In practice, the approach is reasonably robust to unbalanced designs (Penny, 2004)unbalanced designs (Penny, 2004)

• More generally, exact solutions to any hierarchical More generally, exact solutions to any hierarchical GLM can be obtained using ReMLGLM can be obtained using ReML

• This is computationally expensive to perform at This is computationally expensive to perform at every voxel (so not implemented in SPM2)every voxel (so not implemented in SPM2)

• Plus modelling of nonsphericity at 2Plus modelling of nonsphericity at 2ndnd-level can -level can minimise potential bias of unbalanced designs… minimise potential bias of unbalanced designs…

• Summary statistic approach is a special case, valid Summary statistic approach is a special case, valid only when each subject’s design matrix is identical only when each subject’s design matrix is identical (“(“balanced designsbalanced designs”)”)

• In practice, the approach is reasonably robust to In practice, the approach is reasonably robust to unbalanced designs (Penny, 2004)unbalanced designs (Penny, 2004)

• More generally, exact solutions to any hierarchical More generally, exact solutions to any hierarchical GLM can be obtained using ReMLGLM can be obtained using ReML

• This is computationally expensive to perform at This is computationally expensive to perform at every voxel (so not implemented in SPM2)every voxel (so not implemented in SPM2)

• Plus modelling of nonsphericity at 2Plus modelling of nonsphericity at 2ndnd-level can -level can minimise potential bias of unbalanced designs… minimise potential bias of unbalanced designs…

New inSPM2

Nonsphericity again!Nonsphericity again!Nonsphericity again!Nonsphericity again!

• When tests at 2When tests at 2ndnd-level are more complicated than -level are more complicated than 1/2-sample t-tests, errors can be non i.i.d1/2-sample t-tests, errors can be non i.i.d

• For example, two groups (e.g, patients and controls) For example, two groups (e.g, patients and controls) may have different variances (non-identically may have different variances (non-identically distributed; distributed; inhomogeniety of varianceinhomogeniety of variance))

• When tests at 2When tests at 2ndnd-level are more complicated than -level are more complicated than 1/2-sample t-tests, errors can be non i.i.d1/2-sample t-tests, errors can be non i.i.d

• For example, two groups (e.g, patients and controls) For example, two groups (e.g, patients and controls) may have different variances (non-identically may have different variances (non-identically distributed; distributed; inhomogeniety of varianceinhomogeniety of variance))

New inSPM2

Inhomogeneous variance(3 groups of 4 subjects)

Q

Repeated measures(3 groups of 4 subjects)

Q

1

2

3• Or when taking more than one parameter per subject Or when taking more than one parameter per subject ((repeated measuresrepeated measures, e.g, multiple basis functions in , e.g, multiple basis functions in event-related fMRI), errors may be non-independent event-related fMRI), errors may be non-independent

(If nonsphericity correction selected, inhomogeniety (If nonsphericity correction selected, inhomogeniety assumed, and further option for repeated measures)assumed, and further option for repeated measures)

• Same method of variance component estimation Same method of variance component estimation with ReML (that used for autocorrelation) is usedwith ReML (that used for autocorrelation) is used

(Greenhouse-Geisser correction for repeated-(Greenhouse-Geisser correction for repeated-measures ANOVAs is a special case approximation)measures ANOVAs is a special case approximation)

Hierarchical ModelsHierarchical ModelsHierarchical ModelsHierarchical Models

• Two-stage approach is special case of Two-stage approach is special case of Hierarchical GLMHierarchical GLM

• In a Bayesian framework, parameters of one In a Bayesian framework, parameters of one level can be made priors on distribution of level can be made priors on distribution of parameters at lower level: “parameters at lower level: “Parametric Parametric Empirical Bayes” Empirical Bayes” (Friston et al, 2002)(Friston et al, 2002)

• The parameters and hyperparameters at each The parameters and hyperparameters at each level can be estimated using level can be estimated using EM algorithmEM algorithm (generalisation of ReML)(generalisation of ReML)

• Note parameters and hyperparameters at final Note parameters and hyperparameters at final level do not differ from classical frameworklevel do not differ from classical framework

• Second-level could be subjects; it could also Second-level could be subjects; it could also be be voxelsvoxels……

• Two-stage approach is special case of Two-stage approach is special case of Hierarchical GLMHierarchical GLM

• In a Bayesian framework, parameters of one In a Bayesian framework, parameters of one level can be made priors on distribution of level can be made priors on distribution of parameters at lower level: “parameters at lower level: “Parametric Parametric Empirical Bayes” Empirical Bayes” (Friston et al, 2002)(Friston et al, 2002)

• The parameters and hyperparameters at each The parameters and hyperparameters at each level can be estimated using level can be estimated using EM algorithmEM algorithm (generalisation of ReML)(generalisation of ReML)

• Note parameters and hyperparameters at final Note parameters and hyperparameters at final level do not differ from classical frameworklevel do not differ from classical framework

• Second-level could be subjects; it could also Second-level could be subjects; it could also be be voxelsvoxels……

y = X(1) (1) + (1)

(1)= X(2) (2) + (2)

…

(n-1)= X(n) (n) + (n)

C(i) = k

(i) Qk

(i)

New inSPM2

Parametric Empirical Bayes & PPMsParametric Empirical Bayes & PPMsParametric Empirical Bayes & PPMsParametric Empirical Bayes & PPMs

• Bayes rule:Bayes rule:

p(p(|y) = p(y|) p()• Bayes rule:Bayes rule:

p(p(|y) = p(y|) p()

New inSPM2

Posterior Likelihood Prior

(PPM) (SPM)

• What are the priors?

– In “classical” SPM, no (flat) priors

– In “full” Bayes, priors might be from theoretical arguments, or from independent data

– In “empirical” Bayes, priors derive from same data, assuming a hierarchical model for generation of that data

Parametric Empirical Bayes & PPMsParametric Empirical Bayes & PPMsParametric Empirical Bayes & PPMsParametric Empirical Bayes & PPMs

• Bayes rule:Bayes rule:

p(p(|y) = p(y|) p()• Bayes rule:Bayes rule:

p(p(|y) = p(y|) p()

New inSPM2

Posterior Likelihood Prior

(PPM) (SPM)

u

)(yft

)0|( tp

Classical T-test

)|( yp

Bayesian test

• For PPMs in SPM2, priors come from distribution over voxels

• If remove mean over voxels, prior mean can be set to zero (a “shrinkage” prior)

• One can threshold posteriors for a given probability of a parameter estimate greater than some value …

• …to give a posterior probability map (PPM)

Parametric Empirical Bayes & PPMsParametric Empirical Bayes & PPMsParametric Empirical Bayes & PPMsParametric Empirical Bayes & PPMsNew inSPM2

SP

Mm

ip[0

, 0, 0

]

<

< <

PPM2.06

rest [2.06]

SPMresults:C:\home\spm\analysis_PET

Height threshold P = 0.95

Extent threshold k = 0 voxels

Design matrix1 4 7 10 13 16 19 22

147

1013161922252831343740434649525560

contrast(s)

4

SP

Mm

ip[0

, 0, 0

]

<

< <

SPM{T39.0

}

rest

SPMresults:C:\home\spm\analysis_PET

Height threshold T = 5.50

Extent threshold k = 0 voxels

Design matrix1 4 7 10 13 16 19 22

147

1013161922252831343740434649525560

contrast(s)

3

• Activations greater than certain amount Voxels with non-zero activations

• Can infer no responses Cannot “prove the null hypothesis”

• No fallacy of inference Fallacy of inference (large df)

• Inference independent of search volume Correct for search volume

• Computationally expensive Computationally faster

• Activations greater than certain amount Voxels with non-zero activations

• Can infer no responses Cannot “prove the null hypothesis”

• No fallacy of inference Fallacy of inference (large df)

• Inference independent of search volume Correct for search volume

• Computationally expensive Computationally faster

A taxonomy of designA taxonomy of designA taxonomy of designA taxonomy of design

• Categorical designsCategorical designsSubtraction Subtraction - Additive factors and pure insertion- Additive factors and pure insertion

Conjunction Conjunction - Testing multiple hypotheses- Testing multiple hypotheses

• Parametric designsParametric designsLinear Linear - Cognitive components and dimensions- Cognitive components and dimensions

NonlinearNonlinear - Polynomial expansions - Polynomial expansions

• Factorial designsFactorial designsCategoricalCategorical - Interactions and pure insertion- Interactions and pure insertion

- Adaptation, modulation and dual-task inference- Adaptation, modulation and dual-task inference

ParametricParametric - Linear and nonlinear interactions- Linear and nonlinear interactions

- Psychophysiological Interactions- Psychophysiological Interactions

A categorical analysisA categorical analysisA categorical analysisA categorical analysis

Experimental designExperimental design

Word generationWord generation GGWord repetitionWord repetition RR

R G R G R G R G R G R GR G R G R G R G R G R G

G - R = Intrinsic word generationG - R = Intrinsic word generation

……under assumption of under assumption of pure insertionpure insertion,,ie, that G and R do not differ in other waysie, that G and R do not differ in other ways

Cognitive ConjunctionsCognitive ConjunctionsCognitive ConjunctionsCognitive Conjunctions

• One way to minimise problem of pure insertion is to One way to minimise problem of pure insertion is to isolate same process in several different ways (ie, isolate same process in several different ways (ie, multiple subtractions of different conditions)multiple subtractions of different conditions)

A1A1 A2A2

B2B2B1B1

Task (1/2)Task (1/2)ViewingViewing Naming Naming

Stim

uli (

A/B

)S

timul

i (A

/B)

Obj

ects

C

olou

rsO

bjec

ts

Col

ours

Visual ProcessingVisual Processing V V Object Recognition Object Recognition RRPhonological RetrievalPhonological Retrieval PP

Object viewingObject viewing R,VR,VColour viewingColour viewing VVObject namingObject naming P,R,VP,R,VColour namingColour naming P,VP,V

(Object - Colour viewing) [1 -1 0 0] (Object - Colour viewing) [1 -1 0 0] &&

(Object - Colour naming) [0 0 1 -1](Object - Colour naming) [0 0 1 -1]

[ R,V - V ] & [ P,R,V - P,V ] = R & R = R[ R,V - V ] & [ P,R,V - P,V ] = R & R = R

(assuming (assuming RxPRxP = 0; see later) = 0; see later)

Price et al, 1997Common object Common object

recognition response (R)recognition response (R)

Cognitive ConjunctionsCognitive ConjunctionsCognitive ConjunctionsCognitive Conjunctions

• Original (SPM97) definition of conjunctions Original (SPM97) definition of conjunctions entailed sum of two simple effects (A1-A2 + entailed sum of two simple effects (A1-A2 + B1-B2) plus B1-B2) plus exclusiveexclusive masking with masking with interaction (A1-A2) - (B1-B2)interaction (A1-A2) - (B1-B2)

• Ie, “effects significant and of similar size”Ie, “effects significant and of similar size”

New inSPM2

A1-A2

B

1-B

2 p(A1=A2)<p

p(B1=B2)<p

A1-A2

B

1-B

2 p((A1-A2)= (B1-B2))>P2

p(A1=A2+B1=B2)<P1• (Difference between conjunctions and (Difference between conjunctions and masking is that conjunction p-values reflect masking is that conjunction p-values reflect the conjoint probabilities of the contrasts)the conjoint probabilities of the contrasts)

• SPM2 defintion of conjunctions uses SPM2 defintion of conjunctions uses advances in Gaussian Field Theory (e.g, advances in Gaussian Field Theory (e.g, TT22 fields), allowing corrected p-values fields), allowing corrected p-values

• However, the logic has changed slightly, in However, the logic has changed slightly, in that voxels can survive a conjunction even that voxels can survive a conjunction even though they show an interactionthough they show an interaction

+

+

Nonlinear parametric responsesNonlinear parametric responsesNonlinear parametric responsesNonlinear parametric responses

Inverted ‘U’ response toInverted ‘U’ response toincreasing word presentationincreasing word presentation

rate in the DLPFCrate in the DLPFCSPM{F}SPM{F}

Polynomial expansion:Polynomial expansion:f(x) ~ f(x) ~ x + x2 + ...

…(N-1)th order for N levels

Lin

ear

Qua

drat

ic

E.g, F-contrast [0 1 0] on E.g, F-contrast [0 1 0] on Quadratic Parameter =>Quadratic Parameter =>

Interactions and pure insertionInteractions and pure insertionInteractions and pure insertionInteractions and pure insertion

• Presence of an interaction can show a failure of Presence of an interaction can show a failure of pure insertion (using earlier example)…pure insertion (using earlier example)…

A1A1 A2A2

B2B2B1B1

Task (1/2)Task (1/2)ViewingViewing Naming Naming

Stim

uli (

A/B

)S

timul

i (A

/B)

Obj

ects

C

olou

rsO

bjec

ts

Col

ours

Visual ProcessingVisual Processing V V Object Recognition Object Recognition RRPhonological RetrievalPhonological Retrieval PP

Object viewingObject viewing R,VR,VColour viewingColour viewing VVObject namingObject naming P,R,V,P,R,V,RxPRxPColour namingColour naming P,VP,V Naming-specific Naming-specific

object recognitionobject recognition

viewing namingviewing naming

Obj

ect -

Col

our

Obj

ect -

Col

our

(Object – Colour) x (Viewing – Naming) (Object – Colour) x (Viewing – Naming)

[1 -1 0 0] - [0 0 1 -1] = [1 -1] [1 -1 0 0] - [0 0 1 -1] = [1 -1] [1 -1] = [1 -1 -1 1] [1 -1] = [1 -1 -1 1]

[ R,V - V ] - [ P,R,V,[ R,V - V ] - [ P,R,V,RxPRxP - P,V ] = R – R, - P,V ] = R – R,RxPRxP = = RxPRxP

SPM{Z}SPM{Z}

Attentional modulation ofAttentional modulation ofV1 - V5 contributionV1 - V5 contribution

AttentionAttention

V1V1

V5V5

attention

no attention

V1 activityV1 activity

V5

acti

vity

timetime

V1

acti

vity

Psycho-physiological Interaction (PPI)Psycho-physiological Interaction (PPI)Psycho-physiological Interaction (PPI)Psycho-physiological Interaction (PPI)

Parametric, factorial design, in which one factor is psychological (eg attention)

...and other is physiological (viz. activity extracted from a brain region of interest)

Psycho-physiological Interaction (PPI)Psycho-physiological Interaction (PPI)Psycho-physiological Interaction (PPI)Psycho-physiological Interaction (PPI)

• PPIs tested by a GLM with form:PPIs tested by a GLM with form:

yy = ( = (V1V1AA).).11 + + V1V1..22 + + AA..33 + + c = [1 0 0]c = [1 0 0]

• However, the interaction term of interest, However, the interaction term of interest, V1V1AA, is the product of V1 , is the product of V1 activity and Attention block AFTER convolution with HRFactivity and Attention block AFTER convolution with HRF

• We are really interested in interaction at neural level, but:We are really interested in interaction at neural level, but:

(HRF (HRF V1) V1) (HRF (HRF A) A) HRF HRF (V1 (V1 A A) )

(unless A low frequency, eg, blocked; so problem for event-related PPIs)(unless A low frequency, eg, blocked; so problem for event-related PPIs)

• SPM2 can effect a SPM2 can effect a deconvolutiondeconvolution of physiological regressors (V1), before of physiological regressors (V1), before calculating interaction term and reconvolving with the HRFcalculating interaction term and reconvolving with the HRF

• Deconvolution is ill-constrained, so regularised using smoothness priorsDeconvolution is ill-constrained, so regularised using smoothness priors(using ReML)(using ReML)

New inSPM2

Effective vs. functional connectivityEffective vs. functional connectivityEffective vs. functional connectivityEffective vs. functional connectivity

No connection between B and C,yet B and C correlated because of common input from A, eg:

A = V1 fMRI time-seriesB = 0.5 * A + e1C = 0.3 * A + e2

Correlations:

A B C10.49 10.30 0.12 1

A

B

C

0.49

0.31

-0.02

2=0.5, ns.Functional Functional connectivityconnectivity

Effective connectivityEffective connectivity

Dynamic Causal ModellingDynamic Causal ModellingDynamic Causal ModellingDynamic Causal Modelling

• PPIs allow a simple (restricted) test of effective connectivityPPIs allow a simple (restricted) test of effective connectivity

• Structural Equation Modelling is more powerful (Buchel & Friston, 1997)Structural Equation Modelling is more powerful (Buchel & Friston, 1997)

• However in SPM2, Dynamic Causal Modelling (However in SPM2, Dynamic Causal Modelling (DCMDCM) is preferred) is preferred

• DCMs are dynamic models specified at the neural levelDCMs are dynamic models specified at the neural level

• The neural dynamics are transformed into predicted BOLD signals using a The neural dynamics are transformed into predicted BOLD signals using a realistic biological haemodynamic forward model (realistic biological haemodynamic forward model (HDMHDM))

• The neural dynamics comprise a deterministic state-space model and a The neural dynamics comprise a deterministic state-space model and a bilinear approximation to model interactions between variablesbilinear approximation to model interactions between variables

New inSPM2

Dynamic Causal ModellingDynamic Causal ModellingDynamic Causal ModellingDynamic Causal Modelling

• The variables consist of:The variables consist of:connections between regionsconnections between regionsself-connections self-connections direct inputs (eg, visual stimulations)direct inputs (eg, visual stimulations)contextual inputs (eg, attention)contextual inputs (eg, attention)

• Connections can be bidirectionalConnections can be bidirectional

• Variables estimated using EM algorithmVariables estimated using EM algorithm

• Priors are:Priors are:empirical (for haemodynamic model)empirical (for haemodynamic model)principled (dynamics to be convergent)principled (dynamics to be convergent)shrinkage (zero-mean, for connections)shrinkage (zero-mean, for connections)

• Inference using posterior probabilitiesInference using posterior probabilities

• Methods for Bayesian model comparisonMethods for Bayesian model comparison

New inSPM2

direct inputs - u1(e.g. visual stimuli)

z2 V5z1 V1

y1

z3 SPC

contextual inputs - u2(e.g. attention)

y2 y3

z = f(z,u,z) Az + uBz + Cu

y = h(z,h) +

z = state vectoru = inputs = parameters (connection/haemodynamic)

.

Dynamic Causal ModellingDynamic Causal ModellingDynamic Causal ModellingDynamic Causal ModellingNew inSPM2

Z2

stimuliu1

contextu2

Z1

+

+

-

-

+

u1

u2

z2

z1

-

-

Dynamic Causal ModellingDynamic Causal ModellingDynamic Causal ModellingDynamic Causal ModellingNew inSPM2

V1 IFG

V5

SPC

Motion

Photic

Attention

.82(100%)

.42(100%)

.37(90%)

.69 (100%).47

(100%)

.65 (100%)

.52 (98%)

.56(99%)

Friston et al. (2003)

Büchel & Friston (1997)

EffectsEffects

Photic – dots vs fixationPhotic – dots vs fixationMotion – moving vs staticMotion – moving vs staticAttenton – detect changesAttenton – detect changes

• Attention modulates the backward-Attention modulates the backward-connections IFGconnections IFG→SPC and SPC→V5→SPC and SPC→V5

• The intrinsic connection V1→V5 is The intrinsic connection V1→V5 is insignificant in the absence of motioninsignificant in the absence of motion

Friston KJ, Holmes AP, Worsley KJ, Poline J-B, Frith CD, Frackowiak RSJ (1995) Statistical parametric maps in functional imaging: A general linear approach” Human Brain Mapping 2:189-210

Worsley KJ & Friston KJ (1995) Analysis of fMRI time series revisited — again” NeuroImage 2:173-181

Friston KJ, Josephs O, Zarahn E, Holmes AP, Poline J-B (2000) “To smooth or not to smooth” NeuroImage

Zarahn E, Aguirre GK, D'Esposito M (1997) “Empirical Analyses of BOLD fMRI Statistics” NeuroImage 5:179-197

Holmes AP, Friston KJ (1998) “Generalisability, Random Effects & Population Inference” NeuroImage 7(4-2/3):S754

Worsley KJ, Marrett S, Neelin P, Evans AC (1992) “A three-dimensional statistical analysis for CBF activation studies in human brain”Journal of Cerebral Blood Flow and Metabolism 12:900-918

Worsley KJ, Marrett S, Neelin P, Vandal AC, Friston KJ, Evans AC (1995) “A unified statistical approach for determining significant signals in images of cerebral activation” Human Brain Mapping 4:58-73

Friston KJ, Worsley KJ, Frackowiak RSJ, Mazziotta JC, Evans AC (1994) Assessing the Significance of Focal Activations Using their Spatial Extent” Human Brain Mapping 1:214-220

Cao J (1999) The size of the connected components of excursion sets of 2, t and F fields” Advances in Applied Probability (in press)

Worsley KJ, Marrett S, Neelin P, Evans AC (1995) Searching scale space for activation in PET images” Human Brain Mapping 4:74-90

Worsley KJ, Poline J-B, Vandal AC, Friston KJ (1995) Tests for distributed, non-focal brain activations” NeuroImage 2:183-194

Friston KJ, Holmes AP, Poline J-B, Price CJ, Frith CD (1996) Detecting Activations in PET and fMRI: Levels of Inference and Power” Neuroimage 4:223-235

Friston KJ, Holmes AP, Worsley KJ, Poline J-B, Frith CD, Frackowiak RSJ (1995) Statistical parametric maps in functional imaging: A general linear approach” Human Brain Mapping 2:189-210

Worsley KJ & Friston KJ (1995) Analysis of fMRI time series revisited — again” NeuroImage 2:173-181

Friston KJ, Josephs O, Zarahn E, Holmes AP, Poline J-B (2000) “To smooth or not to smooth” NeuroImage

Zarahn E, Aguirre GK, D'Esposito M (1997) “Empirical Analyses of BOLD fMRI Statistics” NeuroImage 5:179-197

Holmes AP, Friston KJ (1998) “Generalisability, Random Effects & Population Inference” NeuroImage 7(4-2/3):S754

Worsley KJ, Marrett S, Neelin P, Evans AC (1992) “A three-dimensional statistical analysis for CBF activation studies in human brain”Journal of Cerebral Blood Flow and Metabolism 12:900-918

Worsley KJ, Marrett S, Neelin P, Vandal AC, Friston KJ, Evans AC (1995) “A unified statistical approach for determining significant signals in images of cerebral activation” Human Brain Mapping 4:58-73

Friston KJ, Worsley KJ, Frackowiak RSJ, Mazziotta JC, Evans AC (1994) Assessing the Significance of Focal Activations Using their Spatial Extent” Human Brain Mapping 1:214-220

Cao J (1999) The size of the connected components of excursion sets of 2, t and F fields” Advances in Applied Probability (in press)

Worsley KJ, Marrett S, Neelin P, Evans AC (1995) Searching scale space for activation in PET images” Human Brain Mapping 4:74-90

Worsley KJ, Poline J-B, Vandal AC, Friston KJ (1995) Tests for distributed, non-focal brain activations” NeuroImage 2:183-194

Friston KJ, Holmes AP, Poline J-B, Price CJ, Frith CD (1996) Detecting Activations in PET and fMRI: Levels of Inference and Power” Neuroimage 4:223-235

Some ReferencesSome ReferencesSome ReferencesSome References

PCA/SVD and EigenimagesPCA/SVD and EigenimagesPCA/SVD and EigenimagesPCA/SVD and Eigenimages

A time-series of 1D imagesA time-series of 1D images128 scans of 32 “voxels”128 scans of 32 “voxels”

Expression of 1st 3 “eigenimages”Expression of 1st 3 “eigenimages”

Eigenvalues and spatial “modes”Eigenvalues and spatial “modes”

The time-series ‘reconstituted’The time-series ‘reconstituted’

PCA/SVD and EigenimagesPCA/SVD and EigenimagesPCA/SVD and EigenimagesPCA/SVD and Eigenimages

Y Y (DATA)(DATA)

timetime

voxelsvoxels

Y = USVY = USVTT = = ss11UU11VV11TT + + ss22UU22VV22

T T + ... + ...

+ ...+ ...APPROX. APPROX. OF YOF Y

UU11

== ss11

VV11

APPROX. APPROX. OF YOF Y

+ + ss22

UU22

VV22

+ + ss33 APPROX. APPROX. OF YOF Y

UU33

VV33

Time x Condition interactionTime x Condition interactionTime x Condition interactionTime x Condition interaction

Time x condition interactions (Time x condition interactions (i.e. adaptationi.e. adaptation))assessed with the SPM{T}assessed with the SPM{T}

Minimise the difference between the observed (Minimise the difference between the observed (SS) and implied () and implied () covariances by adjusting the ) covariances by adjusting the path coefficients (path coefficients (BB) )

The implied covariance structure: The implied covariance structure: xx = x.B + z= x.B + zxx = z.(I - B)= z.(I - B)-1-1

x : matrix of time-series of Regions 1-3x : matrix of time-series of Regions 1-3B: matrix of unidirectional path coefficientsB: matrix of unidirectional path coefficients

Variance-covariance structure:Variance-covariance structure:xxT T . x = . x = = (I-B)= (I-B)-T-T. C.(I-B). C.(I-B)-1-1

where Cwhere C = z= zTT z z

xxTT.x is the implied variance covariance structure .x is the implied variance covariance structure C contains the residual variances (u,v,w) and covariancesC contains the residual variances (u,v,w) and covariances

The free parameters are estimated by minimising a [maximum likelihood] function of The free parameters are estimated by minimising a [maximum likelihood] function of SS and and

Structural Equation Modelling (SEM)Structural Equation Modelling (SEM)Structural Equation Modelling (SEM)Structural Equation Modelling (SEM)

1

3

2

zz zz

zz

12B

23B13B

Attention - No attentionAttention - No attentionAttention - No attentionAttention - No attention

AttentionNo attention

0.760.47

0.750.43

Changes in “effective connectivity”

PPPP

==

Second-order InteractionsSecond-order InteractionsSecond-order InteractionsSecond-order Interactions

V5V5

V1V1

V1xPPV1xPP

V5V5

2 =11, p<0.01

0.14

Modulatory influence of parietal cortex on V1 to V5Modulatory influence of parietal cortex on V1 to V5

Statistical Inference Rik Henson With thanks to: Karl Friston, Andrew Holmes, Stefan Kiebel, Will...

Documents

Transcript of Statistical Inference Rik Henson With thanks to: Karl Friston, Andrew Holmes, Stefan Kiebel, Will...