Post on 24-Oct-2020
• temporal noise: modelling temporal autocorrelation
• temporal signal: FLOBS HRF optimal basis functions• temporal signal: HRF deconvolution
• spatiotemporally structured signal / noise: ICA• “functional grand-plan”: integrating ICA+GLM
Modelling temporal structure(in noise and signal)
Mark Woolrich, Christian Beckmann*, Salima Makni & Steve SmithFMRIB, Oxford *Imperial/FMRIB
frequency
powerColoured / autocorrelated
White / independent
Even after high-pass filtering, FMRI noise has
extra power at low frequencies (positive
autocorrelation or temporal smoothness)
Uncorrected, this causes:
- biased stats ( increased false positives)
- decreased sensitivity
Non-independent/Autocorrelation/Coloured FMRI noise
FMRIB’s Improved Linear Modelling (FILM)
• FILM is used to fit the GLM voxel-wise in FEAT• Deals with the autocorrelation locally and uses prewhitening
FILM estimates autocorrelation by looking at the residuals of the GLM fit:
residuals = Y −Xβ̂
Y = Xβ + "
FMRIB’s Improved Linear Modelling (FILM)
1) Fit the GLM and estimate the autocorrelation on the residuals
frequency
power
Power vs. freq in the residuals
FMRIB’s Improved Linear Modelling (FILM)
1) Fit the GLM and estimate the autocorrelation on the residuals
frequency
2) Spatially and spectrally smooth the data
frequency
power
Power vs. freq in the residuals
power
FMRIB’s Improved Linear Modelling (FILM)
1) Fit the GLM and estimate the autocorrelation on the residuals
frequency
2) Spatially and spectrally smooth the data
frequency
power
3) Construct prewhitening filter to “undo” autocorrelation
frequency
power
Power vs. freq in the residuals
power
FMRIB’s Improved Linear Modelling (FILM)
1) Fit the GLM and estimate the autocorrelation on the residuals
frequency
2) Spatially and spectrally smooth the data
frequency
power
3) Construct prewhitening filter to “undo” autocorrelation
4) Apply filter to data and design matrix and refit
frequency
power
power
power
frequency
FMRIB’s Improved Linear Modelling (FILM)
Dealing with Variations in Haemodynamics
• The haemodynamic responses vary between subjects and areas of the brain
• How do we allow haemodynamics to be flexible but remain plausible?
Reminder: the haemodynamic response function (HRF) describes the BOLD response to a short burst of neural activity
Temporal Derivatives
• A very simple approach to providing HRF variability is to include (alongside each EV) the EV temporal derivative
• Including the temporal derivative of an EV allows for a small shift in time of that EV
• This is based upon a first order Taylor series expansion
EVTemporal derivative
datamodel fit without
derivative
model fit with derivative
• We need to allow flexibility in the shape of the fitted HRF
Parameterise HRF shape and fit shape parameters to the data
Needs nonlinear fitting - HARD
Using Parameterised HRFs
Using Basis Sets
• We need to allow flexibility in the shape of the fitted HRF
Parameterise HRF shape and fit shape parameters to the data
We can use linear basis sets to span the space of expected HRF shapes
Needs nonlinear fitting - HARD Linear fitting (use GLM) - EASY
How do HRF Basis Sets Work?
Different linear combinations of the basis functions can be used to create different HRF shapes
+ -0.1*+ 0.3* 1.0* =
basis fn 1 basis fn 2 basis fn 3 HRF
How do HRF Basis Sets Work?
Different linear combinations of the basis functions can be used to create different HRF shapes
+ -0.1*+ 0.3* 1.0* =
+ 0.5*+ -0.2* 0.7* =
basis fn 1 basis fn 2 basis fn 3 HRF
FMRIB’s Linear Optimal Basis Set (FLOBS)
Using FLOBS we can:
• Specify a priori expectations of parameterised HRF shapes
• Generate an appropriate basis set
Generating FLOBs
(1) Take samples of the HRF
Generating FLOBs
(2) Perform SVD (3) Select the top eigenvectors as the optimal basis set
“Canonical HRF”
Generating FLOBs
(2) Perform SVD (3) Select the top eigenvectors as the optimal basis set
“Canonical HRF”
temporal derivative
Generating FLOBs
(2) Perform SVD (3) Select the top eigenvectors as the optimal basis set
The resulting basis set can then be used in FEAT
“Canonical HRF”dispersion derivative temporal
derivative
Woolrich et al. , TMI, 2004
Bayesian Inference
⊗Neural Activity
*A +BOLD FMRI data
Gaussian noise
HRF
Priors on HRF parameters
p(A, c, m1,m2 . . . |Y )Joint posterior distribution
Infer using MCMCWoolrich et al. , TMI, 2004
Bayesian Inference
⊗Neural Activity
*A +BOLD FMRI data
Gaussian noise
HRF
Priors on HRF parameters
p(A|Y ) =∫
p(A, c, m1,m2 . . . |Y )dcdm1dm2 . . .
Marginal posterior distribution
• Inputs are raw paradigm (stimulation and “modulation”) timecourses
• Model based on Bilinear Dynamical Systems (Penny 2005), where modulatory input changes neural response to stimulation
• What’s new:• estimate HRF from data• full Bayesian inference on model, using VB
Makni, NeuroImage 2008
Temporal deconvolution of FMRI timecourses
UNCO
RREC
TED P
ROOF
307 VB does not guarantee the optimal global solution, that is308 why a good initialization of the different BDS model309 parameters is important. We initialize the HRF using the310 canonical HRF that is widely used in fMRI to model the311 response. The neuronal response is initialized using a simple
312linear least squares optimization. The neurodynamic para-313meters, the state and the space noise terms are all randomly314initialized. This initialisation is sufficient for our VB315algorithm to converge within 10–20 iterations, taking only3161.5–3 min per single time series of length 250 on a standard
Fig. 1. Results on simulated data using EM and VB (low SNR=11.5, Scan number=250). (a) Driving input v used to generate the data. (b, c) Modulatory inputs un(1) and un(2) used togenerate the data. (d, f) HRF using EM and VB, respectively. (e, g) neuronal response using EM and VB, respectively. In all cases dashed line is for the true value of the parameter andcontinuous line is for the estimated parameter.
4 S. Makni et al. / NeuroImage xxx (2008) xxx–xxx
ARTICLE IN PRESS
Please cite this article as: Makni, S., et al., Bayesian deconvolution fMRI data using bilinear dynamical systems, NeuroImage (2008),doi:10.1016/j.neuroimage.2008.05.052
UNCO
RREC
TED P
ROOF
307 VB does not guarantee the optimal global solution, that is308 why a good initialization of the different BDS model309 parameters is important. We initialize the HRF using the310 canonical HRF that is widely used in fMRI to model the311 response. The neuronal response is initialized using a simple
312linear least squares optimization. The neurodynamic para-313meters, the state and the space noise terms are all randomly314initialized. This initialisation is sufficient for our VB315algorithm to converge within 10–20 iterations, taking only3161.5–3 min per single time series of length 250 on a standard
Fig. 1. Results on simulated data using EM and VB (low SNR=11.5, Scan number=250). (a) Driving input v used to generate the data. (b, c) Modulatory inputs un(1) and un(2) used togenerate the data. (d, f) HRF using EM and VB, respectively. (e, g) neuronal response using EM and VB, respectively. In all cases dashed line is for the true value of the parameter andcontinuous line is for the estimated parameter.
4 S. Makni et al. / NeuroImage xxx (2008) xxx–xxx
ARTICLE IN PRESS
Please cite this article as: Makni, S., et al., Bayesian deconvolution fMRI data using bilinear dynamical systems, NeuroImage (2008),doi:10.1016/j.neuroimage.2008.05.052
To do the Bayes, either:• MCMC (computer takes ages)• Variational Bayes (maths takes ages)
UNCO
RREC
TED P
ROOF
307 VB does not guarantee the optimal global solution, that is308 why a good initialization of the different BDS model309 parameters is important. We initialize the HRF using the310 canonical HRF that is widely used in fMRI to model the311 response. The neuronal response is initialized using a simple
312linear least squares optimization. The neurodynamic para-313meters, the state and the space noise terms are all randomly314initialized. This initialisation is sufficient for our VB315algorithm to converge within 10–20 iterations, taking only3161.5–3 min per single time series of length 250 on a standard
Fig. 1. Results on simulated data using EM and VB (low SNR=11.5, Scan number=250). (a) Driving input v used to generate the data. (b, c) Modulatory inputs un(1) and un(2) used togenerate the data. (d, f) HRF using EM and VB, respectively. (e, g) neuronal response using EM and VB, respectively. In all cases dashed line is for the true value of the parameter andcontinuous line is for the estimated parameter.
4 S. Makni et al. / NeuroImage xxx (2008) xxx–xxx
ARTICLE IN PRESS
Please cite this article as: Makni, S., et al., Bayesian deconvolution fMRI data using bilinear dynamical systems, NeuroImage (2008),doi:10.1016/j.neuroimage.2008.05.052
To do the Bayes, either:• MCMC (computer takes ages)• Variational Bayes (maths takes ages)
Model-free Functional Data Analysis
• decomposes data into a set of statistically independent spatial component maps and associated time courses
• can perform multi-subject/ multi-session analysis
• fully automated (incl. estimation of the number of components)
• inference on IC maps using alternative hypothesis testing
MELODICMultivariate Exploratory Linear Optimised Decomposition
into Independent Components
Model-free Functional Data Analysis
• decomposes data into a set of statistically independent spatial component maps and associated time courses
• can perform multi-subject/ multi-session analysis
• fully automated (incl. estimation of the number of components)
• inference on IC maps using alternative hypothesis testing
MELODICMultivariate Exploratory Linear Optimised Decomposition
into Independent Components
EDA techniques for FMRI
• are mostly multivariate• often provide a multivariate linear decomposition:
space# m
aps=
time Scan #k
FMRI dataspatial maps
time
# mapsspace
Data is represented as a 2D matrix and decomposed into factor matrices (or modes)
Model Order Selection
• can estimate the model order from the Eigenspectrum of the data covariance matrix (corrected using Wishart random matrix theory)
• approximate the Bayesian evidence for the model order for a probabilistic PCA model (PPCA)
Minka, TR 514 MIT Media Lab 2000
Laplace approximation BIC AIC
C.F. Beckmann , J.A. Noble , S.M. SmithOxford Centre for Functional Magnetic Resonance Imaging of the Brain (FMRIB)
Medical Vision Laboratory, Department of Engineering, University of Oxfordemail: beckmann,steve @fmrib.ox.ac.uk
IntroductionAnalysing FMRI data using linear techniques like the
GLM, PCA or ICA can be understood as a form of di-
mensionality reduction where a small set of spatial maps
(Z-scores; IC maps; eigenimages) is sought that, together
with pre-specified (GLM) or estimated (PCA/ICA) time-
courses, represent the signal. The signal subspace is gen-
erally expected to be of lower dimensionality than the data,
e.g. in the case of simple block designs, the signal is implic-
itly assumed to be contained in a subspace roughly of size
less than or equal to the length of one stimulation cycle.
In ICA, it is common to whiten the data using singular value
decomposition. The data is often projected onto a set of
dominant eigenvectors in order to reduce the data dimen-
sionality. It is common practice to arbitrarily choose the
number of principal directions to retain such that the vari-
ance in the discarded directions is negligible [4]. In con-
trast, we present a probabilistic ICA model that will allow
us to address issues of model order or equivalently the num-
ber of latent source signals contained in the data by esti-
mating independent components in lower-dimensional sub-
spaces spanned by the dominant eigenvectors of the covari-
ance matrix of the data.
The choice of the number of components to extract is a
problem of model order selection. Overestimating the di-
mensionality results in a large number of spurious compo-
nents due to underconstrained estimation and a factoriza-
tion that will overfit the data, harming later inference and
dramatically increasing computational costs. Underestima-
tion, however, will discard valuable information and result
in suboptimal signal extraction.
We show that the accuracy of probabilistic ICA estimates
is consistent over a wide range of subspace dimensions be-
yond a value that appears to represent the ‘intrinsic’ dimen-
sionality of the data and discuss different approaches to de-
termining this dimensionality from the data prior to ICA
decomposition.
Probilistic ICAIn the probabilistic ICA model, the -dimensional data vec-
tors are modelled as a mixture of statistically in-
dependent latent sources which are linearly mixed by
and corrupted by additive noise such that
(1)
where denotes -dimensional random noise with zero
mean and unit variance.
In the presence of noise, the covariance matrix of the obser-
vations will be the sum of and the noise covariance [2]
i.e. will be of full rank and its rank will no longer equal
the rank of .
Determining a cutoff value for the eigenvalues of an initial
PCA using simplistic criteria like the reconstruction error
or predictive likelihood [4] that do not incorporate a noise
model will naturally predict that the accuracy steadily in-
creases with increased dimensionality. Many other infor-
mal methods have been proposed, the most popular choice
being the ”scree plot” where one looks for a ”knee” in the
plot of ordered eigenvalues that signifies a split between
significant and unimportant directions of the data.
This naturally raises the issues of sensitivity of ICA to the
subspace dimensionality and possible techniques to deter-
mine the optimal subspace dimensionality from the data.
Estimation Accuracy vs DimensionalityWe generated artificial FMRI data based on autoregres-
sive noise (where the AR coefficients were extracted from
real ‘null’ data ) and in vivo ‘null’ data plus additive sig-
nal as outlined in [1]. The 180-dimensional data was de-
composed using ICA after initially projecting the data into
lower dimensional subspaces (3-180 dimensions) by pro-
jection onto the dominant eigenvectors.
0.75
0.85
0.95
0.75
0.85
0
0.005
0.01
0.015
0.02
0 30 60 90 120 150 1800
0.1
0.2
0.3
0.4
0.5
false neg. rate
false pos. rate
correlation
Figure 1: Spatio-temporal accuracy of ICA esti-
mates as a function of subspace dimensionality
After convergence, we calculated (i) the correlation be-
tween the best extracted and the ‘true’ activation time
courses (temporal accuracy) and (ii) the false positive / false
negative rates between thresholded IC maps and clusters of
voxels that have been used to generate the artificial data
(spatial accuracy) [1].
Figure 1 shows an example for data with two types of artifi-
cial activation introduced in a 10on/10off and 15on/15off
block design. These results suggest that the quality of
the estimates does not significantly improve (and actually
reduces in the case of temporal correlation) if more than
dimensions are retained.
Eigenspectrum AnalysisUnder the assumption of Gaussian noise, the sample co-
variance matrix has a Wishart distribution and we can
utilise results from random matrix theory [3] on the em-
pirical distribution function for the eigenvalues of
the covariance matrix of a random -dimensional ma-
trix . Suppose that as , then
almost surely, where the limiting distribu-
tion has a density
and where .
This can be used to obatin a modification to the scree-plot
where one compares the eigenspectrum of the observations
against the quantiles of the predicted cumulative distribu-
tion , i.e. against the expected eigenspectrum of a
randomGaussian matrix. Given the probabilistic model, we
project the data into a ‘signal’ subspace spanned by those
eigenvectors that violate the ‘null’ hypothesis of random
Gaussian signal in the data. Figure 2 shows an example
of the eigenspectrum for different artificial data sets and
the predicted eigenspectrum of a random Gaussian matrix.
Note, that the increase in AR dependency will render the
data to have more Eigenvectors that cannot be explained as
resulting from a random Gaussian distribution. Note also,
that artificial data based on true ‘null’ data differs signifi-
cantly from AR 16 noise.
0 30 60 90 120 150
0.5
1
1.5
2
2.5
AR 0 noise + signal
0 30 60 90 120 1500
1
2
3
4
5
AR 4 noise + signal
0 30 60 90 120 1500
1
2
3
4
5
6
AR 16 noise + signal
0 30 60 90 120 150
0.5
1
1.5
2
2.5
3
3.5
4
’null’ data + signal
50 100 150
0.6
0.8
1
60 90 120 150
0.4
0.6
0.8
50 100 150
0.60.8
11.2
1.4
0 10 20
1
1.5
2
Figure 2: Eigenvalues (blue) and predicted distri-
bution (red) for different artificial FMRI data sets
0 30 60 90 120 150
0.6
0.7
0.8
0.9
1
AR 0 noise + signal
0 30 60 90 120 150
0.6
0.7
0.8
0.9
1
AR 4 noise + signal
0 30 60 90 120 150
0.6
0.7
0.8
0.9
1
AR 16 noise + signal
0 30 60 90 120 150
0.6
0.7
0.8
0.9
1
‘null’ data + signal
0 1 2 3 4 5 60.95
1
15 20 25 300.95
1
15 20 25 300.95
1
0 1000.96
0.98
1
Figure 3: Bayesian estimates of the intrin-
sic dimensionality: Laplace approximation
due to Minka [5](blue) and Raftery [6](green)
and Bayesian score function of Rajan &
Rayner [7](red).
Bayesian AnalysisAs an alternative to methods based on the expected eigen-
spectrum, the problem of model order can also be ap-
proched within the framework of Bayesian model selection.
Minka [5] develops a simple criterion based on the Laplace
approximation of the Bayesian evidence for the model or-
der and gives a review of other existing techniques, includ-
ing the related technique of Raftery [6] to calculate approx-
imate Bayes factors and the selection criterion of Rajan and
Rayner [7] derived for the probabilistic model (1) with as-
sumed Gaussian noise and unit variance of the sources. Fig-
ure 3 shows the normalized score function for all three tech-
niques. For simple AR noise, all estimators essentially pre-
dict similar dimensionality close to the results in figure 2.
For ‘real’ data, techniques based on Laplace approximation
of Bayesian evidence are similar to the modified scree plot
and predict a dimensionality of 92. The selection crite-
rion due to Rajan& Rayner, however, predicts a much lower
dimensionality (33) that matches the results from the analy-
sis of spatio-temporal accuracy of IC estimates. This differ-
ence appears to be due to the different model assumptions
for the source variances.
DiscussionProjecting the data into lower-dimensional signal subspaces
appears to not interfere with the accuracy to estimate pat-
terns of activation, provided a sufficient number of com-
ponents are extracted. The optimal dimensionality is hard
to estimate exactly and probabilistic models vary depend-
ing on the specific noise model. In all cases, however, the
amount of dimensionality reduction proposed well exceeds
the standard level commonly used for ICA. A high level
of reduction does not only significantly reduce computa-
tional demand, but also allows for an estimate of the noise-
subspace which can be used to infer the validity of IC esti-
mates in the signal subspace.
AcknowledgementsThe authors gratefully acknowledge funding from the UK
Medical Research Council.
References
[1] C.F. Beckmann, J.A. Noble, and S.M. Smith. Investigating the intrinsic dimen-
sionality of FMRI data for ICA. In Seventh Int. Conf. on Functional Mapping of
the Human Brain, 2001.
[2] R. Everson and S.J. Roberts. Inferring the eigenvalues of covariance matrices from
limited, noisy data. IEEE Trans. Signal Processing, 1998. to appear.
[3] I.M. Johnstone. On the distribution of the largest principal component. Technical
report, Department of Statistics, Stanford University, 2000.
[4] M. J. McKeown, S. Makeig, G. G. Brown, T. P. Jung, S. S. Kindermann, A. J. Bell,
and T. J. Sejnowski. Analysis of fMRI data by blind separation into independent
spatial components. Human Brain Mapping, 6(3):160–88, 1998.
[5] T. Minka. Automatic choice of dimensionality for PCA. Technical Report 514,
M.I.T. Media Laboratory – Perceptual Computing Section, 2000.
[6] A.E. Raftery. Approximate bayes factors and accounting for model uncertainty
in generalized linear models. Technical Report 255, Department of Statistics,
University of Washington, 1994.
[7] J. Rajan and P.J.W Rayner. Model order selection for the singular value decompo-
sition and the discrete Karhunen-Lo’eve transform using Bayesian approach. IEE
Vision, Image and Signal Processing, 144:123–166, 1997.
Address for correspondence: Oxford Centre for Functional Magnetic Resonance Imaging of the Brain (FMRIB), Department of Clinical Neurology, University of Oxford, John Radcliffe Hospital, Headington, Oxford OX3 9DU, UK
Probabilistic ICA
GLM analysis standard ICA (unconstrained)
Probabilistic ICA
GLM analysis probabilistic ICA
Probabilistic ICA
• designed to address the ‘overfitting problem’:
• tries to avoid generation of ‘spurious’ results
• high spatial sensitivity and specificity
GLM analysis probabilistic ICA
Applications
EDA techniques can be useful to
‣ investigate the BOLD response• estimate artefacts in the data • find areas of ‘activation’ which respond in a non-
standard way
• analyse data for which no model of the BOLD response is available
Investigate BOLD response
estimated signal time
course
standard hrf model
Applications
EDA techniques can be useful to
• investigate the BOLD response‣ estimate artefacts in the data • find areas of ‘activation’ which respond in a non-
standard way
• analyse data for which no model of the BOLD response is available
slice drop-outs
gradient instability
EPI ghost
high-frequency noise
head motion
field inhomogeneity
eye-related artefacts
eye-related artefacts
eye-related artefacts
Wrap around
Structured Noise and the GLM
• ‘structured noise’ appears:• in the GLM residuals and inflate variance estimates
(more false negatives)
• in the parameter estimates (more false positives and/or false negatives)
• In either case lead to suboptimal estimates and wrong inference!
Structured noise and GLM Z-stats bias
• Correlations of the noise time courses with ‘typical’ FMRI regressors can cause a shift in the histogram of the Z-statistics
• Thresholded maps will have wrong false-positive rate
!1000 !500 0 500 10000
100
200
300
400
500PE raw
!10 !5 0 5 100
50
100
150
200
250
300Z raw
!1000 !500 0 500 10000
100
200
300
400
500PE clean
!10 !5 0 5 100
50
100
150
200
250
300Z cleanRIGHT
Denoising FMRI
• Example: left vs right hand finger tapping
before denoising
after denoisingJohansen-Berg et al.PNAS 2002
!1000 !500 0 500 10000
100
200
300
400
500
600PE raw
!10 !5 0 5 100
50
100
150
200
250
300
350Z raw
!1000 !500 0 500 10000
100
200
300
400
500
600PE clean
!10 !5 0 5 100
50
100
150
200
250
300
350Z cleanLEFT
Denoising FMRI
• Example: left vs right hand finger tapping
before denoising
after denoisingJohansen-Berg et al.PNAS 2002
LEFT - RIGHT contrast
Denoising FMRI
• Example: left vs right hand finger tapping
before denoising
after denoisingJohansen-Berg et al.PNAS 2002
Apparent variability
McGonigle et al.: 33 Sessions under motor paradigm
‘de-noising’ data by regressing out noise:reduced ‘apparent’ session variability
Applications
EDA techniques can be useful to
• investigate the BOLD response• estimate artefacts in the data • find areas of ‘activation’ which respond in a non-
standard way
‣ analyse data for which no model of the BOLD response is available
PICA on resting data
• perform ICA on null data and compare spatial maps between subjects/scans
• ICA maps depict spatially localised and temporally coherent signal changes
Example: ICA maps - 1 subject at 3
different sessions
Spatial characteristics
(a) x=17 y=-73 z=-12
R L
(b) x=-13 y=-61 z=6
R L
(c) x=3 y=-17 z=1.5
R L
(d) x=1 y=-21 z=51
R L
(e) x=-4 y=-29 z=33
R L
(f) x=5 y=6 z=27
R L
(g) x=45 y=-42 z=47
R L
(h) x=-45 y=-42 z=47
R L
Medial visual cortex Lateral Visual Cortex
(a) x=17 y=-73 z=-12
R L
(b) x=-13 y=-61 z=6
R L
(c) x=3 y=-17 z=1.5
R L
(d) x=1 y=-21 z=51
R L
(e) x=-4 y=-29 z=33
R L
(f) x=5 y=6 z=27
R L
(g) x=45 y=-42 z=47
R L
(h) x=-45 y=-42 z=47
R L
Auditory system Sensori-motor system
Spatial characteristics
(a) x=17 y=-73 z=-12
R L
(b) x=-13 y=-61 z=6
R L
(c) x=3 y=-17 z=1.5
R L
(d) x=1 y=-21 z=51
R L
(e) x=-4 y=-29 z=33
R L
(f) x=5 y=6 z=27
R L
(g) x=45 y=-42 z=47
R L
(h) x=-45 y=-42 z=47
R L
Visuospatial system Executive control
(a) x=17 y=-73 z=-12
R L
(b) x=-13 y=-61 z=6
R L
(c) x=3 y=-17 z=1.5
R L
(d) x=1 y=-21 z=51
R L
(e) x=-4 y=-29 z=33
R L
(f) x=5 y=6 z=27
R L
(g) x=45 y=-42 z=47
R L
(h) x=-45 y=-42 z=47
R L
Visual Stream
Temporal deconvolution of ICA timecourses
• ‘What are the “task-related” components?• Use explicit time series model on the IC-
generated temporal modes (using BDS)(a)
(b) (c)
0 50 100 150 200 250 300 350 400 450 500
0
0.5
1
0 50 100 150 200 250 300 350 400 450 500
!1.5
0
1.5
Time (s)
(d) (e)
Fig. 6. Results on the estimated time course for IC1 (using PICA approach), (a):thresholded (at p > 0.5) spatial IC map. (b): Estimated mean of the posteriordistribution of the neuronal response, the shaded regions represent the error bars.(c): Sample of the neuronal response using the estimated mean and covariancematrix of its posterior distribution. (d): Estimated mean of the Gaussian posteriordistribution of the HRF, the shaded regions represent the error bars. (e): VB-BDS fitto fMRI data (using the mean of the variable distributions), the time-series relativeto IC1 (dashed line) and VB-BDS model fit (continuous line).
B Model priors
First we choose for the parameters; a, b and d; zero mean Gaussian priorswith precisions equal to φa, φb and φd, respectively.
The parameter h is a non-parametric unknown function, and is given a zero
13
Example: BDS/ICA integration
• Estimate the mean of the posterior distribution over the neuronal response
(a)
(b) (c)
0 50 100 150 200 250 300 350 400 450 500
0
0.5
1
0 50 100 150 200 250 300 350 400 450 500
!1.5
0
1.5
Time (s)
(d) (e)
Fig. 6. Results on the estimated time course for IC1 (using PICA approach), (a):thresholded (at p > 0.5) spatial IC map. (b): Estimated mean of the posteriordistribution of the neuronal response, the shaded regions represent the error bars.(c): Sample of the neuronal response using the estimated mean and covariancematrix of its posterior distribution. (d): Estimated mean of the Gaussian posteriordistribution of the HRF, the shaded regions represent the error bars. (e): VB-BDS fitto fMRI data (using the mean of the variable distributions), the time-series relativeto IC1 (dashed line) and VB-BDS model fit (continuous line).
B Model priors
First we choose for the parameters; a, b and d; zero mean Gaussian priorswith precisions equal to φa, φb and φd, respectively.
The parameter h is a non-parametric unknown function, and is given a zero
13
(a)
(b) (c)
0 50 100 150 200 250 300 350 400 450 500
0
0.5
1
0 50 100 150 200 250 300 350 400 450 500
!1.5
0
1.5
Time (s)
(d) (e)
Fig. 6. Results on the estimated time course for IC1 (using PICA approach), (a):thresholded (at p > 0.5) spatial IC map. (b): Estimated mean of the posteriordistribution of the neuronal response, the shaded regions represent the error bars.(c): Sample of the neuronal response using the estimated mean and covariancematrix of its posterior distribution. (d): Estimated mean of the Gaussian posteriordistribution of the HRF, the shaded regions represent the error bars. (e): VB-BDS fitto fMRI data (using the mean of the variable distributions), the time-series relativeto IC1 (dashed line) and VB-BDS model fit (continuous line).
B Model priors
First we choose for the parameters; a, b and d; zero mean Gaussian priorswith precisions equal to φa, φb and φd, respectively.
The parameter h is a non-parametric unknown function, and is given a zero
13
• Estimate the mean of the Gaussian posterior distribution of the HRF
(a)
(b) (c)
0 50 100 150 200 250 300 350 400 450 500
0
0.5
1
0 50 100 150 200 250 300 350 400 450 500
!1.5
0
1.5
Time (s)
(d) (e)
Fig. 6. Results on the estimated time course for IC1 (using PICA approach), (a):thresholded (at p > 0.5) spatial IC map. (b): Estimated mean of the posteriordistribution of the neuronal response, the shaded regions represent the error bars.(c): Sample of the neuronal response using the estimated mean and covariancematrix of its posterior distribution. (d): Estimated mean of the Gaussian posteriordistribution of the HRF, the shaded regions represent the error bars. (e): VB-BDS fitto fMRI data (using the mean of the variable distributions), the time-series relativeto IC1 (dashed line) and VB-BDS model fit (continuous line).
B Model priors
First we choose for the parameters; a, b and d; zero mean Gaussian priorswith precisions equal to φa, φb and φd, respectively.
The parameter h is a non-parametric unknown function, and is given a zero
13
• Assess the total model fit
(a)
(b) (c)
0 50 100 150 200 250 300 350 400 450 500
0
0.5
1
0 50 100 150 200 250 300 350 400 450 500
!1.5
0
1.5
Time (s)
(d) (e)
Fig. 6. Results on the estimated time course for IC1 (using PICA approach), (a):thresholded (at p > 0.5) spatial IC map. (b): Estimated mean of the posteriordistribution of the neuronal response, the shaded regions represent the error bars.(c): Sample of the neuronal response using the estimated mean and covariancematrix of its posterior distribution. (d): Estimated mean of the Gaussian posteriordistribution of the HRF, the shaded regions represent the error bars. (e): VB-BDS fitto fMRI data (using the mean of the variable distributions), the time-series relativeto IC1 (dashed line) and VB-BDS model fit (continuous line).
B Model priors
First we choose for the parameters; a, b and d; zero mean Gaussian priorswith precisions equal to φa, φb and φd, respectively.
The parameter h is a non-parametric unknown function, and is given a zero
13
• posterior probability of neuronal response > 0 for the IC
Data
= + Noise
Predicted response
Estimated effect
Integrating GLM and ICAfor FMRI analysis
GLM
βi
Data
= + Noise
Predicted response
Estimated effect
Integrating GLM and ICAfor FMRI analysis
GLM
βi
+ clear conclusions on a particular question
- results depend on the model
PICA
time
=
time
# maps
+ NoiseScan #k
FMRI data
space
time
space
spatial maps
# maps
Data Estimated spatially independent mapsEstimated artefact time courses
Integrating GLM and ICAfor FMRI analysis
PICA
time
=
time
# maps
+ Noise
+ data driven and multivariate approach
- no knowledge about the fMRI paradigm is used
- can be hard to interpret activation results
Scan #k
FMRI data
space
time
space
spatial maps
# maps
Data Estimated spatially independent mapsEstimated artefact time courses
Integrating GLM and ICAfor FMRI analysis
GLM + ICA
space
# maps=
Scan #k
FMRI data
spatial maps
time
# mapsspace
+ Noise
Data Estimated spatially independent mapsEstimated artefact time courses
time
Predicted response
Integrating GLM and ICAfor FMRI analysis
GLM + ICA
space
# maps=
Scan #k
FMRI data
spatial maps
time
# mapsspace
+ Noise
Data Estimated spatially independent mapsEstimated artefact time courses
time
• Stimulus model-based hypothesis testing • Adaptive model-free artefact modelling
- e.g. stimulus correlated motion, physiological noise, networks of spontaneous neuronal activity
Predicted response
Integrating GLM and ICAfor FMRI analysis
0 5 10 15 20 25 30 35 40 45
0
2
5
10
auditory stimulusvisual stimulus
Time (s)
Modelled ICs
Example “model free” IC
Estimated HRFs
Time courseSpatial map
Time courseSpatial map
space
# maps
spatial maps
time
# maps