The General Linear Model (GLM) · Statistical Parametric Mapping: The Analysis of Functional Brain...

With many thanks for slides & images to:

FIL Methods group, Virginia Flanagin and Klaas Enno Stephan

Dr. Frederike Petzschner

Translational Neuromodeling Unit (TNU)Institute for Biomedical Engineering, University of Zurich & ETH Zurich

The General Linear Model (GLM)

Translational Neuromodeling Unit

Overview of SPM

Realignment Smoothing

Normalisation

General linear model

Statistical parametric map (SPM)Image time-series

Parameter estimates

Design matrix

Template

Kernel

Gaussian field theory

p <0.05

Statisticalinference

Research Question:

Where in the brain do we represent listening to sounds?

Image a very simple experiment…

Time

SINGLE VOXEL TIME SERIES…

TIME

Image a very simple experiment…

Time

Question: Is there a change in the BOLD response betweenlistening and rest?

linear model

effectsestimate

error estimatestatistic

You need a model of your data…

ß

BOLD signal

Time = + + erro

r

x1 x2 e

Explain your data… as a combination of experimental manipulation, confounds and errors

Single voxel regression model:regressor

!β1 !β2

y = x1β1 + x2β2 + e

BOLD signal

Time = + + erro

r

x1 x2 e

y = x1β1 + x2β2 + eSingle voxel regression model:

Explain your data… as a combination of experimental manipulation,confounds and errors

!β2!β1

eXy += b

n

= + + erro

r

e

eXy += b

1

n

p

p

1

n

1

n: number of scansp: number of regressors

The black and white version in SPM

Des

ignm

atrix

erro

r!β2!β1

The design matrix embodies all available knowledge about experimentally controlled factors and potential confounds.

Model assumptions

Designmatrix

errorYou want to estimate your parameters such that you minimize:

This can be done using an Ordinary least squares estimation(OLS) assuming an i.i.d. error

et2

t=1

N

∑

error

GLM assumes identical and independently distributed errors

i.i.d. = error covariance is a scalar multiple of the identity matrix

úû

ùêë

é=

1001

)(eCov úû

ùêë

é=

1004

)(eCov úû

ùêë

é=

2112

)(eCov

non-identity non-independencet1 t2

t1

t2

e ≈ N(0,σ 2I )

= +

e

úû

ùêë

é

2

1

bb

y X

„Option 1“: Per hand

How to fit the model and estimate the parameters?

erro

r

= +

e

úû

ùêë

é

2

1

bb

y X

How to fit the model and estimate the parameters?

erro

r

y = Xβe = y− y

e = y− Xβ

min(eTe) =min((y− Xβ)T (y− Xβ))

Data predicted by our model

Error between predicted and actual data

Goal is to determine the betas such that we minimize the quadratic error

OLS (Ordinary Least Squares)

eTe = (y− Xβ)T (y− Xβ)

eTe = (yT − βT XT )(y− Xβ)

eTe = yT y− yTXβ − βT XT y+ βT XTXβ

eTe = yT y− 2βT XT y+ βT XTXβ∂eTe∂β

= −2XT y+ 2XTXβ

0 = −2XT y+ 2XTXβ

β = (XTX)−1XT y

OLS (Ordinary Least Squares)The goal is to minimize the quadratic error between data and model





= −2XT y+ 2XTXβ


β = (XTX)−1XT y


This is a scalar and the transpose of a scalar is a scalar J

The goal is to minimize the quadratic error between data and model





= −2XT y+ 2XTXβ


β = (XTX)−1XT y


This is a scalar and the transpose of a scalar is a scalar J

You find the extremum of a function by taking its derivative and setting it to zero

The goal is to minimize the quadratic error between data and model

y

e

Design space defined by X

x1

x2

A geometric perspective on the GLM

bˆ Xy =

yXXX TT 1)(ˆ -=bOLS estimates

x1

x2x2*

y

Correlated and orthogonal regressors

When x2 is orthogonalized with regard to x1, only the parameter estimate for x1 changes, not that for x2!

Correlated regressors = explained variance is shared between regressors

121

2211

==++=

bbbb exxy

1;1 *21

*2

*211

=>

++=

bb

bb exxy

Design space defined by X

linear model

effectsestimate

error estimatestatisticß

…but we are dealing with fMRI data

We are nearly there…

What are the problems?

n

= + + erro

r

n

p

1

n

!β2!β1

ò -=Ät

dtgftgf0

)()()( ttt

The response of a linear time-invariant (LTI) system is the convolution of the input with the system's response to an impulse (delta function).

Problem 1: Shape of BOLD response

Solution: Convolution model of the BOLD response

expected BOLD response = input function x impulse response function (HRF) ò -=Ä

t

dtgftgf0

)()()( ttt

blue = datagreen = predicted response, taking convolved with HRFred = predicted response, NOT taking into account the HRF

Problem 2: Low frequency noise

blue = datablack = mean + low-frequency driftgreen = predicted response, taking into account low-frequency driftred = predicted response, NOT taking into account low-frequency drift

Problem 2: Low frequency noise

blue = datablack = mean + low-frequency driftgreen = predicted response, taking into account low-frequency driftred = predicted response, NOT taking into account low-frequency drift

Linear model

discrete cosine transform (DCT) set

Solution 2: High pass filtering

Problem 3: Serial correlations

non-identity non-independencet1 t2

t1

t2

i.i.d

Cov(e) = 1 00 1

!

"#

$

%& Cov(e) = 4 0

0 1

!

"#

$

%& Cov(e) = 2 1

1 2

!

"#

$

%&

Cov(e)

n: number of scans

n

n

Problem 3: Serial correlations

• Transform the signal into a space where the error is iid

• Pre-whitening:

1. Use an enhanced noise model with multiple error covariance components, i.e. e ~ N(0,σ2V) instead of e ~ N(0, σ2I).

2. Use estimated serial correlation to specify filter matrix W for whitening the data.

WeWXWy += b

This is i.i.d

Problem 3: How to find W à Model the noise

Cov(e)

n: number of scans

n

n

autocovariancefunction

withttt aee e+= -1 ),0(~ 2se Nt

1st order autoregressive process: AR(1)

Model the noise: Multiple covariance components

),0(~ 2VNe s å=µ

iiQVeCovV

l)(

= Q1 Q2

Estimation of hyperparameters with EM (expectation maximisation) or ReML (restricted maximum likelihood).

V

enhanced noise model error covariance components Qand hyperparameters

!λ1 !+λ2

How do we define W ?

• Enhanced noise model

• Remember linear transform for Gaussians

• Choose W such that error covariance becomes spherical

• Conclusion: W is a simple function of V

WeWXWy += b

),0(~ 2VNe s

),(~),,(~

22

2

sµ

sµ

aaNyaxyNx

Þ

=

2/1

2

22 ),0(~

-=Þ

=Þ

VWIVW

VWNWe s

!ys = Xsβ +es

We are there…

• the GLM models the effect of your experimental manipulation on the acquired data

• GLM includes all known experimental effects and confounds

• estimates effects an errors on a voxel-by-voxel basis

Because we are dealing with fMRI data there are a number of problems we need to take care of:

• Convolution with a canonical HRF

• High-pass filtering to account for low-frequency drifts

• Estimation of multiple variance components (e.g. to account for serial correlations)

linear model

effectsestimate

error estimatestatisticß

We are there…c = 1 0 0 0 0 0 0 0 0 0 0

Null hypothesis: 01 =b

)ˆ(

ˆ

bbT

T

cStdct =

So far we have looked at a single voxel…

single voxel time series

• Mass-univariateapproach: GLM applied to > 100,000 voxels

• Threshold of p<0.05 more than 5000 voxels significant by chance!

• Massive problem with multiple comparisons!

• Solution: Gaussian random field theory

Outlook: further challenges

• correction for multiple comparisons• variability in the HRF across voxels• limitations of frequentist statistics• GLM ignores interactions among voxels

Thank you for listening!

• Friston, Ashburner, Kiebel, Nichols, Penny (2007) Statistical Parametric Mapping: The Analysis of Functional Brain Images. Elsevier.

• Christensen R (1996) Plane Answers to Complex Questions: The Theory of Linear Models. Springer.

• Friston KJ et al. (1995) Statistical parametric maps in functional imaging: a general linear approach. Human Brain Mapping 2: 189-210.

The General Linear Model (GLM) · Statistical Parametric Mapping: The Analysis of Functional Brain...

Documents

Transcript of The General Linear Model (GLM) · Statistical Parametric Mapping: The Analysis of Functional Brain...