Download - Volumetric Intersubject Registration John Ashburner Wellcome Department of Imaging Neuroscience, 12 Queen Square, London, UK.

Volumetric Intersubject Registration

John AshburnerWellcome Department of Imaging Neuroscience,

12 Queen Square, London, UK.

Intersubject registration for fMRI

* Inter-subject averaging* Increase sensitivity with more subjects

* Fixed-effects analysis

* Extrapolate findings to the population as a whole* Mixed-effects analysis

* Standard coordinate system* e.g., Talairach & Tournoux space

Typical overview of fMRI analysis

MotionCorrection

Smoothing

SpatialNormalisation

General Linear Model

Statistical Parametric MapfMRI time-series

Parameter Estimates

Design matrix

Anatomical Reference

Overview* Part I: General Inter-subject registration

* Spatial transformations* Affine* Global nonlinear* Local nonlinear

* Objective functions for registration* Likelihood Models

* Mean squared difference* Information Theoretic measures

* Prior Models

* Part II: The Segmentation Method in SPM5

Image Registration

Registration - i.e. Optimise the parameters that describe a spatial transformation between the source and reference (template) images

Transformation - i.e. Re-sample according to the determined transformation parameters

A Mapping from one image to another

Need x, y and z coordinates in one image that correspond to those of another

Affine Transforms* Rigid-body transformations are a subset* Parallel lines remain parallel * Operations can be represented by:

x’ = m11x + m12y + m13z + m14

y’ = m21x + m22y + m23z + m24

z’ = m31x + m32y + m33z + m34

* Or as matrices:Y=Mx

1

z

y

x

1000

mmmm

mmmm

mmmm

1

z

y

x

34333231

24232221

14131211

'

'

'

2D Affine Transforms* Translations by tx and ty

* x’ = x + tx

* y’ = y + ty

* Rotation around the origin by radians* x’ = cos() x + sin() y* y’ = -sin() x + cos() y

* Zooms by sx and sy

* x’ = sx x

* y’ = sy y

*Shear*x’ = x + h y*y’ = y

2D Affine Transforms* Translations by tx and ty

* x’ = 1 x + 0 y + tx

* y’ = 0 x + 1 y + ty

* Rotation around the origin by radians* x’ = cos() x + sin() y + 0* y’ = -sin() x + cos() y + 0

* Zooms by sx and sy:

* x’ = sx x + 0 y + 0

* y’ = 0 x + sy y + 0

*Shear*x’ = 1 x + h y + 0*y’ = 0 x + 1 y + 0

Polynomial Basis Functions

216

1514

2131211

ya

xyaya

xaxaa'x

226

2524

2232221

ya

xyaya

xaxaa'y

As used by Roger Woods’ AIR Software

Cosine Transform Basis Functions

As used by SPM software

SPM Spatial Normalisation

Non-linear registration

* Begin with affine registration* Refine with some non-linear registration

Affine registration

Accuracy of Automated Volumetric Inter-subject Registration

Sulcal misregistration

0

2

4

6

8

10

12

A D M P R SPM2Method

Dis

tanc

e (

mm

)

Hellier et al. Inter subject registration of functional and anatomical data using SPM. MICCAI'02 LNCS 2489 (2002)Hellier et al. Retrospective evaluation of inter-subject brain registration. MIUA (2001)

Local Basis Functions* More detailed

deformations use lots of basis functions with local support.

* Local support means that the basis functions are mostly all zero* Faster computations

Simple addition of displacements

Notice that there is no longer a one-to-one mapping

Generating large one-to-one deformations

The principle behind the one-to-one mappings of viscous fluid registration

Y2 = Y1 Y1 Y3 = Y1 Y2Y1 Y4 = Y1 Y3

Y5 = Y1 Y4 Y6 = Y1 Y5 Y7 = Y1 Y6 Y8 = Y1 Y7

Faster to repeatedly square the deformation

Y1 Y2 = Y1 Y1

Y4 = Y2 Y2 Y8 = Y4 Y4

Note that this is analogous to computing a matrix exponential (c.f. Lie Groups and exponential mappings)

Y16 = Y8 Y8

One-to-One Mappings* One-to-one

mappings break down beyond a certain scale

* The concept of a single “best” mapping may be meaningless at higher resolution Pictures taken from

http://www.messybeast.com/freak-face.htm

Optimisation

* Optimisation involves finding some “best” parameters according to an “objective function”, which is either minimised or maximised

* The “objective function” is often related to a probability based on some model

Value of parameter

Objective function

Most probable solution (global

optimum)Local optimumLocal optimum

Bayes Rule* Most registration procedures maximise a joint

probability of the deformation (warp) and the images (data).* P(Warp,Data) = P(Warp | Data) x P(Data) = P(Data | Warp) x P(Warp)

* In practice, this can be by minimising* -log P(Warp,Data) = -log P(Data | Warp) -log P(Warp)

Likelihood Prior

Mean Squared Difference Objective Function* Assumes one image is a warped version of the

other with Gaussian noise added…* P(fi|t) = (22)-1/2 exp(-(fi-gi(t))2/(22))

so

* -log P(fi |t) = (fi-gi(t))2 /(22) + 1/2 log(22)

* Assumes that voxels are independent...* P(f1,f2,…,fN,...) = P(f1) P(f2) … P(fN)

so

* -log P(f1,f2,…,fN)

= ((f1-g1(t))2 + (f2-g2(t))2 +…+ (fN-gN(t))2)/(22)

+ 1/2 N log(22)

Information Theoretic Approaches* Used when there is no simple relationship

between intensities in one image and those of another

Joint Probability Density* Intensities in one image predict those of another.* Joint probability often represented by a

histogram.

Mutual Information

* MI=ab P(a,b) log2 [P(a,b)/( P(a) P(b) )]* Related to entropy: MI = -H(a,b) + H(a) + H(b)

* H(a) = -a P(a) log P(a) da

* H(a,b) = -a P(a,b) log P(a,b) da

More Joint Probabilities

4x256 Joint Histograms

4x256 Joint Histograms

Joint Probabilities generated from Tissue Probability Maps

Rather than using an image of discrete values, multiple images showing which voxels are in which class can be used.

These can be constructed from an average of many subjects

4x256 Joint Histogram

Priors enforce “smooth” deformations* Membrane Energy

* Bending Energy

* Linear Elastic Energy

dxx

)(y2

)(Plog2

31i

i

x

Y

dxxx

)(y2

)(Plog

2

31i

31j

ji

2

xY

dxx

)(y

x)(y

4x

)(y

x)(y

2)(Plog

2

i

j

j

i

j

j31i

31j

i

i

xxxxY

Priors enforce “smooth” deformations* The form of prior determines how the

deformations behave in regions with no matching information

Overview* Part I: General Inter-subject registration* Part II: The Segmentation Method in SPM5

* Modelling intensities by a Mixture of Gaussians* Bias correction* Tissue Probability Maps to assist the segmentation* Warping the tissue probability maps to match the

image

Traditional View of Pre-processing* Brain image processing is often thought of as a

pipeline procedure.* One tool applied before another etc...

* For example…

OriginalImage

SkullStrip

Non-uniformityCorrect

Classify BrainTissues

Extract BrainSurfaces

Segmentation in SPM5

* Uses a generative model, which involves:* Mixture of Gaussians (MOG)* Bias Correction Component* Warping (Non-linear Registration) Component

y1c1

y2

y3

c2

c3

C

CyIcI

Ashburner & Friston. Unified Segmentation. NeuroImage 26:839-851 (2005).

Gaussian Probability Density* If intensities are assumed to be Gaussian of

mean k and variance 2k, then the probability of

a value yi is:

Non-Gaussian Probability Distribution* A non-Gaussian probability density function can

be modelled by a Mixture of Gaussians (MOG):

Mixing proportion - positive and sums to one

Belonging Probabilities

Belonging probabilities are assigned by normalising to one.

Mixing Proportions* The mixing proportion k represents the prior

probability of a voxel being drawn from class k - irrespective of its intensity.

* So:

Non-Gaussian Intensity Distributions* Multiple Gaussians per tissue class allow non-

Gaussian intensity distributions to be modelled.* E.g. accounting for partial volume effects

Probability of Whole Dataset* If the voxels are assumed to be independent,

then the probability of the whole image is the product of the probabilities of each voxel:

* A maximum-likelihood solution can be found by minimising the negative log-probability:

Modelling a Bias Field* A bias field is included, such that the required

scaling at voxel i, parameterised by , is i().

* Replace the means by k/i()

* Replace the variances by (k/i())2

Modelling a Bias Field* After rearranging...

()y y ()

y1c1

y2

y3

c2

c3

C

CyIcI

Tissue Probability Maps

* Tissue probability maps (TPMs) are used instead of the proportion of voxels in each Gaussian as the prior.

ICBM Tissue Probabilistic Atlases. These tissue probability maps are kindly provided by the International Consortium for Brain Mapping, John C. Mazziotta and Arthur W. Toga.

“Mixing Proportions”* Tissue probability maps for

each class are included.* The probability of obtaining

class k at voxel i, given weights is then:

y1c1

y2

y3

c2

c3

C

CyIcI

Deforming the Tissue Probability Maps* Tissue probability images

are deformed according to parameters .

* The probability of obtaining class k at voxel i is then:

y1c1

y2

y3

c2

c3

C

CyIcI

The Extended Model

* By combining the modified P(ci=k|) and P(yi|ci=k,), the overall objective function (E) becomes:

The Objective Function

Optimisation* The “best” parameters are those that minimise

this objective function.* Optimisation involves finding them.* Begin with starting estimates, and repeatedly

change them so that the objective function decreases each time.

Steepest DescentStart

Optimum

Alternate between optimising different

groups of parameters

Schematic of optimisationRepeat until convergence…

Hold , , 2 and constant, and minimise E w.r.t. - Levenberg-Marquardt strategy, using dE/d and d2E/d2

Hold , , 2 and constant, and minimise E w.r.t. - Levenberg-Marquardt strategy, using dE/d and d2E/d2

Hold and constant, and minimise E w.r.t. , and 2

-Use an Expectation Maximisation (EM) strategy.

end

Levenberg-Marquardt Optimisation* LM optimisation is used for nonlinear registration

() and bias correction ().* Requires first and second derivatives of the

objective function (E).* Parameters and are updated by

* Increase to improve stability (at expense of decreasing speed of convergence).

Expectation Maximisation is used to update , 2 and * For iteration (n), alternate between:

* E-step: Estimate belonging probabilities by:

* M-step: Set (n+1) to values that reduce:

Regularisation* Some bias fields and warps are more probable (a

priori) than others.* Encoded using Bayes rule (for a maximum a

posteriori solution):

* Prior probability distributions modelled by a multivariate normal distribution.* Mean vector and

* Covariance matrix and

* -log[P()] = (-T-1( + const

* -log[P()] = (-T-1( + const

Tissue probability maps of GM

and WM

Spatially normalised BrainWeb phantoms

(T1, T2 and PD)

Cocosco, Kollokian, Kwan & Evans. “BrainWeb: Online Interface to a 3D MRI Simulated Brain Database”. NeuroImage 5(4):S425 (1997)

Summary* Part I: General Inter-subject registration

* Spatial transformations* Affine* Global nonlinear* Local nonlinear

* Objective functions for registration* Likelihood Models

* Mean squared difference* Information Theoretic measures

* Prior Models

* Part II: The Segmentation Method in SPM5* Modelling intensities by a Mixture of Gaussians* Bias correction* Tissue Probability Maps to assist the segmentation* Warping the tissue probability maps to match the image

References* Friston et al. Spatial registration and normalisation of images.

Human Brain Mapping 3:165-189 (1995).* Collignon et al. Automated multi-modality image registration based on

information theory. IPMI’95 pp 263-274 (1995).* Ashburner et al. Incorporating prior knowledge into image registration.

NeuroImage 6:344-352 (1997).* Ashburner & Friston. Nonlinear spatial normalisation using basis

functions.Human Brain Mapping 7:254-266 (1999).

* Thévenaz et al. Interpolation revisited.IEEE Trans. Med. Imaging 19:739-758 (2000).

* Andersson et al. Modeling geometric deformations in EPI time series.Neuroimage 13:903-919 (2001).

* Ashburner & Friston. Unified Segmentation.NeuroImage in press (2005).

Spare slides

Very hard to define a one-to-one mappingof cortical

folding

Use only approximat

e registration

.

Smooth

Before convolution Convolved with a circleConvolved with a Gaussian

Smoothing is done by convolution.

Each voxel after smoothing effectively becomes the result of applying a weighted region of interest (ROI).

Voxel-to-world Transforms* Affine transform associated with each image

* Maps from voxels (x=1..nx, y=1..ny, z=1..nz) to some world co-ordinate system. e.g.,

* Scanner co-ordinates - images from DICOM toolbox* T&T/MNI coordinates - spatially normalised

* Registering image B (source) to image A (target) will update B’s voxel-to-world mapping* Mapping from voxels in A to voxels in B is by

* A-to-world using MA, then world-to-B using MB-1

* MB-1 MA

Left- and Right-handed Coordinate Systems

* Analyze™ files are stored in a left-handed system* Talairach & Tournoux uses a right-handed system* Mapping between them requires a flip

* Affine transform with a negative determinant

Transforming an image* Images are re-sampled. An example in 2D:

for y=1..ny % loop over rows

for x=1..nx % loop over pixels in row

x’= tx(x,y,a) % transform according to a

y’= ty(x,y,a)

if 1x’ nx & 1y’ny then % voxel in range

f (x,y) = f’(x’,y’) % assign re-sampled value

end % voxel in rangeend % loop over pixels in row

end % loop over rows

* What happens if x’ and y’ are not integers?

* Nearest neighbour* Take the value of the

closest voxel

* Tri-linear* Just a weighted

average of the neighbouring voxels

* f5 = f1 x2 + f2 x1

* f6 = f3 x2 + f4 x1

* f7 = f5 y2 + f6 y1

Simple Interpolation

B-spline Interpolation

B-splines are piecewise polynomials

A continuous function is represented by a linear combination of basis

functions

2D B-spline basis functions of degrees 0, 1,

2 and 3

Nearest neighbour and trilinear interpolation are the same as B-spline interpolation with degrees 0 and 1.

Inverse

EPI

T2 T1 Transm

PD PET

305T1

PD T2 SS

Template Images “Canonical” images

A wider range of contrasts can be registered to a linear combination of template images.

Spatial normalisation can be weighted so that non-brain voxels do not influence the result.

Similar weighting masks can be used for normalising lesioned brains.

Spatial Normalisation - Templates

T1 PD

PET

Templateimage

Affine registration.(2 = 472.1)

Non-linearregistration

withoutregularisation.(2 = 287.3)

Non-linearregistration

usingregularisation.(2 = 302.7)

Without regularisation, the non-linear spatial normalisation can introduce unnecessary warps.

Spatial Normalisation - Overfitting

A Growing Trend* Larger and more complex models are being

produced to explain brain imaging data.* Bigger and better computers

* allow more powerful models to be used

* More experience among software developers* Older and wiser* More engineers - rather than e.g. psychiatrists & biochemists

* This presentation is about combining various preprocessing procedures for anatomical images into a single generative model.

Another example (for VBM)

Bias Correction helps Registration* MRI images are corrupted by a smooth intensity

non-uniformity (bias).* Image intensity non-uniformity artefact has a

negative impact on most registration approaches.* Much better if this artefact is corrected.

Image with bias artefact

Corrected image

Bias Correction helps Segmentation* Similar tissues no

longer have similar intensities.

* Artefact should be corrected to enable intensity-based tissue classification.

Registration helps Segmentation* SPM99 and SPM2 require tissue probability

maps to be overlaid prior to segmentation.

Segmentation helps Bias Correction* Bias correction should not eliminate differences

between tissue classes.* Can be done by

* make all white matter about the same intensity* make all grey matter about the same intensity* etc

* Currently fairly standard practice to combine bias correction and tissue classification

Segmentation helps Registration

Original MRITemplate

Grey MatterSegment

Affine register

Tissue probability maps

Deformation

Affine Transform

Spatial Normalisation- estimation

Spatial Normalisation- writing

Spatially NormalisedMRI

A convoluted method using SPM2