A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their...

51
A Data Mining Approach to the Study of Dynamic Changes in Brain White Matter Melvin Gelbard Master of Science Artificial Intelligence School of Informatics University of Edinburgh 2019

Transcript of A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their...

Page 1: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

A Data Mining Approach to the

Study of Dynamic Changes in

Brain White Matter

Melvin Gelbard

Master of Science

Artificial Intelligence

School of Informatics

University of Edinburgh

2019

Page 2: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Abstract

In brain magnetic resonance imaging, white matter hyper-intensities are a visual indi-

cator of small vessel disease. Furthermore, they have been associated with the develop-

ment of degenerative conditions such as vascular dementia. Therefore, a lot of research

is dedicated to the analysis of such pathological features. However, poor segmentation

and characterisation of these lesions has led to ambiguous or contradictory hypotheses.

To resolve this issue, we present our two contributions: the LOTS-IAM-3D and a lesion

characterisation pipeline. The LOTS-IAM-3D outperforms its 2D implementation and

the state-of-the-art unsupervised segmentation method for the identification of brain

white matter hyper-intensities in terms of both accuracy and processing speed. Then,

the characterisation pipeline produced in this work introduces a robust and precise

way of carrying out lesion characterisation to account for inter- and intra-individual

differences, as well as in a whole population. We applied our segmentation module on

two datasets of patients with mild strokes and presented a thorough evaluation of the

LOTS-IAM-3D followed by a complete lesion analysis based on our pipeline.

i

Page 3: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Acknowledgements

First of all, I would like to thank my supervisor, Professor Taku Komura, for offering

me the opportunity to work on such an inspiring project and giving me the willingness

to pursue a career in biomedical imaging.

Second of all, I want to show my gratitude towards Dr. Marıa Valdes Hernandez

and Febrian Rachmadi, for their constant support and ideas that helped me go through

this challenging task. This project would have not been possible without you.

Finally, I would like to thank my family, Marc, Carine, Olivia, Dan, Tzuki and

Quito for their support from far away and Elisabetta, Anisha, Katie, Gaurav, Jordi and

Marko for their support in Appleton tower level 9.

ii

Page 4: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Table of Contents

1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Research Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4 Beneficiaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Background Research 52.1 Ground Medical Knowledge . . . . . . . . . . . . . . . . . . . . . . 5

2.1.1 Stroke . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.2 Magnetic Resonance Imaging . . . . . . . . . . . . . . . . . 6

2.1.3 MRI Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.4 White Matter Hyper-intensities . . . . . . . . . . . . . . . . . 9

2.2 Ground Technical Knowledge . . . . . . . . . . . . . . . . . . . . . 10

2.2.1 LOTS-IAM . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.2 Gaussian Mixture Models . . . . . . . . . . . . . . . . . . . 13

2.2.3 White Matter Damage Metric . . . . . . . . . . . . . . . . . 14

3 Methodology 153.1 Ethics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.3 Modification of the LOTS-IAM . . . . . . . . . . . . . . . . . . . . 16

3.3.1 LOTS-IAM-3D . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.3.2 Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.3.3 Target Patch Selection From Prior . . . . . . . . . . . . . . . 18

3.4 Multi-Spectral Clustering . . . . . . . . . . . . . . . . . . . . . . . . 19

3.4.1 Gaussian Mixture Model . . . . . . . . . . . . . . . . . . . . 20

3.5 Tools and Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

iii

Page 5: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

4 Segmentation Evaluation 214.1 Metrics Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.1.1 Measures of Relevance . . . . . . . . . . . . . . . . . . . . . 21

4.1.2 Jaccard Index & Dice coefficient . . . . . . . . . . . . . . . . 22

4.1.3 Bland-Altman Measure . . . . . . . . . . . . . . . . . . . . . 22

4.2 WMH Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.2.1 Total WMH Load . . . . . . . . . . . . . . . . . . . . . . . . 23

4.2.2 Target Patch Selector Performance . . . . . . . . . . . . . . . 24

4.2.3 WMH Subtle Changes . . . . . . . . . . . . . . . . . . . . . 26

4.3 Multi-Spectral Clustering Performance . . . . . . . . . . . . . . . . . 28

4.4 Execution Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

5 Lesion Characterisation 305.1 Stability Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.2 Analysis Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.3 Initial Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.4 Lesion State Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.5 Trend Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5.6 Correlation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5.7 Growth Curve Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 36

6 Conclusions 39

Bibliography 41

iv

Page 6: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 1

Introduction

It is well established that white matter hyper-intensities (short: WMHs) are connected

to the progression of neurodegenerative conditions such as dementia, Alzheimer dis-

ease (short: AD) and the occurrence of strokes [46, 9, 2]. These neurological syn-

dromes usually appear in individuals of older age and are believed to be accelerated

by small vessel disease (short: SVD), also predominant in elderly subjects [35]. Ev-

ery year, around 15 million people are affected by brain strokes and 850,000 people

by dementia [38]. Despite the considerable improvement in quality of life in the past

decades, these numbers are a sign that age-related neurodegenerative diseases remain

a struggle for modern medicine [48], as they can occur in any subset of the world

population.

Research strongly suggests that the presence of WMH is an indicator of SVD,

which leads to the development of cognitive impairment. However, their aetiology

is not very well understood [52]. The analysis of these brain white matter lesions

of presumed vascular origin is clinically important [54] as they “predict an increased

risk of stroke, dementia, and death” [12] and methods could potentially be found to

minimise their development.

Nowadays, the progress in modern technology allows a more precise analysis of

these lesions with the use of magnetic resonance imaging (short: MRI). This imaging

method facilitates the examination of pathology in the brain. Naturally, the automatic

identification of the pathological brain features became a rapidly growing field of re-

search. Most of these automatic methods rely on expert-generated data. However,

the skills and experience of the specialist performing the analysis highly influence the

quality of the reference data. To solve this issue, these labels are generated by groups

of experts, usually blinded to each other. Nevertheless, this procedure suffers from

1

Page 7: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 1. Introduction 2

intra-observer and inter-observer inconsistencies [47] and is time-consuming as well

as costly. Furthermore, despite the extended research on the subject, many subtleties

lying behind the general knowledge of these pathological features are still unexplained

or unexplored [20]. This is reinforced by two common problems: (1) the lack of stan-

dard quantification metric regarding the evolution of lesions in the brain and (2) the

variability in the segmentation and classification of white matter damage [20]. Both

have led to contradictory results and ambiguous hypotheses [16].

Recently, promising work carried out at the Centre of Clinical Brain Sciences

(short: CCBS) by the University of Edinburgh attempted to solve the issue of lesion

quantification by introducing a new metric for quantifying white matter damage on

MRI [24]. This measure is more robust to variations in the scanning protocols and

manages to capture any level of white matter damage in the brain. However, no useful

analysis can be drawn without an accurate segmentation of the white matter volume

first, which generally lacks specificity due to brain tissue irregularities.

Many tools such as deep convolutional networks [26] and support vector machines

[22] have been used for solving the issue of inaccurate WMH segmentation. However,

they usually lack in practicality as they are tuned for specific MRI settings. Further-

more, they require large labelled datasets to perform supervised learning, which is

challenging to acquire in the field of medical research. Nonetheless, an interesting ap-

proach was taken by Rachmadi et al. in their recent publication [34]. This innovative

method, based on texture analysis from computer graphics [5], allows the generation

of probability maps to detect abnormalities in the input image. It is fully unsupervised

and therefore does not require any training on labelled datasets. Even though it suffers

from some shortcomings, such as lesion underestimation in highly damaged brains and

poor processing time, it was taken as base for this project as it is believed that it can

outperform current state-of-the-art approaches. An overview of the shortcomings is

detailed in section 2.2.1 and an explanation of the modification is given in section 3.3.

This thesis aims at helping to establish a robust pipeline for the segmentation and

characterisation of white matter damage and its progression in brain MRI. To do so,

it will use two brain imaging datasets of patients with mild stroke. They comprise of

MRI scans from 17 and 43 patients respectively, taken at three different time points.

The datasets received ethical approval from the Lothian Research Ethics Commit-

tee (09/S1101/54, LREC/2003/2/29, REC 09/81101/54), the NHS Lothian R+D Of-

fice (2009/W/NEU/14) and the Multi-Centre Research Ethics Committee for Scotland

(MREC/01/0/56) and was conducted according to the principles expressed in the Dec-

Page 8: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 1. Introduction 3

laration of Helsinki [24].

This thesis is structured by dividing each aspect of the project into specific chap-

ters and sections that do not require the reader to have previous knowledge in biology

or data mining. First, chapter 1 will present the main motivation and objective of this

dissertation (sections 1.1 & 1.2), along with a detailed account of its significance in the

scientific world and an explanation of who will be the beneficiaries of this project (sec-

tions 1.3 & 1.4). Then, chapter 2 will present the reader with background information

regarding the main contribution of the thesis.

The contribution of this thesis is twofold. First, the initial stage finds an in-novative way to accurately segment out white matter lesions appearing in brainMRI, as detailed in chapter 3 and evaluated in chapter 4. Second, the followingstage quantifies and characterises the recently segmented white matter volumewith a pipeline of analyses designed by the author, as detailed in chapter 5.

Finally, chapter 6 will discuss the results presented in the previous chapters, along

with a small investigation about the limitations of the project and future work.

1.1 Motivation

In their review, Llado et al. call for the need of an automatic segmentation and quan-

tification system for better analysis and identification of brain pathology [29]. Their

legitimate concerns come from the fact that there is no common ground, at the time of

writing, for the assessment of abnormalities (including WMH) and the characterisation

of their evolution. However, there is a crucial need for such work to be carried out as

many fully-automated algorithms have been found to perform better than experts on

similar tasks [42].

This project is motivated by the willingness to move the field forward and con-

tribute with an improved and time-efficient segmentation module and characterisation

pipeline for brain MRI analysis. It was carried out with the hope that it will be used by

the research community for the establishment of strong common grounds among dif-

ferent institutions. This would allow future findings to be more trustworthy and enable

cooperation between research centres that currently use different tools for segmenta-

tion and analysis.

Page 9: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 1. Introduction 4

1.2 Research Goal

The objective of this research project is to address the shortcomings in the clinical ap-

plication of the LOTS-IAM to WMH segmentation by developing a complete system

that: (1) allows the accurate segmentation of WMH from brain MRI taken at different

time points and (2) allows accurate information to be gleaned from it. The segmenta-

tion module takes a 3D-array MAT file representing the scan as input and performs a

patch comparison-based abnormality detection. Then, it generates a probability map

for each voxel, representing how likely it is to be WMH. The focus of this thesis is not

the separation of WMH from presumed vascular origin and WMH from stroke inci-

dents. Therefore, the regions segmented by the module could originate from both.

Following, the second part of the system is an analysis pipeline designed for the char-

acterisation of the previously-found abnormalities. It takes the output from the first

module and generates a full description of the lesions over time and a complete analy-

sis of the generated numbers.

1.3 Significance

In collaboration with Dr. Taku Komura, Dr. Marıa Valdes Hernandez and Febrian

Rachmadi, the project is at the centre of their current aim in this field: being able to

describe and characterise subtle changes in white matter hyper-intensity volumes in

the brain over time. Currently, the full analysis (segmentation and characterisation) is

semi-automatically performed, with a great deal of manual input. A substantial amount

of time is spent on these tasks, whereas an efficient algorithm could allow researchers

and doctors to fully focus on the diagnosis and treatment administration.

1.4 Beneficiaries

The first motivation behind this project was the curiosity for the exploration of new

techniques and improvements for WMH segmentation and characterisation. However,

due to its interesting results, it will benefit any academic party interested in the devel-

opment of brain lesion segmentation and analysis algorithms. Furthermore, as a result

of its original implementation, the system is beneficial to experts in both the field of

computer science and medicine. In the long run, clinicians and patients with WMH in

the need of an accurate diagnosis will be able to profit from it.

Page 10: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 2

Background Research

2.1 Ground Medical Knowledge

2.1.1 Stroke

According to the World Health Organization, brain strokes are the second leading

cause of death worldwide, with more than 15 million people affected every year [38].

Even if about a third do not leave any irreversible damage, the rest of them can have

dreadful consequences. Great progress in medicine, coupled with an increase in the

quality of life worldwide led to an impressive decline in the number of recorded

strokes. However, the recent statistics offered by the Office for National Statistics

estimated strokes to be the 4th leading cause of death in the United Kingdom. [19].

Strokes are neurological dysfunctions happening in the brain for a short period of

time. When they occur, the brain is deprived of its oxygen and nutrients, preventing

it from operating as designed [18]. Strokes can cause significant damage to the body

and affect the patient’s ability to carry out tasks such as talking or moving. We observe

two main types of strokes: haemorrhagic and ischaemic strokes (see figure 2.1). They

might inflict the same damage for the body but originate from different phenomena.

Stroke

Ischemic

ThromboticEmbolic

Haemorrhagic

IntracerebralSubarachnoid

Figure 2.1: Tree representing the different kinds of strokes.

5

Page 11: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 2. Background Research 6

Haemorrhagic strokes, also called aneurysms, are the least common types of strokes,

according to the National Stroke Association. Only around 15% of all stroke incidents

are diagnosed as haemorrhagic. Nevertheless, they are known to be fatal for the pa-

tient in most of the cases. During such attack, the blood vessels will leak or burst, and

flood the cavities of the brain, preventing blood from reaching its destination. This

is common in patients with high blood pressure, which induces weakened and more

fragile blood vessels. Haemorrhagic strokes can be divided further into two sub-types.

The first sub-type, intracerebral haemorrhages, appears when the leak or burst occurs

inside the brain and affects the surrounding brain cells first. The second sub-type, sub-

arachnoid hemorrhage, appears when the damaged blood vessel is positioned in the

subarachnoid space.

On the other hand, ischaemic strokes are the most common types of strokes, with

around 85% of all strokes recorded. They appear in specific cases where the blood

cannot circulate properly to the brain because the blood vessels are either too narrow

or obstructed by a clot. The origin of the clot defines the subtypes of ischaemic attacks.

Embolic strokes are caused by a clot called embolus outside the brain area, which are

not problematic until it reaches a thinner capillary and prevents good blood circulation.

Thrombotic strokes are caused by a clot called thrombus, directly in the capillaries

of the brain. Ischaemic strokes have a prevalence in individuals suffering from high

blood tension, obesity and drug addiction. Depending on the length of the stroke, the

damages may have drastic consequences on the patient’s body [50].

According to the annual report of the National Stroke Organisation, there are cur-

rently 1.2 million patients who survived a recent stroke incident in the United King-

dom. 84% of these patients need intensive care and constant support with daily tasks

and around 33% of these patients suffer from difficulty with speech, motor and per-

ceptual skills. These numbers show that the heavy consequences of strokes remain a

major concern in modern health care. In the datasets used throughout this project, pa-

tients were imaged 1-4 weeks after presenting to clinic with a mild to moderate stroke

of type ischaemic, although few patients had previous small haemorrhagic episodes.

2.1.2 Magnetic Resonance Imaging

Magnetic Resonance Imaging (short: MRI) is an imaging method widely used in the

medical field. With the help of strong magnetic fields and radio waves, it allows the

inspection of pathology inside the body [25]. MRI is very useful as different scanning

Page 12: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 2. Background Research 7

parameters and contrasting agents can reveal different elements and ease the identifi-

cation of these abnormalities [28].

Magnetic resonance imaging machines rely on the magnetic properties of the water

particles present in the body tissues. By measuring their reaction time after the emis-

sion of radio frequency (short: RF) energies, scanning machines can recreate an image

representing the different tissues of the region captured. However, different scanning

parameters will yield different types of images. The two main parameters to adjust for

controlling the weight and intensity of the final image are the repetition time (short:

RT) and the echo time (short: TE). RT is the time separating two consecutive RF

emissions, whereas TE is the time separating the emission of the RF signal and the

measurements of its echo back [20].

Different parameters settings will generate different images. As the various tissues

present in the brain react differently to shorter and longer TR/TE, it is possible to

control the resulting image contrasts and facilitate their identification. Figure 2.2 shows

the three sequences that this thesis experiments with: T1-weighted, T2-weighted and

Fluid Attenuated Inversion Recovery (short: FLAIR).

Figure 2.2: T1-weighted, T2-weighted and FLAIR MRIs.

Source: https://casemed.case.edu.

T1-weighted sequences, or anatomy scans, use short TR/TE, which brightens the

tissues with high water content. It is very useful to identify the boundaries between the

tissues of the brain.

T2-weigthed sequences use long TR/TE and are widely used for evaluation of liga-

ments and cartilages because of its contrasting properties. Furthermore, since the water

content of the tissues usually grows in presence of a pathology, they are easily captured

on T2-weighted sequences.

Page 13: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 2. Background Research 8

FLAIR sequences use very long TR/TE and are very similar to T2-weighted se-

quences. However, the longer capture times completely suppresses the intensity of

some parts of the brain, like the CSF [23]. Nowadays, it is widely used for its ability

to identify a range of intricate lesions that in the past were more difficult to visualise

[31]. As it is the most widely-used sequence for the identification of WMH, the main

part of the project focuses on the processing of FLAIR sequences. However, some

experiments involving T1- and T2-weighted sequences will be presented in section 3.4.

Table 2.1 summarises the appearance of the main tissues in the different MRI se-

quences.

TISSUE T1-WEIGHTED T2-WEIGHTED FLAIR

CSF DARK BRIGHT DARK

WHITE MATTER LIGHT DARK GREY DARK GREY

CORTEX GREY LIGHT GREY LIGHT GREY

WMH DARK BRIGHT BRIGHT

Table 2.1: Appearance of different tissues in MRI sequences.

Extended from: https://casemed.case.edu.

2.1.3 MRI Analysis

Nowadays, the most widespread methods for MRI analysis remains manual. Other

than the fact that it is remarkably time-consuming, it is also prone to mistakes, depend-

ing on the experience of the analyst [13]. Furthermore, a fully supervised examination

made by experts requires an excessive amount of money to be carried out.

However, these past years have seen the rise in the number of publications involv-

ing automatic -or semi automatic- methods for brain MRI analysis [4]. Their aim is to

serve as guidance for the expert in charge and facilitate the administration of a proper

treatment. Most of the research focuses on the segmentation of brain MRI into WMH

and normal appearing white matter (short: NAWM). Nevertheless, little attention is

given to the characterisation of the results and the longitudinal study of the lesions.

We observe two main operations for the analysis of lesions in brain MRI: lesions

segmentation and lesions characterisation [20].

Segmentation is the task of separating the input image into different regions, each

classified with a different label, typically numeric. Each label will then be assigned to

a class (NAWM, WMH, ...), which, in the case of automatic methods, is usually deter-

Page 14: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 2. Background Research 9

mined from some probability-based estimation [1]. Unlike classification, segmentation

is a task carried out at pixel-size level and requires specific evaluation metrics (see sec-

tion 4.1). Manually segmented scans are nowadays considered the standard approach

and the “gold standard” for algorithm comparison, as Patriarche et al. report in their

review [40]. Manual segmentation requires the experts to cautiously highlight the re-

gions of interest using an MRI segmentation-designed tool, slice by slice. Depending

on the scan format and shape, this task can take an extraordinary amount of time and

financial resources.

Following, lesion characterisation is the act of quantifying the segmented volumes

and modelling their evolution over time. Regarding MRI analysis, this could mean the

measurement of the WMH/NAWM (volume, ratio, intensity, ...), the analysis of their

growth/shrinkage (shape, scale, variance, ...) or more. Analyses can reveal precious

information for the treatment of the patients and therefore their precision are of utmost

importance.

2.1.4 White Matter Hyper-intensities

As pointed out by Debette et al., there is a strong belief that the presence of WMH in

FLAIR and T2-weighted scans is associated with a variety of pathology such as stroke

incidents and Alzheimer disease [12]. Therefore, an effective analysis of the MRI can

allow a better follow-up of the patient and more educated treatment decisions [36].

A popular visual rating metric still widely used today is the Fazekas scale [17].

In their original paper, Fazekas et al. described white matter damage (referred to

as leukoaraiosis) by first separating it into deep white matter hyper-intensity (short:

DWMH) and periventricular white matter hyper-intensity (short: PVWMH). Then, a

score from 0 (no damage) to 3 (“large confluent areas” for DWMH and “large conflu-

ent disease” for PVWMH [8]) is given to the segmented region. It is a useful metric

for visual analysis of WMH but it suffers from limitations: it is very sensitive to inter-

rater variability [39] and is not precised enough to be used as an evaluation metric for

automatic tools.

We observe in the literature various quantification metrics for the assessment of

WMH in brain MRI. Most of them are intensity-based and can be divided into simple,

statistical and temporal quantification methods. However, these usually fail to deliver

accurate representations of the WMH as they are too sensitive to over-damaged brains

and fluctuations in the images over time [29].

Page 15: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 2. Background Research 10

On the other hand, the damage metric developed by Valdes et al., briefly reviewed

in section 2.2.3, resolves most of the limitations of the previously-mentioned quantifi-

cation metrics. As we believe it is more accurate when it comes to automatic WMH

quantification, it was used in this thesis to assess the WMH burden of our datasets.

Only a few years ago, it was believed that only the presence of WMH in the brain

was a sign that it would progress and extend over time [44], leading to the develop-

ment of dementia, SVD and Alzeihmer. However, studies with more advanced meth-

ods using automatic segmentation observed that WMH on longitudinal data may also

regress over time [10]. For instance, Wardlaw et al. observed a clear regression of the

WMH in minor stoke patients, shortly after the incident [53]. As mentioned earlier, the

poor segmentation of WMH burden in new studies can lead to ambiguous hypotheses.

Therefore, this thesis will also aim at discovering if Wardlaw et al.’s observation can

be validated on our datasets.

Overall, the knowledge about the progression of WMH and stroke lesions is weak.

However, accurate automatic segmentation tools applied to longitudinal studies might

give us the tools necessary to sharpen our knowledge about these pathological features.

2.2 Ground Technical Knowledge

2.2.1 LOTS-IAM

Recently, the inspiring work of Rachmadi et al. produced an automatic unsupervised

detector of tissue irregularities in brain MRI [34], called Limited One-Time Sampling

Irregularity Map (short: LOTS-IAM). It runs a patch comparison method borrowed

from a weathered-texture analysis technique [5] to estimate abnormalities in an input

image. It was used and evaluated for the task of WMH segmentation in brain MRI. To

do so, it implements a full segmentation pipeline, as seen on figure 2.3.

The inputs of the LOTS-IAM are NIfTI-1 files (developed by the Neuroimaging

Informatics Technology Initiative). This format provides a common ground for MRI

analysis worldwide as it stores essential information such as pixel coordinate to spatial

location transformation and can be read by extensively used programs such as ANA-

LYZE 7.5 and MATLAB [30]. The first input to this pipeline is the FLAIR file paths, as

FLAIR images which will be used for age map generation. Additionally, the intracra-

nial volume and cerebrospinal fluid masks are needed to pre-process the data. These

are subtracted from the original NIfTI-formatted FLAIR scan as they would induce

Page 16: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 2. Background Research 11

Pre-

processing

Source patch

extraction

Target patch

extraction

Distance

function

calculation

Post-

processingNIfTI files

input

Final Age

Map

Figure 2.3: Overview of the LOTS-IAM architecture.

irrelevant information to the main module of the program.

From the pre-processed data, the LOTS-IAM extracts brain tissue irregularities by

performing comparisons between source and target patches. Source patches are ex-

tracted by partitioning a slice into non-overlapping patches of prefixed sizes, whereas

target patches are randomly sampled from the entire scan slice under the condition that

they are located inside a non-masked region. The program requires the source patch

size to be the same as the target patch size. To get a more accurate representation of

the abnormalities, a substantial amount of target patches is needed. Rachmadi et al.

recommends a target patch number of 64, which can take a tremendous amount of time

to process. To solve this issue, the original paper describes the use of GPU computing

to speed up the heavy computation induced by the high number of source and target

patches. The LOTS-IAM relies on the assumption that abnormalities are outliers and

therefore appear different than the average brain tissue. The age value is given by the

comparison between source and target patches and is calculated with the use of the

following distance function:

d = α · |max(s− t)|+(1−α) · |mean(s− t)| (2.1)

Where α = 0.5, and s & t are the source and target patches compared, respectively.

The program allows multiple patch sizes to be used individually and in conjunction.

Typically, patch sizes are 1, 2, 4 and 8. Using a hierarchical set of patch sizes allows

for different levels of context to be taken into account. However, when multiple patch

sizes are used, multiple age maps are generated. To combine these into a visually

sensible output, the resulting age maps go through a post-processing pipeline.

The post-processing performed at the end of the age map calculation can be summed

up in three steps: blending age maps sampled from different patch sizes, penalty appli-

cation and global normalisation. First, the age maps generated by the distance calcula-

Page 17: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 2. Background Research 12

tions are up-sampled and Gaussian smoothed. Then, a combined age map is produced

by summing the probability maps according to some pre-defined weighting (specific

weight values are suggested by the authors). Following, the age map is penalised by

multiplying its values by the corresponding pixel values in the original FLAIR MRI

scan. This step emphasises WMH against abnormally darker regions. Finally, the age

map is normalised between 0 and 1 to be interpreted as the probability for a pixel to

belong to WMH volume.

The original paper compares the results from the LOTS-IAM under different thresh-

olds with other popular WMH segmentation tools and observed that its finding outper-

formed them on most metrics. The other methods used included both supervised and

unsupervised methods. Furthermore, the LOTS-IAM performed better than the current

state-of-the-art unsupervised WMH segmentation framework, LST-LGA. The perfor-

mance metrics used were the dice coefficient, PPV, SPC and TPR.

However, experiments at the Centre of Clinical Brain Sciences of the University of

Edinburgh demonstrated that the LOTS-IAM could not be used as a WMH segmenta-

tion tool in clinical research or practice per se as it suffers from important weaknesses.

First of all, a notable number of false positives and false negatives are present in the fi-

nal age maps, which invalidates its use for automatic quantification of lesions. Second

of all, it tends to underestimate lesions when processing over-damaged brains. Finally,

it is only tuned for FLAIR sequences and was not optimised to work for T1- and T2-

weighted sequences. Additionally, its execution time makes it impractical to use for

very large datasets (e.g. UK Biobank [49]).

The LOTS-IAM runs on the terminal (or any virtual environment) and consists of

five main files:

1. iam lots gpu.py: contains main() function that will call the main module.

2. iam params.py: contains the 10 hyper-parameters that the user can tune.

3. IAM GPU lib.py: contains main module (defined in iam lots gpu compute()

function), called by the main() function.

4. IAM lib.py: contains the additional function definitions used in the main mod-

ule (including CUDA functions).

5. input.csv: contains the file paths to be processed by the main module.

From the terminal, a simple call to its main function (iam lots gpu.py) is needed

to run the main module and process all the files specified in input.csv.

Page 18: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 2. Background Research 13

It was decided, in accordance with researchers at the CCBS and its authors, that

the LOTS-IAM would be used and improved in the scope of this project. Therefore,

the segmentation module is based on the LOTS-IAM’s architecture and the character-

isation of the lesions is based on its output.

2.2.2 Gaussian Mixture Models

Gaussian Mixture Models (short: GMMs) are a type of soft clustering methods [14].

Unlike K-means, which assigns each data point to a cluster, GMMs compute the prob-

ability that a data point belongs to a cluster. Therefore, each cluster corresponds to

a probability distribution based on a Gaussian [6]. To perform this, they rely on ex-

pectation maximisation (short: EM) for the estimation of the parameters. Since it is a

Gaussian mixture model, the underlying density function is a Gaussian, defined as

N(X |µ,Σ) = 1(2π)D/2Σ1/2 exp(−1

2(X−µ)T

Σ−1(X−µ)) (2.2)

Where D is the number of input dimensions, X is the set of input data points and Σ and

µ are the covariance matrix and the mean of the distribution, respectively. These two

last variables represent the parameters of the distribution that the model will attempt

to estimate, in order to fit the input data [3].

The algorithm proceeds in four main steps:

1. Initialisation step: Randomly initialise the parameters.

2. Expectation step: Assign a cluster probability to each data point based on current

means and variances.

3. Maximisation step: Update means and variances based on the new probabilities

for each data point (maximum likelihood calculation).

4. Repeat steps 2 & 3 until convergence.

Generally, the algorithm is ran multiple times as the outcome is not deterministic,

due to the random initialisation step [33]. Therefore, the most frequent cluster assigned

to each datapoint is taken as final cluster assignment. This results in the partitioning

of the dataset into the number of clusters specified. Gaussian mixture models are used

for post-processing the outputs of the improved version of the LOTS-IAM developed

in this thesis, when applied to different image sequences.

Page 19: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 2. Background Research 14

2.2.3 White Matter Damage Metric

In their recent publication, Valdes et al. call for a harmonisation of the quantification

results of WMH segmentation [24]. To do so, they propose a new metric to assess

the white matter damage in the brain based on both its volume and intensity. The

advantages of this damage metric is that it can be used for any MRI sequence, it is

fast to compute and holds significant information for the researcher. It is calculated as

follows:

WMDamage =IWMH− INAWM

INAWM∗ WMHvolume

WMHvolume +NAWMvolume(2.3)

Where IWMH and INAWM are the average intensities of WMH and of NAWM respec-

tively and where WMHvolume and NAWMvolume are the volume of WMH identified and

of the NAWM, respectively. The resulting number falls between 0 and 1. For minimal

damage, the metric approximates 0 whereas for predominant damage, it approximates

1. If no WMH is detected in the brain, the WMDamage is set to 0.

The damage metric introduced by Valdes et al. will be used to quantify the lesions

and their evolution over time. This has been decided with the objective of making it

one of the de facto measure when it comes to WMH analysis in brain MRI.

Page 20: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 3

Methodology

3.1 Ethics

This thesis received approval from the Informatics Ethics Panel (ref. num: 64736)

and was conducted in accordance with the GDPR and the ethical standards set by the

Informatics department of the University of Edinburgh.

3.2 Dataset

Two datasets were used throughout this project. Both belong to the Mild Stroke Study

(short: MSS) initiated at the University of Edinburgh. More specifically, subsets of the

MSS2 and MSS3 datasets were created and processed by the algorithms. The segmen-

tation algorithm uses the MSS2 subset only, since ground truth was only generated for

MSS2. On the other hand, the characterisation of the lesions was performed using both

MSS2 and MSS3 scans.

• The subset taken from the MSS2 contains 43 patients, scanned at three time steps

separated by roughly 3 months (visit 1 and 2) and a year (visit 1 and 3). Each

scan is represented as a 3D array of size 256x256x42.

• The subset taken from the MSS3 contains 17 patients, scanned at three time steps

separated by roughly 3 months (visit 1 and 2) and 6 months (visit 1 and 3). Each

scan is represented as a 3D array of size 256x256x176.

Each patient is represented by 3 different MRI scans (T1-, T2-weighted and FLAIR)

at 3 different time points, all originally registered in NIfTI format. However, for an

15

Page 21: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 3. Methodology 16

easier manipulation and portability of the data, it was decided to extract the 3D arrays

from the NIfTI files after masking them, to avoid extra pre-processing unrelated to the

segmentation/characterisation goals of this project. Therefore, the data used for the

rest of the project are 3D MAT files of the region of interest (i.e. the whole white

matter region), resulting from the subtraction of the intracranial volume, cortex and

cerebrospinal fluid masks.

Ground truth was produced for the MSS2 set in order to allow the evaluation of dif-

ferent segmentation tools. However, it is not perfectly accurate. Due to the extensive

time needed to label a dataset, the medical authority in charge decided not to generate

ground truth for scans that are separated by only 1-2 months. Therefore, the ground

truth for the second visit of MSS2 is the same as the first visit ground truth. Further-

more, the ground truth was originally generated for clinical studies and includes the

tissue loss due to stroke in the third (i.e. last) time point, which should not be picked

up by the algorithm since it is not the focus of this thesis. This will lower the accuracy

of the algorithm performance at these time points (i.e. second and third). Nevertheless,

it is still possible to compare the performance of different algorithms on this dataset,

as long as the same ground truth is used for all the evaluations. No ground truth was

generated for MSS3.

3.3 Modification of the LOTS-IAM

As mentioned earlier, the LOTS-IAM suffers from important shortcomings. Therefore,

the base code was taken and modified to improve its final output and make it suitable

for clinical studies. The main experiments carried out with the LOTS-IAM can be

summed up as follows:

1. Development of a 3-dimensional version of the LOTS-IAM, conveniently called

LOTS-IAM-3D, to account for contextual information and improve accuracy.

2. Modification and refactoring of the main module of the program to decrease its

execution time by 4.

3. Introduction of a target patch selection function based on priors to improve ac-

curacy and allow inter-subject comparisons.

4. Experiments with T1 and T2 sequences, separately and in conjunction (multi-

spectral clustering).

Page 22: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 3. Methodology 17

3.3.1 LOTS-IAM-3D

The first main contribution of this project is the development of a 3-dimensional ver-

sion of the original LOTS-IAM. The original algorithm processes the MRI slice by

slice whereas the LOTS-IAM-3D analyses the whole MRI volume, which allows more

contextual information to be taken into account. This means that the algorithm per-

forms a cube-based comparison of the voxel intensities in FLAIR scans, using cubic

source and target patches (see figure 3.1(b)).

(a) LOTS-IAM source patch (4x4). (b) LOTS-IAM-3D source patch (4x4x4).

Figure 3.1: Examples of source patches in both versions of the algorithm (red).

The addition of a dimension for source patch extraction, as seen on figure 3.1, is

also performed during the target patch extraction. However, the operations applied

on source and target patches for age value generation are the same as in the original

LOTS-IAM.

3.3.2 Data Structure

The execution time of the algorithm and its distinct parts were all measured using

an MSi GE60 2PE ApachePro (GTX860M) with Intel core i7 and were all applied

on the MSS3 dataset. This was decided in order to have a common ground for all

comparisons.

The processing of an additional dimension (i.e. depth) to the algorithm drastically

slowed down its processing speed performance. From 10 minutes, the 3D version of

the algorithm with the same architecture as the original LOTS-IAM takes around 40

minutes to run one MRI scan of dimension 256x256x176 with patch sizes [1,2,4,8].

Page 23: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 3. Methodology 18

This drastic increase in execution time is due to a non-optimal management of re-

sources and data structure. As an excessive execution time hurts the practicality of

the method, we performed an execution time analysis of the code and restructured the

architecture from the original paper. Once done, we decreased the LOTS-IAM-3D pro-

cessing time from 40 minutes to under 40 seconds (see section 4.4). The gain in speed

is due to the creation of specific functions to avoid repetition of heavy calculation and

the use of more efficient data structures and operations.

3.3.3 Target Patch Selection From Prior

One of the main shortcomings of the original LOTS-IAM is its ineffectiveness in recog-

nising the irregularities when applied to highly-damaged brains. Indeed, the algorithm

is based on the average brain intensity. Therefore, it tends to underestimate damaged

regions when the average intensity of the whole brain is high (due to a high number of

lesions). This results in an irregularity map with a large number of false positives and

false negatives.

To solve this problem, we introduce a patch selection function based on prior. The

patch selector is enabled via the tuning of an input parameter (e.g. None, [0-1]). While

the original LOTS-IAM samples target patches randomly, the LOTS-IAM-3D’s patch

selector filters out, if enabled, target patches that do not comply with some require-

ments, as explained below.

When the patch selection function is enabled, an additional step is added to the

target extraction process. First, a mask from the original 3D brain array is created by

applying the following pipeline:

Calculate 3D

mean and std

Estimate intense

WMH (iWMH)

Estimate subtle

WMH (WMH)

Apply 3D

Gaussian

Apply 3D

erosion

Threshold

result

3D brain

inputMasked

3D brain

Figure 3.2: Mask generation pipeline for the target patch selection function.

Where each brain voxel v is defined as (i)WMHest if:

Page 24: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 3. Methodology 19

voxel ≥ meanintensity +CI ∗ stdintensity (3.1)

And where CI = 1.282 (80% confidence) for WMHest and CI = 1.960 (95% con-

fidence) for iWMHest . Then, the target patch extraction is executed. However, each

randomly sampled target patch is this time tested by checking the ratio of masked pix-

els in its corresponding patch from the masked brain volume, generated earlier. If the

ratio of WMHest exceeds the prefixed threshold, the patch is rejected.

Here below is an illustration of the patch selection function. Note that it demon-

strates the different steps involved when applied to a 2D slice only for visualisation

purposes.

Figure 3.3: Overview of the patch selection architecture.

3.4 Multi-Spectral Clustering

Most research related to WMH in brain MRI focuses on one modality only. Generally,

FLAIR or T2-weighted scans are used as they allow the distinction between normal and

abnormal brain tissues. However, T1-weighted scans can be useful for the distinction

between the stroke area and other age-related WMH.

Other than unsuccessful and experimental work on multi-spectral clustering using

T1 and T2 [27], very little work was done to incorporate the information coming from

multiple modalities in a single segmentation task. In this thesis, we believed that the

area of multi-spectral clustering incorporating the irregularity maps from the LOTS-

IAM-3D was worth exploring. Therefore, we ran the LOTS-IAM-3D on the three

sequences of scans available for all the patients in the MSS2 dataset: T1-, T2-weighted

and FLAIR.

Another important note is the experimental optimisation of the LOTS-IAM-3D for

T1 modality. The original penalisation step aims at suppressing the low intensities of

the original scan to remove any irrelevant darker abnormalities (generally present in T2

and FLAIR). To make it relevant to T1-weighted scans, the penalisation process (only

Page 25: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 3. Methodology 20

when applied to T1 scans) was modified to reverse its effect and improve the results

overall. The probability maps used were produced by the LOTS-IAM-3D with patch

selector (threshold = 0.05), as it produced the best results on its own (see section 4.2.2).

We implemented two types of clustering methods, namely Gaussian Mixture Model

and Self-Organising Maps (short: SOMs), to process the LOTS-IAM-3D output. How-

ever, SOMs are a type of artificial neural networks that require extended hyper-parameter

tuning. The limited amount of time did not allow for significant results to be achieved.

Therefore, it was decided not to include SOM results here. Nonetheless, the code of

the SOM experiments are available on request and could potentially be used as baseline

for future research.

3.4.1 Gaussian Mixture Model

To perform Gaussian clustering, the data needs to be pre-processed. Therefore, the

LOTS-IAM-3D with patch selector (threshold = 0.05) was ran on the three modalities

available: T1-, T2-weighted and FLAIR. Then, the three irregularity maps resulting

from the run were flattened and stored in a 3-dimensional array of shape (N, 3), where

N is the multiplication of the x, y and z original dimensions of the brain MRI. Then,

the 3-dimensional array is fitted to a Gaussian mixture model with 4 clusters. The

rationale behind this is to separate the main visible parts of the MRI scan ROI given

as input: background (black), subcortical grey matter, NAWM and WMH. To label

the clusters, a function calculates the average intensity of all the clusters’ voxels on the

original scan. From this, we can assume that the highest intensities are WMH volumes.

Section 4.3 presents the results to the reader, introducing very experimental work to

the LOTS-IAM literature.

3.5 Tools and Libraries

As the LOTS-IAM was originally written in Python 3.5, it was decided to extend and

improve it using the same programming language and the same version. The Numpy

library [37] was extensively used for its implementation. The Gaussian mixture models

were implemented using the Scikit-learn library [41] in Python. The SOM models (not

presented here) were implemented using the Minisom library [51] in Python. Finally,

the evaluation and characterisation of the WMH load were implemented using both the

MATLAB environment and Jupyter notebooks (available on request) in Python.

Page 26: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 4

Segmentation Evaluation

4.1 Metrics Used

The evaluation was carefully carried out to validate not only the segmentation accuracy

of the different algorithms produced during this project, but also the fidelity in repre-

senting the evolution of the white matter damage in terms of lesion progression, stabil-

ity and/or shrinkage. The evaluation of the segmentation algorithms was treated as a

binary classification problem. Therefore, both the output of the LOTS-IAM-3D and the

ground truth are binary masks of equal size. As the raw output of the LOTS-IAM-3D is

a probability map, different thresholds were tested to find the optimal value at which a

voxel can be considered white matter damage with enough confidence. Along with the

damage metric described in section 2.2.3, the metrics from the following subsections

have been used for the evaluation of the segmentation module.

4.1.1 Measures of Relevance

Measures of relevance give some statistics regarding the true/false positives/negatives.

They all yield values between 0 and 1.

Sensitivity, also called recall, measures the ratio of positive instances that were

correctly labelled by the algorithm (equation 4.1a). Specificity, on the other hand,

measures the ratio of negative instances that were correctly classified by the algorithm

(equation 4.1b). In the case of brain MRI segmentation, specificity tends to be very

high, as WMH (positive instances) usually represents only about 5% of the brain area.

Precision is defined as the ratio of actual positives over predicted positives (equation

4.2a). It is usually used for the calculation of the F1 score, along with sensitivity.

21

Page 27: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 4. Segmentation Evaluation 22

F1 score is a harmonic mean between precision and sensitivity (recall), as seen in

equation 4.2b. Its value is a good indicator of the relevance of a model as it balances

out precision and recall measures.

Sensitivity =T P

(T P+FN)(4.1a) Speci f icity =

T N(T N +FP)

(4.1b)

Precision =T P

(T P+FP)(4.2a) F1 = 2∗ Precision∗Recall

Precision+Recall(4.2b)

4.1.2 Jaccard Index & Dice coefficient

The Jaccard Index is used to evaluate the spatial coincidence and dissimilarities be-

tween two sets. It is defined by the intersection of the two sets divided by their union.

The resulting index range from 0 to 1. The dice coefficient is another similarity metric

commonly used in evaluating image processing algorithms. It can be seen as the over-

lapping percentage between the ground truth and the prediction. It can be calculated

with the Jaccard index and ranges from 0 to 1.

Jaccard =| A∩B || A∪B |

(4.3a) Dice =2∗ JaccardJaccard +1

(4.3b)

4.1.3 Bland-Altman Measure

In 1983, Bland and Altman proposed a method for assessing the agreement between

measures from different clinical methods [7]. In other words, it evaluates a variable,

measured by two different procedures. This type of analysis is mainly applied to the

medical field of research, where comparison between different quantitative assess-

ments is frequently carried out. For any general comparison task, simple measures

such as correlation and linear regression are implemented. Commonly-used correla-

tion metrics include the Pearson correlation coefficient, r, which is calculated from

the division of the covariance by the product of the standard deviations. However, to

assess the degree of agreement between two distinct methods, not only the correla-

tion but also the difference between them should be measured [21]. This is where the

Bland-Altman analysis proves its usefulness.

Page 28: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 4. Segmentation Evaluation 23

The Bland-Altman analysis makes use of statistical notions such as limits of agree-

ment to investigate the correspondence between two measures. These limits of agree-

ment are based on the mean and standard deviation of the subtraction of the two mea-

sures. It represents a margin within which a certain percentage of both measurements

are included. It is common to fix them to 90% or 95% (1.645 and 1.960 times the

standard deviation, respectively). This yields degrees of confidence as to where the

bulk of the data lies and is easily readable on a single plot, as seen in section 4.2.2.

4.2 WMH Segmentation

4.2.1 Total WMH Load

A full evaluation was performed to assess the LOTS-IAM-3D’s ability to segment the

total WMH load in the original scans. As the LOTS-IAM-3D’s output heavily relies

on a probability threshold parameter, an initial evaluation of the optimal value was

performed. We based this optimal threshold on the model’s average dice coefficient

when ran on MSS2. Figure 4.1 displays the results of this experiment. The best dice

coefficient was 0.619 at a threshold of 0.15. Therefore, the optimal threshold for the

remaining experiments is set to 0.15.

0.0 0.2 0.4 0.6 0.8 1.0Threshold

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Dice

ave

rage

Algorithm comparisonLST-LGALOTS-IAMLOTS-IAM-3D

Figure 4.1: Average dice curves of LST-LGA, original LOTS-IAM and LOTS-IAM-3D.

For a better comparison, we decided to include an evaluation of the LST-LGA in

Page 29: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 4. Segmentation Evaluation 24

the graph. The LST-LGA [43] is a WMH segmentation algorithm currently considered

the unsupervised state-of-the-art method and commonly used as standard comparison

for segmentation algorithms. It is implemented in MATLAB and makes use of the

Statistical Parametric Mapping package (spm12) to segment the WMH in T2-weighted

scans. However, we can see that the LOTS-IAM-3D outperforms both its predecessor

and the LST-LGA on the MSS2 samples.

4.2.2 Target Patch Selector Performance

To evaluate the efficiency of the patch selector, we divided our dataset into three differ-

ent groups, according to their WMH load (given by the ground truth). This experiments

aims at visualising where the LOTS-IAM-3D is outperforming its predecessor and if

it also struggles to identify WMH in over-damaged brains. Figure 4.2 shows the per-

formance of both LOTS-IAMs on different load levels. The division was designed in

order to have around a third of the data in each fold. Therefore, WMH volume lower

than 4cc were considered “Low load”, WMH volume between 4cc and 10cc were con-

sidered “Medium load”. Higher volumes were considered “High load”.

Low Medium HighLoad

0.0

0.2

0.4

0.6

0.8

Aver

age

Dice

Patch Selector PerformanceModel

LOTS-IAMLOTS-IAM-3D T=0.05

Figure 4.2: Patch selector evaluation according to different load levels.

Page 30: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 4. Segmentation Evaluation 25

From figure 4.2, we can observe that the use of the patch selector results in a clear

improvement compared to the basic LOTS-IAM (i.e. 2D approach). Therefore, it was

included in the final version of the algorithm. Figure 4.3 illustrates the “underestima-

tion effect” of the LOTS-IAM-3D without patch selector, clearly inducing mistakes in

the segmentation.

(a) IAM of LOTS-IAM-3D. (b) IAM of LOTS-IAM-3D with PS (T=0.05).

(c) Original FLAIR scan.

Figure 4.3: Target Patch Selector (PS) Performance.

Figure 4.4 shows the agreement plots between the LOTS-IAM-3D and the ground

truth. Figure 4.4(a) compares the volume change between any two time points. Each

point in the graph represents the value of the slope that captures the change as per

both methods. They were generated by plotting these values for each patient (i.e. the

slope that characterises the volume change between visits 1 & 2, 2 & 3 and 1 & 3)

in a graph as calculated by the LOTS-IAM-3D (x-axis) and according to the ground

truth (y-axis). The Pearson correlation for the prediction and ground truth relationship

is 0.735 with p-value < 10−16 (high confidence) and most of the data of the Bland-

Page 31: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 4. Segmentation Evaluation 26

Altman plot is confined within the confidence limits, which is demonstrating a clear

sense of agreement.

0.06 0.04 0.02 0.00 0.02 0.04 0.06Predicted Slopes (3D-0.05)

0.06

0.04

0.02

0.00

0.02

0.04

0.06Gr

ound

Tru

th S

lope

s

r: 0.735p: 8.10E-16

Predicted vs ground truth slopesPerfect AgreementLinear Regression

(a) Correlation between slopes.

0 5 10 15 20 25 30Average of measures

10

0

10

20

Diffe

renc

e be

twee

n m

easu

res

MEAN:-0.1257

+1.96SD:+12.03

-1.96SD:-12.282

Bland Altman (prediction vs ground truth)

(b) Prediction and ground truth agreement.

Figure 4.4: Measures of agreement.

4.2.3 WMH Subtle Changes

The LOTS-IAM-3D is capable of recognising the unhealthy parts of the brain. This

means that it it able to segment the WMH in an MRI scan. However, clinical studies

are usually highly interested in the dynamic changes of the lesions. Therefore, it is

interesting to evaluate our module for its ability to capture the dynamic changes present

in a patient’s brain from one visit to another.

To do so, the lesion evaluation and quantification was performed in two steps. First,

the second and third visit prediction masks were subtracted from their preceding visit

prediction mask (Visit 2 - Visit 1, Visit 3 - Visit 2), leaving only the dynamic changes

between the consecutive visits. Then, the first visit was subtracted from the last visit

(Visit 3 - Visit 1), leaving the dynamic changes overall. Finally, an individual extra

evaluation was performed on the subtle changes in-between visits, in a similar way as

the total WMH load segmentation evaluation.

This aims at revealing whether the new LOTS-IAM-3D has the ability to also de-

tect small dynamic changes, in addition to the main WMH load. The procedure for

identifying lesion changes over two consecutive visits is demonstrated on figure 4.5.

Figure 4.5(a) is subtracted from figure 4.5(b), which highlight the changes, identified

by their category (“Extended”, “Healed” and “Stable”) on figure 4.5(c).

Page 32: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 4. Segmentation Evaluation 27

(a) First visit FLAIR WMH. (b) Second visit FLAIR WMH.

(c) Evolution of lesions.

Figure 4.5: Visualisation of lesion evolution between two visits.

Table 4.1 presents the results of the evaluation, according to the average prediction

volume, average ground truth volume, average dice coefficient and average F1 score.

Intermediate changes (“Inter.”) are also included, although they were nonexistent in

the ground truth. The best measures are recorded in bold. It is clear that the LOTS-

IAM-3D significantly outperforms the original LOTS-IAM in all the categories, except

specificity. This can be explained by the slightly lower number of false positives given

by the LOTS-IAM on these very small regions. Therefore, we can conclude that the

LOTS-IAM-3D can be preferred to its predecessor for lesion evolution characterisa-

tion, which will be performed in chapter 5.

Page 33: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 4. Segmentation Evaluation 28

LOTS-IAM LOTS-IAM-3D

AVERAGE HEALS EXT. STAB. INTER. HEALS EXT. STAB. INTER.

PRED. VOL. (CC) 3.891 3.753 4.957 1.989 5.025 4.380 7.942 2.476

GT. VOL. (CC) 4.266 3.892 7.828 0 4.266 3.892 7.828 0

DICE 0.300 0.254 0.562 0 0.349 0.321 0.608 0

PPV 0.235 0.278 0.662 0 0.294 0.376 0.669 0

SENSITIVITY 0.255 0.285 0.567 0 0.286 0.354 0.708 0

SPECIFICITY 0.987 0.981 0.993 0.988 0.979 0.978 0.979 0.984

F1-SCORE 0.223 0.291 0.669 0 0.223 0.340 0.682 0

Table 4.1: Evaluation of the LOTS-IAMs performances for subtle changes in WMH.

4.3 Multi-Spectral Clustering Performance

Unfortunately, the results for the multi-spectral clustering using Gaussian mixture

models was not as positive as expected. As explained in section 3.4, all the possi-

ble combinations of FLAIR, T1- and T2-weighted were used to perform the clustering

and segmentation. Figure 4.6 shows the results of each combination, along with the

results of the LOTS-IAM-3D (threshold = 0.05) from section 4.2.2. The combination

of different sequences did not improve the overall performance of the segmentation.

[FLAIR] [T1-FLAIR] [T1-T2-FLAIR] [T2-FLAIR]Modality

0.0

0.1

0.2

0.3

0.4

0.5

0.6

Aver

age

Dice

Dice

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Aver

age

F1 S

core

Clustering Performance

F1 score

Figure 4.6: Gaussian mixture model performances.

Page 34: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 4. Segmentation Evaluation 29

4.4 Execution Speed

As mentioned earlier, the architecture of the original LOTS-IAM was completely changed

to optimise its overall execution speed. Table 4.2 shows the execution time of the dif-

ferent LOTS-IAM versions, for different input sizes. Note that “Improved” refers to

the code refactoring of the model’s architecture from section 3.3.2 and “PS” refers to

the target patch selector from section 3.3.3.

LOTS-IAM LOTS-IAM-3D

INPUT SIZE ORIGINAL IMPROVED PS NO PS

256X256X176 110.19 ± 12.26S 65.91 ± 0.31S 129.99 ± 12.96S 36.43S ± 4.86S

256X256X42 33.89 ± 4.71S 15.35 ± 0.33S 32.35 ± 6.90S 15.00 ± 2.81S

MEAN PER SLICE 0.72 ±0.09S 0.36 ± 0.00S 0.92S ±0.11S 0.36S ± 0.05S

Table 4.2: Total execution times for different input sizes. PS = Patch Selector.

Table 4.3 shows the time improvement for each function when tested with the

MSS3 subset (input size= 256x256x176). However, it is important to note that the

original LOTS-IAM makes N calls to each function listed below, where N is the num-

ber of layers per scan. On the other hand, the LOTS-IAM-3D calls them only once,

making it much faster overall.

LOTS-IAM LOTS-IAM-3D

FUNCTION ORIGINAL IMPROVED PS. NO PS.

SOURCE EXTR. 0.118 ± 0.142S 0.002 ± 0.180S 0.139 ± 0.030 S 0.139 ± 0.030 S

TARGET EXTR. 0.099 ± 0.040S 0.019 ± 0.027S 4.641 ± 1.825S 1.520 ± 0.931S

GPU COMP. 0.953 ± 0.981S 0.919 ± 1.090S 0.972 ± 1.435S 0.972 ± 1.435S

Table 4.3: Execution times of the LOTS-IAM main functions. PS = Patch Selector.

Page 35: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 5

Lesion Characterisation

5.1 Stability Index

The Stability Index [45] was developed with the idea of capturing the volume change

in brain MRI over time. It responds to the need for a straightforward and relevant way

of measuring variability in brain MRI volumes for a single individual over time. It is

calculated as follows:

N

∑n=1

[((Xn−Xn+1)/Xn)∗100] (5.1)

Where N is the total amount of visits recorded and Xn is the WMH volume at time

n. Overall, the stability index captures how stable a volume is over time. A high value

corresponds to a high estimate of variability, whereas a value close to 0 corresponds to

a more stable mass.

However, for this project, a slight variant of the stability index is used, as shown

on equation 5.2:

N

∑n=1

[((Xn+1−Xn)/Xn)∗100] (5.2)

Where a high number represents a dynamic increase in the overall WMH volume (ex-

tends) and a negative number represents a decrease in the WMH volume (heals).

5.2 Analysis Pipeline

The analysis pipeline aims at characterising the brain lesions per individual, per pop-

ulation, per consecutive visit and over time. To do so, the pipeline is divided into five

30

Page 36: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 5. Lesion Characterisation 31

analyses. For a more detailed account of each analysis, please refer to the correspond-

ing section. Note that by population, we refer to a group of individuals (e.g. MSS2

dataset).

Firstly, an initial investigation is performed on the data to get some general infor-

mation about the distribution and scale of each population, treating each visit indepen-

dently. We use the volume, damage metric and stability index for each population.

Secondly, a lesion state analysis, similar to the subtle change segmentation of

section 4.2.3, gives some more in-depth information about the possible progression of

the data. This is crucial information regarding the subtle changes occurring in-between

visits. However, it can only capture progression accurately if the whole population

follows the same trends, which is not always the case.

Thirdly, we divide the dataset into different trends, based on its mean and standard

deviation. We perform this to account for inter-individual differences that can occur in

clinical studies. The separation of the data allows for a more relevant analysis and cat-

egorisation of the individuals belonging to a population into smaller subgroups. This

solves the issue faced by the simple temporal load subtraction, performed during the

lesion state analysis. The subgroups are “Decreasing” and “Increasing”. Further sub-

categories (“Highly increasing”, “Slightly increasing”, ...) can be designed, according

to the distribution of the data and the presence of “outliers”.

Following, we perform a time-correlation analysis for each category of our dataset.

This attempts to underline any relationship common to individuals of the same trend,

identified in the previous step. For all consecutive visits, the volume change is anal-

ysed and modelled through a linear regression. The quality of the fit is given by its

Pearson correlation coefficient and p-value.

Finally, a growth curve modelling is fitted to each trend of the dataset. This allows

for the shape of the lesion evolution to be captured, additionally from its direction. For

each time step an intercept, slope and standard errors will describe the quality and

variance explained by the model.

For this thesis, we deployed our analysis pipeline on the subset of MSS2 & MSS3

described in 3.2, using the output of the LOTS-IAM-3D (PS = 0.05). The next sections

show the results for both sets. The analysis and the graphs were fully generated with

Jupyter notebooks and are available on request.

Page 37: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 5. Lesion Characterisation 32

5.3 Initial Analysis

First, the initial measurements take into account the total WMH load per time step.

This generates metrics to quantify the WMH volume separately at each visit (visit 1,

visit 2, visit 3), without capturing the relationships between the attributes.

Table 5.1 shows the initial measurements captured for MSS2 and MSS3.

MSS2 MSS3

V1 V2 V3 V1 V2 V3

VOLUME (CC)

MEDIAN 9.818 9.780 9.874 28.950 34.205 35.105

MEAN 13.848 13.554 13.203 31.654 31.693 36.399

STD 13.836 11.972 10.905 17.129 15.114 19.299

DAMAGE MET.

MEDIAN 0.0223 0.0212 0.0243 0.0209 0.0212 0.0237

MEAN 0.0309 0.0292 0.0362 0.0240 0.0240 0.0266

STD 0.0301 0.0245 0.0317 0.0145 0.0141 0.0156

STAB. INDEX

MEDIAN 14.258 14.258 14.258 15.369 15.369 15.369

MEAN 17.761 17.761 17.761 26.864 26.864 26.867

STD 50.000 50.000 50.000 58.027 58.027 58.027

Table 5.1: Initial measures of the MSS2 & MSS3 subsets.

From table 5.1, we can see that the volumes from MSS2 data are clearly lower than

MSS3. This is caused by the MRI dimension size of MSS3 being bigger than MSS2.

Also, the important difference between the mean and median of MSS2 suggests an

unbalanced distribution in the data, whereas the mean and median of MSS3 seem to

be more stable. Furthermore, the means and standard deviations between the different

visits suggest that the WMH volume does not vary significantly over time. However,

this might not hold if the positive and negative changes balance each other out, making

these numbers insignificant. Only a deeper analysis can reveal whether this is the case.

5.4 Lesion State Analysis

The lesion state analysis of the volume is performed by subtracting consecutive visits

from each other (visit 2 - visit 1, visit 3 - visit 2, ...) and labelling the different values

as “heals”, “stable” or “extends”. Table 5.2 shows that MSS2 has roughly an equal

amount of healing and extending regions. However, MSS3’s extending lesions seem

Page 38: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 5. Lesion Characterisation 33

to overpower the healing ones.

MSS2 MSS3

HEALS STABLE EXT. HEALS STABLE EXT.

VOLUME (CC)MEAN 5.025 7.942 4.380 9.3677 19.270 14.113

STD 13.836 11.972 10.905 8.6829 13.401 9.3742

DAMAGE MET.MEAN 0 0.023 0.008 0 0.017 0.007

STD 0 0.023 0.011 0 0.013 0.003

Table 5.2: Lesion state analysis of the MSS2 & MSS3 subsets.

5.5 Trend Separation

As every individual is different, it is impossible to describe a large population with a

single metric. This is why we perform a split of the population, according to the trends

present in its individuals. To identify the different trends, we observe the box plots of

the volume variability (i.e. visit subtraction) in MSS2 & MSS3, for all visits.

Visit 1 Visit 2 Visit 330

25

20

15

10

5

0

5

10

Volu

me

(cc)

Volume change distribution over time

(a) Box plot of MSS2 volume changes.

Visit 1 Visit 2 Visit 3

5

0

5

10

15

Volu

me

(cc)

Volume change distribution over time

(b) Box plot of MSS3 volume changes.

Figure 5.1: Distribution of the volume changes over time.

As shown on figure 5.1(a), MSS2’s volume variability ranges from a low negative

value to a high positive value, with most of the data lying around 0. However, a

significant amount of outliers are present over one standard deviation (short: STD)

away from the mean, in both directions. Therefore, the separation for MSS2 will be

Page 39: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 5. Lesion Characterisation 34

as follows: “high increase” (over one STD away from the mean, positively), “slight

increase” (between 0 and one STD away from the mean, positively), “slight decrease”

(between 0 and one STD away from the mean, negatively), “high decrease” (over one

STD away from the mean, negatively). If instances are placed in different folds from

one visit to another, only the fold of the last visit change is taken into consideration.

As shown on figure 5.1(b), MSS3’s volume variability is not as spread out as

MSS2’s. All the data ranges from one STD away from the mean, negatively and pos-

itively. Therefore, a simpler division is needed. The data will be divided into volume

“increase” and volume “decrease”.

A closer look at figures 5.2 and 5.3 can reveal some precious information regarding

the linearity of change from one visit to another. The next section will attempt to

capture this linearity assumption with correlation analysis and linear regression.

Visit 1 Visit 2 Visit 3

0

2

4

6

8

10

12

Volu

me

High volume increaseAverage

Visit 1 Visit 2 Visit 3

Slight volume increaseAverage

Visit 1 Visit 2 Visit 3-30

-25

-20

-15

-10

-5

0

Volu

me

High volume decrease

Average

Visit 1 Visit 2 Visit 3

Slight volume decrease

Average

Figure 5.2: Volume variability of the MSS2 subset, divided into 4 main trends.

Page 40: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 5. Lesion Characterisation 35

Visit 1 Visit 2 Visit 3

5

0

5

10

15

Volume increaseAverage

Visit 1 Visit 2 Visit 3

Volume decreaseAverage

Figure 5.3: Volume variability of the MSS3 subset, divided into 2 main trends.

5.6 Correlation Analysis

The analysis of the correlation between the different visits can reveal precious infor-

mation regarding the evolution of the lesions over time. It was decided to take into

account the correlation between consecutive visits as well as between the first and last

visit.

For our datasets, since only 3 visits are available, the correlation is calculated be-

tween visits 1 & 2, between visits 2 & 3 and between visits 1 & 3. To describe the

correlation and capture the relationship between visits, we fit a linear regression model

to the data and measure its goodness-of-fit (Pearson’s r and p-value).

0 10 20 30Visit 1

0

5

10

15

20

25

30

35

Visit

2

r: 0.835p value: 1.36E-3

Linear Reg.Residual

0 10 20 30Visit 2

Visit

3

r: 0.941p value: 1.62E-5

Linear Reg.Residual

0 10 20 30Visit 1

Visit

3

r: 0.935p value: 2.45E-5

Linear Reg.Residual

Correlation analysis

Figure 5.4: Correlation plots of the highly increasing volumes in MSS2.

Page 41: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 5. Lesion Characterisation 36

As an example, figure 5.4 demonstrates how the “highly increasing” trend of MSS2

evolves linearly with high confidence (r ≥ 0.83 and p− value < 0.01 in all fits). Due

to space limitation, the visit correlation plots of the other trends were included in table

format (see tables 5.3 & 5.4). These metrics help to reveal the evolution of the data

from one visit to another.

TRENDS

HIGH DEC. SLIGHT DEC. SLIGHT INC. HIGH INC.

r p r p r p r p

V1-V2 0.98 1.55E-6 0.94 6.13E-4 0.97 2.32E-5 0.84 1.36E-3

V2-V3 0.96 4.59E-5 0.96 1.89E-4 0.92 4.18E-4 0.94 1.62E-5

V1-V3 0.92 3.98E-4 0.99 4.69E-6 0.97 1.38E-5 0.94 2.45E-5

Table 5.3: Correlation measures for MSS2.

TRENDS

DECREASE INCREASE

r p r p

V1-V2 0.87 1.26E-1 0.98 2.63E-7

V2-V3 0.74 2.64E-1 0.97 1.98E-6

V1-V3 0.97 3.00E-2 0.98 8.83E-7

Table 5.4: Correlation measures for MSS3.

The evolution according to trends seems to be strongly linear in MSS2 between

all the visits (r ≥ .84 and p− value ≤ .001 in all cases). In MSS3, the relationships

between visits are also mostly explained by a linear regression. However, the decreas-

ing trend of the MSS3 data is more unstable. The high correlation between visits 1

and 3 (r = 0.97 and p− value = 10−6) seems to suggest that visits 2 presents higher

fluctuations that could not be captured as well by a linear model, though it is still over

an acceptable threshold (r ≥ 0.74 and p− value≤ 0.264).

5.7 Growth Curve Analysis

To further explore the area of longitudinal studies, the last analysis of the characterisa-

tion pipeline involves the computation of a growth curve model [32]. For each volume

Page 42: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 5. Lesion Characterisation 37

variability trend, a growth curve model is fitted and displayed on table 5.5 & table 5.6.

However, it was decided to only test it for linear changes over time, due to the small

amount of time points (visit 1, visit 2, visit 3). A higher number of time points could

give additional information about the shape of the growth [15], which would allow a

deeper understanding of the data. For instance, the use of polynomials could yield

better results by explaining the shape of the evolution (trajectories) with a quadratic or

cubic model [11].

The equation for a linear growth curve is given by

yi = intercept + ti ∗ slope+ errori (5.3)

where yi and errori represent the volume (variability) prediction and the prediction

error associated with time i, respectively, and t1 represents the time-specific order of

the fit (e.g. [0, 1, 2] for linear fit with 3 time points).

Table 5.5 & 5.6 show the result of applying growth curve modelling to capture the

evolution of the trajectories in MSS2 & MSS3.

TRENDS

HIGH DEC. SLIGHT DEC. SLIGHT INC. HIGH INC.

INTER. MEAN (VAR) 16.12 (49.29) 10.42 (30.81) 6.99 (6.90) 11.527 (44.25)

SLOPE MEAN (VAR) -1.08 (-6.54) -0.14 (6.13) 0.46 (-0.05) 2.50 (0.10)

RESID. VAR. 1 30.395 -8.32 -0.263 9.969

RESID. VAR. 2 5.132 6.46 1.225 10.368

RESID. VAR. 3 9.289 -8.653 1.028 -2.978

P-VALUE 0.007 0.843 0.013 0.385

Table 5.5: Growth Curve Analysis for MSS2.

The p− values of highly decreasing and slightly increasing volumes suggest a

significant fit to the data. However, the p−values of the slightly increasing and highly

increasing volumes are higher, which indicates a lower confidence regarding the fit of

the model.

Page 43: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 5. Lesion Characterisation 38

TRENDS

DECREASE. INCREASE

INTER. MEAN (VAR) 29.80 (22.91) 27.23 (140.7)

SLOPE MEAN (VAR) -1.58 (0.04) 3.79 (-28.7)

RESID. VAR. 1 -2.87 56.58

RESID. VAR. 2 13.07 -24.22

RESID. VAR. 3 6.74 58.68

P-VALUE 0.829 0.160

Table 5.6: Growth Curve Analysis for MSS3.

The decreasing trend of MSS3 does not seem to find a very statistically significant

solution for growth curve modelling (p− value = 0.829). However, there is a good fit

for the increasing trend (p− value = 0.160).

The intercept indicate the level and scale of the volume variability at each visit at

population level (mean) and at individual level (variance). Similarly, the slope relate

to the relationship between each visit at population level (mean) and at individual level

(variance). The residual variance reports the error fluctuations of the growth model at

each visit.

Overall, the growth curve model allows a better understanding of a set of individu-

als presenting some similar traits (presence of WMH), from a whole population point

of view (i.e. whole dataset), as well as from an individual point of view (i.e. patient).

In our case, more data would improve the fit of the model and allow a better under-

standing of the datasets. However, as the data is still currently being collected, it is not

possible to include more trajectories into the model.

Page 44: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 6

Conclusions

The segmentation and characterisation of brain lesions in MRI scans is a challeng-

ing task. However, with the constant progress made by the scientific community in

bio-imaging analysis, it is now possible to see the emergence of new unsupervised

segmentation methods that outperforms experts on similar tasks. Furthermore, these

techniques have the advantage of not requiring labelled datasets, which are intricate

to acquire in the medical field. We chose one of these original segmentation methods,

called the LOTS-IAM, as basis for this project as it is believed it could outperform

current state-of-the-art unsupervised methods for the task of WMH segmentation.

From this, we presented our first contribution, the LOTS-IAM-3D with targetpatch selection function based on prior, which improves the original algorithm by

(1) adding the processing of a depth dimension to the irregularity map generation, (2)

giving it the ability to filter out over-damaged patches during patch comparisons and

(3) drastically decreasing its processing time while improving its overall accuracy.

Our second contribution is a pipeline for the analysis of the previously-segmentedlesions. It consists of five sub-analyses that outline different characteristics of brain le-

sions on MRI from a population of individuals. The analyses are: overall description

of the population (“initial analysis”, section 5.3), lesion state analysis (section 5.4),

trend identification and separation (section 5.5), correlation analysis (section 5.6) and

growth curve modelling (section 5.7).

To validate our findings, we ran the LOTS-IAM-3D on our two datasets: MSS2 &

MSS3. Then, we evaluated and compared with the original LOTS-IAM and a standard

segmentation public tool, the LST-LGA. The evaluation demonstrated that the LOTS-

IAM-3D outperformed both the original LOTS-IAM and LST-LGA based on its dice

coefficient and F1 score (age probability map threshold = 0.15). Furthermore, the

39

Page 45: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Chapter 6. Conclusions 40

LOTS-IAM-3D overcomes the issue of lesion underestimation in over-damaged brains

with the implementation of the patch selector. Finally, we evaluated the LOTS-IAM-

3D’s ability to segment the subtle dynamic changes of the brain in-between visits,

which revealed again that it outperformed the original LOTS-IAM in all categories.

Following the segmentation evaluation, we applied our analysis pipeline to the out-

puts and produced a full description of the lesions appearing in both clinical samples

(MSS2 & MSS3). The analysis revealed that the two populations show similar but

not identical characteristics. Both populations presented increasing and decreasing le-

sion volumes over time, which validates Wardlaw et al.’s findings. However, MSS2

contains individuals whose lesion evolution was steadier and presented strong linear

relationships between visits in all cases, whereas MSS3 seems to evolve linearly for

positive volume changes but not for negative volume changes. The growth curve anal-

yses give overviews of both populations and their respective categories regarding the

shape and parameters of their evolution.

Despite the good results of the evaluations, this thesis suffers from some limita-

tions. First of all, if more time was allowed, a deeper comparison of the LOTS-IAM-

3D’s performance would have been produced to also include supervised learning meth-

ods (e.g. UResNet). Also, more work would have been put into the tuning of SOMs, as

it is still believed to be a promising area of research for multi-spectral analysis. Finally,

more time would have allowed more patient data to be registered (MSS3 is an ongoing

study), which would have improved the quality of the dataset analysis. Nevertheless,

all these limitations could be subject of future work.

In conclusion, we bring two contributions that we believe are of importance to clini-

cal research. The LOTS-IAM-3D showed constant superiority in terms of performance

for unsupervised segmentation and the analysis pipeline builds strong foundations for

longitudinal studies of brain lesions. The work produced here was designed in the

hope that it will be considered by the scientific community and used for future clinical

studies.

Page 46: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Bibliography

[1] Petronella Anbeek, Koen L Vincken, Matthias JP van Osch, Robertus HC Biss-

chops, and Jeroen van der Grond. Automatic segmentation of different-sized

white matter lesions by voxel probability estimation. Medical image analysis,

8(3):205–215, 2004.

[2] Rhoda Au, Joseph M Massaro, Philip A Wolf, Megan E Young, Alexa Beiser,

Sudha Seshadri, Ralph B DAgostino, and Charles DeCarli. Association of white

matter hyperintensity volume with decreased cognitive functioning: the framing-

ham heart study. Archives of neurology, 63(2):246–250, 2006.

[3] Mohammad Balafar. Gaussian mixture model based segmentation methods for

brain mri images. Artificial Intelligence Review, 41, 03 2014.

[4] Mohd Ali Balafar, Abdul Rahman Ramli, M Iqbal Saripan, and Syamsiah

Mashohor. Review of brain mri image segmentation methods. Artificial Intel-

ligence Review, 33(3):261–274, 2010.

[5] Rachele Bellini, Yanir Kleiman, and Daniel Cohen-Or. Time-varying weathering

in texture space. ACM Transactions on Graphics (TOG), 35(4):141, 2016.

[6] Jeff A Bilmes et al. A gentle tutorial of the em algorithm and its application to

parameter estimation for gaussian mixture and hidden markov models. Interna-

tional Computer Science Institute, 4(510):126, 1998.

[7] J Martin Bland and Douglas G Altman. Statistical methods for assessing agree-

ment between two methods of clinical measurement. The lancet, 327(8476):307–

310, 1986.

[8] Neurology Unit Cambridge University. Preserve: How intensively should we

treat blood pressure in established cerebral small vessel disease? 2012.

41

Page 47: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Bibliography 42

[9] Arıstides Andres Capizzano, L Acion, T Bekinschtein, M Furman, H Gomila,

A Martinez, R Mizrahi, and SE Starkstein. White matter hyperintensities are

significantly associated with cortical atrophy in alzheimers disease. Journal of

Neurology, Neurosurgery & Psychiatry, 75(6):822–827, 2004.

[10] A-Hyun Cho, Hyeong-Ryul Kim, Woojun Kim, and Dong Won Yang. White mat-

ter hyperintensity in ischemic stroke patients: it may regress over time. Journal

of stroke, 17(1):60, 2015.

[11] Patrick J Curran, Khawla Obeidat, and Diane Losardo. Twelve frequently asked

questions about growth curve modeling. Journal of cognition and development,

11(2):121–136, 2010.

[12] Stephanie Debette and HS Markus. The clinical importance of white matter hy-

perintensities on brain magnetic resonance imaging: systematic review and meta-

analysis. Bmj, 341:c3666, 2010.

[13] Ivana Despotovic, Bart Goossens, and Wilfried Philips. Mri segmentation of

the human brain: challenges, methods, and applications. Computational and

mathematical methods in medicine, 2015, 2015.

[14] Chuong B Do and Serafim Batzoglou. What is the expectation maximization

algorithm? Nature biotechnology, 26(8):897, 2008.

[15] Terry E Duncan and Susan C Duncan. The abcs of lgm: An introductory guide

to latent variable growth curve modeling. Social and personality psychology

compass, 3(6):979–991, 2009.

[16] Franz Fazekas. Incidental periventricular white matter hyperintensities revisited:

what detailed morphologic image analyses can tell us. American Journal of Neu-

roradiology, 35(1):63–64, 2014.

[17] Franz Fazekas, John B Chawluk, Abass Alavi, Howard I Hurtig, and Robert A

Zimmerman. Mr signal abnormalities at 1.5 t in alzheimer’s dementia and normal

aging. American journal of roentgenology, 149(2):351–356, 1987.

[18] Urs Fischer, Adrian Baumgartner, Marcel Arnold, Krassen Nedeltchev, Jan

Gralla, Gian Marco De Marchis, Liliane Kappeler, Marie-Luise Mono, Caspar

Brekenfeld, Gerhard Schroth, et al. What is a minor stroke? Stroke, 41(4):661–

666, 2010.

Page 48: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Bibliography 43

[19] Office for National Statistics. Deaths registered in england and wales, 2017.

[20] Melvin Gelbard. Data mining with dynamic contrast enhanced magnetic reso-

nance imaging (dce-mri) data, 2019.

[21] Davide Giavarina. Understanding bland altman analysis. Biochemia medica:

Biochemia medica, 25(2):141–151, 2015.

[22] Lei Guo, Xuena Liu, Youxi Wu, Weili Yan, and Xueqin Shen. Research on

the segmentation of mri image based on multi-classification support vector ma-

chine. In 2007 29th Annual International Conference of the IEEE Engineering in

Medicine and Biology Society, pages 6019–6022. IEEE, 2007.

[23] Joseph V Hajnal, David J Bryant, Larry Kasuboski, Pradip M Pattany, et al. Use

of fluid attenuated inversion recovery (flair) pulse sequences in mri of the brain.

Journal of computer assisted tomography, 16:841–841, 1992.

[24] Maria Del C Valdes Hernandez, Francesca M Chappell, Susana Munoz Man-

iega, David Alexander Dickie, Natalie A Royle, Zoe Morris, Devasuda Anblagan,

Eleni Sakka, Paul A Armitage, Mark E Bastin, et al. Metric to quantify white mat-

ter damage on brain magnetic resonance images. Neuroradiology, 59(10):951–

962, 2017.

[25] Scott A Huettel, Allen W Song, Gregory McCarthy, et al. Functional magnetic

resonance imaging, volume 1. Sinauer Associates Sunderland, MA, 2004.

[26] Ali Isın, Cem Direkoglu, and Melike Sah. Review of mri-based brain tumor

image segmentation using deep learning methods. Procedia Computer Science,

102:317–324, 2016.

[27] Xiao-li Jin. Multi-spectral mri brain image segmentation based on kernel cluster-

ing analysis.

[28] Raymond J. Kim, Edwin Wu, Allen Rafael, Enn-Ling Chen, Michele A. Parker,

Orlando Simonetti, Francis J. Klocke, Robert O. Bonow, and Robert M. Judd.

The use of contrast-enhanced magnetic resonance imaging to identify reversible

myocardial dysfunction. New England Journal of Medicine, 343(20):1445–1453,

2000. PMID: 11078769.

Page 49: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Bibliography 44

[29] Xavier Llado, Onur Ganiler, Arnau Oliver, Robert Martı, Jordi Freixenet, Laia

Valls, Joan C Vilanova, Lluıs Ramio-Torrenta, and Alex Rovira. Automated

detection of multiple sclerosis lesions in serial brain mri. Neuroradiology,

54(8):787–807, 2012.

[30] MATLAB. version 9.2.0 (R2017a). The MathWorks Inc., Natick, Massachusetts,

2017.

[31] Katie L. McMahon, Gary Cowin, and Graham Galloway. Magnetic resonance

imaging: The underlying principles. Journal of Orthopaedic & Sports Physical

Therapy, 41(11):806–819, 2011. PMID: 21654095.

[32] Daniel McNeish and Tyler Matta. Differentiating between mixed-effects and

latent-curve approaches to growth modeling. Behavior research methods,

50(4):1398–1414, 2018.

[33] Todd K Moon. The expectation-maximization algorithm. IEEE Signal processing

magazine, 13(6):47–60, 1996.

[34] Hongwei Li Ricardo Guerrero Rozanna Meijboom Stewart Wiseman Adam

Waldman Jianguo Zhang Daniel Rueckert Taku Komura Muhammad

Febrian Rachmadia, Maria del C. Valdes-Hernandez. Limited one-time

sampling irregularity map (lots-im): Automatic unsupervised quantitative as-

sessment of white matter hyperintensities in structural brain magnetic resonance

images. 2019.

[35] D Mungas, WJ Jagust, Bruce R Reed, JH Kramer, MW Weiner, N Schuff, D Nor-

man, WJ Mack, L Willis, and HC Chui. Mri predictors of cognition in subcortical

ischemic vascular disease and alzheimers disease. Neurology, 57(12):2229–2235,

2001.

[36] John T O’Brien, David Ames, and Isaac Schwietzer. White matter changes in de-

pression and alzheimer’s disease: a review of magnetic resonance imaging stud-

ies. International Journal of Geriatric Psychiatry, 11(8):681–694, 1996.

[37] Travis E Oliphant. A guide to NumPy, volume 1. Trelgol Publishing USA, 2006.

[38] World Health Organization. World Health Report. 2002.

Page 50: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Bibliography 45

[39] Leonardo Pantoni, Michela Simoni, Giovanni Pracucci, Reinhold Schmidt, Fred-

erik Barkhof, and Domenico Inzitari. Visual rating scales for age-related white

matter changes (leukoaraiosis) can the heterogeneity be reduced? Stroke,

33(12):2827–2833, 2002.

[40] Julia Patriarche and Bradley Erickson. A review of the automated detection

of change in serial imaging studies of the brain. Journal of digital imaging,

17(3):158–174, 2004.

[41] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel,

M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos,

D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine

learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.

[42] Richard J Radke, Srinivas Andra, Omar Al-Kofahi, and Badrinath Roysam. Im-

age change detection algorithms: a systematic survey. IEEE transactions on

image processing, 14(3):294–307, 2005.

[43] Paul Schmidt, Christian Gaser, Milan Arsic, Dorothea Buck, Annette Forschler,

Achim Berthele, Muna Hoshi, Rudiger Ilg, Volker J Schmid, Claus Zimmer, et al.

An automated tool for detection of flair-hyperintense white-matter lesions in mul-

tiple sclerosis. Neuroimage, 59(4):3774–3783, 2012.

[44] Reinhold Schmidt, Helena Schmidt, Peter Kapeller, Christian Enzinger, Stefan

Ropele, Ronald Saurugg, and Franz Fazekas. The natural course of mri white

matter hyperintensities. Journal of the neurological sciences, 203:253–257, 2002.

[45] Christopher James Martin Scott. Master’s thesis, University of Edinburgh, 2016.

[46] Eric E Smith, Svetlana Egorova, Deborah Blacker, Ronald J Killiany, Alona

Muzikansky, Bradford C Dickerson, Rudolph E Tanzi, Marilyn S Albert,

Steven M Greenberg, and Charles RG Guttmann. Magnetic resonance imaging

white matter hyperintensities and brain volume in the prediction of mild cognitive

impairment and dementia. Archives of neurology, 65(1):94–100, 2008.

[47] Stephen M Smith, Yongyue Zhang, Mark Jenkinson, Jacqueline Chen,

PM Matthews, Antonio Federico, and Nicola De Stefano. Accurate, robust, and

automated longitudinal and cross-sectional brain change analysis. Neuroimage,

17(1):479–489, 2002.

Page 51: A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their constant support and ideas that helped me go through this challenging task. This project

Bibliography 46

[48] Els Steeman, Bernadette Dierckx De Casterle, Jan Godderis, and Mieke Gryp-

donck. Living with early-stage dementia: A review of qualitative studies. Journal

of advanced nursing, 54(6):722–738, 2006.

[49] Cathie Sudlow, John Gallacher, Naomi Allen, Valerie Beral, Paul Burton, John

Danesh, Paul Downey, Paul Elliott, Jane Green, Martin Landray, et al. Uk

biobank: an open access resource for identifying the causes of a wide range of

complex diseases of middle and old age. PLoS medicine, 12(3):e1001779, 2015.

[50] CLM Sudlow and CP Warlow. Comparing stroke incidence worldwide: what

makes studies comparable? Stroke, 27(3):550–558, 1996.

[51] G. Vettigli. Minisom: minimalistic and numpy based implementation of the self

organizing maps. https://github.com/JustGlowing/minisom, 2013.

[52] Joanna M Wardlaw, Michael Allerhand, Fergus N Doubal, Maria Valdes Hernan-

dez, Zoe Morris, Alan J Gow, Mark Bastin, John M Starr, Martin S Dennis, and

Ian J Deary. Vascular risk factors, large-artery atheroma, and brain white matter

hyperintensities. Neurology, 82(15):1331–1338, 2014.

[53] Joanna M Wardlaw, Francesca M Chappell, Maria del Carmen Valdes Hernandez,

Stephen DJ Makin, Julie Staals, Kirsten Shuler, Michael J Thrippleton, Paul A

Armitage, Susana Munoz-Maniega, Anna K Heye, et al. White matter hyper-

intensity reduction and outcomes after minor stroke. Neurology, 89(10):1003–

1010, 2017.

[54] Joanna M Wardlaw, Maria C Valdes Hernandez, and Susana Munoz-Maniega.

What are white matter hyperintensities made of? relevance to vascular cognitive

impairment. Journal of the American Heart Association, 4(6):e001140, 2015.