A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their...
Transcript of A Data Mining Approach to the Study of Dynamic Changes in ... · and Febrian Rachmadi, for their...
A Data Mining Approach to the
Study of Dynamic Changes in
Brain White Matter
Melvin Gelbard
Master of Science
Artificial Intelligence
School of Informatics
University of Edinburgh
2019
Abstract
In brain magnetic resonance imaging, white matter hyper-intensities are a visual indi-
cator of small vessel disease. Furthermore, they have been associated with the develop-
ment of degenerative conditions such as vascular dementia. Therefore, a lot of research
is dedicated to the analysis of such pathological features. However, poor segmentation
and characterisation of these lesions has led to ambiguous or contradictory hypotheses.
To resolve this issue, we present our two contributions: the LOTS-IAM-3D and a lesion
characterisation pipeline. The LOTS-IAM-3D outperforms its 2D implementation and
the state-of-the-art unsupervised segmentation method for the identification of brain
white matter hyper-intensities in terms of both accuracy and processing speed. Then,
the characterisation pipeline produced in this work introduces a robust and precise
way of carrying out lesion characterisation to account for inter- and intra-individual
differences, as well as in a whole population. We applied our segmentation module on
two datasets of patients with mild strokes and presented a thorough evaluation of the
LOTS-IAM-3D followed by a complete lesion analysis based on our pipeline.
i
Acknowledgements
First of all, I would like to thank my supervisor, Professor Taku Komura, for offering
me the opportunity to work on such an inspiring project and giving me the willingness
to pursue a career in biomedical imaging.
Second of all, I want to show my gratitude towards Dr. Marıa Valdes Hernandez
and Febrian Rachmadi, for their constant support and ideas that helped me go through
this challenging task. This project would have not been possible without you.
Finally, I would like to thank my family, Marc, Carine, Olivia, Dan, Tzuki and
Quito for their support from far away and Elisabetta, Anisha, Katie, Gaurav, Jordi and
Marko for their support in Appleton tower level 9.
ii
Table of Contents
1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Research Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Significance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Beneficiaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Background Research 52.1 Ground Medical Knowledge . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Stroke . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.2 Magnetic Resonance Imaging . . . . . . . . . . . . . . . . . 6
2.1.3 MRI Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.4 White Matter Hyper-intensities . . . . . . . . . . . . . . . . . 9
2.2 Ground Technical Knowledge . . . . . . . . . . . . . . . . . . . . . 10
2.2.1 LOTS-IAM . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.2 Gaussian Mixture Models . . . . . . . . . . . . . . . . . . . 13
2.2.3 White Matter Damage Metric . . . . . . . . . . . . . . . . . 14
3 Methodology 153.1 Ethics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Modification of the LOTS-IAM . . . . . . . . . . . . . . . . . . . . 16
3.3.1 LOTS-IAM-3D . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3.2 Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3.3 Target Patch Selection From Prior . . . . . . . . . . . . . . . 18
3.4 Multi-Spectral Clustering . . . . . . . . . . . . . . . . . . . . . . . . 19
3.4.1 Gaussian Mixture Model . . . . . . . . . . . . . . . . . . . . 20
3.5 Tools and Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
iii
4 Segmentation Evaluation 214.1 Metrics Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.1.1 Measures of Relevance . . . . . . . . . . . . . . . . . . . . . 21
4.1.2 Jaccard Index & Dice coefficient . . . . . . . . . . . . . . . . 22
4.1.3 Bland-Altman Measure . . . . . . . . . . . . . . . . . . . . . 22
4.2 WMH Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.1 Total WMH Load . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2.2 Target Patch Selector Performance . . . . . . . . . . . . . . . 24
4.2.3 WMH Subtle Changes . . . . . . . . . . . . . . . . . . . . . 26
4.3 Multi-Spectral Clustering Performance . . . . . . . . . . . . . . . . . 28
4.4 Execution Speed . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5 Lesion Characterisation 305.1 Stability Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2 Analysis Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.3 Initial Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.4 Lesion State Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.5 Trend Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.6 Correlation Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.7 Growth Curve Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 36
6 Conclusions 39
Bibliography 41
iv
Chapter 1
Introduction
It is well established that white matter hyper-intensities (short: WMHs) are connected
to the progression of neurodegenerative conditions such as dementia, Alzheimer dis-
ease (short: AD) and the occurrence of strokes [46, 9, 2]. These neurological syn-
dromes usually appear in individuals of older age and are believed to be accelerated
by small vessel disease (short: SVD), also predominant in elderly subjects [35]. Ev-
ery year, around 15 million people are affected by brain strokes and 850,000 people
by dementia [38]. Despite the considerable improvement in quality of life in the past
decades, these numbers are a sign that age-related neurodegenerative diseases remain
a struggle for modern medicine [48], as they can occur in any subset of the world
population.
Research strongly suggests that the presence of WMH is an indicator of SVD,
which leads to the development of cognitive impairment. However, their aetiology
is not very well understood [52]. The analysis of these brain white matter lesions
of presumed vascular origin is clinically important [54] as they “predict an increased
risk of stroke, dementia, and death” [12] and methods could potentially be found to
minimise their development.
Nowadays, the progress in modern technology allows a more precise analysis of
these lesions with the use of magnetic resonance imaging (short: MRI). This imaging
method facilitates the examination of pathology in the brain. Naturally, the automatic
identification of the pathological brain features became a rapidly growing field of re-
search. Most of these automatic methods rely on expert-generated data. However,
the skills and experience of the specialist performing the analysis highly influence the
quality of the reference data. To solve this issue, these labels are generated by groups
of experts, usually blinded to each other. Nevertheless, this procedure suffers from
1
Chapter 1. Introduction 2
intra-observer and inter-observer inconsistencies [47] and is time-consuming as well
as costly. Furthermore, despite the extended research on the subject, many subtleties
lying behind the general knowledge of these pathological features are still unexplained
or unexplored [20]. This is reinforced by two common problems: (1) the lack of stan-
dard quantification metric regarding the evolution of lesions in the brain and (2) the
variability in the segmentation and classification of white matter damage [20]. Both
have led to contradictory results and ambiguous hypotheses [16].
Recently, promising work carried out at the Centre of Clinical Brain Sciences
(short: CCBS) by the University of Edinburgh attempted to solve the issue of lesion
quantification by introducing a new metric for quantifying white matter damage on
MRI [24]. This measure is more robust to variations in the scanning protocols and
manages to capture any level of white matter damage in the brain. However, no useful
analysis can be drawn without an accurate segmentation of the white matter volume
first, which generally lacks specificity due to brain tissue irregularities.
Many tools such as deep convolutional networks [26] and support vector machines
[22] have been used for solving the issue of inaccurate WMH segmentation. However,
they usually lack in practicality as they are tuned for specific MRI settings. Further-
more, they require large labelled datasets to perform supervised learning, which is
challenging to acquire in the field of medical research. Nonetheless, an interesting ap-
proach was taken by Rachmadi et al. in their recent publication [34]. This innovative
method, based on texture analysis from computer graphics [5], allows the generation
of probability maps to detect abnormalities in the input image. It is fully unsupervised
and therefore does not require any training on labelled datasets. Even though it suffers
from some shortcomings, such as lesion underestimation in highly damaged brains and
poor processing time, it was taken as base for this project as it is believed that it can
outperform current state-of-the-art approaches. An overview of the shortcomings is
detailed in section 2.2.1 and an explanation of the modification is given in section 3.3.
This thesis aims at helping to establish a robust pipeline for the segmentation and
characterisation of white matter damage and its progression in brain MRI. To do so,
it will use two brain imaging datasets of patients with mild stroke. They comprise of
MRI scans from 17 and 43 patients respectively, taken at three different time points.
The datasets received ethical approval from the Lothian Research Ethics Commit-
tee (09/S1101/54, LREC/2003/2/29, REC 09/81101/54), the NHS Lothian R+D Of-
fice (2009/W/NEU/14) and the Multi-Centre Research Ethics Committee for Scotland
(MREC/01/0/56) and was conducted according to the principles expressed in the Dec-
Chapter 1. Introduction 3
laration of Helsinki [24].
This thesis is structured by dividing each aspect of the project into specific chap-
ters and sections that do not require the reader to have previous knowledge in biology
or data mining. First, chapter 1 will present the main motivation and objective of this
dissertation (sections 1.1 & 1.2), along with a detailed account of its significance in the
scientific world and an explanation of who will be the beneficiaries of this project (sec-
tions 1.3 & 1.4). Then, chapter 2 will present the reader with background information
regarding the main contribution of the thesis.
The contribution of this thesis is twofold. First, the initial stage finds an in-novative way to accurately segment out white matter lesions appearing in brainMRI, as detailed in chapter 3 and evaluated in chapter 4. Second, the followingstage quantifies and characterises the recently segmented white matter volumewith a pipeline of analyses designed by the author, as detailed in chapter 5.
Finally, chapter 6 will discuss the results presented in the previous chapters, along
with a small investigation about the limitations of the project and future work.
1.1 Motivation
In their review, Llado et al. call for the need of an automatic segmentation and quan-
tification system for better analysis and identification of brain pathology [29]. Their
legitimate concerns come from the fact that there is no common ground, at the time of
writing, for the assessment of abnormalities (including WMH) and the characterisation
of their evolution. However, there is a crucial need for such work to be carried out as
many fully-automated algorithms have been found to perform better than experts on
similar tasks [42].
This project is motivated by the willingness to move the field forward and con-
tribute with an improved and time-efficient segmentation module and characterisation
pipeline for brain MRI analysis. It was carried out with the hope that it will be used by
the research community for the establishment of strong common grounds among dif-
ferent institutions. This would allow future findings to be more trustworthy and enable
cooperation between research centres that currently use different tools for segmenta-
tion and analysis.
Chapter 1. Introduction 4
1.2 Research Goal
The objective of this research project is to address the shortcomings in the clinical ap-
plication of the LOTS-IAM to WMH segmentation by developing a complete system
that: (1) allows the accurate segmentation of WMH from brain MRI taken at different
time points and (2) allows accurate information to be gleaned from it. The segmenta-
tion module takes a 3D-array MAT file representing the scan as input and performs a
patch comparison-based abnormality detection. Then, it generates a probability map
for each voxel, representing how likely it is to be WMH. The focus of this thesis is not
the separation of WMH from presumed vascular origin and WMH from stroke inci-
dents. Therefore, the regions segmented by the module could originate from both.
Following, the second part of the system is an analysis pipeline designed for the char-
acterisation of the previously-found abnormalities. It takes the output from the first
module and generates a full description of the lesions over time and a complete analy-
sis of the generated numbers.
1.3 Significance
In collaboration with Dr. Taku Komura, Dr. Marıa Valdes Hernandez and Febrian
Rachmadi, the project is at the centre of their current aim in this field: being able to
describe and characterise subtle changes in white matter hyper-intensity volumes in
the brain over time. Currently, the full analysis (segmentation and characterisation) is
semi-automatically performed, with a great deal of manual input. A substantial amount
of time is spent on these tasks, whereas an efficient algorithm could allow researchers
and doctors to fully focus on the diagnosis and treatment administration.
1.4 Beneficiaries
The first motivation behind this project was the curiosity for the exploration of new
techniques and improvements for WMH segmentation and characterisation. However,
due to its interesting results, it will benefit any academic party interested in the devel-
opment of brain lesion segmentation and analysis algorithms. Furthermore, as a result
of its original implementation, the system is beneficial to experts in both the field of
computer science and medicine. In the long run, clinicians and patients with WMH in
the need of an accurate diagnosis will be able to profit from it.
Chapter 2
Background Research
2.1 Ground Medical Knowledge
2.1.1 Stroke
According to the World Health Organization, brain strokes are the second leading
cause of death worldwide, with more than 15 million people affected every year [38].
Even if about a third do not leave any irreversible damage, the rest of them can have
dreadful consequences. Great progress in medicine, coupled with an increase in the
quality of life worldwide led to an impressive decline in the number of recorded
strokes. However, the recent statistics offered by the Office for National Statistics
estimated strokes to be the 4th leading cause of death in the United Kingdom. [19].
Strokes are neurological dysfunctions happening in the brain for a short period of
time. When they occur, the brain is deprived of its oxygen and nutrients, preventing
it from operating as designed [18]. Strokes can cause significant damage to the body
and affect the patient’s ability to carry out tasks such as talking or moving. We observe
two main types of strokes: haemorrhagic and ischaemic strokes (see figure 2.1). They
might inflict the same damage for the body but originate from different phenomena.
Stroke
Ischemic
ThromboticEmbolic
Haemorrhagic
IntracerebralSubarachnoid
Figure 2.1: Tree representing the different kinds of strokes.
5
Chapter 2. Background Research 6
Haemorrhagic strokes, also called aneurysms, are the least common types of strokes,
according to the National Stroke Association. Only around 15% of all stroke incidents
are diagnosed as haemorrhagic. Nevertheless, they are known to be fatal for the pa-
tient in most of the cases. During such attack, the blood vessels will leak or burst, and
flood the cavities of the brain, preventing blood from reaching its destination. This
is common in patients with high blood pressure, which induces weakened and more
fragile blood vessels. Haemorrhagic strokes can be divided further into two sub-types.
The first sub-type, intracerebral haemorrhages, appears when the leak or burst occurs
inside the brain and affects the surrounding brain cells first. The second sub-type, sub-
arachnoid hemorrhage, appears when the damaged blood vessel is positioned in the
subarachnoid space.
On the other hand, ischaemic strokes are the most common types of strokes, with
around 85% of all strokes recorded. They appear in specific cases where the blood
cannot circulate properly to the brain because the blood vessels are either too narrow
or obstructed by a clot. The origin of the clot defines the subtypes of ischaemic attacks.
Embolic strokes are caused by a clot called embolus outside the brain area, which are
not problematic until it reaches a thinner capillary and prevents good blood circulation.
Thrombotic strokes are caused by a clot called thrombus, directly in the capillaries
of the brain. Ischaemic strokes have a prevalence in individuals suffering from high
blood tension, obesity and drug addiction. Depending on the length of the stroke, the
damages may have drastic consequences on the patient’s body [50].
According to the annual report of the National Stroke Organisation, there are cur-
rently 1.2 million patients who survived a recent stroke incident in the United King-
dom. 84% of these patients need intensive care and constant support with daily tasks
and around 33% of these patients suffer from difficulty with speech, motor and per-
ceptual skills. These numbers show that the heavy consequences of strokes remain a
major concern in modern health care. In the datasets used throughout this project, pa-
tients were imaged 1-4 weeks after presenting to clinic with a mild to moderate stroke
of type ischaemic, although few patients had previous small haemorrhagic episodes.
2.1.2 Magnetic Resonance Imaging
Magnetic Resonance Imaging (short: MRI) is an imaging method widely used in the
medical field. With the help of strong magnetic fields and radio waves, it allows the
inspection of pathology inside the body [25]. MRI is very useful as different scanning
Chapter 2. Background Research 7
parameters and contrasting agents can reveal different elements and ease the identifi-
cation of these abnormalities [28].
Magnetic resonance imaging machines rely on the magnetic properties of the water
particles present in the body tissues. By measuring their reaction time after the emis-
sion of radio frequency (short: RF) energies, scanning machines can recreate an image
representing the different tissues of the region captured. However, different scanning
parameters will yield different types of images. The two main parameters to adjust for
controlling the weight and intensity of the final image are the repetition time (short:
RT) and the echo time (short: TE). RT is the time separating two consecutive RF
emissions, whereas TE is the time separating the emission of the RF signal and the
measurements of its echo back [20].
Different parameters settings will generate different images. As the various tissues
present in the brain react differently to shorter and longer TR/TE, it is possible to
control the resulting image contrasts and facilitate their identification. Figure 2.2 shows
the three sequences that this thesis experiments with: T1-weighted, T2-weighted and
Fluid Attenuated Inversion Recovery (short: FLAIR).
Figure 2.2: T1-weighted, T2-weighted and FLAIR MRIs.
Source: https://casemed.case.edu.
T1-weighted sequences, or anatomy scans, use short TR/TE, which brightens the
tissues with high water content. It is very useful to identify the boundaries between the
tissues of the brain.
T2-weigthed sequences use long TR/TE and are widely used for evaluation of liga-
ments and cartilages because of its contrasting properties. Furthermore, since the water
content of the tissues usually grows in presence of a pathology, they are easily captured
on T2-weighted sequences.
Chapter 2. Background Research 8
FLAIR sequences use very long TR/TE and are very similar to T2-weighted se-
quences. However, the longer capture times completely suppresses the intensity of
some parts of the brain, like the CSF [23]. Nowadays, it is widely used for its ability
to identify a range of intricate lesions that in the past were more difficult to visualise
[31]. As it is the most widely-used sequence for the identification of WMH, the main
part of the project focuses on the processing of FLAIR sequences. However, some
experiments involving T1- and T2-weighted sequences will be presented in section 3.4.
Table 2.1 summarises the appearance of the main tissues in the different MRI se-
quences.
TISSUE T1-WEIGHTED T2-WEIGHTED FLAIR
CSF DARK BRIGHT DARK
WHITE MATTER LIGHT DARK GREY DARK GREY
CORTEX GREY LIGHT GREY LIGHT GREY
WMH DARK BRIGHT BRIGHT
Table 2.1: Appearance of different tissues in MRI sequences.
Extended from: https://casemed.case.edu.
2.1.3 MRI Analysis
Nowadays, the most widespread methods for MRI analysis remains manual. Other
than the fact that it is remarkably time-consuming, it is also prone to mistakes, depend-
ing on the experience of the analyst [13]. Furthermore, a fully supervised examination
made by experts requires an excessive amount of money to be carried out.
However, these past years have seen the rise in the number of publications involv-
ing automatic -or semi automatic- methods for brain MRI analysis [4]. Their aim is to
serve as guidance for the expert in charge and facilitate the administration of a proper
treatment. Most of the research focuses on the segmentation of brain MRI into WMH
and normal appearing white matter (short: NAWM). Nevertheless, little attention is
given to the characterisation of the results and the longitudinal study of the lesions.
We observe two main operations for the analysis of lesions in brain MRI: lesions
segmentation and lesions characterisation [20].
Segmentation is the task of separating the input image into different regions, each
classified with a different label, typically numeric. Each label will then be assigned to
a class (NAWM, WMH, ...), which, in the case of automatic methods, is usually deter-
Chapter 2. Background Research 9
mined from some probability-based estimation [1]. Unlike classification, segmentation
is a task carried out at pixel-size level and requires specific evaluation metrics (see sec-
tion 4.1). Manually segmented scans are nowadays considered the standard approach
and the “gold standard” for algorithm comparison, as Patriarche et al. report in their
review [40]. Manual segmentation requires the experts to cautiously highlight the re-
gions of interest using an MRI segmentation-designed tool, slice by slice. Depending
on the scan format and shape, this task can take an extraordinary amount of time and
financial resources.
Following, lesion characterisation is the act of quantifying the segmented volumes
and modelling their evolution over time. Regarding MRI analysis, this could mean the
measurement of the WMH/NAWM (volume, ratio, intensity, ...), the analysis of their
growth/shrinkage (shape, scale, variance, ...) or more. Analyses can reveal precious
information for the treatment of the patients and therefore their precision are of utmost
importance.
2.1.4 White Matter Hyper-intensities
As pointed out by Debette et al., there is a strong belief that the presence of WMH in
FLAIR and T2-weighted scans is associated with a variety of pathology such as stroke
incidents and Alzheimer disease [12]. Therefore, an effective analysis of the MRI can
allow a better follow-up of the patient and more educated treatment decisions [36].
A popular visual rating metric still widely used today is the Fazekas scale [17].
In their original paper, Fazekas et al. described white matter damage (referred to
as leukoaraiosis) by first separating it into deep white matter hyper-intensity (short:
DWMH) and periventricular white matter hyper-intensity (short: PVWMH). Then, a
score from 0 (no damage) to 3 (“large confluent areas” for DWMH and “large conflu-
ent disease” for PVWMH [8]) is given to the segmented region. It is a useful metric
for visual analysis of WMH but it suffers from limitations: it is very sensitive to inter-
rater variability [39] and is not precised enough to be used as an evaluation metric for
automatic tools.
We observe in the literature various quantification metrics for the assessment of
WMH in brain MRI. Most of them are intensity-based and can be divided into simple,
statistical and temporal quantification methods. However, these usually fail to deliver
accurate representations of the WMH as they are too sensitive to over-damaged brains
and fluctuations in the images over time [29].
Chapter 2. Background Research 10
On the other hand, the damage metric developed by Valdes et al., briefly reviewed
in section 2.2.3, resolves most of the limitations of the previously-mentioned quantifi-
cation metrics. As we believe it is more accurate when it comes to automatic WMH
quantification, it was used in this thesis to assess the WMH burden of our datasets.
Only a few years ago, it was believed that only the presence of WMH in the brain
was a sign that it would progress and extend over time [44], leading to the develop-
ment of dementia, SVD and Alzeihmer. However, studies with more advanced meth-
ods using automatic segmentation observed that WMH on longitudinal data may also
regress over time [10]. For instance, Wardlaw et al. observed a clear regression of the
WMH in minor stoke patients, shortly after the incident [53]. As mentioned earlier, the
poor segmentation of WMH burden in new studies can lead to ambiguous hypotheses.
Therefore, this thesis will also aim at discovering if Wardlaw et al.’s observation can
be validated on our datasets.
Overall, the knowledge about the progression of WMH and stroke lesions is weak.
However, accurate automatic segmentation tools applied to longitudinal studies might
give us the tools necessary to sharpen our knowledge about these pathological features.
2.2 Ground Technical Knowledge
2.2.1 LOTS-IAM
Recently, the inspiring work of Rachmadi et al. produced an automatic unsupervised
detector of tissue irregularities in brain MRI [34], called Limited One-Time Sampling
Irregularity Map (short: LOTS-IAM). It runs a patch comparison method borrowed
from a weathered-texture analysis technique [5] to estimate abnormalities in an input
image. It was used and evaluated for the task of WMH segmentation in brain MRI. To
do so, it implements a full segmentation pipeline, as seen on figure 2.3.
The inputs of the LOTS-IAM are NIfTI-1 files (developed by the Neuroimaging
Informatics Technology Initiative). This format provides a common ground for MRI
analysis worldwide as it stores essential information such as pixel coordinate to spatial
location transformation and can be read by extensively used programs such as ANA-
LYZE 7.5 and MATLAB [30]. The first input to this pipeline is the FLAIR file paths, as
FLAIR images which will be used for age map generation. Additionally, the intracra-
nial volume and cerebrospinal fluid masks are needed to pre-process the data. These
are subtracted from the original NIfTI-formatted FLAIR scan as they would induce
Chapter 2. Background Research 11
Pre-
processing
Source patch
extraction
Target patch
extraction
Distance
function
calculation
Post-
processingNIfTI files
input
Final Age
Map
Figure 2.3: Overview of the LOTS-IAM architecture.
irrelevant information to the main module of the program.
From the pre-processed data, the LOTS-IAM extracts brain tissue irregularities by
performing comparisons between source and target patches. Source patches are ex-
tracted by partitioning a slice into non-overlapping patches of prefixed sizes, whereas
target patches are randomly sampled from the entire scan slice under the condition that
they are located inside a non-masked region. The program requires the source patch
size to be the same as the target patch size. To get a more accurate representation of
the abnormalities, a substantial amount of target patches is needed. Rachmadi et al.
recommends a target patch number of 64, which can take a tremendous amount of time
to process. To solve this issue, the original paper describes the use of GPU computing
to speed up the heavy computation induced by the high number of source and target
patches. The LOTS-IAM relies on the assumption that abnormalities are outliers and
therefore appear different than the average brain tissue. The age value is given by the
comparison between source and target patches and is calculated with the use of the
following distance function:
d = α · |max(s− t)|+(1−α) · |mean(s− t)| (2.1)
Where α = 0.5, and s & t are the source and target patches compared, respectively.
The program allows multiple patch sizes to be used individually and in conjunction.
Typically, patch sizes are 1, 2, 4 and 8. Using a hierarchical set of patch sizes allows
for different levels of context to be taken into account. However, when multiple patch
sizes are used, multiple age maps are generated. To combine these into a visually
sensible output, the resulting age maps go through a post-processing pipeline.
The post-processing performed at the end of the age map calculation can be summed
up in three steps: blending age maps sampled from different patch sizes, penalty appli-
cation and global normalisation. First, the age maps generated by the distance calcula-
Chapter 2. Background Research 12
tions are up-sampled and Gaussian smoothed. Then, a combined age map is produced
by summing the probability maps according to some pre-defined weighting (specific
weight values are suggested by the authors). Following, the age map is penalised by
multiplying its values by the corresponding pixel values in the original FLAIR MRI
scan. This step emphasises WMH against abnormally darker regions. Finally, the age
map is normalised between 0 and 1 to be interpreted as the probability for a pixel to
belong to WMH volume.
The original paper compares the results from the LOTS-IAM under different thresh-
olds with other popular WMH segmentation tools and observed that its finding outper-
formed them on most metrics. The other methods used included both supervised and
unsupervised methods. Furthermore, the LOTS-IAM performed better than the current
state-of-the-art unsupervised WMH segmentation framework, LST-LGA. The perfor-
mance metrics used were the dice coefficient, PPV, SPC and TPR.
However, experiments at the Centre of Clinical Brain Sciences of the University of
Edinburgh demonstrated that the LOTS-IAM could not be used as a WMH segmenta-
tion tool in clinical research or practice per se as it suffers from important weaknesses.
First of all, a notable number of false positives and false negatives are present in the fi-
nal age maps, which invalidates its use for automatic quantification of lesions. Second
of all, it tends to underestimate lesions when processing over-damaged brains. Finally,
it is only tuned for FLAIR sequences and was not optimised to work for T1- and T2-
weighted sequences. Additionally, its execution time makes it impractical to use for
very large datasets (e.g. UK Biobank [49]).
The LOTS-IAM runs on the terminal (or any virtual environment) and consists of
five main files:
1. iam lots gpu.py: contains main() function that will call the main module.
2. iam params.py: contains the 10 hyper-parameters that the user can tune.
3. IAM GPU lib.py: contains main module (defined in iam lots gpu compute()
function), called by the main() function.
4. IAM lib.py: contains the additional function definitions used in the main mod-
ule (including CUDA functions).
5. input.csv: contains the file paths to be processed by the main module.
From the terminal, a simple call to its main function (iam lots gpu.py) is needed
to run the main module and process all the files specified in input.csv.
Chapter 2. Background Research 13
It was decided, in accordance with researchers at the CCBS and its authors, that
the LOTS-IAM would be used and improved in the scope of this project. Therefore,
the segmentation module is based on the LOTS-IAM’s architecture and the character-
isation of the lesions is based on its output.
2.2.2 Gaussian Mixture Models
Gaussian Mixture Models (short: GMMs) are a type of soft clustering methods [14].
Unlike K-means, which assigns each data point to a cluster, GMMs compute the prob-
ability that a data point belongs to a cluster. Therefore, each cluster corresponds to
a probability distribution based on a Gaussian [6]. To perform this, they rely on ex-
pectation maximisation (short: EM) for the estimation of the parameters. Since it is a
Gaussian mixture model, the underlying density function is a Gaussian, defined as
N(X |µ,Σ) = 1(2π)D/2Σ1/2 exp(−1
2(X−µ)T
Σ−1(X−µ)) (2.2)
Where D is the number of input dimensions, X is the set of input data points and Σ and
µ are the covariance matrix and the mean of the distribution, respectively. These two
last variables represent the parameters of the distribution that the model will attempt
to estimate, in order to fit the input data [3].
The algorithm proceeds in four main steps:
1. Initialisation step: Randomly initialise the parameters.
2. Expectation step: Assign a cluster probability to each data point based on current
means and variances.
3. Maximisation step: Update means and variances based on the new probabilities
for each data point (maximum likelihood calculation).
4. Repeat steps 2 & 3 until convergence.
Generally, the algorithm is ran multiple times as the outcome is not deterministic,
due to the random initialisation step [33]. Therefore, the most frequent cluster assigned
to each datapoint is taken as final cluster assignment. This results in the partitioning
of the dataset into the number of clusters specified. Gaussian mixture models are used
for post-processing the outputs of the improved version of the LOTS-IAM developed
in this thesis, when applied to different image sequences.
Chapter 2. Background Research 14
2.2.3 White Matter Damage Metric
In their recent publication, Valdes et al. call for a harmonisation of the quantification
results of WMH segmentation [24]. To do so, they propose a new metric to assess
the white matter damage in the brain based on both its volume and intensity. The
advantages of this damage metric is that it can be used for any MRI sequence, it is
fast to compute and holds significant information for the researcher. It is calculated as
follows:
WMDamage =IWMH− INAWM
INAWM∗ WMHvolume
WMHvolume +NAWMvolume(2.3)
Where IWMH and INAWM are the average intensities of WMH and of NAWM respec-
tively and where WMHvolume and NAWMvolume are the volume of WMH identified and
of the NAWM, respectively. The resulting number falls between 0 and 1. For minimal
damage, the metric approximates 0 whereas for predominant damage, it approximates
1. If no WMH is detected in the brain, the WMDamage is set to 0.
The damage metric introduced by Valdes et al. will be used to quantify the lesions
and their evolution over time. This has been decided with the objective of making it
one of the de facto measure when it comes to WMH analysis in brain MRI.
Chapter 3
Methodology
3.1 Ethics
This thesis received approval from the Informatics Ethics Panel (ref. num: 64736)
and was conducted in accordance with the GDPR and the ethical standards set by the
Informatics department of the University of Edinburgh.
3.2 Dataset
Two datasets were used throughout this project. Both belong to the Mild Stroke Study
(short: MSS) initiated at the University of Edinburgh. More specifically, subsets of the
MSS2 and MSS3 datasets were created and processed by the algorithms. The segmen-
tation algorithm uses the MSS2 subset only, since ground truth was only generated for
MSS2. On the other hand, the characterisation of the lesions was performed using both
MSS2 and MSS3 scans.
• The subset taken from the MSS2 contains 43 patients, scanned at three time steps
separated by roughly 3 months (visit 1 and 2) and a year (visit 1 and 3). Each
scan is represented as a 3D array of size 256x256x42.
• The subset taken from the MSS3 contains 17 patients, scanned at three time steps
separated by roughly 3 months (visit 1 and 2) and 6 months (visit 1 and 3). Each
scan is represented as a 3D array of size 256x256x176.
Each patient is represented by 3 different MRI scans (T1-, T2-weighted and FLAIR)
at 3 different time points, all originally registered in NIfTI format. However, for an
15
Chapter 3. Methodology 16
easier manipulation and portability of the data, it was decided to extract the 3D arrays
from the NIfTI files after masking them, to avoid extra pre-processing unrelated to the
segmentation/characterisation goals of this project. Therefore, the data used for the
rest of the project are 3D MAT files of the region of interest (i.e. the whole white
matter region), resulting from the subtraction of the intracranial volume, cortex and
cerebrospinal fluid masks.
Ground truth was produced for the MSS2 set in order to allow the evaluation of dif-
ferent segmentation tools. However, it is not perfectly accurate. Due to the extensive
time needed to label a dataset, the medical authority in charge decided not to generate
ground truth for scans that are separated by only 1-2 months. Therefore, the ground
truth for the second visit of MSS2 is the same as the first visit ground truth. Further-
more, the ground truth was originally generated for clinical studies and includes the
tissue loss due to stroke in the third (i.e. last) time point, which should not be picked
up by the algorithm since it is not the focus of this thesis. This will lower the accuracy
of the algorithm performance at these time points (i.e. second and third). Nevertheless,
it is still possible to compare the performance of different algorithms on this dataset,
as long as the same ground truth is used for all the evaluations. No ground truth was
generated for MSS3.
3.3 Modification of the LOTS-IAM
As mentioned earlier, the LOTS-IAM suffers from important shortcomings. Therefore,
the base code was taken and modified to improve its final output and make it suitable
for clinical studies. The main experiments carried out with the LOTS-IAM can be
summed up as follows:
1. Development of a 3-dimensional version of the LOTS-IAM, conveniently called
LOTS-IAM-3D, to account for contextual information and improve accuracy.
2. Modification and refactoring of the main module of the program to decrease its
execution time by 4.
3. Introduction of a target patch selection function based on priors to improve ac-
curacy and allow inter-subject comparisons.
4. Experiments with T1 and T2 sequences, separately and in conjunction (multi-
spectral clustering).
Chapter 3. Methodology 17
3.3.1 LOTS-IAM-3D
The first main contribution of this project is the development of a 3-dimensional ver-
sion of the original LOTS-IAM. The original algorithm processes the MRI slice by
slice whereas the LOTS-IAM-3D analyses the whole MRI volume, which allows more
contextual information to be taken into account. This means that the algorithm per-
forms a cube-based comparison of the voxel intensities in FLAIR scans, using cubic
source and target patches (see figure 3.1(b)).
(a) LOTS-IAM source patch (4x4). (b) LOTS-IAM-3D source patch (4x4x4).
Figure 3.1: Examples of source patches in both versions of the algorithm (red).
The addition of a dimension for source patch extraction, as seen on figure 3.1, is
also performed during the target patch extraction. However, the operations applied
on source and target patches for age value generation are the same as in the original
LOTS-IAM.
3.3.2 Data Structure
The execution time of the algorithm and its distinct parts were all measured using
an MSi GE60 2PE ApachePro (GTX860M) with Intel core i7 and were all applied
on the MSS3 dataset. This was decided in order to have a common ground for all
comparisons.
The processing of an additional dimension (i.e. depth) to the algorithm drastically
slowed down its processing speed performance. From 10 minutes, the 3D version of
the algorithm with the same architecture as the original LOTS-IAM takes around 40
minutes to run one MRI scan of dimension 256x256x176 with patch sizes [1,2,4,8].
Chapter 3. Methodology 18
This drastic increase in execution time is due to a non-optimal management of re-
sources and data structure. As an excessive execution time hurts the practicality of
the method, we performed an execution time analysis of the code and restructured the
architecture from the original paper. Once done, we decreased the LOTS-IAM-3D pro-
cessing time from 40 minutes to under 40 seconds (see section 4.4). The gain in speed
is due to the creation of specific functions to avoid repetition of heavy calculation and
the use of more efficient data structures and operations.
3.3.3 Target Patch Selection From Prior
One of the main shortcomings of the original LOTS-IAM is its ineffectiveness in recog-
nising the irregularities when applied to highly-damaged brains. Indeed, the algorithm
is based on the average brain intensity. Therefore, it tends to underestimate damaged
regions when the average intensity of the whole brain is high (due to a high number of
lesions). This results in an irregularity map with a large number of false positives and
false negatives.
To solve this problem, we introduce a patch selection function based on prior. The
patch selector is enabled via the tuning of an input parameter (e.g. None, [0-1]). While
the original LOTS-IAM samples target patches randomly, the LOTS-IAM-3D’s patch
selector filters out, if enabled, target patches that do not comply with some require-
ments, as explained below.
When the patch selection function is enabled, an additional step is added to the
target extraction process. First, a mask from the original 3D brain array is created by
applying the following pipeline:
Calculate 3D
mean and std
Estimate intense
WMH (iWMH)
Estimate subtle
WMH (WMH)
Apply 3D
Gaussian
Apply 3D
erosion
Threshold
result
3D brain
inputMasked
3D brain
Figure 3.2: Mask generation pipeline for the target patch selection function.
Where each brain voxel v is defined as (i)WMHest if:
Chapter 3. Methodology 19
voxel ≥ meanintensity +CI ∗ stdintensity (3.1)
And where CI = 1.282 (80% confidence) for WMHest and CI = 1.960 (95% con-
fidence) for iWMHest . Then, the target patch extraction is executed. However, each
randomly sampled target patch is this time tested by checking the ratio of masked pix-
els in its corresponding patch from the masked brain volume, generated earlier. If the
ratio of WMHest exceeds the prefixed threshold, the patch is rejected.
Here below is an illustration of the patch selection function. Note that it demon-
strates the different steps involved when applied to a 2D slice only for visualisation
purposes.
Figure 3.3: Overview of the patch selection architecture.
3.4 Multi-Spectral Clustering
Most research related to WMH in brain MRI focuses on one modality only. Generally,
FLAIR or T2-weighted scans are used as they allow the distinction between normal and
abnormal brain tissues. However, T1-weighted scans can be useful for the distinction
between the stroke area and other age-related WMH.
Other than unsuccessful and experimental work on multi-spectral clustering using
T1 and T2 [27], very little work was done to incorporate the information coming from
multiple modalities in a single segmentation task. In this thesis, we believed that the
area of multi-spectral clustering incorporating the irregularity maps from the LOTS-
IAM-3D was worth exploring. Therefore, we ran the LOTS-IAM-3D on the three
sequences of scans available for all the patients in the MSS2 dataset: T1-, T2-weighted
and FLAIR.
Another important note is the experimental optimisation of the LOTS-IAM-3D for
T1 modality. The original penalisation step aims at suppressing the low intensities of
the original scan to remove any irrelevant darker abnormalities (generally present in T2
and FLAIR). To make it relevant to T1-weighted scans, the penalisation process (only
Chapter 3. Methodology 20
when applied to T1 scans) was modified to reverse its effect and improve the results
overall. The probability maps used were produced by the LOTS-IAM-3D with patch
selector (threshold = 0.05), as it produced the best results on its own (see section 4.2.2).
We implemented two types of clustering methods, namely Gaussian Mixture Model
and Self-Organising Maps (short: SOMs), to process the LOTS-IAM-3D output. How-
ever, SOMs are a type of artificial neural networks that require extended hyper-parameter
tuning. The limited amount of time did not allow for significant results to be achieved.
Therefore, it was decided not to include SOM results here. Nonetheless, the code of
the SOM experiments are available on request and could potentially be used as baseline
for future research.
3.4.1 Gaussian Mixture Model
To perform Gaussian clustering, the data needs to be pre-processed. Therefore, the
LOTS-IAM-3D with patch selector (threshold = 0.05) was ran on the three modalities
available: T1-, T2-weighted and FLAIR. Then, the three irregularity maps resulting
from the run were flattened and stored in a 3-dimensional array of shape (N, 3), where
N is the multiplication of the x, y and z original dimensions of the brain MRI. Then,
the 3-dimensional array is fitted to a Gaussian mixture model with 4 clusters. The
rationale behind this is to separate the main visible parts of the MRI scan ROI given
as input: background (black), subcortical grey matter, NAWM and WMH. To label
the clusters, a function calculates the average intensity of all the clusters’ voxels on the
original scan. From this, we can assume that the highest intensities are WMH volumes.
Section 4.3 presents the results to the reader, introducing very experimental work to
the LOTS-IAM literature.
3.5 Tools and Libraries
As the LOTS-IAM was originally written in Python 3.5, it was decided to extend and
improve it using the same programming language and the same version. The Numpy
library [37] was extensively used for its implementation. The Gaussian mixture models
were implemented using the Scikit-learn library [41] in Python. The SOM models (not
presented here) were implemented using the Minisom library [51] in Python. Finally,
the evaluation and characterisation of the WMH load were implemented using both the
MATLAB environment and Jupyter notebooks (available on request) in Python.
Chapter 4
Segmentation Evaluation
4.1 Metrics Used
The evaluation was carefully carried out to validate not only the segmentation accuracy
of the different algorithms produced during this project, but also the fidelity in repre-
senting the evolution of the white matter damage in terms of lesion progression, stabil-
ity and/or shrinkage. The evaluation of the segmentation algorithms was treated as a
binary classification problem. Therefore, both the output of the LOTS-IAM-3D and the
ground truth are binary masks of equal size. As the raw output of the LOTS-IAM-3D is
a probability map, different thresholds were tested to find the optimal value at which a
voxel can be considered white matter damage with enough confidence. Along with the
damage metric described in section 2.2.3, the metrics from the following subsections
have been used for the evaluation of the segmentation module.
4.1.1 Measures of Relevance
Measures of relevance give some statistics regarding the true/false positives/negatives.
They all yield values between 0 and 1.
Sensitivity, also called recall, measures the ratio of positive instances that were
correctly labelled by the algorithm (equation 4.1a). Specificity, on the other hand,
measures the ratio of negative instances that were correctly classified by the algorithm
(equation 4.1b). In the case of brain MRI segmentation, specificity tends to be very
high, as WMH (positive instances) usually represents only about 5% of the brain area.
Precision is defined as the ratio of actual positives over predicted positives (equation
4.2a). It is usually used for the calculation of the F1 score, along with sensitivity.
21
Chapter 4. Segmentation Evaluation 22
F1 score is a harmonic mean between precision and sensitivity (recall), as seen in
equation 4.2b. Its value is a good indicator of the relevance of a model as it balances
out precision and recall measures.
Sensitivity =T P
(T P+FN)(4.1a) Speci f icity =
T N(T N +FP)
(4.1b)
Precision =T P
(T P+FP)(4.2a) F1 = 2∗ Precision∗Recall
Precision+Recall(4.2b)
4.1.2 Jaccard Index & Dice coefficient
The Jaccard Index is used to evaluate the spatial coincidence and dissimilarities be-
tween two sets. It is defined by the intersection of the two sets divided by their union.
The resulting index range from 0 to 1. The dice coefficient is another similarity metric
commonly used in evaluating image processing algorithms. It can be seen as the over-
lapping percentage between the ground truth and the prediction. It can be calculated
with the Jaccard index and ranges from 0 to 1.
Jaccard =| A∩B || A∪B |
(4.3a) Dice =2∗ JaccardJaccard +1
(4.3b)
4.1.3 Bland-Altman Measure
In 1983, Bland and Altman proposed a method for assessing the agreement between
measures from different clinical methods [7]. In other words, it evaluates a variable,
measured by two different procedures. This type of analysis is mainly applied to the
medical field of research, where comparison between different quantitative assess-
ments is frequently carried out. For any general comparison task, simple measures
such as correlation and linear regression are implemented. Commonly-used correla-
tion metrics include the Pearson correlation coefficient, r, which is calculated from
the division of the covariance by the product of the standard deviations. However, to
assess the degree of agreement between two distinct methods, not only the correla-
tion but also the difference between them should be measured [21]. This is where the
Bland-Altman analysis proves its usefulness.
Chapter 4. Segmentation Evaluation 23
The Bland-Altman analysis makes use of statistical notions such as limits of agree-
ment to investigate the correspondence between two measures. These limits of agree-
ment are based on the mean and standard deviation of the subtraction of the two mea-
sures. It represents a margin within which a certain percentage of both measurements
are included. It is common to fix them to 90% or 95% (1.645 and 1.960 times the
standard deviation, respectively). This yields degrees of confidence as to where the
bulk of the data lies and is easily readable on a single plot, as seen in section 4.2.2.
4.2 WMH Segmentation
4.2.1 Total WMH Load
A full evaluation was performed to assess the LOTS-IAM-3D’s ability to segment the
total WMH load in the original scans. As the LOTS-IAM-3D’s output heavily relies
on a probability threshold parameter, an initial evaluation of the optimal value was
performed. We based this optimal threshold on the model’s average dice coefficient
when ran on MSS2. Figure 4.1 displays the results of this experiment. The best dice
coefficient was 0.619 at a threshold of 0.15. Therefore, the optimal threshold for the
remaining experiments is set to 0.15.
0.0 0.2 0.4 0.6 0.8 1.0Threshold
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Dice
ave
rage
Algorithm comparisonLST-LGALOTS-IAMLOTS-IAM-3D
Figure 4.1: Average dice curves of LST-LGA, original LOTS-IAM and LOTS-IAM-3D.
For a better comparison, we decided to include an evaluation of the LST-LGA in
Chapter 4. Segmentation Evaluation 24
the graph. The LST-LGA [43] is a WMH segmentation algorithm currently considered
the unsupervised state-of-the-art method and commonly used as standard comparison
for segmentation algorithms. It is implemented in MATLAB and makes use of the
Statistical Parametric Mapping package (spm12) to segment the WMH in T2-weighted
scans. However, we can see that the LOTS-IAM-3D outperforms both its predecessor
and the LST-LGA on the MSS2 samples.
4.2.2 Target Patch Selector Performance
To evaluate the efficiency of the patch selector, we divided our dataset into three differ-
ent groups, according to their WMH load (given by the ground truth). This experiments
aims at visualising where the LOTS-IAM-3D is outperforming its predecessor and if
it also struggles to identify WMH in over-damaged brains. Figure 4.2 shows the per-
formance of both LOTS-IAMs on different load levels. The division was designed in
order to have around a third of the data in each fold. Therefore, WMH volume lower
than 4cc were considered “Low load”, WMH volume between 4cc and 10cc were con-
sidered “Medium load”. Higher volumes were considered “High load”.
Low Medium HighLoad
0.0
0.2
0.4
0.6
0.8
Aver
age
Dice
Patch Selector PerformanceModel
LOTS-IAMLOTS-IAM-3D T=0.05
Figure 4.2: Patch selector evaluation according to different load levels.
Chapter 4. Segmentation Evaluation 25
From figure 4.2, we can observe that the use of the patch selector results in a clear
improvement compared to the basic LOTS-IAM (i.e. 2D approach). Therefore, it was
included in the final version of the algorithm. Figure 4.3 illustrates the “underestima-
tion effect” of the LOTS-IAM-3D without patch selector, clearly inducing mistakes in
the segmentation.
(a) IAM of LOTS-IAM-3D. (b) IAM of LOTS-IAM-3D with PS (T=0.05).
(c) Original FLAIR scan.
Figure 4.3: Target Patch Selector (PS) Performance.
Figure 4.4 shows the agreement plots between the LOTS-IAM-3D and the ground
truth. Figure 4.4(a) compares the volume change between any two time points. Each
point in the graph represents the value of the slope that captures the change as per
both methods. They were generated by plotting these values for each patient (i.e. the
slope that characterises the volume change between visits 1 & 2, 2 & 3 and 1 & 3)
in a graph as calculated by the LOTS-IAM-3D (x-axis) and according to the ground
truth (y-axis). The Pearson correlation for the prediction and ground truth relationship
is 0.735 with p-value < 10−16 (high confidence) and most of the data of the Bland-
Chapter 4. Segmentation Evaluation 26
Altman plot is confined within the confidence limits, which is demonstrating a clear
sense of agreement.
0.06 0.04 0.02 0.00 0.02 0.04 0.06Predicted Slopes (3D-0.05)
0.06
0.04
0.02
0.00
0.02
0.04
0.06Gr
ound
Tru
th S
lope
s
r: 0.735p: 8.10E-16
Predicted vs ground truth slopesPerfect AgreementLinear Regression
(a) Correlation between slopes.
0 5 10 15 20 25 30Average of measures
10
0
10
20
Diffe
renc
e be
twee
n m
easu
res
MEAN:-0.1257
+1.96SD:+12.03
-1.96SD:-12.282
Bland Altman (prediction vs ground truth)
(b) Prediction and ground truth agreement.
Figure 4.4: Measures of agreement.
4.2.3 WMH Subtle Changes
The LOTS-IAM-3D is capable of recognising the unhealthy parts of the brain. This
means that it it able to segment the WMH in an MRI scan. However, clinical studies
are usually highly interested in the dynamic changes of the lesions. Therefore, it is
interesting to evaluate our module for its ability to capture the dynamic changes present
in a patient’s brain from one visit to another.
To do so, the lesion evaluation and quantification was performed in two steps. First,
the second and third visit prediction masks were subtracted from their preceding visit
prediction mask (Visit 2 - Visit 1, Visit 3 - Visit 2), leaving only the dynamic changes
between the consecutive visits. Then, the first visit was subtracted from the last visit
(Visit 3 - Visit 1), leaving the dynamic changes overall. Finally, an individual extra
evaluation was performed on the subtle changes in-between visits, in a similar way as
the total WMH load segmentation evaluation.
This aims at revealing whether the new LOTS-IAM-3D has the ability to also de-
tect small dynamic changes, in addition to the main WMH load. The procedure for
identifying lesion changes over two consecutive visits is demonstrated on figure 4.5.
Figure 4.5(a) is subtracted from figure 4.5(b), which highlight the changes, identified
by their category (“Extended”, “Healed” and “Stable”) on figure 4.5(c).
Chapter 4. Segmentation Evaluation 27
(a) First visit FLAIR WMH. (b) Second visit FLAIR WMH.
(c) Evolution of lesions.
Figure 4.5: Visualisation of lesion evolution between two visits.
Table 4.1 presents the results of the evaluation, according to the average prediction
volume, average ground truth volume, average dice coefficient and average F1 score.
Intermediate changes (“Inter.”) are also included, although they were nonexistent in
the ground truth. The best measures are recorded in bold. It is clear that the LOTS-
IAM-3D significantly outperforms the original LOTS-IAM in all the categories, except
specificity. This can be explained by the slightly lower number of false positives given
by the LOTS-IAM on these very small regions. Therefore, we can conclude that the
LOTS-IAM-3D can be preferred to its predecessor for lesion evolution characterisa-
tion, which will be performed in chapter 5.
Chapter 4. Segmentation Evaluation 28
LOTS-IAM LOTS-IAM-3D
AVERAGE HEALS EXT. STAB. INTER. HEALS EXT. STAB. INTER.
PRED. VOL. (CC) 3.891 3.753 4.957 1.989 5.025 4.380 7.942 2.476
GT. VOL. (CC) 4.266 3.892 7.828 0 4.266 3.892 7.828 0
DICE 0.300 0.254 0.562 0 0.349 0.321 0.608 0
PPV 0.235 0.278 0.662 0 0.294 0.376 0.669 0
SENSITIVITY 0.255 0.285 0.567 0 0.286 0.354 0.708 0
SPECIFICITY 0.987 0.981 0.993 0.988 0.979 0.978 0.979 0.984
F1-SCORE 0.223 0.291 0.669 0 0.223 0.340 0.682 0
Table 4.1: Evaluation of the LOTS-IAMs performances for subtle changes in WMH.
4.3 Multi-Spectral Clustering Performance
Unfortunately, the results for the multi-spectral clustering using Gaussian mixture
models was not as positive as expected. As explained in section 3.4, all the possi-
ble combinations of FLAIR, T1- and T2-weighted were used to perform the clustering
and segmentation. Figure 4.6 shows the results of each combination, along with the
results of the LOTS-IAM-3D (threshold = 0.05) from section 4.2.2. The combination
of different sequences did not improve the overall performance of the segmentation.
[FLAIR] [T1-FLAIR] [T1-T2-FLAIR] [T2-FLAIR]Modality
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Aver
age
Dice
Dice
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Aver
age
F1 S
core
Clustering Performance
F1 score
Figure 4.6: Gaussian mixture model performances.
Chapter 4. Segmentation Evaluation 29
4.4 Execution Speed
As mentioned earlier, the architecture of the original LOTS-IAM was completely changed
to optimise its overall execution speed. Table 4.2 shows the execution time of the dif-
ferent LOTS-IAM versions, for different input sizes. Note that “Improved” refers to
the code refactoring of the model’s architecture from section 3.3.2 and “PS” refers to
the target patch selector from section 3.3.3.
LOTS-IAM LOTS-IAM-3D
INPUT SIZE ORIGINAL IMPROVED PS NO PS
256X256X176 110.19 ± 12.26S 65.91 ± 0.31S 129.99 ± 12.96S 36.43S ± 4.86S
256X256X42 33.89 ± 4.71S 15.35 ± 0.33S 32.35 ± 6.90S 15.00 ± 2.81S
MEAN PER SLICE 0.72 ±0.09S 0.36 ± 0.00S 0.92S ±0.11S 0.36S ± 0.05S
Table 4.2: Total execution times for different input sizes. PS = Patch Selector.
Table 4.3 shows the time improvement for each function when tested with the
MSS3 subset (input size= 256x256x176). However, it is important to note that the
original LOTS-IAM makes N calls to each function listed below, where N is the num-
ber of layers per scan. On the other hand, the LOTS-IAM-3D calls them only once,
making it much faster overall.
LOTS-IAM LOTS-IAM-3D
FUNCTION ORIGINAL IMPROVED PS. NO PS.
SOURCE EXTR. 0.118 ± 0.142S 0.002 ± 0.180S 0.139 ± 0.030 S 0.139 ± 0.030 S
TARGET EXTR. 0.099 ± 0.040S 0.019 ± 0.027S 4.641 ± 1.825S 1.520 ± 0.931S
GPU COMP. 0.953 ± 0.981S 0.919 ± 1.090S 0.972 ± 1.435S 0.972 ± 1.435S
Table 4.3: Execution times of the LOTS-IAM main functions. PS = Patch Selector.
Chapter 5
Lesion Characterisation
5.1 Stability Index
The Stability Index [45] was developed with the idea of capturing the volume change
in brain MRI over time. It responds to the need for a straightforward and relevant way
of measuring variability in brain MRI volumes for a single individual over time. It is
calculated as follows:
N
∑n=1
[((Xn−Xn+1)/Xn)∗100] (5.1)
Where N is the total amount of visits recorded and Xn is the WMH volume at time
n. Overall, the stability index captures how stable a volume is over time. A high value
corresponds to a high estimate of variability, whereas a value close to 0 corresponds to
a more stable mass.
However, for this project, a slight variant of the stability index is used, as shown
on equation 5.2:
N
∑n=1
[((Xn+1−Xn)/Xn)∗100] (5.2)
Where a high number represents a dynamic increase in the overall WMH volume (ex-
tends) and a negative number represents a decrease in the WMH volume (heals).
5.2 Analysis Pipeline
The analysis pipeline aims at characterising the brain lesions per individual, per pop-
ulation, per consecutive visit and over time. To do so, the pipeline is divided into five
30
Chapter 5. Lesion Characterisation 31
analyses. For a more detailed account of each analysis, please refer to the correspond-
ing section. Note that by population, we refer to a group of individuals (e.g. MSS2
dataset).
Firstly, an initial investigation is performed on the data to get some general infor-
mation about the distribution and scale of each population, treating each visit indepen-
dently. We use the volume, damage metric and stability index for each population.
Secondly, a lesion state analysis, similar to the subtle change segmentation of
section 4.2.3, gives some more in-depth information about the possible progression of
the data. This is crucial information regarding the subtle changes occurring in-between
visits. However, it can only capture progression accurately if the whole population
follows the same trends, which is not always the case.
Thirdly, we divide the dataset into different trends, based on its mean and standard
deviation. We perform this to account for inter-individual differences that can occur in
clinical studies. The separation of the data allows for a more relevant analysis and cat-
egorisation of the individuals belonging to a population into smaller subgroups. This
solves the issue faced by the simple temporal load subtraction, performed during the
lesion state analysis. The subgroups are “Decreasing” and “Increasing”. Further sub-
categories (“Highly increasing”, “Slightly increasing”, ...) can be designed, according
to the distribution of the data and the presence of “outliers”.
Following, we perform a time-correlation analysis for each category of our dataset.
This attempts to underline any relationship common to individuals of the same trend,
identified in the previous step. For all consecutive visits, the volume change is anal-
ysed and modelled through a linear regression. The quality of the fit is given by its
Pearson correlation coefficient and p-value.
Finally, a growth curve modelling is fitted to each trend of the dataset. This allows
for the shape of the lesion evolution to be captured, additionally from its direction. For
each time step an intercept, slope and standard errors will describe the quality and
variance explained by the model.
For this thesis, we deployed our analysis pipeline on the subset of MSS2 & MSS3
described in 3.2, using the output of the LOTS-IAM-3D (PS = 0.05). The next sections
show the results for both sets. The analysis and the graphs were fully generated with
Jupyter notebooks and are available on request.
Chapter 5. Lesion Characterisation 32
5.3 Initial Analysis
First, the initial measurements take into account the total WMH load per time step.
This generates metrics to quantify the WMH volume separately at each visit (visit 1,
visit 2, visit 3), without capturing the relationships between the attributes.
Table 5.1 shows the initial measurements captured for MSS2 and MSS3.
MSS2 MSS3
V1 V2 V3 V1 V2 V3
VOLUME (CC)
MEDIAN 9.818 9.780 9.874 28.950 34.205 35.105
MEAN 13.848 13.554 13.203 31.654 31.693 36.399
STD 13.836 11.972 10.905 17.129 15.114 19.299
DAMAGE MET.
MEDIAN 0.0223 0.0212 0.0243 0.0209 0.0212 0.0237
MEAN 0.0309 0.0292 0.0362 0.0240 0.0240 0.0266
STD 0.0301 0.0245 0.0317 0.0145 0.0141 0.0156
STAB. INDEX
MEDIAN 14.258 14.258 14.258 15.369 15.369 15.369
MEAN 17.761 17.761 17.761 26.864 26.864 26.867
STD 50.000 50.000 50.000 58.027 58.027 58.027
Table 5.1: Initial measures of the MSS2 & MSS3 subsets.
From table 5.1, we can see that the volumes from MSS2 data are clearly lower than
MSS3. This is caused by the MRI dimension size of MSS3 being bigger than MSS2.
Also, the important difference between the mean and median of MSS2 suggests an
unbalanced distribution in the data, whereas the mean and median of MSS3 seem to
be more stable. Furthermore, the means and standard deviations between the different
visits suggest that the WMH volume does not vary significantly over time. However,
this might not hold if the positive and negative changes balance each other out, making
these numbers insignificant. Only a deeper analysis can reveal whether this is the case.
5.4 Lesion State Analysis
The lesion state analysis of the volume is performed by subtracting consecutive visits
from each other (visit 2 - visit 1, visit 3 - visit 2, ...) and labelling the different values
as “heals”, “stable” or “extends”. Table 5.2 shows that MSS2 has roughly an equal
amount of healing and extending regions. However, MSS3’s extending lesions seem
Chapter 5. Lesion Characterisation 33
to overpower the healing ones.
MSS2 MSS3
HEALS STABLE EXT. HEALS STABLE EXT.
VOLUME (CC)MEAN 5.025 7.942 4.380 9.3677 19.270 14.113
STD 13.836 11.972 10.905 8.6829 13.401 9.3742
DAMAGE MET.MEAN 0 0.023 0.008 0 0.017 0.007
STD 0 0.023 0.011 0 0.013 0.003
Table 5.2: Lesion state analysis of the MSS2 & MSS3 subsets.
5.5 Trend Separation
As every individual is different, it is impossible to describe a large population with a
single metric. This is why we perform a split of the population, according to the trends
present in its individuals. To identify the different trends, we observe the box plots of
the volume variability (i.e. visit subtraction) in MSS2 & MSS3, for all visits.
Visit 1 Visit 2 Visit 330
25
20
15
10
5
0
5
10
Volu
me
(cc)
Volume change distribution over time
(a) Box plot of MSS2 volume changes.
Visit 1 Visit 2 Visit 3
5
0
5
10
15
Volu
me
(cc)
Volume change distribution over time
(b) Box plot of MSS3 volume changes.
Figure 5.1: Distribution of the volume changes over time.
As shown on figure 5.1(a), MSS2’s volume variability ranges from a low negative
value to a high positive value, with most of the data lying around 0. However, a
significant amount of outliers are present over one standard deviation (short: STD)
away from the mean, in both directions. Therefore, the separation for MSS2 will be
Chapter 5. Lesion Characterisation 34
as follows: “high increase” (over one STD away from the mean, positively), “slight
increase” (between 0 and one STD away from the mean, positively), “slight decrease”
(between 0 and one STD away from the mean, negatively), “high decrease” (over one
STD away from the mean, negatively). If instances are placed in different folds from
one visit to another, only the fold of the last visit change is taken into consideration.
As shown on figure 5.1(b), MSS3’s volume variability is not as spread out as
MSS2’s. All the data ranges from one STD away from the mean, negatively and pos-
itively. Therefore, a simpler division is needed. The data will be divided into volume
“increase” and volume “decrease”.
A closer look at figures 5.2 and 5.3 can reveal some precious information regarding
the linearity of change from one visit to another. The next section will attempt to
capture this linearity assumption with correlation analysis and linear regression.
Visit 1 Visit 2 Visit 3
0
2
4
6
8
10
12
Volu
me
High volume increaseAverage
Visit 1 Visit 2 Visit 3
Slight volume increaseAverage
Visit 1 Visit 2 Visit 3-30
-25
-20
-15
-10
-5
0
Volu
me
High volume decrease
Average
Visit 1 Visit 2 Visit 3
Slight volume decrease
Average
Figure 5.2: Volume variability of the MSS2 subset, divided into 4 main trends.
Chapter 5. Lesion Characterisation 35
Visit 1 Visit 2 Visit 3
5
0
5
10
15
Volume increaseAverage
Visit 1 Visit 2 Visit 3
Volume decreaseAverage
Figure 5.3: Volume variability of the MSS3 subset, divided into 2 main trends.
5.6 Correlation Analysis
The analysis of the correlation between the different visits can reveal precious infor-
mation regarding the evolution of the lesions over time. It was decided to take into
account the correlation between consecutive visits as well as between the first and last
visit.
For our datasets, since only 3 visits are available, the correlation is calculated be-
tween visits 1 & 2, between visits 2 & 3 and between visits 1 & 3. To describe the
correlation and capture the relationship between visits, we fit a linear regression model
to the data and measure its goodness-of-fit (Pearson’s r and p-value).
0 10 20 30Visit 1
0
5
10
15
20
25
30
35
Visit
2
r: 0.835p value: 1.36E-3
Linear Reg.Residual
0 10 20 30Visit 2
Visit
3
r: 0.941p value: 1.62E-5
Linear Reg.Residual
0 10 20 30Visit 1
Visit
3
r: 0.935p value: 2.45E-5
Linear Reg.Residual
Correlation analysis
Figure 5.4: Correlation plots of the highly increasing volumes in MSS2.
Chapter 5. Lesion Characterisation 36
As an example, figure 5.4 demonstrates how the “highly increasing” trend of MSS2
evolves linearly with high confidence (r ≥ 0.83 and p− value < 0.01 in all fits). Due
to space limitation, the visit correlation plots of the other trends were included in table
format (see tables 5.3 & 5.4). These metrics help to reveal the evolution of the data
from one visit to another.
TRENDS
HIGH DEC. SLIGHT DEC. SLIGHT INC. HIGH INC.
r p r p r p r p
V1-V2 0.98 1.55E-6 0.94 6.13E-4 0.97 2.32E-5 0.84 1.36E-3
V2-V3 0.96 4.59E-5 0.96 1.89E-4 0.92 4.18E-4 0.94 1.62E-5
V1-V3 0.92 3.98E-4 0.99 4.69E-6 0.97 1.38E-5 0.94 2.45E-5
Table 5.3: Correlation measures for MSS2.
TRENDS
DECREASE INCREASE
r p r p
V1-V2 0.87 1.26E-1 0.98 2.63E-7
V2-V3 0.74 2.64E-1 0.97 1.98E-6
V1-V3 0.97 3.00E-2 0.98 8.83E-7
Table 5.4: Correlation measures for MSS3.
The evolution according to trends seems to be strongly linear in MSS2 between
all the visits (r ≥ .84 and p− value ≤ .001 in all cases). In MSS3, the relationships
between visits are also mostly explained by a linear regression. However, the decreas-
ing trend of the MSS3 data is more unstable. The high correlation between visits 1
and 3 (r = 0.97 and p− value = 10−6) seems to suggest that visits 2 presents higher
fluctuations that could not be captured as well by a linear model, though it is still over
an acceptable threshold (r ≥ 0.74 and p− value≤ 0.264).
5.7 Growth Curve Analysis
To further explore the area of longitudinal studies, the last analysis of the characterisa-
tion pipeline involves the computation of a growth curve model [32]. For each volume
Chapter 5. Lesion Characterisation 37
variability trend, a growth curve model is fitted and displayed on table 5.5 & table 5.6.
However, it was decided to only test it for linear changes over time, due to the small
amount of time points (visit 1, visit 2, visit 3). A higher number of time points could
give additional information about the shape of the growth [15], which would allow a
deeper understanding of the data. For instance, the use of polynomials could yield
better results by explaining the shape of the evolution (trajectories) with a quadratic or
cubic model [11].
The equation for a linear growth curve is given by
yi = intercept + ti ∗ slope+ errori (5.3)
where yi and errori represent the volume (variability) prediction and the prediction
error associated with time i, respectively, and t1 represents the time-specific order of
the fit (e.g. [0, 1, 2] for linear fit with 3 time points).
Table 5.5 & 5.6 show the result of applying growth curve modelling to capture the
evolution of the trajectories in MSS2 & MSS3.
TRENDS
HIGH DEC. SLIGHT DEC. SLIGHT INC. HIGH INC.
INTER. MEAN (VAR) 16.12 (49.29) 10.42 (30.81) 6.99 (6.90) 11.527 (44.25)
SLOPE MEAN (VAR) -1.08 (-6.54) -0.14 (6.13) 0.46 (-0.05) 2.50 (0.10)
RESID. VAR. 1 30.395 -8.32 -0.263 9.969
RESID. VAR. 2 5.132 6.46 1.225 10.368
RESID. VAR. 3 9.289 -8.653 1.028 -2.978
P-VALUE 0.007 0.843 0.013 0.385
Table 5.5: Growth Curve Analysis for MSS2.
The p− values of highly decreasing and slightly increasing volumes suggest a
significant fit to the data. However, the p−values of the slightly increasing and highly
increasing volumes are higher, which indicates a lower confidence regarding the fit of
the model.
Chapter 5. Lesion Characterisation 38
TRENDS
DECREASE. INCREASE
INTER. MEAN (VAR) 29.80 (22.91) 27.23 (140.7)
SLOPE MEAN (VAR) -1.58 (0.04) 3.79 (-28.7)
RESID. VAR. 1 -2.87 56.58
RESID. VAR. 2 13.07 -24.22
RESID. VAR. 3 6.74 58.68
P-VALUE 0.829 0.160
Table 5.6: Growth Curve Analysis for MSS3.
The decreasing trend of MSS3 does not seem to find a very statistically significant
solution for growth curve modelling (p− value = 0.829). However, there is a good fit
for the increasing trend (p− value = 0.160).
The intercept indicate the level and scale of the volume variability at each visit at
population level (mean) and at individual level (variance). Similarly, the slope relate
to the relationship between each visit at population level (mean) and at individual level
(variance). The residual variance reports the error fluctuations of the growth model at
each visit.
Overall, the growth curve model allows a better understanding of a set of individu-
als presenting some similar traits (presence of WMH), from a whole population point
of view (i.e. whole dataset), as well as from an individual point of view (i.e. patient).
In our case, more data would improve the fit of the model and allow a better under-
standing of the datasets. However, as the data is still currently being collected, it is not
possible to include more trajectories into the model.
Chapter 6
Conclusions
The segmentation and characterisation of brain lesions in MRI scans is a challeng-
ing task. However, with the constant progress made by the scientific community in
bio-imaging analysis, it is now possible to see the emergence of new unsupervised
segmentation methods that outperforms experts on similar tasks. Furthermore, these
techniques have the advantage of not requiring labelled datasets, which are intricate
to acquire in the medical field. We chose one of these original segmentation methods,
called the LOTS-IAM, as basis for this project as it is believed it could outperform
current state-of-the-art unsupervised methods for the task of WMH segmentation.
From this, we presented our first contribution, the LOTS-IAM-3D with targetpatch selection function based on prior, which improves the original algorithm by
(1) adding the processing of a depth dimension to the irregularity map generation, (2)
giving it the ability to filter out over-damaged patches during patch comparisons and
(3) drastically decreasing its processing time while improving its overall accuracy.
Our second contribution is a pipeline for the analysis of the previously-segmentedlesions. It consists of five sub-analyses that outline different characteristics of brain le-
sions on MRI from a population of individuals. The analyses are: overall description
of the population (“initial analysis”, section 5.3), lesion state analysis (section 5.4),
trend identification and separation (section 5.5), correlation analysis (section 5.6) and
growth curve modelling (section 5.7).
To validate our findings, we ran the LOTS-IAM-3D on our two datasets: MSS2 &
MSS3. Then, we evaluated and compared with the original LOTS-IAM and a standard
segmentation public tool, the LST-LGA. The evaluation demonstrated that the LOTS-
IAM-3D outperformed both the original LOTS-IAM and LST-LGA based on its dice
coefficient and F1 score (age probability map threshold = 0.15). Furthermore, the
39
Chapter 6. Conclusions 40
LOTS-IAM-3D overcomes the issue of lesion underestimation in over-damaged brains
with the implementation of the patch selector. Finally, we evaluated the LOTS-IAM-
3D’s ability to segment the subtle dynamic changes of the brain in-between visits,
which revealed again that it outperformed the original LOTS-IAM in all categories.
Following the segmentation evaluation, we applied our analysis pipeline to the out-
puts and produced a full description of the lesions appearing in both clinical samples
(MSS2 & MSS3). The analysis revealed that the two populations show similar but
not identical characteristics. Both populations presented increasing and decreasing le-
sion volumes over time, which validates Wardlaw et al.’s findings. However, MSS2
contains individuals whose lesion evolution was steadier and presented strong linear
relationships between visits in all cases, whereas MSS3 seems to evolve linearly for
positive volume changes but not for negative volume changes. The growth curve anal-
yses give overviews of both populations and their respective categories regarding the
shape and parameters of their evolution.
Despite the good results of the evaluations, this thesis suffers from some limita-
tions. First of all, if more time was allowed, a deeper comparison of the LOTS-IAM-
3D’s performance would have been produced to also include supervised learning meth-
ods (e.g. UResNet). Also, more work would have been put into the tuning of SOMs, as
it is still believed to be a promising area of research for multi-spectral analysis. Finally,
more time would have allowed more patient data to be registered (MSS3 is an ongoing
study), which would have improved the quality of the dataset analysis. Nevertheless,
all these limitations could be subject of future work.
In conclusion, we bring two contributions that we believe are of importance to clini-
cal research. The LOTS-IAM-3D showed constant superiority in terms of performance
for unsupervised segmentation and the analysis pipeline builds strong foundations for
longitudinal studies of brain lesions. The work produced here was designed in the
hope that it will be considered by the scientific community and used for future clinical
studies.
Bibliography
[1] Petronella Anbeek, Koen L Vincken, Matthias JP van Osch, Robertus HC Biss-
chops, and Jeroen van der Grond. Automatic segmentation of different-sized
white matter lesions by voxel probability estimation. Medical image analysis,
8(3):205–215, 2004.
[2] Rhoda Au, Joseph M Massaro, Philip A Wolf, Megan E Young, Alexa Beiser,
Sudha Seshadri, Ralph B DAgostino, and Charles DeCarli. Association of white
matter hyperintensity volume with decreased cognitive functioning: the framing-
ham heart study. Archives of neurology, 63(2):246–250, 2006.
[3] Mohammad Balafar. Gaussian mixture model based segmentation methods for
brain mri images. Artificial Intelligence Review, 41, 03 2014.
[4] Mohd Ali Balafar, Abdul Rahman Ramli, M Iqbal Saripan, and Syamsiah
Mashohor. Review of brain mri image segmentation methods. Artificial Intel-
ligence Review, 33(3):261–274, 2010.
[5] Rachele Bellini, Yanir Kleiman, and Daniel Cohen-Or. Time-varying weathering
in texture space. ACM Transactions on Graphics (TOG), 35(4):141, 2016.
[6] Jeff A Bilmes et al. A gentle tutorial of the em algorithm and its application to
parameter estimation for gaussian mixture and hidden markov models. Interna-
tional Computer Science Institute, 4(510):126, 1998.
[7] J Martin Bland and Douglas G Altman. Statistical methods for assessing agree-
ment between two methods of clinical measurement. The lancet, 327(8476):307–
310, 1986.
[8] Neurology Unit Cambridge University. Preserve: How intensively should we
treat blood pressure in established cerebral small vessel disease? 2012.
41
Bibliography 42
[9] Arıstides Andres Capizzano, L Acion, T Bekinschtein, M Furman, H Gomila,
A Martinez, R Mizrahi, and SE Starkstein. White matter hyperintensities are
significantly associated with cortical atrophy in alzheimers disease. Journal of
Neurology, Neurosurgery & Psychiatry, 75(6):822–827, 2004.
[10] A-Hyun Cho, Hyeong-Ryul Kim, Woojun Kim, and Dong Won Yang. White mat-
ter hyperintensity in ischemic stroke patients: it may regress over time. Journal
of stroke, 17(1):60, 2015.
[11] Patrick J Curran, Khawla Obeidat, and Diane Losardo. Twelve frequently asked
questions about growth curve modeling. Journal of cognition and development,
11(2):121–136, 2010.
[12] Stephanie Debette and HS Markus. The clinical importance of white matter hy-
perintensities on brain magnetic resonance imaging: systematic review and meta-
analysis. Bmj, 341:c3666, 2010.
[13] Ivana Despotovic, Bart Goossens, and Wilfried Philips. Mri segmentation of
the human brain: challenges, methods, and applications. Computational and
mathematical methods in medicine, 2015, 2015.
[14] Chuong B Do and Serafim Batzoglou. What is the expectation maximization
algorithm? Nature biotechnology, 26(8):897, 2008.
[15] Terry E Duncan and Susan C Duncan. The abcs of lgm: An introductory guide
to latent variable growth curve modeling. Social and personality psychology
compass, 3(6):979–991, 2009.
[16] Franz Fazekas. Incidental periventricular white matter hyperintensities revisited:
what detailed morphologic image analyses can tell us. American Journal of Neu-
roradiology, 35(1):63–64, 2014.
[17] Franz Fazekas, John B Chawluk, Abass Alavi, Howard I Hurtig, and Robert A
Zimmerman. Mr signal abnormalities at 1.5 t in alzheimer’s dementia and normal
aging. American journal of roentgenology, 149(2):351–356, 1987.
[18] Urs Fischer, Adrian Baumgartner, Marcel Arnold, Krassen Nedeltchev, Jan
Gralla, Gian Marco De Marchis, Liliane Kappeler, Marie-Luise Mono, Caspar
Brekenfeld, Gerhard Schroth, et al. What is a minor stroke? Stroke, 41(4):661–
666, 2010.
Bibliography 43
[19] Office for National Statistics. Deaths registered in england and wales, 2017.
[20] Melvin Gelbard. Data mining with dynamic contrast enhanced magnetic reso-
nance imaging (dce-mri) data, 2019.
[21] Davide Giavarina. Understanding bland altman analysis. Biochemia medica:
Biochemia medica, 25(2):141–151, 2015.
[22] Lei Guo, Xuena Liu, Youxi Wu, Weili Yan, and Xueqin Shen. Research on
the segmentation of mri image based on multi-classification support vector ma-
chine. In 2007 29th Annual International Conference of the IEEE Engineering in
Medicine and Biology Society, pages 6019–6022. IEEE, 2007.
[23] Joseph V Hajnal, David J Bryant, Larry Kasuboski, Pradip M Pattany, et al. Use
of fluid attenuated inversion recovery (flair) pulse sequences in mri of the brain.
Journal of computer assisted tomography, 16:841–841, 1992.
[24] Maria Del C Valdes Hernandez, Francesca M Chappell, Susana Munoz Man-
iega, David Alexander Dickie, Natalie A Royle, Zoe Morris, Devasuda Anblagan,
Eleni Sakka, Paul A Armitage, Mark E Bastin, et al. Metric to quantify white mat-
ter damage on brain magnetic resonance images. Neuroradiology, 59(10):951–
962, 2017.
[25] Scott A Huettel, Allen W Song, Gregory McCarthy, et al. Functional magnetic
resonance imaging, volume 1. Sinauer Associates Sunderland, MA, 2004.
[26] Ali Isın, Cem Direkoglu, and Melike Sah. Review of mri-based brain tumor
image segmentation using deep learning methods. Procedia Computer Science,
102:317–324, 2016.
[27] Xiao-li Jin. Multi-spectral mri brain image segmentation based on kernel cluster-
ing analysis.
[28] Raymond J. Kim, Edwin Wu, Allen Rafael, Enn-Ling Chen, Michele A. Parker,
Orlando Simonetti, Francis J. Klocke, Robert O. Bonow, and Robert M. Judd.
The use of contrast-enhanced magnetic resonance imaging to identify reversible
myocardial dysfunction. New England Journal of Medicine, 343(20):1445–1453,
2000. PMID: 11078769.
Bibliography 44
[29] Xavier Llado, Onur Ganiler, Arnau Oliver, Robert Martı, Jordi Freixenet, Laia
Valls, Joan C Vilanova, Lluıs Ramio-Torrenta, and Alex Rovira. Automated
detection of multiple sclerosis lesions in serial brain mri. Neuroradiology,
54(8):787–807, 2012.
[30] MATLAB. version 9.2.0 (R2017a). The MathWorks Inc., Natick, Massachusetts,
2017.
[31] Katie L. McMahon, Gary Cowin, and Graham Galloway. Magnetic resonance
imaging: The underlying principles. Journal of Orthopaedic & Sports Physical
Therapy, 41(11):806–819, 2011. PMID: 21654095.
[32] Daniel McNeish and Tyler Matta. Differentiating between mixed-effects and
latent-curve approaches to growth modeling. Behavior research methods,
50(4):1398–1414, 2018.
[33] Todd K Moon. The expectation-maximization algorithm. IEEE Signal processing
magazine, 13(6):47–60, 1996.
[34] Hongwei Li Ricardo Guerrero Rozanna Meijboom Stewart Wiseman Adam
Waldman Jianguo Zhang Daniel Rueckert Taku Komura Muhammad
Febrian Rachmadia, Maria del C. Valdes-Hernandez. Limited one-time
sampling irregularity map (lots-im): Automatic unsupervised quantitative as-
sessment of white matter hyperintensities in structural brain magnetic resonance
images. 2019.
[35] D Mungas, WJ Jagust, Bruce R Reed, JH Kramer, MW Weiner, N Schuff, D Nor-
man, WJ Mack, L Willis, and HC Chui. Mri predictors of cognition in subcortical
ischemic vascular disease and alzheimers disease. Neurology, 57(12):2229–2235,
2001.
[36] John T O’Brien, David Ames, and Isaac Schwietzer. White matter changes in de-
pression and alzheimer’s disease: a review of magnetic resonance imaging stud-
ies. International Journal of Geriatric Psychiatry, 11(8):681–694, 1996.
[37] Travis E Oliphant. A guide to NumPy, volume 1. Trelgol Publishing USA, 2006.
[38] World Health Organization. World Health Report. 2002.
Bibliography 45
[39] Leonardo Pantoni, Michela Simoni, Giovanni Pracucci, Reinhold Schmidt, Fred-
erik Barkhof, and Domenico Inzitari. Visual rating scales for age-related white
matter changes (leukoaraiosis) can the heterogeneity be reduced? Stroke,
33(12):2827–2833, 2002.
[40] Julia Patriarche and Bradley Erickson. A review of the automated detection
of change in serial imaging studies of the brain. Journal of digital imaging,
17(3):158–174, 2004.
[41] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel,
M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos,
D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine
learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
[42] Richard J Radke, Srinivas Andra, Omar Al-Kofahi, and Badrinath Roysam. Im-
age change detection algorithms: a systematic survey. IEEE transactions on
image processing, 14(3):294–307, 2005.
[43] Paul Schmidt, Christian Gaser, Milan Arsic, Dorothea Buck, Annette Forschler,
Achim Berthele, Muna Hoshi, Rudiger Ilg, Volker J Schmid, Claus Zimmer, et al.
An automated tool for detection of flair-hyperintense white-matter lesions in mul-
tiple sclerosis. Neuroimage, 59(4):3774–3783, 2012.
[44] Reinhold Schmidt, Helena Schmidt, Peter Kapeller, Christian Enzinger, Stefan
Ropele, Ronald Saurugg, and Franz Fazekas. The natural course of mri white
matter hyperintensities. Journal of the neurological sciences, 203:253–257, 2002.
[45] Christopher James Martin Scott. Master’s thesis, University of Edinburgh, 2016.
[46] Eric E Smith, Svetlana Egorova, Deborah Blacker, Ronald J Killiany, Alona
Muzikansky, Bradford C Dickerson, Rudolph E Tanzi, Marilyn S Albert,
Steven M Greenberg, and Charles RG Guttmann. Magnetic resonance imaging
white matter hyperintensities and brain volume in the prediction of mild cognitive
impairment and dementia. Archives of neurology, 65(1):94–100, 2008.
[47] Stephen M Smith, Yongyue Zhang, Mark Jenkinson, Jacqueline Chen,
PM Matthews, Antonio Federico, and Nicola De Stefano. Accurate, robust, and
automated longitudinal and cross-sectional brain change analysis. Neuroimage,
17(1):479–489, 2002.
Bibliography 46
[48] Els Steeman, Bernadette Dierckx De Casterle, Jan Godderis, and Mieke Gryp-
donck. Living with early-stage dementia: A review of qualitative studies. Journal
of advanced nursing, 54(6):722–738, 2006.
[49] Cathie Sudlow, John Gallacher, Naomi Allen, Valerie Beral, Paul Burton, John
Danesh, Paul Downey, Paul Elliott, Jane Green, Martin Landray, et al. Uk
biobank: an open access resource for identifying the causes of a wide range of
complex diseases of middle and old age. PLoS medicine, 12(3):e1001779, 2015.
[50] CLM Sudlow and CP Warlow. Comparing stroke incidence worldwide: what
makes studies comparable? Stroke, 27(3):550–558, 1996.
[51] G. Vettigli. Minisom: minimalistic and numpy based implementation of the self
organizing maps. https://github.com/JustGlowing/minisom, 2013.
[52] Joanna M Wardlaw, Michael Allerhand, Fergus N Doubal, Maria Valdes Hernan-
dez, Zoe Morris, Alan J Gow, Mark Bastin, John M Starr, Martin S Dennis, and
Ian J Deary. Vascular risk factors, large-artery atheroma, and brain white matter
hyperintensities. Neurology, 82(15):1331–1338, 2014.
[53] Joanna M Wardlaw, Francesca M Chappell, Maria del Carmen Valdes Hernandez,
Stephen DJ Makin, Julie Staals, Kirsten Shuler, Michael J Thrippleton, Paul A
Armitage, Susana Munoz-Maniega, Anna K Heye, et al. White matter hyper-
intensity reduction and outcomes after minor stroke. Neurology, 89(10):1003–
1010, 2017.
[54] Joanna M Wardlaw, Maria C Valdes Hernandez, and Susana Munoz-Maniega.
What are white matter hyperintensities made of? relevance to vascular cognitive
impairment. Journal of the American Heart Association, 4(6):e001140, 2015.