Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain...

16
Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform and multi-institution clinical images Yangming Ou a,b, , Randy L. Gollub a,b , Kallirroi Retzepi a,b , Nathaniel Reynolds a,b , Rudolph Pienaar c , Steve Pieper d , Shawn N. Murphy e,f , P. Ellen Grant c , Lilla Zöllei b a Psychiatric Neuroimaging, Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, 120 2nd Ave, Charlestown, MA 02129, USA b Laboratory for Computational Neuroimaging, Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical School, 149 13th St, Charlestown, MA 02129, USA c Fetal-Neonatal Neuroimaging and Developmental Science Center, Children's Hospital Boston, Harvard Medical School, 1 Autumn St, Boston, MA 02115, USA d Isomics, Inc., 55 Kirkland St, Cambridge, MA 02138, USA e Research Computing, Partners HealthCare, 1 Constitution Center, Charlestown, MA 02129, USA f Laboratory of Computer Science, Massachusetts General Hospital, Harvard Medical School, 50 Staniford St, Boston, MA 02114, USA abstract article info Article history: Received 9 November 2014 Accepted 3 August 2015 Available online 7 August 2015 Apparent Diffusion Coefcient (ADC) maps can be used to characterize myelination and to detect abnormalities in the developing brain. However, given the normal variation in regional ADC with myelination, detection of ab- normalities is difcult when based on visual assessment. Quantitative and automated analysis of pediatric ADC maps is thus desired but requires accurate brain extraction as the rst step. Currently, most existing brain extrac- tion methods are optimized for structural T1-weighted MR images of fully myelinated brains. Due to differences in age and image contrast, these approaches do not translate well to pediatric ADC maps. To address this problem, we present a multi-atlas brain extraction framework that has 1) specicity: designed and optimized specically for pediatric ADC maps; 2) generality: applicable to multi-platform and multi-institution data, and to subjects at various neuro-developmental stages across the rst 6 years of life; 3) accuracy: highly accurate compared to expert annotations; and 4) consistency: consistently accurate regardless of sources of data and ages of subjects. We show how we achieve these goals, via optimizing major components in a multi-atlas brain extraction frame- work, and via developing and evaluating new criteria for its atlas ranking component. Moreover, we demonstrate that these goals can be achieved with a xed set of atlases and a xed set of parameters, which opens doors for our optimized framework to be used in large-scale and multi-institution neuro-developmental and clinical stud- ies. In a pilot study, we use this framework in a dataset containing scanner-generated ADC maps from 308 pedi- atric patients collected during the course of routine clinical care. Our framework leads to successful quantications of the changes in whole-brain volumes and mean ADC values across the rst 6 years of life. © 2015 Elsevier Inc. All rights reserved. Introduction ADC maps measure the magnitude of (intra- and extra-cellular) water diffusion (Le Bihan et al., 1986; Neil et al., 1998). They provide valuable information about diffusivity and myelination in the central nervous system (Gerstner et al., 2010; Grant et al., 2001; Menze et al., in press; Verma et al., 2008; Woodhams et al., 2005). Pediatric ADC maps have been found to be very useful to characterize myelination- related early neuro-development (Erus et al., 2014; McKinstry, 2011; Ou et al., 2014a; Retzepis et al., submitted for publication; Snook et al., 2005; Twomey et al., 2010; Yoshida et al., 2013), and to identify neuro-developmental abnormalities (Barkovich, 2000; Dubois et al., 2014a; Guleria and Gross Kelly, 2014; Hüppi and Dubois, 2006). For both purposes, automatic analysis of pediatric ADC maps is preferred. This, however, requires brain extraction as one of the rst preprocessing steps, which is a challenging problem with a lack of algorithms specic to pediatric ADC maps. Here, brain extraction, or skull stripping, refers to an image processing step to remove the extra-meningeal tissues and non-brain structures (eyes, nose, neck, etc.) from whole-head im- ages (e.g., the red contours in Fig. 1). Existing brain extraction approaches have been primarily developed for structural T1-weighted MR images of fully myelinated brains of teens and adults. They use morphological operators (e.g., Chiverton et al., 2007; Dogdas et al., 2005), deformable surface evolution (e.g., Smith, 2002), edge detection (e.g., Shattuck and Leahy, 2002), region growing (e.g., Park and Lee, 2009), level sets (e.g., Zhuang et al., NeuroImage 122 (2015) 246261 Corresponding author at: Massachusetts General Hospital, Harvard Medical School, 149 13th St, Rm 2301, Charlestown, MA 02129, USA. Fax: +1 617 726 7422. E-mail address: [email protected] (Y. Ou). http://dx.doi.org/10.1016/j.neuroimage.2015.08.002 1053-8119/© 2015 Elsevier Inc. All rights reserved. Contents lists available at ScienceDirect NeuroImage journal homepage: www.elsevier.com/locate/ynimg

Transcript of Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain...

Page 1: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

NeuroImage 122 (2015) 246–261

Contents lists available at ScienceDirect

NeuroImage

j ourna l homepage: www.e lsev ie r .com/ locate /yn img

Brain extraction in pediatric ADC maps, toward characterizingneuro-development in multi-platform and multi-institutionclinical images

Yangming Ou a,b,⁎, Randy L. Gollub a,b, Kallirroi Retzepi a,b, Nathaniel Reynolds a,b, Rudolph Pienaar c,Steve Pieper d, Shawn N. Murphy e,f, P. Ellen Grant c, Lilla Zöllei b

a Psychiatric Neuroimaging, Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, 120 2nd Ave, Charlestown, MA 02129, USAb Laboratory for Computational Neuroimaging, Athinoula A. Martinos Center for Biomedical Imaging, Massachusetts General Hospital, Harvard Medical School, 149 13th St, Charlestown,MA 02129, USAc Fetal-Neonatal Neuroimaging and Developmental Science Center, Children's Hospital Boston, Harvard Medical School, 1 Autumn St, Boston, MA 02115, USAd Isomics, Inc., 55 Kirkland St, Cambridge, MA 02138, USAe Research Computing, Partners HealthCare, 1 Constitution Center, Charlestown, MA 02129, USAf Laboratory of Computer Science, Massachusetts General Hospital, Harvard Medical School, 50 Staniford St, Boston, MA 02114, USA

⁎ Corresponding author at: Massachusetts General Ho149 13th St, Rm 2301, Charlestown, MA 02129, USA. Fax:

E-mail address: [email protected] (Y. O

http://dx.doi.org/10.1016/j.neuroimage.2015.08.0021053-8119/© 2015 Elsevier Inc. All rights reserved.

a b s t r a c t

a r t i c l e i n f o

Article history:Received 9 November 2014Accepted 3 August 2015Available online 7 August 2015

Apparent Diffusion Coefficient (ADC) maps can be used to characterize myelination and to detect abnormalitiesin the developing brain. However, given the normal variation in regional ADC with myelination, detection of ab-normalities is difficult when based on visual assessment. Quantitative and automated analysis of pediatric ADCmaps is thus desired but requires accurate brain extraction as the first step. Currently,most existing brain extrac-tion methods are optimized for structural T1-weightedMR images of fully myelinated brains. Due to differencesin age and image contrast, these approaches do not translatewell to pediatric ADCmaps. To address this problem,we present a multi-atlas brain extraction framework that has 1) specificity: designed and optimized specificallyfor pediatric ADC maps; 2) generality: applicable to multi-platform and multi-institution data, and to subjectsat various neuro-developmental stages across the first 6 years of life; 3) accuracy: highly accurate compared toexpert annotations; and 4) consistency: consistently accurate regardless of sources of data and ages of subjects.We show howwe achieve these goals, via optimizing major components in a multi-atlas brain extraction frame-work, and via developing and evaluating new criteria for its atlas ranking component.Moreover,we demonstratethat these goals can be achieved with a fixed set of atlases and a fixed set of parameters, which opens doors forour optimized framework to be used in large-scale andmulti-institution neuro-developmental and clinical stud-ies. In a pilot study, we use this framework in a dataset containing scanner-generated ADCmaps from 308 pedi-atric patients collected during the course of routine clinical care. Our framework leads to successfulquantifications of the changes in whole-brain volumes and mean ADC values across the first 6 years of life.

© 2015 Elsevier Inc. All rights reserved.

Introduction

ADC maps measure the magnitude of (intra- and extra-cellular)water diffusion (Le Bihan et al., 1986; Neil et al., 1998). They providevaluable information about diffusivity and myelination in the centralnervous system (Gerstner et al., 2010; Grant et al., 2001; Menze et al.,in press; Verma et al., 2008; Woodhams et al., 2005). Pediatric ADCmaps have been found to be very useful to characterize myelination-related early neuro-development (Erus et al., 2014; McKinstry, 2011;Ou et al., 2014a; Retzepis et al., submitted for publication; Snook et al.,2005; Twomey et al., 2010; Yoshida et al., 2013), and to identify

spital, Harvard Medical School,+1 617 726 7422.u).

neuro-developmental abnormalities (Barkovich, 2000; Dubois et al.,2014a; Guleria and Gross Kelly, 2014; Hüppi and Dubois, 2006). Forboth purposes, automatic analysis of pediatric ADC maps is preferred.This, however, requires brain extraction as one of thefirst preprocessingsteps, which is a challenging problem with a lack of algorithms specificto pediatric ADC maps. Here, brain extraction, or skull stripping, refersto an image processing step to remove the extra-meningeal tissuesand non-brain structures (eyes, nose, neck, etc.) from whole-head im-ages (e.g., the red contours in Fig. 1).

Existing brain extraction approaches have been primarily developedfor structural T1-weighted MR images of fully myelinated brains ofteens and adults. They use morphological operators (e.g., Chivertonet al., 2007; Dogdas et al., 2005), deformable surface evolution(e.g., Smith, 2002), edge detection (e.g., Shattuck and Leahy, 2002),region growing (e.g., Park and Lee, 2009), level sets (e.g., Zhuang et al.,

Page 2: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

Fig. 1. Brain extraction in a pediatric ADCmap (top row) and an adult structural (T1w) image (bottom row). Red contours aremanually annotated brain boundaries. The differences in ageand imaging modality are clearly visible.

247Y. Ou et al. / NeuroImage 122 (2015) 246–261

2006), watershed (e.g., Hahn and Peitgen, 2000; Ségonne et al., 2004),graph cut (e.g., Mahapatra, 2012; Sadananthan et al., 2010), atlas priors(e.g., Doshi et al., 2013a; Leung et al., 2011), and hydrid techniques(e.g., Eskildsen et al., 2012; Galdames et al., 2012; Iglesias et al., 2011).These approaches are built on age- and imaging-modality as well ascontrast-specific assumptions that do not completely hold true forpediatric ADC maps. In terms of the difference in age, adults have amore mature anatomy; also, due to less motion and the completedmyelination in adults, their images tend to have higher tissue contrast,a higher signal-to-noise ratio (SNR) and higher voxel resolution thanimages of pediatric subjects; and, there is a larger distance betweenthe outer surface of the brain and the inner surface of the skull in adultsas well (Hoeksma et al., 2005; Mahapatra, 2012; Shi et al., 2012). Interms of the difference in MR imaging sequences, structural T1-weighted images show non-brain structures (e.g., skull, eyes, nose,mouth and neck) more clearly than diffusion ADC maps, and have dif-ferent inherent distortions (Counsell et al., 2006; Dubois et al., 2014b;Giménez et al., 2008). Fig. 1 displays an adult structural (T1-weighted)image and a pediatric ADCmap, where the differences in age and imag-ing sequence are clearly visible. Because of these fundamental differ-ences, brain extraction approaches for adult structural images usuallydo not directly apply to pediatric ADC maps. The inapplicability can beseen in our experiment (Table 3) and further experiments in recent lit-erature (Geng et al., 2012b; Mahapatra, 2012; Shi et al., 2011).

A few brain extraction methods do exist for pediatric T1 weightedimages (Chiverton et al., 2007; Mahapatra, 2012; Shi et al., 2012).However, directly applying them to pediatric ADC maps may also en-counter problems, due to differences in contrast properties and differenttypes of distortions of the two image sequences. Other approaches existfor other image modalities of infant or pediatric subjects (e.g., for T2-weighted MRI of infants (Leroy et al., 2011), for post mortem MRI ofnewborns and fetus Orasanu et al., 2014, for infant structural MRI withexpert post-correction of brain extraction results (Wang et al., 2012),for fetal brain MRI (Kuklisova-Murgasova et al., 2012)) or for ADCmaps of adults (e.g., Liu et al., 2007). However, there is a lack of a vali-dated algorithm for neonatal/pediatric ADCmaps. Consequently, one ei-ther has to (semi-) manually annotate the diffusion images (e.g., Hasanet al., 2007; Irimia et al., 2011), or transfer the brainmask obtained fromthe structural image to the ADC map of the same pediatric subject. Theformer is a subjective and not fully reproducible process. The latter re-quires an additional structural-to-diffusion multi-modal registration

that is non-trivial even in the case of adult images (Geng et al., 2012b;Malinsky et al., 2013).

Two other very important factors need to be considered when de-signing brain extraction algorithms for pediatric ADC maps. First, MRIscans of normative pediatric subjects are rare. Clinicians order MRIscans for infants and pediatric subjects with great caution, even thoughsedation is not always required (Kulikova et al., 2014). Additionally,even fewer datasets exist for research purposes, due to the difficultyin recruiting volunteers and the safety concerns (Arthurs and Sury,2013; Barkovich and Raybaud, 2012; Edwards and Arthurs, 2011;Rozovsky et al., 2013). As a result, many studies of neuro-development have to pool together clinical images collected from mul-tiple institutions to assure statistically significant outcomes (e.g., Evans,2006). Therefore, it is preferable for an algorithm designed for pediatricimages to be applied and validated in images from different sources (in-stitutions, scanner platforms and scanner vendors). This is a very chal-lenging requirement, given that images from different data sourcesmay differ greatly in resolution, contrast properties, sequences, fields-of-views (FOVs) and distortions (Guggenberger et al., 2013;Hameeteman et al., 2011; Kirişli et al., 2010; Kıvrak et al., 2013;Malyarenko et al., 2013; Ou et al., 2014c). The second factor to consideris the inter-subject structural variations among pediatric subjects evenat the same age, which are greater than among adolescents and adults.This is due to the rapid neuro-development in neonatal and early pedi-atric stages. Therefore, an ideal algorithm should also be robust to thelarge structural variations in pediatric subjects of the same or differentages.

Considering the above, our goal is to develop a brain extractionframework with four desirable properties: 1) specificity: designed spe-cifically for pediatric ADC maps; 2) generality: applicable to datafrom different sources (institutions, platforms), and to subjects of dif-ferent ages across the first 6, and in particular the first 2, years of life(during which the neuro-development is most dynamic (Kazemi et al.,2007; Knickmeyer et al., 2008; Geng et al., 2012a; Mori et al., 2013; 3)accuracy: highly accurate compared to expert annotations; and 4) con-sistency: consistently accurate in different data sources and subject ages.

This paper presents a multi-atlas framework for the brain extractionof pediatric ADC maps. An atlas in our framework is comprised of a pe-diatric ADCmap and a corresponding expert-annotated brainmask (seeFig. 2). We used multiple atlases to collectively infer the brain mask inthe target images. We optimized each component in the framework

Page 3: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

Fig. 2. The proposedmulti-atlas brain extraction framework. Given a set of atlas ADCmaps {Ii}Ni = 1 and their corresponding expert-annotated brainmasks {Li}Ni = 1, and the ADCmap IT tobe processed, this framework aims to compute the brain mask LT in the space of IT.

248 Y. Ou et al. / NeuroImage 122 (2015) 246–261

and proposed new criteria to improve the atlas ranking component. Theoptimization of the framework was done in one dataset and was thentested in 2 other unseen datasets acquired at a different institution oron a different scanner.

Having amulti-atlas framework is not our contribution. Such frame-works have been frequently used to segment brain (e.g., Cabezas et al.,2011; Doshi et al., 2013b; Heckemann et al., 2006; Iglesias et al., 2013;Lötjönen et al., 2010; Ou et al., 2012b; Sabuncu et al., 2010; Wanget al., 2011; Wang et al., 2014; Wu et al., 2013), prostate (e.g., Kleinet al., 2008; Ou et al., 2012a), heart (e.g., Isgum et al., 2009; vanRikxoort et al., 2010) and other organs. Two multi-atlas frameworkseven exist for skull stripping adult structural images (Doshi et al.,2013a; Leung et al., 2011). Our contributions are: (i) our framework isthe first algorithm specifically designed to skull strip clinical pediatricADC maps; it is fully-automated after having obtained a set of expert-annotated atlases; (ii) unlike many other multi-atlas frameworks, oursdoes not require adoption or annotations of additional, often dataset-specific, atlases when new images/datasets are to be processed, nordoes our framework require additional, dataset-specific parametertuning; and (iii) we have shown that an optimized framework canreach the desired specificity, generality, accuracy and consistency,which have been rarely demonstrated in the literature even for adultimages. We further used the proposed framework in a pilot neuro-developmental study containing clinical ADC maps from a cohort of308 normative pediatric subjects. As a result, we quantified changes inthe whole-brain volume and in mean ADC values across the first6 years of life.

Data

The Institutional Review Boards (IRBs) at both Partners Healthcareand Boston Children's Hospital approved our study. We queried and re-trieved clinical pediatric ADCmaps for subjects 1) younger than 6 yearsof age at the time of the scan, and also 2) with clinical radiology reportsand clinical records documenting the absence ofmajor neurological dis-ease. This data query and retrieval was made possible by the ResearchPatient Data Registry (RPDR)1 data warehouse, and the recently re-leased Medical Imaging Informatics Bench to Bedside (mi2b22) work-bench (Murphy et al., 2014). The clinical data obtained from the RPDRquery, including the Radiology Reports, were analyzed using databasequeries to identify potential normative cases that were then confirmedby an expert review of the medical records. The mi2b2 workbench wasthen used to access the brain MRI scans of these potential cases directly

1 RPDR: http://rc.partners.org/rpdr.2 Mi2b2: https://community.i2b2.org/wiki/display/mi2b2/mi2b2+Home.

from our institutions' clinical Picture Archiving and CommunicationSystem (PACS), in compliance with the Health Insurance Portabilityand Accountability Act (HIPAA) Omnibus Rule.

Three datasets with expert annotations (Section 2.1) were used foralgorithm development and validation. Another, relatively large dataset(Section 2.2), was used for applying the optimized framework in awhole-brain volumetric and ADC neuro-development study.

Three datasets for algorithm development and validation

As detailed in Table 1, we used 3 datasets acquired on different scan-ner platforms from different institutions for development and valida-tion of our framework. These datasets are: MGH_3T (N = 15),MGH_1.5T (N = 10) and BCH_3T (N = 14). Each dataset containsADC maps of multiple pediatric subjects, which were retrieved as de-scribed in Section 2. A trained expert (KR) annotated brain masks inall datasets, whichwe used as references in our experiments. The anno-tations took 0.5–1 h per subject, using the FreeView3 visualization andannotation tool from the FreeSurfer software package (Fischl, 2012).

One dataset for whole-brain neuro-development study

Wealso applied the proposed framework to a relatively large datasetnamed Pediatric Dataset 308, or PD308 for quantitatively characterizingvolumetric and diffusion aspects of neuro-development from birth to6 years old. This PD308 dataset contains clinical ADC maps from N =308 pediatric subjects younger than 6 years of age at the time of thescan. The ADC maps were all acquired by the same protocol as inMGH_3T. A pediatric intensive care nurse (KM) confirmed that theywere normative at the time of the scan, by reviewing their radiologicalreports and clinical records. Table 2 lists the age distribution of thisdataset.

Methods

Overview

FormulationWe denote an atlas An = (In, Ln), which comprises of an ADCmap In,

and the associatedmanually-annotated brainmask Ln. Similarly, we de-note a target as T=(IT, LT), where IT is the input ADCmap (i.e., the targetimage), and LT is the brain mask to be calculated in the target imagespace.

3 FreeView: https://surfer.nmr.mgh.harvard.edu/fswiki/FreeviewGuide.

Page 4: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

Table 1The 3 datasets that we used for developing and validating our framework.

Datasets MGH_3T MGH_1.5T BCH_3T

Imaging institution Massachusetts General Hospital (MGH) Massachusetts General Hospital (MGH) Boston Children's Hospital (BCH)MR scanner 3T Siemens Trio 1.5T Siemens Avanto 3T Siemens TrioRepetition time (TR) 7500–9500 ms 6500–8000 ms 7000–8000 msEcho time (TE) 80–115 ms 80–100 ms 80–100 msb value 1000 s/mm2 1000 s/mm2 1000 s/mm2

Sequence Echo planar, segmented k-space\spoiled,partial Fourier-phase, fat saturation.

Echo planar, research mode, partialFourier-frequency, spatial presaturation,fat saturation.

Echo planar, segmented k-space, spatialpresaturation, partial Fourier-Phase.

Typical image size 256 × 256 × 60 voxels 256 × 256 × 50 voxels 128 × 128 × 70 voxelsTypical voxel size 0.86 × 0.86 × 2.0 mm3 0.86 × 0.86 × 2.2 mm3 1.7 × 1.7 × 2.0 mm3

Manually-annotated? Yes, brain masks Yes, brain masks Yes, brain masksNumber of subjects 15 10 14Age distribution Month 1 (4 subjects) Month 1 (1 subject) Month 1 (4 subjects)

Month 3 (1 subject) Month 4 (1 subject) Month 2 (1 subject)Month 4 (1 subject) Month 6–12 (2 subjects) Month 4 (1 subject)Month 6–12 (4 subjects) Year 1–2 (2 subjects) Month 6–12 (2 subjects)Year 1–2 (1 subject) Year 2–3 (1 subject) Year 1–2 (1 subject)Year 3–4 (2 subjects) Year 3–4 (1 subject) Year 2–3 (1 subject)Year 4–5 (1 subject) Year 4–5 (1 subject) Year 3–4 (1 subject)Year 5–6 (1 subject) Year 5–6 (1 subject) Year 4–5 (1 subject)

Year 5–6 (1 subject)

249Y. Ou et al. / NeuroImage 122 (2015) 246–261

FrameworkSimilarly to other multi-atlas consensus frameworks (e.g.,

Heckemann et al., 2006; Klein et al., 2008), ourmulti-atlas brain extrac-tion has the following components (also illustrated in Fig. 2):

a) a fixed library of N = 15 atlases (Section 3.2);b) deformable registration and label propagation (Section 3.3). We

used the deformation field hn: In- N IT, calculated from ADC maps,to transfer the atlas brain mask Ln into the target space, i.e., hn(Ln).

c) atlas ranking (Section 3.4) andd) atlas selection (Section 3.5) for choosing the most relevant atlases;e) label fusion to generate final target brain mask LT (Section 3.6).

In the following,we describe each component in the framework. Theemphasis is on how we selected the optimal methods for each compo-nent, in order to establish a frameworkwith the desirable generality, ac-curacy and consistency.

Atlas library

Our atlas library contains the 15 ADC maps from the MGH_3Tdataset. To represent the anatomical variations due to rapid neuro-development, we chose a bigger number of subjects that were scannedduring the first year (especially during the first month) of their lives,and also some representative ones up until 6 years of age. Fig. 3 showsthe central sagittal slices of the chosen atlases. They had different con-trasts and signal-to-noise ratios. Please note that some atlases (namely2, 5, 6, 12, 14, 15) have different fields of views (FOVs), where someslices in the top of the brain are missing. This alone highlights a hugevariability within the same dataset. Also, we point out that the chosenatlases are not spatially aligned in Fig. 3. Thus, the central sagittal slicesin different subjects may not exactly correspond across atlases.

Table 2Age distribution in the PD308 dataset (W—Week, Q—Quarter and Y—Year).

Age Y0–1 Y1–2 Y2–3 Y3–4 Y4–5 Y5–6

W1–2 Rest of Q1 Q2 Q3 Q4

# Subjects 20 22 13 17 25 47 45 38 31 50

Optimizing the registration and label propagation component

One consequence of having afixed atlas library is that the atlases andthe target may come from datasets of different sources (i.e., institutions,platforms). To this end, we need a deformable registration algorithmthat is robust to inter-subject structural variability, and also robust tochallenges in data acquired from different sources.

We tested four registration algorithms to see which one can satisfythis need. If an algorithm could demonstrate generality, accuracy andconsistency regarding single- and multi-institution/platform images,then selecting it would optimize the registration and label propagationcomponents in our framework. The four candidate algorithms that wetested are frequently used in the literature with demonstrated high ac-curacy (Klein et al., 2009; Ou et al., 2014c), are publicly available and aregeneral-purpose. Three of the algorithms are intensity-based: ANTs4

(the “SyN” symmetric diffeomorphism option) (Avants et al., 2008),Diffeomorphic Demons5 (Thirion, 1998; Vercauteren et al., 2009), andFSL's FNIRT6 (Andersson et al., 2008) and the fourth one is a texture-attribute-based one called DRAMMS7 (Ou et al., 2011). They representa variety of transformation models, image similarity metrics, and opti-mization strategies (please see Klein et al., 2009; Ou et al., 2014c for de-tailed reviews of their algorithmic differences). In the literature, theperformance of these algorithms has been extensively evaluated withrespect to adult brain images (Klein et al., 2009; Avants et al., 2011;Christensen et al., 2006; Diez et al., 2013; Hellier et al., 2001; Kleinet al., 2010; Ou et al., 2014c; West et al., 1997; Yassa and Stark, 2009);and occasionally in pediatric structural images (4–11 years of age)(Ghosh et al., 2010). However, no comparable evaluation was complet-ed for how they would perform in pediatric brain ADC maps.

We performed a total of 2280 inter-subject registrations to evaluatethese 4 algorithms in 3 different scenarios. In the first scenario, the twopediatric ADC maps to be registered were both from the MGH_3Tdataset (within-dataset registrations). We performed all possible pair-wise registrations among the 15 ADC maps in this collection, resultingin 210 (=15 × 14) registrations for each algorithm. In the secondscenario, the two ADC maps to be registered were from datasets ac-quired in different scanner platforms (across-scanner-platform regis-trations). We registered each of the 15 ADC maps in our atlas library

4 ANTs: http://stnava.github.io/ANTs/.5 Demons: http://www.insight-journal.org/browse/publication/154.6 FNIRT: http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FNIRT.7 DRAMMS: http://www.cbica.upenn.edu/sbia/software/dramms/.

Page 5: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

Fig. 3. The library of 15 ADC atlases with manual annotations of brain masks outlined by red contours. ADC atlases are 3D images, but for display purpose, this figure shows only a singlesagittal slice. Ages are also shown.

250 Y. Ou et al. / NeuroImage 122 (2015) 246–261

(all from the MGH_3T dataset) to each of the 10 ADC maps in theMGH_1.5T dataset. This resulted in 150 pair-wise registrations foreach algorithm. In the third scenario, the two ADCmaps to be registeredwere from datasets acquired at different institutions (across-institutionregistrations). We registered each of the 15 ADC maps in our atlas li-brary to each of the 14 ADC maps in the BCH_3T dataset. This resultedin 210 pair-wise registrations for each algorithm. Therefore, there are210+150+210 registrations per each of the 4 registration algorithms,totaling at 2280 registrations.

The deformation fields that were computed when registering twopediatric ADC maps were used to resample/deform the correspondingmanually-annotated brain masks into a common image space. Usingthese, we evaluated registration accuracy by computing the Dice coeffi-cient (Dice, 1945) between the deformed brain mask from the atlas tothe target space and the brainmask of the target image. The registrationalgorithm with a higher Dice coefficient score was regarded as beingbetter suited for the brain extraction task in our framework. Additional-ly, we also computed the Hausdorff Distance (Huttenlocher et al., 1993)between the deformed manually-annotated brain masks. Since theDice coefficients and Hausdorff Distances agreed in their accuracy rank-ings, we only report the results of the Dice coefficients in this paper. LetP be themanually-annotated atlasmask deformed into the target space,and Q the manually-annotated brain mask in the target space. The Dicecoefficient between them is defined as

DICE P; Qð Þ ¼ 2 � V P ∩Qð ÞV Pð Þ þ V Qð Þ ; ð1Þ

where V(⋅) is the volume of a region.

Optimizing the atlas ranking component

Atlases with inferior registrationwill cause poor inference regardingthe target mask. To optimize the atlas ranking component in our frame-work, we propose a set of novel atlas ranking criteria, and comparethem with respect to several existing ones.

Atlas ranking is a much studied problem. It is not feasible to directlymeasure and rank how accurately an atlas can infer the segmentation of

the target image, since the ground truth for target segmentation is un-known (and to be computed). Existing studies instead rely on atlasranking criteria that consist of demographic information (e.g., age, gen-der, race, etc.), and/or atlas-to-target image similaritymeasures (Aljabaret al., 2009; Bai et al., 2013; Klein et al., 2008; Langerak et al., 2013;Leung et al., 2011; Lötjönen et al., 2010). The assumption is that thoseatlases that can better infer the target segmentation originate from sub-jects having similar demographic information to the target subject, orthe ones having higher atlas-to-target image similarities. This assump-tion, however, was recently found to be biased (Duc et al., 2013;Sanroma et al., 2014; Wolz et al., 2010; Yang et al., 2013). The funda-mental problem is that intensity-based similarity metrics do not faith-fully reflect registration accuracy (Klein et al., 2009; Avants et al.,2011; Ou et al., 2012c; Ou et al., 2014b; Ou et al., 2014c; Rohlfing,2012), and hence do not necessarily imply the accuracy of the inferredsegmentation. To address this issue, new ranking criteria have been pro-posed. For example, (Sanroma et al., 2014) suggested a supervised ma-chine learning approach, assuming that the utility of an atlas can beimplied and learnt by image appearances and a more comprehensiveset of registration properties other than purely image similarity. This su-pervised training process may have to be re-established for each newdataset to be analyzed, especially when the new datasets are from dif-ferent institutions or scanners of different platforms. Thus, although itimproves the atlas ranking accuracy, it is not an optimal choice in ourframework.

For the generality of our framework, we instead propose severalsimple and unsupervised atlas ranking criteria. Whether an atlas is use-ful in deriving the brain mask of a target subject is determined by howaccurate the atlas-to-target registration is. So, ranking atlases translatesto measuring and ranking the accuracy of registering different atlasesonto the same target. We thus measure registration accuracy by:(a) the Dice overlap coefficient between the deformed atlas brainmask in the target space and the tentative target brain mask (TTM).Here TTM is obtained by fusing the atlas-impliedmasks from all atlases,which we assume to be a good approximation of the unknown brainmask in the target image space; (b) intensity similarity (MI, CC) be-tween the deformed atlas and the target ADCmaps. Since intensity sim-ilarity alone is not necessarily an accurate indicator for registration

Page 6: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

8 NiftySeg: http://sourceforge.net/projects/niftyseg/.9 BET: http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/BET.

10 BSE: http://brainsuite.org/processing/surfaceextraction/bse/.

251Y. Ou et al. / NeuroImage 122 (2015) 246–261

performance (Klein et al., 2009; Avants et al., 2011; Ou et al., 2012c; Ouet al., 2014b; Ou et al., 2014c; Rohlfing, 2012),we used the overlapmea-surement alone, or the overlapmeasurementwith the intensity similar-ity as in the three proposed ranking criteria below:

EstDice ¼ DICE hn Lnð Þ; LTTTM

� �: ð2Þ

EstDice�CCdef ¼ DICE hn Lnð Þ; LTTTM

� �� CC hn Inð Þ; ITð Þ ð3Þ

EstDice�MIde f ¼ DICE hn Lnð Þ; LTTTM

� ��MI hn Inð Þ; ITð Þ ð4Þ

CCdef andMIdef refer to the correlation coefficient CC(.,.) andmutu-al information MI(.,.) image similarity metrics between the registeredADC maps hn(In) and IT. EstDice is the Dice coefficient between theatlas-implied brain mask hn(Ln) and TTM LT

TTM = FUSE({hn(Ln)}Nn = 1),where we used the STAPLE (Simultaneous Truth And PerformanceLevel Estimation) label fusion tool (Warfield et al., 2004) to implementthe FUSE() function (more in Sections 3.6 and 4.1.3). Please note that,the novelty in Eq. (2) is the use of tentative template mask (LTTTM),which we hypothesize and will later show is a good approximation ofthe true yet unknown brain mask.

We evaluated the performance of the above three atlas rankingcriteria by comparing them with four conventional image-similarity-based criteria listed below (Aljabar et al., 2009; Bai et al., 2013; Kleinet al., 2008; Langerak et al., 2013; Leung et al., 2011; Lötjönen et al.,2010).

CCaff ¼ CC an Inð Þ; ITð Þ; ð5Þ

MIaff ¼ MI an Inð Þ; ITð Þ; ð6Þ

CCdef ¼ CC hn Inð Þ; ITð Þ; ð7Þ

MIde f ¼ MI hn Inð Þ; ITð Þ: ð8Þ

In the above, an(In) and hn(In) are the transformed atlas ADC maps,by affine (a(⋅)) and deformable (h(⋅)) registration, respectively. Foraffine registration, we used FLIRT (Jenkinson and Smith, 2001); fordeformable registration, we used the one that empirically performedbest in our study (see Sections 3.3 and 4.1.1).

To quantify the atlas ranking accuracy, we measured the errorsbetween the true and estimated ranks. We determined the true rank(RTruen, T) of an atlas (An)with regard to target T based on the Dice coef-ficient between the atlas-implied brain mask and the expert-annotatedtarget brain mask. We determined the estimated rank (REstn, T) basedon the criteria in Eqs. (2)–(8). The average ranking error (ARET) withrespect to target T is defined as:

ARET ¼ 1N∑N

n¼1 REstn; T−RTrue

n; T

� �: ð9Þ

Optimizing the atlas selection component

Selectingmultiple atlases (e.g., Han et al., 2008; Rohlfing et al., 2004)usually leads to higher segmentation accuracy than selecting merely asingle top ranked atlas (e.g., Martin et al., 2010; Park et al., 2003). Weevaluated two strategies with various parameter settings. One strategywas to keep K (a user-specified number) top-ranked atlases (Aljabaret al., 2009; Leung et al., 2011):

keep atlas Αn if REstn;T≤K: ð10Þ

In our experiments, we investigated how the final brain extractionaccuracy changes as K varies.

The other strategy was to keep those atlases whose values, as mea-sured by the objective function in a certain atlas ranking criteria, wereabove a user-specified percentage λ of the maximum value among allatlases (Klein et al., 2008). We call this λ a threshold ratio. The valueof an atlas Ai with regard to a target T as measured by the cost functionC in a certain atlas ranking criterion (Eqs. (2)–(8)) is denoted by VT

C(Ai).Then this second set of atlas selection is defined as

keep atlas An if VCT An≥λ max

i¼1;…;NVCTAi: ð11Þ

In our experiments, we studied how the final brain extraction accu-racy changes as λ varies.

Optimizing the label fusion component

We evaluated 3 different label fusion methods: majority voting(MV), STAPLE (Warfield et al., 2004), and Shape-Based Averaging(SBA) (Rohlfing and Maurer, 2007). We used their implementations asprovided in the NiftySeg8 software package (Cardoso et al., 2011). In-cluding them for comparison was based on their popularity and theirpublic availability. Other label fusion algorithms to be considered inthe future, including, but are not limited to, SIMPLE (Langerak et al.,2010), COLLATE (Asman and Landman, 2011), improved STAPLE(Commowick et al., 2012), STEPS (Cardoso et al., 2013), etc.

Comparison with other algorithms

There are no brain extraction methods in the literature that arespecifically designed for pediatric ADC maps. Consequently, we com-pared our approach to methods such as BET (Smith, 2002) and BSE(Shattuck and Leahy, 2002), which are widely used methods, primarilydesigned for adult structural T1-weighted MR images. Below we pres-ent how we optimized BET and BSE for pediatric ADC maps.

BET9

The brain extraction tool (BET) is part of the FSL toolbox. It evolves adeformablemodel to the brain's surface, based on local intensity profilesand subject to surface smoothness constraints. It is widely used becauseof its public availability, its simplicity of use, its speed (seconds) andthe overall good quality of the results it generates. Unlike other brainextraction methods that only apply to T1-weighted structural images,BET can be also used with T2-weighted images, B0 images (Acosta-Cabronero et al., 2010; Focke et al., 2008; Parker et al., 2005; Xu et al.,2007), and even on FA images (Zhang et al., 2010) with certain param-eter adjustments.

We adjusted BET parameters for pediatric ADC maps based exclu-sively on the MGH_3T dataset, in the same way as we did for BSE andour own method. In particular, we ran BET multiple times by varyingtwo key parameters—the fractional intensity threshold and the verticalgradient. The former implicitly determines the brain size andwhere theboundary should be located. It is controlled by the “-f” option (default:0.5). We varied this parameter from 0.2 to 0.8, with an increment of0.05. The latter parameter applies a vertical gradient change of the in-tensity threshold as specified by the “-g” option (default: 0). We variedthis parameter from −0.2 to 0.2, with an increment of 0.05. Fig. 4(a)shows how the brain extraction accuracy changes as these two param-eter values change. The highest accuracy occurs with the two parame-ters set at f = 0.45 and g = 0, which we will use in this paper.

Page 7: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

Fig. 4.Visualization of brain extraction accuracy changes as parameters change. This set of results is based on theMGH_3T dataset. In each panel, an orange arrowpoints out the location ofthe maximum accuracy.

252 Y. Ou et al. / NeuroImage 122 (2015) 246–261

BSE10

The brain surface extractor (BSE) is a part of the BrainSuite softwarepackage. It conducts a series of anisotropic smoothing, edge detectionandmorphological operations. It is easy to run and finishes fast (typical-ly less than 1 min). It is designed for T1-weighted structural imagesonly. Shi et al. (2011) reported that BSE with carefully adjusted param-eter values could be used for pediatric structural images. Therefore weincluded BSE to test the hypothesis that BSE with adjusted parametersmay improve its accuracy in pediatric ADC maps.

We ran BSE on theMGH_3T dataset multiple times by varying threekey parameters—the diffusion constant, the edge detection constantand the erosion size. The first parameter determines the strength ofthe edges to be kept. It is controlled by the “-d” option (default: 25).We varied it from 10 to 40, with an increment of 5. The second param-eter influences howwide the edgemust be in order to be identified. It iscontrolled by the “-s” option (default: 0.62). We varied it from 0.42 to0.82, with an increment of 0.04. The third onemodulates how far to ex-tend the edge voxels. It is controlled by the “-r” option (default: 1). Wetried to set it to values 1 and 2, and found that 2 usually gives better re-sults. Fig. 4(b) shows how the brain extraction accuracy changes asthese parameters change. The highest accuracy occurs with d = 40,s = 0.54 and r = 2, which we will use in this paper.

Results

In this section, we first present performance evaluation results foreach component in our framework. Then we demonstrate the overallperformance, both qualitatively and quantitatively. The third set of re-sults corresponds to the application of our framework to the PD308population of a normative cohort, to characterize volumetric and diffu-sion changes across the first 6 years of life.

Results for each component

It should be emphasized that the optimization of each component ofour framework was based exclusively on the MGH_3T dataset.

Results for optimizing the registration componentFig. 5 shows the comparison results among the 4 deformable regis-

tration algorithms. Within-dataset (MGH_3T) registration results inFig. 5(a) show that DRAMMS had the highest mean Dice coefficient(0.92) and the smallest standarddeviation (0.06). ANTs followed closely(mean Dice 0.89) with a similar standard deviation (0.06). Demons hada good mean Dice (0.84), but showed larger standard deviation (0.16).Perhaps this outcome can be best explained by the fact that this algo-rithm uses intensity difference as image similarity, which had beenknown to be more vulnerable to variations in image contrasts and

anatomies (Klein et al., 2009; Diez et al., 2013; Ou et al., 2012c; Ouet al., 2014b; Ou et al., 2014c; Sotiras et al., 2013; Sotiras et al., 2014).

Next we investigate how the registration algorithm of our choice,DRAMMS, performed in more challenging across-scanner-platformand across-institution registration tasks. ANTs and FNIRT showed ro-bustness registering multi-platform data (Fig. 5(b)). Conversely, theyobtained lower Dice coefficients when registering multi-institutiondata (mean Dice down from 0.89 to 0.82 for ANTs, and from 0.66to 0.56 for FNIRT, Fig. 5(c)). Demons showed robustness registeringmulti-institution data (Fig. 5(c)). Nevertheless, it had decreased Dicecoefficients when registering multi-platform data (mean Dice downfrom 0.84 to 0.73, Fig. 5(b)). In contrast, DRAMMS performed consis-tently well with a high mean Dice (0.91–0.92) and a low standard devi-ation (0.04–0.06), regardless of differences in platform or institution(Figs. 5(a–c)). A summary of these results is provided in Fig. 5(d), andthe difference between DRAMMS and others in our datasets are statisti-cally significant (p b 0.001). The performance fromDRAMMSprovided afoundation for our framework's generality, accuracy and consistencywhen facing data from different sources and subjects of different ages.

Results for optimizing the atlas ranking componentTable A in Appendix A shows an example of how to calculate the av-

erage ranking errors (AREs, defined in Eq. (9)) for different atlas rankingcriteria, with regard to a randomly selected target. Fig. 6(a) shows thedistribution of AREs using all possible targets in the MGH_3T dataset.The results indicated that the criterion best indicative of an atlas' infer-ence ability was the estimated Dice coefficient with the TTM (Eq. (2),EstDice). It led to an ARE less than 2. In contrast, the conventionalimage similarity based atlas ranking criterion led to relatively largerAREs at around 3 and larger variations. The same finding was also truein two other datasets. For example, Fig. 6(b) shows the same trend inthe MGH_1.5T dataset. The findings here support our hypothesis fromSection 3.4, that overlap-based metrics are more faithful indicators forthe accuracy of atlas-based segmentation than image similarity metrics.

Results for optimizing the atlas selection and label fusion componentsWe aimed to select a combination of atlas selection and label fusion

strategies balancing the following conditions: (1) high accuracy, interms of the final brain extraction; (2) consistent accuracy in a dataset,(i.e., small standard deviation); (3) consistent accuracywhen the valuesof the atlas selection strategy slightly changed; and (4) computationalefficiency (i.e., using as few atlases as possible, or a threshold ratio asbig as possible).

Fig. 7(a) demonstrates results of leave-one-out experimentsusing the MGH_3T dataset. For label fusion, STAPLE and SBA per-formed almost equally well, and significantly better than MV. Foratlas selection, parameter values had a big impact on the final brainextraction accuracy. Overall, two combinations of atlas selection

Page 8: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

Fig. 5. The Box-and-Whisker plots of brain mask Dice coefficients for four registration algorithms: FNIRT, Demons, ANTs and DRAMMS. P-values are also provided to show whether theperformances of registration algorithms are significantly different.

253Y. Ou et al. / NeuroImage 122 (2015) 246–261

and label fusion strategies performed better than others. One was tokeep the top 5 ranked atlases, followed by the STAPLE label fusion(Dice coefficients at 0.958 ± 0.012, the highest in Fig. 7(a) leftpanel); the other was to keep those atlases with which the estimatedDice coefficients (EstDice) were above 98% of the maximum EstDiceover all atlases, followed by the STAPLE label fusion (Dice coefficientsat 0.958 ± 0.011, the highest in Fig. 7(b) right panel). The accuraciesof the two combinations were statistically equivalent (p = 0.97 in atwo-sample t-test). Since their difference was statistically insignifi-cant, for simplicity and efficiency in communication we chose topresent only one, in this case, STAPLE, just because its accuracy wasslightly (only slightly) higher than SBA, although statistically not sig-nificantly. By this atlas selection strategy, we recorded that 7–9atlases were selected on average, in the leave-one-out tests in theMGH_3T dataset.

The chosen atlas selection and label fusion strategies also performedwell in two other datasets. For example, Fig. 7(b) shows the results fromtheMGH_1.5T dataset. The chosen strategies led to a highmeanDice co-efficient (0.929), a stable performance in multiple subjects in thisdataset (small standard deviation 0.023), and a stable performancewhen the threshold ratio for atlas selection slightly changed (meanand standard deviation of the Dice coefficients remained almost thesame).

Results of the overall framework

The previous section presented quantitative comparison resultsaiming to optimize each component in our framework. To summarize,our optimized framework started with a fixed set of 15 pediatric atlases(MGH_3T), transferred the brain mask annotations into the targetimage space by the DRAMMS deformable registration, then rankedand selected atlases by the proposed Dice-coefficient-based ranking cri-terion and a thresholding-based atlas selection strategy, and finallyfused the selected atlases by STAPLE. The output is the final brainmask for the target image.

We tested how the optimized framework performed qualitatively(Section 4.2.1) and quantitatively (Section 4.2.2), recorded computationtimes (Section 4.2.3) and examined whether the brain extraction accu-racy was consistent for subjects at different neuro-developmentalstages (Section 4.2.4).

Qualitative resultsFig. 8 displays the brain extraction results in all three datasets as

compared with expert annotations. In each dataset, we show resultsfor subjects having the highest, medium and lowest brain extractionaccuracies. Except for one subject in the MGH_1.5T dataset having abrain extraction accuracy at 0.88, all subjects in all three datasets had

Page 9: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

Fig. 6. The distribution of AREs for various atlas ranking criteria in (a) MGH_3T and (b) MGH_1.5T datasets. In each figure, the columns from left to right correspond to criteria inEqs. (2)–(8), respectively.

254 Y. Ou et al. / NeuroImage 122 (2015) 246–261

accuracies between 0.9 and 0.972. Qualitatively, in Fig. 8, the resultsare consistently close to expert annotations, even though ADC mapsclearly demonstrate differences in contrast, signal-to-noise ratio,and anatomy.

Quantitative results and comparison with other approachesTable 3 compares the mean and standard deviation of accuracies by

BET, BSE, and our proposedmethod, asmeasured by the Dice coefficientbetween the computed and the manually-annotated brain masks. Theproposedmethod outperformed the other two approaches by a statisti-cally significant margin (p b 0.001). Its performance was accurate (highmean Dice coefficient values above 0.92 and even 0.95), consistent insubjects with different ages (very narrow standard deviation, 0.01–0.02), and consistent in datasets from different scanner platforms andinstitutions.

We also note that, compared to the single-atlas accuracy inFig. 5(d), the optimized multi-atlas pipeline not only improved themean Dice coefficient (from 0.91–0.92 to 0.93–0.96 in differentdatasets), but more importantly, significantly reduced the standarddeviation (from 0.04–0.06 to 0.01–0.02). This meant that our opti-mized multi-atlas framework significantly increased the desirableconsistency with respect to data sources and subject ages. Sincebrain extraction is often times the first step in automated imageanalysis pipelines, the increased accuracy and consistency of ourframework should have a significant impact and provide an im-proved initialization for subsequent steps (e.g., volume calculation,ADC calculation, tissue and structural segmentation, atlas construction,longitudinal registration).

Computational timeOur method took about 10–12 min or 15–20 min per pediatric ADC

map of image size 128 × 128 × 70 or 256 × 256 × 60. These times wererecorded on a high-performance computer cluster that has 127 comput-er nodes, each node having 8 CPUs and 56 GB of shared virtualmemory.The majority of time was spent in the atlas-to-target registration step.BET and BSE took about 5–10 s per case.

Consistency of accuracyBesides accuracy, we also investigated the consistency of the accura-

cy, with regard to data sources and subject ages. We plot the brain ex-traction of all subjects in all 3 data sources in Fig. 9. For BET and BSE(Figs. 9(a,b)), the brain extraction accuracies had larger variations forsubjects under the age of 2 years than for older subjects. Thus, age andthe associated structural maturation seem to be two dominant factorsof accuracy for BET and BSE.

For our framework (Fig. 9(c)), in contrast, brain extraction accuracyremained very consistent, regardless of the data sources (different panelsin Fig. 9) and the ages of the subjects (x axes in each panel). Thismight bedue to three reasons: 1) our atlas library had included atlases frommanysubjects in the first year (see Fig. 3); 2) we had used a deformableregistration algorithm that was relatively robust to variations in age andanatomy; and 3) we optimized each component in the framework.

Applications to neuro-development study of 308 pediatric ADC maps

We applied the proposed framework to the PD308 cohort. With theextracted ADC maps, we were able to characterize whole-brain volu-metric and diffusion changes across the first 6 years of life.

Page 10: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

Fig. 7.Brain extraction accuracywhen atlas selection and label fusion parameters change, in (a)MGH_3T and (b)MGH_1.5Tdatasets. Panels in the left and right columns are based on atlasselection strategies in Eqs. (10) and (11).

255Y. Ou et al. / NeuroImage 122 (2015) 246–261

We first looked at the brain volume changes. We computed themean and standard deviation of brain volumes in the 10 age groups de-scribed in Table 2. Results are shown in Fig. 10(a). At birth, the brain hasan average volume of around 500 cc. The brain expands quickly in thefirst year, especially the first 6 months (steeply ascending curve in thisperiod). At the end of the first year, the brain volume almost doublesto 900–1000 cc. Then the brain continues to expand, but at a muchslower pace. At about 4 years of age, the brain volume reaches a plateau,and ranges between 1150 and 1450 cc, with the average at around1300 cc. This is almost 2.5 times as big as the volume at birth and isalso very close to the reported brain volume of young adults. All thesefindings agreed with those reported in the literature. For exampleLenroot and Giedd (2006) reported a mean brain volume of close to1000 cc at around 1 year of age and about 1200 cc at around 2 yearsof age (N = 511). Reiss et al. (1996) reported a mean brain volume ofaround 1200 cc for children 5–17 years of age (N = 21 male and 64female). (Ge et al., 2002) reported brain volume of 1367 ± 147 cc foradults (N = 32, ages 29–40).

We also report a preliminary estimate of how the average ADCvalues change across the first 6 years of life. When brain grows, neuro-nal myelination should restrict water diffusion, and hence decrease theADC values in white matter (WM) and graymatter (GM) (Dubois et al.,2014a; Guleria and Gross Kelly, 2014; Hüppi and Dubois, 2006). Tofocus on WM and GM regions, we removed CSF by thresholding theADC maps at a manually selected value for each image, usually around1800 μm2/s. This thresholdwas based on observations of the histogramsand prior knowledge from clinical literature (Helenius et al., 2002;Hüppi and Dubois, 2006; Morriss et al., 1999; Naganawa et al., 2003;Provenzale et al., 2010). Fig. 10(b) plots how the average ADC changesover time, after excluding the CSF tissue. The average ADC drops from1300 to 1100 μm2/s, or almost 15% in the first 6 months; and from

1300 to 1000 μm2/s, or almost 25% in 1 year. From 1–5 years of age,the decrease in the mean ADC values happens at a much slower pace,at less than 10% rate over the whole 4 years. After 5 years of age, thewhole-brain ADC value plateaus at 900–950 μm2/s, which is about thelevel that has been found in young adults (Bonekamp et al., 2007;Counsell et al., 2003; Engelbrecht et al., 2002; Helenius et al., 2002;Hüppi andDubois, 2006;Naganawa et al., 2003; Provenzale et al., 2010).

We want to emphasize that we are not showing new informationabout neuro-development that was previously unknown. Rather, theresults herein highlight the fact that our optimized framework hasreached a high level of accuracy and consistency even when applied tomedical images collected during the delivery of routine clinical care.These clinically collected images yield comparable valuable benchmarkinformation as expected from the research literature.

Discussion

Wepresented andoptimized amulti-atlas brain extraction frameworkspecifically built for pediatric ADCmaps beginning in infancy.Wedemon-strated its generality, accuracy and consistency in datasets from differentplatforms/institutions and for subjects of various ages. We also appliedthe proposed framework to study the volumetric and ADC changes in alarger population across the first 6 years of life. We showed that theoptimized framework could achieve the desirable specificity, generality,accuracy and consistency using a fixed set of atlases and a fixed set ofparameters. Because of these compelling properties, our framework wassuccessfully used in challenging clinically-acquired data and still obtainedinformation consistent with the literature (Fig. 10).

Diffusion imaging for infants and pediatric subjects often havedistortions arising from factors such as head size/shape, during-scanmotion and scanner eddy currents (Counsell et al., 2006; Doshi et al.,

Page 11: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

Fig. 8. Examples of algorithm-computed (green) and expert-annotated (red) brain masks in different datasets.

256 Y. Ou et al. / NeuroImage 122 (2015) 246–261

2013b; Giménez et al., 2008). Distortion correction is a non-trivial anal-ysis task additionally requiring the use of all diffusion weighted images.We purposely chose not to correct for distortions and show that ourframework achieved high accuracy, consistency and robustness in ourexperiments in multi-institutional datasets. We believe that this ismainly because the DRAMMS registration method uses attribute-based matching which better deals with distortions than intensity-based registration algorithms. This advantage has been shown inFig. 5, and has also been reported in other registration comparativestudies (Ou et al., 2014b; Ou et al., 2014c). The independence from dis-tortion correction highlights the generality and robustness of our pro-posed framework.

Similarly, our final results in Table 3 and especially Fig. 9(c) showedthat, even though we have used a fixed set of atlases from subjectswhose ages were not necessarily matched to that of the target, ourfinal brain extraction accuracies remain consistently high in a very nar-row range (0.9–0.95 Dicewith expert annotation), regardless of the ageof the subject (Fig. 9(c)), whereas BSE/BET's brain extraction accuracies

Table 3Brain extraction accuracies (Dice coefficients) by BET, BSE and our proposed framework.The default parameters of BET and BSE are primarily for adult brain images and structuralMR images, so they are not necessarily directly applicable to pediatric ADC maps. Tuningthe parameters of BET and BSE to the MGH_3T pediatric ADC maps shows improvementbut does not perform as well as the proposed work. For how we optimized BET and BSEfor this task, please refer to Section 3.7.

MGH_3T MGH_1.5T BCH_3T

BET (default parameters) 0.795 ± 0.101 0.748 ± 0.029 0.853 ± 0.057BSE (default parameters) 0.334 ± 0.160 0.162 ± 0.110 0.154 ± 0.115BET (optimized in MGH_3T) 0.813 ± 0.117 0.835 ± 0.044 0.855 ± 0.059BSE (optimized in MGH_3T) 0.881 ± 0.035 0.867 ± 0.039 0.861 ± 0.047Ours (optimized in MGH_3T) 0.958 ± 0.011 0.929 ± 0.023 0.936 ± 0.020

were in a more variable, wider range, and their performance droppedwhen the subjects were under 2 years old, and even more when theywere 1 year old. Having a consistently high accuracy, independent of asubject's age, is a strength in our framework.

Webelieve that the resulting generality and consistency can be largelyattributed to the use of the DRAMMS registration algorithm that achievesconsistently high accuracy in atlas-to-target alignment. This was shownin Fig. 5 and also documented inOuet al. (2014c). Several studies have re-ported that attribute-based registration algorithms such asDRAMMSmayperform more stable than intensity-based registration algorithms whenanatomy varies greatly or when the image contrast is low, such in ourpediatric ADC maps. Nevertheless, conclusively demonstrating this, orunderstanding the exact role of various components (transformation/similarity/optimization) in various registration algorithms is anotherlarge-scale study (such as (Klein et al., 2009; Ou et al., 2014c)) thatwe be-lieve is outside the scope of this paper. Another reason for the generalityand consistency in our framework is in comparatively choosing propermeasurements in many components of the framework. Especially, theatlas ranking and atlas selection components can effectively reject atlasesthat fail to indicate a good brain mask.

There are several technical issues to be addressed in future studies.One issue is the optimization of the atlas library. We used 15 atlases,since existing studies (e.g., Isgum et al., 2009; Leung et al., 2011)showed that 9 to 15 atlases are usually sufficient to saturate thesegmentation accuracy. However, with amore advanced registration al-gorithm as used in this paper, our brain extraction accuracy seems tosaturate at around 7–9 atlases. We would like to better understandwhether fewer atlases could be used tomaintain the same level of accu-racy and consistency, and how to choose the optimal atlas library tobegin with. Another issue is the optimization of parameters in ourframework. We optimized the choice of methods in each componentof our framework. For the chosen methods, we used their default

Page 12: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

Fig. 9. Plots of brain extraction accuracies in different data sources and for subjects of different ages, to test the consistency of brain extraction algorithms. a) BET, b) BSE and c) ourmethod.

257Y. Ou et al. / NeuroImage 122 (2015) 246–261

parameters. Even though they performed at a high level, we would liketo examine whether another set of parameters could further improvethe overall accuracy and consistency. A third issue is that, althoughour visual inspection led us to believe that the margin between ouralgorithm (average Dice at 0.93–0.96 with expert annotations) andthe parameter-tuned BET/BSE (average Dice at 0.81–0.86 with expertannotations) should far exceed the intra-/inter-rater variability, ourfuture work would include 2 or more experts annotating the brainmasks in order to rigorously quantify intra-/inter-rater variability, orto employ unsupervised segmentation evaluation when the absolutegold-standard is absent (Hoppin et al., 2002; Lebenberg et al., 2012).Last, due to limited accessibility of intermediate results in BET andBSE, we compared only the final results with BET and BSE.

For neuro-development studies, the access to large-scale multi-institution/platform/vendor data is quickly becoming widespread, dueto advancements in data query tools such as the mi2b2 workbench

Fig. 10. Characterization of neuro-development in the PD308 dataset. (a) Mean and standard dwhole-brain ADC values (CSF excluded) in various age groups.

(Murphy et al., 2014) and the availability of large-scale databases suchas the NIH Pediatric MRI Data Repository (NIH-PD) (Evans, 2006).Large-scale data help to better model the normal range of variation,which is important for accurate identification of structural brain abnor-malities (Liauw et al., 2009; Volkher et al., 2002). While many brainextraction algorithms that were developed for adult structural T1-weighted images have difficulties in large-scale multi-site structuraldata (Fennema-Notestine et al., 2006; Lee et al., 2003), our frameworkshows consistently high accuracy in different datasets. In the future,we will further test our framework's performance in data from differ-ent vendors (GE, Philips). We also plan to repeat the experiments inSection 4.3 (whole-brain ADC and volumetric changes) for imagesacquired from 1.5T scanners, and compare the results with the onesfrom 3T scanners.

Neuro-development studies based on automatedMR image analysis,while more often based on T1-weighted structural images (Fonov et al.,

eviation of whole-brain volume in various age groups. (b) Mean and standard deviation of

Page 13: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

258 Y. Ou et al. / NeuroImage 122 (2015) 246–261

2011; Lenroot and Giedd, 2006; Shi et al., 2011), are increasingly usingpediatric diffusion weighted images, either independently or as anadditional sequence (Geng et al., 2012a; Melbourne et al., 2013).Therefore, the development of our framework is timely. In this paper,the automated analysis of N = 308 pediatric subjects has enabled usto plot volumetric and ADC changes in the first 6 years of life. In thefuture, we plan to construct age-specific atlases to better quantify themean shape and mean ADC values at the regional level. This will allowthe development of abnormality detection tools. These are essentialfirst steps toward the application of medical informatics approaches todata mining of clinical images to assist in computerized diagnosis forpediatric neurological and neuropsychiatric diseases.

Table AAn example of how to calculate the average ranking error (ARE) of an atlas ranking criterion, wthe remaining 14 as the atlases.

To A15 Truth Proposed 1EstDice(Eq. (2))

Proposed 2EstDice ×CCdef(Eq. (3))

Proposed 3EstDice ×MIdef(Eq. (4))

True Dice True Rank Value Rank Value Rank Value R

A2 .949 1 .953 3 .534 13 .261 1A3 .943 2 .959 1 .609 11 .276A10 .934 3 .941 4 .606 12 .263 1A9 .932 4 .938 8 .613 7 .266A4 .930 5 .930 10 .616 5 .270A7 .929 6 .939 6 .614 6 .270A8 .928 7 .938 7 .631 1 .286A6 .928 8 .922 13 .617 4 .272A1 .927 9 .958 2 .475 14 .201 1A5 .917 10 .930 11 .613 9 .264 1A14 .913 11 .939 5 .621 3 .274A12 .912 12 .929 12 .613 8 .265 1A11 .910 13 .935 9 .628 2 .279A15 .866 14 .915 14 .611 10 .268ARE 0 2.57 3.86 3.86

Acknowledgments

The authors would like to thank Katie Murphy for her extensiveand careful reviewing of the radiological reports and patient records,which guaranteed that the pediatric subjects in the PD308 cohort wereall normative at the time of the scan. We also thank other members ofthe mi2b2 team, for the support of RPDR and the mi2b2 workbench,which we used to query and retrieve radiological data for this study:Christopher Herrick, Yanbing Wang, Taowei David Wang, DarrenSack, and Katherine P. Andriole. The authors would also like toacknowledge Instrumentation Grants 1S10RR023401, 1S10RR019307,and 1S10RR023043.

Appendix A. Measuring atlas ranking errors

This Appendix provides an example to explain how to calculate the average ranking error (ARE), with regard to a randomly chosen target ADCmap. In this example, a randomly chosen ADC map A13 from the MGH_3T dataset was used as the target image. The remaining 14 ADC maps wereused as atlases.

ith regard to a given target. We randomly chose the 13th atlas (A13) as the target, and used

Conventional 1CCaff (Eq. (5))

Conventional 2MIaff (Eq. (6))

Conventional 3CCdef (Eq. (7))

Conventional 4MIdef (Eq. (8))

ank Value Rank Value Rank Value Rank Value Rank

3 .639 13 .318 2 .609 5 .302 23 .639 12 .301 9 .613 2 .289 32 .641 11 .300 11 .604 10 .282 99 .643 9 .297 13 .603 11 .279 126 .649 5 .303 7 .604 9 .281 87 .645 7 .300 10 .606 6 .282 101 .652 2 .306 3 .611 3 .287 45 .649 6 .305 5 .598 12 .281 114 .661 1 .325 1 .633 1 .311 11 .651 4 .306 4 .605 7 .284 64 .644 8 .301 8 .605 8 .283 70 .642 10 .297 12 .597 13 .276 132 .652 3 .305 6 .610 4 .285 58 .637 14 .294 14 .583 14 .269 14

5.43 5 5.14 4.43

References

Acosta-Cabronero, Julio, Williams, Guy B., Pengas, George, Nestor, Peter J., 2010. Absolutediffusivities define the landscape of whitematter degeneration in Alzheimer's disease.Brain 133 (2), 529–539.

Aljabar, Paul, Heckemann, Rolf A., Hammers, Alexander, Hajnal, Joseph V., Rueckert,Daniel, 2009. Multi-atlas based segmentation of brain images: atlas selection and itseffect on accuracy. NeuroImage 46 (3), 726–738.

Andersson, J., Smith, S., Jenkinson, M., 2008. Fnirt-fmrib's non-linear image registrationtool. Hum. Brain Mapp. 15–19.

Arthurs, Owen J., Sury, Michael, 2013. Anaesthesia or sedation for paediatric MRI: advan-tages and disadvantages. Curr. Opin. Anesthesiol. 26 (4), 489–494.

Asman, Andrew J., Landman, Bennett A., 2011. Robust statistical label fusion through con-sensus level, labeler accuracy, and truth estimation (COLLATE). IEEE Trans. Med. Im-aging 30 (10), 1779–1794.

Avants, Brian B., Epstein, Charles L., Grossman, Murray, Gee, James C., 2008. Symmetricdiffeomorphic image registration with cross-correlation: evaluating automated label-ing of elderly and neurodegenerative brain. Med. Image Anal. 12 (1), 26–41.

Avants, Brian B., Tustison, Nicholas J., Song, Gang, Cook, Philip A., Klein, Arno, Gee, JamesC., 2011. A reproducible evaluation of ANTs similarity metric performance in brainimage registration. NeuroImage 54 (3), 2033–2044.

Bai, Wenjia, Shi, Wenzhe, O'Regan, Declan P., Tong, Tong, Wang, Haiyan, Jamil-Copley, Shahnaz, Peters, Nicholas S., Rueckert, Daniel, 2013. A probabilisticpatch-based label fusion model for multi-atlas segmentation with registrationrefinement: application to cardiac MR images. IEEE Trans. Med. Imaging1302–1315.

Barkovich, A. James, 2000. Concepts of myelin and myelination in neuroradiology. Am.J. Neuroradiol. 21 (6), 1099–1109.

Barkovich, A. James, Raybaud, Charles, 2012. Pediatric Neuroimaging. Lippincott Williams& Wilkins.

Bonekamp, David, Nagae, Lidia M., Degaonkar, Mahaveer, Matson, Melissa, Abdalla, Wael,Barker, Peter B., Mori, Susumu, Horská, Alena, 2007. Diffusion tensor imaging in chil-dren and adolescents: reproducibility, hemispheric, and age-related differences.NeuroImage 34 (2), 733–742.

Cabezas, Mariano, Oliver, Arnau, Lladó, Xavier, Freixenet, Jordi, Cuadra, Meritxell Bach,2011. A review of atlas-based segmentation for magnetic resonance brain images.Comput. Methods Prog. Biomed. 104 (3), e158–e177.

Cardoso, M. Jorge, Clarkson, Matthew J., Ridgway, Gerard R., Modat, Marc, Fox, Nick C.,Ourselin, Sebastien, 2011. LoAd: a locally adaptive cortical segmentation algorithm.NeuroImage 56 (3), 1386–1397.

Cardoso, M. Jorge, Leung, Kelvin, Modat, Marc, Keihaninejad, Shiva, Cash, David, Barnes,Josephine, Fox, Nick C., Ourselin, Sebastien, 2013. STEPS: Similarity and Truth

Page 14: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

259Y. Ou et al. / NeuroImage 122 (2015) 246–261

Estimation for Propagated Segmentations and its application to hippocampal seg-mentation and brain parcellation. Med. Image Anal. 17 (6), 671–684.

Chiverton, John, Wells, Kevin, Lewis, Emma, Chen, Chao, Podda, Barbara, Johnson, Declan,2007. Statistical morphological skull stripping of adult and infant MRI data. Comput.Biol. Med. 37 (3), 342–357.

Christensen, Gary E., Geng, Xiujuan, Kuhl, Jon G., Bruss, Joel, Grabowski, Thomas J.,Pirwani, Imran A., Vannier, Michael W., Allen, John S., Damasio, Hanna, 2006. Intro-duction to the non-rigid image registration evaluation project (NIREP). BiomedicalImage Registration. Springer, Berlin Heidelberg, pp. 128–135.

Commowick, Olivier, Akhondi-Asl, Alireza, Warfield, Simon K., 2012. Estimating a refer-ence standard segmentation with spatially varying performance parameters: localMAP STAPLE. IEEE Trans. Med. Imaging 31 (8), 1593–1606.

Counsell, Serena J., Allsop, Joanna M., Harrison, Michael C., Larkman, David J., Kennea,Nigel L., Kapellou, Olga, Cowan, Frances M., Hajnal, Joseph V., Edwards, A. David,Rutherford, Mary A., 2003. Diffusion-weighted imaging of the brain in preterm in-fants with focal and diffuse white matter abnormality. Pediatrics 112 (1), 1–7.

Counsell, Serena J., Shen, Yuji, Boardman, James P., Larkman, David J., Kapellou, Olga,Ward, Philip, Allsop, Joanna M., et al., 2006. Axial and radial diffusivity in preterm in-fants who have diffuse white matter changes on magnetic resonance imaging atterm-equivalent age. Pediatrics 117 (2), 376–386.

Dice, Lee R., 1945. Measures of the amount of ecologic association between species. Ecol-ogy 26 (3), 297–302.

Diez, Yago, Oliver, Arnau, Cabezas, Mariano, Valverde, Sergi, Martí, Robert, Vilanova, JoanCarles, Ramió-Torrentà, Lluís, Rovira, Àlex, Lladó, Xavier, 2013. Intensity basedmethods for brain MRI longitudinal registration. A study on multiple sclerosis pa-tients. Neuroinformatics 1–15.

Dogdas, Belma, Shattuck, David W., Leahy, Richard M., 2005. Segmentation of skull andscalp in 3-D human MRI using mathematical morphology. Hum. Brain Mapp. 26(4), 273–285.

Doshi, Jimit, Erus, Guray, Ou, Yangming, Gaonkar, Bilwaj, Davatzikos, Christos, 2013a.Multi-atlas skull-stripping. Acad. Radiol. 20 (12), 1566–1576.

Doshi, Jimit, Erus, Guray, Ou, Yangming, Davatzikos, Christos, 2013b. Ensemble-basedmedical image labeling via sampling morphological appearance manifolds. MICCAIChallenge Workshop on Segmentation. MICCAI, Nagoya, Japan.

Dubois, J., Dehaene-Lambertz, G., Kulikova, S., Poupon, C., Hüppi, P.S., Hertz-Pannier, L.,2014a. The early development of brain white matter: a review of imaging studiesin fetuses, newborns and infants. Neuroscience 276, 48–71.

Dubois, Jessica, Kulikova, Sofya, Hertz-Pannier, Lucie, Mangin, Jean-François, Dehaene-Lambertz, Ghislaine, Poupon, Cyril, 2014b. Correction strategy for diffusion-weighted images corrupted with motion: application to the DTI evaluation of infants'white matter. Magn. Reson. Imaging 32 (8), 981–992.

Duc, Albert K. Hoang, Modat, Marc, Leung, Kelvin K., Cardoso, M. Jorge, Barnes, Josephine,Kadir, Timor, Ourselin, Sébastien, Alzheimer's Disease Neuroimaging Initiative, 2013.Using manifold learning for atlas selection in multi-atlas segmentation. PLoS One 8(8), e70059. http://dx.doi.org/10.1371/journal.pone.0070059.

Edwards, Andrea D., Arthurs, Owen J., 2011. Paediatric MRI under sedation: is it neces-sary? What is the evidence for the alternatives? Pediatr. Radiol. 41 (11), 1353–1364.

Erus, Guray, Battapady, Harsha, Satterthwaite, Theodore D., Hakonarson, Hakon, Gur,Raquel E., Davatzikos, Christos, Gur, Ruben C., 2014. Imaging patterns of brain devel-opment and their relationship to cognition. Cereb. Cortex http://dx.doi.org/10.1093/cercor/bht425 (bht425).

Eskildsen, Simon F., Coupé, Pierrick, Fonov, Vladimir, Manjón, José V., Leung, Kelvin K.,Guizard, Nicolas, Wassef, Shafik N., Østergaard, Lasse Riis, Collins, D. Louis, 2012.BEaST: brain extraction based on nonlocal segmentation technique. NeuroImage 59(3), 2362–2373.

Evans, Alan C., 2006. The NIH MRI study of normal brain development. NeuroImage 30(1), 184–202.

Fennema-Notestine, Christine, Ozyurt, I. Burak, Clark, Camellia P., Morris, Shaunna,Bischoff-Grethe, Amanda, Bondi, MarkW., Jernigan, Terry L., et al., 2006. Quantitativeevaluation of automated skull-stripping methods applied to contemporary and lega-cy images: effects of diagnosis, bias correction, and slice location. Hum. Brain Mapp.27 (2), 99–113.

Fischl, Bruce, 2012. FreeSurfer. NeuroImage 62 (2), 774–781.Focke, Niels K., Yogarajah, Mahinda, Bonelli, Silvia B., Bartlett, Philippa A., Symms,Mark R.,

Duncan, John S., 2008. Voxel-based diffusion tensor imaging in patients with mesialtemporal lobe epilepsy and hippocampal sclerosis. NeuroImage 40 (2), 728–737.

Fonov, Vladimir, Evans, Alan C., Botteron, Kelly, Almli, C. Robert, McKinstry, Robert C.,Collins, D. Louis, 2011. Unbiased average age-appropriate atlases for pediatric studies.NeuroImage 54 (1), 313–327.

Galdames, Francisco J., Jaillet, Fabrice, Perez, Claudio A., 2012. An accurate skull strippingmethod based on simplex meshes and histogram analysis for magnetic resonanceimages. J. Neurosci. Methods 206 (2), 103–119.

Ge, Y., Grossman, R.I., Babb, J.S., Rabin, M.L., Mannon, L.J., Kolson, D.L., 2002. Age-relatedtotal gray matter and white matter changes in normal adult brain. Part I: volumetricMR imaging analysis. Am. J. Neuroradiol. 23 (8), 1327–1333.

Geng, Xiujuan, Gouttard, Sylvain, Sharma, Anuja, Gu, Hongbin, Styner, Martin, Lin, Weili,Gerig, Guido, Gilmore, John H., 2012a. Quantitative tract-based white matter devel-opment from birth to age 2 years. NeuroImage 61 (3), 542–557.

Geng, Xiujuan, Styner, Martin, Gupta, Aditya, Shen, Dinggang, Gilmore, John H., 2012b.Multi-contrast diffusion tensor image registration with structural MRI. BiomedicalImaging (ISBI), 2012 9th IEEE International Symposium on. IEEE, pp. 684–687.

Gerstner, Elizabeth R., Chen, Poe-Jou, Wen, Patrick Y., Jain, Rakesh K., Batchelor, Tracy T.,Sorensen, Gregory, 2010. Infiltrative patterns of glioblastoma spread detected via dif-fusion MRI after treatment with cediranib. Neuro-Oncology 12 (5), 466–472.

Ghosh, Satrajit S., Kakunoori, Sita, Augustinack, Jean, Nieto-Castanon, Alfonso, Kovelman,Ioulia, Gaab, Nadine, Christodoulou, Joanna A., Triantafyllou, Christina, Gabrieli, John

D.E., Fischl, Bruce, 2010. Evaluating the validity of volume-based and surface-basedbrain image registration for developmental cognitive neuroscience studies in chil-dren 4 to 11 years of age. NeuroImage 53 (1), 85–93.

Giménez, Mónica, Miranda, Maria J., Born, A. Peter, Nagy, Zoltan, Rostrup, Egill, Jernigan,Terry L., 2008. Accelerated cerebral white matter development in preterm infants: avoxel-based morphometry study with diffusion tensor MR imaging. NeuroImage 41(3), 728–734.

Grant, P. Ellen, He, Julian, Halpern, Elkan F., Wu, Ona, Schaefer, PamelaW., Schwamm, LeeH., Budzik, Ronald F., Sorensen, A. Gregory, Koroshetz, Walter J., Gonzalez, R. Gilberto,2001. Frequency and clinical context of decreased apparent diffusion coefficient re-versal in the human brain. Radiology 221 (1), 43–50.

Guggenberger, Roman, Nanz, Daniel, Bussmann, Lorenz, Chhabra, Avneesh, Fischer,Michael A., Hodler, Jürg, Pfirrmann, ChristianW.A., Andreisek, Gustav, 2013. Diffusiontensor imaging of the median nerve at 3.0T using different MR scanners: agreementof FA and ADC measurements. Eur. J. Radiol. 82 (10), e590–e596.

Guleria, Saurabh, Kelly, Teresa Gross, 2014. Myelin, myelination, and corresponding mag-netic resonance imaging changes. Radiol. Clin. N. Am. 52 (2), 227–239.

Hahn, Horst K., Peitgen, Heinz-Otto, 2000. The skull stripping problem in MRI solved by asingle 3D watershed transform. Medical Image Computing and Computer-AssistedIntervention—MICCAI 2000. Springer, Berlin Heidelberg, pp. 134–143.

Hameeteman, K., Zuluaga, Maria A., Freiman, Moti, Joskowicz, Leo, Cuisenaire, Olivier,Valencia, L. Flórez, Gülsün, Mehmet Akif, et al., 2011. Evaluation framework for carot-id bifurcation lumen segmentation and stenosis grading. Med. Image Anal. 15 (4),477–488.

Han, Xiao, Hoogeman, Mischa S., Levendag, Peter C., Hibbard, Lyndon S., Teguh, David N.,Voet, Peter, Cowen, AndrewC.,Wolf, TheresaK., 2008. Atlas-based auto-segmentationof head and neck CT images. Medical Image Computing and Computer-AssistedIntervention—MICCAI 2008. Springer, Berlin Heidelberg, pp. 434–441.

Hasan, Khader M., Halphen, Christopher, Sankar, Ambika, Eluvathingal, Thomas J.,Kramer, Larry, Stuebing, Karla K., Ewing-Cobbs, Linda, Fletcher, Jack M., 2007. Diffu-sion tensor imaging-based tissue segmentation: validation and application to the de-veloping child and adolescent brain. NeuroImage 34 (4), 1497–1505.

Heckemann, Rolf A., Hajnal, Joseph V., Aljabar, Paul, Rueckert, Daniel, Hammers,Alexander, 2006. Automatic anatomical brain MRI segmentation combining labelpropagation and decision fusion. NeuroImage 33 (1), 115–126.

Helenius, Johanna, Soinne, Lauri, Perkiö, Jussi, Salonen, Oili, Kangasmäki,Aki, Kaste,Markku,Carano, Richard A.D., Aronen, Hannu J., Tatlisumak, Turgut, 2002. Diffusion-weightedMR imaging in normal human brains in various age groups. Am. J. Neuroradiol. 23(2), 194–199.

Hellier, P., Barillot, C., Corouge, I., Gibaud, B., Le Goualher, G., Collins, L., Evans, A.,Malandain, G., Ayache, N., 2001. Retrospective evaluation of inter-subject brain regis-tration. Medical Image Computing and Computer-Assisted Intervention—MICCAI2001. Springer, pp. 258–265.

Hoeksma, Marco R., Kenemans, J. Leon, Kemner, Chantal, van Engeland, Herman, 2005.Variability in spatial normalization of pediatric and adult brain images. Clin.Neurophysiol. 116 (5), 1188–1194.

Hoppin, JohnW., Kupinski, Matthew A., Kastis, George A., Clarkson, Eric, Barrett, HarrisonH., 2002. Objective comparison of quantitative imaging modalities without the use ofa gold standard. IEEE Trans. Med. Imaging 21 (5), 441–449.

Hüppi, Petra S., Dubois, Jessica, 2006. Diffusion tensor imaging of brain development.Seminars in Fetal and Neonatal Medicine vol. 11. WB Saunders, pp. 489–497 (no. 6).

Huttenlocher, Daniel P., Klanderman, Gregory A., Rucklidge, William J., 1993. Comparingimages using the Hausdorff distance. IEEE Trans. Pattern Anal. Mach. Intell. 15 (9),850–863.

Iglesias, Juan Eugenio, Liu, Cheng-Yi, Thompson, PaulM., Tu, Zhuowen, 2011. Robust brainextraction across datasets and comparison with publicly available methods. IEEETrans. Med. Imaging 30 (9), 1617–1634.

Iglesias, Juan Eugenio, Sabuncu, Mert Rory, Van Leemput, Koen, 2013. A unified frame-work for cross-modality multi-atlas segmentation of brain MRI. Med. Image Anal.17 (8), 1181–1191.

Irimia, Andrei, Chambers, Micah C., Alger, Jeffry R., Filippou, Maria, Prastawa, Marcel W.,Wang, Bo, Hovda, David A., et al., 2011. Comparison of acute and chronic traumaticbrain injury using semi-automatic multimodal segmentation of MR volumes.J. Neurotrauma 28 (11), 2287–2306.

Isgum, Ivana, Staring, Marius, Rutten, Annemarieke, Prokop, Mathias, Viergever, Max A.,van Ginneken, Bram, 2009. Multi-atlas-based segmentation with local decisionfusion—application to cardiac and aortic segmentation in CT scans. IEEE Trans. Med.Imaging 28 (7), 1000–1010.

Jenkinson, M., Smith, S., 2001. A global optimisation method for robust affine registrationof brain images. Medical image analysis 5 (2), 143–156.

Kazemi, K., Moghaddam, H.A., Grebe, R., Gondry-Jouet, C., Wallois, F., 2007. A neonatalatlas template for spatial normalization of whole-brain magnetic resonance imagesof newborns: preliminary results. Neuroimage 37 (2), 463–473.

Kirişli, H.A., Schaap, Michiel, Klein, Stefan, Papadopoulou, S.L., Bonardi, M., Chen, C.H.,Weustink, A.C., et al., 2010. Evaluation of a multi-atlas based method for segmenta-tion of cardiac CTA data: a large-scale, multicenter, and multivendor study. Med.Phys. 37 (12), 6279–6291.

Kıvrak, Ali Sami, Paksoy, Yahya, Erol, Cengiz, Koplay, Mustafa, Özbek, Seda, Kara, Fatih,2013. Comparison of apparent diffusion coefficient values among different MRI plat-forms: a multicenter phantom study. Diagn. Interv. Radiol. (Ank. Turk.) 19 (6), 433.

Klein, Stefan, van der Heide, Uulke A., Lips, Irene M., van Vulpen, Marco, Staring, Marius,Pluim, Josien P.W., 2008. Automatic segmentation of the prostate in 3DMR images byatlas matching using localized mutual information. Med. Phys. 35 (4), 1407–1417.

Klein, A., Andersson, J., Ardekani, B.A., Ashburner, J., Avants, B., Chiang, M.-G., Christensen,G.E., et al., 2009. Evaluation of 14 nonlinear deformation algorithms applied tohuman brain MRI registration. Neuroimage 46 (3), 786–802.

Page 15: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

260 Y. Ou et al. / NeuroImage 122 (2015) 246–261

Klein, A., Ghosh, S.S., Avants, B., Yeo, B.T., Fischl, B., Ardekani, B., Gee, J.C., Mann, J.J., Parsey,R.V., 2010. Evaluation of volume-based and surface-based brain image registrationmethods. NeuroImage 51 (1), 214–220.

Knickmeyer, R.C., Gouttard, S., Kang, C., Evans, D., Wilber, K., Smith, J.K., Hamer, R.M., Lin,W., Gerig, G., Gilmore, J.H., 2008. A structural MRI study of human brain developmentfrom birth to 2 years. J. Neurosci. 28 (47), 12176–12182.

Kuklisova-Murgasova, Maria, Quaghebeur, Gerardine, Rutherford, Mary A., Hajnal, JosephV., Schnabel, Julia A., 2012. Reconstruction of fetal brain MRI with intensity matchingand complete outlier removal. Med. Image Anal. 16 (8), 1550–1564.

Kulikova, S., Hertz-Pannier, L., Dehaene-Lambertz, G., Buzmakov, A., Poupon, C., Dubois, J.,2014. Multi-parametric evaluation of the white matter maturation. Brain Struct.Funct. 1–16.

Langerak, Thomas Robin, van der Heide, Uulke A., Kotte, Alexis N.T.J., Viergever, Max A.,van Vulpen, Marco, Pluim, Josien P.W., 2010. Label fusion in atlas-based segmentationusing a selective and iterative method for performance level estimation (SIMPLE).IEEE Trans. Med. Imaging 29 (12), 2000–2008.

Langerak, Thomas R., Berendsen, Floris F., Van der Heide, Uulke A., Kotte, Alexis N.T.J.,Pluim, Josien P.W., 2013. Multiatlas-based segmentation with preregistration atlasselection. Med. Phys. 40 (9), 091701.

Le Bihan, Denis, Breton, Eric, Lallemand, Denis, Grenier, Philippe, Cabanis,Emmanuel, Laval-Jeantet, Maurice, 1986. MR imaging of intravoxel incoherentmotions: application to diffusion and perfusion in neurologic disorders. Radiolo-gy 161 (2), 401–407.

Lebenberg, Jessica, Buvat, Irène, Lalande, Alain, Clarysse, Patrick, Casta, Christopher,Cochet, Alexandre, Constantinidés, Constantin, et al., 2012. Nonsupervised rankingof different segmentation approaches: application to the estimation of the left ven-tricular ejection fraction from cardiac cine MRI sequences. IEEE Trans. Med. Imaging31 (8), 1651–1660.

Lee, Jong-Min, Yoon, Uicheul, Hee Nam, Sang, Kim, Jung-Hyun, Kim, In-Young, Kim, Sun I.,2003. Evaluation of automated and semi-automated skull-stripping algorithms usingsimilarity index and segmentation error. Comput. Biol. Med. 33 (6), 495–507.

Lenroot, Rhoshel K., Giedd, Jay N., 2006. Brain development in children and adolescents:insights from anatomical magnetic resonance imaging. Neurosci. Biobehav. Rev. 30(6), 718–729.

Leroy, François, Mangin, Jean-François, Rousseau, François, Glasel, Hervé, Hertz-Pannier,L., Dubois, Jessica, Dehaene-Lambertz, Ghislaine, 2011. Atlas-free surface reconstruc-tion of the cortical grey–white interface in infants. PLoS One 6 (11), e27128.

Leung, Kelvin K., Barnes, Josephine, Modat, Marc, Ridgway, Gerard R., Bartlett, JonathanW., Fox, Nick C., Ourselin, Sébastien, 2011. Brain MAPS: an automated, accurate androbust brain extraction technique using a template library. NeuroImage 55 (3),1091–1108.

Liauw, L., van Wezel-Meijler, G., Veen, S., Van Buchem, M.A., Van der Grond, J., 2009. Doapparent diffusion coefficient measurements predict outcome in children with neo-natal hypoxic–ischemic encephalopathy. Am. J. Neuroradiol. 30 (2), 264–270.

Liu, Tianming, Li, Hai, Wong, Kelvin, Tarokh, Ashley, Guo, Lei, Wong, Stephen T.C., 2007.Brain tissue segmentation based on DTI data. NeuroImage 38 (1), 114–123.

Lötjönen, Jyrki M.P., Wolz, Robin, Koikkalainen, Juha R., Thurfjell, Lennart, Waldemar,Gunhild, Soininen, Hilkka, Rueckert, Daniel, 2010. Fast and robust multi-atlas seg-mentation of brain magnetic resonance images. NeuroImage 49 (3), 2352–2365.

Mahapatra, Dwarikanath, 2012. Skull stripping of neonatal brain MRI: using prior shapeinformation with graph cuts. J. Digit. Imaging 25 (6), 802–814.

Malinsky, Milos, Peter, Roman, Hodneland, Erlend, Lundervold, Astri J., Lundervold, Arvid,Jan, Jiri, 2013. Registration of FA and T1-weighted MRI data of healthy human brainbased on template matching and normalized cross-correlation. J. Digit. Imaging 26(4), 774–785.

Malyarenko, Dariya, Galbán, Craig J., Londy, Frank J., Meyer, Charles R., Johnson, TimothyD., Rehemtulla, Alnawaz, Ross, Brian D., Chenevert, Thomas L., 2013. Multi-system re-peatability and reproducibility of apparent diffusion coefficient measurement usingan ice-water phantom. J. Magn. Reson. Imaging 37 (5), 1238–1246.

Martin, Sébastien, Troccaz, Jocelyne, Daanen, Vincent, 2010. Automated segmentation ofthe prostate in 3D MR images using a probabilistic atlas and a spatially constraineddeformable model. Med. Phys. 37 (4), 1579–1590.

McKinstry, Robert C., 2011. Advances in pediatric diffusion tensor imaging. Pediatr.Radiol. 41, 137–138.

Melbourne, Andrew, Eaton-Rosen, Zach, Bainbridge, Alan, Kendall, Giles S., Cardoso,Manuel Jorge, Robertson, Nicola J., Marlow, Neil, Ourselin, Sebastien, 2013. Measure-ment of myelin in the preterm brain: multi-compartment diffusion imaging andmulti-component T2 relaxometry. Medical Image Computing and Computer-Assisted Intervention—MICCAI 2013. Springer, Berlin Heidelberg, pp. 336–344.

Menze, Bjoern, Jakab, Andras, Bauer, Stefan, Kalpathy-Cramer, Jayashree, Farahani,Keyvan, Kirby, Justin, Burren, Yuliya, et al., 2014. The Multimodal Brain TumorImage Segmentation Benchmark (BRATS). IEEE Trans. Med. Imaging, http://dx.doi.org/10.1109/TMI.2014.2377694 (in press).

Mori, Susumu, Oishi, Kenichi, Faria, Andreia V., Miller, Michael I., 2013. Atlas-basedneuroinformatics via mri: harnessing information from past clinical cases and quan-titative image analysis for patient care. Annu. Rev. Biomed. Eng. 15, 71–92.

Morriss, M.C., Zimmerman, R.A., Bilaniuk, L.T., Hunter, J.V., Haselgrove, J.C., 1999. Changesin brain water diffusion during childhood. Neuroradiology 41 (12), 929–934.

Murphy, Shawn N., Herrick, Christopher, Wang, Yanbing, David Wang, Taowei, Sack,Darren, Andriole, Katherine P., Wei, Jesse, et al., 2014. High throughput tools to accessimages from clinical archives for research. J. Digit. Imaging 1–11.

Naganawa, Shinji, Sato, Kimihide, Katagiri, Toshio, Mimura, Takeo, Ishigaki, Takeo, 2003.Regional ADC values of the normal brain: differences due to age, gender, andlaterality. Eur. Radiol. 13 (1), 6–11.

Neil, Jeffrey J., Shiran, Shelly I., McKinstry, Robert C., Schefft, Georgia L., Snyder, Avi Z.,Almli, C. Robert, Akbudak, Erbil, et al., 1998. Normal brain in human newborns:

apparent diffusion coefficient and diffusion anisotropy measured by using diffusiontensor MR imaging. Radiology 209 (1), 57–66.

Orasanu, Eliza, Andrew, Melbourne, Cardoso, M. Jorge, Modat, Marc, Taylor, Andrew M.,Thayyil, Sudhin, Ourselin, Sebastien, 2014. Brain volume estimation from post-mortem newborn and fetal MRI. NeuroImage Clin. 6, 438–444.

Ou, Yangming, Sotiras, Aristeidis, Paragios, Nikos, Davatzikos, Christos, 2011. DRAMMS:deformable registration via attribute matching and mutual-saliency weighting.Med. Image Anal. 15 (4), 622–639.

Ou, Yangming, Doshi, Jimit, Erus, Guray, Davatzikos, Christos, 2012a. Multi-atlas segmen-tation of the prostate: a zooming process with robust registration and atlas selection.Medical Image Computing and Computer Assisted Intervention (MICCAI) GrandChallenge: Prostate MR Image Segmentation 7.

Ou, Yangming, Doshi, Jimit, Erus, Guray, Davatzikos, Christos, 2012b. Attribute similarityand mutual-saliency weighting for registration and label fusion. MICCAI Workshopon Multi-Atlas Segmentation, pp. 95–98.

Ou, Yangming, Ye, Dong Hye, Pohl, Kilian M., Davatzikos, Christos, 2012c. Validation ofDRAMMS among 12 popular methods in cross-subject cardiac MRI registration. Bio-medical Image Registration. Springer, Berlin Heidelberg, pp. 209–219.

Ou, Yangming, Reynolds, Nathaniel, Gollub, Randy, Pienaar, Rudolph, Wang, Yanbing,Wang, Taowei, Sack, Darren, Andriole, Katherine, Pieper, Steve, Herrick, Christopher,Murphy, Shawn, Grant, P. Ellen, Zollei, Lilla, 2014a. Developmental Brain ADC AtlasCreation from Clinical Images. Organization for Human Brain Mapping (OHBM).

Ou, Yangming, Weinstein, Susan P., Conant, Emily F., Englander, Sarah, Da, Xiao, Gaonkar,Bilwaj, Hsieh, Meng-Kang, et al., 2014b. Deformable registration for quantifying lon-gitudinal tumor changes during neoadjuvant chemotherapy. Magn. Reson. Med.http://dx.doi.org/10.1002/mrm.25368.

Ou, Yangming, Akbari, Hamed, Bilello, Michello, Da, Xiao, Davatzikos, Christos, 2014c.Comparative evaluation of registration algorithms in different brain databases withvarying difficulty: results and insights. IEEE Trans. Med. Imaging 2039–2065.

Park, Jong Geun, Lee, Chulhee, 2009. Skull stripping based on region growing for magneticresonance brain images. NeuroImage 47 (4), 1394–1407.

Park, Hyunjin, Bland, Peyton H., Meyer, Charles R., 2003. Construction of an abdominalprobabilistic atlas and its application in segmentation. IEEE Trans. Med. Imaging 22(4), 483–492.

Parker, Geoffrey J.M., Luzzi, Simona, Alexander, Daniel C., AMWheeler-Kingshott, Claudia,Ciccarelli, Olga, Lambon Ralph, Matthew A., 2005. Lateralization of ventral and dorsalauditory–language pathways in the human brain. NeuroImage 24 (3), 656–666.

Provenzale, James M., Isaacson, Jared, Chen, Steven, Stinnett, Sandra, Liu, Chunlei, 2010.Correlation of apparent diffusion coefficient and fractional anisotropy values in thedeveloping infant brain. AJ. Am. J. Roentgenol. 195 (6), W456.

Reiss, Allan L., Abrams, Michael T., Singer, Harvey S., Ross, Judith L., Denckla, Martha B.,1996. Brain development, gender and IQ in children. A volumetric imaging study.Brain 119 (5), 1763–1774.

Retzepis, K., Ou, Y., Zollei, L., Reynolds, N., Castro, V., Pieper, S., Murphy, S.N., Grant, P.E.,Gollub, R.L., 2014. Using clinical images to study the evolution of mean ADC valuesand brain volume of healthy pediatric subjects. Society of Neuroscience Annual Meeting(SfN), Washington D.C. (submitted for publication)

Rohlfing, Torsten, 2012. Image similarity and tissue overlaps as surrogates for image reg-istration accuracy: widely used but unreliable. IEEE Trans. Med. Imaging 31 (2),153–163.

Rohlfing, Torsten, Maurer Jr., Calvin R., 2007. Shape-based averaging. IEEE Trans. ImageProcess. 16 (1), 153–161.

Rohlfing, Torsten, Brandt, Robert, Menzel, Randolf, Maurer Jr., Calvin R., 2004. Evaluationof atlas selection strategies for atlas-based image segmentation with application toconfocal microscopy images of bee brains. NeuroImage 21 (4), 1428–1442.

Rozovsky, Katya, Ventureyra, Enrique C.G., Miller, Elka, 2013. Fast-brainMRI in children isquick, without sedation, and radiation-free, but beware of limitations. J. Clin.Neurosci. 20 (3), 400–405.

Sabuncu, Mert R., Yeo, B.T. Thomas, Van Leemput, Koen, Fischl, Bruce, Golland, Polina,2010. A generative model for image segmentation based on label fusion. IEEETrans. Med. Imaging 29 (10), 1714–1729.

Sadananthan, Suresh A., Zheng, Weili, WL Chee, Michael, Zagorodnov, Vitali, 2010. Skullstripping using graph cuts. NeuroImage 49 (1), 225–239.

Sanroma, G., Wu, G., Gao, Y., Shen, D., 2014. Learning to rank atlases for multiple-atlassegmentation. IEEE Trans. Med. Imaging 33 (10), 1939–1953.

Ségonne, Florent, Dale, A.M., Busa, E., Glessner,M., Salat, D., Hahn,H.K., Fischl, B., 2004. A hy-brid approach to the skull stripping problem in MRI. NeuroImage 22 (3), 1060–1075.

Shattuck, David W., Leahy, Richard M., 2002. BrainSuite: an automated cortical surfaceidentification tool. Med. Image Anal. 6 (2), 129–142.

Shi, Feng, Yap, Pew-Thian, Wu, Guorong, Jia, Hongjun, Gilmore, John H., Lin, Weili, Shen,Dinggang, 2011. Infant brain atlases from neonates to 1-and 2-year-olds. PLoS One 6(4), e18746.

Shi, Feng,Wang, Li, Dai, Yakang, Gilmore, JohnH., Lin,Weili, Shen, Dinggang, 2012. LABEL:pediatric brain extraction using learning-based meta-algorithm. NeuroImage 62 (3),1975–1986.

Smith, Stephen M., 2002. Fast robust automated brain extraction. Hum. Brain Mapp. 17(3), 143–155.

Snook, Lindsay, Paulson, Lori-Anne, Roy, Dawne, Phillips, Linda, Beaulieu, Christian, 2005.Diffusion tensor imaging of neurodevelopment in children and young adults.NeuroImage 26 (4), 1164–1173.

Sotiras, Aristeidis, Davatzikos, Christos, Paragios, Nikos, 2013. Deformable medical imageregistration: a survey. IEEE Trans. Med. Imaging 32 (7), 1153–1190.

Sotiras, Aristeidis, Ou, Yangming, Paragios, Nikos, Davatzikos, Christos, 2014. Graph-baseddeformable image registration. In: Paragios, N., Duncan, J., Ayache, N. (Eds.), Handbookof Biomedical Imaging; Methodologies and Clinical Research. Springer (ISBN-10:0387097481).

Page 16: Brain extraction in pediatric ADC maps, toward ...you2/publications/Ou15_NeuroImage.pdf · Brain extraction in pediatric ADC maps, toward characterizing neuro-development in multi-platform

261Y. Ou et al. / NeuroImage 122 (2015) 246–261

Thirion, J.-P., 1998. Image matching as a diffusion process: an analogy with Maxwell's de-mons. Med. Image Anal. 2 (3), 243–260.

Twomey, Eilish, Twomey, Anne, Ryan, Stephanie, Murphy, John, Donoghue, Veronica B.,2010. MR imaging of term infants with hypoxic–ischaemic encephalopathy as a pre-dictor of neurodevelopmental outcome and late MRI appearances. Pediatr. Radiol. 40(9), 1526–1535.

van Rikxoort, Eva M., Isgum, Ivana, Arzhaeva, Yulia, Staring, Marius, Klein, Stefan,Viergever, Max A., Pluim, Josien P.W., van Ginneken, Bram, 2010. Adaptive localmulti-atlas segmentation: application to the heart and the caudate nucleus. Med.Image Anal. 14 (1), 39–49.

Vercauteren, Tom, Pennec, Xavier, Perchant, Aymeric, Ayache, Nicholas, 2009.Diffeomorphic demons: efficient non-parametric image registration. NeuroImage45 (1), S61–S72.

Verma, Ragini, Zacharaki, Evangelia I., Yangming, Ou., Cai, Hongmin, Chawla, Sanjeev, Lee,Seung-Koo, Melhem, Elias R., Wolf, Ronald, Davatzikos, Christos, 2008. Multi-parametric tissue characterization of brain neoplasms and their recurrence using pat-tern classification of MR images. Acad. Radiol. 15 (8), 966–977.

Volkher, Engelbrecht, Scherer, Axel, Rassek, Margarethe, Witsack, Hans J., Modder, Ulrich,2002. Diffusion-weightedMR imaging in the brain in children: findings in the normalbrain and in the brain with white matter diseases 1. Radiology 222 (2), 410–418.

Wang, Hongzhi, Das, Sandhitsu R., Wook Suh, Jung, Altinay, Murat, Pluta, John, Craige,Caryne, Avants, Brian, Yushkevich, Paul A., 2011. A learning-based wrapper methodto correct systematic errors in automatic image segmentation: consistently improvedperformance in hippocampus, cortex and brain segmentation. NeuroImage 55 (3),968–985.

Wang, Li, Shi, Feng, Yap, Pew-Thian, Gilmore, John H., Lin,Weili, Shen, Dinggang, 2012. 4Dmulti-modality tissue segmentation of serial infant images. PLoS One 7 (9), e44596.

Wang, Jiahui, Vachet, Clement, Rumple, Ashley, Gouttard, Sylvain, Ouziel, Clementine,Perrot, Emilie, Guangwei, Du., Huang, Xuemei, Gerig, Guido, Styner, Martin A.,2014. Multi-atlas segmentation of subcortical brain structures via the AutoSeg soft-ware pipeline. Name Front. Neuroinforma. 8 (7).

Warfield, Simon K., Zou, Kelly H., Wells, WilliamM., 2004. Simultaneous truth and perfor-mance level estimation (STAPLE): an algorithm for the validation of image segmen-tation. IEEE Trans. Med. Imaging 23 (7), 903–921.

West, J., Fitzpatrick, J.M., Wang, M.Y., Dawant, B.M., Maurer Jr., C.R., Kessler, R.M.,Maciunas, R.J., Barillot, C., Lemoine, D., Collignon, A., et al., 1997. Comparison andevaluation of retrospective intermodality brain image registration techniques.J. Comput. Assist. Tomogr. 21 (4), 554–568.

Wolz, Robin, Aljabar, Paul, Hajnal, Joseph V., Hammers, Alexander, Rueckert, Daniel, 2010.LEAP: learning embeddings for atlas propagation. NeuroImage 49 (2), 1316–1325.

Woodhams, Reiko, Matsunaga, Keiji, Iwabuchi, Keiichi, Kan, Shinichi, Hata, Hirofumi,Kuranami, Masaru, Watanabe, Masahiko, Hayakawa, Kazushige, 2005. Diffusion-weighted imaging of malignant breast tumors: the usefulness of apparent diffusioncoefficient (ADC) value and ADC map for the detection of malignant breast tumorsand evaluation of cancer extension. J. Comput. Assist. Tomogr. 29 (5), 644–649.

Wu, Guorong, Wang, Qian, Zhang, Daoqiang, Nie, Feiping, Huang, Heng, Shen, Dinggang,2013. A generative probability model of joint label fusion for multi-atlas basedbrain segmentation. Med. Image Anal. 18 (6), 881–890.

Xu, Jian, Rasmussen, Inge-Andre, Lagopoulos, Jim, Håberg, Asta, 2007. Diffuse axonal inju-ry in severe traumatic brain injury visualized using high-resolution diffusion tensorimaging. J. Neurotrauma 24 (5), 753–765.

Yang, J., Beadle, B., Garden, A., Balter, P., Court, L., 2013. Atlas Ranking and Selection forMulti-Atlas Segmentation. The American Association of Physicists in Medicine(AAPM) Annual Meeting, Indianapolis, IN, USA.

Yassa, Michael A., Stark, Craig E., 2009. A quantitative evaluation of cross-participant reg-istration techniques for MRI studies of the medialtemporal lobe. NeuroImage 44 (2),319–327.

Yoshida, Shoko, Oishi, Kenichi, Faria, Andreia V., Mori, Susumu, 2013. Diffusion tensor im-aging of normal brain development. Pediatr. Radiol. 43 (1), 15–27.

Zhang, K., Johnson, B., Pennell, D., Ray, W., Sebastianelli, W., Slobounov, S., 2010. Are func-tional deficits in concussed individuals consistent with white matter structural alter-ations: combined FMRI & DTI study. Exp. Brain Res. 204 (1), 57–70.

Zhuang, Audrey H., Valentino, Daniel J., Toga, Arthur W., 2006. Skull-stripping magneticresonance brain images using a model-based level set. NeuroImage 32 (1), 79–92.