CBRAIN and LONI: Multi-site Brain Imaging Networks for the ... · CBRAIN and LONI: Multi-site Brain...
Transcript of CBRAIN and LONI: Multi-site Brain Imaging Networks for the ... · CBRAIN and LONI: Multi-site Brain...
CBRAIN and LONI:Multi-site Brain Imaging Networks for the Study of
Neurodegenerative Disorders
Alan C. Evans, Ph.D.McConnell Brain Imaging CentreMontreal Neurological Institute
McGill University
TOWARDS THE DEVELOPMENT OF EFFECTIVE DRUGS FOR ALZHEIMER’S DISEASE. HOW E‐SCIENCE CAN HELP SOLVE PRESSING SOCIETAL PROBLEMS
Bruxelles, January 26th, 2011
Grand Vision
outGRIDoutGRID
http://www.LONI.ucla.edu
Brain Atlasing
• Atlas Construction– Web-based Vervet monkey brain atlas
• Woods, et al., NeuroImage (2010)
– Chinese Brain Atlas• Tang, et al., NeuroImage (2010)
Biological Shape Modeling and Analysis
• Shape metrics, correspondences & mapping
– Pattern of Hippo shape & volume differences in blind subjects• Lepore, et al. (2009)
• http://dx.doi.org/10.1016/j.neuroimage.2009.01.071
Top view
Bottom
view
LONI Infrastructure
• Databases– Raw Data (e.g., imaging, genetics, phenotypic, meta-data)– Derived Data (e.g., Atlases, models, shapes, masks, labels)
• Grid Computing– Managing efficient back-end hardware computational framework– Job submission, user management and support
• Pipeline Environment (http://pipeline.loni.ucla.edu)– Design, validation, dissemination and execution of heterogeneous
workflows
– Tool discovery (http://iTools.ccb.ucla.edu)
– Tool interoperability
– Distributed computing
– Friendly access to data, hardware and tools• Dinov et al., (2010), PLoS, doi:10.1371/journal.pone.0013070
• Dinov et al. (2009) Frontiers NeuroInfo, doi:10.3389/neuro.11.022.2009
LONI Computational InfrastructureDescriptionDescription ValueValue
GridGrid
Number of Grid NodesNumber of Grid Nodes 380 nodes / 1,256 cores380 nodes / 1,256 cores
RAMRAM 8 8 –– 16 Gigabytes / node16 Gigabytes / node
SpeedSpeed 2.5+ GHZ per core2.5+ GHZ per core
SpecsSpecs Sun V20z and Sun X2200Sun V20z and Sun X2200
Usage StatsUsage Stats ~16,000 average jobs completed/day (past 3 months)~16,000 average jobs completed/day (past 3 months)
Number UsersNumber Users 165 unique users (past 3 months)165 unique users (past 3 months)
NetworkingNetworkingSpecsSpecs Mixed 1GB production and 10GB HPC networksMixed 1GB production and 10GB HPC networks
UsageUsage Average: 20GB/sec. Max: 80GB/secAverage: 20GB/sec. Max: 80GB/sec
BandwidthBandwidth 100Gb+ total throughput to cluster100Gb+ total throughput to cluster
DisksDisksCapacity (online/offline)Capacity (online/offline) 250TB online capacity w/ 4PB+ Offline (tape) virtual storage250TB online capacity w/ 4PB+ Offline (tape) virtual storage
Specs (latency, bandwidth)Specs (latency, bandwidth) Peak max 3 Gigabytes/secPeak max 3 Gigabytes/sec
Number of FilesNumber of Files 10,000,000,00010,000,000,000’’ss
Web ServicesWeb ServicesIDAIDA 1,0001,000’’s users per weeks users per week
iToolsiTools 100100’’s users per weeks users per week
Pipeline Pipeline -- webweb--serverserver 100100’’s users per weeks users per week
PipelinePipeline
QueueQueue pipeline.qpipeline.q
UsageUsage ~12,000 avg jobs completed/day (past 3 months)~12,000 avg jobs completed/day (past 3 months)
Node Allocation Node Allocation Dynamic, approximately 75% of LONIDynamic, approximately 75% of LONI’’s HPC Resourcess HPC Resources
Users/AccountsUsers/Accounts 700+ authenticated users700+ authenticated users
IDAIDA
(database)(database)
number of projectsnumber of projects 5555
number of usersnumber of users >1,200>1,200
number of volumesnumber of volumes DTI: 2,748; fMRI: 1,569: HISTO: 4; MRA: 1,204: MRI: 56,248; PETDTI: 2,748; fMRI: 1,569: HISTO: 4; MRA: 1,204: MRI: 56,248; PET: 2,678 : 2,678
diskdisk--spacespace 1PB1PB
Average Monthly Uploads (2009)Average Monthly Uploads (2009) 1,2001,200
Average Monthly Downloads (2009)Average Monthly Downloads (2009) 25,00025,000
The LONI Pipeline http://pipeline.loni.ucla.edu
DistributedArchitecture
Data Computing
Users
CBRAIN• Canadian Distributed Neuroimaging Platform (5 Canadian Centers)
• Prototype Collaborative 3D Visualization of a High Resolution Brain
CANARIECANADA'S ADVANCED RESEARCH AND INNOVATION NETWORK
DCC
CCC DPCSPC
NIH MRI Study of Normal Brain Development (N=500)
Behavior/MRI for ages 0-18 yrs
Structure-behavior relationships
Disseminate results
BackupSystem
Data AnalysisPipeline
MRI
StudyWorkStation
BVL
DCC
INTERNET
MRI
BVLBVL
MRI
PSC
BehavioralPC (laptop)
MRI Console
MRIScanner
ScientificCommunity
Mass Storage System
Internet &DBMS
Server(s)
Data Marts
DataWarehouse
NIHPD Network Architecture
Acquisition managementProject management toolsDouble data entry/ range checkingAutomated 3D image QCJava-based remote 3D image QC150 behavioral instrumentsMANTIS bug-tracking
Analysis pipelines External pipelines for analysis (MNI, SPM, FSL , LONI, AFNI)Integrated with grid-computing networks (CBRAIN, NeuGrid)
Repository /downloadData types: behavior, clinical, imaging, geneticOn-line remote MRI browserData querying GUI (volumes, surfaces, behavior)e.g. NIH database of normal brain development
50 man-years of developmentWeb-based, secure data transfer of multi-site dataGeneralized open-source MYSQL architecture - flexible, extensibleApplications in development, neurodegeneration (US, Europe, Asia)
LORIS
Cortical atrophy in Alzheimer’s Disease(Lerch J et al., Cerebral Cortex 2005)
CBRAIN Portal& Metadata
DB
CIVET on HPC
CBRAIN data provider
CBRAIN controller
Rotman Database
DB
DRMAA
CBRAIN Scientist
Toronto
MontrealSherbrooke
Real-World Example
RQCHP
CBRAIN Network of Centres of Excellence
Enhanced
public education
network
Novel ethical
framework for
imaging genomics
Cross-disciplinary
HQP Training
National
clinical trial
network
Biomarker
validation
Multivariate
analytic
techniques
Commercialization
Translation to
clinical practice
Advanced
e-communication
CBRAIN
Mechanisms
of disease
Development of
new treatments
Early detection and prevention
Broadband
Network
HPC
Grid
National
imaging
database
International
neuroinformatics
partnerships
Informing
health policyRegenerative
strategies
LORIS in Europe
Innomed / AddNeuroMedAlzheimer’s Disease
NeuGrid – Grid ComputingDistributed processingDistributed databasing
GBRAIN Access to HPCs(November 2009)
Top 500 Rank Name Location # of Cores
13 JUROPA Julich, Germany 26304
22 SciNet UofT, Canada 30240
63 CLUMEQ 2 U. Laval, Canada 7616
230 WESTGRID UBC, Canada 3088
322 SHARCNET UWO, Canada 2688
462 RQCHP U. Sherbrooke, Canada 2464
TBD CLUMEQ 1&2 McGill, Canada ~20000
NA LONI UCLA, USA ~600
NA MNI/BIC McGill, Canada ~400
ComputingInfrastructure
Number of HPCsNumber of Cores AvailableCores per HPCNumber of Nodes Available
9 (8 in Canada, 1 in Germany)33,948From 200 to 20 0004678
Systems Heterogeneous SGI XE320, SUN Blade 6048/6275, HP Blade 2x220c, HP DL145 G2, SUN Fire x4100
Network Heterogeneous Gigabit Ethernet, InfiniBand
Storage Total files storedNumber of Data ProvidersAvailable Space on Main Storage
2,060,00075TB
Tools Number of toolsNumber of jobs submitted
2512,316
Users Total UsersRegistered User ProjectsRegistered Scientific SitesCountries
554720Canada, Scotland, USA, Germany
CBRAIN Infrastructure
24
Platform: Web app, not a desktop app. Available anywhere you have a connection and a browser
General approach:Highly-integrated with single point of entry, i.e. no need for re-authentication
Web-databasing: LORIS supports n-D images, genomics, psych testing
Pipeline processing:Fully-automated. 20 years experience with pipelines
Compute Capacity:Huge processing power and great flexibility in adding new clusters
Collaboration:Built-in. Files, tools, servers, storage, visualization can all be shared as desired
Visualization:Brainbrowser 3D web-viewer. QC mosaic for rapid assessment of pipeline results
CBRAIN Summary
N.B. not restricted to human brain MRI Species: human, primate, rodentOrgans: brain, heart, liver
Modalities: (PET, fMRI, histology, IHC, DNA/RNA)
Ruby on Rails Framework
Rails enforces segregation of presentation layer, database access,business logic in MVC (struts, WebObjects, JavaServer Faces)
Rails is RESTful by design
NIHPD-Obj1up to 3 visits per subject, each visit consisting of 3 (t1, t2, pd) volumes @ 256x256x256 (~5Mb)
866 CIVET runs to generate cortical thickness maps Input: 866 x 3 x 5Mb = 15Gb
Output: 866 x 250 Mb = 211Gb
Cluster Total CPU-hrs
Maximum Performance Typical Performance
# cores Execution time (h)
# cores Execution time (h)
mammouth-ms2
(RQCHP -Sherbrooke)
866 x 4 = 3464 2112 1.6 176 20
CLUMEQ-1
(McGill)
866 x 6 = 5196 90 58 24 216
BIC (Linux) 866 x 8 = 6928 100 69 40 173
Illustrative Performance Comparison
Toga USA
ZillesGermany
Lee S. Korea
GBRAIN
Infrastructure – Grid Computing
• Managing efficient back-end hardware computational framework
• Job submission, user management and support
– SGE
– Permissions
– Ticketing
– Tutorials
– Batch/Pipeline
– SVN/CVS
– Dashboard
www.loni.ucla.edu/Resources/clustervisualization
50K lines of code9 integrated cluster installations (8 Canadian, 1 German) 50,000+ CPUs 21 Virtual Sites (1 USA, 1 Spain, 3 Germany, 1 China, 15 Canada)~50 users (1 Chinese, 5 German, 1 UK, 1 Spanish, 1 American, 44 Canadian)~70 user projects, Approx. 5TB of data12000+ HPC tasks performedTypical sustained compute: 130 compute days/24 hours
CBRAIN Statistics
SPMbatch StatisticsNIAK-fMRIstat-SurfStat StatisticsN-PAIRS StatisticsMINC tools General GRETNA Graph theoretical modellingPLS Network modellingCIVET/CLASP Surface ExtractionN3 Inhomogeneity Correction.AFNI GeneralFSL GeneralNITRC Clearing house
25 tools/converters/pipelines