Using MapReduce for Large–scale Medical Image Analysis
-
Upload
institute-of-information-systems-hes-so-valais-wallis -
Category
Technology
-
view
108 -
download
1
description
Transcript of Using MapReduce for Large–scale Medical Image Analysis
![Page 1: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/1.jpg)
Using MapReduce for Large-scale Medical Image AnalysisHISB 2012Presented by : Roger Schaer - HES-SO Valais
![Page 2: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/2.jpg)
Summary
Introduction
Methods
Results & Interpretation
Conclusions
2
![Page 3: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/3.jpg)
Introduction
![Page 4: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/4.jpg)
IntroductionExponential growth of imaging data (past 20 years)
Year
Amou
nt o
f imag
es p
rodu
ced
per d
ay a
t the
HUG
4
![Page 5: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/5.jpg)
Introduction (continued)
Mainly caused by :
Modern imaging techniques (3D, 4D) : Large files !
Large collections (available on the Internet)
Increasingly complex algorithms make processing this data more challenging
Requires a lot of computation power, storage and network bandwidth
5
![Page 6: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/6.jpg)
Introduction (continued)Flexible and scalable infrastructures are needed
Several approaches exist :
Single, powerful machine
Local cluster / grid
Alternative infrastructures (Graphics cards)
Cloud computing solutions
First two approaches have been tested and compared6
![Page 7: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/7.jpg)
Introduction (continued)
3 large-scale medical image processing use cases
Parameter optimization for Support Vector Machines
Content-based image feature extraction & indexing
3D texture feature extraction using the Riesz transform
NOTE : I mostly handled the infrastructure aspects !
7
![Page 8: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/8.jpg)
Methods
![Page 9: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/9.jpg)
Methods
MapReduce
Hadoop Cluster
Support Vector Machines
Image Indexing
Solid 3D Texture Analysis Using the Riesz Transform
9
![Page 10: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/10.jpg)
MapReduce
MapReduce is a programming model
Developed by Google
Map Phase : Key/Value pair input, Intermediate output
Reduce phase : For each intermediate key, process the list of associated values
Trivial example : Word Count application 10
![Page 11: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/11.jpg)
MapReduce : WordCount
11
![Page 12: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/12.jpg)
MapReduce : WordCountINPUT
11
![Page 13: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/13.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
INPUT
11
![Page 14: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/14.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
INPUT MAP
11
![Page 15: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/15.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
INPUT MAP
11
![Page 16: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/16.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1
INPUT MAP
11
![Page 17: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/17.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
INPUT MAP
11
![Page 18: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/18.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
INPUT MAP
11
![Page 19: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/19.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1
INPUT MAP
11
![Page 20: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/20.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1
INPUT MAP
11
![Page 21: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/21.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1
INPUT MAP
11
![Page 22: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/22.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
INPUT MAP
11
![Page 23: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/23.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
hadoop 1
INPUT MAP
11
![Page 24: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/24.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
hadoop 1
INPUT MAP
11
![Page 25: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/25.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
hadoop 1bye 1
INPUT MAP
11
![Page 26: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/26.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
hadoop 1bye 1
hadoop 1
INPUT MAP
11
![Page 27: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/27.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
hadoop 1bye 1
hadoop 1
INPUT MAP REDUCE
11
![Page 28: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/28.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
hadoop 1bye 1
hadoop 1
INPUT MAP REDUCE
11
![Page 29: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/29.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
hadoop 1bye 1
hadoop 1
INPUT MAP REDUCE
hello 2
11
![Page 30: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/30.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
hadoop 1bye 1
hadoop 1
INPUT MAP REDUCE
hello 2
11
![Page 31: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/31.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
hadoop 1bye 1
hadoop 1
INPUT MAP REDUCE
hello 2world 2
11
![Page 32: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/32.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
hadoop 1bye 1
hadoop 1
INPUT MAP REDUCE
hello 2world 2
11
![Page 33: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/33.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
hadoop 1bye 1
hadoop 1
INPUT MAP REDUCE
hello 2world 2
goodbye 1
11
![Page 34: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/34.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
hadoop 1bye 1
hadoop 1
INPUT MAP REDUCE
hello 2world 2
goodbye 1
11
![Page 35: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/35.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
hadoop 1bye 1
hadoop 1
INPUT MAP REDUCE
hello 2world 2
goodbye 1hadoop 2
11
![Page 36: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/36.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
hadoop 1bye 1
hadoop 1
INPUT MAP REDUCE
hello 2world 2
goodbye 1hadoop 2
11
![Page 37: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/37.jpg)
MapReduce : WordCount
#1 hello world#2 goodbye world#3 hello hadoop#4 bye hadoop...
hello 1world 1
goodbye 1world 1hello 1
hadoop 1bye 1
hadoop 1
INPUT MAP REDUCE
hello 2world 2
goodbye 1hadoop 2
bye 1
11
![Page 38: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/38.jpg)
HadoopApache’s implementation of MapReduce
Consists of
Distributed storage system : HDFS
Execution framework : Hadoop MapReduce
Master node which orchestrates the task distribution
Worker nodes which perform the tasks
Typical node runs a DataNode and TaskTracker12
![Page 39: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/39.jpg)
Support Vector MachinesComputes a decision boundary (hyperplane) that separates inputs of different classes represented in a given feature space transformed by a given kernel
The values of two parameters need to be adapted to the data:
Cost C of errors
σ of the Gaussian kernel
13
![Page 40: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/40.jpg)
Support Vector MachinesComputes a decision boundary (hyperplane) that separates inputs of different classes represented in a given feature space transformed by a given kernel
The values of two parameters need to be adapted to the data:
Cost C of errors
σ of the Gaussian kernel
0
5
10
15
20
0 5 10 15 20 13
![Page 41: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/41.jpg)
Support Vector MachinesComputes a decision boundary (hyperplane) that separates inputs of different classes represented in a given feature space transformed by a given kernel
The values of two parameters need to be adapted to the data:
Cost C of errors
σ of the Gaussian kernel
0
5
10
15
20
0 5 10 15 20 13
![Page 42: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/42.jpg)
Support Vector MachinesComputes a decision boundary (hyperplane) that separates inputs of different classes represented in a given feature space transformed by a given kernel
The values of two parameters need to be adapted to the data:
Cost C of errors
σ of the Gaussian kernel
0
5
10
15
20
0 5 10 15 20 13
![Page 43: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/43.jpg)
Support Vector MachinesComputes a decision boundary (hyperplane) that separates inputs of different classes represented in a given feature space transformed by a given kernel
The values of two parameters need to be adapted to the data:
Cost C of errors
σ of the Gaussian kernel
0
5
10
15
20
0 5 10 15 20
?
13
![Page 44: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/44.jpg)
Support Vector MachinesComputes a decision boundary (hyperplane) that separates inputs of different classes represented in a given feature space transformed by a given kernel
The values of two parameters need to be adapted to the data:
Cost C of errors
σ of the Gaussian kernel
0
5
10
15
20
0 5 10 15 20 13
![Page 45: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/45.jpg)
SVM (continued)Goal : find optimal value couple (C, σ) to train a SVM
Allowing best classification performance of 5 lung texture patterns
Execution on 1 PC (without Hadoop) can take weeks
Due to extensive leave-one-patient-out cross-validation with 86 patients
Parallelization : Split job by parameter value couples 14
![Page 46: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/46.jpg)
Image Indexing
Vocabulary File
Image Files
Feature Extractor
Feature Vectors Files
Bag of Visual Words Factory
Index File
Two phases
Extract features from images
Construct bags of visual words by quantization
Component-based / Monolithic approaches
Parallelization : Each task processes N images 15
![Page 47: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/47.jpg)
Image Indexing
Vocabulary File
Image Files
Feature Extractor
Feature Vectors Files
Bag of Visual Words Factory
Index File
Two phases
Extract features from images
Construct bags of visual words by quantization
Component-based / Monolithic approaches
Parallelization : Each task processes N images 15
Feature Extractor
+Bag of Visual Words
Factory
![Page 48: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/48.jpg)
3D Texture Analysis (Riesz)Features are extracted from 3D images (see below)
Parallelization : Each task processes N images
16
![Page 49: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/49.jpg)
Results & Interpretation
![Page 50: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/50.jpg)
Hadoop Cluster
Minimally invasive setup (>=2 free cores per node)18
![Page 51: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/51.jpg)
Support Vector Machines
Optimization : Longer tasks = bad performance
Because the optimization of the hyperplane is more difficult to compute (more iterations needed)
After 2 patients (out of 86), check if : ti ≥ F · tref.If time exceeds average (+margin), terminate task
19
![Page 52: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/52.jpg)
Support Vector Machines
Black : tasks to be interrupted by the new algorithm
Optimized algorithm : ~50h → ~9h15min
All the best tasks (highest accuracy) are not killed 20
σ (Sigma)C (Cost)
Accu
racy
(%)
![Page 53: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/53.jpg)
Image Indexing1K IMAGES
Shows the calculation time in function of the # of tasks
Both experiments were executed using hadoop
Once on a single computer, then on our cluster of PCs 21
![Page 54: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/54.jpg)
Image Indexing1K IMAGES 10K IMAGES
Shows the calculation time in function of the # of tasks
Both experiments were executed using hadoop
Once on a single computer, then on our cluster of PCs 21
![Page 55: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/55.jpg)
Image Indexing1K IMAGES 10K IMAGES 100K IMAGES
Shows the calculation time in function of the # of tasks
Both experiments were executed using hadoop
Once on a single computer, then on our cluster of PCs 21
![Page 56: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/56.jpg)
Riesz 3DParticularity : code was a series of Matlab® scripts
Instead of rewriting the whole application :
Used Hadoop streaming feature (uses stdin/stdout)
To maximize scalability, GNU Octave was used
Great compatibility between Matlab® and Octave
22
![Page 57: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/57.jpg)
Riesz 3DParticularity : code was a series of Matlab® scripts
Instead of rewriting the whole application :
Used Hadoop streaming feature (uses stdin/stdout)
To maximize scalability, GNU Octave was used
Great compatibility between Matlab® and OctaveRESULTS
1 task (no Hadoop) 42 tasks (idle) 42 tasks (normal)131h32m42s 6h29m51s 5h51m31s
22
![Page 58: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/58.jpg)
Conclusions
![Page 59: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/59.jpg)
ConclusionsMapReduce is
Flexible (worked with very varied use cases)
Easy to use (2-phase programming model is simple)
Efficient (>=20x speedup for all use cases)
Hadoop is
Easy to deploy & manage
User-friendly (nice Web UIs)24
![Page 60: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/60.jpg)
Conclusions (continued)Speedups for the different use cases
SVMs Image Indexing
3D Feature Extraction
Single task 990h* 21h* 131h30
42 tasks on hadoop 50h / 9h15** 1h 5h50
Speedup 20x / 107x** 21x 22.5x
* estimation ** using the optimized algorithm 25
![Page 61: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/61.jpg)
Lessons Learned
It is important to use physically distributed resources
Overloading a single machine hurts performance
Data locality notably speeds up jobs
Not every application is infinitely scalable
Performance improvements level off at some point
26
![Page 62: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/62.jpg)
Future work
Take it to the next level : The Cloud
Amazon Elastic Cloud Compute (IaaS)
Amazon Elastic MapReduce (PaaS)
Cloudbursting
Use both local resources + Cloud (for peak usage)
27
![Page 63: Using MapReduce for Large–scale Medical Image Analysis](https://reader033.fdocuments.us/reader033/viewer/2022051819/54c6a3464a7959d9148b4581/html5/thumbnails/63.jpg)
Thank you ! Questions ?