Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological...
-
Upload
allyson-watkins -
Category
Documents
-
view
213 -
download
0
description
Transcript of Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological...
![Page 1: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/1.jpg)
Speedup for Multi-Level Parallel Computing
School of Computer Engineering
Nanyang Technological University
21st May 2012
Shanjiang Tang, Bu-Sung Lee, Bingsheng He
![Page 2: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/2.jpg)
OutLine
• Background & Motivation• Multi-Level Parallel Speedup• Evaluation• Conclusion
![Page 3: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/3.jpg)
Multi-Level Computing Architectureand Paradigm
![Page 4: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/4.jpg)
Multi-Level Computing Architectureand Paradigm• MPI+OpenMP • MPI+CUDA• MPI+OpenMP+CUDA …..
![Page 5: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/5.jpg)
Multi-Level Parallel Computing Model LmL3L2L1
Notes: Sequential Part Parallel Part
PE2,2
PE1,1 PE
2,1
PE3,1
PE3,2
PE3,3
PE3,4
PE3,5
PE3,6
PE3,7
PE3,8
![Page 6: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/6.jpg)
Parallel Speedup
• Definition
• Classification
Absolute Speedup
Relative Speedup
SequentialExecutionTimeSpeedupParallelExecutionTime
=
BestSequentialALGExecutionTimeSpeedupParallelALGExecutionTime
=
ParallelALGSequentialExecutionTimeSpeedupParallelALGExecutionTime
=
![Page 7: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/7.jpg)
Relative Speedup Model
• Fixed-size Speedup
Amdahl’s Law
• Fixed-time Speedup
Gustafson’s Law
1
1
sequentialTimeSpeedupparallelTime
paa
= =- +
1 11
sequentialTime pSpeedup ppparallelTimep
a a a aaa
- += = = - +- +
![Page 8: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/8.jpg)
Motivation Example—NAS Benchmark (MPI+OpenMP)
![Page 9: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/9.jpg)
Motivation Example—NAS Benchmark (MPI+OpenMP)
Amdahl’s Law is UNSUITABLE for Multi-Level Parallel Computing
![Page 10: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/10.jpg)
OutLine
• Background & Motivation• Multi-Level Parallel Speedup• Evaluation• Conclusion
![Page 11: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/11.jpg)
E-Amdahl’s Law
• Awareness of Different Grained-Level Parallelism
1 ( )( )1 ( )( )
( )1 (1 ).( )1 ( )( ) ( 1)
i mf mf mp m
sp ii m
f if ip i sp i
ìïï =ïïï - +ïïï=íïï £ <ïïï - +ïï +ïî
LmL3L2L1
Notes:
Sequential Part
Parallel Part
PE2,2
PE1,1 PE
2,1
PE3,1
PE3,2
PE3,3
PE3,4
PE3,5
PE3,6
PE3,7
PE3,8
![Page 12: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/12.jpg)
E-Amdahl’s Law
• Two-Level Parallelism Speedup Model (MPI+OpenMP)
where is the parallel fraction of coarse-grained (MPI-level) parallelism. is the parallel fraction of fine-grained (OpenMP-level) parallelism. is the number of processes spawned. is the number of threads spawned per process.
1( , , , )(1 )
1
sp p t
tp
a b ba ba
=- +
- +
abpt
![Page 13: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/13.jpg)
E-Gustafson’s Law
• Awareness of Different Grained-Level Parallelism
1 ( ) ( ) ( ) ( )( )
1 ( ) ( ) ( ) ( 1) (1 ).f m f m p m i m
sp if i f i p i sp i i m
ì - + =ïï=íï - + + £ <ïî
LmL3L2L1
Notes:
Sequential Part
Parallel Part
PE2,2
PE1,1 PE
2,1
PE3,1
PE3,2
PE3,3
PE3,4
PE3,5
PE3,6
PE3,7
PE3,8
![Page 14: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/14.jpg)
OutLine
• Background & Motivation• Multi-Level Parallel Speedup• Evaluation• Conclusion
![Page 15: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/15.jpg)
Experiment Setup
• Platform and Configuration
A linux cluster consisting of eight computing nodes each with two quad-core chips Configuration: One thread per CPU core
• Benchmarks
NAS Parallel Benchmark (NPB) Multi-Zone (MZ) Version: BT-MZ (Unbalanced Workload Partitioning) SP-MZ (balanced Workload Partitioning) LU-MZ (balanced Workload Partitioning)
![Page 16: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/16.jpg)
Performance Prediction
![Page 17: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/17.jpg)
Prediction Result Comparison
![Page 18: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/18.jpg)
OutLine
• Background & Motivation• Multi-Level Parallel Speedup• Evaluation• Conclusion
![Page 19: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/19.jpg)
Conclusion
• Traditional speedup models are unsuitable for multi-level parallelism
– Unable to be awareness of different granularities of parallelism for multi-level parallel computing.
• Multi-level Parallelism Model
– A guidance model for multi-level optimization.– A prediction model for multi-level parallelism.
![Page 20: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/20.jpg)
![Page 21: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/21.jpg)
Argument Estimation
![Page 22: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/22.jpg)
Speedup Under E-Amdahl’s Law
![Page 23: Speedup for Multi-Level Parallel Computing School of Computer Engineering Nanyang Technological University 21 st May 2012 Shanjiang Tang, Bu-Sung Lee,](https://reader035.fdocuments.us/reader035/viewer/2022070605/5a4d1af57f8b9ab059981686/html5/thumbnails/23.jpg)
Speedup Under E-Gustafson’s Law