High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming...
Transcript of High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming...
![Page 1: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/1.jpg)
Vivek Kumar1, Stephen M Blackburn1, David Grove2, Daniel Frampton1, 3
1 The Australian National University 2 IBM T.J. Watson Research 3 Microsoft
High Performance Runtime for Next Generation Parallel Programming
Languages
![Page 2: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/2.jpg)
Hardware and Software Today
Background
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar 2
![Page 3: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/3.jpg)
3
The Challenge
Background
• Productivity • Language based features to expose
parallelism – X10, Cilk, Habanero etc
• Performance • Work–stealing scheduling
• Portability • Managed runtime to hide the hardware
complexities
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 4: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/4.jpg)
4
Options ?
• Productivity • Language based features to expose
parallelism – X10, Cilk, Habanero-Java etc
• Performance • Work–stealing scheduling
• Portability • Managed runtime to hide the hardware
complexities
Background
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 5: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/5.jpg)
5
Thesis Statement
High performance languages are using managed
platforms for productivity and portability, but
performance is inadequate. By exploiting and
extending the underlying mechanisms of managed
runtimes, implementation of these languages will be
able to deliver scalability and performance at the
levels necessary for widespread uptake.
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 6: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/6.jpg)
Contributions
6 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 7: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/7.jpg)
Contributions
7 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 8: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/8.jpg)
Contributions
8
High Performance
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 9: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/9.jpg)
Contributions
9
High Productivity High Performance
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 10: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/10.jpg)
Contributions
10
High Productivity High Performance Highly Competitive
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 11: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/11.jpg)
Understanding Work–Stealing
11 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 12: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/12.jpg)
Understanding Work–Stealing
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 13: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/13.jpg)
Understanding Work–Stealing
13 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 14: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/14.jpg)
Methodology • Hardware Platform
– 2x8 cores Intel Xeon E5-2450
• Software Platform – Jikes RVM (3.1.3)
• Benchmarks – UTS, BarnessHut, FFT, Jacobi, LUDecomposition,
JGF_SeriesTest, HeatDiffusion, PointCorrelation, NQueens, Matmul, CilkSort and Fibonacci
• To evaluate performance – JMetal (sourceforge project with 327 Java files)
• To evaluate the productivity of our system
14 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 15: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/15.jpg)
15
Big…… But How Big ??
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
Motivating Analysis
![Page 16: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/16.jpg)
16
1
2
4
geomean
Habanero-Java ManagedX10 Fork-Join
Sequential Overhead
Motivating Analysis
3.7x
2.5x
1.6x
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 17: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/17.jpg)
17
Steal to Task Ratio
Motivating Analysis
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
0.0000001
0.000001
0.00001
0.0001
0.001
0.01
0.1
1
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Threads
![Page 18: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/18.jpg)
18
Insights
• Move the overheads from common case to the rare case
• Re-use existing mechanisms inside modern managed runtimes
Motivating Analysis
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 19: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/19.jpg)
foo() { finish { async X = S1(); Y = S2(); }}
Implementation
steal
….
THIEF
S1
foo
C
B
A Sta
ck G
row
th D
irect
ion
VICTIM
Yieldpoint Mechanism
foo
C
B
A
THIEF
S2
• Yieldpoint mechanism • On-stack replacement • Java try/catch exceptions • Dynamic code patching
![Page 20: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/20.jpg)
Evaluation
20
1
2
4
geomean
Habanero-Java ManagedX10 Fork-Join TryCatchWS
Sequential Overhead
7%
3.7x
2.5x
1.6x
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 21: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/21.jpg)
21
Steal Rate
Motivating Analysis
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
0
10
20
30
40
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Ste
als
per m
illi-s
econ
ds
Threads
![Page 22: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/22.jpg)
Dynamic Overhead
Motivating Analysis
22 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
0
10
20
30
40
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 0
2
4
6
8
10
12
14
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Dyn
amic
ove
rhea
d (%
)
Threads
![Page 23: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/23.jpg)
23
Insights
• Still the same – Re-use existing mechanisms inside modern managed
runtimes
Motivating Analysis
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 24: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/24.jpg)
Return Barrier Hijack a return and bridge to some other method
E
D
C
B
A Sta
ck G
row
th D
irect
ion
Implementation
24 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 25: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/25.jpg)
Dynamic Overhead
Evaluation
25 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
0
0.5
1
1.5
2
geomean
Old New
Dyn
amic
ove
rhea
d (%
) For threads=16
![Page 26: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/26.jpg)
26 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 27: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/27.jpg)
Productivity in a Large Code Base
• Project with several hundred files • Multiple dependencies (inheritance…) • Achieving parallelism
– Minimal changes – Track fields with atomic updates – Avoid deadlocks
27
Motivating Analysis
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 28: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/28.jpg)
Java Language Annotations
Implementation
• Annotate and leave the rest on compiler • Parallelism
– syncsteal {…} – steal {…}
• Data centric concurrency control (Dolby et al. 2012) – @Atomicsets(X) – @Atomic(X) – @AliasAtomic(Y=this.X)
28 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 29: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/29.jpg)
29
So Where Do We Stand …?
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar
![Page 30: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/30.jpg)
0
1
2
3
4
5
6
7
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Spe
ed
up o
ver
Seq
uentia
l
Threads
Habanero-Java
ManagedX10
Fork-Join
TryCatchWS
Work–Stealing Performance
Evaluation
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar 30
Jacobi
![Page 31: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/31.jpg)
Work–Stealing Performance
Evaluation
0
1
2
3
4
5
6
7
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Speedup o
ver
Sequentia
l
Threads
Habanero-Java
ManagedX10
Fork-Join
TryCatchWS
High Performance Runtime for Next Generation Parallel Programming Languages | Kumar 31
UTS
![Page 32: High Performance Runtime for Next Generation Parallel ... · Next Generation Parallel Programming Languages. Hardware and Software Today Background High Performance Runtime for Next](https://reader030.fdocuments.us/reader030/viewer/2022011920/6024cbf22e63ec7969672245/html5/thumbnails/32.jpg)
Summary and Conclusion • Work–stealing overheads – sequential and dynamic • Reused existing mechanisms inside modern managed
runtimes – Yieldpoint mechanism – On-stack replacement – Java try/catch exception handling – Dynamic code patching – Return barrier
• Effectively eliminated sequential overhead (only 7%) • Halved the dynamic overhead • Annotations in Java to generate work-stealing calls and
synchronization blocks
Summary
32 High Performance Runtime for Next Generation Parallel Programming Languages | Kumar