Dynamic Determinism Checking for Structured Parallelism
description
Transcript of Dynamic Determinism Checking for Structured Parallelism
![Page 1: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/1.jpg)
Dynamic Determinism Checking for Structured Parallelism
Edwin Westbrook1, Raghavan Raman2, Jisheng Zhao3, Zoran Budimlić3, Vivek Sarkar3
1Kestrel Institute 2Oracle Labs 3Rice University
![Page 2: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/2.jpg)
Deterministic Parallelism for the 99%
• Parallelism is crucial for performance• Hard to understand for most programmers• Deterministic parallelism is much easier• In many cases, non-determinism is a bug
How to catch / prevent non-determinism bugs?
![Page 3: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/3.jpg)
Two Sorts of Code
1. High-performance parallel libraries– Uses complex and subtle parallel constructs– Written by concurrency experts: the 1%
2. Deterministic application code– Uses parallel libraries in a deterministic way– Parallelism behavior is straightforward– Written by everybody else: the 99%
We focus on application code
![Page 4: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/4.jpg)
Determinism via Commutativity
1. Identify pairs of operations which commute– Operations = parallel library primitives (the 1%)– Verified independently of this work
2. Ensure operations in parallel tasks commute– I.e., verify the application code (the 99%)
![Page 5: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/5.jpg)
Our Approach: HJd
• HJd = Habanero Java with determinism– Builds on our prior race-freedom work
[RV’11,ECOOP’12]• Determinism is checked dynamically– For application code, not parallel libraries
• Determinism failures throw exceptions– Because non-determinism is a bug!
• Checking itself uses a deterministic structure• Leads to low overhead: 1.26x slowdown!
![Page 6: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/6.jpg)
Example: Counting Factors in Parallel
class CountFactors { int countFactors (int n) { AtomicInteger cnt = new AtomicInteger(); finish { for (int i = 2; i < n; ++i) async { if (n % i == 0) cnt.increment(); }} return cnt.get ();}}
Fork task
Join child tasks
Increment cnt in parallel
Get result after finish
![Page 7: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/7.jpg)
Specifying Commutativity for Libraries
• Methods annotated with “commutativity sets”– Each pair of methods in set commute
• Syntax:@CommSets{S1, …,Sn} <method
sig>– States method is in sets S1 through Sn
– Commutes with all other methods in these sets
![Page 8: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/8.jpg)
Commutativity Sets for AtomicInteger
final class AtomicInteger { @CommSets{"read"} int get () { ... } @CommSets{"modify"} void increment() { ... } @CommSets{"modify"} void decrement() { ... } @CommSets{"read","modify"} int initValue() { ... } int incrementAndGet () { ... }
get commutes with itself
inc/dec commute with themselves and each other
Commutes with anything
Commutes with nothing (not even itself)
![Page 9: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/9.jpg)
Irreflexive Commutativity
• Some methods do not commute with themselves, but commute with other methods
• Specified with @Irreflexive
![Page 10: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/10.jpg)
Example #2: Blocking Atomic Queues
final class AtomicQueue { @CommSets{"modify"} @Irreflexive void add (Object o) { ... }
@CommSets{"modify"} @Irreflexive Object remove () { ... }}
Add and remove commute: queue order unchanged
Self-commuting add (or remove) changes the queue!
![Page 11: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/11.jpg)
Commutativity means Commutativity
• Queue add might self-commute for some uses– E.g. worklist-based algorithms: each queue item is
consumed in the same way• Still cannot mark add as self-commuting• Instead, change library to capture use case
![Page 12: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/12.jpg)
Example #3: Atomic Worklistsfinal class AtomicWorklist { interface Handler () { void handle (Object o); }
void setHandler (Handler h) { ... }
@CommSets{"modify"} void add (Object o) { ... }}
Now add self-commutes: all elems get same handler
![Page 13: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/13.jpg)
Dynamically Verifying Determinism
• Each parallel library call a dynamic check– Ensures no non-commuting methods could
possibly run in parallel• HJd exposes these checks to the user– Construct called permission region [RV ‘11]– Many calls can be grouped into one check– See our paper for more details
![Page 14: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/14.jpg)
The Key to Dynamic Checks: DPST
• DPST = Dynamic Program Structure Tree• Deterministic representation of parallelism• May-happen-in-parallel check with no synch– Low overhead!
![Page 15: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/15.jpg)
DPST by Examplefinish { finish { async { S1; } S2; } S3;}
F
F S3
A
S1
S2
![Page 16: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/16.jpg)
May-Happen-In-Parallel with DPST
1. Find least common ancestor (LCA) of 2 nodes2. Check if leftmost child of LCA on LCA node
path is an async3. If so, return true, otherwise return false
![Page 17: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/17.jpg)
MHIP with DPSTfinish { finish { async { S1; } S2; } S3;}
F
F S3
A
S1
S2
Not in parallel
![Page 18: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/18.jpg)
MHIP with DPSTfinish { finish { async { S1; } S2; } S3;}
F
F S3
A
S1
S2
Maybe in parallel
![Page 19: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/19.jpg)
Results: Slowdown ≈ 1.26x
![Page 20: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/20.jpg)
Conclusions: Determinism for the 99%
• Deterministic parallelism is easier• Non-determinism is often a bug– For application code, the 99%
• HJd reports non-determinism bugs with low overhead: 1.26x
![Page 21: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/21.jpg)
Backup Slides
![Page 22: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/22.jpg)
Related Work on HJ
• Race freedom determinism for HJ (PPPJ ’11)• Finish accumulators (WoDet ‘13)• Race-freedom for HJ (RV ‘11, ECOOP ‘12)
![Page 23: Dynamic Determinism Checking for Structured Parallelism](https://reader035.fdocuments.us/reader035/viewer/2022062410/568160b8550346895dcfdece/html5/thumbnails/23.jpg)
DPST for Permission Regionsfinish { permit m1(x) { async { S1; }} async { permit m2(y) { S2; } } permit m3(x) { S3;}}
F
P1 P3
A1
S1
P2
A2
S2
S3