Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

24
1 LCPC ‘05: Concurrency Analysis Amir Kamil Concurrency Analysis for Parallel Programs with Textually Aligned Barriers Amir Kamil and Katherine Yelick Titanium Group http:// titanium.cs.berkeley.edu U.C. Berkeley October 20, 2005

description

Concurrency Analysis for Parallel Programs with Textually Aligned Barriers. Amir Kamil and Katherine Yelick Titanium Group http://titanium.cs.berkeley.edu U.C. Berkeley October 20, 2005. Motivation/Goals. - PowerPoint PPT Presentation

Transcript of Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

Page 1: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

1LCPC ‘05: Concurrency Analysis Amir Kamil

Concurrency Analysis for Parallel Programs with Textually Aligned

BarriersAmir Kamil and Katherine Yelick

Titanium Grouphttp://titanium.cs.berkeley.edu

U.C. BerkeleyOctober 20, 2005

Page 2: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

2 Amir KamilLCPC ‘05: Concurrency Analysis

Motivation/Goals• Many program analyses and optimizations

benefit from knowing which expressions can run concurrently

• Develop basic concurrency analysis for Titanium programs

• Refine analysis to ignore infeasible paths• Evaluate analysis using two applications

• Race detection• Memory model enforcement

Page 3: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

3 Amir KamilLCPC ‘05: Concurrency Analysis

• Many parallel languages make no attempt to ensure that barriers line up• Example code that is legal but will deadlock:if (Ti.thisProc() % 2 == 0)

Ti.barrier(); // even ID threadselse

; // odd ID threads

Barrier Alignment

Page 4: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

4 Amir KamilLCPC ‘05: Concurrency Analysis

• Aiken and Gay introduced structural correctness (POPL’98)• Ensures that every thread executes the same

number of barriers• Example of structurally correct code:if (Ti.thisProc() % 2 == 0)

Ti.barrier(); // even ID threadselse

Ti.barrier(); // odd ID threads

Structural Correctness

Page 5: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

5 Amir KamilLCPC ‘05: Concurrency Analysis

• Titanium has textual barriers: all threads must execute the same textual sequence of barriers• Stronger guarantee than structural correctness –

this example is illegal:if (Ti.thisProc() % 2 == 0)

Ti.barrier(); // even ID threadselse

Ti.barrier(); // odd ID threads

• Single-valued expressions used to enforce textual barriers

Textual Barrier Alignment

Page 6: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

6 Amir KamilLCPC ‘05: Concurrency Analysis

• A single-valued expression has the same value on all threads when evaluated• Example: Ti.numProcs() > 1

• All threads guaranteed to take the same branch of a conditional guarded by a single-valued expression• Only single-valued conditionals may have barriers• Example of legal barrier use:if (Ti.numProcs() > 1)

Ti.barrier(); // multiple threadselse

; // only one thread

Single-Valued Expressions

Page 7: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

7 Amir KamilLCPC ‘05: Concurrency Analysis

Concurrency Graph• Represents concurrency as a graph

• Nodes are program expressions• If a path exists between two nodes, they can run

concurrently

• Generated from control flow graph

b

y++

x++ call foo

ret foo

method fooz++z--

c

w--

x++;

Ti.barrier();

if (b) z--;

else z++;

if (c [single]) w--;

else y++;

foo();

Page 8: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

8 Amir KamilLCPC ‘05: Concurrency Analysis

Barriers• Barriers prevent code before and after

from running concurrently• Nodes for barrier expressions removed

from concurrency graph

x++;

Ti.barrier();

...

x++

...

barrier

Page 9: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

9 Amir KamilLCPC ‘05: Concurrency Analysis

Non-Single Conditionals• Branches of a non-single conditional can

run concurrently• Different threads can take different branches

• Edge added in the concurrency graph between branches

if (b)

z--;

else

z++;

...

b

z++z--

...

Page 10: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

10 Amir KamilLCPC ‘05: Concurrency Analysis

Single Conditionals• Branches of a single conditional cannot

run concurrently• All threads take the same branch

• No edge added in the concurrency graph between branches

if (c [single])

w--;

else

y++;

...

c

y++w--

...

Page 11: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

11 Amir KamilLCPC ‘05: Concurrency Analysis

Method Calls• Method call nodes split into a call and

return node• Edges added from call node to target method’s

subgraph, and from target method to return node

...

foo();

...

call foo

ret foo

method foo

Page 12: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

12 Amir KamilLCPC ‘05: Concurrency Analysis

Concurrency Algorithm• Two accesses can run concurrently if at

least one is reachable from the other• Concurrent accesses computed by doing

N depth first searches

b

y++

x++ call foo

ret foo

method fooz++z--

c

w--

x++;

Ti.barrier();

if (b) z--;

else z++;

if (c [single]) w--;

else y++;

foo();

Page 13: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

13 Amir KamilLCPC ‘05: Concurrency Analysis

Infeasible Paths (I)• Handling of method calls allows infeasible

control flow paths• Path exists into one call site and out of another

call foo

ret foo

method foo

call foo

ret foo

method bar method baz

Page 14: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

14 Amir KamilLCPC ‘05: Concurrency Analysis

Infeasible Paths (II)• Solution – label call and return edges with

matching parentheses• Only follow paths that correspond to balanced

parentheses

call foo

ret foo

method foo

call foo

ret foo

method bar method baz

Page 15: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

15 Amir KamilLCPC ‘05: Concurrency Analysis

Bypass Edges (I)• Reachability now depends on context• Inefficient to revisit method in every

context• Solution – add edges to bypass method

calls

call foo

ret foo

method fooret foo

call foo

call foo

ret foo

method fooret foo

call foo

Page 16: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

16 Amir KamilLCPC ‘05: Concurrency Analysis

Bypass Edges (II)• Can only bypass method calls that can

actually complete (without executing a barrier)

• Iteratively compute set of methods that can complete

CanComplete φDo (until a fixed point is reached): CanComplete CanComplete all methods that can complete by only calling methods in CanComplete

Page 17: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

17 Amir KamilLCPC ‘05: Concurrency Analysis

• Two heap accesses compose a data race if they can concurrently access the same location, and at least one is a write

• Alias and concurrency analysis used to statically compute set of possible data races• Analyses are sound, so all real races are detected

• Goal is to minimize number of false races detected

Static Race Detection

Possible final values of x: -1, 0, 1set x = x + 1

T1 T2

Initially, x = 0

set x = x - 1

Page 18: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

18 Amir KamilLCPC ‘05: Concurrency Analysis

Definition: A parallel execution must behave as if it were an interleaving of the serial executions by individual threads, with each individual execution sequence preserving the program order [Lamport79].

Legal execution: a x y b

Illegal execution: x y b a

Critical cycle

Sequential Consistency

a [set data = 1]

x [set flag = 1] b [read data]

y [read flag]

T1 T2

Initially, flag = data = 0

Titanium and most other languages do not provide sequential consistency due to the (perceived) cost of enforcing it.

Page 19: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

19 Amir KamilLCPC ‘05: Concurrency Analysis

• Compiler and architecture must not reorder memory accesses that are part of a critical cycle

• Fences inserted into program to enforce order• Potentially costly – can prevent optimizations such

as code motion and communication aggregation• At runtime, can cost an RTT on a distributed

machine

• Goal is to minimize number of inserted fences

Enforcing Sequential Consistency

Page 20: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

20 Amir KamilLCPC ‘05: Concurrency Analysis

Benchmarks

Benchmark Lines1 Description

lu-fact 420 Dense linear algebra

gsrb 1090 Computational fluid dynamics kernel

spmv 1493 Sparse matrix-vector multiply

pps 3673 Parallel Poisson equation solver

gas 8841 Hyperbolic solver for gas dynamics

1 Line counts do not include the reachable portion of the 1 37,000 line Titanium/Java 1.0 libraries

Page 21: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

21 Amir KamilLCPC ‘05: Concurrency Analysis

• We tested analyses of varying levels of precision

• All levels use alias analysis to distinguish memory locations

Analysis Levels

Analysis Description

base All expressions assumed concurrent

concur Basic concurrency analysis

feasible Feasible paths concurrency analysis

Page 22: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

22 Amir KamilLCPC ‘05: Concurrency Analysis

Race Detection Results

Number of Data Races Detected

0

0.2

0.4

0.6

0.8

1

1.2

gas gsrb lu-fact pps spmvBenchmark

Fra

cti

on

Co

mp

are

d t

o base

base concur feasible

Page 23: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

23 Amir KamilLCPC ‘05: Concurrency Analysis

Sequential Consistency Results

Number of Static Memory Barriers

0

0.2

0.4

0.6

0.8

1

1.2

gas gsrb lu-fact pps spmvBenchmark

Fra

cti

on

Co

mp

are

d t

o base

base concur feasible

Page 24: Concurrency Analysis for Parallel Programs with Textually Aligned Barriers

24 Amir KamilLCPC ‘05: Concurrency Analysis

Conclusion• Textual barriers and single-valued

expressions allow for simple but precise concurrency analysis

• Concurrency analysis is useful both for detecting races and for enforcing sequential consistency• Not sufficient for race detection – too many false

positives• Good enough for sequential consistency to be

provided at low cost (SC|05)

• Ignoring infeasible paths significantly improves analysis results