Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice...

107
Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

Transcript of Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice...

Page 1: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Barrier Synchronization

Companion slides forThe Art of Multiprocessor

Programmingby Maurice Herlihy & Nir Shavit

Page 2: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

2

Simple Video Game

• Prepare frame for display– By graphics coprocessor

• “soft real-time” application– Need at least 35 frames/second– OK to mess up rarely

Page 3: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

3

Simple Video Gamewhile (true) {

frame.prepare();

frame.display();

}

Page 4: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

4

Simple Video Gamewhile (true) {

frame.prepare();

frame.display();

}

• What about overlapping work?– 1st thread displays frame– 2nd prepares next frame

Page 5: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

5

Two-Phase Renderingwhile (true) {

if (phase) {

frame[0].display();

} else {

frame[1].display();

}

phase = !phase;

}

while (true) {

if (phase) {

frame[1].prepare();

} else {

frame[0].prepare();

}

phase = !phase;

}

Page 6: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

6

Two-Phase Renderingwhile (true) {

if (phase) {

frame[0].display();

} else {

frame[1].display();

}

phase = !phase;

}

while (true) {

if (phase) {

frame[1].prepare();

} else {

frame[0].prepare();

}

phase = !phase;

}

Even phases

Page 7: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

7

Two-Phase Renderingwhile (true) {

if (phase) {

frame[0].display();

} else {

frame[1].display();

}

phase = !phase;

}

while (true) {

if (phase) {

frame[1].prepare();

} else {

frame[0].prepare();

}

phase = !phase;

}

odd phases

Page 8: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

8

Synchronization Problems

• How do threads stay in phase?• Too early?

– “we render no frame before its time”

• Too late?– Recycle memory before frame is

displayed

Page 9: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

9

Ideal Parallel Computation

00

0 11

1

Page 10: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

10

Ideal Parallel Computation

22

2 11

1

Page 11: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

11

Real-Life Parallel Computation

00

0 11zzz…

Page 12: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

12

Real-Life Parallel Computation

2

11zzz…

Uh, oh

0

Page 13: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

13

Barrier Synchronization

00

0

barrier

Page 14: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

14

Barrier Synchronization

barrier

11

1

Page 15: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

15

Barrier Synchronization

barrier

No thread enters here

Until every thread has left here

Page 16: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

16

Why Do We Care?

• Mostly of interest to– Scientific & numeric computation

• Elsewhere– Garbage collection– Less common in systems

programming– Still important topic

Page 17: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

17

Duality

• Dual to mutual exclusion– Include others, not exclude them

• Same implementation issues– Interaction with caches …

• Invalidation?• Local spinning?

Page 18: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

18

Example: Parallel Prefix

a b c d

a a+b a+b+ca+b+c

+d

before

after

Page 19: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

19

Parallel Prefix

a b c dOne threadPer entry

Page 20: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

20

Parallel Prefix: Phase 1

a b c d

a a+b b+c c+d

Page 21: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

21

Parallel Prefix: Phase 2

a b c d

a a+b a+b+ca+b+c

+d

Page 22: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

22

Parallel Prefix

• N threads can compute– Parallel prefix– Of N entries

– In log2 N rounds

• What if system is asynchronous?– Why we need barriers

Page 23: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

23

Prefixclass Prefix extends Thread { private int[] a; private int i; private Barrier b; public Prefix(int[] a, Barrier b, int i) { a = a; b = b; i = i; }

Page 24: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

24

Prefixclass Prefix extends Thread { private int[] a; private int i; private Barrier b; public Prefix(int[] a, Barrier b, int i) { a = a; b = b; i = i; }

Array of input values

Page 25: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

25

Prefixclass Prefix extends Thread { private int[] a; private int i; private Barrier b; public Prefix(int[] a, Barrier b, int i) { a = a; b = b; i = i; }

Thread index

Page 26: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

26

Prefixclass Prefix extends Thread { private int[] a; private int i; private Barrier b; public Prefix(int[] a, Barrier b, int i) { a = a; b = b; i = i; }

Shared barrier

Page 27: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

27

Prefixclass Prefix extends Thread { private int[] a; private int i; private Barrier b; public Prefix(int[] a, Barrier b, int i) { a = a; b = b; i = i; }

Initialize fields

Page 28: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

28

Where Do the Barriers Go?public void run() { int d = 1, sum = 0; while (d < N) { if (i >= d) sum = a[i-d]; if (i >= d) a[i] += sum; d = d * 2;}}}

Page 29: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

29

Where Do the Barriers Go?public void run() { int d = 1, sum = 0; while (d < N) { if (i >= d) sum = a[i-d]; b.await(); if (i >= d) a[i] += sum; d = d * 2;}}}

Page 30: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

30

Where Do the Barriers Go?public void run() { int d = 1, sum = 0; while (d < N) { if (i >= d) sum = a[i-d]; b.await(); if (i >= d) a[i] += sum; d = d * 2;}}}

Make sure everyone reads before anyone writes

Page 31: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

31

Where Do the Barriers Go?public void run() { int d = 1, sum = 0; while (d < N) { if (i >= d) sum = a[i-d]; b.await(); if (i >= d) a[i] += sum; b.await(); d = d * 2;}}}

Make sure everyone reads before anyone writes

Page 32: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

32

Where Do the Barriers Go?public void run() { int d = 1, sum = 0; while (d < N) { if (i >= d) sum = a[i-d]; b.await(); if (i >= d) a[i] += sum; b.await(); d = d * 2;}}}

Make sure everyone writes before anyone reads

Make sure everyone reads before anyone writes

Page 33: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

33

Barrier Implementations

• Cache coherence– Spin on locally-cached locations?– Spin on statically-defined locations?

• Latency– How many steps?

• Symmetry– Do all threads do the same thing?

Page 34: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

34

Barrierspublic class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); }}}}

Page 35: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); }}}} Art of Multiprocessor

Programming35

Barriers

Number threads not yet arrived

Page 36: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); }}}} Art of Multiprocessor

Programming36

Barriers

Number threads participating

Page 37: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); }}}} Art of Multiprocessor

Programming37

BarriersInitialization

Page 38: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); }}}} Art of Multiprocessor

Programming38

Barriers

Principal method

Page 39: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); }}}} Art of Multiprocessor

Programming39

BarriersIf I’m last, reset

fields for next time

Page 40: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); }}}} Art of Multiprocessor

Programming40

Barriers

Otherwise, wait for everyone else

Page 41: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); }}}} Art of Multiprocessor

Programming41

Barriers

What’s wrong with this protocol?

Page 42: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

42

ReuseBarrier b = new Barrier(n);while ( mumble() ) { work(); b.await()}

Do work

synchronizerepeat

Page 43: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); }}}} Art of Multiprocessor

Programming43

Barriers

Page 44: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); }}}} Art of Multiprocessor

Programming44

Barriers

Waiting for Phase 1 to

finish

Page 45: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); }}}} Art of Multiprocessor

Programming45

Barriers

Waiting for Phase 1 to

finish

Phase 1 is so over

Page 46: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); }}}} Art of Multiprocessor

Programming46

Barriers

ZZZZZ….Prepare

for phase 2

Page 47: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; public Barrier(int n){ count = AtomicInteger(n); size = n; } public void await() { if (count.getAndDecrement()==1) { count.set(size); } else { while (count.get() != 0); }}}} Art of Multiprocessor

Programming47

Uh-Oh

Waiting for Phase 1 to

finish

Waiting for Phase 2 to

finish

Page 48: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

48

Basic Problem

• One thread “wraps around” to start phase 2

• While another thread is still waiting for phase 1

• One solution:– Always use two barriers

Page 49: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

49

Sense-Reversing Barrierspublic class Barrier { AtomicInteger count; int size; boolean sense = false; threadSense = new ThreadLocal<boolean>…

public void await { boolean mySense = threadSense.get(); if (count.getAndDecrement()==1) { count.set(size); sense = !mySense } else { while (sense != mySense) {} } threadSense.set(!mySense)}}}

Page 50: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; boolean sense = false; threadSense = new ThreadLocal<boolean>…

public void await { boolean mySense = threadSense.get(); if (count.getAndDecrement()==1) { count.set(size); sense = !mySense } else { while (sense != mySense) {} } threadSense.set(!mySense)}}}

Art of Multiprocessor Programming

50

Sense-Reversing Barriers

Completed odd or even-numbered

phase?

Page 51: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; boolean sense = false; threadSense = new ThreadLocal<boolean>…

public void await { boolean mySense = threadSense.get(); if (count.getAndDecrement()==1) { count.set(size); sense = !mySense } else { while (sense != mySense) {} } threadSense.set(!mySense)}}}

Art of Multiprocessor Programming

51

Sense-Reversing Barriers

Store sense for next phase

Page 52: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; boolean sense = false; threadSense = new ThreadLocal<boolean>…

public void await { boolean mySense = threadSense.get(); if (count.getAndDecrement()==1) { count.set(size); sense = !mySense } else { while (sense != mySense) {} } threadSense.set(!mySense)}}}

Art of Multiprocessor Programming

52

Sense-Reversing Barriers

Get new sense determined by last

phase

Page 53: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; boolean sense = false; threadSense = new ThreadLocal<boolean>…

public void await { boolean mySense = threadSense.get(); if (count.getAndDecrement()==1) { count.set(size); sense = !mySense } else { while (sense != mySense) {} } threadSense.set(!mySense)}}}

Art of Multiprocessor Programming

53

Sense-Reversing Barriers

If I’m last, reverse sense for next time

Page 54: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; boolean sense = false; threadSense = new ThreadLocal<boolean>…

public void await { boolean mySense = threadSense.get(); if (count.getAndDecrement()==1) { count.set(size); sense = !mySense } else { while (sense != mySense) {} } threadSense.set(!mySense)}}}

Art of Multiprocessor Programming

54

Sense-Reversing Barriers

Otherwise, wait for sense to flip

Page 55: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Barrier { AtomicInteger count; int size; boolean sense = false; threadSense = new ThreadLocal<boolean>…

public void await { boolean mySense = threadSense.get(); if (count.getAndDecrement()==1) { count.set(size); sense = !mySense } else { while (sense != mySense) {} } threadSense.set(!mySense)}}}

Art of Multiprocessor Programming

55

Sense-Reversing Barriers

Prepare sense for next phase

Page 56: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

56

2-barrier

2-barrier 2-barrier

Combining Tree Barriers

Page 57: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

57

2-barrier

2-barrier 2-barrier

Combining Tree Barriers

Page 58: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

58

Combining Tree Barrierpublic class Node{ AtomicInteger count; int size; Node parent; Volatile boolean sense;

public void await() {… if (count.getAndDecrement()==1) { if (parent != null) { parent.await()} count.set(size); sense = mySense } else { while (sense != mySense) {} }…}}}

Page 59: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Node{ AtomicInteger count; int size; Node parent; Volatile boolean sense;

public void await() {… if (count.getAndDecrement()==1) { if (parent != null) { parent.await()} count.set(size); sense = mySense } else { while (sense != mySense) {} }…}}}

Art of Multiprocessor Programming

59

Combining Tree BarrierParent barrier in

tree

Page 60: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Node{ AtomicInteger count; int size; Node parent; Volatile boolean sense;

public void await() {… if (count.getAndDecrement()==1) { if (parent != null) { parent.await()} count.set(size); sense = mySense } else { while (sense != mySense) {} }…}}}

Art of Multiprocessor Programming

60

Combining Tree BarrierAm I last?

Page 61: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Node{ AtomicInteger count; int size; Node parent; Volatile boolean sense;

public void await() {… if (count.getAndDecrement()==1) { if (parent != null) { parent.await();} count.set(size); sense = mySense } else { while (sense != mySense) {} }…}}}

Art of Multiprocessor Programming

61

Combining Tree BarrierProceed to parent barrier

Page 62: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Node{ AtomicInteger count; int size; Node parent; Volatile boolean sense;

public void await() {… if (count.getAndDecrement()==1) { if (parent != null) { parent.await()} count.set(size); sense = mySense } else { while (sense != mySense) {} }…}}}

Art of Multiprocessor Programming

62

Combining Tree BarrierPrepare for next phase

Page 63: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Node{ AtomicInteger count; int size; Node parent; Volatile boolean sense;

public void await() {… if (count.getAndDecrement()==1) { if (parent != null) { parent.await()} count.set(size); sense = mySense } else { while (sense != mySense) {} }…}}}

Art of Multiprocessor Programming

63

Combining Tree BarrierNotify others at this node

Page 64: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

public class Node{ AtomicInteger count; int size; Node parent; Volatile boolean sense;

public void await() {… if (count.getAndDecrement()==1) { if (parent != null) { parent.await()} count.set(size); sense = mySense } else { while (sense != mySense) {} }…}}}

Art of Multiprocessor Programming

64

Combining Tree BarrierI’m not last, so wait for

notification

Page 65: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

65

Combining Tree Barrier

• No sequential bottleneck– Parallel getAndDecrement() calls

• Low memory contention– Same reason

• Cache behavior– Local spinning on bus-based

architecture– Not so good for NUMA

Page 66: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

66

Remarks

• Everyone spins on sense field– Local spinning on bus-based (good)– Network hot-spot on distributed

architecture (bad)

• Not really scalable

Page 67: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

67

Tournament Tree Barrier

• If tree nodes have fan-in 2– Don’t need to call getAndDecrement()– Winner chosen statically

• At level i– If i-th bit of id is 0, move up– Otherwise keep back

Page 68: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

68

Tournament Tree Barriers

winner loser winner loser

winner loser

root

Page 69: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

69

Tournament Tree Barriers

All flags blue

Page 70: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

70

Tournament Tree Barriers

Loser thread sets winner’s

flag

Page 71: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

71

Tournament Tree Barriers

Loser spins on own flag

Page 72: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

72

Tournament Tree Barriers

Winner spins on own flag

Page 73: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

73

Tournament Tree Barriers

Winner sees own flag, moves up,

spins

Page 74: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

74

Tournament Tree BarriersBingo!

Page 75: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

75

Tournament Tree Barriers

Sense-reversing: next time use blue flags

Page 76: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

76

Tournament Barrierclass TBarrier { boolean flag; TBarrier partner; TBarrier parent; boolean top; …}

Page 77: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

77

Tournament Barrierclass TBarrier { boolean flag; TBarrier partner; TBarrier parent; boolean top; …}

Notifications delivered here

Page 78: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

78

Tournament Barrierclass TBarrier { boolean flag; TBarrier partner; TBarrier parent; boolean top; …}

Other thead at same level

Page 79: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

79

Tournament Barrierclass TBarrier { boolean flag; TBarrier partner; TBarrier parent; boolean top; …}

Parent (winner) or null (loser)

Page 80: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

80

Tournament Barrierclass TBarrier { boolean flag; TBarrier partner; TBarrier parent; boolean top; …}

Am I the root?

Page 81: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

81

Tournament Barriervoid await(boolean mySense) { if (top) { return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {};}}}

Page 82: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

82

Tournament Barriervoid await(boolean mySense) { if (top) { return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {};}}}

Le root, c’est moi

Sense is a Parameter to save

space…

Page 83: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

83

Tournament Barriervoid await(boolean mySense) { if (top) { return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {};}}}

I am already a winner

Page 84: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

84

Tournament Barriervoid await(boolean mySense) { if (top) { return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {};}}}

Wait for partner

Page 85: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

85

Tournament Barriervoid await(boolean mySense) { if (top) { return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {};}}}

Synchronize upstairs

Page 86: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

86

Tournament Barriervoid await(boolean mySense) { if (top) { return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {};}}}

Inform partner

Page 87: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

87

Tournament Barriervoid await(boolean mySense) { if (top) { return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {};}}}

Natural-born loser

Page 88: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

88

Tournament Barriervoid await(boolean mySense) { if (top) { return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {};}}}

Tell partner I’m here

Page 89: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

89

Tournament Barriervoid await(boolean mySense) { if (top) { return; } else if (parent != null) { while (flag != mySense) {}; parent.await(mySense); partner.flag = mySense; } else { partner.flag = mySense; while (flag != mySense) {};}}}

Wait for notification from

partner

Page 90: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

90

Remarks

• No need for read-modify-write calls• Each thread spins on fixed location

– Good for bus-based architectures– Good for NUMA architectures

Page 91: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

91

Dissemination Barrier

• At round i– Thread A notifies thread A+2i (mod n)

• Requires log n rounds

Page 92: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

92

Dissemination Barrier+1 +2 +4

Page 93: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

93

Remarks

• Great for lovers of combinatorics• Not so much fun for those who

track memory footprints

Page 94: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

94

Ideas So Far

• Sense-reversing– Reuse without reinitializing

• Combining tree– Like counters, locks …

• Tournament tree– Optimized combining tree

• Dissemination barrier– Intellectually Pleasing (matter of taste)

Page 95: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

95

Which is best for Multicore?

• On a cache coherent multicore chip: none of the above…

• Use a static tree barrier!

Page 96: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

96

Static Tree BarrierAll spin on cached copy of Done flag

0

2

2

0

0

Counter in parent:Num of children that have not arrived

Page 97: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

97

Static Tree BarrierAll spin on cached copy of Done flag

0

2

2

0

0

10

10All detect change in Done flag

Page 98: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

98

Remarks

• Very little cache traffic• Minimal space overhead• Can be laid out optimally on NUMA

– Instead of Done bit, notification must be performed by passing sense down the static tree

Page 99: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Back to Amdahl’s Law:

Speedup = 1/(ParallelPart/N + SequentialPart)

Pay for N = 8 cores SequentialPart = 25%

Speedup = only 2.9 times!Must parallelize applications on a very fine grain!

So…How will we make use of multicores?

Page 100: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Need Fine-Grained Locking

75%Unshared

25%Shared

c cc c

c cc c

CoarseGrained

c

cc

c

c

c

c c

c cc c

c cc c

FineGrained c c

cc

cc

cc

The reason we get

only 2.9 speedup

75%Unshared

25%Shared

Page 101: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

101

Traditional Scaling Process

User code

TraditionalUniprocessor

Speedup1.8x1.8x

7x7x

3.6x3.6x

Time: Moore’s law

Page 102: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

102

Multicore Scaling Process

User code

Multicore

Speedup 1.8x1.8x

7x7x

3.6x3.6x

As noted, not so simple…

Page 103: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

103

Real-World Scaling Process

1.8x1.8x 2x2x 2.9x2.9x

User code

Multicore

Speedup

Parallelization and Synchronization will require great care…

Page 104: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Future: Smoother Scaling …

Abstractions(like TM)

Speedup 1.8x1.8x

7x7x

3.6x3.6x

Multicore

User code

Page 105: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Multicore Programming

• We are at the dawn of a new era…• Still a lot of research and

development to be done• And a body of multicore

programming experience to be accumelated by us all

© 2006 Herlihy & Shavit 105

Page 106: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

© 2006 Herlihy & Shavit 106

Thanks for Attending!

• Good luck with your programming…

• If you have questions or run into interesting problems, just email us: – [email protected][email protected]

Page 107: Barrier Synchronization Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Art of Multiprocessor Programming

107

         This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

• You are free:– to Share — to copy, distribute and transmit the work – to Remix — to adapt the work

• Under the following conditions:– Attribution. You must attribute the work to “The Art of

Multiprocessor Programming” (but not in any way that suggests that the authors endorse you or your use of the work).

– Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license.

• For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to– http://creativecommons.org/licenses/by-sa/3.0/.

• Any of the above conditions can be waived if you get permission from the copyright holder.

• Nothing in this license impairs or restricts the author's moral rights.