Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

88
Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit

Transcript of Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Page 1: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Transactional Memory

The Art of Multiprocessor ProgrammingHerlihy and Shavit

Page 2: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

From the New York Times …

SAN FRANCISCO, May 7. 2004 - Intel said on Friday that it was scrapping its development of two microprocessors, a move that is a shift in the company's business strategy….

Page 3: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Moore’s Law

(hat tip: Herb Sutter)

Clock speed

flattening sharply

Transistor count still

rising

Page 4: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 6

The Problem

• Cannot exploit cheap threads• Today’s Software

– Non-scalable methodologies

• Today’s Hardware– Poor support for scalable

synchronization

Page 5: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 7

Locking

Page 6: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 8

Coarse-Grained Locking

Easily made correct …But not scalable.

Page 7: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 9

Fine-Grained Locking

Here comes trouble …

Page 8: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 10

Why Locking Doesn’t Scale

• Not Robust• Relies on conventions• Hard to Use

– Conservative– Deadlocks– Lost wake-ups

• Not Composable

Page 9: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 11

Locks are not Robust

If a thread holdinga lock is delayed

No one else can make progress

Page 10: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 12

Why Locking Doesn’t Scale

• Not Robust• Relies on conventions• Hard to Use

– Conservative– Deadlocks– Lost wake-ups

• Not Composable

Page 11: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 13

Locking Relies on Conventions

• Relation between– Lock bit and object bits– Exists only in programmer’s mind

/* * When a locked buffer is visible to the I/O layer * BH_Launder is set. This means before unlocking * we must clear BH_Launder,mb() on alpha and then * clear BH_Lock, so no reader can see BH_Launder set * on an unlocked buffer and then risk to deadlock. */

Actual comment from Linux

Kernel(hat tip: Bradley Kuszmaul)

Page 12: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 14

Why Locking Doesn’t Scale

• Not Robust• Relies on conventions• Hard to Use

– Conservative– Deadlocks– Lost wake-ups

• Not Composable

Page 13: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 15

Sadistic Homework

enq(x) enq(y)Double-ended queue

No interference if ends “far

enough” apart

Page 14: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 16

Sadistic Homework

deq() ceq()Double-ended queue

Interference OK if ends “close

enough” together

Page 15: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 17

Sadistic Homework

deq() deq()Double-ended queue

Make sure suspended dequeuers

awake as needed

Page 16: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

In Search of the Lost Wake-Up

• Waiting thread doesn’t realize when to wake up

• It’s a real problem in big systems– “Calling pthread_cond_signal() or

pthread_cond_broadcast() when the thread does not hold the mutex lock associated with the condition can lead to lost wake-up bugs.”

from Google™ search for “lost wake-up”

Page 17: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 19

You Try It …

• One lock?– Too Conservative

• Locks at each end?– Deadlock, too complicated, etc

• Waking blocked dequeuers?– Harder than it looks

Page 18: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 20

Actual Solution

• Clean solution would be a publishable result

• [Michael & Scott, PODC 96]• High performance fine-grained

lock-based solutions are good for libraries…

• not general consumption by programmers

Page 19: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 21

Why Locking Doesn’t Scale

• Not Robust• Relies on conventions• Hard to Use

– Conservative– Deadlocks– Lost wake-ups

• Not Composable

Page 20: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 22

Locks do not compose

add(T1, item)

delete(T1, item)

add(T2, item) item item

Move from T1 to T2

Must lock T2

before deleting from T1

lock T2lock T2lock T1lock T1

lock T1lock T1

item

Exposing lock internals breaks abstraction

Hash Table Must lock T1

before adding item

Page 21: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 23

Monitor Wait and Signal

zzz

If buffer is empty, wait for item to show up

Empty

bufferYes!

Page 22: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 24

Wait and Signal do not Compose

empty

empty

zzz…

Wait for either?

Page 23: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 25

The Transactional Manifesto

• What we do now is inadequate to meet the multicore challenge

• Research Agenda– Replace locking with a transactional

API – Design languages to support this model– Implement the run-time to be fast

enough

Page 24: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 26

Transactions

• Atomic– Commit: takes effect– Abort: effects rolled back

• Usually retried

• Linearizable– Appear to happen in one-at-a-time

order

Page 25: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 27

atomic { x.remove(3); y.add(3);}

atomic { y = null;}

Atomic Blocks

Page 26: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 28

atomic { x.remove(3); y.add(3);}

atomic { y = null;}

Atomic Blocks

No data race

Page 27: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 29

Public void LeftEnq(item x) { Qnode q = new Qnode(x); q.left = this.left; this.left.right = q; this.left = q;}

Sadistic Homework Revisited

(1)

Write sequential Code

Page 28: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 30

Public void LeftEnq(item x) { atomic { Qnode q = new Qnode(x); q.left = this.left; this.left.right = q; this.left = q; }}

Sadistic Homework Revisited

(1)

Page 29: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 31

Public void LeftEnq(item x) { atomic { Qnode q = new Qnode(x); q.left = this.left; this.left.right = q; this.left = q; }}

Sadistic Homework Revisited

(1)

Enclose in atomic block

Page 30: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 32

Warning

• Not always this simple– Conditional waits– Enhanced concurrency– Overlapping locks

• But often it is– Works for sadistic homework

Page 31: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 33

Public void Transfer(Queue<T> q1, q2){ atomic { T x = q1.deq(); q2.enq(x); }}

Composition

(1)

Trivial or what?

Page 32: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 34

Public T LeftDeq() { atomic { if (this.left == null) retry; … }}

Wake-ups: lost and found

(1)

Roll back transaction and restart when something

changes

Page 33: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 35

OrElse Compositionatomic { x = q1.deq(); } orElse { x = q2.deq();}

Run 1st method. If it retries …Run 2nd method. If it retries …

Entire statement retries

Page 34: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Transactional Memory

• Software transactional memory (STM)

• Hardware transactional memory (HTM)

• Hybrid transactional memory (HyTM, try in hardware and default to software if unsuccessful)

© 2006 Herlihy & Shavit 36

Page 35: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 37

Hardware versus Software

• Do we need hardware at all?– Analogies:

• Virtual memory: yes!• Garbage collection: no!

– Probably do need HW for performance

• Do we need software?– Policy issues don’t make sense for

hardware

Page 36: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Design Issues

• Implementation choices

• Language design issues

• Semantic issues

Page 37: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Granularity

• Object– managed languages, Java, C#, …– Easy to control interactions between

transactional & non-trans threads

• Word– C, C++, …– Hard to control interactions between

transactional & non-trans threads

Page 38: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Direct/Deferred Update

• Deferred – modify private copies & install on

commit– Commit requires work– Consistency easier

• Direct – Modify in place, roll back on abort– Makes commit efficient– Consistency harder

Page 39: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Conflict Detection

• Eager– Detect before conflict arises– “Contention manager” module

resolves

• Lazy– Detect on commit/abort

• Mixed– Eager write/write, lazy read/write …

Page 40: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Conflict Detection

• Eager detection may abort transaction that could have committed.

• Lazy detection discards more computation.

Page 41: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Contention Managers

• Oracle that resolves conflicts– For eager conflict detection

• CM decides– Whether to abort other transaction– Or give it a chance to finish …

Page 42: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Contention Manager Strategies

• Exponential backoff• Oldest gets priority• Most work done gets priority• Non-waiting has priority over waiting• Lots of alternatives

– None seems to dominate– But choice can have big impact

Page 43: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

I/O & System Calls?

• Some I/O revocable– Provide transaction-safe libraries– Undoable file system/DB calls

• Some not– Opening cash drawer– Firing missile

Page 44: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

I/O & System Calls

• One solution: make transaction irrevocable– If transaction tries I/O, switch to

irrevocable mode.• There can be only one …

– Requires serial execution• No explicit aborts

– In irrevocable transactions

Page 45: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Exceptions

int i = 0;try { atomic { i++; node = new Node(); }} catch (Exception e) { print(i);}

int i = 0;try { atomic { i++; node = new Node(); }} catch (Exception e) { print(i);}

Page 46: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Exceptions

int i = 0;try { atomic { i++; node = new Node(); }} catch (Exception e) { print(i);}

int i = 0;try { atomic { i++; node = new Node(); }} catch (Exception e) { print(i);}

Throws OutOfMemoryException!

Page 47: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Exceptions

int i = 0;try { atomic { i++; node = new Node(); }} catch (Exception e) { print(i);}

int i = 0;try { atomic { i++; node = new Node(); }} catch (Exception e) { print(i);}

Throws OutOfMemoryException!

What is printed?

Page 48: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Unhandled Exceptions

• Aborts transaction– Preserves invariants– Safer

• Commits transaction– Like locking semantics– What if exception object refers to

values modified in transaction?

Page 49: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Nested Transactions

atomic void foo() { bar();}

atomic void bar() { …}

atomic void foo() { bar();}

atomic void bar() { …}

Page 50: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Nested Transactions

• Needed for modularity– Who knew that cosine() contained a

transaction?• Flat nesting

– If child aborts, so does parent• First-class nesting

– If child aborts, partial rollback of child only

Page 51: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Open Nested Transactions

• Normally, child commit– Visible only to parent

• In open nested transactions– Commit visible to all– Escape mechanism– Dangerous, but useful

• What escape mechanisms are needed?

Page 52: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Privatization

Node n;atomic { n = head; if (n != null) head = head.next;}// use n.val many times

Node n;atomic { n = head; if (n != null) head = head.next;}// use n.val many times

Page 53: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Node n;atomic { n = head; if (n != null) head = head.next;}// use n.val many times

Node n;atomic { n = head; if (n != null) head = head.next;}// use n.val many times

Privatization

“privatize” node

Page 54: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Privatization

Node n;atomic { n = head; if (n != null) head = head.next;}// use n.val many times

Node n;atomic { n = head; if (n != null) head = head.next;}// use n.val many times

Thread 0

Access without transactional

synchronization

Page 55: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Transactions

Node n;atomic { n = head; if (n != null) head = head.next;}// use n.val many times

Node n;atomic { n = head; if (n != null) head = head.next;}// use n.val many times

atomic {Node n = head; while (n != null) { n.val++; n = n.next; }}

atomic {Node n = head; while (n != null) { n.val++; n = n.next; }}

Thread 0

Thread 1

Page 56: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Transactions

Node n;atomic { n = head; if (n != null) head = head.next;}// use n.val many times

Node n;atomic { n = head; if (n != null) head = head.next;}// use n.val many times

atomic {Node n = head; while (n != null) { n.val++; n = n.next; }}

atomic {Node n = head; while (n != null) { n.val++; n = n.next; }}

Thread 0

Thread 1

Is this safe?

Page 57: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Transactions

Thread 0

Thread 1

Is this safe?For locks, yes.

Node n;atomic { n = head; if (n != null) head = head.next;}// use n.val many times

Node n;atomic { n = head; if (n != null) head = head.next;}// use n.val many times

atomic {Node n = head; while (n != null) { n.val++; n = n.next; }}

atomic {Node n = head; while (n != null) { n.val++; n = n.next; }}

Page 58: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Transactions

Thread 0

Thread 1

Node n;atomic { n = head; if (n != null) head = head.next;}// use n.val many times

Node n;atomic { n = head; if (n != null) head = head.next;}// use n.val many times

atomic {Node n = head; while (n != null) { n.val++; n = n.next; }}

atomic {Node n = head; while (n != null) { n.val++; n = n.next; }}

Is this safe?For locks, yes.

For STMs, maybe not.

Page 59: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Strong vs Weak Isolation

• Non-transactional accesses don’t synchronize

• Non-transactional thread may see delayed updates/recovery

• Solutions– Privatization barrier?– Other techniques?

Page 60: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Single Global Lock Semantics

• Transactions act as if it acquires SGL

• Good:– Intuitively appealing

• Bad:– What about aborted transactions?– May be expensive to implement

Page 61: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Simple Lock-Based STM

• STMs come in different forms– Lock-based– Lock-free

• Here we will describe a simple lock-based STM

Page 62: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Synchronization

• Transaction keeps– Read set: locations & values read– Write set: locations & values to be

written• Deferred update

– Changes installed at commit• Lazy conflict detection

– Conflicts detected at commit

Page 63: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 65

STM: Transactional Locking

Map

Array of Versioned-Write-Locks

Application Memory

V#

V#

V#

Page 64: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 66

Reading an Object

• Put V#s & value in RS• If not already locked

MemLocks

V#

V#

V#

V#

V#

Page 65: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 67

To Write an Object

• Add V# and new value to WS

MemLocks

V#

V#

V#

V#

V#

Page 66: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 68

To Commit

• Acquire W locks• Check V#s unchanged

• In RS & WS• Install new values• Increment V#s• Release …

MemLocks

V#

V#

V#

V#

V#

X

Y

V#+1

V#+1

Page 67: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Transactional Consistency

Application Memory

x

y

4

2

8

4

Invariant x = 2y

Transaction A: Write xWrite y

Transaction B: Read xRead y Compute z = 1/(x-y) = 1/2

Page 68: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Zombies!

• A transaction that will abort because of a synchronization conflict

• But doesn’t know it yet• And keeps running

Page 69: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Zombies!

Application Memory

x

y

4

2

8

4

Invariant x = 2y

Transaction A: Write x (kills B)

Write y

Transaction B: Read x = 4

Transaction B: (zombie)Read y = 4Compute z = 1/(x-y)

DIV by 0 ERROR

Page 70: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Solution: The “Global Clock”

• Have one shared global clock• Incremented by (small subset of)

writing transactions• Read by all transactions• Used to validate that state worked

on is always consistent

Page 71: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 73

Read-Only Transactions

• Copy V Clock to RV• Read lock,V#• Read mem• Check unlocked• Recheck V# unchanged• Check V# < RV

MemLocks

12

32

56

19

17

100

Shared Version Clock

Private Read Version

100

Reads form a snapshot of memory.No read set!

Page 72: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 74

Regular Transactions

• Copy V Clock to RV• On read/write, check:

• Unlocked• V# ≤ RV• Add to R/W set

MemLocks

100

Shared Version Clock

Private Read Version

100

12

32

56

19

17

Page 73: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 75

Regular Transactions

• Acquire locks• WV = F&Inc(V Clock)• Check each V# ≤ RV• Update memory• Set write V#s to WV

MemLocks

100

Shared Version Clock

Private Read Version

100101

x

y

12

32

56

19

17

100

100

Page 74: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 76

Hardware Transactional Memory

• Exploit Cache coherence protocols• Already do almost what we need

– Invalidation– Consistency checking

• Exploit Speculative execution– Branch prediction = optimistic synch!

Page 75: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 77

HW Transactional Memory

Interconnect

caches

memory

read active

T

Page 76: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 78

Transactional Memory

caches

memory

read

activeTT

active

Page 77: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 79

Transactional Memory

caches

memory

activeTT

activecommitted

Page 78: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 80

Transactional Memory

caches

memory

write

active

committed

TD

Page 79: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 81

Rewind

caches

memory

activeTT

activewriteaborted

D

Page 80: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 82

Transaction Commit

• At commit point– If no cache conflicts, we win.

• Mark transactional entries– Read-only: valid– Modified: dirty (eventually written

back)

• That’s all, folks!– Except for a few details …

Page 81: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

© 2006 Herlihy & Shavit 83

Not all Skittles and Beer

• Limits to– Transactional cache size– Scheduling quantum

• Transaction cannot commit if it is– Too big– Too slow– Actual limits platform-dependent

Page 82: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Approaches

• Virtualization– Spill transactional cache to memory

• Hybrid– Use limited HTM as HW accelerator

for STM– Sun’s Rock™ processor …

Page 83: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Back to Amdahl’s Law:

Speedup = 1/(ParallelPart/N + SequentialPart)

Pay for N = 8 cores SequentialPart = 25%

Speedup = only 2.9 times!

Must parallelize applications on a very fine grain!

So…How will we make use of multicores?

Page 84: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Art of Multiprocessor Programming

86

Traditional Scaling Process

User code

TraditionalUniprocessor

Speedup1.8x1.8x

7x7x

3.6x3.6x

Time: Moore’s law

Page 85: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Art of Multiprocessor Programming

87

Multicore Scaling Process

User code

Multicore

Speedup 1.8x1.8x

7x7x

3.6x3.6x

Unfortunately, not so simple…

Page 86: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Art of Multiprocessor Programming

88

Actual Scaling Process

1.8x1.8x 2x2x 2.9x2.9x

User code

Multicore

Speedup

Parallelization and Synchronization require great care…

Page 87: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

Need Ways to Make Scaling Smoother…

TM code

Speedup 1.8x1.8x

7x7x

3.6x3.6x

Multicore

User code

Page 88: Transactional Memory The Art of Multiprocessor Programming Herlihy and Shavit.

I, for one, Welcome our new Multicore Overlords …

• Multicore architectures force us to rethink how we do synchronization

• Standard locking model won’t work• Transactional model might

– Software – Hardware

• A full-employment act!