Analyzing Aborts in Software Transactional Memory

35
ANALYZING ABORTS IN SOFTWARE TRANSACTIONAL MEMORY Presented by: Ofer Kiselov & Omer Kiselov Supervised by: Dmitri Perelman Final Presentation

description

Presented by: Ofer Kiselov & Omer Kiselov Supervised by: Dmitri Perelman. Final Presentation. Analyzing Aborts in Software Transactional Memory. Overview. - PowerPoint PPT Presentation

Transcript of Analyzing Aborts in Software Transactional Memory

Page 1: Analyzing Aborts in  Software Transactional Memory

ANALYZING ABORTS IN SOFTWARE TRANSACTIONAL MEMORYPresented by: Ofer Kiselov &

Omer KiselovSupervised by: Dmitri

Perelman

Final Presentation

Page 2: Analyzing Aborts in  Software Transactional Memory

Overview Repeating midterm presentation on the following subjects

      * Software Transactional Memory abstraction      * STM implementation example - TL2 overview       * Aborts in STM      * Unnecessary aborts in STM  * Project goal  * Implementation     * Overview

Online part – implementation Online logging Evaluation Hardware Deuce Benchmarks Results Conclusion and analysis Nice to have Future work

Page 3: Analyzing Aborts in  Software Transactional Memory

Importance Of Parallel Programming

Frequency barrier – the single core processor’s performance can not improve.

Switch to multi-cores. Parallel programs allow utilizing

multi-core processors. Need for synchronization for

accessing shared data

Page 4: Analyzing Aborts in  Software Transactional Memory

Transactional Memory – why? Current synchronization – locks

Coarse-grained – limit parallelism Fine-grained – high programming complexity Error-prone (deadlocks / livelocks)

Transactional memory solution Intuitive for a programmer Provides a “transaction” abstraction for a

critical section (operations executed atomically)

Implemented in both software and hardware.

Page 5: Analyzing Aborts in  Software Transactional Memory

Why Do Aborts Happen?

OBJECT1

OBJECT2

T1

T2 T3

T4T1 T2 T3 Read from O1

T4 Reads from O2 and writes to O1

To maintain consistency if T4 commits T1 T2 & T3 must abort!

Aborted

Committed

T1 T2 T3 write to O2

Page 6: Analyzing Aborts in  Software Transactional Memory

Unnecessary Aborts Aborts are bad

work is lost, resources are wasted, throughput decreases

Some aborts are necessary continuing the run would violate correctness

And some aborts are not Analysis whether the algorithm should is too

expensive. “Unnecessary” abort: it could be avoided

keep more versions, better check of transactional dependencies.

o1

o2C

A

T1T2

T3

Page 7: Analyzing Aborts in  Software Transactional Memory

Project Goals Build a software analysis tool:

measures aborts statistics for a given run

evaluate how many of them were unnecessary

evaluate the damage to performance “Will it pay off to add designs to stop

the unnecessary aborts?”

Page 8: Analyzing Aborts in  Software Transactional Memory

Project Formation

An offline part for analyzing the run: reads the log of the run. gathers statistics. analyzes unnecessary aborts.

An online part for logging the run: is inserted to a specific algorithm run in a benchmark flushes the run info to an XML log file

Page 9: Analyzing Aborts in  Software Transactional Memory

Offline Part

XML Log Parser

Analyzer

Output of analysis is a precedence graph

showing the transactions and

their actions.RUN DESCRIPTOR

Abort Analyzer

Matlab histograms and final analysis

Parser Every log line represents transactional action

represented by LogLine abstract class Parser responsibility:

iterate over the xml create appropriate LogLine instances

LogLine factories for different operation types transactional start read operation write operation transactional commit

Analyzer

Gives basic statistics regarding the transactions run. Counts aborts per reason. Counts reads, writes Count transactions

Inserting the Path into Run Descriptor ADT Struct.

Page 10: Analyzing Aborts in  Software Transactional Memory

Transactional DependenciesRun Descriptor is a precedence graph!

Page 11: Analyzing Aborts in  Software Transactional Memory

RUN DESCRIPTOR

T1

T4

Reader

OBJECT1

OBJECT2

Reader

OBJECT1 Version2

OBJECT2 Version2

Writer

Writer

WaRWaR

In order to create the graph we needed to establishA way to make the basic run into a graph

Page 12: Analyzing Aborts in  Software Transactional Memory

ABORTS ANALYZER Searches for unnecessary aborts in RUN

DESCRIPTOR Speculatively adds the edges of the aborted

transaction to the RUN DESCRIPTOR Using DFS – Finds circles in the precedence

graph.Circles represent necessary aborts

Removes the edges at the end of analysis.Built as visitor pattern

Flexible for more complex analysis

Page 13: Analyzing Aborts in  Software Transactional Memory

Online partOur goals: Run benchmarks to prepare the

statistics for offline part. Be sure that the measurements don’t

distort the scheduling picture.

Page 14: Analyzing Aborts in  Software Transactional Memory

Platform Supporting STM

Deuce STM is an open source java STM environment.

With Deuce STM, if the method:public void doThing() {…} is not thread-safe…@AtomicPublic void doThing() {…} is!!

Introducing:Deuce STM!!!

Created By: Guy Korland, Nir Shavit, Pascal Felber, Igor Berman

Source Codefinal public class Context implements org.deuce.transaction.Context {

private static String objectId(Object reference, long field) {return Long.toString(System.identityHashCode(reference) + field);}

final static AtomicInteger clock = new AtomicInteger(0);

TL2 Work

MethodWith

Logging

Deuce Frame Work

Page 15: Analyzing Aborts in  Software Transactional Memory

How To Utilize Deuce for Logging Modified code to call logging utils. More exceptions type to distinct

between different aborts types.

Logger

Deuce Framework

TL2 Algorithm

Transactions Code:StartReadWrite

Commit

A Perfectly Scalable Code

Page 16: Analyzing Aborts in  Software Transactional Memory

Online Part Implementation Version 1

Main Problem : Adding to priority queue damages

parallelism and lowers performance

Page 17: Analyzing Aborts in  Software Transactional Memory

Online Part ImplementationVersion 2

The Back End

CollectorThe threads don’t do any

Extra actions to log therun.

The Loglines have ended

The program has ended

Page 18: Analyzing Aborts in  Software Transactional Memory

What Do we Check? Commit rate Unnecessary aborts (classified by

types) Wasted work

Page 19: Analyzing Aborts in  Software Transactional Memory

Testbenches SSCA2 – Short transactions, low

contention, high memory utilization Vacation – High contention, Medium length

transaction, Mostly reads. AVL tree – customizable contention,

medium length transactions. Random choice between add, remove or

search for a random integer in the tree. Ability to change integer range for custom

contention. Created by us.

Page 20: Analyzing Aborts in  Software Transactional Memory

Hardware Benchmarks run on Trinity:

8 quad-cores 132 GB RAM Machine was idle for our use.

Page 21: Analyzing Aborts in  Software Transactional Memory

100

101

102

0

0.2

0.4

0.6

0.8

1

Number Of Threads

Pre

cent

age

Of S

ucce

ssfu

l Com

mits

100

101

102

0

1000

2000

3000

4000

5000

6000

Number Of Threads

Am

ount

Of U

nnec

essa

ry A

borts

100

101

102

0

0.2

0.4

0.6

0.8

1

Number Of Threads

prec

enta

ge O

f Unn

eces

sary

Abo

rts

100

101

102

0

0.2

0.4

0.6

0.8

1

Number Of Threads

Pre

cent

age

Of W

aste

d R

eads

Simulation Results – AVL treeCommit Ratio

Percentage of Unnecessary Aborts

All graphs are a function of the thread amount

Amount of Aborts & Unnecessary Aborts

Percentage of Wasted Reads

Page 22: Analyzing Aborts in  Software Transactional Memory

100

101

102

0

0.2

0.4

0.6

0.8

1

Number Of Threads

Pre

cent

age

Of S

ucce

ssfu

l Com

mits

100

101

102

0

500

1000

1500

Number Of Threads

Am

ount

Of U

nnec

essa

ry A

borts

100

101

102

0

0.2

0.4

0.6

0.8

1

Number Of Threads

prec

enta

ge O

f Unn

eces

sary

Abo

rts

100

101

102

0

0.2

0.4

0.6

0.8

1

Number Of Threads

Pre

cent

age

Of W

aste

d R

eads

Simulation Results – SSCA2Commit Ratio

Percentage of Unnecessary Aborts

All graphs are a function of the thread amount

Amount of Aborts & Unnecessary Aborts

Percentage of Wasted Reads

Page 23: Analyzing Aborts in  Software Transactional Memory

100

101

102

0

0.2

0.4

0.6

0.8

1

Number Of Threads

Pre

cent

age

Of S

ucce

ssfu

l Com

mits

100

101

102

0

100

200

300

400

500

600

700

Number Of Threads

Am

ount

Of U

nnec

essa

ry A

borts

100

101

102

0

0.2

0.4

0.6

0.8

1

Number Of Threads

prec

enta

ge O

f Unn

eces

sary

Abo

rts

100

101

102

0

0.2

0.4

0.6

0.8

1

Number Of Threads

Pre

cent

age

Of W

aste

d R

eads

Simulation Results – VacationCommit Ratio

Percentage of Unnecessary Aborts

All graphs are a function of the thread amount

Amount of Aborts & Unnecessary Aborts

Percentage of Wasted Reads

Page 24: Analyzing Aborts in  Software Transactional Memory

Simulation Results – AVL treeAll graphs are a function of the thread amount

46%

11%

43%

threads2

43%

14%

43%

threads4

39%

25%

36%

threads8

51%

28%

22%

threads16

57%27%

16%

threads32

16%

79%

5%threads64

Version Too HighObject LockedReadset Invalid

Page 25: Analyzing Aborts in  Software Transactional Memory

Simulation Results – SSCA2All graphs are a function of the thread amount

23%

12%

65%

threads2

26%

14%60%

threads4

22%

19%60%

threads8

36%

18%

45%

threads16

35%

24%

41%

threads32

28%

36%

36%

threads64

Version Too HighObject LockedReadset Invalid

Page 26: Analyzing Aborts in  Software Transactional Memory

Simulation Results – VacationAll graphs are a function of the thread amount

55%

12%

34%

threads2

61%10%

29%

threads4

62%

6%

32%

threads8

68%

5%

27%

threads16

62% 15%

23%

threads32

38%

49%

13%

threads64

Version Too HighObject LockedReadset Invalid

Page 27: Analyzing Aborts in  Software Transactional Memory

1 2 3 4 5 60

500

1000

1500

2000

2500

3000

3500

log2 of Number Of Threads

Am

ount

Of A

borts

Version Too HighObject LockedReadset Invalid

1 2 3 4 5 60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

log2 of Number Of Threads

Am

ount

Of A

borts

Version Too HighObject LockedReadset Invalid

Simulation Results – AVL treeAll graphs are a function of the thread amount

Percentage of Aborts by typesAmount of Aborts by types

Page 28: Analyzing Aborts in  Software Transactional Memory

1 2 3 4 5 60

100

200

300

400

500

600

700

log2 of Number Of Threads

Am

ount

Of A

borts

Version Too HighObject LockedReadset Invalid

1 2 3 4 5 60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

log2 of Number Of Threads

Am

ount

Of A

borts

Version Too HighObject LockedReadset Invalid

Simulation Results – SSCA2All graphs are a function of the thread amount

Percentage of Aborts by typesAmount of Aborts by types

Page 29: Analyzing Aborts in  Software Transactional Memory

1 2 3 4 5 60

50

100

150

200

250

300

350

400

log2 of Number Of Threads

Am

ount

Of A

borts

Version Too HighObject LockedReadset Invalid

1 2 3 4 5 60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

log2 of Number Of Threads

Am

ount

Of A

borts

Version Too HighObject LockedReadset Invalid

Simulation Results – VacationAll graphs are a function of the thread amount

Percentage of Aborts by typesAmount of Aborts by types

Page 30: Analyzing Aborts in  Software Transactional Memory

Logger impact on performance

Logger access obviously demands more from the Deuce framework. More memory accesses More exception types On every read & write

How much distortion does the logger cause?

100

101

102

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Number Of Threads

Pre

cent

age

Of S

ucce

ssfu

l Com

mits

With loggerWithout logger

AVL test with logging – commit ratio

Page 31: Analyzing Aborts in  Software Transactional Memory

Conclusions

Parallelism increases → aborts rate, unnecessary abort rate and the wasted work rate increase as well.

Parallelism increases → more aborts are caused by locked objects.

To improve STM performance over highly parallel workloads, algorithms may be improved to prevent unnecessary aborts.

Page 32: Analyzing Aborts in  Software Transactional Memory

Nice To Have Drawing the precedence graph

automatically to a drawing in Microsoft Visio.

Possibility to analyze according to abort types.

GUI. Expansion of the simulation to more

algorithms and test benches – makes the comparison of performance between algorithms possible.

Page 33: Analyzing Aborts in  Software Transactional Memory

Future Work

Drop in abort rates after 128 threads due to a drop in concurrency – further analysis is required.

Unfit versions cause a lot of aborts. The new SMV algorithm may solve this

problem.

Page 34: Analyzing Aborts in  Software Transactional Memory

BIBLIOGRAPHY I. Keidar and D. Perelman. On avoiding spare aborts in

transactional memory. In Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures, pages 59–68, 2009.

I. Keidar and D. Perelman .SMV: Selective Multi-Versioning STM

O. S. D. Dice and N. Shavit. Transactional locking II. In Proceedings of the 20th International Symposium on Distributed Computing, pages 194–208, 2006.

M. Herlihy, V. Luchangco, M. Moir, and W. N. Scherer, III. Soft-ware transactional memory for dynamic-sized data structures. In Pro-ceedings of the twenty-second annual symposium on Principles of distributed computing, pages 92–101, 2003.

Page 35: Analyzing Aborts in  Software Transactional Memory

?QUESTIONS