CAR-STM: Scheduling-based Collision Avoidance and Reduction for Software Transactional Memory
description
Transcript of CAR-STM: Scheduling-based Collision Avoidance and Reduction for Software Transactional Memory
CAR-STM: Scheduling-based Collision Avoidance and Reduction for Software Transactional Memory
Shlomi Dolev, Danny Hendler and Adi SuissaPODC 2008
CAR-STM: rationale
“transaction ignorant” thread scheduling problematic
TM scheduler handles transactional threads
This permits:
o Serializing contention management
o Proactive collision avoidance
“Conventional” STM system high-level structure
OS-scheduler-controlledapplication threads
Contention
Manager
ContentionDetection
arbitrate
proceed
Abort/retry/Wait
TM System
CAR-STM's distinctive features
Proactive Collision avoidance
Proactively assign transaction thread to core with “most conflicting’’ transactions based on application-provided information
Serializing contention management
Serialize the execution of colliding transactions
Relying on (current) OS scheduling is problematic!
1) Introduces pseudo-parallelism
2) Hurts TM performance stability/predictability
3) Does not allow proactive collision avoidance and serializing CM.
OS scheduling of transaction threads:
CAR-STM high-level architecture
Transaction queue #1
TQ thread
TQ thread
Transaction thread
T-Info
Core #1
Serializing
contention mgr.
Dispatcher
Collision
Avoider
Core #k
Transaction queue #k
TQ-Entry Structure
Transaction queue #1
TQ thread
TQ thread
Transaction thread
T-Info
Core #1
Serializing
contention mgr.
Dispatcher
Collision
Avoider
Core #k
Transaction queue #k
wrapper method
Transaction data
T-Info
Trans. thread
Lock, condition var
Transaction dispatching processEnque transaction in most-conflicting queue. Put thread to sleep, notify TQ thread.
4
4
Transaction execution
TQ thread
Core #i
Transaction queue #i
wrapper method
Transaction data
T-Info
Trans. threadLock, condition var
Dispatcher / TQ-thread synchronization
TQ thread
Core #i
Transaction queue #i
Dispatcher
Serializing Contention Managers
When two transactions collide, fail the newer transaction and move it to the TQ of the older
Fast elimination of live-lock scenarios Two SCMs implemented
o Basic (BSCM) – move failed transaction to end of the other transactions' TQ
o Permanent (PSCM) – Make the failed transaction a subordinate-transaction of the other transaction
PSCM
Ta
Transaction
queue #1
TQ thread
Core #1
PSCM
Tb
Transaction
queue #k
TQ thread
Core #k
Tc
Td Te
Transactions a and b collide, b is older
1
PSCM
Transaction queue #1
TQ thread
Core #1
PSCM
Tb
Transaction queue #k
TQ thread
Core #k
TaTc
Td Te
Losing transaction and its subordinates are made subordinates of winning transaction
Ta Tc
Experimental evaluation
Incorporated CAR-STM within RSTM
Tested on an 8-way 4 x XEON-7110M server
Serializing CM tests: Workloads generated by STMBEench7 [Guerraoui, Kapalka, Vitek, '07]
Proactive collision avoidance tested on synthetic app
STMBench7
A benchmark for STM implementations
Generates realistic workloads representative of complex, object-oriented applications
Workloads composed of 45 operation types on a shared data structure
Operation categorieso Long / short traversalso Short operations o Structure modification operations
Metrics and workload types
Workload typeReadsWrites
Read dominated
90%10%
Read/Write60%40%
Write dominated
10%90%
MetricsOperation typesComments
Execution timeAll5 min + quiescence
Quiescence timeAll
ThroughputAll except long traversals
Execution time: R/W dominated workloads
Speed-up of between
1.7 and 36
Reduction of standard deviation
by factor of up to 40
Execution time: read dominated workloads
Execution time: Write dominated workloads
Quiescence time:a measure of live-lock
Speed-up of between
11 and 118
Throughput: write dominated workloads
Throughput increase of up to 15.7
Experimental evaluation: proactive collision avoidance
RegionedArray (RA) synthetic app (read, write, delete)
Each thread runs for 20 secondso Randomly select regiono Randomly select transaction lengtho Randomly select operationo Transaction repeatedly applies operation to randomly-
selected region item
Transactional memory Dagstuhl, June 08
Experimental results
Transactional memory Dagstuhl, June 08
Most relevant prior art
[Yoo, Lee, 2008]: Adaptive transaction scheduling for TM systems
[Bai, Shen, Zhang, Scherer, Ding, Scott]: A key-based adaptive TM executor
Conclusions
Transactions-ignorant scheduling is problematic
Serializing contention management eliminates live-lock STM behavior
Proactive Collision avoidance contribution application-dependentSome future work directions
Robust scheduling Transaction-aware OS scheduling Better handling of page faults, local data access,…