Consistency Oblivious Programming Hillel Avni Tel Aviv University.
-
Upload
russell-allison -
Category
Documents
-
view
215 -
download
0
Transcript of Consistency Oblivious Programming Hillel Avni Tel Aviv University.
![Page 1: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/1.jpg)
Consistency Oblivious Programming
Hillel AvniTel Aviv University
![Page 2: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/2.jpg)
Agenda Transactional Memory and Locking
Consistency Oblivious Programming (COP)
COP with STM
COP With HTM
Future Work
2
![Page 3: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/3.jpg)
Global Lock
Easy to use
Composable - Concatenate critical sections
Not scalable
3
![Page 4: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/4.jpg)
Fine Grain Locking
Hard to use
Not Composable
Scalable
Lazy linked list is a good example…
4
![Page 5: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/5.jpg)
Lazy Traversal
b d ea
add(c) Aha!
5
![Page 6: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/6.jpg)
Lock and Validate
b d ea
add(c) Yes, b still points to d
6
![Page 7: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/7.jpg)
Perform Updates and Release Locks
b d ea
add(c)
c
7
![Page 8: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/8.jpg)
Transactional Memory
Easy to use
Composable
Scalable
How is it done?
8
![Page 9: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/9.jpg)
9
Java (Duece)bool CAS(int location, int expected, int new val){ atomic { if (location != expected) return false; location = new val; } return true;}
![Page 10: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/10.jpg)
10
bool CAS(int location, int expected, int new val){ __transaction_atomic { if (location != expected) return false; location = new val; } return true;}
C/C++ (GCC-4.7)
![Page 11: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/11.jpg)
1111
Software Transactional Memory
Different algorithms are used. Different algorithms are used.
consistency checkingconsistency checking
rollbackrollback
Compiler recognizes shared accesses.
Compiler recognizes shared accesses.
![Page 12: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/12.jpg)
STM Problem - Overheadtemplate <typename V> static V load(const V* addr, ls_modifier mod)
{
if (unlikely(mod == RfW))
{
pre_write(addr, sizeof(V));
return *addr;
}
if (unlikely(mod == RaW))
return *addr;
gtm_thread *tx = gtm_thr();
gtm_rwlog_entry* log = pre_load(tx, addr, sizeof(V));
V v = *addr;
atomic_thread_fence(memory_order_acquire);
post_load(tx, log);
return v;
}
load function from GCC 4.8.1load function from GCC 4.8.1
12
![Page 13: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/13.jpg)
STM Problem - Overhead static gtm_rwlog_entry* pre_load(gtm_thread *tx, const void* addr, size_t len)
{
size_t log_start = tx->readlog.size();
gtm_word snapshot = tx->shared_state.load(memory_order_relaxed);
gtm_word locked_by_tx = ml_mg::set_locked(tx);
size_t orec = ml_mg::get_orec(addr);
size_t orec_end = ml_mg::get_orec_end(addr, len);
do
{
gtm_word o = o_ml_mg.orecs[orec].load(memory_order_acquire);
if (likely (!ml_mg::is_more_recent_or_locked(o, snapshot))) {
success:
gtm_rwlog_entry *e = tx->readlog.push();
e->orec = o_ml_mg.orecs + orec; e->value = o;
}
else if (!ml_mg::is_locked(o)) {snapshot = extend(tx); goto success; } else {
if (o != locked_by_tx)
tx->restart(RESTART_LOCKED_READ);}
orec = o_ml_mg.get_next_orec(orec); }
while (orec != orec_end);
return &tx->readlog[log_start];
}
load always call pre_loadload always call pre_load
13
![Page 14: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/14.jpg)
STM Problem - Overhead
static void post_load(gtm_thread *tx, gtm_rwlog_entry* log)
{
for (gtm_rwlog_entry *end = tx->readlog.end(); log != end; log++)
{
gtm_word o = log->orec->load(memory_order_relaxed);
if (log->value != o)
tx->restart(RESTART_VALIDATE_READ);
}
} and post_loadand post_load
Compare to mov eax, [ebx]on x86
Compare to mov eax, [ebx]on x86
14
![Page 15: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/15.jpg)
1515
Hardware Transactional Memory
Exploit native cache coherenceExploit native cache coherence
consistency checkingconsistency checking
rollbackrollback
![Page 16: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/16.jpg)
1616
HTM Problem – Resources
limitslimits
cache size limits data footprintcache size limits data footprint
A transaction cannot commit if it isA transaction cannot commit if it is
too bigtoo big
too slowtoo slow
quantum size limits durationquantum size limits duration
![Page 17: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/17.jpg)
1717
All TM Problem – False Conflicts
Any address that was encountered during the transaction is monitored until the endof that transaction.
An address may abort a transaction long After it is not relevant…
Any address that was encountered during the transaction is monitored until the endof that transaction.
An address may abort a transaction long After it is not relevant…
![Page 18: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/18.jpg)
Agenda Transactional Memory and Locking
Consistency Oblivious Programming (COP)
COP with STM
COP With HTM
Future Work
18
![Page 19: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/19.jpg)
COP Operation
• In non transactional mode:– Execute the read-only prefix of the
operation and record its output.
• In transactional mode:– Verify output is correct.– Perform updates.
19
![Page 20: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/20.jpg)
COP Example – RB Tree
20
3010
27 40
2528
20
![Page 21: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/21.jpg)
Add 26 – Tree Unbalanced
20
3010
40
TM Search 26TM Search 26
27
2528
2621
![Page 22: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/22.jpg)
Tree Balanced
27
3020
2510
2840
26
TM Search continues from 27TM Search continues from 27
Conflict and AbortConflict and Abort
22
![Page 23: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/23.jpg)
Add 26 – Tree Unbalanced
20
3010
40
COP Search 26COP Search 26
27
2528
2623
![Page 24: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/24.jpg)
Tree Balanced
27
3020
2510
2840
26
TM Search continues from 27TM Search continues from 27
FoundFound
24
![Page 25: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/25.jpg)
COP RB-Tree VerifyTo facilitate verification:
• all nodes in the RB-Tree are connected in a successor-predecessor doubly linked list, and each node has a live mark.
• Search returns a node n with k or a leaf with k’s successor or predecessor.
25
![Page 26: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/26.jpg)
COP RB-Tree Suffix• Resume a transaction
• Verify:– k found and n is live – done.– K not found, check:
• (n.k>k>n.pred.k && !n.right) or (n.k<k<n.succ.k && !n.left)
• If verification failed – abort the transaction.
• Complete updates, add / remove / rebalance, using n.
26
![Page 27: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/27.jpg)
COP Template for opstart-transaction
any-code
suspend-transaction
output = op-rop();
resume-transaction
If(not(op-verify(output)))
abort-transaction
op-complete(output)
any-code
end-transaction
27
![Page 28: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/28.jpg)
COP CorrectnessThe underlying TM:• Transactional Regular Registers
The COP algorithm:• Obliviousness• Verifiability• Separation
We prove that if the TM yields transactional regular registers, and the COP algorithm demonstrates obliviousness, verifiability, and separation, than the COP operation is linearizeable.
28
![Page 29: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/29.jpg)
Agenda Transactional Memory and Locking
Consistency Oblivious Programming (COP)
COP with STM
COP With HTM
Future Work
29
![Page 30: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/30.jpg)
STM Algorithm• GCC default STM algorithm is the one that proved to
be the most efficient and scalable in most scenarios:– Write Through (WT)– Encounter Time Locking (ETL)– Multi Lock (ML)
30
![Page 31: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/31.jpg)
STM: WT – ETL - ML
1. RV Shared Version Clock2. On Read: check unlocked and
v# <= RV then add to read-Set3. On write: check v# <= RV, lock,
and add to undo-Set4. WV = F&I(VClock)5. Validate that in the read-set
each v# <= RV6. Release locks with v# WV
100 Shared Version Clock
87 0 87 0
34 0
88 0
44 0
V# 0
34 0
99 0 99 0
50 0 50 0
Mem Locks
87 0
34 0
99 0
50 0
34 1
99 1
87 0
X
Y
Commit
121 0
121 0
50 0
87 0
121 0
88 0
V# 0
44 0
V# 0
121 0
50 0
100 RV
100120121
X
Y
31
![Page 32: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/32.jpg)
GCC Constructs__transaction_atomic{}: Mark the transaction.
__transaction_cancel: Explicit abort.
__attribute__((transaction_safe)): Instrument the code.
__attribute__((transaction_pure)):
Do not instrument the code. We will show this attribute can be used efficiently as __transaction_suspend with WT – ETL – ML default STM algorithm in GCC.
32
![Page 33: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/33.jpg)
pure = suspend • Transactional Regular Registers – All values upto
one architecture-word size are written and read atomically. The rollback may use memcpy, but the memcpy is optimized to write maximal alignment.
• Now we will compare the future Power architecture HTM suspended mode, to transaction_pure with WT-ETL-ML STM algorithm.
33
![Page 34: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/34.jpg)
Power tsuspend - tresume1. Until failure occurs, load instructions that access
memory locations that were transactionally written by the same thread will return the transactionally written data.
2. In the event of transaction failure, failure recording is performed, but failure handling is deferred until transactional execution is resumed.
3. The initiation of a new transaction is prevented.
4. Store instructions that access memory locations that have been accessed transactionally (due to load or store) by the same thread will cause the transaction to fail.
34
![Page 35: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/35.jpg)
RB – 1M sz – 20%U - 10 op/tx
35
![Page 36: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/36.jpg)
RB – 1K sz – 8 Threads – 20% U
36
![Page 37: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/37.jpg)
Agenda Transactional Memory and Locking
Consistency Oblivious Programming (COP)
COP with STM
COP With HTM
Future Work
37
![Page 38: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/38.jpg)
Haswell HTM with COPThere is no suspend mode, so to compose COP
operations, we execute all ROP before the transaction. This limits the composition to one writing COP operation in a transaction at most.
38
![Page 39: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/39.jpg)
Capacity and Cache AssociativityPacked Memory Array (PMA) search is done by divide
and conquer. Assume a PMA size is 0x800000, and it starts at address 0. A searches for an item that is found in address 0x0…0x7FFF, must go through the addresses:
0x400000 0x20000 0x100000 0x80000
0x40000 0x20000 0x10000 0x8000
As cache size in Haswell is 0x8000, all these addresses have the same cache index (0), and will always abort.
39
![Page 40: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/40.jpg)
PMA
40
![Page 41: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/41.jpg)
RB-Tree Capacity Aborts
41
![Page 42: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/42.jpg)
RB-Tree Conflict Aborts
42
![Page 43: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/43.jpg)
Agenda Transactional Memory and Locking
Consistency Oblivious Programming (COP)
COP with STM
COP With HTM
Future Work
43
![Page 44: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/44.jpg)
Data StructuresWe already have COP versions of:• RB-Tree• Linked list• PMA• Cache Oblivious B-Tree• Leaplist (k-ary skip list, tailored for range queries)
Can we design more COP data structures?
44
![Page 45: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/45.jpg)
ApplicationsUse COP in applications.
Many applications use shared data structures, so it is interesting to see the impact of COP on their performance.
45
![Page 46: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/46.jpg)
InfrastructureAdd statistics (transactional accesses, conflicts) to GCC.
Add real suspend-mode to GCC, hardware.
46
![Page 47: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/47.jpg)
TheoryHow to make transformation to COP automatic?
Is COP applicable outside the data-structures area?
Bounds on the amount of transactional accesses?
Bounds on the amount of false conflicts?
47
![Page 48: Consistency Oblivious Programming Hillel Avni Tel Aviv University.](https://reader036.fdocuments.us/reader036/viewer/2022070413/5697bfe41a28abf838cb563c/html5/thumbnails/48.jpg)
Thank You