Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming...

Post on 08-Jan-2018

233 views 2 download

description

Agenda ● Closed address:  3 gradually improvements  Lock free model ● Open address  2 gradually improvements

Transcript of Concurrent Hashing and Natural Parallelism Chapter 13 in The Art of Multiprocessor Programming...

Concurrent Hashingand Natural Parallelism

Chapter 13 in The Art of Multiprocessor Programming

Instructor: Erez Petrank

Presented by Tomer Hermelin

Hash-table Recap

Int hash_function(T item)

Void add(T item)

Void remove(T item)

Bool contains(T item)

resize and resize policy.

Closed address vs open address

Agenda

● Closed address: 3 gradually improvements Lock free model

● Open address 2 gradually improvements

Concurrent Closed address

Base class - abstract● Constructor(int capacity): init –

int setSize array of lists in size capacity

● Contains(T x): acquire (x) Checks for x release (x)

● Add/remove(T x): acquires (x) Adds and inc size if not already in list release (x) Check policy and resize if needed

Name of the game:

● acquire(T x): acquires the locks necessary to manipulate item x.

● release(T x) releases the relevant locks.

● policy() decides whether to resize the set.

● resize() doubles the capacity of the table.

Coarse-Grained Hash Set

Coarse-GrainedThe naïve solution:

Add one main lock, to lock for each method.

The only thing to do:

When resize, after locking, make sure no one has already resized.

Why shouldn’t we do that for add, remove and contains?

Easy to understand and implement.

But every thread stops all the other threads…

Tomer Hermelin
the could have either way resize
Tomer Hermelin
because if someone already added/removed we'll just fail

Striped Hash set

Striped Hash set

acquire: given item with hash-code k, we’ll lock the lock in index k (mod IC)

Lets say we create a Hash Table with capacity 8, and it was double in size once. Then:

Can modify Buckets 0 and 5 in parallel

Can’t modify Buckets 0 with two threads in parallel

Can’t modify Buckets 0 and 8 in parallel

Locks

Table

ResizingSave table size

Validate table size

No Deadlock

contains, add, or remove cannot deadlock (also with resize), because they require only one lock to operate.

A resize call cannot deadlock with another resize call because both calls start without holding any locks, and acquire the locks in the same order

Draw back

After multiple resizing there would be large groups of cells that cannot be modified in parallel.

Reasons not to grow the locks array?

1. Associating a lock with every table entry could consume too much space, especially when tables are large and contention is low.

2. While resizing the table is straightforward, resizing the lock array (while in use) is more complex.

Striped Hash set - summary

● Striped locking permits some concurrency.

● add(), contains(), and remove() methods take constant expected time.

● After multiple resizing, not ideal locks-buckets ratio.

Refinable Hash set

Refinable Hash set

Propose: Refine the resolution of locking when resizing

The main step – Making sure the lock array is not in use, while resizing.

Atomic Markable Reference

Add AtomicMarkableReference<Thread> owner

We use the owner as a mutual exclusion flag between the resize() call and all other calls (including other resizes)

acquire()

Locks

TableOwner

+Validate

Tomer Hermelin
how implemented?

Resize()

LocksTable

OwnerR Owner R

+ - C R’

Locks

Table

Resize()

LocksTable

OwnerR

R1

Owner R

Striped Hash set - summary

● Control over the locks array and table size ratio.

● Resize is ‘stop the world’ method.

Lock free Models

We want to not “stop-the world” in order to resize, while still doing contains, add, and remove in constant time

Atomic operations work only on a single memory location. Resizing is really really not the case.

We’ll take care of resizing incrementally, during add, remove and contains.

Tomer Hermelin
constant? really? check this out
Tomer Hermelin
also remove? yes?

Recursive Split-Ordering

A list structure

● All the bucket are part of one long list.

● Add(), remove() and contains() through pointers in table.

● To make our life easy, we make special nodes.

● Initialize when first accessed.

Tomer Hermelin
correct?
Tomer Hermelin
MSB
Tomer Hermelin
find out how we make a node special

The order of the items

We want items not to move in resizing!

Every item is inserted according to the reverse order of its hash-code bit representation.

The order of the items

0010

Size of the table = 2^n

01100000

64 20

n=1

0 0 0

n=2

011100010 0

8 14

010001000 0 0 0 1 11

Triggered only by a small action – change bucketSize.

The table is in fixed size, and each cell points to the correct ‘logical bucket’ in the list (a pointer is initialized when first accessed).

Resize()

Adding Example

Resizing Example

When the capacity is 2, to add item with hash-code = 3, we would be directed by the table with index no. 1.

after changing the capacity from 2 to 4, we’ll access for the same item with index no. 3

So how do we implement?

The list is almost the same as LockFreeList:

● The items are sorted in recursive-split order

● While the LockFreeList class uses only two sentinels, we place a sentinel at the start of each new bucket.

So how do we implement?

0

Table

1

2

3

4

5

6

7

AtomicIntegerbucketSize

AtomicIntegersetSize

An item inserted before the table was resized must be accessible afterwards from both its previous and current buckets.

With our ordering, we ensure that these two groups of items are positioned one after the other in the list. This organization keeps each item in the second group accessible from bucket b.

Correctness while Resizing

Tomer Hermelin
Because the hash function depends on the table capacity, we must be carefulwhen the table capacity changes.
Tomer Hermelin
When the capacity grows to 2i+1, the items in bucket b are split between two buckets:those for which k = b (mod 2i+1) remain in bucket b, while those for whichk = b + 2i (mod 2i+1) migrate to bucket b + 2i
Tomer Hermelin
animation of

Open-Addressed Hash Set

Cuckoo Hashing

Some Cuckoos are nest parasites: they lay their eggs in other birds’ nests. Cuckoo chicks hatch early, and quickly push the other eggs out of the nest.

Sequential Cuckoo

● Two tables, each with own hash function.

● Remove and contains: simply check in both tables.

● Add method is done by ‘kicking out’ the item in the

way and letting him find a new cell.

If no free cell can be found, we resize.

● add()● remove()● contains()● relocate()

The main problem in making the sequential Cuckoo concurrent is the add method

Concurrent Base Class - abstract

Concurrent Base Class - abstract probe sets: a constant-sized set of items with the same hash code.

we use a two-dimensional table of probe sets.

10

0

1

2

3

4

5

6

7

9

8

11

12

13

14

15

2

31

0

X 2

Concurrent Cuckoo Hashingremove() and contains():

k 10

0

1

2

3

4

5

6

7

9

8

11

12

13

14

15

0

11

0Table 1Table 0

checkcheck

Add()

10

0

1

2

3

4

5

6

7

9

8

11

12

13

14

15

0

11

0Table 0 Table 1

threshold

thresholdthreshold

threshold

k

k

k

k

k

Resize!!Relocate!!

acquire(k)

relocation

10

0

1

2

3

4

5

6

7

9

8

11

12

13

14

15

0

11

0Table 0 Table 1

threshold

thresholdthreshold

thresholdnk

s

acquire(s)

acbr

acquire(a)

And we start all over again!

Tomer Hermelin
why no deadlock?
Tomer Hermelin
only aquire one at a moment

Name of the game:

● acquire(T x): acquires the locks necessary to manipulate item x.

● release(T x) releases the relevant locks.

● resize() doubles the capacity of the table.

● policy() decides whether to resize the set.

Striped Concurrent Cuckoo Hashing

Striped Concurrent CuckooAdding a fixed 2-by-L array of reentrant locks

As before, lock[i][j] protects table[i][k], where k (mod L) = j

10

0

1

2

3

4

5

6

7

9

8

11

12

13

14

15

0

11

0Table 1Table 0

Locks 0 Locks 1

Still no deadlock

The acquire() method locks lock[0][h0(x)] and only then lock[1][h1(x)], to avoid deadlock.

When resizing we only acquire the locks in lock[0].

Refinable Concurrent Cuckoo Hashing

Refinable Concurrent Cuckoo

Owner

10

0

1

2

3

4

5

6

7

9

8

11

12

13

14

15

0

11

0Table 1Table 0

Locks 0 Locks 1

The end

Questions?