Simple, Fast and Practical Non-Blocking and Blocking Concurrent Queue Algorithms
Designing Concurrent Search Structure Algorithms
-
Upload
orestes-kristopher -
Category
Documents
-
view
37 -
download
0
description
Transcript of Designing Concurrent Search Structure Algorithms
![Page 1: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/1.jpg)
Designing Concurrent Search Structure Algorithms
Dennis Shasha
![Page 2: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/2.jpg)
What is a Search Structure?
• Data structure (typically a B tree, hash structure, R-tree, etc.) that supports a dictionary.
• Operations are insert key-value pair, delete key-value pair, and search for key-value pair.
![Page 3: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/3.jpg)
How to make a search structure algorithm concurrent
• Naïve approach: use two phase locking (but then at the very least the root is read-locked so lock conflicts are frequent).
• Semi-naïve algorithm: use hierarchical tree locking: lock root; afterwards lock node n only if you hold lock on parent of n. (Still tends to hold locks high in tree.)
![Page 4: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/4.jpg)
How can we do better: fundamental insight
• In a search structure algorithm, all that we really care about is that we implement the dictionary operations correctly.
• Operations on structure need not even be serializable provided they maintain certain constraints.
![Page 5: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/5.jpg)
Train Your Intuition:parable of the library
• Imagine a library with books.
• It’s a little old fashion so there are still card catalogues that identify the shelf where a book is held.
• Bob wants to get a book B.
• Alice is working on reorganizing the library by moving books from shelf to shelf and then changing the card catalogue.
![Page 6: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/6.jpg)
Parable of the library: interleaving of ops
• Bob 1. look up book B in catalogue.
• Bob 2. read “go to shelf S”
• Bob 3. Start walking but see friend.
• Alice 1: move several books from S to S’, leaving a note.
• Alice 2: change catalogue so B maps to S’
• Bob 4: go to S, follow note to S’
![Page 7: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/7.jpg)
Parable of the library: observations
• Not conflict-preserving serializable:Bob Alice (Bob reads catalog then Alice changes it)Alice Bob(Alice modifies S before Bob reads)
• Indeed in no serial execution would Bob go to two shelves.
• Yet execution is completely ok!
![Page 8: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/8.jpg)
Parable of the library: what’s going on?
• All we care about is that 1. structure is ok after Alice finishes.2. Bob gets his book if it’s there
• We want to find a general theory for this.• Ref: Vossen Weikum book and
``Concurrent Search Structure Algorithms'‘ D. Shasha and N. Goodman, ACM Transactions on Database Systems, vol. 13, no. 1,pp. 53-90, March 1988.
![Page 9: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/9.jpg)
Good Structure for any Dictionary Data Structure
• Dictionary holds a set of key-value pairs. Values don’t matter for our theory so consider just the set of keys that could be present, denoted keyspace. Example: all natural numbers.
• From the root (in general, any root), must be able to navigate to a node n such that n either has a key being sought or no node has that key.
![Page 10: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/10.jpg)
Example: binary search tree
50
7010
35
Inset = Keyspace
Inset = {x| x > 50}Inset = {x| x < 50}
Inset = {x| x < 50 and x > 10}
![Page 11: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/11.jpg)
Inset, Outset, Keyset
Inset(n) is the subset of Keyspace that are either in n or could be reachable (according to the rules of the structure) from n
• Edgeset(n,n’) is the subset of Keyspace directed to descendant n’ of n. Union of all edgesets with source n is outset(n)
• Keyset(n) = Inset(n) – Outset(n). The set of keys that are in node n or nowhere.
![Page 12: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/12.jpg)
Notes
Inset(n) = union over all edges (m,n) of inset(m) ^ edgeset(m,n).
• Note that Edgeset(n,n’) need not always be a subset of Inset(n). You’ll see why this is good later.
![Page 13: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/13.jpg)
Example: binary search treeKeyspace is all integers
50
7010
35
Inset = Keyspace; keyset = {50}
Outset = {x|x!=50}
Inset = {x| x > 50} = edgeset(node 50,
node 70)
Keyset = Inset
Inset = {x| x < 50}
Keyset = Inset – {x| x > 10}
= {x| x <= 10}
Inset = {x| x < 50 and x > 10}
edgeset (node 10, node 35)
= {x|x > 10}
Keyset = Inset
![Page 14: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/14.jpg)
Structure Goodness Conditions
• The keysets of the nodes partition the keyspace.So U {Keyset(n) | n is a node} = Keyspaceand if n!=n’ then keyset(n) is disjoint from keyset(n’).
• Edgsets leaving node n are disjoint• Let Existkeys(n) be the keys actually
present at node n. Existkeys(n) is a subset of keyset(n).
![Page 15: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/15.jpg)
Structure Goodness Conditions(applies to each root)
• In the library, suppose that initially, inset(shelf S) = {books | authors begin with “S”}.Afterwards, outset(S) = {books|author names begin with “Sh” or later}
• At end keyset(S) = books having names starting with Sa through Sg. Inset(S’)= books having names starting with Sh through Sz.
![Page 16: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/16.jpg)
Example: library at beginning
Cat
SA
Inset of catalog = Keyspace Outset = Keyspace; keyset = {}
Inset = {x| x begins with “S”} = edgeset(cat,S)
Keyset = Inset
Inset = {x| x begins with “A”}= edgeset(cat,S) …
![Page 17: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/17.jpg)
Example: library after reshelving
Cat
SA
Inset of catalog = Keyspace Outset = Keyspace; keyset = {}
Inset = {x| x begins with “Sh” .. “Sz”}
Keyset = Inset
Inset = {x| x begins with “A”}
…
S’
Inset = {x| x begins with “S”} = edgeset(cat,S)
Outset = {x |x begins with “Sh” or greater}
![Page 18: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/18.jpg)
Example: library after reshelvingand catalog change
Cat
SA
Inset of catalog = Keyspace Outset = Keyspace; keyset = {}
Inset = {x| x begins with “Sh” .. “Sz”} = edgeset(Cat, S’)
Keyset = Inset
Inset = {x| x begins with “A”}
…
S’
Inset = {x| x begins with “S” through “Sg”} =
edgset(cat, S)
Outset = {x |x begins with “Sh” or greater}
![Page 19: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/19.jpg)
Observe
• Without the note from S to S’, there would be keys on S’ yet S’ would have a null inset and hence a null keyset.
• This violates the Existkeys part of the structural condition.
• Note also that we can’t eliminate the note from S to S’ even after the catalog is updated. Why?
![Page 20: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/20.jpg)
Execution Goodness
• For a search for an item B beginning at node m, the following invariant holds:
• After any operation of any process, if the search for item B is at node x, then B is in keyset(x) or there is a path from x to node y such that B is in keyset(y) and every edge E along that path has B in its edgeset.
![Page 21: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/21.jpg)
Execution Goodness Proof Sketch
• Provided the search reaches the node having B in its keyset, the search will find B there or will find it nowhere.
• The invariant ensures that the search will not end its search anywhere else.
![Page 22: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/22.jpg)
Execution Goodness Proof
• Why is it that Bob is fine in spite of the fact that the Bob and Alice concurrent execution could never execute serially?
• Because even when Bob is at shelf S, the book Bob is looking for is in edgeset(S,S’) and B is in keyset(S’).
![Page 23: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/23.jpg)
Practical Applications
• Most sophisticated database management systems use some version of the library parable in their B-trees, hash structures, etc.
• Reason: locks need not be held as long and can be held lower in the tree.
• B trees for example have links at the leaf level. So a split looks like this:
![Page 24: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/24.jpg)
B tree simplified (two vals per node)
50
701, 7
Inset = {x | 0 <=90}; keyset = {}
Outset = inset
Inset = {x| x > 50 and x <= 90} = edgeset(node
50, node 70)
Keyset = Inset
Inset = {x| x < 50}
Keyset = Inset
![Page 25: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/25.jpg)
B tree insert(32): split left leaf at 15Only 1,7 node needs to be locked
50
701, 7 32
Inset = {x | 0 <=90}; keyset = {}
Outset = inset
Inset = {x| x > 50 and x <= 90} = edgeset(node
50, node 70)
Keyset = Inset
Inset = {x| x < 50}
Keyset = Inset – {x| x > 15}
= {x| x <= 15}
Edgeset = {x|x > 15}
![Page 26: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/26.jpg)
Readjust parent (so lock it briefly)
15, 50
701, 7 32
Inset = {x | 0 <=90}; keyset = {}
Outset = inset
Inset = {x| x > 50 and x <= 90} = edgeset(node
50, node 70)
Keyset = Inset
Inset = {x| x < 50}
Keyset = Inset – {x| x > 15}
= {x| x <= 15}
Edgeset = {x|x > 15}
![Page 27: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/27.jpg)
Can Generalize Using Model
• Above algorithm is due to Lehman and Yao and is called the B-link algorithm. Long journal article to present and prove.
• Now can generalize to any structure. Ensure structure works and invariant holds on execution.
• Also possible to invent a new algorithm making direct use of the model.
![Page 28: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/28.jpg)
High Concurrency Without Links:Give-up algorithm
• Explicitly record the description of inset of each node in the node.
• Search(B) descends. If B is ever not in the inset of the current node, then give up and start over.
• Happens rarely enough that performance is as good as B-link for searches. Less work for deletions.
• Proof is immediate.
![Page 29: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/29.jpg)
Conclusion
• Simple framework for all search structures. Handful of concepts: keyspace, inset, edgeset, outset, keyset.
• Can be a guide to coding.
![Page 30: Designing Concurrent Search Structure Algorithms](https://reader036.fdocuments.us/reader036/viewer/2022062321/5681364c550346895d9dccc8/html5/thumbnails/30.jpg)
Exercise
• When can Alice remove the note directing those seeking certain books to go from S to S’?
• Try to design a merge algorithm for a B-tree in the give-up setting. Lock as little and as low as possible.