CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
description
Transcript of CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
![Page 1: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/1.jpg)
CSCE 668DISTRIBUTED ALGORITHMS AND SYSTEMS
Fall 2011Prof. Jennifer WelchCSCE 668
Set 18: Wait-Free Simulations Beyond Registers 1
![Page 2: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/2.jpg)
Data Types Beyond Registers
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
2
Registers support the operations read and write
We've seen wait-free simulations of one kind of register out of another kind different numbers of values, readers, writers
What about (wait-free) simulating a significantly different kind of data type out of registers?
More generally, what about (wait-free) simulating an object of type X out of objects of type Y ?
![Page 3: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/3.jpg)
Key Insight
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
3
Ability of objects of type Y to be used to simulate an object of type X is related to the ability of those data types to solve consensus!
We are focusing on systems that are asynchronous shared memory wait-free
![Page 4: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/4.jpg)
FIFO Queue Example
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
4
Sequential specification of a FIFO queue: operation with invocation enq(x) and
response ack operation with invocation deq and response
return(x) a sequence of operations is allowable iff
each deq returns the oldest enqueued value that has not yet been dequeued (returns if queue is empty)
![Page 5: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/5.jpg)
Consensus Algorithm for n = 2 Using FIFO Queue
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
5
Initially Q = [0] and Prefer[i] =
Prefer[i] := pi's inputval := deq(Q)if val = 0 then
decide on pi's inputelse
temp := Prefer[1 - i]decide temp
one shared FIFO queuetwo shared registers
write my input into my register
use shared queue to arbitrate between the 2 procs: first oneto dequeue the initial 0 wins,decision value is its input
loser obtains decisionvalue from other proc'sregister
![Page 6: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/6.jpg)
Implications of Consensus Algorithm Using FIFO Queue
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
6
Suppose we want to wait-free simulate a FIFO queue using read/write registers.
Is this possible? No! If it were possible, we could solve
consensus: simulate a FIFO queue using registers use simulated queue and previous
algorithm to solve consensus
![Page 7: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/7.jpg)
Extend Algorithm to More Procs?
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
7
Can we use FIFO queues to solve consensus with more than 2 procs?
The ability to atomically dequeue a value was key to the 2-proc alg: one proc. learns it is the winner the other learns it is the loser, therefore the
id of the winner is obvious Not clear how to handle 3 procs. Suppose we have a different data type:
![Page 8: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/8.jpg)
Compare & Swap Specification
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
8
compare&swap(X : shared memory address, old: value, new: value) previous := X // previous is a local var. if previous = old then X := new return previous X
oldnew
![Page 9: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/9.jpg)
Consensus Algorithm Using Compare-and-Swap
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
9
Initially First =
val := compare&swap(First, , my input)
if val = then decide on my input
else decide val
one shared C&S object
simultaneously indicate the winner and the value to be decided by all the losers
if First = then replace with my input
![Page 10: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/10.jpg)
Impossibility of 3-Proc Consensus with FIFO Queue
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
10
Theorem (15.3): Wait-free consensus is impossible using FIFO queues and registers if n > 2.
Proof: Same structure as for registers.Key difference is when considering
situation when C is bivalent p0(C) is 0-valent and p1(C) is 1-valent.
![Page 11: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/11.jpg)
Impossibility of 3-Proc Consensus with FIFO Queues
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
11
p0 and p1 must be accessing the same FIFO queue.
Case 1: Both steps are deq's.
p0 deq's p1 deq's
C0/1
0 1
p1 deq's p0 deq's
0 1look sameto p2
![Page 12: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/12.jpg)
Impossibility Proof
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
12
Case 2: p0 deq's and p1 enq's.Case 2.1: The queue is not empty in C
p0 deq's p1 enq's
C0/1
0 1
p1 enq's p0 deq's
?
![Page 13: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/13.jpg)
Impossibility Proof
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
13
Case 2: p0 deq's and p1 enq's.Case 2.2: The queue is empty in C
p0 deq's p1 enq's
C0/1
01
look thesame to p2
p0 deq's
1queue is empty again
queue is empty
queue is still empty
![Page 14: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/14.jpg)
Impossibility Proof
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
14
Case 3: Both p0 and p1 enq (on same queue).
p0 enq's A p1 enq's BC 0/1
0 1p1 enq's B p0 enq's A
: p0 takessteps untildeq'ing A
: p1 takessteps untildeq'ing B
: p0 takessteps untildeq'ing B
: p1 takessteps untildeq'ing A
0 1look the same to p2
why do and exist?
![Page 15: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/15.jpg)
Impossibility Proof
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
15
Case 3 cont'd: Suppose does not exist:p0 enq's A p1 enq's B
C 0/1
0 1p1 enq's B p0 enq's A
p0 takessteps untildecidingbut neverdeq's A;decides 0
p0 takes same numberof steps as on the left;never deq's B; alsodecides 0
0 1
![Page 16: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/16.jpg)
Impossibility Proof
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
16
Case 3 cont'd: Prove existence of similarly.
Thus there is no wait-free algorithm for consensus with 3 procs using FIFO queues and registers.
![Page 17: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/17.jpg)
Implications
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
17
Suppose we want to wait-free simulate a compare&swap object using FIFO queues (and registers).
Is this possible? Not if n > 2! If it were possible, we could
solve consensus using FIFO queues (and registers): simulate a compare&swap object using
FIFO queues (and registers) use simulated compare&swap object and
c&s algorithm to solve consensus
![Page 18: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/18.jpg)
Generalize these Arguments
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
18
Previous results concerning FIFO queues and compare&swap suggest a criterion for determining if wait-free simulations exist:
based on ability of the data types to solve consensus for a certain number of procs.
![Page 19: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/19.jpg)
Consensus Number
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
19
Data type X has consensus number n if n is the largest number of procs. for which consensus can be solved using only objects of type X and read/write registers.
data type consensus number
read/write register
1
FIFO queue 2compare&swap ∞
![Page 20: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/20.jpg)
Using Consensus Numbers
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
20
Theorem (15.5): If data type X has consensus number m and data type Y has consensus number n with n > m, then there is no wait-free simulation of an object of type Y using objects of type X and read/write registers in a system with more than m procs.
X X X …
reg reg reg …Y
![Page 21: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/21.jpg)
Using Consensus Numbers
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
21
Proof: Suppose in contradiction there is a wait-free simulation S of Y using X and registers in a system with k procs, where m < k ≤ n.
Construct consensus algorithm for k > m procs using objects of type X (and registers): Use S to simulate some objects of type Y using
objects of type X (and registers) Use the (simulated) type Y objects (and registers)
in the k-proc consensus algorithm that exists since CN(Y) = n.
![Page 22: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/22.jpg)
Corollaries
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
22
There is no wait-free simulation of any object with consensus number > 1 using just read/write registers.
There is no wait-free simulation of any object with consensus number > 2 using just FIFO queues and read/write registers.
![Page 23: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/23.jpg)
Universality
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
23
Let's now consider positive results relating to consensus number.
A data type is universal if objects of that type (together with read/write registers) can wait-free simulate any data type.
Theorem: If data type X has consensus number n, then it is universal in a system with at most n procs.
![Page 24: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/24.jpg)
Proving Universality Result
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
24
1. Describe an algorithm that simulates any data type
uses compare&swap (instead of any object with consensus number n)
simulation is only non-blocking, weaker than wait-free
2. Modify to use any object with consensus number n
3. Modify to be wait-free 4. Modify to bound shared memory used
![Page 25: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/25.jpg)
Non-Blocking
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
25
Non-blocking vs. wait-free is analogous to no-deadlock vs. no-lockout for mutual exclusion.
Non-blocking simulation: at any point in an execution, if at least one operation is pending (response is not yet ready to be done), then there is a finite sequence of steps by a single proc that completes one of the pending operations.
Does not ensure that every pending operation is eventually completed.
![Page 26: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/26.jpg)
Universal Construction
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
26
Keep history of operations that have been applied to the simulated object as a shared linked list.
To apply an operation on the simulated object, the invoking proc. must insert an appropriate "node" into the linked list: it is convenient to put the newest node at the
head of the list A compare&swap object is used to keep
track of the head of the list
![Page 27: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/27.jpg)
Details on Linked List
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
27
Each linked list node has operation invocation new state of the simulated object operation response pointer to previous node (previous op)
invocationstateresponsebefore
invocationstateresponsebefore
initial state
anchor
Head
![Page 28: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/28.jpg)
Simulation
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
28
Initially Head points to anchor node represents initial state of simulated object
When inv is invoked:allocate a new linked list node in shared
memory, pointed to by local var pointpoint.inv := invrepeat
h := Head // h is a local varpoint.state, point.response :=
apply(inv,h.state)point.before := h
until compare&swap(Head,h,point) = hdo the output indicated by point.response
depends on simulated data type
if Head still points tosame node h pointsto, then make Headpoint to new node.
![Page 29: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/29.jpg)
Simulation Figure
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
29
invocationstateresponsebefore
Head…
pipoint
h
invocationstateresponse
before
if compare&swapindicates that Head hasmoved on, then try againto insert the new node,at the new location
![Page 30: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/30.jpg)
Strengthenings of Algorithm
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
30
To replace compare&swap object with any object with consensus number n (the number of procs): define a consensus object (data type
version of consensus problem) get around the difficulty that a consensus
object can only be used once by adding a consensus object to each linked list node that points to next node in the list
![Page 31: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/31.jpg)
Strengthenings of Algorithm
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
31
To get a wait-free implementation, use idea of helping: procs help each other to finish pending operations (not just their own)
To reduce the size of the linked list (so it doesn't grow without bound), need to keep track of which list nodes can be recycled.
![Page 32: CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS](https://reader036.fdocuments.us/reader036/viewer/2022062816/568156de550346895dc48432/html5/thumbnails/32.jpg)
Effect of Randomization
CSCE 668Set 18: Wait-Free Simulations Beyond Registers
32 Suppose we relax the liveness
condition for linearizable shared memory: operations must terminate with high
probability Now a randomized consensus
algorithm can be used to simulate any data type out of any other data type, including read/write registers
I.e., hierarchy collapses.