CS542 Topics in Distributed Systems
-
Upload
adrian-patton -
Category
Documents
-
view
48 -
download
0
description
Transcript of CS542 Topics in Distributed Systems
![Page 1: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/1.jpg)
CS542 Topics inDistributed Systems
CS542 Topics inDistributed Systems
Diganta Goswami
![Page 2: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/2.jpg)
Algorithms to Find Global StatesAlgorithms to Find Global States
• Why?– (Distributed) garbage collection [think multiple processes sharing and
referencing objects]– (Distributed) deadlock detection, termination [think database
transactions]– Global states most useful for detecting stable predicates : once true
always stays true (unless you do something about it)» e.g., once a deadlock, always stays a deadlock
• What?– Global state=states of all processes + states of all communication
channels– Capture the instantaneous state of each process– And the instantaneous state of each communication channel, i.e.,
messages in transit on the channels
• How?– We’ll see this lecture!
![Page 3: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/3.jpg)
Obvious First Solution…Obvious First Solution…
• Synchronize clocks of all processes• Ask all processes to record their states at known
time t
• Problems?– Time synchronization possible only approximately (but
distributed banking applications cannot take approximations)– Does not record the state of messages in the channels
• Again: synchronization not required – causality is enough!
![Page 4: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/4.jpg)
Two Processes and Their Initial StatesTwo Processes and Their Initial States
p1 p2c2
c1
account widgets
$1000 (none)
account widgets
$50 2000
![Page 5: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/5.jpg)
Execution of the ProcessesExecution of the Processes
p1
p2
(empty)<$1000, 0> <$50, 2000>
(empty)
c2
c1
1. Global state S0
2. Global state S1
3. Global state S2
4. Global state S3
p1
p2
(Order 10, $100)<$900, 0> <$50, 2000>
(empty)
c2
c1
p1
p2
(Order 10, $100)<$900, 0> <$50, 1995>
(five widgets)
c2
c1
p1
p2
(Order 10, $100)<$900, 5> <$50, 1995>
(empty)
c2
c1
Send 5 freebie widgets!
![Page 6: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/6.jpg)
CutsCuts
Cut = time frontier, one at each process
f cut C iff f is to the left of the frontier C
P1
P2
P3
e10 e1
1 e12 e1
3
e20
e21
e22
e30 e3
1 e32
Inconsistent cut
Consistent cut
![Page 7: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/7.jpg)
Consistent CutsConsistent Cuts
f cut C iff f is to the left of the frontier C
A cut C is consistent if and only if
e C (if f e then f C)
A global state S is consistent if and only if it corresponds to a consistent cut
A consistent cut == a global snapshot
P1
P2
P3
e10 e1
1 e12 e1
3
e20
e21
e22
e30 e3
1 e32
Inconsistent cut
Consistent cut Lamport’s “happens-before”
![Page 8: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/8.jpg)
The “Snapshot” Algorithm The “Snapshot” Algorithm
Problem: Record a set of process and channel states such that the combination is a global snapshot/consistent cut.
System Model:There is a uni-directional communication channel
between each ordered process pair (Pj Pi and Pi Pj)
Communication channels are FIFO-ordered
No failure, all messages arrive intact, exactly once
Any process may initiate the snapshot (by sending a special message called “Marker”)
Snapshot does not require application to stop sending messages, does not interfere with normal execution
Each process is able to record its state and the state of its incoming channels (no central collection)
![Page 9: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/9.jpg)
The “Snapshot” Algorithm (2) The “Snapshot” Algorithm (2) 1. Marker sending rule for initiator process P0
After P0 has recorded its own state
• for each outgoing channel C, send a marker message on C
2. Marker receiving rule for a process Pk
on receipt of a marker over channel C if Pk has not yet received a marker
- record Pk’s own state
- record the state of C as “empty”
- for each outgoing channel C, send a marker on C
- turn on recording of messages over other incoming channels
- else
- record the state of C as all the messages received over C since Pk saved its own state; stop recording state of C
![Page 10: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/10.jpg)
Chandy and Lamport’s ‘Snapshot’ AlgorithmChandy and Lamport’s ‘Snapshot’ Algorithm
Marker receiving rule for process pi
On pi’s receipt of a marker message over channel c:if (pi has not yet recorded its state) it
records its process state now;records the state of c as the empty set;turns on recording of messages arriving over other incoming channels;
else pi records the state of c as the set of messages it has received over c since it saved its state.
end ifMarker sending rule for process pi
After pi has recorded its state, for each outgoing channel c: pi sends one marker message over c (before it sends any other message over c).
![Page 11: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/11.jpg)
Snapshot ExampleSnapshot Example
P1
P2
P3
e10
e20
e23
e30
e13
a
b
M
e11,2
M
1- P1 initiates snapshot: records its state (S1); sends Markers to P2 & P3; turns on recording for channels C21 and C31
e21,2,3
M
M
2- P2 receives Marker over C12, records its state (S2), sets state(C12) = {} sends Marker to P1 & P3; turns on recording for channel C32
e14
3- P1 receives Marker over C21, sets state(C21) = {a}
e32,3,4
M
M
4- P3 receives Marker over C13, records its state (S3), sets state(C13) = {} sends Marker to P1 & P2; turns on recording for channel C23
e24
5- P2 receives Marker over C32, sets state(C32) = {b}
e31
6- P3 receives Marker over C23, sets state(C23) = {}
e13
7- P1 receives Marker over C31, sets state(C31) = {}
![Page 12: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/12.jpg)
Provable Assertion: Chandy-Lamport algo. determines a consistent cut
Provable Assertion: Chandy-Lamport algo. determines a consistent cut
• Let ei and ej be events occurring at pi and pj, respectively such that ei ej
• The snapshot algorithm ensures that
if ej is in the cut then ei is also in the cut.
• if ej <pj records its state>, then it must be true that ei <pi records its state>.
• By contradiction, suppose <pi records its state> ei
• Consider the path of app messages (through other processes) that go from ei ej
• Due to FIFO ordering, markers on each link in above path precede regular app messages
• Thus, since <pi records its state> ei , it must be true that pj received a marker before ej
• Thus ej is not in the cut => contradiction
![Page 13: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/13.jpg)
Formally Speaking…. Process HistoriesFormally Speaking…. Process Histories For a process Pi , where events ei
0, ei1, …
occur:
history(Pi) = hi = <ei0, ei
1, … >
prefix history(Pik) = hi
k = <ei0, ei
1, …,eik >
Sik : Pi ’s state immediately after kth event
For a set of processes P1 , …,Pi , …. :
global history: H = i (hi)
global state: S = i (Sik
i) channels
a cut C H = h1c1 h2
c2 … hncn
the frontier of C = {eici, i = 1,2, … n}
![Page 14: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/14.jpg)
Global States useful for detecting Global
Predicates
Global States useful for detecting Global
Predicates A cut is consistent if and only if it does not violate causality
A Run is a total ordering of events in H that is consistent with each hi’s ordering
A Linearization is a run consistent with happens-before () relation in H (history of all events).
Linearizations pass through consistent global states.
A global state Sk is reachable from global state Si, if there is a linearization, L, that passes through Si and then through Sk.
The distributed system evolves as a series of transitions between global states S0 , S1 , ….
![Page 15: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/15.jpg)
Reachability between states in the snapshot algorithmReachability between states in the snapshot algorithm
Sinit Sfinal
Ssnap
actual execution e0,e1,...
recording recording begins ends
pre-snap: e'0,e '1,...e'R-1 post-snap: e 'R,e 'R+1,...
'
![Page 16: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/16.jpg)
Distributed debuggingDistributed debugging
Examine the problem of recording a system’s global state so that we may make useful statements about whether a transitory state – as opposed to a stable state – occurred in an actual executionThis is what we require, in general, when debugging a
distributed system
Is |xi – xj| <= where xi is a variable in process Pi
![Page 17: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/17.jpg)
Distributed debuggingDistributed debugging
Chandy and Lamport’s algorithm collects state in a distributed fashionThe processes in the system can send the state they
gather to a monitor process for collection
Algorithm [Marzullo and Neiger, ‘91] – The observed processes send their states to a process called a monitor, which assembles globally consistent states from what it receivesThe monitor lie outside the system, observing its
execution
![Page 18: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/18.jpg)
Distributed debuggingDistributed debugging
Goal is to determine cases when a given global state predicate was definitely True at some point in the execution we observed, and cases when it was possibly True Possibly – because we may extract a consistent global
state S from an executing system and find that (S) is True.
No single observation of a consistent global state allows us to conclude whether a non-stable predicate ever evaluated to True in the actual execution
![Page 19: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/19.jpg)
Distributed debuggingDistributed debugging
Possibly : There is a consistent global state S through which a linearization of H passes such that (S) is True
Definitely : For all linearization L of H, there is a consistent global state S through which L passes such that (S) is True
![Page 20: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/20.jpg)
Distributed debuggingDistributed debugging
We now describe
How the process states are collected
How the monitor extracts consistent global states
How the monitor evaluates possibly and definitely in both asynchronous and synchronous systems
![Page 21: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/21.jpg)
Distributed debuggingDistributed debugging
The observed processes pi (I = 1, 2, … N) send their initial state to monitor process initially, and thereafter from time to time, in state messagesNo need to send state except initially and when it changes
Global state predicate may depend only on certain parts of the process’ states – hence need only send relevant state
Need only send state at times when the predicate may become True or cease to be True
The monitor process records the state messages from process pi in a separate queue Qi, for each i= 1, 2, … N
![Page 22: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/22.jpg)
Distributed debuggingDistributed debugging
In order that the monitor can distinguish consistent global states from inconsistent global states, the observed processes enclose their vector clock values with their state messages
Each queue Qi is kept ordered in sending order (can be established by examining the i-th component of the vector clock)
![Page 23: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/23.jpg)
Distributed debuggingDistributed debugging
Let S = (s1, s2, …, SN) be a global statedrawn from the state messages that the monitor has received. Let V(si) be the vector clock of the state si received from pi
S is a consistent global state iff
V(si)[i] >= V(sj)[i] for i, j = 1, 2, …, N
That is, the no. of pi’s events known at pj when it sent sj is no more than the no. of events that have occurred at pi
when it sent si.
Hence, if one process’s state depends upon another, then the global state also encompasses the state upon which it depends
![Page 24: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/24.jpg)
Distributed debuggingDistributed debugging
The monitor process may establish whether a given global state is consistent, using the vector timestamps sent by the observed processes
It can construct a lattice of consistent global states corresponding to the execution of the processes – captures the relation of reachability between consistent global states
The nodes denote global states, and the edges denote possible transitions between these states
![Page 25: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/25.jpg)
Vector timestamps and variable values Vector timestamps and variable values
m1 m2
p1
p2Physical
time
Cut C1
(1,0) (2,0) (4,3)
(2,1) (2,2) (2,3)
(3,0)
x1= 1 x1= 100 x1= 105
x2= 100 x2= 95 x2= 90
x1= 90
Cut C 2
![Page 26: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/26.jpg)
The lattice of global states for the execution of previous FigThe lattice of global states for the execution of previous Fig
Sij = global state after i events at process 1 and j events at process 2
S00
S10
S20
S21S30
S31
S32
S22
S23
S33
S43
Level 0
1
2
3
4
5
6
7
![Page 27: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/27.jpg)
Distributed debuggingDistributed debugging
A linearization traverses the lattice from any global state to any global state reachable from it on the next level – that is, in each step some process experiences one event. For ex. S22 is reachable from S20, but S22 is not reachable from S30.
The lattice shows all linearizations corresponding to a history
A monitor process can now evaluate possibly and definitely
![Page 28: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/28.jpg)
Distributed debuggingDistributed debugging
To evaluate possibly , the monitor process starts at the initial state and steps through all consistent states reachable from that point, evaluating at each stage. It stops when evaluates to True
To evaluate definitely , the monitor process must attempt to find a set of states through which all linearizations must pass, and at each of which evaluates to True
Note that, the state S’ is reachable from S iff
V(sj)[j] >= V(s’i)[j] for j = 1, 2, …, N, j ≠ i
![Page 29: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/29.jpg)
Algorithms to evaluate possibly and definitely Algorithms to evaluate possibly and definitely
![Page 30: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/30.jpg)
Global State Predicates Global State Predicates A global-state-predicate is a function from the set of
global states to {true, false} , e.g., deadlock, termination
A global state S0 satisfies liveness property P iff:liveness(P(S0)) L linearizations from S0 L passes through an SL & P(SL)
= true
Ex: P(S) = the computation will terminate
A global state S0 satisfies this safety property P if: safety(P(S0)) S reachable from S0, P(S) = false
Ex: P(S) = S has a deadlock
Global states often useful for detecting stable global-state-predicate: it is one that once it becomes true, it remains true in subsequent global states, e.g., an object O is orphaned, or deadlock
A stable predicate may be a safety or liveness predicate
![Page 31: CS542 Topics in Distributed Systems](https://reader036.fdocuments.us/reader036/viewer/2022081506/568131f1550346895d98517f/html5/thumbnails/31.jpg)
Liveness versus SafetyLiveness versus Safety
Can be confusing, but terms are very important:
• Liveness=guarantee that something good will happen, eventually
– “Guarantee of termination” is a liveness property
– Guarantee that “at least one of the atheletes in the 100m final will win gold” is liveness
– A criminal will eventually be jailed
– Completeness in failure detectors
• Safety=guarantee that something bad will never happen– Deadlock avoidance algorithms provide safety
– A peace treaty between two nations provides safety
– An innocent person will never be jailed
– Accuracy in failure detectors
• Can be difficult to satisfy both liveness and safety!