Operating Systems, 2011, Danny Hendler & Amnon Meisels 1 Distributed Synchronization: outline ...
-
Upload
blanche-poole -
Category
Documents
-
view
218 -
download
0
Transcript of Operating Systems, 2011, Danny Hendler & Amnon Meisels 1 Distributed Synchronization: outline ...
Operating Systems, 2011, Danny Hendler & Amnon Meisels 1
Distributed Synchronization: outline
Introduction
Causality and timeo Lamport timestamps
o Vector timestamps
o Causal communication
Snapshots
This presentation is based on the book: “Distributed operating-systems & algorithms” by Randy Chow and Theodore Johnson
2Operating Systems, 2011, Danny Hendler & Amnon Meisels
A distributed system is a collection of independent computational nodes, communicating over a network, that is abstracted as a single coherent systemo Grid computingo Cloud computing (“software as a
service”)o Peer-to-peer computingo Sensor networkso …
A distributed operating system allows sharing of resources and coordination of distributed computation in a transparent manner
Distributed systems
(a), (b) – a distributed system(c) – a multiprocessor
3Operating Systems, 2011, Danny Hendler & Amnon Meisels
Underlies distributed operating systems and algorithms
Processes communicate via message passing (no shared memory)
Inter-node coordination in distributed systems challenged byo Lack of global stateo Lack of global clocko Communication links may failo Messages may be delayed or losto Nodes may fail
Distributed synchronization supports correct coordination in distributed systemso May no longer use shared-memory based locks and semaphores
Distributed synchronization
4Operating Systems, 2011, Danny Hendler & Amnon Meisels
Distributed Synchronization: outline
Introduction
Causality and timeo Lamport timestamps
o Vector timestamps
o Causal communication
Snapshots
5Operating Systems, 2011, Danny Hendler & Amnon Meisels
Eventso Sending a messageo Receiving a messageo Timeout, internal interrupt
Processors send control messages to each othero send(destination, action; parameters)
Processes may declare that they are waiting for events:o Wait for A1, A2, …., An
A1(source; parameters) code to handle A1
. . An(source; parameters) code to handle An
Distributed computation model
6Operating Systems, 2011, Danny Hendler & Amnon Meisels
A distributed system has no global state nor global clock no global order on all events may be determined
Each processor knows total orders on events occurring in it
There is a causal relation between the sending of a message and its receipt
Causality and events ordering
Lamport's happened-before relation H
1. e1 <p e2 e1 <H e2 (events within same processor are ordered)
2. e1 <m e2 e1 <H e2 (each message m is sent before it is received)
3. e1 <H e2 AND e2 <H e3 e1 <H e3 (transitivity)
Leslie Lamport (1978): “Time, clocks, and the ordering of events in a distributed system”
7Operating Systems, 2011, Danny Hendler & Amnon Meisels
Causality and events ordering (cont'd)TimeP1 P2 P3
e1
e2
e3
e4
e5
e6
e7
e8
e1 <H e7 ? Yes. e1 <H e3 ? Yes. e1 <H e8 ? Yes.
e5 <H e7 ? No.
8Operating Systems, 2011, Danny Hendler & Amnon Meisels
1 Initially my_TS=0
2 Upon event e,3 if e is the receipt of message m4 my_TS=max(m.TS, my_TS)5 my_TS++6 e.TS=my_TS7 If e is the sending of message m8 m.TS=my_TS
Lamport's timestamp algorithm
To create a total order, ties are broken by process ID
9Operating Systems, 2011, Danny Hendler & Amnon Meisels
Lamport's timestamps (cont'd)TimeP1 P2 P3
e1
e2
e3
e4
e5
e6
e7
e8
1.1
2.1
3.1
1.3
2.2
3.2
4.3
1.2
10Operating Systems, 2011, Danny Hendler & Amnon Meisels
Distributed Synchronization: outline
Introduction
Causality and timeo Lamport timestamps
o Vector timestamps
o Causal communication
Snapshots
11Operating Systems, 2011, Danny Hendler & Amnon Meisels
Lamport's timestamps define a total ordero e1 <H e2 e1.TS < e2.TS
o However, e1.TS < e2.TS e1 <H e2 does not hold, in general.
Vector timestamps - motivation
Definition: Message m1 casually precedes message m2 (written as m1 <c m2) if
s(m1) <H s(m2) (sending m1 happens before sending m2)
Definition: causality violation occurs if m1 <c m2 but r(m2) <p r(m1).
In other words, m1 is sent to processor p before m2 but is received after it.
Lamport's timestamps do not allow to detect (nor prevent) causality violations.
12Operating Systems, 2011, Danny Hendler & Amnon Meisels
Causality violations – an exampleP1 P2 P3
Migrate O to P2
Where is O?
M1
On p2
Where is O?
I don’t know
M2
M3
ERROR!
Causality violation between… M1 and M3.
13Operating Systems, 2011, Danny Hendler & Amnon Meisels
Vector timestamps1 Initially my_VT=[0,…,0]
2 Upon event e,3 if e is the receipt of message m4 for i=1 to M5 my_VT[i]=max(m.VT[i], my_VT[i])6 My_VT[self]++7 e.VT=my_VT8 if e is the sending of message m9 m.VT=my_VT
For vector timestamps it does hold that: e1 <VT e2 e1 <H e2
e1.VT ≤V e2.VT if and only if e1.VT[i] ≤ e2.VT[i], for every i=1,…,M
e1.VT <V e2.VT if and only if e1.VT ≤V e2.VT and e1.VT ≠ e2.VT
14Operating Systems, 2011, Danny Hendler & Amnon Meisels
Vector timestamps (cont'd)P1 P2 P3 P4
1 1
1
1
22
232
334
54
3
5
4
6e
e.VT=(3,6,4,2)
15Operating Systems, 2011, Danny Hendler & Amnon Meisels
VTs can be used to detect causality violations P1 P2 P3
Migrate O to P2
Where is O?
M1
On p2
Where is O?
I don’t know
M2
M3
ERROR!
Causality violation between… M1 and M3.
(1,0,0) (0,0,1)
(2,0,1)
(3,0,1) (3,0,2)
(3,0,3)(3,1,3)
(3,2,3)
(3,3,3)
(3,2,4)
16Operating Systems, 2011, Danny Hendler & Amnon Meisels
Distributed Synchronization: outline
Introduction
Causality and timeo Lamport timestamps
o Vector timestamps
o Causal communication
Snapshots
17Operating Systems, 2011, Danny Hendler & Amnon Meisels
A processor cannot control the order in which it receives messages…But it may control the order in which they are delivered to applications
Preventing causality violations
Application
FIFO ordering
Message passing
Deliver in FIFO order
Assigntimestamps
Message arrival order
x1
Deliverx1
x2
Deliverx2
x4
Bufferx4
x5
Bufferx5
x3
Delieverx3 x4 x5
Protocol for FIFO message delivery(as in, e.g., TCP)
18Operating Systems, 2011, Danny Hendler & Amnon Meisels
Preventing causality violations (cont'd) Senders attach a timestamp to each message
Destination delays the delivery of out-of-order messages
Hold back a message m until we are assured that no message m’ <H m will be deliveredo For every other process p, maintain the earliest timestamp of a message m
that may be delivered from po Do not deliver a message if an earlier message may still be delivered from
another process
19Operating Systems, 2011, Danny Hendler & Amnon Meisels
Algorithm for preventing causality violations1 earliest[1..M] initially [ <1,0,…0>, <0,1,…,0>, …, <0,0,…,1> ]2 blocked[1…M] initially [ {}, …, {} ]
3 Upon the receipt of message m from processor p4 Delivery_list = {}5 If (blocked[p] is empty)6 earliest[p]=m.timestamp7 Add m to the tail of blocked[p]8 While ( k such that blocked[k] is non-empty AND
i {1,…,M} (except k and Self) not_earlier(earliest[i], earliest[k])9 remove the message at the head of blocked[k], put it in delivery_list10 if blocked[k] is non-empty11 earliest[k] m'.timestamp, where m' is at the head of blocked[k]12 else13 increment the k'th element of earliest[k]14 End While 15 Deliver the messages in delivery_list, in causal order
For each processor p, the earliest timestamp with which a message from p may still be delivered
20Operating Systems, 2011, Danny Hendler & Amnon Meisels
Algorithm for preventing causality violations1 earliest[1..M] initially [ <1,0,…0>, <0,1,…,0>, …, <0,0,…,1> ]2 blocked[1…M] initially [ {}, …, {} ]
3 Upon the receipt of message m from processor p4 Delivery_list = {}5 If (blocked[p] is empty)6 earliest[p]=m.timestamp7 Add m to the tail of blocked[p]8 While ( k such that blocked[k] is non-empty AND
i {1,…,M} (except k and Self) not_earlier(earliest[i], earliest[k])9 remove the message at the head of blocked[k], put it in delivery_list10 if blocked[k] is non-empty11 earliest[k] m'.timestamp, where m' is at the head of blocked[k]12 else13 increment the k'th element of earliest[k]14 End While 15 Deliver the messages in delivery_list, in causal order
For each process p, the messages from p that were received but were not delivered yet
21Operating Systems, 2011, Danny Hendler & Amnon Meisels
Algorithm for preventing causality violations1 earliest[1..M] initially [ <1,0,…0>, <0,1,…,0>, …, <0,0,…,1> ]2 blocked[1…M] initially [ {}, …, {} ]
3 Upon the receipt of message m from processor p4 Delivery_list = {}5 If (blocked[p] is empty)6 earliest[p]=m.timestamp7 Add m to the tail of blocked[p]8 While ( k such that blocked[k] is non-empty AND
i {1,…,M} (except k and Self) not_earlier(earliest[i], earliest[k])9 remove the message at the head of blocked[k], put it in delivery_list10 if blocked[k] is non-empty11 earliest[k] m'.timestamp, where m' is at the head of blocked[k]12 else13 increment the k'th element of earliest[k]14 End While 15 Deliver the messages in delivery_list, in causal order
List of messages to be delivered as a result of m’s receipt
22Operating Systems, 2011, Danny Hendler & Amnon Meisels
Algorithm for preventing causality violations1 earliest[1..M] initially [ <1,0,…0>, <0,1,…,0>, …, <0,0,…,1> ]2 blocked[1…M] initially [ {}, …, {} ]
3 Upon the receipt of message m from processor p4 Delivery_list = {}5 If (blocked[p] is empty)6 earliest[p]=m.timestamp7 Add m to the tail of blocked[p]8 While ( k such that blocked[k] is non-empty AND
i {1,…,M} (except k and Self) not_earlier(earliest[i], earliest[k])9 remove the message at the head of blocked[k], put it in delivery_list10 if blocked[k] is non-empty11 earliest[k] m'.timestamp, where m' is at the head of blocked[k]12 else13 increment the k'th element of earliest[k]14 End While 15 Deliver the messages in delivery_list, in causal order
If no blocked messages from p, update earliest[p]
23Operating Systems, 2011, Danny Hendler & Amnon Meisels
Algorithm for preventing causality violations1 earliest[1..M] initially [ <1,0,…0>, <0,1,…,0>, …, <0,0,…,1> ]2 blocked[1…M] initially [ {}, …, {} ]
3 Upon the receipt of message m from processor p4 Delivery_list = {}5 If (blocked[p] is empty)6 earliest[p]=m.timestamp7 Add m to the tail of blocked[p]8 While ( k such that blocked[k] is non-empty AND
i {1,…,M} (except k and Self) not_earlier(earliest[i], earliest[k])9 remove the message at the head of blocked[k], put it in delivery_list10 if blocked[k] is non-empty11 earliest[k] m'.timestamp, where m' is at the head of blocked[k]12 else13 increment the k'th element of earliest[k]14 End While 15 Deliver the messages in delivery_list, in causal order
m is now the most recent message from p not yet delivered
24Operating Systems, 2011, Danny Hendler & Amnon Meisels
Algorithm for preventing causality violations1 earliest[1..M] initially [ <1,0,…0>, <0,1,…,0>, …, <0,0,…,1> ]2 blocked[1…M] initially [ {}, …, {} ]
3 Upon the receipt of message m from processor p4 Delivery_list = {}5 If (blocked[p] is empty)6 earliest[p]=m.timestamp7 Add m to the tail of blocked[p]8 While ( k such that blocked[k] is non-empty AND
i {1,…,M} (except k and Self) not_earlier(earliest[i], earliest[k])9 remove the message at the head of blocked[k], put it in delivery_list10 if blocked[k] is non-empty11 earliest[k] m'.timestamp, where m' is at the head of blocked[k]12 else13 increment the k'th element of earliest[k]14 End While 15 Deliver the messages in delivery_list, in causal order
If there is a process k with delayed messages for which it is now safe to deliver its earliest message…
25Operating Systems, 2011, Danny Hendler & Amnon Meisels
Algorithm for preventing causality violations1 earliest[1..M] initially [ <1,0,…0>, <0,1,…,0>, …, <0,0,…,1> ]2 blocked[1…M] initially [ {}, …, {} ]
3 Upon the receipt of message m from processor p4 Delivery_list = {}5 If (blocked[p] is empty)6 earliest[p]=m.timestamp7 Add m to the tail of blocked[p]8 While ( k such that blocked[k] is non-empty AND
i {1,…,M} (except k and Self) not_earlier(earliest[i], earliest[k])9 remove the message at the head of blocked[k], put it in delivery_list10 if blocked[k] is non-empty11 earliest[k] m'.timestamp, where m' is at the head of blocked[k]12 else13 increment the k'th element of earliest[k]14 End While 15 Deliver the messages in delivery_list, in causal order
Remove k'th earliest message from blocked queue, make sure it is delivered
26Operating Systems, 2011, Danny Hendler & Amnon Meisels
Algorithm for preventing causality violations1 earliest[1..M] initially [ <1,0,…0>, <0,1,…,0>, …, <0,0,…,1> ]2 blocked[1…M] initially [ {}, …, {} ]
3 Upon the receipt of message m from processor p4 Delivery_list = {}5 If (blocked[p] is empty)6 earliest[p]=m.timestamp7 Add m to the tail of blocked[p]8 While ( k such that blocked[k] is non-empty AND
i {1,…,M} (except k and Self) not_earlier(earliest[i], earliest[k])9 remove the message at the head of blocked[k], put it in delivery_list10 if blocked[k] is non-empty11 earliest[k] m'.timestamp, where m' is at the head of blocked[k]12 else13 increment the k'th element of earliest[k]14 End While 15 Deliver the messages in delivery_list, in causal order
If there are additional blocked messages of k, update earliest[k] to be the timestamp of the earliest such message
27Operating Systems, 2011, Danny Hendler & Amnon Meisels
Algorithm for preventing causality violations1 earliest[1..M] initially [ <1,0,…0>, <0,1,…,0>, …, <0,0,…,1> ]2 blocked[1…M] initially [ {}, …, {} ]
3 Upon the receipt of message m from processor p4 Delivery_list = {}5 If (blocked[p] is empty)6 earliest[p]=m.timestamp7 Add m to the tail of blocked[p]8 While ( k such that blocked[k] is non-empty AND
i {1,…,M} (except k and Self) not_earlier(earliest[i], earliest[k])9 remove the message at the head of blocked[k], put it in delivery_list10 if blocked[k] is non-empty11 earliest[k] m'.timestamp, where m' is at the head of blocked[k]12 else13 increment the k'th element of earliest[k]14 End While 15 Deliver the messages in delivery_list, in causal order
Otherwise the earliest message that may be delivered from k would have previous timestamp with the k'th component incremented
28Operating Systems, 2011, Danny Hendler & Amnon Meisels
Algorithm for preventing causality violations1 earliest[1..M] initially [ <1,0,…0>, <0,1,…,0>, …, <0,0,…,1> ]2 blocked[1…M] initially [ {}, …, {} ]
3 Upon the receipt of message m from processor p4 Delivery_list = {}5 If (blocked[p] is empty)6 earliest[p]=m.timestamp7 Add m to the tail of blocked[p]8 While ( k such that blocked[k] is non-empty AND
i {1,…,M} (except k and Self) not_earlier(earliest[i], earliest[k])9 remove the message at the head of blocked[k], put it in delivery_list10 if blocked[k] is non-empty11 earliest[k] m'.timestamp, where m' is at the head of blocked[k]12 else13 increment the k'th element of earliest[k]14 End While 15 Deliver the messages in delivery_list, in causal order
Finally, deliver set of messages that will not cause causality violation (if there are any).
29Operating Systems, 2011, Danny Hendler & Amnon Meisels
Execution of the algorithm as multicast
P1 P2 P3
(0,1,0)
Since the algorithm is “interested” only in causal order of sending events, vector timestamp is incremented only upon send events.
(1,1,0)
p3 state
earliest[1] earliest[2] earliest[3]
(1,0,0) (0,1,0) (0,0,1)m1
m2
(1,1,0) (0,1,0) (0,0,1)
Upon receipt of m2
Upon receipt of m1
(1,1,0) (0,1,0) (0,0,1)deliver m1
(1,1,0) (0,2,0) (0,0,1)deliver m2
(2,1,0) (0,2,0) (0,0,1)
30Operating Systems, 2011, Danny Hendler & Amnon Meisels
Introduction
Causality and timeo Lamport timestamps
o Vector timestamps
o Causal communication
Snapshots
Distributed Synchronization: outline
31Operating Systems, 2011, Danny Hendler & Amnon Meisels
Assume we would like to implement a distributed deadlock-detector We record process states to check if there is a waits-for-cycle
Snapshots motivation – phantom deadlock
P1 r1 r2 P2
request
OK
request
OK
release request
OK
12
3
4
p2
r1
r2
p1
Actual waits-for graph
p2
r1
r2
p1
Observed waits-for graph
request
Global system state:o S=(s1, s2, …, sM) – local processor states
o The contents Li,j =(m1, m2, …, mk) of each communication channel Ci,j
(channels assumed to be FIFO)These are messages sent but not yet received
Global state must be consistento If we observe in state si that pi received message m from pk, then in
observation sk ,k must have sent m.
o Each Li,j must contain exactly the set of messages sent by pi but not yet received by pj, as reflected by si, sj.
What is a snapshot?
• Snapshot state much be consistent, one that might have existed during the computation.
• Observations must be mutually concurrent that is, no observation casually precedes another observation
Upon joining the algorithm a process records its local state
The process that initiates the snapshot sends snapshot tokens to its neighbors (before sending any other messages)o Neighbors send them to their neighbors – broadcast
Upon receiving a snapshot token:o a process must record its state prior to receiving additional messageso Must then send tokens to all its other neighbors
How shall we record sets Lp,q?o q receives token from p and that is the first time q receives token
Lp,q={ }o q receives token from p but q received token before
Lp,q={all messages received by q from p since q received token}
Snapshot algorithm – informal description
34Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – data structures The algorithm supports multiple ongoing snapshots – one per process
Different snapshots from same process distinguished by version number
Per process variables
integer my_version initially 0integer current_snap[1..M] initially [0,…,0] integer tokens_received[1..M]processor_state S[1…M]channel_state [1..M][1..M]
The version number of my current snapshot
35Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – data structures The algorithm supports multiple ongoing snapshots – one per process
Different snapshots from same process distinguished by version number
Per process variables
integer my_version initially 0integer current_snap[1..M] initially [0,…,0] integer tokens_received[1..M]processor_state S[1…M]channel_state [1..M][1..M]
current_snap[r] contains version number of snapshot initiated by processor r
36Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – data structures The algorithm supports multiple ongoing snapshots – one per process
Different snapshots from same process distinguished by version number
Per process variables
integer my_version initially 0integer current_snap[1..M] initially [0,…,0] integer tokens_received[1..M]processor_state S[1…M]channel_state [1..M][1..M] tokens_received[r]
contains the number of tokens received for the snapshot initiated by processor r
37Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – data structures The algorithm supports multiple ongoing snapshots – one per process
Different snapshots from same process distinguished by version number
Per process variables
integer my_version initially 0integer current_snap[1..M] initially [0,…,0] integer tokens_received[1..M]processor_state S[1…M]channel_state [1..M][1..M]
processor_state[r] contains the state recorded for the snapshot of processor r
38Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – data structures The algorithm supports multiple ongoing snapshots – one per process
Different snapshots from same process distinguished by version number
Per process variables
integer my_version initially 0integer current_snap[1..M] initially [0,…,0] integer tokens_received[1..M]processor_state S[1…M]channel_state [1..M][1..M]
channel_state[r][q] records the state of channel from q to current processor for the snapshot initiated by processor r
39Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – pseudo-codeexecute_snapshot()
Wait for a snapshot request or a token
Snapshot request:1 my_version++, current_snap[self]=my_version2 S[self] my_state3 for each outgoing channel q, send(q, TOEKEN, my_version)4 tokens_received[self]=0
TOKEN(q; r,version):5. If current_snap[r] < version6. S[r] my state7. current_snap[r]=version8. L[r][q] empty, send token(r, version) on each outgoing channel9. tokens_received[r] 110. else11. tokens_received[r]++12. L[r][q] all messages received from q since first receiving token(r,version)13. if tokens_received = #incoming channels, local snapshot for (r,version) is finished
40Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – pseudo-codeexecute_snapshot()
Wait for a snapshot request or a token
Snapshot request:1 my_version++, current_snap[self]=my_version2 S[self] my_state3 for each outgoing channel q, send(q, TOEKEN, my_version)4 tokens_received[self]=0
TOKEN(q; r,version):5. If current_snap[r] < version6. S[r] my state7. current_snap[r]=version8. L[r][q] empty, send token(r, version) on each outgoing channel9. tokens_received[r] 110. else11. tokens_received[r]++12. L[r][q] all messages received from q since first receiving token(r,version)13. if tokens_received = #incoming channels, local snapshot for (r,version) is finished
41Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – pseudo-codeexecute_snapshot()
Wait for a snapshot request or a token
Snapshot request:1 my_version++, current_snap[self]=my_version2 S[self] my_state3 for each outgoing channel q, send(q, TOEKEN, my_version)4 tokens_received[self]=0
TOKEN(q; r,version):5. If current_snap[r] < version6. S[r] my state7. current_snap[r]=version8. L[r][q] empty, send token(r, version) on each outgoing channel9. tokens_received[r] 110. else11. tokens_received[r]++12. L[r][q] all messages received from q since first receiving token(r,version)13. if tokens_received = #incoming channels, local snapshot for (r,version) is finished
Increment version number of this snapshot, record my local state
42Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – pseudo-codeexecute_snapshot()
Wait for a snapshot request or a token
Snapshot request:1 my_version++, current_snap[self]=my_version2 S[self] my_state3 for each outgoing channel q, send(q, TOEKEN; self, my_version)4 tokens_received[self]=0
TOKEN(q; r,version):5. If current_snap[r] < version6. S[r] my state7. current_snap[r]=version8. L[r][q] empty, send token(r, version) on each outgoing channel9. tokens_received[r] 110. else11. tokens_received[r]++12. L[r][q] all messages received from q since first receiving token(r,version)13. if tokens_received = #incoming channels, local snapshot for (r,version) is finished
Send snapshot-token on all outgoing channels, initialize number of received tokens for my snapshot to 0
43Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – pseudo-codeexecute_snapshot()
Wait for a snapshot request or a token
Snapshot request:1 my_version++, current_snap[self]=my_version2 S[self] my_state3 for each outgoing channel q, send(q, TOEKEN; self, my_version)4 tokens_received[self]=0
TOKEN(q; r,version):5. If current_snap[r] < version6. S[r] my state7. current_snap[r]=version8. L[r][q] empty, send token(r, version) on each outgoing channel9. tokens_received[r] 110. else11. tokens_received[r]++12. L[r][q] all messages received from q since first receiving token(r,version)13. if tokens_received = #incoming channels, local snapshot for (r,version) is finished
Upon receipt from q of TOKEN for snapshot (r,version)
44Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – pseudo-codeexecute_snapshot()
Wait for a snapshot request or a token
Snapshot request:1 my_version++, current_snap[self]=my_version2 S[self] my_state3 for each outgoing channel q, send(q, TOEKEN; self, my_version)4 tokens_received[self]=0
TOKEN(q; r,version):5. If current_snap[r] < version6. S[r] my state7. current_snap[r]=version8. L[r][q] empty, send token(r, version) on each outgoing channel9. tokens_received[r] 110. else11. tokens_received[r]++12. L[r][q] all messages received from q since first receiving token(r,version)13. if tokens_received = #incoming channels, local snapshot for (r,version) is finished
If this is first token for snapshot (r,version)
45Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – pseudo-codeexecute_snapshot()
Wait for a snapshot request or a token
Snapshot request:1 my_version++, current_snap[self]=my_version2 S[self] my_state3 for each outgoing channel q, send(q, TOEKEN; self, my_version)4 tokens_received[self]=0
TOKEN(q; r,version):5. If current_snap[r] < version6. S[r] my state7. current_snap[r]=version8. L[r][q] empty, send token(r, version) on each outgoing channel9. tokens_received[r] 110. else11. tokens_received[r]++12. L[r][q] all messages received from q since first receiving token(r,version)13. if tokens_received = #incoming channels, local snapshot for (r,version) is finished
Record local state for r'th snapshotUpdate version number of r'th snapshot
46Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – pseudo-codeexecute_snapshot()
Wait for a snapshot request or a token
Snapshot request:1 my_version++, current_snap[self]=my_version2 S[self] my_state3 for each outgoing channel q, send(q, TOEKEN; self, my_version)4 tokens_received[self]=0
TOKEN(q; r,version):5. If current_snap[r] < version6. S[r] my state7. current_snap[r]=version8. L[r][q] empty, send token(r, version) on each outgoing channel9. tokens_received[r] 110. else11. tokens_received[r]++12. L[r][q] all messages received from q since first receiving token(r,version)13. if tokens_received = #incoming channels, local snapshot for (r,version) is finished
Set of messages on channel from q is emptySend token on all outgoing channelsInitialize number of received tokens to 1
47Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – pseudo-codeexecute_snapshot()
Wait for a snapshot request or a token
Snapshot request:1 my_version++, current_snap[self]=my_version2 S[self] my_state3 for each outgoing channel q, send(q, TOEKEN; self, my_version)4 tokens_received[self]=0
TOKEN(q; r,version):5. If current_snap[r] < version6. S[r] my state7. current_snap[r]=version8. L[r][q] empty, send token(r, version) on each outgoing channel9. tokens_received[r] 110. else11. tokens_received[r]++12. L[r][q] all messages received from q since first receiving token(r,version)13. if tokens_received = #incoming channels, local snapshot for (r,version) is finished
Not the first token for (r,version)
48Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – pseudo-codeexecute_snapshot()
Wait for a snapshot request or a token
Snapshot request:1 my_version++, current_snap[self]=my_version2 S[self] my_state3 for each outgoing channel q, send(q, TOEKEN; self, my_version)4 tokens_received[self]=0
TOKEN(q; r,version):5. If current_snap[r] < version6. S[r] my state7. current_snap[r]=version8. L[r][q] empty, send token(r, version) on each outgoing channel9. tokens_received[r] 110. else11. tokens_received[r]++12. L[r][q] all messages received from q since first receiving token(r,version)13. if tokens_received = #incoming channels, local snapshot for (r,version) is finished
Yet another token for snapshot (r,version)
49Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – pseudo-codeexecute_snapshot()
Wait for a snapshot request or a token
Snapshot request:1 my_version++, current_snap[self]=my_version2 S[self] my_state3 for each outgoing channel q, send(q, TOEKEN; self, my_version)4 tokens_received[self]=0
TOKEN(q; r,version):5. If current_snap[r] < version6. S[r] my state7. current_snap[r]=version8. L[r][q] empty, send token(r, version) on each outgoing channel9. tokens_received[r] 110. else11. tokens_received[r]++12. L[r][q] all messages received from q since first receiving token(r,version) 13. if tokens_received = #incoming channels, local snapshot for (r,version) is finished
These messages are the state of the channel from q for snapshot (r,version)
50Operating Systems, 2011, Danny Hendler & Amnon Meisels
Snapshot algorithm – pseudo-codeexecute_snapshot()
Wait for a snapshot request or a token
Snapshot request:1 my_version++, current_snap[self]=my_version2 S[self] my_state3 for each outgoing channel q, send(q, TOEKEN; self, my_version)4 tokens_received[self]=0
TOKEN(q; r,version):5. If current_snap[r] < version6. S[r] my state7. current_snap[r]=version8. L[r][q] empty, send token(r, version) on each outgoing channel9. tokens_received[r] 110. else11. tokens_received[r]++12. L[r][q] all messages received from q since first receiving token(r,version)13. if tokens_received = #incoming channels, local snapshot for (r,version) is finished
If all tokens of snapshot (r,version) arrived, snapshot computation is over