Introduction 1-1 1DT066 Distributed Information Systems Chapter 1 Introduction.
1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.
-
Upload
christiana-arling -
Category
Documents
-
view
214 -
download
0
Transcript of 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.
![Page 1: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/1.jpg)
1DT066DISTRIBUTED INFORMATION SYSTEM
Time, Coordination and Agreement
1
![Page 2: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/2.jpg)
OUTLINE
TimePhysical timeLogical time
Coordination and agreement Multicast communication Summary
2
![Page 3: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/3.jpg)
1 TIME
The notation of time External synchronization Internal synchronization Physical clocks and their synchronization Logical time and logical clocks
3
![Page 4: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/4.jpg)
1.1 SYNCHRONIZING PHYSICAL CLOCKS
Computer each contains its physical clock. Physical clock is limited by its resolution - the
period between updates of the clock register. Clock drift often happens to physical clocks. To compensate for clock drifts, computers are
synchronized to a time service, e.g., UTC - Coordinated universal time.
Several other algorithms for synchronization.
4
![Page 5: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/5.jpg)
1.1 CRISTIAN’S CLOCK SYNCHRONIZATION
A process P can record the total round-trip time Tround taken to send the request mr and receive the reply mt.
A simple estimate of the time to which P should set its clock is t + Tround/2.
5
![Page 6: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/6.jpg)
1.1 THE BERKELEY ALGORITHM
A coordinator computer is chosen to act as the master. Master periodically polls to slaves whose clocks are to be synchronized.
The master estimates their local clock times by observing the round-trip times, and it averages the values obtained.
The master takes a fault-tolerant average. Should the master fail, then another can be
elected to take over.
6
![Page 7: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/7.jpg)
1.1 THE NETWORK TIME PROTOCOL NTP distributes time information to provide:
a service to synchronize clients in Internet a reliable service that survives loss of connection a frequent resynchronization for client’s clock drift protection against interference with time server
NTP service is provided by various servers: Primary servers, secondary servers, and servers of
other levels (called strata). Synchronization subnet: the servers which are
connected in a logical hierarchy.
7
![Page 8: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/8.jpg)
1.2 LOGICAL TIME AND LOGICAL CLOCKS The order of the events
two events occurred in the order they appear in a process. event of sending occurred before event of receiving.
happened-before relation, denoted by HB1: If process p: x p y, then x y.HB2: For any message m, send(m) rcv(m),HB3: If x, y and z are events such that x y and y z, then x z.
8
![Page 9: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/9.jpg)
1.2 LOGICAL TIMESTAMPS EXAMPLE
Events occurring at three processes
9
![Page 10: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/10.jpg)
1.2 LAMPORT LOGICAL TIMESTAMPS Logical clock - a monotonically increasing
software counter. Cp: logical clock for process p; Cp(a): timestamp
of event a at p; C(b): timestamp of event b LC1: event issued at process p: Cp := Cp + 1
LC2: a) p sends message m to q with value t = Cp
b) Cq := max(Cq,t) and applies LC1 to rcv(m). If a b then C(a) < C(b), but not visa versa! Total order logical clock and vector clock. 10
![Page 11: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/11.jpg)
1.2 LAMPORT TIMESTAMPS EXAMPLE
Events occurring at three processes
1 2
3 4
51
3
6
=> 7
11
![Page 12: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/12.jpg)
1.3 VECTOR CLOCKS Vector clock
A vector clock of N processes is an array of N integers
Each process keeps its own vector clock Vi, which it uses to timestamp a local event
VC1: Initially, Vi[j] = 0, for i, j = 1, 2…, N.
VC2: Just before pi timestamps an event, it sets Vi[i] := Vi[i] + 1.
VC3: pi includes the value t = Vi in every message it sends.
VC4: When pi receives a timestamp t in a message, it sets Vi[j] := max(Vi[j], t[j]), for j = 1, 2…, N. Taking the component-wise maximum of two vector timestamps in this way is known as a merge operation.
12
![Page 13: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/13.jpg)
1.3 VECTOR CLOCKS EXAMPLE
Events occurring at three processes
(1,0,0) (2,0,0)
(0,0,1)
(2,1,0) (2,2,0)
(2,2,2)
(2,2,3)
(3,2,3)
![Page 14: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/14.jpg)
1.4 COMPARISON In Lamport’s clock, C(e)<C(e’) does not imply e e’;
while in Vector timestamp, V(e)<V(e’) implies e e’. Vector timestamps take up an amount of storage
and message payload that is proportional to N, the number of process; while Lamport’s clock does not.
14
![Page 15: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/15.jpg)
1.5 LAMPORT TIMESTAMPS EXERCISE
15
![Page 16: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/16.jpg)
1.5 LAMPORT TIMESTAMPS ANSWER
1 4 5
3 4 52
61 7
16
![Page 17: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/17.jpg)
1.5 VECTOR CLOCKS EXERCISE
17
![Page 18: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/18.jpg)
1.5 VECTOR CLOCKS ANSWER
(1,0,0) (2,2,1) (3,2,1)
(0,1,1) (0,2,1) (1,3,1) (1,4,1)
(0,0,1) (3,2,2) (3,4,3)
18
![Page 19: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/19.jpg)
2 COORDINATION
Distributed processes need to coordinate their activities.
Distributed mutual exclusion is required for safety, liveness, and ordering properties.
Election algorithms: methods for choosing a unique process for a particular role.
19
![Page 20: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/20.jpg)
2.1 DISTRIBUTED MUTUAL EXCLUSION The basic requirements for mutual exclusion:
ME1 (safety): At most one process may execute in the critical section (CS) at a time.
ME2 (liveness): A process requesting entry to the CS is eventually granted.
ME3 (ordering): Entry to the CS should be granted in happened-before order.
The central server algorithm. A ring-based algorithm. A distributed algorithm using logical clocks.
20
![Page 21: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/21.jpg)
2.2 ELECTIONS
An election is a procedure carried out to choose a process from a group.
A ring-based election algorithm. The bully algorithm.
21
![Page 22: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/22.jpg)
2.2.1 RING-BASED ELECTION ALGORITHM
Each process P(i) has a communication channel to the next process P(i+1) mod N.
Messages are sent clockwise. The goal is to elect a single process called the
coordinator, which is the process with the largest identifier.
22
![Page 23: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/23.jpg)
2.2.1 RING-BASED ALGORITHM
3
12
34
5
7
1Process number status
1 Non-participant
3 Non-participant
5 Non-participant
7 Non-participant
12 Non-participant
34 Non-participantDirection of message flow
![Page 24: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/24.jpg)
2.2.1 RING-BASED ALGORITHM
Every process can begin an election A process begins an election by marking itself as a participant, and
sends an election message to its neighbor by placing its identifier Suppose process 7 now begins the election
3
12
34
5
7
1Process number status
1
3
5
7
12
34
Election message
7
Election message
7
Election message
34
34
34
34
34
34
ParticipantNon-participant
ParticipantNon-participant
ParticipantNon-participant
ParticipantNon-participant
ParticipantNon-participant
ElectedCoordinator is 34
Non-participant
Participant
Non-participant
Non-participant
Non-participant
Non-participant
Non-participant
24
![Page 25: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/25.jpg)
2.2.2 BULLY ALGORITHM The processes themselves are synchronous. I.e. they use timeouts to
detect a process failure. Unlike the ring-based algorithm in which processes only know their
neighbors, bully algorithm allows processes to know those processes with a higher identifier.
There are three types of message: Election Answer Coordinator
121 13 5
Coordinator is now 13, because it has the highest identifier
25
![Page 26: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/26.jpg)
2.2.2 BULLY ALGORITHM The election begins when a process notices that the coordinator is failed. Several processes may discover this concurrently A process which detects the failure will send an election message to those with a
higher identifier When a process receives an election message, it sends back an answer message and
begins another election
121 13 5
Election message
Answer Message
Election message
Coordinator
Coordinator
Process 12 will know that it is the highest identifier now as all its higher identifier process (i.e. process 13) have failed, this process will then send back the coordinator message to all its lower identifier process.
26
![Page 27: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/27.jpg)
3 MULTICAST COMMUNICATION Group (multicast) communication requires
coordination and agreement. One multicast operation is much better than
multiple send operation in terms of efficiency and delivery guarantees.
Basic multicast: guarantees a correct process will eventually deliver the message.
Reliable multicast: requires that all correct processes in the group must receive a message if any of them does.
27
![Page 28: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/28.jpg)
3.1 OPEN AND CLOSED GROUPS
Closed group Open group
28
![Page 29: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/29.jpg)
3 BULLETIN BOARD EXAMPLE
29
![Page 30: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/30.jpg)
3.2 CONSISTENCY AND REQUEST ORDERING Criteria: correctness vs. expenses. Total, causal, and FIFO ordering requirements. Implementing request ordering. Implementing total ordering. Implementing causal ordering with vector
timestamps.
30
![Page 31: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/31.jpg)
3.2.1 TOTAL, FIFO, CAUSAL ORDERING
Let m1 and m2 be messages delivered to the group.
Total ordering: Either m1 is delivered before m2 or m2 is delivered before m1, at all processes.
Causal ordering: If m1 happened-before m2 then m1 is delivered before m2 at all processes.
FIFO ordering: If m1 is issued before m2 then m1 is delivered before m2 at all processes.
31
![Page 32: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/32.jpg)
3.2.1 ORDERING OF MULTICAST MESSAGES
F3
F1
F2
T2
T1
P1 P2 P3
Time
C3
C1
C2
Totally ordered messages T1 ,T2 and F1,
FIFO-related messagesF1 and F2 ; C1 and C2
Causally related messages C1 and C3 (assuming C3 is a
reply to C1 at P3 )
32
![Page 33: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/33.jpg)
3.2.2 IMPLEMENTING MESSAGE ORDERING Hold-back: A received message is not delivered until
ordering constraints can be met. Stable message: all prior messages processed. Hold-back queue vs. delivery queue. Safety property: no message will be delivered out of
order by being prematurely transferred. Liveness property: no message should wait on the
hold-back queue forever.
33
![Page 34: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/34.jpg)
3.2.2 THE HOLD-BACK QUEUE
Messageprocessing
Delivery queueHold-back
queue
deliver
Incomingmessages
When delivery guarantees aremet
34
![Page 35: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/35.jpg)
3.2.3 IMPLEMENTING TOTAL ORDERING Basic approach: assign totally ordered sequence to
messages. Sequencer Distributed agreement in assigning sequence.
21
1
2
2
1 Message
2 Proposed Seq
P2
P3
P1
P4
3 Agreed Seq
3
3
35
![Page 36: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/36.jpg)
3.2.4 IMPLEMENTING CAUSAL ORDERING Vector timestamp: a list of counts of update events,
one for each of the processes. Merging vector timestamps: choose the largest values
from the two vectors, component-wise.
P3
P2
P1
P1 P2 P3
P2 P2 P3
P1 P2 P3
… 36
![Page 37: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/37.jpg)
3.3 MULTICAST COMMUNICATION Sample multicasting operation
void Multicast (in orderType order, in groupId group, in msg m, in int nReplies, out msgSeq replies) raises (…);
Order types:unorderedtotal orderingcausal orderingsync-ordering 37
![Page 38: 1DT066 D ISTRIBUTED I NFORMATION S YSTEM Time, Coordination and Agreement 1.](https://reader035.fdocuments.us/reader035/viewer/2022070306/551771f755034645368b4caa/html5/thumbnails/38.jpg)
4 SUMMARY Timing issues
Synchronizing physical clocks.Logical time and logical clocks.
Distributed coordination and mutual exclusions.
Coordination. Elections
Ring-based algorithmBully algorithm
Multicast communication. Read textbook Chapters 14 and 15.
38