Logical Clocks
-
Upload
rhonda-garcia -
Category
Documents
-
view
19 -
download
0
description
Transcript of Logical Clocks
![Page 1: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/1.jpg)
Logical Clocks
Ken Birman
![Page 2: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/2.jpg)
Time: A major issue in distributed systems
We tend to casually use temporal concepts Example: “p suspects that q has failed”
Implies a notion of time: first q was believed correct, later q is suspected faulty
Challenge: relating local notion of time in a single process to a global notion of time
Discuss this issue before developing practical tools for dealing with other aspects, such as system state
![Page 3: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/3.jpg)
Time in Distributed Systems
Three notions of time: Time seen by external observer. A global clock
of perfect accuracy Time seen on clocks of individual processes.
Each has its own clock, and clocks may drift out of sync.
Logical notion of time: event a occurs before event b and this is detectable because information about a may have reached b.
![Page 4: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/4.jpg)
External Time
The “gold standard” against which many protocols are defined
Not implementable: no system can avoid uncertain details that limit temporal precision!
Use of external time is also risky: many protocols that seek to provide properties defined by external observers are extremely costly and, sometimes, are unable to cope with failures
![Page 5: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/5.jpg)
Time seen on internal clocks
Most workstations have reasonable clocks Clock synchronization is the big problem (will visit
topic later in course): clocks can drift apart and resynchronization, in software, is inaccurate
Unpredictable speeds a feature of all computing systems, hence can’t predict how long events will take (e.g. how long it will take to send a message and be sure it was delivered to the destination)
![Page 6: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/6.jpg)
Logical notion of time
Has no clock in the sense of “real-time” Focus is on definition of the “happens before”
relationship: “a happens before b” if: both occur at same place and a finished before b started,
or a is the send of message m, b is the delivery of m, or a and b are linked by a chain of such events
![Page 7: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/7.jpg)
Logical time as a time-space picture
a
b
d
c
p0
p1
p2
p3
a, b are concurrent
c happens after a, b
d happens after a, b, c
![Page 8: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/8.jpg)
Notation
Use “arrow” to represent happens-before relation
For previous slide: a c, b c, c d hence, a d, b d a, b are concurrent
Also called the “potential causality” relation
![Page 9: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/9.jpg)
Logical clocks
Proposed by Lamport to represent causal order
Write: LT(e) to denote logical timestamp of an event e, LT(m) for a timestamp on a message, LT(p) for the timestamp associated with process p
Algorithm ensures that if a b, then LT(a) < LT(b)
![Page 10: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/10.jpg)
Algorithm
Each process maintains a counter, LT(p) For each event other than message delivery:
set LT(p) = LT(p)+1 When sending message m,
set LT(m) = LT(p) When delivering message m to process q,
set LT(q) = max(LT(m), LT(q))+1
![Page 11: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/11.jpg)
Illustration of logical timestamps
0 1 2 7p0
p1
p2
p3
0 1 6
0 2 3 4 5 6
0 1
![Page 12: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/12.jpg)
Concurrent events
If a, b are concurrent, LT(a) and LT(b) may have arbitrary values!
Thus, logical time lets us determine that a potentially happened before b, but not that a definitely did so!
Example: processes p and q never communicate. Both will have events 1, 2, ... but even if LT(e)<LT(e’) e may not have happened before e’
![Page 13: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/13.jpg)
Vector timestamps
Extend logical timestamps into a list of counters, one per process in the system
Again, each process keeps its own copy Event e occurs at process p: p increments VT(p)[p]
(p’th entry in its own vector clock) q receives a message from p: q sets
VT(q)=max(VT(q),VT(p)) (element-by-element)
![Page 14: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/14.jpg)
Illustration of vector timestamps
[1,0,0,0] [2,0,0,0]
[0,0,1,0]
[2,1,1,0] [2,2,1,0]
p0
p1
p2
p3
[0,0,0,1]
![Page 15: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/15.jpg)
Vector timestamps accurately represent happens-before relation
Define VT(e)<VT(e’) if, for all i, VT(e)[i]<VT(e’)[i], and for some j, VT(e)[j]<VT(e’)[j]
Example: if VT(e)=[2,1,1,0] and VT(e’)=[2,3,1,0] then VT(e)<VT(e’)
Notice that not all VT’s are “comparable” under this rule: consider [4,0,0,0] and [0,0,0,4]
![Page 16: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/16.jpg)
Vector timestamps accurately represent happens-before relation
Now can show that VT(e)<VT(e’) if andonly if e e’: If e e’, then there exists a chain e0 e1 ... en on
which vector timestamps increase “hop by hop” If VT(e)<VT(e’) suffices to look at VT(e’)[proc(e)],
where proc(e) is the place that e occured. By definition, we know that VT(e’)[proc(e)] is at least as large as VT(e)[proc(e)], and by construction, this implies a chain of events from e to e’
![Page 17: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/17.jpg)
Examples of VT’s and happens-before
Example: suppose that VT(e)=[2,1,0,1] and VT(e’)=[2,3,0,1], so VT(e)<VT(e’)
How did e’ “learn” about the 3 and the 1? Either these events occured at the same place as e’, or Some chain of send/receive events carried the values!
If VT’s are not comparable, the corresponding events are concurrent
![Page 18: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/18.jpg)
Notice that vector timestamps require a static notion of system membership
For vector to make sense, must agree on the number of entries
Later will see that vector timestamps are useful within groups of processes
Will also find ways to compress them and to deal with dynamic group membership changes
![Page 19: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/19.jpg)
What about “real-time” clocks?
Accuracy of clock synchronization is ultimately limited by uncertainty in communication latencies
These latencies are “large” compared with speed of modern processors (typical latency may be 35us to 500us, time for thousands of instructions)
Limits use of real-time clocks to “coarse-grained” applications
![Page 20: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/20.jpg)
Interpretations of temporal terms
Understand now that “a happens before b” means that information can flow from a to b
Understand that “a is concurrent with b” means that there is no information flow between a and b
What about the notion of an “instant in time”, over a set of processes?
![Page 21: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/21.jpg)
Neither clock is appropriate
Problem is that with both clocks, there can be many events that are concurrent with a given event
Leads to a philosophical question: Event e has happened at process p Which events are “really” simultaneous with p?
![Page 22: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/22.jpg)
Perspectives on logical time
One view is based on intuition from physics Imagine a time-space diagram Cones of causality define past and future “Now” is any cut across the system consistent
including no future events and no past events Next Tuesday will see algorithms based on this
![Page 23: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/23.jpg)
Causal notions of past, future
a
b
g
e
p0
p1
p2
p3
d
c
f
![Page 24: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/24.jpg)
FUTURE
Causal notions of past, future
a
b
g
e
p0
p1
p2
p3
d
c
fPAST
![Page 25: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/25.jpg)
Issues raised by time
Time is a tool Typical uses of time?
To put events into some sort of order Example: the order of updates on a replicated
data item With one item, logical time may make sense With multiple items, consider VT with one
element per item
![Page 26: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/26.jpg)
Ways to extend time to a total order
Often extend a logical timestamp or vector timestamp with actual clock time when the event occurred and process id where it occurred
Combination breaks any possible ties Or can use event “names”
![Page 27: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/27.jpg)
An example
Suppose we are broadcasting messages Atomic broadcast is
Fault-tolerant: unless every process with a copy fails, the message is delivered everywhere (often expressed as all or nothing delivery)
Ordered: if p, q both receive m, n, either both receive m before n, or both receive n before m
How should we implement this policy?
![Page 28: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/28.jpg)
Easy case
In many systems there is really just one source of broadcasts Typically we see this pattern when there is really one
reference copy of a replicated object and the replicas are viewed as cached copies
Accordingly we can use a FIFO ordered broadcast and reduce the problem to fault-tolerance
FIFO ordering simply requires a counter from sender
![Page 29: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/29.jpg)
A more complex example
Sender-ordered multicast Sender places a timestamp in the broadcast Receiver waits until it has full set of messages Orders them by logical timestamp, breaks ties
with sender-id Then delivers in this order
How can it tell when it has the “full set”?
![Page 30: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/30.jpg)
A more complex example
m
n
Deliver m,n or n,m?
![Page 31: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/31.jpg)
A more complex example
Solution implicitly depends upon membership In fact, most distributed systems depend upon membership Membership is “the most fundamental” idea in many
systems for this reason Receiver can simply wait until all members have sent
one message System ends up running in rounds, where each
member contributes zero or one messages per round Use a “null” message if you have nothing to send
![Page 32: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/32.jpg)
A more complex example
m
n
![Page 33: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/33.jpg)
Optimizations
We could agree in advance on “permission to send” Now, perhaps only p, q have permission We treat their messages in rounds but others must
get permission before sending Avoids all the null messages and ensures fairness if
p, q send at same rate Dolev: explored extensions for varied rates, gets
quite elaborate…
![Page 34: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/34.jpg)
Optimizations
In the limit, we end up with a token scheme While holding the token, p has permission to
send If q requests the token p must release it
(perhaps after a small delay) Token carries the sequence number to use
![Page 35: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/35.jpg)
A more complex example
m:1
![Page 36: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/36.jpg)
A more complex example
m:1
![Page 37: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/37.jpg)
A more complex example
m:1
n:2
![Page 38: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/38.jpg)
An example
Such solutions are expressed in many ways With a ring: Chang and Maxemchuck; messages
are like a “train” with new message tacked onto end and old ones delivered from front
Direct all-to-all broadcast Like a token moving around the ring, but it
carries the messages with it (inspired by FDDI) Tree structured in various ways
![Page 39: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/39.jpg)
More examples
Old Isis system uses logical clocks Sender says “here is a message” Receivers maintain logical clocks. Each
proposes a delivery time Sender gathers votes, picks maximum, says
“commit delivery at time t” Receivers deliver committed messages in
timestamp order from front of a queue
![Page 40: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/40.jpg)
More examples
m m:[1,p] n:[2,p]
n:[1,q] m:[2,q]
m:[1,r] n:[2,r]
![Page 41: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/41.jpg)
More examples
m m:[1,p] n:[2,p] m:[2,q]
n:[1,q] m:[2,q] n:[2,r]
m:[1,r] n:[2,r]
![Page 42: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/42.jpg)
More examples
m m:[1,p] n:[2,p] m:[2,q] m! n!
n:[1,q] m:[2,q] n:[2,r] m!n!
m:[1,r] n:[2,r] m!n!
![Page 43: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/43.jpg)
More examples
Later versions of Isis used vector times Membership is handled separately Each message is assigned a vector time Delivered in vector time order, with ties broken
using process id of the sender
![Page 44: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/44.jpg)
Totem and Transis
These systems represent time using partial order information
Message m arrives and includes ordering fields: Deliver m after n and o By transitivity, if n is after p, them m is after p
Break ties using process id number
![Page 45: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/45.jpg)
Totem and Transis
m
n o
p
![Page 46: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/46.jpg)
Things to notice
Time is just a programming tool But membership and message atomicity are very
fundamental Waiting for m won’t work if m never arrives And VT is only meaningful if we can agree on the
meaning of the indicies With failures, these algorithms get surprisingly
complicated: suppose p fails while sending m?
![Page 47: Logical Clocks](https://reader036.fdocuments.us/reader036/viewer/2022062721/56813657550346895d9de039/html5/thumbnails/47.jpg)
Major uses of time
To order updates on replicated data To define versions of objects To deal with processes that come and go in
dynamic networked applications Processes that joined earlier often have more complete
knowledge of system state Process that leaves and rejoins often needs some form of
incrementing “incarnation number” To prove correctness of complex protocols