Partial Synchrony: Realizing an Ideal
description
Transcript of Partial Synchrony: Realizing an Ideal
http://parasol.tamu.edu
Partial Synchrony: Realizing an Ideal
Srikanth Sastry Parasol Lab, Texas A&M University
Outline
• Classic Partial Synchrony• Empirical Systems• Problem Statement and Methodology• Preliminary Results
• The Celeration Problem• Fair Schedulers
• Future Work
2
Partial Synchrony
• Temporal guarantees on computation and communication• Guarantees themselves are incomplete• Knowledge is incomplete
• Introduced to circumvent the FLP impossibility• Formalizes the notion of ‘somewhat timely’
• Classic model (ParSync) [DLS1988]• (Eventual) Reliable communication• (Eventual, Unknown) Bound on message delay• (Eventual, Unknown) Bound on relative process speeds
3
Closer Look At ParSync
• Reliable message delivery• Unreliable message delivery
• Unbounded-size messages delivered in bounded time• Larger messages experience greater delays
• Arbitrary number of messages received per step• Fixed number of messages per step
• Agnostic to absolute process speeds• Aware and affected by absolute process speeds
• Agnostic to channel capacity• Sensitive to channel capacity
• Non-blocking Communication• Blocking Communication
4
Characterizing Empirical Systems
Computation• Processes take atomic
steps• Receive at most one
message• Make a state transition• Send at most one message
• Processes can crash• Processes execute at
finite rate• Processes have bounded
relative execution rate
Communication• Fair lossy• Detectable message
corruption• Some infinite subset of
timely messages• Message delay proportional
to message size• Timely messages not too
sparse • Bounded FIFO delivery of
messages
5
Problem Statement and Methodology
• Problem:Construct ParSync on top of empirical systems
• Methodology‒Step 1. Encapsulate the underlying synchronism‒Step 2. Construct reliable channels‒Step 3. Construct a fair distributed scheduler‒Step 4. Construct timely channels
6
7
Methodology
Empirical Distributed Systems
Encapsulate Synchronism
Distributed Fair Scheduler
Reliable Channels
ParSync Environment
8
Encapsulating Synchronism: Failure Detection Oracles
• A system service that can be queried for (potentially unreliable) information about process crashes [CT96]
• False negatives (crash, but no suspicion)• False positives (suspicion, but no crash)
• ◊P – Eventually Perfect Failure Detector• No false negatives• Finitely many false positives
• Strong Completeness• Every crashed process is eventually and permanently
suspected• Eventual Strong Accuracy
• Every correct process is eventually and permanently trusted
Implementation Challenges
• Absolute Process Speeds [SPW2009]
• Message Loss [SP2007]
• Bounded Channel Capacity [SP2007]
• Size-Sensitive Message Delay [SP2007]
9
10
Absolute Process Speeds: Celeration
• Every crash-fault detector…uses some kind of timeout mechanism…
which requires some way to measure time.
• But which time base should be measured?• Real Time ≈ Ticks of a physical clock• Action Time ≈ Steps of an executing process
• Negative result: Neither time base is sufficient for crash-fault detection in celerating environments.
• Celerating processes can:• Accelerate (e.g., via hardware upgrades)• Decelerate (e.g., via increased loads)
11
Acceleration and Action Time
Transmission Processing
Real Time
Action Time
12
Deceleration and Real Time
Transmission Processing
Real Time
Action Time
13
The Celeration Problem
Transmission Processing
Action time diverges for acceleration
Real time diverges for deceleration
14
Significance of Celeration
• Existing crash-detection mechanisms based only on real-time or action-time clocks are actually broken!
• Our positive result: We construct a new bichronal timeout mechanism that is immune to process celeration
15
System Model
• Temporal Assumptions• - unknown upper bound absolute message delay • - unknown bounds relative process speeds
• Reliability Assumptions• Reliable communication: no message loss/corruption• Unreliable computation: processes may crash
16
and Processing Delays
• Suppose P sends a ping to Q• Ping will be delivered within real-time units• But when will the ping actually be received?
• Depends on the local processing delay at Q
• Q takes (at least) 1 step for every steps at P• If Q has c local actions executed in round-robin order,
then Q executes all such actions within · c steps at P
• So, processing delay at Q is at most ·c steps at P
17
Ping-Ack ◊P Implementations
• Adaptive timeout values should• Exceed RTT after finitely-many false positives• Converge to a constant timeout value (if efficient)• Guarantee accuracy forever thereafter
Round-Trip Time (RTT) ≤ (2 + ·c + c)
P QPING
ACK
≤ Δ
≤ Δ≤ (Φ·c)
≤ c
18
Celeration and False PositivesR
TT in
Act
ion-
Tim
e U
nits
RTT in Real-Time Units
2
·c
Acceleration
Deceleration
Worst-Case RTTUnbounded RTT!
Unbounded RTT!
19
A New Timeout Mechanism
• Timeouts in ParSync are inherently bichronal• Transmission delays are bounded in real-time units• Processing delays are bounded in action-time units
• We define Bichronal Clocks for bichronal time• Model as an ordered pair <real-time, action-time>• Measure both time components concurrently • Expire only after both components expire!
20
Bichronal Clock Expiry
• Clock.Start (Real=5, Action=8)
0
Real Ticks
Action Ticks
2 4 6 8 10 12
Expiry at bichronal time (5,11)5 8 11
21
Celeration-Immune ◊P• Adaptive ping-ack protocol
• Start bichronal clock after sending ping• Run bichronal clock 4 consecutive times• Timeout if no ack received by 4th expiry
• Upon receiving any ack• Trust sending process• Adapt bichronal values after false-positive mistakes
• Increase both real-time and action-time components!
22
◊P – Strong Completeness
• RequirementSuspect crashed processes permanently
• This one is easy• Crashed processes stop sending acks• Bichronal timer eventually expires 4th time• Permanent suspicion after final ack
23
◊P – Eventual Strong Accuracy
• Requirement Trust correct processes eventually and permanently
• After finitely many false-positive mistakes• Bichronal values exceed <real=, action=·c>
P QPING
ACK
≤ Δ
≤ Δ
≤ ·c
24
◊P – Eventual Strong Accuracy
P
Q
Ping Ack
Real Time Unbounded Unbounded
Action Time Unbounded ·c Unbounded c
Bichronal max(,·c) max(,·c) max(,·c) max(,c)
Transmit Ping
Process Ping
Transmit Ack
Process Ack
25
Take Home Lessons
• Diagnosed the celeration problem• Many existing ◊P implementations are actually broken
• Defined bichronal clocks• Effective timeout mechanism for celerating environments
• Implemented celeration-immune ◊P • Ping-ack implementation based on bichronal clocks
• Practical advantages • Performance: Reduces ◊P mistakes during system volatility • Portable: Easy to incorporate into existing implementations
26
Why ◊P ?
• Strong enough to implement • Fair distributed schedulers [SP2008], [PSS 2008]• Quiescent reliable communication [ACT2004]
• In fact, it is the weakest such failure detector!• For fair distributed schedulers [SPW2009] • For quiescent reliable communication [ACT2004]
27
Checkpoint
Empirical Distributed Systems
Eventually Perfect Failure Detector ◊P
Distributed Fair Scheduler
Reliable Channels
ParSync Environment
Celeration [IPDPS 2009]Message Loss [ISPA 2007]Crash Quiescence [DISC 2009]
[ACT 2004]
28
Dining Philosophers As Schedulers
• Arbitrary graph topology• Nodes = processes (diners)• Edges = potential conflicts
Thinking
HungryEating
Diners cycle among three states
• Constraints– Thinking may last forever– Eating must be finite for correct
diners
29
Dining Specifications
• Wait Freedom• Progress despite crashes
• Eventual Weak Exclusion ◊WX• Eventually live neighbors never eat together
• Eventual fairness• Eventually a hungry process is never overtaken more than k times
• ◊P is sufficient to implement wait-free dining under ◊WX [PSS2008] [SP2007]
• But is ◊P necessary? In other words, is it the weakest?
Utility of ◊WX: Duty Cycle Scheduling
31
Methodology To Show ◊P Is The Weakest Failure Detector
• Based on definitions in [CT 96] and [CHT 96]• Suppose a weaker failure detector D could solve dining
under ◊WX• If we can implement ◊P using a black-box solution to
dining under ◊WX• Then we can use D to implement ◊P• Contradiction!• Hence, ◊P is the weakest.
32
Construction
• Given two processes X and Y• Y has two witnesses detecting X's liveness• Subject and witness compete in a dining instance• Each subject-witness pair throttles the other• Careful hand-off of eating sessions
S0
S1
W0
W1
X Y
Dining0
Dining1
33
Witness Actions• Wi becomes hungry• Upon eating
– Trusts X if trust bit is true– Else, suspects X– Resets trust bit to false– Triggers W1-i to become
hungry– Exits eating
• Upon receiving a ping from Sx
– Set trust bit to true– Send an ack to Sx
S0
S1
W0
W1
X Y
DX0
DX1
LegendThinking Hungry Eating
34
Subject Actions• Sx becomes hungry• Upon eating
– Waits until S1-x exits– Sends ping to Wx
– Waits for ack– Upon receiving ack– Triggers S1-x to become
hungry– Waits until S1-x is eating– Exits eating
S0
S1
W0
W1
X Y
DX0
DX1
LegendThinking Hungry Eating
PINGACK
PINGACK
35
Witness Actions – Timeline
Y.w0
Y.w1
1
2
3
. . . 4. . .
. . .
Enable
Ena
ble
Enab
le
Enable
LegendThinking Hungry Eating
36
6
Subject Action - Timeline
1X.s0
X.s1 2
3
4
5
. .. . .
.
PIN
G
AC
K
PIN
G
AC
K
AC
K
PIN
G
PIN
G
AC
K
AC
K
AC
K
PIN
G
PIN
G
Y.w0
Y.w1
LegendThinking Hungry EatingTrigger
37
2 4
1 3
6
Eventual Strong Accuracy
1X.s0
X.s1 2
3
4
5
. .. . .
.
PIN
G
AC
K
PIN
G
AC
K
AC
K
PIN
G
PIN
G
AC
K
AC
K
AC
K
PIN
G
PIN
G
Y.w0
Y.w1
LegendThinking Hungry EatingTrigger
Trust X Trust X
Trust X Trust X
38
Strong Completeness
X.s0
X.s1. ..
. . .
. . .
PIN
G
AC
K
PING
AC
K
Y.w0
Y.w1
LegendThinking Hungry EatingTrigger
Trust X Suspect X ....
Trust X Suspect X ....
Crash!
Suspect X
Suspect X
39
Take Home Lesson
• ◊P is the ‘weakest’ failure detector to implement Wait-free dining under ◊WX
• ◊P and Wait-free dining under ◊WX encapsulate equivalent synchronism in the underlying system
40
Checkpoint
Empirical Distributed Systems
Eventually Perfect Failure Detector ◊P
Dining under ◊WX Reliable Channels
ParSync Environment
Celeration [IPDPS 2009]Message Loss [ISPA 2007]Crash Quiescence [DISC 2009]
[ACT 2004]Necessity [SPAA 2009]
Sufficiency [ICDCN 2008]
Next Steps
• Implement `timely’ systems using ◊P • Challenge
• Failure detectors no real time guarantees!• “In Search of Lost Time” [CBHW2008]
• So, failure detectors do not encapsulate temporal guarantees• Synchronous System and P [CBGS2000]
• Then, what do failure detectors encapsulate?• Our assertion: Fairness
• Theta [WS2009]• Asynchronous Bounded Cycle [RS2008]
41
42
Big Picture
Empirical Distributed Systems
Eventually Perfect Failure Detector ◊P
Dining under ◊WX Reliable Channels
ParSync Environment
Celeration [IPDPS 2009]Message Loss [ISPA 2007]Crash Quiescence [DISC 2009]
[ACT 2004]Necessity [SPAA 2009]
Sufficiency [ICDCN 2008]
[On Going]
Future Work
• Extend the results to:‒ Other models of partial synchrony
Abstract MAC Layer‒ Other fault models
Crash-recover faults Transient faults
‒ Other kinds of networks VANETs, MANETs Anonymous Networks
43