CSC 536 Lecture 7
description
Transcript of CSC 536 Lecture 7
![Page 1: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/1.jpg)
CSC 536 Lecture 7
![Page 2: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/2.jpg)
Outline
Fault toleranceReliable client-server communicationReliable group communicationDistributed commit
![Page 3: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/3.jpg)
Reliable client-server communication
![Page 4: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/4.jpg)
Process-to-process communication
Reliable process-to-process communications is achieved using the Transmission Control Protocol (TCP)
TCP masks omission failures using acknowledgments and retransmissionsCompletely hidden to client and serverNetwork crash failures are not masked
![Page 5: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/5.jpg)
RPC/RMI Semantics inthe Presence of Failures
Five different classes of failures that can occur in RPC/RMI systems:
The client is unable to locate the server.
The request message from the client to the server is lost.
The server crashes after receiving a request.The reply message from the server to the client is lost.
The client crashes after sending a request.
![Page 6: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/6.jpg)
RPC/RMI Semantics inthe Presence of Failures
Five different classes of failures that can occur in RPC/RMI systems:
The client is unable to locate the server.Throw exception
The request message from the client to the server is lost.Resend request
The server crashes after receiving a request.The reply message from the server to the client is lost.
Assign each request a unique id and have the server keep track or request ids
The client crashes after sending a request.What to do with orphaned RPC/RMIs?
![Page 7: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/7.jpg)
Server Crashes
A server in client-server communication
(a) The normal case. (b) Crash after execution. (c) Crash before execution.
![Page 8: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/8.jpg)
What should the client do?
Try again: at least once semantics
Report back failure: at most once semantics
Want: exactly once semantics
Impossible to do in generalExample: Server is a print server
![Page 9: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/9.jpg)
Print server crash
Three events that can happen at the print server:
1. Send the completion message (M), 2. Print the text (P), 3. Crash (C).
Note: M could be sent by the server just before it sends the file to be printed to the printer or just after
![Page 10: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/10.jpg)
Server Crashes
These events can occur in six different orderings:1. M →P →C: A crash occurs after sending the completion
message and printing the text.2. M →C (→P): A crash happens after sending the completion
message, but before the text could be printed.3. P →M →C: A crash occurs after sending the completion
message and printing the text.4. P→C(→M): The text printed, after which a crash occurs
before the completion message could be sent.5. C (→P →M): A crash happens before the server could do
anything.6. C (→M →P): A crash happens before the server could do
anything.
![Page 11: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/11.jpg)
Client strategies
If the server crashes and subsequently recovers, it will announce to all clients that it is running again
The client does not know whether its request to print some text has been carried out
Strategies for the client:Never reissue a requestAlways reissue a requestReissue a request only if client did not receive a completion messageReissue a request only if client did receive a completion message
![Page 12: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/12.jpg)
Server Crashes
Different combinations of client and server strategies in the presence of server crashes.Note that exactly once semantics is not achievable under any client/server strategy.
![Page 13: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/13.jpg)
Akka client-server communication
At most once semantics
Developer is left the job of implementing any additional guarantees required by the application
![Page 14: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/14.jpg)
Reliable group communication
![Page 15: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/15.jpg)
Reliable group communication
Process replication helps in fault tolerance but gives rise to a new problem:
How to construct a reliable multicast service, one that provides a guarantee that all processes in a group receive a message?
A simple solution that does not scale:Use multiple reliable point-to-point channels
Other problems that we will consider later:Process failuresProcesses join and leave groups
We assume that unreliable multicasting is availableWe assume processes are reliable, for now
![Page 16: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/16.jpg)
Implementing reliable multicasting on top of unreliable multicasting
Solution attempt 1:(Unreliably) multicast message to process groupA process acknowledges receipt with an ack messageResend message if no ack received from one or more processes
Problem:
![Page 17: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/17.jpg)
Implementing reliable multicasting on top of unreliable multicasting
Solution attempt 1:(Unreliably) multicast message to process groupA process acknowledges receipt with an ack messageResend message if no ack received from one or more processes
Problem:Sender needs to process all the acks which may be hugeSolution does not scale
![Page 18: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/18.jpg)
Implementing reliable multicasting on top of unreliable multicasting
Solution attempt 2:(Unreliably) multicast numbered message to process groupA receiving process replies with a feedback message only to inform that it is missing a messageResend missing message to process
Problems with solution attempt:Sender must keep a log of all message it multicast forever, andThe number of feedback message may still be huge
![Page 19: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/19.jpg)
Implementing reliable multicasting on top of unreliable multicasting
We look at two more solutionsThe key issue is the reduction of feedback messages!Also care about garbage collection
![Page 20: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/20.jpg)
Nonhierarchical Feedback Control
Feedback suppressionSeveral receivers have scheduled a request for retransmission, but the first retransmission request leads to the suppression of others.
![Page 21: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/21.jpg)
Hierarchical Feedback Control
Each local coordinator forwards the message to its childrenA local coordinator handles retransmission requestsHow to construct the tree dynamically?
![Page 22: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/22.jpg)
Hierarchical Feedback Control
Each local coordinator forwards the message to its children.A local coordinator handles retransmission requests.How to construct the tree dynamically?
Use the multicast tree in the underlying network
![Page 23: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/23.jpg)
Atomic Multicast
We assume now that processes could fail, while multicast communication is reliable
We focus on atomic (reliable) multicasting in the following sense:
A multicast protocol is atomic if every message multicast to group view G is delivered to each non-faulty process in G
if the sender crashes during the multicast, the message is delivered to all non-faulty processes or none
Group view: the view on the set of processes contained in the group which sender has at the time message M was multicastAtomic = all or nothing
![Page 24: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/24.jpg)
Virtual Synchrony
The principle of virtual synchronous multicast.
![Page 25: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/25.jpg)
The model for atomic multicasting
The logical organization of a distributed system to distinguish between message receipt and message delivery
![Page 26: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/26.jpg)
Implementing atomic multicasting
How to implement atomic multicasting using reliable multicasting
Reliable multicasting could be simply sending a separate message to each member using TCP or use one of the methods described in slides 19-22.
Note 1: Sender could fail before sending all messagesNote 2: A message is delivered to the application only when all non-faulty processes have received it (at which point the message is referred to as stable)Note 3: A message is therefore buffered at a local process until it can be delivered
![Page 27: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/27.jpg)
Implementing atomic multicasting
On initialization, for every process:Received = {}
To A-multicast message m to group G:R-multicast message m to group G
When process q receives an R-multicast message m:if message m not in Received set:
Add message m to Received setR-multicast message m to group GA-deliver message m
![Page 28: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/28.jpg)
Implementing atomic multicasting
We must especially guarantee that all messages sent to view G are delivered to all non-faulty processes in G before the next group membership change takes place
![Page 29: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/29.jpg)
Implementing Virtual Synchrony
a) Process 4 notices that process 7 has crashed, sends a view changeb) Process 6 sends out all its unstable messages, followed by a flush messagec) Process 6 installs the new view when it has received a flush message from
everyone else
![Page 30: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/30.jpg)
Distributed Commit
![Page 31: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/31.jpg)
Distributed Commit
The problem is to insure that an operation is performed by all processes of a group, or none at allSafety property:
The same decision is made by all non-faulty processesDone in a durable (persistent) way so faulty processes can learn committed value when they recover
Liveness property:Eventually a decision is made
Examples:Atomic multicastingDistributed transactions
![Page 32: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/32.jpg)
The commit problem
An initiating process communicates with a group of actors, who vote
Initiator is often a group member, tooIdeally, if all vote to commit we perform the actionIf any votes to abort, none does so
Assume synchronous model
To handle asynchronous model, introduce timeoutsIf timeout occurs the leader can presume that a member wants to abort. Called the presumed abort assumption.
![Page 33: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/33.jpg)
Two-Phase Commit Protocol
2PC initiator
pqrst
![Page 34: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/34.jpg)
Two-Phase Commit Protocol
2PC initiator
pqrst
Vote?
![Page 35: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/35.jpg)
Two-Phase Commit Protocol
2PC initiator
pqrst
Vote?
All vote “commit”
![Page 36: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/36.jpg)
Two-Phase Commit Protocol
2PC initiator
pqrst
Vote?
All vote “commit”
Commit!
![Page 37: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/37.jpg)
Two-Phase Commit Protocol
2PC initiator
pqrst
Vote?
All vote “commit”
Commit!
Phase 1
![Page 38: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/38.jpg)
Two-Phase Commit Protocol
2PC initiator
pqrst
Vote?
All vote “commit”
Commit!
Phase 1 Phase 2
![Page 39: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/39.jpg)
Fault tolerance
If no failures, protocol works
However, any member can abort any time, even before the protocol runs
If failure, we can separate this into three casesGroup member fails; initiator remains healthyInitiator fails; group members remain healthyBoth initiator and group member fail
We also need to handleHandling recovery of a failed member
![Page 40: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/40.jpg)
Fault tolerance
Some cases are pretty easy
E.g. if a member fails before voting we just treat it as an abort
If a member fails after voting commit, we assume that when it recovers it will finish up the commit protocol and perform whatever action was decided
All members keep a log of their actions in persistent memory.
Hard cases involve crash of initiator
![Page 41: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/41.jpg)
Two-Phase Commit
a) The finite state machine for the coordinator in 2PC.b) The finite state machine for a participant.
![Page 42: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/42.jpg)
Two-Phase Commit
If failure at coordinator, note that participants may block in one of two states: Init or Ready
Use timeouts
If participant blocked in INIT state:Abort
If participant P blocked in READY:Wait for coordinator or contact another participant Q
![Page 43: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/43.jpg)
Two-Phase Commit
Actions taken by a participant P when residing in state READY and having contacted another participant Q.
If all other participants are READY, P must block!
Contact another participantREADY
Make transition to ABORTINIT
Make transition to ABORTABORT
Make transition to COMMITCOMMIT
Action by PState of Q
![Page 44: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/44.jpg)
As a time-line picture
2PC initiator
qprst
Vote?
All vote “commit”
Commit!
Phase 1 Phase 2
![Page 45: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/45.jpg)
Why do we get stuck?
If process q voted “commit”, the coordinator may have committed the protocol
And q may have learned the outcomePerhaps it transferred $10M from a bank account…So we want to be consistent with that
If q voted “abort”, the protocol must abortAnd in this case we can’t risk committing
Important: In this situation we must block; we cannot restart the transaction, say.
![Page 46: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/46.jpg)
Two-Phase Commit
Outline of the steps taken by the coordinator in a two phase commit protocol
actions by coordinator:
while START _2PC to local log;multicast VOTE_REQUEST to all participants;while not all votes have been collected { wait for any incoming vote; if timeout { while GLOBAL_ABORT to local log; multicast GLOBAL_ABORT to all participants; exit; } record vote;}if all participants sent VOTE_COMMIT and coordinator votes COMMIT{ write GLOBAL_COMMIT to local log; multicast GLOBAL_COMMIT to all participants;} else { write GLOBAL_ABORT to local log; multicast GLOBAL_ABORT to all participants;}
![Page 47: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/47.jpg)
Two-Phase Commit
Steps taken by participant process in 2PC.
actions by participant:write INIT to local log;wait for VOTE_REQUEST from coordinator;if timeout { write VOTE_ABORT to local log; exit;}if participant votes COMMIT { write VOTE_COMMIT to local log; send VOTE_COMMIT to coordinator; wait for DECISION from coordinator; if timeout { multicast DECISION_REQUEST to other participants; wait until DECISION is received; /* remain blocked */ write DECISION to local log; } if DECISION == GLOBAL_COMMIT write GLOBAL_COMMIT to local log; else if DECISION == GLOBAL_ABORT write GLOBAL_ABORT to local log;} else { write VOTE_ABORT to local log; send VOTE ABORT to coordinator;}
![Page 48: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/48.jpg)
Two-Phase Commit
Steps taken for handling incoming decision requests.
actions for handling decision requests: /* executed by separate thread */
while true { wait until any incoming DECISION_REQUEST is received; /* remain blocked */ read most recently recorded STATE from the local log; if STATE == GLOBAL_COMMIT send GLOBAL_COMMIT to requesting participant; else if STATE == INIT or STATE == GLOBAL_ABORT send GLOBAL_ABORT to requesting participant; else skip; /* participant remains blocked */
![Page 49: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/49.jpg)
Three-phase commit protocol
Unlike the Two Phase Commit Protocol, it is non-blocking
Idea is to add an extra PRECOMMIT (“prepared to commit”) stage
![Page 50: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/50.jpg)
3 phase commit
3PC initiator
pqrst
Vote?
All vote “commit”
Phase 1Prepare to commit
All say “ok”
Phase 2
They commit
Commit!
Phase 3
![Page 51: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/51.jpg)
Three-Phase Commit
a) Finite state machine for the coordinator in 3PCb) Finite state machine for a participant
![Page 52: CSC 536 Lecture 7](https://reader030.fdocuments.us/reader030/viewer/2022033105/5681626a550346895dd2db1a/html5/thumbnails/52.jpg)
Why 3 phase commit?
A process can deduce the outcomes when this protocol is used
Main insight?Nobody can enter the commit state unless all are first in the precommit stateMakes it possible to determine the state, then push the protocol forward (or back)