Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

66
Chapter 8 Coordination
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    221
  • download

    0

Transcript of Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Page 1: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Chapter 8

Coordination

Page 2: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Topics Election algorithms Mutual exclusion Deadlock Transaction

Page 3: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Election Algorithms This is the way nodes in a DS electing

a new coordinator when the old one failed or was cut out of the network

In the following algorithms, each processor (node) has a unique ID. Communications are reliable (messages are not dropped or corrupted).

Page 4: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Requirements Safety: each process Pi has

coordinator =null or coordinator = P, where P is the live process

Liveness: each process Pi eventually has coordinator ≠ null or it has failed.

Page 5: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

The Bully Algorithm (Garcia-Molina) “Node with highest ID

bullies his way into leadership”. When a process notices that the

coordinator fails, it holds an election: 1. P sends an ELECTION (E-message) to all

processes with higher numbers 2. If no one responds, P wins the election

and becomes coordinator. 3. If one of the higher-ups answers, say Q, it

takes over. P’s job is done.

Page 6: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

An Example

Process 4 holds an election Process 5 and 6 respond, telling 4 to stop Now 5 and 6 each hold an election

Page 7: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

An Example (Cont.)

d) Process 6 tells 5 to stope) Process 6 wins and tells everyone

Page 8: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

The Cost In a network of N nodes, assume

the coordinator with ID N fails If the process with ID (N-1) starts an

election, the cost is O(N) messages If the lowest numbered node starts an

election, the cost is O(N2)

Page 9: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

A Ring Election Algorithm Nodes are physically or logically organized

in a ring. Nodes know their successors.

Node states are: Normal, Election, Leader. Any node that notices that the leader is

not functioning, changes his state to Election, starts an election message containing his ID and sends it to his clockwise neighbor.

Page 10: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

An Example

Page 11: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

A Ring Election Algorithm (2) When a node receives an election

message: Add its ID to the message, send it to the

successor If the message contains its own ID, it sends

a CORDINATOR message, which contains the list member with the highest number as the coordinator. This message circulates once.

Page 12: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

An Example

Page 13: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

An Example (Cont.)

Page 14: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

An Example (Cont.)

Page 15: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Complexity In the best case, only one node starts

an election message, so the number of messages is 2N.

In the worst case, N nodes start an election message resulting in O(N2).

Improvements Drop election messages arriving in less than

time , where is the time a message takes to traverse the ring.

Does it work?

Page 16: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

LCR Ring Election Each node sends a message with its ID

around the ring. When a process receives an incoming message, it compares the ID with its own. If the incoming ID is greater than its own, it passes it to the next node; if it is less than its own, it discards it; if it is equal to its own, it declares itself leader.

3

50

Elect 3Elect 5

Elect 0

Page 17: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Complexity If messages are

passed clockwise…only one survives after the first round.

If messages are passed counter-clockwise...

Best case O(N), worst case O(N2).

1

2

3

0Elect 0

Elect 1

Elect 2

Elect 3

Page 18: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

HS (Hirschberg Sinclair) Ring Election (1) Motivation: O(N2) is a lot of messages.

Improve it to O(N log N). Assumptions: the ring size can be

unknown. The communications must be bidirectional. All nodes start more or less at the same time. Each node operates in phases and sends out tokens. The tokens carry hop-counts and direction flags in addition to the ID of the sender.

3ID=3

2 hops clockwise

ID=3,2 hops

Counter-clckws

Page 19: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

HS Ring Election (2) Phases are numbered 0, 1, 2, 3, … log2N. In each

phase, k, node j sends out tokens uj containing its ID in both directions.

The tokens travel 2k hops then return to their origin j.

Travel only the distance of 2k

If both tokens make it back, process j continues with the next phase (increments k). If both tokens do not make it back, process j simply waits to be told who the results of the election.

3x x

Outbound

Inbound

Page 20: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

HS Ring Election (3) All processes always relay inbound tokens. If a process i receives a token uj going in the

outbound direction, it compares the token’s ID with its own.

If it has a larger ID, it simply discards the token. If it has a smaller ID, it relays the token as requested. If it is equal to the token ID, it has received its own token

in the outbound direction, so the token has gone clear around the ring and the process declares itself leader.

4

ID=3,2 hops

clockwise

Page 21: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Complexity Communications Complexity: In the first phase,

every process sends out 2 tokens and they go one hop and return. This is a total of 4N messages for the tokens to go out and return.

In phase k, where k>0, a node sends out tokens if it was not overruled in the previous phase, that is by a process within a distance of 2k-1 in either direction. This implies that within group of 2k-

1+1consecutive nodes, at most one goes on to send out tokens in phase k.

This limits the message complexity to O(N log N).

Page 22: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Mutual Exclusion in DS Mutual exclusion is needed for

restricting access to a shared resource. We use semaphores, monitors and

similar constructs to enforce mutual exclusion on a centralized system.

We need the same capabilities on DS. As in the one processor case, we are

interested in safety (mutual exclusion), progress, and bounded waiting (fairness).

Page 23: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Solutions Centralized lock manager Token-passing lock manager Distributed lock manager

Ricard/Agrawala Algorithm Voting Quorum

Page 24: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

A Centralized Algorithm

a) Process 1 asks the coordinator for permission to enter a critical region. Permission is granted

b) Process 2 then asks permission to enter the same critical region. The coordinator does not reply.

c) When process 1 exits the critical region, it tells the coordinator, when then replies to 2

Page 25: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Problems with Centralized Locking?

Other issues?

Page 26: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

The Token Ring Algorithm Assumption: Processes are ordered in a ring. Communications are reliable and can be limited

to one direction. Size of ring can be unknown and each process is

only required to know his immediate neighbor. A single token circulates around the ring (in one

direction only).

3

5

0token

Page 27: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Algorithm Details When a process has the token, he can enter the

CR at most once. Then he must pass the token on.

Only the process with the token can enter the CR, thus Mutual Exclusion is ensured.

Bounded waiting since the token circulates. Liveness: as long as the process with the token

doesn’t fail, progress in ensures. Global snapshots can be used if a lost token is suspected.

3

5

0token

Page 28: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Problems with Token-Algorithm 1. How to distinguish if token is lost or if it is used

very long? 2. What happens if token-holder crashes for some

time? 3. How to maintain a logical ring if a participant

drops out (voluntarily or by failure) of the system? 4. How to identify and add new participants? 5. Token is perpetually passed over the ring even

when none of the participants wants to enter its CS ⇒ unnecessary overhead consuming bandwidth

6. Ring imposes an average delay of N/2 hops limiting scalability

Page 29: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Distributed Algorithm: Ricart and Agrawala Timestamp Algorithm Assumption: there is a total ordering of all events in

the system (Lamport’s timestamps will provide this).

Communications are reliable. Each process must maintain a queue for each

critical region or resource if there is more than one resource to be shared.

1

0

2

resource

Page 30: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Ricart and Agrawala (2) When a process wants to enter the Critical

Region or obtain a resource, it sends a message with its ID and a Lamport timestamp (t, pid) to all other processes.

It can proceed to enter the CR when it gets an “OK” message from all other processes.

When it is done with the CR, it sends an “OK” message to every process on its wait queue and removes them from the queue.

Page 31: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Ricart and Agrawala (3) When a process, P1, receives a request for the

resource from process, P2: If P1 is not in the CR and does not want the CR, it

sends back an “OK” message. If P1 is currently in the CR, it does not reply, but

queues P2’s request. If P1 wants to enter the CR but has not yet received

all the permissions, it compares the timestamp in P2’s message with the one in the message that P1 sent out to request the CR. The lowest timestamp wins.

If TS(P1) < TS(P2), then P2’s message is put on the queue.

If TS(P1) > TS(P2), then P1 sends P2 an “OK” message.

Page 32: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Ricart and Agrawala (4)

a) Two processes want to enter the same critical region at the same moment.

b) Process 0 has the lowest timestamp, so it wins.c) When process 0 is done, it sends an OK also, so 2

can now enter the critical region.

Page 33: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Analysis No tokens anymore Cooperative voting to determine sequence of CSs Does not rely on an interconnection media offering ordered

messages Serialization based on logical time stamps ( total ordering) If a participant wants to enter its CS it asks all others for

permission and does not proceed until all others have agreed

If a participant gets a permission request and is not interested in its CS, it returns permission immediately to the requester.

Message complexity: 2(N-1). Algorithm ensures:

mutual exclusion (no 2 have the lowest timestamp) progress (someone has the lowest timestamp) bounded waiting

Page 34: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Voting for Mutual Exclusion Potential problems: You must be sure you

have more votes than any other process to enter the CR: if P1 has 4 and P2 has 3 and P3 has 2, P1 has the most votes, but how does he know without communicating (costly) with other contenders? Just having 4 votes is not enough: what if P1 has 4 and P2 has 5 ?

Potential solution: require a simple majority to win. But 4 is not a majority of 9, so in this example, no one can go. Worse: processes are deadlocked.

Must be a way to resolve this kind of deadlock.

Page 35: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Timestamp Resolution When a process makes a request, it attaches a Lamport

timestamp. Voters will prefer candidates with the smaller timestamp.

If voter V has voted for P1 and then receives a request for vote from P2 with an earlier timestamp, V will try to retrieve its vote. V retrieves his vote by sending an INQUIRE message to P1. If P1 has not yet received all the needed votes, he must relinquish V’s vote, in which case, V now gives his vote to P2. This avoids deadlock.

When the P1 is finished with the CR, he sends release messages to all his voters, so they can give their votes to new candidates.

Page 36: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Anti-quorum Resolution An anti-quorum is any set of nodes that has

a non-empty intersection with all quorums. A voter votes YES to one process and NO to

other processes seeking the same resource. When process gets a quorum of YES votes:

proceeds to the CR. When he gets an anti-quorum of NO votes, he knows he will not get enough YES votes, so he “withdraws his candidacy” and releases his votes.

After waiting a specified time, he tries again to gain enough votes.

Page 37: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Quorums Do we need to get a majority of votes or is

there some smaller set of votes that will do? Different nodes could have different voting districts as long as any two districts have a non-empty intersection.

Quorums have the property that any 2 have a non-empty intersection.

Simple majorities are quorums. Any 2 sets whose sizes are simple majorities must have at least one element in common.

Page 38: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Quorums (2) Grid quorum: arrange nodes in logical grid

(square). A quorum is all of a row and all of a column. Quorum size is 2*sqrt(n) –1.

Finite Projective Plane (Maekawa): if N=7, form coteries of 3

Page 39: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

ComparisonAlgorithm Messages per

entry/exitDelay before entry (in message times)

Problems

Centralized 3 2 Coordinator crash

Token ring 1 to 0 to n-1 Lost token, process crash

Distributed 2(n-1) 2(n-1) Crash of process

Voting 2(n-1) 2(n-1) Crash of process

Page 40: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Transaction Property Atomicity. Either all operations of the transaction

are properly reflected in the database or none are. Consistency. Execution of a transaction in

isolation preserves the consistency of the database. Isolation. Although multiple transactions may

execute concurrently, each transaction must be unaware of other concurrently executing transactions. Intermediate transaction results must be hidden from other concurrently executed transactions.

Durability. After a transaction completes successfully, the changes it has made to the database persist, even if there are system failures.

Page 41: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Example: Funds Transfer Transaction to transfer $50 from account A to

account B:1.read(A)2.A := A – 503.write(A)4.read(B)5.B := B + 506.write(B)

Consistency requirement – the sum of A and B is unchanged by the execution of the transaction.

Atomicity requirement — if the transaction fails after step 3 and before step 6, the system ensures that its updates are not reflected in the database.

Page 42: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Example: Funds Transfer continued Durability requirement — once the user has

been notified that the transaction has completed (i.e., the transfer of the $50 has taken place), the updates to the DB must persist despite failures.

Isolation requirement — if between steps 3 and 6, another transaction is allowed to access the partially updated database, it will see an inconsistent database (the sum A + B will be less than it should be).Can be ensured by running transactions serially.

Page 43: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

The Transaction Model

Write data to a file, a table, or otherwiseWRITE

Read data from a file, a table, or otherwiseREAD

Kill the transaction and restore the old valuesABORT_TRANSACTION

Terminate the transaction and try to commitEND_TRANSACTION

Make the start of a transactionBEGIN_TRANSACTION

DescriptionPrimitive

Page 44: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Transaction Types Flat transactions

No partial results available A nested transaction is a transaction

that is logically decomposed into a hierarchy of sub-transactions. Allow partial results to be committed

A distributed transaction is a logically flat indivisible transaction that operates on distributed data.

Page 45: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Distributed Transactions: Illustration

Page 46: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Private Workspace

a) The file index and disk blocks for a three-block fileb) The situation after a transaction has modified block 0

and appended block 3c) After committing

Q: the cost of copying data?

Page 47: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

More Efficient Implementation Two common methods of implementation are

write-ahead logs and before/after images. With write-ahead logs, the transactions act

on the permanent workspace, but before they can make a change, a log record is written to stable storage with the transaction and data item ID and the old and new values.

This log can then be used if the transaction aborts and the changes need to be rolled back.

Page 48: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Write-ahead Logx = 0;

y = 0;

BEGIN_TRANSACTION;

x = x + 1;

y = y + 2

x = y * y;

END_TRANSACTION;

(a)

Log

[x = 0 / 1]

(b)

Log

[x = 0 / 1]

[y = 0/2]

(c)

Log

[x = 0 / 1]

[y = 0/2]

[x = 1/4]

(d)

a) A transaction b) – d) The log before each statement is

executed

Page 49: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Before- and After- Images A before- and after-image is kept for each data

item. When a data item is changed, the old value is

written to the before-image and the new value is the after-image.

Other transactions are not allowed to “see” the new value until the current transaction commits.

The after-image is made permanent and durable once the transaction which wrote it commits.

If the transaction aborts, the before-image is restored.

Page 50: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

DBMS Organization

General organization of managers for handling transactions.

Page 51: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

DBMS Organization

Page 52: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Levels of Consistency (SQL92) Serializable — default Repeatable read — only committed

records to be read, repeated reads of same record must return same value. However, a transaction may not be serializable.

Read committed — only committed records can be read, but successive reads of record may return different (but committed) values.

Read uncommitted — even uncommitted records may be read (browse).

Page 53: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

SerializabilityBEGIN_TRANSACTION x = 0; x = x + 1;END_TRANSACTION

(a)

BEGIN_TRANSACTION x = 0; x = x + 2;END_TRANSACTION

(b)

BEGIN_TRANSACTION x = 0; x = x + 3;END_TRANSACTION

(c)

Schedule 1 x = 0; x = x + 1; x = 0; x = x + 2; x = 0; x = x + 3 Legal

Schedule 2 x = 0; x = 0; x = x + 1; x = x + 2; x = 0; x = x + 3; Legal

Schedule 3 x = 0; x = 0; x = x + 1; x = 0; x = x + 2; x = x + 3; Illegal

Page 54: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Two-Phase Locking (2PL)

Page 55: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Strict 2PL

Page 56: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Pessimistic Timestamp Ordering Target: enforce serializability Every transaction gets a (Lamport,

totally ordered) timestamp. Every data item has a read ts and a

write ts and a commit bit c. The commit bit c is true if and only if the

most recent transaction to write to that item has committed.

The scheduler maintains the item timestamps and checks to make sure the reads and writes are correct.

Page 57: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Read Too Late T2 writes X

T1 reads X?

T1 starts T2 starts

T1 tries to read X, but ts(T1) < write-ts(X) meaning X has been written to by a later transaction.

T1 should not be allowed to read X because it was written by a transaction that occurs later in the serialization order (transactions are serialized by start time).

Solution: T1 is aborted.

Page 58: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Write Too Late T2 reads X

T1 writes X?

T1 starts T2 starts

T1 tries to write X, but the read-ts indicates that some other transaction should have read the value about to be written.

Solution: T1 is aborted.

Page 59: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Dirty Reads T2 writes X

T1 reads X?

T2 starts T1 starts T2 abort

T1 reads X that was last written by T2. The timestamps are properly ordered, but the commit bit c=false so if T2 later aborts then T1 must abort.

Solution: We can avoid cascading aborts by delaying T1’s read until T2 has committed (though not necessary to ensure serializability).

Page 60: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Thomas Write Rule T2 writes X

T1 writes X?

T1 starts T2 starts

T2 has written to X before T1. When T1 tries to write, the appropriate action is to do nothing. No other transaction T3 that should have read T1’s value of X got T2’s value instead, because it would have been aborted because of a too late read. Future reads of X want T2’s value or a later value, not T1’s value.

Solution: T1’s write can be skipped.

Page 61: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

TS Ordering Rules When scheduler receives a read request

from transaction T, if ts(T)>= write-ts(X) and c(X) is true, grant

request and set read-ts(X) to MAX{ts(T),read-ts(X)}

if ts(T)>= write-ts(X) and c(X) is false, delay T until c(X) becomes true or txn aborts.

If ts(T)< write-ts(X), abort T and restart with new timestamp.

Page 62: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

TS Ordering Rules, continued When scheduler receives a write

request from transaction T, if ts(T)>= read-ts(X) and ts(T)>= write-

ts(X), grant request, set write-ts(X) to ts(T) and c(X)=false

if ts(T)>= read-ts(X) and ts(T)< write-ts(X), don’t do the operation but allow T to continue as if done (Thomas write rule).

If ts(T)< read-ts(X), abort T and restart with new timestamp.

Page 63: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Optimistic Timestamp Ordering In any optimistic concurrency control, each

transaction does its writes to a private workspace until completion of a validation phase.

In the validate phase, the scheduler validates the transaction by comparing its read set and write set with those of other transactions.

After validation, the write set values are written to the database and the transaction commits

Validation is frequently done with the help of timestamps.

Page 64: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Two-Phase Commit (2PC) When several database take part

in a single transaction a protocol called Two-Phase Commit is used

Each database is assumed to have its own local “resource manager”

A single system component called the Coordinator controls the whole process.

Page 65: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

Steps Phase 1:

Coordinator sends a VOTE_REQUEST message Clients return VOTE_COMMIT or VOTE_ABORT

Phase 2: Coordinator collects all votes and sends

GLOBAL_COMMIT or GLOBAL_ABORT Each client commits or aborts.

Important factor: time-out

Page 66: Chapter 8 Coordination. Topics Election algorithms Mutual exclusion Deadlock Transaction.

2PC (2)

a) The finite state machine for the coordinator in 2PC.b) The finite state machine for a participant.

1) Client fail?

2) Coordinate fail?