UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent...

124
UNIT-II Distributed Synchronization 1

Transcript of UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent...

Page 1: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

UNIT-IIDistributed Synchronization

1

Page 2: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Mutual exclusion

Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized way.

If a process , say Pi , is executing in its critical section, then no other processes can be executing in their critical sections

Example: – updating a DB – Directory management– sending control signals to an IO device

2

Page 3: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Mutual Exclusion Algorithms3

• Non-token based: • A site/process can enter a critical section when an assertion (condition) becomes true.• Algorithm should ensure that the assertion will be true in only one site/process.

• Token based:• A unique token (a known, unique message) is shared among cooperating sites/processes.• Possessor of the token has access to critical section.• Need to take care of conditions such as loss of token, crash of token holder, possibility of multiple tokens, etc.

Page 4: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

General System Model

At any instant, a site may have several requests for critical section (CS), queued up, and serviced one at a time.

Site States: Requesting CS, executing CS, idle (neither requesting nor executing CS).

Requesting CS: blocked until granted access, cannot make additional requests for CS.

Executing CS: using the CS. Idle: In token-based approaches, idle site can have

the token.

4

Page 5: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Mutual Exclusion: Requirements

Freedom from deadlocks: two or more sites should not endlessly wait on conditions/messages that never become true/arrive.

Freedom from starvation: No indefinite waiting.

Fairness: Order of execution of CS follows the order of the requests for CS. (equal priority).

Fault tolerance: recognize “faults”, reorganize, continue. (e.g., loss of token).

5

Page 6: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Performance Number of messages per CS invocation: should be

minimized. Synchronization delay, i.e., time between the leaving

of CS by a site and the entry of CS by the next one: should be minimized.

Response time: time interval between request messages transmissions and exit of CS.

System throughput, i.e., rate at which system executes requests for CS: should be maximized.

If sd is synchronization delay, E the average CS execution time: system throughput = 1 / (sd + E).

6

Page 7: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Performance metrics7

Last siteexits CS

Next siteenters CS

Synchronizationdelay

Time

Time

CS Requestarrives

Messagessent Enter CS Exit CS

E

Response Time

Page 8: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Performance ... Low and High Load:

– Low load: No more than one request at a given point in time.– High load: Always a pending mutual exclusion request at a

site.

Best and Worst Case:– Best Case (low loads): Round-trip message delay +

Execution time. 2T + E.– Worst case (high loads).

Message traffic: low at low loads, high at high loads. Average performance: when load conditions fluctuate

widely.

8

Page 9: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Simple Solution

Control site: grants permission for CS execution. A site sends REQUEST message to control site. Controller grants access one by one. Synchronization delay: 2T -> A site release CS by

sending message to controller and controller sends permission to another site.

System throughput: 1/(2T + E). If synchronization delay is reduced to T, throughput doubles.

Controller becomes a bottleneck, congestion can occur.

9

Page 10: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Non-token Based Algorithms Notations:

– Si: Site I– Ri: Request set, containing the ids of all Sis from which

permission must be received before accessing CS. – Non-token based approaches use time stamps to order

requests for CS.– Smaller time stamps get priority over larger ones.

Lamport’s Algorithm– Ri = {S1, S2, …, Sn}, i.e., all sites.– Request queue: maintained at each Si. Ordered by time

stamps.– Assumption: message delivered in FIFO.

10

Page 11: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Lamport’s Algorithm Requesting CS:

– Send REQUEST(tsi, i). (tsi,i): Request time stamp. Place REQUEST in request_queuei.

– On receiving the message; sj sends time-stamped REPLY message to si. Si’s request placed in request_queuej.

Executing CS:– Si has received a message with time stamp larger than (tsi,i)

from all other sites.– Si’s request is the top most one in request_queuei.

Releasing CS:– Exiting CS: send a time stamped RELEASE message to all sites

in its request set.– Receiving RELEASE message: Sj removes Si’s request from its

queue.

11

Page 12: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Lamport’s Algorithm…

Performance.– 3(N-1) messages per CS invocation.

• (N - 1) REQUEST, (N - 1) REPLY, (N - 1) RELEASE messages.

– Synchronization delay: T

12

Page 13: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Lamport’s Algorithm: Example-1

13

(2,1)

(1,2)

S1

S2

S3

(1,2) (2,1)

(1,2) (2,1)

S1

S2

S3

Step 1:

Step 2:

S2 enters CS

(1,2) (2,1)

Page 14: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Lamport’s: Example…14

(1,2) (2,1)

(1,2) (2,1)

S1

S2

S3

Step 3:

(1,2) (2,1)

S2 leaves CS

(1,2) (2,1)

(1,2) (2,1)

S1

S2

S3

Step 4:

(1,2) (2,1)

S1 enters CS

(2,1)

(2,1)

(2,1)

Page 15: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Example-215

Page 16: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Ricart-Agrawala Algorithm Requesting critical section

– Si sends time stamped REQUEST message– Sj sends REPLY to Si, if

• Sj is not requesting nor executing CS• If Sj is requesting CS and Si’s time stamp is smaller

than its own request.• Request is deferred(postponed) otherwise.

Executing CS: after it has received REPLY from all sites in its request set.

Releasing CS: Send REPLY to all deferred requests. i.e., a site’s REPLY messages are blocked only by sites with smaller time stamps

16

Page 17: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Ricart-Agrawala: Performance

Performance:– 2(N-1) messages per CS execution.

• (N-1) REQUEST + (N-1) REPLY.

– Synchronization delay: T.

17

Page 18: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Ricart-Agrawala: Example18

(2,1)

(1,2)

S1

S2

S3

(2,1)

S1

S2

S3

Step 1:

Step 2:

S2 enters CS

Page 19: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Ricart-Agrawala: Example…19

(2,1)

S1

S2

S3

Step 3:

S1 enters CS

S2 leaves CS

Page 20: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Maekawa’s Algorithm

A site requests permission only from a subset of sites. Request set of sites si & sj: Ri, Rj such that Ri and Rj will

have atleast one common site (Sk). Sk mediates conflicts between Ri and Rj.

A site can send only one REPLY message at a time, i.e., a site can send a REPLY message only after receiving a RELEASE message for the previous REPLY message.

Request Sets Rules:– Sets Ri and Rj have atleast one common site.– Si is always in Ri.– Cardinality of Ri, i.e., the number of sites in Ri is K.– Any site Si is in K number of Ris. N = K(K - 1) + 1 -> K = square

root of N.

20

Page 21: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Maekawa’s Algorithm ...

Requesting CS– Si sends REQUEST(i) to sites in Ri.– Sj sends REPLY to Si if

• Sj has NOT sent a REPLY message to any site after it received the last RELEASE message.

• Otherwise, queue up Si’s request. Executing CS: after getting REPLY from all sites in Ri. Releasing CS

– send RELEASE(i) to all sites in Ri– Any Sj after receiving RELEASE message, send REPLY

message to the next request in queue.– If queue empty, update status indicating receipt of

RELEASE.

21

Page 22: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Maekawa’s Algorithm ...

Performance– Synchronization delay: 2T– Messages: 3 times square root of N (one each for

REQUEST, REPLY, RELEASE messages) Deadlocks

– Message deliveries are not ordered.– Assume Si, Sj, Sk concurrently request CS– Ri intersection Rj = {Sij}, Rj Rk = {Sjk}, Rk Ri = {Ski}– Possible that:

• Sij is locked by Si (forcing Sj to wait at Sij)• Sjk by Sj (forcing Sk to wait at Sjk)• Ski by Sk (forcing Si to wait at Ski)• -> deadlocks among Si, Sj, and Sk.

22

Page 23: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Token-based Algorithms

Unique token circulates among the participating sites.

A site can enter CS if it has the token. Token-based approaches use sequence numbers

instead of time stamps.– Request for a token contains a sequence number.– Sequence number of sites advance independently.

Correctness issue is trivial since only one token is present -> only one site can enter CS.

Deadlock and starvation issues to be addressed.

23

Page 24: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Suzuki-Kasami Algorithm If a site without a token needs to enter a CS, broadcast a REQUEST for token

message to all other sites.

Token: (a) Queue of request sites (b) Array LN[1..N], the sequence number of the most recent execution by a site j.

Token holder sends token to requestor, if it is not inside CS. Otherwise, sends after exiting CS.

Token holder can make multiple CS accesses. Design issues:

– Distinguishing outdated REQUEST messages.• Format: REQUEST(j,n) -> jth site making nth request.• Each site has RNi[1..N] -> RNi[j] is the largest sequence number of

request from j.– Determining which site has an outstanding token request.

• If LN[j] = RNi[j] - 1, then Sj has an outstanding request.

24

Page 25: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Suzuki-Kasami Algorithm ... Passing the token

– After finishing CS– (assuming Si has token), LN[i] := RNi[i]– Token consists of Q and LN. Q is a queue of requesting

sites.– Token holder checks if RNi[j] = LN[j] + 1. If so, place j in

Q.– Send token to the site at head of Q.

Performance– 0 to N messages per CS invocation.– Synchronization delay is 0 (if the token holder repeats

CS) or T.

25

Page 26: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Example

0

2

1

34

req=[1,0,0,0,0]last=[0,0,0,0,0]

req=[1,0,0,0,0]

req=[1,0,0,0,0]

req=[1,0,0,0,0]

req=[1,0,0,0,0]

initial state

Page 27: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Example

0

2

1

34

req=[1,1,1,0,0]last=[0,0,0,0,0]

req=[1,1,1,0,0]

req=[1,1,1,0,0]

req=[1,1,1,0,0]

req=[1,1,1,0,0]

1 & 2 send requests

Page 28: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Example

0

2

1

34

req=[1,1,1,0,0]last=[1,0,0,0,0]Q=(1,2)

req=[1,1,1,0,0]

req=[1,1,1,0,0]

req=[1,1,1,0,0]

req=[1,1,1,0,0]

0 prepares to exit CS

Page 29: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Example

0

2

1

34

req=[1,1,1,0,0]

req=[1,1,1,0,0]last=[1,0,0,0,0]Q=(2)

req=[1,1,1,0,0]

req=[1,1,1,0,0]

req=[1,1,1,0,0]

0 passes token to 1

Page 30: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Example

0

2

1

34

req=[2,1,1,1,0]

req=[2,1,1,1,0]last=[1,0,0,0,0]Q=(2,0,3)

req=[2,1,1,1,0]

req=[2,1,1,1,0]

req=[2,1,1,1,0]

0 and 3 send requests

Page 31: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Example

0

2

1

34

req=[2,1,1,1,0]

req=[2,1,1,1,0]

req=[2,1,1,1,0]last=[1,1,0,0,0]Q=(0,3)

req=[2,1,1,1,0]

req=[2,1,1,1,0]

1 sends token to 2

Page 32: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Raymond’s Algorithm Sites are arranged in a logical directed tree. Root: token holder.

Edges: directed towards root. Every site has a variable holder that points to an immediate

neighbor node, on the directed path towards root. (Root’s holder point to itself).

Requesting CS– If Si does not hold token and request CS, sends REQUEST

upwards provided its request_q is empty. It then adds its request to request_q.

– Non-empty request_q -> REQUEST message for top entry in q (if not done before).

– Site on path to root receiving REQUEST -> propagate it up, if its request_q is empty. Add request to request_q.

– Root on receiving REQUEST -> send token to the site that forwarded the message. Set holder to that forwarding site.

– Any Si receiving token -> delete top entry from request_q, send token to that site, set holder to point to it. If request_q is non-empty now, send REQUEST message to the holder site.

32

Page 33: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Raymond’s Algorithm … Executing CS: getting token with the site at the top of

request_q. Delete top of request_q, enter CS. Releasing CS

– If request_q is non-empty, delete top entry from q, send token to that site, set holder to that site.

– If request_q is non-empty now, send REQUEST message to the holder site.

Performance– Average messages: O(log N) as average distance between

2 nodes in the tree is O(log N).– Synchronization delay: (T log N) / 2, as average distance

between 2 sites to successively execute CS is (log N) / 2.– Greedy approach: Intermediate site getting the token may

enter CS instead of forwarding it down. Affects fairness, may cause starvation.

33

Page 34: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Raymond’s Algorithm: Example

34

S1

S4 S5

S2

S7

S3

S6

Tokenholder

Token request

S1

S4 S5

S2

S7

S3

S6

Step 1:

Step 2:

Token

Page 35: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Raymond’s Algm.: Example…35

S1

S4 S5

S2

S7

S3

S6

Step 3:

Tokenholder

Page 36: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Example-

123 4 5

6 71,4

4,7

1

1

4

1,4,7 want to enter their CS

Page 37: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Raymond’s Algorithm

123 4 5

6 71,4 4,7

1

4

2 sends the token to 6

Page 38: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Raymond’s Algorithm

123 4 5

6 7

4

4,7

4

6 forwards the token to 1

4

Page 39: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Singhal’s Heuristic Algorithm Instead of broadcast: each site maintains information on other sites,

guess the sites likely to have the token. Data Structures:

– Si maintains SVi[1..N] and SNi[1..N] for storing information on other sites: state and highest sequence number.

– Token contains 2 arrays: TSV[1..N] and TSN[1..N].– States of a site

• R : requesting CS• E : executing CS• H : Holding token, idle• N : None of the above

– Initialization:• SVi[j] := N, for j = N .. i; SVi[j] := R, for j = i-1 .. 1; SNi[j] := 0, j =

1..N. S1 (Site 1) is in state H.• Token: TSV[j] := N & TSN[j] := 0, j = 1 .. N.

39

Page 40: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Singhal’s Heuristic Algorithm …

Requesting CS– If Si has no token and requests CS:

• SVi[i] := R. SNi[i] := SNi[i] + 1.• Send REQUEST(i,sn) to sites Sj for which SVi[j] = R. (sn:

sequence number, updated value of SNi[i]).– Receiving REQUEST(i,sn): if sn <= SNj[i], ignore. Otherwise,

update SNj[i] and do:• SVj[j] = N -> SVj[i] := R.• SVj[j] = R -> If SVj[i] != R, set it to R & send

REQUEST(j,SNj[j]) to Si. Else do nothing.• SVj[j] = E -> SVj[i] := R.• SVj[j] = H -> SVj[i] := R, TSV[i] := R, TSN[i] := sn, SVj[j] = N.

Send token to Si. Executing CS: after getting token. Set SVi[i] := E.

40

Page 41: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Singhal’s Heuristic Algorithm …

Releasing CS– SVi[i] := N, TSV[i] := N. Then, do:

• For other Sj: if (SNi[j] > TSN[j]), then {TSV[j] := SVi[j]; TSN[j] := SNi[j]}

• else {SVi[j] := TSV[j]; SNi[j] := TSN[j]}– If SVi[j] = N, for all j, then set SVi[i] := H. Else send token to a

site Sj provided SVi[j] = R. Fairness of algorithm will depend on choice of Si, since

no queue is maintained in token. Arbitration rules to ensure fairness used. Performance

– Low to moderate loads: average of N/2 messages.– High loads: N messages (all sites request CS).– Synchronization delay: T.

41

Page 42: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Singhal: Example42

• Assume there are 3 sites in the system. Initially: Site 1: SV1[1] = H, SV1[2] = N, SV1[3] = N. SN1[1], SN1[2], SN1[3] are 0. Site 2: SV2[1] = R, SV2[2] = N, SV2[3] = N. SNs are 0. Site 3: SV3[1] = R, SV3[2] = R, SV3[3] = N. SNs are 0. Token: TSVs are N. TSNs are 0.• Assume site 2 is requesting token. S2 sets SV2[2] = R, SN2[2] = 1. S2 sends REQUEST(2,1) to S1 (since only S1 is set to R in SV[2])• S1 receives the REQUEST. Accepts the REQUEST since SN1[2] is smaller than the message sequence number. Since SV1[1] is H: SV1[2] = R, TSV[2] = R, TSN[2] = 1, SV1[1] = N. Send token to S2• S2 receives the token. SV2[2] = E. After exiting the CS, SV2[2] = TSV[2] = N. Updates SN, SV, TSN, TSV. Since nobody is REQUESTing, SV2[2] = H.• Assume S3 makes a REQUEST now. It will be sent to both S1 and S2. Only S2 responds since only SV2[2] is H (SV1[1] is N now).

Page 43: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Comparison 43

Non-Token Resp. Time(ll) Sync. Delay Messages(ll) Messages(hl)

Lamport 2T+E T 3(N-1) 3(N-1)Ricart-Agrawala 2T+E T 2(N-1) 2(N-1)Maekawa 2T+E 2T 3*sq.rt(N) 5*sq.rt(N)

Token Resp. Time(ll) Sync. Delay Messages(ll) Messages(hl)

Suzuki-Kasami 2T+E T N NSinghal 2T+E T N/2 NRaymond T(log N)+E Tlog(N)/2 log(N) 4

Page 44: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Deadlock

A deadlock is a situation in which two or more competing actions are each waiting for the other to finish, and thus neither ever does.

a deadlock is a situation which occurs when a process enters a waiting state because a resource requested by it is being held by another waiting process, which in turn is waiting for another resource.

44

Page 45: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Cont..

 If a process is unable to change its state indefinitely because the resources requested by it are being used by another waiting process, then the system is said to be in a deadlock.

Deadlock is a common problem in multiprocessing systems, parallel computing and distributed systems.

45

Page 46: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Example-46

Page 47: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Example-

Suppose a computer has three CD drives and three processes. Each of the three processes holds one of the drives.

If each process now requests another drive, the three processes will be in a deadlock.

Each process will be waiting for the "CD drive released" event, which can be only caused by one of the other waiting processes.

Thus, it results in a circular chain.

47

Page 48: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Necessary conditions

A deadlock situation can arise if and only if all of the following conditions hold simultaneously in a system:

– Mutual Exclusion– Hold and Wait or Resource Holding:– No Preemption– Circular Wait

48

Page 49: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

DISTRIBUTED DEADLOCK DETECTION

49

Page 50: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

System Model

System have Only Reusable Resources

Processes are allowed only exclusive access to resources.

There is Only One Copy of each resource

Process can be in Running state or Blocked

50

Page 51: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Deadlocks in Distributed Systems

Deadlocks in distributed systems are similar to deadlocks in single processor systems, – They are harder to avoid, prevent or even detect.

– They are hard to cure when tracked down because all relevant information is scattered over many machines.

51

Tulika Ringan (AL_IT)

Page 52: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Types of DeadlocksTypes of Deadlocks

People sometimes might classify deadlock into the following types:– Communication deadlocks -- competing with

buffers for send/receive– Resources deadlocks -- exclusive access on I/O

devices, files, locks, and other resources. We treat everything as resources, there we only have

resources deadlocks.

Page 53: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Strategies to Handle Deadlock

Four best-known strategies to handle deadlocks:– Detection

• (let deadlocks occur, detect them, and try to recover)

– Prevention • (statically make deadlocks structurally impossible)

– Avoidance • (avoid deadlocks by allocating resources carefully)

53

Page 54: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

54

• PreventionToo expensive in time and network traffic in a distributed system

• AvoidanceDetermining safe and unsafe states would require a huge number of messages in a DS

• DetectionMay be practical, and is primary chapter focus

• ResolutionMore complex than in non-distributed systems

Page 55: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

DS Deadlock Detection

Bi-partite graph strategy modified– Use Wait For Graph (WFG or TWF)

• All nodes are processes (threads)• Resource allocation is done by a process (thread)

sending a request message to another process (thread) which manages the resource (client - server communication model, RPC paradigm)

– A system is deadlocked If and only if there is a directed cycle (or knot) in a global WFG

55

Tulika Ringan (AL_IT)

Page 56: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

DS Deadlock Detection, Cycle vs. Knot

The AND model of requests requires all resources currently being requested to be granted to un-block a computation– A cycle is sufficient to declare a deadlock with

this model The OR model of requests allows a computation

making multiple different resource requests to un-block as soon as any are granted– A cycle is a necessary condition– A knot is a sufficient condition

56

Page 57: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

57

P8

P10P9

P7

P6P5

P4

P3P2

P1

S1

S3S2

Deadlock in the AND model; there is a cyclebut no knot

No Deadlock in the OR model

Page 58: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

58

P8

P10P9

P7

P6P5

P4

P3P2

P1

S1

S3S2

Deadlock in both the AND model and the OR model; there are cycles and a knot

Page 59: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

DS Detection Requirements

Progress– No undetected deadlocks

• All deadlocks found• Deadlocks found in finite time

Safety– No false deadlock detection

• Phantom deadlocks (false) caused by network latencies

• Principal problem in building correct DS deadlock detection algorithms

59

Page 60: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Resolution

Breaking Existing Wait for dependencies in system Rolling back one or more processes that are

deadlocked and assigning their resources to blocked processes in the deadlock.

When WF dependency is broken the corresponding information should be immediately cleaned up (detection of phantom deadlock).

60

Page 61: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Control Framework

Approaches to DS deadlock detection fall in three domains:– Centralized control

• one node responsible for building and analyzing a real WFG for cycles

– Distributed Control• each node participates equally in detecting

deadlocks … abstracted WFG

– Hierarchical Control• nodes are organized in a tree which tends to

look like a business organizational chart

61

Page 62: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Total Centralized Control

Simple conceptually:– Each node reports to the master detection node– The master detection node builds and analyzes

the WFG– The master detection node manages resolution

when a deadlock is detected

Some serious problems:– Single point of failure– Network congestion issues– False deadlock detection

62

Page 63: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Total Centralized Control (cont)

The Ho-Ramamoorthy Algorithms– Two phase (can be for AND or OR model)

• each site has a status table of locked and waited resources

• the control site will periodically ask for this table from each node

• the control node will search for cycles and, if found, will request the table again from each node

• Only the information common in both reports will be analyzed for confirmation of a cycle

63

Page 64: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Total Centralized Control (cont)

The Ho-Ramamoorthy Algorithms (cont)– One phase (can be for AND or OR model)

• each site keeps 2 tables; process status and resource status

• the control site will periodically ask for these tables (both together in a single message) from each node

• the control site will then build and analyze the WFG, looking for cycles and resolving them when found

64

Page 65: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Distributed Control

Each node has the same responsibility for, and will expend the same amount of effort in detecting deadlock– The WFG becomes an abstraction, with any single

node knowing just some small part of it– Generally detection is launched from a site when

some thread at that site has been waiting for a “long” time in a resource request message

65

Tulika Ringan (AL_IT)

Page 66: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Distributed Control Models

Four common models are used in building distributed deadlock control algorithms:– Path-pushing

• path info sent from waiting node to blocking node

– Edge-chasing• probe messages are sent along graph edges

– Diffusion computation• echo messages are sent along graph edges

– Global state detection• sweep-out, sweep-in WFG construction and

reduction

66

Tulika Ringan (AL_IT)

Page 67: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Path-pushing

Obermarck’s algorithm for path propagation : (an AND model)– based on a database model using transaction

processing– sites which detect a cycle in their partial WFG

views convey the paths discovered to members of the (totally ordered) transaction

– the highest priority transaction detects the deadlock “Ex => T1 => T2 => Ex”

– Algorithm can detect phantoms due to its asynchronous snapshot method

67

Tulika Ringan (AL_IT)

Page 68: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Edge Chasing Algorithms

Chandy-Misra-Haas Algorithm (an AND model)– probe messages M(i, j, k)

• initiated by Pj for Pi and sent to Pk• probe messages work their way through the

WFG and if they return to sender, a deadlock is detected

• make sure you can follow the example in Figure 7.1 of the book

68

Tulika Ringan (AL_IT)

Page 69: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Chandy-Misra-Haas Algorithm69

P8

P10P9

P7

P6P5

P4

P3P2

P1Probe (1, 3, 4)

Probe (1, 7, 10)

Probe (1, 6, 8)

Probe (1, 9, 1)S1

S3S2

P1 launches

Tulika Ringan (AL_IT)

Page 70: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Edge Chasing Algorithms (cont)

Mitchell-Meritt Algorithm (an AND model)– propagates message in the reverse direction– uses public - private labeling of messages– messages may replace their labels at each site– when a message arrives at a site with a matching

public label, a deadlock is detected (by only the process with the largest public label in the cycle) which normally does resolution by self - destruct

70

Tulika Ringan (AL_IT)

Page 71: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

71

P8

P10P9

P7

P6P5

P4

P3P2

P1

S1

S3S2

Public 1=> 3Private 1

Public 3Private 3

Public 2 => 3Private 2

1. P6 initially asks P8 for its Public label and changes its own 2 to 32. P3 asks P4 and changes its Public label 1 to 33. P9 asks P1 and finds its own Public label 3 and thus detects the deadlock P1=>P2=>P3=>P4=>P5=>P6=>P8=>P9=>P1

2

1

3

Mitchell-Meritt Algorithm

Tulika Ringan (AL_IT)

Page 72: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Diffusion Computation

Deadlock detection computations are diffused through the WFG of the system– => are sent from a computation (process or thread)

on a node and diffused across the edges of the WFG– When a query reaches an active (non-blocked)

computation the query is discarded, but when a query reaches a blocked computation the query is echoed back to the originator when( and if) all outstanding => of the blocked computation are returned to it

– If all => sent are echoed back to an initiator, there is deadlock

72

Tulika Ringan (AL_IT)

Page 73: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Diffusion Computation of Chandy et al (an OR model)A waiting computation on node x periodically sends => to

all computations it is waiting for (the dependent set), marked with the originator ID and target ID

Each of these computations in turn will query their dependent set members (only if they are blocked themselves) marking each query with the originator ID, their own ID and a new target ID they are waiting on

A computation cannot echo a reply to its requestor until it has received replies from its entire dependent set, at which time its sends a reply marked with the originator ID, its own ID and the most distant dependent ID

When (and if) the original requestor receives echo replies from all members of its dependent set, it can declare a deadlock when an echo reply’s originator ID and most distant ID are its own

73

Tulika Ringan (AL_IT)

Page 74: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

74

P8

P10P9

P7

P6P5

P4

P3P2

P1

S1

S3S2

Diffusion Computation of Chandy et al

Tulika Ringan (AL_IT)

Page 75: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

75P1 => P2 message at P2 from P1 (P1, P1, P2) P2 => P3 message at P3 from P2 (P1, P2, P3) P3 => P4 message at P4 from P3 (P1, P3, P4) P4 => P5 ETC. P5 => P6 P5 => P7 P6 => P8 P7 => P10 P8 => P9 (P1, P8, P9), now reply (P1, P9, P1) P10 => P9 (P1, P10, P9), now reply (P1, P9, P1) P8 <= P9 reply (P1, P9, P8) P10<= P9 reply (P1, P9, P10) P6 <= P8 reply (P1, P8, P6) P7 <= P10 reply (P1, P10, P7) P5 <= P6 ETC. P5 <= P7 P4 <= P5 P3 <= P4 P2 <= P3 P1 <= P2 reply (P1, P2, P1)

P5 cannot reply until both P6 and P7replies arrive !

Diffusion Computation of Chandy et al

end condition

deadlock condition

Page 76: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Global State Detection

Based on 2 facts of distributed systems:– A consistent snapshot of a distributed system can

be obtained without freezing the underlying computation

– A consistent snapshot may not represent the system state at any moment in time, but if a stable property holds in the system before the snapshot collection is initiated, this property will still hold in the snapshot

76

Page 77: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Global State Detection (the P-out-of-Q request model)The Kshemkalyani-Singhal algorithm is demonstrated

in the text◦ An initiator computation snapshots the system by sending

FLOOD messages along all its outbound edges in an outward sweep

◦ A computation receiving a FLOOD message either returns an ECHO message (if it has no dependencies itself), or propagates the FLOOD message to it dependencies An echo message is analogous to dropping a request edge

in a resource allocation graph (RAG)◦ As ECHOs arrive in response to FLOODs the region of the

WFG the initiator is involved with becomes reduced◦ If a dependency does not return an ECHO by termination,

such a node represents part (or all) of a deadlock with the initiator

◦ Termination is achieved by summing weighted ECHO and SHORT messages (returning initial FLOOD weights)

77

Page 78: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Hierarchical Deadlock Detection

These algorithms represent a middle ground between fully centralized and fully distributed

Sets of nodes are required to report periodically to a control site node (as with centralized algorithms) but control sites are organized in a tree

The master control site forms the root of the tree, with leaf nodes having no control responsibility, and interior nodes serving as controllers for their branches

78

Page 79: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Hierarchical Deadlock Detection 79

Master Control Node

Level 1 Control Node

Level 2 Control Node

Level 3 Control Node

Page 80: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Hierarchical Deadlock Detection The Menasce-Muntz Algorithm

– Leaf controllers allocate resources– Branch controllers are responsible for the finding

deadlock among the resources that their children span in the tree

– Network congestion can be managed – Node failure is less critical than in fully centralized– Detection can be done many ways:

• Continuous allocation reporting• Periodic allocation reporting

80

Page 81: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Hierarchical Deadlock Detection (cont’d) The Ho-Ramamoorthy Algorithm

– Uses only 2 levels• Master control node• Cluster control nodes

– Cluster control nodes are responsible for detecting deadlock among their members and reporting dependencies outside their cluster to the Master control node (they use the one phase version of the Ho-Ramamoorthy algorithm discussed earlier for centralized detection)

– The Master control node is responsible for detecting intercluster deadlocks

– Node assignment to clusters is dynamic

81

Page 82: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

82

Agreement Protocols

Page 83: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Agreement Protocols

When distributed systems engage in cooperative efforts like enforcing distributed mutual exclusion algorithms, processor failure can become a critical factor

Processors may fail in various ways, and their failure modes and communication interfaces are central to the ability of healthy processors to detect and respond to such failures

83

Page 84: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

The System Model

The are n processors in the system and at most m of them can be faulty

The processors can directly communicate with others processors via messages (fully connected system)

A receiver computation always knows the identity of a sending computation

The communication system is pipelined and reliable

84

Page 85: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Faulty Processors

May fail in various ways– Drop out of sight completely– Start sending spurious messages– Start to lie in its messages (behave maliciously)– Send only occasional messages (fail to reply when

expected to)

May believe themselves to be healthy Are not known to be faulty initially by non-

faulty processors

85

Page 86: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Communication Requirements

Synchronous model communication is assumed in this section:– Healthy processors receive, process and reply to

messages in a lockstep manner– The receive, process, reply sequence is called a round– In the synchronous-communication model, processes

know what messages they expect to receive in a round

The synchronous model is critical to agreement protocols, and the agreement problem is not solvable in an asynchronous system

86

Page 87: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Processor Failures

Crash fault– Abrupt halt, never resumes operation

Omission fault– Processor “omits” to send required messages to

some other processors

Malicious fault– Processor behaves randomly and arbitrarily– Known as Byzantine faults

87

Page 88: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Authenticated vs. Non-Authenticated Messages Authenticated messages (also called

signed messages) – assure the receiver of correct identification of the

sender– assure the receiver that the message content was

not modified in transit

Non-authenticated messages (also called oral messages)– are subject to intermediate manipulation– may lie about their origin

88

Page 89: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Authenticated vs. Non-Authenticated Messages (cont’d) To be generally useful, agreement protocols

must be able to handle non-authenticated messages

The classification of agreement problems include:– The Byzantine agreement problem– The consensus problem– the interactive consistency problem

89

Page 90: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Agreement Problems

Problem Who initiates value Final agreement

Byzantine One Processor Single Value

Agreement

Consensus All Processors Single Value

Interactive All Processors A Vector of Values

Consistency

90

Tulika Ringan (AL_IT)

Page 91: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Agreement Problems (cont’d)

Byzantine Agreement– One processor broadcasts a value to all other

processors– All non-faulty processors agree on this value,

faulty processors may agree on any (or no) value Consensus

– Each processor broadcasts a value to all other processors

– All non-faulty processors agree on one common value from among those sent out. Faulty processors may agree on any (or no) value

91

Page 92: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

92•Interactive Consistency

•Each processor broadcasts a value to all other processors

• All non-faulty processors agree on the same vector of values such that vi is the initial broadcast value of non-faulty processori . Faulty processors may agree on any (or no) value

Page 93: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Agreement Problems (cont’d)

The Byzantine Agreement problem is a primitive to the other 2 problems

The focus here is thus the Byzantine Agreement problem

Lamport showed the first solutions to the problem– An initial broadcast of a value to all processors– A following set of messages exchanged among all

(healthy) processors within a set of message rounds

93

Page 94: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

The Byzantine Agreement problem The upper bound on number of faulty

processors:– It is impossible to reach a consensus (in a fully

connected network) if the number of faulty processors m exceeds ( n - 1) / 3 (from Pease et al)

– Lamport et al were the first to provide a protocol to reach Byzantine agreement which requires m + 1 rounds of message exchanges

– Fischer et al showed that m + 1 rounds is the lower bound to reach agreement in a fully connected network where only processors are faulty

– Thus, in a three processor system with one faulty processor, agreement cannot be reached

94

Tulika Ringan (AL_IT)

Page 95: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Lamport - Shostak - Pease Algorithm The Oral Message (OM(m)) algorithm with m

> 0 (some faulty processor(s)) solves the Byzantine agreement problem for 3m + 1 processors with at most m faulty processors– The initiator sends n - 1 messages to everyone

else to start the algorithm – Everyone else begins OM( m - 1) activity, sending

messages to n - 2 processors– Each of these messages causes OM (m - 2)

activity, etc., until OM(0) is reached when the algorithm stops

– When the algorithm stops each processor has input from all others and chooses the majority value as its value

95

Page 96: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Lamport - Shostak - Pease Algorithm (cont’d) The algorithm has O(nm) message complexity,

with m + 1 rounds of message exchange, where n (3m + 1)– See the examples on page 186 - 187 in the

book, where, with 4 nodes, m can only be 1 and the OM(1) and OM(0) rounds must be exchanged

– The algorithm meets the Byzantine conditions:• A single value is agreed upon by healthy

processors

• That single value is the initiators value if the initiator is non-faulty

96

Page 97: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Dolev et al AlgorithmSince the message complexity of the Oral

Message algorithm is NP, polynomial solutions were sought.

Dolev et al found an algorithm which runs with polynomial message complexity and requires 2m + 3 rounds to reach agreement

The algorithm is a trade-off between message complexity and time-delay (rounds)◦see the description of the algorithm on page 87

97

Page 98: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Additional Considerations to Dolev

Consider the case where n > (3m + 1)– more messages are sent than needed– a set of processors can be selected such

the set size is 3m + 1 (called active processors) and messages can be limited to a degree among these processors

– all active and passive processors using Dolev’s algorithm this way reach Byzantine agreement in 2m + 3 rounds of these limited messages

98

Page 99: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Applications

Atomic Commit in Distributed Database system In Distributed systems each system performs its

individual transaction independently They decide individually whether to commit or abort. Once they decide, each system transfer its decision to

all others Then the final decision is taken depending upon the

common agreement. This way it follows the Byzantine agreement solution to

the problem.

99

Page 100: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Atomic Commit Protocol

Two-phase commit protocol: most commonly used atomic commit protocol.

Implemented as: an exchange of messages between the coordinator and the cohorts.

Guarantees global atomicity: of the transaction even if failures should occur while the protocol is executing.

Page 101: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

DEADLOCK EXTRA NOTES

101

Page 102: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

DEADLOCK DETECTION ALGORITHMS IN DISTRIBUTED SYSTEMS

Advanced Operating System

Page 103: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Overview

• Algorithms For Deadlock Detection in Distributed Systems.

• Deadlock Handling Techniques.

• Deadlocks – An Introduction.

• Deadlocks in Distributed Systems.

• Summary.

Page 104: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Deadlocks – An Introduction

• What Are DEADLOCKS ? A Blocked Process which can never be resolved

unless there is some outside Intervention.

Resource R1 is requested by Process P1 but is held by Process P2.

• For Example:-

Page 105: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Illustrating A Deadlock• Wait-For-Graph (WFG)

Nodes – Processes in the system Directed Edges – Wait-For blocking relation

• A Cycle represents a Deadlock• Starvation - A process’ execution is permanently halted.

Process 1 Process 2

Resource 1

Resource 2Waits For

Waits For

Held By

Held By

Page 106: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Causes Of Deadlocks

• Mutual Exclusion – Resources being held must be in non-shareable mode.

• Hold n Wait – A Process is holding one resource and is waiting for another, which is held by another process.

• No Preemption – Resource cannot be preempted even if it is being requested.

• Circular Wait – Presence of a cycle of waiting processes.

Page 107: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Deadlocks in Distributed Systems

• Resource Deadlock Most Common. Occurs due to lack of requested Resource.

• Communication Deadlock

A Process waits for certain messages before it can proceed.

Page 108: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Handling Deadlocks

• Deadlock Avoidance Only fulfill those resource requests that won’t

cause deadlock in the future.

Inefficient. Requires Prior resource requirement

information for all processes. High Cost of

scalability.

• Drawbacks

Simulate resource allocation and determine if resultant state is safe or not.

Page 109: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Handling Deadlocks

• Deadlock Prevention

Provide all required resources from start itself.

Prioritize processes. Assign resources accordingly.

Inefficient and effects Concurrency.

Make Prior Rules: For Ex. – Process P1 cannot request resource

R1 unless it releases resource R2.

Future resource requirement unpredictable.

• Drawbacks

Starvation possible.

Page 110: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Handling Deadlocks

• Deadlock Detection

Resource allocation with an optimistic outlook. Periodically examine process status. Detect then break the Deadlock.

• Resolution – Roll back 1 or More processes and break dependency.

Page 111: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Deadlock Detection

• Centralized Deadlock Detection One control node (Coordinator) maintains Global

WFG and searches for cycles.• Distributed Deadlock Detection

Each node equally responsible in maintaining Global WFG and detecting Deadlocks.

• Hierarchical Deadlock Detection Nodes organized in a tree, where each site detects

deadlocks involving only its descendants.

CONTROL ORGANIZATIONS

Page 112: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Deadlock Detection Algorithms

• Centralized Deadlock Detection

• Distributed Deadlock Detection

• Hierarchical Deadlock Detection

Ho-Ramamoorthy’s one and two phase algorithms.

Obermarck’s Path Pushing Algorithm. Chandy-Misra-Haas Edge Chasing

algorithm.

Menasce-Muntz Algorithm. Ho-Ramamoorthy’s Algorithm.

Page 113: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Centralized Deadlock Detection

• Ho-Ramamoorthy’s 1-Phase Algorithm Each site maintains 2 Status

Tables:

One of the Sites Becomes the Central Control site.

Process Table. Resource Table.

The Central Control site periodically asks for the status tables.

Contd…

Page 114: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Control site builds WFG using the status tables. Control site analyzes WFG and resolves any

present cycles.

Centralized Deadlock Detection

Ho-Ramamoorthy’s 1-Phase Algorithm Contd…

• Shortcomings Phantom

Deadlocks. High Storage & Communication Costs.

Page 115: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Phantom Deadlocks

P0 P2

P1

R

S TSystem A System B

P1 releases resource S and asks-for resource T. 2 Messages sent to Control Site:

1. Releasing S.2. Waiting-for T.

Message 2 arrives at Control Site first. Control Site makes a WFG with cycle, detecting a phantom deadlock.

Page 116: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Centralized Deadlock Detection

• Ho-Ramamoorthy’s 2-Phase Algorithm

Each site maintains a status table for processes.

Phase 1

Control Site periodically asks for these Locked & Waited tables.

Contd…

Resources Locked & Resources Awaited.

It then searches for presence of cycles in these tables.

Page 117: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Ho-Ramamoorthy’s 2-Phase Algorithm Contd…

Phase 2

If cycles are found in phase 1 search, Control site makes 2nd request for the tables.

The details found common in both table requests will be analyzed for cycle confirmation.

Centralized Deadlock Detection

• Shortcomings Phantom

Deadlocks.

Page 118: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Distributed Deadlock Detection

• Obermarck’s Path-Pushing Algorithm

Individual Sites maintain local WFG A virtual node ‘x’ exists at each site. Node ‘x’ represents external processes.

Detection Process Case 1: If Site Sn finds a cycle not

involving ‘x’ -> Deadlock exists. Case 2: If Site Sn finds a cycle

involving ‘x’ -> Deadlock possible.

Contd…

Page 119: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Site Sn sends a message containing its detected cycles to other sites. All sites receive the message, update their WFG and re-evaluate the graph.

• If Case 2 ->

Consider Site Sj receives the message: Site Sj checks for local cycles. If cycle found not

involving ‘x’ (of Sj) -> Deadlock exists. If site Sj finds cycle involving ‘x’ it forwards

the message to other sites. Process continues till deadlock found.

Obermarck’s Path-Pushing Algorithm

Page 120: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Distributed Deadlock Detection

• Chandy-Misra-Haas Edge Chasing algorithm.

The blocked process sends ‘probe’ message to the resource holding process.

‘Probe’ message contains: ID of blocked process. ID of process sending the message. ID of process to which the message was sent.

When probe is received by blocked process it forwards it to processes holding the requested resources.

If Blocked Process receives its own probe -> Deadlock Exists.

Page 121: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Hierarchical Deadlock Detection

• Menasce-Muntz Algorithm

Sites (controllers) organized in a tree structure.

Leaf controllers manage local WFG.

Upper controllers handle Deadlock Detection. Each Parent node maintains a Global WFG,

union of WFG’s of its children. Deadlock detected for its children.

Changes propagated upwards in the tree.

Page 122: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

• Ho-Ramamoorthy’s Algorithm

Hierarchical Deadlock Detection

Sites grouped into clusters. Periodically 1 site chosen as central control site: Central control site chooses controls site

for other clusters. Control site for each cluster collects the status

graph there: Ho-Ramamoorthy’s 1-phase algorithm

centralized DD algorithm used. All control sites forward status report to Central

Control site which combines the WFG and performs cycle search.

Page 123: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

Centralized Deadlock Detection Algorithms Large communication overhead. Coordinator is performance bottleneck. Possibility of single point of failure.

Summary

Distributed Deadlock Detection Algorithms High Complexity. Detection of phantom deadlocks possible.

Hierarchical Deadlock Detection Algorithms Most Common. Efficient.

Page 124: UNIT-II Distributed Synchronization 1 Mutual exclusion Mutual exclusion : makes sure that concurrent process access shared resources or data in a serialized.

“Choose the least general technique - which is still general enough to solve the problem”.Edgar

Knapp.

THANK YOU