Chapter 7, Deadlocks

Post on 23-Mar-2016

32 views 1 download

description

Chapter 7, Deadlocks. 7.1 System Model. In order to talk about deadlocks in a system, it’s helpful to have a model of the system A system consists of a set of resources The resources can be grouped into types Processes compete to have (unique) access to instances of the types. - PowerPoint PPT Presentation

Transcript of Chapter 7, Deadlocks

1

Chapter 7, Deadlocks

2

7.1 System Model

• In order to talk about deadlocks in a system, it’s helpful to have a model of the system

• A system consists of a set of resources• The resources can be grouped into types• Processes compete to have (unique) access to

instances of the types

3

• Examples of resource types:– Memory locations– CPU cycles– Files– Object locks– I/O devices (printers, drives, etc.)

• When classifying by type, each item has to be interchangeable

• Otherwise, it’s necessary to classify into (distinct) subtypes

4

• Each process can request (and hold) as many resources as needed to complete its work

• Each process may request one or more instances of one or more types of resource

• It makes no sense for a process to request more instances of resources than exist

• Such a request can only be denied and such a process cannot do whatever it supposedly was intended to do

5

• Request/Use sequence– Request the resource– This request will either be granted or not– If granted, use the resource– After use, release the resource

• Note that “use” will involve a certain period of time when the process holds the resource

6

• Request and release are forms of system calls• As usual, you may make explicit calls, or the

compiler may generate them from your source code

• Examples:– You can open and close files– You can request and release devices– You can allocate and de-allocate (free) memory

locations

7

• The O/S manages access to shared system resources

• User level resource sharing may also be implemented in synchronized application code

• The general plan of action for synchronization described in the previous chapter applies to system level resource management

8

• Aspects of system resource management:• The O/S keeps a list of resources and records which

have been allocated to which process• For processes that couldn’t acquire a resource, there

is a separate waiting list for each resource• Processes’ statuses implicitly (due to the presence of

their PCB in a waiting list) or explicitly (due to a value recorded in their PCB) reflect whether the processes have acquired or are waiting for certain resources

9

Definition of deadlock

• Deadlock is a condition involving a set of processes (and a set of resources)

• Verbal definition of deadlock:• Each process in the set of deadlocked

processes is waiting for an event (namely, the release of a lock = the release of a resource) which can only be caused by another process in that set

10

• Keep in mind the idea that you can have resources and you can have locks on the resources

• Access to the resource is managed through access to or possession of the lock

• This was the basis of the car/car title analogy• Hypothetically, access to resources could be granted

directly, without some lock concept between the process and the resource

• In that case, deadlock could literally occur on the resources

11

• In practice, access to a resource is managed through a semaphore, a lock, or some other control mechanism

• As a consequence, deadlocking typically occurs on the stand-in, the lock or other mechanism, not on the resource itself

12

• Under correct synchronization, deadlock should not be possible on one resource

• Either one process has it or another does• The process that doesn’t is simply waiting for

it to be released by the other process, without any other conditions being attached

13

• Deadlock can occur with as few as two contending processes and two shared resources

• Deadlocks can occur on instances of one kind of resource type or instances of various kinds of resource type

• The number of processes and resources involved in a deadlock can be arbitrarily large

14

• As a student, the place you are most likely to concretely encounter deadlock is in Java thread programming

• Although it doesn’t necessarily use the term “lock”, mutual exclusion meets the requirements for a process action that has the effect of a lock.

• Mutual exclusion means that one thread locks another out of a critical section

15

• As soon as you have threads, you have the potential for shared resources—and a critical section itself can qualify as the shared resource

• As soon as you have shared resources, you have to do synchronization so that your code is thread safe

• As soon as you write code with synchronization, assuming >1 process and >1 resource, you have the potential for deadlock

16

7.2 Deadlock Characterization

• Necessary conditions for deadlock:• 1. Mutual exclusion (locking): There are

common resources that can’t be shared without concurrency control

• 2. Hold and wait: Once processes have acquired resources (locks) they’re allowed to hold them while waiting for others

17

• 3. No pre-emption: Resources can’t be pre-empted (swiped from other processes). – A process can only release its resources voluntarily

• 4. Circular wait: This condition is actually redundant. – The previous three conditions imply this one,

which is essentially a complete statement of the underlying problem:

18

• This is a summary of the deadlock conditions:• If processes can’t acquire resources, they wait. • If process x is waiting for a resource held by process y

and y is waiting for a resource held by x, this is circular wait

• It turns out that it’s relatively easy to understand deadlocks by looking at diagrams of them

19

Resource allocation graphs

• Let processes, Pi, be represented as circles (labeled)• Let resources, Ri, be represented as boxes with a dot

for each instance of the type• Let a request by a process for a resource be

represented by an arrow from the process to the resource

• Let the granting of requests, if successful, be immediate.

• Such an arrow will only be shown in cases where the request could not be granted

20

• Let the granting of requests also be atomic• Let the granting, or assignment, of a resource

to a process be represented by an arrow from the resource to the process

• If there is more than one instance of the resource, the arrow should go from the specific dot representing that instance

• Illustrations follow

21

Pi requesting and being granted an instance of Rj

22

• A cycle in the resource allocation graph implies a deadlock

• A cycle in a directed graph is not simply a connected path through the graph

• It’s a path where all of the arrows point in the same direction

• A diagram of the simple, classical case follows

23

24

• The book’s next example includes multiple resource types and multiple processes

• It also includes multiple dots per box, representing multiple instances of a resource

• This principle holds: If there is no cycle in the graph, there is no deadlock

• The example shows waiting, but no deadlock• A diagram follows

25

26

• Whether there is more than one instance of a resource type or not, no cycleno deadlock

• However, with more than one instance of a resource type, even if there is a cycle in the graph, there may or may not be deadlock

• The next example illustrates a case with deadlock

27

28

• The next example illustrates a case where there is a cycle in the resource allocation graph, but there is no deadlock

• A verbal description of the situation is given next—but it can essentially be skipped

• The lack of a deadlock should be apparent from the diagram which follows

29

• There is no deadlock because, of the multiple instances of resource R1 which are held, one is held by P2, which is not in the cycle.

• Similarly, one instance of R2 is held by P4, which is not in the cycle

• If P2 gives up R1, R1 will immediately be assigned to P1, and the cycle in the graph will disappear.

• Likewise, if P4 gives up R2, R2 will immediately be assigned to P3, and the cycle in the graph will disappear

30

31

7.3 Methods for Handling Deadlocks

• There are three major approaches to deadlock handling which lead to subsections in the book

• 1. Use techniques so that a system never enters the deadlocked state– A. Deadlock prevention– B. Deadlock avoidance

• 2. Allow systems to deadlock, but support:– A. Deadlock detection– B. Deadlock recovery

32

• 3. Ignore the problem—in effect, implement processing without regard to deadlocks– If problems occur (system performance slows, processing

comes to a halt) deal with them on a special case basis– The justification for this is that formal deadlock may occur

rarely—and there are other reasons that systems go down– Administrative tools have to exist to re-initiate processing

in any case. – Deadlock is just one of several different cases where they

are necessary

33

• In extreme cases, rebooting the system may be the solution

• Consider these observations:– How many times a year, on average, do you press

CTRL+ALT+DEL on a Windows based system?– Hypothesize that in a given environment, one formal

deadlock a year would occur– Under these circumstances, would it be worthwhile to

implement a separate deadlock handling mechanism?

34

• In the interests of fairness, note this: • In fact, in simple implementations of Unix,

what has been described is the level of support implemented for deadlock handling

• It may not be elegant, but it’s easy to reboot, and suitable for simple systems

35

Deadlock handling in Java

• The book’s explanation may leave something to be desired

• The book’s example program is so confusing that I will not pursue it

• In short, although a fair amount of Java synchronization was covered in the last chapter, I will be limiting myself to the general discussion of deadlock handling in this chapter

36

• The situation can be summarized in this way: • Java doesn’t contain any specific deadlock

handling mechanisms• If threaded code may be prone to

deadlocking, then it’s up to the application programmer to devise the deadlock handling

37

• It is worth keeping in mind that the Java API contains these methods, which apply to threads, and have been deprecated:

• suspend(), resume(), and stop()• Part of the reason for deprecating them is that

they have characteristics which impinge on deadlock

38

• The suspend() method causes the currently running thread to be suspended, but the suspended thread will continue to hold all locks it has acquired

• The resume() method causes a thread to start again, but this call to resume can only be made in some other, running thread

• If the suspended thread holds locks required by the other thread which contains the call to resume the suspended thread, deadlock will result

39

• The stop() method isn’t directly deadlock prone

• As pointed out some time ago, it is prone to lead to inconsistent state

• Consider this typical sequence of events:– Acquire a lock– Access a shared data structure– Release the lock

40

• When stop() is called, all locks held by the thread are immediately released

• In confused code, stop() could be called at a point where a shared resource has undergone incomplete modification

• In other words, the call to stop() will cause locks to be released before the point where they should be

• This can lead to inconsistent state

41

• It may seem odd that a programmer would call stop() somewhere during the modification of a shared resource, but

• Hamlet:...There are more things in heaven and earth, Horatio,Than are dreamt of in your philosophy.Hamlet Act 1, scene 5, 159–167

42

• In short• In Java, it’s the programmer’s problem to write

code that isn’t deadlock prone• There are definitely things in the Java API to avoid if

you are worried about deadlock• The book’s example program is not the ideal vehicle

for seeing how to solve this as a programmer• I don’t have the time to dream up a better example• …

43

7.4 Deadlock Prevention

• Recall the preconditions for deadlock:• 1. Mutual exclusion (locking)• 2. Hold and wait• 3. No pre-emption• 4. Circular wait (redundant)• The basic idea behind deadlock prevention is to

implement a protocol (write code) where at least one of the preconditions can’t hold or is disallowed

44

1. Disallowing mutual exclusion

• This is not an option• It’s true that without mutual exclusion,

deadlock is impossible, but this consists solely of wishing the problem away

• The whole point of the last chapter was the fact that in some cases mutual exclusion is necessary, and it is therefore necessary to be able to manage the deadlock that comes along with it

45

• In support of the need for mutual exclusion, the book finally gives a rock-bottom simple example of a shared resource managed by the O/S where mutual exclusion is necessary:

• Having given the printer to one process, it is out of the question to interrupt it and hand the resource over to another process in the middle of a print job

46

2. Disallowing hold and wait

• This is doable• There are two basic approaches:– A. Request (and acquire) all needed resources

before proceeding to execute– B. Only request needed resource(s) at a point

when no others are held

47

• Both option A and option B are kinds of block acquisition.

• B is a finer grained version of it than A—a single process may request and release >1 block of resources over time

• Note that “wait” in the phrase “hold and wait” means wait for other needed resources to become available

48

Problems with disallowing hold and wait:

• Low resource utilization—because processes grab and hold everything they need, even when they’re not using it

• If B is impractical, A is forced, but A is the more drastic choice where more unused resources are held for longer times

• Starvation is possible if a process needs a large set of resources and has to be able to acquire them all at the same time

49

• In general, the underlying goal of a multi-tasking system is concurrency.

• Disallowing hold and wait reduces concurrency• Processes that aren’t deadlocked may not be able

to run because they require a resource another process is holding but not using

• Note that disallowing hold and wait is not practical in Java.

• Java doesn’t have a block request mechanism

50

3. Disallowing no pre-emption

• Implement a protocol which allows one process to take locks/resources away from other processes as needed

• There are two approaches, given below

51

• 1. Self-pre-emption: The requesting process gives up resources– If process P holds resources and requests

something that’s unavailable because it’s held (by process Q, for example), process P releases the resources it has already acquired

– Process P will have to start over from scratch

52

• 2. External pre-emption: The requesting process takes resources– If process P requests something that’s unavailable

because it’s held by process Q, and process Q in turn is also waiting for resources, then process Q is required to release its resources and process P takes what it needs

– In this case, process Q was pre-empted and will have to start over from scratch

– If process Q is active, not waiting, when P makes its request, then pre-emption will not occur and P will have to wait

53

• Note that pre-emption based protocols are related in nature pre-emptive scheduling and context switching

• The CPU itself is a resource shared among processes• Registers and other system components are examples

of specific parts of this resource• One process can interrupt, or pre-empt another, and

take the shared resource• Context switching can be made to work because the

state of the pre-empted process can be saved and restored

54

• In the verbal description of the deadlock algorithms, the statement was made that pre-empted processes would have to start from scratch

• That didn’t mean that users would have to resubmit them

• It meant that the system would keep a record of the resources they had requested, and when they were scheduled again, the first order of business would be re-acquiring those resources

55

• Keep in mind that in some cases, pre-emption is simply not a practical option.

• If a process has already acquired a printer, it should not give it up until it’s finished with it

• Chaos results of the printed output consists of interleaved print jobs

56

4. Disallowing circular wait

• Although “circular wait” is in a sense redundant, it encapsulates the idea behind deadlock and suggests a solution– 1. Number the resource types– 2. Only let processes acquire items of types in

increasing resource type order– 3. A process has to request all items of a given

type at the same time

57

• Why does this work?• Examine the classic case of deadlock, and

observe how it doesn’t meet the requirements listed above:

58

59

• The order of actions for P1:– Acquire R1– Request R2

• The order of actions for P2:– Acquire R2– Request R1

• P2 did not acquire/try to acquire resources in ascending order by type

60

• If all processes acquire in ascending order, waiting may result, but deadlock can’t result

• This is the verbal explanation– Suppose process 1 holds resources a, b, c, and

requests resource x– Suppose process 2 holds x– Process 1 will have to wait for process 2

61

– However, if process 2 already holds x, because it acquires in ascending order, it cannot be waiting for resources a, b, or c

– It may already have acquired copies of those resources that it needs, or it may not need them at all, but it will not be going back to try and get them

– Therefore, it is not possible for process 1 and process 2 to be deadlocked

62

• A variation on the previous idea (which involves a serious cost)

• Suppose a process reaches an execution point where it becomes clear that a resource with a smaller number is now needed

• The process will have to release all resources numbered higher than that one and acquire them again—in ascending order

63

7.5 Deadlock avoidance

• The previous discussion was of techniques for deadlock prevention

• Deadlock prevention reduces concurrency• Deadlock avoidance is based on having more

information about resource requests

64

• With sufficient knowledge requests may be ordered/granted in a way that will not lead to deadlock

• A simple starting point for the discussion: Require all processes to declare their resource needs up front—then work from there

65

Definitions needed for deadlock to discuss deadlock avoidance in detail

• In order to formally define a deadlock avoidance algorithm, it’s necessary to define a safe state

• Very informally, a safe state is a situation where certain resources have been allocated to certain processes, and making yet another allocation will not cause deadlock

• A deadlock avoidance algorithm is one that moves only from one safe state to another

66

• In order to formally define a safe state, it’s necessary to define a safe sequence

• A sequence refers to a sequence of processes, which can be shown using this notation: <P1, P2, …, Pn>

• A safe sequence has this property:• For all Pj, Pj’s resource requests can be satisfied by

free resources or resources held by Pi, where i < j.

67

• :• Informal definition: The system can allocate

resources to processes in some order without deadlock

• Formal definition: there exists a safe sequence

68

• Notice how this is related to the idea of circular wait• Under the definition given, Pj can be waiting, but

only on Pi where i < j

• Pj can’t be waiting on some Pk where k > j

• More specifically, Pi can’t be waiting on Pj

• Waiting only goes in sequential order, ascending by subscript

• As a consequence, there can be no circular waiting

69

• According to these definitions• 1. Certainly, a safe state is not deadlocked• 2. Incidentally, a deadlocked state is not safe• 3. In particular, there are unsafe states that

are not yet deadlocked, but could or will lead to deadlock

70

• This is a scenario for such an unsafe state:• Pi is waiting for Pj

• This is unsafe, because it is a “forward” wait• Pj hasn’t yet made a request on Pi

• When it does, it would be legal• But in the presence of the pre-existing forward

wait, it would cause deadlock

71

• The nub of deadlock avoidance is point 3• Under deadlock prevention, you avoid going

into a deadlocked state• Under deadlock avoidance, you want to avoid

going into an unsafe state

72

• Resource allocations can cause a system to move from a safe to an unsafe state

• These may be allocations of currently free resources

• Because processes pre-declare their needs, an unsafe state (leading to possible deadlock) can be foreseen based on other requests and allocations that will be made

73

• There are several algorithms for deadlock avoidance• They all prevent a system from entering an unsafe

state• The algorithms are based on pre-declaration of needs,

but not immediate acquisition of all resources up front

• Like deadlock prevention algorithms, they tend to have the effect of reducing concurrency, but the level of concurrency reduction is low since not all acquisitions happen up front

74

Resource allocation graph algorithm for deadlock avoidance

• Let the previous explanation of the resource allocation graph stand:– Let a request by a process for a resource be

represented by an arrow from the process to the resource

– When a request is made, it may be granted or not, but if it is, the granting is atomic

– Let the granting, or assignment, of a resource to a process be represented by an arrow from the resource to the process

75

• For deadlock avoidance, let a new kind of edge, a claim edge, be added to the notation– A claim edge goes in the same direction as a

request, but it’s dashed– A claim edge represents a future request that will

be made by a process

76

• The resource allocation graph algorithm says:– Requests for resources can only be granted if they

don’t lead to cycles in the graph– When identifying cycles, the dashed claim edges

are included

77

• What follows is a sequence of diagrams showing two processes progressing towards an unsafe state

• Initially, the diagram only shows claim edges. • These are the pre-declared requests of the

processes• The diagrams show a sequence of actual requests

and granting of them• The last request can’t be granted because it would

lead to an unsafe state

78

P1 acquires R1

79

P2 requests R1, P2 requests R2, and P2 acquires R2. This leads to the third state, which is unsafe. From that state, if R1 then requested R2 before anything was released, deadlock would result. The middle state is the last safe state. Therefore, P2’s request for R2, the

transition to the third state, can’t be granted.

80

Reiteration of aspects of the resource allocation graph algorithm for deadlock avoidance

• This algorithm is based on pre-declaring all claims

• The pre-declaration requirement can be relaxed by accepting new claims if all of a process’s edges are still only claims

• The algorithm would require the implementation of graph cycle detection.

• The authors say this can be done O(n2)

81

The banker’s algorithm

• The resource allocation graph algorithm doesn’t handle multiple instances of each resource type

• The banker’s algorithm does• This algorithm is as exciting as its name would

imply• It basically consists of a bunch of bookkeeping• It’s messy to do, but there is nothing cosmic about

the idea• I’m not covering it

82

7.6 Deadlock Detection

• If you don’t do deadlock prevention or deadlock avoidance, then you allow deadlocks

• At this point, if you choose to handle deadlocks, two capabilities are necessary:– 1. Deadlock detection– 2. Deadlock recovery

• (Remember that some systems may simply ignore deadlocks.)

83

Detecting deadlocks with single instances of resource types

• This can be done with a wait-for-graph• This is like a resource allocation graph, but it’s

not necessary to record the resources• The key information is whether one process is

waiting on another• A cycle in the graph still indicates deadlock• A simple illustration of a resource allocation

graph and a comparable wait-for-graph follow

84

85

• Wait-for-graph (WFG) algorithm implementation

• 1. The system maintains a WFG, adding and removing edges with requests and releases

• 2. The system periodically searches the graph for cycles. – This is O(n2), where n is the number of vertices

86

Several instances of a resource type

• This is analogous to the banker’s algorithm• Instead of a WFG, it’s necessary to maintain

some NxM data structures and algorithms• I am uninterested in the details

87

Deadlock detection algorithm usage

• When should deadlock detection be invoked?• This depends on two questions:– 1. How often is deadlock likely to occur?– 2. How many processes are likely to be involved?

• The general answer is, the more likely you are to have deadlock, and the worse it’s likely to be, the more often you should check

88

• Checking for deadlock is a trade-off• Checking isn’t computationally cheap• Checking every time a request can’t be

satisfied would be extreme• If deadlock is a real problem, on a given

system, checking every hour might be extreme in the other direction

89

• Deadlock will tend to affect system performance, like CPU utilization

• A system (or administrator) might adopt a rule of thumb like this:

• Trigger deadlock detection when CPU utilization falls below a certain threshold, like 40%.

90

7.7 Recovery from Deadlock

• Recovery from deadlock, when detected, can be manual, done by an administrator

• It can also be an automatic feature built into a deadlock handling system

91

• Overall, deadlock recovery falls into two possible categories:– 1. Abort processes to break cycles– 2. Pre-empt resources from processes to break

cycles without aborting

92

Process termination (abortion)

• There are basically two approaches:– 1. Abort all deadlocked processes.– 2. Abort one process at a time among those that

are deadlocked• Approach 1 has the disadvantage that it

wastes a lot of work that was already done

93

• Approach 2 also has disadvantages– The underlying problem is that there may be >1

cycle, and a given process may be in more than one cycle

– If one process at a time is aborted, deadlock detection will have to be done after each abortion to see whether the system is deadlock free

94

• The overall problem with abortion is the potential for leaving resources in an inconsistent state

• If abortion is supported, then all changes made by a process should be individually logged

• Then if a partially finished process is aborted, an advanced system will roll them back all of the changes that the process had made

• That is, abortion isn’t complete until there is no trace of the process’s existence left

95

• Aside from the general question of rollback, if selective abortion is done under approach 2, then there need to be criteria for picking a victim. For example:– Process priority– Time already spent computing, time remaining (%

completion)– What resources are held– How many more resources are needed? (% completion as

measured by resources)– How many (other?) processes will need to be terminated?– Whether the process is interactive or batch…

96

Resource pre-emption instead of full abortion

• This is an even finer scalpel. • Three questions remain– Victim selection: How do you choose a victim for pre-

emption (what cost function)?– Rollback: In what way can you bring a partially

finished process back to a safe state where it could be restarted and would run to completion correctly, short of aborting it altogether?

– Starvation: How do you make sure a single process isn’t repeatedly pre-empted?

97

• Consider the general principle illustrated by deadlock

• The problem, deadlock, arises due to concurrency• The mindless “solution” to the problem is to

eliminate concurrency• Barring that, a solution at one extreme is to

ignore deadlock, hoping that it is infrequent, and overcoming it by gross means, such as rebooting

98

• From there, possible solutions get more and more fine-grained, each leading to their problems to solve

• You can do prevention, avoidance, or detection and recovery• With detection and recovery you have to decide how fine-

grained the recovery mechanism is, whether rollback can be implemented, whether it is possible to pre-empt at the level of individual resources rather than whole processes…

• This continues ad infinitum, until you’ve had enough and you decide that a certain level of solution is cost-effective for the system under consideration

99

The End