COP 4610: OPERATING SYSTEMS & CONCURRENT PROGRAMMING |
Dining Philosophers Problem |
We have separately discussed the Dining Philosophers problem, demonstrated that it can result in deadlock, and shown two ways to avoid deadlock in solutions to that problem. We will now look at some more examples, analyze more carefully what deadlock means, and look at a broader range of solutions.
The figure, which is of a traffic deadlock, makes the nature of deadlock very clear. None of the cars can pass unless at least one of them backs up or is removed.
Deadlock comes from contention between processes (or threads) for resources. So, to understand deadlock we need to understand resources.
One author of an operating sytems textbook has proposed that a resource be defined as "something a process can wait for".
A first step in understanding resources is to recognize that they can be divided into two kinds, according to whether they can be reused or not.
The most is known about reusable resources. Deadlock is easy to analyze because of there being a known finite number of resources.
You can think of reusable resources as collections, each of which contains one or more units. At a high level of abstraction, all the different modes of access can be modeled in terms of exclusive access to different numbers of units of a resource.
In the simplest case, a single-unit resource, the collection has just one unit. An example of a single-unit resource is a mutex. Only one thread may hold a mutex at a time. The only mode of access is exclusive.
We have previously seen the readers-writers synchronization problem. Reader-writer access to a resource can be modeled by pretending there are "tickets" for the resource, sufficiently many that every reader can have a ticket. These tickets are the logical units of the resource. For a reader to access the resource it needs only one ticket. For a writer to access the resource, it needs all the tickets. Assuming the tickets are accessed in exclusive mode, this models gives the correct behavior for reader-writer locks. That is, any number of concurrent readers can obtain tickets to access the resource, but cannot succeed in getting all the tickets so long as another reader or another writer is holding a ticket.
Unrestricted sharing can also be modeled, by pretending there is a very large number of tickets for the resource, and only a single ticket is required to access it. However, unrestricted sharing means no waiting, and so is not a mode we have to consider when we study deadlock.
The steps by which a process (or thread) gets a resource acquires and releases a resource involve an alternation of control between the process and the operating system (or resource manager). The first step is for the process to ask for (request) the resource. This is at the initiative of the process. The second step is for the operating system (or resource manager) to grant the resource. This is the only step that is under control of for the operating system, and the only aspects that the operating system can control is when (if ever) to grant the request. After the resource has been granted, the process is said to have acquired it. The third step is for the process to release the resource. This is agin under the control of the process.
This alternation can be viewed like a game, in which the players alternate moves. In the deadlock game the operating system loses if a deadlock develops.
Questions:
What are examples of reusable resources?
Processors, I/O channels, main and secondary memory, files, databases, and semaphores
Deadlocks involving consumable resources are harder to analyze than those involving only reusable resources. The main problem is that the conditions under which a reusable resource may be produced can depend in an arbitrarily complex way on the internal logic of processes. Given the operating system has to deal with an unknown and arbitrary collection of user processes, cannot see into the processes, and could not predict their actions even if it could see into them (remember the Halting Problem), there is no general way for the operating system to prevent, avoid, or even recognize the existence of a deadlock involving consumable resources. For this reason, dealing with consumable resource deadlocks is outside the scope of our present study of operating systems.
By the way, this difficulty with deadlock is arguably one of the disadvantages of message-based synchronization. By comparison, if we can solve a problem using simple mutual exclusion alone (that is, single-unit reusable resources) there are simple design techniques that guarantee there is no possibility of deadlock.
Questions:
What are examples of consumable resources?
Hardware interrupts, Unix signals, pthread_cond_signal calls, messages, and information in I/O buffers.
Process P1:
request(R1); | ... | request(R2); | ... | release(R2); | ... | release(R1); |
R1 critical section | ||||||
R2 critical section |
Process P2:
request(R2); | ... | request(R1); | ... | release(R1); | ... | release(R2); |
R1 critical section | ||||||
R2 critical section |
The pseudocode above shows two processes, each of which has nested critical sections for the same pair of resources. They request the resources in a different order, so there is a possibility of deadlock. Suppose P1 requests R1, and that is granted. Then P2 requests R2, and that is granted. Then P1 requests R2, and that cannot be granted because R2 is held by P2, so P1 blocks. Then P2 requests R1, and that cannot be granted because R1 is held by P1, so P2 blocks. The two processes are note deadlocked.
The figure shows the deadlock that is possible for the two pseudocode processes above, modeled by a (single-unit) resource allocation graph. This is the simplest of several graph models for resource allocation we will consider below. (The others involve multiunit and consumable resources.) There are two types of nodes: process nodes (show as squares) and resource nodes (show as circles). There are two types of edges: held-by edges (shown as solid arrows) go from a resource to the process that is holding it; wait-for edges (shown as dashed arrows) go from a process to the resource it is waiting for. Notice that there is a cycle in the graph. A cycle in the resource allocation graph is a necessary condition for deadlock (in every model). For single-unit resources (but not in the more general models) it is also a sufficient condition.
The figure shows a trajectory diagram for the same pair of processes P1 and P2 as the pseudocode and resource allocation graph shown above.
A trajectory diagram is useful for visualizing how a system can get into a deadlock. We can draw such a diagram if we have just two processes and the steps in the execution of each process progress can be viewed as a simple sequence of steps (i.e., we don't have to consider condition control flow or looping). The combined state of the two processes corresponds to a point (X, Y) in the plane, where X is the state of one process and Y is the state of the other process. In our example, the progress of one process P1 is plotted on the Y (vertical) axis and the progress of process P2 is plotted on the X (horizontal) axis.
The pink shaded regions correspond to states that are inside the critical region for resource R1. The darker pink region covers the states in which both processes would need to hold R1, which would violate mutual exclusion, and so that region is forbidden.
The grey shaded regions correspond to states that are inside the critical region for resource R2. The darker grey region covers the states in which both processes would need to hold R2, which would violate mutual exclusion, and that region is forbidden.
The white dots correspond to states whose entry is under the control of the OS, and black dots represent states whose entry is under the control of the application. The system start state is shown at the lower left corner. In the start state, neither P1 nor P2 has started execution, so neither is holding any resources.
The green lines show two possible system trajectories. The solid green line shows a trajectory that leads to deadlock. It is deadlock because there is no way the system can make progress without entering one of the forbidden regions. Here, progress in process P1 means movement upward and progress in process P2 means movement toward the right.
Notice that the deadlock state is in the corner of a concave pocket that is formed by the intersection of the two forbidden regions. This pocket is a trap. Once we get into it we cannot get out without backing up. Since preemption is not allowed, the system cannot force a process to back up.
The problem here can be viewed as a board game in which the processes are playing against the operating system. The goal of the operating system is to not get into a deadlock state. The goal of the processes is to try to force the operating system in to a deadlock. The moves of the processes are resource requests and resource releases. The moves of the operating system are resource grants. The only choice of the operating system is how long it postpones granting a request.
The dashed green line shows an alternate trajectory, which enables both processes to complete without deadlock. The critical difference between the two trajectories is at the point where P1 requests R1. If the system grants the request immediately (which it can, because P2 is not using R1 at the time), ultimate deadlock is unavoidable. If the system postpones granting P1's request for R1, P2 is given time to release R2, and there will be no deadlock. A good resource manager can force the system along the safe trajectory, since the decision when to allocate a resource is up to the resource manager.
P1 | P2 | |
---|---|---|
... | ... | |
receivefrom(P2, &M2); | receivefrom(P1, &M1); | |
... | ... | |
sendto(P2, M1); | sendto(P1, M2); |
Deadlock occurs if the receivefrom operation is blocking.
The pseudocode above shows how deadlock is possible with message passing. Process P1 and P2 each wait for a message from the other, and while they are waiting neither is able to send the message that the other is waiting for. This kind of deadlock is difficult to prevent unless one imposes some constraints (protocols) on the sending and receiving of messages.
What are the fundamental causes of deadlock?
How can we deal with deadlock?
Now that we have looked at examples of some specific cases of deadlock, it is time to look at deadlock from a general point of view.
In order for there to be a deadlock, all of the above conditions must be true. You can observe that they are true in the examples we considered. Richard Holt, in a PhD dissertation published in the 1970's, showed that these conditions must apply in any deadlock. You can verify that informally, by looking at each condition and convincing yourself that if the condition is not true, there is no deadlock.
Questions:
How can we deal with deadlock?
What is the difference between prevention and avoidance?
What are examples of strategies for prevention?
There are only three approaches to dealing with deadlock. We can allow it to happen. In that case we need to be able to detect it when it has happened, and then deal with it. Alternatively, we can try to make sure deadlock does not happen. There are two general approaches to that. One is called prevention and the other is called avoidance. Prevention means always restricting what we do in such a way that there is no possibility of deadlock. Avoidance means waiting to impose restrictions until we are close to a deadlock, and then "steering around" the deadlock.
Experience has shown that some people have difficulty distinguishing prevention form avoidance. The technical way to understand the difference is in terms of safe and unsafe states, which we will get to a bit later. A more intuitive way is by analogy to other situations in which a person needs to deal with a potential hazard.
Analogies:
Suppose there is an open hole in the middle of a street. One way to prevent people from falling into the hole is to cover it. Another is to put up a barricade around it. A third avoidance strategy would be to never use that street. A way to avoid falling into it is to keep your look where you are going, and steer around the hole.
Suppose a person does want to get a dangerous communicable disease, such as AIDS. A prevention strategy would be to not engage in any of the activities by which the disease can be transmitted. An avoidance strategy would be to engage in the risky activities, but try not to engage in them with anyone who is a carrier of the disease.
In general, hazard prevention is safer and simpler than avoidance, but it is more limiting. In designing software systems, it is a good idea to design in a way that prevents hazards, to the extent the system requirements allow you to do so.
With deadlock, the prevention approach is to impose some restriction on how resources are used, that assures there can never be a deadlock. The avoidance approach is to impose no restrictions on resource usage, up to a point where it there is a danger the system will deadlock, and then guide the system around the deadlock, by postponing some allocations.
All of the four conditions are necessary for deadlock to occur.
Hence, by preventing any one of them we prevent deadlock.
If we look at each of the four necessary conditions for deadlock, we can see how deadlock might be prevented by denying that condition. Whether one of these prevention strategies can apply to a given resource depends on the nature of the resource and how it is used. For example, ordered allocation works well for mutexes, but preemption of a mutex is not acceptable.
Questions:
What happens to a process if resources it is holding are preempted?
Is it safe for the process to just go on executing?
Short of killing the process that is preempted, how might this problem be addressed?
For a hold-and-wait cycle to exist, one process must request a pair of resources in a different order than another. In the diagram, P1 requested R1 (which it is holding) and then requested R2, but P2 requested R2 (which it is holding) and then requested R1.
In general, if we number the resource nodes and walk around a cycle in the resource allocation graph, there must come a point where we go from a higher-numbered resource to a lower numbered resource. It follows that if we never request a resource while holding a higher-numbered resource, there can be no cycle.
Process P1:
Process P2:
|
If all processes always request resources in a fixed order, there can be no hold-and-wait cycle.
This is usually not a burdensome restriction. It still allows us to have critical sections that involve more than one resource, and use simple lock and unlock operations on individual resources to implement them.
The trajectory diagram shows that there are no "concave" regions formed by the intersectios of the forbidden regions, so there are no accesssible deadlock states.
Process P1:
request(R1); | ... | release (R1); | request(R2); | ... | release(R2); |
R1 critical section | R2 critical section |
Process P2:
request(R2); | ... | release (R2); | request(R1); | ... | release(R1); |
R2 critical section | R1 critical section |
If processes always release one resource before requesting another, there can be no hold-and-wait
This is a tougher restriction. It rules out critical sections that involve more than one resource.
Process P1:
request(R1, R2); | ... | release(R1, R2); |
R1 critical section |   | |
R2 critical section |   |
Process P2:
request(R1, R2); | ... | release(R1, R2); |
R1 critical section |   | |
R2 critical section |   |
If processes always request all resources they will need at once, there can be no hold-and-wait
This allows critical sections that involve more than one resource, but requires the operating system (or resource manager) to provide a way to atomically request several resources at once. Some batch and real-time operating systems support this, but it is not supported, for example, by the POSIX thread mutex API.
Postpone starting a process if its demands might lead to deadlock,
i.e. while resources it may need are held by others
Postpone granting an incremental resource request to a process if granting the allocation might lead to deadlock
Deadlock avoidance requires foreknowledge of the worst-case resource requests a process may make.
The two logical points at which a deadlock avoidance decision can be made are the point of process admission and the point of an incremental resource request.
Maximum resource requirement information may not be available, except for certain kinds of resources and certain restricted types of processes. The availability of this kind of information was typical of traditional batch processing systems, and is true for some real-time operating systems.
The kinds of resources this method may be applied to are also limited. It has been used for allocating tape drives to jobs in a batch-oriented operating system. Another example of an application is to the allocation of signal processors to a main-CPU process in a real-time signal processing system.
An example of a deadlock avoidance algorithm.
Please try out FSU applet for Banker's Algorithm (by Shishir Tiwari).
The banker's algorithm postpones deadlock avoidance decisions until the point of an incremental resource request, which if granted would take the system into an unsafe state.
The following slides contain an example from Stallings' text, which we walk through in class.
It is helpful to also go through some other examples, to make certain you understand the algorithm. One way to do this is using the applet to which link is given above.
Process P1:
request(R1); | ... | request(R2); | ... | release(R2); | ... | release(R1); |
R1 critical section | ||||||
R2 critical section |
Process P2:
request(R2); | ... | request(R1); | ... | release(R1); | ... | release(R2); |
R1 critical section | ||||||
R2 critical section |
We will review the two-process two-resource deadlock example given in pseudocode above, to see how deadlock could be avoided.
This trajectory diagram is identical to the one given for the same example further above, except that one region of the diagram has been colored gold. That region contains the unsafe states of the system. Notice that the unsafe states are in a concave pocket, formed by the intersection of the two forbidden regions (the dark pink and dark grey regions). Once the system enters this pocket, it cannot get out without either backing out (violating nonpreemption) or entering a forbidden zone (violating mutual exclusion).
Notice that there is only one deadlocked state, in the upper right corner of the unsafe region. The other states in this region are unsafe, but not deadlocked.
The green trajectory in the diagram above corresponds to the following, which one many possible sequences of moves:
P2 requests R2
The OS grants the request of P2
P1 requests R1
The OS grants the request of P1
P2 requests R1
The OS cannot grant the request of P2, because R2 is locked.
P1 requests R2
The OS cannot grant the request of P1, because R1 is locked.
No further progress is possible, i.e. we have a deadlock.
The system made a fatal error when it granted P1's request for R1 while P2 was holding R2. This took the system into an unsafe state, from which the processes P1 and P2 were finally able to drive the system into deadlock.
Questions:
What is the difference between a safe state and a deadlock-free state?
Another way of looking at safety is in terms of a state transition diagram. The figure shows the states and transitions of the example system considered further above. The red state is the deadlock state. The shaded yellow states are unsafe states. The black and green states are safe states. The black transitions and states are under control of the processes, and the green transitions and states are under control of the operating system (or resource manager).
The property that makes the unsafe region usafe is that there are no transitions out of the unsafe region that are controlled by the operating system.
In this diagram, the system will eventually reach deadlock, once it enters an unsafe state, since all the transitions lead toward the deadlock state.
In this regard the example gives a false impression, that there can be no transitions out of an unsafe region, at all. That is because the example has no conditional branching in the code. In a more general example there might be transitions out of the unsafe region that are under control of the processes (but none under control of the operating system).
Deadlock avoidance algorithms keep track of the system state, and avoid making transitions that lead from safe states to unsafe states.
The green path on the diagram above shows how the system can steer itself around unsafe states, for the same sequence of requests from P1 and P2 as in the deadlock example considered previously. The difference is that the system does not grant every request immediately, just because the resource is available. Instead, the system blocks P1 at its request for R1 until after P2 releases R2.
P2 requests R2
The OS grants the request of P2
P1 requests R1
The OS postpones granting the request of P1, because it
would lead to an unsafe state.
P2 requests R1
The OS grants the request of P2
P2 releases R1
The OS grants the request of P1 for R1
P1 requests R2
The OS cannot grant the request of P1, because it
R2 is still locked by P2.
P2 releases R2
The OS grants the request of P1 for R2
P1 releases R2
P1 releases R1
The banker's algorithm represents the state of the system by a set of tables, which are shown above. This example comes from Stallings' text. In class, we walk through the example, step by step, explaining each step. Stallings gives the same explanation in his text, so it is not repeated here.
It would be a good idea here to jump ahead momentarily to look at example of graph reduction much further below. Some people may find it easier to grasp quickly than the example with tables.
|
|
Mark P4. Deadlock Detection with Tables (Step 2)
|