COP 5611 SPRING 2003

COP 5611: OPERATING SYSTEMS -Spring 2003- Ted Baker

Distributed Deadlock Detection

These notes were used in a subsititute lecture, for Xiuwen Liu's class in Spring 2003.

The material is mainly from Chapter 7 (Distributed Deadlock Detection) in Advanced Concepts in OS.

You should be already familiar with deadlock handling in single-processor systems, from the undergraduate course that is prerequisite to this one. If not, you should review it now, as background for distributed deadlock.

For reference, see COP 4610 Notes on Deadlock.

Beyond that, you should have read and be familiar with Chapter 3 of the text for this course, especially the sections on graph models for deadlock detection.

Distributed Deadlock Models

Based on WFG (not GRG)

Nodes are processes
Resources are located at a site, but may be held by processes at other sites
Edge (P,Q) indicates P is blocked and waiting for Q to release some resource
Deadlock exists iff there is a directed cycle or knot (depends on request model -- see Chapter 3)

The text fudges a bit by forgetting that in some models the existence of a cycle is necessary but not sufficient, and the existence of a knot is sufficient but not necessary, leaving a gap.

Distributed Deadlock Handling Strategies

prevention -- too restrictive?
avoidance -- too complicated, slow?
detection -- the usual approach

Distributed Deadlock Detection Issues & Requirements

issues
- maintenance of WFG
- detection of cycles (or knots) in the WFG
requirements
- progress = no undetected deadlocks
- safety = no false deadlocks

What do we do when a deadlock is detected?

Categorization of Methods

centralized control
distributed control
hierarchical control

Simple Centralized Control

one control site maintains the WFG and checks for deadlock
other sites send request and release messages to control site

What are the advantages and disadvantages?

False Deadlock Example

An external observer can see deadlock where there is none.

4 sites:

R₁ stored at S₁
R₂ stored at S₂
T₁ runs at S₃
T₂ runs at S₄

Two transactions run concurrently:

T₁: lock R₁

T₂: lock R₁
T₁: unlock R₁

T₂: unlock R₁
T₁: lock R₂

T₂: lock R₂
T₁: unlock R₂

T₂: unlock R₂

False Deadlock Example: Event Trace Diagram

False Deadlock Snapshots

What if each site maintains its local view of the system state, and periodically sends this to the control site?

Ho-Ramamoorthy Algorithm: Two-Phase

each site maintains table with status of all local processes
control site periodically requests status from all sites, builds WFG, and checks for deadlock
if deadlock detected, control site repeats status requests, but throws out transactions that have changed
(trying to avoid false deadlock)

May still report false deadlocks.

What lesson(s) does this teach?

Ho-Ramamoorthy Algorithm: One-Phase

each site maintains two tables:
- resource status (for all local resources)
- process status (for all local processes)
control site periodically requests copies of tables, builds WFG, and checks for deadlock
transactions are not used unless the process table info agrees with the resource table info

Notice the similarity in principle here to the Chandy-Lamport global state recording algorithm, i.e., we need to capture not just the states of the processes but also the states of the messages in transit.

Classification of Distributed Detection Algorithms

path-pushing
- path information transmitted, accumulated
edge-chasing
- ``I'm waiting for you'' probes are sent along edges
- single returned probe indicates a cycle
diffusion
- ``Are you blocked?'' probes are sent along all edges
- all queries returned indicates existence of a knot (necessary and sufficient condition for deadlock in an expedient general resource graph with single unit requests), {\it i.e.} there is no way out of the blockage
global state detection
- take and use snapshot of system state

Obermarck's Path-Pushing Algorithm

designed for distributed database systems
processes are called ``transactions'' T₁, T₂, ¼T_n
there is special virtual node Ex
transactions are totally ordered

How can we totally order the transactions?

Obermarck's Path-Pushing Algorithm

wait for info from previous iteration of Step 3
combine received info with local TWFG
detect all cycles
break local cycles
send each cycle including the node Ex
to the external nodes it is waiting for
time-saver: only send path to other sites if first
transaction is higher in lexical order than the last

Note: TWF = "transaction wait-for graph", just another name for a WFG, when applied to transactions.

Note: In the figures below, the rule above is reversed, i.e., we send the path of the first transaction is lower in lexical order than the first. This was an accident when I drew up the figures, but it does not matter. Either rule will work. The essential idea is simply to have one canonical representation of each path.

Note: The example above does not fully illustrate all the things that can go on in a more complicated graph. In particular, there are no nodes with multi-node AND-dependences.

Problem and Questions on Obermarck's Path-Pushing Algorithm

Detects false deadlocks, due to asynchronous snapshots at different sites. (lesson?)

Message complexity? Message size? Detection delay?

Exactly how are paths combined and checked?

Obermarck's Path-Pushing Algorithm: Performance

O(n(n-1)/2) messages
O(n) message size
O(n) delay to detect deadlock

The O(n(n-1)/2) message bound is a little bit misleading, since the algorithm is described as being run continually, in an iterative fashion. That is, the sites exchange paths continually, even when no deadlock is detected.

Chandy-Misra-Haas Edge-Chasing Algorithm

for AND request model
probe= (i,j,k) is sent for
detection initiated by P_i,
by site of P_j to site of P_k
deadlock is detected when a probe returns to its initiator

Terminology

P_j is dependent on P_k if there is a sequence P_j,P_i1,P_i2,¼P_im, P_k such that each process except P_k is blocked and each process except the first holds a resource for which the previous process is waiting
P_j is locally dependent on P_k if it is dependent and both processes are at the same site
array dependent_i(j) = true Û P_i knows that P_j is dependent on P_i

Algorithm Initiation by P_i

if P_i is locally dependent on itself then declare a deadlock
else send probe (i, j, k) to home site of P_k for each j, k such that all of the following hold

P_i is locally dependent on P_j
P_j is waiting on P_k
P_j and P_k are on different sites

Algorithm on receipt of probe (i,j,k) by node of P_k

check the following conditions

P_k is blocked
dependent_k(i) = false
(P_k does not yet know that P_i is dependent on P_k)
P_k has not replied to all requests of P_j

if these are all true, do the following

set dependent_k(i) = true
(P_k now knows that P_i is dependent on P_k)
if k=i declare that P_i is deadlocked
else send probe (i,m,n) to the home site of P_n for every m and n such that the following all hold
- P_k is locally dependent on P_m
- P_m is waiting on P_n
- P_m and P_n are on different sites

What is the purpose of the dependent relation?

The above example does not include any nontrivial AND-requests. What is it about this algorithm that allows it to handle AND-requests, but not OR-requests?

Chandy Misra Haas Complexity

What is the message complexity?
What is delay in detection?

Analysis

m(n-1)/2 messages for m processes at n sites
3-word message length
O(n) delay to detect deadlock

Other Edge-Chasing Algorithms

We will not have time to go into these in detail in class.

Mitchell-Meritt
- each node has two counters, called public and private labels
- propagates largest public label backwards in TWFG along cycle
Sinha-Natarajan
- uses probes (initiator, lowest_trans_traversed_so_far)
- Suffers from false deadlocks, and also misses deadlocks
- "Correction" by Choudary et al. has same defects
- What is the lesson here?

Diffusion Based Algorithms: Chandy et al.

for OR request model
processes are active or blocked
whenever a process blocks it starts a diffusion (!!)
if deadlock is not detected, the process will eventually unblock and terminate the algorithm
message = query (i,j,k)
- i = initiator of check
- j = immediate sender
- k = immediate recipient
reply = reply (i,k,j)

On receipt of query(i,j,k) by m

if not blocked then discard the query
if blocked
- if this is an engaging query
  propagate query(i,k,m) to dependent set of m
- else
  - if not continously blocked since engagement then discard the query
  - else send reply(i,k,j) to j

On receipt of reply(i,j,k) by k

if this is not the last reply
then just decrement the awaited reply count
if this is the last reply then
- if i=k report a deadlock
- else send reply(i,k,m)
  to the engaging process m

The black dashed arrows indicate the engaging process for each engaged process. This information is needed to route the reply when the number of other processes for which the engaged process is waiting goes to zero.

Observe that the engaging process arrows form a spanning tree of the subgraph corresponding to the set of process for which the initiating process is waiting. If every process in this subgraph is blocked, we have a knot.

At this point, a knot has been detected.

Global State Detection Based Algorithms

take snapshot of distributed WFG
use graph reduction to check for deadlock

Details differ.

Recall: What is graph reduction?

Graph Reduction

General idea: simulate the result of execution, assuming all unblocked processes complete without requesting any more resources

while there is an unblocked process, remove the process and all (resource-holding) edges to it
there is deadlock if the remaining graph is non-null

If you have never done one, it would be a good idea to work through an example of graph reduction. In particular, note that backtracking may be necessary with the more complex resource models such as consumable resource graphs.

Distributed Deadlock Detection

Distributed Deadlock Models

Distributed Deadlock Handling Strategies

Distributed Deadlock Detection Issues & Requirements

Categorization of Methods

Simple Centralized Control

False Deadlock Example

False Deadlock Example: Event Trace Diagram

False Deadlock Snapshots

Ho-Ramamoorthy Algorithm: Two-Phase

Ho-Ramamoorthy Algorithm: One-Phase

Classification of Distributed Detection Algorithms

Obermarck's Path-Pushing Algorithm

Obermarck's Path-Pushing Algorithm

Problem and Questions on Obermarck's Path-Pushing Algorithm

Obermarck's Path-Pushing Algorithm: Performance

Chandy-Misra-Haas Edge-Chasing Algorithm

Terminology

Algorithm Initiation by Pi

Algorithm on receipt of probe (i,j,k) by node of Pk

Chandy Misra Haas Complexity

Analysis

Other Edge-Chasing Algorithms

Diffusion Based Algorithms: Chandy et al.

On receipt of query(i,j,k) by m

On receipt of reply(i,j,k) by k

Global State Detection Based Algorithms

Graph Reduction

Algorithm Initiation by P_i

Algorithm on receipt of probe (i,j,k) by node of P_k