Lecture #6: Distributed Mutual Exclusion
These topics are from Chapter 6
(Distributed Mutual Exclusion) in Advanced Concepts in OS,
supplemented with other materials.
Topics for Today
- Lamport's mutual exclusion algorithm
- token-based distributed mutual exclusion algorithms
- Suzuki-Kasami broadcast algorithm
- Raymond's tree-based algorithm
Lamport's Mutual Exclusion Algorithm
- Assumes messages are delivered in FIFO order between each
pair of sites
- Is not based on tokens
- Is based on Lamport's clock synchronization scheme*
- Each request gets a timestamp
- Requests with lower timestamps take priority over requests
with higher timestamps
- Each site maintains a queue of pairs (timestamp, site),
ordered by timestamp
* Are these the single-integer valued clocks, or the vector
clocks?
The Algorithm
- Request
- Si sends REQUEST(tsi, i) to all
sites in its request set Ri and
puts the request on request_queuei
- when Sj receives REQUEST(tsi, i) from Si
it returns a timestamped REPLY to Si and
places Si's request on request_queuej
- Si waits to start the CS until both
- [L1:] Si has received a message with timestamp > (tsi, i) from all other sites
- [L2:] Si's request is at the top of
request_queuei
- Release
- Si removes request from top of request_queuei
and sends time-stamped RELEASE message to all the sites
in its request set
- when Sj receives a RELEASE messages from Si
it removes Si's request from request_queuej
Correctness
Suppose Si and Sj are executing the CS concurrently.
L1 and L2 must hold at both sites concurrently.
Si and Sj both have requests at top of their queues
and L1 holds, at some instant t.
WLOG suppose Si's request has earlier timestamp than Sj's.
(Remember the tie-breaking rule!)
Assuming communication channels are FIFO, at instant t
Si's request is queued at Sj, when Sj is in the CS
and Sj's own request is at the top of the queue, ahead of
a smaller timestamp request.
This is a contradiction.
Example
(Possibly step through additinal examples on the blackboard.)
Performance
- 3(N-1) messages per CS invocation
- sd = T
What does this assume about transmission delay versus message
processing delay?
Token-Based Algorithms
- one token, shared among all sites
- site can enter its CS iff it holds token
- The major difference is the way the token is searched
- use sequence numbers instead of timestamps
- used to distinguish requests from same site
- kept independently for each site
- The proof of mutual exclusion is trivial
- The proof of other issues (deadlock and starvation) may be less so
Suzuki-Kasami Broadcast Algorithm
Each site Si keeps an array of integers
RNi[1..N],
where RNi[j] is the largest sequence number received so far from Sj.
- token has form (Q, LN)
- Q is queue of requests
- LN is vector of sequence numbers
- LN[i] is seq. number of Si's most recent request
- when Si wants to enter CS:
- if Si does not already have the token
then increment RNi[i] and broadcast REQUEST(i,RNi[i])
- when Sj receives REQUEST(i,n):
- set RNj[i] to max(RNj[i], n)
- if have token and RNj[i]=LN[i]+1 send token to Si
- when Si leaves CS:
- set LN[i] to RNi[i]
- for every Sj
if Sj not in Q and RNi[j]=LN[j]+1 then append Sj to Q
- if Q is not empty
then delete the top site Sk from Q and send (Q¢,LN) to Sk
Performance of Suzuki-Kazami Algorithm
- a request will be served after at most N-1 others
- 0 or N messages per CS executed
- synchronization delay = 0 or T
Comparison of Lamport and Suzuki-Kazami Algorithms
The essential difference is in who keeps the queue. In one
case every site keeps its own local copy of the queue. In the
other case, the queue is passed around within the token.
What is gained by this scheme versus the centralized
mutual exclusion scheme?
Raymond's Tree-Based Algorithm
- sites are logically arranged as a directed tree
- edges represent the holder variable of each site
- which node is root changes over time
Si requests entry to CS
- if Si does not hold the token and Qi is empty
then send request to holderi
- add Si to Qi
Sj receives request from Si
- if Sj is holding token
- send token to Si
- set holderj to Si
- if Sj is not holding token
- place request in Qj
- if Sj does not have a pending request
then send request to holderj
Si receives token
- delete top entry Sj from Qi
- if k = i enter own critical section
- if k ¹ i then
{ send token to Sj;
set holderi to Sj }
Si leaves a CS
- if Qi is nonempty then
{ delete top entry Sj from Qi;
send token to Sj;
set holderi to Sj }
- if Qi is (still) nonempty then send request to holderi
Properties of Raymond's Algorithm
- free from deadlock
- free from starvation
due to connectedness & FIFO queue service
- ``average'' case
- O(logN) message complexity
- O(T logN / 2) synchronization delay
On what assumption(s) does average-case analysis depend?
What are worst-case metrics?
What is degenerate case?
What trade-off does this point out?
Worst Case in Balanced Binary Tree
What is the worst-case number of messages if
the topology is a balanced binary tree?
How about other topologies?
Universal Bounds
min synch delay | T | message to permit |
max throughput | 1/(T+E) | message to transmit + CS |
min response time | 2T + E | round trip + CS |
max avg resp time | N(T + E) | all others served first* |