priolocking.html

Priorities and Mutual Exclusion

These notes make use of mathematical symbols that may not be visible with the default settings on some versions of some browsers.

Symbol	Symbol Name	Symbol	Symbol Name	Symbol	Symbol Name
∼	sim	∑	sum	⌈	lceil
⌉	rceil	⌊	lfloor	⌋	rfloor
→	rarr	⇒	rArr	∞	infin
<	lt	≤	le	>	gt
≥	ge	≠	ne	δ	delta
Δ	Delta	∂	part	τ	tau
π	pi	φ	phi	∞	infin

If a blank space appears to the left of any of the names in the table above, or the symbol looks strange, try changing the font settings on your browser or using a different browser.

Warning: All of the discussion here is based on a single-processor environment, where the blocking task and the blocked task are competing for the same (single) processor.

Blocking, a.k.a. Priority Inversion

In preemptive priority-based scheduling
When a task is executing but it has lower priority than some other task that has a pending job and is not executing, that is called priority inversion
Priority inversions can occur for various reasons, most notably:
- a high-priority task blocks tries to lock a mutex that is held by a lower priority task
This is a violation of the assumptions under which scheduling analyses such as the RM and EDF utlization bounds and the fixed-priority response-time computation are based.
Two-part solution:
1. Modify analyses to include "blocking terms"
  See SEI notes on synchronization and priority inversion for formulae and examples.
2. Modify mutex locking protocols to bound duration of priority inversions:
  1. priority inheritance
  2. priority ceiling protocol
  3. stack resource protocol (SRP)

Priority Inversion Example

Tasks τ₁ and τ₃ share a mutex S1.
τ₂ does not use the mutex
τ₁ has highest priority, then τ₂, and then τ₃
τ₁ code: ... Lock(S1) ... Unlock(S1) ...
τ₃ code: ... Lock(S1) ... Unlock(S1) ...

In this example, task τ₁ si blocked by tasks τ₂ and τ₃.

Note that the problem here is not just priority inversion. It is inevitable that τ₃ block τ₁ until the critical section is complete, in order to preserve mutual exclusion. We expect critical sections to be short, so the duration of this blocking should be very short.

The problem is the the priority inversion caused by &tau₂;, which is potentially unbounded, since &tau₂; is not inside a critical section.

In general, we use the term B_i to denote the maximum length of time that task τ_i can be blocked. The figure shows that this term can be very large, unless we do something to prevent the kind of "pass through" blocking inflicted on τ₁ by τ₂ here.

Adding Blocking terms to Fixed-Task_Priority Schedulability Analysis

If B_k is the worst-case blocking time experienced by task τ_k, then τ_k is schedulable by preemptive RM in collection of independent periodic tasks τ₁,...,τ_n if

B_k/p_k +

k
∑
i=1

e_i/p_i ≤ m(2^1/m-1)

Adding the blocking term to the response time equation gives us the following:

t = e_k + B_k +

k-1

∑

i = 1

⌈

p_i

⌉ e_i

... and similary for the EDF analysis.

Priority Inheritance

Rule: Whenever task A is blocked by task B, task B inherits A's priority until the blockage is ended
Reasoning: A cannot proceed until B releases the lock, to expediting A requires expediting B.
Worst-case blocking time can be computed if we know:
- WCET of each critical section
- which critical sections can block which other critical sections
- must take into account nesting and chaining
Can be implemented simply, by itself: scheduler follows wait-for chain from highest-priority task
But complexity of implementation can become very high, if done in full generality and in combination with other features (see POSIX list below)
effect can be lost if mixed with other blocking operations, such as I/O, r-w locks, etc.
POSIX supports priority inheritance for mutexes, via the PTHREAD_PRIO_INHERIT mutex creation attribute:
- example:
```
pthread_mutexattr_t attr;
pthread_mutex_t m;
pthread_mutexattr_setprotocol(&attr, PTHREAD_PRIO_INHERIT);
pthread_mutex_init (&m, &attr);
```
- but mixes with other features that require heavyweight implementation:
  - transitive inheritance
  - time-outs on blocking operations (interruption by signal, pthread_mutex_timedlock(), pthread_cond_timedwait()
  - other priority changes to threads involved in an inheritance situation (sched_setscheduler(), pthread_setschedparam(), pthread_setconcurrency())
  - multi-processor scheduling
- Linux supports this
- The Real-time Specification for Java (RTSJ) supports this
```
public class PriorityInheritance extends MonitorControl;
```
```
setMonitorControl(java.lang.Object obj, MonitorControl policy)
```

Immediate Priority Ceiling Protocol (ICPP), aka Highest-Locker-Priority and SRP

Warning: Do not confuse this with the "Priority Ceiling Protocol" (PCP), which uses priority ceilings in a more complicated and subtle way.

Rule: Whenever a task is holding the lock on a mutex, it inherits the maximum locker priority of the mutex
Reasoning: since a higher priority task may want the lock, expediting higher priority tasks requires expediting the lock holder
Worst-case blocking time is the duration one critical section of a lower-priority task
This protocol works for both EDF and fixed-task-priority scheduling
It assumes the lock holder will not self-suspend while holding the lock
This can be implemented very simply, by itself: locker pushes old priority, sets new one; unlocker pops old priority
Complexity goes up if combined with other features (see POSIX list below)
POSIX supports priority locking for mutexes, via the PTHREAD_PRIO_PROTECT mutex creation attribute:
- example:
```
pthread_mutexattr_t attr;
pthread_mutex_t m;
pthread_mutexattr_setprotocol(&attr, PTHREAD_PRIO_PROTECT);
pthread_mutex_init (&m, &attr);
```
- but mixes with other features that require heavyweight implementation:
  - see list for priority inheritance above, plus
  - dynamic piority ceilings (pthread_mutex_setprioceiling())
- Linux does not support this
- The Ada programming language supports this, and calls it "ceiling priority locking"
- The Real-time Specification for Java (RTSJ) calls this "priority ceiling emulation"
```
public class PriorityCeilingEmulation extends MonitorControl;
```
```
setMonitorControl(java.lang.Object obj, MonitorControl policy)
```

The above protocol is a simplification of a more general treatment, which I published under the name "Stack Resource Protocol". The SRP handles reader-writere and mult-unit resources as well as mutexes, and works for both EDF and fixed-task-priority scheduling. It also works for a very lightweight task model, in which threads share a single runtime stack.

An Extreme Case: Nonpreemption

If we don't know the highest potential locker's priority, we can use the highest system priority
In effect, the holder of the lock becomes non-preemptible
This does bound blocking time to one critical section
It is very easy to iplement, and not bad if critical sections lengths are kept small
Linux does this internally, for kernel spinlocks

Original Priority Ceiling Protocol (PCP)

Main difference from ICPP: Priority is not raised until some higher priority task blocks
Reasoning: raising priority of the lock-holder is itself a priority inversion, so we want to avoid it unless it becomes necessary
Complexity of implementation is higher than other protocols above
POSIX/Linux do not support this
Reference: http://www.sei.cmu.edu/pub/documents/88.reports/pdf/sr04.88.pdf
Detailed rules:
- when a task T tries to lock a mutex M, the request is denied (only) if
  - M is already locked by another task, or
  - the priority of T is not higher than all priority ceilings of all resources allocated to other tasks at the time (these other tasks block T)
- when a task blocks other tasks, it inherits the highest of their priorities

See SEI notes on priority inversion and synchronization for some good examples and figures.

T. P. Baker. ($Id$)