|
COT 5405 Advanced Algorithms Chris Lacher Notes 3: Sorting and Related Topics |
InsertionSort ( array of numbers A , length n ) { for (i = 1; i < n; ++i) { // Loop Invariant: A[0..i) is sorted t = A[i]; for (j = i; j > 0 && t < A[j - 1]; --j) A[j] = A[j - 1]; A[j] = t; } return; }
inline Swap (key& x, key& y) { key z = x; x = y; y = z; } SelectionSort ( array A[0..n) ) // pre: // post: A[0..n) is sorted { for (i = 0; i != n; ++i) { // Loop Invariant: A[0..i) is sorted k = i; for (j = i; j != n; ++j) if (A[j] < A[k]) k = j; // Loop Invariant: A[k] is the first smallest element of A[i..n) Swap (A[i], A[k]); } return; }
Merge(array A, index p, index q, index r) // Pre: A[p..q) and A[q..r) are sorted ranges // Post: A[p..r) equals merge of A[p..q) and A[q..r) { T B [r-p]; // temp space for merged copy of A fsu::g_set_merge(A+p, A+q, A+q, A+r, B); // merge the two parts of A to B fsu::g_copy(B, B+(r-p), A+p); // copy B back to A[p,r) return; }
MergeSort (array A, index p, index r) // Pre: [p..r) is a subrange of A // Post: A[p..r) is sorted { if (r - p > 1) { q = (p+r)/2; // integer arithmetic MergeSort(A,p,q); // recursive call MergeSort(A,q,r); // recursive call Merge(A,p,q,r); // defined above } return; }
HeapSort (array A, index p index r) // pre: p and r are valid index values for A // post: A[p .. r] is sorted { for (i = p+1; i <= r; ++i) push_heap(A,p,i) for (i = r; i > p; --i) pop_heap(A,p,i) }
QuickSort(array A, index p, index r) // Pre: A is an array of type T // A is defined for the range [p,r) // Post: A[p,r) is sorted { if (r - p > 1) { q = Partition(A,p,r); QuickSort(A,p,q); QuickSort(A,q+1,r); } return; } index Partition(array A, index p, index r) { i = p; for (j = p; j < r-1; ++j) // loop invariants: // 1. if k is in [p..i) then A[k] <= A[r-1] // 2. if k is in [i..j) then A[k] > A[r-1] { if (A[j] <= A[r-1]) // if A[j] <= last { Swap(A[i],A[j]); // swap to low range ++i; // expand the low range } } Swap(A[i],A[r-1]); // swap last into pivot location return i; }
Proof:
An element is used as a pivot at most one time, so there are at most n calls made to the partition routine.
All comparisons are made inside the partition. Therefore
Quicksort runtime <= O(n + x)
where x is the total number of comparisons made by the partition routine.
Consider the case of sorted input. Then every element will serve as a pivot and every other element will be compared to it. Thus the partition routine will be called at least n times and the k-th call will perform k-1 comparisons. Therefore
Worst case runtime >= Ω(n2)
Definitions. In order to estimate the average case runtime, define the following entities:
E[x] = expected value of x
z0, z1, ..., zn-1 = the elements in sorted order
[zi,zj] = {zi,zi+1 ... zj}
xi,j = bool{zi is compared to zj}
ei,j = Probability{zi is compared to zj}
No pair is compared more than one time. Therefore
x = Σ0..n-2 Σj = i+1..n-1 xi,j = ΣΣi,j xi,j
where the double sum ranges over the upper triangle of indices defined by 0 <= i < j <= n-1.
Compute the expected value:
E[x] = E[ΣΣi,j xi,j] = ΣΣi,j E[xi,j] = ΣΣi,j ei,j
Because of the way partition divides the data using the pivot:
It follows that:
zi is compared to zj iff the first element in [zi,zj] to be chosen as a pivot is one of the two ends of the interval.
Assuming pivot values are chosen at random, we have:
ei,j = P{zi or zj is first chosen from [zi,zj]} = P{zi is first chosen from [zi,zj]} + P{zj is first chosen from [zi,zj]} = 1/(j - i + 1) + 1/(j - i + 1) = 2/(j - i + 1)
Estimate the expected value as follows:
E[x] = Σ0..n-2 Σj = i+1..n-1 ei,j = Σ0..n-2 Σj = i+1..n-1 2/(j - i + 1) [substitute k = j - i] = Σ0..n-2 Σk = 1..n-i 2/(k + 1) < Σ0..n-1 Σk = 0..n-1 2/(k + 1) = Σ0..n-1 O(log n) = O(n log n)
Putting all these observations together completes the argument. Note that the assumption that any element in the interval [z0,zn) is equally likely to be chosen as a pivot is used in observation 7.