|
COT 5410-01 Fall 2004
Algorithms
Chris Lacher
Notes 3: Sorting and Related Topics
|
Sorting Problem
- Input: Sequence of n numbers or keys (a1,
a2, ... , an)
- Output: Permutation (a'1,
a'2, ... , a'n)
such that a'1 <=
a'2 <= ... <= a'n
- Satellite data associated with keys may be considerable
- Implementation "details":
- Assume that (!(key1 < key2) and !(key2 <
key1)) implies (key1 == key2), but satellite data may not be the same
- May use indirection to avoid many re-assignments of
satellite data
- Data assumed in random access storage (e.g., array); sorting algorithm may
require copy into such a structure
Insertion Sort
Insert-Sort ( array of numbers A )
{
for (j = 2; j <= length(A); ++j)
{
// Loop Invariant: A[1..j-1] is sorted
key = A[j];
i = j - 1;
while (i > 0 and A[i] > key)
{
A[i+1] = A[i];
i = i-1;
}
A[i+1] = key;
}
return;
}
Proof of Halting (done)
Proof of Correctness (done)
Runtime Analysis (done), Result: Worst case = Average Case = Θ(n2)
Runspace Analysis (done), Result: in-place
Simple Sort
inline Swap (key& x, key& y)
{
key z = x;
x = y;
y = z;
}
SimpleSort (array A[1..n])
// pre:
// post: A[1..n] is sorted
{
for (i = 1; i <= n; ++i)
{
k = i;
for (j = i + 1; j <= n; ++j)
{
if (A[j] < A[k])
k = j;
}
Swap(A[k], A[i]);
}
}
Proof of Halting
Proof of Correctness
Runtime Analysis
Runspace Analysis
Is this sort stable? If not, modify algorithm so its is stable?
Merge Sort
Merge(array A, index p, index q, index r)
// Pre: A[p..q] and A[q+1..r] are sorted ranges
// Post: A[p..r] equals merge of A[p..q] and A[q+1..r]
{
n1 = q - p + 1;
n2 = r - q;
new array L[1 .. n1 + 1]
new array R[1 .. n2 + 1]
for (i = 1; i <= n1; ++i)
L[i] = A[p + i - 1];
for (j = 1; j <= n2; ++j)
R[j] = A[q + j];
L[n1 + 1] = infinity
R[n2 + 1] = infinity
i = 1;
j = 1;
for (k = p; k <= r; ++k)
{
if (L[i] <= R[j])
{
A[k] = L[i];
++i;
}
else
{
A[k] = R[j];
++j;
}
} // end for
}
MergeSort (array A, index p, index r)
// Pre: p and r are in the range of A, p <= r
// Post: A[p..r] is sorted
{
if (p < r)
{
q = (p + r)/2;
MergeSort (A, p, q);
MergeSort (A, q + 1, r);
Merge (A, p, q, r);
}
}
Proof of Halting
Proof of Correctness
Runtime Analysis
Runspace Analysis
HeapSort
- Binary tree model
from array index structure:
- Parent of A[k] is A[k/2]
- Left child of A[k] is A[2k]
- Right child of A[k] is A[2k+1]
- A max heap is an array in which every parent is greater than or
equal to its children, using the tree model described above. (This is also
called the partially ordered tree (POT) property.)
The Push Heap Algorithm
- Add new data at next leaf
- Repair upward
- Repeat
- locate parent
- if POT not satisfied
- swap
- else
- stop
- Until POT
|
|
push_heap (array A, index p, index r)
// pre: A[p..r-1] is a max heap
// r is a valid index for A
// post: Elements of A[p..r] are permuted
// A[p..r] is a max heap
{
}
The Pop Heap Algorithm
- Swap last leaf and root
- "Remove" last leaf
- Repair downward
- Repeat
- identify children
- find larger child
- if POT not satisfied
- swap
- else
- stop
- Until POT
|
|
pop_heap (array A, index p, index r)
// pre: A[p..r] is a max heap
// post: A[p], A[r] have swapped values
// Elements of A[p..r] are permuted
// A[p..r-1] is a max heap
{
}
heap_sort (array A, index p index r)
// pre: p and r are valid index values for A
// post: A[p .. r] is sorted
{
for (i = p+1; i <= r; ++i)
push_heap(A,p,i)
for (i = r; i > p; --i)
pop_heap(A,p,i)
}
Proof of Halting
Proof of Correctness
Runtime Analysis
Runspace Analysis
Recursive version?
QuickSort
QuickSort (array A, index p, index r)
{
if (p < r)
{
q = Partition(A, p, r)
QuickSort(A, p, q-1)
QuickSort(A, q+1, r)
}
}
Partition (array A, index p, index r)
{
x = A[r]
i = p - 1
for (j = p .. r-1)
{
if (A[j] <= x)
{
i = i+1
Swap(A[i], A[j])
}
}
Swap(A[i+1], A[r])
return i + 1
}
Proof of Halting
Proof of Correctness
Worst Case Runtime O(n2)
Average Case Runtime Θ(n log n)
- At most n calls to Partition during entire run, because
each execution of loop body eliminates one (pivot) element from consideration
- Each call to Partition makes one comparison between elements
- Therefore: Run time is O(n + X), where
X = number of comparisons made during entire execution of Quicksort
- Pairs of elements are compared at most once
- Therefore: X = Σi=1n-1
Σj=i+1n Xij where
Xij = I { zi is compared to zj }
- E[X] = Σi=1n-1
Σj=i+1n Pr { zi is compared to zj }
- Assume z1, ..., zn is the sorted permutation of
input. Then:
Pr { zi is compared to zj } = 2/(j - i + 1)
- E(X) = O(n log n)
- Therefore: Run time is O(n + n log n) = O(n log n)
Runspace Analysis
Theoretical Limits
- Theorem: Any comparison sort algorithm has worst case runtime
Ω(n log n).
- Path in decision tree from root to leaf represents one particular sort
instance
- Therefore: Worst case runtime = Ω(H)
- Decision tree must have each possible permutation of input as a reachable leaf
- Therefore: Decision tree has L >= n! leaves
- n! <= L <= 2H
- log n! <= H
- H >= Ω (log n!) >= Ω (n log n) (by "formula")