Educational Objectives: After completing this assignment, the student should be able to accomplish the following:
========================================================= rubric to be used in assessment --------------------------------------------------------- Predicate Order Case build sort_spy.x [0..10]: xx sort_spy.x uint.1000 uint.1000.out [0..10]: xx Default Order Case [no credit for re-using alt version] build hsort.x [0..10]: xx hsort.x < uint.100 > uint.100.out [0..10]: xx log & report [0..5]: x requirements and SE [-25..5]: x dated submissions deduction [2 pts each]: ( x) -- total: [0..50]: xx =========================================================
Background Knowledge Required: Be sure that you have mastered the
material in these chapters before beginning the assignment:
Iterators,
Generic Algorithms,
Introduction to Trees, and
Binary Heaps.
Operational Objectives: Implement two new generic heap algorithms fsu::g_heap_repair and fsu::g_build_heap. Move the current implementation of g_heap_sort into the alt namespace and re-implement fsu::g_heap_sort using the new algorithms. Test both versions fsu::g_heap_sort and alt::g_heap_sort using sort_spy.cpp and tease out the differences in runtimes, providing details in log.txt.
Deliverables: Two files:
gheap.h # fsu::g_heap_sort and alt::g_heap_sort, plus other algorithms log.txt # your project work log
The official development/testing/assessment environment is specified in the Course Organizer.
Create and work within a separate subdirectory cop4530/proj8.
Do your own work. Variations of this project have been used in previous courses. You are not permitted to seek help from former students or their work products. For this and all other projects, it is a violation of course ethics and the student honor code to use, or attempt to use, code from any source other than that explicitly distributed in the course code library, or to give or receive help on this project from anyone other than the course instruction staff. See Introduction/Work Rules.
Begin by copying the entire contents of the directory LIB/proj8 into your cop4530/proj8/ directory. Then copy the file LIB/tcpp/gheap.h into cop4530/proj8/. At this point you should see these files in your directory:
gheap.h hsort.cpp ranuint.cpp sort_spy.cpp deliverables.sh
Edit gheap.h to satisfy the requirements for that deliverable.
Test your algorithms thoroughly and put your notes and conclusions into your log.txt.
Submit the assignment using the script LIB/scripts/submit.sh.
Warning: Submit scripts do not work on the program and
linprog servers. Use shell.cs.fsu.edu to submit assignments. If you do
not receive the second confirmation with the contents of your assignment, there has
been a malfunction.
This section discusses the advanced heap algorithms you are implementing for the assignment. The basic heap algorithms push_heap, pop_heap, and the vanilla version of heap_sort are discussed in the lecture notes and are implemented in the file gheap.h that you have copied. This discussion assumes that the reader is familier with those materals.
Begin with a review of the pop_heap algorithm and its implementing code (default order version):
template <class I> void g_pop_heap (I beg, I end) { if (end - beg < 2) return; size_t n = end - beg - 1; size_t i = 0, left, right; bool finished = 0; g_XC(beg[0],beg[n]); do { left = 2*i + 1; right = left + 1; // left and right child nodes if (right < n) // both child nodes exist { if (beg[left] < beg[right]) // ==> follow right subtree { if (beg[i] < beg[right]) { g_XC(beg[right], beg[i]); i = right; } else { finished = 1; } } else // !(beg[left] < beg[right]) ==> follow left subtree { if (beg[i] < beg[left]) { g_XC(beg[left], beg[i]); i = left; } else { finished = 1; } } } else if (left < n) // only the left child node exists { if (beg[i] < beg[left]) { g_XC(beg[left], beg[i]); } finished = 1; // no grandchild nodes exist } else // no child nodes exist { finished = 1; } } while (!finished); }
This algorithm consists of a swap of the first and last elements of a presumed heap in the range [0..n] followed by a repair of the smaller heap in the range [0..n-1]. The repair works because the two children of the root are heaps, so the only place where the heap conditions might be violated is at the root. The repair portion of this code is in blue.
We can create another function called "repair" using that blue code:
void repair (I beg, I end) { if (end - beg < 2) return; size_t n = end - beg - 1; size_t i = 0, left, right; bool finished = 0; do { left = 2*i + 1; right = left + 1; // left and right child nodes if (right < n) // both child nodes exist { if (beg[left] < beg[right]) // ==> follow right subtree { if (beg[i] < beg[right]) { g_XC(beg[right], beg[i]); i = right; } else { finished = 1; } } else // !(beg[left] < beg[right]) ==> follow left subtree { if (beg[i] < beg[left]) { g_XC(beg[left], beg[i]); i = left; } else { finished = 1; } } } else if (left < n) // only the left child node exists { if (beg[i] < beg[left]) { g_XC(beg[left], beg[i]); } finished = 1; // no grandchild nodes exist } else // no child nodes exist { finished = 1; } } while (!finished); }
and we can refactor this code into a more compact form as:
void repair (I beg, I end) { if (end - beg < 2) return; size_t n,i,left,right,largest; n = end - beg; i = 0; bool finished = 0; do { left = 2*i + 1; right = left + 1; largest = ((left < n && beg[i] < beg[left]) ? left : i); if (right < n && beg[largest] < beg[right]) largest = right; // test order property at i; if bad, swap and repeat if (largest != i) { fsu::g_XC(beg[i],beg[largest]); i = largest; } else finished = 1; } while (!finished); }
Be sure to convince yourself that these two blocks of code implement "repair" in exactly the same way. We can take "repair" one step further, to repair any node in the tree under the assumption that its child nodes are heaps. The place where repair is needed is passed in as a third iterator:
void repair (I beg, I loc, I end) { if (end - beg < 2) return; size_t n,i,left,right,largest; n = end - beg; i = loc - beg; // only changes in red! bool finished = 0; do { left = 2*i + 1; right = left + 1; largest = ((left < n && beg[i] < beg[left]) ? left : i); if (right < n && beg[largest] < beg[right]) largest = right; // test order property at i; if bad, swap and repeat if (largest != i) { fsu::g_XC(beg[i],beg[largest]); i = largest; } else finished = 1; } while (!finished); }
This code defines our new generic algorithm g_heap_repair(I beg, I loc, I end). We can immediately refactor g_pop_heap as follows:
void g_pop_heap (I beg, I end) { if (end - beg < 2) return; g_XC(*beg,*(end - 1)); g_heap_repair(beg,beg,end - 1); }
Moreover, we can use heap_repair as another way to create a heap from an arbitrary array (or other range):
void fsu::g_build_heap (I beg, I end) { size_t size = end - beg; if (size < 2) return; for (size_t i = size/2; i > 0; --i) { g_heap_repair(beg, beg + (i - 1), end); } }
The other way to build a heap from scratch is embodied in the first loop in the vanilla version of heap sort:
void alt::g_build_heap (I beg, I end) { size_t size = end - beg; if (size < 2) return; for (size_t i = 1; i < size; ++i) { g_push_heap(beg, beg + (i + 1)); } }
The remarkable facts are:
These facts are all the more remarkable when you consider that the worst-case runtime for both push_heap and repair_heap are Θ(log n) and the loop of calls is Θ(n) in both cases. We will come back to explaining why these are true later. For now, just contemplate the subtlety. And realize that this provides an opportunity to improve the performance of heap_sort, by substituting fsu::build_heap for the first loop of calls to push_heap. This change does not affect the asymptotic runtime of heap_sort, because the second loop still runs in Θ(n log n). But it certainly improves the algorithm.
Begin by moving the current two versions of heap_sort into the namespace alt, keeping only the prototypes in namespace fsu. You will also need to add the namespace fsu:: resolution to the calls to push_heap and pop_heap. The effect is that the old "fsu::g_heap_sort" is now "alt::g_heap_sort".
Add these algorithm prototypes to namespace fsu (including the requisit template statements):
g_build_heap(I beg, I end, P& p); g_build_heap(I beg, I end); g_heap_repair(I beg, I loc, I end, P& p); g_heap_repair(I beg, I loc, I end);
so that the totality of prototypes in the file is:
namespace fsu { g_push_heap (I beg, I end, P& p); g_pop_heap (I beg, I end, P& p); g_heap_sort (I beg, I end, P& p); g_build_heap (I beg, I end, P& p); g_heap_repair (I beg, I loc, I end, P& p); g_push_heap (I beg, I end); g_pop_heap (I beg, I end); g_heap_sort (I beg, I end); g_build_heap (I beg, I end); g_heap_repair (I beg, I loc, I end); } namespace alt { g_heap_sort (I beg, I end, P& p); g_heap_sort (I beg, I end); }
The implementations for the alt versions should also be already there in the file, where you changed the namespace to include them under alt.
Implement all of the namespace fsu algorithms in the namespace fsu. The implementation for fsu::g_push_heap can be the same as it was in the old version. The new algorithms g_heap_repair and g_build_heap obviously require new implementations, using the ideas outlined above in this document. Finally, g_pop_heap and g_heap_sort require the improved implementations, again as outlined above.
Test your various implementations using the supplied sort_spy.cpp. Note that this version of sort_spy has the additional feature of checking the results of each sort and reporting the number of order errors in the result. Zero is good. Anything else tells you the "sort" algorithm is misnamed.
Once you are sure you have the implementations correct, begin to pay attention to the comp_count data for the two versions of heap_sort. (Keep observations in your log.) Try to tease out a distinction between the two.
In the discussion of heap algorithms we asserted that the build_heap algorithm has runtime O(n), which is a surprising result given the organization of the algorithm as a loop of n/2 of calls to a function whose worst-case runtime is clearly Θ (log n).
To gain some intuition on this fact, notice that the algorithm can be described as follows: For each subtree, starting with the smallest and progressing to the largest, repair the structure to be a heap. This process starts out at the bottom of the tree - i.e., the leaves of the tree, which are by default already heaps. So until we reach a node with a child, there is nothing to repair (which is why we can start the loop at n/2 - the leaves are the nodes with height 0). We first go through all of the nodes with height 1, repairing as we go; then all the nodes with height 2, and so on, until we hit the node with largest height, the root, which has height log2 n = lg n. Notice that 1/2 of the nodes have height 0 with no repair needed. Also 1/2 of the remaining nodes have height 1, so the repair process requires at most one swap. As we get nearer the top of the tree, where the "tall" nodes are, there are very few of them to repair.
Let's say there are N(k) nodes with height k in the tree. Since heap_repair at one of these nodes requires at worst 2k comparisons, the total number of comparisons is no greater than 2 times the sum of k * N(k), the sum taken over all possible heights k:
comp_count <= 2 Σkk*N(k)
The number of nodes of height k can be calculated as no greater than the ceiling of n/2k+1. Substituting into the summation yields
comp_count <= 2 Σkk*[ceil(n/2k+1)] <= Σkk*[ceil(n/2k)] <= nΣk[k/2k]
A fact from Discrete Math is:
Σkkak <= 1/(1 - |a|), provided |a| < 1.
(The sum extends to infinity. See Rosen, Discrete Math, xxx.) Taking a = 1/2, we then have
Σk[k/2k] <= 2
Extending our sum from 0 to infinity and applying this fact, we have
comp_count <= n Σk[k/2k] <= 2n
which verifies that
comp_count <= O(n) and therefore
fsu::build_heap <= O(n).
One final interesting note: Worst-case comp_count for the algorithm has been
established exactly!
In [Suchenek, Marek A. (2012), "Elementary Yet Precise Worst-Case Analysis of
Floyd's Heap-Construction Program", Fundamenta Informaticae (IOS Press) 120 (1):
75] Suchenek shows that
comp_count = 2n - 2s2(n) - e2(n)
in the worst case, where s2(n) is the number of 1's in the binary representation of n and e2(n) is the exponent of 2 in the prime decomposition of n.
The opposite conclusion holds for the basic or "alt" version, which builds a heap with a loop of calls to push_heap. In that algorithm, the calls to push_heap on the sub-range [0,k+1] may require lg(k) comparisons, so the entire algorithm may require
comp_count >= Σk lg k
comp_count >= Ω(n log n) and therefore
alt::build_heap >= Ω(n log n).
The naming of this basic build_heap algorithm as "alt::build_heap" is not standard and will require some background exposition when using it with people not associated with this class.
By compiling hsort.cpp, you have a test of the default order version of your new heap_sort.
There is an executable you can use to check your comp_count data against what we think is correct: [LIB]/area51/sort_spy_all.x.
One optimization you want to use, after all code is debugged and tested, is lowering the cost of the control arithmetic used in the various algorithms. Here multiplying and dividing by 2 are used to calculate the child and parent indices. This integer arithmetic can be made much faster by using the observations:
left = 2*i + 1; // uses integer arithmetic left = (i << 1) | (size_t)0x01; // uses bitwise operations to get same result parent = i/2; // integer arithmetic parent = (i >> 1); // same result using bitwise operations
Because integer arithmetic follows an algorithm quite a few clock cycles may be needed to perform one division or multiplication. The bitwise operations on the other hand have hardware support and may run in as little as one clock cycle.