Educational Objectives: On successful completion of this assignment, the student should be able to
Background Knowledge Required: Be sure that you have mastered the
material in these chapters before beginning the assignment:
Introduction to Sets,
Introduction to Maps,
Binary Search Trees, and
Balanced BSTs.
Operational Objectives: Create an implementation of the Ordered Associative Array API using left-leaning red-black trees. Illustrate the use of the API by refactoring your WordBench as a client of Ordered Associative Array API.
Deliverables:
oaa.h # the ordered associative array class template wordbench2.h # defines wordbench refactored to use the OAA API wordbench2.cpp # implements wordbench2.h log.txt # your standard work log
Keep a text file log of your development and testing activities in log.txt.
Begin by copying all of the files from the assignment distribution directory, which will include:
hw4/main2.cpp # driver program for wordbench2 hw4/foaa.cpp # functionality test for OAA hw4/rantable.cpp # random table file generator hw4/makefile # makefile for project - builds wb2.x and foaa.x hw4/hw4submit.sh # submit script
Define and implement the class template OAA<K,D,P>, placing the code in the file oaa.h.
Use the default value P = fsu::LessThan<T> for the third template parameter, so that OAA<K,D> is also automatically defined.
Thoroughly test your OAA<> with the distributed test client programs foaa.cpp and moaa.cpp. Be sure to log all test activity.
Define the application WordBench, refactored as a client of OAA<fsu::String, size_t>, in the header file wordbench2.h, and implement the refactored WordBench in the file wordbench2.cpp
Test your refactored WordBench thouroughly to be certain that it is a true refactoring or the original. (Refactoring is defined to be re-coding without changing the program behavior.) Again, log all test activity.
Be sure to fully cite all references used for code and ideas, including URLs for web-based resources. These citations should be in the file documentation and if appropriate detailed in relevant code locations. Also cite all resources used in your log.
Be sure to fully cite all references used for code and ideas, including URLs for web-based resources. These citations should be in two places: (1) the code file documentation and if appropriate detailed in relevant code locations; and (2) in your log.
Submit the assignment using the script hw4submit.sh.
Warning: Submit scripts do not work on the program and
linprog servers. Use shell.cs.fsu.edu to submit assignments. If you do
not receive the second confirmation with the contents of your assignment, there has
been a malfunction.
Implement the Ordered Associative Array API using the following class template:
template < typename K , typename D , class P = LessThan<K> > class OAA { public: typedef K KeyType; typedef D DataType; typedef P PredicateType; OAA (); explicit OAA (P p); OAA (const OAA& a); ~OAA (); OAA& operator=(const OAA& a); DataType& operator [] (const KeyType& k) { return Get(k); } void Put (const KeyType& k , const DataType& d) { Get(k) = d; } D& Get (const KeyType& k); void Erase(const KeyType& k); void Clear(); void Rehash(); bool Empty () const { return root_ == 0; } size_t Size () const { return RSize(root_); } // counts alive nodes size_t NumNodes () const { return RNumNodes(root_); } // counts nodes int Height () const { return RHeight(root_); } template <class F> void Traverse(F f) const { RTraverse(root_,f); } void Display (std::ostream& os, int cw1, int cw2) const; void Dump (std::ostream& os) const; void Dump (std::ostream& os, int cw) const; void Dump (std::ostream& os, int cw, char fill) const; private: // definitions and relationships enum Flags { ZERO = 0x00 , DEAD = 0x01, RED = 0x02 , DEFAULT = RED }; // DEFAULT = alive,red static const char* ColorMap (unsigned char flags) { flags &= 0x03; // last 2 bits only switch(flags) { case 0x00: return ANSI_COLOR_BOLD_BLUE; // bits 00 case 0x01: return ANSI_COLOR_BOLD_BLUE_SHADED; // bits 01 case 0x02: return ANSI_COLOR_BOLD_RED; // bits 10 case 0x03: return ANSI_COLOR_BOLD_RED_SHADED; // bits 11 default: return "unknown color"; // unknown flags } } class Node { const KeyType key_; DataType data_; Node * lchild_, * rchild_; unsigned char flags_; Node (const KeyType& k, const DataType& d, Flags flags = DEFAULT) : key_(k), data_(d), lchild_(0), rchild_(0), flags_(flags) {} friend class OAA<K,D,P>; bool IsRed () const { return 0 != (RED & flags_); } bool IsBlack () const { return !IsRed(); } bool IsDead () const { return 0 != (DEAD & flags_); } bool IsAlive () const { return !IsDead(); } void SetRed () { flags_ |= RED; } void SetBlack () { flags_ &= ~RED; } void SetDead () { flags_ |= DEAD; } void SetAlive () { flags_ &= ~DEAD; } }; class PrintNode { public: PrintNode (std::ostream& os, int cw1, int cw2) : os_(os), cw1_(cw1), cw2_(cw2) {} void operator() (const Node * n) const { if (n->IsAlive()) os_ << std::setw(cw1_) << n->key_ << std::setw(cw2_) << n->data_ << '\n'; } private: std::ostream& os_; int cw1_, cw2_; }; class CopyNode { public: CopyNode (Node*& newroot, OAA<K,D>* oaa) : newroot_(newroot), this_(oaa) {} void operator() (const Node * n) const { if (n->IsAlive()) { newroot_ = this_->RInsert(newroot_,n->key_, n->data_); newroot_->SetBlack(); } } private: Node *& newroot_; OAA<K,D> * this_; }; private: // data Node * root_; PredicateType pred_; private: // methods static Node * NewNode (const K& k, const D& d, Flags flags); static void RRelease (Node* n); // deletes all descendants of n static Node * RClone (const Node* n); // returns deep copy of n static size_t RSize (Node * n); static size_t RNumNodes (Node * n); static int RHeight (Node * n); // rotations static Node * RotateLeft (Node * n); static Node * RotateRight (Node * n); template < class F > static void RTraverse (Node * n, F f); // recursive left-leaning get Node * RGet(Node* nptr, const K& kval, Node*& location); // recursive left-leaning insert Node * RInsert(Node* nptr, const K& key, const D& data); }; // class OAA<>
Note that the implementations of all OAA methods are discussed in the lecture notes in one form or another.
Many of the required implementations are already available in the file LIB/hw4/oaa.start.
It is worth pointing out what is NOT in these requirements that would be in a "full" OAA API:
The remaining "mutator" portion of the OAA API consists of Get, Put, Clear, Erase and Rehash -- arguably the minimal necessary for a useful general purpose container.
Note that the AA bracket operator is in the interface and is implemented in-line above with a single call to Get. Also note that Put is implemented with a single call to Get, which leaves Get as the principal functionality requiring implementation in ordr to have the AA bracket operator. The AA bracket operator, in turn, is required for the rafactoring of Wordbench.
The various const methods measure useful characteristics of the underlying BST and provide output useful in the development process as well as offering client programs insight into the AA structure.
The color system is outlined here just as in the lecture notes. The ColorMap is used by the Dump methods to color nodes at output. Color is manipulated by the four Node methods for detecting and changing node color. (This particular map colors red nodes red, black nodes blue, and shades the background of tombstones.)
There are four privately declared in-class types:
The various "private" statements are redundant, but they emphasize the various reasons for using that designation: (1) to have private in-class definitions, such as Node or typedef statements, and to record any friend relationships that might be needed; (2) private data in the form of variables; (3) private methods; and (4) things that are privatized to prevent their use.
The Erase and Rehash methods are not used by WordBench and may implemented non-functionaly. However, There is 20% extra credit available for correctly implementing these two methods.
Don't be confused by the Erase of class OAA<K,D> and the Erase of class WordBench.
Here is a working header file for the refactored WordBench:
/* wordbench2.h */ #include <xstring.h> #include <list.h> #include <oaa.h> class WordBench { public: WordBench (); virtual ~WordBench (); bool ReadText (const fsu::String& infile); bool WriteReport (const fsu::String& outfile, unsigned short c1 = 15, unsigned short c2 = 15) const; void ShowSummary () const; void Erase (); private: typedef fsu::String KeyType; typedef size_t DataType; size_t count_; fsu::OAA < KeyType , DataType > frequency_; fsu::List < fsu::String > infiles_; static void Cleanup (fsu::String& s); } ;
The set "wordset_" from the original design is replaced with the ordered associative array "frequency_".
Note the private terminology is changed slightly. (Of course, the API is not changed.) The main storage OAA is called frequency_ which makes very readable code of this form:
... Cleanup(str); if (str.Length() != 0) { ++frequency_[str]; ++numwords; } // end if ...
This snippet is the inner core of the processing loop implementing ReadText. The main loop implementing ReadText is now only 5 lines of code. (You should be certain you understand how this works, for both a new word and another encounter of an existing word. Note that the default constructor for size_t sets the initial value to 0).
Another small change is that it is no longer possible to loop through the data to count the words, because we are not defining an Iterator class. We could work out a way to make this count using a traversal with a special function object that retrieves the specific frequencies, but it is simpler just to have a class variable count_ that maintains the total number of words read (and is reset to 0 by Erase()).
Be sure you can explain the difference between the Erase of class OAA<K,D> and the Erase of class WordBench.