Project 5: Thread-Based BST Iterators

Threaded iterators for BST-based OAA and Map.

Revision dated 06/19/19

Educational Objectives: After completing this assignment, the student should be able to accomplish the following:

Describe and explain in detail the concept forward and bidirectional iterators on Set and Map container classes.
Implement Stack-based bidirectional "in-order", "pre-order", and "post-order" iterators for binary trees
Implement Queue-based forward "level-order" iterators for binary trees
Implement threaded BSTs and associated threaded iterators
Explain the utility of iterators on Sets and Maps
Give examples of Set and Map applications that are simple with iterators but difficult without iterators.
Explain what the runtime and other efficiency considerations are between RBLL Trees and BSTs.
Explain what the runtime and other efficiency considerations are between ADT-based iterators and thread-based iterators.

=======================================================================
Rubric used in assessment
-----------------------------------------------------------------------
builds
 student makefile                                          [0..5]:   x
 assess  makefile                                          [0..5]:   x
tests:
 fmap.x  [OAA operations Put, Get, Retrieve, Erase]        [0..8]:   x
 fmap.x  [Iterator operations, Includes]                   [0..8]:   x
 fmap.x  [standard traversals, Analysis]                   [0..8]:   x
 wb3.x   [english text only]                               [0..8]:   x
 mmap.x  [hammer test]                                     [0..8]:   x
other:
 log & testing report                                    [-50..0]: ( x)
 requirements and specs                                  [-50..0]: ( x)
 software engineering                                    [-50..0]: ( x)
 dated submission deduction                          [2 pts each]: ( x)
                                                                    --
total:                                                    [0..50]:  50
=======================================================================

Background Knowledge Required: Be sure that you have mastered the material in these chapters before beginning the assignment:
Iterators, Introduction to Sets, Introduction to Maps, Binary Search Trees, Balanced BSTs, and BST Iterators.

Operational Objectives: Implement class templates ConstThreadedMapIterator and ThreadedMapIterator and use these classes to complete the implementations of the class template Map_Threaded.

Deliverables:

map_bst_threaded.h   # Map_BST class template
mapiter_threaded.h   # ConstThreadedMapIterator and ThreadedMapIterator class templates
wordbench3.h         # refactors wordbench2 using fsu::Map_Threaded
wordbench3.cpp       #     "         "        "        "
wordify.cpp          # same file as used for previous project - submitted for convenience
makefile.wb3         # makefile builds fmap.x, mmap.x, and wb3.x
log.txt              # your project work log

Discussion

This project explores the addition of Iterator classes associated with the Ordered Associative Array (OAA) class that was the subject of the previous project. Recall in the previous project, we did not have iterators, but nevertheless created a servicable Map-like container supporting the associative array API (Put, Get, Retrieve). We had to go to extraordinary lengths to obtain a useful traversal, and we were handicapped by the lack of equality operator among OAA objects. We had no way to make sense out of the fundamental Table operation Includes. (But Retrieve is a useful work-around.) Features we will easily obtain using iterators include the following:

Includes method, returning an Iterator object
Operators ==() and !=() defined among Map objects
The Standard Traversal is operational

And we may put aside the special methods for traverals and Display and their recursive implementations. The two tools are compared in the following table.

Associative Array API
&data = Get(key) // returns &data stored at key; ensures key exists in table Put(key,data) // unimodal insert of (key,data) pair; alias for Insert &data = operator[](key) // AA bracket operator; alias for Get

Table API
Insert(key,data) // unimodal insert of (key,data); alias for Put iter = Includes(key) // returns iter to key if found, End() otherwise (const and non-const) Begin() and End() // supporting bidirectional iterators (const and non-const versions)

The APIs have these in common:
bool Retrieve(key,&data) const // if true, &data is a copy of data stored with key Remove(key) or ( Erase(key) and Rehash() ) // Erase implements "lazy removal" (dead/alive flag), Rehash reconstructs the tree without dead nodes

Both APIs equipped with the usual container boiler plate: Empty, Size, Clear, Constructors, Destructor, operator=

In practice, one has either OAA nicknamed "Table Lite", the iterator-free associative array, or Map nicknamed "Full Table" or "Dictionary", which includes both Associative Array and Table APIs.

One may wonder about code bloat - throwing unwanted operations into the executable code. A big advantage of using templates is that a function template is not translated to object code unless it is actually called in the program. For class templates, this means that unused member functions are not compiled. This makes it sensible and convenient to have both the Associative Array and Table APIs supported by the Map container.

Varieties of Iterators

With a sequential structure such as a list, there is one obvious way for an iterator to go through the elements, from front to back. In a Set or Map structure, the "correct" order is neither unique nor obvious. When using any of the BST implementations, any kind of traversal might be used to define iterators, and we have at least 4: Inorder, Preorder, Postorder, and Levelorder. We will use all of these defined as external or "ADT-based" iterator classes. The native iterator class will be thread-based and have the advantage over stack-based Inorderiterator in that it requires +O(1) memory and is very fast. Please be familiar with the chapter on tree iterators before diving deeper into this project.

Map implementations also present the issue of what it means to de-reference an iterator. Sometimes we want the key, sometimes the data, and sometimes both. The solution is to package the key,data in a single object called an entry. An Entry object is similar to a Pair, except: (1) the names of the data are key_ and data_ (rather than first_ and second_), and (2) the key_ is a constant, so that it can never be changed. Mimicking the terminology in the standard library, we name the internal type to be returned by a dereferenced iterator "ValueType".

The external/ADT iterator classes are supplied for this project.

Procedural Requirements

The official development/testing/assessment environment is specified in the Course Organizer.
Create and work within a separate subdirectory cop4530/proj5.

Begin by copying all files in the directory LIB/proj5 into your proj5 directory. At this point you should see these files (at least) in your directory:

deliverables.sh
map_bst_threaded.start      # start for map_bst_threaded.h
mapiter_threaded.start      # start for mapiter_threaded.h
main_wb3.cpp                # 3rd time's a charm for wordbench
fmap.cpp                    # reconfigurable functionality test harness for Map
mmap.cpp                    # reconfigurable hammer tester
rantable.cpp                # create test files to be loaded by fmap.x

Then copy these relevant executables from LIB/area51/:

fmap*_i.x
mmap*_i.x
wb3_i.x

Finally, you may want to copy the slave file for simpler access. (This file should not be in your project directory under its library name, nor should you attempt to modify it.)

cp ~cop4530p/LIB/tcpp/map_bst_threaded_tools.cpp ~/cop4530/proj5/map_bst_threaded_tools.info

Create the deliverables

map_bst_threaded.h
mapiter_threaded.h
wordbench3.h
wordbench3.cpp
makefile.wb3
log.txt

satisfying the requirements and specifications below.

Test thoroughly, using the area51 executables as benchmarks. (See Hints on testing.)
Submit the assignment using the command submit.sh.

Warning: Submit scripts do not work on the program and linprog servers. Use shell.cs.fsu.edu to submit assignments. If you do not receive the second confirmation with the contents of your assignment, there has been a malfunction.

Code Requirements and Specifications

Design of the Map and Map Iterator classes is captured in the start files and discussed in the lecture notes. Most of the function implementations are omitted, and of course need to be supplied.
You are required to TYPE the code into the implementing bodies - code should NOT be copy/pasted. Typing the code will help you understand both the design and the implementation details.
The implementation should follow the binary search tree pattern with iterative (not recursive) implementation of the Get operation.
Map_BST must implement all of the Associative Array and Table APIs.
Map_BST must exhibit all characteristics of unimodal associative containers.
Map_BST methods Get, Put, Retrieve, Erase, Insert, and Includes must have average-case runtime O(log n), where n is the size of the table.
In general, behavior should exactly match that of the benchmark programs in area51.
Note that the runtime constraints make it infeasible to call SetAllThreads as part of the implementation of any of the Map operations, except for those that copy the whole table - copy constructor and assignment operator.
The only difference between WordBench2 and WordBench3 is that WordBench2 uses OAA<String,size_t> and WordBench3 uses Map_Threaded<String,size_t>, plus WordBench3 adds a way to present the tree node height distribution to the user, for visualinspection of the search efficiency.
You will have to re-implement WriteReport using a standard traversal of the underlying Map object to replace the clunky Display method of OAA. This is similar to what you did for the original WordBench.
ShowAnalysis is implemented with this code in wordbench3.cpp:
```
void WordBench::ShowAnalysis () const
{
  fsu::Analysis(frequency_, std::cout, 15);
}
```
Analysis is a stand-alone function template supplied in map_bst_threaded_tools.cpp. The idea is to provide a user of wordbench with a way to see how efficient the map is for their particular word files.
Identical behavior. WordBench3 must use fsu::Map_BST and behave in a manner identical to the area51 version.

Hints

The supplied file map_bst_threaded_tools.cpp is a slave file for map_bst_threaded.h. It contains the integrity checker to verify all of the BST order properties.
Testing is fun and important. The supplied "rantable.cpp" compiles to a random generator of <string,int> data written to a file. Such data files can be "Loaded" into the table with the fmap '<' option and T|K|C sub-options. "< T" reads the data as a table, "< K" is meant to load a file of keys (and associate them with the default data D()), and "< C" loads keys and counts the number of instances in the data field (much like wordbench),
A command file can be used to perform a sequence of commands after the data is loaded. This feature of fmap allows repeatable tests comparing the results between your executable and a benchmark.
Note also that your wb3 should perform in a manner identical to YOUR wb2, since they use the same wordify.cpp code. (This is also expected to conform with the area51 benchmarks.)
fmap has a number of interesting tests: 4 varieties of Dump (see caution bullet), structural test with verbose and silent modes,(again see caution), and all 4 options for traversal: F = Forward, R = Reverse, L = Levelorder, and ! = reciprocity test). And of course fmap makes the AA and Table APIs accessible.
Caution. Some of the choices available from fmap may be inadvisable for large tables. Any kind of Dump or Traverse with a table of 100,000 entries is going to occupy your screen for a while, as will the level 2 output from structural testing. Try these out on modest size tables. (We have built in a "stop" below level k = 7 in the cases where a complete binary tree is output, because level k has 2^(k-1) nodes.)