This chapter is devoted to the subject of binary search tree iterators and moving around in BSTs. Dynamic binary trees are constructed in a manner very similar to linked lists, except using Nodes in place of Links. There are several strategies for adding iterators to dynamic binary trees, one of which mimics the way we did for lists. All of these strategies are complicated by the fact: binary trees have a non-linear structure, whereas iterators must encounter the elements in a sequential (linear) order. For a binary search tree, we want the default iterator type to encounter the elements in sorted (increasing) order. For example, consider the following trees:
40 20 50 / \ / \ / \ 20 60 10 50 30 60 / \ / \ / \ / \ \ 10 30 50 70 40 70 20 40 70 / / / tree1 30 60 10 tree2 tree3
These are each BSTs containing the data 10,20,30,40,50,60,70, and an Iterator-defined traversal should encounter the elements in this order. But, clearly, the trees have different structure and, therefore, the Iterator-defined traversal will navigate in the trees differently to produce the same external result. There was no such problem with Lists. This simple example probably makes you think that the subject of BST iterators can get complicated. If so, you are right.
We list the strategies for binary tree iterators here and consider the first three in the remainder of the chapter.
The fourth approach is actually similar to the third, and we will not discuss it further since Graphs are not covered until later.
While dynamic trees are similar in concept to linked lists, tree iterators are
significantly more complex than list iterators, because the structure to be
navigated is not geometrically linear, whereas iteration is a geometrically
linear process: Begin, Next, and End are basic iterator actions and essentially
define a linear progression from beginning to end.
We will define iterators for dynamically allocated
binary trees that implement the four traversal types (preorder, inorder,
postorder, and levelorder) discussed in the first trees chapter.
Iterators are client-serving devices. In the case of list iterators, they also serve the needs of implementation and other server side considerations because list iterators encounter list elements (client view) in the same order as they encounter list links (server view). Tree iterators, in contrast, encounter the stored elements in an order that is very different from the actual tree structure, often requiring execution of a loop to get from one element to the next in an iteration.
In order to navigate the actual structure of a tree, we introduce the binary tree Navigator. Navigators have the public interface of a bidirectional iterator, but the semantics behind the interface is that of tree navigation (up, down-left, down-right) as opposed to Iterator semantics (previous, next). Navigators also have test methods such as "has parent" and "has left child" that enable efficient navigation through a tree. All four binary tree Iterator types can be implemented in terms of Navigators.
Much is gained by an object-based approach to dynamic tree implementation, exactly as in the case with lists. By making Node<T> pointers accessible only inside the implementation code of a tree class (and tree Navigator), and giving clients access to necessary functionality through a tree public interface,
Because trees are inherently more complex than lists, these advantages of object-based approach with information hiding are even more important for trees.
We will be able to define iterators for trees for each of the four types of traversals, again using an object-based approach rather than recursion. In circumstances where recursive techniques are clearly appropriate, we will be able to keep these in the protected areas thus avoiding client exposure to node pointers.
We will also define an iterator-like class called navigator that
facilitates motion along the tree structure itself: up, down-left, down-right.
Navigators will serve as the basis for implementing many of the algorithms we
need for binary trees, including all iterator operations and various search
methods for associative binary trees.
The following are symbolic representations for "atomic" components of binary tree traversal algorithms:
Recall that an "atomic" operation is one known to have constant runtime cost. In the cases here: Initialize(), ++n (move to the left child), n++ (move to the right child), and --n (alias n--) (move to parent) are each realized as either a single assignment of pointers or a small fixed number (3 or fewer) of stack or queue operations.
The implementations of these atomics vary depending on the iterator
type. Stating the iterator algorithms in terms of these atomics relieves us of
the need to re-state them for each iterator type - all that is needed is to plug
in the particular atomic implementations.
Begin this section on iterator algorithms by understanding that the slides present algorithms in "pseudo-code"! This looks a lot like C/C++ code, which should help make it readable. but it is definitely not ready to compile. Moreover, in any code there are boundary cases that relate to specific implementations, such as checking that pointers are not null before a dereference or ensuring that a stack is not empty before a call to Top(). These boundary cases are not mentioned in the pseudocode, but must be handled in the actual iterator-type-specific implementations.
The following table gives a "verbal" description of each directive in the pseudo-code. Note that the verbal description is easy to understand, to a point, but there are details that seem ambiguous without consulting the pseudo-code for clarification.
Inorder::Initialize | |
English description | pseudo-code fragment |
Enter tree at root | n = bt.root; |
Left slide | while (n->HasLeftChild()) ++n; |
Skip dead nodes (if any) | while (n->IsDead()) Increment(); |
Inorder::Increment | |
English description | pseudo-code fragment |
Begin algorithm | { |
if possible, go to right child | if (n->HasRightChild()) { n++; |
and slide left to the bottom | while (n->HasLeftChild()) ++n; } |
otherwise go to parent one time, rembering whether node is a right child |
else { bool wasRightChild; do { wasRightChild = n->IsRightChild(); --n; |
and keep going to parent as long as the node was a right child |
} while (wasRightChild); } |
(end of algorithm) | } |
Note that the slide shows the algorithm in a kind of reversal of the table - the
pseudocode is stated as if a C++ code block, and the verbal descriptions in
"comments" for the pseudocode. We will give another example in this tabular form
and then revert to the alternative description method illustrated in the slide.
Preorder::Increment | |
English description | pseudo-code fragment |
if n has a left child, go there | if (n->HasLeftChild()) { ++n; } |
otherwise if n has a right child, go there | else if (n->HasRightChild()) { n++; } |
otherwise | else |
ascend until a right child is found and go there | while (n->IsValid()) { } |
remember where you came from | prev = n; --n; |
if you were a left child and you have a right child | if ((prev == n.lchild_) && n->HasRightChild()) |
go there and halt | n++; break; |
otherwise | } else { |
keep ascending, remembering where you came from | prev = n; --n; |
The main loop in InOrder::Increment is a deep descent to the bottom of the tree. The same can be said for the main loop in Postorder::Increment. These loops have names that are somewhat descriptive of the motion in the tree:
Inorder::Increment has the loop called "slide left to the bottom", which keeps moving down to the left child, stopping at the last one.
Postorder::Increment has the loop called "descend to bottom with left priority",
which moves to the left child whenever possible but then moves to the right
child as a second choice, until neither option is available.
Levelorder::Increment just executes one step in a "breadth-first search" of the
tree.
There are some features of these algorithms that are independent of the iterator type:
Note this look a lot like code, but it is pseudo-code.
Pre-, In-, and Post-order iterators are bidirectional, but Levelorder is not: you can't "back up" the queue to a previous state, that information is lost. The algorithms for reverse-Init and Decrement are not new, but they do have a surprising twist:
These slides show the protected details for such a Navigator-based implementation. The public interface we will leave for later discussion. The private portion of the class BinaryTree<T> contains the scope BinaryTree<T> definition of Node. Note that all members of Node are private, analogous to the treatment of Link in the scope of List<T>. There is one protected data item in BinaryTree<T>, a pointer to the root node of the tree. (Here is one place where the analogy between BinaryTree<T> and List<T> breaks down: The root of a tree is analogous to the first link in a list, but there is no analogy for the last link in a list: all of the leaves in a tree are "last", and there is no reason to favor one leaf over another. Thus in BinaryTree<T> we find no pointer analagous to lastLink.)
Thus all access into a dynamically allocated tree must be through the root
node.
A binary tree navigator is a device that facilitates motion around the tree using the actual tree structure. The interface of a navigator is syntactically very similar to that of a bidirectional iterator. However, the semantics of the navigator operations is very different. Navigators provide access and movement around a tree using the binary tree structure itself. If N is a navigator and B is a binary tree:
The internal structure of a tree is important to the implementer of trees and
tree iterators, but is meaningless to, and should be hidden from, client
programs. Navigators are used primarily to build trees and to support the
various iterator classes whose motion is more meaningful to client programs.
Parent pointers need to be maintained during operations that modify tree structure. For example, the following code fragments are from a recursive version of BST Insert, before and after modification for parent pointers:
Non-parented code:if (pred_(tval,nptr->value_)) // left subtree { nptr->lchild_ = RInsert(nptr->lchild_, tval); } else if (pred_(nptr->value_,tval)) // right subtree { nptr->rchild_ = RInsert(nptr->rchild_, tval); } else ... |
Parented Code:if (pred_(tval,nptr->value_)) // left subtree { nptr->lchild_ = RInsert(nptr->lchild_, tval); nptr->lchild_->parent_ = nptr; } else if (pred_(nptr->value_,tval)) // right subtree { nptr->rchild_ = RInsert(nptr->rchild_, tval); nptr->rchild_->parent_ = nptr; } else ... |
Similar modifications would apply to the iterative version of BST Insert
discussed in previous chapters.
Rotations are required for the various forms of height-balanced trees (AVL, RBT, LLRBT). Because rotations modify tree structure the rotation algorithms for the 2-D case must be modified to maintain the parent pointers in the 3-D case. The following shows before and after modification for RotateLeft:
Non-parented code:Node * RotateLeft(Node * n) { if (0 == n || 0 == n->rchild_) return n; Node * p = n->rchild_; n->rchild_ = p->lchild_; p->lchild_ = n; return p; } |
Parented Code:Node * RotateLeft(Node * n) { if (0 == n || 0 == n->rchild_) return n; Node* p = n->rchild_; n->rchild_ = p->lchild_; if (0 != p->lchild_) { p->lchild_->parent_ = n; } p->parent_ = n->parent_; if (0 != n->parent_) { if (n == n->parent_->lchild_) { n->parent_->lchild_ = p; } else { n->parent_->rchild_ = p; } } n->parent_ = p; p->lchild_ = n; return p; } |
Maintaining parent pointer structure has a small incremental runtime cost
"paying in advance" for the ability to navigate the tree.
template < typename T , P > // P = other optional parameters class BinaryTreeNavigator { private: typename TreeType<T,P>::Node * currNode_; T& Retrieve() const; // mixed signature! friend class TreeType<T,P>; public: // terminology support typedef T ValueType; typedef BinaryTreeNavigator <T,P> Navigator; // constructors BinaryTreeNavigator (); BinaryTreeNavigator (typename TreeType<T,P>::Node* ptr); // type converter BinaryTreeNavigator (const BinaryTreeNavigator& n); // copy ctor virtual ~BinaryTreeNavigator (); Navigator& operator = (const Navigator & n); // information/access bool Valid () const; // Navigator points to valid element bool HasParent () const; // node has valid parent bool HasLeftChild () const; // node has valid left child bool HasRightChild () const; // node has valid right child bool IsLeftChild () const; // node is the left child of its valid parent bool IsRightChild () const; // node is the right child of its valid parent bool IsRed () const; // node is red [TreeType = Red Black Tree] bool IsDead () const; // node is dead [TreeType = Binary Search Tree] // flags - TreeType = Red Black Tree, Height Balanced Tree, AVL Tree char GetFlags () const; // returns node flags void SetFlags (uint8_t flags); // sets node flags // neighbors get/set - return a navigator to the item Navigator GetParent () const; Navigator GetLeftChild () const; Navigator GetRightChild () const; // various operators bool operator == (const BinaryTreeNavigator& n2) const; bool operator != (const BinaryTreeNavigator& n2) const; T& operator * (); // Return reference to current Tval const T& operator * () const; // const version // navigation operators: these do NOT conform to standard iterator semantics Navigator& operator ++ (); // prefix: down-left (move to left child) Navigator& operator ++ (int); // postfix: down-right (move to right child) Navigator& operator -- (); // prefix: up (move to parent) Navigator& operator -- (int); // postfix: up (move to parent - same as prefix) // structural output void Dump (std::ostream& os) const; } ; // class BinaryTreeNavigator
template < typename T , class P > T& BinaryTreeNavigator<T,P>::Retrieve () const // pre: Valid() { if (!Valid()) { std::cerr << "** BinaryTreeNavigator<T,P>::Retrieve() error: invalid dereference\n"; exit (EXIT_FAILURE); } return currNode_->value_; } template < typename T , class P > BinaryTreeNavigator<T,P>::BinaryTreeNavigator () : currNode_(0) {} template < typename T , class P > BinaryTreeNavigator<T,P>::BinaryTreeNavigator (typename RBLLT<T,P>::Node* ptr) { currNode_ = ptr; } template < typename T , class P > BinaryTreeNavigator<T,P>::BinaryTreeNavigator (const BinaryTreeNavigator<T,P>& n) : currNode_(n.currNode_) {} template < typename T , class P > BinaryTreeNavigator<T,P>::~BinaryTreeNavigator () {} template < typename T , class P > char BinaryTreeNavigator<T,P>::GetFlags() const { if (currNode_) return currNode_->flags_; return 0; } template < typename T , class P > BinaryTreeNavigator<T,P> BinaryTreeNavigator<T,P>::GetParent () const { Navigator n; // default is null if (currNode_) n.currNode_ = currNode_->parent_; return n; } template < typename T , class P > BinaryTreeNavigator<T,P> BinaryTreeNavigator<T,P>::GetLeftChild () const { Navigator n; // default is null if (currNode_) n.currNode_ = currNode_->lchild_; return n; } template < typename T , class P > BinaryTreeNavigator<T,P> BinaryTreeNavigator<T,P>::GetRightChild () const { Navigator n; // default is null if (currNode_) n.currNode_ = currNode_->rchild_; return n; } template < typename T , class P > bool BinaryTreeNavigator<T,P>::Valid() const { return currNode_ != 0; } template < typename T , class P > bool BinaryTreeNavigator<T,P>::operator == (const BinaryTreeNavigator<T,P>& n2) const { return (currNode_ == n2.currNode_); } template < typename T , class P > bool BinaryTreeNavigator<T,P>::operator != (const BinaryTreeNavigator<T,P>& n2) const { return (currNode_ != n2.currNode_); } template < typename T , class P > BinaryTreeNavigator<T,P> & BinaryTreeNavigator<T,P>::operator = (const BinaryTreeNavigator<T,P>& n) { currNode_ = n.currNode_; return *this; } template < typename T , class P > BinaryTreeNavigator<T,P> & BinaryTreeNavigator<T,P>::operator ++() { if (currNode_ != 0) { currNode_ = currNode_ -> lchild_; } return *this; } template < typename T , class P > BinaryTreeNavigator<T,P> & BinaryTreeNavigator<T,P>::operator ++(int) { if (currNode_ != 0) { currNode_ = currNode_ -> rchild_; } return *this; } template < typename T , class P > BinaryTreeNavigator<T,P> & BinaryTreeNavigator<T,P>::operator --() { if (currNode_ != 0) { currNode_ = currNode_ -> parent_; } return *this; } template < typename T , class P > BinaryTreeNavigator<T,P> & BinaryTreeNavigator<T,P>::operator --(int) { if (currNode_ != 0) { currNode_ = currNode_ -> parent_; } return *this; } template < typename T , class P > const T& BinaryTreeNavigator<T,P>::operator * () const // pre: Valid() { return Retrieve(); } template < typename T , class P > T& BinaryTreeNavigator<T,P>::operator * () // pre: Valid() { return Retrieve(); } template < typename T , class P > bool BinaryTreeNavigator<T,P>::HasParent () const { if (currNode_ != 0 && currNode_ -> parent_ != 0) return 1; return 0; } template < typename T , class P > bool BinaryTreeNavigator<T,P>::HasLeftChild () const { if (currNode_ != 0 && currNode_ -> lchild_ != 0) return 1; return 0; } template < typename T , class P > bool BinaryTreeNavigator<T,P>::HasRightChild () const { if (currNode_ != 0 && currNode_ -> rchild_ != 0) return 1; return 0; } template < typename T , class P > bool BinaryTreeNavigator<T,P>::IsLeftChild () const { if (currNode_ != 0 && currNode_ -> parent_ != 0 && currNode_ == currNode_ -> parent_ -> lchild_) return 1; return 0; } template < typename T , class P > bool BinaryTreeNavigator<T,P>::IsRightChild () const { if (currNode_ != 0 && currNode_ -> parent_ != 0 && currNode_ == currNode_ -> parent_ -> rchild_) return 1; return 0; } template < typename T , class P > bool BinaryTreeNavigator<T,P>::IsRed () const { if (currNode_ == 0) return 0; // null is not read return currNode_->IsRed(); } template < typename T , class P > bool BinaryTreeNavigator<T,P>::IsDead () const { if (currNode_ == 0) return 0; // null is not dead return currNode_->IsDead(); } template < typename T , class P > void BinaryTreeNavigator<T,P>::Dump(std::ostream& os) const { // specific to TreeType }
The class BinaryInorderIterator<N> should be a bidirectional iterator class that encounters the data of a tree in inorder order. The public interface is therefore essentially determined. We use a template parameter representing a TreeType::Navigator in order to have a clean separation between the Iterator and the specific TreeType.
template < class N > class BinaryTreeInorderIterator { private: N nav_; void Increment(); // moves to next node void Decrement(); // moves to previous node public: // terminology support typedef N Navigator; typedef typename N::ValueType ValueType; typedef BinaryTreeInorderIterator<N> Iterator; typedef ConstBinaryTreeInorderIterator<N> ConstIterator; // constructors BinaryTreeInorderIterator (); virtual ~BinaryTreeInorderIterator (); BinaryTreeInorderIterator (const Navigator& n); // type converter BinaryTreeInorderIterator (const BinaryTreeInorderIterator& i); // copy ctor // Initializers void Initialize (const Navigator& n); void rInitialize (const Navigator& n); // information/access bool Valid () const; // cursor is valid element Navigator GetNavigator(); // various operators bool operator == (const BinaryTreeInorderIterator& i2) const; bool operator != (const BinaryTreeInorderIterator& i2) const; ValueType& operator * (); // Return reference to current Tval const ValueType& operator * () const; // const version BinaryTreeInorderIterator<N>& operator = (const BinaryTreeInorderIterator <N> & i); BinaryTreeInorderIterator<N>& operator ++ (); // prefix BinaryTreeInorderIterator<N> operator ++ (int); // postfix BinaryTreeInorderIterator<N>& operator -- (); // prefix BinaryTreeInorderIterator<N> operator -- (int); // postfix } ;
The only data item is a private Navigator object. Our challenge is
to realize an inorder traversal using Initialize() to get started and
operator ++() to keep going. Implementation of all of the other
operations is straightforward based on these two or our past experience with
iterators.
An inorder iterator must be initialized to point to the first element encountered in an inorder traversal, i.e., to the data in the left-most node of the tree:
template < class N > void BinaryTreeInorderIterator<N>::Initialize (const N& n) { // start at n nav_ = n; // then slide left to leftmost child while (nav_.HasLeftChild()) ++nav_; }
(Reverse initialization is similar, reversing the roles of left and right.)
template < class N > BinaryTreeInorderIterator<N>::BinaryTreeInorderIterator () : nav_() {} template < class N > BinaryTreeInorderIterator<N>::~BinaryTreeInorderIterator () {} template < class N > BinaryTreeInorderIterator<N>::BinaryTreeInorderIterator (const BinaryTreeInorderIterator<N>& i) : nav_(i.nav_) {} template < class N > BinaryTreeInorderIterator<N>::BinaryTreeInorderIterator (const N& nav) : nav_(nav) {}
Incrementing an inorder iterator requires finding the next element encountered in an inorder iteration. This is an intricate process, essentially depth first search, for which we have previously used a stack as a control mechanism. However we can convert DFS to an iterative process taking advantage of the special structure of a rooted binary tree.
If the current node has a right child, the next node is the left-most descendant of that right child. Otherwise, we must backtrack until we encounter an unvisited node. We can test for "visited" by asking whether the ascent is from a right child.
// protected increment/decrement template < class N > void BinaryTreeInorderIterator<N>::Increment() { if (!Valid()) { return; } // now we have a valid navigator if (nav_.HasRightChild()) // slide down the left subtree of right child { nav_++; while (nav_.HasLeftChild()) ++nav_; } else // back up to first ancestor not already visited // as long as we are parent's right child, then parent has been visited { bool navWasRightChild; do { navWasRightChild = nav_.IsRightChild(); --nav_; } while (navWasRightChild); } } // Increment() template < class N > void BinaryTreeInorderIterator<N>::Decrement() { if (!Valid()) { return; } // now we have a valid navigator if (nav_.HasLeftChild()) // slide down the right subtree of left child { ++nav_; while (nav_.HasRightChild()) nav_++; } else // back up to first ancestor not already visited // as long as we are parent's right child, then parent has been visited { bool navWasLeftChild; do { navWasLeftChild = nav_.IsLeftChild(); --nav_; } while (navWasLeftChild); } } // Decrement() template < class N > BinaryTreeInorderIterator<N> & BinaryTreeInorderIterator<N>::operator ++() { do Iterator::Increment(); while (nav_.IsDead()); // skips over dead nodes (if any) return *this; } template < class N > BinaryTreeInorderIterator<N> & BinaryTreeInorderIterator<N>::operator --() { do Iterator::Decrement(); // skips dead nodes while (nav_.IsDead()); return *this; }
The established pattern for implementing the postfix increment operator is used:
template < class N > BinaryTreeInorderIterator<N> BinaryTreeInorderIterator<N>::operator ++(int) { BinaryTreeInorderIterator<N> i = *this; operator ++(); return i; } template < class N > BinaryTreeInorderIterator<N> BinaryTreeInorderIterator<N>::operator --(int) { BinaryTreeInorderIterator<N> i = *this; operator --(); return i; }
Note that decrementation is essentially the reverse of incrementation for Inorder. However the other kinds of iterators, decrement may be quite different from "reverse" increment.
All other inorder iterator operations are implemented using a straightforward
analogy to list iterator implementation.
The class BinaryTreePostorderIterator<N> should be a bidirectional iterator class that encounters the data of a tree in postorder order. The public interface is therefore determined and is depicted in the slide. As in the Inorder case, the only data item is a private Navigator. Our challenge is to realize a postorder traversal using Initialize() to get started and operator ++() to keep going. Implementation of all of the other operations is straightforward based on these two or our past experience with iterators.
We have seen in past discussions how the ADTs Stack and Queue may be used as algorithm control devices - notably, using Stack to implement depth-first search (DFS) and Queue to implement breadth-first search (BFS). Observing that the pre-, in-, and post-order traversals follow a DFS of the tree from the root, and similarly that level-order traversal follows a BFS of the tree, it is natural to use this approach to define iterators of these four types. These iterator classes maintain an internal data structure (stack or queue) that is a record of where the iterator has been as well as a guide for the future.
These ADT-based iterators contain an amount of data that is variable and determined at runtime, it is worth looking at the cost of these in space usage. Note that because we have eliminated one node pointer for each tree node, we have a space saving of n pointers, so it is justifiable to allow the iterators to be somewhat bulked up, but we need to ask "how bulky?"
A control stack for any of the three DFS-based iterators consists of pointers to nodes forming a path from the root to the curent node. Because we typically use BSTs in which the tree height is O(log n) [n = size], this is a modest space usage compared to the savings, so it is justified to use these on the basis of space at least. A control queue for the BFS-based levelorder iterator is significantly larger, approaching n/2 as we get near the bottom of the tree. This makes the Inorder iterators of more limited appeal, and certainly should be used only in specialized applications. Nevertheless, they are useful on some occasions.
Iterator Type Properties Inorder Often becomes the official "Iterator" for ordered Set/Table
Bidirectional Iterator Proper Type Copy requires copying control stack Stack is typically size O(log n)
Algorithm for decrement (operator--) is left-handed version of algorithm for increment (operator++)Levelorder Can't back up - information on where it has been is already popped from the control queue
Copying can be expensive, since the control queue has size Ω(n)
Useful - e.g., data written in level order rebuilds the identical tree structure using BST::InsertPre- and Post-order Efficiency same as for Inorder case
Preorder::operator-- uses left-handed agorithm for postorder::operator++, and vice versa
Thus these adt-based iterators are both practical and useful.
The class BinaryTreeInorderIterator<N> should be an iterator class that encounters the data of a tree in inorder order. We could define this class as a bidirectional iterator in a manner identical except for name to BinaryTreePostorderIterator<N> and BinaryTreeInorderIterator<N>, and the implementations would be analogous as well. Only the two methods Initialize() and operator ++() would require significant changes to accomplish the different traversal.
However, we take a different approach for this iterator class, using a stack of Navigators as a private data member of the class and relying on the stack to accomplish depth-first search. Note that the class depicted in the slide is not a fully functional bidirectional iterator, because the decrement operators are rendered unusable by privatization. The reason for this is that it is difficult to "back up" a stack-based process.
In fact as illustrated the class is not even a forward iterator, because we have privatized the postfix version of increment. This was done simply for efficiency: because making copies of stack-based iterators is costly, and the postfix increment necessarily makes a copy of the iterator, we have chosen not to present this opportunity to the client. (Similar comments apply to the copy constructor and assignment operator.) The postfix increment operator, copy constructor, and assignment operator could all be made public and implemented to make this iterator class a legitimate forward iterator class.
The inorder iterator illustrated is sufficient for many client needs, however,
and the implementation techniques are interesting. Moreover, this version works
as is for the more restricted 2-D versions of binary tree with no parent pointer
in Node.
void InorderBTIterator<C>::Init(Node* n) // only intended to be used with n = root_ { if (n == nullptr) return; stk_.Clear(); stk_.Push(n); while (n != nullptr && n->HasLeftChild()) { n = n->lchild_; stk_.Push(n); } while (Valid() && stk_.Top()->IsDead()) Increment(); }
void InorderBTIterator<C>::Increment() { if ( stk_.Empty() ) return; Node * n; if ( stk_.Top()->HasRightChild() ) { n = stk_.Top()->rchild_; stk_.Push(n); while ( n != nullptr && n->HasLeftChild() ) { n = n->lchild_; stk_.Push(n); } } else { do { n = stk_.Top(); stk_.Pop(); } while( !stk_.Empty() && stk_.Top()->HasRightChild() && n == stk_.Top()->rchild_ ); } }
The class BinaryTreeLevelorderIterator<N> should be an iterator class that encounters the data of a tree in levelorder order. Unlike the preorder, inorder, and postorder iterators, however, it is difficult and computationally expensive to implement levelorder iterators using a simple Navigator as data. An approach similar in spirit to our stack-based implementation of preorder iterator, implementing the breadth-first search algorithm, does the job.
The accepted best approach to this problem is to use a queue as a private data member of the class and rely on the queue to control a breadth-first search. Note that the class depicted in the slide is not a fully functional bidirectional iterator, because the decrement operators are rendered unusable by privatization. The reason for this is that it is difficult to "back up" a queue-based process.
In fact as illustrated the class is not even a forward iterator, because we have privatized the postfix version of increment. This was done simply because making copies of queue-based iterators is costly. The postfix increment operator, copy constructor, and assignment operator could all be made public and implemented to make this iterator class a legitimate forward iterator class.
The levelorder iterator illustrated is sufficient for many client needs, however,
and the implementation techniques are interesting.
template < class C > void LevelorderBTIterator<C>::Init(Node* n) { que_.Clear(); if (n == nullptr) return; que_.Push(n); while (!que_.Empty() && que_.Front()->IsDead()) Increment(); }
template < class C > void LevelorderBTIterator<C>::Increment() { if ( que_.Empty() ) return; Node * n = que_.Front(); que_.Pop(); if (n->HasLeftChild()) que_.Push(n->lchild_); if (n->HasRightChild()) que_.Push(n->rchild_); }
Lemma. In any binary tree with n nodes, there are n+1 unused child slots.
Proof. Each node except for the root is a child of some other node, thus using a child slot. There are 2n child slots. Therefore 2n - (n - 1) = n +1 slots are not used.
It turns out that these unused slots, or null pointers when implemented, provide exactly enough storage, in exactly the right places, to assist in defining a bidirectional iterator for the tree without the use of parent pointers. The idea is as follows.
Suppose during an inorder traversal we find ourselves at a certain node n. If n has a right child, then the "next" node in order is the leftmost child of this right child, and we can descend there use part of the navigator-based iterator increment code:
if (n has a right child) { n = n->rchild_; // go to right child while (n has a left child) // slide left to bottom n = n->lchild_; }
If however n does not have a right child, a navigator-based iterator would search upward for the next stopping point. Without parent pointers, this doesn't work. But, if we stored the place to go to in the right child pointer, we could get there in one step:
else // n has no right child { n = n->rchild_; // wormhole to next node }
Similarly, we can decrement using threads:
if (n has a left child) { n = n->lchild_; // go to left child while (n has a right child) // slide right to bottom n = n->rchild_; } else { n = n->lchild_; // leap through space to previous place }
This is a clever idea, but there are issues to overcome:
We answer the first question by adding to our flag system and the second by
modifying the code for the insert and rotation operations. There are numerous
other places in BST code where a naive test such as "n->rchild_ != nullptr"
needs to be replaced with "n->HasRightChild()". The latter is
arguably more readable code in any case.
In any of the height-balanced trees we already have a set of flags associated with each node. For example, in LLRB trees, we used two flags: one for "dead" and one for "red". There are six more flags we have at our disposal. We use two of these to denote when a left or right child pointer is an actual child node or a thread:
enum Flags { ZERO = 0x00 , DEAD = 0x01, RED = 0x02 , LEFT_THREAD = 0x04 , RIGHT_THREAD = 0x08 , THREADS = LEFT_THREAD | RIGHT_THREAD }; class Node { T value_; Node * lchild_, * rchild_; uint8_t flags_; // bit 3 = left threaded, bit 4 = right threaded Node (const T& tval, Flags flags = DEFAULT) : value_(tval), lchild_(0), rchild_(0), flags_(flags) // no-parent version {} friend class BST<T,P>; friend class ThreadedBTIterator < BST <T,P> >; ... bool HasLeftChild () const { return (lchild_ != nullptr) && !(IsLeftThreaded()); } bool HasRightChild () const { return (rchild_ != nullptr) && !(IsRightThreaded()); } bool IsLeftThreaded () const { return 0 != (LEFT_THREAD & flags_); } bool IsRightThreaded () const { return 0 != (RIGHT_THREAD & flags_); } void SetLeftThread (Node* n) { lchild_ = n; flags_ |= LEFT_THREAD; } void SetRightThread (Node* n) { rchild_ = n; flags_ |= RIGHT_THREAD; } void SetLeftChild (Node* n) { lchild_ = n; flags_ &= ~LEFT_THREAD; } void SetRightChild (Node* n) { rchild_ = n; flags_ &= ~RIGHT_THREAD; } ... };
Also we add Set/Get convenience functionality for the newly defined flags. Note that we are not adding to the space requirements for the node, except in the most basic case of a plain BST with no tombstone designations.
Excersize. Describe in precise english what the effect of each of the
Node methods accomplishes.
Non-threaded code:template <typename T, class P> T& BST<T,P>::Get (const T& t) { if (root_ == nullptr) { root_ = NewNode(t); return root_->value_; } Node * p = nullptr; // trailing parent Node * n = root_; bool left; while (n != nullptr) { p = n; if (pred_(t,n->value_)) { n = n->lchild_; left = 1; } else if (pred_(n->value_,t)) { n = n->rchild_; left = 0; } else // found { n->SetAlive(); return n->value_; } } n = NewNode(t); (left ? p->lchild_ = n : p->rchild_ = n); return n->value_; } |
|
Non-threaded code:Node * RotateLeft(Node * n) { if (0 == n || 0 == n->rchild_) return n; Node * p = n->rchild_; n->rchild_ = p->lchild_; p->lchild_ = n; return p; } |
Threaded Code:Node * RotateLeft(Node * n) { // Require(n->HasRightChild() && n->rchild_->IsRed()); if (nullptr == n || nullptr == n->rchild_ || n->IsRightThreaded()) return n; if (!(n->rchild_->IsRed())) { std::cerr << " ** RotateLeft called with black right child\n"; return n; } Node * p = n->rchild_; if (p->HasLeftChild()) n->SetRightChild(p->lchild_); else n->SetRightThread(p); p->SetLeftChild(n); n->IsRed()? p->SetRed() : p->SetBlack(); n->SetRed(); return p; } |
Node * RotateRight(Node * n) { if (0 == n || 0 == n->lchild_) return n; Node * p = n->lchild_; n->lchild_ = p->rchild_; p->rchild_ = n; return p; } |
Node * RotateRight(Node * n) { // Require(n->HasLeftChild() && n->lchild_->IsRed()); if (nullptr == n || nullptr == n->lchild_ || n->IsLeftThreaded()) return n; if (!n->lchild_->IsRed()) { std::cerr << " ** RotateRight called with black left child\n"; return n; } Node * p = n->lchild_; if (p->HasRightChild()) n->SetLeftChild(p->rchild_); else n->SetLeftThread(p); p->SetRightChild(n); n->IsRed()? p->SetRed() : p->SetBlack(); n->SetRed(); return p; } |
BST class:public: Iterator Begin() const { Iterator i; i.Init(root_); return i; } Iterator End() const { Iterator i; return i; } |
Iterator class:private: // usable by friends but not clients void Init(Node* n) { node_ = n; while (node_ != nullptr && node_->HasLeftChild()) node_ = node_->lchild_; while (node_ != nullptr && node_->IsDead()) Increment(); } void Increment () { if (node_ == nullptr) return; if (node_->IsRightThreaded()) { node_ = node_->rchild_; return; } node_ = node_->rchild_; while (node_ != nullptr && node_->HasLeftChild()) node_ = node_->lchild_; } |
BST class:public: Iterator rBegin() const { Iterator i; i.rInit(root_); return i; } Iterator rEnd() const { Iterator i; return i; } |
Iterator class:private: void rInit(Node* n) { node_ = n; while (node_ != nullptr && node_->HasRightChild()) node_ = node_->rchild_; while (node_ != nullptr && node_->IsDead()) Decrement(); } void Decrement () { if (node_ == nullptr) return; if (node_->IsLeftThreaded()) { node_ = node_->lchild_; return; } node_ = node_->lchild_; while (node_ != nullptr && node_->HasRightChild()) node_ = node_->rchild_; } |
Iterator class:public: Iterator & operator ++() { do Increment(); while (node_ != nullptr && node_->IsDead()); return *this; } Iterator & operator++(int) { Iterator |
Iterator class:public: Iterator & operator --() { do Decrement(); while (node_ != nullptr && node_->IsDead()); return *this; } Iterator & operator--(int) { Iterator |
Algorithm:Increment { do nothing on null iterator if node_ is right threaded { go where right thread points } else { go to right child slide left as far as possible } } |
Code:void Increment () { if (node_ == nullptr) return; if (node_->IsRightThreaded()) { node_ = node_->rchild_; return; } node_ = node_->rchild_; while (node_ != nullptr && node_->HasLeftChild()) node_ = node_->lchild_; } |
Algorithm:++iter { Repeat: advance iter to next node Until: node is null or alive return iter by reference } iter++ { make copy of iter advance iter to next element return copy by value } |
Code:Iterator & operator++() { do Increment(); while (node_ != nullptr && node_->IsDead()); return *this; } Iterator operator++(int) { Iterator |
template < class C > class ThreadedBTIterator // a ConstIterator pattern { private: friend C; typename C::Node* node_; void Init (Node*); // left slide, skips dead nodes void rInit (Node*); // right slide, skips dead nodes void Increment (); // moves to next inorder node, dead or alive void Decrement (); // moves to previous inorder node, dead or alive public: // terminology support typedef typename C::ValueType ValueType; typedef typename C::Node Node; typedef ThreadedBTIterator<C> ConstIterator; typedef ThreadedBTIterator<C> Iterator; // operators bool operator == (const ThreadedBTIterator& i2) const; bool operator != (const ThreadedBTIterator& i2) const; const ValueType& operator * () const; // const version ThreadedBTIterator<C>& operator = (const ThreadedBTIterator& i); ThreadedBTIterator<C>& operator ++ (); // prefix ThreadedBTIterator<C> operator ++ (int); // postfix ThreadedBTIterator<C>& operator -- (); // prefix ThreadedBTIterator<C> operator -- (int); // postfix // constructors ThreadedBTIterator (); virtual ~ThreadedBTIterator (); ThreadedBTIterator (Node* n); // type converter ThreadedBTIterator (const ThreadedBTIterator& i); // copy ctor // information/access bool Valid () const; // cursor is valid element };
Among the four iterator types, most commonly the InorderIterator is used as the official "Iterator" type and the others have more specific names:
class TreeType { public: ... typedef T ValueType; typedef TreeType::Navigator Navigator; typedef BinaryTreeInorderIterator<Navigator> InorderIterator; typedef BinaryTreePreorderIterator<Navigator> PreorderIterator; typedef BinaryTreePostorderIterator<Navigator> PostorderIterator; typedef BinaryTreeLevelorderIterator<Navigator> LevelorderIterator; typedef InorderIterator Iterator; ...
Also only one of these will typically have support with Begin/End methods:
class TreeType { public: ... Iterator Begin(); Iterator End(); LevelorderIterator BeginLevelorder(); LevelorderIterator EndLevelorder(); InorderIterator BeginInorder(); InorderIterator EndInorder(); PreorderIterator BeginPreorder(); PreorderIterator EndPreorder(); ...
Thus the various kinds of traversals are invoked with slightly varying syntax, as follows:
// standard traversal for (typename TreeType::Iterator i = x.Begin(); i != x.End(); ++i) { ... } // standard traversal for (Iterator i = x.Begin(); i != x.End(); ++i) { ... } // inorder traversal for (InorderIterator i = x.BeginInorder(); i != x.EndInorder(); ++i) { ... } // preorder traversal for (PreorderIterator i = x.BeginPreorder(); i != x.EndPreorder(); ++i) { ... } // postorder traversal for (PostorderIterator i = x.BeginPostorder(); i != x.EndPostorder(); ++i) { ... } // levelorder traversal for (LevelorderIterator i = x.BeginLevelorder(); i != x.EndLevelorder(); ++i) { ... }
Some of the operations for binary tree iterators have rather complicated implementations (by iterator standards, anyway) and in fact appear to have non-constant runtime complexity -- a bad thing for iterators. A remarkable fact is that, on average at least, these iterator operations have constant runtime complexity:
Definition. An Iterator-Defined Traversal of the data structure b is the following loop:
for (Iterator i = b.Begin(); i != b.End(); ++i) { /* whatever */ }
Recall that we need a notion of atomic computation in order to analyze runtime of an algorithm. For the Navigator-based iterators, that notion is a Navigator motion operation: ++n, n++, or --n. For ADT-based iterators, that notion is a basic Push or Pop operation in the ADT. And for Thread-based iterators, that notion is a change of the underlying node pointer: ptr = ptr->child:
Iterator Basis Atomic computation = Edge Step Navigator Navigator operators ++(), ++(int),--() ADT Push, Pop Threads node = node->lchild_, node = node->rchild_ [same as navigator++() and ++(int)]
Theorem 1. Using Navigator-Based Iterators, the Iterator-Defined traversal loop makes exactly 2*n edge steps, where n is the number of nodes in the tree.
Corollary. Navigator-based Iterator operations have amortized constant runtime.
To gain some intuition on its conclusion, it is useful to look at some special cases. We will return to the proof after the examples.
For example, consider a "left-linear" tree bt1 in which no element has a right child. (Such a tree is essentially a list with a lot of wasted null pointers.)
* / \ * / \ * / \ The tree bt1 * / \ * / \ * / \
Atomic Computations in bt1 | preorder | inorder | postorder | levelorder |
Initialize() | 1 | size | size | 1 |
operator ++() (up to last call) | 1 | 1 | 1 | 2 |
operator ++() (last call) | size | 1 | 1 | 1 |
Clearly these numbers verify the theorem in this case. (Note that the middle line in the table represents (size - 1) calls to operator ++.) Similarly, in a "right-linear" tree bt2, in which no element has a left child, iterator operations have the following runtime complexities:
* / \ * / \ * / \ The tree bt2 * / \ * / \ * / \
Atomic Computations in bt2 | preorder | inorder | postorder | levelorder |
Initialize() | 1 | 1 | size | 1 |
operator ++() (up to last call) | 1 | 1 | 1 | 2 |
operator ++() (last call) | size | size | 1 | 1 |
Note that for both bt1 and bt2, and for all three DFS based traversals, the total number of atomics for a complete traversal is exactly 2 * size.
Consider the complete binary tree bt3:
A / \ B C / \ / \ The tree bt3 D E F G / \ / \ / \ / \ H I J K L M N O
As in the general case, the number of atomic computations required for operator ++() is dependent on where in the tree the iterator happens to be pointing. Often only one atomic is required, but sometimes backtracking forces the use of more than one atomic. The maximum number that may be needed is the height of the tree. It is not immediately obvious how these added atomics affect the overall runtime of a traversal.
If we write down the vertices in inorder order and compute the atomic computations to traverse, we obtain:
inorder: (init) H D I B J E K A L F M C N G O (end) atomics: 4 1 1 2 2 1 1 3 3 1 1 2 2 1 1 4 = 30 = 2 * size
Note that 30 is exactly twice the size of the tree. It turns out that this relation holds in general: the number of atomics required for a DFS based traversal of a binary tree is exactly twice the size of the tree.
As a final example, here are a tree and a table showing the number of edge moves for each step in an inorder traversal:
A / \ B C \ / \ D E F /\ G H inorder op *i edge moves ---------- -- ---------- i = Begin() B 2 ++i G 2 ++i D 1 ++i H 1 ++i A 3 ++i E 2 ++i C 1 ++i F 1 ++i - 3
Translating to the horizontal format, we get:
inorder: (init) B G D H A E C F (end) atomics: 2 2 1 1 3 2 1 1 3 = 16 = 2 * size
Exercise 1. Perform the analysis above for the other three traversals using bt3.
Exercise 2. Show that for a full complete binary tree, the number nH of atomics in an inorder traversal of a full complete binary tree of height H satisfies the recursion:
(Hint: The left and right children of root define subtrees of height nH - 1.)
Exercise 3. Use the result of Exercise 2 above to show that nH = 2 x size.
To prove Theorem 1, we shall formalize the notion of atomic computation by defining an edge step to be a change from a vertex to either a child or a parent vertex, as in the table above. We also refer to an initial access of the root and the move to invalid state directly from a vertex as edge steps. Note that an edge step corresponds exactly to a call to one of the navigator motion operations (n.Inialize(), ++n, n++, --n) for the navigator underlying the iterator.)
The following Hilfsatz is more than enough to prove the theorem in the cases that use DFS:
Proof of Theorem 1. Let eSize be the number of edges in the tree. During DFS, each edge in the tree is crossed twice -- once going down (when it is pushed onto the stack) and once returning up (when it is popped off the stack). This accounts for 2*eSize = 2*(size - 1) = 2*size - 2 edge steps. Adding in one edge step for "jumping on" and one for "jumping off" the tree, and we have exactly 2*size edge steps during the run of the loop.
Each of the Navigator-based traversals implements DFS, with a jump-on and a jump-off. The traversals differ only in where they "stop" after each increment. Therefore the count applies to each of the three traversals.
Theorem 2. Each Thread-Based Iterator operation has runtime bounded above by the runtime of the corresponding Navigator-Based Iterator operation.
Corollary. Thread-based Iterator operations have amortized constant runtime.
Theorem 3. Using ADT-Based Iterators, the Iterator-Defined traversal loop makes exactly 2*size Push/Pop operations.
Corollary. ADT-based Iterator operations have amortized constant runtime, with the exception of destructors and copy operations, which are required to traverse the data structure [stack or queue] owned by the iterator object.
Proof. During the traversal, each node is pushed onto the data structure one time. The traversal begins and ends with an empty control structure, so each node is also popped from the structure (one time). Thus the total number of push/pop operations is twice the number of nodes.
Proof of Corollaries. The theorems show that a full traversal requires 2*size atomics, so runtime complexity of the loop is Θ(2n) = Θ(n). Since there are n executions of the loop body, the average cost of each call to operator++ is Θ(n) / n = constant.
Note that the proof of the theorem avoids the complexity of the arguments like those of the exercises by making a clever global argument about the edge moves in a complete traversal.
Theorem 4. Navigator- and Thread-Based Iterators require +Θ(1) space. Stack-Based Iterators require +Θ(h) space, and Queue-Based Iterators require +Θ(w) space, where h is the tree height and w is the width.
The height of a tree is the length of the longest descending path.
The width of a tree is the number of nodes in the
largest layer of the tree.
Name/Basis | Pros | Cons |
Full / Navigator-Based |
Full-featured Ordered Set API Several kinds of iterators co-exist |
Higher memory use [n pointers "wasted"] Complex operator algorithms |
External / ADT-Based |
Full-featured Ordered Set API All kinds of iterators co-exist Only way to get Levelorder |
Bulky Iterators Levelorder is forward-only Preorder::Decrement = Postorder::Increment (and vice versa) |
Threaded / Thread-Based |
Most memory efficient Simpler/Faster Increment & Decrement Can still use ADT iterators for special purposes - set copy operations - data save order |
Tree copy is costlier than usual - Set API is Singleton in std library |