Trees Overview

Trees, as in special graphs, are of enormous importance in computer science. There is a rich history of using trees both as models and as actual data structures for specific kinds of applications. Trees of various types are used in such diverse areas as indexes (for example, in secondary storage access and databases), as data structures, and as models supporting development of algorithms.

Our study of trees is divided into chapters. The first begins with a review of the theory of trees.

Trees Overview

General Trees - Theory and Terminology
Tree Traversals
Binary Trees

Associative Binary Trees

Totally Ordered Trees [TOT]
Binary Search Trees [BST]
The class fsu::BST<T,P>
Search, Insert, Remove
Recursive Methods
Navigators
Iterators
Random BSTs
Height Balanced Trees
Red-Black Trees

Binary Heaps

Partially Ordered Trees [POT]
Vector Model of Complete Binary Tree
Heap Algorithms
Heapsort
Priority Queue

Positional Binary Trees

Dynamic Memory Implementation of Binary Trees
Binary Tree Navigators
Binary Tree Iterators
Definition of class BinaryTree
Recursive Methods
Inserting Elements
Removing Elements
Binary Tree Construction from File

Theory and Terminology

The theory presented here is intended to serve two purposes: to review basic facts about trees, and to establish terminology necessary for precise communication about trees. For more details and more comprehensive coverage the reader is referred to the prerequisite mathematics courses and their texts.

Definition. A tree is a connected graph with no cycles.

Theorem. Any two vertices in a tree are connected by a unique irredundant path.

Proof. Let v₁ and v₂ be two vertices in a tree G. Because G is connected, there is at least one irredundant path P₁ in G from v₁ to v₂. If there were another such path P₂, then the path consisting of P₁ followed by the reverse of P₂ would be a path in G that contains a cycle, contradicting the definition of tree.

Definition. A rooted tree is a graph G satisfying the following three conditions:

G is connected;
G has no cycles;
G has exactly one vertex call the root.

The depth of a vertex v in a rooted tree is the length of the (unique) irredundant path from root to v.

Note that a rooted tree G can be arranged so that the root is at the top, vertices of depth 1 are in a horizontal line below the root, vertices of depth 2 are in a horizontal line below the vertices of depth 1, and so on. The set of all vertices of depth k is called level k of the tree.

See the illustrations: Tree, Rooted Tree, and Rooted Tree Redrawn.

A descending path in a rooted tree is a path each of whose edges goes from a vertex to a deeper vertex. Note that the unique irredundant path from the root to any vertex is a descending path; the length of this path is equal to the depth of the vertex.

If there is a descending path from v₁ to v₂, v₁ is said to be an ancestor of v₂ and v₂ is a descendant of v₁.

Theorem. Suppose v is a vertex of depth k in a tree G. Then:

Any vertex that is adjacent to v must have depth k -1 or k +1.
Vertices adjacent to v of depth k +1 are called children of v.
If k > 0, there is exactly one vertex of depth k -1 that is adjacent to v in the graph. This vertex is called the parent of v.

A vertex with no children is called a leaf. If a rooted tree is drawn in the default root-down manner, the leaves of the tree will be at the bottom. A vertex v determines a subtree with root v, defined to be the subgraph of all descendents of v in the tree. The height of a tree is the maximum depth of its vertices. Note that the root of a tree is the only vertex of depth 0 and is the only vertex that has no parent.

See the illustrations: Descending Path, Root and Leaves, and Subtree.

Tree Traversals

A traversal of a graph is an algorithm or process for "visiting" all of the vertices in a tree in a specified order that is determined by the graph structure. Tree traversals are traversals that are defined in the special case that the graph is a rooted tree. The following general points pertain to tree traversals.

Tree traversals are typically based on either depth-first search (DFS) or breadth-first search (BFS).
The classic implementations of DFS-based tree traversals are recursive.
Each kind of tree traversal corresponds to an iterator type
Iterators are implemented non-recursively

Preorder Traversal

Preorder tree traversal is a DFS-based tree traversal in which a vertex is visited "on arrival". In other words, a preorder traversal of a rooted tree is obtained by performing DFS beginning at the root and visiting each vertex of the tree as soon as it is encountered in the search. There are at least as many choices for implementation of preorder traversal as there are implementations for DFS, including recursive, stack-based, and nested loop implementations. We will eventually investigate all three of these possibilities.

See an animation of preorder traversal by clicking here.

Postorder Traversal

Postorder traversal is another DFS-based tree traversal. While preorder traversal visits vertices "on arrival", postorder traversal visits vertices "on departure". In other words, a postorder traversal of a rooted tree is obtained by performing DFS beginning at the root and visiting each vertex of the tree as it is left behind in the search. The comments on implementations for preorder traversals apply equally well to postorder traversals.

A way to clarify the distinction between pre- and postorder traversals is with a stack-based implementation of DFS. As the DFS proceeds to search the entire tree, vertices are pushed onto the stack when first encountered and popped from the stack when departed. When all vertices have been encountered, the remaining stack elements are popped. Preorder traversal "visits" vertices when they are pushed onto the stack. Postorder traversal "visits" vertices when they are popped from the stack.

See an illustration of postorder traversal by clicking here

Levelorder Traversal

Levelorder tree traversal is a BFS-based tree traversal in which a vertex is visited "on departure". In other words, a preorder traversal of a rooted tree is obtained by performing BFS beginning at the root and visiting each vertex of the tree as soon as it is encountered in the search. A levelorder traversal "visits" vertices in level order, that is, starting at the root (level 0), then the children of the root (level 1), and so on one level at a time until each vertex has been visited. The only practical implementation of BFS for trees, and hence the only practical implementation of levelorder traversal, is queue-based.

See an illustration of preorder traversal by clicking here.

Traversal orderings

Each tree traversal determines an ordering of the vertices of the tree. In fact, it is this ordering that defines the traversal, independent of any particular implementation algorithm. The three traversal illustrations, reviewed here, include the final ordering of vertices.

Preorder Traversal: depth-first search (possibly stack based), visit on arrival
Postorder Traversal: depth-first search (possibly stack based), visit on departure
Levelorder Traversal: breadth-first search (queue based), visit on departure

Binary Trees

Definition. A binary tree is a rooted tree in which each vertex has 0, 1, or 2 children.

A binary tree is complete iff the only vertices with less than two children are in the bottom two layers. Thus in a complete binary tree

Vertices in the bottom layer have no children (these are leaves in any tree).
Vertices in the penultimate layer have 0, 1, or 2 children.
All other vertices have 2 children.

Compare the illustrations of Binary Tree and Complete Binary Tree

Theorem. Suppose a complete binary tree has n vertices and height H. Then:

2^H <= n < 2^H+1
H <= log n < H + 1
H == floor ( log n )

Proof. There are several steps to the proof of the first bullet:

First note that the number of possible vertices in layer k is 2^k . (Prove this by mathematical induction. Proved in your discrete math book/class.)
Now, if all of the layers through layer k are full, then the number of vertices in the first k layers is the sum
1 + 2 + 4 + ... + 2^k = 2^k+1 - 1 . (Also prove this by induction.)
Substituting k = H - 1 , we see that there are 2^H - 1 vertices in the first H - 1 layers.
Therefore there are at least this many plus one, i.e., 2^H, vertices in the tree. That is, 2^H <= n .
If the last layer is filled, substituting k = H we see that the maximum number of vertices is 2^H+1 - 1, which is smaller than 2^H+1 .
Therefore, n < 2^H+1 .

This completes the proof of the first bullet.

The second bullet is deduced from the first by taking the base 2 logarithm of the three entities in the first bullet, and noting that the log function is monotonic (i.e., preserves inequalities). The third bullet is deduced from the second by truncating all quantities to integers.

Binary Tree Traversals

All of the tree traversals discussed so far apply to binary trees. There is another traversal that takes advantage of the binary treee property: inorder traversal. An inorder traversal of a binary tree visits the vertices in the order left-subtree, parent vertex, right-subtree. This is a DFS-based traversal with visitation occurring upon the first return to a vertex and with DFS set up to prefer the left subtree over the right. Visitation occurrs immediately after the left subtree has been investigated.

All four types of traversals are illustrated for binary trees:
Preorder Traversal
Inorder Traversal
Postorder Traversal
Levelorder Traversal