Contents

Algorithm Analysis
Binary Trees
Binary Search Trees
Non-recursive methods

Source code
Printable version
Discuss this article
in the forums

The Series

Introduction
Binary Trees

Binary Trees

As the name implies, a binary tree is simply a recursively-defined tree that can have a maximum of two children. The two children are commonly given the names left child and right child, and as I said in the last article, binary trees have an additional traversal called the in-depth traversal. Binary trees are used for many things, ranging from efficient searching techniques, data compression, and even arithmetic expression parsers (simple compilers). Note that some people refer to binary trees as ‘B-Trees’, which is incorrect! B-Trees are a special kind of multi-data per node tree used for advanced balancing algorithms, and Apple Macintoshes even use B-trees for their file systems. Be careful to not call a ‘Binary Tree’ a ‘B-Tree’! Before we discuss structure implementation strategies, there are a few properties that a binary tree can have which are important.

Properties of a Binary Tree

Since binary tree nodes can only have a maximum of two children, this fact introduces several properties that do not make sense on general trees. There are three important properties that a binary tree can have: fullness, balance, and leftness.

Fullness

The first property is called fullness. A binary tree is considered full if, and only if, every node in the tree has either two or zero children, and every node which has zero children is on the lowest level of the tree. Because there is no limit to how many child nodes a general tree may contain, fullness of a general tree just doesn’t make sense, but any tree which has a limit on the number of children (ternary, quadnary, etc) can have this property.

Balance

The second property I’d like to discuss is balance. A binary tree can be balanced, such that one branch of the tree is about the same size and depth as the other. In order to find out if a tree node is balanced, you need to find out the maximum height level of both children in each node, and if they differ by more than one level, it is considered unbalanced. The easiest way to do this is to use a recursive function (the depth() function of the binary tree class which I have provided does this) to find the maximum depth of the left child, find the maximum depth of the right child, then subtract the two. If the number is -1, 0, or 1, the node is balanced. If the difference is anything else, then it is unbalanced. Note that a balanced binary tree requires every node in the tree to have the balanced property.

In this example, we see that the only node which is unbalanced is node #3, because the depth of its left side is 0, and the depth of the right side is 2. Subtracting 2 from 0 yields -2, thus that node is unbalanced. Note that node #1, the root, is considered balanced, since the maximum height level of both of its’ children is 3. This raises an important point: You cannot check just the root node for balancing, because it may be balanced but its child nodes may not be. Therefore, every node in the tree must be balanced in order for the tree to be considered balanced. The example tree above can be easily balanced by giving node #3 a left child.

Leftness

The third property I would like to discuss is leftness. This term is of my own creation, as I have never seen this term used before in textbooks, but it fairly important when dealing with some binary trees, such as Heaps. A leftist tree is basically a full tree, up until the lowest level. At the lowest level, every node does not need to be full, but it needs to fill in the leftmost part of the tree. While this is somewhat difficult to explain, a diagram helps immensely.

So, every child on the lowest level is pushed to the left of the tree. If you were to add one more node, it would need to be a right child of node #5 if the leftness property is to be maintained. This property becomes important when dealing with array-represented binary trees and heaps.

Structure Implementations of a Binary Tree

Linked Binary Trees

Binary Trees can be linked, just like the general tree was, where each node has two pointers to its children. This is the most common structure of binary trees.

General structure definition for a linked binary tree node:

template <class dataType>
class BinaryTree
{
public:
    typedef BinaryTree<dataType> node;
// --------------------------------------------------------------
//  Name:           BinaryTree::m_data
//  Description:    data stored in current tree node
// --------------------------------------------------------------
    dataType m_data;

// --------------------------------------------------------------
//  Name:           BinaryTree::m_parent
//  Description:    pointer to the parent of the node.
// --------------------------------------------------------------
    node* m_parent;

// --------------------------------------------------------------
//  Name:           BinaryTree::m_left
//  Description:    pointer to the left child of the node.
// --------------------------------------------------------------
    node* m_left;

// --------------------------------------------------------------
//  Name:           BinaryTree::m_right
//  Description:    pointer to the right child of the node.
// --------------------------------------------------------------
    node* m_right;
};

Notice how I made all the data public. While this destroys any semblance of good Object Oriented Design (OOD), there is a reason behind this. First of all, this is a tutorial on trees, not on object oriented design. If I were to go all out with OOD, I’d be using all sorts of esoteric features, totally eclipsing the theory behind the trees. Second, the binary tree node class is so simple, its definition is not likely to change anytime in the near future.

Advantages: the maximum tree size is limited only by the total memory available on the system.

Disadvantages: every tree node requires two pointers to its children, and this implementation uses a third for its parent. If only simple data types are being stored within the tree (integers, floats, characters), every node takes up 3-12 times the amount of space that the data type takes up.

Arrayed Binary Trees

There is one major variation of the structure of a binary tree: an arrayed binary tree. I have not implemented it here as an abstract data type, because its uses are fairly limited in the real world. However, I will implement a modified version for demonstrating heaps in the next tutorial. Right now, you should just know that the tree is represented as one large array:

It is traditional to represent arrayed nodes as boxes, rather than circles. This makes it easy to distinguish between the major structural differences. Note how the indexing starts at 1, and not 0. This is because of the rules that must be followed when traversing an arrayed binary tree:

The left child of any index i is at i * 2

The right child of any index i is at (i * 2) + 1

The parent of any index i is at i / 2 (integer division)

So, having an index of 0 would make traversal impossible. Take special care to note this fact: if node #2 does not exist within the tree, indices 2, 4, 5, 8, 9, 10, and 11 are all unusable, and therefore wasted. Also note that adding a right child to node #15 would increase the size of the array to 31 indices, of which #16 through #30 would not be used.

Advantages: Relatively fast to traverse, easy to transmit through serial interfaces (internet, external disk storage), saves much more space than a linked binary tree…

Disadvantages: but only if the tree is full, balanced, or leftist. Adding nodes to the tree that go beyond the capacity requires the array to be resized. Tree always has a size of (2^d)-1, where d represents the maximum depth of any given node in the tree.

Methods of a Binary Tree

Here is a listing of all the methods in the binary tree class:

// ----------------------------------------------------------------
//  Name:           BinaryTree::BinaryTree
//  Description:    Default Constructor. Creates a tree node.
//  Arguments:      - data: data to initialize node with
//  Return Value:   None.
// ----------------------------------------------------------------
    BinaryTree( dataType p_data )

// ----------------------------------------------------------------
//  Name:           BinaryTree::~BinaryTree
//  Description:    Destructor. Deletes all child nodes.
//  Arguments:      None.
//  Return Value:   None.
// ----------------------------------------------------------------
    ~BinaryTree()

// ----------------------------------------------------------------
//  Name:           BinaryTree::setLeft
//  Description:    sets the left child pointer, rearranging parents
//                  as necessary. Existing left child is promoted
//                  to the root of its own tree, but beware, it is
//                  up to the client to remember where it is, or
//                  else there is a memory leak.
//  Arguments:      pointer to a binary tree node.
//  Return Value:   None.
// ----------------------------------------------------------------
    void setLeft( node* p_left )

// ----------------------------------------------------------------
//  Name:           BinaryTree::setRight
//  Description:    sets the right child pointer, rearranging parents
//                  as necessary. Existing right child is promoted
//                  to the root of its own tree, but beware, it is
//                  up to the client to remember where it is, or
//                  else there is a memory leak.
//  Arguments:      pointer to a binary tree node.
//  Return Value:   None.
// ----------------------------------------------------------------
    void setRight( node* p_right )

// ----------------------------------------------------------------
//  Name:           BinaryTree::preorder
//  Description:    processes the tree using a pre-order traversal
//  Arguments:      - process: A function which takes a reference
//                             to a dataType and performs an
//                             operation on it.
//  Return Value:   None.
// ----------------------------------------------------------------
    void preorder( void (*process)(dataType& item) )

// ----------------------------------------------------------------
//  Name:           BinaryTree::postorder
//  Description:    processes the tree using a post-order traversal
//  Arguments:      - process: A function which takes a reference
//                             to a dataType and performs an
//                             operation on it.
//  Return Value:   None.
// ----------------------------------------------------------------
    void postorder( void (*process)(dataType& item) )

// ----------------------------------------------------------------
//  Name:           BinaryTree::inorder
//  Description:    processes the tree using an in-order traversal
//  Arguments:      - process: A function which takes a reference
//                             to a dataType and performs an
//                             operation on it.
//  Return Value:   None.
// ----------------------------------------------------------------
    void inorder( void (*process)(dataType& item) )

// ----------------------------------------------------------------
//  Name:           BinaryTree::isLeft
//  Description:    determines if node is a left subtree. Note that
//                  a result of false does NOT mean that it is a
//                  right child; it may be a root instead.
//  Arguments:      None.
//  Return Value:   True or False.
// ----------------------------------------------------------------
    bool isLeft()

// ----------------------------------------------------------------
//  Name:           BinaryTree::isRight
//  Description:    determines if node is a right subtree. Note that
//                  a result of false does NOT mean that it is a
//                  left child; it may be a root instead.
//  Arguments:      None.
//  Return Value:   True or False.
// ----------------------------------------------------------------
    bool isRight()

// ----------------------------------------------------------------
//  Name:           BinaryTree::isRoot
//  Description:    determines if node is a root node.
//  Arguments:      None.
//  Return Value:   True or False.
// ----------------------------------------------------------------
    bool isRoot()

// ----------------------------------------------------------------
//  Name:           BinaryTree::count
//  Description:    recursively counts all children.
//  Arguments:      None.
//  Return Value:   the number of children.
// ----------------------------------------------------------------
    int count()

// ----------------------------------------------------------------
//  Name:           BinaryTree::depth
//  Description:    recursively determines the lowest relative
//                  depth of all its children.
//  Arguments:      None.
//  Return Value:   the lowest depth of the current node.
// ----------------------------------------------------------------
    int depth()

The setLeft/Right functions provide an easy way to set children in the tree. If a child already exists where you are inserting, it sets the existing child’s parent pointer to 0, making it a new tree. Then it puts the passed in node to whichever side you told it, and then sets the new nodes parent to the current node. All of this could be done manually, of course, but the method makes it simpler because that sequence of events happens quite frequently.

Next, I’ve added another traversal routine: inorder. I briefly touched on this in the last tutorial, so I’m going to explain it more in depth this time. An in-order traversal traverses the tree in this order:

process left child
process current data
process right child

So the traversal only makes sense on binary trees, because it processes the current data in any given node between its two children.

isLeft/Right/Root should all make sense, they are just simple status functions.

count and depth are the two recursive status routines. As such, they should be used sparingly, because they are recalculated on each call. Count is simple to understand, it just recursively tells each of its children to return its count, adds one, and returns the result.

Depth is a bit more complex, as it takes the maximum of the relative depth of its left child and the relative depth of the right child, adds one and returns the result. The tree balancing routines discussed later on uses this method.

int depth()
{
    int left = -1;                    // clear left
    int right = -1;                   // and right
    if( m_left != 0 )                // if the left child exists
        left = m_left->depth();      // update the left depth
    if( m_right != 0 )               // if the right child exists
        right = m_right->depth();    // update the right depth
    if( left > right )               // if left is larger than
        return left + 1;             // right, return left + 1
    return right + 1;                // else return right + 1
}

Next : Binary Search Trees