GameDev.netTrees Part 1

Trees Part 1
## IntroductionTrees are a very important portion of software programming theory. Whether you know it or not, they are used all around you, in many applications and games. This is why it was quite a surprise to me when I found out that most of the programmers that frequent the GameDev.net chat room don't know exactly what trees are, and what they are good for. I hope this article will remedy that. My original intent was to not provide any code for this tutorial, as I prefer it when people learn the theory rather than just copy the code (and probably use it for a computer science homework project), but I have come to realise that quite a few of you learn better by seeing an example of the topic rather than just an explanation. Therefore, I have included source code and demo programs to accompany this tutorial. This specific tutorial briefly covers recursion first, because it is a very important topic to understand before you get involved with trees. The next part describes how a tree is structured and how parts of a tree are labelled, and the last part shows how to use a General Tree. Basic Requirements for this series: Knowledge of basic data structures:Linked List Knowledge of basic C++ programming:inheritance Included Classes for this tutorial: LinkedList ## RecursionTrees are highly recursive structures. I'm hoping that you will be somewhat familiar with recursion as a theory, because it is a difficult subject to get correct for most beginning programmers. Here is a tiny introduction to recursion (you'll probably want to find a more in depth tutorial if you've never heard the term before). What is recursion? An old programmers joke has the definition of Recursion in the dictionary as this:
The basic idea behind a recursive function is one which calls itself repeatedly on many different levels. A recursive structure, such as a tree, has itself as its own children, and is thus considered recursive. Recursion helps software engineers solve problems by dividing up complex problems into smaller sub-problems, and is often a great boost in algorithm completion time, because recursion helps to "divide and conquer" the problem at hand, thus reducing the complexity. I will go into this further in the next tutorial dealing with binary tree’s. The basic recursive function will have two portions: the recursive part, and the terminating part. It is very important that recursive functions have a terminating part, because without it, the recursive function would never end. The above joke is an example of recursion without a terminating condition. If it were in a program, the program would continuously look up the definition of recursion forever. Of course, having a machine run into infinite recursion is just as deadly as an infinite loop, but the difference between the two is that infinite recursion will eventually cause the machine or thread to quit out. This is because each recursive call allocates more information on the stack every time it is called, and when the machine or thread runs out of stack space, it quits out. The stack is the most important part of recursion. It is an easy way to have the machine store current state information about a function, change to another object, and then return at a later time. Let's look at a simple recursion example: int getPower( int number, int power ) { if( power != 1 ) return number * getPower( number, power - 1 ); else return number; } Here is the non-recursive solution to the same problem: int getPower( int number, int power ) { int result = number; while( power > 1 ) { result *= number; power--; } return result; } While the recursive example looks ugly and stupid, and the iterative example is far more efficient, it demonstrates an important idea behind recursion. The terminating case is where power is 1. When power reaches 1, it returns a number instead of calling itself again. The recursive portion of the function decides to let the function handle the power one less than the current power, as long as the current power is not 1. So if the function were to calculate 2 to the So why use it? If analyzed, the recursive getPower function is slower and harder to understand than the non-recursive iterative solution, so it quite obviously offers no advantage. Recursion is not used to its full potential in this example simply because I wanted to show you how a problem can be broken down into a smaller problem by using recursion (Look for tutorials on The real reason we would use recursion is for a simple way of storing values on a 'path' for future use. Recursion store's the state of the algorithm at each level and allows the user to come back to that exact state later on. We will see this in use when using the tree traversal routines. ## Introduction to TreesTrees are used for so much in computer programming. They can be used for improving database search times (binary search trees, 2-3 trees, AVL trees, 2-3-4 trees, red-black trees), Game programming (minimax trees, decision trees, pathfinding trees), 3d graphics programming (BSP trees, quadtrees, octrees), Arithmetic Scripting languages (arithmetic precedence trees), Data compression (Huffman trees), and even file system's (B-trees, sparse indexed trees, trie trees). This article will go through the basic tree structures first (general trees), and then go into the specifics of some of the more popular trees, but not all of those listed above. A few of the trees above are quite complex and go beyond the scope of this article. ## Computer Representations of TreesImagine a linked list. The simplest implementation uses only one class, a The box represents the Linked List class, and the circles are the nodes. Trees are the same way in that they are made up of a collection of nodes. As with linked lists, a tree can be referenced by any one of it’s nodes, or a general tree container class. Whichever implementation you choose is all up to you and your needs. This tutorial does not assume one particular way over another, but all tree diagrams I will draw in this document do not show the container (box) class for reasons of clarity and brevity. It looks better without the box getting in the way. ## Parts of a TreeTrees have one discrete starting point. This is called the node, with the possible exception of a child node. Some tree implementations maintain pointers to their parents to ease traversal, but it is not explicitly required by a tree. Any node that does not have any children is called a parent node. Here is a pictorial representation of a theoretical tree:leaf Note that computer trees are typically drawn from the Root-down. Node 1 would be considered the of the root. Node 1 is their children . Note also that node 2 is a parentto 5 and 6, and 4 is a parent to 7. Thus nodes 2 and 4 act as both parent and children . Nodes 3, 5, 6 and 7 are parentsnodes. It is important to note that every node in the tree is also a tree by itself. We usually call these leaf . Take, for example, node number 2. Node 2 is the root of a sub-tree that only has three nodes. Node's 5, 6 and 7 are all sub-trees with no children.sub-trees## Hierarchy of a treeWhen you think of a linked list, it is typically a long horizontal string of nodes, each one connected to the next. This gives it a shape roughly analogous to a straight memory array. However, trees cannot be easily thought of in terms of a straight line, so a tree has what are called Here is an example of height levels: ## General TreesThe ) is a generic tree that has one root node, and every node in the tree can have an Linked Treesnumber of child nodes. One popular use of this kind of tree is in Family Tree programs. Here's an example family tree:unlimited Any parent node is allowed to have as many children as it wants. (Random fact: Johann Sebastian Bach had 19 children). The standard method of representing a general tree is to use a linked list (hence the name Each box is a linear container, and each node has a pointer to a single linear container. ## Using The General TreeSo, what would you use a general tree for? In game programming, many games use these types of trees for decision-making processes. For example, a computer AI might need to make a decision based on an event that happened. Here's an example of a computer gunman thinking about what to do when he hears a gunshot: This is just a simple tree for demonstration. A more complex AI decision tree would definitely have a lot more options. The great thing about using a tree for decision-making is that the options are cut down for every level of the tree you go down, greatly improving the speed at which the AI makes a decision. Other uses might include storing level progressions. In level based games, there is a strong push lately to get rid of the old school "linear" line of thinking, and instead of using just a rigid number of pre-defined levels, maybe the level structure of a game might progress based on a tree. The big problem with tree based level progressions, however, is that sometimes the tree can get much too large and complex. Imagine offering just two choices for the next level at the end of each level in a ten level game. That would require a total of 1023 levels to be created. That requires quite a bit of effort, so it's best to try to limit the number of level choices. There are thousands of other uses for trees, and chances are there is something you want to do that trees are the best solution for. ## The General Tree InterfaceHere is a listing of the basic public GeneralTree interface that I have created: Template <class dataType> class GeneralTree { // ---------------------------------------------------------------- // Name: GeneralTree::GeneralTree // Description: Default Constructor. Creates a tree node. // Arguments: - data: data to initialise node with // Return Value: None. // ---------------------------------------------------------------- GeneralTree( dataType& data ); // ---------------------------------------------------------------- // Name: GeneralTree::~GeneralTree // Description: Destructor. Deletes all child nodes. // Arguments: None. // Return Value: None. // ---------------------------------------------------------------- ~GeneralTree(); // ---------------------------------------------------------------- // Name: GeneralTree::addChild // Description: Adds a child node to the tree’s child list. // Arguments: - tree: sub-tree to add // Return Value: None. // ---------------------------------------------------------------- void addChild( GeneralTree<dataType>* tree ); // ---------------------------------------------------------------- // Name: GeneralTree::start // Description: resets the child iterator to point to the first // child in the current tree. // Arguments: None. // Return Value: None. // ---------------------------------------------------------------- void start(); // ---------------------------------------------------------------- // Name: GeneralTree::forth // Description: moves the child iterator forward to point to // the next child. // Arguments: None. // Return Value: None. // ---------------------------------------------------------------- void forth(); // ---------------------------------------------------------------- // Name: GeneralTree::isValid // Description: determines if the child iterator points to a // valid child. // Arguments: None. // Return Value: True if the iterator points to a valid child, // False otherwise. // ---------------------------------------------------------------- bool isValid(); // ---------------------------------------------------------------- // Name: GeneralTree::currentChild // Description: returns a pointer to the current child tree // that the child iterator points to. // Arguments: None. // Return Value: pointer to current child tree // ---------------------------------------------------------------- GeneralTree<dataType>* currentChild(); // ---------------------------------------------------------------- // Name: GeneralTree::removeChild // Description: removes the child that the child iterator // is pointing tom and moves child iterator to // next child. // Arguments: None. // Return Value: None. // ---------------------------------------------------------------- void removeChild(); // ---------------------------------------------------------------- // Name: GeneralTree::preorder // Description: processes the list using a pre-order traversal // Arguments: - process: A function which takes a reference // to a dataType and performs an // operation on it. // Return Value: None. // ---------------------------------------------------------------- void preorder( void (*process)(dataType& item) ); // ---------------------------------------------------------------- // Name: GeneralTree::postorder // Description: processes the list using a post-order traversal // Arguments: - process: A function which takes a reference // to a dataType and performs an // operation on it. // Return Value: None. // ---------------------------------------------------------------- void postorder( void (*process)(dataType& item) ); }; There is no default constructor, because a tree object MUST contain a piece of data. Therefore you are required to pass in a dataType to it on creation. Note that the destructor of a tree recursively delete’s all of it’s children, so when adding a sub-tree to a tree, make sure it is not referenced anywhere else in the program. The destructor, however, does NOT delete the data stored inside the tree node, so that must be managed by the client of the tree. Since the tree will be using a linked list to store it’s children, the tree maintains a pointer (known as the ), travel through each child individually (start()), retrieve the item at the current child iterator location (forth()), delete the current child subtree (currentChild()), and determine if the iterator is not pointing to a valid child (removeChild()).isValid()The Traversal methods will be discussed later on in this tutorial. ## Adding Nodes to the General TreeThe most common way to add nodes to a general tree is to first find the desired parent of the node you want to insert, then add the node to the parent's child list. The most common implementations insert the nodes one at a time, but since each node can be considered a tree on its own, other implementations build up an entire sub-tree before adding it to a larger tree. Example of adding nodes to a linked tree: void main() { GeneralTree<int>* tree = new GeneralTree<int>( 1 ); GeneralTree<int>* temp = 0; tree->addChild( new GeneralTree<int>( 2 ) ); tree->addChild( new GeneralTree<int>( 3 ) ); tree->addChild( new GeneralTree<int>( 4 ) ); tree->start(); temp = tree->child(); temp->addChild( new GeneralTree<int>( 5 ) ); temp->addChild( new GeneralTree<int>( 6 ) ); tree->forth(); tree->forth(); temp = tree->child(); temp->addChild( new GeneralTree<int>( 7 ) ); } This snippet creates the tree that was given as an example of a general tree earlier using construction.Bottom Up## Removing Nodes from the General TreeRemoving a node is entirely dependant on the use of the tree. The typical method to remove a node from a general tree is to find the node you want to remove, and just delete it from the current list it is in (this is what the void main() { GeneralTree<int>* tree = new GeneralTree<int>( 1 ); // create tree used in previous example here // ... tree->start(); tree->removeChild(); } This removes the entire subtree starting at node 2, which ends up removing both nodes 5 and 6 from the tree as well. Note that this is the standard accepted behavior of a remove routine, since children of a node are assumed to be related to the parent in an important manner, and maintaining them in the tree without the parent usually does not make sense. ## Processing Nodes in the General TreeOften, there arises a need to perform one operation on every node of a tree. Therefore, we need to use a method that traverses the entire tree and visits every node. There are generally three traversal methods for trees that guarantee that every node will be visited, however only two of them are used for general trees. They are called the . The third such traversal, Post-Order Traversal, is usually only used on binary trees. These traversal methods are recursive functions, and demonstrate how useful recursion can be.In-OrderThe Pre-Order Traversal works by processing the current node first, and then processing all the children, where template <class dataType> void GeneralTree<dataType>::preorder(void (*process)(dataType& item) ) { process( m_data ); start(); while( isValid() ) { currentChild->preorder( process ); forth(); } } So, printing the items of the family tree example above would produce the list: Now, since template <class dataType> void GeneralTree<dataType>::postorder(void (*process)(dataType& item) ) { start(); while( isValid() ) { currentChild->postorder( process ); forth(); } process( m_data ); } Printing the items of the family tree using a Post-order would produce: ## Why recursion was used in these traversalsAs I have said many times before, recursion is a powerful tool because it allows a large problem to be broken down into a much simpler problem. In the case of traversal, the beauty lies in being able to specify a small function which operates on only one node and allows the children to solve the problem for their children as well, thus allowing the root (or any parent node in the tree) to remain totally ignorant of the structure of its children. When you pre-order a tree, you process the root node, and then call preorder on each of it’s children. So when you preorder a child, it completes and pop’s execution back up to the root, allowing it to do the same thing to the next child, then the next, and so forth. While it is possible to represent the traversal problem with an iterative solution, that solution would require a custom made stack, which is esentially using recursion with a different name. It is likely that the user defined stack will not be as efficient as the system stack, so an iterative solution would end up being slower. The importance of recursion is somewhat downplayed in the traversal routines because each tree node maintains a pointer to the current child. Now, imagine a general tree implementation that used an array for its child list instead of a linked list. An array does not have an iterator object, and must use an index variable instead. In this case recursion helps because the index is stored on the stack at every level, and automatically restored when execution returns to each higher level. Recursion essentially helps the algorithm remember which path it traveled down. In the end, recursion works so well with tree’s because a tree is esentially a recursive structure. ## ConclusionThat's it for this tutorial. I had intended to include even more tree related stuff, all into one tutorial, but it ended up being too large. So next time, in part II, the tutorial will continue on with trees. Specifically, we'll be looking at binary trees, a specific subset of all trees, and the many various things that have been done with binary trees. I think I'll finish up at the third article, going into some of the more advanced tree structures. Until next time!
© 1999-2011 Gamedev.net. All rights reserved. |