Traversing a BST - GNU libavl 2.0.2

After we've been manipulating a binary search tree for a while, we will want to know what items are in it. The process of enumerating the items in a binary search tree is called traversal (see traversal). libavl provides the bst_t_* functions for a particular kind of traversal called inorder traversal (see inorder traversal), so-called because items are enumerated in sorted order.

In this section we'll implement three algorithms for traversal. Each of these algorithms is based on and in some way improves upon the previous algorithm. The final implementation is the one used in libavl, so we will implement all of the bst_t_* functions for it.

Before we start looking at particular algorithms, let's consider some criteria for evaluating traversal algorithms. The following are not the only criteria that could be used, but they are indeed important:¹

complexity: Is it difficult to describe or to correctly implement the algorithm? Complex algorithms also tend to take more code than simple ones.
efficiency: Does the algorithm make good use of time and memory? The ideal traversal algorithm would require time proportional to the number of nodes traversed and a constant amount of space. In this chapter we will meet this ideal time criterion and come close on the space criterion for the average case. In future chapters we will be able to do better even in the worst case.
convenience: Is it easy to integrate the traversal functions into other code? Callback functions are not as easy to use as other methods that can be used from for loops (see Improving Convenience).
reliability: Are there pathological cases where the algorithm breaks down? If so, is it possible to fix these problems using additional time or space?
generality: Does the algorithm only allow iteration in a single direction? Can we begin traversal at an arbitrary node, or just at the least or greatest node?
resilience: If the tree is modified during a traversal, is it possible to continue traversal, or does the modification invalidate the traverser?

The first algorithm we will consider uses recursion. This algorithm is worthwhile primarily for its simplicity. In C, such an algorithm cannot be made as efficient, convenient, or general as other algorithms without unacceptable compromises. It is possible to make it both reliable and resilient, but we won't bother because of its other drawbacks.

We arrive at our second algorithm through a literal transformation of the recursion in the first algorithm into iteration. The use of iteration lets us improve the algorithm's memory efficiency, and, on many machines, its time efficiency as well. The iterative algorithm also lets us improve the convenience of using the traverser. We could also add reliability and resilience to an implementation of this algorithm, but we'll save that for later. The only problem with this algorithm, in fact, lies in its generality: it works best for moving only in one direction and starting from the least or greatest node.

The importance of generality is what draws us to the third algorithm. This algorithm is based on ideas from the previous iterative algorithm along with some simple observations. This algorithm is no more complex than the previous one, but it is more general, allowing easily for iteration in either direction starting anywhere in the tree. This is the algorithm used in libavl, so we build an efficient, convenient, reliable, general, resilient implementation.

Footnotes

[1] Some of these terms are not generic BST vocabulary. Rather, they have been adopted for these particular uses in this text. You can consider the above to be our working definitions of these terms.