Logo Data Structures and Algorithms with Object-Oriented Design Patterns in C++
next up previous contents index

Successful Search

When a search is successful, exactly d+1 internal nodes are visited, where d is the depth in the tree of object of the search. E.g., if the object of the search is at the root which has depth zero, the search visits just one node--the root itself. Similarly, if the object of the search is at depth one, two nodes are visited, and so on. We shall assume that it is equally likely for the object of the search to appear in any node of the search tree. In that case, the average number of nodes visited during a successful search is tex2html_wrap_inline64558, where tex2html_wrap_inline64560 is the average of the depths of the nodes in a given tree. I.e., given a binary search tree with n>0 nodes,

displaymath64550

where tex2html_wrap_inline64564 is the depth of the tex2html_wrap_inline58387 node of the tree.

The quantity tex2html_wrap_inline64568 is called the internal path length  . The internal path length of a tree is simply the sum of the depths (levels) of all the internal nodes in the tree. Clearly, the average depth of an internal node is equal to the internal path length divided by n, the number of nodes in the tree.

Unfortunately, for any given number of nodes n, there are many different possible search trees. Furthermore, the internal path lengths of the various possibilities are not equal. Therefore, to compute the average depth of a node in a tree with n nodes, we must consider all possible trees with n nodes. In the absence of any contrary information, we shall assume that all trees having n nodes are equiprobable and then compute the average depth of a node in the average tree containing n nodes.

Let I(n) be the average internal path length of a tree containing n nodes. Consider first the case of n=1. Clearly, there is only one binary tree that contains one node--the tree of height zero. Therefore, I(1)=0.

Now consider an arbitrary tree, tex2html_wrap_inline64590, having tex2html_wrap_inline59533 internal nodes altogether, l of which are found in its left subtree, where tex2html_wrap_inline64596. Such a tree consists of a root, the left subtree with l internal nodes and and a right subtree with n-l-1 internal nodes. The average internal path length for such a tree is the sum of the average internal path length of the left subtree, I(l), plus that of the right subtree, I(n-l-1), plus n-1 because the nodes in the two subtrees are one level lower in tex2html_wrap_inline64590.

In order to determine the average internal path length for a tree with n nodes, we must compute the average of the internal path lengths of the trees tex2html_wrap_inline64590 averaged over all possible sizes, l, of the (left) subtree, tex2html_wrap_inline64596.

To do this we consider an ordered set of n distinct keys, tex2html_wrap_inline64620. If we select the tex2html_wrap_inline64622 key, tex2html_wrap_inline64624, to be the root of a binary search tree, then there are l keys, tex2html_wrap_inline64628, tex2html_wrap_inline64392, ..., tex2html_wrap_inline64632, in its left subtree and n-l-1 keys, tex2html_wrap_inline64636, tex2html_wrap_inline64638, ..., tex2html_wrap_inline64396 in its right subtree.

If we assume that it is equally likely for any of the n keys to be selected as the root, then all the subtree sizes in the range tex2html_wrap_inline64596 are equally likely. Therefore, the average internal path length for a tree with tex2html_wrap_inline59533 nodes is

eqnarray19093

Thus, in order to determine I(n) we need to solve the recurrence

  equation19103

To solve this recurrence we consider the case n>1 and then multiply Equation gif by n to get

  equation19113

Since this equation is valid for any n>1, by substituting n-1 for n we can also write

  equation19118

which is valid for n>2. Subtracting Equation gif from Equation gif gives

displaymath64551

which can be rewritten as

  equation19125

Thus, we have shown the solution to the recurrence in Equation gif is the same as the solution of the recurrence

  equation19130


next up previous contents index

Bruno Copyright © 1997 by Bruno R. Preiss, P.Eng. All rights reserved.