5.4.7 Aside: Recursive Insertion |

In previous sections we first looked at recursive approaches because
they were simpler and more elegant than iterative solutions. As it
happens, the reverse is true for insertion into an AVL tree. But just
for completeness, we will now design a recursive implementation of
*avl_probe*().

Our first task in such a design is to figure out what arguments and
return value the recursive core of the insertion function will have.
We'll begin by considering AVL insertion in the abstract. Our existing
function *avl_probe*() works by first moving down the tree, from the
root to a leaf, then back up the tree, from leaf to root, as necessary
to adjust balance factors or rebalance. In the existing iterative
version, down and up movement are implemented by pushing nodes onto and
popping them off from a stack. In a recursive version, moving down the
tree becomes a recursive call, and moving up the tree becomes a function
return.

While descending the tree, the important pieces of information are the
tree itself (to allow for comparisons to be made), the current node, and
the data item we're inserting. The latter two items need to be
modifiable by the function, the former because the tree rooted at the
node may need to be rearranged during a rebalance, and the latter
because of *avl_probe*()'s return value.

While ascending the tree, we'll still have access to all of this information, but, to allow for adjustment of balance factors and rebalancing, we also need to know whether the subtree visited in a nested call became taller. We can use the function's return value for this.

Finally, we know to stop moving down and start moving up when we find a null pointer in the tree, which is the place for the new node to be inserted. This suggests itself naturally as the test used to stop the recursion.

Here is an outline of a recursive insertion function directly corresponding to these considerations:

160. <Recursive insertion into AVL tree 160> =staticintprobe(structavl_table*tree,structavl_node**p,void***data)

{structavl_node*y; /* The current node; shorthand for *p. */assert(tree!=NULL&&p!=NULL&&data!=NULL);y= *p;if(y==NULL) {

<Found insertion point in recursive AVL insertion 161>

}else/*y!=NULL*/

{

<Move down then up in recursive AVL insertion 162>

} }

See also 163.

Parameter *p* is declared as a double pointer (**struct** **avl_node** **) and
*data* as a triple pointer (**void** ***). In both cases, this is because
C passes arguments by value, so that a function modifying one of its
arguments produces no change in the value seen in the caller. As a
result, to allow a function to modify a scalar, a pointer to it must be
passed as an argument; to modify a pointer, a double pointer must be
passed; to modify a double pointer, a triple pointer must be passed.
This can result in difficult-to-understand code, so it is often
advisable to copy the dereferenced argument into a local variable for
read-only use, as **p* is copied into *y* here.

When the insertion point is found, a new node is created and a pointer
to it stored into **p*. Because the insertion causes the subtree to
increase in height (from 0 to 1), a value of 1 is then returned:

161. <Found insertion point in recursive AVL insertion 161> =y= *p=tree->avl_alloc->libavl_malloc(tree->avl_alloc,sizeof*y);if(y==NULL)

{ *data=NULL;return0; }y->avl_data= **data; *data= &y->avl_data;y->avl_link[0] =y->avl_link[1] =NULL;y->avl_balance= 0;tree->avl_count++;tree->avl_generation++;return1;

This code is included in 160.

When we're not at the insertion point, we move down, then back up.
Whether to move down to the left or the right depends on the value of
the item to insert relative to the value in the current node *y*.
Moving down is the domain of the recursive call to *probe*(). If the
recursive call doesn't increase the height of a subtree of *y*, then
there's nothing further to do, so we return immediately. Otherwise, on
the way back up, it is necessary to at least adjust *y*'s balance
factor, and possibly to rebalance as well. If only adjustment of the
balance factor is necessary, it is done and the return value is based on
whether this subtree has changed height in the process. Rebalancing is
accomplished using the same code used in iterative insertion. A
rebalanced subtree has the same height as before insertion, so the value
returned is 0. The details are in the code itself:

162. <Move down then up in recursive AVL insertion 162> =structavl_node*w; /* New root of this subtree; replaces *p. */intcmp;cmp=tree->avl_compare(**data,y->avl_data,tree->avl_param);if(cmp< 0)

{if(probe(tree, &y->avl_link[0],data) == 0)return0;if(y->avl_balance== +1)

{y->avl_balance= 0;return0; }elseif(y->avl_balance== 0)

{y->avl_balance= -1;return1; }

else

{

<Rebalance AVL tree after insertion in left subtree 152>

} }elseif(cmp> 0)

{structavl_node*r; /* Right child ofy, for rebalancing. */if(probe(tree, &y->avl_link[1],data) == 0)return0;if(y->avl_balance== -1)

{y->avl_balance= 0;return0; }elseif(y->avl_balance== 0)

{y->avl_balance= +1;return1; }

else

{

<Rebalance AVL tree after insertion in right subtree 157>

} }else/*cmp== 0 */

{ *data= &y->avl_data;return0; } *p=w;return0;

This code is included in 160.

Finally, we need a wrapper function to start the recursion off correctly and deal with passing back the results:

163. <Recursive insertion into AVL tree 160> += /* Insertsitemintotreeand returns a pointer toitem's address. If a duplicate item is found in the tree, returns a pointer to the duplicate without insertingitem. ReturnsNULLin case of memory allocation failure. */void**avl_probe(structavl_table*tree,void*item)

{void**ret= &item;probe(tree, &tree->avl_root, &ret);returnret; }