Data Structures and Algorithms with Object-Oriented Design Patterns in C#
next up previous contents index

Linear Probing

The simplest collision resolution strategy in open addressing is called linear probing . In linear probing, the function c(i) is a linear function in i. That is, it is of the form

displaymath62171

Property 1 requires that c(0)=0. Therefore, tex2html_wrap_inline62181 must be zero.

In order for tex2html_wrap_inline62183 to satisfy Property 2, tex2html_wrap_inline62185 and M must be relatively prime. If we know the M will always be a prime number, then any tex2html_wrap_inline62185 will do. On the other hand, if we cannot be certain that M is prime, then tex2html_wrap_inline62185 must be one. Therefore, linear probing sequence that is usually used is

displaymath62172

for tex2html_wrap_inline62197.

Figure gif illustrates an example of a scatter table using open addressing together with linear probing. For example, consider the string "åtta". This string hashes to array position tex2html_wrap_inline62065. The corresponding linear probing sequence begins at position tex2html_wrap_inline62065 and goes on to positions tex2html_wrap_inline62067, tex2html_wrap_inline62069,.... In this case, the search for the string "åtta" succeeds after three probes.

   figure12584
Figure: Scatter table using open addressing and linear probing.

To insert an item x into the scatter table, an empty cell is found by following the same probe sequence that would be used in a search for item x. Thus, linear probing finds an empty cell by doing a linear search beginning from array position h(x).

An unfortunate characteristic of linear probing arises from the fact that as the table fills, clusters of consecutive cells form and the time required for a search increases with the size of the cluster. Furthermore, when we attempt to insert an item in the table at a position which is already occupied, that item is ultimately inserted at the end of the cluster--thereby increasing its length. This by itself is not inherently a bad thing. After all, when using the chained approach, every insertion increase the length of some chain by one. However, whenever an insertion is made between two clusters that are separated by one unoccupied position, the two clusters become one, thereby potentially increasing the cluster length by an amount much greater than one--a bad thing! This phenomenon is called primary clustering .


next up previous contents index

Bruno Copyright © 2001 by Bruno R. Preiss, P.Eng. All rights reserved.