Data Structures and Algorithms with Object-Oriented Design Patterns in C#
next up previous contents index

Average Running Time

 

The best case running time of insertion sorting is O(n) but the worst-case running time is tex2html_wrap_inline58403. Therefore, we might suspect that the average running time falls somewhere in between. In order to determine it, we must define more precisely what we mean by the average running time. A simple definition of average running time is to say that it is the running time needed to sort the average sequence. But what is the average sequence?

The usual way to determine the average running time of a sorting algorithm is to consider only sequences that contain no duplicates. Since every sorted sequence of length n is simply a permutation of an unsorted one, we can represent every such sequence by a permutation of the sequence tex2html_wrap_inline68884. When computing the average running time, we assume that every permutation is equally likely. Therefore, the average running time of a sorting algorithm is the running time averaged over all permutations of the sequence S.

Consider a permutation tex2html_wrap_inline68728 of the sequence S. An inversion  in P consists of two elements, say tex2html_wrap_inline57649 and tex2html_wrap_inline68896, such that tex2html_wrap_inline68898 but i<j. That is, an inversion in P is a pair of elements that are in the wrong order. For example, the permutation tex2html_wrap_inline68904 contains three inversions--(4,3), (4,2), and (3,2). The following theorem tells us how many inversions we can expect in the average sequence:

Theorem  The average number of inversions in a permutation of n distinct elements is n(n-1)/4.

extbfProof Let S be an arbitrary sequence of n distinct elements and let tex2html_wrap_inline68920 be the same sequence, but in reverse.

For example, if tex2html_wrap_inline68688, then tex2html_wrap_inline68924.

Consider any pair of distinct elements in S, say tex2html_wrap_inline68750 and tex2html_wrap_inline68752 where tex2html_wrap_inline68746. There are two distinct possibilities: Either tex2html_wrap_inline68932, in which case tex2html_wrap_inline68934 is an inversion in tex2html_wrap_inline68920; or tex2html_wrap_inline68938, in which case tex2html_wrap_inline68940 is an inversion in S. Therefore, every pair contributes exactly one inversion either to S or to tex2html_wrap_inline68920.

The total number of pairs in S is tex2html_wrap_inline68950. Since every such pair contributes an inversion either to S or to tex2html_wrap_inline68920, we expect on average that half of the inversions will appear in S. Therefore, the average number of inversions in a sequence of n distinct elements is n(n-1)/4.

What do inversions have to do with sorting? As a list is sorted, inversions are removed. In fact, since the inner loop of the insertion sort method swaps adjacent array elements, inversions are removed one at a time! Since a swap takes constant time, and since the average number of inversions is n(n-1)/4, the average running time for the insertion sort method is tex2html_wrap_inline58403.


next up previous contents index

Bruno Copyright © 2001 by Bruno R. Preiss, P.Eng. All rights reserved.