Data Structures and Algorithms with Object-Oriented Design Patterns in C#
next up previous contents index

Application: Typesetting Problem

 

Consider the problem of typesetting a paragraph of justified text. A paragraph can be viewed as a sequence of n>0 words, tex2html_wrap_inline67484. The objective is to determine how to break the sequence into individual lines of text of the appropriate size. Each word is separated from the next by some amount of space. By stretching or compressing the space between the words, the left and right ends of consecutive lines of text are made to line up. A paragraph looks best when the amount of stretching or compressing is minimized.

We can formulate the problem as follows: Assume that we are given the lengths of the words, tex2html_wrap_inline68082, and that the desired length of a line is D. Let tex2html_wrap_inline68086 represent the sequence of words from tex2html_wrap_inline67468 to tex2html_wrap_inline68090 (inclusive). That is,

displaymath68074

for tex2html_wrap_inline68092.

Let tex2html_wrap_inline68094 be the sum of the lengths of the words in the sequence tex2html_wrap_inline68086. That is,

displaymath68075

The natural length, for the sequence tex2html_wrap_inline68086 is the sum of the lengths of the words, tex2html_wrap_inline68094 plus the normal amount of space between those words. Let s be the normal size of the space between two words. Then the natural length of tex2html_wrap_inline68086 is tex2html_wrap_inline68106. Note, we can also define tex2html_wrap_inline68094 recursively as follows:

  equation33248

In general, when we typeset the sequence tex2html_wrap_inline68086 all on a single line, we need to stretch or compress the spaces between the words so that the length of the line is the desired length D. Therefore, the amount of stretching or compressing is given by the difference tex2html_wrap_inline68114. However, if the sum of the lengths of the words, tex2html_wrap_inline68094, is longer than the desired line length D, it is not possible to typeset the sequence on a single line.

Let tex2html_wrap_inline68120 be the penalty associated with typesetting the sequence tex2html_wrap_inline68094 on a single line. Then,

  equation33261

This definition is of penalty is consistent with the stated objectives: The penalty increases as the difference between the natural length of the sequence and the desired length increases and the infinite penalty disallows lines that are too long.

Finally, we define the quantity tex2html_wrap_inline67912 for tex2html_wrap_inline68092 as the minimum total penalty required to typeset the sequence tex2html_wrap_inline68086. In this case, the text may be all on one line or it may be split over more than one line. The quantity tex2html_wrap_inline67912 is given by

  equation33273

We obtain Equation gif as follows: When i=j there is only one word in the paragraph. The minimum total penalty associated with typesetting the paragraph in this case is just the penalty which results from putting the one word on a single line.

In the general case, there is more than one word in the sequence tex2html_wrap_inline68086. In order to determine the optimal way in which to typeset the paragraph we consider the cost of putting the first k words of the sequence on the first line of the paragraph, tex2html_wrap_inline68138, plus the minimum total cost associated with typesetting the rest of the paragraph tex2html_wrap_inline68140. The value of k which minimizes the total cost also specifies where the line break should occur.




next up previous contents index

Bruno Copyright © 2001 by Bruno R. Preiss, P.Eng. All rights reserved.