Logo Data Structures and Algorithms with Object-Oriented Design Patterns in C++
next up previous contents index

Application: Typesetting Problem


Consider the problem of typesetting a paragraph of justified text. A paragraph can be viewed as a sequence of n>0 words, tex2html_wrap_inline68483. The objective is to determine how to break the sequence into individual lines of text of the appropriate size. Each word is separated from the next by some amount of space. By stretching or compressing the space between the words, the left and right ends of consecutive lines of text are made to line up. A paragraph looks best when the amount of stretching or compressing is minimized.

We can formulate the problem as follows: Assume that we are given the lengths of the words, tex2html_wrap_inline69081, and that the desired length of a line is D. Let tex2html_wrap_inline69085 represent the sequence of words from tex2html_wrap_inline68467 to tex2html_wrap_inline69089 (inclusive). I.e.,


for tex2html_wrap_inline69091.

Let tex2html_wrap_inline69093 be the sum of the lengths of the words in the sequence tex2html_wrap_inline69085. I.e.,


The natural length, for the sequence tex2html_wrap_inline69085 is the sum of the lengths of the words, tex2html_wrap_inline69093 plus the normal amount of space between those words. Let s be the normal size of the space between two words. Then the natural length of tex2html_wrap_inline69085 is tex2html_wrap_inline69105. Note, we can also define tex2html_wrap_inline69093 recursively as follows:


In general, when we typeset the sequence tex2html_wrap_inline69085 all on a single line, we need to stretch or compress the spaces between the words so that the length of the line is the desired length D. Therefore, the amount of stretching or compressing is given by the difference tex2html_wrap_inline69113. However, if the sum of the lengths of the words, tex2html_wrap_inline69093, is longer than the desired line length D, it is not possible to typeset the sequence on a single line.

Let tex2html_wrap_inline69119 be the penalty associated with typesetting the sequence tex2html_wrap_inline69093 on a single line. Then,


This definition is of penalty is consistent with the stated objectives: The penalty increases as the difference between the natural length of the sequence and the desired length increases and the infinite penalty disallows lines that are too long.

Finally, we define the quantity tex2html_wrap_inline68911 for tex2html_wrap_inline69091 as the minimum total penalty required to typeset the sequence tex2html_wrap_inline69085. In this case, the text may be all on one line or it may be split over more than one line. The quantity tex2html_wrap_inline68911 is given by


We obtain Equation gif as follows: When i=j there is only one word in the paragraph. The minimum total penalty associated with typesetting the paragraph in this case is just the penalty which results from putting the one word on a single line.

In the general case, there is more than one word in the sequence tex2html_wrap_inline69085. In order to determine the optimal way in which to typeset the paragraph we consider the cost of putting the first k words of the sequence on the first line of the paragraph, tex2html_wrap_inline69137, plus the minimum total cost associated with typesetting the rest of the paragraph tex2html_wrap_inline69139. The value of k which minimizes the total cost also specifies where the line break should occur.

next up previous contents index

Bruno Copyright © 1997 by Bruno R. Preiss, P.Eng. All rights reserved.