Page breaking

When generating print output, you may find that some pages break in awkward places in your text flow. Because DocBook XSL is a batch processing system, you can't just visually adjust page breaks by adding blank lines in your file as you can with a word processor. Even if you were to insert empty paragraphs to add space, those empty lines might be out of place if you edit your content and repaginate.

The DocBook DTD does not contain any elements or attributes that control page breaking. Most people mistakenly assume the pagebreak element would create a page break. But it was created to record where there was a page break in a legacy document before it was converted to DocBook XML, and it does not generate a page break.

The DocBook XSL stylesheet tries hard to prevent bad page breaks in print output. It assigns keep-together properties to some output blocks, which prevents insertion of a page break within the block. For example, a table with this property will be pushed to the next page if the whole table doesn't fit at the bottom of a page. For other blocks the stylesheet adds a keep-with-next property to keep the block with the following block. This is useful for section titles so they don't appear at the bottom of a page with nothing after them.


The current version of FOP (0.20.5) does not support these keep properties except in table rows.

Automatic page breaking is great when it works, but it doesn't always produce aestheically pleasing pages. There are times when the author needs to assist the formatter in page breaking. Since page breaking applies only to print output, the stylesheet supports several dbfo processing instructions to let the author provide help in page breaking.

Keep-together processing instruction

The dbfo keep-together processing instruction can be used with tables, examples, figures, and equations (and their informal versions too). By default, each of those elements is automatically kept together, by means of the following attribute in the attribute-set in fo/param.xsl:

<xsl:attribute-set name="">
   <xsl:attribute name="keep-together.within-column">always</xsl:attribute>

For more information on attribute sets, see the section “Attribute sets”. The full name of this XSL-FO property is keep-together.within-column. The within-column part means the block will be kept together across column breaks in a multi-column page, as well as across page breaks. The value of always means to always keep the block together. If it were set to auto instead, then breaks would be permitted.

If you don't change the attribute set, then none of your tables, examples, figures, or equations will be broken across page boundaries. That's good, except when you don't want that behavior. Consider a long table that starts fairly high on the page. If the whole table doesn't fit on the page, then it breaks the page and leaves a lot of blank space behind. In such cases it would be better to start the table on the current page, and permit it to break and continue on the next page. But you don't want to change the attribute-set, because that would change it for all tables, including short ones that should be kept together.

So to permit a single table to break, add the dbfo keep-together processing instruction to your DocBook XML table element as follows:

  <title>My long table</title>
  <?dbfo keep-together="auto" ?>

When this processing instruction is a child of the DocBook table element, the stylesheet will add a keep-together.within-column="auto" property to the output table. That value will override the attribute set value of always and permit a page break within the table.

This processing instruction can also be used for figures, examples, and equations when they contain content that can be broken across pages (this does not include graphics). For example, if you put a long programlisting in an example, you could add the same PI to permit it to break across pages.

The dbfo keep-together PI can also be used to turn on a keep for a single table if you turn it off globally in the attribute set. The same is true if you turn it off for all your tables in the attribute set. See the section “ attribute-set” for more information on attribute sets for tables.

Soft page breaks

The one thing you don't want to do is insert a hard page break in your XML document. A hard page break always forces a page break at that point. While this may be useful for solving an immediate problem, the next time you edit your document and reformat you may find that your hard page break is positioned higher up on the page and breaks it inappropriately. Maintaining a document with hard page breaks is a pain. For that reason there is no processing instruction in DocBook XSL to insert a hard page break.

The stylesheet does provide a processing instruction for soft page breaks. A soft page break is a conditional page break. If the conditions on the page are not met, then the page does not break. The idea is borrowed from the troff typesetting system, which uses the term “need”. You put a processing instruction in your document that effectively says "I need at least 2 vertical inches left on the current page to fit the following material. If that much space is not available on the page, then break to the next page at this point. If there is enough space, don't break."

This kind of conditional page break is handy when the normal “keeps” used in the stylesheet are not sufficient, either for technical reasons or for aesthetic reasons. For example, you may want to make sure a short introductory paragraph that precedes a code listing has at least a few lines of code with it on the page. The para and the programlisting are separate elements that normally would not have a "keep". Here is an example.

<para>Some text in a paragraph</para>
<?dbfo-need height="2in" ?>
<para>The following code snippet illustrates 
the technique.</para>
<programlisting># Some sample code

Here is what happens when this page is being typeset by the XSL-FO processor. If at the point on the page where the second paragraph in the above example would start there is less than 2 inches of vertical space left , then the rest of the page is left blank and the second paragraph is pushed to the next page. How does it work? The stylesheet outputs an empty fo:block-container with a 2 inch height, followed by an empty fo:block with a negative 2 inch space-before property. If there is 2 inches of space left on the page, then it backspaces up to the start of the block container and starts the next text output without breaking the page. If there isn't 2 inches of space left, then the block-container will force a page break and the text will start at the top of the next page.


The current version of FOP (0.20.5) does not support this soft page break mechanism.

Because the mechanism uses blocks, you can't put the processing instruction inline. It must be between elements that generate blocks of text, otherwise you may get invalid XSL-FO. Also note that the processing instruction name is dbfo-need, not dbfo like other DocBook PIs.

If you are managing breaks between items in a list, then you might have to put the processing instruction just inside the listitem element to get it to work. This is especially true for varlistentry.

This kind of page breaking is not perfect, because you need to estimate how much physical space is needed for the content you want to keep together. You would typically use it after the first printout so can measure vertical sizes of typeset elements. But since it isn't wrapping elements, it can create keeps of arbitrary size.

The dbfo-need PI also accepts a second optional pseudo attribute named space-before. This is useful to manually adjust the spacing when the stylesheet can't quite resolve the spacing the way it was without the PI. For example:

<?dbfo-need  height="0.5in"  space-before="3em" ?>

The space-before pseudo attribute also could be used to add extra vertical space wherever you need it. If you leave out the height pseudo attribute, then you will just get the extra spacing.

Hard page breaks

Although the DocBook XSL stylesheets don't provide direct support for hard (unconditional) page breaks, you can implement your own as a customization. Hard page breaks are not recommended for the reasons described in the section “Soft page breaks”. But there may be times when it is useful. Although customizations are discussed later, this short one is included here to make it easier to find. To enable hard page breaks, you add the following template to your customization layer:

<xsl:template match="processing-instruction('hard-pagebreak')">
   <fo:block break-before='page'/>

Then you put the following processing instruction in your document where you want an unconditional page break:

<para>Some text in a paragraph</para>
<para>The following code snippet illustrates 
the technique.</para>
<programlisting># Some sample code

When the stylesheet processes this PI, it inserts an empty block with the break-before='page' property, which forces a page break. As with soft page breaks, this PI cannot appear inline; it must be placed between elements that generate blocks of text.