Chapter 18. Indexes

Table of Contents

Adding indexterms
Specialized indexes
Outputting an index
Cleaning up an FO index
Internationalized indexes
Formatting HTML indexes
Formatting print indexes

The DocBook XSL stylesheets can automatically generate a back-of-the-book index while your document is being processed. You have to do two things:

Creating a good index is an iterative process. The first time you generate the index after inserting indexterm elements, you will likely notice index entries that are similar, but not quite the same. You must decide if they should be exactly the same, and then find and edit the indexterm elements to merge them. You may also notice related entries that would work better using primary, secondary, and possibly tertiary elements to group them together. You will also want to create entries for alternative wording of your entries, and you have to think creatively to anticipate what words your readers might use to find something. Keep adding and editing entries, and keep building and reviewing your index until you are satisfied with the results.

Adding indexterms

By far the hardest of these two tasks is inserting the indexterm elements in your document. You need to insert one for every entry that is to appear in your index. You place an indexterm at the location where the reference from the index is to land. The DocBook DTD permits indexterm elements to be included as content in a wide variety of elements. Certain ambiguous locations are not permitted, such as between two section elements. The DocBook DTD is the final reference. Validating your document against the DTD will confirm that you have placed your indexterms in permitted locations.

You may have noticed that entries in an index can have more than one level. That permits grouping of subtopics under a keyword. For example:

Example 18.1. Index output

...
start characters, changing, 12
start tags
  beginning, 11
  case sensitivity, 579
  empty element, 22, 609
  errors, 58
    misspelling, 58
    out of context, 60
  minimization, 19
step element, 36
...

The indexterm element provides sufficient structure to permit such multi-level entries to be created. Inside, you assign the highest level part to the primary element, any second level part to a secondary element, and any third level part to a tertiary element. The following are examples of the above entries as they appeared in their indexterm elements scattered through the document.

Example 18.2. Indexterms

<indexterm>
    <primary>start characters, changing</primary></indexterm>  1
<indexterm>
    <primary>start tags</primary>
    <secondary>beginning</secondary></indexterm>  2
<indexterm>
    <primary>start tags</primary>
    <secondary>case sensitivity</secondary></indexterm>  3
<indexterm>
    <primary>start tags</primary>
    <secondary>empty element</secondary></indexterm>
<indexterm>
    <primary>start tags</primary>
    <secondary>errors</secondary></indexterm>
<indexterm>
    <primary>start tags</primary>
    <secondary>errors</secondary>
    <tertiary>misspelling</tertiary></indexterm>  4
<indexterm>
    <primary>start tags</primary>
    <secondary>errors</secondary>
    <tertiary>out of context</tertiary></indexterm>  5
<indexterm>
    <primary>start tags</primary>
    <secondary>minimization</secondary></indexterm>
<indexterm>
    <primary>step element</primary></indexterm>
1

Index term with only a primary level.

2

Second-level index term with secondary element. Note that there is no page reference on start tags itself in Example 18.1, “Index output”. It would have one only if there were another indexterm with just a primary element containing start tags.

3

For secondary values to sort together, their primary values must match exactly. The exception is when a sortas attribute is used on the primary element and it is an exact match.

4

Third-level index term with secondary and tertiary elements. Note that in this case errors has a page number, due to the previous indexterm.

5

For tertiary values to sort together, their primary values must match exactly and their secondary values must match exactly. The exception is when a sortas attributes provide the exact match.

If you want entries to sort together, you must make sure the text matches exactly, at all the levels that need to match. When you are working through your document to add indexterms, it helps to keep a running list of words and phrases you are using. That way you can reuse a word or phrase in another indexterm and they will sort together. Generating a thorough and consistent index requires a lot of work and care, but your readers will thank you for it.

There are several other index features you might want to use:

  • If you need an entry to sort to some other location in the index, then add a sortas attribute whose value is the text to be used for sorting. For example:

    <indexterm>
      <primary sortas="Fourthought">4Thought</primary></indexterm>
    

    This entry will sort with the F's, but will appear as 4Thought.

  • If you want to indicate that one destination is more significant that other destinations for the same entry, then add a significance="preferred" attribute to its indexterm element. When processed into the index, the page number for such entries will appear first and in bold. You can change the formatting by customizing the index.preferred.page.properties attribute-set.

  • If an index entry should logically cover a range of pages, you can indicate the start and end of the range with just two entries:

    Start of page range:
    <indexterm class="startofrange" id="makestuff">
      <primary>Makefiles</primary></indexterm>
    ...
    End of page range:
    <indexterm class="endofrange" startref="makestuff">
      <primary>Makefiles</primary></indexterm>
    

    The content of the entries must be the same. The first entry must have an id attribute that the second one points to with its startref attribute. That link establishes the pair of entries. The class attributes trigger the processing to generate the range. Not all FO processors support index ranges.