<- ->
PREVIOUS        CONTENTS        NEXT

Chapter 15. Writing PNG Images

Contents:

15.1. A libpng-Based, PNG-Writing Demo Program
15.2. Gamma Correction
15.3. Text Chunks
15.4. writepng_version_info()
15.5. writepng_init()
15.6. Interlaced PNG: writepng_encode_image()
15.7. Noninterlaced PNG: writepng_encode_row()
15.8. writepng_cleanup()
15.9. Getting the Source Code

Writing PNG images is both simpler and more complex than reading them. Weighing in on the side of simplicity is the fact that there is no need for a lot of platform-specific code, particularly platform-specific graphical code--unless, of course, the application already is graphical. In general, there is also no need for a special progressive mode; writing a PNG file, or almost any image format, for that matter, is more or less progressive by nature, although some complexity creeps in when the image is interlaced.

Writing PNGs is more explicitly complex when it comes to dealing with ancillary information like text annotations, timestamps, and so forth. A simple PNG viewer can ignore all of that; its only concern is with displaying the pixels correctly and in a timely manner. But a PNG-writing application should be prepared to preserve any existing textual information and to give the user the option of adding new information--for example, a title, the author's name, and copyright information. One wants to avoid adding too much baggage to the image, but the user should also be given the option of adding a timestamp (e.g., the tIME chunk for time of last modification, or perhaps a tEXt chunk indicating the creation time).

When it comes to the actual image data, at a minimum, the application should be able to detect when there are no more than 256 colors or color-transparency pairs, including a possible background color, and write a palette-based image if that is the case. Ideally, it should also be able to write a grayscale image as grayscale instead of RGB, but unless there is already information available that indicates the pixels are gray, or the user indicates that the image is to be converted to grayscale, detecting such images can be both CPU- and memory-intensive.

It should go without saying that any such application should include gamma-correction information with the image whenever possible, and that it should be correct information; this may entail providing the user with a calibration screen. And image converters must be much more careful, since most images lacking explicit gamma information also lack any information from which one can infer the gamma value unambiguously; guessing incorrectly is worse than omitting the gamma info in the first place.

High-end, professional applications should also provide chromaticity information, if it is known, and mark any images created in the standard RGB color space with an appropriate sRGB chunk. They may also want to include a complete International Color Consortium embedded profile (iCCP chunk), but given the size of such profiles, this should always be an option given to the user, and generally it should not be the default option. See Chapter 10, "Gamma Correction and Precision Color", for a more detailed discussion of gamma correction and color spaces.

Applications such as image editors, which usually include the generation of web-friendly graphics as one of their features, should also provide the user with the option of converting truecolor images into colormapped ones. This is known as quantization, and it should include images with an alpha channel. As I described in Chapter 8, "PNG Basics", PNG's tRNS chunk effectively transforms a palette from RGB samples into RGBA; thus, any program that can quantize a 24-bit RGB image down to a 256-color palette-based image should also be capable of quantizing a 32-bit RGBA or 16-bit gray/alpha image down to a 256-entry PLTE/tRNS-based image. But because quantization is a lossy procedure, it should never be the default--unless, of course, the entire purpose of the application is the lossy conversion of truecolor images into colormapped ones.

Special-purpose applications that deal with sampled data from scientific or medical apparatus will often encounter odd bit depths or oddly calibrated data, at least compared with standard computer images. For example, medical tomographic (CT) images are usually stored as 16-bit integer samples, but the implied upper bound of 65,535 is misleading. Such images rarely use more than 10 to 12 bits of each sample, their maximum intensity value is typically less than 4,096 and sometimes less than 1,024, though rarely less than 256. When stored as PNG images, their samples should be scaled up so that the maximum value is near 65,535. For example, an image whose raw data has a maximum value of 1,891 is using only 11 bits of each sample--i.e., the next power of two is 2,048, or 211. It should be scaled up either by a factor of 32 (25), which corresponds simply to shifting the bits five to the left, or more properly by a factor of 65,535/2,047, which happens to be very closely approximated by what the PNG spec calls ``left bit replication.'' These two approaches are more easily understood as C code:

    /* how to scale 11-bit data up to 16 bits */
#ifdef LEFT_BIT_REPLICATION
    new_sample = (old_sample << 5) | (old_sample >> 3);
#else
    new_sample = (old_sample << 5);    /* simple shift method */
#endif

Either way, the application should write an sBIT chunk into the file to indicate the number of significant bits in the original data; in this case, the sBIT value would be 11. It might also want to write a pCAL chunk indicating the calibration of the sample values relative to the physical quantity being measured. It is not intuitively obvious how one would allow the user to provide information for the pCAL chunk interactively, however; more likely, a programmer would hardcode things like the pCAL equation type directly into the application, given advance knowledge of the type of data being collected or manipulated.

15.1. A libpng-Based, PNG-Writing Demo Program

The demo program I present here is intentionally more limited than it should be if it were a ``real'' program, in order that the basic concepts of writing PNG images with libpng not be lost in the details. For simplicity's sake, I chose to write a basic command-line image-conversion program in ANSI C, with the PNG-specific ``back end'' code in one file (writepng.c) and the single, cross-platform ``front end'' in another file (wpng.c). As with the PNG-reading demo programs, this uses libpng, which is very complete, well-tested, and by far the most commonly used PNG library. This program also keeps all image-related variables in a single struct; as with the one described in Chapter 14, "Reading PNG Images Progressively", this approach would enable a multithreaded program to handle several images at the same time. Finally, wpng uses NetPBM (or PBMplus) binary files for input, since there are few image formats that are simpler to read (or write, for that matter).

But recall from Chapter 5, "Applications: Image Converters", that there is already an extremely capable NetPBM conversion program called pnmtopng, by Alexander Lehmann and Willem van Schaik. It supports practically all PNG chunks and all possible variants of image data, and its source code is freely available and reusable, subject to minimal restrictions. Rather than duplicate many of its functions, we chose to stick to a minimal subset and instead concentrate on a few features not currently supported[103] by the larger program: incremental (or progressive) conversion, automatic timestamping, interactive input of text fields, and support for a very unofficial NetPBM extension format: type P8 files, containing 32-bit RGBA data. Supported PNG output types include basic 8-bit-per-sample grayscale, RGB and RGBA images, either interlaced or not. The program will write a gamma chunk if the user supplies an explicit value, but not otherwise; it cannot know a priori in what color space the original NetPBM image was created. The background chunk is also supported if the user supplies a background color, but it is ignored if the input image has no alpha channel.

[103] The most recent release as of this writing is version 2.37.2.

Readers with more advanced needs should study pnmtopng, which can be found on the PNG home site: http://www.libpng.org/pub/png/apps/pnmtopng.html. It includes such features as rescaling low-bit-depth samples, reordering the palette so that opaque entries of the tRNS chunk may be omitted, and support for explicitly specifying a separate PGM file as the alpha channel. libpng and zlib can both be found in the same location.

15.2. Gamma Correction

Before diving into the PNG-specific code, there are a couple of items in the main program (front end) that are worth a quick look. The first has to do with our old friend, gamma correction (see Chapter 10, "Gamma Correction and Precision Color"). As I noted earlier, in general there is no way to know what the gamma value of the input file is, so the output PNG file's gamma cannot be set automatically. But we do know that if the input file looks OK when displayed on the user's display system--which is presumed to be the one in use when the conversion program is run--then the file gamma is roughly equal to the inverse of the display system's exponent. So wpng calculates a default value for the display-system exponent just as our two PNG-reading demo programs did; the difference is that its calculated value is purely advisory. Here is the code to calculate the default gamma value:

    double default_gamma = 0.0;
  
#if defined(NeXT)
    default_exponent = 1.0;   /* 2.2/next_gamma for 3rd-party utils */
#elif defined(sgi)
    default_exponent = 1.3;   /* default == 2.2 / 1.7 */
    /* there doesn't seem to be any documented function to get the
     * "gamma" value, so we do it the hard way */
    if (tmpfile = fopen("/etc/config/system.glGammaVal", "r")) {
        double sgi_gamma;
  
        fgets(fooline, 80, tmpfile);
        fclose(tmpfile);
        sgi_gamma = atof(fooline);
        if (sgi_gamma > 0.0)
            default_exponent = 2.2 / sgi_gamma;
    }
#elif defined(Macintosh)
    default_exponent = 1.5;   /* default == (1.8/2.61) * 2.2 */
    /*
    if (mac_gamma = some_mac_function_that_returns_gamma())
        default_exponent = (mac_gamma/2.61) * 2.2;
     */
#else
    default_exponent = 2.2;   /* assume std. CRT, no LUT:  most PCs */
#endif
  
    default_gamma = 1.0 / default_exponent;
  
    if ((p = getenv("SCREEN_GAMMA")) != NULL) {
        double exponent = atof(p);
  
        if (exponent > 0.0)
            default_gamma = 1.0 / atof(p);
    }

The first section calculates a platform-dependent exponent for the display system, which is then inverted to give a default file-gamma value. But it is possible that the user has calibrated the display system more precisely and has defined the SCREEN_GAMMA environment variable as suggested by the libpng documentation. If so, this value is used instead.

Note that the Macintosh code is incomplete. The Macintosh macro, presumed to be defined already, most likely would need to be set on the basis of compiler-specific macros. For example, the following preprocessor code would work for Metrowerks CodeWarrior and the Macintosh Programmer's Workbench, although MPW is not terribly specific and might be defined on non-Macintosh systems, too:

#if !defined(Macintosh)
#  if defined(__MWERKS__) && defined(macintosh)
#    define Macintosh
#  elif defined(MPW)  /* && defined(MCH_MACINTOSH) */
#    define Macintosh
#  endif
#endif

In any case, the calculated file gamma is presented as part of wpng's usage screen but thereafter ignored.

15.3. Text Chunks

The other item worth looking at is the interactive text-entry code. Most windowing systems will have more elegant ways to read in text than I use here, but even they should ensure that the entered text conforms to the recommended format for PNG text chunks. PNG text is required to use the Latin-1 character set; strictly speaking, that does not restrict the use of control characters (character code 127 and any code below 32 decimal), but in practice only line feeds (code 10) are necessary. The use of carriage-return characters (code 13) is explicitly discouraged by the spec in favor of single line feeds; this has implications for DOS, OS/2, Windows, and Macintosh systems. Horizontal tabs (code 9) are discouraged as well since they don't display the same way on all systems, but there are legitimate uses for tabs in text. The section of the spec dealing with security considerations implicitly recommends against the use of the escape character (code 27), which is commonly used to introduce ANSI escape sequences. Since these can include potentially malicious macros, encoders should restrict the use of the escape character for the sake of overly simple-minded decoders. That leaves codes 9, 10, 32-126, and 160-255 as valid from a practical standpoint, with use of the first (tab) discouraged. Note that codes 128-159 are not valid Latin-1 characters, at least not in the printable sense. They are reserved for specialized control characters.

The specification also recommends that lines in each text block be no more than 79 characters long; I've chosen to restrict mine to 72 characters each, plus provide for one or two newline characters and a trailing NULL. The spec does not specifically address the issue of the final newline, but does require omitting the trailing NULL; logically, one might extend that to include trailing newlines, so I have.

Finally, I have arbitrarily allowed only six predetermined keywords: Title, Author, Description, Copyright (all officially registered), and E-mail and URL (unregistered). Description is limited to nine lines, mainly so that the little line-counter prompts for each line are single digits and therefore line up nicely; the others are limited to one line each. Thus the code for reading the Title keyword, once the text buffer (textbuf) has been allocated, looks like this:

    do {
        valid = TRUE;
        p = textbuf + TEXT_TITLE_OFFSET;
        fprintf(stderr, "  Title: ");
        fflush(stderr);
        if (FGETS(p, 74, keybd) && (len = strlen(p)) > 1) {
            if (p[len-1] == '\n')
                p[--len] = '\0';    /* remove trailing newline */
            wpng_info.title = p;
            wpng_info.have_text |= TEXT_TITLE;
  
            if ((result = wpng_isvalid_latin1((uch *)p, len)) >= 0) {
                fprintf(stderr, "    " PROGNAME " warning:  character"
                  " code %u is %sdiscouraged by the PNG\n"
                  "    specification [first occurrence was at
                  " character position #%d]\n", (unsigned)p[result],
                  (p[result] == 27)? "strongly " : "", result+1);
                fflush(stderr);
#ifdef FORBID_LATIN1_CTRL
                wpng_info.have_text &= ~TEXT_TITLE;
                valid = FALSE;
#else
                if (p[result] == 27) {      /* escape character */
                    wpng_info.have_text &= ~TEXT_TITLE;
                    valid = FALSE;
                }
#endif
            }
        }
    } while (!valid);

Aside from some subtlety with the keybd stream that I won't cover here (it has to do with reading from the keyboard even if standard input is redirected), the only part of real interest is the test for nonrecommended Latin-1 characters, which is accomplished in the wpng_isvalid_latin1() function:

static int wpng_isvalid_latin1(uch *p, int len)
{
    int i, result = -1;
  
    for (i = 0;  i < len;  ++i) {
        if (p[i] == 10 || (p[i] > 31 && p[i] < 127) || p[i] > 160)
            continue;
        if (result < 0 || (p[result] != 27 && p[i] == 27))
            result = i;
    }
  
    return result;
}

If the function finds a control character that is discouraged by the PNG specification, it returns the offset of the first one found. The only exception is if an escape character (code 27) is found later in the string; in that case, its offset is what gets returned. The main code then tests for a non-negative value and prints a warning message. What happens next depends on how the program has been compiled. By default, the presence of an escape character forces the user to re-enter the text, but all of the other discouraged characters are allowed. If the FORBID_LATIN1_CTRL macro is defined, however, the user must re-enter the text whenever any of the ``bad'' control characters is found. The default behavior results in output similar to the following:

Enter text info (no more than 72 characters per line);
to skip a field, hit the <Enter> key.
  Title: L'Arc de Triomphe
  Author: Greg Roelofs
  Description (up to 9 lines):
    [1] This line contains only normal characters.
    [2] This line contains a tab character here: ^I
    [3] 
    wpng warning:  character code 9 is discouraged by the PNG
    specification [first occurrence was at character position #85]
  Copyright: We attempt an escape character here: ^[
    wpng warning:  character code 27 is strongly discouraged by the PNG
    specification [first occurrence was at character position #38]
  Copyright: Copyright 1981, 1999 Greg Roelofs
  E-mail: [email protected]
  URL: http://www.libpng.org/pub/png/pngbook.html

Note that the Copyright keyword had to be entered twice since the first attempt included an escape character. The Description keyword also would have had to be reentered if the program had been compiled with FORBID_LATIN1_CTRL defined.

Returning to more mundane issues, wpng_info is the struct by which the front end communicates with the PNG-writing back end. It is of type mainprog_info, and it is defined as follows:

typedef struct _mainprog_info {
    double gamma;
    long width;
    long height;
    time_t modtime;
    FILE *infile;
    FILE *outfile;
    void *png_ptr;
    void *info_ptr;
    uch *image_data;
    uch **row_pointers;
    char *title;
    char *author;
    char *desc;
    char *copyright;
    char *email;
    char *url;
    int filter;
    int pnmtype;
    int sample_depth;
    int interlaced;
    int have_bg;
    int have_time;
    int have_text;
    jmp_buf jmpbuf;
    uch bg_red;
    uch bg_green;
    uch bg_blue;
} mainprog_info;

As in the previous programs, we use the abbreviated typedefs uch, ush, and ulg in place of the more unwieldy unsigned char, unsigned short, and unsigned long, respectively. The title element is simply a pointer into the text buffer, and the struct contains similar pointers for the other five keywords. have_text is more than a simple Boolean (TRUE/FALSE) value, however. Because the user may not want all six text chunks, the program must keep track of which ones were provided with valid data. Thus, have_text is a bit flag, and TEXT_TITLE sets the bit corresponding to the Title keyword--but only if the length of the entered string is greater than one.

The user indicates that a field should be skipped by hitting the Enter key, and the fgets() function includes the newline character in the string it returns; thus a string of length one contains nothing but the newline.

15.4. writepng_version_info()

We'll turn now to the PNG-specific back-end code in writepng.c. As with any module that calls libpng functions, it begins by including the png.h header file, which in turn includes zlib.h. This particular program also includes writepng.h, which defines our mainprog_info struct, various text-related macros, and prototypes for the externally visible functions that we'll be discussing in detail. Indeed, the first of these functions is almost trivial:

#include "png.h"       /* libpng header; includes zlib.h */
#include "writepng.h"  /* typedefs, common macros, public prototypes */
  
void writepng_version_info()
{
  fprintf(stderr, "   Compiled with libpng %s; using libpng %s.\n",
    PNG_LIBPNG_VER_STRING, png_libpng_ver);
  fprintf(stderr, "   Compiled with zlib %s; using zlib %s.\n",
    ZLIB_VERSION, zlib_version);
}

writepng_version_info() simply indicates the versions of libpng and zlib with which the application was compiled, as well as the versions it happens to be using at runtime. Ideally the two pairs of version numbers will match--in the case of a statically linked executable, they always will--but if the program was dynamically linked, it is possible that the program loader has found either an older or a newer version of one or both libraries, in which case strange problems may arise later. Making this information easily available to the user, whether in a simple text-mode usage screen as I do here or via a windowed ``about box'' or even a fancy, automated, troubleshooting function, can be helpful in dealing with the bug reports that inevitably show up sooner or later.

15.5. writepng_init()

Back in the main program we conditionally fill in various elements of our mainprog_info struct based on the user's command-line options: interlaced, modtime, have_time, gamma, bg_red, bg_green, bg_blue, and have_bg. Note that have_bg is set only if the user provides a background color and the PNM image type is the experimental ``type 8'' binary RGBA file. Also, whereas pnmtopng currently requires the user to provide a text version of the current time for use in the tIME chunk, wpng automatically determines the current time if the -time option is given:

    if (user_specified_time_option) {
        wpng_info.modtime = time(NULL);
        wpng_info.have_time = TRUE;
    }

After finishing the command-line options, we next open the input file (in binary mode!), verify that it's in the proper format, and read its basic parameters: image height, width, and depth. We also generate an output filename based on the input name and verify both that the output file does not already exist and that it can be opened and written to (also in binary mode!). That provides enough information to fill in most of the rest of mainprog_info: infile, pnmtype, have_bg, width, height, sample_depth, and outfile.

If any errors have occurred by this point, wpng prints the usage screen--including the libraries' version information--and exits. Otherwise it optionally prompts the user for PNG text information and then, finally, calls our PNG initialization routine, writepng_init(). It is declared as follows:

int writepng_init(mainprog_info *mainprog_ptr)

where mainprog_ptr just points at the mainprog_info struct we filled in in the main program. writepng_init() begins with some fairly standard libpng boilerplate:

    png_structp  png_ptr;
    png_infop  info_ptr;
  
    png_ptr = png_create_write_struct(PNG_LIBPNG_VER_STRING,
      mainprog_ptr, writepng_error_handler, NULL);
    if (!png_ptr)
        return 4;   /* out of memory */
  
    info_ptr = png_create_info_struct(png_ptr);
    if (!info_ptr) {
        png_destroy_write_struct(&png_ptr, NULL);
        return 4;
    }

This fragment allocates memory for the two internal structures that libpng currently requires and sets up a custom error handler. Note that while the structs have the same names and types as those used in our PNG-reading demo programs, libpng provides separate functions to create and destroy them. The first function, png_create_write_struct(), also checks that the compile-time and runtime versions of libpng are reasonably compatible. Of course, any change to the library may create unforeseen incompatibilities, so passing this test does not absolutely guarantee that everything will work. Failing it, on the other hand, is a pretty good indication that things will break.

The second and third arguments to png_create_write_struct() are the keys to installing a custom error handler. The second argument is a pointer to application data (mainprog_ptr, in this case) that will be supplied to the error handler; the third argument is the custom error-handling routine itself. I will explain why it is important to use a custom routine as soon as we take a look at the next section of code.

Once the structs have been allocated, it is necessary to set up the ``receiving end'' of the error-handling code for this particular function. Essentially every user function that calls a libpng routine will need code like this; it amounts to more standard boilerplate, and in general, the only difference between applications will be where the jmpbuf member is stored. In this program, as with the one in the previous chapter, we store jmpbuf in our own struct instead of relying on the one in the main PNG struct:

    if (setjmp(mainprog_ptr->jmpbuf)) {
        png_destroy_write_struct(&png_ptr, &info_ptr);
        return 2;
    }

I discussed the semantics of setjmp() and longjmp() in Chapter 13, "Reading PNG Images"; effectively they amount to a really big goto statement. The problem is not so much with the precise storage location of jmpbuf, but rather that its type, jmp_buf, can be different sizes depending on whether certain sytem macros have been defined. When one uses the default libpng error handler, setjmp() is called from the application, but longjmp() is called from within libpng. Since it is not uncommon for the library to be compiled separately from the application--indeed, it may not even have been compiled on the same system--there is no guarantee that the jmp_buf sizes in libpng and the application will be consistent. If they are not, mayhem ensues. See the sidebar for a solution.

writepng_error_handler()

The solution is a ``custom'' error handler, though that's a slight misnomer in our case. Completely custom error handlers can certainly be installed, but libpng currently assumes that its error-handling routine will never return. This rather drastically limits the options for alternatives--basically, one can use longjmp() or exit(), which amounts to an even larger goto statement.[104] Here, as in Chapter 14, "Reading PNG Images Progressively", I have merely taken libpng's default error handler and modified it slightly to use mainprog_ptr instead of png_ptr:

static void writepng_error_handler(png_structp png_ptr, 
png_const_charp msg)
{
    mainprog_info  *mainprog_ptr;
  
    fprintf(stderr, "writepng libpng error: %s\n", msg);
    fflush(stderr);
  
    mainprog_ptr = png_get_error_ptr(png_ptr);
    if (mainprog_ptr == NULL) {
        fprintf(stderr, "writepng severe error:  "
                "jmpbuf not recoverable; terminating.\n");
        fflush(stderr);
        exit(99);
    }
  
    longjmp(mainprog_ptr->jmpbuf, 1);
}

Because we have to use a libpng function, however trivial, to retrieve our pointer, there is an extra block of code in our version that makes sure the pointer is not NULL. If it is, we are completely stuck, and our only real option is to exit. But assuming the pointer seems valid (it may have been overwritten with an invalid but non-NULL address, in which case we're going to ``exit'' whether we want to or not), we use our saved jmp_buf and longjump back to the part of our application that most recently invoked setjmp(). The key difference from using libpng's error handler is simply the location of the longjmp() call. Here we call both setjmp() and longjmp() within the same application--indeed, from within the same source file. They are therefore guaranteed to have consistent notions of how a jmp_buf is defined, so we have eliminated one more potential source of very-difficult-to-debug crashes.

[104] Ford's Model T was also renowned for its wide range of color options.

As long as we're on the subject of alternatives, libpng also supports user-defined input/output functions. But its default is to read from or write to PNG files, and since that is precisely what we want to do here, I chose to stick with the standard I/O-initialization call and pass the output file's pointer to libpng:

    png_init_io(png_ptr, mainprog_ptr->outfile);

Next we deal with compression. libpng has pretty good defaults, and many programs (possibly most) will not need to do anything here. But in our case we're converting from an uncompressed image format to PNG; for any given image, we're unlikely to do so more than once, and even if we convert many images, wpng is a command-line program and can easily be incorporated into a script for batch processing. Thus I chose to override libpng's default compression setting (zlib level 6--see Chapter 9, "Compression and Filtering") with the slower ``maximum'' setting (zlib level 9):

    png_set_compression_level(png_ptr, Z_BEST_COMPRESSION);

Note that a good PNG-writing program should let the user decide whether and how to override the default settings; options for very fast saves and/or for maximal compression might be reasonable, in addition to the default. In fact, pnmtopng provides options to do just that.

Tweaking Compression

Closely related to compression is filtering, one area in which it is almost always better to leave the decision up to libpng. Repeated tests have shown that filtering is almost never useful on palette-based images, but on everything else it is quite beneficial. Though libpng allows one to restrict its filter selection, this is rarely a good idea; dynamic filtering works best when the encoder can choose from the five defined filter types. But for programmers who want to play with the alternatives, here's an example:

/*
    >>> this is pseudo-code
    if (palette image, i.e., don't want filtering) {
        png_set_filter(png_ptr, PNG_FILTER_TYPE_BASE, 
          PNG_FILTER_NONE);
        png_set_compression_strategy(png_ptr, Z_DEFAULT_STRATEGY);
    } else {
        >>> leave default filter selection alone
        png_set_compression_strategy(png_ptr, Z_FILTERED);
    }
 */

The calls to png_set_compression_strategy() actually alter zlib's behavior to work better with the filtered output. Other zlib parameters can also be tweaked, at least in theory; these include the sliding window size, memory level, and compression method. For the last, only method 8 is currently defined, but zlib 2.0 is likely to introduce at least one or two new methods when it is eventually released. Of course, unless and until the PNG specification is revised accordingly, no new compression method can be used within a PNG file without invalidating it.

The window size is the only thing a normal PNG encoder should consider changing, and then only when the total size of the image data, plus one extra byte per row for the row filters, amounts to 16 kilobytes or less. In such a case, the encoder can use a smaller power-of-two window size without affecting compression, which allows decoders to reduce their memory usage. The following fragment shows how to modify these zlib parameters; the values shown are the defaults used by libpng (consult the libpng documentation, specifically ``Configuring zlib'' and ``Controlling row filtering''):

/*
    >>> second arg is power of two; 8 through 15 (256-32768) valid
    png_set_compression_window_bits(png_ptr, 15);
    png_set_compression_mem_level(png_ptr, 8);
    png_set_compression_method(png_ptr, 8);
 */

The next step is to convert our notion of the image type into something libpng will understand. In this case, because we support only three basic image types--grayscale, RGB, or RGBA--we have a one-to-one correspondence between input and output types, so setting the PNG color type is easy. For more general programs, libpng provides several PNG_COLOR_MASK_* macros that can be combined to get the color type, with the exception that PNG_COLOR_MASK_PALETTE and PNG_COLOR_MASK_ALPHA are incompatible. We also set the appropriate PNG interlace type if the user so requested:

    int color_type, interlace_type;
  
    if (mainprog_ptr->pnmtype == 5)
        color_type = PNG_COLOR_TYPE_GRAY;
    else if (mainprog_ptr->pnmtype == 6)
        color_type = PNG_COLOR_TYPE_RGB;
    else if (mainprog_ptr->pnmtype == 8)
        color_type = PNG_COLOR_TYPE_RGB_ALPHA;
    else {
        png_destroy_write_struct(&png_ptr, &info_ptr);
        return 11;
    }
  
    interlace_type = mainprog_ptr->interlaced? PNG_INTERLACE_ADAM7 :
                                               PNG_INTERLACE_NONE;

At this point, we can set the basic image parameters. We have the option of using several functions, each of which sets a single parameter, but there is really no point in doing so. Instead we set all of them with a single call to png_set_IHDR():

    png_set_IHDR(png_ptr, info_ptr, mainprog_ptr->width,
      mainprog_ptr->height, mainprog_ptr->sample_depth,
      color_type, interlace_type,
      PNG_COMPRESSION_TYPE_DEFAULT, PNG_FILTER_TYPE_DEFAULT);

If we supported palette-based images, this is the point at which we would define the palette for libpng, via the png_set_PLTE() and possibly png_set_tRNS() functions. We can also set any optional parameters the user specified, starting with the gamma value, background color, and image modification time. In the case of the background color, we know that have_bg will be true only if the image has an alpha channel; in this program, that necessarily implies that it's an RGBA image, not grayscale with alpha or palette-based with transparency. Thus we only fill in the red, green, and blue elements of the png_color_16 struct:

    if (mainprog_ptr->gamma > 0.0)
        png_set_gAMA(png_ptr, info_ptr, mainprog_ptr->gamma);
  
    if (mainprog_ptr->have_bg) {
        png_color_16  background;
  
        background.red = mainprog_ptr->bg_red;
        background.green = mainprog_ptr->bg_green;
        background.blue = mainprog_ptr->bg_blue;
        png_set_bKGD(png_ptr, info_ptr, &background);
    }
  
    if (mainprog_ptr->have_time) {
        png_time  modtime;
  
        png_convert_from_time_t(&modtime, mainprog_ptr->modtime);
        png_set_tIME(png_ptr, info_ptr, &modtime);
    }

It is also worth noting that libpng copies most of the data it needs into its own structs, so we can get away with using temporary variables like background and modtime without worrying about their values being corrupted before libpng is ready to write them to the file. The only exceptions are things involving pointers, in which case libpng copies the pointer itself but not the buffer to which it points. In fact, libpng's text-handling code is an excellent example of that:

    if (mainprog_ptr->have_text) {
        png_text  text[6];
        int  num_text = 0;
  
        if (mainprog_ptr->have_text & TEXT_TITLE) {
            text[num_text].compression = PNG_TEXT_COMPRESSION_NONE;
            text[num_text].key = "Title";
            text[num_text].text = mainprog_ptr->title;
            ++num_text;
        }
        if (mainprog_ptr->have_text & TEXT_AUTHOR) {
            text[num_text].compression = PNG_TEXT_COMPRESSION_NONE;
            text[num_text].key = "Author";
            text[num_text].text = mainprog_ptr->author;
            ++num_text;
        }
        if (mainprog_ptr->have_text & TEXT_DESC) {
            text[num_text].compression = PNG_TEXT_COMPRESSION_NONE;
            text[num_text].key = "Description";
            text[num_text].text = mainprog_ptr->desc;
            ++num_text;
        }
        if (mainprog_ptr->have_text & TEXT_COPY) {
            text[num_text].compression = PNG_TEXT_COMPRESSION_NONE;
            text[num_text].key = "Copyright";
            text[num_text].text = mainprog_ptr->copyright;
            ++num_text;
        }
        if (mainprog_ptr->have_text & TEXT_EMAIL) {
            text[num_text].compression = PNG_TEXT_COMPRESSION_NONE;
            text[num_text].key = "E-mail";
            text[num_text].text = mainprog_ptr->email;
            ++num_text;
        }
        if (mainprog_ptr->have_text & TEXT_URL) {
            text[num_text].compression = PNG_TEXT_COMPRESSION_NONE;
            text[num_text].key = "URL";
            text[num_text].text = mainprog_ptr->url;
            ++num_text;
        }
        png_set_text(png_ptr, info_ptr, text, num_text);
    }

Here I have declared a temporary array of six png_text structs, each of which consists of four elements: compression, key, text, and text_length. The first of these simply indicates whether the text chunk is to be compressed (zTXt) or not (tEXt). key and text are pointers to NULL-terminated strings containing the keyword and actual text, respectively. These pointers are what libpng copies, but the text buffers to which they point must remain valid until either png_write_info() or png_write_end() is called--we'll return to that point in a moment. The final member of the struct, text_length, is used internally by libpng; we need not set it ourselves, since libpng will do so regardless.

Anywhere from one to six of the structs is filled in, depending on whether the main program set the appropriate bit for each of the six supported keywords. Then png_set_text() is called, which triggers libpng to allocate its own text structs and copy our struct data into them. Alternatively, we could have used a single png_text struct, repeatedly filling it in and calling png_set_text() for each keyword; libpng merely chains the copied text structs together, so the net result would have been the same.

Text Buffers, PNG Structs, and Core Dumps

The issue of libpng's allocation of its own text buffers is worth a closer look, because it indirectly led to a subtle but fatal bug in a popular PNG viewer. The program in question was John Bradley's XV, an elegant and powerful image viewer/converter for the X Window System. Version 3.10a, released late in 1994 and still the most recent release as of this writing, had no native PNG support. But because it was available in source-code form, it was one of the first applications to support the reading and writing of PNGs, thanks to a patch created by Alexander Lehmann in June 1995 and later modified by Andreas Dilger and the author of this book.

This patch was originally written to work with libpng 0.71 and zlib 0.93, beta versions so old they were arguably alpha-level software. At the time, major functionality was still being added to libpng, and the so-called modern ``convenience functions'' for modifying libpng parameters did not exist. As a result, the patch was designed to access the two PNG structs directly, and later updates to the patch did not completely eliminate this behavior. In particular, all versions of the patch through 1.2d, released in June 1996, allocated their own text structs and plugged them directly into one of the main PNG structs for libpng's use.

Now fast-forward to January 1998, when the final libpng betas were being released. By this time, libpng provided functions not only to allocate and destroy the PNG structs, but also to read from them and write to them. In particular, png_set_text() already existed in its present form; i.e., it allocated its own text structs and copied the user-supplied data into them. But one of the changes in libpng 0.97 involved plugging some small memory leaks by freeing these libpng-allocated text structs as part of png_destroy_write_struct(). Unfortunately, libpng had no way to track whether it had actually allocated the structs in the first place, and...well, one can see where this is going. First libpng freed the text structs, then the XV patch--which had allocated them--did so again. Boom: segmentation fault, core dump, an incomplete PNG file, and no more XV.

The moral of this little story is simple: 1995-era programs had no choice but to access libpng structs directly, because that was how libpng was originally written. But modern programs should never do so, not only because of this particular problem, but also for the several other reasons detailed in the previous two chapters. Let's say it again: Accessing libpng structures directly is just plain evil. Don't do it!

Ye have been warned.

The setting of the text chunks is our last piece of non-pixel-related PNG information, so our next step is to write all chunks up to the first IDAT:

    png_write_info(png_ptr, info_ptr);

Doing this flushes any time or text chunks to the output file, and the corresponding data in the PNG structs is marked so that it is not written to the file again later. I mentioned earlier that text buffers must remain valid until either png_write_info() or png_write_end() is called, which implies that either one can be used to write text chunks to the PNG file. This is indeed the case. Had we wished to put all of our text chunks (or the time chunk) at the end of the PNG file, we would have called png_write_info() first, followed by one or both of png_set_tIME() and png_set_text().

In the case of the latter function,[105] we could do both--that is, call it with one or more text structs before calling png_write_info() and then call it again with one or more new text structs (perhaps a lengthy legal disclaimer to be stored in a zTXt chunk) afterward. Any calls to png_set_text() occurring before png_write_info() will be written to the PNG file before the IDATs; any calls to it after png_write_info() but before png_write_end() will be written to the PNG file after the IDATs. And any png_set_text() or png_set_tIME() calls after png_write_end() will be ignored.

[105] Recall from Chapter 11, "PNG Options and Extensions", that only one tIME chunk is allowed.

Having completed our pre-IDAT housekeeping, we can now turn to our image-data transformations. But unlike our PNG-reading demos, most programs that write PNGs will not require many transformations. In fact, we only call one, and technically there's no point even in that:

    png_set_packing(png_ptr);

This function packs low-bit-depth pixels into bytes. There are no low-bit-depth RGB and RGBA images; only grayscale and palette images support bit depths of 1, 2, or 4. But our main program neither counts colors to see whether a palette-based representation would be possible, nor checks for valid low-bit-depth grayscale values, and it always sets sample_depth to 8, so there is currently no possibility of libpng actually being able to pack any pixels. However, pnmtopng does both, and perhaps a subsequent revision of wpng will, too.

The only remaining thing for our initialization function to do is to save copies of the two PNG-struct pointers for passing to libpng functions later:

    mainprog_ptr->png_ptr = png_ptr;
    mainprog_ptr->info_ptr = info_ptr;
  
    return 0;

Once again, we could have used global variables instead, but this program is intended to demonstrate how a multithreaded PNG encoder might be written.

15.6. Interlaced PNG: writepng_encode_image()

Back in the main program, the first thing we do after returning is to free the text buffer, since all of its data has already been written to the PNG file. Then we calculate the number of bytes per row of image data; since we accept only three basic file types, there are only three possibilities for this: either one, three, or four times the image width.

What happens next depends on whether the user requested that the PNG image be interlaced. If so, there's really no good way to read and write the image progressively, so we simply allocate a buffer large enough for the whole thing and read it in. We also allocate and initialize a row_pointers array, where each element points at the beginning of a row of pixels, and then call writepng_encode_image():

int writepng_encode_image(mainprog_info *mainprog_ptr)
{
    png_structp png_ptr = (png_structp)mainprog_ptr->png_ptr;
    png_infop info_ptr = (png_infop)mainprog_ptr->info_ptr;
  
    if (setjmp(mainprog_ptr->jmpbuf)) {
        png_destroy_write_struct(&png_ptr, &info_ptr);
        mainprog_ptr->png_ptr = NULL;
        mainprog_ptr->info_ptr = NULL;
        return 2;
    }
  
    png_write_image(png_ptr, mainprog_ptr->row_pointers);
  
    png_write_end(png_ptr, NULL);
  
    return 0;
}

One can see that the actual process of writing the image data is quite simple. We first restore our two struct pointers; we could simply use them as is, but that would require some ugly typecasts. Next we set up the usual PNG error-handling code, followed by the call that really matters: png_write_image(). This function writes all of the pixel data to the file, reading from the row_pointers array we just set up in the main program. Once that is complete, there is nothing left to do but to write out the end of the PNG file with png_write_end(). As discussed earlier, this will write any new text or time chunks, but not ones that have already been written; in our case, that means it does nothing but write the final IEND chunk. The second parameter to png_write_end() is ordinarily info_ptr, but since we have no extra chunks to write, passing a NULL value is a tiny optimization.

15.7. Noninterlaced PNG: writepng_encode_row()

If the user did not request interlacing, we can read and write the image progressively, allowing very large images to be converted to PNG without incurring a huge memory overhead. In this case, we forego the row_pointers array and simply allocate image_data large enough to hold one row. Then we start looping over all of the rows in the image (i.e., height rows), reading the pixel data into our buffer and passing it to writepng_encode_row():

int writepng_encode_row(mainprog_info *mainprog_ptr)
{
    png_structp png_ptr = (png_structp)mainprog_ptr->png_ptr;
    png_infop info_ptr = (png_infop)mainprog_ptr->info_ptr;
  
    if (setjmp(mainprog_ptr->jmpbuf)) {
        png_destroy_write_struct(&png_ptr, &info_ptr);
        mainprog_ptr->png_ptr = NULL;
        mainprog_ptr->info_ptr = NULL;
        return 2;
    }
  
    png_write_row(png_ptr, mainprog_ptr->image_data);
  
    return 0;
}

Astute readers will perceive that this function is almost identical to the previous one for interlaced images; the differences are the lack of a png_write_end() call (for obvious reasons) and the call to png_write_row() instead of png_write_image(). image_data now acts as our single row pointer.

Once the loop over rows completes, we call one last function to close out the PNG file:

int writepng_encode_finish(mainprog_info *mainprog_ptr)
{
    png_structp png_ptr = (png_structp)mainprog_ptr->png_ptr;
    png_infop info_ptr = (png_infop)mainprog_ptr->info_ptr;
  
    if (setjmp(mainprog_ptr->jmpbuf)) {
        png_destroy_write_struct(&png_ptr, &info_ptr);
        mainprog_ptr->png_ptr = NULL;
        mainprog_ptr->info_ptr = NULL;
        return 2;
    }

    png_write_end(png_ptr, NULL);
  
    return 0;
}

Again, the function is exactly like what we've seen before except that it calls png_write_end(). Alternatively, it could have been combined with writepng_encode_row() had we included in our mainprog_info struct a flag indicating whether the given row was the last one in the image.

15.8. writepng_cleanup()

The last tasks for the main program are to clean up the PNG-specific allocations and the main-program-specific ones, which is accomplished via the writepng_cleanup() and wpng_cleanup() functions. The former is very similar to the analogous routine in Chapter 14, "Reading PNG Images Progressively", except that this one calls png_destroy_write_struct(), which has only two arguments:

void writepng_cleanup(mainprog_info *mainprog_ptr)
{
    png_structp png_ptr = (png_structp)mainprog_ptr->png_ptr;
    png_infop info_ptr = (png_infop)mainprog_ptr->info_ptr;
  
    if (png_ptr && info_ptr)
        png_destroy_write_struct(&png_ptr, &info_ptr);
}

wpng_cleanup() closes both input and output files and frees the image_data and row_pointers arrays, assuming they were allocated. Since both cleanup functions are also called as a result of various error conditions, they check for valid pointers before freeing anything and set NULL pointers for anything they do free.

15.9. Getting the Source Code

All of the source files for the wpng demo program (wpng.c, writepng.c, writepng.h, and makefiles) are available on the Web, under a BSD-like Open Source license. The files will be available for download from the following URL for the foreseeable future:

http://www.libpng.org/pub/png/pngbook.html

Bug fixes, new features and ports, and other contributions may be integrated into the code, time permitting.



































<- ->
PREVIOUS        CONTENTS        NEXT