Internationalization

All user-visible strings in a Gnome application should be marked for translation. Translation is achieved using the GNU gettext facility. gettext is simply a message catalog; it stores key-value pairs, where the key is the string hard-coded into the program, and the value is a translated string (if appropriate) or simply the key (if there's no translation, or the key is already in the correct language).

As a programmer, it's not your responsibility to provide translations. However, you must make sure strings are marked for translation---so that gettext's scripts can extract a list of strings to be translated---and you must call a special function on each string when the catalog lookup should take place.

#include <libgnome/gnome-i18n.h>

_(string);

N_(string);

Figure 2. Translation Macros

Gnome makes this easy, by defining two macros shown in Figure 2. The macro _() both marks the string for translation and performs the message-catalog lookup. You should use it in any context C permits a function call. The N_() macro is a no-op, but marks the string for translation. You can use it when C does not permit a function call; for example in static array initializers. If you mark a string for translation with N_(), you must eventually call _() on it to actually perform the lookup.

Here's a simple example:


#include <gnome.h>

static char* a[] = { 
  N_("Translate Me"),
  N_("Me Too")
};

int main(int argc, char** argv)
{
  bindtextdomain(PACKAGE, GNOMELOCALEDIR);
  textdomain(PACKAGE);

  printf(_("Translated String\n"));
  printf(_(a[0]));
  printf(_(a[1]));

  return 0;
}

Notice that the string literals "Translate Me" and "Me Too" are marked so that gettext can find them and produce a list of strings to be translated. Translators will use this list to create the actual translations. Later, _() includes a function call to perform the tranlation lookup on each member of the array. Since a function call is allowed when the string literal "Translated String" is introduced, everything can happen in a single step.

At the beginning of your program, you have to call bindtextdomain() and textdomain() as shown in the above example. In the above code, PACKAGE is a string representing the package the program is found in, typically defined in config.h (see the chapter called Creating Your Source Tree). You must arrange to define GNOMELOCALEDIR, typically in your Makefile.am ($(prefix)/share/locale, or $(datadir)/locale, is the standard value). Translations are stored in GNOMELOCALEDIR.

When marking strings for translation, you must make sure your strings are translatable. Avoid constructing a string at runtime via concatenation. For example, do not do this:


  gchar* message = g_strconcat(_("There is an error on device "), 
                               device, NULL);

The problem is that in some languages it may be correct to put the name of the device first (or in the middle). If you use g_snprintf() or g_strdup_printf() instead of concatenation, the translator can change the word order. Here's the right way to do it:


  gchar* message = g_strdup_printf(_("There is an error on device %s"),
                                   device);

Now the translator can move %s as needed.

Complicated syntax-on-the-fly should be avoided whenever possible. For example, translating this is a major problem:


  printf(_("There %s %d dog%s\n"), 
         n_dogs > 1 ? _("were") : _("was"),
         n_dogs, 
         n_dogs > 1 ? _("s") : "");

It is better to move the conditional out of the printf():


  if (n_dogs > 0)
    printf(_("There were %d dogs\n"), n_dogs);
  else 
    printf(_("There was 1 dog\n"));

However, as the gettext manual points out, even this will not always work; some languages will distinguish more categories than "exactly one" and "more than one" (that is, they might have a word form for "exactly two" in addition to English's singular and plural forms). That manual suggests that a lookup table indexed by the number you plan to use might work in some cases:


static const char* ndogs_phrases[] = {
  N_("There were no dogs.\n"),
  N_("There was one dog.\n"),
  N_("There were two dogs.\n"),
  N_("There were three dogs.\n")
};

As you can see, this rapidly becomes unpleasant to deal with. Avoid it if you can. The gettext documentation has more examples, if you find yourself in a hairy situation.

Internationalization must also be considered when parsing or displaying certain kinds of data, including dates and decimal numbers. In general, the C library provides sufficient facilities to deal with this; use strftime(), strcoll(), and so on to handle these cases; a good C or POSIX book will explain them. The glib GDate facility handles dates using strftime() internally.

One common mistake to avoid: don't use locale-dependent functions when reading and writing files. For example, printf() and scanf() adjust their decimal number format for the locale, so you can't use this format in files. Users in Europe won't be able to read files created in the United States.