make: Automating Your Recipes

make: Automating Your Recipes
Prev	Chapter 15. Tools	Next

Program sources by themselves don't make an application. The way you put them together and package them for distribution matters, too. Unix provides a tool for semi-automating these processes; make(1). Make is covered in most introductory Unix books. For a really thorough reference, you can consult Managing Projects with Make [Oram-Talbot]. If you're using GNU make (the most advanced make, and the one normally shipped with open-source Unixes) the treatment in Programming with GNU Software [Loukides-Oram] may be better in some respects. Most Unixes that carry GNU make will also support GNU Emacs; if yours does you will probably find a complete make manual on-line through Emacs's info documentation system.

Ports of GNU make to DOS and Windows are available from the FSF.

Basic Theory of make

If you're developing in C or C++, an important part of the recipe for building your application will be the collection of compilation and linkage commands needed to get from your sources to working binaries. Entering these commands is a lot of tedious detail work, and most modern development environments include a way to put them in command files or databases that can automatically be re-executed to build your application.

Unix's make(1) program, the original of all these facilities, was designed specifically to help C programmers manage these recipes. It lets you write down the dependencies between files in a project in one or more ‘makefiles’. Each makefile consists of a series of productions; each one tells make that some given target file depends on some set of source files, and says what to do if any of the sources are newer than the target. You don't actually have to write down all dependencies, as the make program can deduce a lot of the obvious ones from filenames and extensions.

For example: You might put in a makefile that the binary myprog depends on three object files myprog.o, helper.o, and stuff.o. If you have source files myprog.c, helper.c, and stuff.c, make will know without being told that each .o file depends on the corresponding .c file, and supply its own standard recipe for building a .o file from a .c file.

	Make originated with a visit from Steve Johnson (author of yacc, etc.), storming into my office, cursing the Fates that had caused him to waste a morning debugging a correct program (bug had been fixed, file hadn't been compiled, *cc .o** was therefore unaffected). As I had spent a part of the previous evening coping with the same disaster on a project I was working on, the idea of a tool to solve it came up. It began with an elaborate idea of a dependency analyzer, boiled down to something much simpler, and turned into Make that weekend. Use of tools that were still wet was part of the culture. Makefiles were text files, not magically encoded binaries, because that was the Unix ethos: printable, debuggable, understandable stuff.
-- Stuart Feldman

When you run make in a project directory, the make program looks at all productions and timestamps and does the minimum amount of work necessary to make sure derived files are up to date.

You can read a good example of a moderately complex makefile in the sources for fetchmail. In the subsections below we'll refer to it again.

Very complex makefiles, especially when they call subsidiary makefiles, can become a source of complications rather than simplifying the build process. A now-classic warning is issued in Recursive Make Considered Harmful.^[136] The argument in this paper has become widely accepted since it was written in 1997, and has come near to reversing previous community practice.

No discussion of make(1) would be complete without an acknowledgement that it includes one of the worst design botches in the history of Unix. The use of tab characters as a required leader for command lines associated with a production means that the interpretation of a makefile can change drastically on the basis of invisible differences in whitespace.

	Why the tab in column 1? Yacc was new, Lex was brand new. I hadn't tried either, so I figured this would be a good excuse to learn. After getting myself snarled up with my first stab at Lex, I just did something simple with the pattern newline-tab. It worked, it stayed. And then a few weeks later I had a user population of about a dozen, most of them friends, and I didn't want to screw up my embedded base. The rest, sadly, is history.
-- Stuart Feldman

make in Non-C/C++ Development

make is not just useful for C/C++ recipes, however. Scripting languages like those we described in Chapter 14 may not require conventional compilation and link steps, but there are often other kinds of dependencies that make(1) can help you with.

Suppose, for example, that you actually generate part of your code from a specification file, using one of the techniques from Chapter 9. You can use make to tie the spec file and the generated source together. This will ensure that whenever you change the spec and remake, the generated code will automatically be rebuilt.

It's quite common to use makefile productions to express recipes for making documentation as well as code. You'll often see this approach used to automatically generate PostScript or other derived documentation from masters written in some markup language (like HTML or one of the Unix document-macro languages we'll survey in Chapter 18). In fact, this sort of use is so common that it's worth illustrating with a case study.

Case Study: make for Document-File Translation

In the fetchmail makefile, for example, you'll see three productions that relate files named FAQ, FEATURES, and NOTES to HTML sources fetchmail-FAQ.html, fetchmail-features.html, and design-notes.html.

The HTML files are meant to be accessible on the fetchmail Web page, but all the HTML markup makes them uncomfortable to look at unless you're using a browser. So the FAQ, FEATURES, and NOTES are flat-text files meant to be flipped through quickly with an editor or pager program by someone reading the fetchmail sources themselves (or, perhaps, distributed to FTP sites that don't support Web access).

The flat-text forms can be made from their HTML masters by using the common open-source program lynx(1). lynx is a Web browser for text-only displays; but when invoked with the -dump option it functions reasonably well as an HTML-to-ASCII formatter.

With the productions in place, the developer can edit the HTML masters without having to remember to manually rebuild the flat-text forms afterwards, secure in the knowledge that FAQ, FEATURES, and NOTES will be properly rebuilt whenever they are needed.

Utility Productions

Some of the most heavily used productions in typical makefiles don't express file dependencies at all. They're ways to bundle up little procedures that a developer wants to mechanize, like making a distribution package or removing all object files in order to do a build from scratch.

	Non-file productions were intentional and in there from day one. ‘Make all’ and ‘clean’ were my own conventions from earliest days. One of the older Unix jokes is “Make love” which results in “Don't know how to make love”.
-- Stuart Feldman

There is a well-developed set of conventions about what utility productions should be present and how they should be named. Following these will make your makefile much easier to understand and use.

all: Your all production should make every executable of your project. Usually the all production doesn't have an explicit rule; instead it refers to all of your project's top-level targets (and, not accidentally, documents what those are). Conventionally, this should be the first production in your makefile, so it will be the one executed when the developer types make with no argument.
test: Run the program's automated test suite, typically consisting of a set of unit tests^[137] to find regressions, bugs, or other deviations from expected behavior during the development process. The ‘test’ production can also be used by end-users of the software to ensure that their installation is functioning correctly.
clean: Remove all files (such as binary executables and object files) that are normally created when you make all. A make clean should reset the process of building the software to a good initial state.
dist: Make a source archive (usually with the tar(1) program) that can be shipped as a unit and used to rebuild the program on another machine. This target should do the equivalent of depending on all so that a make dist automatically rebuilds the whole project before making the distribution archive — this is a good way to avoid last-minute embarrassments, like not shipping derived files that are actually needed (like the flat-text README in fetchmail, which is actually generated from an HTML source).
distclean: Throw away everything but what you would include if you were bundling up the source with make dist. This may be the the same as make clean but should be included as a production of its own anyway, to document what's going on. When it's different, it usually differs by throwing away local configuration files that aren't part of the normal make all build sequence (such as those generated by autoconf(1); we'll talk about autoconf(1) in Chapter 17).
realclean: Throw away everything you can rebuild using the makefile. This may be the same as make distclean, but should be included as a production of its own anyway, to document what's going on. When it's different, it usually differs by throwing away files that are derived but (for whatever reason) shipped with the project sources anyway.
install: Install the project's executables and documentation in system directories so they will be accessible to general users (this typically requires root privileges). Initialize or update any databases or libraries that the executables require in order to function.
uninstall: Remove files installed in system directories by make install (this typically requires root privileges). This should completely and perfectly reverse a make install. The presence of an uninstall production implies a kind of humility that experienced Unix hands look for as a sign of thoughtful design; conversely, not having an uninstall production is at best careless, and (when, for example, an installation creates large database files) can be quite rude and thoughtless.

Working examples of all the standard targets are available for inspection in the fetchmail makefile. By studying all of them together you will see a pattern emerge, and (not incidentally) learn much about the fetchmail package's structure. One of the benefits of using these standard productions is that they form an implicit roadmap of their project.

But you need not limit yourself to these utility productions. Once you master make, you'll find yourself more and more often using the makefile machinery to automate little tasks that depend on your project file state. Your makefile is a convenient central place to put these; using it makes them readily available for inspection and avoids cluttering up your workspace with trivial little scripts.

The Imake tools will be available on any Unix that supports X, including Linux . There has been one heroic effort [DuBois] to make the mysteries of Imake comprehensible to non-X-programming mortals. These are worth learning if you are going to do X programming.

autoconf

autoconf was written by people who had seen and rejected the Imake approach. It generates per-project configure shellscripts that are like the old-fashioned custom script configurators. These configure scripts can generate makefiles (among other things).

Autoconf is focused on portability and does no built-in dependency derivation at all. Although it is probably as complex as Imake, it is much more flexible and easier to extend. Rather than relying on a per-system database of rules, it generates configure shell code that goes out and searches your system for things.

Each configure shellscript is built from a per-project template that you have to write, called configure.in. Once generated, though, the configure script will be self-contained and can configure your project on systems that don't carry autoconf(1) itself.

The autoconf approach to makefile generation is like imake's in that you start by writing a makefile template for your project. But autoconf's Makefile.in files are basically just makefiles with placeholders in them for simple text substitution; there's no second notation to learn. If you want dependency derivation, you must take explicit steps to call makedepend(1) or some similar tool — or use automake(1).

autoconf is documented by an on-line manual in the GNU info format. The source scripts of autoconf are available from the FSF archive site, but are also preinstalled on many Unix and Linux versions. You should be able to browse this manual through your Emacs's help system.

Despite its lack of direct support for dependency derivation, and despite its generally ad-hoc approach, in mid-2003 autoconf is clearly the most popular of the makefile generators, and has been for some years. It has eclipsed Imake and driven at least one major competitor (metaconfig) out of use.

A reference, GNU Autoconf, Automake and Libtool is available [Vaughan]. We'll have more to say about autoconf, from a slightly different angle, in Chapter 17.

automake

automake is an attempt to add Imake-like dependency derivation as a layer on top of autoconf(1). You write Makefile.am templates in a broadly Imake-like notation; automake(1) compiles them to Makefile.in files, which autoconf's configure scripts then operate on.

automake is still relatively new technology in mid-2003. It is used in several FSF projects but has not yet been widely adopted elsewhere. While its general approach looks promising, it is as yet rather brittle — it works when used in stereotyped ways but tends to break badly if you try to do anything unusual with it.

Complete on-line documentation is shipped with automake, which can be downloaded from the FSF archive site.

^[136] Available on the Web.

^[137]A unit test is test code attached to a module to verify correct performance. Use of the term ‘unit test’ suggests that the test is written concurrently with the code by the developer of the code, and implies a discipline in which module releases aren't considered complete until they have attached test code. The term and the concept originated in the “Extreme Programming” methodology popularized by Kent Beck, but has gained wide acceptance among Unix programmers since about 2001.

Prev	Up	Next
Special-Purpose Code Generators	Home	Version-Control Systems