make: Automating Your Recipes

Program sources by themselves don't make an application. The way you put them together and package them for distribution matters, too. Unix provides a tool for semi-automating these processes; make(1). Make is covered in most introductory Unix books. For a really thorough reference, you can consult Managing Projects with Make [Oram-Talbot]. If you're using GNU make (the most advanced make, and the one normally shipped with open-source Unixes) the treatment in Programming with GNU Software [Loukides-Oram] may be better in some respects. Most Unixes that carry GNU make will also support GNU Emacs; if yours does you will probably find a complete make manual on-line through Emacs's info documentation system.

Ports of GNU make to DOS and Windows are available from the FSF.

If you're developing in C or C++, an important part of the recipe for building your application will be the collection of compilation and linkage commands needed to get from your sources to working binaries. Entering these commands is a lot of tedious detail work, and most modern development environments include a way to put them in command files or databases that can automatically be re-executed to build your application.

Unix's make(1) program, the original of all these facilities, was designed specifically to help C programmers manage these recipes. It lets you write down the dependencies between files in a project in one or more ‘makefiles’. Each makefile consists of a series of productions; each one tells make that some given target file depends on some set of source files, and says what to do if any of the sources are newer than the target. You don't actually have to write down all dependencies, as the make program can deduce a lot of the obvious ones from filenames and extensions.

For example: You might put in a makefile that the binary myprog depends on three object files myprog.o, helper.o, and stuff.o. If you have source files myprog.c, helper.c, and stuff.c, make will know without being told that each .o file depends on the corresponding .c file, and supply its own standard recipe for building a .o file from a .c file.

 

Make originated with a visit from Steve Johnson (author of yacc, etc.), storming into my office, cursing the Fates that had caused him to waste a morning debugging a correct program (bug had been fixed, file hadn't been compiled, cc *.o was therefore unaffected). As I had spent a part of the previous evening coping with the same disaster on a project I was working on, the idea of a tool to solve it came up. It began with an elaborate idea of a dependency analyzer, boiled down to something much simpler, and turned into Make that weekend. Use of tools that were still wet was part of the culture. Makefiles were text files, not magically encoded binaries, because that was the Unix ethos: printable, debuggable, understandable stuff.

 
-- Stuart Feldman  

When you run make in a project directory, the make program looks at all productions and timestamps and does the minimum amount of work necessary to make sure derived files are up to date.

You can read a good example of a moderately complex makefile in the sources for fetchmail. In the subsections below we'll refer to it again.

Very complex makefiles, especially when they call subsidiary makefiles, can become a source of complications rather than simplifying the build process. A now-classic warning is issued in Recursive Make Considered Harmful.[136] The argument in this paper has become widely accepted since it was written in 1997, and has come near to reversing previous community practice.

No discussion of make(1) would be complete without an acknowledgement that it includes one of the worst design botches in the history of Unix. The use of tab characters as a required leader for command lines associated with a production means that the interpretation of a makefile can change drastically on the basis of invisible differences in whitespace.

 

Why the tab in column 1? Yacc was new, Lex was brand new. I hadn't tried either, so I figured this would be a good excuse to learn. After getting myself snarled up with my first stab at Lex, I just did something simple with the pattern newline-tab. It worked, it stayed. And then a few weeks later I had a user population of about a dozen, most of them friends, and I didn't want to screw up my embedded base. The rest, sadly, is history.

 
-- Stuart Feldman  

make is not just useful for C/C++ recipes, however. Scripting languages like those we described in Chapter 14 may not require conventional compilation and link steps, but there are often other kinds of dependencies that make(1) can help you with.

Suppose, for example, that you actually generate part of your code from a specification file, using one of the techniques from Chapter 9. You can use make to tie the spec file and the generated source together. This will ensure that whenever you change the spec and remake, the generated code will automatically be rebuilt.

It's quite common to use makefile productions to express recipes for making documentation as well as code. You'll often see this approach used to automatically generate PostScript or other derived documentation from masters written in some markup language (like HTML or one of the Unix document-macro languages we'll survey in Chapter 18). In fact, this sort of use is so common that it's worth illustrating with a case study.

Some of the most heavily used productions in typical makefiles don't express file dependencies at all. They're ways to bundle up little procedures that a developer wants to mechanize, like making a distribution package or removing all object files in order to do a build from scratch.

 

Non-file productions were intentional and in there from day one. ‘Make all’ and ‘clean’ were my own conventions from earliest days. One of the older Unix jokes is “Make love” which results in “Don't know how to make love”.

 
-- Stuart Feldman  

There is a well-developed set of conventions about what utility productions should be present and how they should be named. Following these will make your makefile much easier to understand and use.

all

Your all production should make every executable of your project. Usually the all production doesn't have an explicit rule; instead it refers to all of your project's top-level targets (and, not accidentally, documents what those are). Conventionally, this should be the first production in your makefile, so it will be the one executed when the developer types make with no argument.

test

Run the program's automated test suite, typically consisting of a set of unit tests[137] to find regressions, bugs, or other deviations from expected behavior during the development process. The ‘test’ production can also be used by end-users of the software to ensure that their installation is functioning correctly.

clean

Remove all files (such as binary executables and object files) that are normally created when you make all. A make clean should reset the process of building the software to a good initial state.

dist

Make a source archive (usually with the tar(1) program) that can be shipped as a unit and used to rebuild the program on another machine. This target should do the equivalent of depending on all so that a make dist automatically rebuilds the whole project before making the distribution archive — this is a good way to avoid last-minute embarrassments, like not shipping derived files that are actually needed (like the flat-text README in fetchmail, which is actually generated from an HTML source).

distclean

Throw away everything but what you would include if you were bundling up the source with make dist. This may be the the same as make clean but should be included as a production of its own anyway, to document what's going on. When it's different, it usually differs by throwing away local configuration files that aren't part of the normal make all build sequence (such as those generated by autoconf(1); we'll talk about autoconf(1) in Chapter 17).

realclean

Throw away everything you can rebuild using the makefile. This may be the same as make distclean, but should be included as a production of its own anyway, to document what's going on. When it's different, it usually differs by throwing away files that are derived but (for whatever reason) shipped with the project sources anyway.

install

Install the project's executables and documentation in system directories so they will be accessible to general users (this typically requires root privileges). Initialize or update any databases or libraries that the executables require in order to function.

uninstall

Remove files installed in system directories by make install (this typically requires root privileges). This should completely and perfectly reverse a make install. The presence of an uninstall production implies a kind of humility that experienced Unix hands look for as a sign of thoughtful design; conversely, not having an uninstall production is at best careless, and (when, for example, an installation creates large database files) can be quite rude and thoughtless.

Working examples of all the standard targets are available for inspection in the fetchmail makefile. By studying all of them together you will see a pattern emerge, and (not incidentally) learn much about the fetchmail package's structure. One of the benefits of using these standard productions is that they form an implicit roadmap of their project.

But you need not limit yourself to these utility productions. Once you master make, you'll find yourself more and more often using the makefile machinery to automate little tasks that depend on your project file state. Your makefile is a convenient central place to put these; using it makes them readily available for inspection and avoids cluttering up your workspace with trivial little scripts.

One of the subtle advantages of Unix make over the dependency databases built into many IDEs is that makefiles are simple text files — files that can be generated by programs.

In the mid-1980s it was fairly common for large Unix program distributions to include elaborate custom shellscripts that would probe their environment and use the information they gathered to construct custom makefiles. These custom configurators reached absurd sizes. I wrote one once that was 3000 lines of shell, about twice as large as any single module in the program it was configuring — and this was not unusual.

The community eventually said “Enough!” and various people set out to write tools that would automate away part or all of the process of maintaining makefiles. These tools generally tried to address two issues:

One issue is portability. Makefile generators are commonly built to run on many different hardware platforms and Unix variants. They generally try to deduce things about the local system (including everything from machine word size up to which tools, languages, service libraries, and even document formatters it has available). They then try to use those deductions to write makefiles that exploit the local system's facilities and compensate for its quirks.

The other issue is dependency derivation. It's possible to deduce a great deal about the dependencies of a collection of C sources by analyzing the sources themselves (especially by looking at what include files they use and share). Many makefile generators do this in order to mechanically generate make dependencies.

Each different makefile generator tackles these objectives in a slightly different way. Probably a dozen or more generators have been attempted, but most proved inadequate or too difficult to drive or both, and only a few are still in live use. We'll survey the major ones here. All are available as open-source software on the Internet.

autoconf was written by people who had seen and rejected the Imake approach. It generates per-project configure shellscripts that are like the old-fashioned custom script configurators. These configure scripts can generate makefiles (among other things).

Autoconf is focused on portability and does no built-in dependency derivation at all. Although it is probably as complex as Imake, it is much more flexible and easier to extend. Rather than relying on a per-system database of rules, it generates configure shell code that goes out and searches your system for things.

Each configure shellscript is built from a per-project template that you have to write, called configure.in. Once generated, though, the configure script will be self-contained and can configure your project on systems that don't carry autoconf(1) itself.

The autoconf approach to makefile generation is like imake's in that you start by writing a makefile template for your project. But autoconf's Makefile.in files are basically just makefiles with placeholders in them for simple text substitution; there's no second notation to learn. If you want dependency derivation, you must take explicit steps to call makedepend(1) or some similar tool — or use automake(1).

autoconf is documented by an on-line manual in the GNU info format. The source scripts of autoconf are available from the FSF archive site, but are also preinstalled on many Unix and Linux versions. You should be able to browse this manual through your Emacs's help system.

Despite its lack of direct support for dependency derivation, and despite its generally ad-hoc approach, in mid-2003 autoconf is clearly the most popular of the makefile generators, and has been for some years. It has eclipsed Imake and driven at least one major competitor (metaconfig) out of use.

A reference, GNU Autoconf, Automake and Libtool is available [Vaughan]. We'll have more to say about autoconf, from a slightly different angle, in Chapter 17.



[137] A unit test is test code attached to a module to verify correct performance. Use of the term ‘unit test’ suggests that the test is written concurrently with the code by the developer of the code, and implies a discipline in which module releases aren't considered complete until they have attached test code. The term and the concept originated in the “Extreme Programming” methodology popularized by Kent Beck, but has gained wide acceptance among Unix programmers since about 2001.