Evolution of C

The central fact of the Unix programming experience has always been the stability of the C language and the handful of service interfaces that always travel with it (notably, the standard I/O library and friends). The fact that a language originated in 1973 has required as little change as this one has in thirty years of heavy use is truly remarkable, and without parallels anywhere else in computer science or engineering.

In Chapter 4, we argued that C has been successful because it acts as a layer of thin glue over computer hardware approximating the “standard architecture” of [BlaauwBrooks]. There is, of course, more to the story than that. To understand the rest of the story, we'll need to take a brief look at the history of C.

C began life in 1971 as a systems-programming language for the PDP-11 port of Unix, based on Ken Thompson's earlier B interpreter which had in turn been modeled on BCPL, the Basic Common Programming Language designed at Cambridge University in 1966-67.[142]

Dennis Ritchie's original C compiler (often called the “DMR” compiler after his initials) served the rapidly growing community around Unix versions 5, 6, and 7. Version 6 C spawned Whitesmiths C, a reimplementation that became the first commercial C compiler and the nucleus of IDRIS, the first Unix workalike. But most modern C implementations are patterned on Steven C. Johnson's “portable C compiler” (PCC) which debuted in Version 7 and replaced the DMR compiler entirely in both System V and the BSD 4.x releases.

In 1976, Version 6 C introduced the typedef, union, and unsigned int declarations. The approved syntax for variable initializations and some compound operators also changed.

The original description of C was Brian Kernighan and Dennis M. Ritchie's original The C Programming Language aka “the White Book” [Kernighan-Ritchie]. It was published in 1978, the same year the Whitemiths C compiler became available.

The White Book described enhanced Version 6 C, with one significant exception involving the handling of public storage. Ritchie's original intention had been to model C's rules on FORTRAN COMMON declarations, on the theory that any machine that could handle FORTRAN would be ready for C. In the common-block model, a public variable may be declared multiple times; identical declarations are merged by the linker. But two early C ports (to Honeywell and IBM 360 mainframes) happened to be to machines with very limited common storage or a primitive linker or both. Thus, the Version 6 C compiler was moved to the stricter definition-reference model (requiring at most one definition of any given public variable and the extern keyword tagging references to it) described in [Kernighan-Ritchie].

This decision was reversed in the C compiler that shipped with Version 7 after it developed that a great deal of existing source depended on the looser rules. Pressure for backward-compatibility would foil yet another attempt to switch (in 1983's System V Release 1) before the ANSI Draft Standard finally settled on definition-reference rules in 1988. Common-block public storage is still admitted as an acceptable variation by the standard.

V7 C introduced enum and treated struct and union values as first-class objects that could be assigned, passed as arguments, and returned from functions (rather than being passed around by address).

 

Another major change in V7 was that Unix data structure declarations were now documented on header files, and included. Previous Unixes had actually printed the data structures (e.g., for directories) in the manual, from which people would copy it into their code. Needless to say, this was a major portability problem.

 
-- Steve Johnson  

The System III C version of the PCC compiler (which also shipped with BSD 4.1c) changed the handling of struct declarations so that members with the same names in different structs would not clash. It also introduced void and unsigned char declarations. The scope of extern declarations local to a function was restricted to the function, and no longer included all code following it.

The ANSI C Draft Proposed Standard added const (for read-only storage) and volatile (for locations such as memory-mapped I/O registers that might be modified asynchronously from the thread of program control). The unsigned type modifier was generalized to apply to any type, and a symmetrical signed was added. Initialization syntax for auto array and structure initializers and union types was added. Most importantly, function prototypes were added.

The most important changes in early C were the switch to definition-reference and the introduction of function prototypes in the Draft Proposed ANSI C Standard. The language has been essentially stable since copies of the X3J11 committee's working papers on the Draft Proposed Standard signaled the committee's intentions to compiler implementers in 1985-1986.

A more detailed history of early C, written by its designer, can be found at [Ritchie93].

C standards development has been a conservative process with great care taken to preserve the spirit of the original C language, and an emphasis on ratifying experiments in existing compilers rather than inventing new features. The C9X charter[143] document is an excellent expression of this mission.

Work on the first official C standard began in 1983 under the auspices of the X3J11 ANSI committee. The major functional additions to the language were settled by the end of 1986, at which point it became common for programmers to distinguish between “K&R C” and “ANSI C”.

 

Many people don't realize how unusual the C standardization effort, especially the original ANSI C work, was in its insistence on standardizing only tested features. Most language standard committees spend much of their time inventing new features, often with little consideration of how they might be implemented. Indeed, the few ANSI C features that were invented from scratch — e.g., the notorious “trigraphs”—were the most disliked and least successful features of C89.

 
-- Henry Spencer  
 

Void pointers were invented as part of the standards effort, and have been a winner. But Henry's point is still well taken.

 
-- Steve Johnson  

While the core of ANSI C was settled early, arguments over the contents of the standard libraries dragged on for years. The formal standard was not issued until the end of 1989, well after most compilers had implemented the 1985 recommendations. The standard was originally known as ANSI X3.159, but was redesignated ISO/IEC 9899:1990 when the International Standards Organization (ISO) took over sponsorship in 1990. The language variant it describes is generally known as C89 or C90.

The first book on C and Unix portability practice, Portable C and Unix Systems Programming [Lapin], was published in 1987 (I wrote it under a corporate pseudonym forced on me by my employers at the time). The Second Edition of [Kernighan-Ritchie] came out in 1988.

A very minor revision of C89, known as Amendment 1, AM1, or C93, was floated in 1993. It added more support for wide characters and Unicode. This became ISO/IEC 9899-1:1994.

Revision of the C89 standard began in 1993. In 1999, ISO/IEC 9899 (generally known as C99) was adopted by ISO. It incorporated Amendment 1, and added a great many minor features. Perhaps the most significant one for most programmers is the C++-like ability to declare variables at any point in a block, rather than just at the beginning. Macros with a variable number of arguments were also added.

The C9X working group has a Web page, but no third standards effort is planned as of mid-2003. They are developing an addendum on C for embedded systems.

Standardization of C has been greatly aided by the fact that working and largely compatible implementations were running on a wide variety of systems before standards work was begun. This made it harder to argue about what features should be in the standard.



[142] The ‘C’ in C therefore stands for Common — or, perhaps, for ‘Christopher’. BCPL originally stood for “Bootstrap CPL”—a much simplified version of CPL, the very interesting but overly ambitious and never implemented Common Programming Language of Oxford and Cambridge, also known affectionately as “Christopher's Programming Language” after its prime advocate, computer-science pioneer Christopher Strachey.