Chapter 7. Multiprogramming

Separating Processes to Separate Function

Table of Contents

Separating Complexity Control from Performance Tuning
Taxonomy of Unix IPC Methods
Handing off Tasks to Specialist Programs
Pipes, Redirection, and Filters
Security Wrappers and Bernstein Chaining
Slave Processes
Peer-to-Peer Inter-Process Communication
Problems and Methods to Avoid
Obsolescent Unix IPC Methods
Remote Procedure Calls
Threads — Threat or Menace?
Process Partitioning at the Design Level

If we believe in data structures, we must believe in independent (hence simultaneous) processing. For why else would we collect items within a structure? Why do we tolerate languages that give us the one without the other?

-- Alan Perlis Epigrams in Programming, in ACM SIGPLAN (Vol 17 #9, 1982)

The most characteristic program-modularization technique of Unix is splitting large programs into multiple cooperating processes. This has usually been called ‘multiprocessing’ in the Unix world, but in this book we revive the older term ‘multiprogramming’ to avoid confusion with multiprocessor hardware implementations.

Multiprogramming is a particularly murky area of design, one in which there are few guidelines to good practice. Many programmers with excellent judgment about how to break up code into subroutines nevertheless wind up writing whole applications as monster single-process monoliths that founder on their own internal complexity.

The Unix style of design applies the do-one-thing-well approach at the level of cooperating programs as well as cooperating routines within a program, emphasizing small programs connected by well-defined interprocess communication or by shared files. Accordingly, the Unix operating system encourages us to break our programs into simpler subprocesses, and to concentrate on the interfaces between these subprocesses. It does this in at least three fundamental ways:

Inexpensive process-spawning and easy process control are critical enablers for the Unix style of programming. On an operating system such as VAX VMS, where starting processes is expensive and slow and requires special privileges, one must build monster monoliths because one has no choice. Fortunately the trend in the Unix family has been toward lower fork(2) overhead rather than higher. Linux, in particular, is famously efficient this way, with a process-spawn faster than thread-spawning on many other operating systems.[65]

Historically, many Unix programmers have been encouraged to think in terms of multiple cooperating processes by experience with shell programming. Shell makes it relatively easy to set up groups of multiple processes connected by pipes, running either in background or foreground or a mix of the two.

In the remainder of this chapter, we'll look at the implications of cheap process-spawning and discuss how and when to apply pipes, sockets, and other interprocess communication (IPC) methods to partition your design into cooperating processes. (In the next chapter, we'll apply the same separation-of-functions philosophy to interface design.)

While the benefit of breaking programs up into cooperating processes is a reduction in global complexity, the cost is that we have to pay more attention to the design of the protocols which are used to pass information and commands between processes. (In software systems of all kinds, bugs collect at interfaces.)

In Chapter 5 we looked at the lower level of this design problem — how to lay out application protocols that are transparent, flexible and extensible. But there is a second, higher level to the problem which we blithely ignored. That is the problem of designing state machines for each side of the communication.

It is not hard to apply good style to the syntax of application protocols, given models like SMTP or BEEP or XML-RPC. The real challenge is not protocol syntax but protocol logic—designing a protocol that is both sufficiently expressive and deadlock-free. Almost as importantly, the protocol has to be seen to be expressive and deadlock-free; human beings attempting to model the behavior of the communicating programs in their heads and verify its correctness must be able to do so.

In our discussion, therefore, we will focus on the kinds of protocol logic one naturally uses with each kind of interprocess communication.

[65] See, for example, the results quoted in Improving Context Switching Performance of Idle Tasks under Linux [Appleton].