Common Lisp the Language, 2nd Edition


next up previous contents index
Next: Standard Dispatching Macro Up: Printed Representation of Previous: Parsing of Numbers

22.1.3. Macro Characters

If the reader encounters a macro character, then the function associated with that macro character is invoked and may produce an object to be returned. This function may read following characters in the stream in whatever syntax it likes (it may even call read recursively) and return the object represented by that syntax. Macro characters may or may not be recognized, of course, when read as part of other special syntaxes (such as for strings).

The reader is therefore organized into two parts: the basic dispatch loop, which also distinguishes symbols and numbers, and the collection of macro characters. Any character can be reprogrammed as a macro character; this is a means by which the reader can be extended. The macro characters normally defined are as follows:

(
The left-parenthesis character initiates reading of a pair or list. The function read is called recursively to read successive objects until a right parenthesis is found to be next in the input stream. A list of the objects read is returned. Thus the input sequence

(a b c)

is read as a list of three objects (the symbols a, b, and c). The right parenthesis need not immediately follow the printed representation of the last object; whitespace characters and comments may precede it. This can be useful for putting one object on each line and making it easy to add new objects:

(defun traffic-light (color) 
  (case color 
    (green) 
    (red (stop)) 
    (amber (accelerate))     ;Insert more colors after this line 
    ))

It may be that no objects precede the right parenthesis, as in () or ( ); this reads as a list of zero objects (the empty list).

If a token that is just a dot, not preceded by an escape character, is read after some object, then exactly one more object must follow the dot, possibly followed by whitespace, followed by the right parenthesis:

(a b c . d)

This means that the cdr of the last pair in the list is not nil, but rather the object whose representation followed the dot. The above example might have been the result of evaluating

(cons 'a (cons 'b (cons 'c 'd))) => (a b c . d)

Similarly, we have

(cons 'znets 'wolq-zorbitan) => (znets . wolq-zorbitan)

It is permissible for the object following the dot to be a list:

(a b c d . (e f . (g)))

is the same as

(a b c d e f g)

but a list following a dot is a non-standard form that print will never produce.

)
The right-parenthesis character is part of various constructs (such as the syntax for lists) using the left-parenthesis character and is invalid except when used in such a construct.  

'
The single-quote (accent acute) character provides an abbreviation to make it easier to put constants in programs. The form 'foo reads the same as (quote foo): a list of the symbol quote and foo.

;
Semicolon is used to write comments.     The semicolon and all characters up to and including the next newline are ignored. Thus a comment can be put at the end of any line without affecting the reader. (A comment will terminate a token, but a newline would terminate the token anyway.)
change_begin
There is no functional difference between using one semicolon and using more than one, but the conventions shown here are in common use.
change_end

;;;; COMMENT-EXAMPLE function. 
;;; This function is useless except to demonstrate comments. 
;;; (Actually, this example is much too cluttered with them.) 

(defun comment-example (x y)      ;X is anything; Y is an a-list. 
  (cond ((listp x) x)             ;If X is a list, use that. 
        ;; X is now not a list.  There are two other cases. 
        ((symbolp x) 
         ;; Look up a symbol in the a-list. 
         (cdr (assoc x y)))       ;Remember, (cdr nil) is nil. 
        ;; Do this when all else fails: 
        (t (cons x                ;Add x to a default list. 
                 '((lisp t)       ;LISP is okay. 
                   (fortran nil)  ;FORTRAN is not. 
                   (pl/i -500)    ;Note that you can put comments in 
                   (ada .001)     ; "data" as well as in "programs". 
                   ;; COBOL?? 
                   (teco -1.0e9))))))

In this example, comments may begin with one to four semicolons.


Compatibility note: These conventions arose among users of MacLisp and have been found to be very useful. The conventions are conveniently exploited by certain software tools, such as the EMACS editor and the ATSIGN listing program developed at MIT.

change_begin
The ATSIGN listing program, alas, is no longer in use, but EMACS is widely available, especially the GNU EMACS implementation, which is available from the Free Software Foundation, 675 Massachusetts Avenue, Cambridge, Massachusetts 02139. Remember, GNU's Not UNIX.
change_end


"
The double quote character begins the printed representation of a string. Successive characters are read from the input stream and accumulated until another double quote is encountered. An exception to this occurs if a single escape character is seen; the escape character is discarded, the next character is accumulated, and accumulation continues. When a matching double quote is seen, all the accumulated characters up to but not including the matching double quote are made into a simple string and returned.

`
The backquote (accent grave) character makes it easier to write programs to construct complex data structures by using a template.  
change_begin
Notice of correction. In the first edition, the backquote character <`> appearing at the left margin above was inadvertently omitted.
change_end
As an example, writing

`(cond ((numberp ,x) ,@y) (t (print ,x) ,@y))

is roughly equivalent to writing

(list 'cond  
      (cons (list 'numberp x) y)  
      (list* 't (list 'print x) y))

The general idea is that the backquote is followed by a template, a picture of a data structure to be built. This template is copied, except that within the template commas can appear. Where a comma occurs, the form following the comma is to be evaluated to produce an object to be inserted at that point. Assume b has the value 3; then evaluating the form denoted by `(a b ,b ,(+ b 1) b) produces the result (a b 3 4 b).

If a comma is immediately followed by an at-sign (@), then the form following the at-sign is evaluated to produce a list of objects. These objects are then ``spliced'' into place in the template. For example, if x has the value (a b c), then

`(x ,x ,@x foo ,(cadr x) bar ,(cdr x) baz ,@(cdr x)) 
   => (x (a b c) a b c foo b bar (b c) baz b c)

The backquote syntax can be summarized formally as follows. For each of several situations in which backquote can be used, a possible interpretation of that situation as an equivalent form is given. Note that the form is equivalent only in the sense that when it is evaluated it will calculate the correct result. An implementation is quite free to interpret backquote in any way such that a backquoted form, when evaluated, will produce a result equal to that produced by the interpretation shown here.

No other uses of comma are permitted; in particular, it may not appear within the #A or #S syntax.

Anywhere ``,@'' may be used, the syntax ``,.'' may be used instead to indicate that it is permissible to destroy the list produced by the form following the ``,.''; this may permit more efficient code, using nconc instead of append, for example.

If the backquote syntax is nested, the innermost backquoted form should be expanded first. This means that if several commas occur in a row, the leftmost one belongs to the innermost backquote.

Once again, it is emphasized that an implementation is free to interpret a backquoted form as any form that, when evaluated, will produce a result that is equal to the result implied by the above definition. In particular, no guarantees are made as to whether the constructed copy of the template will or will not share list structure with the template itself. As an example, the above definition implies that

`((,a b) ,c ,@d)

will be interpreted as if it were

(append (list (append (list a) (list 'b) 'nil)) (list c) d 'nil)

but it could also be legitimately interpreted to mean any of the following.

(append (list (append (list a) (list 'b))) (list c) d) 
(append (list (append (list a) '(b))) (list c) d) 
(append (list (cons a '(b))) (list c) d) 
(list* (cons a '(b)) c d) 
(list* (cons a (list 'b)) c d) 
(list* (cons a '(b)) c (copy-list d))

(There is no good reason why copy-list should be performed, but it is not prohibited.)

change_begin
Some users complain that backquote syntax is difficult to read, especially when it is nested. I agree that it can get complicated, but in some situations (such as writing macros that expand into definitions for other macros) such complexity is to be expected, and the alternative is much worse.

After I gained some experience in writing nested backquote forms, I found that I was not stopping to analyze the various patterns of nested backquotes and interleaved commas and quotes; instead, I was recognizing standard idioms wholesale, in the same manner that I recognize cadar as the primitive for ``extract the lambda-list from the form ((lambda ...) ...))'' without stopping to analyze it into ``car of cdr of car.'' For example, ,x within a doubly-nested backquote form means ``the value of x available during the second evaluation will appear here once the form has been twice evaluated,'' whereas ,',x means ``the value of x available during the first evaluation will appear here once the form has been twice evaluated'' and ,,x means ``the value of the value of x will appear here.''

See appendix C for a systematic set of examples of the use of nested backquotes.

change_end
,
The comma character is part of the backquote syntax and is invalid if used other than inside the body of a backquote construction as described above.  

#
This is a dispatching macro character. It reads an optional digit string and then one more character, and uses that character to select a function to run as a macro-character function.

The # character also happens to be a non-terminating macro character. This is completely independent of the fact that it is a dispatching macro character; it is a coincidence that the only standard dispatching macro character in Common Lisp is also the only standard non-terminating macro character.

See the next section for predefined # macro-character constructions.



next up previous contents index
Next: Standard Dispatching Macro Up: Printed Representation of Previous: Parsing of Numbers


[email protected]