Hygienic macros.


André van Tonder


This SRFI is currently in ``draft'' status. To see an explanation of each status that a SRFI can hold, see here. It will remain in draft status until 2005/08/14, or as amended. To provide input on this SRFI, please mailto:srfi-72@srfi.schemers.org. See instructions here to subscribe to the list. You can access previous messages via the archive of the mailing list.



This SRFI describes a procedural macro proposal for Scheme with the following features:


We start with a simple example:
   (define-syntax (swap! a b)
       (let ((temp ,a)) 
         (set! ,a ,b) 
         (set! ,b temp))))
This macro builds a syntax object using the quasisyntax substitution form. Syntax provided as part of the input expression is inserted in the result using unquote or unquote-splicing. Macros written in this way are hygienic and referentially transparent.

This macro may also be written as

  (define-syntax swap!
    (lambda (form)
      (let ((a (cadr  form))
            (b (caddr form)))
        `(,(syntax let) ((,(syntax temp) ,a))
          (,(syntax set!) ,a ,b)
          (,(syntax set!) ,b ,(syntax temp))))))
showing that quasisyntax may itself be defined as library syntax in terms of a primitive syntax form [12].

The example illustrates that we can use the traditional and very useful abstractions car, cdr, ..., for handling compound syntax objects in a Lisp. Indeed, the core interface to compound syntax objects is procedural rather than via special forms.

Syntax-case is expressible as a macro in the current proposal and specified as library syntax, so that we can write, for example.

  (define-syntax cond
    (lambda (x)
      (syntax-case x ()
        ((_ c1 c2 ...)
         (let f ((c1 c1) 
                 (cmore (syntax (c2 ...))))
           (if (null? cmore)
               (syntax-case c1 (else =>)
                 ((else e1 e2 ...) (syntax (begin e1 e2 ...)))
                 ((e0)             (syntax (let ((t e0)) (if t t))))
                 ((e0 => e1)       (syntax (let ((t e0)) (if t (e1 t)))))
                 ((e0 e1 e2 ...)   (syntax (if e0 (begin e1 e2 ...)))))
               (let ((rest (f (car cmore) (cdr cmore))))
                 (syntax-case c1 (=>)
                   ((e0)           (quasisyntax (let ((t e0)) (if t t ,rest))))
                   ((e0 => e1)     (quasisyntax (let ((t e0)) (if t (e1 t) ,rest))))
                   ((e0 e1 e2 ...) (quasisyntax (if e0 (begin e1 e2 ...) ,rest)))))))))))
As the example shows, we may combine implicit and explicit substitution in quasisyntax.

One is not limited to using syntax-case for matching. The core primitives described here may be used to implement more general pattern matchers [1].

Improved Hygiene

Existing hygiene algorithms are well suited for syntax-rules macros but still suffer from potentially accidental variable captures in procedural macros.

In the proposal of this SRFI, the forms quasisyntax and syntax-case are specified so that the following macros will give the answer 1

  (define-syntax (no-capture)       |  (define-syntax (no-capture)
    (define (helper value)          |    (define (helper value)
      (quasisyntax                  |      (with-syntax ((value value))
       (let ((temp 2)) ,value)))    |        (syntax (let ((temp 2)) value))))
    (quasisyntax                    |    (with-syntax ((nested (helper (syntax temp))))
     (let ((temp 1))                |      (syntax (let ((temp 1))  
       ,(helper (syntax temp)))))   |                nested))))
  (no-capture)    ==> 1             |  (no-capture)    ==> 1
whereas the same macro would give the answer 2 in existing systems due to a variable capture.

Note that if the helper procedure were written as a macro, one would expect to obtain the answer 1 due to automatic hygiene. In the current proposal, the macro writer is not penalized for using a procedure instead.

Next consider the following macros for a version of let with left to right evaluation. With the proposal of this SRFI, this definition is correct:

  (define-syntax let-in-order
    (lambda (form)
      (define (let-help form tems vars)
        (syntax-case form ()
          ((_ () e0 e1 ...)
           (with-syntax (((var ...) vars)
                         ((tem ...) tems))
              (let ((var tem) ...) e0 e1 ...))))      
          ((_ ((var exp) binding ...) e0 e1 ...)
            (let ((tem exp))
              ,(let-help (syntax (_ (binding ...) e0 e1 ...))
                         (cons (syntax tem) tems)
                         (cons (syntax var) vars)))))))
      (let-help form '() '())))
  (let-in-order ((x 1)
                 (y 2))
    (+ x y))                ==> 3
whereas existing systems will give the wrong answer 4 due to variable capture.

Again, if the helper were expressed as a macro, one would not expect variable capture. With the current proposal, this good behaviour is guaranteed also for procedural helpers.

To preserve hygiene across macro invocations, hygiene algorithms effectively rename identifiers introduced by a macro to prevent accidental capture. In existing systems only one renaming function is typically used during the entire dynamic extent of the macro invocation. This means that accidental captures may still take place in procedural macros unless the programmer keeps careful track of all identifiers introduced in different parts of the macro and all its helper procedures. This burden is reminiscent of the difficulties one encounters in languages with dynamic scoping of variables. It limits the modularity and constrains the maintainability of complex macros. Especially if code is generated recursively, as in the let-in-order macro, it may be quite hard, in these systems, to verify whether accidental captures occur.

The solution proposed here is based on a primitive with-fresh-renaming-scope that effectively causes a fresh renaming function to be used in the static region following this keyword. Identifiers introduced while expanding this region cannot capture or be captured by identifiers introduced outside this region. It is important to note that this is a lexical, not a dynamic, construct. For example, we have

  (let ((f (lambda () (syntax x))))
     (bound-identifier=? (syntax x)
                         (f))))         ==> #f
where bound-identifier=? non-equivalence means that a binding of one identifier cannot capture references to the other.

It should be clear from studying the examples of this section that the capture problems occur when splicing together syntax generated in a lexically separate parts of the program text. We can avoid these capture problems by requiring all substitution forms to expand to an occurrence of with-fresh-renaming-scope.

In the current proposal, the substitution forms are quasisyntax, syntax-case and with-syntax, which are indeed defined in this way. With this specification, the above macros all behave correctly as they stand.

As a result, identifiers introduced in the region lexically enclosed by the quasisyntax, syntax-case or with-syntax keywords cannot capture or be captured by identifiers introduced elswhere in the program text.

Improved hygiene breaking

Consider writing an unhygienic macro if-it that assigns the value of its condition to the identifier it in its consequent and alternative.
  (if-it 1 it 42)      ==> 1
We would like to be able to freely compose unhygienic macros, something that is notoriously difficult to do in existing systems. For example, we would like to be able to write macros when-it, if-flag-it and my-or in terms of if-it as follows:
  (define-syntax (when-it condition consequent)
      (if-it ,condition
             (if #f #f))))

  (define-syntax (if-flag-it body else)
      (if-it flag ,body ,else)))

  (define-syntax (my-or expr1 expr2)
      (if-it ,expr1 it ,expr2)))
These macros should behave as follows:
  (when-it 42 it)                ==> 42
  (define flag 3)
  (if-flag-it it 'none)          ==> 3

  (my-or 2  it)                  ==> 2
  (my-or #f it)                  ==> #f
The macro system described here has a primitive datum->syntax similar to that provided in the Chez syntax-case specification. However, as far as the author is aware, it is impossible to satisfy these four conditions using datum->syntax without code-walking. Furthermore, we may wish to impose referential transparency requirements such as
  (let ((it 1)) (if-it 42 it #f))   ==> 1
  (let ((it 1)) (when-it 42 it))    ==> 1
  (let ((it 1)) (my-or 2 it))       ==> 2 
  (let ((it 1)) (my-or #f it))      ==> 1  
by analogy with the behaviour of
  (let ((else #f)) (my-cond (else 2)))    ==> void
To satisfy these requirements, we provide a new primitive, make-capturing-identifier, that introduces an identifier which, when bound, will capture all free-identifier=? identifiers in its scope. With this primitive, the following implementation of if-it satisfies all the above requirements:
  (define-syntax (if-it condition consequent alternative)
    (let ((it (make-capturing-identifier (syntax here) 'it)))
        (let ((,it ,condition)) 
          (if ,it
A similar idea has been proposed in [2].

The meaning of the new identifier is determined by the lexical binding of the first argument. This allows us fine control over what we mean by referential transparency. Compare for example the above with:

  (define-syntax if-it
    (lambda (form)
      (syntax-case form ()
        ((keyword condition consequent alternative)
         (with-syntax ((it (make-capturing-identifier keyword 'it)))
             (let ((it condition)) 
               (if it

  (let ((it 1)) (if-it 42 it #f))   ==> 42

The primitive datum->syntax is still the appropriate primitive for introducing identifiers that should be captured by bound-identifier=? identifiers in surrounding binding forms. Comparing with the description of make-capturing-identifier above, we see that the one introduces identifiers that are the subject of capture, while the other introduces identifiers that should be the object of capture.

Source-object correlation:

Source correlation information is not tied to syntax objects, but instead recorded separately by the expander. In addition to tracking source location, this also allows intermediate expansion steps from the source to the object code to be recorded and made available to tools.


The following primitive forms are provided:

The following library forms are provided:
Syntax objects:
A syntax object is a graph whose nodes are Scheme pairs or vectors and whose leaves are constants or identifiers. The following expressions evaluate to syntax objects:
  '(1 2 3)
  (cons (syntax x) (vector 1 2 3 (syntax y)))
  (syntax (let ((x 1)) x))
  (quasisyntax (let ((x 1)) ,(syntax x)))
Symbols may not appear in syntax objects:
  '(let ((x 1)) x)  ==> not a syntax object
syntax: (DEFINE-SYNTAX var exp)
        (DEFINE-SYNTAX (var . formals) exp1 exp ...)
Exp is expanded and then evaluated in the current top level environment, var is bound to a top level location, and the resulting value is stored in the location.

The second variant is equivalent to

     (let ((transformer (lambda (dummy . formals) exp1 exp ...)))
       (lambda (form)
         (apply transformer form))))
Exp may, but does not have to, evaluate to a procedure, also called a transformer.
syntax: (LET[REC]-SYNTAX ((var exp) ...) exp* ...)
These primitives have the semantics described in R5RS:
  (let ((x 'outer))
    (let-syntax ((m (lambda (_) (syntax x))))
      (let ((x 'inner))
        (m))))                     ==>  outer

  (let-syntax ((when (lambda (form)
                         (if ,(cadr form)
                             (begin ,@(cddr form)))))))
    (let ((if #t))
      (when if (set! if 'now))
      if))                              ==> now
Macros defined in a lexically enclosing let[rec]-syntax are available for expanding further nested macros, as the following example shows:
  (let ((x 1))
    (let-syntax ((m (lambda (_) (syntax (syntax x)))))
      (let-syntax ((n (lambda (_) (m))))
                                   ==> 1 
syntax: (SET-SYNTAX! var exp)
Set-syntax is to define-syntax as set! is to define.
  (define-syntax (test) (syntax (syntax 'a)))
  (set-syntax! test (lambda (form) (test)))
  (test)                                   ==> a
procedure: (IDENTIFIER? obj) 
Returns #t if obj is an identifier, #f otherwise.
procedure: (BOUND-IDENTIFIER=?   obj1 obj2)
           (FREE-IDENTIFIER=?    obj1 obj2)
           (LITERAL-IDENTIFIER=? obj1 obj2)
Identifiers are free-identifier=? if they refer to the same lexical or toplevel binding. For this purpose, all identifiers that are not lexically bound are considered implicitly bound at the toplevel.

Identifiers are literal-identifier=? if they are free-identifier=? or if they both refer to toplevel bindings and have the same symbolic name. This primitive should be used to reliably identify literals (such as else in cond) even if they occur in a different module from the macro definition.

Identifiers are bound-identifier=? if a binding of one would capture references to the other in the scope of the binding. Two identifiers with the same name are bound-identifier=? if they were present in the same toplevel expression in the original program text. Identifiers will also be bound-identifier=? if they were created by applying syntax to existing bound-identifier=? identifiers during the same macro-invocation or invocation of with-fresh-renaming-scope. In addition, datum->syntax may create identifiers that are bound-identifier=? to previously introduced identifiers.

These procedures return #f if either argument is not an identifier.

syntax: (SYNTAX datum)
Creates a new syntax object from datum, which must be a syntax object, as follows: Constants contained in datum are unaffected, while identifiers are effectively renamed to obtain fresh identifiers in the sense of bound-identifier=?. These fresh identifiers remain free-identifier=? to the original identifiers. This means that a fresh identifier will denote the same thing as the original identifier in datum unless the macro application places an occurrence of it in a binding position.

During the course of a single macro invocation, syntax ordinarily acts like a one-to-one mathematical function on identifiers: Two identifiers created by evaluating syntax expressions will be bound-identifier=? if and only if the syntax expressions were evaluated during the same macro invocation and the original identifiers in the templates were bound-identifier=?.

  (bound-identifier=? (syntax x) (syntax x))   ==> #t
This behaviour may be modified by the form with-fresh-renaming-scope, which in effect causes a fresh renaming function to be used for evaluating syntax expressions occurring in the static region following this keyword. Note that this is a lexical, not dynamic, construct. For example, one has
  (let ((f (lambda () (syntax x))))
     (bound-identifier=? (syntax x)
                         (f))))         ==> #f

Identifiers that are bound-identifier=? are required to also be free-identifier=?, denoting the same binding. Any attempt to break this invariant should cause an error to be signaled.

  (cons (syntax x) 
        (let ((x 1)) (syntax x)))       ==> error


In the following example, (m) expands to (syntax x), where x denotes the outer binding. Although the expanded (syntax x) occurs in the syntactic environment of the middle binding, the fresh identifier resulting from evaluating it will denote the same thing as the identifier x in the template. It will therefore also refer to the outer binding.

  (let ((x 'outer))
    (let-syntax ((m (lambda (_) (syntax (syntax x)))))
      (let ((x 'middle))
        (let-syntax ((n (lambda (_) (m))))
          (let ((x 'inner))
            (n))))))               ==> outer
Note that syntax does not unify identifiers previously distinct in the sense of bound-identifier=? occurring in template even if they have the same symbolic name:
  (let ((x 1))
    (let-syntax ((m (lambda (_) (syntax (syntax x)))))
      (let ((x 2))
        (let-syntax ((n (lambda (_)
                           (let-syntax ((o (lambda (_)
                                             (,(syntax syntax)
                                               (,(syntax list) 
                                                 ,(syntax x))))))
          (n)))))                      ==> (1 2)

syntax: (SYNTAX-QUOTE datum)
Returns the existing syntax object datum embedded in the program. Unlike syntax, no new syntax object is constructed. This primitive is useful for defining certain kinds of macro-generating macros that have to compose pieces of code preserving bound-identifier=? equivalence where not all the pieces are passed via the same chain of macro calls.

For example, in the following fragment, z is passed to the inner macro by two paths, one of them via x and then y, and the other via only x. Using syntax-quote, we can "tunnel" x to the inner macro as follows:

   (let-syntax ((m (lambda (form)
                     (syntax-case form ()
                       ((_ x)
                         (let-syntax ((n (lambda (form*)
                                           (syntax-case form* ()
                                             ((_ y)
                                              (with-syntax ((x (syntax-quote x)))
                                                (syntax (let ((y 1))
                             (n x))))))))
         (m z))     ==> 1
syntax: (WITH-FRESH-RENAMING-SCOPE exp1 exp ...)
Causes a fresh syntactic renaming function to be used in the static lexical region exp1 exp ... as described in the specification of syntax above.

The body exp1 exp ... is subject to the rules for a lambda body, and may contain internal definitions.

procedure: (MAKE-CAPTURING-IDENTIFIER context-identifier symbol)
This procedure returns a fresh identifier with symbolic name symbol, and with denotation that of symbol in the syntactic environment in which context-identifier was introduced. If the resulting identifier occurs in a binding, it will capture any identifiers in the scope of the binding that are free-identifier=? to it. The new identifier is not bound-identifier=? to any existing identifiers.
  (define-syntax (if-it condition consequent alternative)
    (let ((it (make-capturing-identifier (syntax here) 'it)))
        (let ((,it ,condition)) 
          (if ,it

  (if-it 42 it #f)                  ==> 42
  (let ((it 1)) (if-it 42 it #f))   ==> 1
The following examples illustrate how the behaviour of the capturing identifier is affected by the context-identifier argument. Compare in particular the first example below with the one above.
  (define-syntax if-it
    (lambda (form)
      (let ((it (make-capturing-identifier (car form) 'it)))
         (let ((,it ,(cadr form))) 
           (if ,it
               ,(caddr form) 
               ,(cadddr form)))))))

  (let ((it 1)) (if-it 42 it #f))   ==> 42

  (let ((y 'outer))
    (let-syntax ((m (lambda (_) (make-capturing-identifier (syntax here) 'y))))
      (let ((y 'inner))
        (m))))                      ==> outer

  (let ((y 'outer))
    (let-syntax ((m (lambda (form) 
                      (let ((y (make-capturing-identifier (syntax here) 'y)))
                         (let ((,y 'inner)) ,@(cdr form)))))))
      (m y)))  
                                    ==> inner

  (let ((y 'outer))
    (let-syntax ((m (lambda (form) 
                      (let ((y (make-capturing-identifier (syntax here) 'y)))
                         (let ((,y 'inner)) ,@(cdr form)))))))
      (let ((y 'more))
        (m y))))  
                                    ==> more 
procedure: (DATUM->SYNTAX context-identifier obj) 
Transforms obj, which must be a graph with pairs or vectors as nodes and with symbols or constants as leaves, to a syntax object as follows: Constants in obj are unaffected, while symbols appearing in obj are converted to identifiers that behave under bound-identifier=? and free-identifier=? the same as an identifier with the same symbolic name would behave if it had occurred together with context-identifier in the same source toplevel expression or was produced during the same evaluation of the syntax expression producing context-identifier.

If template-identifier is a capturing identifier, the symbols in obj will also be converted to capturing identifiers.

  (let ((x 'outer))
    (let-syntax ((m (lambda (_) (syntax (syntax z)))))
      (let ((x 'middle))
        (let-syntax ((n (lambda (_) (datum->syntax (m) 'x))))
          (let ((x 'inner))
            (n))))))               ==> outer

  (let-syntax ((m (lambda (form)
                    (syntax-case form ()
                      ((_ y)
                       (let ((x (datum->syntax (syntax y) 'x)))
                          (let ((y 1)) ,x))))))))
    (m x))   ==> 1
procedure: (SYNTAX->DATUM syntax-object)
Transforms a syntax object to a new graph by replacing contained identifiers by their symbolic names.
procedure: (EXPAND syntax-object)
Expands the syntax object fully to obtain a core Scheme expression.
  (expand (syntax (let ((x 1)) x)))    ==> ((lambda (@x5872) @x5872) 1)
procedure: (SYNTAX-DEBUG syntax-object)
Converts its argument to a human-readable format.
  (syntax-debug (syntax (let ((x 1)) y)))   ==> (let ((x#top 1)) y#top)
procedure: (SYNTAX-ERROR obj ...)
Invokes a syntax error. The objects obj ... are displayed, available source-object correlation information is displayed or provided to debugging tools, and the expander is stopped.
library syntax: (QUASISYNTAX template)
The quasisyntax may be implemented as a macro in terms of with-fresh-renaming-scope and syntax.

Constructs a new syntax object from the template, parts of which may be unquoted using unquote or unquote-splicing. Quasisyntax is to syntax as quasiquote is to quote.

For example, no variable capture occurs in the following macro:

  (define-syntax (no-capture)

    (define (helper value)
       (let ((temp 2)) ,value)))
     (let ((temp 1))
       ,(helper (syntax temp)))))
  (no-capture)   ==> 1
since, for example, the second quasisyntax expression expands to the equivalent of
     `(,(syntax let) ((,(syntax temp) 1))
       ,(helper (syntax temp))))
To make nested unquote-splicing behave in a useful way, the R5RS-compatible extension described in appendix B of the paper [10] is required.


  (bound-identifier=? (quasisyntax x) (quasisyntax x))   ==> #f

  (define-syntax (macro-generate name id)
      (define-syntax (,name)
          (let ((,(syntax ,id) 4)) ,(syntax ,id))))))

  (macro-generate test z)
  (test)   ==> 4

  (define (generate-temporaries lst)
    (map (lambda (_) (quasisyntax temp))

  (define-syntax (test-temporaries)
    (let ((temps (generate-temporaries '(1 2))))
      (quasisyntax ((lambda ,temps (list ,@temps)) 1 2))))

  (test-temporaries)   ==> (1 2)

library syntax: (SYNTAX-CASE exp (literal ...) clause ...)

                clause := (pattern output-expression)
	                  (pattern fender output-expression)
In the current proposal, the syntax-case form can be written as a macro in terms of the core primitives specified above.

Each pattern is identical to a syntax-rules pattern, and the optional fender may specify additional constraints on acceptance of the clause [6, 7]. Literals in the list (literal ...) are matched against identifiers in the input form using literal-identifier=?.

In the output-expression of each clause, the syntax keyword is effectively rebound to implement implicit substitution of variables bound in lexically enclosing syntax-case pattern's, so that the template in (syntax template) is treated identically to a syntax-rules template.

Subtemplates of quasisyntax templates that do not contain completely unquoted expressions are treated in the same way as syntax templates, allowing implicit substitution also inside these quasisyntax subtemplates.

The proposal adds the following requirement:

For example, no variable capture occurs in the following macro, where the with-syntax forms expand to occurrences of syntax-case:

  (define-syntax (no-capture) 
    (define (helper value)
      (with-syntax ((value value)) 
        (syntax (let ((temp 2)) value))))
    (with-syntax ((nested (helper (syntax temp))))
      (syntax (let ((temp 1))
  (no-capture)    ==> 1
library syntax: (WITH-SYNTAX template)
As in [6, 7], with-syntax expands to an instance of syntax-case
  (define-syntax with-syntax
    (lambda (x)
      (syntax-case x ()
        ((_ ((p e0) ...) e1 e2 ...)
         (syntax (syntax-case (list e0 ...) ()
                   ((p ...) (begin e1 e2 ...)))))))) 
and inherits the modification for improved hygiene specified above for the latter.
library syntax: (SYNTAX-RULES template)
As defined in R5RS.


The implementation uses the forms and procedures specified in R5RS. It does not require R5RS macros or any other existing macro system. In addition, it uses gensym with an optional string prefix argument, and an interaction-environment, no-argument variant of eval. It should run unmodified on systems that provide these additional procedures. Portability hooks are provided for Schemes that lack either of these primitives or provide them with a different interface.

The implementation has been successfully tested on Chez, Chicken, Gambit and MzScheme.

The implementation was strongly influenced by the explicit renaming system [8, 11].

We use an imperative hygiene algorithm that is eager, has linear complexity, and is very fast. This is achieved by having bound-identifier=? identifiers share a location, so that alpha substitutions can be done by a simple imperative update of an identifier and no additional work is required to propagate substitutions or environments to leaves. In addition, source-object correlation information is not stored in syntax objects, but tracked independently, which makes it possible to represent syntax objects as ordinary list or vector structure.

During the draft period, the reference implementation will be available here.


[1] André van Tonder - Simple macros and simple modules


[2] Oleg Kiselyov - Message on comp.lang.scheme:


[3] Marcin 'Qrczak' Kowalczyk - Message on comp.lang.scheme:


[4] Ben Rudiak-Gould - Message on comp.lang.scheme:


[5] Matthew Flatt - Composable and Compilable Macros You Want it When?

[6] R. Kent Dybvig - Schez Scheme user's guide:


[7] Robert Hieb, R. Kent Dybvig and Carl Bruggeman
    - Syntactic Abstraction in Scheme.


[8] William D. Clinger - Hygienic macros through explicit renaming.


[9] Eugene E. Kohlbecker, Daniel P. Friedman, Matthias Felleisen and Bruce F. Duba
    - Hygienic macro expansion


[10] Alan Bawden - Quasiquotation in Lisp 


[11] Richard Kelsey and Jonathan Rees - The Scheme 48 implementation


[12] Robert Hieb, R. Kent Dybvig - A compatible low-level macro facility

     Revised(4) Report on the Algorithmic Language Scheme (appendix)


Copyright (C) André van Tonder (2005). All Rights Reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.


Author: André van Tonder
Editor: Francisco Solsona