Title

Hygienic macros.

Author

André van Tonder

Status

This SRFI is currently in ``draft'' status. To see an explanation of each status that a SRFI can hold, see here. It will remain in draft status until 2005/08/14, or as amended. To provide input on this SRFI, please mailto:srfi-72@srfi.schemers.org. See instructions here to subscribe to the list. You can access previous messages via the archive of the mailing list.

Index

Abstract

This SRFI describes a procedural macro proposal for Scheme with the following features:

Introduction

We start with a simple example:
  (define-syntax (swap! a b)
     (quasisyntax
       (let ((temp ,a)) 
         (set! ,a ,b) 
         (set! ,b temp))))
A syntax object is here constructed using the quasisyntax form. Syntax provided as part of the input expression can be inserted in the result using unquote or unquote-splicing. Macros written in this way are hygienic and referentially transparent.

The following example shows that we can use the procedures car, cdr, null?, ..., on syntax objects. It also illustrates the use of the predicate literal-identifier=? for identifying literals in the input expression.

  (define-syntax (my-cond c . cs)
    (if (literal-identifier=? (car c) (syntax else))
        (quasisyntax (begin ,@(cdr c)))
        (if (null? cs)                      
            (quasisyntax (if ,(car c) (begin ,@(cdr c))))
            (quasisyntax (if ,(car c)
                             (begin ,@(cdr c))
                             (my-cond ,@cs))))))
In the current proposal, the syntax-case form is expressible as a macro in terms of a simpler set of primitives, and is specified as library syntax. The my-cond macro can then also be expressed as:
  (define-syntax my-cond
    (lambda (form)
      (syntax-case form (else)
        ((_ (else e1 ...) c1 ...) (syntax (begin e1 ...)))
        ((_ (e0 e1 ...))          (syntax (if e0 (begin e1 ...))))
        ((_ (e0 e1 ...) c1 ...)   (syntax (if e0
                                              (begin e1 ...)
                                              (my-cond c1 ...)))))))

Improved hygiene

In previous hygienic Scheme macro systems, accidental variable capture can take place in procedural macros, as the following example illustrates:
  (let-syntax ((main (lambda (form)
                       
                       (define (make-swap x y)
                         (quasisyntax 
                          (let ((t ,x))
                            (set! ,x ,y)
                            (set! ,y t))))
                       
                       (quasisyntax
                        (let ((s 1)
                              (t 2))
                          ,(make-swap (syntax s) (syntax t))
                          (list s t))))))
    (main))    
              ==> (1 2) with conventional hygiene algorithm
                  (2 1) with the proposal of this SRFI
This happens because the conventional hygiene algorithm regards all identifiers with the same name introduced during the entire duration of a macro invocation as identical.

This property makes it difficult to procedurally construct code fragments with fixed meaning for use in a different part of the program, since the meaning may always be accidentally corrupted at the use site. This violates the idea of referential transparency and may lead to fragility in large macros or where helper procedures are shared by macros.

The problem can show up in seemingly straightforward macros, as the following example illustrates. The let-in-order form is a version of let with guaranteed left-to-right evaluation of bindings. Again, the conventional hygiene algorithm leads to accidental capture of instances of t introduced recursively:

  (define-syntax let-in-order
    (lambda (form)
      (syntax-case form ()
        ((_ ((i e) ...) e0 e1 ...)         
         (let f ((ies (syntax ((i e) ...)))
                 (its (syntax ()))) 
           (syntax-case ies () 
             (()            (quasisyntax (let ,its e0 e1 ...)))
             (((i e) . ies) (quasisyntax 
                             (let ((t e))
                               ,(f (syntax ies)
                                   (quasisyntax ((i t) ,@its))))))))))))
  
  (let-in-order ((x 1)
                 (y 2))
    (+ x y))                ==> 4 with conventional hygiene algorithm
                                3 with the proposal of this SRFI  

These are exactly the kind of problems that hygiene was invented to solve, reappearing in a slightly different guise. To eliminate these problems, this SRFI proposes the following modified hygiene rule [2-4]:

A binding for an identifier can only capture a reference to another if both were present in the source or introduced during a single evaluation of a syntax or quasisyntax form, with the understanding that the evaluation of any nested, unquoted syntax or quasisyntax forms counts as part of the evaluation of an enclosing quasisyntax.

In the first example, this rule guarantees that the two instances of t are distinct, since they occur in separate quasisyntax forms. In the second example, the instances of t are also distinct since they are introduced during different invocations of the quasisyntax form. With this proposal, the above macros are correct as they stand.

Reflective tower

The macro system proposed here has an infinite reflective tower [11]. In contrast with comparable macro systems, the hygiene algorithm of this SRFI implements a refinement of lexical scoping that takes into account the level of use of an identifier in determining its meaning.

A reflective tower determines the (meta-) language that is used for different phases of execution. Consider the following example:

  (let ((x 1))
    (let-syntax ((m (lambda (form)
                      (let ((x 2))
                        (let-syntax ((n (lambda (form)
                                          (let ((x 3))
                                            (syntax (syntax x))))))
                          (n))))))
      (m)))  ==> 1
Ordinary Scheme code, executing at runtime, is written in the ground-level language. The right hand sides of the macro definition m is executed at expansion time and is written in a meta-language, whose environment is strictly disjoint from the ground-level language. The right hand side of m itself contains a let-syntax defining a nested macro n, the right hand side of which is written in a meta-meta-language, and so on.

The meaning of an identifier is determined by lexical scoping at the language level of its use. In the above example, the x in (syntax (syntax x)) is quoted data at the meta-meta-level and again at the meta-level, and therefore cannot possibly denote bindings at these levels. It is finally used at the ground level, and therefore denotes the binding of the outer let.

A fundamental principle of block-structured languages is that the meaning of a program should depend only on its textual representation. For example, we should obtain the same results whether we expand and evaluate a sequence of top-level forms one by one in an interpreter, or first expand and compile the whole sequence and then evaluate it. Schematically,

     (for-each (lambda (form) (eval (expand form))) sequence)

       == (for-each eval (map-in-order expand sequence))

Applying this invariance principle to the following conversation then shows the necessity for maintaining a separate environment for each level in the reflective tower:

  (define x 1)
  
  (begin-for-syntax (define x 2))
  
  (let-syntax ((m (lambda (form)
                    (quasisyntax (list x ,x)))))
    (m))  ==> (1 2)
Indeed, in an interpreter, the ground-level binding and the meta-binding for x have to be in memory at the same time. In the reference implementation, this is achieved via implicit renaming to obtain an effectively separate namespace for each reflective level.

The primitive begin-for-syntax is provided for specifying computations at arbitrary reflective level.

Escaping ellipses

We require the ellipsis in the template in (syntax ...) to be interpreted as an ordinary identifier, not an ellipsis literal. The following idiom can then be used to include ellipses in syntax-case-generated macros:
  (let-syntax ((m (lambda (form)
                    (syntax-case form ()
                      ((_ x ...)
                       (with-syntax ((::: (syntax ...)))
                         (syntax
                          (let-syntax ((n (lambda (form)
                                            (syntax-case form ()
                                              ((_ x ... :::)
                                               (syntax `(x ... :::)))))))
                            (n a b c d)))))))))
      (m u v))  
                ==> (a b c d)

Specification

The following primitive forms are provided:
         define-syntax
         let-syntax
         letrec-syntax
         begin-for-syntax
                        
         identifier?
         bound-identifier=?
         free-identifier=?
         literal-identifier=?
         
         syntax
         quasisyntax
         embedded-syntax

         datum->syntax-object       
         syntax-object->datum
         make-fluid-identifier

         syntactic-wind

         syntax-debug        
         syntax-error 
The following library forms are provided:
         syntax-case
         with-syntax
         syntax-rules
Syntax objects:

A syntax object is a graph whose nodes are Scheme pairs or vectors and whose leaves are constants or identifiers. The following expressions evaluate to syntax objects:

  '()
  1
  #f
  '(1 2 3)
  (cons (syntax x) (vector 1 2 3 (syntax y)))
  (syntax (let ((x 1)) x))
  (quasisyntax (let ((x 1)) ,(syntax x)))
  
Symbols may not appear in syntax objects:
  '(let ((x 1)) x)  ==> not a syntax object
  
Reflective tower:

The macro system proposed here has an infinite reflective tower, determining the (meta-) language that is used for different phases of execution. The hygiene algorithm of this SRFI implements a refinement of lexical scoping whereby the meaning of an identifier is determined by lexically visible bindings at the language level of its use.

In the following expression, the occurrence of (syntax x) on the right hand side of m is just quoted data at the meta-level and therefore not affected by the inner binding. The resulting identifier is used at the ground level in the expanded code, where it denotes the outer binding.

  (let ((x 1))
    (let-syntax ((m (lambda (form)
                      (let ((x 2))
                        (syntax x)))))
      (m)))  ==> 1
In the following example, there are two instances of x introduced by the macro n. The first is used at the meta-level, where it refers to the binding 2, whereas the second is used at the ground level, where it refers to the binding 1.
  (let ((x 1))
    (let-syntax ((m (lambda (form)
                      (let ((x 2))
                        (let-syntax ((n (lambda (form)
                                          (syntax
                                           (let ((y x))
                                             (quasisyntax (list x ,y)))))))
                          (n))))))
      (m))) ;==> (1 2)
The primitive begin-for-syntax, described below, is provided for specifying computations at arbitrary reflective level.
syntax: (define-syntax keyword exp)
        (define-syntax (keyword . formals) exp1 exp ...)
Exp is expanded and evaluated in the current top-level syntactic environment, and must evaluate to a procedure of type syntax-object -> syntax-object, also called a transformer. The top-level syntactic environment is extended by binding the identifier keyword to the resulting transformer.

The second variant is equivalent to

  (define-syntax keyword 
     (let ((transformer (lambda (dummy . formals) exp1 exp ...)))
       (lambda (form)
         (apply transformer form))))
syntax: (let-syntax    ((keyword exp) ...) exp* ...)
        (letrec-syntax ((keyword exp) ...) exp* ...)
We generalize R5RS, section (4.3.1), by allowing each expression exp to evaluate to an arbitrary transformer procedure.

We also impose the requirement that let[rec]-syntax behave as a splicing form rather than introducing a new local scope. For example:

 (let ((x 1))
    (let-syntax ((foo (syntax-rules ())))
      (define x 2))
    x)               ==> 2
syntax: (begin-for-syntax form ...)
Evaluates the forms form ... at macro-expansion time in sequence at one reflective level higher than the enclosing level. The forms are not evaluated again at runtime. Identifier bindings and references introduced at separate reflective levels refer to disjoint environments.
  (define x 1)
  (begin-for-syntax (define x 2))
  
  (let-syntax ((m (lambda (form)
                    (quasisyntax (list x ,x)))))
    (m))  ==> (1 2)
By nesting begin-for-syntax, we may introduce toplevel bindings and execute computations at arbitrary level in the reflective tower.
  (begin-for-syntax
    (define x 1)
    (begin-for-syntax
      (define x 2)))

  (let ((x 0))
    (let-syntax ((foo (lambda (form)
                        (let-syntax ((bar (lambda (form)
                                            (quasisyntax
                                             (quasisyntax 
                                              (list x ,x ,,x))))))
                          (bar)))))
      (foo))) ==> (0 1 2)
procedure: (identifier? obj)
Returns #t if obj is an identifier, #f otherwise. The identifier type is disjoint from other Scheme primitive types described in R5RS, section (3.2).
procedure: (bound-identifier=?   obj1 obj2)
           (free-identifier=?    obj1 obj2)
           (literal-identifier=? obj1 obj2)
Identifiers are free-identifier=? if they would refer to the same lexical or toplevel binding if inserted as free identifiers in the result of the macro expansion. For this purpose, all identifiers that are not lexically bound are considered implicitly bound at the toplevel.

Identifiers are literal-identifier=? if they are free-identifier=? or if they both refer to toplevel bindings and have the same symbolic name. This primitive should be used to reliably identify literals (such as else in cond) even if they occur in a different module from the macro definition.

Identifiers are bound-identifier=? if a binding of one would capture references to the other in the scope of the binding.

Two identifiers with the same name are bound-identifier=? if both were present in the same toplevel expression in the original program text. Two identifiers will also be bound-identifier=? if they were produced from existing bound-identifier=? identifiers during a single evaluation of the same syntax or quasisyntax form, with the understanding that the evaluation of any nested, unquoted syntax or quasisyntax forms counts as part of the evaluation of an enclosing quasisyntax. In addition, datum->syntax-object may create identifiers that are bound-identifier=? to previously introduced identifiers.

These procedures return #f if either argument is not an identifier.

  (free-identifier=?  (syntax x) (syntax x))      ==> #t
  (bound-identifier=? (syntax x) (syntax x))      ==> #f

  (let ((y (syntax (x . x))))
    (bound-identifier=? (car y) 
                        (cdr y)))                 ==> #t

  (quasisyntax ,(bound-identifier=? (syntax x) 
                                    (syntax x)))  ==> #t

  (let ((x 1))
    (let-syntax ((m (lambda (form)
                      (quasisyntax
                       (let ((x 2))
                         (let-syntax ((n (lambda (form)
                                           (free-identifier=? (cadr form) 
                                                              (syntax x)))))
                           (n ,(cadr form))))))))
      (m x)))  ==> #f

syntax: (syntax datum)
Creates a new syntax object from datum, which must be a syntax object embedded in the input form, as follows: Constants contained in datum are unaffected, while identifiers are replaced by fresh identifiers that are different from all previously existing identifiers in the sense of bound-identifier=?. Two of the resulting identifiers will be bound-identifier=? only if they replaced existing bound-identifier=? identifiers in datum during a single evaluation of the syntax form.

These fresh identifiers remain free-identifier=? to the original identifiers. This means that a fresh identifier will denote the same binding as the original identifier in datum unless macro expansion places an occurrence of it in a binding position.

The core syntax form decribed here has no notion of pattern variable insertion, but is effectively rebound in the scope of syntax-case clauses to implement that feature (see below).

Examples:

  (bound-identifier=? (syntax x) (syntax x))   ==> #f

  (let ((y (syntax (x . x))))
    (bound-identifier=? (car y) 
                        (cdr y)))              ==> #t


  (syntax-object->datum (syntax (x ...)))      ==> (x ...)   

  (define (generate-temporaries list)
    (map (lambda (ignore) (syntax temp))
         list))            
Note that syntax does not unify identifiers previously distinct in the sense of bound-identifier=? occurring in datum even if they have the same symbolic name:
  (let ((x 1))
    (let-syntax
        ((foo (lambda (form)
                (quasisyntax
                 (let-syntax
                     ((bar (lambda (_) 
                             (syntax (let ((x 2)) ,(cadr form))))))
                    (bar))))))
      (foo x)))                 ==> 1
syntax: (quasisyntax template)

Constructs a new syntax object from template, parts of which may be unquoted using unquote or unquote-splicing. If no unquoted subexpressions appear at the same nesting level as the outermost quasisyntax, the result of evaluating (quasisyntax template) is equivalent to the result of evaluating (syntax template). However, if unquoted expressions do appear, they are evaluated and inserted or spliced into the resulting structure according to the rules described for quasiquote in R5RS (4.2.6).

To make nested unquote-splicing behave in a useful way, the R5RS-compatible extension to quasiquote in appendix B of the paper [10] is required mutatis mutandis for quasisyntax.

Identifiers introduced when evaluating the quasisyntax form are different from all previously existing identifiers in the sense of bound-identifier=?. Two of the resulting identifiers will be bound-identifier=? only if they replaced existing bound-identifier=? identifiers in template during a single evaluation of the quasisyntax form, with the understanding that the evaluation of any nested, unquoted syntax or quasisyntax forms counts as part of the evaluation of the enclosing quasisyntax.

These fresh identifiers remain free-identifier=? to the original identifiers. This means that a fresh identifier will denote the same binding as the original identifier in datum unless macro expansion places an occurrence of it in a binding position.

The core quasisyntax form decribed here has no notion of pattern variable insertion, but is effectively rebound in the scope of syntax-case clauses to implement that feature (see below).

Examples:

  (bound-identifier=? (quasisyntax x)
                      (quasisyntax x))                   ==> #f

  (quasisyntax ,(bound-identifier=? (quasisyntax x) 
                                    (syntax x)))         ==> #t

  (let-syntax ((f (lambda (form) (syntax (syntax x)))))
    (quasisyntax ,(bound-identifier=? (f) (f))))         ==> #f

  (let-syntax ((m (lambda (_) 
                    (quasisyntax 
                     (let ((,(syntax x) 1)) ,(syntax x))))))
    (m))  ==> 1
In the above, notice that the rule for bound-identifier=? equivalence inside quasisyntax forms imply the equivalence:
  (quasisyntax (let ((,(syntax x) 1)) ,(syntax x))) 

      <-> (quasisyntax (let ((x 1)) x)) 
The traditional idiom for macro-generating macros containing nested quasisyntax forms works correctly:
  (let-syntax ((m (lambda (form)
                    (let ((x (cadr form)))
                      (quasisyntax 
                       (let-syntax ((n (lambda (_)
                                         (quasisyntax 
                                          (let ((,(syntax ,x) 4)) ,(syntax ,x))))))
                         (n)))))))
    (m z))  ==> 4
The rule for bound-identifier=? equivalence inside quasisyntax is unique in ensuring that the above macro means exactly the same as the corresponding syntax-case macro:
  (let-syntax ((m (lambda (form)
                    (syntax-case form ()
                      ((_ x) (syntax 
                              (let-syntax ((n (lambda (_)
                                                (syntax (let ((x 4)) x)))))
                                (n))))))))
    (m z))   ==> 4
syntax: (embedded-syntax datum)
Returns the existing syntax object datum in the input form. Unlike syntax, no new syntax object is constructed. This primitive is useful for defining certain kinds of macro-generating macros that have to compose pieces of code preserving bound-identifier=? equivalence where not all the pieces are passed via the whole chain of macro calls.

For example, in the following fragment, z is passed to the inner macro by two paths, one of them via form and then form*, and the other via a combination of form only. Using embedded-syntax, we can "tunnel" z unmodified to the inner macro as follows:

  (let ((z 2))
    (let-syntax ((m (lambda (form)
                      (quasisyntax
                       (let-syntax ((n (lambda (form*)
                                         (quasisyntax 
                                          (let ((,(cadr form*) 1))
                                            ,(embedded-syntax ,(cadr form)))))))
                         (n ,(cadr form)))))))
      (m z)))   ==> 1
Compare this with:
  (let ((z 2))
    (let-syntax ((m (lambda (form)
                      (quasisyntax
                       (let-syntax ((n (lambda (form*)
                                         (quasisyntax
                                          (let ((,(cadr form*) 1))
                                            ,(syntax ,(cadr form)))))))
                         (n ,(cadr form)))))))
      (m z)))   ==> 2

procedure: (datum->syntax-object template-identifier obj) 
Transforms obj, which must be a graph with pairs or vectors as nodes and with symbols or constants as leaves, to a syntax object as follows: Constants in obj are unaffected, while symbols appearing in obj are replaced by identifiers that behave under bound-identifier=?, free-identifier=? and literal-identifier=? the same as an identifier with the same symbolic name would behave if it had occurred together with template-identifier in the same source toplevel expression or was produced during the same evaluation of the syntax or quasisyntax expression producing template-identifier.

If template-identifier is a fluid identifier, the symbols in obj will also be converted to fluid identifiers.

  (let-syntax ((m (lambda (_)
                    (let ((x (syntax x)))
                      (let ((x* (datum->syntax-object x 'x)))
                        (quasisyntax
                         (let ((,x 1)) ,x*)))))))
    (m))        ==> 1


  (let ((x 1))
    (let-syntax ((m (lambda (form)
                      (quasisyntax
                       (let ((x 2))
                         (let-syntax ((n (lambda (form)
                                           (datum->syntax-object (cadr form) 'x))))
                           (n ,(cadr form))))))))
      (m z)))   ==> 1
Datum->syntax-object provides a hygienic mechanism for inserting bindings that intentionally capture existing references. Since composing such macros is a subtle affair, with various incorrect examples appearing in the literature, we present a worked-out example, courtesy of [2]:
  (define-syntax if-it
    (lambda (x)
      (syntax-case x ()
        ((k e1 e2 e3)
         (with-syntax ((it (datum->syntax-object (syntax k) 'it)))
           (syntax (let ((it e1))
                     (if it e2 e3)))))))) 
  
  (define-syntax when-it
    (lambda (x)
      (syntax-case x ()
        ((k e1 e2)
         (with-syntax ((it* (datum->syntax-object (syntax k) 'it)))
           (syntax (if-it e1
                          (let ((it* it)) e2)
                          (if #f #f))))))))
  
  (define-syntax my-or
    (lambda (x)
      (syntax-case x ()
        ((k e1 e2)
         (syntax (if-it e1 it e2))))))

  (if-it 2 it 3)    ==> 2
  (when-it 42 it)   ==> 42
  (my-or 2 3)       ==> 2
  (my-or #f it)     ==> Error: undefined identifier: it
                                       
  
  (let ((it 1)) (if-it 42 it #f))    ==> 42
  (let ((it 1)) (when-it 42 it))     ==> 42
  (let ((it 1)) (my-or 42 it))       ==> 42
  (let ((it 1)) (my-or #f it))       ==> 1
  (let ((if-it 1)) (when-it 42 it))  ==> 42
Notice how my-or purposely does not expose it to the user. On the other hand, the definition of when-it explicitly reexports it to the use site, while preserving referential transparency in the last example.
procedure: (syntax-object->datum syntax-object)
Transforms a syntax object to a new graph with identifiers replaced by their symbolic names.

procedure: (make-fluid-identifier template-identifier symbol)
This procedure returns a fresh identifier that is free-identifier=? to (datum->syntax-object template-identifier symbol). The new identifier is not bound-identifier=? to any existing identifiers. If the identifier is inserted as a bound identifier in a binding form, the binding will capture any identifiers in the scope of the binding that are free-identifier=? to (datum->syntax-object template-identifier symbol).

This primitive is useful for implementing expand-time fluid binding forms. The following example illustrates how one may use it to implement fluid-let-syntax from [6, 7]:

  (define-syntax fluid-let-syntax
    (lambda (form)
      (syntax-case form ()
        ((k ((i e) ...) e1 e2 ...) 
         (with-syntax (((fi ...) 
                        (map (lambda (i)
                               (make-fluid-identifier i
                                                      (syntax-object->datum i)))
                             (syntax (i ...)))))
           (syntax 
            (let-syntax ((fi e) ...) e1 e2 ...)))))))
          
  
  (let ((f (lambda (x) (+ x 1))))
    (let-syntax ((g (syntax-rules ()
                      ((_ x) (f x)))))
      (let-syntax ((f (syntax-rules ()
                        ((_ x) x))))
        (g 1))))   ==> 2

  
  (let ((f (lambda (x) (+ x 1))))
    (let-syntax ((g (syntax-rules ()
                      ((_ x) (f x)))))
      (fluid-let-syntax ((f (syntax-rules ()
                              ((_ x) x))))
        (g 1))))   ==> 1
This primitive also provides an alternative mechanism for inserting bindings that intentionally capture existing references. Consider:
  (define-syntax if-it
    (lambda (x)
      (syntax-case x ()
        ((k e1 e2 e3)
         (with-syntax ((it (make-fluid-identifier (syntax here) 'it)))
           (syntax (let ((it e1))
                     (if it e2 e3)))))))) 
  
  (define-syntax when-it
    (lambda (x)
      (syntax-case x ()
        ((k e1 e2)
         (syntax (if-it e1 e2 (if #f #f)))))))
  
  (define-syntax my-or
    (lambda (x)
      (syntax-case x ()
        ((k e1 e2)
         (syntax (let ((thunk (lambda () e2)))
                   (if-it e1 it (thunk))))))))

  (if-it 2 it 3)     ==> 2
  (when-it 42 it)    ==> 42
  (my-or 2 3)        ==> 2
  (my-or #f it)      ==> undefined identifier: it
                                       
  
  (let ((it 1)) (if-it 42 it #f))     ==> 1
  (let ((it 1)) (when-it 42 it))      ==> 1
  (let ((it 1)) (my-or 42 it))        ==> 42
  (let ((it 1)) (my-or #f it))        ==> 1
  (let ((if-it 1)) (when-it 42 it))   ==> 42
Although when-it here is simpler than the above solution using datum->syntax-object, since captures do not have to be explicitly propagated, it now takes a little more work to prevent propagation of captures, as the my-or macro shows. Also, in two cases the answer is different, with explicit bindings here taking precedence over implicit bindings. This behaviour is the same as that of the MzScheme solution suggested in [13], but can be changed by modifying the first argument to make-fluid-identifier.
syntax: (syntactic-wind before form after)
When the expander encounters a syntactic-wind form, the expressions before and after are first evaluated at one reflective level higher than the level at which the syntactic-wind form occurred. They have to evaluate to procedures. The procedure before is called with no arguments, then the form form is expanded, and then the procedure after is called with no arguments. Finally, the expander replaces the syntactic-wind form by the expanded form.

This primitive is useful for managing expand-time environments across macro-invocations. An example of its use is in the reference syntax-case implementation.

  (begin-for-syntax (define env (list (syntax a))))

  (let-syntax ((foo (lambda (form)
                      (quasisyntax ',env))))
    (list
     (syntactic-wind (lambda () (set! env (cons (syntax b) env)))
                     (foo)
                     (lambda () (set! env (cdr env))))
     (foo)))

          ==> ((b a) (a))
procedure: (syntax-debug syntax-object)
Converts its argument to a human-readable format.
  (syntax-debug (syntax (let ((x 1)) y)))   ==> (let ((x#top 1)) y#top)
procedure: (syntax-error obj ...)
Invokes a syntax error. The objects obj ... are displayed, available source-object correlation information is displayed or provided to debugging tools, and the expander is stopped.

library syntax: (syntax-case exp (literal ...) clause ...)

                 clause := (pattern output-expression)
	                   (pattern fender output-expression)
The syntax-case form can be written as a macro in terms of the primitives specified above.

Identifiers in a pattern that are not bound-identifier=? to any of the identifiers (literal ...) are called pattern variables. These belong to the same namespace as ordinary variables and can shadow, or be shadowed by, bindings of the latter. Each pattern is identical to a syntax-rules pattern (R5RS 4.3.2), and is matched against the input expression exp, which must evaluate to a syntax object, according to the rules of R5RS, section (4.3.2), except that the first position in a pattern is not ignored. When an identifier in a pattern is bound-identifier=? to an identifier in (literal ...), it will be matched against identifiers in the input using literal-identifier=?. If a pattern is matched but a fender expression is present and evaluates to #f, evaluation proceeds to the next clause.

In the fender and output-expression of each clause, the (syntax template) and (quasisyntax template) forms are effectively rebound so that pattern variables in pattern, and visible pattern variables in nesting syntax-case forms, will be replaced in template by the subforms they matched in the input. For this purpose, the template in (syntax template) is treated identically to a syntax-rules template (R5RS 4.3.2). Subtemplates of quasisyntax templates that do not contain unquoted expressions are treated identically to syntax templates.

The rules for bound-identifier=? equivalence of fresh identifiers replacing identifiers in templates that do not refer to pattern variables remain as specified in the sections describing the primitives syntax and quasisyntax above.

An ellipsis in a template that is not preceded by an identifier is not interpreted as an ellipsis literal. This allows the following idiom for generating macros containing ellipses:

  (let-syntax ((m (lambda (form)
                    (syntax-case form ()
                      ((_ x ...)
                       (with-syntax ((::: (syntax ...)))
                         (syntax
                          (let-syntax ((n (lambda (form)
                                            (syntax-case form ()
                                              ((_ x ... :::)
                                               (syntax `(x ... :::)))))))
                            (n a b c d)))))))))
      (m u v))  
                ==> (a b c d)
library syntax: (with-syntax template)
As in [6, 7], with-syntax expands to an instance of syntax-case:
  (define-syntax with-syntax
    (lambda (x)
      (syntax-case x ()
        ((_ ((p e0) ...) e1 e2 ...)
         (syntax (syntax-case (list e0 ...) ()
                   ((p ...) (begin e1 e2 ...)))))))) 
library syntax: (syntax-rules template)
See R5RS, section (4.3.2). Definable in terms of syntax-case as [6. 7]:
(define-syntax syntax-rules
  (lambda (x)
    (syntax-case x ()
      ((_ (i ...) ((keyword . pattern) template) ...)
       (syntax (lambda (x)
                 (syntax-case x (i ...)
                   ((dummy . pattern) (syntax template))
                   ...))))))) 

Reader extensions

The reader extensions #'e for (syntax e) and #`e for (quasisyntax e) are recommended.

Implementation

The implementation uses the forms and procedures specified in R5RS. It does not require R5RS macros or any other existing macro system. In addition, it uses gensym with an optional string prefix argument, and an interaction-environment, no-argument variant of eval. Portability hooks are provided for Schemes that lack either of these.

It has been successfully run on at least Chez, Chicken, Gambit and MzScheme.

The implementation was strongly influenced by the explicit renaming system [8, 11].

We use a fast imperative hygiene algorithm that is eager and linear in expression size.

Source-object correlation

The specification requires compound syntax objects to be represented as ordinary Scheme lists or vectors. This means that we cannot store source location information for these in the syntax object itself.

Given this representation, a method to track source information was worked out by Dybvig and Hieb [2]: The expander simply maintains a record of the source information for each list and each (occurrence of each) identifier in some external data structure, e.g., a hash table. This would require an extra wrapper for each identifier occurrence to give it its own identity.

Since this requires reader support, which is not portable, it is not implemented in the reference implementation. Still, the reference implementation does record intermediate expansion steps all the way back to the original source expression and displays these whenever a syntax error is invoked.

During the draft period, the reference implementation will be available here.

Acknowledgements

Special thanks to Kent Dybvig, Matthew Flatt and Felix Winkelmann for helpful comments.

References

[1] André van Tonder - Portable macros and modules

    http://www.het.brown.edu/people/andre/macros/index.htm

[2] R. Kent Dybvig - Private communication.

[3] Marcin 'Qrczak' Kowalczyk - Message on comp.lang.scheme:

    http://groups-beta.google.com/group/comp.lang.scheme/msg/b7075e4ca751dbdb

[4] Ben Rudiak-Gould - Message on comp.lang.scheme:

    http://groups-beta.google.com/group/comp.lang.scheme/msg/180c7627853c288e

[5] Matthew Flatt - Composable and Compilable Macros You Want it When?

[6] R. Kent Dybvig - Schez Scheme user's guide:

    http://www.scheme.com/csug/

[7] Robert Hieb, R. Kent Dybvig and Carl Bruggeman
    - Syntactic Abstraction in Scheme.

    R. Kent Dybvig - Writing hygienic macros in syntax-case

    http://library.readscheme.org/page3.html

[8] William D. Clinger - Hygienic macros through explicit renaming.

    http://library.readscheme.org/page3.html

[9] Eugene E. Kohlbecker, Daniel P. Friedman, Matthias Felleisen and Bruce F. Duba
    - Hygienic macro expansion

    http://library.readscheme.org/page3.html

[10] Alan Bawden - Quasiquotation in Lisp 

     http://citeseer.ist.psu.edu/bawden99quasiquotation.html

[11] Richard Kelsey and Jonathan Rees - The Scheme 48 implementation

     http://s48.org/

[12] Robert Hieb, R. Kent Dybvig - A compatible low-level macro facility

     Revised(4) Report on the Algorithmic Language Scheme (appendix)
     
[13] Matthew Flatt 
     - Introducing an Identifier into the Lexical Context of a Macro Call
    
     http://list.cs.brown.edu/pipermail/plt-scheme/2004-October/006891.html

Copyright

Copyright (C) André van Tonder (2005). All Rights Reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Author: André van Tonder
Editor: Francisco Solsona