mailto:srfi-72@srfi.schemers.org
.
See instructions
here to subscribe to the list. You can access previous messages via
the
archive of the mailing list.
Existing hygienic macro algorithms work well for high-level macros, but do not solve the unintended variable capture problem in the case of procedural macros. The proposal described here identifies and solves a class of capture problems that may occur in procedural macros.
A primitive is provided for writing hygiene-breaking macros that are composable and referentially transparent, correcting a known shortcoming of some existing macro systems.
The design provides a small number of core primitives. More complex features, including pattern matching and substitution facilities such as syntax-case and quasisyntax, are specified as library syntax.
Syntax-case is expressible as a macro in this proposal and specified as library syntax. The version specified here incorporates a proposal for improved hygiene in procedural macros.
The reference implementation documents an imperative hygiene algorithm that is eager, has linear complexity, and is very fast.
Source correlation information is not tied to syntax objects, but instead recorded separately by the expander, allowing not just source location but also intermediate expansion steps to be tracked.
The primitive interface to compound syntax objects is procedural rather than through special forms. In particular, the usual and very useful abstractions car, cdr, cons , ..., for manipulating syntactic objects in a Lisp are available to the macro writer.
The reference implementation is small and tractable, written in mostly portable R5RS Scheme without using R5RS macros. It should run out of the box on most existing Schemes.
(define-syntax (swap! a b) (quasisyntax (let ((temp ,a)) (set! ,a ,b) (set! ,b temp))))This macro builds a syntax object using the quasisyntax substitution form. Syntax provided as part of the input expression is inserted in the result using unquote or unquote-splicing. Macros written in this way are hygienic and referentially transparent.
This macro may also be written as
(define-syntax swap! (lambda (form) (let ((a (cadr form)) (b (caddr form))) `(,(syntax let) ((,(syntax temp) ,a)) (,(syntax set!) ,a ,b) (,(syntax set!) ,b ,(syntax temp))))))showing that quasisyntax may itself be defined as library syntax in terms of a primitive syntax form [12].
The example illustrates that we can use the traditional and very useful abstractions car, cdr, ..., for handling compound syntax objects in a Lisp. Indeed, the core interface to compound syntax objects is procedural rather than via special forms.
Syntax-case is expressible as a macro in the current proposal and specified as library syntax, so that we can write, for example.
(define-syntax cond (lambda (x) (syntax-case x () ((_ c1 c2 ...) (let f ((c1 c1) (cmore (syntax (c2 ...)))) (if (null? cmore) (syntax-case c1 (else =>) ((else e1 e2 ...) (syntax (begin e1 e2 ...))) ((e0) (syntax (let ((t e0)) (if t t)))) ((e0 => e1) (syntax (let ((t e0)) (if t (e1 t))))) ((e0 e1 e2 ...) (syntax (if e0 (begin e1 e2 ...))))) (let ((rest (f (car cmore) (cdr cmore)))) (syntax-case c1 (=>) ((e0) (quasisyntax (let ((t e0)) (if t t ,rest)))) ((e0 => e1) (quasisyntax (let ((t e0)) (if t (e1 t) ,rest)))) ((e0 e1 e2 ...) (quasisyntax (if e0 (begin e1 e2 ...) ,rest)))))))))))As the example shows, we may combine implicit and explicit substitution in quasisyntax.
One is not limited to using syntax-case for matching. The core primitives described here may be used to implement more general pattern matchers [1].
In the proposal of this SRFI, the forms quasisyntax and syntax-case are specified so that the following macros will give the answer 1
(define-syntax (no-capture) | (define-syntax (no-capture) | (define (helper value) | (define (helper value) (quasisyntax | (with-syntax ((value value)) (let ((temp 2)) ,value))) | (syntax (let ((temp 2)) value)))) | (quasisyntax | (with-syntax ((nested (helper (syntax temp)))) (let ((temp 1)) | (syntax (let ((temp 1)) ,(helper (syntax temp))))) | nested)))) | (no-capture) ==> 1 | (no-capture) ==> 1whereas the same macro would give the answer 2 in existing systems due to a variable capture.
Note that if the helper procedure were written as a macro, one would expect to obtain the answer 1 due to automatic hygiene. In the current proposal, the macro writer is not penalized for using a procedure instead.
Next consider the following macros for a version of let with left to right evaluation. With the proposal of this SRFI, this definition is correct:
(define-syntax let-in-order (lambda (form) (define (let-help form tems vars) (syntax-case form () ((_ () e0 e1 ...) (with-syntax (((var ...) vars) ((tem ...) tems)) (syntax (let ((var tem) ...) e0 e1 ...)))) ((_ ((var exp) binding ...) e0 e1 ...) (quasisyntax (let ((tem exp)) ,(let-help (syntax (_ (binding ...) e0 e1 ...)) (cons (syntax tem) tems) (cons (syntax var) vars))))))) (let-help form '() '()))) (let-in-order ((x 1) (y 2)) (+ x y)) ==> 3whereas existing systems will give the wrong answer 4 due to variable capture.
Again, if the helper were expressed as a macro, one would not expect variable capture. With the current proposal, this good behaviour is guaranteed also for procedural helpers.
To preserve hygiene across macro invocations, hygiene algorithms effectively rename identifiers introduced by a macro to prevent accidental capture. In existing systems only one renaming function is typically used during the entire dynamic extent of the macro invocation. This means that accidental captures may still take place in procedural macros unless the programmer keeps careful track of all identifiers introduced in different parts of the macro and all its helper procedures. This burden is reminiscent of the difficulties one encounters in languages with dynamic scoping of variables. It limits the modularity and constrains the maintainability of complex macros. Especially if code is generated recursively, as in the let-in-order macro, it may be quite hard, in these systems, to verify whether accidental captures occur.
The solution proposed here is based on a primitive with-fresh-renaming-scope that effectively causes a fresh renaming function to be used in the static region following this keyword. Identifiers introduced while expanding this region cannot capture or be captured by identifiers introduced outside this region. It is important to note that this is a lexical, not a dynamic, construct. For example, we have
(let ((f (lambda () (syntax x)))) (with-fresh-renaming-scope (bound-identifier=? (syntax x) (f)))) ==> #fwhere bound-identifier=? non-equivalence means that a binding of one identifier cannot capture references to the other.
It should be clear from studying the examples of this section that the capture problems occur when splicing together syntax generated in a lexically separate parts of the program text. We can avoid these capture problems by requiring all substitution forms to expand to an occurrence of with-fresh-renaming-scope.
In the current proposal, the substitution forms are quasisyntax, syntax-case and with-syntax, which are indeed defined in this way. With this specification, the above macros all behave correctly as they stand.
As a result, identifiers introduced in the region lexically enclosed by the quasisyntax, syntax-case or with-syntax keywords cannot capture or be captured by identifiers introduced elswhere in the program text.
(if-it 1 it 42) ==> 1We would like to be able to freely compose unhygienic macros, something that is notoriously difficult to do in existing systems. For example, we would like to be able to write macros when-it, if-flag-it and my-or in terms of if-it as follows:
(define-syntax (when-it condition consequent) (quasisyntax (if-it ,condition ,consequent (if #f #f)))) (define-syntax (if-flag-it body else) (quasisyntax (if-it flag ,body ,else))) (define-syntax (my-or expr1 expr2) (quasisyntax (if-it ,expr1 it ,expr2)))These macros should behave as follows:
(when-it 42 it) ==> 42 (define flag 3) (if-flag-it it 'none) ==> 3 (my-or 2 it) ==> 2 (my-or #f it) ==> #fThe macro system described here has a primitive datum->syntax similar to that provided in the Chez syntax-case specification. However, as far as the author is aware, it is impossible to satisfy these four conditions using datum->syntax without code-walking. Furthermore, we may wish to impose referential transparency requirements such as
(let ((it 1)) (if-it 42 it #f)) ==> 1 (let ((it 1)) (when-it 42 it)) ==> 1 (let ((it 1)) (my-or 2 it)) ==> 2 (let ((it 1)) (my-or #f it)) ==> 1by analogy with the behaviour of
(let ((else #f)) (my-cond (else 2))) ==> voidTo satisfy these requirements, we provide a new primitive, make-capturing-identifier, that introduces an identifier which, when bound, will capture all free-identifier=? identifiers in its scope. With this primitive, the following implementation of if-it satisfies all the above requirements:
(define-syntax (if-it condition consequent alternative) (let ((it (make-capturing-identifier (syntax here) 'it))) (quasisyntax (let ((,it ,condition)) (if ,it ,consequent ,alternative)))))A similar idea has been proposed in [2].
The meaning of the new identifier is determined by the lexical binding of the first argument. This allows us fine control over what we mean by referential transparency. Compare for example the above with:
(define-syntax if-it (lambda (form) (syntax-case form () ((keyword condition consequent alternative) (with-syntax ((it (make-capturing-identifier keyword 'it))) (syntax (let ((it condition)) (if it consequent alternative)))))))) (let ((it 1)) (if-it 42 it #f)) ==> 42
The primitive datum->syntax is still the appropriate primitive for introducing identifiers that should be captured by bound-identifier=? identifiers in surrounding binding forms. Comparing with the description of make-capturing-identifier above, we see that the one introduces identifiers that are the subject of capture, while the other introduces identifiers that should be the object of capture.
define-syntax let-syntax letrec-syntax set-syntax! identifier? bound-identifier=? free-identifier=? literal-identifier=? syntax syntax-quote with-fresh-renaming-scope make-capturing-identifier datum->syntax syntax->datum expand syntax-debug syntax-errorThe following library forms are provided:
quasisyntax syntax-case with-syntax syntax-rules
Syntax objects:
'() 1 #f '(1 2 3) (cons (syntax x) (vector 1 2 3 (syntax y))) (syntax (let ((x 1)) x)) (quasisyntax (let ((x 1)) ,(syntax x)))Symbols may not appear in syntax objects:
'(let ((x 1)) x) ==> not a syntax object
syntax: (DEFINE-SYNTAX var exp) (DEFINE-SYNTAX (var . formals) exp1 exp ...)
The second variant is equivalent to
(DEFINE-SYNTAX var (let ((transformer (lambda (dummy . formals) exp1 exp ...))) (lambda (form) (apply transformer form))))Exp may, but does not have to, evaluate to a procedure, also called a transformer.
syntax: (LET[REC]-SYNTAX ((var exp) ...) exp* ...)
(let ((x 'outer)) (let-syntax ((m (lambda (_) (syntax x)))) (let ((x 'inner)) (m)))) ==> outer (let-syntax ((when (lambda (form) (quasisyntax (if ,(cadr form) (begin ,@(cddr form))))))) (let ((if #t)) (when if (set! if 'now)) if)) ==> nowMacros defined in a lexically enclosing let[rec]-syntax are available for expanding further nested macros, as the following example shows:
(let ((x 1)) (let-syntax ((m (lambda (_) (syntax (syntax x))))) (let-syntax ((n (lambda (_) (m)))) (n)))) ==> 1
syntax: (SET-SYNTAX! var exp)
(define-syntax (test) (syntax (syntax 'a))) (set-syntax! test (lambda (form) (test))) (test) ==> a
procedure: (IDENTIFIER? obj)
procedure: (BOUND-IDENTIFIER=? obj1 obj2) (FREE-IDENTIFIER=? obj1 obj2) (LITERAL-IDENTIFIER=? obj1 obj2)
Identifiers are literal-identifier=? if they are free-identifier=? or if they both refer to toplevel bindings and have the same symbolic name. This primitive should be used to reliably identify literals (such as else in cond) even if they occur in a different module from the macro definition.
Identifiers are bound-identifier=? if a binding of one would capture references to the other in the scope of the binding. Two identifiers with the same name are bound-identifier=? if they were present in the same toplevel expression in the original program text. Identifiers will also be bound-identifier=? if they were created by applying syntax to existing bound-identifier=? identifiers during the same macro-invocation or invocation of with-fresh-renaming-scope. In addition, datum->syntax may create identifiers that are bound-identifier=? to previously introduced identifiers.
These procedures return #f if either argument is not an identifier.
syntax: (SYNTAX datum)
During the course of a single macro invocation, syntax ordinarily acts like a one-to-one mathematical function on identifiers: Two identifiers created by evaluating syntax expressions will be bound-identifier=? if and only if the syntax expressions were evaluated during the same macro invocation and the original identifiers in the templates were bound-identifier=?.
(bound-identifier=? (syntax x) (syntax x)) ==> #tThis behaviour may be modified by the form with-fresh-renaming-scope, which in effect causes a fresh renaming function to be used for evaluating syntax expressions occurring in the static region following this keyword. Note that this is a lexical, not dynamic, construct. For example, one has
(let ((f (lambda () (syntax x)))) (with-fresh-renaming-scope (bound-identifier=? (syntax x) (f)))) ==> #f
Identifiers that are bound-identifier=? are required to also be free-identifier=?, denoting the same binding. Any attempt to break this invariant should cause an error to be signaled.
(cons (syntax x) (let ((x 1)) (syntax x))) ==> error
Examples:
In the following example, (m) expands to (syntax x), where x denotes the outer binding. Although the expanded (syntax x) occurs in the syntactic environment of the middle binding, the fresh identifier resulting from evaluating it will denote the same thing as the identifier x in the template. It will therefore also refer to the outer binding.
(let ((x 'outer)) (let-syntax ((m (lambda (_) (syntax (syntax x))))) (let ((x 'middle)) (let-syntax ((n (lambda (_) (m)))) (let ((x 'inner)) (n)))))) ==> outerNote that syntax does not unify identifiers previously distinct in the sense of bound-identifier=? occurring in template even if they have the same symbolic name:
(let ((x 1)) (let-syntax ((m (lambda (_) (syntax (syntax x))))) (let ((x 2)) (let-syntax ((n (lambda (_) (quasisyntax (let-syntax ((o (lambda (_) (,(syntax syntax) (,(syntax list) ,(m) ,(syntax x)))))) (o)))))) (n))))) ==> (1 2)
syntax: (SYNTAX-QUOTE datum)
For example, in the following fragment, z is passed to the inner macro by two paths, one of them via x and then y, and the other via only x. Using syntax-quote, we can "tunnel" x to the inner macro as follows:
(let-syntax ((m (lambda (form) (syntax-case form () ((_ x) (syntax (let-syntax ((n (lambda (form*) (syntax-case form* () ((_ y) (with-syntax ((x (syntax-quote x))) (syntax (let ((y 1)) x)))))))) (n x)))))))) (m z)) ==> 1
syntax: (WITH-FRESH-RENAMING-SCOPE exp1 exp ...)
The body exp1 exp ... is subject to the rules for a lambda body, and may contain internal definitions.
procedure: (MAKE-CAPTURING-IDENTIFIER context-identifier symbol)
(define-syntax (if-it condition consequent alternative) (let ((it (make-capturing-identifier (syntax here) 'it))) (quasisyntax (let ((,it ,condition)) (if ,it ,consequent ,alternative))))) (if-it 42 it #f) ==> 42 (let ((it 1)) (if-it 42 it #f)) ==> 1The following examples illustrate how the behaviour of the capturing identifier is affected by the context-identifier argument. Compare in particular the first example below with the one above.
(define-syntax if-it (lambda (form) (let ((it (make-capturing-identifier (car form) 'it))) (quasisyntax (let ((,it ,(cadr form))) (if ,it ,(caddr form) ,(cadddr form))))))) (let ((it 1)) (if-it 42 it #f)) ==> 42 (let ((y 'outer)) (let-syntax ((m (lambda (_) (make-capturing-identifier (syntax here) 'y)))) (let ((y 'inner)) (m)))) ==> outer (let ((y 'outer)) (let-syntax ((m (lambda (form) (let ((y (make-capturing-identifier (syntax here) 'y))) (quasisyntax (let ((,y 'inner)) ,@(cdr form))))))) (m y))) ==> inner (let ((y 'outer)) (let-syntax ((m (lambda (form) (let ((y (make-capturing-identifier (syntax here) 'y))) (quasisyntax (let ((,y 'inner)) ,@(cdr form))))))) (let ((y 'more)) (m y)))) ==> more
procedure: (DATUM->SYNTAX context-identifier obj)
If template-identifier is a capturing identifier, the symbols in obj will also be converted to capturing identifiers.
(let ((x 'outer)) (let-syntax ((m (lambda (_) (syntax (syntax z))))) (let ((x 'middle)) (let-syntax ((n (lambda (_) (datum->syntax (m) 'x)))) (let ((x 'inner)) (n)))))) ==> outer (let-syntax ((m (lambda (form) (syntax-case form () ((_ y) (let ((x (datum->syntax (syntax y) 'x))) (quasisyntax (let ((y 1)) ,x)))))))) (m x)) ==> 1
procedure: (SYNTAX->DATUM syntax-object)
procedure: (EXPAND syntax-object)
(expand (syntax (let ((x 1)) x))) ==> ((lambda (@x5872) @x5872) 1)
procedure: (SYNTAX-DEBUG syntax-object)
(syntax-debug (syntax (let ((x 1)) y))) ==> (let ((x#top 1)) y#top)
procedure: (SYNTAX-ERROR obj ...)
library syntax: (QUASISYNTAX template)
Constructs a new syntax object from the template, parts of which may be unquoted using unquote or unquote-splicing. Quasisyntax is to syntax as quasiquote is to quote.
For example, no variable capture occurs in the following macro:
(define-syntax (no-capture) (define (helper value) (quasisyntax (let ((temp 2)) ,value))) (quasisyntax (let ((temp 1)) ,(helper (syntax temp))))) (no-capture) ==> 1since, for example, the second quasisyntax expression expands to the equivalent of
(with-fresh-renaming-scope `(,(syntax let) ((,(syntax temp) 1)) ,(helper (syntax temp))))To make nested unquote-splicing behave in a useful way, the R5RS-compatible extension described in appendix B of the paper [10] is required.
Examples:
(bound-identifier=? (quasisyntax x) (quasisyntax x)) ==> #f (define-syntax (macro-generate name id) (quasisyntax (define-syntax (,name) (quasisyntax (let ((,(syntax ,id) 4)) ,(syntax ,id)))))) (macro-generate test z) (test) ==> 4 (define (generate-temporaries lst) (map (lambda (_) (quasisyntax temp)) lst)) (define-syntax (test-temporaries) (let ((temps (generate-temporaries '(1 2)))) (quasisyntax ((lambda ,temps (list ,@temps)) 1 2)))) (test-temporaries) ==> (1 2)
library syntax: (SYNTAX-CASE exp (literal ...) clause ...) clause := (pattern output-expression) (pattern fender output-expression)
Each pattern is identical to a syntax-rules pattern, and the optional fender may specify additional constraints on acceptance of the clause [6, 7]. Literals in the list (literal ...) are matched against identifiers in the input form using literal-identifier=?.
In the output-expression of each clause, the syntax keyword is effectively rebound to implement implicit substitution of variables bound in lexically enclosing syntax-case pattern's, so that the template in (syntax template) is treated identically to a syntax-rules template.
Subtemplates of quasisyntax templates that do not contain completely unquoted expressions are treated in the same way as syntax templates, allowing implicit substitution also inside these quasisyntax subtemplates.
The proposal adds the following requirement:
For example, no variable capture occurs in the following macro, where the with-syntax forms expand to occurrences of syntax-case:
(define-syntax (no-capture) (define (helper value) (with-syntax ((value value)) (syntax (let ((temp 2)) value)))) (with-syntax ((nested (helper (syntax temp)))) (syntax (let ((temp 1)) nested)))) (no-capture) ==> 1
library syntax: (WITH-SYNTAX template)
(define-syntax with-syntax (lambda (x) (syntax-case x () ((_ ((p e0) ...) e1 e2 ...) (syntax (syntax-case (list e0 ...) () ((p ...) (begin e1 e2 ...))))))))and inherits the modification for improved hygiene specified above for the latter.
library syntax: (SYNTAX-RULES template)
The implementation has been successfully tested on Chez, Chicken, Gambit and MzScheme.
The implementation was strongly influenced by the explicit renaming system [8, 11].
We use an imperative hygiene algorithm that is eager, has linear complexity, and is very fast. This is achieved by having bound-identifier=? identifiers share a location, so that alpha substitutions can be done by a simple imperative update of an identifier and no additional work is required to propagate substitutions or environments to leaves. In addition, source-object correlation information is not stored in syntax objects, but tracked independently, which makes it possible to represent syntax objects as ordinary list or vector structure.
During the draft period, the reference implementation will be available here.
[1] André van Tonder - Simple macros and simple modules http://www.het.brown.edu/people/andre/macros/index.htm [2] Oleg Kiselyov - Message on comp.lang.scheme: http://groups-beta.google.com/group/comp.lang.scheme/msg/9c7da84c5496d1f3 [3] Marcin 'Qrczak' Kowalczyk - Message on comp.lang.scheme: http://groups-beta.google.com/group/comp.lang.scheme/msg/b7075e4ca751dbdb [4] Ben Rudiak-Gould - Message on comp.lang.scheme: http://groups-beta.google.com/group/comp.lang.scheme/msg/180c7627853c288e [5] Matthew Flatt - Composable and Compilable Macros You Want it When? [6] R. Kent Dybvig - Schez Scheme user's guide: http://www.scheme.com/csug/ [7] Robert Hieb, R. Kent Dybvig and Carl Bruggeman - Syntactic Abstraction in Scheme. http://library.readscheme.org/page3.html [8] William D. Clinger - Hygienic macros through explicit renaming. http://library.readscheme.org/page3.html [9] Eugene E. Kohlbecker, Daniel P. Friedman, Matthias Felleisen and Bruce F. Duba - Hygienic macro expansion http://library.readscheme.org/page3.html [10] Alan Bawden - Quasiquotation in Lisp http://citeseer.ist.psu.edu/bawden99quasiquotation.html [11] Richard Kelsey and Jonathan Rees - The Scheme 48 implementation http://s48.org/ [12] Robert Hieb, R. Kent Dybvig - A compatible low-level macro facility Revised(4) Report on the Algorithmic Language Scheme (appendix)
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.