#,
external form
This SRFI is currently in final status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-10@nospamsrfi.schemers.org
. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.
As an extensible external notation the proposed #,()
form
is intended to be specialized in other SRFIs, especially those that
introduce new types of Scheme values. One example is SRFI-4, "Homogeneous
numeric vector datatypes". The external representation for float
vectors was in "minor nonconformance to Standard Scheme." The present
SRFI avoids this conflict. This SRFI can also solve a problem of
representing a 2D matrix with a zero dimension, mentioned in a SRFI-4
discussion archive. Incidentally, SRFI-4 discussion was the
inspiration for the present proposal.
This SRFI discusses and emulates reader-macros and specific sharpsign forms of Common Lisp.
The present SRFI proposes an extensible external representation of
Scheme values, a notational convention for future SRFIs. This SRFI
adds #,(
as a new token and extends production rules of
the grammar for a Scheme reader. The #,()
form can be
used for example to denote values that do not have a convenient
printed representation, as well for conditional code compilation. It
is proposed that future SRFIs that contain new read syntax for values
use the #,()
notation with an appropriate tag symbol.
As a particular example and the reference implementation for the
#,()
convention, this SRFI describes an interpretation of
the #,()
external form as a read-time application.
External representations of booleans, numbers, lists, vectors, strings are codified in RnRS. Every Scheme reader has an in-born knowledge how to parse the corresponding strings and build Scheme values they represent. The set of these built-in constructors is however fixed, and not amenable. This SRFI proposes to lift this limitation, to allow printed representation and serialization for ports, structures, wills, semaphores and other datatypes present in many Scheme systems as well as datatypes that may be introduced in future SRFIs. It has to be stressed however that the proposed external form can denote not only structural, record and esoteric data but standard Scheme values as well, whose canonical representation is cumbersome, dependent on external circumstances or otherwise unsuitable.
The #,()
notation may appear in any character
stream intended to be processed by a read
procedure. For
example, a #,()
form can be used in a stream being read
solely for data, as a source of initial values and other parameters
for a specific algorithm. On the other hand, a character stream may
contain Scheme code to be read and evaluated. In such streams
#,()
forms can denote either literal values or code to be
conditionally compiled.
A future SRFI-X may introduce a new distinct type of Scheme values and intend the values of this type to be written and read. Any such SRFI has to extend the formal syntax of Scheme: R5RS Sections 7.1.1 and 7.1.2. The most common way of such an extension is
"#" <discriminator> <other-char>*where
<discriminator>
is one character other than
'(
', '\
', 'i
',
'e
', 'b
', 'o
',
'x
', 'd
' and possibly 'f
',
'!
' and 't
'. There does not appear to be
too much choice for the <discriminator>
, especially
if one wants to make it mnemonic.
SRFI-10 proposes another, systematic and expressive way of adding
new
external representations of scheme values, namely, via a
#,(<tag> <datum>*)
form. A particular SRFI-X
that introduces a new data type should simply pick an appropriate
<tag>
(a symbol) and decide upon
<datum>
-arguments. The SRFI-X no longer has to add
new productions to the external syntax, nor does it need to fight for
the remaining characters that may be used as the
<discriminator>
. Implementations of the SRFI-X
should support writing of values of the new datatype in the
#,()
form, and reading them back. The exact way of doing
that is however up to the SRFI-X or its implementations.
The #,()
notation is useful not only for new types of
values. Existing Scheme datatypes can benefit from it as well: for
example, cyclical lists
and other structures with circular dependencies, data structures
with large caches or memoization tables, etc. In another possible
scenario one may introduce the following external forms:
#,(pi) #,(epsilon) #,(Infinity) #,(NaN)which represent rather useful (inexact) numbers.
Yet another application area for the proposed notation is "variable
constants", e.g., #,(os-type)
. The Scheme reader would
replace every instance of #,(os-type)
with an appropriate
literal, e.g., "Solaris"
, "HP-UX"
. Since
this "binding" occurs very early, the corresponding string can be
analyzed in various macros, syntax and other special forms. Another
useful notation of the same kind is #,(srfi-features)
,
which can be replaced by the list of feature identifiers the current
implementation supports.
It is important to note the differences between #,(foo)
and seemingly similar constructions (macro-expand-foo)
and (force (delay (compute-foo))
. The former is processed
at read-time. Therefore it may meaningfully appear in input streams
being read
and processed solely as data. The other forms
may only be used in input streams meant to be passed to a compiler or
otherwise evaluated. Furthermore, an evaluator/macro-expander never
sees #,(foo)
forms. When a compiler/interpreter reads an
expression any #,(foo)
forms that may have occurred in
the input stream will already be replaced by Scheme values they
represent. This is similar to handling of abbreviations (R5RS 7.1.2),
e.g., 'bar
. An evaluator never comes across an apostrophe
as a token: all the evaluator gets is a list of two elements, the
first of them being quote
. If #,(foo)
happens to occur within a special form, that syntax will never see
#,(foo)
as it is -- it will see the corresponding
literal. In contrast, (force (delay (compute-foo))
or (macro-expand-foo)
used as arguments to a special form
will be passed to that form as expressions rather than the
values they evaluate/expand to.
The present SRFI extends the grammar for external representations -- R5RS Section 7.1.2 "External representations" -- as follows:
<datum> ---> <simple datum> | <compound datum> | <hash-comma-datum> <hash-comma-datum> ---> "#,(" <tag> <datum>* ")" <tag> ---> <symbol> | <hash-comma-datum>Furthermore, the production for
<token>
in R5RS
Sec. 7.1.1 is amended to read<token> ---> <identifier> | <boolean> | <number> | <character> | <string> | ( | ) | #( | ' | ` | , | ,@ | . | #,(
This SRFI does not specify how the Scheme reader interprets a
specific <hash-comma-datum>
and creates the
corresponding Scheme value. This is left to particular SRFIs. The
behavior of the reader when it fails to interpret a #,()
form is also unspecified.
The Scheme reports do not define behavior of a reader in case the input stream does not follow the grammar for external representations (R5RS 7.1.2). In particular the reader is not required to report an error. The set of all input streams that can be read without a mandatory error reporting is the same no matter if SRFI-10 is in effect or not. In this sense, SRFI-10 does not change the language, therefore, SRFI-10 is already "implemented" in any existing Scheme system.
#.
and #S()
reader-macros
of Common Lisp. It interprets the
<hash-comma-datum>
of the amended grammar for
external representations
"#,(" <tag> <datum>* ")"where a
tag
must be an (external representation of an)
identifier, and datum
etc. are external representations
of some values, which may be read-time applications as well. The
read procedure should look up a reader-constructor associated
with the tag
, read the arguments
datum
... and apply the constructor to the arguments. The
result of the application is taken to be the value that corresponds to
the #,()
external form. It is an error if a
reader-constructor associated with the tag
cannot be
located.
There are several possible ways to specify an association between symbolic tags and constructor procedures. For example, an implementation may:
define-reader-ctor SYMBOL PROCfor that purpose.
#,(define-reader-ctor 'f32 f32vector)
The present SRFI-10 reference implementation provides the
define-reader-ctor
function. The examples below assume
this particular choice.
The read-time application mechanism is rather close to Lisp
reader's macro functions. Unlike Common Lisp, however, a
reader-constructor is not allowed to read from the input stream on its
own. It may only build values from other values, which must have
already been read and internalized. A reader-constructor must always
return one value; it may not return "nothing". It may however throw an
exception or simply return an "inappropriate" value such as
#f
, which will be caught later. Unlike compile-time
function applications (that is, macro-expansions), a read-time
application has no "second pass". Since a #,()
form is
completely read when a reader-constructor is called, we can relax a
Common Lisp's prohibition on side-effects in the constructor. Unlike
CL reader macro function, a reader-constructor cannot be called
repeatedly during reading of a single, non-nested #,()
form. Besides, prohibition on side effects is very difficult to
enforce. Lifting it does not constitute encouragement of
side-effecting
reader-constructors however.
Common Lisp defines an external form #.obj
, which
instructs the Lisp reader to evaluate obj
right after the
reader parsed it. While #.
is a general-purpose
read-evaluator:
#.obj === (eval obj)
#,
is merely an application:
#,obj === (apply (lookup
(car obj)) (cdr obj))
and obj
must be an external representation of a
list. Read-time application is weaker than a generic read-time
evaluation, and therefore less expensive. Example 5 below illustrates
the difference.
Read-time applications are further restricted to only those
procedures that have been specifically declared for that purpose (via
define-reader-ctor
). The user thus has a fine-grained
control over which functions are being applied at read
time. Performing arbitrary computations at read time is a security
concern: for example, when an application server reads data from a
request pipe, it does not want unexpected functions with potentially
dangerous side-effects to be invoked. In Common Lisp one can avoid
this undesirable behavior -- in an all-or-nothing-fashion -- by
setting a distinguished global variable *read-eval*
to
nil
. This however will globally turn off evaluation of
all #.
forms, even those which are deemed safe and
appropriate. Registration of reader-constructors required in this
reference implementation enables therefore a fine-grained control over
read-time computations. Furthermore, the reference implementation
registers reader-constructors in a readtable. Different invocations of
the reader may execute or flag read-time applications differently
depending on a readtable in effect.
Common Lisp defines a #S(typename . data)
notation
that constructs values at read-time yet requires prior registration of
constructors. It happens only to be used for structures. This SRFI
generalizes this notation, enabling read-time applications to compute
arbitrary values, including S-expressions intended to be
compiled/interpreted.
Alternative printed representations for standard Scheme datatypes and other values
(define-reader-ctor 'list list) (with-input-from-string "#,(list 1 2 #f \"4 5\")" read) ==> (1 2 #f "4 5") (define-reader-ctor '+ +) (with-input-from-string "#,(+ 1 2)" read) ==> 3
(define-reader-ctor 'my-vector (lambda x (apply vector (cons 'my-vector-tag x)))) (with-input-from-string "#,(my-vector (my-vector 1 2))" read) ==> a vector whose second element is a list of a symbol my-vector, number 1, and number 2. (with-input-from-string "#,(my-vector #,(my-vector 1 2))" read) ==> a vector whose second element is a my-vector constructed from numbers 1 and 2. (with-input-from-string "#,(my-vector #,(my-vector #,(+ 9 -4)))" read) ==> '#(my-vector-tag #(my-vector-tag 5))
#,
notation
(define-reader-ctor 'f32 f32vector) (with-input-from-string "#,(f32 1.0 2.0 3.0)" read) ==> a uniform f32 vector with three elements
(define-reader-ctor 'file open-input-file) (with-input-from-string "#,(file \"/tmp/a\")" (lambda () (read-char (read)))will return the first character of the file "
/tmp/a
"
#.
and the proposed
#,
(with-input-from-string "#,(+ 1 (+ 2 3))" read) ==> error: can't add number 1 and a list '(+ 2 3) (with-input-from-string "#,(+ 1 #,(+ 2 3))" read) ==> 6In contrast, in Common Lisp
(with-input-from-string (is "#.(+ 1 (+ 2 3))") (read is)) ==> 6
foo.scm
" contains
(define (temp-proc) (let ((v '#,(f32 1.0 2.0 3.0))) (f32vector-ref v 1)))then
(define-reader-ctor 'f32 f32vector) (load "foo.scm") (pp temp-proc) ==> (lambda () (let ((v '#f32(1. 2. 3.))) (f32vector-ref v 1))) (temp-proc) ==> 2.0
The full implementation and the validation code are available at
read-apply.scmThe title comments in the validation code explain how to run it.
vread-apply.scm
Implementation of a cond-expand
form of SRFI-0 as a
read-time application and the validation code are available at
cond-expand.scm
vcond-expand.scm
The implementation is directly based on a grammar given in SRFI-0. We
assume that "feature identifiers" in effect are contained in a list
ALL-FEATURES
. This list should either be pre-defined in a
Scheme system, or otherwise defined prior to reading the code in
question. This implementation demonstrates how to customize a Gambit
Scheme interpreter before it started
evaluating user's code. The modification includes enabling SRFI-10,
SRFI-0, and building a list of features. The reader-ctor code
hopefully demonstrates that the changes to the reader required to
implement cond-expand
are minor and straightforward --
contrary to what SRFI-0 assumed.
Copyright (C) Oleg Kiselyov (1999). All Rights Reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.