-- Title -- S-expression comments -- Author -- Taylor Campbell -- Abstract -- This SRFI proposes a simple extension to Scheme's lexical syntax that allows individual S-expressions to be made into comments, ignored by the reader. This contrasts with the standard Lisp semicolon comments, which make the reader ignore the remainder of the line, and the slightly less common block comments, as SRFI 30 defines: both of these mechanisms comment out sequences of text, not S-expressions. -- Rationale -- Line & block comments are useful for embedding textual commentary in programs, but they are practically useless and bothersome without very complicated editor support for removing selected S-expressions while retaining them in text, or subsequently removing the comments and re- introducing the S-expressions themselves. -- Informal specification -- A new octothorpe reader syntax character is defined, #\;, such that the reader ignores the next S-expression following it. An S-expression is defined for the purposes of this SRFI as a full expression that READ may return. For example, (+ 1 #;(* 2 3) 4) =reads=> (+ 1 4) =evals=> 5 (LIST 'X #;'Y 'Z) =reads=> (LIST (QUOTE X) (QUOTE Z)) =evals=> (X Z) (* 3 4 #;(+ 1 2)) =reads=> (* 3 4) =evals=> 12 (#;SQRT ABS -16) =reads=> (ABS -16) =evals=> 16 Some examples of seemingly nested S-expression comments may appear to be confusing at first, but they are straightforwardly explained. For instance, consider the text (LIST 'X #; #; 'Y 'Z). This reads as the list represented as (LIST (QUOTE X)). Note that both 'Y and 'Z seemed to 'disappear.' The reason is simply that, when the first #; reads ahead in the input stream for the next S-expression, the reader encounters another #;, which consumes & ignores the 'Y, and is then forced to proceed on to the 'Z to find a complete S-expression for the first #; to receive; this, too, is ignored, leaving only the closing parenthesis in the stream, which results in (LIST (QUOTE X)). That is a fairly special case of nested S-expression comments. Others are somewhat simpler for intuition to grasp immediately: (LIST 'A #;(LIST 'B #;'C 'D) 'E) =reads=> (LIST (QUOTE A) (QUOTE E)) =evals=> (A E) There are also some other somewhat peculiar examples, such as in dotted lists: '(A . #;B C) =reads=> (QUOTE (A . C)) =evals=> (A . C) '(A . B #;C) =reads=> (QUOTE (A . B)) =evals=> (A . B) These examples require little explanation: since #;B & #;C are both ignored as whitespace, the two strings must be considered equivalent to '(A . C) & '(A . B), respectively. Note, however, that any text that is invalid without S-expression comments will be invalid with them as well, and S-expression comments must be followed by complete S-expressions; for instance, the following are all errors: (#;A . B) (A . #;B) (A #;. B) (#; #; X Y . Z) (#; #; X . Z) -- Formal specification -- R5RS's formal syntax is modified as follows: - In section 7.1.1, the names , , , , , , & are given a prefix of 'raw'; that is, , , &c. Also, an option is added to the newly named : ---> ... | #; - In section 7.1.2, the rule ---> #; is added, and, for every X in {token, identifier, variable, boolean, character, string, number}, the following rule is added: ---> * Additionally, all of the tokens in section 7.1.2 through 7.1.5 that referred to , , &c., now refer to the new , , &c., as introduced in section 7.1.2, not those that were previously defined in section 7.1.1 & now have the names , , &c. All of the new or modified rules are presented here: 7.1.1: ---> | | | | | ( | ) | #( | ' | ` | , | ,@ | . | #; ---> * | ---> that isn't also a > ---> #t | #f ---> #\ | #\ ---> " * " ---> | | | 7.1.2: ---> #; ---> * ---> * ---> * ---> * ---> * ---> * ---> * Commented data are then ignored by the semantics. -- Design questions -- Why not use a COMMENT macro? A COMMENT macro works only in Scheme code where it is defined. It could not, for example, work in quoted S-expressions the same way that the #; syntax does, although it could serve a different purpose. Furthermore, though such a macro would seem trivial, it is impossible for macros to expand to fewer or more than one output expression: a COMMENT macro would need to expand to exactly one expression, which would be meaningless and useless in many contexts. For example, the above examples would not work with this definition of COMMENT: (define-syntax comment (syntax-rules () ((comment expression) 'comment))) (+ 1 (comment (* 2 3)) 4) error: COMMENT is not a number (list 'x (comment 'y) 'z) => (X COMMENT Z) Moreover, a COMMENT macro would also, like a #,(COMMENT ) SRFI 10 constructor (see below), require an added parenthesis at the end of the commented S-expression. Why not use a SRFI 10 #,(...) constructor? If there were a SRFI 10 constructor #,(COMMENT ), one would need to not only add the initial characters but also find the end of the S-expression and append a closing parenthesis. While this could be performed by an editor, it is a tedious & dreary task that serves no useful purpose other than to permit re-use of the octothorpe reader syntax character for semicolons. Why #; in specific? This syntax is already in wide use, and it is sufficiently close to the existing semicolon syntax for its meaning, a comment syntax, to be obvious, yet the octothorpe adds enough of a mark to distinguish it from the existing line comment syntax. -- Implementation -- Though there is no portable manner by which to extend Scheme's reader, the #; syntax is trivial to implement. The procedure associated with the #\; octothorpe reader syntax character could simply first call READ to consume the following S-expression and subsequently call an internal procedure that reads either the following S-expression or a delimiting token, such as a closing parenthesis. As an example, here is an implementation of this SRFI for Scheme48's simple reader. The DEFINE-SHARP-MACRO procedure defines an octothorpe reader syntax character. When the reader sees an octothorpe followed by a character, it applies the procedure associated with that character to the character and the port being read from. (The first argument is due to the way that Scheme48's reader reads numbers & identifiers.) SUB-READ is a procedure like READ, except that, if it meets a closing parenthesis or a dot, it returns the a token indicating that the outer call to READ should terminate the list or read the dotted tail of the list. (This permits, for instance, (FOO BAR #;BAZ) to work: it is equivalent to just (FOO BAR).) SUB-READ-CAREFULLY is like SUB-READ, but it signals an error if it read a token. (define-sharp-macro #\; (lambda (char port) (read-char port) ; The octothorpe reader leaves the semicolon ; in the input stream, so we consume it. (sub-read-carefully port) (sub-read port))) The reader, for completeness, is in s48-read.scm. That file uses some non-standard procedures, most of which are mentioned at the top of the file. Those that are not mentioned are (INPUT-PORT-OPTION list), which returns the car of LIST if it is non-empty or the current input port if it is; and (WARN message irritant ...), which signals a warning with the given message & irritants. -- Copyright -- Copyright (C) 2004 Taylor Campbell. All rights reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing this copyright notice or references to the Scheme Request for Implementation process or editors, except as needed for the purpose of developing SRFIs in which case the procedures for copyrights defined in the SRFI process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the authors or their successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE AUTHORS AND THE SRFI EDITORS DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ON ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.