This SRFI is currently in final status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-119@nospamsrfi.schemers.org
. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.
This SRFI describes a simple syntax which allows making scheme easier to read for newcomers while keeping the simplicity, generality and elegance of s-expressions. Similar to SRFI 110, SRFI 49 and Python it uses indentation to group expressions. Like SRFI 110 wisp is general and homoiconic.
Different from its predecessors, wisp only uses the absolute minimum of additional syntax-elements which are required for writing and exchanging arbitrary code-structures. As syntax elements it only uses a colon surrounded by whitespace, the period followed by whitespace as first code-character on the line and optional underscores followed by whitespace at the beginning of the line.
It resolves a limitation of SRFI 110 and SRFI 49, both of which force the programmer to use a single argument per line if the arguments to a procedure need to be continued after a procedure-call.
Wisp expressions can include arbitrary s-expressions and as such provide backwards compatibility.
wisp s-exp define : factorial n __ if : zero? n ____ . 1 ____ * n : factorial (- n 1) display : factorial 5 newline (define (factorial n) (if (zero? n) 1 (* n (factorial (- n 1))))) (display (factorial 5)) (newline)
A big strength of Scheme and other lisp-like languages is their minimalistic syntax. By using only the most common characters like the period, the comma, the quote and quasiquote, the hash, the semicolon and the parens for the syntax (.,"'`#;()
), they are very close to natural language.⁽¹⁾ Along with the minimal list-structure of the code, this gives these languages a timeless elegance.
But as SRFI 110 explains very thoroughly (which we need not repeat here), the parentheses at the beginning of lines hurt readability and scare away newcomers. Additionally using indentation to mark the structure of the code follows naturally from the observation that most programmers use indentation, with many programmers letting their editor indent code automatically to fit the structure. Indentation is an important way how programmers understand code and using it directly to define the structure avoids errors due to mismatches between indentation and actual meaning.
As a solution to this, SRFI 49 and SRFI 110 provide a way to write whitespace sensitive scheme, but both have their share of issues.
As noted in SRFI 110, there are a number of implementation-problems in SRFI 49, as well as specification shortcomings like choosing the name “group” for the construct which is necessary to represent double parentheses. In addition to the problems named in SRFI 110, SRFI 49 is not able to continue the arguments to a procedure on one line, if a prior argument was a procedure call. The following example shows the difference between wisp and SRFI 49 for a very simple code snippet:
wisp | SRFI 49 |
---|---|
* 5 + 4 3 . 2 1 |
* 5 + 4 3 2 1 |
Here wisp uses the leading period to mark a line as continuing the argument list.⁽²⁾
SRFI 110 improves a lot over SRFI 49. It resolves the group-naming and reduces the need to continue the argument-list by introducing 3 different grouping syntax forms ($
, \\
and <* *>
). These additional syntax-elements however hurt readability for newcomers (obviously the authors of SRFI 110 disagree with this assertion. Their view is discussed in SRFI 110 in the section about wisp). The additional syntax elements lead to structures like the following (taken from examples from the readable project):
SRFI 110 / readable |
---|
myprocedure x: \\ original-x y: \\ calculate-y original-y |
a b $ c d e $ f g |
let <* x getx() \\ y gety() *> ! {{x * x} + {y * y}} |
This is not only hard to read, but also makes it harder to work with the code, because the programmer has to learn these additional syntax elements and keep them in mind before being able to understand the code.
Like SRFI 49 SRFI 110 also cannot continue the argument-list without resorting to single-element lines, though it reduces this problem by the above grouping syntax forms and advertising the use of neoteric expressions from SRFI 105.
define : factorial n __ if : zero? n ____ . 1 ____ * n : factorial {n - 1} display : factorial 5 newline
Wisp draws on the strength of SRFI 110 but avoids its complexities. It was conceived and improved in the discussions within the readable-project which preceded SRFI 110 and there is a comparison between readable in wisp in SRFI 110.
Like SRFI 110, wisp is general and homoiconic and interacts nicely with SRFI 105 (neoteric expressions and curly infix). Like SRFI 110, the expressions are the same in the REPL and in code-files. Like SRFI 110, wisp has been used for implementing multiple smaller programs, though the biggest program in wisp is still its implementations (written in wisp and bootstrapped via a simpler wisp preprocessor).
But unlike SRFI 110, wisp only uses the minimum of additional syntax-elements which are necessary to support arbitrary code-structures with indentation-sensitive code which is intended to be shared over the internet. To realize these syntax-elements, it generalizes existing syntax and draws on the most common non-letter non-math characters in prose. This allows keeping the actual representation of the code elegant and inviting to newcomers.
Wisp expressions are not as sweet as readable, but they KISS.
Using the colon as syntax element keeps the code very close to written prose, but it can interfere with type definitions as for example used in Typed Racket.⁽³⁾ This can be mitigated in let- and lambda-forms by using the parenthesized form. When doing so, wisp avoids the double-paren for type-declarations and as such makes them easier to catch by eye. For procedure definitions (the only define
call where type declarations are needed in typed-racket), a declare
macro directly before the define
should work well.
Using the period to continue the argument list is unusual compared to other languages and as such can lead to errors when trying to return a variable from a procedure and forgetting the period.
.,":'_#?!;
, in the given order as derived from newspapers and other sources (for the ngram assembling scripts, see the evolve keyboard layout project).(: x Number)
to declare types. These forms can still be used directly in parenthesized form, but in wisp-form the colon has to be replaced with \:
. In most cases type-declarations are not needed in typed racket, since the type can be inferred. See When do you need type annotations?The specification is separated into four parts: A general overview of the syntax, a more detailed description, justifications for each added syntax element and clarifications for technical details.
The basics of wisp syntax can be defined in 4 rules, each of which emerges directly from a requirement:
Indentation:
display + 3 4 5 newline
becomes
(display (+ 3 4 5)) (newline)
requirement: call procedure without parenthesis.
The period:
+ 5 * 4 3 . 2 1
becomes
(+ 5 (* 4 3) 2 1)
This also works with just one argument after the period. To start a line without a procedure call, you have to prefix it with a period followed by whitespace.
requirement: continue the argument list of a procedure after an intermediate call to another procedure.
The colon:
let : x 1 y 2 z 3 body
becomes
(let ((x 1) (y 2) (z 3)) (body))
requirement: represent code with two adjacent blocks in double-parentheses.
The underscore (optional):
let _ : x 1 __ y 2 __ z 3 _ body
becomes
(let ((x 1) (y 2) (z 3)) (body))
requirement: share code in environments which do not preserve whitespace.
The syntax shown here is the minimal syntax required for the goal of wisp: indentation-based, general lisp with a simple preprocessor, and code which can be shared easily on the internet:
.
to continue the argument list:
for double parens_
to survive HTMLA line without indentation is a procedure call, just as if it would start with a parenthesis.
display "Hello World!" ; (display "Hello World!")
A line which is more indented than the previous line is a sibling to that line: It opens a new parenthesis.
display ; (display string-append "Hello " "World!" ; (string-append "Hello " "World!"))
A line which is not more indented than previous line(s) closes the parentheses of all previous lines which have higher or equal indentation. You should only reduce the indentation to indentation levels which were already used by parent lines, else the behaviour is undefined.
display ; (display string-append "Hello " "World!" ; (string-append "Hello " "World!")) display "Hello Again!" ; (display "Hello Again!")
To add any of ' , ` #' #, #` or #@, to the first parenthesis on a line, just prefix the line with that symbol followed by at least one space. Implementations are free to add more prefix symbols.
' "Hello World!" ; '("Hello World!")
A line whose first non-whitespace characters is a dot followed by a space (". ") does not open a new parenthesis: it is treated as simple continuation of the first less indented previous line. In the first line this means that this line does not start with a parenthesis and does not end with a parenthesis, just as if you had directly written it in lisp without the leading ". ".
string-append "Hello" ; (string-append "Hello" string-append " " "World" ; (string-append " " "World") . "!" ; "!")
A line which contains only whitespace and a colon (":") defines an indentation level at the indentation of the colon. It opens a parenthesis which gets closed by the next line which has less or equal indentation. If you need to use a colon by itself. you can escape it as "\:".
let ; (let : ; ( msg "Hello World!" ; (msg "Hello World!")) display msg ; (display msg))
A colon surrounded by whitespace (" : ") starts a parenthesis which gets closed at the end of the line.
define : hello who ; (define (hello who) display ; (display string-append "Hello " who "!" ; (string-append "Hello " who "!")))
If the colon starts a line which also contains other non-whitespace characters, it starts a parenthesis which gets closed at the end of the line and defines an indentation level at the position of the colon.
If the colon is the last non-whitespace character on a line, it represents an empty pair of parentheses:
let : ; (let () display "Hello" ; (display "Hello"))
You can replace any number of consecutive initial spaces by underscores, as long as at least one whitespace is left between the underscores and any following character. You can escape initial underscores by prefixing the first one with \ ("\___ a" → "(_ a)"), if you have to use them as procedure names.
define : hello who ; (define (hello who) _ display ; (display ___ string-append "Hello " who "!" ; (string-append "Hello " who "!")))
Linebreaks inside parentheses and strings are not considered linebreaks for parsing indentation. To use parentheses at the beginning of a line without getting double parens, prefix the line with a period.
define : stringy s string-append s " reversed and capitalized: " ; linebreaks in strings do not affect wisp parsing . (string-capitalize ; same for linebreaks in parentheses (string-reverse s))
Effectively code in parentheses and strings is interpreted directly as Scheme. This way you can simply copy a thunk of scheme into wisp. The following is valid wisp:
define foo (+ 1 (* 2 3)) ; defines foo as 7
.w
.
(define (foo . args))
, either avoid a linebreak before the dot as in define : foo . args
or use a double dot to start the line: . . args
. The first dot mark the line as continuation, the second enters the scheme code.I do not like adding any unnecessary syntax element to lisp. So I want to show explicitly why the syntax elements are required.
See also http://draketo.de/light/english/wisp-lisp-indentation-preprocessor#sec-4
To represent general code trees, we have to be able to represent continuation of the arguments of a procedure with an intermediate call to another (or the same) procedure.
The dot at the beginning of the line as marker of the continuation of a variable list is a generalization of using the dot as identity procedure - which is an implementation detail in many lisps.
(. a)
is justa
So for the single variable case, this would not even need additional parsing: wisp could just parse . a
to (. a)
and produce the correct result in most lisps. But forcing programmers to always use separate lines for each parameter would be very inconvenient, so the definition of the dot at the beginning of the line is extended to mean “take every element in this line as parameter to the parent procedure”.
(. a)
→a
is generalized to(. a b c)
→a b c
.
At its core, this dot-rule means that we mark variables in the code instead of procedure calls. We do so, because variables at the beginning of a line are much rarer in Scheme than in other programming languages.
For double parentheses and for some other cases we must have a way to mark indentation levels which do not contain code. Wisp uses the colon, because it is the most common non-alpha-numeric character in normal prose which is not already reserved as syntax by Scheme when it is surrounded by whitespace, and because it already gets used without surrounding whitespace for marking keyword arguments to procedures in Emacs Lisp and Common Lisp, so it does not add completely alien concepts.
The inline procedure call via inline " : " is a limited generalization of using the colon to mark an indentation level: If we add a syntax-element, we should use it as widely as possible to justify adding syntax overhead.
But if you need to use :
as variable or procedure name, you can still do so by escaping it with a backslash (\:
), so this does not forbid using the character.
For simple cases, the colon could be replaced by clever whitespace parsing, but there are complex cases which make this impossible. The minimal example is a theoretical doublelet which does not require a body. The example uses a double let without action as example for the colon-syntax, even though that does nothing, because that makes it impossible to use later indentation to mark an intermediate indentation-level. Another reason why I would not use later indentation to define whether something earlier is a single or double indent is that this would call for subtle and really hard to find errors.
(doublelet ((foo bar)) ((bla foo)))
The wisp version of this is
doublelet : foo bar : ; <- this empty back step is the real issue bla foo
or shorter with inline colon (which you can use only if you don’t need further indentation-syntax inside the assignment).
doublelet : foo bar : bla foo
The need to be able to represent arbitrary syntax trees which can contain expressions like this is the real reason, why the colon exists. The inline and start-of-line use is only a generalization of that principle (we add a syntax-element, so we should see how far we can push it to reduce the effective cost of introducing the additional syntax).
There are two alternative ways to tackle this issue: deferred level-definition and fixed-width indentation.
Defining intermediate indentation-levels by later elements (deferred definition) would be a problem, because it would create code which is really hard to understand. An example is the following:
define (flubb) nubb hubb subb gam
would become
(define (flubb) ((nubb)) ((hubb)) ((subb)) (gam))
while
define (flubb) nubb hubb subb
would become
(define (flubb) (nubb) (hubb) (subb))
Knowledge of later parts of the code would be necessary to understand the parts a programmer is working on at the moment. This would call for subtle errors which would be hard to track down, because the effect of a change in code would not be localized at the point where the change is done but could propagate backwards.
Fixed indentation width (alternative option to inferring it from later lines) would make it really hard to write readable code. Stuff like this would not be possible:
when equal? wrong isright? stuff fixstuff
In Python the whitespace hostile html already presents problems with sharing code - for example in email list archives and forums. But Python-programmers can mostly infer the indentation by looking at the previous line: If that ends with a colon, the next line must be more indented (there is nothing to clearly mark reduced indentation, though). In wisp we do not have this support, so we need a way to survive in the hostile environment of today's web.
The underscore is commonly used to denote a space in URLs, where spaces are inconvenient, but it is rarely used in Scheme (where the dash ("-") is mostly used instead), so it seems like a a natural choice.
You can still use underscores anywhere but at the beginning of the line, and even at the beginning of the line you simply need to escape it by prefixing the first underscore with a backslash ("\____").
The reference implementation realizes a specialized parser for Scheme. It uses GNU Guile and can also be used at the REPL.
The wisp code also contains a general wisp-preprocessor which can be used for any lisp-like language and can used as an external program which gets called on reading. It does not actually have to understand the code itself.
To allow for easy re-implementation, the chapter after the implementation itself contains a test-suite with commonly used wisp constructs and parenthesized counterparts.
The wisp preprocessor implementation can be found in the wisp code repository. Both implementations are explicitly licensed to allow inclusion in a SRFI.
The reference implementation (also linked below) generates a syntax tree from wisp which can be executed. It is written in indentation-based wisp-syntax and converted with the preprocessor from the code repository (wisp-guile.w) to parenthesized scheme syntax.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.