201: Syntactic Extensions to the Core Scheme Bindings

by Panicz Maciej Godek

Status

This SRFI is currently in final status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-201@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

This document describes a handful of syntactic extensions to the core bindings of the Scheme programming language. In particular, it proposes to extend the binding forms lambda, let, let* with pattern matching capabilities, to extend the forms let and or with the ability to handle multiple values, and to extend the form define with the ability of defining "curried" functions.

It also proposes a body-less variant of the lambda form, and specifies the behavior of body-less "curried" define forms to behave consistently with that extension.

Rationale

The Scheme programming language is rich in syntactic extensions. For example, various pattern matching libraries mentioned in SRFI 200 define special forms like match-lambda*, match-let or match-let*, which extend the lambda, let and let* forms with the capability of pattern matching.

However, those forms usually do not take into account the ability of Scheme procedures to return multiple values, which is a trait that wasn't supported conveniently in the core language and had to be accounted for in SRFI 8, SRFI 11, and SRFI 71 (and remains forgotten by the or form).

Various Scheme implementations provide the feature of "curried definitions", which allows one to generalize the define's extended syntax (define (function arguments ...) . body) to produce higher-order functions. However, they do not integrate well with the pattern-matching variant of the lambda form.

The purpose of this document is to propose a set of extensions to the core bindings that integrate well with both pattern matching and multiple values. These extensions remain mostly backwards-compatible with earlier usages of the core forms.

Specification

The pattern matching (destructuring) lambda form

This document assumes that the lambda form should allow one to destructure its arguments, i.e. for example

  (map (lambda (`(,x . ,y))
         (+ x y))
       '((1 . 2) (3 . 4) (5 . 6)))

should be equivalent to

  (map (lambda (x)
         (+ (car x) (cdr x)))
       '((1 . 2) (3 . 4) (5 . 6)))

or, more generally,

  (lambdamatch (<patterns> ...) . <body>)

is equivalent to

  (lambdacore args
    (match args
      (`(,<patterns> ...) . <body>)
      (_ (error "invalid args: "args", expected"'(<patterns> ...)))))

Moreover, in order to preserve backwards compatibility and avoid unnecessary performance penalties, if each <pattern> is an identifier, then lambdamatch desugars directly into lambdacore.

Empty bodies in the lambda form

The ability to pattern-match arguments provides a justification for lambda forms with empty bodies: it makes sense to interpret

  (lambdamatch (<patterns> ...))
as
  (lambdacore args
    (match args
      (`(,<patterns> ...) #t)
      (_ #f)))

The pattern-matching let form

The let form is often explained in terms of the lambda form, and therefore should be expected to inherit the ability to perform destructuring bind.

Additionally, Scheme provides a variant of the let form known as "named let", which is also expected to pattern-match.

Moreover, if there is only one expression whose value is being bound using the let form, it should also be able to handle multiple values (like SRFI 71, but with the ability to destructure the received values).

To give an example, the form could be used in the following way:

  (let lp ((`(,a ,b) `(,c) (values '(1 2) '(3))))
    (+ a b c))

Following SRFI 71, we treat the values keyword in the left position specially, so that the above form is equivalent to

  (let lp (((values `(,a ,b) `(,c)) (values '(1 2) '(3))))
    (+ a b c))

Note however, that if there is more than one expression whose value is being bound with the let form, and at least one of the bindings binds multiple values, then even though it is possible to implement this scenario, it is impossible to implement it efficiently, and because of this, such use should be discouraged.

The pattern-matching let* form that supports multiple values

The let* form in the Scheme language is typically explained in terms of nested let forms. Because of this, the bindings in the let* form are also expected to perform destructuring bind and to support multiple values (like SRFI 71, but with the ability to destructure additional values).

Here's an example use of the extended let* form:

  (let* ((`(,a ,b) `(,c) (values '(1 2) '(3)))
         ((d e `(,f . ,_) (values 4 5 '(6 7)))))
    (+ a b c d e f))

The define form that supports "curried definitions" with destructuring

The Scheme language provides a variant of the define form which is used like this:

  (define (function . args) . body)
which desugars to
  (define function (lambda args . body))
By analogy, some Scheme implementations provide a generalization of this ability, which allows one to define higher-order functions conveniently. This feature, sometimes incorrectly called "currying", works by desugaring
  (define ((head . tail) . args) . body)
into
  (define (head . tail) (lambda args . body))

This SRFI document requires that the lambda used in the desugared form supports destructuring.

Moreover, since lambda forms with empty bodies expand to predicates that say whether given arguments satisfy the specified patterns (rather than report an error), the behavior of body-less "curried" definitions is made consistent with the behavior of body-less lambda forms: all levels of arguments are tested against the specified patterns, and if some arguments do not comply with the requested patterns, the final procedure evaluates to #f; otherwise, i.e. if all arguments conform to their corresponding patterns, the final procedure evaluates to #t.

The or form that supports multiple values

The or binding provided by default in the Scheme language exposes asymmetrical behavior with regard to handling multiple values in its different branches.

In particular, even though (or #f (values 1 2)) yields the values 1 and 2, the expression (or (values 1 2) #f) only yields 1 on some implementations, and on other implementations is not supported.

Since the handling of multiple values in the or form may require heap allocation, multiple-value or is considered an optional part of this SRFI.

Specific implementation

The sample implementations are based on the abilities of some Scheme module systems to shadow existing bindings. They also rely on the presence of a SRFI-200-conformant pattern matching library, and the presence of the SRFI 1 every procedure available during macro expansion.

Pattern-matching lambda

The pattern-matching variant of the lambda form can be implemented by defining the following mlambda form and exporting it to shadow the core lambda form.

(define-syntax mlambda
  (lambda (stx)
    (syntax-case stx ()

      ((_ args)
       #'(lambda args*
	   (match args*
	     (args #t)
	     (_ #f))))

      ((_ (first-arg ... last-arg . rest-args) . body)
       (and (every identifier? #'(first-arg ... last-arg))
	    (or (identifier? #'rest-args) (null? #'rest-args)))
       #'(lambda (first-arg ... last-arg . rest-args) . body))

      ((_ arg body ...)
       (or (identifier? #'arg) (null? #'arg))
       #'(lambda arg body ...))

      ((_ args body ...)
       #'(lambda params
           (match params
             (args body ...)
	     (_ (error 'mlambda '(args body ...)))))))))

Higher-order define form

The "currying" variant of the define form can be implemented by defining the following cdefine form and exporting it to shadow the core define form (assuming the presence of the mlambda form defined in the previous section)

(define-syntax cdefine
  (syntax-rules ()
    ((_ (prototype . args))
     ;; the body-less variant: see below
     (cdefine-bodyless (prototype . args) #t))
    ((_ ((head . tail) . args) body ...)
     (cdefine (head . tail)
	      (mlambda args body ...)))
    ((_ (function . args) body ...)
     (define function
       (mlambda args body ...)))
    ((_ . rest)
     (define . rest))))

(define-syntax cdefine-bodyless
  (syntax-rules ()
   ((_ (function . pattern) result . args)
    (cdefine-bodyless function
		      (match arg
			(pattern result)
			(_ #f))
		      arg . args))

   ((_ name result args ... arg)
    (cdefine-bodyless name (lambda arg result) args ...))

   ((_ name value)
    (define name value))))

Pattern-matching variant of the let form that supports multiple values

The pattern-matching variant of the let form can be implemented by defining the following named-match-let-values form and exporting it to shadow the core let form.

(define-syntax match-let/error
  (syntax-rules ()
    ((_ ((structure expression) ...)
	body + ...)
     ((lambda args
	(match args
	  ((structure ...) body + ...)
	  (_ (error 'match-let/error
		    (current-source-location)
		    '((structure expression) ...)
		    expression ...))))
      expression ...))))

(define-syntax named-match-let-values
  (lambda (stx)
    (syntax-case stx (values)
      ((_ ((identifier expression) ...)
	  body + ...)
       (every identifier? #'(identifier ...))
       ;; optimization: plain "let" form
       #'(let ((identifier expression) ...)
	   body + ...))

      ((_ name ((identifier expression) ...)
	  body + ...)
       (and (identifier? #'name)
	    (every identifier? #'(identifier ...)))
       ;; optimization: regular named-let
       #'(let name ((identifier expression) ...)
	   body + ...))

      ((_ name (((values . structure) expression))
	  body + ...)
       (identifier? #'name)
       #'(letrec ((name (mlambda structure body + ...)))
	   (call-with-values (lambda () expression) name)))

      ((_ (((values . structure) expression)) body + ...)
       #'(call-with-values (lambda () expression)
	   (mlambda structure body + ...)))

      ((_ name ((structure expression) ...)
	  body + ...)
       (and (identifier? #'name)
	    (any (lambda (pattern)
		   (display pattern)(newline)
		   (and (pair? pattern)
			(eq? (car pattern) 'values)))
		 (syntax->datum #'(structure ...))))
       #'(syntax-error "let can only handle one binding \
in the presence of a multiple-value binding"))

      ((_ name ((structure expression) ...)
	  body + ...)
       (identifier? #'name)
       #'(letrec ((name (mlambda (structure ...) body + ...)))
	   (name expression ...)))

      ((_ ((structure expression) ...)
	  body + ...)
       (any (lambda (pattern)
	      (and (pair? pattern)
		   (eq? (car pattern) 'values)))
	    (syntax->datum #'(structure ...)))
       #'(syntax-error "let can only handle one binding \
in the presence of a multiple-value binding"))

      ((_ ((structure expression) ...)
	  body + ...)
       #'(match-let/error ((structure expression) ...)
			  body + ...))

      ((_ ((structure structures ... expression)) body + ...)
       #'(call-with-values (lambda () expression)
	   (mlambda (structure structures ... . _)
		    body + ...)))

      ((_ name ((structure structures ... expression))
		body + ...)
       (identifier? #'name)
       #'(letrec ((name (mlambda (structure structures ...)
			     body + ...)))
	   (call-with-values (lambda () expression) name)))
    )))

Pattern-matching variant of the let* form that supports multiple values

The pattern-matching variant of the let* form can be implemented by defining the following match-let*-values form and exporting it to shadow the core let* form. It assumes the presence of the definition of the named-match-let-values form from the previous section

(define-syntax match-let*-values
  (lambda (stx)
    (syntax-case stx ()
      ((_ ((identifier expression) ...) body + ...)
       (every identifier? #'(identifier ...))
       ;; optimization/base case: regular let*
       #'(let* ((identifier expression) ...)
	   body + ...))
      ((_ (binding . bindings) body + ...)
       #'(named-match-let-values (binding)
				 (match-let*-values
				  bindings
				  body + ...))))))

A variant of the or form that supports multiple values

The variant of the or form that supports multiple values can be implemented by defining the following or/values form and exporting it to shadow the core or form.

(define-syntax or/values
  (syntax-rules ()
    ((_)
     #false)

    ((or/values final)
     final)

    ((or/values first . rest)
     (call-with-values (lambda () first)
       (lambda result
	 (if (and (pair? result) (car result))
	     (apply values result)
	     (or/values . rest)))))))

Implementation

The sample implementation of this SRFI relies on the ability of module system to rename/shadow existing bindings.

The repository of this SRFI contains one implementation for Guile, and another for Racket. Each comes with a test suite written in a literate programming style, and either of those test suites - i.e. the one for Guile and the one for Racket - is an important supplement to this document.

We encourage the maintainers of Scheme implementations that do not support renaming/shadowing of core bindings, to extend their core bindings with the capabilities described in this document.

Acknowledgements

First of all, I'd like to thank Ścisław Dercz for being an early adopter of the (grand scheme) glossary, which has been the base for the set of extensions proposed in this document. I am particularly proud of the fact that Dercz chose (grand scheme) to write a set of his programming doodles that he called Useless Software.

I'm also very grateful to the developers of Guile, who, through their (ice-9 curried-definitions) module, taught me that overriding core Scheme forms is not only conceivable, but can also be very useful.

I consider this work to be a part of the decades-long evolution in functional programming style, dating back at least to Robin Milner and Rod Burstall, who pioneered pattern matching in programming languages.

This work was originally based on the pattern matcher written by Alex Shinn, based on the specification presented by Andrew K. Wright and Robert Cartwright. The Guile implementation of this SRFI still uses Shinn's match as its base.

The Racket version of this SRFI owes its existence to the kindness of the Racket community, and in particular to Jesse Alama, who invited me to speak at Racketfest that he organized in Berlin in 2019.

Last but not least, I'd like to thank all the people who persist in carrying the torch of Scheme development under the umbrella of the SRFI process, in particular to Arthur Gleckler for his patience, encouragement and support, Marc Nieper Wißkirchen for his attentive and insightful remarks, and John Cowan for shepherding the Scheme community.

Copyright

Copyright © Panicz Maciej Godek (2020).

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler