251: Mixing groups of definitions with expressions within bodies

by Sergei Egorov

Status

This SRFI is currently in final status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-251@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

Scheme has traditionally required procedure bodies and the bodies of derived constructs such as let to contain definitions followed by commands/expressions. This SRFI proposes to allow mixing commands and groups of definitions in such bodies, so that each command/expression is in the scope of all local definition groups preceding it, but not in scope of the local definition groups following it. This approach is backwards compatible with R7RS and upholds the intuitive rule that to find the definition of a lexical variable, one has to look up the source code tree.

Rationale

This SRFI competes with Daphne Preston-Kendal's SRFI-245, which also allows arbitrary mixing of expressions/commands with definitions within single ⟨body⟩. The difference is in the scope of the internal definitions — while SRFI-245 proposes single recursive scope for all definitions, this SRFI limits the scope of each group of immediately adjacent definitions to the corresponding initializers and the subsequent body forms; the preceding body forms are not included. To make sure that the extended ⟨body⟩ with more than one definition group behaves in a way consistent with the top level, it is an error for a command or an initialization expression in a definition group to contain mentions of identifiers defined by “downstream” definition groups in the same body (identifier visibility constraint). This scope rule is in line with common expectations on identifier visibility: to find a definition of an identifier, one has to go up the source tree, looking at groups of adjacent definitions, bindings, or formals; the first definition found (or top-level one if none is found) is the definition for the identifier in question. SRFI-245's rule of looking not only up the tree, but also down, stepping over the commands until the whole body is inspected, goes against the usual expectations. The proposed approach to scopes within body-like constructs with mixed expressions and definitions is in line with decisions made by designers of modern functional languages such as ReasonML, Elixir, and others.

Note that differences in scope behavior of the proposed body-level definitions and top-level definitions are made necessary by the fact that potentially there are a lot of lexically scoped identifiers to refer to from bodies, while top level has just one scope, the program/library global scope. The “look up the tree” rule for searching for variable definitions is not applicable on the top level, because there is no “up”; replicating this functionality on a local level may lead to hard-to-follow code, with “up” and “down” variable definition scans competing for programmer's attention (this is why bindings-first let-style forms are usually preferred to bindings-at-end where-style forms).

The rationale for allowing commands before and between the definitions within a single ⟨body⟩ is the same as in SRFI-245, so it is copied verbatim below (except for the last paragraph). Please note that all bodies satisfying the aforementioned identifier visibility constraint have the same meaning under this proposal and SRFI-245.

It often makes sense to run type and other basic error checks on input forms before any other code runs (including the right-hand sides of definitions):

(define (double-square x)
  (unless (number? x)
    (error "foo: not a number" x))
  (define y (square x))
  (* 2 y))

It is likewise sometimes useful to insert logging code before the beginning of a procedure before any other code:

(define (dangerous-operation x)
  (log-warn "Beginning dangerous operation on value" x)
  (define prepared-x (prepare-for-dangerous-operation x))
  ...)

When writing test suites it is often beneficial to build up values to be tested and run the tests on them incrementally:

(test-group "basic arithmetic"
  (define one-plus-one (+ 1 1))
  (test 2 one-plus-one)
  (define two-plus-two (+ one-plus-one one-plus-one))
  (test 4 two-plus-two))

Specification

Scheme's syntax for ⟨body⟩ must be changed to allow a group of adjacent definitions at any top-level position but the last. The grammar rules are:

  ⟨body⟩ ⟶ ⟨command⟩ ⟨body⟩
    | ⟨definition⟩+ ⟨body⟩
    | ⟨expression⟩

Informally, each group of adjacent definitions starts a new mutually recursive scope that incorporates the rest of the body. There is an additional visibility constraint on mentions of identifiers defined by body-level definition groups: it is an error for a body-level command or an initialization expression of a definition in a body-level definition group to contain a free mention of an identifier defined in a subsequent (“downstream”) body-level definition group in the same body.

More formally, we can represent the semantics of a new ⟨body⟩ satisfying the above constraint in terms of R7RS ⟨body⟩ via translation function T[new body] ⟶ R7RS body :

    T[⟨command⟩ ⟨body⟩] ⟹ ⟨command⟩ T[⟨body⟩]
    T[⟨definition⟩+ ⟨body⟩] ⟹ ((lambda () ⟨definition⟩+ T[⟨body⟩]))
    T[⟨expression⟩] ⟹ ⟨expression⟩

The actual transformation, run as a part of the macro expansion process, must ensure that macro uses expanding into definitions are processed in the same way as forms they expand to. In order to support this SRFI, the “definition discovery” process, normally limited to initial body forms, should be repeated after each non-definition form, producing a new nested recursive scope for each group of adjacent definitions. The constraint on the mentions of defined identifiers may be enforced during macro processing.

Examples:

This is an error because (define (foo) x) contains a mention of x defined via downstream definition group:

(let ((x 0))
  (display "the result is")
  (define (foo) x)
  (display ": ")
  (define x 42)
  (display (foo)))

The example below prints “the result is: 42”:

(let ((x 0))
  (display "the result is")
  (define (foo) x)
  (define x 42)
  (display ": ")
  (display (foo)))

These two examples print “the result is: 0”:

(let ((x 0))
  (display "the result is")
  (define (foo) x)
  (display ": ")
  (define xx 42)
  (display (foo)))

(let ((x 0))
  (define-syntax define-thunk
    (syntax-rules ()
      ((_ i v) (define (i) v))))
  (display "the result is")
  (display ": ")
  (define xx 42)
  (define-thunk foo x)
  (display (foo)))
  

Implementation

The sample implementation is based on Alan Petrofsky's EIOD (“Eval In One Define”) v1.17. Support for this proposal is implemented as a 5-line patch to the original code.

Source for the sample implementation.

Acknowledgements

Daphne Preston-Kendal's SRFI-245 served as an inspiration for this proposal.

© 202? Sergei Egorov

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler