SRFI 269: Portable Test Definitions

by Andrew Tropin, Ramin Honary

Status

This SRFI is currently in draft status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-269@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

This SRFI defines a portable API for test definitions that is decoupled from test execution and reporting. It provides three primitives: the universal is macro for assertions, test for grouping assertions into independently executable units, and suite for organizing tests into hierarchies. Tests and suites can carry user-provided metadata to adjust the behavior of a test runner, for example to select tests by tags or to enforce timeout values. The API is tiny, yet capable and flexible. By focusing on the definition and leaving execution semantics to test runners, this SRFI offers a common ground that can reduce fragmentation among testing libraries.

Unlike side-effect-driven testing frameworks (e.g. SRFI 64), this API produces first-class runtime entities, making it easy to filter, schedule, wrap them in exception guards and continuation barriers, run in arbitrary order, and re-run dynamically generated test subsets. In addition to the usual CLI test runners, it enables runtime-friendly test runners that integrate well with highly interactive development workflows inside REPLs and IDEs, significantly increasing control over test execution, and shortening the feedback loop.

To bridge the test definitions and test runners, the SRFI specifies a message-passing programming interface and test loading and execution semantics recommendations for test runner implementers.

Issues

Rationale

Most of the Scheme libraries and applications benefit from tests, and most of test suites benefit from portability. SRFI 64 (2005) was a valuable first step toward a common testing API, and its widespread adoption demonstrates the need. Yet the Scheme ecosystem remains fragmented: implementations maintain their own incompatible testing libraries (RackUnit, Chicken's test egg, Chibi's (chibi test), numerous ad hoc solutions), each with its own terminology and conventions. There is no shared vocabulary for what "assertion," "test," and "test suite" mean, making it difficult to write portable tests, and nearly impossible to build portable testing tools.

SRFI 64's design couples three distinct concerns: defining tests, executing them, and reporting results. Writing a test-assert form is running it: the assertion fires as a side effect at load time and that leads to multiple consequences. Tests cannot be defined in one place and run in another; the execution strategy cannot be changed without rewriting test code; the test-begin/test-end model is fragile and provides no first-class grouping. SRFI 64's own rationale acknowledges this trade-off, noting that the API "may be a little less elegant and 'compositional' than using explicit test objects."

Because SRFI 64 tests are imperative side effects rather than first-class values, they cannot be filtered by tag, reordered, re-run as a subset, or inspected programmatically. The test runner relies on global mutable state, making tests non-reentrant and difficult to compose. These limitations are especially painful in interactive workflows: at a REPL or inside an IDE. A programmer wants to pick one failing test, re-run it, examine the failure/use a debugger, fix the code, and iterate: a tight feedback loop that is impossible when tests are ephemeral side effects that vanish after execution.

This SRFI addresses these problems through three design decisions:

  1. Common vocabulary. The terms assertion, test, suite, entity, message, test runner, and metadata are defined precisely, giving the community a shared language for definitions, tooling, and communication.
  2. Separation of definition from execution and reporting. Three small primitives: is, test, suite. They construct first-class entities (runtime objects) and deliver them to a pluggable test runner via a message-passing protocol. Definition code is fully portable; execution and reporting strategy varies by environment.
  3. First-class test entities. Because tests are data, runners can filter by metadata, reorder, wrap each test in exception guards or continuation barriers, enforce timeouts, and re-run arbitrary subsets without changing a single line of test definition code. Suite thunks are composable procedures that can be stored, exported, and combined. The is macro captures both the unevaluated source form and a separate argument thunk, enabling rich failure diagnostics. The result is an API that is equally at home in CI pipelines and in live REPL sessions.

The API surface is deliberately minimal: three definition forms, one parameter, three predicates, and one deferred variant. Yet it covers assertions with rich diagnostics, named tests, nested suites, user-provided metadata, and deferred composable suite thunks. This SRFI is intended to supersede SRFI 64 for the purpose of test definition. By standardizing what a test is and leaving how it is run to test runners, this SRFI provides a stable foundation on which future standards for runners, reporters, and discovery mechanisms can be built.

Specification

Overview

The API specified by this SRFI is organized into two layers:

  1. Test definition primitives (normative). Three syntactic forms: is, test, and suite, and a test-runner* parameter, a small set of predicates, and deferred variant suite-thunk.
  2. Test runner interface (normative message protocol). A message-passing protocol that bridges test definitions and test runners. Each message type is defined alongside its corresponding definition primitive above.

The sample implementation section provides informative guidance and materials for test runner implementers on loading, scheduling, and reporting.

The definition primitives do not execute tests themselves. Instead, each form constructs a first-class entity, an association list (alist) that captures the test's body as a thunk, its source form, source location, description string, and optional metadata, and delivers it to the current test runner via a message. The test runner is a procedure stored in the test-runner* parameter; it receives messages as alists and is free to execute tests immediately, collect them for later, or take any other action.

This separation is the central design principle: code that defines tests is portable across all conforming implementations, while code that runs tests may vary to suit different environments: CI pipelines, interactive REPLs and IDE, specific testing/reporting tools.

Terminology

The common vocabulary important for both communication and for implementation. In this section we set up this common glossary.
Assertion
A single check produced by the is macro. An assertion captures an expression (the body), a thunk that evaluates it, and its source location. When the body is a predicate application (pred arg …), the assertion also captures a separate thunk that evaluates the arguments, enabling richer failure messages.
Test
A named, independently executable unit of testing produced by the test form. A test groups zero or more assertions together under a human-readable description string. Tests are the smallest unit that a test runner schedules and reports on.
Test suite
A named grouping of tests and nested test suites produced by the suite or suite-thunk form. Suites impose hierarchical structure and may carry metadata that influences the test runner's behavior for the entire group.
Entity
An association list that represents an assertion, test, or suite. Entities are the first-class runtime objects that flow from definition code to the test runner. Keys are symbols; their prefixes (assert/, test/, suite/) indicate which kind of entity they belong to.
Message
An association list sent to the test runner via the test-runner* parameter. Every message contains at least a type key whose value is a symbol identifying the kind of message (e.g. runner/run-assert, runner/load-test, runner/load-suite). The remaining keys carry the entities and any additional context.
Test runner
A procedure of one argument (a message), which can be used as a test-runner* parameter. The test runner receives every message produced by the definition primitives and decides how to handle it: for example, by executing an assertion immediately, by collecting a test for deferred execution, or by building a suite hierarchy. This SRFI specifies the messages a test runner must accept; it does not prescribe execution order, concurrency, or reporting strategy.
Test reporter
A procedure that consumes events from a test runner and produces human- or machine-readable output (terminal text, JUnit XML, TAP, etc.). Test reporters are outside the scope of this SRFI; they are mentioned here because the test runner interface is designed to make them easy to implement.
Metadata
An association list of user-provided key–value pairs attached to a test or suite. Metadata is opaque to the definition API; its interpretation is entirely up to the test runner. Typical uses include tagging tests (e.g. ((tags . (integration)))), marking them as slow, or specifying a timeout.
Source location
An entity, which identifies the location of the source code, can be an association list with keys filename, line, and column.
Suite path
An ordered list of suite entities representing the chain of enclosing suites from the outermost to the innermost. The suite path provides context for each test, enabling runners and reporters to reconstruct the full hierarchy.

test-runner* parameter

(test-runner*)
Returns the current test runner procedure.
(test-runner* runner)
Sets the current test runner to runner.

test-runner* is a parameter object (as defined by make-parameter) that holds the current test runner procedure. All definition primitives—is, test, and suite—deliver their messages by calling (test-runner*) to obtain the runner and then applying it to a message.

If test-runner* is never set, invoking any definition primitive should produce a diagnostic indicating that no runner has been configured. Libraries that provide a test runner should set test-runner* upon loading so that end users need not configure it manually.

Because test-runner* is a parameter, it can be rebound with parameterize to install a different runner for a dynamic extent, which is useful for testing the test framework itself or for running tests with alternative reporters.

is — assertion macro

Syntax

(is expression)
General form. Asserts that expression evaluates to a true value.
(is (predicate argument …))
Predicate form. A special case of the general form where the body is a procedure application. In addition to the general-form behavior, this form captures the arguments separately for richer failure reporting.

Description

The is macro is the sole assertion primitive. Each invocation constructs an assertion entity (an alist) and delivers it to the current test runner by sending a runner/run-assert message. The test runner decides how to execute the assertion and what to do with the result.

The assertion entity always contains the following keys:

KeyValue
assert/body-thunk A thunk that, when called, evaluates the original expression and returns its value.
assert/body The source form of the expression as a datum (unevaluated), useful for reporting.
assert/location An assertion source code location

When the body has the predicate-application shape (predicate argument …), the entity additionally contains:

KeyValue
assert/args-thunk A thunk that, when called, evaluates the arguments and returns the values as a list. This enables a test runner to display the actual argument values in a failure report, separately from the predicate.

The message sent to the test runner has the form:

`((type        . runner/run-assert)
  (assert      . ,assertion-entity))

Return value

The return value of is is determined by the test runner. A conforming runner should return the value produced by the body thunk when the assertion succeeds. This makes it possible to reuse the value returned by is forms:

(is (= 7 (is (+ 3 4))))  ; inner is returns 7

Constraints

Examples

;; Atomic value — truthy means pass
(is #t)
(is 42)

;; Variable
(let ((x "hello"))
  (is x))

;; Predicate form — enables rich failure messages
(is (= 4 (+ 2 2)))
(is (string=? "hello" (greet "world")))
(is (even? 14))
(is (lset= = '(1 2 3) '(3 2 1)))

test — test definition macro

Syntax

(test description body …)
Defines and immediately loads a test.
(test description 'metadata metadata-alist body …)
Same, with user-provided metadata attached to the test entity.

Description

The test macro defines a single, independently executable unit of testing. It constructs a test entity (an alist) that captures the test body, description, metadata, and source location, then immediately delivers it to the current test runner by sending a runner/load-test message.

A test should be self-contained: its body should not depend on side effects produced by surrounding expressions or by other tests. Because a test runner may execute tests in any order, at any point in time, or skip them entirely, relying on external state makes test results unpredictable.

description is a string that serves as a human-readable label for the test. The body forms typically contain zero or more is assertions, but may contain arbitrary Scheme expressions (e.g. local definitions, setup code).

The test entity contains the following keys:

KeyValue
test/body-thunk A thunk that, when called, evaluates the body forms in order.
test/description The description string.
test/metadata The metadata-alist, or '() if none was provided.
test/location A source location alist.

The message sent to the test runner has the form:

`((type . runner/load-test)
  (test . ,test-entity))

Constraints

Return value

The return value of test is unspecified. Because test is a loading form that registers a test with the runner rather than executing it, no meaningful value is produced. Code must not rely on the return value.

Examples

;; Minimal test
(test "addition works"
  (is (= 4 (+ 2 2))))

;; Multiple assertions in one test
(test "string operations"
  (is (string=? "HELLO" (string-upcase "hello")))
  (is (= 5 (string-length "hello"))))

;; Test with metadata
(test "database round-trip"
  'metadata
  '((tags . (integration))
    (timeout . 30))
  (is (equal? sample-record (db-read (db-write sample-record)))))

;; Test with setup code
(test "list reversal"
  (define xs '(1 2 3))
  (is (equal? '(3 2 1) (reverse xs))))

suite — test suite definition macro

Syntax

(suite description body …)
Defines and immediately loads a test suite.
(suite description 'metadata metadata-alist body …)
Same, with user-provided metadata attached to the suite entity.

Description

The suite macro defines a grouping unit that organizes tests and nested suites into a hierarchy. It constructs a suite entity (an alist), then immediately delivers it to the current test runner by sending a runner/load-suite message. The test runner evaluates the suite body, during which any enclosed test and suite forms are loaded and possibly associated with the context of this suite.

description is a string that serves as a human-readable label for the suite. The body forms typically contain test forms, and nested suite forms. It should not contain any other code, especially test setup code, as tests should be self-contained as they can be executed in arbitrary orders and multiple times.

The suite entity contains the following keys:

KeyValue
suite/body-thunk A thunk that, when called, evaluates the body forms in order. During evaluation, enclosed test and suite forms register themselves with the current test runner under this suite’s context.
suite/description The description string.
suite/metadata The metadata-alist, or '() if none was provided.
suite/location A source location alist.

The message sent to the test runner has the form:

`((type  . runner/load-suite)
  (suite . ,suite-entity))

Constraints

Return value

The return value of suite is unspecified. Because suite is a loading form that registers a suite with the runner rather than executing its tests, no meaningful value is produced. Code must not rely on the return value.

Examples

;; Flat suite
(suite "arithmetic"
  (test "addition"
    (is (= 4 (+ 2 2))))
  (test "multiplication"
    (is (= 6 (* 2 3)))))

;; Nested suites
(suite "strings"
  (suite "case conversion"
    (test "upcase"
      (is (string=? "HELLO" (string-upcase "hello"))))
    (test "downcase"
      (is (string=? "hello" (string-downcase "HELLO")))))
  (suite "splitting"
    (test "split on comma"
      (is (equal? '("a" "b" "c")
                  (string-split "a,b,c" #\,))))))

;; Suite with metadata
(suite "integration tests" 'metadata '((tags . (integration))
                                       (slow? . #t))
  (test "end-to-end round trip"
    (is (equal? expected (round-trip input)))))

suite-thunk — deferred suite definition

Syntax

(suite-thunk description body …)
Returns a suite thunk that, when called, loads the suite.
(suite-thunk description 'metadata metadata-alist body …)
Same, with metadata.

Description

suite-thunk is the deferred counterpart of suite. It constructs the same suite entity but does not immediately send it to the test runner. Instead, it returns a thunk that, when invoked, sends the runner/load-suite message. The returned thunk should carry the information or be registered in some registry to make the suite-thunk? predicate return #t on it.

The relationship between the two forms is:

(suite desc body …)
≡
((suite-thunk desc body …))

Although is and test forms can appear at the top level, test modules should wrap all tests in a suite-thunk (or define-suite) so that test runners can discover and load them as a unit. A bare top-level test form is loaded as soon as the module is evaluated, with no way for a runner to discover it independently or defer its execution.

Suite thunks are the primary building block for composable, reusable test suites. Because they are first-class procedures, they can be stored in variables, passed as arguments, and invoked inside other suites to include their tests:

(define my-unit-tests
  (suite-thunk "unit tests"
    (test "one" (is #t))
    (test "two" (is (= 2 (+ 1 1))))))

(define my-integration-tests
  (suite-thunk "integration tests"
    (test "round-trip" (is (equal? x (decode (encode x)))))))

;; Compose into a top-level public suite
(define-public all-tests
  (suite-thunk "all tests"
    (my-unit-tests)
    (my-integration-tests)))

define-suite — named suite definition

Syntax

(define-suite name body …)
Defines and exports a suite thunk bound to name.
(define-suite name 'metadata metadata-alist body …)
Same, with user-provided metadata attached to the underlying suite entity.

Description

define-suite is a convenience form that combines suite-thunk with a public definition. It is equivalent to:

(define-public name
  (suite-thunk (symbol->string 'name) body …))

Or, with metadata:

(define-public name
  (suite-thunk (symbol->string 'name) 'metadata metadata-alist body …))

The suite description string is derived from name by converting the symbol to a string. This form is intended for top-level suite definitions in test modules.

Examples

(define-suite arithmetic-tests
  (test "addition"
    (is (= 4 (+ 2 2))))
  (test "subtraction"
    (is (= 0 (- 2 2)))))

;; Now (arithmetic-tests) can be called from another suite
;; or from the REPL to load and run the tests.

Predicates

Those predicates have no use for defining test and test suites, but they are really handy for test runners implementers.
(test? obj)boolean
Returns #t if obj is a test entity: an alist that contains at least the keys test/body-thunk and test/description.
(suite? obj)boolean
Returns #t if obj is a suite entity: an alist that contains at least the keys suite/body-thunk and suite/description.
(suite-thunk? obj)boolean
Returns #t if obj is a suite thunk produced by suite-thunk or define-suite. Implementations can mark suite thunks with a procedure property or somehow else; ordinary lambdas do not satisfy this predicate.

Implementation

A sample implementation is a part of the suitbl testing library and provided as part of the guile-ares-rs project. The definitions API is contained in a single module:

The implementation depends on syntax-case, make-parameter, and association lists. Entities are represented as plain alists rather than records or opaque types, which keeps the representation transparent, inspectable, and maximally portable across Scheme implementations.

Design notes

The is macro uses two syntax-case clauses: one matches the predicate-application shape (pred arg …) and produces both a body thunk and a separate arguments thunk; the other matches an arbitrary expression and produces only a body thunk. Both clauses capture the unevaluated source form as a datum.

The immediate forms test and suite are defined in terms of their deferred counterparts: test expands to ((test-thunk …)) and suite expands to ((suite-thunk …)). The deferred form is the primitive; the immediate form simply invokes it. This factoring keeps the macro logic in one place and makes suite thunks the natural unit of composition.

The suite-thunk form returns a procedure that, when called, sends a runner/load-suite message to the current test runner. The returned procedure carries metadata that allows the suite-thunk? predicate to identify it (see portability notes below).

Portability considerations

A small number of Guile-specific features are used in the sample implementation. Each has a straightforward equivalent in other Scheme systems:

Source locations (syntax-source, %search-load-path)
The sample implementation calls Guile's syntax-source at macro-expansion time to obtain a filename/line/column alist, and resolves relative paths via %search-load-path. Other implementations can use their own source-location API (e.g. syntax-line/syntax-column in Racket, or the source-info facility of the host system). When no source-location API is available, the assert/location, test/location, and suite/location keys may be set to #f.
Procedure properties (set-procedure-properties!, procedure-property)
Suite thunks are tagged using Guile's procedure properties so that suite-thunk? can distinguish them from ordinary lambdas. Implementations without procedure properties can use alternative strategies: placing suite-thunk in a hash table that serves as a global registry, or any other mechanism that allows a reliable predicate.
define-public
The define-suite convenience macro expands to define-public, a Guile shorthand for defining and exporting a binding. Other implementations should expand to their own define-and-export form, or simply expand to define and leave exporting to the module declaration.

Test suite

A test suite for the definitions API is provided:

Supplementary material

Acknowledgements

The design of this SRFI was influenced by clojure.test's use first-class test entities and its is macro, and by JUnit's test primitives. SRFI 64 provided the foundation and demonstrated the need for a portable and runtime-friendly testing API; this SRFI builds on the lessons learned from its adoption.

Thanks to Andrew Kravchuk for showcasing Common Lisp testing ecosystem and sharing the experience. Thanks to NLnet for funding the work on the suitbl testing library and this specification.

© 2025, 2026 Andrew Tropin, Ramin Honary.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler