Title

A Scheme API for test suites

Author

Per Bothner <per@bothner.com>

Status

This SRFI is currently in ``draft'' status. To see an explanation of each status that a SRFI can hold, see here. It will remain in draft status until 2005/03/17, or as amended. To provide input on this SRFI, please


mailto:srfi-64@srfi.schemers.org

. See instructions here to subscribe to the list. You can access previous messages via the archive of the mailing list.

Received: 2005/01/07
Draft: 2005/01/28 - 2005/03/28
Revised: 2005/10/18

Abstract

This defines an API for writing test suites, to make it easy to portably test Scheme APIs, libaries, applications, and implementations. A test suite is a collection of test cases that execute in the context of a test-runner. This specifications also supports writing new test-runners, to allow customization of reporting and processing the result of running test suites.

Issues

There are other testing frameworks written in Scheme, including SchemeUnit. However SchemeUnit is not portable. It is also a bit on the verbose side. It would be useful to have a bridge between this framework and SchemeUnit so SchemeUnit tests could run under this framework and vice versa. However, that is not part of this specification.

There exists at least one Scheme wrapper providing a Scheme interface to the standard JUnit API for Java. It would be useful to have a bridge so that tests written using this framework can run under a JUnit runner, and also that existing Scheme tests run under the current framework. However, that is not part of this specification.

We should have a testsuite for the testing framework. It should preferably be written using this specification, if that isn't too awkward. At the very least we need complete examples that exercise more of the API.

The implementation should be ported to other featureful Scheme implementations so they can make use of other than the lowest R5RS functionality.

Need to define error-type for test-error.

Need to nail down definition of test specifier - specifically how a value gets coerced to a specifier procedure.

The implementation could be polished a bit more.

Rationale

The Scheme community needs a standard for writing test suites. Every SRFI or other library should come with a test suite. Such a test suite must be portable, without requiring any non-standard features, such as modules. The test suite implementation or "runner" need not be portable, but it is desirable that it be possible to write a portable basic implementation.

This API makes use of implicit dynamic state, including an implicit test runner. This makes the API convenient and terse to use, but it may be a little less elegant and compositional than using explicit test objects, such as JUnit-style frameworks. It is not claimed to follow either object-oriented or functional design principles, but I hope it is useful and convenient to use and extend.

This proposal allows converting a Scheme source file to a test suite by just adding a few macros. You don't have to write the entire file in a new form, thus you don't have to re-ident it.

All names defined by the API start with the prefix test-. (Issue: Perhaps a colon prefix test: or testing: would be better.) All function-like forms are defined as syntax. They may be implemented as functions or macros or builtins. The reason for specifying them as syntax is to allow specific tests to be skipped without evaluating sub-expressions, or for implementations to add features such as printing line numbers or catching exceptions.

Specification

Let's start with a simple example. This is a complete self-contained test-suite.

;; Initialize and give a name to a simple testsuite.
(test-begin "vec-test")
(define v (make-vector 5 99))
;; Require that an expression evaluate to true.
(test-assert (vector? v))
;; Test that an expression is eqv? to some other expression.
(test-eqv (vector-ref v 2) 99)
(vector-set! v 2 7)
(test-eqv (vector-ref v 2) 7)
;; Finish the testsuite, and report results.
(test-end "vec-test")

This testsuite could be saved in its own source file. Nothing else is needed: We do not require any top-level forms, so it is easy to wrap an existing program or test to this form, without adding indentation. It is also easy to add new tests, without having to name individual tests (though that is optional).

Test cases are executed in the context of a test runner, which is a object that accumulates and reports test results. This specification defines how to create and use custom test runners, but implementations should also provide a default test runner. It is suggested (but not required) that loading the above file in a top-level environment will cause the tests to be executed using an implementation-specified default test runner, and test-end will cause a summary to be displayed in an implementation-specified manner.

Simple test-cases

Primitive test cases test that a given condition is true. They may have a name. The core test case form is test-assert:

(test-assert [test-name] expression)

This evaluates the expression. The test passes if the result is true; if the result is false, a test failure is reported. The test also fails if an exception is raised, assuming the implementation has a way to catch exceptions. How the failure is reported depends on the test runner environment. The test-name is a string that names the test case. It is used when reporting errors, and also when skipping tests, as described below. It is an error to invoke test-assert if there is no current test runner.

The following forms may be more convenient than using test-assert directly:

(test-eqv [test-name] test-expr expected)

This is equivalent to:

(test-assert [test-name] (eqv? test-expr expected))

Similarly test-equal and test-eq are shorthand for test-assert combined with equal? or eq?, respectively.

Here is a simple example:

(define (mean x y) (/ (+ x y) 2.0))
(test-eqv (mean 3 5) 4)

Tests for catching errors

We need a way to specify that evaluation should fail. This are tests that errors are detected.

(test-error [[test-name] error-type] test-expr)

Evaluating test-expr is expected to signal an error. The kind of error is indicated by error-type.

Issue: What is error-type? Perhaps a condition type or the associated predicate, in the SRFI-35 sense?

If the error-type is left out, or it is #t, it means "some kind of unspecified error should be signaled". For example:

(test-error #t (vector-ref #(1 2) 9))

An implementation that cannot catch exceptions should skip test-error forms.

Test groups and paths

A test group is a named sequence of forms containing testcases, expressions, and definitions. Entering a group sets the test group name; leaving a group restores the previous group name. These are dynamic (run-time) operations, and a group has no other effect or identity. Test groups are informal groupings: they are neither Scheme values, nor are they syntactic forms.

A test group may contain nested inner test groups. The test group path is a list of the currently-active (entered) test group names, oldest (outermost) first.

(test-begin suite-name)

A test-begin enters a new test group. The suite-name becomes the current test group name, and is added to the end of the test group path. Portable test suites should use a sting literal for suite-name; the effect of expressions or other kinds of literals is unspecified.

Rationale: In some ways using symbols would be preferable. However, we want human-readable names, and standard Scheme does not provide a way to include spaces or mixed-case text in literal symbols.

Additionally, if there is no currently executing test runner, one is installed in an implementation-defined manner.

(test-end [suite-name] [count])

A test-end leaves the current test group. An error is reported if the suite-name does not match the current test group name. If it does match an earlier name in the test group path, intervening groups are left.

The optional count must match the number of test-cases executed since the matching test-begin. (Nested test groups count as a single test case for this count.) This extra test may be useful to catch cases where a test doesn't get executed because of some unexpected error.

Additionally, if the matching test-begin installed a new test-runner, then the test-end will de-install it, after reporting the accumulated test results in an implementation-defined manner.

(test-group suite-name decl-or-expr ...)

Equivalent to:

(if (not (test-to-skip% suite-name))
  (dynamic-wind
    (lambda () (test-begin suite-name))
    (lambda () decl-or-expr ...)
    (lambda () (test-end suite-name))))

This is usually equivalent to executing the decl-or-exprs within the named test group. However, the entire group is skipped if it matched an active test-skip (see later). Also, the test-end is executed in case of an exception.

Issue: In the case of an exception, should we actually catch it, and proceed following the test-group, or should we use a separate form for catching errors?

Handling set-up and cleanup

(test-group-with-cleanup suite-name
  decl-or-expr ...
  cleanup-form)

For example:

(test-group-with-cleanup "test-file"
  (define f (open-output-file "log"))
  (do-a-bunch-of-tests f)
  (close-output-port f))

Test specifiers

Sometimes we want to only run certain tests, or we know that certain tests are expected to fail. A test specifier is one-argument function that takes a test-runner and returns a boolean. The specifier may be run before a test is performed, and the result may control whether the test is executed. For convenience, a specifier may also be a non-procedure value, which is coerced to a specifier procedure as needs to be decided.

(test-match-named name)
The resulting specifier matches if the current test name (as returned by test-runnner-test-name) is equals? to name.

(test-match-nth n [count])
This evaluates to a stateful predicate: A counter keeps track of how many times it has been called. The predicate matches the n'th time it is called (where 1 is the first time), and the next (- count 1) times, where count defaults to 1.

(test-match-any specifier ...)
The resulting specifier matches if any specifier matches.

(test-match-all specifier ...)
The resulting specifier matches if each specifier matches.

Skipping selected tests

In some cases you may want to skip a test.

(test-skip specifier)

Evaluatng test-skip adds the resulting specifier to the set of currently active skip-specifiers. Before each test (or begin-group) the set of active skip-specifiers are applied to the active test-runner. If any specifier matches, then the test is skipped.

For convenience, if the specifier is a string that is syntactic sugar for (test-match-named specifier). For example:

(test-skip "test-b")
(test-assert "test-a")   ;; executed
(test-assert "test-b")   ;; skipped

Any skip specifiers introduced by a test-skip are removed by a following non-nested test-end.

(test-begin "group1")
(test-skip "test-a")
(test-assert "test-a")   ;; skipped
(test-end "group1)       ;; Undoes the prior test-skip
(test-assert "test-a")   ;; executed

Expected failures

Sometimes you know a test case will fail, but you don't have time to or can't fix it. Maybe a certain feature only works on certain platforms. However, you want the test-case to be there to remind you to fix it. You want to note that such tests are expected to fail.

(test-expect-fail specifier)

Matching tests (where matching is defined as in test-skip) are expected to fail. This only affects test reporting, not test execution.

Test-runner

A test-runner is an object that runs a test-suite, and manages the state. The test group path, and the sets skip and expected-fail specifiers are part of the test-runner. A test-runner will also typically accumulate statistics about executed tests,

(test-runner-current)
(test-runner-current runner)
Get or set the current test-runner. If an implementation supports parameter objects (as in SRFI-39), then test-runner-current can be a parameter object. Alternatively, test-runner-current may be implemented as a macro or function that uses a fluid or thread-local variable, or a plain global variable.

(test-runner-simple)
Creates a new simple test-runner, that prints errors and a summary on the standard output port.

(test-runner-null)
Creates a new test-runner, that does nothing with the test results. This is mainly meant for extending when writing a custom runner.

Implementations may provide other test-runners, perhaps a (test-runner-gui).

(test-runner-create)
Create a new test-runner. Equivalent to ((test-runner-factory)).

(test-runner-factory)
(test-runner-factory factory)
Get or set the current test-runner factory. A factory is a zero-argument function that creates a new test-runner. The default value is test-runner-simple, but implementations may provide a way to override the default. As with test-runner-current, this may be a parameter object, or use a per-thread, fluid, or global variable.

Running specific tests with a specified runner

(test-apply [runner] specifier ... procedure)

Calls procedure with no arguments using the specified runner as the current test-runner. If runner is omitted, then (test-runner-current) is used. (If there is no current runner, one is created as in test-begin.) If one or more specifiers are listed then only tests matching the specifiers are executed. A specifier has the same form as one used for test-skip. A test is executed if it matches any of the specifiers in the test-apply and does not match any active test-skip specifiers.

(test-with-runner runner decl-or-expr ...)

Executes each decl-or-expr in order in a context where the current test-runner is runner.

Writing a new test-runner

This section can be ignored if you just want to write test-cases.

Test result

A test-result is an association list that contains various information about the result of a test. Some associations are standard; implementations can add more.

The test-kind association return one of the following symbols:

'PASS: The passed, as expected.
'FAIL: The test failed (and was not expected to).
'XFAIL: The test failed and was expected to.
'XPASS: The test passed, but was expected to fail.
'SKIP: The test was skipped.

If an impementation can pass the source location (filename and line) to the test routines, they should use the associations source-file and source-line.

Examples needed. Also more standard associations.

Test-runner components

The following functions are for accessing the components of a test-runner. They would normally only be used to write a new test-runner or a match-predicate.

(test-runner-pass-count runner)
Returns the number of tests that passed, and were expected to pass.

(test-runner-fail-count runner)
Returns the number of tests that failed, but were expected to pass.

(test-runner-xpass-count runner)
Returns the number of tests that passed, but were expected to fail.

(test-runner-xfail-count runner)
Returns the number of tests that failed, and were expected to pass.

(test-runner-skip-count runner)
Returns the number of tests or test groups that were skipped.

(test-runnner-test-name runner)
Returns the name of the current test or test group, as a string. During execution of test-begin this is the name of the test group; during the execution of an actual test, this is the name of the test-case. If no name was specified, the name is the empty string.

(test-runner-aux-value runner)
(test-runner-aux-value! runner on-test)
Get or set the aux-value field of a test-runner. This field is not used by this API or the test-runner-simple test-runner, but may be used by custom test-runners to store extra state.

(test-runner-on-test runner)
(test-runner-on-test! runner on-test)
Gets or sets the procedure that is run after each test to report the results. The procedure takes two parameters: a test-runner, and an association list giving information about the test. (Need more specifics on this!) Typically, this procedure may be emit terse or no output if the test succeeded or was skipped, and emit more detailed output if the test failed. The initial value is test-on-test-simple which writes to the standard output (fill this in later).

(test-runner-on-final runner)
(test-runner-on-final! runner on-final)
Gets or sets the procedure that is run at the very end to report the results. The procedure takes one parameter (a test-runner) and typically displays a summary (count) of the tests. The initial value is test-on-final-simple which writes to the standard output port the rumber of tests of the various kinds.

(test-runner-reset runner)
Resets the state of the runner to its initial state.

Example

This is an example of a simple custom test-runner. Loading this program before running a test-suite will install it as the default test runner.

(define (my-simple-runner filename)
  (let ((runner (test-runner-null))
	(port (open-output-file filename))
        (num-passed 0)
        (num-failed 0))
    (test-runner-on-test! runner
      (lambda (runner result)
        (case (cdr (assq 'result-kind result))
          ((pass xpass) (set! num-passed (+ num-passed 1)))
          ((fail xfail) (set! num-failed (+ num-failed 1)))
          (else #t))))
    (test-runner-on-final! runner
       (lambda (runner)
          (format port "Passing tests: ~d.~%Failing tests: ~d.~%"
                  num-passed num-failed)
	  (close-output-port port)))
    runner))

(test-runner-factory
 (lambda () (my-simple-runner "/tmp/my-test.log")))

Implementation

The test implementation uses cond-expand (SRFI-0) to select different code depending on certain SRFI names (srfi-9, srfi-34, srfi-35, srfi-39), or implementations (kawa). It should otherwise be portable to any R5RS implementation. (It has been tested on Kawa, MzScheme, and Chez Scheme. So far only Kawa makes use of non-R5RS features; patches welcomed.)

The implementation is neither finished nor debugged, but I hope ready for people to experiment with.

testing.scm

Test suite

Of course we need a test suite for the testing framework testself. For that we need a meta-lever test-runner. The test-with-runner should be helpful.

Copyright

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Per Bothner

Editor: Francisco Solsona

Last modified: Thu Jan 27 19:17:02 PST 2005