[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: testing inexacts



Aubrey Jaffer wrote:
 | The test-name is a string that names the test case. It is used when
 | reporting errors, and also when skipping tests, as described below.

Must TEST-NAMEs be unique?

No.  After all, they're optional: a missing name is equivalent to "".

If not, then aren't calls to TEST-END ambiguous?

I don't believe so.  test-begin/test-end have to be properly
bracketed. The name in the test-end is mainly for readability and
to catch test-suite errors.  I also intend that if the test-end
name doesn't match the current name (from the previous test-begin),
but it matches an earlier one, then extra implicit test-end
calls would be added.  However, the implementation doesn't yet do
that.  This would primarily for recovering from test-suite errors,
or exceptions that aren't caught.  I.e. like recovering from a
syntax error in that the test suite would fail, but we try to
fail a little more elegantly.

 |  *Rationale:* In some ways using symbols would be preferable.
 |  However, we want human-readable names, and standard Scheme does
 |  not provide a way to include spaces or mixed-case text in literal
 |  symbols.

Writing tests should be about the tests; and not about making
capitalization consistent.

The point is *reporting* the results of tests.  The report should
be human-readable, and allowing mixed case and spaces in test names
helps that.

Please allow symbols as well.

Using symbols allows us to match test names using eq?.
Using strings requires matching test names using equal?.
That is an acceptable price to pay for more readable test names.

And while we are at it, R5RS sections are hierarchically numbered.
Why not allow integers?

Since these are just names, and we're not doing any operations on
test names except displaying them and comparing them, there is no
particular value to allowing integers.  E.g. allowing the name 34
doesn't add much compared to using "34".

However, I have no strong opposition to allowing numbers - or
symbols.  The concern I have is with "test specifiers".  Those
"evaluate" to procedures, but it may be convenient to allow
short-hands.  The draft allows "test-name" as a short-hand for
(test-match-named "test-name").  I have considered allowing
integers, perhaps as a short-hand for test-match-nth.  That
wouldn't work if we allow integers as test-names.

Feedback on the "syntax" of test-specifiers wojuld be welcome.
Though I guess I should post my ideas.  If you look at the
HTML source (section "Skiping selected tests") you see some
ideas I had before settled on defining specifiers as
boolean functions; I'd like to combine the convenient
syntax for the commented-out specifiers with the simple
and general model of using procedures.

 | The following forms may be more convenient than using |test-assert|
 | directly:
| | (test-eqv [test-name] test-expr expected)

The EXPECTED is usually shorter to write than the TEST-EXPR.  I
recommend swapping TEST-EXPR and EXPECTED.

I don't feel strongly about it.  However, the "flow" is that you
first evaluate text-expr, and then compare that to the expected
result, so having the latter last may be more natural.

Also, putting the optional
argument last is what Scheme programmers are accustomed to.

True, but I think the test-name should still come first.
I think having the name first, as we do for declarations,
and as we do in documents (like dictionaries) is more natural.
Visually scanning quickly for a test-name is also easier if
the test-name is first.

TEST-EQUAL is just as useful as TEST-EQV and should be provided.

It's in the draft, but just in passing:

  Similarly test-equal and test-eq are shorthand for
  test-assert combined with equal? or eq?, respectively.

For
testing inexact calculations, a TEST-APPROXIMATE procedure which
accepts values within a small range of the expected number would be
very useful.

That sounds useful.  Should the error range be specified
absolutely or relatively?  The latter is presumably more
general - except for "approximately zero".  How about:

(test-approximate [test-name] test-expression expected [error])
where error defaults to (say) 0.01 and is relative to expected
I.e. (and (>= result (- expected (* expected error))
          (>= result (+ expected (* expected error))))

(test-zero [test-name] test-expression [error])
where error is absolute and defaults (say) 0.01
I.e. (and (>= result (- error)) (<= result error))

For extra points make TEST-APPROXIMATE recursively
descend list and array structures, using its standard of approximate
numerical match.  The range (delta) should be a property of the test
runner.

Perhaps the default delta should be a test runner property,
but test-approximate/test-zero could override it?

Of course, having optional inexact tests in a testing file isn't
portable to implementations lacking inexacts.  R5RS requires those
implementations to signal an error when inexact number syntax is
encountered (macros don't help).  "r4rstest.scm" goes through the
hassle of replacing what would be literal inexact numbers with calls
to STRING->NUMBER.  I would really like a better way to do this.

Not all testing files are going to be portable.  The goal is that
the api be portable, so it is easy to write portable tests, but
presumably not more portable than what you're testing.  E.g. a
test for complex numbers is only going to work if the implementation
supports complex numbers.  What you want is for the complex tests
to be skipped (and the report summary say so) if complex is
unavailable.  If the tests depend on reader syntax one can
always put them in a separate file and load it. E.g.:

(if no-complex
 (test-skip "complex number tests"))
(test-group "complex number tests"
  (load "complex-number-tests.scm"))

If we allow test-specifiers to be integers interpreted relatively,
this could be simplified to:

(if no-complex
 (test-skip 1)) ;; skip following group
(test-group "complex number tests"
  (load "complex-number-tests.scm"))

 | Additionally, if the matching |test-begin| installed a new test-runner,
 | then the |test-end| will de-install it, after reporting the accumulated
 | test results in an implementation-defined manner.
| | (test-group suite-name decl-or-expr ...) | | Equivalent to: | | (if (not (test-to-skip% suite-name))
 |   (dynamic-wind
 |     (lambda () (test-begin suite-name))
 |     (lambda () decl-or-expr ...)
 |     (lambda () (test-end suite-name))))

In a test system it is desirable to use the fewest possible features
of Scheme, so that problems in the implementation are less likely to
render the test system unusable.  In this light, is the nesting of
test-groups bringing benefits large enough to justify the use of
complicated constructs like DYNAMIC-WIND?

We only use dynamic-wind to "cleanup" - i.e "unwind-protect".
In an implementation without dynamic-wind it would be acceptable
to replace it with a macro that just calls the 3 thunks in sequence.
In an implementation that doen't have full dynamic-wind
but does have "cleanups" (e.g. Kawa, since it doesn't yet have
full continuations) it would be nice to register the final
thunk as a cleanup.

I can change the implementation to make it easier tweak this
part.
--
	--Per Bothner
per@xxxxxxxxxxx   http://per.bothner.com/