Title

Comparators

Author

John Cowan

This SRFI is currently in ``draft'' status. To see an explanation of each status that a SRFI can hold, see here. To provide input on this SRFI, please mail to <srfi minus 114 at srfi dot schemers dot org>. See instructions here to subscribe to the list. You can access previous messages via the archive of the mailing list.

Received: 2013/06/13
Revised: 2013/06/23
Revised: 2013/07/03
Revised: 2013/12/04
Draft: 2013/06/15-2013/08/15

Abstract

This proposal is a rewrite of SRFI 67, Compare Procedures, extending it from procedures that represent a total order to procedure bundles that represent one or more of a total order, an equality predicate, and a hash function. By packaging these procedures together, along with a type test predicate, they can be treated as a single item for use in the implementation of data structures.

Issues

All issues closed.

Rationale

The four procedures above have complex dependencies on one another, and it is inconvenient to have to pass them all to other procedures that might or might not make use of all of them. For example, a set implementation naturally requires only an equality predicate, but if it is implemented using a hash table, an appropriate hash function is also required if the implementation does not provide one; alternatively, if it is implemented using a tree, a comparison procedure is required. By passing a comparator rather than a bare equality predicate, the set implementation can make use of whatever procedures are available and useful to it.

This SRFI could not have been written without the work of Sebastian Egner and Jens Axel Søgaard on SRFI 67; much of the credit for this SRFI is due to them, but none of the blame. In addition, many of the design decisions of this SRFI are copied from SRFI 67's design rationale.

Specification

The procedures in this SRFI are in the (srfi 114) library (or (srfi :114) on R6RS), but the sample implementation currently places them in the (comparators) library.

Definitions

A comparator is an object of a disjoint type. It is a bundle of procedures that are useful for comparing two objects either for equality or for ordering. There are four procedures in the bundle:

The type test predicate returns #t if its argument has the correct type to be passed as an argument to the other three procedures, and #f otherwise.
The equality predicate returns #t if the two objects are the same in the sense of the comparator, and #f otherwise. It is the programmer's responsibility to ensure that it is reflexive, symmetric, transitive, and can handle any arguments that satisfy the type test predicate.
The comparison procedure returns -1, 0, or 1 if the first object precedes the second, is equal to the second, or follows the second, respectively, in a total order defined by the comparator. It is the programmer's responsibility to ensure that it is reflexive, weakly antisymmetric, transitive, can handle any arguments that satisfy the type test predicate, and returns 0 iff the equality predicate returns #t. Comparison procedures are compatible with the compare procedures of SRFI 67; see SRFI 67 for the rationale for adopting this return convention.
The hash function takes one argument, and returns an exact non-negative integer. It is the programmer's responsibility to ensure that it can handle any argument that satisfies the type test predicate, and that it returns the same value on two objects if the equality predicate says they are the same (but not necessarily the converse).

It is also the programmer's responsibility to ensure that all four procedures provide the same result whenever they are applied to the same object(s) (in the sense of eqv?), unless the object(s) have been mutated since the last invocation. In particular, they must not depend in any way on memory addresses in implementations where the garbage collector can move objects in memory.

Limitations

The comparator objects defined in this SRFI are not applicable to circular structure, or (with the exception of comparators created by make-inexact-real-comparator) to NaNs or objects containing them. Attempts to pass any such objects to any procedure defined here, or to any procedure that is part of a comparator defined here, is an error.

Index

Predicates: comparator?, comparator-comparison-procedure?, comparator-hash-function?
Standard comparators: boolean-comparator, char-comparator, char-ci-comparator, string-comparator, string-ci-comparator, symbol-comparator, exact-integer-comparator, integer-comparator, rational-comparator, real-comparator, complex-comparator, number-comparator, pair-comparator, list-comparator, vector-comparator, bytevector-comparator
The default comparator: default-comparator
Comparator constructors: make-comparator, make-inexact-real-comparator, make-vector-comparator, make-bytevector-comparator, make-list-comparator, make-vectorwise-comparator, make-listwise-comparator, make-car-comparator, make-cdr-comparator, make-pair-comparator, make-improper-list-comparator, make-selecting-comparator, make-refining-comparator, make-reverse-comparator, make-debug-comparator
Wrapped equality predicates: eq-comparator, eqv-comparator, equal-comparator
Accessors: comparator-type-test-procedure, comparator-equality-predicate, comparator-comparison-procedure, comparator-hash-function
Primitive applicators: comparator-test-type, comparator-check-type, comparator-equal?, comparator-compare, comparator-hash
Comparison procedure constructors: make-comparison<, make-comparison>, make-comparison<=, make-comparison>=, make-comparison=/< make-comparison=/>
Comparison syntax: if3, if=?, if<?, if>?, if<=?, if>=?, if-not=?
Comparison predicates: =?, <?, >?, <=?, >=?
Comparison predicate constructors: make=, make< , make>, make<=, make>=
Interval (ternary) comparison predicates: in-open-interval?, in-closed-interval?, in-open-closed-interval?, in-closed-open-interval?
Min/max comparison procedures: comparator-min, comparator-max

Predicates

(comparator? obj)

Returns #t if obj is a comparator, and #f otherwise.

(comparator-comparison-procedure? comparator)

Returns #t if comparator has a supplied comparison procedure, and #f otherwise.

(comparator-hash-function? comparator)

Returns #t if comparator has a supplied hash function, and #f otherwise.

Standard comparators

The following comparators are analogous to the standard compare procedures of SRFI 67. They all provide appropriate hash functions as well.

boolean-comparator

Compares booleans using the total order #f < #t.

char-comparator

Compares characters using the total order implied by char<?. On R6RS and R7RS systems, this is Unicode codepoint order.

char-ci-comparator

Compares characters using the total order implied by char-ci<? On R6RS and R7RS systems, this is Unicode codepoint order after the characters have been folded to lower case.

string-comparator

Compares strings using the total order implied by string<?. Note that this order is implementation-dependent.

string-ci-comparator

Compares strings using the total order implied by string-ci<?. Note that this order is implementation-dependent.

symbol-comparator

Compares symbols using the total order implied by applying symbol->string to the symbols and comparing them using the total order implied by string<?. It is not a requirement that the hash function of symbol-comparator be consistent with the hash function of string-comparator, however.

exact-integer-comparator

integer-comparator

rational-comparator

real-comparator

complex-comparator

number-comparator

These comparators compare exact integers, integers, rational numbers, real numbers, complex numbers, and any numbers using the total order implied by <. They must be compatible with the R5RS numerical tower in the following sense: If S is a subtype of the numerical type T and the two objects are members of S , then the equality predicate and comparison procedures (but not necessarily the hash function) of S-comparator and T-comparator compute the same results on those objects.

Since non-real numbers cannot be compared with <, the following least-surprising ordering is defined: If the real parts are < or >, so are the numbers; otherwise, the numbers are ordered by their imaginary parts. This can still produce surprising results if one real part is exact and the other is inexact.

pair-comparator

This comparator compares pairs using default-comparator (see below) on their cars. If the cars are not equal, that value is returned. If they are equal, default-comparator is used on their cdrs and that value is returned.

list-comparator

This comparator compares lists lexicographically, as follows:

The empty list compares equal to itself.
The empty list compares less than any non-empty list.
Two non-empty lists are compared by comparing their cars. If the cars are not equal when compared using default-comparator (see below), then the result is the result of that comparison. Otherwise, the cdrs are compared using list-comparator.

vector-comparator

bytevector-comparator

These comparators compare vectors and bytevectors by comparing their lengths. A shorter argument is always less than a longer one. If the lengths are equal, then each element is compared in turn using default-comparator (see below) until a pair of unequal elements is found, in which case the result is the result of that comparison. If all elements are equal, the arguments are equal.

If the implementation does not support bytevectors, bytevector-comparator has a type testing procedure that always returns #f.

The default comparator

default-comparator

This is a comparator that accepts any two Scheme values (with the exceptions listed in the Limitations section) and orders them in some implementation-defined way, subject to the following conditions:

The following ordering between types must hold: the empty list precedes pairs, which precede booleans, which precede characters, which precede strings, which precede symbols, which precede numbers, which precede vectors, which precede bytevectors, which precede all other objects. This ordering is compatible with SRFI 67.
When applied to pairs, booleans, characters, strings, symbols, numbers, vectors, or bytevectors, its behavior must be the same as pair-comparator, boolean-comparator, character-comparator, string-comparator, symbol-comparator, number-comparator, vector-comparator, and bytevector-comparator respectively. The same should be true when applied to an object or objects of a type for which a standard comparator is defined elsewhere.
Given disjoint types a and b, one of three conditions must hold:
- All objects of type a compare less than all objects of type b.
- All objects of type a compare greater than all objects of type b.
- All objects of either type a or type b compare equal to each other. This is not permitted for any of the standard types mentioned above.

Comparator constructors

Most of the following comparator constructors are close analogues of the standard compare procedures of SRFI 67. They all provide appropriate hash functions as well. Note that comparator constructors are allowed to cache their results: they need not return a newly allocated object, since comparators are purely functional.

(make-comparator type-test equality compare hash)

Returns a comparator which bundles the type-test, equality, compare, and hash procedures provided. As a convenience, the following additional values are accepted:

If type-test is #t, a type-test procedure that accepts any arguments is provided.
If equality is #t, an equality predicate is provided that returns #t iff compare returns 0.
If compare or hash is #f, a procedure is provided that signals an error on application. The predicates comparator-comparison-procedure? and/or comparator-hash-function?, respectively, will return #f in these cases.

(make-inexact-real-comparator epsilon rounding nan-handling)

Returns a comparator that compares inexact real numbers as follows: if after rounding to the nearest epsilon they are the same, they compare equal; otherwise they compare as specified by <. The direction of rounding is specified by the rounding argument, which is either a procedure accepting two arguments (the number and epsilon, or else one of the symbols floor, ceiling, truncate, or round.

The argument nan-handling specifies how to compare NaN arguments to non-NaN arguments. If it is a procedure, the procedure is invoked if either argument is a NaN. If it is the symbol min, NaN values precede all other values; if it is the symbol max, they follow all other values, and if it is the symbol error, an error is signaled if a NaN value is compared. If both arguments are NaNs, however, they always compare as equal.

(make-list-comparator element-comparator)

(make-vector-comparator element-comparator)

(make-bytevector-comparator element-comparator)

These procedures return comparators which compare two lists, vectors, or bytevectors in the same way as list-comparator, vector-comparator, and bytevector-comparator respectively, but using element-comparator rather than default-comparator.

If the implementation does not support bytevectors, the result of invoking make-bytevector-comparator is a comparator whose type testing procedure always returns #f.

(make-listwise-comparator type-test element-comparator empty? head tail)

Returns a comparator which compares two objects that satisfy type-test as if they were lists, using the empty? procedure to determine if an object is empty, and the head and tail procedures to access particular elements.

(make-vectorwise-comparator type-test element-comparator length ref)

Returns a comparator which compares two objects that satisfy type-test as if they were vectors, using the length procedure to determine the length of the object, and the ref procedure to access a particular element.

(make-car-comparator comparator)

Returns a comparator that compares pairs on their cars alone using comparator.

(make-cdr-comparator comparator)

Returns a comparator that compares pairs on their cdrs alone using comparator.

(make-pair-comparator car-comparator cdr-comparator)

Returns a comparator that compares pairs first on their cars using car-comparator. If the cars are equal, it compares the cdrs using cdr-comparator.

(make-improper-list-comparator element-comparator)

Returns a comparator that compares arbitrary objects as follows: the empty list precedes all pairs, which precede all other objects. Pairs are compared as if with (make-pair-comparator element-comparator element-comparator). All other objects are compared using element-comparator.

(make-selecting-comparator comparator₁ comparator₂ ...)

Returns a comparator whose procedures make use of the comparators as follows:

The type test predicate passes its argument to the type test predicates of comparators in the sequence given. If any of them returns #t, so does the type test predicate; otherwise, it returns #f.

The arguments of the equality, compare, and hash functions are passed to the type test predicate of each comparator in sequence. The first comparator whose type test predicate is satisfied on all the arguments is used when comparing those arguments. All other comparators are ignored. If no type test predicate is satisfied, an error is signaled.

This procedure is analogous to the expression types select-compare and cond-compare from SRFI 67.

(make-refining-comparator comparator₁ comparator₂ ...)

Returns a comparator that makes use of the comparators in the same way as make-selecting-comparator, except that its procedures can look past the first comparator whose type test predicate is satisfied. If the comparison procedure of that comparator returns zero, then the next comparator whose type test predicate is satisfied is tried in place of it until one returns a non-zero value. If there are no more such comparators, then the comparison procedure returns zero. The equality predicate is defined in the same way. If no type test predicate is satisfied, an error is signaled.

The hash function of the result returns a value which depends, in an implementation-defined way, on the results of invoking the hash functions of the comparators whose type test predicates are satisfied on its argument. In particular, it may depend solely on the first or last such hash function. If no type test predicate is satisfied, an error is signaled.

This procedure is analogous to the expression type refine-compare from SRFI 67.

(make-reverse-comparator comparator)

Returns a comparator that behaves like comparator, except that the compare procedure returns 1, 0, and -1 instead of -1, 0, and 1 respectively. This allows ordering in reverse.

(make-debug-comparator comparator)

Returns a comparator that behaves exactly like comparator, except that whenever any of its procedures are invoked, it verifies all the programmer responsibilities (except stability), and an error is signaled if any of them are violated. Because it requires three arguments, transitivity is not tested on the first call to a debug comparator; it is tested on all future calls using an arbitrarily chosen argument from the previous invocation. Note that this may cause unexpected storage leaks.

Wrapped equality predicates

eq-comparator

eqv-comparator

equal-comparator

The equality predicates of these comparators are eq?, eqv?, and equal? respectively. When their comparison procedures are applied to non-equal objects, their behavior is implementation-defined. The type test predicates always return #t.

Accessors

(comparator-type-test-procedure comparator)

Returns the type test predicate of comparator.

(comparator-equality-predicate comparator)

Returns the equality predicate of comparator.

(comparator-comparison-procedure comparator)

Returns the comparison procedure of comparator.

(comparator-hash-function comparator)

Returns the hash function of comparator.

Primitive applicators

(comparator-test-type comparator obj)

Invokes the type test predicate of comparator on obj and returns what it returns.

(comparator-check-type comparator obj)

Invokes the type test predicate of comparator on obj and returns true if it returns true and signals an error otherwise.

(comparator-equal? comparator obj₁ obj₂)

Invokes the equality predicate of comparator on obj₁ and obj₂ and returns what it returns.

(comparator-compare comparator obj₁ obj₂)

Invokes the comparison procedure of comparator on obj₁ and obj₂ and returns what it returns.

(comparator-hash comparator obj)

Invokes the hash function of comparator on obj and returns what it returns.

Comparison procedure constructors

(make-comparison< lt-pred)

(make-comparison> gt-pred)

(make-comparison<= le-pred)

(make-comparison>= ge-pred)

(make-comparison=/< eq-pred lt-pred)

(make-comparison=/> eq-pred gt-pred)

These procedures return a comparison procedure, given a less-than predicate, a greater-than predicate, a less-than-or-equal-to predicate, a greater-than-or-equal-to predicate, or the combination of an equality predicate and either a less-than or a greater-than predicate.

They are the same as the corresponding SRFI 67 compare-by procedures. Note that they do not accept comparand arguments.

Comparison syntax

The following expression types allow the convenient use of comparison procedures. They come directly from SRFI 67.

(if3 <expr> <less> <equal> <greater>)

The expression <expr> is evaluated; it will typically, but not necessarily, be a call on a comparison procedure. If the result is -1, <less> is evaluated and its value(s) are returned; if the result is 0, <equal> is evaluated and its value(s) are returned; if the result is 1, <greater> is evaluated and its value(s) are returned. Otherwise an error is signaled.

(if=? <expr> <consequent> [ <alternate> ])

(if<? <expr> <consequent> [ <alternate> ])

(if>? <expr> <consequent> [ <alternate> ])

(if<=? <expr> <consequent> [ <alternate> ])

(if>=? <expr> <consequent> [ <alternate> ])

(if-not=? <expr> <consequent> [ <alternate> ])

The expression <expr> is evaluated; it will typically, but not necessarily, be a call on a comparison procedure. It is an error if its value is not -1, 0, or 1. If the value is consistent with the specified relation, <consequent> is evaluated and its value(s) are returned. Otherwise, if <alternate> is present, it is evaluated and its value(s) are returned; if it is absent, an unspecified value is returned.

Comparison predicates

(=? [comparator] object₁ object₂ object₃ ...)

(<? [comparator] object₁ object₂ object₃ ...)

(>? [comparator] object₁ object₂ object₃ ...)

(<=? [comparator] object₁ object₂ object₃ ...)

(>=? [comparator] object₁ object₂ object₃ ...)

These procedures are analogous to the number, character, and string comparison predicates of Scheme. They allow the convenient use of comparators in situations where the expression types are not usable. They are also analogous to the similarly named procedures SRFI 67, but handle arbitrary numbers of arguments, which in SRFI 67 requires the use of the variants whose names begin with chain.

These procedures apply the comparison procedure of comparator to the objects as follows. If the specified relation returns #t for all object_i and object_j where n is the number of objects and 1 <= i < j <= n, then the procedures return #t, but otherwise #f.

If the first argument is not a comparator, then default-comparator is used. Note that there is no comparator for comparators, so there is no ambiguity.

The order in which the values are compared is unspecified. Because the relations are transitive, it suffices to compare each object with its successor.

Comparison predicate constructors

(make=? comparator)

(make<? comparator)

(make>? comparator)

(make<=? comparator)

(make>=? comparator)

These procedures return predicates which, when applied to two or more arguments, return #t if comparing obj₁ and obj₂ using the equality or comparison procedures of comparator shows that the objects bear the specified relation to one another. Such predicates can be used in contexts that do not understand or expect comparators.

Interval (ternary) comparison predicates

These procedures return true or false depending on whether an object is contained in an open, closed, or half-open interval. All comparisons are done in the sense of comparator, which is default-comparator if omitted.

(in-open-interval? [ comparator ] obj₁ obj₂ obj₃)

Return #t if obj₁ is less than obj₂, which is less thanobj₃, and #f otherwise.

(in-closed-interval? [ comparator ] obj₁ obj₂ obj₃)

Returns #t if obj₁ is less than or equal to obj₂, which is less than or equal to obj₃, and #f otherwise.

(in-open-closed-interval? [ comparator ] obj₁ obj₂ obj₃)

Returns #t if obj₁ is less than obj₂, which is less than or equal to obj₃, and #f otherwise.

(in-closed-open-interval? [ comparator ] obj₁ obj₂ obj₃)

Returns #t if obj₁ is less than or equal to obj₂, which is less than obj₃, and #f otherwise.

Min/max comparison procedures

(comparator-min comparator object₁ object₂ ...)

(comparator-max comparator object₁ object₂ ...)

These procedures are analogous to min and max respectively. They apply the comparison procedure of comparator to the objects to find and return a minimal (or maximal) object. The order in which the values are compared is unspecified.

Note: The SRFI 67 procedures pairwise-not=? and kth-largest involve sorting their arguments, and are not provided by this proposal in order to avoid an otherwise unnecessary implementation dependency. They are easily provided by a sorting package that makes use of comparators.

Implementation

The sample implementation contains the following files:

basics.scm - the syntax, record type definition, and simple constructors and procedures
default.scm - a simple implementation of the default constructor, which should be improved by implementers to handle records and implementation-specific types
constructors.scm - most of the constructors
advanced.scm - the more complex constructors
r7rs-shim.scm - procedures for R7RS compatibility, including a trivial implementation of bytevectors on top of SRFI 4 u8vectors
complex-shim.scm - a trivial implementation of real-part and imag-part for Schemes that don't have complex numbers
comparators.sld - an R7RS library
comparators.scm - a Chicken library

A future release will include a test program using the Chicken test egg, which is available on Chibi as the (chibi test) library.

Copyright

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Editor: Mike Sperber