[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: compare function return values

On Tue, 10 May 2005, Panu A Kalliokoski wrote:

>On Sun, May 08, 2005 at 10:25:28PM +0200, Jens Axel Søgaard wrote:
>> The "numbers can be used in index computations" also refer to the fact
>> that one at the assembly level can dispatch to the proper branch by
>> doing a simple table lookup based upon the return value of a compare
>> function.
>This can't possibly lead to any performance gain, because in the general
>case, the compiler will have to bound-check the integer returned by the
>compare function, which leads to two checks - the same number of
>comparisons as caused by checking which symbol it is.

This isn't true, I don't think.  Depending on the level at
which this function is integrated into the scheme system,
the compiler can *know* that it will return one of the
three values - and know that therefore it doesn't need to
range check the table lookup.

>> The overall idea is to provide enough constructs such that the user
>> of this srfi never needs to know the underlying represention of the
>> three cases. Thus the user is ought not to use bad practices from other
>> languages.
>Yes.  So, my argument boils down to this:
>(1) there are no good reasons for using integers as the return values.
>(2) bad code should be discouraged. (eg. using compare values for
>indexing, addition etc.)
>(3) using symbols would lead to more descriptive user experience.
>-> ergo, the return values of compare functions should be symbols.

Hmmm.  I see your point and it is a valid point, but I have a
few nits to pick:

First, symbols, from an implementation POV, are a *lot* heavier
than small integer numbers.  There are table lookups to do, and
guarantees to make about symbols being eq? and hash functions to
call and store the result of, etc.  Depending on the implementation,
there's either a permanent allocation of heap, or garbage collection
complicated by weak pointers to do repeatedly.  There are pointers
to follow whenever referring to the value, and they may frequently
lead to a cache miss.  Most of this could be optimized away by the
mythical "sufficiently smart compiler", but still, it would be
unsurprising if operations using symbols in a real system took ten
or even fifty times as long in amortized CPU cost over the program
run, and in interpreted systems they almost certainly would.

Second, small integers are useful in indexing arrays, and indexing
arrays is a natural use of comparison results.  Indexing arrays with
symbols is possible (and indeed, this is the "natural" form for
hash tables, since symbols have precomputed hash values for exactly
such use in name-value lookups), but ranges from inefficient to
haphazard on very small tables, such as the prototypical 3-entry
table suggested by comparison results.  It would almost certainly
be a better implementation strategy to do three comparisons.

Third, the limitation of array indexes to exact nonnegative integers
only, and the requirement of starting at zero, is a problem that ought
to be addressed first.  If arrays could be indexed by an arbitrary
subrange of any scalar type including characters and booleans, for
example, the underlying code for (and speed of) array lookups would
be not *too* much different, and this would benefit the present problem
by giving natural expression to arrays indexed by a scalar comparison

Fourth, SRFI-10 suggests a method for extending syntax, albeit perhaps
a bit clumsy:  Values such as #,(cmp lt) #,(cmp eq) and #,(cmp gt)
could be the written form of a three-valued scalar type for comparison