[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: the discussion so far



"John.Cowan" <jcowan@xxxxxxxxxxxxxxxxx> writes:

> Thomas Bushnell BSG scripsit:
>
>> I'm referring to all the associate Unicode-related standards as well.
>> Please don't standardize non-compliance with other standards.  If a
>> Scheme system wants to comply with the UCA, then it should be able to
>> do so without violating the Scheme standard.
>
> SRFI-75 in no way prevents that.  It simply says what string<? and
> its friends mean.  You can still provide string-uca-simple<? and
> string-uca-locale<? if you want.

I think you are missing the point.  I cannot fathom why, but I'll try
to explain it again because it may have slid by.

When you provide a function that does almost-the-right-thing, you are
encouraging programmers to use it.  The only case where you have
identified a value to this function (when implemented as a simple
radix comparator on codepoints) is when you have binary search trees
which you want to exchange between scheme systems.

Yet, this function will not be used only for that purpose.  Instead,
it will be used just as the R5RS function is: a general purpose way of
sorting strings to alphebetize them for human-readable output.  

At the very least, call it "ascii-array<?" and make it obvious to the
programmer that it is limited when it is foolishly used.

Any programmer, you see, who wants to write code that Does The Right
Thing and uses this function, thinking, "oh, this will sort strings
usefully for human readable output", will be wildly misled.  On their
scheme system it will be fine, but then on a fancy enough scheme
system, with full Unicode support, their code will break.

It would have been better to tell them "Scheme has no portable way to
sort strings for human-readable output" than to provide a function
which is almost right.

Alternatively, which I would prefer, you could say that your pretended
use of string<? is not so important, but that sorting strings for
human-readable output is.  For this, we would simply have the standard
get the hell out of the way of systems that want fancier processing,
and not specify a collation that we *know* will cause problems.

Thomas