[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: String comparison under Latin-1 and Unicode

>>>>> On Fri, 10 Mar 2000 14:43:05 -0500, "Sergei Egorov" <esl@xxxxxxxxxxxxxxx> said:

> I don't agree with this proposal: it seems to me that STRING<? and
> others are better left for trivial tasks like sorting strings of
> digits; they have simple definition based on CHAR<? that, in its
> turn, is based on internal encoding (ASCII or UNICODE). It is still
> very useful as ordering predicate with no language-dependent
> meaning; for example, if you want to implement string sets as sorted
> lists, it's much better to use fast ordering predicate, even if the
> induced ordering doesn't make any sense. From the other hand, some

A reasonable argument.

> I would suggest using new names for collation predicates, especially
> because collation is actually a complex process involving generation
> of "collation keys" which can be reused:

> (string->collation-key str language-specifier) => c-key
> (collation-key<? c-key1 c-key2) => bool
> (collation-key<=? c-key1 c-key2) => bool
> ...  and then you can define your own collation predicates:

I would much prefer either:
	(collation->predicate language-specifier ordering) -> pred?
	(pred? string1 string2) -> bool

where LANGUAGE-SPECIFIER is as Ben Goetter <goetter@xxxxxxxxxxxxxxxx>
suggested and ORDERING is one of the strings "<", "<=", or "="

This seems far more useful, and efficient that converting any string
you want to compare to a collation-key!