[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

predicate->char-set considered harmful (was: New drafts of SRFIs 13 & 14 available)

What about predicate->char-set on large (Unicode or larger) character
sets?  I'd certainly not want to call a function 65536 times (or 2^32 times)
just to construct a char-set.  And a user may not know that a Scheme
implementation has two-byte or four-byte characters. (How many people know
that Gambit has 2-byte chars by default?) I just don't see how it's really helpful
to have this function, and I think it should be eliminated.

I have similar, but less strongly pronounced, difficulties with char-set-invert.
It seems that it should be there for completeness, but an efficient implementation
of it conflicts with the suggestion:

> "Large" character types, such as Unicode, should use a sparse representation,
> taking care that the Latin-1 subset continues to be represented with a dense
> 32-byte bit set. 

If it stays (and perhaps it must), then I object to the name char-set-invert;
I prefer char-set-complement.  (I don't think I've every heard the word
"invert" used for this operation in any text on basic mathematics containing
set theory.)

Brad Lucier