[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

predicate->char-set considered harmful (was: New drafts of SRFIs 13 & 14 available)

This page is part of the web mail archives of SRFI 14 from before July 7th, 2015. The new archives for SRFI 14 contain all messages, not just those from before July 7th, 2015.



What about predicate->char-set on large (Unicode or larger) character
sets?  I'd certainly not want to call a function 65536 times (or 2^32 times)
just to construct a char-set.  And a user may not know that a Scheme
implementation has two-byte or four-byte characters. (How many people know
that Gambit has 2-byte chars by default?) I just don't see how it's really helpful
to have this function, and I think it should be eliminated.

I have similar, but less strongly pronounced, difficulties with char-set-invert.
It seems that it should be there for completeness, but an efficient implementation
of it conflicts with the suggestion:

> "Large" character types, such as Unicode, should use a sparse representation,
> taking care that the Latin-1 subset continues to be represented with a dense
> 32-byte bit set. 

If it stays (and perhaps it must), then I object to the name char-set-invert;
I prefer char-set-complement.  (I don't think I've every heard the word
"invert" used for this operation in any text on basic mathematics containing
set theory.)

Brad Lucier