[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: case mappings

This page is part of the web mail archives of SRFI 75 from before July 7th, 2015. The new archives for SRFI 75 contain all messages, not just those from before July 7th, 2015.

On Thu, 14 Jul 2005, Thomas Bushnell BSG wrote:

> From my perspective, the problem with the current standard is
> that you *cannot* implement Unicode properly.  Far from
> requiring it, it is essentially prohibited.

> I fear that this SRFI is (ironically) in danger of doing the
> same damn thing all over again.

> If Scheme standardizers would just get out of the way, and
> allow Unicode-interested Schemes to implement Unicode
> correctly, I would be happy.  My concern is that I want to make
> sure that in the next go-round of the RnRS, it is possible to
> write a Unicode conformant Scheme system.

I think this is my essential position too.  Requiring a specific
form of unicode support (or requiring it at all) may be
premature; but at the very least removing requirements that
militate *AGAINST* proper unicode support must be done.

Making symbols case-sensitive and not requiring case operations
that work on single characters, I believe, is enough to make it
possible for interested implementors to create fully unicode
compliant schemata.

The standard may also require case operations that work on
strings without breaking the ability of an implementor to be
fully compliant with both RnRS and Unicode.  But this is
optional, and can be relegated to a library function.

Many implementors may choose to also have case operations that
work (non-compliantly) on single characters, but IMO the standard
must not require them.

This is not, however, enough to make it possible for these
schemata to handle unicode with any kind of uniform semantics.
For example, for a given external representation, I can imagine
different systems thinking that it was any of six different
lengths, depending on whether they do normaliztion, what choice
of normalization form they use, and whether they map characters
to codepoints or grapheme clusters (making normalization moot).

A standard could iron out these issues; but in the absence of
experience and the reports of satisfied or unsatisfied users, it
may actually be better if the standard does not.