[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

case mappings



I agree with Bear that case-mappings are poorly defined on single
codepoints.

Michael Sperber wrote:
> I don't quite understand what you're saying: the locale-independent
> case mappings in UnicodeData.txt always map a single scalar value to a
> single scalar value.  Sure it doesn't always do what your locale
> thinks (as you point out), but this case mapping doesn't require
> "multi-codepoint characters."

This isn't just a "locale-awareness" problem.  True, the mappings in
UnicodeData.txt are for simplicity only the 1-1 mappings, but
SpecialCasing.txt includes a large number of mappings that aren't 1-1
regardless of locale.  The Unicode concept of locale-independent
case-mapping includes these special cases.  Without handling these
cases, R6RS would be using an incomplete case mapping rule,
which is therefore not usable in the general sense.  I don't think anyone
wants 90% compatibility thrown into the core language.

Because the proper definition is so complicated and slow, yet there
are many uses of strict ASCII case mapping in computer languages
and protocols, I think it makes sense to define the core case-mapping
procedures as ASCII-specific.  Full linguistic case-handling should be
provided by specialized library procedures which optionally accept locale,
and only work at the string level, since single-char case-mappings are
ill-defined.

char-title-case? would then no longer be needed.

-- 
Alex