[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: the discussion so far

At Sat, 16 Jul 2005 15:05:06 +0200, Jorgen Schaefer wrote:
> In contrast, case folding is available for Unicode as a simple
> table which maps codepoints to the case-folded variant. There are
> two tables: The simple case folding maps a single codepoint to a
> single codepoint, while the full case folding table maps a single
> codepoint to one or more codepoints.

Thank you for this clarification (for repeating and expanding it,
actually; I had not yet worked through your earlier message).

So, the `char-ci' operations should use the "simple case folding" table
from CaseFolding.txt, and the `string-ci' operations should use the
"full case folding" table from CaseFolding.txt. After folding, the
comparison result is determined character-by-character.

Meanwhile, `string-upcase' and `string-downcase' reflect the same
improved handling at the string level (compared to the character level)
by using SpecialCasing.txt in addition to UnicodeData.txt.

Have I got that right?

> Since Unicode support requires such lookup tables for about
> anything - including downcasing -, using the case folding table is
> not much of an extra burden.

Yes, I agree.