[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: the "Unicode Background" section



Thomas Lord <lord@xxxxxxx> writes:

> The Unicode Background section of the new draft has
>
>   > It is thus appropriate to define Scheme characters as Unicode scalar
>   > values, which includes all code points except those designated as
>   > surrogates.
>
> That seems wrong-headed to me.   Characters should simply
> be codepoints, instead.

A second ago you were saying that we should not be arguing about how
high-level characters are.  I think charaters should be graphemes.

> If CHARs are codepoints, more basic Unicode algorithms translate
> into Scheme cleanly.   

Those algorithms all deal with encodings, and should therefore, it
seems to me, be in the interface between arrays-of-integers and
strings.  Strings are not arrays-of-integers!

> If CHARs are codepoints, they have simple algebraic properties
> in relation to integers.

Except characters are not integers.  Scheme is not C.

Thomas