[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: the "Unicode Background" section

This page is part of the web mail archives of SRFI 75 from before July 7th, 2015. The new archives for SRFI 75 contain all messages, not just those from before July 7th, 2015.

To: Thomas Lord <lord@xxxxxxx>
Subject: Re: the "Unicode Background" section
From: Thomas Bushnell BSG <tb@xxxxxxxxxx>
Date: Thu, 21 Jul 2005 16:10:27 -0700
Cc: srfi-75@xxxxxxxxxxxxxxxxx
Delivered-to: srfi-75@xxxxxxxxxxxxxxxxx
In-reply-to: <1121985934.4501.46.camel@xxxxxxxxxxxxxx> (Thomas Lord's message of "Thu, 21 Jul 2005 15:45:34 -0700")
References: <1121985934.4501.46.camel@xxxxxxxxxxxxxx>
User-agent: Gnus/5.1007 (Gnus v5.10.7) Emacs/21.4 (gnu/linux)

Thomas Lord <lord@xxxxxxx> writes:

> The Unicode Background section of the new draft has
>
>   > It is thus appropriate to define Scheme characters as Unicode scalar
>   > values, which includes all code points except those designated as
>   > surrogates.
>
> That seems wrong-headed to me.   Characters should simply
> be codepoints, instead.

A second ago you were saying that we should not be arguing about how
high-level characters are.  I think charaters should be graphemes.

> If CHARs are codepoints, more basic Unicode algorithms translate
> into Scheme cleanly.   

Those algorithms all deal with encodings, and should therefore, it
seems to me, be in the interface between arrays-of-integers and
strings.  Strings are not arrays-of-integers!

> If CHARs are codepoints, they have simple algebraic properties
> in relation to integers.

Except characters are not integers.  Scheme is not C.

Thomas

References:
- the "Unicode Background" section
  - From: Thomas Lord

Prev by Date: Re: A proposal for reserved read-syntax characters
Next by Date: Re: the "Unicode Background" section
Previous by thread: the "Unicode Background" section
Next by thread: Re: the "Unicode Background" section
Index(es):
- Date
- Thread