[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encodings.

This page is part of the web mail archives of SRFI 52 from before July 7th, 2015. The new archives for SRFI 52 contain all messages, not just those from before July 7th, 2015.

To: "Bradd W. Szonye" <bradd+srfi@xxxxxxxxxx>, srfi-52@xxxxxxxxxxxxxxxxx
Subject: Re: Encodings.
From: Ken Dickey <Ken.Dickey@xxxxxxxxxxxxxx>
Date: Fri, 13 Feb 2004 13:56:24 +0100
Delivered-to: srfi-52@xxxxxxxxxxxxxxxxx
In-reply-to: <20040213180324.GB16778@xxxxxxxxxxxxxxx>
Organization: BitWize Consulting
References: <200402102106.NAA13325@xxxxxxxxxxxxxxxxxxxxxxx> <200402130751.49528.Ken.Dickey@xxxxxxxxxxxxxx> <20040213180324.GB16778@xxxxxxxxxxxxxxx>
User-agent: KMail/1.5.4

On Friday 13 February 2004 07:03 pm, Bradd W. Szonye wrote:
> On Fri, Feb 13, 2004 at 07:51:49AM +0100, Ken Dickey wrote:
> > Let's say that there is a Scheme SRFI (or even, *GASP*, a standard)
> > which picks a single cannonical Unicode form (say the most compact
> > one) and requires, where Unicode is used, that Scheme programs be
> > prepared in that format ....
>
> Such a program would not conform to the Unicode standard:

Who cares?  Scheme does not conform to ASCII or EBCDIC.  Why should Scheme 
conform to the Unicode Standard(s)?  Defining what is an acceptable Scheme 
program should be sufficient.

It is desirable that a Scheme with support for extended identifiers should not 
be large or expensive to implement.  I have suggested a solution in which 
this is the case, i.e. to allow implementations to specify and restrict what 
source text is allowable for Scheme programs.  Scheme source could be in 
ASCII, ISO-Latin-1, (pre-canonicalized) Unicode (perhaps ucs-2). 

>     C9. A process shall not assume that the interpretations of two
>         canonical-equivalent character sequences are distinct.
>
> This section goes on to concede that
>
>     Ideally, an implementation would always interpret two
>     canonical-equivalent character sequences identically. There are
>     practical circumstances under which implementations may reasonably
>     distinguish them.

Scheme does not IMPLEMENT Unicode.  Support for processing Unicode data is a 
good idea.   Does every Scheme implementation come with its own conforming 
Unicode source editor?  Isn't this asking a bit much!?!  8^)

> In other words, recognizing canonically-equivalent characters *is* the
> responsibility of the reader, if it claims to implement the Unicode
> character set. 

I still fail to see why one would wish to make such a claim.  I have not yet 
seen a convincing case made for making Scheme "a conforming Unicode 
implementation".  Convince me!

$0.02,
-KenD

Follow-Ups:
- Re: Encodings.
  - From: Bradd W. Szonye

References:
- terminology
  - From: Tom Lord
- Re: Encodings.
  - From: Ken Dickey
- Re: Encodings.
  - From: Bradd W. Szonye

Prev by Date: Re: Encodings.
Next by Date: Re: Encodings.
Previous by thread: Re: Encodings.
Next by thread: Re: Encodings.
Index(es):
- Date
- Thread