[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: character strings versus byte strings

This page is part of the web mail archives of SRFI 50 from before July 7th, 2015. The new archives for SRFI 50 contain all messages, not just those from before July 7th, 2015.

To: Tom Lord <lord@xxxxxxx>
Subject: Re: character strings versus byte strings
From: bear <bear@xxxxxxxxx>
Date: Mon, 22 Dec 2003 17:29:02 -0800 (PST)
Cc: tb@xxxxxxxxxx, mflatt@xxxxxxxxxxx, srfi-50@xxxxxxxxxxxxxxxxx
Delivered-to: srfi-50@xxxxxxxxxxxxxxxxx
In-reply-to: <200312222259.OAA07049@xxxxxxxxxxxxxxxxxxxxxxx>
References: <20031222141633.829B7828@xxxxxxxxxxxxxxxxx> <87vfo8k3ef.fsf@xxxxxxxxxxxxxxxxx> <200312222230.OAA06693@xxxxxxxxxxxxxxxxxxxxxxx> <87hdzsjxxg.fsf@xxxxxxxxxxxxxxxxx> <200312222259.OAA07049@xxxxxxxxxxxxxxxxxxxxxxx>

On Mon, 22 Dec 2003, Tom Lord wrote:

>    > Many many many computer systems could get away with
>    > ignoring the locale-dependency of case-mapping, but now they can
>    > no longer plead ignorance.  (Though the problems are hardly
>    > obscure; even German causes problems.)
>
>(I think that, being a culturally unbiased person, you mean that
>German causes one _unique_ problem regarding case mapping.)

This is absolutely the case.  From the perspective of grapheme-
characters, and ignoring ligatures as a pure typesetting issue,
Eszett is the ONLY character in all of unicode that upcases into
a different number of characters.  I'm using an ugly kluge to
put off changing the length of any string until a canonicalization
operation, or return the upcase as a single non-standard character
(yet another character which doesn't exist in unicode), but I'm
sorely tempted to simply declare all use of eszett, given its
unique status in the history of human writing, to be an error.

				Bear

Follow-Ups:
- Re: character strings versus byte strings
  - From: Thomas Bushnell, BSG

References:
- character strings versus byte strings
  - From: Matthew Flatt
- Re: character strings versus byte strings
  - From: Thomas Bushnell, BSG
- Re: character strings versus byte strings
  - From: Tom Lord
- Re: character strings versus byte strings
  - From: Thomas Bushnell, BSG
- Re: character strings versus byte strings
  - From: Tom Lord

Prev by Date: Re: character strings versus byte strings
Next by Date: Re: character strings versus byte strings
Previous by thread: Re: passing C data to Scheme world
Next by thread: Re: character strings versus byte strings
Index(es):
- Date
- Thread