[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: character strings versus byte strings

This page is part of the web mail archives of SRFI 50 from before July 7th, 2015. The new archives for SRFI 50 contain all messages, not just those from before July 7th, 2015.

To: Tom Lord <lord@xxxxxxx>
Subject: Re: character strings versus byte strings
From: tb@xxxxxxxxxx (Thomas Bushnell, BSG)
Date: 22 Dec 2003 14:41:37 -0800
Cc: mflatt@xxxxxxxxxxx, srfi-50@xxxxxxxxxxxxxxxxx
Delivered-to: srfi-50@xxxxxxxxxxxxxxxxx
In-reply-to: <200312222259.OAA07049@xxxxxxxxxxxxxxxxxxxxxxx>
References: <20031222141633.829B7828@xxxxxxxxxxxxxxxxx> <87vfo8k3ef.fsf@xxxxxxxxxxxxxxxxx> <200312222230.OAA06693@xxxxxxxxxxxxxxxxxxxxxxx> <87hdzsjxxg.fsf@xxxxxxxxxxxxxxxxx> <200312222259.OAA07049@xxxxxxxxxxxxxxxxxxxxxxx>
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3

Tom Lord <lord@xxxxxxx> writes:

>     > Many many many computer systems could get away with
>     > ignoring the locale-dependency of case-mapping, but now they can
>     > no longer plead ignorance.  (Though the problems are hardly
>     > obscure; even German causes problems.)
> 
> (I think that, being a culturally unbiased person, you mean that
> German causes one _unique_ problem regarding case mapping.)

The problem in German that I'm thinking of is the eszet problem, where
there is a lower case letter whose uppercase is a two-letter combo.
(And downcasing SS requires morpohological understanding of the word
as well, because not all SS pairs should be downcased as an eszet,
IIUC.)

That's a way in which German causes problems for easy case mapping.

The situation with the two Turkish I's is different, and more
symmetrical, and it would be wrong to characterize that as "Turkish
causing a problem".  But I think my characterization of the situation
with German stands.  That is, dealing with Turkish is no harder than
dealing with English--it's just hard to deal with both at once.

Dealing with German properly is hard all by itself.

Thomas

References:
- character strings versus byte strings
  - From: Matthew Flatt
- Re: character strings versus byte strings
  - From: Thomas Bushnell, BSG
- Re: character strings versus byte strings
  - From: Tom Lord
- Re: character strings versus byte strings
  - From: Thomas Bushnell, BSG
- Re: character strings versus byte strings
  - From: Tom Lord

Prev by Date: Re: character strings versus byte strings
Next by Date: Re: character strings versus byte strings
Previous by thread: Re: character strings versus byte strings
Next by thread: Re: character strings versus byte strings
Index(es):
- Date
- Thread