[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: character strings versus byte strings

This page is part of the web mail archives of SRFI 50 from before July 7th, 2015. The new archives for SRFI 50 contain all messages, not just those from before July 7th, 2015.

To: bear <bear@xxxxxxxxx>
Subject: Re: character strings versus byte strings
From: tb@xxxxxxxxxx (Thomas Bushnell, BSG)
Date: 22 Dec 2003 19:29:18 -0800
Cc: Matthew Flatt <mflatt@xxxxxxxxxxx>, srfi-50@xxxxxxxxxxxxxxxxx
Delivered-to: srfi-50@xxxxxxxxxxxxxxxxx
In-reply-to: <Pine.LNX.4.58.0312221637510.13108@xxxxxxxxxxxxxx>
References: <20031222141633.829B7828@xxxxxxxxxxxxxxxxx> <87vfo8k3ef.fsf@xxxxxxxxxxxxxxxxx> <Pine.LNX.4.58.0312221637510.13108@xxxxxxxxxxxxxx>
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3

bear <bear@xxxxxxxxx> writes:

> Each character is a unicode codepoint plus a non-defective sequence of
> unicode combining codepoints.  The unicode documentation refers to these
> entities as "graphemes."

I should revise what I said; there may well be a case for Scheme
characters being graphemes instead of codepoints.  I lead toward
codepoints, but I recognize that graphemes are a good contender.

My post was intended to argue against UTF-8; but moving further up the
abstraction ladder than codepoints may well be right.

Thomas

References:
- character strings versus byte strings
  - From: Matthew Flatt
- Re: character strings versus byte strings
  - From: Thomas Bushnell, BSG
- Re: character strings versus byte strings
  - From: bear

Prev by Date: Re: character strings versus byte strings
Next by Date: Re: character strings versus byte strings
Previous by thread: Re: character strings versus byte strings
Next by thread: Re: character strings versus byte strings
Index(es):
- Date
- Thread