[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Surrogates and character representation

This page is part of the web mail archives of SRFI 75 from before July 7th, 2015. The new archives for SRFI 75 contain all messages, not just those from before July 7th, 2015.

To: tree@xxxxxxxxxxxxx
Subject: Re: Surrogates and character representation
From: Alex Shinn <alexshinn@xxxxxxxxx>
Date: Thu, 28 Jul 2005 12:16:09 +0900
Cc: srfi-75@xxxxxxxxxxxxxxxxx
Delivered-to: srfi-75@xxxxxxxxxxxxxxxxx
Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=di3Lyr1FR+ZRK5BRFNHFeqMgZAKibB7mVbkGZpKdS/g2QBvNaScnxfHwhSHmvqm1Tvico9hf5WLdg2UQBWCylvXNMUME0B+zoksGEQJMHnZFF18Xs/OwMIc+ZcuNgQ6zGjnu5swWlhUgmSKpQXNLRAjCjWVOuqNgsoRRcEs1yvQ=
In-reply-to: <17128.19464.258589.23946@xxxxxxxxxxxxxxxxxxxxxx>
References: <y9lu0ig46v8.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <17127.44572.207464.724852@xxxxxxxxxxxxxxxxxxxxxx> <5fb7e0870507271853a6defce@xxxxxxxxxxxxxx> <17128.19464.258589.23946@xxxxxxxxxxxxxxxxxxxxxx>
Reply-to: Alex Shinn <alexshinn@xxxxxxxxx>

On 7/28/05, Tom Emerson <tree@xxxxxxxxxxxxx> wrote:
> 
> I'm not missing his point, actually. The stand-off markup may be
> generated by someone else, say the data provider (in the case of data
> acquired from the LDC or ELDA) and hence I do not have any Scheme
> serialized data, rather character offsets into a UTF-8 scheme.

Do either of those actually supply UTF-32 files along with data
files holding codepoint offsets?  UTF-8 is by far the most common
storage format for Unicode, and required by most network protocols.

Regardless, this has nothing to do with strings.  This involves
seeking to a byte position in a file, and extracting (and optionally
converting to the internal encoding) a chunk of text.

-- 
Alex

Follow-Ups:
- Re: Surrogates and character representation
  - From: Tom Emerson

References:
- Re: Surrogates and character representation
  - From: William D Clinger
- Re: Surrogates and character representation
  - From: Tom Emerson
- Re: Surrogates and character representation
  - From: Alex Shinn
- Re: Surrogates and character representation
  - From: Tom Emerson

Prev by Date: Re: Surrogates and character representation
Next by Date: Re: Surrogates and character representation
Previous by thread: Re: Surrogates and character representation
Next by thread: Re: Surrogates and character representation
Index(es):
- Date
- Thread