[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: constant-time access to variable-width encodings

This page is part of the web mail archives of SRFI 75 from before July 7th, 2015. The new archives for SRFI 75 contain all messages, not just those from before July 7th, 2015.

To: bear <bear@xxxxxxxxx>
Subject: Re: constant-time access to variable-width encodings
From: Per Bothner <per@xxxxxxxxxxx>
Date: Wed, 13 Jul 2005 17:39:22 -0700
Cc: srfi-75@xxxxxxxxxxxxxxxxx
Delivered-to: srfi-75@xxxxxxxxxxxxxxxxx
In-reply-to: <Pine.LNX.4.58.0507131704430.20391@xxxxxxxxxxxxxx>
References: <42D559A9.1080000@xxxxxxxxxxx> <20050713.101557.865676685.shiro@xxxxxxxx> <42D57B2F.1060608@xxxxxxxxxxx> <Pine.LNX.4.58.0507131704430.20391@xxxxxxxxxxxxxx>
User-agent: Mozilla Thunderbird 1.0.2-6 (X11/20050513)

bear wrote:

Aaaand, this is yet another problem that goes away if you embrace
glyph=character instead of codepoint=character.

Huh? A glyph depends on a specific font. No way can we define Schemecharacters in terms of glyphs.

Do you mean a (canonicalized) composite (combining) sequence? Oneproblem is you can't practially map one of those to a fixed-lengthinteger value, so we have to give up char->integer and integer->char.Also, if equal characters are to be eq? they would have to interned,like strings. Both of these chanegs are possible, but a rather radical(and unneeded departure) from current practice.

With Unicode,
you *CANNOT* make assumptions about how strings are represented.
Two strings which are "equal" under unicode's required
equivalence predicates may be of different lengths and have not a
single codepoint in common, and the differences are purely
representation artifacts.

Nonetheless, Java defines the Strings equals routine in terms of codepoint equality, and Java programmers manage to get useful work done.

--
	--Per Bothner
per@xxxxxxxxxxx   http://per.bothner.com/

Follow-Ups:
- Re: constant-time access to variable-width encodings
  - From: bear
- Re: constant-time access to variable-width encodings
  - From: Thomas Bushnell BSG

References:
- constant-time access to variable-width encodings
  - From: Per Bothner
- Re: constant-time access to variable-width encodings
  - From: Shiro Kawai
- Re: constant-time access to variable-width encodings
  - From: Per Bothner
- Re: constant-time access to variable-width encodings
  - From: bear

Prev by Date: Re: constant-time access to variable-width encodings
Next by Date: Re: constant-time access to variable-width encodings
Previous by thread: Re: constant-time access to variable-width encodings
Next by thread: Re: constant-time access to variable-width encodings
Index(es):
- Date
- Thread