[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Issues with Unicode



From: "Taylor R. Campbell" <campbell@xxxxxxxxxx>
Subject: Re: Issues with Unicode
Date: Wed, 26 Apr 2006 07:48:05 +0000

> I can't recall whether this ever came up here, but last year, when
> this SRFI was still fresh and under heavy discussion, I wrote up an
> alternative proposal for a Unicode-supporting -- although *not*
> Unicode-mandating -- string API, where strings are collections of
> grapheme clusters indexed by opaque cursors, not character indices,
> and whose binary encoding is separated into BLOB->STRING and
> STRING->BLOB[!] procedures and abstracted by text codec descriptors.
> The text of the document is here:
> 
>   <http://mumble.net/~campbell/proposals/alt-text.text>.
[...]
> There are, of course, still some problems with it.  I couldn't think
> of a good literal syntax, for instance.  However, I think the basic
> idea of the proposal is a considerable improvement over the current,
> historically motivated, mutable character vector model of strings.

I like it, albeit I see some of the details needs to be worked out.

>    Some of the fancier implementations might not go well with
>    preemptive multithreads; if mutation of string touches more
>    than one place of the string objects, it creates a hazard.
> 
> While I agree that strings ought to be immutable, as you recommended
> afterward, I don't think this is really a very good reason: I can't
> imagine why anyone would *want* to share a mutable string between
> threads badly enough for synchronization to be the default.

I mentioned this as one more point to back up immutable
strings.  I can't imagine proper programs to share mutable strings
between threads, too.   However, in order to make sure buggy or
sloppy programs don't break the internals of your implementation,
you have to set up this kind of safety net.

I don't know how much R6RS committee want to change string API,
but I really wish R6RS strings be immutable.

--shiro