[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: the "Unicode Background" section

This page is part of the web mail archives of SRFI 75 from before July 7th, 2015. The new archives for SRFI 75 contain all messages, not just those from before July 7th, 2015.

On Thu, 21 Jul 2005, Thomas Lord wrote:

>You are concerned about sequences containing isolated (unpaired)
>surrogates and their implications for string algebra.  Your
>concerns are entirely reducible to a concern with UTF-16 --
>in all other encodings, there is no ambiguity.

I want to know something: what does a string containing an
unpaired surrogate mean?  What is represented by it?  How
can anything handle it sensibly in rendering or reading or

As far as I can tell, the only use for a string containing
an unpaired surrogate is an abuse, where you're using strings
to store some other kind of data.

So I don't regard it as being at all important, or even
appropriate, to allow unpaired surrogates in strings.