[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: the "Unicode Background" section

On Thu, 21 Jul 2005, Thomas Lord wrote:

>You are concerned about sequences containing isolated (unpaired)
>surrogates and their implications for string algebra.  Your
>concerns are entirely reducible to a concern with UTF-16 --
>in all other encodings, there is no ambiguity.

I want to know something: what does a string containing an
unpaired surrogate mean?  What is represented by it?  How
can anything handle it sensibly in rendering or reading or

As far as I can tell, the only use for a string containing
an unpaired surrogate is an abuse, where you're using strings
to store some other kind of data.

So I don't regard it as being at all important, or even
appropriate, to allow unpaired surrogates in strings.