[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: the "Unicode Background" section
On Thu, 21 Jul 2005, Thomas Lord wrote:
>You are concerned about sequences containing isolated (unpaired)
>surrogates and their implications for string algebra. Your
>concerns are entirely reducible to a concern with UTF-16 --
>in all other encodings, there is no ambiguity.
I want to know something: what does a string containing an
unpaired surrogate mean? What is represented by it? How
can anything handle it sensibly in rendering or reading or
writing?
As far as I can tell, the only use for a string containing
an unpaired surrogate is an abuse, where you're using strings
to store some other kind of data.
So I don't regard it as being at all important, or even
appropriate, to allow unpaired surrogates in strings.
Bear