[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: the "Unicode Background" section
Thomas Lord scripsit:
> Permitting unpaired surrogates does not damage interoperability
> -- programs need only avoid trying to transmit them on channels
> where strictly well-formed UTF-* is called for.
In fact, it is not ill-formed to have an unpaired surrogate in *any*
UTF encoding; it's just semantically meaningless.
> In my view, DISPLAY (in R6RS, not forever) should be undefined in that
> case (and in all cases where a string contains a non-8-bit-character) --
There are no such things as "8-bit characters" per se. There are a variety
of 8-bit encodings that allow up to 256 characters, but they are not the
same characters in all cases.
--
John Cowan http://www.ccil.org/~cowan <jcowan@xxxxxxxxxxxxxxxxx>
"Any legal document draws most of its meaning from context. A telegram
that says 'SELL HUNDRED THOUSAND SHARES IBM SHORT' (only 190 bits in
5-bit Baudot code plus appropriate headers) is as good a legal document
as any, even sans digital signature." --me