[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: the "Unicode Background" section

Thomas Lord scripsit:

> Permitting unpaired surrogates does not damage interoperability
> -- programs need only avoid trying to transmit them on channels
> where strictly well-formed UTF-* is called for.  

In fact, it is not ill-formed to have an unpaired surrogate in *any*
UTF encoding; it's just semantically meaningless.

> In my view, DISPLAY (in R6RS, not forever) should be undefined in that
> case (and in all cases where a string contains a non-8-bit-character) --

There are no such things as "8-bit characters" per se.  There are a variety
of 8-bit encodings that allow up to 256 characters, but they are not the
same characters in all cases.

John Cowan    http://www.ccil.org/~cowan   <jcowan@xxxxxxxxxxxxxxxxx>
    "Any legal document draws most of its meaning from context.  A telegram
    that says 'SELL HUNDRED THOUSAND SHARES IBM SHORT' (only 190 bits in
    5-bit Baudot code plus appropriate headers) is as good a legal document
    as any, even sans digital signature." --me