[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode surrogates

This page is part of the web mail archives of SRFI 75 from before July 7th, 2015. The new archives for SRFI 75 contain all messages, not just those from before July 7th, 2015.

bear writes:
> That doesn't matter, really.  The fact that it's in violation of
> the unicode standard does not make it cease to exist or solve the
> problem it creates.

True enough.

> To put it another way, Windows allows characters that are not part
> of Unicode to be used to name files.  If we restrict our character
> set for filenames to Unicode-only, we will not be able to open
> those files.  That problem is real.

The problem is real, but how often does it happen? The question is
whether the character representation for the language should be
dictated by the broken behavior of a particular operating system,
regardless of how ubiquitous that OS is.

To my mind an unpaired surrogate used in a file name is an
exception. As long as a method exists to specify the name explicitly,
this can be handled.

> Hmmm.... can we use read-byte and write-byte to read and write
> filenames?

I doubt it. I thinkt he pathname type in PLT and Common LISP may be
the way to go to handle these cases.

Tom Emerson                                          Basis Technology Corp.
Software Architect                                 http://www.basistech.com
 "You can't fake quality any more than you can fake a good meal." (W.S.B.)