[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode surrogates



Tom Emerson scripsit:

> > For example, I can create a file called "\uD802.ss" in Windows.  How
> > would I be able to open this file in Scheme with the given proposal?
> 
> Well, U+D802 is invalid, since it must be paired.

It is indeed invalid Unicode.  Unfortunately, Win32 filenames are not Unicode
strings; they are vectors of almost-arbitrary 16-bit values (certain values
are prohibited).  Similarly, Posix filenams are not strings either; they
are vectors of almost-arbitrary 8-bit values.

Vectors, though, are not a sensible interface to file systems; filenames are
thought of as strings, accessed as strings, and almost always do correspond
to strings.   The occasional deficiencies in this model just have to be
swallowed.

-- 
The man that wanders far                        cowan@xxxxxxxx
from the walking tree                           http://www.ap.org
        --first line of a non-existent poem by:         John Cowan