[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode surrogates



At Mon, 13 Mar 2006 08:55:49 -0500, John Cowan wrote:
> It is indeed invalid Unicode.  Unfortunately, Win32 filenames are not Unicode
> strings; they are vectors of almost-arbitrary 16-bit values (certain values
> are prohibited).  Similarly, Posix filenams are not strings either; they
> are vectors of almost-arbitrary 8-bit values.
> 
> Vectors, though, are not a sensible interface to file systems; filenames are
> thought of as strings, accessed as strings, and almost always do correspond
> to strings.   The occasional deficiencies in this model just have to be
> swallowed.

PLT Scheme uses a `path' type, distinct from the string type (but
nearly always convertable) to deal with this problem (and to help
support platform independent path construction). I'm not sure of the
extent of this SRFI, but it may be worth having a look to see how we
dealt with this exact problem:

http://download.plt-scheme.org/doc/301/html/mzscheme/mzscheme-Z-H-11.html#node_sec_11.3

Robby