[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Encoding Windows reserved charactes
John Cowan <cowan@xxxxxxxx> writes:
> Derick Eddington scripsit:
>> The question is: what is the exact set of characters which should be
>> required to be encoded? I've heard different descriptions of what
>> Windows/DOS disallows. Does it differ between versions? What eras of
>> Microsoft OSs do we want to cater to?
> Microsoft's page and Cygwin's agree perfectly; the first certainly
> should know, and the second has had every reason to find out. I cannot
> believe that Microsoft, with its obsession with backward compat, would
> ever remove a character from the blacklist (which might break ancient apps
> that don't expect to see them) nor add one (which would make existing
> files unreachable). So I think the blacklist of #\", #\*, #\:, #\<,
> #\>, #\?, #\|, #\/, #\\, and #\x0; to #\x1F; is a solid one.
> The blacklist doubtless arose because COMMAND.COM (and its ancestors, the
> CP/M monitor and various DEC command executives) didn't have any escape
> convention, and so files with those characters couldn't be manipulated
> from the shell. Consequently, the kernel forbade them, and it still does.
> Note that this limitation is specific to Windows, the operating system,
> not any particular file system. In fact, the Microsoft page specifically
> says that there may be more characters which are forbidden by the file
> system. But I don't think either VFAT or NTFS applies any restrictions
> of its own -- indeed, the Posix subsystem (which bypasses the Windows
> executive and runs directly on the NT kernel) does not respect the
> blacklist, and can create files which Windows programs cannot process.
Additionally, and more annoyingly IMO, Windows disallows several
perfectly innocent-looking names like "aux", "prn", "con" and "nul" (at
least), with any extension (see also  for a story including some
historical background). I wonder if SRFI 103 should mention this
horrendous stupidity. I actually ran into this, naming a library
"aux.sls", and a fellow Schemer on Windows was unable to check out the
git archive containing this file, getting obscure error messages.
>> Surely, some shells differ in what are nuisance characters? What shells
>> should be catered to for the nuisance characters to encode?
> I wouldn't worry about that. The fact that these characters are
> painful on Posix systems because of the shell is just lagniappe.
+1. Zsh handles completion of such filenames just fine, FWIW:
rotty@delenn:~/tmp% touch 'foo*'
rotty@delenn:~/tmp% ls f<TAB>
rotty@delenn:~/tmp% ls foo\*
Andreas Rottmann -- <http://rotty.yi.org/>