[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: upcoming revision, need feedback



On Sun, 2010-01-10 at 18:15 +0200, Vitaly Magerya wrote:
> Derick Eddington wrote:
> > ---------------------------
> > Changes I'm undecided about
> > ---------------------------
> > 
> > 1) [...] instead of (foo) use (foo main)
> 
> Please don't. If I develop (foo) in a separate directory, I do not want
> it to suddenly become (foo main). This would be syntactic clutter I
> would not forgive as an end user.

I'll count you in favor of keeping the implicit file name support.

> > 2) [...] (a%/b c:*d) would map to the file name "a%25%%2F%b/c%3A%%2A%d.ext",
> > instead of "a%25%2Fb/c%3A%2Ad.ext" [...]
> 
> URI encoding gives a well known precedent, no need to break it.

I'll count you in favor of keeping the URI-style UTF-8 encoding design.

> > -------------------------------------
> > Changes I assume aren't controversial
> > -------------------------------------
> > [...]
> > 6) Define the pathname component separator character to be #\/ on Unixes and #\\
> > on Windows.  Define the environment variable element separator character to be
> > #\: on Unixes and #\; on Windows.  The current draft is already tied to Unixes
> > and Windows.
> 
> You need the element separator to extract the path list, so it should be
> defined. Path separator on the other hand is something you don't care
> about beyond constructing sub-paths -- a process invisible to the user,
> so you don't need to define them beyond "recognizable by the OS".

I think the pathname component separators do need to be defined.  They
need to be encoded so that library names including them do not result in
pathnames which will be misinterpreted because of them occurring
unencoded.  The specification of the set of encoded characters needs to
include the pathname component separators, so, they need to be defined
for the targeted platforms so they can be specified to be encoded.
Otherwise, if they're undefined, the encoded set would not be clearly,
precisely, completely specified.

> > 7) Add #\; to the set of encoded characters, because a directory could be both
> > in the SCHEME_LIB_PATH sequence and correspond to a library name component.
> > Such a directory with a name including #\; is unusual but must be supported,
> > otherwise an unencoded #\; would be misinterpreted in SCHEME_LIB_PATH.
> 
> I heard that when you strive to fail safety it's best to enumerate
> allowed things, not the forbidden ones. 

I don't think that justifies what you suggest below.

> How about "Encode everything
> except for [a-zA-Z0-9_.-]"? It's safe, short, simple and works for 99%
> of libraries without any encoding at all.

Other cultures' characters must be usable unencoded, especially since
the targeted file systems support using them, and we want other
cultures' use of Scheme to not be discriminated against growing to be
more than 1% of libraries.  There's no good reason to encode any
characters other than those the SRFI itself interprets specially and
those the targeted file systems disallow.  I'm fairly sure the current
set, with the inclusion of #\;, is correct, and I'm planning on asking
for help double-checking it before it's set in stone.

-- 
: Derick
----------------------------------------------------------------