[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Character encoding

This page is part of the web mail archives of SRFI 103 from before July 7th, 2015. The new archives for SRFI 103 contain all messages, not just those from before July 7th, 2015.



On Fri, 2010-03-05 at 21:23 -0600, Eduardo Cavazos wrote:

> > But, if we accept that, should the encoding of characters be flushed
> > altogether (which might help further increase the rage against
> > Windows)?
> 
> I'm somewhat in favor of dropping the encoding of characters. But... 
> 
> SRFI 97 already picked the ':' symbol for use in "standard names" for
> the SRFIs. This makes for really messy looking directory contents 

(The author of SRFI 97 initially named them like (srfi 1 lists), but two
implementors were against that because R6RS allows only symbols.  I've
always wished exact integers were also allowed, and they could map to
file names in the obvious way, but I didn't speak-up.)

> and
> put us in the current situation where the portable R6RS SRFIs don't run
> under Chez Scheme because it doesn't encode the non-portable characters.
> 
> However, SRFIs aren't holy.

I agree.

> The right thing is to have a better standard for SRFI names; names which
> are portable.

The SRFI library names using a character in front of the numbers is a
separate issue from whether characters should be encoded.  Allowing
Windoze's flaws to determine library names is not acceptable.

> This would impact the implementations now, but let's get it right early
> rather than living with %3a*.sls for years.

(As the new draft says, the encoding is now %3A%.)

As the maintainer of that collection of SRFIs, I've been dealing with
those encoded colons a lot, and I'm okay with continuing to do so for
the benefits the encoding gives.  (I'd still like a leading character to
not be needed for the SRFI numbers.  My point is that I'm okay with
dealing with ugly and annoying-to-type encoded characters for the
benefits.)

> If the R6RS implementors speak up and say they'd be OK with supporing a
> better standard for SRFI names, e.g.:
> 
> 	(srfi s101 random-access-lists)
> 
> then I'll change my vote to being strongly in favor of dropping the
> character encoding.

The character encoding, and potentially a special mapping of the
Windows-disallowed file-name components, is much more important than for
only the SRFI 97 colon.  Library names like (srfi 2 and-let*) and 
(acme thing?) are problematic for Windoze without the encoding, and so
are (acme aux) and etc. without a special mapping.  I simply will never
support discouraging naming libraries like that.

Without the encoding, library names like (acme this/that) should still
work, although with funny additional separation of file-name components,
which would also break reverse mapping.  The current draft is designed
to support reverse mapping, because it's great for programmatically
managing/analyzing collections of library files using only file names
(I've been loving using this ability to work with my own sizable
collections).

If we don't have the special handling of the problematic characters and
file-name components, this is what will happen:  Myself, and I imagine
others, will not avoid making library names whose file names are
disallowed by Windoze and I/we will be offended by library names which
have been corrupted.  This will mean Windows users will be hassling with
packages which cannot be unpacked because of disallowed file names, and
they won't be able to auto-import the libraries anyway because the
Scheme systems won't support any encoding/special-mapping which they
could rename the problematic files to use.

> Currently, two implementations haven't committed at all to the current
> SRFI naming convention: Chez and Ikarus.

I don't know what you mean.  They both support it.  Currently, the
pre-release Chez needs the colons to be ":" and Ikarus needs them to be
"%3a".  The SRFI naming convention works with both.

The only reason I'm okay with losing the character encoding and/or not
specially mapping the Windows-disallowed file names is because I, and I
hope others, will not avoid library-file names which Windoze can't
handle, which will indeed greatly annoy Windows users, and I feel that's
just reflecting the nature of the OS they chose to use and I want them
to reconsider that choice.

-- 
: Derick
----------------------------------------------------------------