[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing Scheme [was Re: strings draft]

This page is part of the web mail archives of SRFI 50 from before July 7th, 2015. The new archives for SRFI 50 contain all messages, not just those from before July 7th, 2015.

    > From: Per Bothner <per@xxxxxxxxxxx>

    > Ken Dickey wrote:

    > > It would be a *bad thing* if in going from one locale to another changed a 
    > > working Scheme program into a broken Scheme program.

    > > So, please be sure that the specification of character and string encoding and  
    > > of portable Scheme source code defines Scheme source as being locale indepent 
    > > (by construction).

    > Huh?  What do you mean?  How can a source file containing Scheme
    > source code possibly be locale independent?  What if you're on
    > a system whose native encoding is EBCDIC?  What if you use
    > non-ascii character in string literals or symbols?

That one's easy.

Scheme can be formally defined over a set of abstract characters and
can say (or implementations can say), for any given character set,
what these correspond to.

The needed punctuation, space, line-terminator, the 10 decimal digits,
and the letters are pretty portable.  (My proposals for R6RS go
further to add tab and formfeed to the list of abstract characters.)

As for string literals and symbols: some implementations will permit
some that aren't portable.  Down the road, it might be worth adding
specifications of optional extensions to the character set -- for
example: if your implementation can represent certain characters
included in Unicode, then [such and such] a subset of those must be
valid in string literals, as identifier constituents, as the first
character of an identifier, etc.

Hey, maybe we need triglyphs :-)