[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing Scheme [was Re: strings draft]





    > From: Ken Dickey <Ken.Dickey@xxxxxxxxxxxxxx>

    > By all means, clean up any silliness. [I have not seen such, but
    > I do not have time to read the volume of email generated by
    > this list, so I had best just shut up now].

Quick summary: nobody has proposed anything that contradicts your
goals except in such minor ways that I doubt you would object.  I
think that by in large, we agree.  Um... it does show that you haven't
read the proposals and responses carefully.

One specific "flame", though: you _seem_ to say in your message body
that you think #\a..#\z are portable in identifiers but that #\A..#\Z
are not.  I'd be happy to take it up with you off-list but that is an
absurd reading, imo.  I think that the definition of "<letter>" in
ch. 7 is only reasonably read in the context of ch. 2: it mentions
only lower case letters because it is taken as read that case (for
a..z and A..Z) doesn't matter for <letter>, and thus it simplifies the
formal syntax to mention only lowercase there.   It's a mistake to
read the definition of <letter> as meaning that 

	(let ((x 3)) (display X))

is not a portable program, not least because to do so would invalidate
some of the code samples in chapter 6.


Basically, if authors want to tell me "We only mentioned case as a
guideline -- the standard doesn't rely on it.  At the same time, we
all write code assuming that implementations follow the guidelines for
ASCII," -- well, I won't believe them.


    > PS: I would expect that the included code which implements R5RS
    > READ (less number recognition, but I can send that file as well)
    > still should behave as expected under whatever new standard
    > emerges.

Yes, a quick read suggests that it would work as well as it currently
does and as portably as it currently does if my recommendations were
incorporated in R6RS.

Your code says:

    > (define (read-identifier port)
    >   ;; ASSERT: peek-char is start of identifier
    >   (string->caseified-symbol (read-identifier-string port)))

R5RS does not promise that, given a sybmol name as input, this
procedure will return the same symbol that is named by that symbol.

Your definition of STRING->CASEIFIED-SYMBOL smashes the string to
lower case.   Implementations are not required to do that.

I can not portably compare a return value from READ-IDENTIFIER to 

	'let

even on ASCII implementations and even on ASCII implementations that
follow the report's guidelines for ASCII.

That's why, in the thread with Thomas Bushnell, we agreed to add:

* (string->parsed-symbol s)

  S must be an IDENTIFIER? string.  Return the symbol denoted by that
  identifier if it were used in a quoted context in a Scheme expression.
  (Note how this differs from STRING->SYMBOL.)

-t