[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing Scheme [was Re: strings draft]

    > From: Ken Dickey <Ken.Dickey@xxxxxxxxxxxxxx>

    > By all means, clean up any silliness. [I have not seen such, but
    > I do not have time to read the volume of email generated by
    > this list, so I had best just shut up now].

Quick summary: nobody has proposed anything that contradicts your
goals except in such minor ways that I doubt you would object.  I
think that by in large, we agree.  Um... it does show that you haven't
read the proposals and responses carefully.

One specific "flame", though: you _seem_ to say in your message body
that you think #\a..#\z are portable in identifiers but that #\A..#\Z
are not.  I'd be happy to take it up with you off-list but that is an
absurd reading, imo.  I think that the definition of "<letter>" in
ch. 7 is only reasonably read in the context of ch. 2: it mentions
only lower case letters because it is taken as read that case (for
a..z and A..Z) doesn't matter for <letter>, and thus it simplifies the
formal syntax to mention only lowercase there.   It's a mistake to
read the definition of <letter> as meaning that 

	(let ((x 3)) (display X))

is not a portable program, not least because to do so would invalidate
some of the code samples in chapter 6.

Basically, if authors want to tell me "We only mentioned case as a
guideline -- the standard doesn't rely on it.  At the same time, we
all write code assuming that implementations follow the guidelines for
ASCII," -- well, I won't believe them.

    > PS: I would expect that the included code which implements R5RS
    > READ (less number recognition, but I can send that file as well)
    > still should behave as expected under whatever new standard
    > emerges.

Yes, a quick read suggests that it would work as well as it currently
does and as portably as it currently does if my recommendations were
incorporated in R6RS.

Your code says:

    > (define (read-identifier port)
    >   ;; ASSERT: peek-char is start of identifier
    >   (string->caseified-symbol (read-identifier-string port)))

R5RS does not promise that, given a sybmol name as input, this
procedure will return the same symbol that is named by that symbol.

Your definition of STRING->CASEIFIED-SYMBOL smashes the string to
lower case.   Implementations are not required to do that.

I can not portably compare a return value from READ-IDENTIFIER to 


even on ASCII implementations and even on ASCII implementations that
follow the report's guidelines for ASCII.

That's why, in the thread with Thomas Bushnell, we agreed to add:

* (string->parsed-symbol s)

  S must be an IDENTIFIER? string.  Return the symbol denoted by that
  identifier if it were used in a quoted context in a Scheme expression.
  (Note how this differs from STRING->SYMBOL.)