[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: new, simpler formal specification



> From: Taylor Campbell <campbell@xxxxxxxxxxxxxxxxxx>
>> Paul Schlie wrote:
>> In hopes it may be helpful, the following is a somewhat simpler version of
>> an earlier one: http://srfi.schemers.org/srfi-62/mail-archive/msg00045.html
>> which only modifies R5RS's definition of <comment> (effectively treated as
>> separator/white-space); which includes <block-comment>, and removes legally
>> quoting and/or commenting, a comment or <empty>:
>>
>> Only modifying one R5RS's exiting grammar specification:
>> 
>>    <comment> -> <line-comment> | <block-comment> | <datum-comment>
>> 
>> And augmented it with these additions:
>> 
>>   <line-comment> -> ; <all-chars-to-end-of-line>
>> 
>>   <block-comment> -> #| <all-char-except-#|-or-|#> |#
>> 
>>   <datum-comment> -> #; <datum>
>>
> One problem with this is that it creates mutual references between
> sections 7.1.1, which describes the lexical structure, and 7.1.2, which
> describes the external representation of S-expressions -- in terms of a
> stream of tokens.  While I find the 'stream of tokens' model very
> distasteful to describe Lisp syntax, it is nevertheless how the current
> framework for Scheme's syntax works, and breaking it is not a good idea
> -- especially since you seem opposed to such massive changes anyway, or
> at least that was the impression I got from your previous objections.

- I'm honestly confused, the above representation is fully consistent with
  R5RS and your own alternative referenced specification of scheme grammar,
  (your proposal isn't with either, as it doesn't properly classify "#;
  <datum>" grammatically as a comment, which it is both syntactically and
  semantically inconsistent, with it needs to be parsed as <whitespace>. I
  apparently need to be more explicit, given it's effective specification:

  <whitespace> -> [ <whitespace-character> | <comment> ] <whitespace>

 (which is the grammatical vehicle used to denote that comments are ignored)

> (Your block comments are also inconsistent with SRFI 30, by the way,
> but that is not relevant to this SRFI.)

- how? other than being simpler, it seems fully consistent with it's text?

>> (i.e. #;#; '#; (#;) (') etc. are illegal sequences, vs. earlier versions.)
> 
> Actually, with the way you just presented it, immediately nested
> S-expression comments work exactly as in the current proposal.

- Almost, as I intentionally simplified it stating: "and removes legally
  quoting and/or commenting, a comment or <empty>", as it seemed to be
  confusing an otherwise very simple specification. (but note your following
  semantic interpretation is not consistent with it, i.e. wrong.):

>                                                                 First,
> consider the text '#; A B'.  If parsed as a datum, the value will be
> just B, since the '#; A' is considered intertoken space.  (This follows
> straightforwardly since A is a datum, and so the text '#; A' satisfies
> your <datum-comment> rule.)  So the whole of '#; A B' is one datum (as
> defined in R5RS section 7.1.2).  If we then attempt to parse '#; #; A B
> C' as a datum, we see that there is some intertoken space first, namely
> #; followed by a single datum.  Since we determined that the text '#; A
> B' qualifies as one datum, '#; #; A B' must be one datum comment, and
> the only item remaining in the input stream is C.  Thus '#; #; A B C'
> reads as the symbol C.

- candidly haven't a clue of how you believe a recursive parser parses the
  above grammar, but it simply specifies that: (given <ws> :: <whitespace>)

  "#; #; A B C" :: "<error> <ws> <A> <ws> <B> <ws> <C>"

  (as the grammar only specifies a legal parsing of "<#;> <datum>" as a
   <comment> => <whitespace>, therefore "#; #;" is not a valid sequence,
   as <#;> may not validly begin a <datum>, therefore a parse error.)

  If you want to give "#; #; A B C" a consistent meaning, here are your
  grammar specification options: (in addition to the one above):

  1 - <datum-comment> -> #; [ <datum> | <datum-comment> ]

       Which specifies "#; #; A B C" :: "<ws> B <ws> C", as:

       "#; #; A" => <ws>{<datum-comment>{#; <datum-comment>{#; <datum>}}}

  2 - as I denoted 2 months ago, which also further specifies the meanings:

       [` | ` | , | ,@ ] [<datum-comment> | <empty>]
        
  The reason you're having difficulty trying to cleanly specify:

   " The first datum within a commented datum is ignored, as is any datum
     immediately following the "#;" token in a delimiter prefix. "
   
   is that it's a lousy inconsistent semantic behavior to try to specify,
   vs. one more consistent with the language and recursive decent grammars:

  " The <datum-comment> specified as a <#;> followed by a <datum> are
    ignored as <whitespace> "

  -or-

  " The <datum-comment> specified as a <#;> followed by a <datum> or another
    <datum-comment> are ignored as <whitespace> "

  -or- 

  " The <datum-comment> specified as a <#;> followed by a <datum> or another
    <datum-comment> or <empty> are ignored as <whitespace> "

  -or- 

  " The <datum-comment> specified as a <#;> followed by a <datum> or another
    <datum-comment> or <empty>; or a quoted <datum-comment> or <empty>; are
    ignored as <whitespace> "