[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: new, simpler formal specification

This page is part of the web mail archives of SRFI 62 from before July 7th, 2015. The new archives for SRFI 62 contain all messages, not just those from before July 7th, 2015.



> From: Taylor Campbell <campbell@xxxxxxxxxxxxxxxxxx>
>> Paul Schlie wrote:
>> In hopes it may be helpful, the following is a somewhat simpler version of
>> an earlier one: http://srfi.schemers.org/srfi-62/mail-archive/msg00045.html
>> which only modifies R5RS's definition of <comment> (effectively treated as
>> separator/white-space); which includes <block-comment>, and removes legally
>> quoting and/or commenting, a comment or <empty>:
>>
>> Only modifying one R5RS's exiting grammar specification:
>> 
>>    <comment> -> <line-comment> | <block-comment> | <datum-comment>
>> 
>> And augmented it with these additions:
>> 
>>   <line-comment> -> ; <all-chars-to-end-of-line>
>> 
>>   <block-comment> -> #| <all-char-except-#|-or-|#> |#
>> 
>>   <datum-comment> -> #; <datum>
>>
> One problem with this is that it creates mutual references between
> sections 7.1.1, which describes the lexical structure, and 7.1.2, which
> describes the external representation of S-expressions -- in terms of a
> stream of tokens.  While I find the 'stream of tokens' model very
> distasteful to describe Lisp syntax, it is nevertheless how the current
> framework for Scheme's syntax works, and breaking it is not a good idea
> -- especially since you seem opposed to such massive changes anyway, or
> at least that was the impression I got from your previous objections.

- I'm honestly confused, the above representation is fully consistent with
  R5RS and your own alternative referenced specification of scheme grammar,
  (your proposal isn't with either, as it doesn't properly classify "#;
  <datum>" grammatically as a comment, which it is both syntactically and
  semantically inconsistent, with it needs to be parsed as <whitespace>. I
  apparently need to be more explicit, given it's effective specification:

  <whitespace> -> [ <whitespace-character> | <comment> ] <whitespace>

 (which is the grammatical vehicle used to denote that comments are ignored)

> (Your block comments are also inconsistent with SRFI 30, by the way,
> but that is not relevant to this SRFI.)

- how? other than being simpler, it seems fully consistent with it's text?

>> (i.e. #;#; '#; (#;) (') etc. are illegal sequences, vs. earlier versions.)
> 
> Actually, with the way you just presented it, immediately nested
> S-expression comments work exactly as in the current proposal.

- Almost, as I intentionally simplified it stating: "and removes legally
  quoting and/or commenting, a comment or <empty>", as it seemed to be
  confusing an otherwise very simple specification. (but note your following
  semantic interpretation is not consistent with it, i.e. wrong.):

>                                                                 First,
> consider the text '#; A B'.  If parsed as a datum, the value will be
> just B, since the '#; A' is considered intertoken space.  (This follows
> straightforwardly since A is a datum, and so the text '#; A' satisfies
> your <datum-comment> rule.)  So the whole of '#; A B' is one datum (as
> defined in R5RS section 7.1.2).  If we then attempt to parse '#; #; A B
> C' as a datum, we see that there is some intertoken space first, namely
> #; followed by a single datum.  Since we determined that the text '#; A
> B' qualifies as one datum, '#; #; A B' must be one datum comment, and
> the only item remaining in the input stream is C.  Thus '#; #; A B C'
> reads as the symbol C.

- candidly haven't a clue of how you believe a recursive parser parses the
  above grammar, but it simply specifies that: (given <ws> :: <whitespace>)

  "#; #; A B C" :: "<error> <ws> <A> <ws> <B> <ws> <C>"

  (as the grammar only specifies a legal parsing of "<#;> <datum>" as a
   <comment> => <whitespace>, therefore "#; #;" is not a valid sequence,
   as <#;> may not validly begin a <datum>, therefore a parse error.)

  If you want to give "#; #; A B C" a consistent meaning, here are your
  grammar specification options: (in addition to the one above):

  1 - <datum-comment> -> #; [ <datum> | <datum-comment> ]

       Which specifies "#; #; A B C" :: "<ws> B <ws> C", as:

       "#; #; A" => <ws>{<datum-comment>{#; <datum-comment>{#; <datum>}}}

  2 - as I denoted 2 months ago, which also further specifies the meanings:

       [` | ` | , | ,@ ] [<datum-comment> | <empty>]
        
  The reason you're having difficulty trying to cleanly specify:

   " The first datum within a commented datum is ignored, as is any datum
     immediately following the "#;" token in a delimiter prefix. "
   
   is that it's a lousy inconsistent semantic behavior to try to specify,
   vs. one more consistent with the language and recursive decent grammars:

  " The <datum-comment> specified as a <#;> followed by a <datum> are
    ignored as <whitespace> "

  -or-

  " The <datum-comment> specified as a <#;> followed by a <datum> or another
    <datum-comment> are ignored as <whitespace> "

  -or- 

  " The <datum-comment> specified as a <#;> followed by a <datum> or another
    <datum-comment> or <empty> are ignored as <whitespace> "

  -or- 

  " The <datum-comment> specified as a <#;> followed by a <datum> or another
    <datum-comment> or <empty>; or a quoted <datum-comment> or <empty>; are
    ignored as <whitespace> "