[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: More comments, and the ANTLR code is too complex

This page is part of the web mail archives of SRFI 110 from before July 7th, 2015. The new archives for SRFI 110 contain all messages, not just those from before July 7th, 2015.



Thanks for the reply Mark!  We'll consider your suggestions.
Improving adoption for this syntax is an important goal for us; we
want others to at least try using it before judging the (de)merits of
the syntax, and having it implemented in a major Scheme implementation
(guile) would help tremendously.

Please wait a while, we'll think about how best to go about showing a
"simple and easy way to add SRFI-110 on top of SRFI-105/random Scheme
implementation".

On 6/13/13, Mark H Weaver <mhw@xxxxxxxxxx> wrote:
> Hi David,
>
> "David A. Wheeler" <dwheeler@xxxxxxxxxxxx> writes:
>> Below is a first shot at breaking up it_expr, currently 1 long rule, into
>> 2 rules.
>> This could obviously be repeated to make more rules, each one simpler.
>> Not saying it's done, but would it help to break the current longer rules
>> into more but smaller rules?
>
> No.  This doesn't help at all, because it doesn't reduce the total
> complexity of the specification.  My concern is the amount of mental
> effort required to understand the precise specification.
>
> Part of the problem is that your specification is actually an
> _implementation_, which is made more complex by efficiency concerns.
> For example, constraining yourself to an LL(1) grammar probably rules
> out a more elegant presentation.
>
> Another big problem is the amount of redundancy in this grammar.  For
> example, the pattern "scomment hspace*" is repeated in many places.
> Sometimes it's a prefix wrapped in (...)*, and other times it's iterated
> over by tail recursion.  The pattern "COLLECTING hspace* collecting_tail
> hspace*" is also repeated in several places.  These redundancies make
> more work for the reader, and make me wonder "are all these actually the
> same, or are there slight differences?"
>
> I suspect that the key to simplifying this grammar (apart from moving
> away from ANTLR for purposes of the specification) is to choose a
> different set of non-terminals.
>
> Please take a look at section 7.1 of the R5RS (or the R7RS draft).
> Understanding that grammar is almost effortless, and there's almost no
> redundancy.  Now take a look at the specifications of SRFI-10, SRFI-30,
> and SRFI-38.  All of them are expressed as a list of modifications to
> the R5RS grammar.  That's the kind of thing I'd like to see in the
> SRFI-110 specification.
>
> One more nit while I'm on this subject: In the BNF conventions section,
> you write "a sweet-expression reader MUST act as if it preprocessed its
> input as follows", but as far as I can tell it's not actually possible
> to implement this as a preprocessor.  This "preprocessing" must be
> interleaved with parser, because several syntactic elements affect the
> preprocessing.  For example, the <* and *> markers manipulate the
> preprocessor's stack, and yet you need a full parser to recognize those
> markers.  Also, if I understand correctly, indentation is only processed
> outside of n-expressions.
>
> I also think that there needs to be a much simpler sample
> implementation: one which does not attempt to be fully featured
> (e.g. omit support for source location tracking), and which is not a
> fully self-contained reader, but is instead expressed in terms of
> existing procedures which are likely already present in an SRFI-105
> reader (or which could be easily created from existing code).
>
> In other words, you should help implementors understand how to add
> SRFI-110 to their existing readers with a minimal amount of code
> changes.  The resulting code needs to be as simple as reasonably
> possible.
>
> Here's one possible strategy: Assume the existence of an n-expression
> reader.  Now write a t-expression reader in terms of it, in the most
> elegant Scheme code possible.  It turns out this is not quite possible,
> but hopefully the problems can be patched up by assuming the existence
> of some other helpers, and/or by adding some functionality to the
> n-expression reader.
>
> After our last email exchange, I spent some time thinking about this,
> and identified a few additional things you might need:
>
> * In order to recognize the special markers, you'll need either (1) a
>   way to "unread" characters, or (2) a way for the n-expression reader
>   to tell you that e.g. the symbol '<*' was "by itself" for purposes of
>   SRFI-110.
>
> * You might need a helper to read special comments without consuming the
>   following datum.
>
> I'm sorry that I cannot be more complete in my analysis of what needs to
> be done, but my time (and motivation) is limited.  Reformulating this
> code will be a lot of work, but I suspect that adoption will be very low
> unless you can show implementors how to add SRFI-110 easily and with a
> small amount of code.
>
>      Regards,
>        Mark
>
>