[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sweet-expressions are not homoiconic

This page is part of the web mail archives of SRFI 110 from before July 7th, 2015. The new archives for SRFI 110 contain all messages, not just those from before July 7th, 2015.



John David Stone:
>         The rationale section of SRFI 110 acknowledges that previous
> attempts to deparenthesize LISP-like languages have regularly come to
> grief, but argues that this one will be different, because it will preserve
> the homoiconicity of the parenthesis notation:  In effect, the goal was to
> make grouping manifest in a way that reflects the actual syntactic
> structure of the code, just as the parenthesis notation does.
...
>         As the project evolved, however, it ran into the same sorts of
> difficulties as earlier attempts to deparenthesize LISP-like languages:
> ambiguous constructions,

There are no ambiguous constructions in the sense I understand the term.
We have a rigorous BNF grammar, and a large test suite too.

If there *is* an ambiguity, please let us know.  Where is it?


> awkwardness at the points of transition between
> the parenthesis notation

I disagree, I think it's *really* clear and simple. The rule: "Inside (...),
all indentation processing is disabled".  This is a simple
and clear rule.  It is also what many other languages (like Python) do.
It's not awkward at all.

...
>         On the other hand, there is nothing about the \\ and $ symbols that
> represents the corresponding syntactic structures, and indeed \\ is used in
> two completely different ways, depending on context.

There is also nothing about 'x that represents the underlying syntactic
structure.  Readers simply have to learn that 'x represents (quote x).

The point is that these are entirely mechanistic transforms that are
easily learned and independent of the underlying semantics.

> It even has two different names, GROUP and SPLIT...

The "\\" is notionally a single semantic:
"stop the line here and restart at the current indentation level".
However, I have (so far) found it easier to *explain* this
as separate constructs.

If people prefer, we could rename it as a single term, say LINE-RESTART.

We could have used 2 separate markers, though I doubt that'd make you happier.
We wanted to *minimize* the number of different syntactic markers, in part
because every new one increases backwards-compatibility risks.

And finally, SRFI-49 (its predecessor) had a group symbol too, named "group".
Every indentation-sensitive notation for Lisp
has to have SOME way of identifying lists of lists.

>         Sweet-expressions also undermine the (slight but reliable) iconic
> status of some of Scheme's existing marker symbols (quote, quasiquote, and
> unquote), making them context-dependent.... All depend for their iconicity on
> their immediate juxtaposition to the form to which they are attached.  This
> attachment is lost in sweet-expressions.

They still depend on juxtaposition.  The only difference is that, if the
abbreviation is followed by whitespace, then the *sweet-expression* that follows is used
where plausible, not the neoteric-expression.  This solves what would *otherwise*
be an ambiguity, namely: given ', should it address the neoteric- or sweet-
expression that follows?  Intervening whitespace means "use the whitespace".

This is the SRFI-49 solution also.

>         The counterargument that sweet-expressions are highly readable to
> people who have spent half an hour mastering the rules is not to the
> point.  Other proposals to deparenthesize LISP-like languages have also
> resulted in programs that were readable to the people who were trained to
> read them...  The
> claim, however, was that this time around deparenthesization would succeed,
> because sweet-expressions would be homoiconic.  But they aren't.

Proof by repeated assertion is not a valid rule of argument.
You keep saying the notation is not homoiconic, even though it is
per the definition we've been using.

Let's review the definition that we've been using:
  Homoiconic = "the underlying data structure is clear from the syntax".
Clearly you need to *learn* the syntax, but that takes less than an hour.
After that, given ANY use of the syntax, you can immediately state
its underlying data structure.  That's all "homoiconic" means.

Homoiconic does *NOT* for us mean "has no markers" or
"whitespace isn't syntactically meaningful".
Complaining that t-expressions require *some* effort to learn is not reasonable;
s-expressions are widely acknowledged as being homoiconic, but to use them
you must learn what (a b c) and 'x and `(a ,b) and (q . r) mean.
Remember that even (a b c) is an abbreviation, in this case, for (a . (b . (c . ()))).
The point of a homoiconic structure is that you can easily perceive the
underlying data structure, not that the visible form matches exactly that structure.

Also, we do *not* claim that sweet-expressions are just homoiconic, we
claim that they are general *AND* homoiconic. The combination of these
properties *IS* different from most past efforts.  Our definition:
  General = "the notation is independent from any underlying semantic".
Most past notations, e.g., M-expressions, Rlisp, IACL2, Logo, etc., don't have both
of those properties.

You don't like "$" (even though lots of Haskell users do),
you don't like "\\" (even though lists-of-lists need to be supported somehow),
and you don't like whitespace-after-abbreviations.
Okay, got that.  But a notation with them is still homoiconic,
just like a notation that allows 'x is homoiconic.


--- David A. Wheeler