[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fundamental design flaws




    > From: Taylor Campbell <campbell@xxxxxxxxxxxx>

Slight out-of-sequence reply order here:

    > > 4) absense of reasonable support for [cursors]

    > Please stop saying 'iterator.'  [...] If you mean 'cursor,'
    > _say_ 'cursor,'

Cursor it is, then.   (I guess I picked up iterator from C++, back in
the day.)

    [...]

    >>   The -FOLD- procedures permit me to do this but only by one of a few
    >>   means all of which seem to be horribly inefficient compared to what
    >>   is achievable:  I can convert the collections to lists or some other
    >>   type;   I can use call/cc.

    > There, you've just stated how you can perfectly easily convert a fold
    > operation into a cursor.  Horribly inefficient?  There are plenty of
    > implementations of CALL/CC that are quite efficient.

It doesn't matter how fast simple uses of call/cc are.  Using call/cc
to emulate cursors has at least three huge drawbacks, two of which
pertain to efficiency, all three of which pertain to semantics:

1) The space consumption of the cursor is in general bound by the size
   of the (entire) continuation of the call to the enumerator.
   A call/cc cursor is going to gc-protect way, way, more than it
   needs to.   (Sure, fairly aggressive optimization can do better
   in some special cases but I think systems which do that will always
   be exceptions rather than the rule.  Optimization of this sort will
   only be possible when the compiler can do global flow analysis that
   includes the implementation of the collection type.)

   The interaction of this with other possible extensions like weak 
   references or first-class environments are scary to contemplate.


2) call/cc+enumerator cursors capture the dynamic scope of the call
   to the procedure passed to the enumerator, which in turn includes
   the dynamic scope of the all to the enumerator.

   If I want to pass one of these cursors to an entirely new dynamic
   scope before using it, then each step of the iteration involves 
   an unbounded number dynamic-wind guard-procedure invocations.

   That gives rise to a semantic problem:


3) call/cc cursors interact awkwardly with srfi-34 exceptions, which
   points to a larger problem

   Try to write a call/cc cursor implementation that handles
   exceptions correctly.   You can do it but, do you see those
   contortions you have to go through with WITH-EXCEPTION-HANDLER?

   That's a symptom of the dynamic scope issue -- that the cursor is
   doing its work in a different dynamic scope than the one in which 
   each iteration step is invoked.   A dynamically bound variable,
   like the one implicit in WITH-EXCEPTION-HANDLER can't be counted on
   to have the right value.

   You can hack around that for exceptions, sure -- but are you also
   going to hack around for the current input and output ports?   What 
   about all the dynamic variables you don't know about at the time 
   you write the cursor?   Is the *-ref procedure of a collection type
   simply not allowed to use any dynamically scoped bindings?

In short, call/cc-cursors and ordinary cursors are very different
things.  They aren't interchangeable alternatives.





    > >    Collection type constructors are a prerequisite for allowing people
    > >    to write portable implementations of collection types.   While a
    > >    portable implementation may or may not be less efficient than
    > >    a native one, many useful collection types would perform quite
    > >    reasonably using just about any reasonable implementation of
    > >    srfi-9, a srfi which I would hope that any non-trivial Scheme
    > >    implementation seeks to implement well.

    > I fail to see why we would include MAKE-* instead of MAKE-%.  As it
    > has been stated before, if you like saying MAKE-DICTIONARY instead of
    > some way to make a hash table that compares with EQ?, write:

Another vocabulary issue.  By "type constructor" I mean a form that
creates a new _type_.  As in "DEFINE-RECORD-TYPE is a type
constructor."

I'm not griping that MAKE-DICTIONARY isn't in there (I agree with your
reasons why it shouldn't be).

I'm griping that, given just R5RS + SRFI-44 I can't write my own value
constructor like MAKE-BTREE-DICTIONARY where:

	(dictionary? (make-btree-dictionary [...])) => #t

Suppose that I changed my mind about 44 or 44 changed or whatever and
its finalized and I decide I like it.  Well, then I'd like to (at
least be able to) make some libraries, portable to all R5RS + SRFI-44
systems, defining new collection types (please, _without_ having to
redefine many of the procedures from 44).

Scott mentioned concern that this would lead to unacceptably
inefficient collections.  I disagree, though.  Even a reference
implemenation that "fell back" (for new types of collection at least)
to srfi-9 records with some kind of vtable would be "fast enough" for
many purposes.

Meanwhile, if the spec of the type constructor doesn't explicitly rely
on 9, then implementations are free to optimize the heck out of it.




    > > 2) unusable procedures

    > >    44 defines procedures which client programs have no use for.
    > >    For example, all of the SET functions could be correctly
    > >    implemented as:

    > > 	(define (set-fn . ignored)
    > >           (error "there is no such thing as a set"))

    > What a useful implementation _that_ would be!

That's my point -- the specs are too weak.

Let's suppose that _no_ other detail of 44 changed accept this:  all
the stuff about sets were removed.   It would make exactly as much
sense and wouldn't have that flaw.   A later SRFI, one that provided
at least one set type (and hopefully more than one) could add that
stuff back in.



    > > 3) unfortunate class hierarchy

    > >   I recognize that there are, for example, operations which apply
    > >   generically to sequences and dictionaries.

    > >   It is a fine idea to provide an abstract interface to such
    > >   operations, much as Common Lisp does for sequences.

    > >   However, this srfi goes much further than that.  In particular, it
    > >   requires that, in these abstract interfaces, a list or a vector is
    > >   always a sequence and never a dictionary.

    > Where would a vector be used and exposed to be used as a dictionary?

    > (the 'exposed to be used' bit is important: yes, I'm aware that hash
    > tables use vectors _internally_; _you_, however, are arguing that we
    > should be using SRFI 9 a lot more; you would probably then advocate
    > using SRFI 9 to abstract away hash tables or whatever other type, and
    > then we wouldn't have this problem)

I'm arguing that the interfaces in a collections srfi should not
provide genericity in a way that precludes traditional lisp idioms.

Using plain old vectors as a hash table is one example.   Using
ordinary lists as  either sequences, sets, or associative tables is
another.  

I think it would be cleaner to have, for example, a set of dictionary
generics that are able to act on (normal) associative lists and a
set separate set of sequence generics that are able to act on (normal)
lists.    But 44 can't do that because it really wants to have
procedures that operate on all "collections" -- and that operate
differently depending on the type of the collection arguments,
differently in the case of sequences and dictionaries (but lists and
association lists are not disjoint types).

One way to fix that is to drop the collections procedures entirely.
Another way is to reduce the collections procedures to those which
operate identically on all collection types.   Another way is to add
extra parameters (or similar glue) to them to disambiguate whether a
given list is (intended as) a sequence or associative list.

    > >   Let me put this a little differently: supposing I really like the
    > >   abstract parts of the collections interface of 44 but I'm writing a
    > >   program that wants to use ordinary lists (plus some other
    > >   implementation-specific types) as _dictionaries_ not sequences.

    > Why would you do this, and not abstract their interface away?

Well, why should the procedures in 44 work directly on vectors, lists,
and strings rather than abstract _those_ away.    They don't abstract
them: they use standard procedures (like MAKE-VECTOR) as part of the
interface to collections.

It seems to me completely arbitrary (and unecessary) to say "because
we want to give you various generic procedures for collection-like
data structures, we're going to declare that a list is always an
EQV?-based flexible sequence -- never any other kind of collection".

I'm saying: either don't try to operate on the Scheme types at all, or
design 44 in such a way that the collections procedures work on (at
least):

	ordinary lists as sequences (with any equivalence predicate)
	ordinary lists as sets (with any equivalence predicate)
	ordinary lists as ordered sequences 
          (with any equivalence /ordering predicate)
	ordinary associative lists as dictionaries (with any equivalence predicate)

(and probably other things I'm forgetting.)

I wouldn't be too disappointed to see ordinary vectors as hash tables
but I admit that that's probably just my Emacs Lisp quirks showing.


    > >   Compared to, say, incrementing an index integer or taking the CDR of
    > >   a pair, those techniques for iteration over multiple collections are
    > >   horrible.

    > What prevents you from using those two methods of iteration?  For
    > list-like structures, use *-DELETE-LEFT, which returns the left-most
    > value and the rest of the collection (this is like SRFI 1's CAR+CDR).
    > For vector-like structures, use *-REF.  What's so difficult about
    > this?

I'm just given a sequence.  I don't know whether it is "list like" or
"vector like".  I thought that was a large part of the reason to have
an interface to sequences.

    > >              It is a serious weakness of 44 that it does not define
    > >   generic interfaces to iterators which would not require those
    > >   techniques.

    > There are compelling reasons to specify folding enumerators instead of
    > cursors, and there are easy implementations of cursors in terms of
    > folding enumerators.  It is therefore rather pointless to specify both
    > folding enumerators _and_ cursors.

I think that the sematic (dynamic scope foo) and performance (amount
of data likely to be gc protected) differences between cursors and
enumerator-based cursors provide a pretty good reason to have both.




    > > 6) buggy and incomplete implementation with unreasonable
    > >    dependencies. 

    > >   Bugs and missing things can be fixed, of course, but in its current
    > >   state, the srfi truly ought to be rejected by the editors on this
    > >   basis.

    > >   Tiny-Clos is, in my opinion, a quite unreasonable dependency.   At
    > >   the _very_ least it should be included in the reference
    > >   implementation so that the proposal is self contained.

    > I was going to do this when I first rewrote the implementation with
    > Tiny-CLOS, but then I realized it would be kind of silly, given how
    > many various modifications/reimplementations of Tiny-CLOS there
    > are.

    > I therefore left it at a list of notes about Tiny-CLOS, what
    > needs to be changed in the standard Tiny-CLOS distribution, et
    > cetera; this is plenty to easily modify the standard Tiny-CLOS
    > distribution to your liking, and to make implementors of
    > variants of Tiny-CLOS easily modify their implementations to
    > make it work with SRFI 44.

I understand but I think you made the wrong choice and maybe two wrong
choices.

Nothing really requires (or could reasonably require) that the
entirely library of SRFI's be "self contained" -- that it give you a 
set of specs that refer to nothing but earlier srfis and standard
scheme, and a set of reference implementations that you could just
load up and go.

Some srfis, (4, 9, and 34 come quickly to mind) are example of "low
level" functionality that we can't even expect the reference
implementation to be all that useful except in rather tightly
constrained situations.

And, if someone were to make, say, a GTk-bindings srfi:  I wouldn't
expect the reference implementation to include GTk.

But I think that such exceptions can and should be rare.  It isn't
something srfi can enforce but it is something that authors and
reviewers can try to promote as The Right Thing.

So, minimally, TinyCLOS ought to have been included.   (It's tiny,
right? :-)

But really, I think the total code-size of the reference
implementation would be smaller and the reference implementation would
be directly useful to more people if it were based on SRFI-9 rather
than Tiny-CLOS.   (See 35 for an example of a srfi that builds its own
little mini-object-system on top of 9.)

    > >                                                           Personally,
    > >   I would much prefer to see a reference implementation based on
    > >   srfi-9 rather than one that defines lots of complicated new
    > >   functionality well outside the scope of the srfi.

    > SRFI 9 doesn't define any sort of dispatching mechanism.  How would it
    > help here?

The "stuff" in 44, even if it were extended with type constructors,
doesn't require a very complicated dispatching mechanism.   Wouldn't
it be, perhaps, a page of code on top of 9?

    > I fail to see why 'dragging in' Tiny-CLOS is a bad idea.  I
    > chose Tiny-CLOS because it provides a simple mechanism for
    > dispatching on collection types and it works with existing
    > collection types.  SRFI 9 is for something entirely different:
    > opaque record types.

Hmm.   It's a bit agnostic about whether or not records are opaque.
It certainly doesn't provide any access to them other than the
predicate and field functions created by DEFINE-RECORD-TYPE.

On the other hand, it's very clear about creating new disjoint types.  

I think part of the problem here is that the reference implementation
of 44 sets out to dispatch on a few standard scheme types, plus
(44-style) alists -- and aside from the alists, 9 wouldn't help much
here.

But there's a few things:

1) As I said, I think you really want type constructors in 44 and 
   9 would give you a mechanism for that.

2) 44 is a little fuzzy about disjointness.   The reference
   implementation gets into trouble with that regarding alists
   but even if that were fixed, are we sure that (44-style) alists
   shouldn't be a new disjoint type?

3) I'm not sure I see how TinyCLOS is providing _that_ much help here.

   A reference implementation that says:

	(define (some-sequence-generic collection [...])
          (cond
            ((vector? collection)	[...])
            ((pair? collection)		[...])
            ((string? collection)	[...])
            [...]
            (#t				
              ((lookup-some-generic collection) [...]))))

   looks pretty clean and clear to me and, operationally, is pretty
   close to what TinyCLOS is actually doing.




    > But anyways, if you notice the organization of the reference
    > implementation, you'll see that it looks like it was deliberately
    > designed to fit several implementations.  A utilities.scm file exists
    > for utilities that aren't specific to one implementation, or don't
    > depend on the mechanism that that implementation uses.  Then there is
    > utilities_tiny-clos.scm, a file of utilities that depend on Tiny-CLOS.
    > Finally, there is srfi-44_tiny-clos.scm, a file that defines SRFI 44
    > in terms of Tiny-CLOS.  A srfi-44_yasos.scm might later be put in the
    > archive of the implementation; perhaps also a srfi-44_meroon.scm.  I
    > merely chose Tiny-CLOS because it has a portable implementation and
    > defines a simple dispatching mechanism; it was useful at the time I
    > chose it.  (YASOS was used previously, but it required lots of CONDs
    > in the generic operations, and I didn't like that; that's why I moved
    > to Tiny-CLOS)

It's an aweful lot of code to drag in for just the functionality in
44.

So I think we have took towards the future:  let's assume there's 6 or
12 later srfis giving us new handy collection types.

My fear here is that, since 44 doesn't have type constructors, they'll
have reference implementations that further depend on large parts of
the reference implementation of 44 that aren't backed by any srfi.

It seems like there's a longer term plan by the contributors to 44 to
give us a library of data structures -- very nice.

But since such a library can be built, quite easily I think, with
relatively little new code, using srfi-9, and since 9 is more likely
to be well supported by srfi-using implementations than any of the
object systems you mention, I think it would be the better choice.

There's a more subtle issue, too, which is that if you stick to a
simpler implementation, its less likely the sum total of the data
structure specs will quietly drift into (in essense) _requiring_ some
quirky aspect of what Tiny-CLOS does.

It's just a bummer to think that if I'm going to want to use these
data structure libraries I have to put up with Tiny-CLOS being "snuck
in the back door" when it sure looks completely unecessary at this
stage.


-t