[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[oleg@xxxxxxxxx: Minor quibbles on the latest draft]



[forwarded on behalf of Oleg]

Hello!

	I'm sorry I'm late to the party, which has almost finished. It
looks like a great party.

	The following comments apply to the latest draft
	http://sgmiller.org/srfi-44.html

"As standard Scheme lacks the ability to automatically dispatch on
different function behaviors based on the types of arguments, it
difficult to devise a portable implementation of this SRFI which
allows arbitrary future collections. It is expected that Scheme system
implementors will take advantage of generic function or
object-oriented features of their systems to implement this SRFI
efficiently."

I wonder why a module system was not mentioned at all? Quite often
people use a particular standard collection as an implementation
vehicle for their advanced data structure. A collection will be
encapsulated into the data structure. For example, a graph data
structure may use a dictionary to store adjacency information for a node
(or to store a cost value for a pair of nodes). In that case, most
of the implementation code is monomorphic with respect to the
collection type and no run-time dispatch is required. Given an
advanced module system, I can import a particular dictionary
implementation (e.g., associative lists) with renaming
	make-alist => make-mydict
	alist-get  => mydict-get
etc. In my own code I would use the names make-mydict, mydict-get,
etc. exclusively. If at a later time I decide to use a different
dictionary implementation, I would only need to change the module
declaration statement.
	
I have also taken a look at the Edison:
	http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/hfl/hfl/edison/
in particular, Coll/Coll/Collection.hs

The latter gave a few suggestions.

> procedure: % value ... => % 

>      Constructs a % collection with the zero or more values provided
> as its initial contents. 

Should we add that if the collection is a set, it is unspecified which
element is kept in case of duplicates? Or should duplicates be
outlawed? [Edison takes the former approach].

> procedure: *-add bag value => *
> procedure: *-add![!] bag value => * 

>      Adds a single value to a bag. 

> procedure: *-add-all dest-bag source-bag => *
> procedure: *-add-all![!] dest-bag source-bag => * 

>      Adds all the values in the source bag to the destination bag. 

Edison says: insert [our add] keeps the new element in case of
duplicates, but insertSeq [our add-all] keeps an unspecified element.

Edison defines the procedure count, for the occurrence count of an
element in a collection. Sure, count may be expensive for some
collections. For sets, the occurrence count is defined too (but
required to return either 0 or 1). Edison also defines a procedure

>   instanceName   :: c -> String
>     -- the name of the module implementing c

which can be useful, for example, for generating error messages. The
procedure is useful regardless of a particular module or OO system.

For ordered collections, Edison defines procedures to view, insert or
delete the maximum and the minimum elements. I don't know how useful
this may be (many advanced collections do allow efficient
implementations of such operations -- especially for collections that
are used for priority queues or dequeues).

It goes without saying that many operations in Edison -- like union or
intersection of two collections -- are perhaps too high-order or too
laborous or too infrequent to warrant their specification in
SRFI-44. It's better to stick to the bare minimum and provide
extensions only for very frequent operations.


> procedure: *-ref sequence integer => boolean 

>      Returns the value stored in the sequence at the integer index
> provided. It is an error to reference an index outside the range of
> the sequence. 

The return type can't be boolean. Right?

BTW, often a function *-ref-with-default is surprisingly useful. The
function takes an extra argument -- the default value -- and returns
literally (in the sense of eq?) that argument in case the lookup
failed.

Edison's sequences,
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/hfl/hfl/edison/Seq/Sequence.hs

are also interesting, in particular procedures lview, rview, inBounds,
adjust and zippers. However, one may also leave them out for SRFIs to
come.



> (eq? <input collection> (update-procedure <input collection>)) will
> always return a non-false value.

Perhaps this statement is too conservative? R5RS requires that eq? (as
well as eqv? and equal?) return a boolean value. Therefore, the only
non-false return value of eq? is #t. It might be a good idea to say
that explicitly. 


> The collection fold procedure then invokes the folding function on a
> value of the collection and the seeds, and receives from the folding
> function a proceed value followed by a like number of seed return
> values. If the termination value is non-false,

Perhaps you meant "the proceed value" rather than a termination
value. Otherwise, "termination value" is not mentioned anywhere else in
the document.

> In order to ensure a consistent ordering in the collection, the
> ordering function must return the same result for like inputs over
> time. 
In other words, an ordering function must be a pure function, a
mapping of its arguments. One might add that an implementation of a
collection may assume 
    (and (equal? x x1) (equal? y y1)) ==>
	(equal? (ordering-function x y) (ordering-function x1 y1))

Should the same requirement be applied to a equivalence function as
well?

	Cheers,
	Oleg

Attachment: pgp6FtTpDUqy1.pgp
Description: PGP signature