This page is part of the web mail archives of SRFI 44 from before July 7th, 2015. The new archives for SRFI 44 contain all messages, not just those from before July 7th, 2015.
[forwarded on behalf of Oleg] Hello! I'm sorry I'm late to the party, which has almost finished. It looks like a great party. The following comments apply to the latest draft http://sgmiller.org/srfi-44.html "As standard Scheme lacks the ability to automatically dispatch on different function behaviors based on the types of arguments, it difficult to devise a portable implementation of this SRFI which allows arbitrary future collections. It is expected that Scheme system implementors will take advantage of generic function or object-oriented features of their systems to implement this SRFI efficiently." I wonder why a module system was not mentioned at all? Quite often people use a particular standard collection as an implementation vehicle for their advanced data structure. A collection will be encapsulated into the data structure. For example, a graph data structure may use a dictionary to store adjacency information for a node (or to store a cost value for a pair of nodes). In that case, most of the implementation code is monomorphic with respect to the collection type and no run-time dispatch is required. Given an advanced module system, I can import a particular dictionary implementation (e.g., associative lists) with renaming make-alist => make-mydict alist-get => mydict-get etc. In my own code I would use the names make-mydict, mydict-get, etc. exclusively. If at a later time I decide to use a different dictionary implementation, I would only need to change the module declaration statement. I have also taken a look at the Edison: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/hfl/hfl/edison/ in particular, Coll/Coll/Collection.hs The latter gave a few suggestions. > procedure: % value ... => % > Constructs a % collection with the zero or more values provided > as its initial contents. Should we add that if the collection is a set, it is unspecified which element is kept in case of duplicates? Or should duplicates be outlawed? [Edison takes the former approach]. > procedure: *-add bag value => * > procedure: *-add![!] bag value => * > Adds a single value to a bag. > procedure: *-add-all dest-bag source-bag => * > procedure: *-add-all![!] dest-bag source-bag => * > Adds all the values in the source bag to the destination bag. Edison says: insert [our add] keeps the new element in case of duplicates, but insertSeq [our add-all] keeps an unspecified element. Edison defines the procedure count, for the occurrence count of an element in a collection. Sure, count may be expensive for some collections. For sets, the occurrence count is defined too (but required to return either 0 or 1). Edison also defines a procedure > instanceName :: c -> String > -- the name of the module implementing c which can be useful, for example, for generating error messages. The procedure is useful regardless of a particular module or OO system. For ordered collections, Edison defines procedures to view, insert or delete the maximum and the minimum elements. I don't know how useful this may be (many advanced collections do allow efficient implementations of such operations -- especially for collections that are used for priority queues or dequeues). It goes without saying that many operations in Edison -- like union or intersection of two collections -- are perhaps too high-order or too laborous or too infrequent to warrant their specification in SRFI-44. It's better to stick to the bare minimum and provide extensions only for very frequent operations. > procedure: *-ref sequence integer => boolean > Returns the value stored in the sequence at the integer index > provided. It is an error to reference an index outside the range of > the sequence. The return type can't be boolean. Right? BTW, often a function *-ref-with-default is surprisingly useful. The function takes an extra argument -- the default value -- and returns literally (in the sense of eq?) that argument in case the lookup failed. Edison's sequences, http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/hfl/hfl/edison/Seq/Sequence.hs are also interesting, in particular procedures lview, rview, inBounds, adjust and zippers. However, one may also leave them out for SRFIs to come. > (eq? <input collection> (update-procedure <input collection>)) will > always return a non-false value. Perhaps this statement is too conservative? R5RS requires that eq? (as well as eqv? and equal?) return a boolean value. Therefore, the only non-false return value of eq? is #t. It might be a good idea to say that explicitly. > The collection fold procedure then invokes the folding function on a > value of the collection and the seeds, and receives from the folding > function a proceed value followed by a like number of seed return > values. If the termination value is non-false, Perhaps you meant "the proceed value" rather than a termination value. Otherwise, "termination value" is not mentioned anywhere else in the document. > In order to ensure a consistent ordering in the collection, the > ordering function must return the same result for like inputs over > time. In other words, an ordering function must be a pure function, a mapping of its arguments. One might add that an implementation of a collection may assume (and (equal? x x1) (equal? y y1)) ==> (equal? (ordering-function x y) (ordering-function x1 y1)) Should the same requirement be applied to a equivalence function as well? Cheers, Oleg
Attachment:
pgp6FtTpDUqy1.pgp
Description: PGP signature