This page is part of the web mail archives of SRFI 44 from before July 7th, 2015. The new archives for SRFI 44 contain all messages, not just those from before July 7th, 2015.
Bradd wrote: >> That's why I suggested that you use dictionaries as "mapping" >> collections rather than "pure index" collections. For one thing, your >> sole example of a dictionary -- the Scheme alist -- is a mapping, not >> an index. scgmille@xxxxxxxxxxxxxxxxxx wrote: > This is clear, thank you. I believe its obvious we wish to support > mappings, as its the more general concept. Now that thats cleared up, > what issues remain in the SRFI which impede that goal? Going through the procedure list: COLLECTION-FOLD-LEFT/RIGHT For dictionaries, enumerates the range-domain pairs. In other words, the "single collection value" accepted by the fold-function should be a (cons range-value domain-value). COLLECTION-FOLD-KEYS-LEFT/RIGHT All of the collections defined so far can support this, because they all have some form of "indexing." This is true even for bags -- the *-contains? procedure is effectively a self-index. The fold-function must accept two arguments in addition to the seeds: (fold-function key element seed ...) For all collections, the key is the value used to find an element in the collection: Collection Key set, bag same as the element sequence position number dictionary range value Collection Sample element Fold-function called with set, bag "hello" (f "hello" "hello" seeds ...) sequence "hello" (f 0 "hello" seeds ...) dictionary ("hello" "goodbye") (f "hello" ("hello" "goodbye") seeds ...) *-COUNT, *-KEY-COUNT *->LIST, *-KEYS->LIST These should be semantically identical to collection-fold[-keys] with an appropriate fold-function. For example, *-count for dictionaries accepts a range-domain pair, and *-key-count is defined for all collections. For bags and sets, the two functions are effectively aliases for each other. For sequences, *-key-count will always return 1 or 0 (i.e., does that position exist in the sequence). Likewise, *->LIST returns a list of range-domain pairs (an alist!) for dictionaries. *-KEYS->LIST is actually not very useful for bags, sets, and dictionaries (you already get everything you need with *->LIST), but it's very useful for sequences. It may also be useful for future collection types. *-EQUIVALENCE-FUNCTION, *-KEY-EQUIVALENCE-FUNCTION *-ORDERING-FUNCTION, *-KEY-ORDERING-FUNCTION Again, it's possible to define the key relationship functions for all collections so far. For bags and sets, they're identical to the element relationship functions. For sequences, they are = and <. For dictionaries, the element relationship functions should compare range-domain pairs, and the key relationship functions should compare range values only. *-GET For dictionaries, gets an arbitrary range-domain pair. *= Much easier to define now: Create the union of all collection elements (annotated with their source collections) and partition it with elt=. The collections are equivalent if each partition contains an equal number of elements from each collection. Example: Let b be a bag of pairs that contains ((a 1) (b 2) (c 3)) d be a dictionary that contains ((a 1) (b 2) (b 3) (c 3)) elt= be (and (eqv? (car e1) (car e2)) (eqv? (car e1) (car e2))) The partition is ([b:(a 1) d:(a 1)] [b:(b 2) d:(b 2)] [d:(b 3)] [b:(c 3) d:(c 3)]) The first, second, and fourth elements of the partition contain equal numbers of elements from b and d, but the third element does not. The two collections are not equal. *-CONTAINS?, *-ADD If these accept elements as arguments, they're also well-defined for dictionaries. In fact, dictionaries would seem to support the entire bag interface, so long as you remember that all "values" are really range-domain pairs. Likewise, bags can support the entire dictionary interface, so long as you remember that the "keys" to a bag are identical to the elements of the bag. Therefore, I recommend making dictionary a subtype of bag and adding *-contains-key? to the bag interface. In most cases, the two "contains?" functions are equivalent to: (*-contains? cv elt) => (not (= 0 (*-count cv elt))) (*-contains-key? cv elt) => (not (= 0 (*-key-count cv elt))) You can also add *-contains-key? to the set interface, or make set a subtype of bag. Or make bag a subtype of set. Honestly, classification of collections is *very* difficult. Collections are "value" types, and those don't translate well to class-based OO systems. They work a bit better with a prototype-based OO system, and even better in a generic traits- or interface-based system. *-INSERT-LEFT/RIGHT These need a failure thunk (or a way to raise an exception). Insertion may not be possible, not even at the ends, e.g., if the container is inflexible or has an ordering requirement. Basically, *all* mutators need failure thunks (or some other exception-throwing convention), because generic mutators can always run into problems with violated requirements. *-REF pos, *-GET key These are really the same function. Both of them turn a range value into a domain value. Typically, collections use the *-REF convention for sequential data and the *-GET convention for dictionary-like data. OK solution: Make them aliases for the same basic function. Better solution: Make *-get the generic range->domain function. For dictionaries, it takes a range value and returns a domain value. For sequences, it's the same as *-ref. For bags, it takes an element value and returns that element. Define *-ref only for sequences and ordered collections. An ordered dictionary can then use *-get to look up keys and *-ref to look up the nth pair in the collection. Consider adding *-get-any, which returns all elements that match the key. Unlike *-get, which should return *domain values* for dictionaries (because that's how it works in existing dictionary-like collections), *-get-any should return *elements* (range-domain pairs) for dictionaries. *-DELETE, *-REMOVE These are tricky. Depending on the collection type, you may need a way to eliminate elements by position, by key, or by whole elements. For bags and sets, "key" and "value" are the same thing. For dictionaries, they're not. Usually, you'll want to delete mappings by key (the range value), but occasionally you'll want to delete by whole mappings (e.g., for doing dictionary differences with remove-all). I'm not sure how to resolve this one yet. I think DELETE should eliminate positions, and REMOVE should eliminate keys. Need a good name for "eliminate based on the whole element value" for dictionary differences. *-PUT This is probably OK as is, but you may want to consider generalizing it to work for all collections, since they all have keys and values. *-UPDATE I definitely recommend extending this to all collections, or at least to sequences. I think that covers all of the dictionary/index issues (plus a few other minor issues that occurred to me while I was writing it.) -- Bradd W. Szonye http://www.szonye.com/bradd