225: Dictionaries

by John Cowan (spec) and Arvydas Silanskas (implementation)

Status

This SRFI is currently in draft status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-225@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

The procedures of this SRFI allow callers to manipulate an object that maps keys to values without the caller needing to know exactly what the type of the object is. However, what procedures can be called on the dictionary is partly determined by whether it is persistent or mutable. Such an object is called a dict in this SRFI.

Issues

None at present.

Rationale

Until recently there was only one universally available mechanism for managing key-value pairs: alists. Most Schemes also support hash tables, but until R6RS there was no standard interface to them, and many implementations do not provide that interface.

Now, however, the number of such mechanisms is growing. In addition to both R6RS and R7RS hash tables, there are R7RS persistent ordered and hashed mappings from SRFI 146, ordered mappings with fixnum keys from SRFI 224, and ordered bytevector key-value stores (often on a disk or a remote machine) from SRFI 167.

It’s inconvenient for users if SRFIs or other libraries have to insist on accepting only a specific type of dictionary. This SRFI exposes a number of accessors, mutators, and other procedures that can be called on any dictionary, provided that a dictionary type descriptor (DTD, with apologies to SGML and XML users) is available for it: either exported from this SRFI, or from other SRFIs or libraries, or created by the user. DTDs are of an unspecified type.

This in turn requires that the DTD provides a predicate that can recognize its dictionaries, plus several primitive generic procedures.

By using the procedures of this SRFI, a procedure can take a DTD and a dictionary as an argument and make flexible use of the dictionary without knowing its exact type. For the purposes of this SRFI, such a procedure is called a generic procedure.

However, dictionaries need to be constructed using type-specific constructors, as the required and optional arguments differ in each case. In addition, the dictionary type provided by the caller of a generic procedure doesn't necessarily have the right performance characteristics needed by the generic procedure itself. Consequently there are no make-dict, dict, dict-unfold, dict-copy or similar procedures in this SRFI.

Specification

We call a specific key-value combination an association. This is why an alist, or association list, is called that; it is a list of associations represented as pairs.

A dictionary or dict is a collection of associations which may be ordered or unordered. In principle an equality predicate is enough, given a key, to determine whether an association with that key exists in the dictionary. However, for efficiency most dictionaries require an ordering predicate or a hash function as well.

When a key argument is said to be the same as some key of the dictionary, it means that they are the same in the sense of the dictionary’s implicit or explicit equality predicate. It is assumed that no dictionary contains two keys that are the same in this sense. Two dictionaries are similar if they are of the same type and have the same equality predicate and the same ordering predicate and/or hash function.

Alists

Alists are supported as dictionaries, but are given special treatment. New values are added to the beginning of the alist and the new alist is returned. If an association has been updated, then both the new and the old association may be processed by the whole-dictionary procedures; by the same token, if an association is removed, any later association with the same key will be exposed. The examples in this SRFI use alists.

An alist (unlike a hashtable or mapping) does not know which equality predicate its users intend to use on it. Therefore, rather than exporting a single DTD for all alists, this SRFI provides a procedure make-alist-dtd that takes an equality predicate and returns a DTD specialized for manipulation of alists using that predicate. For convenience, DTDs for eqv? and equal? are exported.

Each of the following examples is assumed to be prefixed by the following definitions:

(define dict (list '(1 . 2) '(3 . 4) '(5 . 6)))
(define aed alist-eqv-dtd)
Consequently, previous examples don't affect later ones.

The dtd argument is not discussed in the individual procedure descriptions below, but it is an error if invoking dictionary? on dtd and dict would return #f. The dictionary? procedure itself is an exception to this.

Predicates

(dictionary? dtd obj)

Returns #t if obj answers #t to the type predicate stored in dtd and #f otherwise.

(dictionary? aed dict) => #t
(dictionary? aed 35) => #t

(dict-empty? dtd dict)

Returns #t if dict contains no associations and #f if it does contain associations.

(dict-empty? aed '()) => #t
(dict-empty? aed dict) => #f

(dict-contains? dtd dict key)

(dict-contains? aed dict 1) => #t
(dict-contains? aed dict 2) => #f

Returns #t if one of the keys of dict is the same as key, and #f otherwise.

(dict=? dtd = dict1 dict2)

Returns #t if the keys of dtd dict1 and dict2 are the same, and the corresponding values are the same in the sense of the = argument.

(define dicta '((5 . 6) (3 . 4) (1 . 2))
(define dictb '((1 . 2) (3 . 4))
(dict=? aed = dict dicta) => #t
(dict=? aed = dict dictb) => #f

(dict-mutable? dtd dict)

Returns #t if the dictionary type supports mutations and #f if it supports functional updates.

(dict-mutable? hash-table-dtd (make-hash-table)) => #t
(dict-mutable? aed dict) => #f

Lookup

(dict-ref dtd dict key [failure [success] ])

If key is the same as some key of dict, then invokes success on the corresponding value and returns its result. If key is not a key of dict, then invokes the thunk failure and returns its result. The default value of failure signals an error; the default value of success is the identity procedure.

(dict-ref aed dict 1 (lambda () '()) list) =>
  (1) ; Success wraps value in a list
(dict-ref aed dict 2 (lambda () '()) list) =>
  ()  ; Failure returns empty list

(dict-ref/default dtd dict key default)

If key is the same as some key of dict, returns the corresponding value. If not, returns default.

(dict-ref/default aed dict 1 #f) => 1
(dict-ref/default aed dict 1 #f) => #f

Functional update and mutation

All these procedures exist in pairs, with and without a final !. The descriptions apply to the procedures without !; the procedures with ! mutate their dictionary argument and do not return a dictionary value. Only one set of procedures is supported by any dictionary: for example, SRFI 125 hash tables support only mutation, whereas SRFI 146 mappings support only functional update. The dict-mutable? procedure can be used to determine which set is usable.

(dict-set dtd dict obj)
(dict-set! dtd dict obj)

Returns a dictionary that contains all the associations of dict plus those specified by objs, which alternate between keys and values. If a key to be added already exists in dict, the new value prevails.

; new values are prepended
(dict-set aed dict 7 8) =>
   ((7 . 8) (1 . 2) (3 . 4) (5 . 6))
(dict-set aed dict 3 5) =>
   ((3 . 5) (1 . 2) (3 . 4) (5 . 6))

(dict-adjoin dtd dict objs)
(dict-adjoin! dtd dict objs)

Returns a dictionary that contains all the associations of dict plus those specified by objs, which alternate between keys and values. If a key to be added already exists in dict, the old value prevails.

; new values are prepended to alists
(dict-adjoin aed dict 7 8) =>
  ((7 . 8) (1 . 2) (3 . 4) (5 . 6))
(dict-adjoin aed dict 3 5) =>
  ((1 . 2) (3 . 4) (5 . 6))

(dict-delete dtd dict key)
(dict-delete! dtd dict key)

Returns a dictionary that contains all the associations of dict except those whose keys are the same as one of the keys.

; new values are prepended
(dict-delete aed dict 1 3) =>
  ((5. 6)) ; may share
(dict-delete aed dict 5) =>
  ((1 . 2) (3 . 4))

(dict-delete-all dtd dict keylist)
(dict-delete-all! dtd dict keylist)

Returns a dictionary with all the associations of dict except those whose keys are the same as some member of keylist.

(dict-delete-all aed dict '(1 3)) => ((5 . 6))

(dict-replace dtd dict key value)
(dict-replace! dtd dict key value)

Returns a dictionary that contains all the associations of dict except as follows: If key is the same as a key of dict, then the association for that key is omitted and replaced by the association defined by the pair key and value. If there is no such key in dict, then dictionary is returned unchanged.

(dict-replace aed dict 1 3) =>
  ((1 . 3) (1 . 2) (3 . 4) (5 . 6)) 

(dict-intern dtd dict key failure)
(dict-intern! dtd dict key failure)

If there is a key in dict that is the same as key, returns two values, dict and the value associated with key. Otherwise, returns two values, a dictionary that contains all the associations of dict and in addition a new association that maps key to the result of invoking failure, and the result of invoking failure.

(dict-intern aed dict 1 (lambda () #f)) => ; 2 values
  ((1 . 2) (3 . 4) (5 . 6))
  2
(dict-intern aed dict 2 (lambda () #f)) => ; 2 values
  ((2 . #f) (1 . 2) (3 . 4) (5 . 6))
  #f

(dict-update dtd dict key updater [failure [success] ])
(dict-update! dtd dict key updater [failure [success] ])

Retrieves the value of key as if by dict-ref, invokes updater on it, and sets the value of key to be the result of calling updater as if by dict-set, but may do so more efficiently. Returns the updated dictionary. The default value of failure signals an error; the default value of success is the identity procedure.

(dict-update aed dict 1 (lambda (x) (+ 1 x))) =>
  ((1 3) (1 2) (3 4) (5 6))
(dict-update aed dict 2 (lambda (x) (+ 1 x))) =>
  error

(dict-update/default dtd dict key updater default)
(dict-update/default! dtd dict key updater default)

Retrieves the value of key as if by dict-ref/default, invokes updater on it, and sets the value of key to be the result of calling updater as if by dict-set, but may do so more efficiently. Returns the updated dictionary.

(dict-update/default aed dict 1
  (lambda (x) (+ 1 x)) 0) =>
  ((1 3) (1 2) (3 4) (5 6))
(dict-update/default aed dict 2
  (lambda (x) (+ 1 x)) 0) =>
  ((1 0) (1 2) (3 4) (5 6))

(dict-pop dtd dict)
(dict-pop! dtd dict)

Chooses an association from dict and returns three values: a dictionary that contains all associations of dict except the chosen one, and the key and the value of the association chosen. If the dictionary is ordered, the first association is chosen; otherwise the chosen association is arbitrary.

If dictionary contains no associations, it is an error.

(dict-pop aed dict) => ; 3 values
  ((3 . 4) (5 . 6))
  1
  2

(dict-map dtd proc dict)
(dict-map! dtd proc dict)

Returns a dictionary similar to dict that maps each key of dict to the proc on the key and corresponding value of dict.

(dict-map (lambda (k v) (cons v k)) aed dict) =>
   ((2 . 1) (4 . 3) (6 . 5))

(dict-filter dtd pred dict)
(dict-filter! dtd pred dict)
(dict-remove dtd pred dict)
(dict-remove! dtd pred dict)

Returns a dictionary similar to dict that contains just the associations of dict that satisfy / do not satisfy pred when it is invoked on the key and value of the association.

(dict-filter (lambda (k v) (= k 1)) aed dict) =>
  ((1 . 2))
(dict-remove (lambda (k) (= k 1)) aed dict) =>
  ((3 . 4) (5 . 6))

(dict-alter dtd dict key failure success)
(dict-alter! dtd dict key failure success)

This procedure is a workhorse for dictionary lookup, insert, and delete. The dictionary dict is searched for an association whose key is the same as key. If one is not found, then the failure procedure is tail-called with two procedure arguments, insert and ignore.

If such an association is found, then the success procedure is tail-called with the matching key of dict, the associated value, and two procedure arguments, update and remove.

In either case, the values returned by failure or success are returned.

Here are four examples of dict-alter, one for each of the four procedures:

     ;; ignore
     (define-values
       (dict value)
       (dict-alter (alist->dict '((a . b))) 'c
              (lambda (insert ignore)
                (ignore))
              (lambda args
                (error))))
     (dict->alist aed dict)) => ((a . b))

     ;; insert
     (define-values
       (dict value)
       (dict-alter (alist->dict '((a . b))) 'c
              (lambda (insert ignore)
                (insert 'd))
              (lambda args
                (error))))
     (dict-ref aed dict 'a)) => b
     (dict-ref aed dict 'c)) => 'd

     ;; update
     (define-values
       (dict value)
       (dict-alter (alist->dict '((a . b))) 'a
              (lambda args
                (error))
              (lambda (key value update delete)
                (update 'a2 'b2))))
     (dict->alist aed dict) => ((a2 . b2)

     ;; delete
     (define-values
       (dict value)
       (dict-alter (alist->dict '((a . b) (c . d))) 'a
              (lambda args
                (error))
              (lambda (key value update delete)
                (delete))))
     (dict->alist aed dict)) => ((c . d))

The whole dictionary

(dict-size dtd dict)

Returns an exact integer representing the number of associations in dict.

(dict-size aed dict) => 3

(dict-count dtd pred dict)

Passes each association of dictionary as two arguments to pred and returns the number of times that pred returned true as an an exact integer.

(dict-count aed dict (lambda (k v) (even? k))) => 0

(dict-any dtd pred dict)

Passes each association of dict as two arguments to pred and returns the value of the first call to pred that returns true, after which no further calls are made. If the dictionary type is inherently ordered, associations are processed in the inherent order; otherwise in an arbitrary order. If all calls return false, dict-any returns false.

(define (both-even? k v) (and (even? k) (even? v)))
(dict-any aed both-even? '((2 . 4) (3 . 5))) => #t
(dict-any aed both-even? '((1 . 2) (3 . 4))) => #f

(dict-every dtd pred dict)

Passes each association of dict as two arguments to pred and returns #f after the first call to pred that returns false, after which no further calls are made. If the dictionary type is inherently ordered, associations are processed in the inherent order; otherwise in an arbitrary order. If all calls return true, dict-any returns the value of the last call, or #t if no calls are made.

(define (some-even? k v) (or (even? k) (even? v)))
(dict-every aed some-even? '((2 . 3) (3 . 4))) => #t
(dict-every aed some-even? '((1 . 3) (3 . 4))) => #f

(dict-keys dtd dict)

Returns a list of the keys of dict. If the dictionary type is inherently ordered, associations appear in the inherent order; otherwise in an arbitrary order. The order may change when new elements are added to dict.

(dict-keys aed dict) => (1 3 5)

(dict-values dtd dict)

Returns a list of the values of dict. The results returned by dict-keys and dict-values are not necessarily ordered consistently.

(dict-values aed dict) => (2 4 6)

(dict-entries dtd dict)

Returns two values, the results of calling dict-keys and the result of calling dict-values on dict.

(dict-entries aed dict) => ; 2 values
  (1 3 5)
  (2 4 6)

(dict-fold dtd proc knil dict)

Invokes proc on each association of dict with three arguments: the key of the association, the value of the association, and an accumulated result of the previous invocation. For the first invocation, knil is used as the third argument. Returns the result of the last invocation, or knil if there was no invocation. Note that there is no guarantee of a consistent result if the dictionary does not have an inherent order.

(dict-fold + 0 '((1 . 2) (3 . 4))) => 10

(dict-map->list dtd proc dict)

Returns a list of values that result from invoking proc on the keys and corresponding values of dict.

(dict-map->list (lambda (k v) v) dict) =>
  (2 4 6),
(dict-map->list - aed dict) =>
  (-1 -1 -1) ; subtract value from key

(dict->alist dtd dict)

Returns an alist whose keys and values are the keys and values of dict.

Add dict-map, dict-filter, dict-remove, and dict-alter.

(dict-comparator dtd dict)

Return a comparator representing the type predicate, equality predicate, ordering predicate, and hash function of dict. The last two may be #f if the dictionary does not make use of these functions.

If no comparator is relevant to the dictionary type, returns #f.

Iteration

(dict-for-each dtd proc dict)

Invokes proc on each key of dict and its corresponding value in that order. This procedure is used for doing operations on the whole dictionary. If the dictionary type is inherently ordered, associations are processed in the inherent order; otherwise in an arbitrary order. Returns an unspecified value.

(define (write-key key value) (write key))
(dict-for-each write-key aed dict) => unspecified
  ; writes "135" to current output

(dict-for-each< dtd proc dict key)
(dict-for-each<= dtd proc dict key)
(dict-for-each> dtd proc dict key)
(dict-for-each>= dtd proc dict key)

Invokes proc on each key of dict that is less than / less than or equal to / greater than / greater than or equal to key and its corresponding value. This procedure is used for doing operations on part of the dictionary. If the dictionary type is inherently ordered, associations are processed in the inherent order; otherwise in an arbitrary order. Returns an unspecified value.

(dict-for-each-in-open-interval dtd proc dict key1 key2)
(dict-for-each-in-closed-interval dtd proc dict key1 key2)
(dict-for-each-in-open-closed-interval dtd proc dict key1 key2)
(dict-for-each-in-closed-open-interval dtd proc dict key1 key2)

Invokes proc on each key of dict that is that is within the specified interval between key1 and key2, and its corresponding value. This procedure is used for doing operations on part of the dictionary. If the dictionary type is inherently ordered, associations are processed in the inherent order; otherwise in an arbitrary order. Returns an unspecified value.

Generator procedures

(make-dict-generator dtd dict)

Returns a generator that when invoked returns the associations of dict as pairs. When no associations are left, returns an end-of-file object. If the dictionary type is inherently ordered, associations are processed in the inherent order; otherwise in an arbitrary order. It is an error to mutate dict until the generator is exhausted.

(dict-set-accumulator dtd dict)

Returns a SRFI 158 accumulator procedure that when invoked on a pair adds the car and cdr of the pair as a key and value, returning the new value of the dictionary. If invoked on an end-of-file object, no action is taken and dict is returned. If a key to be added already exists in dictionary, the new value prevails.

(dict-adjoin-accumulator dtd dict)

The same as dict-set-accumulator, except that if a key to be added already exists in dictionary, the old value prevails.

Dictionary type descriptor procedures

(dtd? obj)

Returns #t if obj is a DTD, and #f otherwise.

(make-dtd arg)
(dtd (proc-id procname))     [syntax]

Returns a new DTD providing procedures that allow manipulation of dictionaries of that type. The args are alternately proc-ids and corresponding procs.

A proc-id argument is the value of a variable whose name is the same as a procedure (except those in this section and the Exceptions section) suffixed with -id, and a proc argument is the specific procedure implementing it for this type. Note that there is only one proc-id variable for each pair of mutation and functional-update procedures: the proc-id variable for dict-map-id and dict-map! is dict-map-id.

The following proc-id variables (and corresponding primitive procedures) need to be provided in order for the DTD to support the full set of dictionary procedures:

Note that it is not an error to omit any of these, but some dictionary procedures may be unavailable.

There are additional proc-id variables that may be provided with corresponding procedures in order to increase efficiency. For example, it is not necessary to provide a dict-ref procedure, because the default version is built on top of dict-alter or dict-alter!. But if the underlying dictionary provides its own -ref procedure, it may be more efficient to specify it to make-dtd using dict-ref-id. Here is the list of additional proc-id variables:

The dtd macro behaves like a wrapper around make-dtd with parentheses around each procid-procname pair. The macro may also verify that the proc-ids are valid, that there are no duplicates, etc.

(make-alist-dtd equal)

Returns a DTD for manipulating an alist using the equality predicate equal.

(make-alist-dtd =) => a DTD for alists using numeric equality

(dtd-ref dtd proc-id)

Returns the procedure designated by proc-id from dtd. This allows the ability to call a particular DTD procedure multiple times more efficiently.

Exceptions

(dictionary-error message irritant ... )

Returns a dictionary error with the given message (a string) and irritants (any objects). If a particular procedure in a DTD cannot be implemented, it instead should signal an appropriate dictionary error that can be reliably caught.

(dictionary-error? obj)

Returns #t if obj is a dictionary error and #f otherwise.

(dictionary-message dictionary-error)

Returns the message associated with dictionary-error.

(dictionary-irritants dictionary-error)

Returns a list of the irritants associated with dictionary-error.

Exported DTDs

The following DTDs are exported from this SRFI: srfi-69-dtd, hash-table-dtd, srfi-126-dtd, mapping-dtd, hash-mapping-dtd, alist-eqv-dtd, and alist-equal-dtd. The last two provide DTDs for alists using eqv? and equal? respectively.

Implementation

The sample implementation is found in the GitHub repository.

The following list of dependencies is designed to ease defining new dictionary types that may not have complete dictionary APIs:

dict-empty?
dict-size
dict=?
dict-ref
dict-keys
dict-size
dict-contains?
dict-ref
dict-ref
dict-mutable?
dict-alter or dict-alter!
dict-ref/default
dict-ref
dict-set
dict-alter
dict-adjoin
dict-alter
dict-delete
dict-delete-all
dict-delete-all
dict-alter
dict-replace
dict-alter
dict-intern
dict-alter
dict-update
dict-alter
dict-update/default
dict-update
dict-pop
dict-for-each
dict-delete-all
dict-empty?
dict-map
dict-keys
dict-ref
dict-replace
dict-filter
dict-keys
dict-ref
dict-delete-all
dict-remove
dict-filter
dict-count
dict-fold
dict-any
dict-for-each
dict-every
dict-for-each
dict-keys
dict-fold
dict-values
dict-fold
dict-entries
dict-fold
dict-fold
dict-for-each
dict-map->list
dict-fold
dict->alist
dict-map->list
dict-for-each<
dict-for-each
dict-for-each<=
dict-for-each
dict-for-each>
dict-for-each
dict-for-each>=
dict-for-each
dict-for-each-in-open-interval
dict-for-each
dict-for-each-in-closed-interval
dict-for-each
dict-for-each-in-open-closed-interval
dict-for-each
dict-for-each-in-closed-open-interval
dict-for-each
make-dict-generator
dict-entries
dict-set-accumulator
dict-set
dict-adjoin-accumulator
dict-set

Acknowledgements

Thanks to the participants on the mailing list.

© 2021 John Cowan, Arvydas Silanskas.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler