225: Dictionaries

by John Cowan (spec) and Arvydas Silanskas (implementation)

Status

This SRFI is currently in final status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-225@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

The procedures of this SRFI allow callers to manipulate an object that maps keys to values without the caller needing to know exactly what the type of the object is. Such an object is called a dictionary or dict in this SRFI.

Rationale

Until recently, there was only one universally available mechanism for managing key-value pairs: alists. Most Schemes also support hash tables, but until R6RS there was no standard interface to them, and many implementations do not provide that interface.

In addition, alists can have multiple entries with the same key, which makes them atypical instances of persistent dictionaries.

Now, however, the number of such mechanisms is growing. In addition to both R6RS and R7RS hash tables, there are R7RS persistent inherently ordered and hashed mappings from SRFI 146, inherently ordered mappings with fixnum keys from SRFI 224, and inherently ordered bytevector key-value stores (often on a disk or a remote machine) from SRFI 167.

It’s inconvenient for users if SRFIs or other libraries accept only a specific type of dictionary. This SRFI exposes a number of accessors, updaters, and other procedures that can be called on any dictionary, provided that a dictionary type object (DTO) is available for it: either exported from this SRFI, or from other SRFIs or libraries, or created by the user. DTOs are of an unspecified type.

Specification

By using the procedures of this SRFI, a procedure can take a DTO and a dictionary as arguments and make flexible use of the dictionary without knowing its exact type. For the purposes of this SRFI, such a procedure is called a generic procedure.

However, it is still necessary to distinguish between pure and impure dictionary types. A pure dictionary either does not support updates at all, or else updates are persistent so that a new dictionary is returned by an update that can share storage with the original dictionary but is distinct from it. Impure dictionaries, on the other hand, perform updates by mutation. SRFI 146 mappings are pure dictionaries; SRFI 125 hash tables are impure. Note that if an instance of an impure dictionary type like SRFI 126 is in fact immutable, it still counts as impure. The generic predicate dict-pure? can be used to distinguish the two types.

In addition, dictionaries need to be constructed using type-specific constructors, as the performance characteristics differ in each case. In addition, in cases where the dictionary has persistent storage of some type there is generally some ancillary information required such as a file name or DBMS table name needed. Consequently there are no make-dict, dict, dict-unfold, dict-copy, or similar procedures provided by this SRFI.

Each of the following examples is assumed to be prefixed by the following definitions:

(define dict '((1 . 2) (3 . 4) (5 . 6)))
(define dto eqv-alist-dto)
Consequently, previous examples don't affect later ones.

The dto argument is not discussed in the individual procedure descriptions below, but it is an error if invoking dictionary? on dto and dict would return #f. The dictionary? generic procedure itself is an exception to this.

Definitions

We call a specific key-value combination an association. (This is why an alist, or association list, is called that; it is a list of associations represented as pairs.)

A dictionary or dict is a collection of associations which may or may not be inherently ordered by their keys. In principle an equality predicate is enough, given a key, to determine whether an association with that key exists in the dictionary. However, for efficiency most dictionaries require an ordering predicate or a hash function as well.

When a key argument is said to be the same as some key of the dictionary, it means that they are the same in the sense of the dictionary’s implicit or explicit equality predicate. Two dictionaries are similar if they have the same DTO and have the same equality predicate and the same ordering predicate and/or hash function.

Alists

Alists are supported as dictionaries, but are given special treatment. Associations with new keys are added to the beginning of the alist and the new alist is returned. The examples in this SRFI use alists. Alists are treated as pure, but copying is done as necessary to guarantee that the update procedures of this SRFI never result in an alist with duplicate keys. However, an alist constructed by other means may have duplicate keys, in which case the first occurrence of the key is the relevant one.

An alist (unlike a hashtable or mapping) does not know which equality predicate its users intend to use on it. Therefore, rather than exporting a single DTO for all alists, this SRFI provides a procedure make-alist-dto that takes an equality predicate and returns a DTO specialized for manipulation of alists using that predicate. For convenience, DTOs for eqv? and equal? are exported.

Predicates

(dictionary? dto obj)

Returns #t if obj answers #t to the type predicate stored in dto and #f otherwise.

(dictionary? dto dict) ⇒ #t
(dictionary? dto 35) ⇒ #f

(dict-empty? dto dict)

Returns #t if dict contains no associations and #f if it does contain associations.

(dict-empty? dto '()) ⇒ #t
(dict-empty? dto dict) ⇒ #f

(dict-contains? dto dict key)

Returns #t if one of the keys of dict is the same as key, and #f otherwise.

(dict-contains? dto dict 1) ⇒ #t
(dict-contains? dto dict 2) ⇒ #f

(dict=? dto = dict1 dict2)

Returns #t if the keys of dict1 and dict2 are the same, and the corresponding values are the same in the sense of the = argument.

(dict=? dto = dict '((5 . 6) (3 . 4) (1 . 2))) ⇒ #t
(dict=? dto = dict '((1 . 2) (3 . 5))) ⇒ #f

(dict-pure? dto dict)

Returns #t if dto describes a pure dictionary. The dict argument is required for the sake of uniformity with other generic procedures, but it can have any value.

(dict-pure? dto dict) ⇒ #t

Accessors

(dict-ref dto dict key [failure [success] ])

If key is the same as some key of dict, then invokes success on the corresponding value and returns its result. If key is not a key of dict, then invokes the thunk failure and returns its result. The default value of failure signals an error; the default value of success is the identity procedure.

(dict-ref dto dict 1 (lambda () '()) list) ⇒
  (2) ; Success wraps value in a list
(dict-ref dto dict 2 (lambda () '()) list) ⇒
  ()  ; Failure returns empty list

(dict-ref/default dto dict key default)

If key is the same as some key of dict, returns the corresponding value. If not, returns default.

(dict-ref/default dto dict 1 #f) ⇒ 2
(dict-ref/default dto dict 2 #f) ⇒ #f

(dict-comparator dto dict)

Returns a comparator representing the type predicate, equality predicate, ordering predicate, and hash function of dict. The last two may be #f if the comparator does not make use of these functions.

If the comparator is unavailable or is irrelevant to the dictionary type, returns #f.

Update procedures

Note that the following procedures apply to both pure and impure dictionaries (see dict-pure?). Their names uniformly end in ! even though it depends on the dictionary whether any mutation is done.

Updates are not permitted while any generic procedure that takes a procedure argument is running.

(dict-set! dto dict obj)

Returns a dictionary that contains all the associations of dict plus those specified by objs, which alternate between keys and values. If a key to be added already exists in dict, the new value prevails.

 (dict-set! dto dict 7 8) ⇒
   ((1 . 2) (3 . 4) (5 . 6) (7 . 8)))
(dict-set! dto dict 3 5) ⇒
   ((3 . 5) (1 . 2) (5 . 6)))

(dict-adjoin! dto dict obj ...)

Returns a dictionary that contains all the associations of dict plus those specified by objs, which alternate between keys and values. If a key to be added already exists in dict, the old value prevails.

 (dict-adjoin! dto dict 7 8) ⇒
  ((7 . 8) (1 . 2) (3 . 4) (5 . 6))
(dict-adjoin! dto dict 3 5) ⇒
  ((1 . 2) (3 . 4) (5 . 6))

(dict-delete! dto dict key)

Returns a dictionary that contains all the associations of dict except those whose keys are the same as one of the keys.

(dict-delete! dto dict 1 3) ⇒
  ((5 . 6))
(dict-delete! dto dict 5) ⇒
  ((1 . 2) (3 . 4))

(dict-delete-all! dto dict keylist)

The same as dict-delete!, except that the keys to be deleted are in the list keylist.

(dict-delete-all! dto dict '(1 3)) ⇒ ((5 . 6))

(dict-replace! dto dict key value)

Returns a dictionary that contains all the associations of dict except as follows: If key is the same as a key of dict, then the association for that key is omitted and replaced by the association defined by the pair key and value. If there is no such key in dict, then dictionary is returned unchanged.

(dict-replace! dto dict 1 3) ⇒
  ((1 . 3) (3 . 4) (5 . 6))) 

(dict-intern! dto dict key failure)

If there is a key in dict that is the same as key, returns two values, dict and the value associated with key. Otherwise, returns two values, a dictionary that contains all the associations of dict and in addition a new association that maps key to the result of invoking failure, and the result of invoking failure.

(dict-intern! dto dict 1 (lambda () #f)) ⇒ ; 2 values
  ((1 . 2) (3 . 4) (5 . 6))
  2
(dict-intern! dto dict 2 (lambda () 0)) ⇒ ; 2 values
  ((1 . 2) (2 . 0) (3 . 4) (5 . 6))
  0

(dict-update! dto dict key updater [failure [success] ])

Retrieves the value of key as if by dict-ref, invokes updater on it, and sets the value of key to be the result of calling updater as if by dict-set!, but may do so more efficiently. Returns the updated dictionary. The default value of failure signals an error; the default value of success is the identity procedure.

(dict-update! dto dict 1 (lambda (x) (+ 1 x))) ⇒
  ((1 . 3) (3 . 4) (5 . 6))
(dict-update! dto dict 2 (lambda (x) (+ 1 x))) ⇒
  error

(dict-update/default! dto dict key updater default)

Retrieves the value of key as if by dict-ref/default, invokes updater on it, and sets the value of key to be the result of calling updater as if by dict-set!, but may do so more efficiently. Returns the updated dictionary.

(dict-update/default! dto dict 1 (lambda (x) (+ 1 x)) 0) ⇒
  ((1 . 3) (3 . 4) (5 . 6))
(dict-update/default! dto dict 2 (lambda (x) (+ 1 x)) 0) ⇒
  ((2 . 1) (3 . 4) (5 . 6))

(dict-pop! dto dict)

Chooses an association from dict and returns three values: a dictionary that contains all associations of dict except the chosen one, the key, and the value of the association chosen. If the dictionary is inherently ordered, the first association is chosen; otherwise, the chosen association is arbitrary.

If dict contains no associations, it is an error.

(dict-pop! dto dict) ⇒ ; 3 values
  ((3 . 4) (5 . 6))
  1
  2

(dict-find-update! dto dict key failure success)

This procedure is a workhorse for dictionary lookup, insert, and delete. The dictionary dict is searched for an association whose key is the same as key. If one is not found, then the failure procedure is tail-called with two procedure arguments, insert and ignore.

If such an association is found, then the success procedure is tail-called with the matching key of dict, the associated value, and two procedure arguments, update and delete.

In either case, the values returned by failure or success are returned.

Mapping and filtering

(dict-map dto proc dict)

Returns a dictionary similar to dict that maps each of dict to the result of applying proc to the key and corresponding value of dict.

(dict-map dto (lambda (k v) (- v)) dict) ⇒
   (((1 . -2) (3 . -4) (5 . -6))

(dict-filter dto pred dict)

(dict-remove dto pred dict)

Returns a dictionary similar to dict that contains just the associations of dict that satisfy / do not satisfy pred when it is invoked on the key and value of the association.

(dict-filter dto (lambda (k v) (= k 1)) dict) ⇒
  ((1 . 2))
(dict-remove dto (lambda (k v) (= k 1)) dict) ⇒
  ((3 . 4) (5 . 6))

The whole dictionary

(dict-size dto dict)

Returns an exact integer representing the number of associations in dict.

(dict-size dto dict) ⇒ 3

(dict-count dto pred dict)

Passes each association of dictionary as two arguments to pred and returns the number of times that pred returned true as an an exact integer.

(dict-count dto (lambda (k v) (even? k)) dict) ⇒ 0

(dict-any dto pred dict)

Passes each association of dict as two arguments to pred and returns the value of the first call to pred that returns true, after which no further calls are made. If the dictionary type is inherently ordered, associations are processed in that order; otherwise, in an arbitrary order. If all calls return false, dict-any returns false.

(define (both-even? k v) (and (even? k) (even? v)))
(dict-any dto both-even? '((2 . 4) (3 . 5))) ⇒ #t
(dict-any dto both-even? '((1 . 2) (3 . 4))) ⇒ #f

(dict-every dto pred dict)

Passes each association of dict as two arguments to pred and returns #f after the first call to pred that returns false, after which no further calls are made. If the dictionary type is inherently ordered, associations are processed in that order; otherwise, in an arbitrary order. If all calls return true, dict-any returns the value of the last call, or #t if no calls are made.

(define (some-even? k v) (or (even? k) (even? v)))
(dict-every dto some-even? '((2 . 3) (3 . 4))) ⇒ #t
(dict-every dto some-even? '((1 . 3) (3 . 4))) ⇒ #f

(dict-keys dto dict)

Returns a list of the keys of dict. If the dictionary type is inherently ordered, associations appear in that order; otherwise, in an arbitrary order. The order may change when new elements are added to dict.

(dict-keys dto dict) ⇒ (1 3 5)

(dict-values dto dict)

Returns a list of the values of dict. The results returned by dict-keys and dict-values are not necessarily ordered consistently.

(dict-values dto dict) ⇒ (2 4 6)

(dict-entries dto dict)

Returns two list values, the keys and the corresponding values.

(dict-entries dto dict) ⇒ ; 2 values
  (1 3 5)
  (2 4 6)

(dict-fold dto proc knil dict)

Invokes proc on each association of dict with three arguments: the key of the association, the value of the association, and an accumulated result of the previous invocation. For the first invocation, knil is used as the third argument. Returns the result of the last invocation, or knil if there was no invocation. Note that there is no guarantee of a consistent result if the dictionary does not have an inherent order.

(dict-fold dto + 0 '((1 . 2) (3 . 4))) ⇒ 10

(dict-map->list dto proc dict)

Returns a list of values that result from invoking proc on the keys and corresponding values of dict.

(dict-map->list dto (lambda (k v) v) dict) ⇒
  (2 4 6)
(dict-map->list dto - dict) ⇒
  (-1 -1 -1) ; subtract value from key

(dict->alist dto dict)

Returns an alist whose keys and values are the keys and values of dict.

(dict->alist dto dict) ⇒
  ((1 . 2) (3 . 4) (5 . 6))

Iteration

(dict-for-each dto proc dict [ start [ end ] ] )

Invokes proc on each key of dict and its corresponding value in that order. This procedure is used for doing operations on the whole dictionary. If the dictionary type is inherently ordered, associations are processed in the order specified by the dictionary's comparator; otherwise, they are processed in an arbitrary order. The start and end arguments specify the inclusive lower bound and exclusive upper bound of the keys (in the sense of the dictionary's comparator). They can can provide additional efficiency when iterating over part of the dictionary if the dictionary is ordered. The procedure returns an unspecified value.

(define (write-key key value) (write key))
(dict-for-each dto write-key dict) ⇒ unspecified
  ; writes "135" to current output

(dict->generator dto dict [ start [ end ] ] )

Returns a SRFI 158 generator that, when invoked, returns the associations of dict as pairs. If the dictionary type is inherently ordered, associations are generated in the order specified by the dictionary's comparator; otherwise, they are generated in an arbitrary order.

The start and end arguments specify the inclusive lower bound and exclusive upper bound of the keys to be processed (in the sense of the dictionary's comparator). They can can provide additional efficiency when iterating over part of the dictionary if the dictionary is ordered.

It is an error to mutate dict until after the generator is exhausted. When all the associations have been processed, returns an end-of-file object.

(dict-set!-accumulator dto dict)

Returns a SRFI 158 accumulator procedure that, when invoked on a pair, adds the car and cdr of the pair as a key and value of dict as if by dict-set!, eventually returning the new value of dict. If invoked on an end-of-file object, no action is taken and dict is returned.

(dict-adjoin!-accumulator dto dict)

The same as dict-set!-accumulator, except using dict-adjoin!.

Dictionary type object procedures (non-generic)

(dto? obj)

Returns #t if obj is a DTO, and #f otherwise.

(make-dto arg)

Returns a new DTO providing procedures that allow manipulation of dictionaries of a new type. The args are alternately proc-ids and corresponding procs.

A proc-id argument is the value of a variable whose name is the same as a procedure suffixed with -id, and a proc argument is the specific procedure implementing it for this type. The following proc-id variables and associated procedures need to be provided in each call to make-dto in order for the DTO to support the full set of dictionary procedures:

Note that if any of these are not provided, an implementation-defined set of generic procedures will signal an error satisfying dictionary-error? if invoked.

There are additional proc-id variables that may be provided with corresponding procedures in order to increase efficiency. For example, it is not necessary to provide a dict-ref procedure, because the default version is built on top of dict-find-update!. But if the underlying dictionary provides its own -ref procedure, it may be more efficient to specify it to make-dto using dict-ref-id. Here is the list of additional proc-id variables:

(dto-ref dto proc-id)

Returns the procedure designated by proc-id from dto. This allows the ability to call a particular DTO procedure multiple times more efficiently.

(make-alist-dto equal)

Returns a DTO for manipulating an alist using the equality predicate equal.

Exception procedures (non-generic)

(dictionary-error message irritant ... )

Returns a dictionary error with the given message (a string) and irritants (any objects). If a particular procedure in a DTO cannot be implemented, it instead should signal an appropriate dictionary error that can be reliably caught.

(dictionary-error? obj)

Returns #t if obj is a dictionary error, and #f otherwise.

(dictionary-message dictionary-error)

Returns the message associated with dictionary-error.

(dictionary-irritants dictionary-error)

Returns a list of the irritants associated with dictionary-error.

Exported DTOs

The following DTOs are exported from this SRFI: srfi-69-dto, hash-table-dto, srfi-126-dto, mapping-dto, and hash-mapping-dto, provided that the corresponding libraries are available. In addition, eqv-alist-dto and equal-alist-dto are unconditionally exported.

Implementation

The sample implementation is found in the GitHub repository.

The following list of dependencies is designed to ease defining new dictionary types that may not have complete dictionary APIs:

dict-empty?
dict-size
dict=?
dict-ref
dict-keys
dict-size
dict-contains?
dict-ref
dict-ref
dict-pure?
dict-find-update!
dict-ref/default
dict-ref
dict-set!
dict-find-update!
dict-adjoin!
dict-find-update!
dict-delete!
dict-delete-all!
dict-delete-all!
dict-find-update!
dict-replace!
dict-find-update!
dict-intern!
dict-find-update!
dict-update!
dict-find-update!
dict-update/default!
dict-update!
dict-pop!
dict-for-each
dict-delete-all!
dict-empty?
dict-filter
dict-keys
dict-ref
dict-delete-all!
dict-remove
dict-filter
dict-count
dict-fold
dict-any
dict-for-each
dict-every
dict-for-each
dict-keys
dict-fold
dict-values
dict-fold
dict-entries
dict-fold
dict-fold
dict-for-each
dict-map->list
dict-fold
dict-for-each
dict-map
dict->generator
dict-for-each
dict-set!-accumulator
dict-set!
dict-adjoin!-accumulator
dict-set!

Acknowledgements

Thanks to the participants on the mailing list.

© 2021 John Cowan, Arvydas Silanskas.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler