225: Dictionaries

by John Cowan (spec) and Arvydas Silanskas (implementation)

Status

This SRFI is currently in draft status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-225@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

The procedures of this SRFI allow callers to manipulate an object that maps keys to values without the caller needing to know exactly what the type of the object is. Such an object is called a dictionary in this SRFI.

Issues

None at present.

Rationale

Until recently there was only one universally available mechanism for managing key-value pairs: alists. Most Schemes also support hash tables, but until R6RS there was no standard interface to them, and many implementations do not provide that interface.

Now, however, the number of such mechanisms is growing. In addition to both R6RS and R7RS hash tables, there are persistent ordered and hashed mappings from SRFI 146 and ordered key-value stores (often on a disk or a remote machine) from SRFI 167.

It’s inconvenient for users if SRFIs or other libraries have to insist on accepting only a specific type of dictionary. This SRFI exposes a number of accessors, mutators, and other procedures that can be called on any dictionary, provided that a dictionary type descriptor (DTD, with apologies to SGML and XML users) is available for it: either exported from this SRFI, or from other SRFIs or libraries, or created by the user. DTDs are of an unspecified type.

This in turn requires that the DTD provides a predicate that can recognize its dictionaries, plus at least these primitive operations: create an empty dictionary from a comparator; determine a dictionary’s current size; reference, update, or insert an element of the dictionary depending on its current contents; map over all the elements with a function mapping each old value to a new one; filter the elements based on their keys or values; and process all the elements using a procedure invoked for its side effects.

By using the procedures of this SRFI, a procedure can take a DTD and a dictionary as an argument and make flexible use of the dictionary without knowing its exact type.

Note that dictionaries must still be constructed using type-specific constructors, as the required and optional arguments differ in each case.

Specification

We call a specific key-value combination an association. This is why an alist, or association list, is called that; it is a list of associations represented as pairs.

A dictionary is a collection of associations which may be ordered or unordered. In principle an equality predicate is enough, given a key, to find if an association with that key exists in the dictionary. However, for efficiency most dictionaries require an ordering predicate or a hash function as well.

When a key argument is said to be the same as some key of the dictionary, it means that they are the same in the sense of the dictionary’s implicit or explicit equality predicate. It is assumed that no dictionary contains two keys that are the same in this sense. Two dictionaries are similar if they are of the same type and have the same equality predicate and the same ordering predicate and/or hash function.

Lists as dictionaries

Lists are supported as dictionaries using the specification in this section. If two keys are the same (in the sense of the specified equality predicate), then all but the first are treated as if they did not exist.

If plist-dtd (see below) is used with a list, then the list is assumed to be a property list, alternating symbol keys with values. Mutation operations actually mutate the property list whenever possible. The equality predicate of this type of dictionary is eq?.

If a list is empty, or its car is a pair, then the list can be treated as an alist. New values are added to the beginning of an alist and the new alist is returned; deletion does not mutate the alist, but returns an alist that may or may not share storage with the original alist. If an association has been updated, then both the new and the old association may be processed by the whole-dictionary procedures. The examples in this SRFI use alists.

However, an alist (unlike a hashtable or mapping) does not know which equality predicate its users intend to use on it. Therefore, rather than exporting a single DTD for alists, this SRFI provides a procedure make-alist-dtd takes an equality predicate and returns a DTD specialized for manipulation of alists using that predicate.

In all other cases, lists cannot be treated as dictionaries unless a DTD exists.

Procedures

Each of the following examples is assumed to be prefixed by the following definitions:

(define dict (list '(1 . 2) '(3 . 4) '(5 . 6)))
(define aed alist-eqv-dtd)
Consequently, previous examples don't affect later ones.

The dtd argument is not discussed in the individual procedure descriptions below, but it is an error if invoking the dictionary? procedure that was used to make the DTD on dictionary returns #f.

Constructor

(make-dict dtd comparator obj ...)

Returns an empty dictionary of the type described by the DTD using comparator to specify to specify the dictionary's equality predicate and its ordering predicate and/or hash function.

The meaning of the other arguments depends on the type of dictionary being created. However, implementations which support the ability to specify the initial capacity of a hash table should interpret a non-negative exact integer as the specification of that capacity. In addition, if the symbols thread-safe, weak-keys, ephemeral-keys, weak-values, or ephemeral-values are present, implementations should create thread-safe hash tables, hash tables with weak keys or ephemeral keys, or hash tables with weak or ephemeral values respectively. Implementations are free to use ephemeral keys or values when weak keys or values respectively have been requested.

Predicates

(dictionary? dtd obj)

Returns #t if obj answers #t to some registered predicate, and #f otherwise.

(define dict (list '(1 . 2) '(3 . 4) '(5 . 6)))
(dictionary? aed dict) => #t

(dict-empty? )

Returns #t if dictionary contains no associations and #f if it does contain associations.

(dict-empty? '()) => #t
(dict-empty? aed dict) => #f

(dict-contains? dtd dictionary key)

(dict-contains? aed dict 1) => #t
(dict-contains? aed dict 2) => #f

Returns #t if one of the keys of dictionary is the same as key, and #f otherwise.

Lookup

(dict-ref dtd dictionary key [failure [success] ])

If key is the same as some key of dictionary, then invokes success on the corresponding value and returns its result. If key is not a key of dictionary, then invokes the thunk failure and returns its result. The default value of failure signals an error; the default value of success is the identity procedure.

(dict-ref aed dict 1 (lambda () '()) list) => (1) ; Success wraps value in a list
(dict-ref aed dict 2 (lambda () '()) list) => ()  ; Failure returns empty list

(dict-ref/default dtd dictionary key default)

If key is the same as some key of dictionary then returns the corresponding value. If not, then returns default.

(dict-ref/default aed dict 1 #f) => 1
(dict-ref/default aed dict 1 #f) => #f

Mutation

All these procedures exist in pairs, with and without a final !. The ones without ! are functional: they never side-effect their arguments, but may copy them. The ones with ! are linear-update: they may return either a new dictionary object (which may or may not share storage with the dictionary argument), or the same dictionary object, mutated; in either case it is an error to access the dictionary later through any other reference to it, as that reference may have been invalidated.

(dict-set dtd dictionary obj)
(dict-set! dtd dictionary obj)

Returns a dictionary that contains all the associations of dictionary plus those specified by objs, which alternate between keys and values. If a key to be added already exists in dictionary, the new value prevails.

; new values are prepended
(dict-set aed dict 7 8) => ((7 . 8) (1 . 2) (3 . 4) (5 . 6))
(dict-set aed dict 3 5) => ((3 . 5) (1 . 2) (3 . 4) (5 . 6)

(dict-adjoin dtd dictionary objs)
(dict-adjoin! dtd dictionary objs)

Returns a dictionary that contains all the associations of dictionary plus those specified by objs, which alternate between keys and values. If a key to be added already exists in dictionary, the old value prevails.

; new values are prepended to alists
(dict-adjoin aed dict 7 8) => ((7 . 8) (1 . 2) (3 . 4) (5 . 6))
(dict-adjoin aed dict 3 5) => ((1 . 2) (3 . 4) (5 . 6)

(dict-delete dtd dictionary key)
(dict-delete! dtd dictionary key)

Returns a dictionary that contains all the associations of dictionary except those whose keys are the same as one of the keys.

; new values are prepended
(dict-delete aed dict 1 3) => ((5. 6)) ; may share
(dict-delete aed dict 5) => ((1 . 2) (3 . 4)

(dict-delete-all dtd dictionary keylist)
(dict-delete-all! dtd dictionary keylist)

Returns a dictionary with all the associations of dictionary except those whose keys are the same as some member of keylist.

(dict-delete-all aed dict '(1 3)) => ((5 . 6))

(dict-replace dtd dictionary key value)
(dict-replace! dtd dictionary key value)

Returns a dictionary that contains all the associations of dictionary except as follows: If key is the same as a key of dictionary, then the association for that key is omitted and replaced by the association defined by the pair key and value. If there is no such key in dictionary, then dictionary is returned unchanged.

(dict-replace aed dict 1 3) => ((1 . 3) (1 . 2) (3 . 4) (5 . 6)) 

(dict-intern dtd dictionary key failure)
(dict-intern! dtd dictionary key failure)

If there is a key in dictionary that is the same as key, returns two values, dictionary and the value associated with key. Otherwise, returns two values, a dictionary that contains all the associations of dictionary and in addition a new association that maps key to the result of invoking failure, and the result of invoking failure.

(dict-intern aed dict 1 (lambda () #f)) => ; 2 values
  ((1 . 2) (3 . 4) (5 . 6))
  3
(dict-intern aed dict 2 (lambda () #f)) => ; 2 values
  ((2 . #f) (1 . 2) (3 . 4) (5 . 6))
  #f

(dict-update dtd dictionary key updater [failure [success] ])

(dict-update! dtd dictionary key updater [failure [success] ])

Retrieves the value of key as if by dict-ref, invokes updater on it, and sets the value of key to be the result of calling updater as if by dict-set, but may do so more efficiently. Returns the updated dictionary. The default value of failure signals an error; the default value of success is the identity procedure.

(dict-update/default dtd dictionary key updater default)

(dict-update/default! dtd dictionary key updater default)

Retrieves the value of key as if by dict-ref/default, invokes updater on it, and sets the value of key to be the result of calling updater as if by dict-set, but may do so more efficiently. Returns the updated dictionary.

(dict-pop dtd dictionary)
(dict-pop! dtd dictionary)

Chooses an association from dictionary and returns three values: a dictionary that contains all associations of dictionary except the chosen one, and the key and the value of the chosen association. If the dictionary is ordered, the first association is chosen; otherwise the chosen association is arbitrary.

If dictionary contains no associations, it is an error.

(dict-pop aed dict) => # 3 values
  ((3 . 4) (5 . 6))
  1
  2

(dict-map dtd proc dictionary)
(dict-map! dtd proc dictionary)

Returns a dictionary similar to dictionary that maps each key of dictionary to the value that results from invoking proc on the corresponding key and value of dictionary.

(dict-map (lambda (k v) (cons v k)) aed dict) => ((2 . 1) (4 . 3) (6 . 5))
(dict-map! (lambda (k v) (cons v k)) aed dict) => ((2 . 1) (4 . 3) (6 . 5))

(dict-filter dtd pred dictionary)
(dict-filter! dtd pred dictionary)

Returns a dictionary similar to dictionary that contains just the associations of dictionary that satisfy pred when it is invoked on the key and value of the association.

(dict-filter (lambda (k v) (= k 1)) aed dict) => ((1 . 2))

(dict-remove dtd pred dictionary)
(dict-remove! dtd pred dictionary)

Returns a dictionary that contains all the associations of dictionary except those that satisfy pred when called on the key and value.

(dict-remove (lambda (k) (= k 1)) aed dict) => ((3 . 4) (5 . 6))

(dict-search dtd dictionary key failure success)
(dict-search! dtd dictionary key failure success)

This procedure is a workhorse for dictionary lookup, insert, and delete. The dictionary dictionary is searched for an association whose key is the same as key in the sense of dictionary. If one is not found, then the failure procedure is tail-called with two procedure arguments, insert and ignore, and is expected to tail-call one of them.

However, if such an association is found, then the success procedure is tail-called with the matching key of dictionary, the associated value, and two procedure arguments, update and remove, and is expected to tail-call one of them.

It is an error if the procedure arguments are invoked other than in tail position in the failure and success procedures. It is also an error if the failure and success procedures return to their implicit continuation without invoking one of their arguments.

The behaviors of the procedures are as follows (where obj is any Scheme object):

In all cases, obj is returned as a second value.

Here are four examples of dict-search, one for each of the four procedures:

     ;; ignore
     (define-values
       (dict value)
       (dict-search (alist->dict '((a . b))) 'c
              (lambda (insert ignore)
                (ignore 'foo))
              (lambda args
                (error))))
     (dict->alist aed dict)) => ((a . b))
     value => 'foo

     ;; insert
     (define-values
       (dict value)
       (dict-search (alist->dict '((a . b))) 'c
              (lambda (insert ignore)
                (insert 'd 'foo))
              (lambda args
                (error))))
     (dict-ref aed dict 'a)) => b
     (dict-ref aed dict 'c)) => 'd`
     value => foo

     ;; update
     (define-values
       (dict value)
       (dict-search (alist->dict '((a . b))) 'a
              (lambda args
                (error))
              (lambda (key value update delete)
                (update 'a2 'b2 'foo))))
     (dict->alist aed dict) => ((a2 . b2)
     value => foo

     ;; delete
     (define-values
       (dict value)
       (dict-search (alist->dict '((a . b) (c . d))) 'a
              (lambda args
                (error))
              (lambda (key value update delete)
                (delete 'foo))))
     (dict->alist aed dict)) => ((c . d))
     value => foo

The whole dictionary

(dict-copy dtd dictionary)

Returns a dictionary that is similar to dictionary and contains the same associations. Mutating the result of dict-copy does not affect dictionary or vice versa.

(dict-size dtd dictionary)

Returns an exact integer representing the number of associations in dictionary.

(dict-size aed dict) => 3

(dict-for-each dtd proc dictionary)

Invokes proc on each key of dictionary and its corresponding value in that order. This procedure is used for doing operations on the whole dictionary. If the dictionary type is inherently ordered, associations are processed in the inherent order; otherwise in an arbitrary order. Returns an unspecified value.

(define (write-key key value) (write key))
(dict-for-each write-key aed dict) => unspecified
  ; writes "135" to current output

(dict-count dtd pred dictionary)

Passes each association of dictionary as two arguments to pred and returns the number of times that pred returned true as an an exact integer.

(dict-count aed dict (lambda (k v) (even? k))) => 0

(dict-any dtd pred dictionary)

Passes each association of dictionary as two arguments to pred and returns the value of the first call to pred that returns true, after which no further calls are made. If the dictionary type is inherently ordered, associations are processed in the inherent order; otherwise in an arbitrary order. If all calls return false, dict-any returns false.

(define (both-even? k v) (and (even? k) (even? v)))
(dict-any both-even? '((2 . 4) (3 . 5))) => #t
(dict-any both-even? '((1 . 2) (3 . 4))) => #f

(dict-every dtd pred dictionary)

Passes each association of dictionary as two arguments to pred and returns #f after the first call to pred that returns false, after which no further calls are made. If the dictionary type is inherently ordered, associations are processed in the inherent order; otherwise in an arbitrary order. If all calls return true, dict-any returns the value of the last call, or #t if no calls are made.

(define (some-even? k v) (or (even? k) (even? v)))
(dict-every some-even? '((2 . 3) (3 . 4))) => #t
(dict-every some-even? '((1 . 3) (3 . 4))) => #f

(dict-keys dtd dictionary)

Returns a list of the keys of dictionary. If the dictionary type is inherently ordered, associations are processed in the inherent order; otherwise in an arbitrary order. The order may change when new elements are added to dictionary.

(dict-keys aed dict) => (1 3 5)

(dict-values dtd dictionary)

Returns a list of the values of dictionary. The results returned by dict-keys and dict-values are not necessarily ordered consistently.

(dict-values aed dict) => (2 4 6)

(dict-entries dtd dictionary)

Returns two values, the results of calling dict-keys and the result of calling dict-values on dictionary.

(dict-entries aed dict) => ; 2 values
  (1 3 5)
  (2 4 6)

(dict-fold dtd proc knil dictionary)

Invokes proc on each association of dictionary with three arguments: the key of the association, the value of the association, and an accumulated result of the previous invocation. For the first invocation, knil is used as the third argument. Returns the result of the last invocation, or knil if there was no invocation. Note that there is no guarantee of a consistent result if the dictionary does not have an inherent order.

(dict-fold + 0 '((1 . 2) (3 . 4))) => 10

(dict-map->list dtd proc dictionary)

Returns a list of values that result from invoking proc on the keys and corresponding values of dictionary.

(dict-map->list (lambda (k v) v) dict) =>gt; (2 4 6),
(dict-map->list - aed dict) => (-1 -1 -1) ; subtract value from key

(dict->alist dtd dictionary)

Returns an alist whose keys and values are the keys and values of dictionary.

; plist to alist
(dict->alist '(1 2 3 4 5 6)) => ((1 . 2) (3 . 4) (5 . 6))

Add dict-map, dict-filter, dict-remove, and dict-search.

(dict-comparator dtd dictionary)

Dictionary type descriptors

(dtd? obj)

Returns #t if obj is a DTD, and #f otherwise.

(make-dtd arg)

Returns a new dictionary type providing procedures that allow manipulation of dictionaries of that type. The args are alternately procnames and corresponding procs.

A procname argument is a symbol which is the same as one of the procedures defined in this SRFI (other than those in this section), and a proc argument is the specific procedure implementing it for this type. These procedures only need to handle a full-length argument list (except when defining dict-ref and dict-update!), as the other defaults have already been supplied by the framework.

Shrink this list if possible: Arguments for the seven procedures make-dict, dictionary?, dict-size, dict-search!, dict-map!, dict-filter!, and dict-for-each are required. The others are optional, but if provided can be more efficient than the versions automatically provided by the implementation of this SRFI.

The following example is from the file plist-impl.scm in the sample implementation; the procedures referred to are also in that file.

  (make-dtd
    'dictionary? plist?
    'dict-map! plist-map!
    'dict-filter! plist-filter!
    'dict-search! plist-search!
    'dict-size plist-size
    'dict-for-each plist-foreach
    'dict->alist plist->alist)) => a DTD for plists

(make-alist-dtd equal)

Returns a DTD for manipulating an alist using the equality predicate equal.

(make-alist-dtd =) => a DTD for alists using numeric equality

Exceptions

dictionary-error message irritant ... )

Returns a dictionary error with the given message (a string) and irritants (any objects). If a particular procedure in a DTD cannot be implemented, it instead should signal an appropriate dictionary exception that can be reliably caught.

dictionary-error? obj)

Returns #t if obj is a dictionary error and #f otherwise.

dictionary-message dictionary-error)

Returns the message associated with dictionary-error.

(dictionary-irritants dictionary-error)

Returns a list of the irritants associated with dictionary-error.

Variables

The following DTDs are also exported from this SRFI: srfi-69-dtd, hash-table-dtd, srfi-126-dtd, mapping-dtd, hash-mapping-dtd, plist-dtd, alist-eqv-dtd, and alist-equal-dtd. The last two provide DTDs for alists using eqv? and equal? respectively.

Implementation

The sample implementation is found in the GitHub repository.

Acknowledgements

Thanks to the participants on the mailing list.

© 2021 John Cowan, Arvydas Silanskas.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler