by John Cowan (shepherd, text), Will Clinger (text), Daphne Preston-Kendal (implementation)

This SRFI is currently in *draft* status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to `srfi-250@nospamsrfi.schemers.org`

. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

- Received: 2023-11-13
- 60-day deadline: 2024-01-14
- Draft #1 published: 2023-11-15
- Daphne Preston-Kendal's
personal
Git repo for the sample implementation for this SRFI for
reference while the SRFI is in
*draft*status

This SRFI defines an interface to hash tables, which are widely recognized as a fundamental data structure for a wide variety of applications. A hash table is a data structure that:

- Is disjoint from all other types.
- Provides a mapping from objects known as
*keys*to corresponding objects known as*values*.- Keys may be any Scheme objects in some kinds of hash tables, but are restricted in other kinds.
- Values may be any Scheme objects.

- Provides an
*equality predicate*which defines when a proposed key is the same as an existing key. No table may contain more than one value for a given key. - Provides a
*hash function*which maps a candidate key into a non-negative exact integer. - Supports mutation as the primary means of setting the contents of a table.
- Provides key lookup and destructive update in (expected) amortized constant time, provided that a satisfactory hash function is available.
- Does not guarantee that whole-table operations work in the presence of concurrent mutation of the whole hash table. (Values may be safely mutated.)

Unlike the hash tables of SRFI 125, which is the direct ancestor of this specification, the hash tables described here are ordered by insertion: that is, associations inserted earlier in the history of the hash table appear earlier in the ordering. Advances in the implementations of hash tables, as provided by C++, Python, JavaScript, etc., make the provision of this new facility practical. As a result, the hash tables of this SRFI do not interoperate with the hash tables of SRFI 125, SRFI 126, or existing R6RS implementations.

- It would be useful for
`hash-table-map`

to pass both the key and the value to*proc*, but that would make the API incompatible with SRFI 125. Rather than changing it in this way, a variant of`hash-table-map`

could be added, under a different name, that passes both key and value. - There are no examples. SRFI 125, upon which this SRFI is based, didn't have examples.

Hash tables themselves don't really need defending: almost all dynamically typed languages, from awk to JavaScript to Lua to Perl to Python to Common Lisp, and including many Scheme implementations, provide them in some form as a fundamental data structure. Therefore, what needs to be defended is not the data structure but the procedures. This SRFI supports a great many convenience procedures on top of the basic hash table interfaces provided by SRFI 69 and R6RS. Modulo the question of association ordering, nothing in it adds power to what those interfaces provide, but it does add convenience in the form of pre-debugged routines to do various common things, and even some things not so commonly done but useful.

There is no support for thread safety or weakness.

This specification depends on SRFI 128 comparators, which package a type test, an equality predicate, and a hash function into a single bundle.

The relatively few hash table procedures in R6RS are all available
in this SRFI under somewhat different names.
This SRFI adopts SRFI 69's term `hash-table`

rather than R6RS's `hashtable`

, because of the universal
use of "hash table" rather than "hashtable" in other computer languages and
in technical prose generally. Besides, the English word
*hashtable* obviously means something that can be ... hashted.
It would be trivial to provide the R6RS names on top of this SRFI.

As usual, the Common Lisp names are completely different from the Scheme names. Common Lisp provides the following capabilities that are not in this SRFI:

- The constructor allows specifying the rehash size and rehash threshold of the new hash table. There are also accessors and mutators for these and for the current capacity (as opposed to size).
- There are hash tables based on
`equalp`

(which does not exist in Scheme). `With-hash-table-iterator`

is a hash table external iterator implemented as a local macro.`Sxhash`

is an implementation-specific hash function for the`equal`

predicate. It has the property that objects in different instantiations of the same Lisp implementation that are similar (a concept analogous to`equal`

but defined across all instantiations of a Common Lisp program) always return the same value from`sxhash`

; for example, the symbol`xyz`

will have the same`sxhash`

result in all instantiations.

The procedures in this SRFI are drawn primarily from SRFI 69 and R6RS. In addition, the following sources are acknowledged:

- The
`hash-table-mutable?`

procedure and the second argument of`hash-table-copy`

(which allows the creation of immutable hash tables) are from R6RS, renamed in the style of this SRFI. - The
`hash-table-intern!`

procedure is from Racket, renamed in the style of this SRFI. - The
`hash-table-find`

procedure is a modified version of`table-search`

in Gambit. - The procedures
`hash-table-unfold`

and`hash-table-count`

were suggested by SRFI 1. - The procedures
`hash-table=?`

and`hash-table-map`

were suggested by Haskell's Data.Map.Strict module. - The procedure
`hash-table-map->list`

is from Guile.

The procedures `hash-table-empty?`

,
`hash-table-empty-copy`

, `hash-table-pop!`

,
`hash-table-map!`

, `hash-table-intersection!`

,
`hash-table-difference!`

, and `hash-table-xor!`

were added for convenience and completeness.

The native hash tables of MIT, SISC, Bigloo, Scheme48, SLIB, RScheme, Scheme 7, Scheme 9, Rep, and FemtoLisp were also investigated, but no additional procedures were incorporated.

The slash in the names of some procedures can be pronounced "with".

The procedures in this SRFI are in the `(srfi 250)`

library
(or `(srfi :250)`

on R6RS).

All references to "executing in expected amortized constant time" presuppose that a satisfactory hash function is available. Arbitrary or impure hash functions can make a hash of any implementation.

Hash tables are allowed to cache the results of calling the equality predicate and hash function, so programs cannot rely on the hash function being called exactly once for every primitive hash table operation: it may be called zero, one, or more times.

It is an error if the procedure argument of `hash-table-find`

,
`hash-table-count`

, `hash-table-map`

,
`hash-table-for-each`

, `hash-table-map!`

,
`hash-table-map->list`

, `hash-table-fold`

or `hash-table-prune!`

mutates the hash table being walked.

It is an error to pass two hash tables that have different (in the
sense of `eq?`

) comparators to any of the procedures of
this SRFI.

Implementations are permitted to ignore user-specified hash
functions in certain circumstances. Specifically, if the
equality predicate, whether passed as part of a comparator
or explicitly, is more fine-grained (in the sense of R7RS-small
section 6.1) than `equal?`

, the implementation is
free — indeed, is encouraged — to ignore the user-specified
hash function and use something implementation-dependent.
This allows the use of addresses as hashes, in which case
the keys must be rehashed if they are moved by the garbage
collector. Such a hash function is unsafe to use outside
the context of implementation-provided hash tables. It can
of course be exposed by an implementation as an extension,
with suitable warnings against inappropriate uses.

It is an error to mutate a key during or after its insertion into a hash table in such a way that the hash function of the table will return a different result when applied to that key.

- Constructors:
`make-hash-table`

,`hash-table`

,`hash-table-unfold`

,`alist->hash-table`

- Predicates:
`hash-table?`

,`hash-table-contains?`

,`hash-table-empty?`

,`hash-table=?`

,`hash-table-mutable?`

- Accessors:
`hash-table-ref`

,`hash-table-ref/default`

,`hash-table-comparator`

- Mutators:
`hash-table-set!`

,`hash-table-delete!`

,`hash-table-intern!`

,`hash-table-update!`

,`hash-table-update!/default`

,`hash-table-pop!`

,`hash-table-clear!`

- The whole hash table:
`hash-table-size`

,`hash-table-keys`

,`hash-table-values`

,`hash-table-entries`

,`hash-table-key-vector`

,`hash-table-value-vector`

,`hash-table-entry-vectors`

,`hash-table-size`

,`hash-table-keys`

,`hash-table-values`

,`hash-table-entries`

,`hash-table-find`

,`hash-table-count`

- Mapping and folding:
`hash-table-map`

,`hash-table-for-each`

,`hash-table-map!`

,`hash-table-map->list`

,`hash-table-fold`

,`hash-table-prune!`

- Copying and conversion:
`hash-table-copy`

,`hash-table-empty-copy`

,`hash-table->alist`

- Hash tables as sets:
`hash-table-union!`

,`hash-table-intersection!`

,`hash-table-difference!`

,`hash-table-xor!`

Note that the argument `k`

is a positive integer
representing the initial capacity of the hashtable being created (that
is, the number of associations it can hold without having to grow).
If not present, the initial capacity is implementation-dependent.

`(make-hash-table `

*comparator* [ *k* ]`)`

Returns a newly allocated hash table whose equality predicate and
hash function are extracted from *comparator*.

As mentioned above, implementations are free to use an appropriate
implementation-dependent hash function instead of the
specified hash function, provided that the specified equality predicate
is a refinement of the `equal?`

predicate.
This applies whether the hash function and equality predicate are passed
as separate arguments or packaged up into a comparator.

The constraints on equality predicates and hash functions are given in SRFI 128.

(R6RS `make-eq-hashtable`

, `make-eqv-hashtable`

,
and `make-hashtable`

; Common Lisp `make-hash-table`

)

`(hash-table `

*comparator* [ *key value* ] ...`)`

Returns a newly allocated hash table, created as if by
`make-hash-table`

using *comparator*.
For each pair of arguments, an association is added to the
new hash table with *key* as its key and *value*
as its value.
This procedure returns an immutable hash table.
If the same key (in the sense of the equality predicate) is
specified more than once, it is an error.

`(hash-table-unfold `

*stop? mapper successor seed comparator* [ *k* ]`)`

Create a new hash table as if by `make-hash-table`

using
*comparator* and the *args*. If the result of applying
the predicate *stop?* to *seed* is true, return the hash
table. Otherwise, apply the procedure *mapper* to *seed*.
*Mapper* returns two values, which are inserted into the hash
table as the key and the value respectively. Then get a new seed by
applying the procedure *successor* to *seed*, and repeat
this algorithm.

`(alist->hash-table `

*alist comparator* [ *k* ]`)`

Returns a newly allocated hash-table as if by `make-hash-table`

using *comparator* and the *args*. It is then initialized
from the associations of *alist*. Associations earlier in the
list take precedence over those that come later.

`(hash-table? `

*obj*`)`

Returns `#t`

if *obj* is a hash table, and
`#f`

otherwise. (R6RS `hashtable?`

;
Common Lisp `hash-table-p`

)

`(hash-table-contains? `

*hash-table key*`)`

Returns `#t`

if there is any association to *key*
in *hash-table*, and `#f`

otherwise. Must execute
in expected amortized constant time.
(R6RS `hashtable-contains?`

)

`(hash-table-empty? `

*hash-table*`)`

Returns `#t`

if *hash-table* contains no associations,
and `#f`

otherwise.

`(hash-table=? `

*value-comparator hash-table _{1} hash-table_{2}*

`)`

Returns `#t`

if *hash-table _{1}* and

`#f`

otherwise.
`(hash-table-mutable? `

*hash-table*`)`

Returns `#t`

if the hash table is mutable.
(R6RS `hashtable-mutable?`

)

The following procedures, given a key, return the corresponding value.

`(hash-table-ref `

*hash-table key* [ *failure* [ *success* ] ]`)`

Extracts the value associated to *key* in *hash-table*,
invokes the procedure *success* on it, and returns its result;
if *success* is not provided, then the value itself is returned.
If *key* is not contained in *hash-table* and
*failure* is supplied, then *failure* is invoked
on no arguments and its result is returned. Otherwise, it is
an error. Must execute in expected amortized constant time,
not counting the time to call the procedures.

`(hash-table-ref/default `

*hash-table key default*`)`

Semantically equivalent to, but may be more efficient than, the following code:

`(hash-table-ref`

hash-table key`(lambda ()`

default`))`

(R6RS `hashtable-ref`

; Common Lisp `gethash`

)

`(hash-table-comparator `

*hash-table*`)`

Returns a comparator equivalent to the comparator with which
`hash-table`

was created. R6RS
`hashtable-equivalence-function`

and
`hashtable-hash-function`

.

The following procedures alter the associations in a hash table either unconditionally or conditionally on the presence or absence of a specified key. It is an error to add an association to a hash table whose key does not satisfy the type test predicate of the comparator used to create the hash table.

`(hash-table-set! `

*hash-table* *arg* ...`)`

Repeatedly mutates *hash-table*, creating new associations
in it by processing the arguments from left to right.
The *args* alternate between keys and values. Whenever
there is a previous association for a key, it is deleted. It is
an error if the type check procedure of the comparator of
*hash-table*, when invoked on a key, does not return
`#t`

. Likewise, it is an error if a key is not a
valid argument to the equality predicate of *hash-table*.
Returns an unspecified value. Must execute in expected amortized
constant time per key. R6RS `hashtable-set!`

and Common Lisp `(setf gethash)`

do not handle multiple
associations.

`(hash-table-delete! `

*hash-table key* ...`)`

Deletes any association to each *key* in *hash-table*
and returns the number of keys that had associations. Must execute
in expected amortized constant time per key. R6RS
`hashtable-delete!`

and Common Lisp `remhash`

do not handle multiple associations.

`(hash-table-intern! `

*hash-table key* *failure*`)`

Effectively invokes `hash-table-ref`

with the given
arguments and returns what it returns. If *key* was not
found in *hash-table*, its value is set to the result of
calling *failure*. Must execute in expected amortized constant time.

`(hash-table-update! `

*hash-table key updater* [ *failure* [ *success ] ] )
*

Semantically equivalent to, but may be more efficient than, the following code:

`(hash-table-set!`

hash-table key`(`

updater`(hash-table-ref`

hash-table key failure success`)))`

Must execute in expected amortized constant time. Returns an unspecified value.

`(hash-table-update!/default `

*hash-table key updater default*`)`

Semantically equivalent to, but may be more efficient than, the following code:

(R6RS

`(hash-table-set!`

hash-table key`(`

updater`(hash-table-ref/default`

hash-table key default`)))`

`hashtable-update!`

)

Must execute in expected amortized constant time. Returns an unspecified value.

`(hash-table-pop! `

*hash-table*`)`

Chooses the first association from *hash-table* and removes
it, returning the key and value as two values.

It is an error if *hash-table* is empty.

`(hash-table-clear! `

*hash-table*`)`

Delete all the associations from *hash-table*.
(R6RS `hashtable-clear!`

; Common Lisp `clrhash`

)

These procedures process the associations of the hash table in insertion order.

`(hash-table-size `

*hash-table*`)`

Returns the number of associations in *hash-table* as an
exact integer. Must execute in constant time.
(R6RS `hashtable-size`

; Common Lisp `hash-table-count`

.)

`(hash-table-keys `

*hash-table*`)`

`(hash-table-key-vector `

*hash-table*`)`

Returns a newly allocated list/vector of all the keys in *hash-table*.
R6RS `hashtable-keys`

returns a vector.

`(hash-table-values `

*hash-table*`)`

`(hash-table-value-vector `

*hash-table*`)`

R6RS `hashtable-values`

returns a vector.

Returns a newly allocated list/vector of all the keys in *hash-table*.

`(hash-table-entries `

*hash-table*`)`

`(hash-table-entry-vectors `

*hash-table*`)`

Returns two values, a newly allocated list/vector of all the keys in
*hash-table* and a newly allocated list/vector of all the values
in *hash-table* in the corresponding order. R6RS
`hash-table-entries`

returns vectors.

`(hash-table-find `

*proc hash-table failure*`)`

For each association of *hash-table*, invoke *proc*
on its key and value. If *proc* returns true, then
`hash-table-find`

returns what *proc* returns.
If all the calls to *proc* return `#f`

, return
the result of invoking the thunk *failure*.

`(hash-table-count `

*pred hash-table*`)`

For each association of *hash-table*, invoke *pred*
on its key and value. Return the number of calls to *pred*
which returned true.

These procedures process the associations of the hash table in insertion order.

`(hash-table-map `

*proc comparator hash-table*`)`

Returns a newly allocated hash table as if by
`(make-hash-table `

*comparator*`)`

.
Calls *proc* for every association in *hash-table*
with the value of the association. The key of the association
and the result of invoking *proc* are entered into the
new hash table. Note that this is *not* the result of
lifting mapping over the domain of hash tables, but it is
considered more useful.

If *comparator* recognizes multiple keys in the *hash-table*
as equivalent, any one of such associations is taken.

`(hash-table-for-each `

*proc hash-table*`)`

Calls *proc* for every association in *hash-table*
with two arguments: the key of the association and the value of
the association. The value returned by *proc* is discarded.
Returns an unspecified value.

`(hash-table-map! `

*proc hash-table*`)`

Calls *proc* for every association in *hash-table*
with two arguments: the key of the association and the value of
the association. The value returned by *proc* is used to
update the value of the association. Returns an unspecified value.

`(hash-table-map->list `

*proc hash-table*`)`

Calls *proc* for every association in *hash-table*
with two arguments: the key of the association and the value of
the association. The values returned by the invocations of
*proc* are accumulated into a list, which is returned.

`(hash-table-fold `

*proc seed hash-table*`)`

Calls *proc* for every association in *hash-table*
with three arguments: the key of the association, the value of
the association, and an accumulated value *val*.
*Val* is *seed* for the first invocation of
*procedure*, and for subsequent invocations of *proc*,
the returned value of the previous invocation. The value returned
by `hash-table-fold`

is the return value of the last
invocation of *proc*.

`(hash-table-prune! `

*proc hash-table*`)`

Calls *proc* for every association in *hash-table*
with two arguments, the key and the value of the association, and
removes all associations from *hash-table* for which
*proc* returns true. Returns an unspecified value.

`(hash-table-copy `

*hash-table* [ *mutable?* ]`)`

Returns a newly allocated hash table with the same properties
and associations as *hash-table*. If the second argument
is present and is true, the new hash table is mutable. Otherwise
it is immutable provided that the implementation supports immutable
hash tables. (R6RS `hashtable-copy`

)

`(hash-table-empty-copy `

*hash-table*`)`

Returns a newly allocated mutable hash table with the same properties
as *hash-table*, but with no associations.

`(hash-table->alist `

*hash-table*`)`

Returns an alist with the same associations as *hash-table*
in insertion order.

`(hash-table-union! `

*hash-table _{1} hash-table_{2}*

`)`

Adds the associations of *hash-table _{2}* to

`(hash-table-intersection! `

*hash-table _{1} hash-table_{2}*

`)`

Deletes the associations from *hash-table _{1}*
whose keys don't also appear in

`(hash-table-difference! `

*hash-table _{1} hash-table_{2}*

`)`

Deletes the associations of *hash-table _{1}* whose
keys are also present in

`(hash-table-xor! `

*hash-table _{1} hash-table_{2}*

`)`

Deletes the associations of *hash-table _{1}* whose
keys are also present in

The current sample implementation is unfinished, and is here. When it is finished, it will be moved into this SRFI's repo. It relies upon SRFI 128.

Some of the language of this SRFI is copied from SRFI 69 with thanks to its author, Panu Kalliokoski. However, he is not responsible for what I have done with it.

I also acknowledge the members of the SRFI 250, 125, 126, and 128 mailing lists, especially Takashi Kato, Alex Shinn, Shiro Kawai, and Per Bothner.

© 2023 John Cowan, Will Clinger, Daphne Preston-Kendal.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Editor: Arthur A. Gleckler