This SRFI is currently in final status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-113@nospamsrfi.schemers.org
. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.
set-map
and bag-map
.)set-map
.)Post-finalization note 1 (Added 2016-05-14.): Because SRFI 114 has been deprecated
by SRFI 128, it is recommended that implementers make use of
SRFI 128 rather than SRFI 114 comparators where comparators are
specified in this SRFI. Specifically, the procedures set
,
bag
, set-unfold
, bag-unfold
, set-map
, list->set
,
list->bag
, and alist->bag
, should accept SRFI 128 rather
than SRFI 114 comparators as arguments. By the same token, the
results of set-element-comparator
and bag-element-comparator
,
as well as the values of set-comparator
and bag-comparator
,
should be SRFI 128 comparators. The sample implementation has
been updated to depend on SRFI 128 rather than SRFI 114.
Post-finalization note 2: The order of arguments of set-map
and set-unfold
are not consistent with those of hash-table-map
and hash-table-unfold
in SRFI 125. Both SRFI 113 and SRFI 125 are going to be part of R7RS large (also known as the Red Edition), and the plan is to make them consistent there, following the order in SRFI 125. Furthermore, SRFI 146 (in draft status at the time of this writing) uses the order of SRFI 125. Since this problem was discovered after SRFI 113 was finalized, and it wasn't an error in the SRFI, it's too late to fix it here. However, John Cowan, the author, encourages implementers to consider adopting the order of SRFI 125. Thanks to Marc Nieper-Wißkirchen for reporting this mismatch, and for providing a version of SRFI 113 with the recommended argument order in the implementation of SRFI 146.
Post-finalization note 3 (Added 2021-06-03): At the request of the SRFI author, this paragraph has been marked deleted where it appears below:
The implementation registers set-comparator and bag-comparator with SRFI 114's default comparator, assuming the sample implementation of SRFI 114 is being used. Scheme implementers who provide their own implementations of SRFI 114 must change this part of the code.
John says:
The words from "assuming" onwards represent the SRFI 114 situation, where registration was not standardized but was made available by the sample implementation, so since PFN 1 (which replaced SRFI 114 with SRFI 128) they are now meaningless anyway.
Sets and bags (also known as multisets) are unordered collections that can contain any Scheme object. Sets enforce the constraint that no two elements can be the same in the sense of the set's associated equality predicate; bags do not.
Sets are a standard part of the libraries of many high-level programming languages, including Smalltalk, Java, and C++. Racket provides general sets similar to those of this proposal, though with fewer procedures, and there is a Chicken egg called sets
(unfortunately undocumented), which provides a minimal set of procedures. SRFI 1 also provides a list-based implementation of sets.
Bags are useful for counting anything from a fixed set of possibilities, e.g. the number of each type of error in a log file or the number of uses of each word in a lexicon drawn from a body of documents. Although other data structures can serve the same purpose, using bags clearly expresses the programmer's intent and allows for optimization.
Insofar as possible, the names in this SRFI are harmonized with the names used for ordered collections (lists, strings, vectors, and bytevectors) in Scheme. However, size
is used instead of length
to express the number of elements in a collection, because length
implies order.
It's possible to use the general sets of this SRFI to contain characters, but the use of SRFI 14 is recommended instead. The names and facilities in this SRFI are harmonized with SRFI 14, except that SRFI 14 does not contain analogues of the set-search!
, set>?
, set<=?
, set>=?
, set-remove
, or set-partition
procedures.
Sets and bags do not have a lexical syntax representation. It's possible to use SRFI 108 quasi-literal constructors to create them in code, but this SRFI does not standardize how that is done.
The interface to general sets and bags depends on SRFI 114 comparators, despite that SRFI having a higher number than this one for hysterical raisins. Comparators conveniently package the equality predicate of the set with the hash function or comparison procedure needed to implement the set efficiently.
Sets and bags are mutually disjoint, and disjoint from other types of Scheme objects.
It is an error for any procedure defined in this SRFI to be invoked on sets or bags with distinct comparators (in the sense of eq?
).
It is an error to mutate any object while it is contained in a set or bag.
It is an error to add an object to a set or bag which does not satisfy the type test predicate of the comparator.
It is an error to add or remove an object for a set or a bag while iterating over it.
The procedures of this SRFI, by default, are "pure functional" — they do not alter their parameters. However, this SRFI also defines "linear-update" procedures, all of whose names end in !
. They have hybrid pure-functional/side-effecting semantics: they are allowed, but not required, to side-effect one of their parameters in order to construct their result. An implementation may legally implement these procedures as pure, side-effect-free functions, or it may implement them using side effects, depending upon the details of what is the most efficient or simple to implement in terms of the underlying representation.
It is an error to rely upon these procedures working by side effect. For example, this is not guaranteed to work:
(let* ((set1 (set 'a 'b 'c)) ; set1 = {a,b,c}. (set2 (set-adjoin! set1 'd))) ; Add d to {a,b,c}. set1) ; Could be either {a,b,c} or {a,b,c,d}.
However, this is well-defined:
(let ((set1 (set 'a 'b 'c))) (set-adjoin! set1 'd)) ; Add d to {a,b,c}.
So clients of these procedures write in a functional style, but must additionally be sure that, when the procedure is called, there are no other live pointers to the potentially-modified set or bag (hence the term "linear update").
There are two benefits to this convention:
Implementations are free to provide the most efficient possible implementation, either functional or side-effecting.
Programmers may nonetheless continue to assume that sets are purely functional data structures: they may be reliably shared without needing to be copied, uniquified, and so forth.
In practice, these procedures are most useful for efficiently constructing sets and bags in a side-effecting manner, in some limited local context, before passing the set or bag outside the local construction scope to be used in a functional manner.
Scheme provides no assistance in checking the linearity of the potentially side-effected parameters passed to these functions — there's no linear type checker or run-time mechanism for detecting violations.
Note that if an implementation uses no side effects at all, it is allowed to return existing sets and bags rather than newly allocated ones, even where this SRFI explicitly says otherwise.
Implementations of this SRFI are allowed to place restrictions on the comparators that the procedures accept. In particular, an implementation may require comparators to provide a comparison procedure. Alternatively, an implementation may require comparators to provide a hash function, unless the equality predicate of the comparator is eq?
, eqv?
, equal?
, string=?
, or string-ci=?
. Implementations must not require the provision of both a comparison procedure and a hash function.
Constructors: set
, set-unfold
Predicates: set?
, set-contains?
, set-empty?
, set-disjoint?
Accessors: set-member
, set-element-comparator
Updaters: set-adjoin
, set-adjoin!
, set-replace
, set-replace!
, set-delete
, set-delete!
, set-delete-all
, set-delete-all!
, set-search!
The whole set: set-size
, set-find
, set-count
, set-any?
, set-every?
Mapping and folding: set-map
, set-for-each
, set-fold
, set-filter
, set-filter!
, set-remove
, set-remove!
, set-partition
, set-partition!
Copying and conversion: set-copy
, set->list
, list->set
, list->set!
Subsets: set=?
, set<?
, set>?
, set<=?
, set>=?
Set theory operations: set-union
, set-intersection
, set-difference
, set-xor
, set-union!
, set-intersection!
, set-difference!
, set-xor!
Additional bag procedures: bag-sum
, bag-sum!
, bag-product
, bag-product!
, bag-element-count
, bag-for-each-unique
, bag-fold-unique
, bag-increment!
, bag-decrement!
, bag->set
, set->bag
, set->bag!
Comparators: set-comparator
, bag-comparator
(set
comparator element ... )
Returns a newly allocated empty set. The comparator argument is a SRFI 114 comparator, which is used to control and distinguish the elements of the set. The elements are used to initialize the set.
(set-unfold
comparator stop? mapper successor seed)
Create a newly allocated set as if by set
using comparator. If the result of applying the predicate stop? to seed is true, return the set. Otherwise, apply the procedure mapper to seed. The value that mapper returns is added to the set. Then get a new seed by applying the procedure successor to seed, and repeat this algorithm.
(set?
obj)
Returns #t
if obj is a set, and #f
otherwise.
(set-contains?
set element)
Returns #t
if element is a member of set and #f
otherwise.
(set-empty?
set)
Returns #t
if set has no elements and #f
otherwise.
(set-disjoint?
set1 set2)
Returns #t
if set1 and set2 have no elements in common and #f
otherwise.
(set-member
set element default)
Returns the element of set that is equal, in the sense of set's equality predicate, to element. If element is not a member of set, default is returned.
(set-element-comparator
set)
Returns the comparator used to compare the elements of set.
(set-adjoin
set element ...)
The set-adjoin
procedure returns a newly allocated set that uses the same comparator as set and contains all the values of set, and in addition each element unless it is already equal (in the sense of the comparator) to one of the existing or newly added members. It is an error to add an element to set that does not return #t
when passed to the type test procedure of the comparator.
(set-adjoin!
set element ...)
The set-adjoin!
procedure is the same as set-adjoin
, except that it is permitted to mutate and return the set argument rather than allocating a new set.
(set-replace
set element)
The set-replace
procedure returns a newly allocated set that uses the same comparator as set and contains all the values of set except as follows: If element is equal (in the sense of set's comparator) to an existing member of set, then that member is omitted and replaced by element. If there is no such element in set, then set is returned unchanged.
(set-replace!
set element)
The set-replace!
procedure is the same as set-replace
, except that it is permitted to mutate and return the set argument rather than allocating a new set.
(set-delete
set element ...)
(set-delete!
set element ...)
(set-delete-all
set element-list)
(set-delete-all!
set element-list)
The set-delete
procedure returns a newly allocated set containing all the values of set except for any that are equal (in the sense of set's comparator) to one or more of the elements. Any element that is not equal to some member of the set is ignored.
The set-delete!
procedure is the same as set-delete
, except that it is permitted to mutate and return the set argument rather than allocating a new set.
The set-delete-all
and set-delete-all!
procedures are the same as set-delete
and set-delete!
, except that they accept a single argument which is a list of elements to be deleted.
(set-search!
set element failure success)
The set is searched for element. If it is not found, then the failure procedure is tail-called with two continuation arguments, insert and ignore, and is expected to tail-call one of them. If element is found, then the success procedure is tail-called with the matching element of set and two continuations, update and remove, and is expected to tail-call one of them.
The effects of the continuations are as follows (where obj is any Scheme object):
Invoking (
insert obj)
causes element to be inserted into set.
Invoking (
ignore obj)
causes set to remain unchanged.
Invoking (
update new-element obj)
causes new-element to be inserted into set in place of element.
Invoking (
remove obj)
causes the matching element of set to be removed from it.
In all cases, two values are returned: the possibly updated set and obj.
(set-size
set)
Returns the number of elements in set as an exact integer.
(set-find
predicate set failure)
Returns an arbitrarily chosen element of set that satisfies predicate, or the result of invoking failure with no arguments if there is none.
(set-count
predicate set)
Returns the number of elements of set that satisfy predicate as an exact integer.
(set-any?
predicate set)
Returns #t
if any element of set satisfies predicate, or #f
otherwise. Note that this differs from the SRFI 1 analogue because it does not return an element of the set.
(set-every?
predicate set)
Returns #t
if every element of set satisfies predicate, or #f
otherwise. Note that this differs from the SRFI 1 analogue because it does not return an element of the set.
(set-map
comparator proc set)
Applies proc to each element of set in arbitrary order and returns a newly allocated set, created as if by (set
comparator)
, which contains the results of the applications. For example:
(set-map string-ci-comparator symbol->string (set eq? 'foo 'bar 'baz)) => (set string-ci-comparator "foo" "bar" "baz")
Note that, when proc defines a mapping that is not 1:1, some of the mapped objects may be equivalent in the sense of comparator's equality predicate, and in this case duplicate elements are omitted as in the set constructor. For example:
(set-map integer-comparator (lambda (x) (quotient x 2)) (set integer-comparator 1 2 3 4 5)) => (set integer-comparator 0 1 2)
If the elements are the same in the sense of eqv?
, it is unpredictable which one will be preserved in the result.
(set-for-each
proc set)
Applies proc to set in arbitrary order, discarding the returned values. Returns an unspecified result.
(set-fold
proc nil set)
Invokes proc on each member of set in arbitrary order, passing the result of the previous invocation as a second argument. For the first invocation, nil is used as the second argument. Returns the result of the last invocation, or nil if there was no invocation.
(set-filter
predicate set)
Returns a newly allocated set with the same comparator as set, containing just the elements of set that satisfy predicate.
(set-filter!
predicate set)
A linear update procedure that returns a set containing just the elements of set that satisfy predicate.
(set-remove
predicate set)
Returns a newly allocated set with the same comparator as set, containing just the elements of set that do not satisfy predicate.
(set-remove!
predicate set)
A linear update procedure that returns a set containing just the elements of set that do not satisfy predicate.
(set-partition
predicate set)
Returns two values: a newly allocated set with the same comparator as set that contains just the elements of set that satisfy predicate, and another newly allocated set, also with the same comparator, that contains just the elements of set that do not satisfy predicate.
(set-partition!
predicate set)
A linear update procedure that returns two sets containing the elements of set that do and do not, respectively, not satisfy predicate.
(set-copy
set)
Returns a newly allocated set containing the elements of set, and using the same comparator.
(set->list
set)
Returns a newly allocated list containing the members of set in unspecified order.
(list->set
comparator list)
Returns a newly allocated set, created as if by set
using comparator, that contains the elements of list. Duplicate elements (in the sense of the equality predicate) are omitted.
(list->set!
set list)
Returns a set that contains the elements of both set and list. Duplicate elements (in the sense of the equality predicate) are omitted.
Note: None of these predicates produces a total order on
sets. In particular, set=?
,
set<?
, and set>?
do not obey
the trichotomy law.
(set=?
set1 set2 ...)
Returns #t
if each set contains the same elements.
(set<?
set1 set2 ...)
Returns #t
if each set other than the last is a proper subset of the following set, and #f
otherwise.
(set>?
set1 set2 ...)
Returns #t
if each set other than the last is a proper superset of the following set, and #f
otherwise.
(set<=?
set1 set2 ...)
Returns #t
if each set other than the last is a subset of the following set, and #f
otherwise.
(set>=?
set1 set2 ...)
Returns #t
if each set other than the last is a superset of the following set, and #f
otherwise.
(set-union
set1 set2 ...)
(set-intersection
set1 set2 ...)
(set-difference
set1 set2 ...)
(set-xor
set1 set2)
Return a newly allocated set that is the union, intersection, asymmetric difference, or symmetric difference of the sets. Asymmetric difference is extended to more than two sets by taking the difference between the first set and the union of the others. Symmetric difference is not extended beyond two sets. Elements in the result set are drawn from the first set in which they appear.
(set-union!
set1 set2 ...)
(set-intersection!
set1 set2 ...)
(set-difference!
set1 set2 ...)
(set-xor!
set1 set2)
Linear update procedures returning a set that is the union, intersection, asymmetric difference, or symmetric difference of the sets. Asymmetric difference is extended to more than two sets by taking the difference between the first set and the union of the others. Symmetric difference is not extended beyond two sets. Elements in the result set are drawn from the first set in which they appear.
Bags are like sets, but can contain the same object more than once. However, if two elements that are the same in the sense of the equality predicate, but not in the sense of eqv?
, are both included, it is not guaranteed that they will remain distinct when retrieved from the bag. It is an error for a single procedure to be invoked on bags with different comparators.
The procedures for creating and manipulating bags are the same as those for sets, except that set
is replaced by bag
in their names, and that adjoining an element to a bag is effective even if the bag already contains the element. If two elements in a bag are the same in the sense of the bag's comparator, the implementation may in fact store just one of them.
The bag-union
, bag-intersection
, bag-difference
, and bag-xor
procedures (and their linear update analogues) behave as follows when both bags contain elements that are equal in the sense of the bags' comparator:
For bag-union
, the number of equal elements in the result is the largest number of equal elements in any of the original bags.
For bag-intersection
, the number of equal elements in the result is the smallest number of equal elements in any of the original bags.
For bag-difference
, the number of equal elements in the result is the number of equal elements in the first bag, minus the number of elements in the other bags (but not less than zero).
For bag-xor
, the number of equal elements in the result is the absolute value of the difference between the number of equal elements in the first and second bags.
(bag-sum
set1 set2 ... )
(bag-sum!
bag1 bag2 ... )
The bag-sum
procedure returns a newly allocated bag containing all the unique elements in all the bags, such that the count of each unique element in the result is equal to the sum of the counts of that element in the arguments. It differs from bag-union
by treating identical elements as potentially distinct rather than attempting to match them up.
The bag-sum!
procedure is equivalent except that it is linear-update.
(bag-product
n bag)
(bag-product!
n bag)
bag-product
procedure returns a newly allocated bag containing all the unique elements in bag, where the count of each unique element in the bag is equal to the count of that element in bag
multiplied by n.
The bag-product!
procedure is equivalent except that it is linear-update.
(bag-unique-size
bag)
Returns the number of unique elements of bag.
(bag-element-count
bag element)
Returns an exact integer representing the number of times that element appears in bag.
(bag-for-each-unique
proc bag)
Applies proc to each unique element of bag in arbitrary order, passing the element and the number of times it occurs in bag, and discarding the returned values. Returns an unspecified result.
(bag-fold-unique
proc nil bag)
Invokes proc on each unique element of bag in arbitrary order, passing the number of occurrences as a second argument and the result of the previous invocation as a third argument. For the first invocation, nil is used as the third argument. Returns the result of the last invocation.
(bag-increment!
bag
element
count)
(bag-decrement!
bag
element
count)
Linear update procedures that return a bag with the same elements as bag, but with the element count of element in bag increased or decreased by the exact integer count (but not less than zero).
(bag->set
bag)
(set->bag
set)
(set->bag!
bag set)
The bag->set
procedure returns a newly allocated set containing the unique elements (in the sense of the equality predicate) of bag. The set->bag
procedure returns a newly allocated bag containing the elements of set. The set->bag!
procedure returns a bag containing the elements of both bag and set. In all cases, the comparator of the result is the same as the comparator of the argument or arguments.
(bag->alist
bag)
(alist->bag
comparator alist)
The bag->alist
procedure returns a newly allocated alist whose keys are the unique elements of bag and whose values are the number of occurrences of each element. The alist->bag
returning a newly allocated bag based on comparator, where the keys of alist specify the elements and the corresponding values of alist specify how many times they occur.
The following comparators are used to compare sets or bags, and allow sets of sets, bags of sets, etc.
set-comparator
bag-comparator
Note that these comparators do not provide comparison procedures, as there is no ordering between sets or bags. It is an error to compare sets or bags with different element comparators.
The implementation places the identifiers defined above into the sets
library.
Sets and bags are implemented as a thin veneer over hashtables.
The implementation registers (See post-finalization note #3.)set-comparator
and
bag-comparator
with SRFI 114's default comparator, assuming
the sample implementation of SRFI 114 is being used. Scheme implementers
who provide their own implementations of SRFI 114 must change this part
of the code.
The sample implementation contains the following files:
sets-impl.scm
– implementation of general sets and bagssets.sld
– an R7RS library named (sets)
sets.scm
– a Chicken librarysets-test.scm
– test suiteThe test suite will work with the Chicken test
egg, which is provided on Chibi as the (chibi test)
library.
Copyright (C) John Cowan 2013. All Rights Reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.