Title

Unifiable Boxes

Author

Marc Nieper Wißkirchen

Status

This SRFI is currently in final status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-161@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

Unifiable boxes are, like the boxes of SRFI 111, objects with a single mutable state. A constructor, predicate, accessor, and mutator are provided.

In addition to this, an equality predicate and union operations (link, union, unify) are provided. Applying a union operation to two unifiable boxes makes the two boxes equal (in the sense of the equality predicate). As a consequence, their state will also become identical. In the case of link and union, it will be the state of one of the two unioned boxes. In the case of unify, the state is determined by a supplied unification procedure.

Unifiable boxes are also known under the names disjoint-set data structure, union–find data structure or merge–find set.

Rationale

The disjoint-set (or union-find) forest is a basic, well-known data structure that tracks partitions of a set. For example, it can be used to keep track of the connected components of an undirected graph. It can also be used in implementations of unification or in implementing Kruskal's algorithm to find the minimum spanning tree of a graph.

The abstract data type defines the three operations make-set, find, and union. Make-set creates a singleton set, union merges two sets, and find returns a canonical representative of a set.

There are implementations of the union-find data structure in Haskell, SML/NJ, C++ (Boost), etc. It is the aim of this SRFI to provide an implementation for Scheme together with a standardized interface.

This SRFI defines the interface to a union-find data type with a SRFI 111 boxes-like interface. An implementation of this interface consists of operations for creating new unifiable boxes, getting the contents of a unifiable box, checking for equality of unifiable boxes, and for joining two unifiable boxes.

Unifiable boxes are analogous to boxes as expressed in the following table:

OperationBoxesUnifiable boxes
Constructorboxubox
Predicatebox?ubox?
Accessorunboxubox-ref
Mutatorset-box!ubox-set!
Equalityeq?ubox=?
Unionn.a.ubox-unify!, ubox-union!, ubox-link!

Note: Unifiable boxes use the standard naming pattern for Scheme procedures. As SRFI 111 deviates (for historical reasons) from this naming pattern, the naming patterns for boxes and for unifiable boxes differ.

Using the procedures of this SRFI, the classical operations make-set, union, and find can be defined as follows:

  (define (make-set x) (ubox x))
  (define (union x y) (ubox-link! y x))
  (define (find x) (ubox-ref x))

There is no comparator in the sense of SRFI 128 defined for unifiable boxes as the equality predicate for unifiable boxes is not stable.

Specification

Unifiable boxes form a new type as if created by define-record-type. The effect of using record-type inspection or inheritance for the unifiable box type is unspecified.

Equality of unifiable boxes is an equivalence relation not finer than eqv?. The union operations ubox-unify!, ubox-union!, and ubox-link! coarsen this equivalence relation.

Each unifiable box has a single mutable state, its value. Equal unifiable boxes have the same state.

Each procedure below shall execute in (practically) constant time. (The theoretical bound is O(α(n)), where α is the inverse Ackermann function and n is the number of unifiable boxes that are pairwise not eq? to each other.)

It is an error if two threads access the same equivalence class of unifiable boxes at the same time. It is no error if two threads access distinct equivalence classes of unifiable boxes that remain distinct during the accesses at the same time.

Procedures

(ubox value)

Constructor. Returns a newly allocated unifiable box initialized to value. The new unifiable box is not equal to any previously constructed unifiable box.

(ubox? object)

Predicate. Returns #t if object is a unifiable box, and #f otherwise.

(ubox-ref ubox)

Accessor. Returns the current value of the unifiable box ubox.

(ubox-set! ubox value)

Mutator. Changes the unifiable box ubox to hold value. The return value is unspecified.

(ubox=? ubox1 ubox2)

Equality predicate. Returns #t if ubox1 and ubox2 are equal unifiable boxes, and #f otherwise.

(ubox-unify! proc ubox1 ubox2)

Union operation. Invokes proc on the values of ubox1 and ubox2, makes ubox1 and ubox2 equal, and updates their value to the result of the invocation of proc. The return value is unspecified.

(ubox-union! ubox1 ubox2)

Union operation. Makes the unifiable boxes ubox1 and ubox2 equal. The value of the unified box is the value of either ubox1 or ubox2 before the unification. The return value is unspecified.

(ubox-link! ubox1 ubox2)

Union operation. Makes the unifiable boxes ubox1 and ubox2 equal. The value of the unified box is the value of ubox2 before the unification. The return value is unspecified.

Implementation

The sample implementation comes in the form of the portable R7RS library (srfi 161). It uses path compression and union by rank.

Source for the sample implementation.

Acknowledgements

Disjoint-set forests were first described by Bernard A. Galler and Michael J. Fischer in 1964 in their paper An improved equivalence algorithm.

The interface to a disjoint-set data structure that is described in this SRFI has been modelled on the UREF signature of SML/NJ.

Olin Shivers has used the disjoint-set data structure, which is implemented by this SRFI, in Scheme code that generates random mazes. The code and the description of the algorithm is available on Olin Shiver's web site. Thanks to Arthur A. Gleckler for pointing this (practical) use of disjoint sets out to me.

Copyright

Copyright (C) Marc Nieper-Wißkirchen (2018).

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler