234: Topological Sorting

This SRFI has complex authorship. The original specification and corresponding implementation were written by Shiro Kawai for Gauche. This SRFI was written by John Cowan, but its design is based on the Gauche specification. The implementation was written principally by Arvydas Silanskas based on the Gauche implementation, and then modified further by John Cowan to match this SRFI. Arne Babenhauserheide took over authorship in 2023.

Status

This SRFI is currently in draft status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-234@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

Topological sorting is an algorithm that takes a graph consisting of nodes and other nodes that depend on them, forming a partial order, and returns a list representing a total ordering of the graph. If the graph is cyclic, the topological sort will fail. With the procedure topological-sort, failure results in returning #false and additional information is provided via multiple values. With topological-sort/exception, an error will be signalled on failure. This SRFI includes utilities to operate on simple edgelists.

Issues

None at present.

Rationale

Consider the following graph representing steps to be performed:

example graph described below

This graph means that (among other things) steps 5 and 7 must be performed before step 11, and step 11 must be performed before steps 2, 9, and 10. By topologically sorting this graph, we get one possible total ordering of the steps that satisfies the partial order.

The graph structure uses an efficient representation for graphs where a small number of dependencies lead to many dependents.

Specification

(topological-sort graph)
(topological-sort graph =)

Topologically sorts graph, which is a list of connections. Each connection is a list of the form (node dependent dependent ...), meaning that dependents are dependent on node. The optional argument = specifies the identity relation between nodes; it defaults to equal?.

Returns the list of topologically sorted nodes or #f if the graph cannot be sorted topologically.

If graph is circular, topological-sort may return values where the second value is an error message and the third value provides information about at least one cycle. If a second and third value are provided outside error conditions, they must both be #f.

It is an error if the same node (in the sense of =) appears at the head of more than one connection.

The graph shown above can be represented as ((5 11) (7 11 8) (3 8 10) (11 2 9 10) (8 9)). One possible result of applying `topological-sort` to this graph is the list (3 7 5 11 2 8 10 9), meaning that performing steps 3, 7, 5, ..., 10, 9 in that order will satisfy the partial ordering of the graph.

(topological-sort/exception graph)
(topological-sort/exception graph =)

Returns the sorted nodes from topologica-sort when there are no cycles, but if graph is circular, an error satisfying circular-graph? is signaled.

(circular-graph? exception)

Returns #t if the exception raised by topological-sort/exception signals a circular graph.

(circular-graph-message exception)

Returns the message of an exception matching circular-graph?.

(circular-graph-cycle exception)

Returns one circular structure from an exception matching circular-graph?. The graph may contain additional cycles.

(edgelist->graph edgelist)
(edgelist->graph edgelist =)

Convert an edgelist '((a b) (a c) (b e)) to a graph '((a b c) (b e)). The optional argument = specifies the identity relation between nodes; it defaults to equal?.

(edgelist/inverted->graph edgelist)
(edgelist/inverted->graph edgelist =)

Convert an edgelist '((b a) (c a) (e b)) to the inverted graph '((a b c) (b e)). The optional argument = specifies the identity relation between nodes; it defaults to equal?.

(graph->edgelist graph)

Convert a graph '((a b c) (b e)) to an edgelist '((a b) (a c) (b e))

(graph->edgelist/inverted graph)

Convert a graph '((a b c) (b e)) to an inverted edgelist '((b a) (c a) (e b))

(connected-components graph)

Calculate the connected components from a graph of in-neighbors. Returns a list where each item corresponds to one connected component and contains the nodes which form that component.

Implementation

The sample implementation is available in the SRFI 234 repository repository.

Acknowledgements

Credit is due to the members of the SRFI 234 mailing list.

The graph is in the public domain and is available at https://upload.wikimedia.org/wikipedia/commons/0/03/Directed_acyclic_graph_2.svg.

© 2022 John Cowan, Arvydas Silanskas, Shiro Kawai.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler