234: Topological Sorting

This SRFI has complex authorship. The original specification and corresponding implementation were written by Shiro Kawai for Gauche. This SRFI was written by John Cowan, but its design is based on the Gauche specification. The implementation was written principally by Arvydas Silanskas based on the Gauche implementation, and then modified further by John Cowan to match this SRFI. Arne Babenhauserheide took over authorship in 2023.

Status

This SRFI is currently in final status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-234@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

Topological sorting is an algorithm that takes a graph consisting of nodes and other nodes that depend on them, forming a partial order, and returns a list representing a total ordering of the graph. If the graph is cyclic, the topological sort will fail. The procedure topological-sort returns three values. If sorting succeeds, the first value contains the result and the second and third are #false. If sorting fails, the result is #false and the second and third value may provide additional information about the error.

Rationale

Consider the following graph representing steps to be performed:

example graph described below

This graph means that (among other things) steps 5 and 7 must be performed before step 11, and step 11 must be performed before steps 2, 9, and 10. By topologically sorting this graph, we get one possible total ordering of the steps that satisfies the partial order.

The graph structure uses an efficient representation for graphs where a small number of dependencies lead to many dependents.

For very large graphs or nodes which cannot be compared, the graph can be represented by index values into a nodes vector. This keeps the memory footprint of the graph small.

Specification

(topological-sort graph)
(topological-sort graph =)
(topological-sort graph = nodes)

topological-sort returns only the first value from topological-sort/details.

(topological-sort/details graph)
(topological-sort/details graph =)
(topological-sort/details graph = nodes)

Topologically sorts graph, which is a list of connections. Each connection is a list of the form (node dependent dependent ...), meaning that dependents are dependent on node.

The optional argument = specifies the identity relation between nodes; it defaults to equal?. The optional argument nodes is a vector of node values.

topological-sort/details returns three values. The first value is a list of topologically sorted nodes or #false if the graph cannot be sorted topologically.

If graph is circular, topological-sort returns #false as first value. Additionally the second value may be an error message and the third value may provide information about at least one cycle. Outside error conditions, the second and third value must both be #false.

If = is provided, it is used to check equality between nodes. The default is equal?.

If nodes is provided, the entries in the graph must be index keys with which the actual nodes can be retrieved from nodes with vector-ref. These nodes can be used to process graphs of large size or whose nodes cannot be compared with =.

It is an error if the same node (in the sense of =) appears at the head of more than one connection.

The graph shown above can be represented as ((5 11) (7 11 8) (3 8 10) (11 2 9 10) (8 9)). One possible result of applying `topological-sort` to this graph is the list (3 7 5 11 2 8 10 9), meaning that performing steps 3, 7, 5, ..., 10, 9 in that order will satisfy the partial ordering of the graph.

(edgelist->graph edgelist)
(edgelist->graph edgelist asc)

Convert an edgelist '((a b) (a c) (b e)) to a graph '((a b c) (b e)). The optional argument asc specifies the procedure to retrieve a dependency element; it defaults to assoc. When using eq? for the graph this should be assq.

(edgelist/inverted->graph edgelist)
(edgelist/inverted->graph edgelist asc)

Convert an edgelist '((b a) (c a) (e b)) to the inverted graph '((a b c) (b e)). The optional argument asc specifies the procedure to retrieve a dependency element; it defaults to assoc. When using eq? for the graph this should be assq.

(graph->edgelist graph)

Convert a graph '((a b c) (b e)) to an edgelist '((a b) (a c) (b e))

(graph->edgelist/inverted graph)

Convert a graph '((a b c) (b e)) to an inverted edgelist '((b a) (c a) (e b))

(connected-components graph)

Calculate the connected components from a graph of in-neighbors. Returns a list where each item corresponds to one connected component and contains the nodes which form that component.

Implementation

The sample implementation is available in the SRFI 234 repository repository.

Acknowledgements

Credit is due to the members of the SRFI 234 mailing list.

The graph is in the public domain and is available at https://upload.wikimedia.org/wikipedia/commons/0/03/Directed_acyclic_graph_2.svg.

© 2024 John Cowan, Shiro Kawai, Arthur A. Gleckler, Arne Babenhauserheide.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler