- Draft: 2001/11/12-2002/01/11
- Revised: 2002/02/03
- Final: 2002/05/21

A core set of procedures for creating and manipulating heterogeneous multidimensional arrays is proposed. The design is consistent with the rest of Scheme and independent of other container data types. It provides easy sharing of parts of an array as other arrays without copying, encouraging a declarative style of programming.

The specification is based on an original contribution by Alan Bawden in 1993.

The proposed arrays encourage a natural declarative programming style. They allow sharing of most any rectangular part of an array through an affine index mapping, without copying. But imperative style is equally natural.

The design is consistent with the two indexed data structures of Scheme: vectors and strings. The design makes arrays a self-contained type. These statements are illustrated in the following paragraphs.

First, in the one-dimensional case, the arguments of the following relevant calls match exactly.

(vector-set! v k o) (string-set! s k c) (array-set! a k o)

Likewise, `make-array`

matches `make-vector`

and
`make-string`

. An analogue to `vector`

,
`string`

and `list`

is provided, alleviating the
lack of an external representation. Index bounds are specified as for
`substring`

, lower bound included and upper bound excluded.

Array shapes are specified as arrays. These can be made with a special
procedure `shape`

that does not have a shape argument. An
array does not retain a dependence to the shape array. For example,
mutation of a shape array is allowed.

Index mappings return multiple values as multiple values.

Array dimensions can begin at any index. In particular, the choice
between `0`

and `1`

is left to the user.
(Shapes and index objects are zero based, though.)

The ability to pack an index sequence in a vector is useful for implementing higher level operations. (The ability to pack it in a one-dimensional array lets one use, say, a row of a matrix as an index.)

It is not required that vectors not be arrays. It is not required that they be, either.

Arrays are heterogeneous data structures whose elements are indexed by integer sequences of fixed length. The length of a valid index sequence is the rank or the number of dimensions of an array. The shape of an array consists of bounds for each index.

The lower bound `b` and the upper bound `e` of a
dimension are exact integers with
`(<= `

. A valid
index along the dimension is an exact integer `b` `e`)`k` that
satisfies both
`(<= `

and
`b` `k`)`(< `

. The length of
the array along the dimension is the difference
`k` `e`)`(- `

. The
size of an array is the product of the lengths of its
dimensions.
`e` `b`)

A shape is specified as an even number of exact integers. These are alternately the lower and upper bounds for the dimensions of an array.

The following ten procedures should be implemented.

`(array? `

`obj`)

Returns `#t` if `obj` is an array, otherwise
returns `#f`.

Note: there is no reasonable way to implement this procedure accurately in R5RS; SRFI 9 (Defining Record Types) specifies a way, and many Scheme implementations provide something similar.

`(make-array `

`shape`)

`(make-array `

`shape` `obj`)

Returns a newly allocated array whose shape is given by
`shape`. If `obj` is provided, then each element is
initialized to it. Otherwise the initial contents of each element is
unspecified. The array does not retain a dependence to
`shape`.

`(shape `

`bound ...`)

Returns a shape. The sequence `bound ...` must consist of an
even number of exact integers that are pairwise not decreasing. Each
pair gives the lower and upper bound of a dimension. If the shape is
used to specify the dimensions of an array and `bound ...` is
the sequence `b0 e0 ... bk ek ...` of `n` pairs of
bounds, then a valid index to the array is any sequence `j0 ... jk
...` of `n` exact integers where each `jk`
satisfies `(<= `

and `bk` `jk`)`(< `

.
`jk` `ek`)

The shape of a `d`-dimensional array is a
`d` × `2` array where the element at
`k 0` contains the lower bound for an index along dimension
`k` and the element at `k 1` contains the
corresponding upper bound, where `k` satisfies
`(<= 0 `

and
`k`)`(< `

.
`k` `d`)

`(array `

`shape` `obj ...`)

Returns a new array whose shape is given by `shape` and the
initial contents of the elements are `obj ...` in row major
order. The array does not retain a dependence to `shape`.

`(array-rank `

`array`)

Returns the number of dimensions of `array`.

(array-rank (make-array (shape 1 2 3 4)))

Returns `2`.

`(array-start `

`array` `k`)

Returns the lower bound for the index along dimension `k`.

`(array-end `

`array` `k`)

Returns the upper bound for the index along dimension `k`.

`(array-ref `

`array` `k ...`)

`(array-ref `

`array` `index`)

Returns the contents of the element of `array` at index
`k ...`. The sequence `k ...` must be a valid index
to `array`. In the second form, `index` must be
either a vector or a 0-based 1-dimensional array containing `k
...`.

(array-ref (array (shape 0 2 0 3) 'uno 'dos 'tres 'cuatro 'cinco 'seis) 1 0)

Returns `cuatro`.

(let ((a (array (shape 4 7 1 2) 3 1 4))) (list (array-ref a 4 1) (array-ref a (vector 5 1)) (array-ref a (array (shape 0 2) 6 1))))

Returns `(3 1 4)`.

`(array-set! `

`array` `k ...` `obj`)

`(array-set! `

`array` `index` `obj`)

Stores `obj` in the element of `array` at index
`k ...`. Returns an unspecified value. The sequence `k
...` must be a valid index to `array`. In the second
form, `index` must be either a vector or a 0-based
1-dimensional array containing `k ...`.

(let ((a (make-array (shape 4 5 4 5 4 5)))) (array-set! a 4 4 4 'huuhkaja) (array-ref a 4 4 4))

Returns `huuhkaja`.

`(share-array `

`array` `shape` `proc`)

Returns a new array of shape `shape` that shares elements of
`array` through `proc`. The procedure
`proc` must implement an affine function that returns indices
of `array` when given indices of the array returned by
`share-array`

. The array does not retain a dependence to
`shape`.

(define i_4 (let* ((i (make-array (shape 0 4 0 4) 0)) (d (share-array i (shape 0 4) (lambda (k) (values k k))))) (do ((k 0 (+ k 1))) ((= k 4)) (array-set! d k 1)) i))

Note: the affinity requirement for `proc` means that each
value must be a sum of multiples of the arguments passed to
`proc`, plus a constant.

Implementation note: arrays have to maintain an internal index mapping
from indices `k1 ... kd` to a single index into a backing
vector; the composition of this mapping and `proc` can be
recognised as `(+ n0 (* n1 `

by setting each index in turn to `k1`) ... (* nd
`kd`))`1`

and others to `0`

, and all to `0`

for the
constant term; the composition can then be compiled away, together
with any complexity that the user introduced in their procedure.

This document does not specify any external representation for arrays.
This document does not specify when arrays are `equal?`

.
(Indeed, R5RS `equal?`

will do the wrong thing.)

The reference implementation comes with a number of files that
illustrate some ways to use the proposed system (and are very useful
in *testing* an implementation; that is their origin).

- A library arlib.scm that contains, among
several other things,
`tabulate-array`

for a more useful initialization of a new array, an`array-equal?`

, and a`transpose`

that can permute the dimensions of an array any which way. - A test suite test.scm for
*array*, and another test suite list.scm for*arlib*. - A rudimentary display procedure
`(play array)`

in play.scm, for playing around with the system.

A portable reference implementation is provided. It uses a minimal
error reporting mechanism that conforms to SRFI 23 (Error reporting
mechanism). Type disjointness requires support from the host
implementation, such as support for SRFI 9 (Defining Record
Types). All names not defined in this proposal are in the prefix
"`array:`

", which serves as a module system.

You can get source for the reference implementation as a single file and stop reading. But there are variations. This single file represents arrays as procedures (so the type predicate is very approximate); it represents index mapping as vectors of coefficients; map recognition is not optimised for any number of dimensions as that would be redundant in this representation.

The real source comes in too many files. A working installation consists of the following parts, each in its own file.

- a record type definition, either system specific for
type disjointness, or portable as procedures, in a file
*as-*.scm*, and - indexing operations to match the type, in a file
*ix-*.scm*, and - an affine recogniser of one of three types, optimised
up to some number of dimensions, in a file
*op-*.scm*, and - the main source file array.scm.

Affine recognisers are made by a program opt.scm
but one of each type is also available here, optimized for 0, 1, 2 and
3 dimensions. Choose one type: pick a recogniser with matching index
procedures; load `as-`, `ix-` and `op-` and
`array`.)

- In the mbda type representation, index
mappings are procedures that accept an optional argument. The
matching access procedures apply the
mapping to the arguments of
`array-ref`

and`array-set!`

. - In the tter type representation, index
mappings are pairs of procedures: one takes exactly the indices,
the other takes indices and an object. The matching access procedures apply the first
procedure to the argumets of
`array-ref`

and the second procedure to the arguments of`array-set!`

. - In ctor representation, index mappings are coefficient vectors. The access procedures compute the sum of products of coefficients and indexes in a loop on the list.

Record implementations are available for
generic Scheme (arrays are not disjoint from procedures), for SRFI 9 (Defining Record Types)
(*not tested*), and for PLT
Scheme (arrays belong to a struct type).

With the three files from above, the main source file should work in any Scheme implementation without need of modification.

Error checking in the implementation may be a tad expensive. The places where it occurs are cleanly separated from the surrounding code. (Sharing uses a check that is exponential in the number of dimensions. It is disabled above a threshold rank.)

The original concept comes from a message to the Usenet newsgroup
`comp.lang.scheme` by Alan Bawden in 1993. A variant of that
implementation by Richard Kelsey in the Scheme 48 system was also an
influence. Apart from the origins, the main design goal has been
consistency with the core types of Scheme.

Alan Bawden and Mark K. Gardner gave useful comments at an earlier attempt to make this specification public. (There was at least one other. Notes have gone missing.) SRFI feedback led to improved wording, hidden shapes, and two kinds of index objects.

The exact title of the proposal comes from a message titled "a process that might work" by William D. Clinger to the `rrrs-authors`
mailing list in 1998. That appears to be a part of the past of the SRFI process.

Copyright (C) Jussi Piitulainen (2001). All Rights Reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Editor: Mike Sperber Last modified: Sun Jan 28 13:40:29 MET 2007