268: Multidimensional Array Literals

by Per Bothner (SRFI 163), Peter McGoron (design), John Cowan (editor and steward)

Status

This SRFI is currently in draft status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-268@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

This is a specification of a lexical syntax for multi-dimensional arrays. It is a modest alteration of SRFI 163, which is an extension of the Common Lisp array reader syntax to handle non-zero lower bounds and optional uniform element types (compatibly with SRFI 4 and SRFI 160). It can be used in conjunction with SRFI 25, SRFI 122, or SRFI 213. There are recommendations for output formatting and a suggested format-array procedure.

Rationale

It is desirable to have a lexical syntax for reading and writing multi-dimensional arrays, so they can not only appear as literal values in code, but also be read from and written to data files. Basing it on the Common Lisp syntax makes sense, and is what has been done by all known existing implementations. However, Common Lisp arrays do not support non-zero lower bounds, and Common Lisp's handling of specialized (uniform) arrays is very different from that of known Scheme implementations. Various Scheme extensions have been proposed. SRFI 58 is one proposal, but it does not handle non-zero lower bounds, and its type-specifier syntax is stylistically incompatible with the literals proposed for uniform vectors in SRFIs 4 and 160.

Specification

Reader syntax

An array literal consists of the following parts: an # immediately followed by an optional non-negative integer literal representing the number of dimensions, immediately followed by a tag that specifies the type of the elements of the array. An implementation that supports the literal syntax of SRFI 4 or SRFI 160 should allow the TAG or @ from those specifications as a tag. If the tag is omitted, the array may contain elements of any type.

After the tag comes the lbounds, an optional list whose elements are integer literals giving the lower bounds of each dimension. This is followed by a datum containing array elements organized into dimensions using parentheses. The datum contains the elements in a nested-list format: a one-dimensional array uses a single list, a two-dimensional array uses a list of lists, and so on. The elements are in lexicographic order. A zero-dimensional array is followed by a datum of any type. The lbounds and the number of dimensions are mutually exclusive.

If all the lower bounds are 0, the lbounds may be omitted. In this case, the number of dimensions is required: it would be impossible to tell whether #a (1 2) represents a one-dimensional array with integer elements, or a zero-dimensional array whose sole element is a list. The lbounds and the number of dimensions are mutually exclusive.

Whitespace is permitted before either the lower bounds or the list of array elements. A single space before the datum is recommended style, even if the datum starts with a delimiter. (Compatibility note: Common Lisp does not require a space after #0a.)

For example #2au32((10 11) (20 21)) is a 2x2 array of 32-bit unsigned integers. As another example, #a((0 2) (0 3)) ((12 13) (21 22 23)) is a 2-dimensional array (a matrix) whose indexes are rows 0 to 1 and columns 0 to 2. It could be pretty-printed (using the non-normative format-array helper function below):

#a═══════╗
║11│12│13║
╟──┼──┼──╢
║21│22│23║
╚══╧══╧══╝

Here are more examples:

A uniform u32 array with two dimensions and index ranges 2..3 and 3..4 inclusive:

#au32((2 4) (3 5)) ((a b) (c d))

Some zero-dimensional arrays:

#a() sym
#af32() 237.0

Some empty arrays:

#1a ()
#2a (())

Possible extension: It might make source code more readable if it could contain array literals in the form produced by format-array. An implementation has not been attempted, but it seems doable: If the first non-whitespace character after the header is a box drawing character, then use those box-drawing characters to figure out each element cell. (Nested arrays make this a bit more complicated.) If a cell has multiple lines, convert it to a string, with a newline between each line. Then recursively read each element.

Output

When an array is printed with the write function, the result should be an array-literal. A single space should be printed before the datum.

Printing with display may format the array in the same way as write (except using display for each element), or in some more readable way, perhaps by using the format-array procedure.

The format-array utility procedure (non-normative)

(format-arrayvalue [port])

Produce a nice pretty display for value, which is usually an array. Using Unicode "box drawing" characters is suggested but not required.

If port is an output port, the formatted output is written into that port. Otherwise, port must be a boolean (one of #t or #f). If the port is #t, output is to the (current-output-port). If the port is #f or no port is specified, the output is returned as a string. If the port is specified and is #t or an output-port, the result of the format-array procedure is unspecified. (This convention matches that of SRFI 48 format.)

Zero-dimensional arrays are written without boxing; empty arrays are written using empty boxes.

The top line includes the tag and the number of dimensions or the lbounds, possibly in abbreviated form.


(define arr
  #a(1 1)
    ((#a(2 2)
       ((1 2) (3 4))
        9
        #a(2 2)
          ((3 4) (5 6)))
          (#1a (42 43)
           #1a (8 7 6)
           #2a ((90 91) (100 101)))))

(format-array arr) ⇨
#a(1 1)═══════════════════╗
║#2a═╗  │      9│#2a═╗    ║
║║1│2║  │       │║3│4║    ║
║╟─┼─╢  │       │╟─┼─╢    ║
║║3│4║  │       │║5│6║    ║
║╚═╧═╝  │       │╚═╧═╝    ║
╟───────┼───────┼─────────╢
║#a═══╗│#a═════│#2a══════╗║
║║42│43║│║8│7│6║│║ 90│ 91║║
║╚══╧══╝│╚═╧═╧═╝│╟───┼───╢║
║       │       │║100│101║║
║       │       │╚═══╧═══╝║
╚═══════╧═══════╧═════════╝

If the number of dimensions is more than 2, then each "layer" is printed separated by double lines.


#af32(2 3 4)
  (((1 2 3 4) (5 6 7 8))
   ((9 10 11 12) (13 14 15 16))
   ((17 18 19 20) (21 22 23 24)))

f32(2 3 4)══╗
║ 1│ 2│ 3│ 4║
╟──┼──┼──┼──╢
║ 5│ 6│ 7│ 8║
╠══╪══╪══╪══╣
║ 9│10│11│12║
╟──┼──┼──┼──╢
║13│14│15│16║
╠══╪══╪══╪══╣
║17│18│19│20║
╟──┼──┼──┼──╢
║21│22│23│24║
╚══╧══╧══╧══╝

Implementations may extend format-array so it can be called with more than 2 arguments, but this is not specified here. For example, implementations could allow a formatting procedure or combinator. An implementation that supports SRFI 48 format may support an optional element-format parameter, which would be interpreted as a format string used for each array element. A possible implementation is illustrated here:

(format-array arr "~4,2f") ⇨
#2a@1:3@1:4═══╤════════════════╤═══════════════╗
║#2a:2:2═══╗  │            9.00│#2a:2:2═══╗    ║
║║1.00│2.00║  │                │║3.00│4.00║    ║
║╟────┼────╢  │                │╟────┼────╢    ║
║║3.00│4.00║  │                │║5.00│6.00║    ║
║╚════╧════╝  │                │╚════╧════╝    ║
╟─────────────┼────────────────┼───────────────╢
║#1a:2═╤═════╗│#2a:1:3════╤═══╗│#2a:2:2═══════╗║
║║42.00│43.00║│║8.00│7.00│6.00║│║ 90.00│ 91.00║║
║╚═════╧═════╝│╚════╧════╧════╝│╟──────┼──────╢║
║             │                │║100.00│101.00║║
║             │                │╚══════╧══════╝║
╚═════════════╧════════════════╧═══════════════╝

Implementation

(none yet)

Acknowledgements

This specification is based on prior art, primarily SRFI 163, Kawa, Guile, and Common Lisp.

© Per Bothner 2019, John Cowan 2026.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler