Title

Homogeneous and Heterogeneous Arrays

Author

Aubrey Jaffer

Status

This SRFI is currently in ``draft'' status. To see an explanation of each status that a SRFI can hold, see here. It will remain in draft status until 2005/03/17, or as amended. To provide input on this SRFI, please


mailto:srfi-63@srfi.schemers.org

. See instructions here to subscribe to the list. You can access previous messages via the archive of the mailing list.

Received: 2005/01/17
Draft: 2005/01/17 - 2005/03/18
Revised: 2005/01/27

Abstract

The SRFI, which is to supersede SRFI-47, "Array",

synthesizes array concepts from Common-Lisp and Alan Bawden's "array.scm";
incorporates all the uniform vector types from SFRI-4 "Homogeneous numeric vector datatypes";
adds a boolean uniform array type;
adds 16.bit and 128.bit floating-point uniform-array types;
adds decimal floating-point uniform-array types; and
adds array types of (dual) floating-point complex numbers.

Multi-dimensional arrays subsume homogeneous vectors as the one-dimensional case, obviating the need for SRFI-4.

SRFI-58 gives a read/write invariant syntax for the homogeneous and heterogeneous arrays described here.

Issues

Character arrays are can be supported based on strings; but they do not necessarily have access times comparable to other types of arrays.
The conversion rules for exact decimal flonums have yet to be determined.
array->vector and vector->array are not inverses for rank-0 arrays.

Rationale

Arrays

Computations have been organized into multidimensional arrays for over 200 years. Applications for multi-dimensional arrays continue to arise. Computer graphics and imaging, whether vector or raster based, use arrays. A general-purpose computer language without multidimensional arrays is an oxymoron.

Precision

R5RS provides an input syntax for inexact numbers which is capable of distinguishing between short, single, double, and long precisions. But R5RS provides no method for limiting the precision of calculations:

In particular, implementations that use flonum representations must follow these rules: A flonum result must be represented with at least as much precision as is used to express any of the inexact arguments to that operation.

And calculation with any exact number inputs blows the precision out to "the most precise flonum format available":

If, however, an exact number is operated upon so as to produce an inexact result (as by `sqrt'), and if the result is represented as a flonum, then the most precise flonum format available must be used; but if the result is represented in some other way then the representation must have at least as much precision as the most precise flonum format available.

Scheme is not much hampered by lack of low-precision inexact numbers for scalar calculations. The extra computation incurred by gratuitous precision is usually small compared with the overhead of type-dispatch and boxed data manipulation.

Homogeneous Arrays

But if calculations are vectorized, that overhead can become significant. Sophisticated program analysis may be able to deduce that aggregated number storage can be made uniformly of the most precise flonum format available. But even the most aggressive analysis of uncontrived programs will not be able to reduce the precision while yielding results equivalent to the most precise calculation, as R5RS requires.

Also significant is that the numerical data in most Scheme implementations has manifest type information encoded with it. Varying sizes of number objects means that the vectors hold pointers to some numbers, requiring data fetches from memory locations unlikely to be in the same CPU cache-line.

Arrays composed of elements all having the same size representations can eliminate these indirect accesses and the storage allocation associated with them. Homogeneous arrays of lower precision flonums can reduce by factors of 2 or 4 the storage they occupy; which can also speed execution because of the lower bandwidth to the memory necessary to supply the CPU data cache.

Common Lisp

Common-Lisp arrays are serviceable, and are the basis for arrays here. Common-Lisp's make-array does not translate well to Scheme because the array element type and the initial contents are passed using named arguments.

Prototype arrays specify both the homogeneous array type (or lack of) and the initial value or lack of it; allowing these purposes to be satisfied by one argument to make-array or other procedures which create arrays.

Some have objected that restricting type specification to arrays is a half-measure. In vectorized programs, specifying the precision of scalar calculations will produce negligible performance improvements. But the performance improvements of homogeneous arrays can accrue to both interpreted and compiled Scheme implementations. By avoiding the morass of general type specification, SRFI-63 can be more easily accommodated by more Scheme implementations.

Argument Order

Most of the procedures except originate from Alan Bawden's "array.scm". SRFI-47's array-set! argument order is that of Bawden's package. SLIB adopted "array.scm" in 1993. This form of array-set! has also been part of the SCM Scheme implementation since 1993.
The array-set! argument order is different from the same-named procedure in SRFI-25. Type dispatch on the first argument to array-set! could support both SRFIs simultaneously.
The make-array arguments are different from the same-named procedure in SRFI-25. Type dispatch on the first argument to make-array could support both SRFIs simultaneously.

The SRFI-47 argument orders are motivated to make easy dealing with the variable arity resulting from variable rank.

       (vector->array  vect  proto  bound1 ...)
          (make-array        proto  bound1 ...)
   (make-shared-array  array mapper bound1 ...)
          (array-set!  array obj    index1 ...)
    (array-in-bounds?  array        index1 ...)
           (array-ref  array        index1 ...)

The list->array is somewhat dissonant:

         (list->array  rank  proto  list)

Homogeneous Array Types

All implementations must support Scheme strings as rank 1 character arrays. This requirement mandates that Scheme strings be valid arguments to array procedures; their stored representations may be different from other character arrays.

Although an implementation is required to define all the prototype functions, it is not required to support all or even any of the homogeneous numeric arrays. It is assumed that no uniform numeric types have larger precision than the Scheme implementation supports as numbers.

prototype
procedure exactness element type
vector any
A:floC128b inexact 128.bit binary flonum complex
A:floC64b inexact 64.bit binary flonum complex
A:floC32b inexact 32.bit binary flonum complex
A:floC16b inexact 16.bit binary flonum complex
A:floR128b inexact 128.bit binary flonum real
A:floR64b inexact 64.bit binary flonum real
A:floR32b inexact 32.bit binary flonum real
A:floR16b inexact 16.bit binary flonum real

A:floQ128d exact 128.bit decimal flonum rational
A:floQ64d exact 64.bit decimal flonum rational
A:floQ32d exact 32.bit decimal flonum rational

A:fixZ64b exact 64.bit binary fixnum
A:fixZ32b exact 32.bit binary fixnum
A:fixZ16b exact 16.bit binary fixnum
A:fixZ8b exact 8.bit binary fixnum
A:fixN64b exact 64.bit nonnegative binary fixnum
A:fixN32b exact 32.bit nonnegative binary fixnum
A:fixN16b exact 16.bit nonnegative binary fixnum
A:fixN8b exact 8.bit nonnegative binary fixnum
A:bool boolean
string char

prototype procedure	exactness	element type
`vector`		any
`A:floC128b`	inexact	128.bit binary flonum complex
`A:floC64b`	inexact	64.bit binary flonum complex
`A:floC32b`	inexact	32.bit binary flonum complex
`A:floC16b`	inexact	16.bit binary flonum complex
`A:floR128b`	inexact	128.bit binary flonum real
`A:floR64b`	inexact	64.bit binary flonum real
`A:floR32b`	inexact	32.bit binary flonum real
`A:floR16b`	inexact	16.bit binary flonum real

`A:floQ128d`	exact	128.bit decimal flonum rational
`A:floQ64d`	exact	64.bit decimal flonum rational
`A:floQ32d`	exact	32.bit decimal flonum rational

`A:fixZ64b`	exact	64.bit binary fixnum
`A:fixZ32b`	exact	32.bit binary fixnum
`A:fixZ16b`	exact	16.bit binary fixnum
`A:fixZ8b`	exact	8.bit binary fixnum
`A:fixN64b`	exact	64.bit nonnegative binary fixnum
`A:fixN32b`	exact	32.bit nonnegative binary fixnum
`A:fixN16b`	exact	16.bit nonnegative binary fixnum
`A:fixN8b`	exact	8.bit nonnegative binary fixnum
`A:bool`		boolean
`string`		char

Decimal flonums are used for financial calculations so that fractional errors do not accumulate. They should be exact numbers.

Conversions

All the elements of arrays of type A:fixN8b, A:fixN16b, A:fixN32b, A:fixN64b, A:fixZ8b, A:fixZ16b, A:fixZ32b, or A:fixZ64b are exact.
All the elements of arrays of type A:floR16b, A:floR32b, A:floR64b, A:floR128b, A:floC16b, A:floC32b, A:floC64b, and A:floC128b are inexact.
The value retrieved from an exact array element will equal (=) the value stored in that element.
Assigning a non-integer to array-type A:fixN8b, A:fixN16b, A:fixN32b, A:fixN64b, A:fixZ8b, A:fixZ16b, A:fixZ32b, or A:fixZ64b is an error.
Assigning a number larger than can be represented in array-type A:fixN8b, A:fixN16b, A:fixN32b, A:fixN64b, A:fixZ8b, A:fixZ16b, A:fixZ32b, or A:fixZ64b is an error.
Assigning a negative number to array-type A:fixN8b, A:fixN16b, A:fixN32b, or A:fixN64b is an error.
Assigning an inexact number to array-type A:fixN8b, A:fixN16b, A:fixN32b, A:fixN64b, A:fixZ8b, A:fixZ16b, A:fixZ32b, or A:fixZ64b is an error.
When assigning an exact number to an inexact array-type, the procedure may report a violation of an implementation restriction.
Assigning a non-real number (eg. real? returns #f) to an A:floR128b, A:floR64b, A:floR32b, or A:floR16b array is an error.
When an inexact number is assigned to an array whose type is lower precision, the number will be rounded to that lower precision if possible; otherwise it is an error.

Prototype Procedures

Implementations are required to define all of the prototype procedures. Uniform types of matching format and sizes which the platform supports will be used; the others will be represented as follows:

For inexact flonum complex arrays:

the next larger complex format is used;
if there is no larger format,
- then if the implementation supports complex floating-point numbers of unbounded precision,
  - then a heterogeneous array;
  - else the largest inexact flonum complex array.

For inexact flonum real arrays:

the next larger real format is used;
if there is no larger real format, then the next larger complex format is used.
If there is no larger complex format,
- then if the implementation supports floating-point real numbers of unbounded precision,
  - then a heterogeneous array;
  - else the largest inexact flonum real or complex array.

For exact decimal flonum arrays:

the next larger decimal flonum format array is used;
If there is no larger decimal flonum format, then a heterogeneous array is used.

For exact bipolar fixnum arrays:

the next larger bipolar fixnum format array is used;
If there is no larger bipolar fixnum format,
- then if the implementation supports exact integers of unbounded precision,
  - then a heterogeneous array;
  - else the largest bipolar fixnum array.

For exact nonnegative fixnum arrays:

the next larger nonnegative fixnum format array is used;
If there is no larger nonnegative fixnum format,
- then the next larger bipolar fixnum format is used.
- If there is no larger bipolar fixnum format,
  - then if the implementation supports exact integers of unbounded precision,
    - then a heterogeneous array;
    - else the largest nonnegative or bipolar fixnum array.

This arrangement has platforms which support uniform array types employing them, with less capable platforms using vectors; but all working compatibly from the same source code.

Shared Arrays

To my knowledge, shared arrays were original to Alan Bawden in his "array.scm". Make-shared-array creates any view into an array whose coordinates can be mapped by exact integer affine functions. Shared arrays are quite useful. They can reverse indexes, make subarrays, and facilitate straightforward implementations of divide-and-conquer algorithms.

In Common-Lisp a displaced array can be created by calls to adjust-array. But displaced arrays are far less flexible than shared arrays, constrained to have the same rank as the original and allowing only index displacements (not reversals, skips, or shuffling).

Limit Cases

The bounds for each index in both Alan Bawden's "array.scm" and SRFI-25 can be any consecutive run of integers. All indexes in SRFI-63 are zero-based for compatibility with R5RS.

Empty arrays having no elements can be of any positive rank. Empty arrays can be returned from make-shared-array.

Following Common-Lisp's lead, zero-rank arrays have a single element.

Except for character arrays, array access time is O(R)+V, where R is the rank of the array and V is the vector access time.

Character array access time is O(R)+S, where R is the rank of the array and S is the string access time.

Specification

Function: array? obj: Returns #t if the obj is an array, and #f if not.

Note: Arrays are not disjoint from other Scheme types. Vectors and possibly strings also satisfy array?. A disjoint array predicate can be written:

(define (strict-array? obj)
  (and (array? obj) (not (string? obj)) (not (vector? obj))))

Function: equal? obj1 obj2

Returns #t if obj1 and obj2 have the same rank and dimensions and the corresponding elements of obj1 and obj2 are equal?.

equal? recursively compares the contents of pairs, vectors, strings, and arrays, applying eqv? on other objects such as numbers and symbols. A rule of thumb is that objects are generally equal? if they print the same. equal? may fail to terminate if its arguments are circular data structures.

(equal? 'a 'a)                             =>  #t
(equal? '(a) '(a))                         =>  #t
(equal? '(a (b) c)
        '(a (b) c))                        =>  #t
(equal? "abc" "abc")                       =>  #t
(equal? 2 2)                               =>  #t
(equal? (make-vector 5 'a)
        (make-vector 5 'a))                =>  #t
(equal? (make-array (A:fixN32b 4) 5 3)
        (make-array (A:fixN32b 4) 5 3))    =>  #t
(equal? (make-array '#(foo) 3 3)
        (make-array '#(foo) 3 3))          =>  #t
(equal? (lambda (x) x)
        (lambda (y) y))                    =>  unspecified

Function: array-rank obj: Returns the number of dimensions of obj. If obj is not an array, 0 is returned.

Function: array-dimensions array

Returns a list of dimensions.

(array-dimensions (make-array '#() 3 5))
   => (3 5)

Function: make-array prototype k1 ...

Creates and returns an array of type prototype with dimensions k1, ... and filled with elements from prototype. prototype must be an array, vector, or string. The implementation-dependent type of the returned array will be the same as the type of prototype; except if that would be a vector or string with rank not equal to one, in which case some variety of array will be returned.

If the prototype has no elements, then the initial contents of the returned array are unspecified. Otherwise, the returned array will be filled with the element at the origin of prototype.

Function: make-shared-array array mapper k1 ...

make-shared-array can be used to create shared subarrays of other arrays. The mapper is a function that translates coordinates in the new array into coordinates in the old array. A mapper must be linear, and its range must stay within the bounds of the old array, but it can be otherwise arbitrary. A simple example:

(define fred (make-array '#(#f) 8 8))
(define freds-diagonal
  (make-shared-array fred (lambda (i) (list i i)) 8))
(array-set! freds-diagonal 'foo 3)
(array-ref fred 3 3)
   => FOO
(define freds-center
  (make-shared-array fred (lambda (i j) (list (+ 3 i) (+ 3 j)))
                     2 2))
(array-ref freds-center 0 0)
   => FOO

Function: list->array rank proto list

list must be a rank-nested list consisting of all the elements, in row-major order, of the array to be created.

list->array returns an array of rank rank and type proto consisting of all the elements, in row-major order, of list. When rank is 0, list is the lone array element; not necessarily a list.

(list->array 2 '#() '((1 2) (3 4)))
                => #2A((1 2) (3 4))
(list->array 0 '#() 3)
                => #0A 3

Function: array->list array

Returns a rank-nested list consisting of all the elements, in row-major order, of array. In the case of a rank-0 array, array->list returns the single element.

(array->list #2A((ho ho ho) (ho oh oh)))
                => ((ho ho ho) (ho oh oh))
(array->list #0A ho)
                => ho

Function: vector->array vect proto dim1 ...

vect must be a vector of length equal to the product of exact nonnegative integers dim1, ....

vector->array returns an array of type proto consisting of all the elements, in row-major order, of vect. In the case of a rank-0 array, vect has a single element.

(vector->array #(1 2 3 4) #() 2 2)
                => #2A((1 2) (3 4))
(vector->array '#(3) '#())
                => #0A 3

Function: array->vector array

Returns a new vector consisting of all the elements of array in row-major order. In the case of a rank-0 array, array->vector returns the single element.

(array->vector #2A ((1 2)( 3 4)))
                => #(1 2 3 4)
(array->vector #0A ho)
                => ho

Function: array-in-bounds? array index1 ...: Returns #t if its arguments would be acceptable to array-ref.

Function: array-ref array k1 ...: Returns the (k1, ...) element of array.

Procedure: array-set! array obj k1 ...: Stores obj in the (k1, ...) element of array. The value returned by array-set! is unspecified.

These functions return a prototypical uniform-array enclosing the optional argument (which must be of the correct type). If the uniform-array type is supported by the implementation, then it is returned; defaulting to the next larger precision type; resorting finally to vector.

Function: a:floc128b z
Function: a:floc128b: Returns an inexact 128.bit flonum complex uniform-array prototype.

Function: a:floc64b z
Function: a:floc64b: Returns an inexact 64.bit flonum complex uniform-array prototype.

Function: a:floc32b z
Function: a:floc32b: Returns an inexact 32.bit flonum complex uniform-array prototype.

Function: a:floc16b z
Function: a:floc16b: Returns an inexact 16.bit flonum complex uniform-array prototype.

Function: a:flor128b z
Function: a:flor128b: Returns an inexact 128.bit flonum real uniform-array prototype.

Function: a:flor64b z
Function: a:flor64b: Returns an inexact 64.bit flonum real uniform-array prototype.

Function: a:flor32b z
Function: a:flor32b: Returns an inexact 32.bit flonum real uniform-array prototype.

Function: a:flor16b z
Function: a:flor16b: Returns an inexact 16.bit flonum real uniform-array prototype.

Function: a:flor128b z
Function: a:flor128b: Returns an exact 128.bit decimal flonum rational uniform-array prototype.

Function: a:flor64b z
Function: a:flor64b: Returns an exact 64.bit decimal flonum rational uniform-array prototype.

Function: a:flor32b z
Function: a:flor32b: Returns an exact 32.bit decimal flonum rational uniform-array prototype.

Function: a:fixz64b n
Function: a:fixz64b: Returns an exact binary fixnum uniform-array prototype with at least 64 bits of precision.

Function: a:fixz32b n
Function: a:fixz32b: Returns an exact binary fixnum uniform-array prototype with at least 32 bits of precision.

Function: a:fixz16b n
Function: a:fixz16b: Returns an exact binary fixnum uniform-array prototype with at least 16 bits of precision.

Function: a:fixz8b n
Function: a:fixz8b: Returns an exact binary fixnum uniform-array prototype with at least 8 bits of precision.

Function: a:fixn64b k
Function: a:fixn64b: Returns an exact non-negative binary fixnum uniform-array prototype with at least 64 bits of precision.

Function: a:fixn32b k
Function: a:fixn32b: Returns an exact non-negative binary fixnum uniform-array prototype with at least 32 bits of precision.

Function: a:fixn16b k
Function: a:fixn16b: Returns an exact non-negative binary fixnum uniform-array prototype with at least 16 bits of precision.

Function: a:fixn8b k
Function: a:fixn8b: Returns an exact non-negative binary fixnum uniform-array prototype with at least 8 bits of precision.

Function: a:bool bool
Function: a:bool: Returns a boolean uniform-array prototype.

Implementation

slib/array.scm implements array procedures for R4RS or R5RS compliant Scheme implementations with records as implemented by slib/record.scm or SRFI-9.

"array.scm" redefines equal? to handle arrays.

Copyright

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Editor: David Van Horn

Last modified: Thu Jan 27 09:30:33 EST 2005