Title

Homogeneous numeric vector libraries

Author

John Cowan (based on SRFI 4 by Marc Feeley)
Shiro Kawai (contributed a major patch).

Status

This SRFI is currently in final status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-160@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Received: 2018-05-20
Draft #1 published: 2018-08-21
Draft #2 published: 2018-09-02
Draft #3 published: 2018-09-08
Draft #4 published: 2018-10-10
Draft #5 published: 2018-10-30
Draft #6 published: 2018-12-06
Draft #7 published: 2018-12-22
Draft #8 published: 2019-01-13
Draft #9 published: 2019-03-11
Draft #10 published: 2019-05-21
Draft #11 published: 2019-07-31
Finalized: 2019-08-27
Revised to fix errata:
- 2019-08-29 (Document behavior on multiple returns.)
- 2020-11-14 (Removed error in description of list->@vector.)
- 2022-08-21 (It was unspecified what vector-segment does with an inexact or non-positive argument.

Abstract

This SRFI describes a set of operations on SRFI 4 homogeneous vector types (plus a few additional types) that are closely analogous to the vector operations library, SRFI 133. An external representation is specified which may be supported by the read and write procedures and by the program parser so that programs can contain references to literal homogeneous vectors.

Rationale

Like lists, Scheme vectors are a heterogeneous datatype which impose no restriction on the type of the elements. This generality is not needed for applications where all the elements are of the same type. The use of Scheme vectors is not ideal for such applications because, in the absence of a compiler with a fancy static analysis, the representation will typically use some form of boxing of the elements which means low space efficiency and slower access to the elements. Moreover, homogeneous vectors are convenient for interfacing with low-level libraries (e.g., binary block I/O) and to interface with foreign languages which support homogeneous vectors. Finally, the use of homogeneous vectors allows certain errors to be caught earlier.

This SRFI specifies a set of homogeneous vector datatypes which cover the most practical cases, that is, where the type of the elements is numeric (exact integer or inexact real or complex) and the precision and representation is efficiently implemented on the hardware of most current computer architectures (8, 16, 32 and 64 bit integers, either signed or unsigned, and 32 and 64 bit floating point numbers).

This SRFI extends SRFI 4 by providing the additional c64vector and c128vector types, and by providing analogues for almost all of the heterogeneous vector procedures of SRFI 133. There are some additional procedures, most of which are closely analogous to the string procedures of SRFI 152.

Note that there are no conversions between homogeneous vectors and strings in this SRFI. In addition, there is no support for u1vectors (bitvectors) provided, not because they are not useful, but because they are different enough in both specification and implementation to be put into a future SRFI of their own.

Specification

There are eight datatypes of exact integer homogeneous vectors (which will be called integer vectors):

s8vector: signed exact integer in the range -2⁷ to 2⁷-1
u8vector: unsigned exact integer in the range 0 to 2⁸-1
s16vector: signed exact integer in the range -2¹⁵ to 2¹⁵-1
u16vector: unsigned exact integer in the range 0 to 2¹⁶-1
s32vector: signed exact integer in the range -2³¹ to 2³¹-1
u32vector: unsigned exact integer in the range 0 to 2³²-1
s64vector: signed exact integer in the range -2⁶³ to 2⁶³-1
u64vector: unsigned exact integer in the range 0 to 2⁶⁴-1

All are part of SRFI 4.

There are two datatypes of inexact real homogeneous vectors (which will be called float vectors):

f32vector: inexact real, typically 32 bits
f64vector: inexact real, typically 64 bits

These are also part of SRFI 4.

f64vectors must preserve at least as much precision and range as f32vectors. (See the implementation section for details.)

And there are two datatypes of inexact complex homogeneous vectors (which will be called complex vectors):

c64vector: inexact complex, typically 64 bits
c128vector: inexact complex, typically 128 bits

These are not part of SRFI 4.

c128vectors must preserve at least as much precision and range as c64vectors. (See the implementation section for details.)

A Scheme system that conforms to this SRFI does not have to support all of these homogeneous vector datatypes. However, a Scheme system must support float vectors if it supports Scheme inexact reals (of any precision). A Scheme system must support complex vectors if it supports Scheme inexact complex numbers (of any precision). Finally, a Scheme system must support a particular integer vector datatype if the system's exact integer datatype contains all the values that can be stored in such an integer vector. Thus a Scheme system with bignum support must implement all the integer vector datatypes, but a Scheme system might only support s8vectors, u8vectors, s16vectors and u16vectors if it only supports integers in the range -2²⁹ to 2²⁹-1 (which would be the case if they are represented as 32-bit machine integers with a 2-bit tag).

Scheme systems which conform to this SRFI and also conform to either R6RS or R7RS should use the same datatype for bytevectors and for u8vectors, and systems that also implement SRFI 74 (blobs) should use the same datatype for them as well. All other homogeneous vector types are disjoint from each other and from all other Scheme types.

Each element of a homogeneous vector must be valid. That is, for an integer vector, it must be an exact integer within the inclusive range specified above; for a float vector, it must be an inexact real number; and for a complex vector, it must be an inexact complex number. It is an error to try to use a constructor or mutator to set an element to an invalid value.

Notation

So as not to multiply the number of procedures described in this SRFI beyond necessity, a special notational convention is used. The description of the procedure make-@vector is really shorthand for the descriptions of the twelve procedures make-s8vector, make-u8vector, ... make-c128vector, all of which are exactly the same except that they construct different homogeneous vector types. Furthermore, except as otherwise noted, the semantics of each procedure are those of the corresponding SRFI 133 procedure, except that it is an error to attempt to insert an invalid value into a homogeneous vector. Consequently, only a brief description of each procedure is given, and SRFI 133 (or in some cases SRFI 152) should be consulted for the details. It is worth mentioning, however, that all the procedures that return one or more vectors (homogeneous or heterogeneous) invariably return newly allocated vectors specifically.

In the section containing specifications of procedures, the following notation is used to specify parameters and return values:

(f arg₁ arg₂ ...) -> something: A procedure f that takes the parameters arg₁ arg₂ ... and returns a value of the type something. If two values are returned, two types are specified. If something is unspecified, then f returns a single implementation-dependent value; this SRFI does not specify what it returns, and in order to write portable code, the return value should be ignored.
vec: Must be a heterogeneous vector, i.e. it must satisfy the predicate vector?.
@vec, @to, @from: Must be a homogeneous vector, i.e. it must satisfy the predicate @vector?. In @vector-copy! and reverse-@vector-copy!, @to is the destination and @from is the source.
i, j, start, at: Must be an exact nonnegative integer less than the length of the @vector. In @vector-copy! and reverse-@vector-copy!, at refers to the destination and start to the source.
end: Must be an exact nonnegative integer not less than start and not greater than the length of the vector. This indicates the index directly before which traversal will stop — processing will occur until the index of the vector is one less than end. It is the open right side of a range.
f: Must be a procedure taking one or more arguments, which returns (except as noted otherwise) exactly one value.
pred: Must be a procedure taking one or more arguments that returns one value, which is treated as a boolean.
=: Must be an equivalence procedure.
obj, seed, knil: Any Scheme object.
fill, value: Any number that is valid with respect to the @vec.
[something]: An optional argument; it needn't necessarily be applied. Something needn't necessarily be one thing; for example, this usage of it is perfectly valid:

[start [end]]

and is indeed used quite often.
something ...: Zero or more somethings are allowed to be arguments.
something₁ something₂ ...: At least one something must be arguments.

Packaging

For each @vector type, there is a corresponding library named (srfi 160 @), and if an implementation provides a given type, it must provide that library as well. In addition, the library (srfi 160 base) provides a few basic procedures for all @vector types. If a particular type is not provided by an implementation, then it is an error to call the corresponding procedures in this library. Note that there is no library named (srfi 160).

Procedures

The procedures shared with SRFI 4 are marked with [SRFI 4]. The procedures with the same semantics as SRFI 133 are marked with [SRFI 133] unless they are already marked with [SRFI 4]. The procedures analogous to SRFI 152 string procedures are marked with [SRFI 152].

Constructors

(make-@vector size [fill]) -> @vector [SRFI 4]

Returns a @vector whose length is size. If fill is provided, all the elements of the @vector are initialized to it.

(@vector value ...) -> @vector [SRFI 4]

Returns a @vector initialized with values.

(@vector-unfold f length seed) -> @vector [SRFI 133]

Creates a vector whose length is length and iterates across each index k between 0 and length - 1, applying f at each iteration to the current index and current state, in that order, to receive two values: the element to put in the kth slot of the new vector and a new state for the next iteration. On the first call to f, the state's value is seed.

(@vector-unfold-right f length seed) -> @vector [SRFI 133]

The same as @vector-unfold, but initializes the @vector from right to left.

(@vector-copy @vec [start [end]]) -> @vector [SRFI 133]

Makes a copy of the portion of @vec from start to end and returns it.

(@vector-reverse-copy @vec [start [end]]) -> @vector [SRFI 133]

The same as @vector-copy, but in reverse order.

(@vector-append @vec ...) -> @vector [SRFI 133]

Returns a @vector containing all the elements of the @vecs in order.

(@vector-concatenate list-of-@vectors) -> @vector [SRFI 133]

The same as @vector-append, but takes a list of @vectors rather than multiple arguments.

(@vector-append-subvectors [@vec start end] ...) -> @vector [SRFI 133]

Concatenates the result of applying @vector-copy to each triplet of @vec, start, end arguments, but may be implemented more efficiently.

Predicates

(@? obj) -> boolean

Returns #t if obj is a valid element of an @vector, and #f otherwise.

(@vector? obj) -> boolean [SRFI 4]

Returns #t if obj is a @vector, and #f otherwise.

(@vector-empty? @vec) -> boolean [SRFI 133]

Returns #t if @vec has a length of zero, and #f otherwise.

(@vector= @vec ...) -> boolean [SRFI 133]

Compares the @vecs for elementwise equality, using = to do the comparisons. Returns #f unless all @vectors are the same length.

Selectors

(@vector-ref @vec i) -> value [SRFI 4]

Returns the ith element of @vec.

(@vector-length @vec) -> exact nonnegative integer [SRFI 4]

Returns the length of @vec.

Iteration

(@vector-take @vec n) -> @vector [SRFI 152]

(@vector-take-right @vec n) -> @vector [SRFI 152]

Returns a @vector containing the first/last n elements of @vec.

(@vector-drop @vec n) -> @vector [SRFI 152]

(@vector-drop-right @vec n) -> @vector [SRFI 152]

Returns a @vector containing all except the first/last n elements of @vec.

(@vector-segment @vec n) -> list [SRFI 152]

Returns a list of @vectors, each of which contains n consecutive elements of @vec. The last @vector may be shorter than n. It is an error if n is not an exact positive integer.

(@vector-fold kons knil @vec @vec2 ...) -> object [SRFI 133]

(@vector-fold-right kons knil @vec @vec2 ...) -> object [SRFI 133]

When one @vector argument @vec is given, folds kons over the elements of @vec in increasing/decreasing order using knil as the initial value. The kons procedure is called with the state first and the element second, as in SRFIs 43 and 133 (heterogeneous vectors). This is the opposite order to that used in SRFI 1 (lists) and the various string SRFIs.

When multiple @vector arguments are given, kons is called with the current state value and each value from all the vectors; @vector-fold scans elements from left to right, while @vector-fold-right does from right to left. If the lengths of vectors differ, only the portion of each vector up to the length of the shortest vector is scanned.

(@vector-map f @vec @vec2 ...) -> @vector [SRFI 133]

(@vector-map! f @vec @vec2 ...) -> unspecified [SRFI 133]

(@vector-for-each f @vec @vec2 ...) -> unspecified [SRFI 133]

Iterate over the elements of @vec and apply f to each, returning respectively a @vector of the results, an undefined value with the results placed back in @vec, and an undefined value with no change to @vec.

If more than one vector is passed, f gets one element from each vector as arguments. If the lengths of the vectors differ, iteration stops at the end of the shortest vector. For @vector-map!, only @vec is modified even when multiple vectors are passed.

If @vector-map or @vector-map! returns more than once (i.e. because of a continuation captured by f), the values returned or stored by earlier returns may be mutated.

(@vector-count pred? @vec @vec2 ...) -> exact nonnegative integer [SRFI 133]

Call pred? on each element of @vec and return the number of calls that return true.

When multiple vectors are given, pred? must take the same number of arguments as the number of vectors, and corresponding elements from each vector are given for each iteration, which stops at the end of the shortest vector.

(@vector-cumulate f knil @vec) -> @vector [SRFI 133]

Like @vector-fold, but returns a @vector of partial results rather than just the final result.

Searching

(@vector-take-while pred? @vec) -> @vector [SRFI 152]

(@vector-take-while-right pred? @vec) -> @vector [SRFI 152]

Return the shortest prefix/suffix of @vec all of whose elements satisfy pred?.

(@vector-drop-while pred? @vec) -> @vector [SRFI 152]

(@vector-drop-while-right pred? @vec) -> @vector [SRFI 152]

Drops the longest initial prefix/suffix of @vec such that all its elements satisfy pred.

(@vector-index pred? @vec @vec2 ...) -> exact nonnegative integer or #f [SRFI 133]

(@vector-index-right pred? @vec @vec2 ...) -> exact nonnegative integer or #f [SRFI 133]

Return the index of the first/last element of @vec that satisfies pred?.

When multiple vectors are passed, pred? must take the same number of arguments as the number of vectors, and corresponding elements from each vector are passed for each iteration. If the lengths of vectors differ, @vector-index stops iteration at the end of the shortest one. Lengths of vectors must be the same for @vector-index-right.

(@vector-skip pred? @vec @vec2 ...) -> exact nonnegative integer or #f [SRFI 133]

(@vector-skip-right pred? @vec @vec2 ...) -> exact nonnegative integer or #f [SRFI 133]

Returns the index of the first/last element of @vec that does not satisfy pred?.

When multiple vectors are passed, pred? must take the same number of arguments as the number of vectors, and corresponding elements from each vector are passed for each iteration. If the lengths of vectors differ, @vector-skip stops iteration at the end of the shortest one. Lengths of vectors must be the same for @vector-skip-right.

(@vector-any pred? @vec @vec2 ...) -> value or boolean [SRFI 133]

Returns first non-false result of applying pred? on a element from the @vec, or #f if there is no such element. If @vec is empty, returns #t

(@vector-every pred? @vec @vec2 ...) -> value or boolean [SRFI 133]

If all elements from @vec satisfy pred?, return the last result of pred?. If not all do, return #f. If @vec is empty, return #t

When multiple vectors are passed, pred? must take the same number of arguments as the number of vectors, and corresponding elements from each vector is passed for each iteration. If the lengths of vectors differ, it stops at the end of the shortest one.

(@vector-partition pred? @vec) -> @vector and integer [SRFI 133]

Returns a @vector of the same type as @vec, but with all elements satisfying pred? in the leftmost part of the vector and the other elements in the remaining part. The order of elements is otherwise preserved. Returns two values, the new @vector and the number of elements satisfying pred?.

(@vector-filter pred? @vec) -> @vector [SRFI 152]

(@vector-remove pred? @vec) -> @vector [SRFI 152]

Return a @vector containing the elements of @vec that satisfy / do not satisfy pred?.

Mutators

(@vector-set! @vec i value) -> unspecified [SRFI 4]

Sets the ith element of @vec to value.

(@vector-swap! @vec i j) -> unspecified [SRFI 133]

Interchanges the ith and jth elements of @vec.

(@vector-fill! @vec fill [start [end]]) -> unspecified [SRFI 133]

Fills the portion of @vec from start to end with the value fill.

(@vector-reverse! @vec [start [end]]) -> unspecified [SRFI 133]

Reverses the portion of @vec from start to end.

(@vector-copy! @to at @from [start [end]]) -> unspecified [SRFI 133]

Copies the portion of @from from start to end onto @to, starting at index at.

(@vector-reverse-copy! @to at @from [start [end]]) -> unspecified [SRFI 133]

The same as @vector-copy!, but copies in reverse.

(@vector-unfold! f @vec start end seed) -> @vector [SRFI 133]

Like vector-unfold, but the elements are copied into the vector @vec starting at element start rather than into a newly allocated vector. Terminates when end - start elements have been generated.

(@vector-unfold-right! f @vec start end seed) -> @vector [SRFI 133]

The same as @vector-unfold!, but initializes the @vector from right to left.

Conversion

(@vector->list @vec [start [end]]) -> proper-list [SRFI 4 plus start and end]

(reverse-@vector->list @vec [start [end]]) -> proper-list [SRFI 133]

(list->@vector proper-list) -> @vector

(reverse-list->@vector proper-list) -> @vector [SRFI 133]

(@vector->vector @vec [start [end]]) -> vector

(vector->@vector vec [start [end]]) -> @vector

Returns a list, @vector, or heterogeneous vector with the same elements as the argument, in reverse order where specified.

Generators

(make-@vector-generator @vector)

Returns a SRFI 121 generator that generates all the values of @vector in order. Note that the generator is finite.

Comparators

@vector-comparator

Variable containing a SRFI 128 comparator whose components provide ordering and hashing of @vectors.

Output

(write-@vector @vec [ port ] ) -> unspecified

Prints to port (the current output port by default) a representation of @vec in the lexical syntax explained below.

Optional lexical syntax

Each homogeneous vector datatype has an external representation which may be supported by the read and write procedures and by the program parser. Conformance to this SRFI does not in itself require support for these external representations.

For each value of @ in { s8, u8, s16, u16, s32, u32, s64, u64, f32, f64, c64, c128 }, if the datatype @vector is supported, then the external representation of instances of the datatype @vector is #@( ...elements... ).

For example, #u8(0 #e1e2 #xff) is a u8vector of length 3 containing 0, 100 and 255; #f64(-1.5) is an f64vector of length 1 containing -1.5.

Note that the syntax for float vectors conflicts with R5RS, which parses #f32() as 3 objects: #f, 32 and (). For this reason, conformance to this SRFI implies this minor nonconformance to R5RS.

This external representation is also available in program source code. For example, (set! x '#u8(1 2 3)) will set x to the object #u8(1 2 3). Literal homogeneous vectors, like heterogeneous vectors, are self-evaluating; they do not need to be quoted. Homogeneous vectors can appear in quasiquotations but must not contain unquote or unquote-splicing forms (i.e. `(,x #u8(1 2)) is legal but `#u8(1 ,x 2) is not). This restriction is to accommodate the many Scheme systems that use the read procedure to parse programs.

Implementation

This implementation was developed on Chicken 5 and Chibi, and directly supports those two systems. There is support for Gauche as well. It should be easy to adapt to other implementations.

After downloading the source, it is necessary to run the atexpander.sh shell script in order to generate the individual files for the different types. This will take a skeleton file like at.sld and create the files u8.sld, s8.sld, ... c128.sld. The unexpander.sh script safely undoes the effects of atexpander.sh. The heavy lifting is done by sed.

Making this SRFI available on R6RS systems is straightforward, requiring only a replacement library file that includes the implementation files in the srfi/160/base directory and the srfi/160 directories. The file include.scm contains an R6RS (include) library that will be useful for systems that don't provide it.

The SRFI 160 base library

The library (srfi 160 base) is in the repository of this SRFI. It supports the eight procedures of SRFI 4, namely make-@vector, @vector, @vector?, @vector-length, @vector-ref, @vector-set!, @vector->list, and list->@vector, not only for the ten homogeneous vector types supported by SRFI 4, but also for the two homogeneous vector types beyond the scope of SRFI 4, namely c64vectors and c128vectors. In addition, the @? procedure, which is not in SRFI 4, is available for all types.

The implementation depends on SRFI 4. For systems that do not have a native SRFI 4 implementation, the version in the contrib/cowan directory of the SRFI 4 repository may be used; it depends only on a minimal implementation of bytevectors.

The tests are for the c64 and c128 procedures and the @? procedures only. The assumption is that tests for the underlying SRFI 4 procedures suffice for everything else.

The following files are provided:

srfi.160.base.scm - Chicken 5 (srfi 160 base) library.
srfi/160/base.sld - R7RS (srfi 160 base) library.
srfi/160/base/complex.scm - Complex number implementation on top of SRFI 4.
srfi/160/base/valid.scm - Validity (@?) predicates.
srfi/160/base/at-vector2list.scm - Reimplementation of SRFI 4's @vector->list procedures that accept start and end optional arguments.
srfi/160/base/r7rec.scm - Record-type definitions for R7RS or SRFI 9.
srfi/160/base/r6rec.scm - Record-type definitions for R6RS.
shared-base-tests.scm - Shared tests (no dependencies).
chibi-base-tests.scm - Chibi test script wrapper.
chicken-base-tests.scm - Chicken 5 test script wrapper.

The SRFI 160 libraries

The following files are provided:

srfi/160/at.sld - Skeleton for Chibi libraries. Can be adapted to any R7RS system.
srfi.160.at.scm - Skeleton for Chicken 5 libraries. Can be adapted to any R5RS system with simple byte vectors.
srfi/160/at-impl.scm - Skeleton for shared implementation of SRFI 160 procedures.
shared-tests.scm - Tests of the s16 library only (depends on Chicken or Chibi test library). The assumption is that if s16 works, everything works.
chibi-tests.scm - Chibi test script wrapper.
chicken-tests.scm - Chicken 5 test script wrapper.
gauche-tests.scm - Gauche test script with embedded testing library.

Acknowledgements

Thanks to all participants in the SRFI 160 mailing list over the unconscionably long time it took me to get this proposal to finalization. Special thanks to Shiro Kawai for bringing up many issues and contributing the code that extended many procedures from one @vector to many.

Copyright

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Editor: Arthur A. Gleckler