Title

Homogeneous numeric vector libraries

Author

John Cowan (based on SRFI 4 by Marc Feeley)

Status

This SRFI is currently in draft status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-160@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

This SRFI describes a set of operations on SRFI 4 homogeneous vector types (plus a few additional types) that are closely analogous to the vector operations library, SRFI 133. An external representation is specified which may be supported by the read and write procedures and by the program parser so that programs can contain references to literal homogeneous vectors.

Issues

Rationale

Like lists, Scheme vectors are a heterogeneous datatype which impose no restriction on the type of the elements. This generality is not needed for applications where all the elements are of the same type. The use of Scheme vectors is not ideal for such applications because, in the absence of a compiler with a fancy static analysis, the representation will typically use some form of boxing of the elements which means low space efficiency and slower access to the elements. Moreover, homogeneous vectors are convenient for interfacing with low-level libraries (e.g. binary block I/O) and to interface with foreign languages which support homogeneous vectors. Finally, the use of homogeneous vectors allows certain errors to be caught earlier.

This SRFI specifies a set of homogeneous vector datatypes which cover the most practical cases, that is where the type of the elements is numeric (exact integer or inexact real or complex) and the precision and representation is efficiently implemented on the hardware of most current computer architectures (8, 16, 32 and 64 bit integers, either signed or unsigned, and 32 and 64 bit floating point numbers).

This SRFI extends SRFI 4 by providing the additional c64vector and c128vector types, and by providing analogues for almost all of the heterogeneous vector procedures of SRFI 133. There are some additional procedures, most of which are closely analogous to the string procedures of SRFI 152.

Note that there are no conversions between homogeneous vectors and strings in this SRFI. In addition, there is no support for u1vectors (bitvectors) provided, not because they are not useful, but because they are different enough in both specification and implementation to be put into a future SRFI of their own.

Specification

There are eight datatypes of exact integer homogeneous vectors (which will be called integer vectors):
datatype type of elements
s8vector signed exact integer in the range -27 to 27-1
u8vector unsigned exact integer in the range 0 to 28-1
s16vectorsigned exact integer in the range -215 to 215-1
u16vectorunsigned exact integer in the range 0 to 216-1
s32vectorsigned exact integer in the range -231 to 231-1
u32vectorunsigned exact integer in the range 0 to 232-1
s64vectorsigned exact integer in the range -263 to 263-1
u64vectorunsigned exact integer in the range 0 to 264-1
All but the first are part of SRFI 4.

There are two datatypes of inexact real homogeneous vectors (which will be called float vectors):
datatype type of elements
f32vector inexact real
f64vector inexact real
These are part of SRFI 4.

The only required difference between the two float vector types is that f64vectors preserve at least as much precision and range as f32vectors (see the implementation section for details).

And there are two datatypes of inexact complex homogeneous vectors (which will be called complex vectors):
datatype type of elements
c64vector inexact complex
c128vector inexact complex
These are not part of SRFI 4.

The only required difference between the two complex vector types is that c128vectors preserve at least as much precision and range as c64vectors (see the implementation section for details).

A Scheme system that conforms to this SRFI does not have to support all of these homogeneous vector datatypes. However, a Scheme system must support float vectors if it supports Scheme inexact reals (of any precision). A Scheme system must support complex vectors if it supports Scheme inexact complex numbers (of any precision). Finally, a Scheme system must support a particular integer vector datatype if the system's exact integer datatype contains all the values that can be stored in such an integer vector. Thus a Scheme system with bignum support must implement all the integer vector datatypes, but a Scheme system might only support s8vectors, u8vectors, s16vectors and u16vectors if it only supports integers in the range -229 to 229-1 (which would be the case if they are represented as 32-bit machine integers with a 2-bit tag).

Scheme systems which conform to this SRFI and also conform to either R6RS or R7RS should use the same datatype for bytevectors and for u8vectors. All other homogeneous vector types are disjoint from each other and from all other Scheme types.

Each element of a homogeneous vector must be valid. That is, for an integer vector, it must be an exact integer within the inclusive range specified above; for a float vector, it must be an inexact real number; and for a complex vector, it must be an inexact complex number. It is an error to try to use a constructor or mutator to set an element to an invalid value.

It should be noted that all of the procedures that iterate across multiple vectors in parallel stop iterating and produce the final result when the end of the shortest vector is reached. The sole exception is @vector=, which immediately returns #f if the vectors' lengths are different.

Notation

So as not to multiply the number of procedures described in this SRFI beyond necessity, a special notational convention is used. The description of the procedure make-@vector is really shorthand for the descriptions of the thirteen procedures make-s8vector, make-u8vector, ... make-c128vector, all of which are exactly the same except that they construct different homogeneous vector types. Furthermore, except as otherwise noted, the semantics of each procedure are those of the corresponding SRFI 133 procedure, except that it is an error to attempt to insert an invalid value into a homogeneous vector. Consequently, only a brief description of each procedure is given, and SRFI 133 (or in some cases SRFI 152) should be consulted for the details.

In the section containing specifications of procedures, the following notation is used to specify parameters and return values:

(f arg1 arg2 ...) -> something
A procedure f that takes the parameters arg1 arg2 ... and returns a value of the type something. If two values are returned, two types are specified. If something is unspecified, then f returns a single implementation-dependent value; this SRFI does not specify what it returns, and in order to write portable code, the return value should be ignored.

vec
Must be a heterogeneous vector, i.e. it must satisfy the predicate vector?.

@vec, @to, @from
Must be a homogeneous vector, i.e. it must satisfy the predicate @vector?. In @vector-copy! and reverse-@vector-copy!, @to is the destination and @from is the source.

i, j, start, at
Must be an exact nonnegative integer less than the length of the @vector. In @vector-copy! and reverse-@vector-copy!, at refers to the destination and start to the source.

end
Must be an exact nonnegative integer not less than start. This indicates the index directly before which traversal will stop — processing will occur until the index of the vector is one less than end. It is the open right side of a range.

f
Must be a procedure taking one or more arguments, which returns (except as noted otherwise) exactly one value.

pred?
Must be a procedure taking one or more arguments that returns one value, which is treated as a boolean.

=
Must be an equivalence procedure.

obj, seed, knil
Any Scheme object.

fill, value
Any number that is valid with respect to the @vec.

[something]
An optional argument; it needn't necessarily be applied. Something needn't necessarily be one thing; for example, this usage of it is perfectly valid:

   [start [end]]

and is indeed used quite often.

something ...
Zero or more somethings are allowed to be arguments.

something1 something2 ...
At least one something must be arguments.

Procedures

The procedures shared with SRFI 4 are marked with [SRFI 4]. The procedures with the same semantics as SRFI 133 are marked with [SRFI 133] unless they are already marked with [SRFI 4]. The procedures analogous to SRFI 152 string procedures are marked with [SRFI 152].

Constructors

(make-@vector size [fill]) -> @vector [SRFI 4]

Returns a @vector whose length is size. If fill is provided, all the elements of the @vector are initialized to it.

(@vector value ...) -> @vector [SRFI 4]

Returns a @vector initialized with values.

(@vector-unfold f length seed ...) -> @vector [SRFI 133]

Create a @vector whose length is length and initialize it by calling f, f(f), f(f(f))) ... on the seed.

(@vector-unfold-right f length seed ...) -> @vector [SRFI 133]

The same as @vector-unfold, but initializes the @vector from right to left.

(@vector-copy @vec [start [end]]) -> @vector [SRFI 133]

Makes a copy of the portion of @vec from start to end and returns it.

(@vector-reverse-copy @vec [start [end]]) -> @vector [SRFI 133]

The same as @vector-copy, but in reverse order.

(@vector-append @vec ...) -> @vector [SRFI 133]

Returns a @vector containing all the elements of the @vecs in order.

(@vector-concatenate list-of-@vectors) -> @vector [SRFI 133]

The same as @vector-append, but takes a list of @vectors rather than multiple arguments.

(@vector-append-subvectors [@vec start end] ...) -> @vector [SRFI 133]

Concatenates the result of applying @vector-copy to each triplet of @vec, start, end arguments, but may be implemented more efficiently.

Predicates

(@? obj) -> boolean

Returns #t if obj is a valid element of an @vector, and #f otherwise.

(@vector? obj) -> boolean [SRFI 4]

Returns #t if obj is a @vector, and #f otherwise.

(@vector-empty? @vec) -> boolean [SRFI 133]

Returns #t if @vec has a length of zero, and #f otherwise.

(@vector= elt=? @vec ...) -> boolean [SRFI 133]

Compares the @vecs for elementwise equality, using elt=? to do the comparisons.

Selectors

(@vector-ref @vec i) -> value [SRFI 4]

Returns the ith element of @vec.

(@vector-length @vec) -> exact nonnegative integer [SRFI 4]

Returns the length of @vec

Iteration

(@vector-take @vec n) -> @vector] [SRFI 152]

(@vector-take-right @vec n) -> @vector [SRFI 152]

Returns a @vector containing the first/last n elements of @vec.

(@vector-drop @vec n) -> @vector [SRFI 152]

(@vector-drop-right @vec n) -> @vector [SRFI 152]

Returns a @vector containing all except the first/last n elements of @vec.

(@vector-segment @vec n) -> list [SRFI 152]

Returns a list of @vectors, each of which contains n consecutive elements of @vec. The last @vector may be shorter than n.

(@vector-fold kons knil @vec1 @vec2 ...) -> object [SRFI 133]

(@vector-fold-right kons knil @vec1 @vec2 ...) -> object [SRFI 133]

Folds kons over the elements of @vec in increasing/decreasing order using knil as the initial value.

(@vector-map f @vec1 @vec2 ...) -> @vector [SRFI 133]

(@vector-map! f @vec1 @vec2 ...) -> unspecified [SRFI 133]

(@vector-for-each f @vec1 @vec2 ...) -> unspecified [SRFI 133]

Iterate over the elements of @vec and apply f to each, returning respectively a @vector of the results, an undefined value with the results placed back in @vec, and an undefined value with no change to @vec.

(@vector-count pred? @vec1 @vec2 ...) -> exact nonnegative integer [SRFI 133]

Call pred? on the nth argument of each @vec and return the number of calls that return true.

(@vector-cumulate f knil @vec) -> @vector [SRFI 133]

Like @vector-fold, but returns an @vector of partial results rather than just the final result.

Searching

(@vector-take-while @vec pred? -> index [SRFI 152]

(@vector-take-while-right @vec pred? -> index [SRFI 152]

Return the shortest prefix/suffix of @vec all of whose elements satisfy pred?.

(@vector-drop-while @vec pred -> index [SRFI 152]

(@vector-drop-while-right @vec pred -> index [SRFI 152]

Drops the longest initial prefix/suffix of @vec such that all its elements satisfy pred.

(@vector-index pred? @vec.) -> exact nonnegative integer or #f [SRFI 133]

(@vector-index-right pred? @vec.) -> exact nonnegative integer or #f [SRFI 133]

Return the index of the first/last element of @vec that satisfies pred?.

(@vector-skip pred? @vec) -> exact nonnegative integer or #f [SRFI 133]

(@vector-skip-right pred? @vec.) -> exact nonnegative integer or #f [SRFI 133]

Returns the index of the first/last element of @vec that does not satisfy pred?.

(@vector-binary-search @vec value =) -> exact nonnegative integer or #f [SRFI 133]

Returns the index of any element of @vec that is the same as value in the sense of the = argument using a binary search. Returns #f if there is no such element.

(@vector-any pred? @vec1 @vec2 ...) -> value or #f [SRFI 133]

Returns the first set of elements from the @vecs which satisfy pred?, or #f if there is no such set.

(@vector-every pred? @vec1 @vec2 ...) -> value or #f [SRFI 133]

If all sets of elements from the @vecs satisfy pred?, return the last set. If none do, return #f.

(@vector-partition pred? @vec) -> @vector and integer [SRFI 133]

Returns a @vector of the same type as @vec, but with all elements satisfying pred? in the leftmost part of the vector and the other elements in the remaining part. The order of elements is otherwise preserved. Returns two values, the new @vector and the number of elements satisfying pred?.

(@vector-filter pred? @vec -> @vector [SRFI 152]

(@vector-remove pred? @vec -> @vector [SRFI 152]

Return a @vector containing the elements of @vec that satisfy / do not satisfy pred?.

Mutators

(@vector-set! @vec i value) -> unspecified [SRFI 4]

Sets the ith element of @vec to value.

(@vector-swap! @vec i j) -> unspecified [SRFI 133]

Interchanges the ith and jth elements of @vec.

(@vector-fill! @vec fill [start [end]]) -> unspecified [SRFI 133]

Fills the portion of @vec from start to end with the value fill.

(@vector-reverse! @vec [start [end]]) -> unspecified [SRFI 133]

Reverses the portion of @vec from start to end.

(@vector-copy! @to at @from [start [end]]) -> unspecified [SRFI 133]

Copies the portion of @from from start to end onto @to, starting at index @at.

(@vector-reverse-copy! @to at @from [start [end]]) -> unspecified [SRFI 133]

The same as @vector-copy!, but copies in reverse.

Conversion

(@vector->list @vec [start [end]]) -> proper-list [SRFI 4]

(reverse-@vector->list @vec [start [end]]) -> proper-list [SRFI 133]

(list->@vector proper-list) -> @vector [SRFI 4]

(reverse-list->@vector proper-list) -> @vector [SRFI 133]

(@vector->vector @vec) -> vector

(vector->@vector vec) -> @vector

Returns a list, @vector, or heterogeneous vector with the same elements as the argument, in reverse order where specified.

Generators

(make-@vector-generator @vector)

Returns a SRFI 121 generator that generates all the values of @vector in order. Note that the generator is finite.

Output

(write-@vector vec [ port ] ) -> unspecified

Prints to port (the current output port by default) a representation of @vec in the lexical syntax explained below.

Optional lexical syntax

Each homogeneous vector datatype has an external representation which may be supported by the read and write procedures and by the program parser. Conformance to this SRFI does not in itself require support for these external representations.

For each value of @ in { s8, u8, s16, u16, s32, u32, s64, u64, f32, f64, c64, c128 }, if the datatype @vector is supported, then the external representation of instances of the datatype @vector is #@( ...elements... ).

For example, #u8(0 #e1e2 #xff) is a u8vector of length 3 containing 0, 100 and 255; #f64(-1.5) is an f64vector of length 1 containing -1.5.

Note that the syntax for float vectors conflicts with R5RS, which parses #f32() as 3 objects: #f, 32 and (). For this reason, conformance to this SRFI implies this minor nonconformance to R5RS.

This external representation is also available in program source code. For example, (set! x '#u8(1 2 3)) will set x to the object #u8(1 2 3). Literal homogeneous vectors, like heterogeneous vectors, are self-evaluating; they do not need to be quoted. Homogeneous vectors can appear in quasiquotations but must not contain unquote or unquote-splicing forms (i.e. `(,x #u8(1 2)) is legal but `#u8(1 ,x 2) is not). This restriction is to accommodate the many Scheme systems that use the read procedure to parse programs.

Packaging

This SRFI is packaged as thirteen libraries named (srfi 160 @), or (srfi :160 @) in R6RS systems, on the assumption that most programs will use only a few homogeneous vector types.

Implementation

The SRFI 160 library

The sample implementation is (not yet) in the repository of this SRFI. It depends on the implementation of the (srfi x4) library described in the next section.

For systems that do not a native SRFI 4 implementation, the version in the contrib/cowan directory of the SRFI 4 repository may be used; it depends only on a minimal implementation of bytevectors.

After downloading the source, it is necessary to run the atexpander.sh shell script in order to generate the individual files for the different types. This will take at.sls, for example, and create the files u8.sls, s8.sls, ... c128.sls. The heavy lifting is done by sed.

The following files are provided:

The SRFI x4 library

The library (srfi x4) is in the repository of this SRFI. It supports everything that (srfi 4) supports, plus two additional homogeneous vector types beyond the scope of SRFI 4, namely c64vectors, which support inexact complex numbers as two IEEE 32-bit values, and c128vectors, which support inexact complex numbers as two IEEE 64-bit values. The same eight procedures are available for these types as for the others. In addition, the @? procedure described above is available for all types.

The implementation depends on SRFI 4. For systems that do not have a native SRFI 4 implementation, the version in the contrib/cowan directory of the SRFI 4 repository may be used; it depends only on a minimal implementation of bytevectors.

The following files are provided:

Copyright

Copyright © John Cowan 2018.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler