178: Bitvector library

by John Cowan (text), Wolfgang Corcoran-Mathe (implementation)

Status

This SRFI is currently in final status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-178@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

This SRFI describes a set of operations on homogeneous bitvectors. Operations analogous to those provided on the other homogeneous vector types described in SRFI 160 are provided, along with operations analogous to the bitwise operations of SRFI 151.

Rationale

Bitvectors were excluded from the final draft of SRFI 160 because they are the only type of homogeneous numeric vectors for which bitwise operations make sense. In addition, there are two ways to view them: as vectors of exact integers limited to the values 0 and 1, and as vectors of booleans. This SRFI is designed to allow bitvectors to be viewed in either way.

Specification

Bitvectors are disjoint from all other Scheme types with the possible exception of u1vectors, if the Scheme implementation supports them.

The procedures of this SRFI that accept single bits or lists of bits can be passed any of the values 0, 1, #f, #t. Procedures that return a bit or a list of bits come in two flavors: one ending in /int that returns an exact integer, and one ending in /bool that returns a boolean.

Mapping and folding procedures also come in two flavors: those ending in /int pass exact integers to their procedure arguments, whereas those ending in /bool pass booleans to theirs.

Procedures whose names end in ! are the same as the corresponding procedures without !, except that the first bitvector argument is mutated and an unspecified result is returned. Consequently, only the non-! version is documented below.

It is an error unless all bitvector arguments passed to procedures that accept more than one are of the same length (except as otherwise noted).

Notation

In the section containing specifications of procedures, the following notation is used to specify parameters and return values:

(f arg1 arg2 ...) -> something
A procedure f that takes the parameters arg1 arg2 ... and returns a value of the type something. If two values are returned, two types are specified. If something is unspecified, then f returns a single implementation-dependent value; this SRFI does not specify what it returns, and in order to write portable code, the return value should be ignored.

vec
A heterogeneous vector; that is, it must satisfy the predicate vector?.

bvec, to, from
A bitvector, i.e. it must satisfy the predicate bitvector?. In bitvector-copy! and reverse-bitvector-copy!, to is the destination and from is the source.

i, j, start, at
An exact nonnegative integer less than the length of the bitvector. In bitvector-copy! and reverse-bitvector-copy!, at refers to the destination and start to the source.

end
An exact nonnegative integer not less than start. This indicates the index directly before which traversal will stop — processing will occur until the index of the vector is one less than end. It is the open right side of a range.

f
A procedure taking one or more arguments, which returns (except as noted otherwise) exactly one value.

=
An equivalence procedure.

obj, seed, knil
Any Scheme object.

[something]
An optional argument; it needn't necessarily be applied. Something needn't necessarily be one thing; for example, this usage of it is perfectly valid:

   [start [end]]

and is indeed used quite often.

something ...
Zero or more somethings are allowed to be arguments.

something1 something2 ...
One or more somethings must be arguments.

All procedures that return bitvectors, vectors, or lists newly allocate their results, except those that end in !. However, a zero-length value need not be separately allocated.

Except as otherwise noted, the semantics of each procedure are those of the corresponding SRFI 133 or SRFI 151 procedure.

Procedures

Bit conversion

(bit->integer bit) -> exact integer

Returns 0 if bit is 0 or #f and 1 if bit is 1 or #t.

(bit->boolean bit) -> boolean

Returns #f if bit is 0 or #f and #t if bit is 1 or #t.

Constructors

(make-bitvector size [bit]) -> bitvector

Returns a bitvector whose length is size. If bit is provided, all the elements of the bitvector are initialized to it.

(bitvector value ...) -> bitvector

Returns a bitvector initialized with values.

(bitvector-unfold f length seed ...) -> bitvector

Creates a vector whose length is length and iterates across each index k between 0 and length, applying f at each iteration to the current index and current states, in that order, to receive n + 1 values: the bit to put in the kth slot of the new vector and new states for the next iteration. On the first call to f, the states' values are the seeds.

(bitvector-unfold-right f length seed) -> bitvector

The same as bitvector-unfold, but initializes the bitvector from right to left.

(bitvector-copy bvec [start [end]]) -> bitvector

Makes a copy of the portion of bvec from start to end and returns it.

(bitvector-reverse-copy bvec [start [end]]) -> bitvector

The same as bitvector-copy, but in reverse order.

(bitvector-append bvec ...) -> bitvector

Returns a bitvector containing all the elements of the bvecs in order.

(bitvector-concatenate list-of-bitvectors) -> bitvector

The same as bitvector-append, but takes a list of bitvectors rather than multiple arguments.

(bitvector-append-subbitvectors [bvec start end] ...) -> bitvector

Concatenates the result of applying bitvector-copy to each triplet of bvec, start, end arguments, but may be implemented more efficiently.

Predicates

(bitvector? obj) -> boolean

Returns #t if obj is a bitvector, and #f otherwise.

(bitvector-empty? bvec) -> boolean

Returns #t if bvec has a length of zero, and #f otherwise.

(bitvector=? bvec ...) -> boolean

Compares the bvecs for element-wise equality, using eqv? to do the comparisons, and returns #t or #f accordingly.

Selectors

(bitvector-ref/int bvec i) -> integer
(bitvector-ref/bool bvec i) -> boolean

Returns the ith element of bvec as an exact integer or boolean, respectively.

(bitvector-length bvec) -> exact nonnegative integer

Returns the length of bvec.

Iteration

(bitvector-take bvec n) -> bitvector
(bitvector-take-right bvec n) -> bitvector

Returns a bitvector containing the first/last n elements of bvec.

(bitvector-drop bvec n) -> bitvector
(bitvector-drop-right bvec n) -> bitvector

Returns a bitvector containing all except the first/last n elements of bvec.

(bitvector-segment bvec n) -> list

Returns a list of bitvectors, each of which contains n consecutive elements of bvec. The last bitvector may be shorter than n. It is an error if n is not an exact positive integer.

(bitvector-fold/int kons knil bvec1 bvec2 ...) -> object
(bitvector-fold/bool kons knil bvec1 bvec2 ...) -> object
(bitvector-fold-right/int kons knil bvec1 bvec2 ...) -> object
(bitvector-fold-right/bool kons knil bvec1 bvec2 ...) -> object

Folds kons over the elements of bvec in increasing/decreasing order using knil as the initial value. The kons procedure is called with the states first and the new element last, as in SRFIs 43, 133, and 160.

(bitvector-map/int f bvec1 bvec2 ...) -> bitvector
(bitvector-map/bool f bvec1 bvec2 ...) -> bitvector
(bitvector-map!/int f bvec1 bvec2 ...) -> unspecified
(bitvector-map!/bool f bvec1 bvec2 ...) -> unspecified
(bitvector-map->list/int f bvec1 bvec2 ...) -> list
(bitvector-map->list/bool f bvec1 bvec2 ...) -> list
(bitvector-for-each/int f bvec1 bvec2 ...) -> unspecified
(bitvector-for-each/bool f bvec1 bvec2 ...) -> unspecified

Iterate over the corresponding elements of the bvecs and apply f to each, returning respectively: a bitvector of the results, an undefined value with the results placed back in bvec1, a list of the results, and an undefined value with no change to bvec1.

Prefixes, suffixes, trimming, padding

(bitvector-prefix-length bvec1 bvec2) -> index
(bitvector-suffix-length bvec1 bvec2) -> index

Return the number of elements that are equal in the prefix/suffix of the two bvecs, which are allowed to be of different lengths.

(bitvector-prefix? bvec1 bvec2) -> boolean
(bitvector-suffix? bvec1 bvec2) -> boolean

Returns #t if bvec1 is a prefix/suffix of bvec2, and #f otherwise. The arguments are allowed to be of different lengths.

(bitvector-pad bit bvec length) -> bvec
(bitvector-pad-right bit bvec length) -> bvec

Returns a copy of bvec with leading/trailing elements equal to bit added (if necessary) so that the length of the result is length.

(bitvector-trim bit bvec) -> bvec
(bitvector-trim-right bit bvec) -> bvec
(bitvector-trim-both bit bvec) -> bvec

Returns a copy of bvec with leading/trailing/both elements equal to bit removed.

Mutators

(bitvector-set! bvec i bit) -> unspecified

Sets the ith element of bvec to bit.

(bitvector-swap! bvec i j) -> unspecified

Interchanges the ith and jth elements of bvec.

(bitvector-reverse! bvec [start [end]]) -> unspecified

Reverses the portion of bvec from start to end.

(bitvector-copy! to at from [start [end]]) -> unspecified

Copies the portion of from from start to end onto to, starting at index at.

(bitvector-reverse-copy! to at from [start [end]]) -> unspecified

The same as bitvector-copy!, but copies in reverse.

Conversion

(bitvector->list/int bvec [start [end]]) -> list of integers
(bitvector->list/bool bvec [start [end]]) -> list of booleans

(reverse-bitvector->list/int bvec [start [end]]) -> list of integers
(reverse-bitvector->list/bool bvec [start [end]]) -> list of booleans

(list->bitvector list) -> bitvector
(reverse-list->bitvector list) -> bitvector

(bitvector->vector/int bvec [start [end]]) -> vector of integers
(bitvector->vector/bool bvec [start [end]]) -> vector of booleans

(reverse-bitvector->vector/int bvec [start [end]]) -> vector of integers
(reverse-bitvector->vector/bool bvec [start [end]]) -> vector of booleans

(vector->bitvector vec [start [end]]) -> bitvector
(reverse-vector->bitvector vec [start [end]]) -> bitvector

Returns a list, bitvector, or heterogeneous vector with the same elements as the argument, in reverse order where specified.

(bitvector->string bvec) -> string

Returns a string beginning with "#*" and followed by the values of bvec represented as 0 and 1 characters. This is the Common Lisp representation of a bitvector.

(string->bitvector string) -> bitvector

Parses a string in the format generated by bitvector->string and returns the corresponding bitvector, or #f if the string is not in this format.

(bitvector->integer bitvector)

Returns a non-negative exact integer whose bits, starting with the least significant bit as bit 0, correspond to the values in bitvector. This ensures compatibility with the integers-as-bits operations of SRFI 151.

(integer->bitvector integer [ len ])

Returns a bitvector whose length is len whose values correspond to the bits of integer, a non-negative exact integer, starting with the least significant bit as bit 0. This ensures compatibility with the integers-as-bits operations of SRFI 151.

The default value of len is (integer-length integer). If the value of len is less than the default, the resulting bitvector cannot be converted back by bitvector->integer correctly.

Generators

(make-bitvector/int-generator bitvector)
(make-bitvector/bool-generator bitvector)

Returns a SRFI 158 generator that generates all the values of bitvector in order. Note that the generator is finite.

(make-bitvector-accumulator)

Returns a SRFI 158 accumulator that collects all the bits it is invoked on. When invoked on an end-of-file object, returns a bitvector containing all the bits in order.

Basic operations

(bitvector-not bvec)
(bitvector-not! bvec)

Returns the element-wise complement of bvec; that is, each value is changed to the opposite value.

The following ten procedures correspond to the useful set of non-trivial two-argument boolean functions. For each such function, the corresponding bitvector operator maps that function across two or more bitvectors in an element-wise fashion. The core idea of this group of functions is this element-wise "lifting" of the set of dyadic boolean functions to bitvector parameters.

(bitvector-and bvec1 bvec2 bvec ...)
(bitvector-and! bvec1 bvec2 bvec ...)
(bitvector-ior bvec1 bvec2 bvec ...)
(bitvector-ior! bvec1 bvec2 bvec ...)
(bitvector-xor bvec1 bvec2 bvec ...)
(bitvector-xor! bvec1 bvec2 bvec ...)
(bitvector-eqv bvec1 bvec2 bvec ...)
(bitvector-eqv! bvec1 bvec2 bvec ...)

These operations are associative.

The bitvector-eqv procedure produces the complement of the bitvector-xor procedure. When applied to three arguments, it does not produce a #t value everywhere that a, b and c all agree. That is, it does not produce

     (bitvector-ior (bitvector-and a b c)
                    (bitvector-and (bitvector-not a)
                                   (bitvector-not b)
                                   (bitvector-not c)))

Rather, it produces (bitvector-eqv a (bitvector-eqv b c)) or the equivalent (bitvector-eqv (bitvector-eqv a b) c).

(bitvector-nand bvec1 bvec2)
(bitvector-nand! bvec1 bvec2)
(bitvector-nor bvec1 bvec2)
(bitvector-nor! bvec1 bvec2)
(bitvector-andc1 bvec1 bvec2)
(bitvector-andc1! bvec1 bvec2)
(bitvector-andc2 bvec1 bvec2)
(bitvector-andc2! bvec1 bvec2)
(bitvector-orc1 bvec1 bvec2)
(bitvector-orc1! bvec1 bvec2)
(bitvector-orc2 bvec1 bvec2)
(bitvector-orc2! bvec1 bvec2)

These operations are not associative.

Quasi-integer operations

(bitvector-logical-shift bvec count bit)

Returns a bitvector equal in length to bvec containing the logical left shift (toward lower indices) when count>=0 or the right shift (toward upper indices) when count<0. Newly vacated elements are filled with bit.

(bitvector-count bit bvec)

Returns the number of bit values in bvec.

(bitvector-count-run bit bvec i)

Returns the number of consecutive bit values in bvec, starting at index i.

(bitvector-if if-bitvector then-bitvector else-bitvector)

Returns a bitvector that merges the bitvectors then-bitvector and else-bitvector, with the bitvector if-bitvector determining from which bitvector to take each value. That is, if the kth value of if-bitvector is #t (or 1, depending in how you look at it), then the kth bit of the result is the kth bit of then-bitvector, otherwise the kth bit of else-bitvector.

(bitvector-first-bit bit bvec)

Return the index of the first (smallest index) bit value in bvec. Return -1 if bvec contains no values equal to bit.

Bit field operations

These procedures operate on a contiguous field of bits (a "byte" in Common Lisp parlance) in a given bitvector. The start and end arguments, which are not optional, are non-negative exact integers specifying the field: it is the end − start bits running from bit start to bit end − 1.

(bitvector-field-any? bvec start end)

Returns #t if any of the field's bits are set in bvec, and #f otherwise.

(bitvector-field-every? bvec start end)

Returns #f if any of the field's bits are not set in bvec, and #t otherwise.

(bitvector-field-clear bvec start end)
(bitvector-field-clear! bvec start end)
(bitvector-field-set bvec start end)
(bitvector-field-set! bvec start end)

Returns a bitvector containing bvec with the field's bits set appropriately.

(bitvector-field-replace dest source start end)
(bitvector-field-replace! dest source start end)

Returns a bitvector containing dest with the field replaced by the first end-start bits in source.

(bitvector-field-replace-same dest source start end)
(bitvector-field-replace-same! dest source start end)

Returns a bitvector containing dest with its field replaced by the corresponding field in source.

(bitvector-field-rotate bvec count start end)

Returns bvec with the field cyclically permuted by count bits towards higher indices when count is negative, and toward lower indices otherwise.

(bitvector-field-flip bvec start end)
(bitvector-field-flip! bvec start end)

Returns bvec with the bits in the field flipped: that is, each value is replaced by the opposite value. There is no SRFI 151 equivalent.

Bitvector literals

The compact string representation used by bitvector->string and string->bitvector may be supported by the standard read and write procedures and by the program parser, so that programs can contain references to literal bitvectors. On input, it is an error if such a literal is not followed by a <delimiter> or the end of input.

Implementation

The sample implementation can be found in the repository of this SRFI and in this .tgz file. Bitvectors are implemented as wrapped SRFI 160 u8vectors; for simplicity and possibly for speed, they use a whole byte to represent each bit, as Java and C# do. At a later date, a more complex implementation that packs bits into bytes but still operates byte-wise as much as possible might be provided.

Copyright

© 2018 John Cowan, 2020 Wolfgang Corcoran-Mathe

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler