Title

Integer Bitwise-operation Library

Author

Olin Shivers

Status

This SRFI is currently in withdrawn status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-33@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Table of contents

Abstract

R5RS Scheme has no utilities for performing bitwise logical operations on integers or bitstrings, which is a problem for authors of portable code. This SRFI proposes a coherent and comprehensive set of these functions; it is accompanied by a reference implementation of the spec in terms of a set of seven core operators. The reference implementation is

The precise semantics of these operators is almost never an issue. A consistent, portable set of names and parameter conventions, however, is. Hence this SRFI.

Function index

  bitwise-not
  bitwise-and   bitwise-ior
  bitwise-xor   bitwise-eqv
  bitwise-nand  bitwise-nor
  bitwise-andc1 bitwise-andc2
  bitwise-orc1  bitwise-orc2

  arithmetic-shift bit-count integer-length

  bitwise-merge
  bit-set? any-bits-set? all-bits-set?
  first-set-bit

  extract-bit-field test-bit-field? clear-bit-field
  replace-bit-field copy-bit-field
  

Function specifications

In a Scheme system that has a module or package system, these functions should be contained in a module named "bitwise-lib". They may additionally be provided as part of the general suite of numerical operations.

In the following function specifications all parameters are exact integers; unless otherwise indicated, all return values are exact integers. It is an error to pass values of other types as arguments to these functions.

Bitstrings are represented by integers, using a two's-complement encoding of the bitstring. Thus every integer represents a semi-infinite bitstring, having either a finite number of zeroes (negative integers) or a finite number of ones (non-negative integers). The bits of a bitstring are numbered from the rightmost/least-significant bit: bit #0 is the rightmost or 2^0 bit, bit #1 is the next or 2^1 bit, and so forth.

bitwise-not i -> exact-integer

Bitwise logical negation, i.e., -(i+1).

  (bitwise-not 10) => -11
  (bitwise-not -37) => 36

N-ary operators, for n >= 0

ProcedureBit function
bitwise-and i ...And
bitwise-ior i ...Inclusive or
bitwise-xor i ...Exclusive or
bitwise-eqv i ...(eqv b1 b2) = (not (xor b1 b2))
bitwise-nand i ...(not (and b ...))
bitwise-nor i ...(not (or b ...))

Note that the and, ior, xor and eqv operations are associative.

Exactly two arguments

ProcedureBit function
bitwise-andc1 i j(and (not bi) bj)
bitwise-andc2 i j(and bi (not bj))
bitwise-orc1 i j(ior (not bi) bj)
bitwise-orc2 i j(ior bi (not bj))

Trivial, hence not provided

ProcedureDefinition
bitwise-const0 i j(lambda (i j) 0)
bitwise-const1 i j(lambda (i j) -1)
bitwise-arg1 i j(lambda (i j) i)
bitwise-arg2 i j(lambda (i j) j)
bitwise-not1 i j(lambda (i j) (bitwise-not i))
bitwise-not2 i j(lambda (i j) (bitwise-not j))
arithmetic-shift i count -> exact-integer

Arithmetic left shift when COUNT>0; right shift when COUNT<0.

  (arithmetic-shift 8 2) => 32
  (arithmetic-shift 4 0) => 4
  (arithmetic-shift 8 -1) => 4
  (arithmetic-shift -100000000000000000000000000000000 -100) => -79
bit-count i -> nonnegative-exact-integer

Population count of 1's (i >= 0) or 0's (i < 0).

  (bit-count 0) =>  0
  (bit-count -1) =>  0
  (bit-count 7) =>  3
  (bit-count  13) =>  3 ;Two's-complement binary: ...0001101
  (bit-count -13) =>  2 ;Two's-complement binary: ...1110011
  (bit-count  30) =>  4 ;Two's-complement binary: ...0011110
  (bit-count -30) =>  4 ;Two's-complement binary: ...1100010
  (bit-count (expt 2 100)) =>  1
  (bit-count (- (expt 2 100))) =>  100
  (bit-count (- (1+ (expt 2 100)))) =>  1
integer-length i -> nonnegative-exact-integer

The number of bits needed to represent I, i.e.

  (ceiling (/ (log (if (negative? integer)
  (- integer)
  (+ 1 integer)))
  (log 2)))
  

For i >= 0, this is the number of bits needed to represent I in an unsigned binary representation. For all i, (+ 1 (integer-length i)) is the number of bits needed to represent i in a signed, twos-complement representation.

  (integer-length  0) => 0
  (integer-length  1) => 1
  (integer-length -1) => 0
  (integer-length  7) => 3
  (integer-length -7) => 3
  (integer-length  8) => 4
  (integer-length -8) => 3
  
bitwise-merge mask i0 i1 -> exact-integer

Merge the bitstrings I0 and I1, with bitstring MASK determining from which string to take each bit. That is,

RESULT[k] := if MASK[k] = 0 then I0[k] else I1[k].

or

(bitwise-ior (bitwise-and (bitwise-not mask) i0)
  (bitwise-and mask i1))
bit-set? index i -> boolean

Is bit INDEX set in bitstring I? INDEX is a non-negative exact integer. The rightmost/least-significant bit in the bitstring is bit 0.

  (bit-set? 1 1) =>  false
  (bit-set? 0 1) =>  true
  (bit-set? 3 10) =>  true
  (bit-set? 1000000 -1) =>  true
  (bit-set? 2 6) =>  true
  (bit-set? 0 6) =>  false
  
any-bits-set? test-bits i -> boolean
all-bits-set? test-bits i -> boolean

Determines if any / all of the bits set in bitstring TEST-BITS are set in bitstring I. I.e., return

(not (zero? (bitwise-and TEST-BITS I)))

or

(= TEST-BITS (bitwise-and TEST-BITS I)))

respectively.

first-set-bit i -> exact-integer

Return the index of the first (smallest index) 1 bit in bitstring I. Return -1 if I contains no 1 bits (i.e., if I is zero).

  (first-set-bit 1) => 0
  (first-set-bit 2) => 1
  (first-set-bit 0) => -1
  (first-set-bit 40) => 3
  (first-set-bit -28) => 2
  (first-set-bit (expt  2 99)) => 99
  (first-set-bit (expt -2 99)) => 99
extract-bit-field size position i -> exact-integer
test-bit-field?   size position i -> boolean
clear-bit-field   size position i -> exact-integer
replace-bit-field size position new-field i -> exact-integer
copy-bit-field    size position from to     -> exact-integer

These functions operate on a contiguous field of bits (a "byte," in Common-Lisp parlance) in a given bitstring I. SIZE and POSITION are non-negative exact integers specifying the field: it is the SIZE bits running from bit POSITION to bit POSITION+SIZE-1.

Rationale

Summaries of related designs

Below are summaries of the related libraries currently found in Common Lisp, PLT Scheme, slib, Bigloo, Scheme 48, Kawa, and MIT Scheme. I was unable to find anything for Gambit.

Common Lisp

  lognot n
  Associative: log{ior,xor,and,eqv}
  Non-associative: log{nand,nor,andc1,andc2,orc1,orc2}

  (boole op i j)
  op one of boole-{clr,set,1,2,c1,c2,and,ior,xor,eqv,nand,nor,
  andc1,andc2,orc1,orc2}

  (logtest testbits n)  ; #t if any of the 1 bits in TESTBITS are set in N.
  (not (zerop (logand x y)))

  (logbitp index n) ; #t if bit # INDEX in N is set.
  (not (zero? (logand n (ash 1 index))))

  ash n count
  logcount n    ; pop-count
  integer-length n

  A CL byte is a contiguous field of bits in an int.
  (byte size position) -> byte-specifier
  (byte-size byte-spec) -> int
  (byte-position byte-spec) -> int

  (ldb bytespec n)      ; Extracted byte is shifted down to lsb position.
  (ldb-test bytespec n) #t if any bits in the byte are 1's.
  (mask-field bytespec n) Zero out all bits not in bytespec.
  (dpb newbyte bytespec n)      ; Replacement bits are low bits of newbyte
  (deposit-field from bytespec n) ; Replacement bits are (ldb from bytespec)
    

PLT Scheme

  bitwise-ior i1 ...
  bitwise-and i1 ...
  bitwise-xor i1 ...
  bitwise-not i
  arithmetic-shift i j
    

slib

Slib has a clone of a chunk of CL's design. It also has (bit-extract n start end) which is like ldb on bits [start,end) of n.

Bigloo

  bit-or i1 i2
  bit-xor i1 i2
  bit-and i1 i2
  bit-not i
  bit-lsh i1 i2
  bit-rsh i1 i2
    

Chez

General integers:

  integer-length i
  ash i count
    

Fixnums:

  fxlogand i j
  fxlogor i j
  fxlogxor i j
  fxlognot i
  fxsll i count         ("shift left logical")
  fxsrl i count         ("shift right logical")
  fxsla i count         ("shift left arithmetic")
  fxsra i count         ("shift right arithmetic")
    

Scheme 48

  bitwise-not i
  bitwise-and i1 ...
  bitwise-ior i1 ...
  bitwise-xor i1 ...
  arithmetic-shift i count
    

MIT Scheme

Fixnums:

  fix:not i
  fix:and i j
  fix:andc i j
  fix:or i j
  fix:xor i j
  fix:lsh i count
    

MIT Scheme also has a distinct datatype, the "bit vector." A bit vector is very different, in that it its elements are mutable -- it is a mutable vector of bits.

Constants are written with a sharp-star prefix, e.g. #*11111.

  (make-bit-string k init)
  (bit-string-allocate k)
  (bit-string-copy bs)
  (bit-string? object)
  (bit-string-length bs)
  (bit-string-ref bs k) -> boolean
  (bit-string-set! bs k)
  (bit-string-clear! bs k)
  (bit-substring-find-next-set-bit bs start end)
  (bit-string-append bs1 bs2)
  (bit-substring bs start end)
  (bit-string-zero? bs)
  (bit-string=? bs1 bs2)
  (bit-string-not bs)
  (bit-string-movec! target-bs source-bs) ; Destructive NOT operation

  (bit-string-and  bs1 bs2)     (bit-string-and!  target-bs1 bs2)
  (bit-string-andc bs1 bs2)     (bit-string-andc! target-bs1 bs2)
  (bit-string-or   bs1 bs2)     (bit-string-or!   target-bs1 bs2)
  (bit-string-xor  bs1 bs2)     (bit-string-xor!  target-bs1 bs2)

  (bit-string-fill! bs init)    ; init is a boolean
  (bit-string-move! target-bs bs)       ; Must be of the same length
  (bit-substring-move-right! source-bs start1 end1 target-bs start2)

  (unsigned-integer->bit-string length integer)
  (signed-integer->bit-string length integer)
  (bit-string->unsigned-integer bit-string)
  (bit-string->signed-integer bit-string)
    

Guile & Kawa

  logand i1 ...
  logior i1 ...
  logxor i1 ...
  lognot i
  logtest i j   (any-bit-set?)
  logbit? index i
  ash i count
  logcount i
  integer-length i
  bit-extract i start end
    

Reference implementation

There are 24 functions in the spec. 15 can be defined in under two lines of code; REPLACE-BIT-FIELD needs three lines; and BITWISE-EQV needs five. This is not an onerous implementation load; I provide the code below. As this is only 31 lines of code, it hardly seems reasonable to bother discussing copyright. To lay the issue to rest, I am the sole author, and I place it in the public domain.

That leaves 7 basic functions that must be primitively defined for each implementation: BITWISE-{NOT,AND,IOR,XOR}, ARITHMETIC-SHIFT, BIT-COUNT, and INTEGER-LENGTH. Slib has implementations of even these functions using R4RS arithmetic, so a simple-minded implementation again doesn't need to do much to support them -- however, slib's general implementations are terribly inefficient relative to native support and should not be used except in case of dire emergency. (It's quite clever code, nonetheless, to provide the semantics with such little support.)

A good implementation might choose to provide direct compiler/interpreter support for these derived functions, or might simply define them to be integrable -- i.e., inline-expanded.

The n-ary BITWISE-EQV function should also receive primitive compiler/interpreter support so that the expensive n-ary mechanism is not invoked in the standard cases -- that is, an application of BITWISE-EQV should be rewritten into an equivalent tree applying some two-argument primitive to the arguments, in the same manner that statically-known n-ary applications of associative operations such as + and * are handled efficiently:

  (bitwise-eqv)         => -1
  (bitwise-eqv i)       => i
  (bitwise-eqv i j)     => (%bitwise-eqv i j)
  (bitwise-eqv i j k)   => (%bitwise-eqv (%bitwise-eqv i j) k)
  (bitwise-eqv i j k l) => (%bitwise-eqv (%bitwise-eqv (%bitwise-eqv i j) k) l)

  ;;; The seven non-trivial boolean functions in terms
  ;;; of not, and, or & xor.

  (define (bitwise-nand  i j)  (bitwise-not (bitwise-and i j)))
  (define (bitwise-nor   i j)  (bitwise-not (bitwise-ior i j)))
  (define (bitwise-andc1 i j)  (bitwise-and (bitwise-not i) j))
  (define (bitwise-andc2 i j)  (bitwise-and i (bitwise-not j)))
  (define (bitwise-orc1  i j)  (bitwise-ior (bitwise-not i) j))
  (define (bitwise-orc2  i j)  (bitwise-ior i (bitwise-not j)))

  (define (bitwise-eqv . args)
  (let lp ((args args) (ans -1))
  (if (pair? args)
  (lp (cdr args) (bitwise-not (bitwise-xor ans (car args))))
  ans)))

  ;;; Helper function -- make a mask of SIZE 1-bits, e.g. (%MASK 3) = #b111.
  ;;; Suppose your Scheme's fixnums are N bits wide (counting the sign bit,
  ;;; not counting any tag bits). This version, due to Marc Feeley, will
  ;;; handle SIZE in the range [0,N-1] without overflowing to bignums.
  ;;; (For SIZE >= N, the correct bignum value is also produced.)

  (define (%mask size) (bitwise-not (arithmetic-shift -1 size)))

  ;;; This alternate, mathematically-equivalent expression
  ;;;     (- (arithmetic-shift 1 size) 1)
  ;;; is not as good -- it only handles SIZE in the range [0,N-2] without
  ;;; overflowing to bignums.
  ;;;
  ;;; Finally, note that even Feeley's expression can't build an N-bit mask
  ;;; without bignum help. This is fundamental, since the interpretation
  ;;; of fixed-size fixnum bit patterns as semi-infinite-bit-strings is that
  ;;; you replicate the high bit out to infinity. So you have to have a
  ;;; zero "stop bit" appearing after that highest one bit to turn off the
  ;;; replication of the ones.

  (define (bit-set? index n)
  (not (zero? (bitwise-and (arithmetic-shift 1 index) n))))

  (define (any-bits-set? test-bits n) (not (zero? (bitwise-and test-bits n))))

  (define (all-bits-set? test-bits n) (= test-bits (bitwise-and test-bits n)))

  (define (bitwise-merge mask n0 n1)
  (bitwise-ior (bitwise-and mask n1)
  (bitwise-and (bitwise-not mask) n0)))

  ;;; Bit-field ops

  (define (extract-bit-field size position n)
  (bitwise-and (%mask size) (arithmetic-shift n (- position))))

  (define (test-bit-field? size position n)
  (not (zero? (bitwise-and (arithmetic-shift n (- position)) (%mask size)))))

  ;; Integrating i-b-f reduces nicely.
  (define (clear-bit-field size position n)
  (replace-bit-field size position 0 n))

  ;;; Oops -- intermediate ARITHMETIC-SHIFT can fixnum-overflow on fixnum args.
  ;(define (replace-bit-field size position newfield n)
  ;  (copy-bit-field size position (arithmetic-shift newfield position) n))

  ;;; This three-line version won't fixnum-overflow on fixnum args.
  (define (replace-bit-field size position newfield n)
  (let ((m (%mask size)))
  (bitwise-ior (bitwise-and n (bitwise-not (arithmetic-shift m position)))
  (arithmetic-shift (bitwise-and newfield m) position))))

  (define (copy-bit-field size position from to)
  (bitwise-merge (arithmetic-shift (%mask size) position) to from))

  ;; Simple definition
  ;(define (first-set-bit i)
  ;  (and (not (zero? i))
  ;       (let lp ((j 0) (i start))
  ;         (if (bit-set? i 0) j
  ;             (lp (+ j 1) (arithmetic-shift i 1))))))

  ;;; Clever definition, assuming you have a fast BIT-COUNT.
  (define (first-set-bit i) (- (bit-count (bitwise-xor i (- i 1))) 1))
    

Topics to be resolved during discussion phase

I particularly solicit comments about the following topics.

SIZE/POSITION vs. FROM/TO field specs

Several functions in this library

  extract-bit-field size position i -> integer
  test-bit-field?   size position i -> boolean
  clear-bit-field   size position i -> integer
  replace-bit-field size position new-field i -> integer
  copy-bit-field    size position from to     -> integer
    

specify a contiguous "field" of bits in a bitstring. There are two conventions we might use to do so:

FROM/TO specs are conventionally and most usefully "half-open" specs, meaning "all i such that FROM <= i and i < TO" -- the FROM index is included and the TO index is excluded.

I have chosen to use SIZE/POSITION instead of FROM/TO for this library. Doing so eliminates any possibility of fencepost errors on the TO endpoint. It is also the convention chosen by Common Lisp.

It is not, however, a widely-used convention within Scheme. Most ranges in Scheme are specified with half-open intervals of the [from,to) form (e.g., (substring s from to)). One might argue that SIZE/POSITION is still the right thing for bit fields, as they are, in practice, frequently of fixed size, unlike element ranges in strings or vectors.

Order of arguments for non-bitstring parameters

The "bitwise boolean" functions such as BITWISE-AND only take bitstring parameters. But the following 10 functions are different in that they take other kinds of parameters (masks, indices, field sizes) that indicate the exact operation to perform on the bitstring parameter(s):

  arithmetic-shift i count -> integer
  bitwise-merge mask i0 i1 -> integer
  bit-set? index i -> boolean
  any-bits-set? test-bits i -> boolean
  all-bits-set? test-bits i -> boolean
  extract-bit-field size position i -> integer
  test-bit-field?   size position i -> boolean
  clear-bit-field   size position i -> integer
  replace-bit-field size position new-field i -> integer
  copy-bit-field    size position from to     -> integer
    

Note that in all of these functions, with the sole exception of ARITHMETIC-SHIFT, the bitstring parameter comes last. This is consistent with an "operation currying" convention, wherein the arguments that determine the operation come first, and the actual value upon which we operate comes last. MAP and FOLD, for example, work this way, too. (The "op currying" convention is actually useful in SML; in Scheme, its utility is almost entirely as a mnemonic convention to aid programmers in remembering argument order.)

ARITHMETIC-SHIFT is entrenched by long and consistent tradition in the indicated parameter order; it would be a mistake to alter this. Every implementation of Scheme I have checked that offers a bit-shift operation on integers (PLT Scheme, slib, Bigloo, Scheme 48, and MIT Scheme), as well as Common Lisp, uses the "i count" argument order.

As an alternative to the "op currying" order, we could use the "data-structure accessor" convention, wherein the data-structure being accessed (the bitstring) comes first, and the "selector" arguments come after. For example, this is the convention used for the functions VECTOR-REF and STRING-REF. One could make the argument that this convention could be reasonably applied to some of these operators, such as BIT-SET?

I recommend leaving things as they are, for maximal consistency with a simple rule. This also provides consistency with Common Lisp, whose bitwise functions uniformly use the ops-curry convention (see the "related designs" summary below).

The name "bit-set?"

BIT-SET? uses the term "bit set," but that sounds like "Is this a set of bits?" as well as the intended "is the bit set?" On the other hand, a set of bits is a not-very-useful notion (there are, after all, only four such sets), so I haven't pre-empted anything we'd ever really want for some other purpose, such as a term like "bit vector" or something...

References & Links

This document, in HTML: http://srfi.schemers.org/srfi-33/srfi-33.html [This link may not be valid while the SRFI is in draft form.]

This document, in simple text format: http://srfi.schemers.org/srfi-33/srfi-33.txt

Archive of SRFI-33 discussion-list email: https://srfi-email.schemers.org/srfi-33

SRFI web site:http://srfi.schemers.org/

  [CommonLisp]
  Common Lisp: the Language
  Guy L. Steele Jr. (editor).
  Digital Press, Maynard, Mass., second edition 1990.
  Available at http://www.elwood.com/alu/table/references.htm#cltl2

  The Common Lisp "HyperSpec," produced by Kent Pitman, is essentially
  the ANSI spec for Common Lisp:
  http://www.lispworks.com/documentation/HyperSpec/Front/index.htm

  [R5RS]
  Revised^5 Report on the Algorithmic Language Scheme,
  R. Kelsey, W. Clinger, J. Rees (editors).
  Higher-Order and Symbolic Computation, Vol. 11, No. 1, September, 1998.
  and ACM SIGPLAN Notices, Vol. 33, No. 9, October, 1998.

  Available at http://www.schemers.org/Documents/Standards/
    

Copyright

This document is copyright © Olin Shivers (1998, 1999). All Rights Reserved.

However, the program source found in section "Reference Implementation" is by Olin Shivers and explicitly placed in the public domain.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.