[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

endianness

This page is part of the web mail archives of SRFI 74 from before July 7th, 2015. The new archives for SRFI 74 contain all messages, not just those from before July 7th, 2015.




SRFI-74 reads:
> (endianness big) (syntax)
> (endianness little) (syntax)
> (endianness native) (syntax)
>         These return three distinct and unique objects representing an endianness.
>        The native endianness denotes the endianness of the underlying machine architecture.

I have three proposals concerning endianness:

1. Please explain *exactly* what you mean by "little endian"---unfortunately I have seen
contradicting definitions in the past (though most people agree on one), and also I keep
forgetting which one it is.

2. The external representation of an endianness options is undefined. Are you sure
you want that?

3. Why is the 'native' endianness not just either (endianness big) or (endianness little)
but *another* option value? How do I find out about the native endianness? Do have
to write a program stuffing a blob with data in 'native' and reading it out in 'little' or can
I just ask (eq? (endianness native) (endianness little))?

4. While litte/big endian is the most important distinction, the endianness issue is
more complicated: There are 4! = 24 ways to store 4 bytes in a 32-bit word and good
deal of these permutations are actually found in silicon. Luckily, for 64 bit architectures
I haven't come across any 'strange' permutations yet, i.e. anything that is not just a
repetition of a 32-bit endianness.

My proposal is to be more flexible by specifying the permutation explicitly:

(endianness k0 k1 k2 k3)
        with integers k0, k1, k2, k3 in {0..3} specify a permutation of bytes in a 32-bit word.
        The blob of bytes (x[0] .. x[3]) represents the integer x[k3]*2^24+x[k2]*2^16+x[k1]*2^8+x[k0].

        The permutation (endianness 0 1 2 3) is often called "little endian,"
        (endianness 3 2 1 0) is often called "big endian" on 32-bit architectures,
        and (endianness 1 0 3 2) is often called "big endian" on 16-bit architectures.

Even more flexible, but also more complex, is the following idea: You specify a
permutation by permuting the first few integers, i.e. {0..n-1} for n in {2,4,8}, and then
repeat this permutation on any block of n bytes. This comes down to the following:

(endianness k[0] .. k[n-1])

        specifies that a blob of bytes (x[0] x[1] ..) is interpreted as a blob of n-byte
        words by the relation (implicitly padding x with 0):

                y[i] = Sum(x[i*n + k[j]]*256^j : j in {0..n-1}].

This definition allows specifying endianness for 8-, 16-, 32-, 64-, and whatever
architectures as long as it is based on bytes (= 8 bit). The three most frequent
cases can (and should) of course be treated specially.