[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

SRFI 56 Binary I/O

This page is part of the web mail archives of SRFI 56 from before July 7th, 2015. The new archives for SRFI 56 contain all messages, not just those from before July 7th, 2015.


	In higher-level code I'll want to stick character encoding
information and endianness into the port objects, but you have
correctly identified the real primitives of I/O.


	are what we need, because at this moment in history there
are several competing "standard" ways to write characters (and
unicode is multiplicitous, which means that reading/writing "a
character" can never have a fundamental, unambiguous meaning
in terms of binary I/O ever again).

	I would, in fact, advocate that any and all definitions
of read-char, write-char, etc, be defined in terms of these
operations rather than the other way 'round, so that they can
be redefined for different character environments by loading
different libraries.

	One issue; how much of a standard is BER?  12.5 percent
protocol lossage seems like a lot to me.  I'd rather use 1 bit
out of 16 than 1 bit out of 8 to carry the "continuing" information.
bits to encode various "actual" integer lengths:

length      8-bit BER     16-bit modified BER
32-36       40*              48 >
37-41       48               48 =
42-48       56*              64 >
49-55       64               64 =
56-59       72              *64 <
60-63       72*              80 >
64-69       80               80 =
70-74       88              *80 <
75-76       88*              96 >
77-83       96               96 =
84-90      104              *96 <
91-97      112              112 =

	As you can see, for small values 8-bit BER is more
efficient, but the difference between 8 and 16 bit BER breaks
even right around 64 bits, and after we hit 77 bits in real
length, 8 bit BER is never more efficient again.

	Since hardware is increasingly supporting reads and
writes of 16 bits as faster than reads and writes of 8 bits,
and since numeric formats up to around 64 bits are often
supported by special purpose instructions, and since in the
ranges where we're forced to a BER type representation we'll
probably use fewer bits with a 16-bit BER, I think we should
prefer a 16-bit unit with a continuation  bit rather than an
8-bit unit with a continuation bit.

	But the difference is awfully small in importance.
If there are existing tools out there that support the 8-bit
BER format, I'd say go with it.