[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: SRFI-77 with more than one flonum representation
This is my second reply to John Cowan. In the first, I argue that my
need to use single-precision floating point numbers is based on a desire
for correctness, not efficiency. Here, I address efficiency.
John Cowan wrote:
I think the burden of persuasion now lies on you (or someone else in
your position) to show that:
1) there are still significant architectures in which different kinds
of floating-point numbers represent a significant tradeoff (as was
historically the case, single-float being faster but less precise and
with a smaller range), such that it does not make sense to privilege
one over the other; and that
2) this feature warrants support, even if halfhearted, from the Scheme
standard rather than being left as implementation-dependent.
I believe this will be a difficult burden to meet.
I present two examples. The first is a flight of fantasy, but an
interesting one. The second is real, and something that could be
1. Vectorizable code on an x86 or x86-64 with SSE and SSE2 can run twice
as fast in single precision as double precision. To a large degree this
is true regardless of the size of the vectors (i.e., it is true even for
small vectors that are not limited by memory bandwidth).
I know of no vectorizing Scheme compiler, so it is difficult to argue
that the Scheme standard should be bent to support such a hypothetical
compiler. On the other hand, I do not think that the Scheme standard
should be written in such a way that it is difficult to write an
efficient vectorizing compiler that behaves naturally and can handle
both single- and double-precision representations. So, sure, the
standard should not require implementations to have more that one
floating point representation, but it should not rule out that
possibility either (i.e., please keep s, d, and l exponents).
2. An implementation on a 64-bit machine can probably represent
single-precisions as unboxed types but would probably have to box
double-precisions. This may well make single-precison arithmetic in
Scheme faster than double-precision arithmetic in Scheme, even if both
are equally fast at the hardware level. Of course, more sophisticated
compilers may be able to unbox both types in some circumstances.
I know of no implementation that has unboxed singles, but 64-bit
machines are here and now and this is an obvious optimization. To take
full advantage of this, we would need a version of SRFI 77 that is
specific to s-exponent numbers. Now, let us consider a standard that
mandated a version of SRFI 77 that worked with s-exponent numbers and
another that worked with e-exponent numbers. If the Scheme only uses one
floating-point number, these will be identical. If the Scheme has
unboxed single-precision numbers, the first version will be more
efficient than the second, albeit at some cost in precision.
Is this second example at all convincing?
Dr Alan Watson
Centro de Radioastronomía y Astrofísica
Universidad Astronómico Nacional de México