[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FP Hardware

This page is part of the web mail archives of SRFI 70 from before July 7th, 2015. The new archives for SRFI 70 contain all messages, not just those from before July 7th, 2015.

 | Date: Thu, 19 May 2005 07:45:15 -0700 (PDT)
 | From: Noel Welsh <noelwelsh@xxxxxxxxx>
 | Aubrey wrote:
 | > Reducing the precision of scalar operands nets no speed 
 | > increase from floating-point hardware.
 | This is not the case in general.  The action in floating point
 | calculations is currently in the vector units (SSE2, AltiVec) found
 | in modern processors.  Indeed the Pentium 4s scalar floating point
 | unit is dismal, often making the vector unit the preferred path for
 | even scalar floating point code!  Vector units generally have fixed
 | size registers (e.g. 128-bits) meaning you can either achieve a 4x
 | speedup on single floats (32-bits) or 2x on doubles (64-bits).
 | What I've read of the Cell processor suggests it may only have
 | vectorised FP units.

The crucial word in my claim is *scalar*.  The big speedups for vector
calculations come with pipelined and parallel execution.  The size of
the operand will have a minor effect on the latencies throttling speed
when the vector units are data starved.

 | Hence allowing the user to specify precision seems like a good
 | move.

The homogeneous arrays of SRFI-63 do that.