[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: arithmetic issues
Aubrey Jaffer wrote:
| > Flonums often are the most difficult feature to port to new
| > architectures.
| Why do you say that?
From the experience of porting SCM to dozens of C compilers.
Okay, when you said "architecture" I thought you refered to CPU
instructions and data formats. Yes, compilers are a pain and there are
frequent bugs in the standard library.
| That is, I would mandate only unlimited size integers in the core.
| The rest of the tower should be moved to the library.
Moving 1/3 or more of an implementation to a library isn't always
practical. libm may not be dynamically loadable; in which case the
executable must carry around the math libraries, even when they are
not used. One can end up having two copies of many subroutines and
some subsystems like garbage collection.
But if libm is not dynamically loadable, an implementation that wants to
offer flonums must carry it around anyway, regardless of whether flonums
are part of the core or the library.
I would distinguish "moving the rest of the tower to a library in the
*language* *definition*" and "moving the rest of the tower to a library
in an *implementation*".
That is, moving all but exact integers out of the core of the language
definition simplifies the language definition and keeps it grounded in
things that are generally agreed to be correct.
However, that does not prohibit implementors from including important
aspects of other numbers in the core of their implementation. For
example, the core of their implementation could have representation and
garbage collection for flonums, ratnums, and whatever else. You would
probably end up duplicating some arithmetic routines, but that's about it.
[In SCM] The arithmetic subrs test first for INUMs, then bignums, then flonums.
The type dispatch for bignums and flonums is very similar. It would
be good to find what causes the difference.
This is my point. The branches for inums and bignums are probably
predicted as taken. Thus, when you use these generic operators on
flonums, you incur two mispredicted branches. flonum-specific operators
would save those.
> I tested SCM and SCMLIT (fixnums only), both compiled with gcc -O3,
> computing 2000 digits of pi 4 digits at a time on a Pentium 4 3.00GHz.
> The benchmark uses only small integers.
> SCM took 5330.ms, while SCMLIT took 3330.ms, a substantial savings.
However, your results suggest to me that perhaps some of the branches
are not predicted as they should be. It might be worth using the
"__builtin_expect" feature of GCC to hint to the compiler that numbers
are expected to be inums.
Of course, type-specific operators are "just" a performance hack to get
around a lack of type analysis in many implementations. Hats off to stalin.
Dr Alan Watson
Centro de Radioastronomía y Astrofísica
Universidad Astronómico Nacional de México