[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: your implementation of L'Ecuyer's MRG32k3a generator

This page is part of the web mail archives of SRFI 27 from before July 7th, 2015. The new archives for SRFI 27 contain all messages, not just those from before July 7th, 2015.



> * Why do you use an f64vector for the seed? Is this more efficient than
> creating six different variables? (for example because flonums are
> allocated rather than immediate)

Yes.  Gambit-C boxes flonums except in basic blocks.  So accessing an element
of an f64vector takes as long as unboxing a flonum, but storing a flonum
into an existing f64vector is faster than boxing it, and eliminates the need for
allocating the storage for the boxed flonum.

> * Why does the fixnum/flonum implementation need flonums at all?
> Wouldn't it be more efficient to use 54-bit fixnums throughout? (Which
> should be there in a Scheme running on a DEC Alpha)

That may work fine; I wanted to have an implementation that worked well
both on sparcs and alphas, so the seed is of the same form on both.
Flonum arithmetic is often faster than fixnum arithmetic on machines, too,
especially for multiplication and division.  But L'Ecuyer has some timings
in his paper that shows that an integer implementation works well on the
alpha.

> * What does (declare (flonum)) actually mean here?

It's a Gambit-C extension that says that within that lexical scope, all
arithmetic operations will be on flonum values.  This is faster than using
the generic arithmetic operations.

> * Why have you chosen to write a nested let expression instead of a let*?
> Is this a special style recognized by the compiler to get better code?

I don't think so.  I just saw that several operations could be done in
parallel at each step, and wanted to write the code to allow this if
possible.

> * I have installed GAMBIT here at a HP system. The definition 
> flonum-m-bits-plus-1 in the 'header.scm' file shows that the actual
> mantissa width of FLONUMs are just 53 bits; which is exactly one bit short
> of what is needed for MRG32k3a theoretically. Do you have the same
> problem on your platform? If not, what platform (HW/SW) are you using?

Gambit-C uses the system's underlying floating-point arithmetic.  I believe
that L'Ecuyer's method requires the representation of all integer between
2^{53} and -2^{53} (exclusive) exactly, and this is possible with IEEE
double precision arithmetic.  I don't think there's a problem.

One thing to take into account is that this RNG returns values, before
normalization, between 1 and 4294967087, i.e., not the range
0..2^{32}-1=0..4294967295.  This may require some care in using these
RNGs.

Brad