Title

SRFI 144: Flonums

Author

John Cowan, Will Clinger

Status

This SRFI is currently in final status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-144@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Post-finalization note: It is recommended that implementors return +nan.0 (if the implementation supports that number) from flonum when the argument is a non-real number, rather than signaling an error. This behavior is permitted but not required by the current specification.

Abstract

This SRFI describes numeric procedures applicable to flonums, a subset of the inexact real numbers provided by a Scheme implementation. In most Schemes, the flonums and the inexact reals are the same. These procedures are semantically equivalent to the corresponding generic procedures, but allow more efficient implementations.

Rationale

Flonum arithmetic is already supported by many systems, mainly to remove type-dispatching overhead. Standardizing flonum arithmetic increases the portability of code that uses it. Standardizing the range or precision of flonums would make flonum operations inefficient on some systems, which would defeat their purpose. Therefore, this SRFI specifies some of the semantics of flonums, but makes the range and precision implementation-dependent. However, this SRFI, unlike C99, does assume that the floating-point radix is 2.

The source of most of the variables and procedures in this SRFI is the C99/Posix <math.h> library, which should be available directly or indirectly to Scheme implementers. (Note: the C90 version of <math.h> lacks arcsinh, arccosh, arctanh, erf, and tgamma.)

In addition, some procedures and variables are provided from the R6RS flonum library, the Chicken flonum routines, and the Chicken mathh egg. Lastly, a few procedures are flonum versions of R7RS-small numeric procedures.

The SRFI text is by John Cowan; the portable implementation is by Will Clinger.

Specification

It is required that all flonums have the same range and precision. That is, if 12.0f0 is a 32-bit inexact number and 12.0 is a 64-bit inexact number, they cannot both be flonums. In this situation, it is recommended that the 64-bit numbers be flonums.

When a C99 variable, procedure, macro, or operator is specified for a procedure in this SRFI, the semantics of the Scheme variable or procedure are the same as its C equivalent. The definitions given here of such procedures are informative only; for precise definitions, users and implementers must consult the Posix or C99 standards. This applies particularly to the behavior of these procedures on -0.0, +inf.0, -inf.0, and +nan.0. However, conformance to this SRFI does not require that these numbers exist or are flonums.

When a variable is bound to, or a procedure returns, a mathematical expression, it is understood that the value is the best flonum approximation to the mathematically correct value.

It is an error, except as otherwise noted, for an argument not to be a flonum. If the mathematically correct result is not a real number, the result is +nan.0 if the implementation supports that number, or an arbitrary flonum if not.

Flonum operations must be at least as accurate as their generic counterparts when applied to flonum arguments. In some cases, operations should be more accurate than their naive generic expansions because they have a smaller total roundoff error.

This SRFI uses x, y, z as parameter names for flonum arguments. Exact integer parameters are designated n.

Mathematical Constants

The following (mostly C99) constants are provided as Scheme variables.

fl-e

Bound to the mathematical constant e. (C99 M_E)

fl-1/e

Bound to 1/e. (C99 M_E)

fl-e-2

Bound to e2.

fl-e-pi/4

Bound to eπ/4.

fl-log2-e

Bound to log2 e. (C99 M_LOG2E)

fl-log10-e

Bound to log10 e. (C99 M_LOG10E)

fl-log-2

Bound to loge 2. (C99 M_LN2)

fl-1/log-2

Bound to 1/loge 2. (C99 M_LN2)

fl-log-3

Bound to loge 3.

fl-log-pi

Bound to loge π.

fl-log-10

Bound to loge 10. (C99 M_LN10)

fl-1/log-10

Bound to 1/loge 10. (C99 M_LN10)

fl-pi

Bound to the mathematical constant π. (C99 M_PI)

fl-1/pi

Bound to 1/π. (C99 M_1_PI)

fl-2pi

Bound to 2π.

fl-pi/2

Bound to π/2. (C99 M_PI_2)

fl-pi/4

Bound to π/4. (C99 M_PI_4)

fl-pi-squared

Bound to π2.

fl-degree

Bound to π/180, the number of radians in a degree.

fl-2/pi

Bound to 2/π. (C99 M_2_PI)

fl-2/sqrt-pi

Bound to 2/√π. (C99 M_2_SQRTPI)

fl-sqrt-2

Bound to √2. (C99 M_SQRT2)

fl-sqrt-3

Bound to √3.

fl-sqrt-5

Bound to √5.

fl-sqrt-10

Bound to √10.

fl-1/sqrt-2

Bound to 1/√2. (C99 M_SQRT1_2)

fl-cbrt-2

Bound to ∛2.

fl-cbrt-3

Bound to ∛3.

fl-4thrt-2

Bound to ∜2.

fl-phi

Bound to the mathematical constant φ.

fl-log-phi

Bound to log(φ).

fl-1/log-phi

Bound to 1/log(φ).

fl-euler

Bound to the mathematical constant γ (Euler's constant).

fl-e-euler

Bound to eγ.

fl-sin-1

Bound to sin 1.

fl-cos-1

Bound to cos 1.

fl-gamma-1/2

Bound to Γ(1/2).

fl-gamma-1/3

Bound to Γ(1/3).

fl-gamma-2/3

Bound to Γ(2/3).

Implementation Constants

fl-greatest

fl-least

Bound to the largest/smallest positive finite flonum. (e.g. C99 DBL_MAX and C11 DBL_TRUE_MIN)

fl-epsilon

Bound to the appropriate machine epsilon for the hardware representation of flonums. (C99 DBL_EPSILON in <float.h>)

fl-fast-fl+*

Bound to #t if (fl+* x y z) executes about as fast as, or faster than, (fl+ (fl* x y) z); bound to #f otherwise. (C99 FP_FAST_FMA)

So that the value of this variable can be determined at compile time, R7RS implementations and other implementations that provide a features function should provide the feature fl-fast-fl+* if this variable is true, and not if it is false or the value is unknown at compile time.

fl-integer-exponent-zero

Bound to whatever exact integer is returned by (flinteger-exponent 0.0). (C99 FP_ILOGB0)

fl-integer-exponent-nan

Bound to whatever exact integer is returned by (flinteger-exponent +nan.0). (C99 FP_ILOGBNAN)

Constructors

(flonum number)

If number is an inexact real number and there exists a flonum that is the same (in the sense of =) to number, returns that flonum. If number is a negative zero, an infinity, or a NaN, return its flonum equivalent. If such a flonum does not exist, returns the nearest flonum, where "nearest" is implementation-dependent. If number is not a real number, it is an error. If number is exact, applies inexact or exact->inexact to number first.

(fladjacent x y)

Returns a flonum adjacent to x in the direction of y. Specifically: if x < y, returns the smallest flonum larger than x; if x > y, returns the largest flonum smaller than x; if x = y, returns x. (C99 nextafter)

(flcopysign x y)

Returns a flonum whose magnitude is the magnitude of x and whose sign is the sign of y. (C99 copysign)

(make-flonum x n)

Returns x × 2n, where n is an integer with an implementation-dependent range. (C99 ldexp)

Accessors

(flinteger-fraction x)

Returns two values, the integral part of x as a flonum and the fractional part of x as a flonum. (C99 modf)

(flexponent x)

Returns the exponent of x. (C99 logb)

(flinteger-exponent x)

Returns the same as flexponent truncated to an exact integer. If x is zero, returns fl-integer-exponent-zero; if x is a NaN, returns fl-integer-exponent-nan; if x is infinite, returns a large implementation-dependent exact integer. (C99 ilogb)

(flnormalized-fraction-exponent x)

Returns two values, a correctly signed fraction y whose absolute value is between 0.5 (inclusive) and 1.0 (exclusive), and an exact integer exponent n such that x = y(2n). (C99 frexp)

(flsign-bit x)

Returns 0 for positive flonums and 1 for negative flonums and -0.0. The value of (flsign-bit +nan.0) is implementation-dependent, reflecting the sign bit of the underlying representation of NaNs. (C99 signbit)

Predicates

(flonum? obj)

Returns #t if obj is a flonum and #f otherwise.

(fl=? x y z ...)

(fl<? x y z ...)

(fl>? x y z ...)

(fl<=? x y z ...)

(fl>=? x y z ...)

These procedures return #t if their arguments are (respectively): equal, monotonically increasing, monotonically decreasing, monotonically nondecreasing, or monotonically nonincreasing; they return #f otherwise. These predicates must be transitive. (C99 =, <, > <=, >= operators respectively)

(flunordered? x y)

Returns #t if x and y are unordered according to IEEE rules. This means that one of them is a NaN.

These numerical predicates test a flonum for a particular property, returning #t or #f.

(flinteger? x)

Tests whether x is an integral flonum.

(flzero? x)

Tests whether x is zero. Beware of roundoff errors.

(flpositive? x)

Tests whether x is positive.

(flnegative? x)

Tests whether x is negative. Note that (flnegative? -0.0) must return #f; otherwise it would lose the correspondence with (fl<? -0.0 0.0), which is #f according to IEEE 754.

(flodd? x)

Tests whether the flonum x is odd. It is an error if x is not an integer.

(fleven? x)

Tests whether the flonum x is even. It is an error if x is not an integer.

(flfinite? x)

Tests whether the flonum x is finite. (C99 isfinite)

(flinfinite? x)

Tests whether the flonum x is infinite. (C99 isinf)

(flnan? x)

Tests whether the flonum x is NaN. (C99 isnan)

(flnormalized? x)

Tests whether the flonum x is normalized. (C11 isnormal; in C99, use fpclassify(x) == FP_NORMAL)

(fldenormalized? x)

Tests whether the flonum x is denormalized. (C11 issubnormal; in C99, use fpclassify(x) == FP_SUBNORMAL)

Arithmetic

(flmax x ...)

(flmin x ...)

Return the maximum/minimum argument. If there are no arguments, these procedures return -inf.0 or +inf.0 if the implementation provides these numbers, and (fl- fl-greatest) or fl-greatest otherwise. (C99 fmax fmin)

(fl+ x ...)

(fl* x ...)

Return the flonum sum or product of their flonum arguments. (C99 + * operators respectively)

(fl+* x y z)

Returns xy + z as if to infinite precision and rounded only once. The boolean constant fl-fast-fl+* indicates whether this procedure executes about as fast as, or faster than, a multiply and an add of flonums. (C99 fma)

(fl- x y ...)

(fl/ x y ...)

With two or more arguments, these procedures return the difference or quotient of their arguments, associating to the left. With one argument, however, they return the additive or multiplicative inverse of their argument. (C99 - / operators respectively)

(flabs x)

Returns the absolute value of x. (C99 fabs)

(flabsdiff x y)

Returns |x - y|.

(flposdiff x y)

Returns the difference of x and y if it is non-negative, or zero if the difference is negative. (C99 fdim)

(flsgn x)

Returns (flcopysign 1.0 x).

(flnumerator x)

(fldenominator x)

Returns the numerator/denominator of x as a flonum; the result is computed as if x was represented as a fraction in lowest terms. The denominator is always positive. The numerator of an infinite flonum is itself. The denominator of an infinite or zero flonum is 1.0. The numerator and denominator of a NaN is a NaN.

(flfloor x)

Returns the largest integral flonum not larger than x. (C99 floor)

(flceiling x)

Returns the smallest integral flonum not smaller than x. (C99 ceil)

(flround x)

Returns the closest integral flonum to x, rounding to even when x represents a number halfway between two integers. (Not the same as C99 round, which rounds away from zero)

(fltruncate x)

Returns the closest integral flonum to x whose absolute value is not larger than the absolute value of x (C99 trunc)

Exponents and logarithms

(flexp x)

Returns ex. (C99 exp)

(flexp2 x)

Returns 2x. (C99 exp2)

(flexp-1 x)

Returns ex - 1, but is much more accurate than flexp for very small values of x. It is recommended for use in algorithms where accuracy is important. (C99 expm1)

(flsquare x)

Returns x2.

(flsqrt x)

Returns √x. For -0.0, flsqrt should return -0.0. (C99 sqrt)

(flcbrt x)

Returns ∛x. (C99 cbrt)

(flhypot x y)

Returns the length of the hypotenuse of a right triangle whose sides are of length |x| and |y|. (C99 hypot)

(flexpt x y)

Returns xy. If x is zero, then the result is zero. (C99 pow)

(fllog x)

Returns loge x. (C99 log)

(fllog1+ x)

Returns loge (x+ 1), but is much more accurate than fllog for values of x near 0. It is recommended for use in algorithms where accuracy is important. (C99 log1p)

(fllog2 x)

Returns log2 x. (C99 log2)

(fllog10 x)

Returns log10 x. (C99 log10)

(make-fllog-base x)

Returns a procedure that calculates the base-x logarithm of its argument. If x is 1.0 or less than 1.0, it is an error.

Trigonometric functions

(flsin x)

Returns sin x. (C99 sin)

(flcos x)

Returns cos x. (C99 cos)

(fltan x)

Returns tan x. (C99 tan)

(flasin x)

Returns arcsin x. (C99 asin)

(flacos x)

Returns arccos x. (C99 acos)

(flatan [y] x)

Returns arctan x. (C99 atan)

With two arguments, returns arctan(y/x). in the range [-π,π], using the signs of x and y to choose the correct quadrant for the result. (C99 atan2)

(flsinh x)

Returns sinh x. (C99 sinh)

(flcosh x)

Returns cosh x. (C99 cosh)

(fltanh x)

Returns tanh x. (C99 tanh)

(flasinh x)

Returns arcsinh x. (C99 asinh)

(flacosh x)

Returns arccosh x. (C99 acosh)

(flatanh x)

Returns arctanh x. (C99 atanh)

Integer division

(flquotient x y)

Returns the quotient of x/y as an integral flonum, truncated towards zero.

(flremainder x y)

Returns the truncating remainder of x/y as an integral flonum.

(flremquo x y)

Returns two values, the rounded remainder of x/y and the low-order n bits (as a correctly signed exact integer) of the rounded quotient. The value of n is implementation-dependent but at least 3. This procedure can be used to reduce the argument of the inverse trigonometric functions, while preserving the correct quadrant or octant. (C99 remquo)

Special functions

(flgamma x)

Returns Γ(x), the gamma function applied to x. This is equal to (x-1)! for integers. (C99 tgamma)

(flloggamma x)

Returns two values, log |Γ(x)| without internal overflow, and the sign of Γ(x) as 1.0 if it is positive and -1.0 if it is negative. (C99 lgamma)

(flfirst-bessel n x)

Returns the nth order Bessel function of the first kind applied to x, Jn(x). (jn, which is an XSI Extension of C99)

(flsecond-bessel n x)

Returns the nth order Bessel function of the second kind applied to x, Yn(x). (yn, which is an XSI Extension of C99)

(flerf x)

Returns the error function erf(x). (C99 erf)

(flerfc x)

Returns the complementary error function, 1 - erf(x). (C99 erfc)

Implementation

A sample implementation of this SRFI is in the repository.

The sample implementation should run without modification in every complete implementation of R7RS-small that uses IEEE-754 double or single precision arithmetic for inexact reals.

To show how a Foreign Function Interface (FFI) to C99 math libraries could be used to implement some procedures of SRFI 144, the sample implementation is configured to use Larceny's FFI when running on an x86 processor under Linux or MacOS X. Although srfi/144.ffi.scm uses Larceny's FFI to make many C functions available, the sample implementation uses only three of those C functions:

Those are the only C functions that provide a worthwhile advantage in either accuracy or speed over the completely portable definitions, as measured in Larceny.

The portable implementations of the Bessel functions are likely to be considerably less accurate than the C functions jn and yn when the flonum argument is large.

An implementation for Chibi Scheme is available, too.

Acknowledgements

This SRFI would not have been possible without Taylor Campbell, the R6RS editors, and the ISO C Working Group.

Copyright

Copyright (C) John Cowan (2016). All Rights Reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler