[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: reading NaNs

This page is part of the web mail archives of SRFI 77 from before July 7th, 2015. The new archives for SRFI 77 contain all messages, not just those from before July 7th, 2015.



Aubrey Jaffer wrote:
 | Date: Tue, 25 Oct 2005 18:13:54 -0500
 | From: Alan Watson <a.watson@xxxxxxxxxxxxxxxx>
| | Aubrey Jaffer wrote:
 | > Having a universal read/write representation for arbitrary bit
 | > patterms prevents including information like the procedure
 | > causing the NaN in its printed representation.
| | I don't see this. Can you elaborate?

Suppose Scheme implementation X distinguishes NaNs by the procedure
producing them and makes that information part of the printed
representation for NaNs:

  (expt +inf.0 0)    ==>  #<not-a-number expt>
  (+ -inf.0 +inf.0)  ==>  #<not-a-number +>
  (/ (* 0 -inf.0) 3) ==>  #<not-a-number *>
  (+ 5. (/ 0.0 0.0)) ==>  #<not-a-number />

If R6RS has a universal read/write representation for arbitrary bit
NaN patterns, then R6RS must assign bit patterns for all possible
#<not-a-number {*}> syntaxes.

Agreed.

If some future IEEE-754 hardware
returns more than one NaN code, then its assignments are very unlikely
to match the R6RS codes.

Yes, but all you are saying is that a Scheme standard cannot mandate the values of the NaNs that are produced without creating a severe burden to some implementations.

Scheme has so far avoided arbitrary assignments of numbers to
procedures.

Yes. But what of it? Associating context (procedure names, source locations, etc.) with NaNs will be a severe burden to many implementations. For this reason, R6RS should not mandate it.

R6RS might provide an optional means to associate context with NaNs, provided it failed gracefully on implementations that support context. However, for the same reason, it should not specify the precise relation between context and bit pattern.

For example, suppose I allow any object to be associate with a NaN. I accomplish this by sticking the object in a vector and using the integer formed by the bit pattern of the NaN as an offset into the vector. The association between a bit pattern and a context is not unique even in within the same implementation!

I do think your suggestion of context is a good one, provided it is optional and does not interfere with bit patterns. It can be accomodated by:

(a) "write" writes NaNs with a bit pattern and a possible context (for example, #<NaN #x12345ef expt>). If the implementation does not support context or if the NaN is not associated with any context, "write" omits it (for example, #<NaN #x12345ef>).

(b) "display" writes NaNs without a bit pattern but with a possible context (for example, #<NaN expt>). If the implementation does not support context or if the NaN is not associated with any context, "display" omits it (for example, #<NaN>).

(c) If a bit pattern is present, "read" produces a NaN with the same bit patterm. This maintains read/write equivalence.

(d) If no bit pattern is present, but a context is present, and if the implementation supports contexts, then "read" produces a NaN with the same context. This may be a different bit pattern as the reading and writing implementations may encode the context differently.

(e) If neither a bit pattern nor a context is present, "read" produces a contextless NaN.

Basically, write/read would preserve bit patterns and display/read would attempt to preserve context. What do you think?

 | BTW, I neither favor nor oppose reading or writing of arbitrary bit
 | patterns.  I am favor of being able to specify the precision of the
 | NaN, though.

The precision available in NaNs is a hardware attribute.  How can it
be settable?

I mean that when I read a NaN, I want to be able to specify if it is a single precision NaN, a double precision NaN, or whatever.


 | >  | However, I still think we need a read syntax.  Suppose program
 | >  | A calculates a value and writes it to a file and program B
 | >  | reads the value from the file and uses it.  Is is not useful
 | >  | for program A to be able to communicate to program B that it
 | >  | got a NaN?
| > | > [...] | > | > If program A writes out its state, it would be useful to see that
 | > NaNs were computed.  It gives operators a chance to capture the
 | > use case which provoked the error.  If the program state is very
 | > valuable, then it can be repaired manually.
| > | > But if program B reads its initial state from the file, its
 | > reading of NaNs puts errors into its state which can propagate
 | > and corrupt it.
| | Um, how about program B doing: | | (let ((x (read)))
 |      (if (nan? x)
 |        (display "Help! Help! Program A is feeding me NaNs!")))
| | Sure, if you don't check your input, you can be screwed, but what's
 | new there?

If program B reads lists, association lists, or vectors containing
numbers, then your approach must descend data structures looking for
NaNs -- all this effort for a case which shouldn't occur.

Yes, but if generating a NaN is an error, somone will have to do this -- either program A or program B.

For example, you taked about HTTP server dumping its state and then re-reading it. If a NaN appear in the output, then the NaN was there in the original state. If to your server NaNs are errors, should they not be checked at the point they might be produced, rather than produce an error later when they are re-read?

Should this
input vetting also check for embedded end-of-file objects?

I guess this is why Kawa and Chicken interpret embedded EOF objects in the way they do -- you can read and write a list containing an EOF object without a problem.

One could use NaNs for all sorts of purposes.  I think they are most
useful when they flag impossible and unplanned-for numerical errors.

That's a valid opinion, but I do not share it. I use them quite usefully to flag possible and planned-for events for which there is no other good answer.

As such, it removes this input vetting burden if an implementation is
permitted to make NaNs unreadable.

Yes, and for certain applications this is great and for others it is a disaster.

Suppose the input to B must be non-negative. What happens if A produces some numbers that are negative? Well, you have to check for them in A or B. In this context, how are negative numbers different from NaNs?

If NaNs were used to signal bad pixels, then those NaNs would spread
to neighboring pixels with certain types of signal processing steps.
If you happened to employ Fourier transforms to do convolutions, then
your entire image would become NaNs.

Yes, and that's probably a good idea. The pixels are bad because they are ... bad. Using bad pixels directly in a calculation is not a good idea.

(We get rid of the bad pixels later by taking images at different offsets (so the bad pixels move on the sky), aligning the images, and averaging the stack ignoring NaNs. Or we interpolate. We would then, for example, take the FFT of the image without NaNs.)

I've also used NaNs to mean "this region of the image should not be used when fitting a model to the data".

These are real uses of NaNs that would be compromised by prohibiting reading NaNs.

Regards,

Alan
--
Dr Alan Watson
Centro de Radioastronomía y Astrofísica
Universidad Astronómico Nacional de México