Title

A Library of Streams

Author

Philip L. Bewig

Status

This SRFI is currently in ``final'' status. To see an explanation of each status that a SRFI can hold, see here. To comments this SRFI, please mail to srfi-40@srfi.schemers.org. See instructions here to subscribe to the list. You can access the discussion via the archive of the mailing list. You can access post-finalization messages via the archive of the mailing list.

  • Received: 2003/02/03
  • Draft: 2003/02/03-2003/04/03
  • Revised: 2003/08/02
  • Revised: 2003/12/23
  • Final: 2004/08/22

    Abstract

    Along with higher-order functions, one of the hallmarks of functional programming is lazy evaluation. A primary manifestation of lazy evaluation is lazy lists, generally called streams by Scheme programmers, where evaluation of a list element is delayed until its value is needed.

    The literature on lazy evaluation distinguishes two styles of laziness, called even and odd. Odd style streams are ubiquitous among Scheme programs and can be easily encoded with the Scheme primitives delay and force defined in R5RS. However, the even style delays evaluation in a manner closer to that of traditional lazy languages such as Haskell and avoids an "off by one" error that is symptomatic of the odd style.

    This SRFI defines the stream data type in the even style, some essential procedures and syntax that operate on streams, and motivates our choice of the even style. A companion SRFI 41 Stream Library provides additional procedures and syntax which make for more convenient processing of streams and shows several examples of their use.

    Rationale

    Two of the defining characteristics of functional programming languages are higher-order functions, which provide a powerful tool to allow programmers to abstract data representations away from an underlying concrete implementation, and lazy evaluation, which allows programmers to modularize a program and recombine the pieces in useful ways. Scheme provides higher-order functions through its lambda keyword and lazy evaluation through its delay keyword. A primary manifestation of lazy evaluation is lazy lists, generally called streams by Scheme programmers, where evaluation of a list element is delayed until its value is needed. Streams can be used, among other things, to compute with infinities, conveniently process simulations, program with coroutines, and reduce the number of passes over data. This library defines a minimal set of functions and syntax for programming with streams.

    Scheme has a long tradition of computing with streams. The great computer science textbook Structure and Interpretation of Computer Programs, uses streams extensively. The example given in R5RS makes use of streams to integrate systems of differential equations using the method of Runge-Kutta. MIT Scheme, the original implementation of Scheme, provides streams natively. Scheme and the Art of Programming, discusses streams. Some Scheme-like languages also have traditions of using streams: Winston and Horn, in their classic Lisp textbook, discuss streams, and so does Larry Paulson in his text on ML. Streams are an important and useful data structure.

    Basically, a stream is much like a list, and can either be null or can consist of an object (the stream element) followed by another stream; the difference to a list is that elements aren't evaluated until they are accessed. All the streams mentioned above use the same underlying representation, with the null stream represented by '() and stream pairs constructed by (cons car (delay cdr)), which must be implemented as syntax. These streams are known as head-strict, because the head of the stream is always computed, whether or not it is needed.

    Streams are the central data type -- just as arrays are for most imperative languages and lists are for Lisp and Scheme -- for the "pure" functional languages Miranda and Haskell. But those streams are subtly different from the traditional Scheme streams of SICP et al. The difference is at the head of the stream, where Miranda and Haskell provide streams that are fully lazy, with even the head of the stream not computed until it is needed. We'll see in a moment the operational difference between the two types of streams.

    Philip Wadler, Walid Taha, and David MacQueen, in their paper "How to add laziness to a strict language without even being odd", describe how they added streams to the SML/NJ compiler. They discuss two kinds of streams: odd streams, as in SICP et al, and even streams, as in Haskell; the names odd and even refer to the parity of the number of constructors (delay, cons, nil) used to represent the stream. Here are the first two figures from their paper, rewritten in Scheme:

    ;;; FIGURE 1 -- ODD                 
    				    
    (define nil1 '())                   
    				    
    (define (nil1? strm)                
      (null? strm))                     
    				    
    (define-syntax cons1                
      (syntax-rules ()                  
        ((cons1 obj strm)               
          (cons obj (delay strm)))))    
    				    
    (define (car1 strm)                 
      (car strm))                       
    				    
    (define (cdr1 strm)                 
      (force (cdr strm)))               
    				    
    (define (map1 func strm)            
                                        
      (if (nil1? strm)                  
        nil1                            
        (cons1                          
          (func (car1 strm))            
          (map1 func (cdr1 strm)))))    
    				    
    (define (countdown1 n)              
                                        
      (cons1 n (countdown1 (- n 1))))   
    				    
    (define (cutoff1 n strm)            
      (cond                             
        ((zero? n) '())                 
        ((nil1? strm) '())              
        (else                           
          (cons                         
            (car1 strm)                 
            (cutoff (- n 1)             
                    (cdr1 strm))))))    
    
    ;;; FIGURE 2 -- EVEN
    
    (define nil2 (delay '()))
    
    (define (nil2? strm)
      (null? (force strm)))
    
    (define-syntax cons2
      (syntax-rules ()
        ((cons2 obj strm)
         (delay (cons obj strm)))))
    
    (define (car2 strm)
      (car (force strm)))
    
    (define (cdr2 strm)
      (cdr (force strm)))
    
    (define (map2 func strm)
      (delay (force
        (if (nil2? strm)
          nil2
          (cons2
            (func (car2 strm))
            (map2 func (cdr2 strm)))))))
    
    (define (countdown2 n)
      (delay (force
        (cons2 n (countdown2 (- n 1))))))
    
    (define (cutoff2 n strm)
      (cond
        ((zero? n) '())
        ((nil2? strm) '())
        (else
          (cons
            (car2 strm)
            (cutoff2 (- n 1)
                     (cdr2 strm))))))
    

    It is easy to see the operational difference between the two kinds of streams, using an example adapted from the paper:

    > (define (12div n) (/ 12 n))       
    > (cutoff1 4                        
        (map1 12div (countdown1 4)))    
    error: divide by zero               
    
    > (define (12div n) (/ 12 n))
    > (cutoff2 4
        (map2 12div (countdown2 4)))
    (3 4 6 12)
    

    The problem of odd streams is that they do too much work, having an "off-by-one" error that causes them to evaluate the next element of a stream before it is needed. Mostly that's just a minor leak of space and time, but if evaluating the next element causes an error, such as dividing by zero, it's a silly, unnecessary bug.

    It is instructive to look at the coding differences between odd and even streams. We expect the two constructors nil and cons to be different, and they are; the odd nil and cons return a strict list, but the even nil and cons return promises. Nil?, car and cdr change to accomodate the underlying representation differences. Cutoff is identical in the two versions, because it doesn't return a stream.

    The subtle but critical difference is in map and countdown, the two functions that return streams. They are identical except for the (delay (force ...)) that wraps the return value in the even version. That looks odd, but is correct. It is tempting to just eliminate the (delay (force ...)), but that doesn't work, because, given a promise x, even though (delay (force x)) and x both evaluate to x when forced, their semantics are different, with x being evaluated and cached in one case but not the other. That evaluation is, of course, the same "off-by-one" error that caused the problem with odd streams. Note that (force (delay x)) is something different entirely, even though it looks much the same.

    Unfortunately, that (delay (force ...)) is a major notational inconvenience, because it means that the representation of streams can't be hidden inside a few primitives but must infect each function that returns a stream, making streams harder to use, harder to explain, and more prone to error. Wadler et al solve the notational inconvenience in their SML/NJ implementation by adding special syntax -- the keyword lazy -- within the compiler. Since Scheme allows syntax to be added via a macro, it doesn't require any compiler modifications to provide streams. Shown below is a Scheme implementation of Figure 1 to 3 from the paper, with the (delay (force ...)) hidden within stream-define, which is the syntax used to create a function that returns a stream:

    ;;; FIGURE 1 -- ODD      
    			 
    (define nil1             
      '())                   
    			 
    (define (nil1? strm)     
      (null? strm))          
    			 
    (define-syntax cons1     
      (syntax-rules ()       
        ((cons1 obj strm)    
          (cons              
            obj              
              (delay         
                strm)))))    
    			 
    (define (car1 strm)      
      (car strm))            
    			 
    (define (cdr1 strm)      
      (force (cdr strm)))    
    			 
                             
                             
                             
                             
                             
                             
                             
    			 
    (define (map1 func strm) 
                             
      (if (nil1? strm)       
        nil1                 
        (cons1               
          (func              
            (car1 strm))     
          (map1              
            func             
            (cdr1            
              strm)))))      
    			 
    (define (countdown1 n)   
                             
      (cons1                 
        n                    
        (countdown1          
          (- n 1))))         
    			 
    (define (cutoff1 n strm) 
      (cond                  
        ((zero? n) '())      
        ((nil1? strm) '())   
        (else                
          (cons              
            (car1 strm)      
            (cutoff1         
              (- n 1)        
              (cdr1          
                strm))))))   
    
    ;;; FIGURE 2 -- EVEN     
    			 
    (define nil2             
      (delay '()))           
    			 
    (define (nil2? strm)     
      (null? (force strm))   
    			 
    (define-syntax cons2     
      (syntax-rules ()       
        ((cons2 obj strm)    
          (delay             
            (cons            
              obj            
              strm)))))      
    			 
    (define (car2 strm)      
      (car (force strm)))    
    			 
    (define (cdr2 strm)      
      (cdr (force strm)))    
    			 
                             
                             
                             
                             
                             
                             
                             
    			 
    (define (map2 func strm) 
      (delay (force		 
        (if (nil2? strm)     
          nil2               
          (cons2             
            (func            
              (car2 strm))   
            (map2            
              func           
              (cdr2          
                strm)))))))  
    			 
    (define (countdown2 n)   
      (delay (force		 
        (cons2               
          n                  
          (countdown2        
            (- n 1))))))     
    			 
    (define (cutoff2 n strm) 
      (cond                  
        ((zero? n) '())      
        ((nil2? strm) '())   
        (else                
          (cons              
            (car2 strm)      
            (cutoff2         
              (- n 1)        
              (cdr2          
                strm))))))   
    
    ;;; FIGURE 3 -- EASY
    
    (define nil3
      (delay '()))
    
    (define (nil3? strm)
      (null? (force strm)))
    
    (define-syntax cons3
      (syntax-rules ()
        ((cons3 obj strm)
          (delay
            (cons
              obj
              strm)))))
    
    (define (car3 strm)
      (car (force strm)))
    
    (define (cdr3 strm)
      (cdr (force strm)))
    
    (define-syntax stream-define
     (syntax-rules ()
      ((stream-define (name args ...)
                      body0 body1 ...)
       (define (name args ...)
        (delay (force
         (begin body0 body1 ...)))))))
    
    (stream-define (map3 func strm)
    
      (if (nil3? strm)
        nil3
        (cons3
          (func
            (car3 strm))
          (map3
            func
            (cdr3
              strm)))))
    
    (stream-define (countdown3 n)
    
      (cons3
        n
        (countdown3
          (- n 1))))
    
    (define (cutoff3 n strm)
      (cond
        ((zero? n) '())
        ((nil3? strm) '())
        (else
          (cons
            (car3 strm)
            (cutoff3
              (- n 1)
              (cdr3
                strm))))))
    

    It is now easy to see the notational inconvenience of Figure 2, as the bodies of map1 and map3 are identical, as are countdown1 and countdown3. All of the inconvenience is hidden in the stream primitives, where it belongs, so functions that use the primitives won't be burdened. This means that users can just step up and use the library without any knowledge of how the primitives are implemented, and indeed the implementation of the primitives can change without affecting users of the primitives, which would not have been possible with the streams of Figure 2. With this implementation of streams, (cutoff3 4 (map3 12div (countdown3 4))) evaluates to (3 4 6 12), as it should.

    This library provides streams that are even, not odd. This decision overturns years of experience in the Scheme world, but follows the traditions of the "pure" functional languages such as Miranda and Haskell. The primary benefit is elimination of the "off-by-one" error that odd streams suffer. Of course, it is possible to use even streams to represent odd streams, as Wadler et al show in their Figure 4, so nothing is lost by choosing even streams as the default.

    Obviously, stream elements are evaluated when they are accessed, not when they are created; that's the definition of lazy. Additionally, stream elements must be evaluated only once, and the result cached in the event it is needed again; that's common practice in all languages that support streams. Following the rule of R5RS section 1.1 fourth paragraph, an implementation of streams is permitted to delete a stream element from the cache and reclaim the storage it occupies if it can prove that the stream element cannot possibly matter to any future computation.

    The fact that objects are permitted, but not required, to be reclaimed has a significant impact on streams. Consider for instance the following example, due to Joe Marshall. Stream-filter is a function that takes a predicate and a stream and returns a new stream containing only those elements of the original stream that pass the predicate; it can be simply defined as follows:

        (stream-define (stream-filter pred? strm)
          (cond ((stream-null? strm) strm)
                ((pred? (stream-car strm))
                  (stream-cons (stream-car strm)
                               (stream-filter pred? (stream-cdr strm))))
                (else (stream-filter pred? (stream-cdr strm)))))
    

    But this implementation of stream-filter has a problem:

        (define (times3 n)
          (stream-car
            (stream-cdr
              (stream-cdr
                (stream-cdr
                  (stream-cdr
                    (stream-filter
                      (lambda (x) (zero? (modulo x n)))
                      from0)))))))
    

    Called as (times3 5), the function evaluates to 15, as desired. But called as (times3 1000000), it churns the disk, creating closures and caching each result as it counts slowly to 3,000,000; on most Scheme systems, this function will run out of memory long before it computes an answer. A space leak occurs when there is a gap between elements that pass the predicate, because the naive definition hangs on to the head of the gap. Unfortunately, this space leak can be very hard to fix, depending on the underlying Scheme implementation, and solutions that work in one Scheme implementation may not work in another. And, since R5RS itself doesn't specify any safe-for-space requirements, this SRFI can't make any specific requirements either. Thus, this SRFI encourages native implementations of the streams described in this SRFI to "do the right thing" with respect to space consumption, and implement streams that are as safe-for-space as the rest of the implementation. Of course, if the stream is bound in a scope outside the stream-filter expression, there is nothing to be done except cache the elements as they are filtered.

    Although stream-define has been discussed as the basic stream abstraction, in fact it is the (delay (force ...)) mechanism that is the basis for everything else. In the spirit of Scheme minimality, the specification below gives stream-delay as the syntax for converting an expression to a stream; stream-delay is similar to delay, but returns a stream instead of a promise. Given stream-delay, it is easy to create stream-lambda, which returns a stream-valued function, and then stream-define, which binds a stream-valued function to a name. However, stream-lambda and stream-define are both library procedures, not fundamental to the use of streams, and are thus excluded from this SRFI.

    Specification

    A stream-pair is a data structure consisting of two fields called the stream-car and stream-cdr. Stream-pairs are created by the procedure stream-cons, and the stream-car and stream-cdr fields are accessed by the procedures stream-car and stream-cdr. There also exists a special stream object called stream-null, which is a single stream object with no elements, distinguishable from all other stream objects and, indeed, from all other objects of any type. The stream-cdr of a stream-pair must be either another stream-pair or stream-null.

    Stream-null and stream-pair are used to represent streams. A stream can be defined recursively as either stream-null or a stream-pair whose stream-cdr is a stream. The objects in the stream-car fields of successive stream-pairs of a stream are the elements of the stream. For example, a two-element stream is a stream-pair whose stream-car is the first element and whose stream-cdr is a stream-pair whose stream-car is the second element and whose stream-cdr is stream-null. A chain of stream-pairs ending with stream-null is finite and has a length that is computed as the number of elements in the stream, which is the same as the number of stream-pairs in the stream. A chain of stream-pairs not ending with stream-null is infinite and has undefined length.

    The way in which a stream can be infinite is that no element of the stream is evaluated until it is accessed. Thus, any initial prefix of the stream can be enumerated in finite time and space, but still the stream remains infinite. Stream elements are evaluated only once; once evaluated, the value of a stream element is saved so that the element will not be re-evaluated if it is accessed a second time. Streams and stream elements are never mutated; all functions involving streams are purely applicative. Errors are not required to be signalled, as in R5RS section 1.3.2, although implementations are encouraged to detect and report errors.

    stream-null (constant)
    Stream-null is the distinguished nil stream, a single Scheme object distinguishable from all other objects. If the last stream-pair in a stream contains stream-null in its cdr field, the stream is finite and has a computable length. However, there is no need for streams to terminate.
        stream-null                                 => (stream)
    
    (stream-cons object stream) (syntax)
    Stream-cons is the primitive constructor of streams, returning a stream with the given object in its car field and the given stream in its cdr field. The stream returned by stream-cons must be different (in the sense of eqv?) from every other Scheme object. The object may be of any type, and there is no requirement that successive elements of a stream be of the same type, although it is common for them to be. It is an error if the second argument of stream-cons is not a stream.
        (stream-cons 'a stream-null)                => (stream 'a)
        (stream-cons 'a (stream 'b 'c 'd))          => (stream 'a 'b 'c 'd)
        (stream-cons "a" (stream 'b 'c))            => (stream "a" 'b 'c)
        (stream-cons 'a 3)                          => error
        (stream-cons (stream 'a 'b) (stream 'c))    => (stream (stream 'a 'b) 'c)
    
    (stream? object) (function)
    Stream? returns #t if the object is a stream, and otherwise returns #f. A stream object may be either the null stream or a stream pair created by stream-cons.
        (stream? stream-null)                       => #t
        (stream? (stream-cons 'a stream-null))      => #t
        (stream? 3)                                 => #f
    
    (stream-null? object) (function)
    Stream-null? returns #t if the object is the distinguished nil stream, and otherwise returns #f (stream-null? stream-null) => #t (stream-null? (stream-cons 'a stream-null)) => #f (stream-null? 3) => #f
    (stream-pair? object) (function)
    Stream-pair? returns #t if the object is a stream pair created by stream-cons, and otherwise returns #f.
        (stream-pair? stream-null)                  => #f
        (stream-pair? (stream-cons 'a stream-null)) => #t
        (stream-pair? 3)                            => #f
    
    (stream-car stream) (function)
    Stream-car returns the object in the stream-car field of a stream-pair. It is an error to attempt to evaluate the stream-car of stream-null.
        (stream-car (stream 'a 'b 'c))              => a
        (stream-car stream-null)                    => error
        (stream-car 3)                              => error
    
    (stream-cdr stream) (function)
    Stream-cdr returns the stream in the stream-cdr field of a stream-pair. It is an error to attempt to evaluate the stream-cdr of stream-null.
        (stream-cdr (stream 'a 'b 'c))              => (stream 'b 'c)
        (stream-cdr stream-null)                    => error
        (stream-cdr 3)                              => error
    
    
    (stream-delay expression) (syntax)
    Stream-delay is the essential mechanism for operating on streams, taking an expression and returning a delayed form of the expression that can be asked at some future point to evaluate the expression and return the resulting value. The action of stream-delay is analogous to the action of delay, but it is specific to the stream data type, returning a stream instead of a promise; no corresponding stream-force is required, because each of the stream functions performs the force implicitly.
        (define from0
          (let loop ((x 0))
            (stream-delay
              (stream-cons x (loop (+ x 1))))))
        from0                                       => (stream 0 1 2 3 4 5 6 ...)
    
    (stream object ...) (library function)
    Stream returns a newly allocated finite stream of its arguments, in order.
        (stream 'a (+ 3 4) 'c)                      => (stream 'a 7 'c)
        (stream)                                    => stream-null
    
    (stream-unfoldn generator seed n) (function)
    Stream-unfoldn returns n streams whose contents are produced by successive calls to generator, which takes the current seed as an arguments and returns n + 1 values:

    (proc seed) -> seed result0 ... resultN

    where resultI indicates how to produce the next element of the Ith result stream:

    (value) value is the next car of this result stream
    #f no new information for this result stream
    () the end of this result stream has been reached
    Note that getting the next element in any particular result stream may require multiple calls to generator.
        (define (take5 s)
          (stream-unfoldn
            (lambda (x)
              (let ((n (car x)) (s (cdr x)))
                (if (zero? n)
                    (values 'dummy '())
                    (values
                      (cons (- n 1) (stream-cdr s))
                      (list (stream-car s))))))
            (cons 5 s)
            1))
        (take5 from0)                              => (stream 0 1 2 3 4)
    
    (stream-map function stream ...) (library function)
    Stream-map creates a newly allocated stream built by applying function elementwise to the elements of the streams. The function must take as many arguments as there are streams and return a single value (not multiple values). The stream returned by stream-map is finite if the given stream is finite, and infinite if the given stream is infinite. If more than one stream is given, stream-map terminates when any of them terminate, or is infinite if all the streams are infinite. The stream elements are evaluated in order.
        (stream-map (lambda (x) (+ x x)) from0)      => (stream 0 2 4 6 8 10 ...)
        (stream-map + (stream 1 2 3) (stream 4 5 6)) => (stream 5 7 9)
        (stream-map (lambda (x) (expt x x))
          (stream 1 2 3 4 5))                        => (stream 1 4 27 256 3125)
    
    (stream-for-each procedure stream ...) (library function)
    Stream-for-each applies procedure elementwise to the elements of the streams, calling the procedure for its side effects rather than for its values. The procedure must take as many arguments as there are streams. The value returned by stream-for-each is unspecified. The stream elements are visited in order.
        (stream-for-each display from0)             => no value, prints 01234 ...
    
    (stream-filter predicate? stream) (library function)
    Stream-filter applies predicate? to each element of stream and creates a newly allocated stream consisting of those elements of the given stream for which predicate? returns a non-#f value. Elements of the output stream are in the same order as they were in the input stream, and are tested by predicate? in order.
        (stream-filter odd? stream-null)            => stream-null
        (take5 (stream-filter odd? from0))          => (stream 1 3 5 7 9)
    

    Implementation

    A reference implementation of streams is shown below. It strongly prefers simplicity and clarity to efficiency, and though a reasonable attempt is made to be safe-for- space, no promises are made. The reference implementation relies on the record types of SRFI-9. Implementations may instead use whatever mechanism the native Scheme system uses to create new types. The stream-error function aborts by calling (car '()) and should be rewritten to call the native error handler. All identifiers defined in the reference implementation not appearing in the specification section of this document should not be considered part of the API.

    References

    Copyright

    Copyright (C) 2003 by Philip L. Bewig of Saint Louis, Missouri, United States of America. All rights reserved.

    This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Scheme Request For Implementation process or editors, except as needed for the purpose of developing SRFIs in which case the procedures for copyrights defined in the SRFI process must be followed, or as required to translate it into languages other than English.

    The limited permissions granted above are perpetual and will not be revoked by the authors or their successors or assigns.

    This document and the information contained herein is provided on an "AS IS" basis and THE AUTHOR AND THE SRFI EDITORS DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


    Editor: Mike Sperber
    Last modified: Tue Aug 31 20:16:03 MST 2004