mailto:srfi-68@srfi.schemers.org
.
See instructions
here to subscribe to the list. You can access previous messages via
the
archive of the mailing list.
This SRFI defines a comprehensive I/O subsystem for Scheme with three layers, where each layer is built on top of the one below it:
The layer architecture is similar to the upper three layers of the I/O subsystem in The Standard ML Basis Library.
In particular, the subsystem provides
The subsystem does not provide
However, all of these could be added on top of one or several of the layers specified in this SRFI.
The I/O subsystem in R5RS is too limited in many respects: It only provides for for text I/O, it only allows reading at the character and the datum level, and some of the primitives are mis-designed or underspecified. As a result, almost every Scheme implementation has its own extensions to the I/O system, and rarely are two of these extensions compatible.
This SRFI is meant as a compelling replacement for the R5RS I/O subsystem. As such, it is a completely new design, and it is not based on the extensions a particular existing Scheme system provides. (In fact, it is probably, in its entirety, unlike what any existing Scheme system provides.) Moreover, it is meant to be a substrate for further extensions which can be built on top of the subsystem via the interface described here.
The design of this SRFI is driven by the requirements mentioned in the abstract on the one hand, and on the excellent design of the I/O subsystem in the Standard ML Basis Library. The is latter is also the reason why this SRFI is different from the extensions provided by any existing Scheme implementation, as none of them picked up on the Basis design, and because the Basis design seems superior to the extensions I have looked at. (Among those I have looked at are Scheme 48, scsh, Gambit-C, and PLT Scheme.)
Note, however, that this SRFI differs from the SML Basis in several important respects, among the handling of textual I/O streams, the ability to define translated streams, and the absence of any functionality related to non-blocking I/O. The latter is more properly in the domain of a thread/even system; Concurrent ML shows that the SML Basis plays well with such a system, and I expect the same to hold true here. The text encoding/translation functionality is different mainly because it plays to a different substrate for representing text (based on Unicode; see below) than Standard ML.
Like the Standard ML Basis I/O subsystem, the I/O system specified in this SRFI is probably not suitable for maximal-throughput I/O, chiefly because it does not re-use buffers, and because the buffer objects are byte vectors, with no further constraints on alignment or GC behavior. However, I deemed the achievable performance as more than adequate for most applications---it seemed a small price to pay for the convenient programming model.
This SRFI is meant for Scheme implementations with the following prerequisites:
This SRFI assumes that the char
datatype in Scheme to correspond to Unicode scalar values. This, in turn, means that strings are represented as vectors of scalar values. (Note that this is consistent with 14 (Character-set library).) It may be possible to make this SRFI work in an ASCII- or Latin-1-only system, but I have not made any special provisions to ensure this.
Some of the procedures described here accept a filename filename as an argument. Valid values for such a filename include strings naming a file using the native notation of the operating system the Scheme implementation happens to be running on.
It is expected that a future SRFI will extend this set of values by a more abstract representation: This is necessary, as the most common operating systems do not really use strings for representing filenames, but rather byte sequences. Moreover, the string notation is difficult to manipulate and not very portable.
For procedures that have no "natural" return value, this SRFI often uses the sentence
The return values are unspecified.
This means that number of return values and the return values are unspecified. However, the number of return values is such that it is accepted by a continuation created by begin
. Specifically, on Scheme implementations where continuations created by begin
accept an arbitrary number of arguments (this includes most implementations), it is suggested that the procedure return zero return values.
The I/O subsystem consists of three layers. Each layer can function independently from those above it. Moreover, each layer can be used without referring directly to the ones below it. Therefore, a Scheme implementation with a module system should offer each layer as an independent module. Moreover, some data extensions are common to all three I/O layers---specifically, the common I/O condition types.
The I/O conditition type hierarchy here is similar, but not identical to the one described in 36 (I/O Conditions).
The following list depicts the I/O condition hierarchy; more detailed explanations of the condition types follow.
&error
&i/o-error
&i/o-read-error
&i/o-write-error
&i/o-closed-error
&i/o-invalid-position-error
&i/o-filename-error
(has a filename
field)&i/o-malformed-filename-error
&i/o-file-protection-error
&i/o-file-is-read-only-error
&i/o-file-already-exists-error
&i/o-no-such-file-error
In exceptional situations not described as "it is an error", the procedures described in the specification below will raise an &i/o-error
condition object. Except where explicitly specified, there is no guarantee that the raised condition object will contain all the information that would be applicable. It is recommended, however, that an implementation of this SRFI provide all information about an exceptional situation in the condition object that is available at the place where it is detected.
(define-condition-type &i/o-error &error i/o-error?)
This is a supertype for a set of more specific I/O errors.
(define-condition-type &i/o-read-error &i/o-error i/o-read-error?)
This condition type specifies a read error that occurred during an I/O operation.
(define-condition-type &i/o-write-error &i/o-error i/o-write-error?)
This condition type specifies a write error that occurred during an I/O operation.
(define-condition-type &i/o-invalid-position-error &i/o-error i/o-invalid-position-error? (position i/o-error-position))
This condition type specifies that an attempt to set the file position specified an invalid position.
(define-condition-type &i/o-closed-error &i/o-error i/o-error?)
A condition of this type specifies that an operation tried to operate on a closed I/O object under the assumption that it is open.
(define-condition-type &i/o-filename-error &i/o-error i/o-filename-error? (filename i/o-error-filename))
This condition type specifies an I/O error that occurred during an operation on a named file. Condition objects belonging to this type must specify a file name in the filename
field.
(define-condition-type &i/o-malformed-filename-error &i/o-filename-error i/o-malformed-filename-error?)
(define-condition-type &i/o-operation-error &i/o-error i/o-operation-error? (operation i/o-error-operation))
This condition type specifies an I/O error that occurred during an specific operation. Condition objects belonging to this type must specify the procedure that was called to perform the operation in the operation
field.
(define-condition-type &i/o-operation-not-available-error &i/o-operation-error i/o-operation-not-available-error?)
(define-condition-type &i/o-file-protection-error &i/o-filename-error i/o-file-protection-error?)
A condition of this type specifies that an operation tried to operate on a named file with insufficient access rights.
(define-condition-type &i/o-file-is-read-only-error &i/o-file-protection-error i/o-file-is-read-only-error?)
A condition of this type specifies that an operation tried to operate on a named read-only file under the assumption that it is writeable.
(define-condition-type &i/o-file-already-exists-error &i/o-filename-error i/o-file-already-exists-error?)
A condition of this type specifies that an operation tried to operate on an existing named file under the assumption that it does not exist.
(define-condition-type &i/o-file-exists-not-error &i/o-filename-error i/o-file-exists-not-error?)
A condition of this type specifies that an operation tried to operate on an non-existent named file under the assumption that it exists.
The Primitive I/O layer is an abstraction of the low-level I/O system calls commonly available on file descriptors: Streams and ports from the upper layers of the I/O system always perform access through the abstractions provided by this layer. The objects representing I/O descriptors are called readers for input and writers for output. They are unbuffered and operate purely on binary data.
This layer only specifies a fairly small set of operations --- a subset of the Standard ML Basis PRIM_IO
signature. Specifically, all functionality related to non-blocking I/O or polling is missing here. This is intentional, as this functionality should be integrated with the threads system of the underlying implementation, and is thus outside the scope of this (already large) SRFI. Instead, it is expected that the set of operations available on primitive I/O readers and writers will be augmented by future specifications, as will be the available constructors for these objects.
The Primitive I/O layer introduces one condition type of its own.
(define-condition-type &i/o-reader/writer-error &i/o-error i/o-reader/writer-error? (reader/writer i/o-error-reader/writer))
This condition type allows specifying with what particular reader or writer an I/O error is associated. The reader/writer
field has purely informational purpose. Conditions raised in by Primitive I/O procedures may include an &i/o-reader/writer-error
condition, but are not required to do so.
A reader object typically stands for a file or device descriptor, but can also represent the output of some algorithm, such as in the case of string readers. The sequence of bytes represented is potentially unbounded, and is punctuated by end of file elements.
(reader?
obj)
Returns #t
if obj is a reader, otherwise returns #f
.
(make-simple-reader
id descriptor chunk-size read-byte! available-bytes get-position set-position! end-position close)
Returns a reader object. Id is a string naming the reader, provided for informational purposes only. For a file, this will be a string representation of the file name. Descriptor is supposed to be the low-level object connected to the reader, such as the OS-level file descriptor or the source string in the case of a string reader.
Chunk-size must be a positive exact integer, and is the recommended efficient size of the read operations on this reader. This is typically the block size of the buffers of the operating system. Note that this is just a recommendation --- calls to the read-bytes! procedure (see below) will not necessarily use it. A value of 1 represents a recommendation to use unbuffered reads.
The remaining arguments are procedures --- get-position, set-position!, and end-position may be omitted, in which case the corresponding arguments must be #f
.
(read-bytes!
bytes start count)
Start and count must be non-negative exact integers. This reads up to count bytes from the reader and writes them into bytes, which must be a byte vector, starting at index start. Bytes must have at least start + count elements. This procedure returns the number of bytes read as an exact integer. It returns 0 if it encounters an end of file, or if count is 0. This procedure blocks until at least one byte has been read or it has encountered end of file.
(available-bytes
)
This returns an estimate of the total number of available bytes left in the stream. The return value is either an exact integer, or #f
if no such estimate is possible. There is no guarantee that this estimate will have any specific relationship to the true number of available bytes.
(get-position
)
This procedure, when present, returns the current position in the byte stream as an exact integer counting the number of bytes since the beginning of the stream.
(set-position!
pos)
This procedure, when present, moves to position pos (which must be a non-negative exact integer) in the stream.
(end-position
)
This procedure, when present, returns the position in the byte stream of the next end of file, without changing the current position.
(close
)
This procedure marks the reader as closed, performs any necessary cleanup, and releases the resources associated with the reader. Further operations on the reader may signal an error.
(reader-id
reader)
This returns the value of the id field of the argument reader.
(reader-descriptor
reader)
This returns the value of the descriptor field of the argument reader.
(reader-chunk-size
reader)
This returns the value of the chunk-size field of the argument reader.
(reader-read-bytes!
reader bytes start count)
This calls the read-bytes! procedure of reader with the remaining arguments.
(reader-available-bytes
reader)
This calls the available-bytes procedure of reader.
(reader-has-get-position?
reader)
This returns #t
if reader has a get-position procedure, and #f
otherwise.
(reader-get-position
reader)
This calls the get-position procedure of reader, if present. It is an error to call this procedure if reader does not have a get-position procedure.
(reader-has-set-position!?
reader)
This returns #t
if reader has a set-position! procedure, and #f
otherwise.
(reader-set-position!
reader pos)
This calls the set-position! procedure of reader with the pos argument, if present. It is an error to call this procedure if reader does not have a set-position! procedure.
(reader-has-end-position?
reader)
This returns #t
if reader has a end-position procedure, and #f
otherwise.
(reader-end-position
reader)
This calls the end-position procedure of reader, if present. It is an error to call this procedure if reader does not have a end-position procedure.
(open-byte-vector-reader
bytes)
This returns a reader that uses bytes, a byte vector, as its contents. This reader has get-position, set-position!, and end-position operations.
(open-file-reader
filename)
This returns a reader connected to the file named by filename.This reader may or may not have get-position, set-position!, and end-position operations.
(standard-input-reader
)
This returns a reader connected to the standard input. The meaning of "standard input" is implementation-dependent.
A writer object typically stands for a file or device descriptor, but can also represent the sink for the output of some algorithm, such as in the case of string writers.
(make-simple-writer
id descriptor chunk-size write-bytes! get-position set-position! end-position close)
Returns a writer object. Id is a string naming the writer, provided for informational purposes only. For a file, this will be a string representation of the file name. Descriptor is supposed to be the low-level object connected to the writer, such as the OS-level file descriptor.
Chunk-size must be a positive exact integer, and is the recommended efficient size of the write operations on this writer. This is typically the block size of the buffers of the operating system. Note that this is just a recommendation --- calls to the write-bytes!procedure (see below) will not necessarily use it. A value of 1 represents a recommendation to use unbuffered writes.
The remaining arguments are procedures --- get-position, set-position!, and end-position may be omitted, in which case the corresponding arguments must be #f
.
(write-bytes!
bytes start count)
Start and count must be non-negative exact integers. This writes up to count bytes in byte-vector bytes starting at index start. Before it does this, it will block until it can write at least one byte. It returns the number of bytes actually written as a positive exact integer.
(get-position
)
This procedure, when present, returns the current position in the byte stream as an exact integer counting the number of bytes since the beginning of the stream.
(set-position!
pos)
This procedure, when present, moves to position pos (which must be a non-negative exact integer) in the stream.
(end-position
)
This procedure, when present, returns the position in the byte stream of the next end of file, without changing the current position.
(close
)
This procedure marks the writer as closed, performs any necessary cleanup, and releases the resources associated with the writer. Further operations on the writer may signal an error.
(writer-id
writer)
This returns the value of the id field of the argument writer.
(writer-descriptor
writer)
This returns the value of the descriptor field of the argument writer.
(writer-chunk-size
writer)
This returns the value of the chunk-size field of the argument writer.
(writer-write-bytes!
writer bytes start count)
This calls the write-bytes! procedure of writer with the remaining arguments.
(writer-has-get-position?
writer)
This returns #t
if writer has a get-position procedure, and #f
otherwise.
(writer-get-position
writer)
This calls the get-position procedure of writer, if present. It is an error to call this procedure if writer does not have a get-position procedure.
(writer-has-set-position!?
writer)
This returns #t
if writer has a set-position! procedure, and #f
otherwise.
(writer-set-position!
writer pos)
This calls the set-position! procedure of writer with the pos argument, if present. It is an error to call this procedure if writer does not have a set-position! procedure.
(writer-has-end-position?
writer)
This returns #t
if writer has a end-position procedure, and #f
otherwise.
(writer-end-position
writer)
This calls the end-position procedure of writer, if present. It is an error to call this procedure if writer does not have a end-position procedure.
(open-byte-vector-writer
)
This returns a writer that can yield everything written to it as a byte vector. This writer has get-position, set-position!, and end-position operations.
(writer-byte-vector
writer)
The writer argument must be a byte-vector writer returned by open-byte-vector-writer
. This procedure returns a newly allocated byte vector containing the data written to writer in sequence. Doing this in no way invalidates the writer or change its store.
(open-file-writer
filename)
This returns a writer connected to the file named by filename.The named file is created if it does not exist already, and truncated to zero length otherwise. This writer may or may not have get-position, set-position!, and end-position operations.
(open-file-writer/append
filename)
This returns a writer connected to the file named by filename.The named file is created if it does not exist already. If it does exist, this sets the current position to the end of the file. This writer may or may not have get-position, set-position!, and end-position operations.
(standard-output-writer
)
This returns a writer connected to the standard output. The meaning of "standard output" is implementation-dependent.
(standard-error-writer
)
This returns a writer connected to the standard error. The meaning of "standard error" is implementation-dependent.
The Stream I/O layer defines high-level I/O operations on two new datatypes: input streams and output streams. These operations include binary and textual I/O. Input streams are treated in lazy functional style: input from a stream s yields an object representing the input itself, and a new input stream s1. S will continue to represent exactly the same position within the input; to advance within the stream, the program needs to perform input from s1. Consequently, input streams allow arbitrary lookahead, which is especially convenient for all kinds of scanning.
Output streams are more conventional, as the lazy functional style does not make sense with output.
Both input streams and output streams are either directly connected to underlying readers and writers, or are defined by translation on an underlying stream. This makes it possible to perform trivial transformations such as CR/LF translation, but also transparent recoding on the streams.
Textual I/O always uses UTF-8 as the underlying encoding. Other encodings can easily be supported by translating to or from UTF-8 using the translation framework. If a decoding error occurs, the implicit decoder will skip the byte starting the character encoding, yield a ? character, and attempt to continue decoding after that initial byte.
The Stream I/O layer adds an additional condition type:
(define-condition-type &i/o-stream-error &i/o-error i/o-stream-error? (stream i/o-error-stream))
The stream
field has purely informational purpose. Conditions raised in by Stream I/O procedures may include an &i/o-stream-error
condition, but are not required to do so.
Input streams come in two flavors: either directly based on a reader, or based on another input stream via translation. Input streams are in one of three states: active, truncated, or closed. When initially created, a stream is active. A program can retrieve the reader underlying an input stream---this automatically incurs disconnecting the stream from the reader, and puts the stream into the truncated state. When explicitly closed, the reader underlying an open input stream is closed as well. The closed state implies the truncated state.
Reading from a truncated stream is not an error; after all the existing buffers having been exhausted, the stream behaves as if an infinite sequence of end of files followed.
(input-stream?
obj)
This returns #t
if the argument is an input stream, #f
otherwise.
(input-bytes
input-stream)
This returns two values: a byte vector and another input stream. If any data is available before the next end of file, this returns a freshly allocated byte vector of non-zero size containing that data. If an end of file has been reached, the byte vector is empty, and the input stream returned points just past the end of file. This procedure will block until either data is available or end of file is reached.
(input-byte
input-stream)
This returns two values: a value and another input stream. If a byte is available before the next end of file, this returns that byte as an exact integer. If an end of file has been reached, the value is #f
, and the input stream returned points just past the end of file. This procedure will block until either data is available or end of file is reached.
(input-bytes-n
input-stream n)
N must be an exact, non-negative integer, specifying the number of bytes to be read.This returns two values: a byte vector and another input stream. It tries to read n bytes. If n or more bytes are available before the next end of file, it returns a byte vector of size n. If fewer bytes are available before the next end of file, it returns a byte vector containing those bytes. If end of file has been reached, the input stream returned points just past the end of file. This procedure will block until either data is available or end of file is reached.
(input-bytes-all
input-stream)
This returns two values: a byte vector and another input stream. The byte vector contains all bytes until the next end of file. The input stream returned points just past the end of file. Note that this function may block indefinitely on streams connected to interactive readers.
(input-string
input-stream)
This returns two values: a string and another input stream. If any data is available before the next end of file, this returns a string of non-zero size containing the UTF-8 decoding of that data. If an end of file has been reached, the string is empty, and the input stream returned points just past the end of file. This procedure will block until either data is available or end of file is reached.
(input-char
input-stream)
This returns two values: a value and another input stream. If a char is available before the next end of file, this returns that char as an exact integer. If an end of file has been reached, the value is #f
, and the input stream returned points just past the end of file. This procedure will block until either data is available or end of file is reached.
(input-string-n
input-stream n)
N must be an exact, non-negative integer, specifying the number of chars to be read.This returns two values: a string and another input stream. It tries to read n chars. If n or more chars are available before the next end of file, it returns a string of size n. If fewer chars are available before the next end of file, it returns a string containing those chars. If end of file has been reached, the input stream returned points just past the end of file. This procedure will block until either data is available or end of file is reached.
(input-string-all
input-stream)
This returns two values: a string and another input stream. The string contains all text until the next end of file. The input stream returned points just past the end of file. Note that this function may block indefinitely on streams connected to interactive readers.
(input-line
input-stream)
This returns two values: a value and another input stream. If data is available before the next newline char, the value is a string that contains all text until the newline char. The input stream returned points just past the newline char.If end of file has been reached, the value is #f
.
(end-of-stream?
input-stream)
This returns #t if the stream has reached end of file, #f otherwise.
(input-stream-position
input-stream)
For reader-based input streams, this returns the reader position corresponding to the next byte read from the input stream. This procedure raises an &i/o-operation-not-available-error
condition if the stream does not support the operation. It is an error to apply this procedure to a truncated or closed stream, or to a translated stream.
(input-stream-reader+constructor
input-stream)
Input-stream must be an open input stream. This returns two values: a reader and a procedure of one argument. The reader is the underlying reader of the stream at the end of the chain of translations whose head is input-stream. The procedure consumes a reader as its argument and returns a fresh input stream with the same chain of translations as input-stream. This also disconnects the input stream from the reader and puts it into the truncated state; all other input streams based on the input stream at the end of the translation chain (directly or indirectly) are also put into the truncated state.
(close-input-stream
input-stream)
This closes the underlying reader if input-stream is still open, and marks the input stream as closed. Applying close-input-stream
to a closed stream has no effect. Closing an input stream also closes all input streams that are translations (directly or indirectly) of the input stream at the end of its own translation chain. The return values are unspecified.
(open-file-input-stream
filename)
This opens a reader connected to the file named by filename and returns an input stream connected to it.
(open-byte-vector-input-stream
bytes)
This opens a byte-vector reader connected to bytes and returns an input stream connected to it.
(open-string-input-stream
string)
This opens a byte-vector reader connected to the UTF-8 encoding of string and returns an input stream connected to it.
(open-reader-input-stream
reader)
This returns an input stream connected to the reader reader.
(call-with-input-stream
input-stream proc)
This calls proc with input-stream as an argument. If proc returns, then the stream is closed automatically and the values returned by proc are returned. If proc does not return, then the stream will not be closed automatically, unless it is possible to provide that the stream will never again be used for a read operation.
(make-translated-input-stream
input-stream translate-proc)
This returns a translated input stream based on input-stream. Translate-proc must be a procedure that adheres to the following specification:
(translate-proc
input-stream wish)
Input-stream is the underlying input stream originally passed to make-translated-input-stream
. Wish is either #f
, #t
, or an exact, non-negative integer, giving a hint how much data is requested. #f
means a chunk of arbitrary size, suggesting that the user program called input-bytes
, #t
means as much as possible, suggesting that the user program called user-input-all
, and an integer specifies the requested number of bytes. Note that translate-proc can ignore wish. The procedure must return two values, a byte vector, and another input stream, analogous to the various input-...
procedures. An empty byte vector denotes an end of file. The returned input stream points just past the data returned.
Output streams, like input streams, come in two flavors: either directly based on a writer, or based on another output stream via translation.
An output stream has an associated buffer mode that defines when an output operation will flush the buffer associated with the output stream. The possible buffer modes are none
for no buffering, line
for flushing upon newlines, and block
for block-based buffering.
Output streams are in one of three states: active, terminated, or closed. When initially created, a stream is active. A program can retrieve the writer underlying an output stream---this automatically incurs disconnecting the stream from the writer, and puts the stream into the terminated state. When explicitly closed, the writer underlying an an output stream enters the closed state. The closed state implies the terminated state.
It is an error to perform an output operations on a terminated stream.
(output-stream?
obj)
This returns #t
if the argument is an output stream, #f
otherwise.
(output-bytes
output-stream bytes)
This writes the bytes in byte vector bytes to the stream. The return values are unspecified.
(output-byte
output-stream byte)
This writes the byte byte (which must be an exact integer in the range [0,255]) to the stream. The return values are unspecified.
(output-bytes-n
output-stream bytes start count)
Start and count must be non-negative exact integers. This writes the count bytes in byte vector bytes starting at index start to the output stream. It is an error if the byte vector actually has size less than start + count. The return values are unspecified.
(output-char
output-stream char)
This writes the UTF-8 encoding of the char char to the stream. The return values are unspecified.
(output-string
output-stream string)
This writes the UTF-8 encoding of the string string to the stream. The return values are unspecified.
(output-string-n
output-stream string start count)
This writes the UTF-8 encoding of the substring (substring string (+ start count))
to the stream.The return values are unspecified.
(flush-output-stream
output-stream)
This flushes any output from the buffer of output-stream to the underlying writer. It is a no-op if output-stream is terminated. The return values are unspecified.
(output-stream-position
output-stream)
For writer-based output streams, this returns the writer position corresponding to the next byte read from the output stream. This procedure raises an &i/o-operation-not-available-error
condition if the stream does not support the operation. It is an error to apply this procedure to a terminated or closed stream, or to a translated stream.
(set-output-stream-position!
output-stream pos)
Pos must be a non-negative exact integer. This flushes the output stream and sets the current position of underlying writer to pos. This procedure raises an &i/o-operation-not-available-error
condition if the stream does not support the operation. It is an error to apply this procedure to a terminated or closed stream, or to a translated stream. The return values are unspecified.
(output-stream-writer+constructor
output-stream)
Output-stream must be an open output stream. This returns two values: a writer and a procedure of one argument. The writer is the underlying writer of the stream at the end of the chain of translations whose head is output-stream. The procedure consumes a writer as its argument and returns a fresh output stream with the same chain of translations as output-stream, where each translation is in the same state as in the chain. This also disconnects the output stream from the writer and puts it into the terminated state; all other output streams based on the output stream at the end of the translation chain (directly or indirectly) are also put into the truncated state.
(close-output-stream
output-stream)
This closes the underlying writer if output-stream is still open, and marks the output stream as closed. Applying close-output-stream
to a closed stream has no effect. Closing an output stream also closes all output streams that are translations (directly or indirectly) of the output stream at the end of its own translation chain. The return values are unspecified.
(buffer-mode
name)
(syntax)Name must be one of the identifiers none
, line
, and block
. This returns a buffer-mode object denoting the associated buffer mode. There is only one such object for each mode, so a program can compare them using eq?
.
(buffer-mode?
obj)
This returns #t
if the argument is a buffer-mode object, #f
otherwise.
(output-stream-buffer-mode
output-stream)
This returns the buffer-mode object of output-stream.
(set-output-stream-buffer-mode!
output-stream buffer-mode)
If the current buffer mode of output-stream is something other than none
and buffer-mode is the none
buffer-mode object, this will first flush the output stream. Then, it sets the buffer-mode object associated with output-stream to buffer-mode. The return values are unspecified.
(open-file-output-stream
filename)
This opens a writer connected to the file named by filename via open-file-writer
and returns an output stream with unspecified buffering mode connected to it.
(open-file-output-stream/append
filename)
This opens a writer connected to the file named by filename via open-file-writer/append
and returns an output stream with unspecified buffering mode connected to it.
(call-with-byte-vector-output-stream
proc)
Proc is a procedure accepting one argument. This creates an unbuffered output stream connected to a byte-vector writer, and calls proc with that output stream as an argument. The call to call-with-byte-vector-output-stream
returns the byte vector associated with the stream when proc returns.
(call-with-string-output-stream
proc)
Proc is a procedure accepting one argument. This creates an unbuffered output stream connected to a byte-vector writer, and calls proc with that output stream as an argument. The call to call-with-string-output-stream
returns the UTF-8 decoding of the byte vector associated with the stream when proc returns.
(call-with-output-stream
output-stream proc)
This calls proc with output-stream as an argument. If proc returns, then the stream is closed automatically and the values returned by proc are returned. If proc does not return, then the stream will not be closed automatically, unless it is possible to provide that the stream will never again be used for a write operation.
(make-translated-output-stream
output-stream translate-proc state)
This returns a translated output stream based on output-stream. The translation can thread an arbitrary state from one output operation to the next; the initial state is given by state. Translate-proc must be a procedure that adheres to the following specification:
(translate-proc
output-stream state bytes start count)
This is expected to write the output data in bytes to output-stream, which is the output stream passed into make-translated-output-stream
. State is the translation state associated with the output stream.Bytes is the data to be written: it is either a byte vector or a byte represented an an exact integer. If bytes is a byte vector, start is an exact integer representing the starting index of the data to be written within bytes. Count is the number of data bytes within bytes to be written.
The procedure must return a new state object, which will be passed to the next call to translate-proc. It is recommended that translate-proc not modify stateitself, but rather generate a new state object if necessary. Otherwise, the constructor procedure by output-stream-writer+constructor
may not operate correctly.
The Imperative I/O layer provides buffered I/O using mutable, redirectable so-called ports. A port is essentially just a reference cell to a stream. The port layer is very similar, but not identical, to the R5RS I/O system.
The Imperative I/O layer introduces one condition type of its own.
(define-condition-type &i/o-port-error &i/o-error i/o-port-error? (port i/o-error-port))
This condition type allows specifying with what particular port an I/O error is associated. The port
field has purely informational purpose. Conditions raised in by Imperative I/O procedures may include an &i/o-port-error
condition, but are not required to do so.
(input-port?
obj)
This returns #t
if the argument is an input port, #f
otherwise.
(input-port-stream
input-port)
This returns the input stream underlying input-port.
(set-input-port-stream!
input-port input-stream)
This sets the input stream underlying input-port to input-stream.
(read-bytes
)
(read-bytes
input-port)
This calls input-bytes
on the underlying input stream, updates the underlying input stream to the second return value, and returns input-bytes
's first return value. Input-port may be omitted, in which case it defaults to the value returned by (current-input-port)
.
(read-byte
)
(read-byte
input-port)
This calls input-byte
on the underlying input stream, updates the underlying input stream to the second return value, and returns input-byte
's first return value. Input-port may be omitted, in which case it defaults to the value returned by (current-input-port)
.
(read-bytes-n
n)
(read-bytes-n
n input-port)
This calls input-bytes-n
on the underlying input stream, updates the underlying input stream to the second return value, and returns input-bytes-n
's first return value. Input-port may be omitted, in which case it defaults to the value returned by (current-input-port)
.
(read-bytes-all
)
(read-bytes-all
input-port)
This calls input-bytes-all
on the underlying input stream, updates the underlying input stream to the second return value, and returns input-bytes-all
's first return value. Input-port may be omitted, in which case it defaults to the value returned by (current-input-port)
.
(read-string
)
(read-string
input-port)
This calls input-string
on the underlying input stream, updates the underlying input stream to the second return value, and returns input-string
's first return value. Input-port may be omitted, in which case it defaults to the value returned by (current-input-port)
.
(read-char
)
(read-char
input-port)
This calls input-char
on the underlying input stream, updates the underlying input stream to the second return value, and returns input-char
's first return value. Input-port may be omitted, in which case it defaults to the value returned by (current-input-port)
.
(read-string-n
n)
(read-string-n
n input-port)
This calls input-string-n
on the underlying input stream, updates the underlying input stream to the second return value, and returns input-string-n
's first return value. Input-port may be omitted, in which case it defaults to the value returned by (current-input-port)
.
(read-string-all
)
(read-string-all
input-port)
This calls input-string-n
on the underlying input stream, updates the underlying input stream to the second return value, and returns input-string-all
's first return value. Input-port may be omitted, in which case it defaults to the value returned by (current-input-port)
.
(peek-byte
)
(peek-byte
input-port)
This calls input-byte
on the underlying input stream, but does not update the underlying input stream. It returns input-byte
's first return value. Input-port may be omitted, in which case it defaults to the value returned by (current-input-port)
.
(peek-char
input-port)
(peek-char
input-port)
This calls input-char
on the underlying input stream, but does not update the underlying input stream. It returns input-char
's first return value. Input-port may be omitted, in which case it defaults to the value returned by (current-input-port)
.
(eof?
)
(eof?
input-port)
This returns the result of calling end-of-stream?
on the input stream underlying input-port. Input-port may be omitted, in which case it defaults to the value returned by (current-input-port)
.
(input-port-position
input-port)
This calls input-stream-position
on the underlying input stream and returns the result.
(close-input-port
input-port)
This calls close-input-stream
on the stream underlying input-port.
(open-input-file
filename)
This returns an input port for the named file, associated with a stream created by open-file-input-stream
.
(open-input-byte-vector
bytes)
This returns an input port, associated with a byte-vector stream on the byte vector bytes created by open-byte-vector-input-stream
.
(open-input-string
string)
This returns an input port, associated with a byte-vector stream on the UTF-8 encoding of string string created by open-string-input-stream
.
(call-with-input-port
input-port proc)
This calls proc with input-port as an argument. If proc returns, then the port is closed automatically and the values returned by proc are returned. If proc does not return, then the port will not be closed automatically, unless it is possible to provide that the port will never again be used for a read operation.
(call-with-current-input-port
input-port proc)
This calls proc must with no arguments---during the extent of the call to call-with-current-input-port
, the value of (current-input-port)
is changed to return input-port. When control leaves the extent of the call, the previous return value is restored. The call to call-with-current-input-port
returns whatever proc returns.
(current-input-port
)
Returns the current default input port.
(output-port?
obj)
This returns #t
if the argument is an output port, #f
otherwise.
(output-port-stream
output-port)
This returns the output stream underlying output-port.
(set-output-port-stream!
output-port output-stream)
This sets the output stream underlying output-port to output-stream.
(display-bytes
bytes)
(display-bytes
bytes output-port)
This calls output-bytes
on the underlying output stream and bytes. The return values are unspecified. Output-port may be omitted, in which case it defaults to the value returned by (current-output-port)
.
(display-byte
byte)
(display-byte
byte output-port)
This calls output-byte
on the underlying output stream and byte. The return values are unspecified. Output-port may be omitted, in which case it defaults to the value returned by (current-output-port)
.
(display-bytes-n
bytes start count)
(display-bytes-n
bytes start count output-port)
This calls output-bytes-n
on the underlying output stream and bytes, start
and count
. The return values are unspecified. Output-port may be omitted, in which case it defaults to the value returned by (current-output-port)
.
(display-string
string)
(display-string
string output-port)
This calls output-string
on the underlying output stream and string. The return values are unspecified. Output-port may be omitted, in which case it defaults to the value returned by (current-output-port)
.
(display-char
char)
(display-char
char output-port)
This calls output-char
on the underlying output stream and char. The return values are unspecified. Output-port may be omitted, in which case it defaults to the value returned by (current-output-port)
.
(display-string-n
string start count)
(display-string-n
string start count output-port)
This calls output-string-n
on the underlying output stream and string, start
and count
. The return values are unspecified. Output-port may be omitted, in which case it defaults to the value returned by (current-output-port)
.
(newline
)
(newline
output-port)
This is equivalent to (display-char #\newline output-port)
. The return values are unspecified. Output-port may be omitted, in which case it defaults to the value returned by (current-output-port)
.
(flush-output-port
)
(flush-output-port
output-port)
This calls flush-output-stream
on the underlying output stream. The return values are unspecified. Output-port may be omitted, in which case it defaults to the value returned by (current-output-port)
.
(output-port-position
output-port)
This calls output-stream-position
on the underlying output stream and returns the result.
(set-output-port-position!
output-port pos)
This calls set-output-stream-position!
on the underlying output stream with pos and returns whatever it returns.
(close-output-port
output-port)
This calls close-output-stream
on the stream underlying output-port.
(open-output-file
filename)
This returns an output port for the named file, associated with a stream created by open-file-output-stream
.
(open-output-file/append
filename)
This returns an output port for the named file, associated with a stream created by open-file-output-stream/append
.
(call-with-output-byte-vector
proc)
Proc is a procedure accepting one argument. This creates an output port on an unbuffered output stream connected to a byte-vector writer, and calls proc with that output port as an argument. The call to call-with-byte-vector-output-port
returns the byte vector associated with the stream when proc returns.
(call-with-output-string
proc)
Proc is a procedure accepting one argument. This creates an output port on an unbuffered output stream connected to a byte-vector writer, and calls proc with that port as an argument. The call to call-with-string-output-stream
returns the UTF-8 decoding of the byte vector associated with the stream when proc returns.
(call-with-output-port
output-port proc)
This calls proc with output-port as an argument. If proc returns, then the port is closed automatically and the values returned by proc are returned. If proc does not return, then the port will not be closed automatically, unless it is possible to provide that the port will never again be used for a write operation.
(call-with-current-output-port
output-port proc)
This calls proc must with no arguments---during the extent of the call to call-with-current-output-port
, the value of (current-output-port)
is changed to return output-port. When control leaves the extent of the call, the previous return value is restored. The call to call-with-current-output-port
returns whatever proc returns.
(current-output-port
)
Returns the current default output port.
Many I/O system implementations allow associating an encoding with a port, allowing the direct use of several different encodings with ports. The problem with this approach is that the encoding/decoding defines a mapping from binary data to text or vice versa. Because of this asymmetry, such mappings do not compose. The result is usually complications and restrictions in the I/O API, such as the inability to mix text or binary data, or the inability to change encoding mid-stream.
This SRFI avoids this problem by specifying that textual I/O always uses UTF-8. This means that, if the target or source of an I/O stream is to use a different encoding, a translated stream needs to be used, for which this SRFI offers the required facilities. This means that text decoders or encoders are expressed as binary-to-binary mappings, and as such compose.
It would easily be possible to construct a library of text recoders for use with this SRFI, possibly connected to some notion of locale. This would properly be the subject of another SRFI, however.
display-char
vs write-char
R5RS calls the procedure for writing a character to an output port write-char
. This is inconsistent with the behavior of write
, which outputs characters using the \#
notation. Instead, it is consistent with the behavior of display
, which is why the procedure is called display-char
here, and why the other port output procedures have names starting with display-
.
Historically, it seems that the original proposal for the I/O subsystem in RnRS indeed called the procedure display-char
. I do not know why it was renamed---probably for compatibility with Common Lisp, which also has write-char
.
char-ready?
This SRFI intentionally does not provide char-ready?
, which is part of R5RS. The original intention of the procedure seems to have been to interface with something like Unix select(2)
. With multi-byte encodings such as UTF-8, this is no longer sufficient: the procedure would really have to look at the actual input data in order to determine whether a complete character is actually present. This makes realistic implementations of char-ready?
inconsistent with the user's expectations. A procedure byte-ready?
would be more consistent. On the other hand, such a procedure is rarely useful in real-world programs, and complicates all layers of the I/O system, as readers would have to provide an additional member procedure to enable its implementation. Moreover, a select(2)
-like implementation is not possible on all plattforms and all types of ports. Consequently, char-ready?
and byte-ready?
are not part of this SRFI.
display
This SRFI does not provide display
, which is part of R5RS. Display
is woefully underspecified, and mostly used to output strings. For this purpose, this SRFI offers display-string
.
In R5RS, the distinct type of end of file objects is primarily for the benefit of read
, where end of file must be denoted by an object that read
cannot normally return as a result of parsing the input. However, it does not seem necessary to drag in the complications of this separate object into a pure I/O SRFI, where #f
or the empty byte vector or the empty string is perfectly adequate to represent end of file.
It is debatable whether not all input procedures should denote end of file by #f
. The present arrangement is the way it is primarily because it is that way in the SML Basis Library.
Here is a tarball containing a reference implementation of this SRFI. It only runs on a version of Scheme 48 that has not been released at the time of writing in this SRFI.
However, its actual dependencies on Scheme 48 idiosyncracies are few. Chief are its use of the module system, which is easily replaced by another, and the implementation of Unicode. To implement primitive readers and writers on files, the code only relies on suitable library procedures to open the files, and read-byte
and write-byte
procedures to read or write single bytes from a (R5RS) port, as well as a force-output
procedure to flush a port.
The reference implementation has not been highly tuned, but I have spent a modest amount of time making the code deal with buffers in an economic buffer. Because of this, the code is more complicated than it needs to be, but hopefully also more usable as a basis for implementing this SRFI in actual Scheme systems.
Many examples are adapted from The Standard ML Basis Library edited by Emden R. Gansner and John H. Reppy. Cambrige University Press, 2004.
The code makes liberal use of SRFIs 1 (List Library), 11 (Syntax for receiving multiple values), 26 (Notation for Specializing Parameters without Currying).
The tarball with the reference implementation contains these examples along with test cases for them.
This customized reader reads from a list of byte vectors. A null byte vector yields EOF. Procedures for defining streams based on such readers follow.
(define (open-byte-vectors-reader bs) (let* ((pos 0)) (make-simple-reader "<byte vectors>" bs 5 ; for debugging (lambda (byte-vector start count) (cond ((null? bs) 0) (else (let* ((b (car bs)) (size (byte-vector-length b)) (real-count (min count (- size pos)))) (byte-vector-copy! byte-vector start b pos (+ pos real-count)) (set! pos (+ pos real-count)) (if (= pos size) (begin (set! bs (cdr bs)) (set! pos 0))) real-count)))) ;; rough approximation ... (lambda () (if (null? bs) 0 (- (byte-vector-length (car bs)) pos))) #f #f #f ; semantics would be unclear (lambda () (set! bs #f))))) ; for GC (define (open-strings-reader strings) (open-byte-vectors-reader (map string->utf-8 strings))) (define (open-byte-vectors-input-stream byte-vectors) (open-reader-input-stream (open-byte-vectors-reader byte-vectors))) (define (open-strings-input-stream strings) (open-reader-input-stream (open-byte-vectors-reader (map string->utf-8 strings))))
Create a string via a string output port:
(define three-lines-string (call-with-output-string (lambda (port) (display-string "foo" port) (newline port) (display-string "bar" port) (newline port) (display-string "baz" port) (newline port))))
Note that, for input streams, the successive streams need to be threaded through the program:
(define (input-two-lines s) (let*-values (((line-1 s-2) (input-line s)) ((line-2 _) (input-line s-2))) (values line-1 line-2)))
There may be life after end of file; hence, the following is not guaranteed to return true:
(define (at-end?/broken s) (let ((z (end-of-stream? s))) (let-values (((a s-2) (input-bytes s))) (let ((x (end-of-stream? s-2))) (equal? z x)))))
... but this is:
(define (at-end? s) (let ((z (end-of-stream? s))) (let-values (((a s-2) (input-bytes s))) (let ((x (end-of-stream? s))) (equal? z x)))))
Catch an I/O exception:
(define (open-it filename) (guard (condition ((i/o-error? condition) (if (message-condition? condition) (begin (display-string (condition-message condition) (current-error-port)) (newline (current-error-port)))) #f)) (open-file-input-stream filename)))
Read a file directly:
(define (get-contents filename) (call-with-input-port (open-input-file filename) read-bytes-all))
Read a file byte-by-byte:
(define (get-contents-2 filename) (call-with-input-port (open-input-file filename) (lambda (port) (let loop ((accum '())) (let ((thing (read-byte port))) (if (not thing) (list->byte-vector (reverse accum)) (loop (cons thing accum)))))))) (define (list->byte-vector l) (let ((bytes (make-byte-vector (length l) 0))) (let loop ((i 0) (l l)) (if (null? l) bytes (begin (byte-vector-set! bytes i (car l)) (loop (+ 1 i) (cdr l)))))))
Read file chunk-by-chunk:
(define (get-contents-3 filename) (call-with-input-port (open-input-file filename) (lambda (port) (let loop ((accum '())) (let ((bytes (read-bytes port))) (if (zero? (byte-vector-length bytes)) (concatenate-byte-vectors (reverse accum)) (loop (cons bytes accum)))))))) (define (concatenate-byte-vectors list) (let* ((size (fold + 0 (map byte-vector-length list))) (result (make-byte-vector size 0))) (let loop ((index 0) (byte-vectors list)) (if (null? byte-vectors) result (let* ((b (car byte-vectors)) (size (byte-vector-length b))) (byte-vector-copy! result index b 0 size) (loop (+ index size) (cdr byte-vectors)))))))
Note that imperatively changing the stream of (current-input-port)
is not a good idea, as it may be shared among several threads:
(define (redirect-in g filename) (call-with-input-port (open-input-file filename) (cut call-with-current-input-port <> g)))
Read a file using Stream I/O:
(define (get-contents/stream filename) (call-with-input-stream (open-file-input-stream filename) (lambda (stream) (let-values (((bytes _) (input-bytes-all stream))) bytes))))
Read a file byte by byte:
(define (get-contents/stream-2 filename) (call-with-input-stream (open-file-input-stream filename) (lambda (stream) (let loop ((accum '()) (stream stream)) (let-values (((byte stream) (input-byte stream))) (if (not byte) (list->byte-vector (reverse accum)) (loop (cons byte accum) stream)))))))
Read a file chunk-by-chunk:
(define (get-contents/stream-3 filename) (call-with-input-stream (open-file-input-stream filename) (lambda (stream) (let loop ((accum '()) (stream stream)) (let-values (((chunk stream) (input-bytes stream))) (if (zero? (byte-vector-length chunk)) (concatenate-byte-vectors (reverse accum)) (loop (cons chunk accum) stream)))))))
Drop a word at the beginning of a stream selectively:
(define (eat-thousand stream) (let-values (((text new-stream) (input-string-n stream (string-length "thousand")))) (if (string=? text "thousand") new-stream stream)))
Skip whitespace at the beginning of a stream:
(define (skip-whitespace stream) (let-values (((thing new-stream) (input-char stream))) (cond ((not thing) stream) ((char-whitespace? thing) (skip-whitespace new-stream)) (else stream))))
Reading a line could be implemented by scanning forward, then reading a chunk from the original position:
(define (my-input-line stream) (let count ((n 0) (g stream)) (let-values (((thing g*) (input-char g))) (cond ((not thing) (if (zero? n) (values #f g*) (input-string-n stream n))) ((char=? #\newline thing) (let*-values (((line _) (input-string-n stream n))) (values line g*))) (else (count (+ 1 n) g*))))))
Write some text to a file:
(define (hello myfile) (call-with-output-stream (open-file-output-stream myfile) (lambda (stream) (output-string stream "Hello, ") (output-string stream "world!") (output-char stream #\newline))))
Extract the reader from a stream, read a byte from it, and then reconstruct a stream from it:
(define (after-first filename) (let ((stream (open-file-input-stream filename))) (call-with-values (lambda () (input-stream-reader+constructor stream)) (lambda (reader construct) (let ((b (make-byte-vector 1 0))) (reader-read-bytes! reader b 0 1) (call-with-input-stream (construct (open-reader-input-stream reader)) (lambda (stream-2) (let-values (((contents _) (input-string-all stream-2))) contents))))))))
Extract the reader from a stream, set position, and then reconstruct a stream from it:
(define (after-n stream n) (call-with-values (lambda () (input-stream-reader+constructor stream)) (lambda (reader construct) (reader-set-position! reader n) (call-with-input-stream (construct (open-reader-input-stream reader)) (lambda (stream-2) (let-values (((contents _) (input-string-all stream-2))) contents))))))
Translate CR/LF to LF on input:
(define (translate-crlf-input original-input-stream wish) ;; state automaton (define (vanilla input-stream count) (call-with-values (lambda () (input-byte input-stream)) (lambda (byte input-stream) (cond ((not byte) (finish count)) ((= 13 byte) (cr input-stream count)) (else (vanilla input-stream (+ 1 count))))))) (define (cr input-stream count) (call-with-values (lambda () (input-byte input-stream)) (lambda (byte input-stream) (cond ((not byte) (finish (+ 1 count))) ; CR hasn't been counted yet ((= 10 byte) (call-with-values (lambda () (input-bytes-n original-input-stream (+ 1 count))) (lambda (bytes _) (byte-vector-set! bytes count 10) (values bytes input-stream)))) (else (vanilla input-stream (+ count 1))))))) (define (finish count) (call-with-values (lambda () (input-bytes-n original-input-stream count)) (lambda (bytes input-stream) (values bytes input-stream)))) (vanilla original-input-stream 0)) (define (make-crlf-translated-input-stream input-stream) (make-translated-input-stream input-stream translate-crlf-input))
Translate LF to CR/LF on output:
(define (make-crlf-translated-output-stream output-stream) (make-translated-output-stream output-stream translate-crlf-output #f)) (define (unspecific) (if #t #t)) (define (translate-crlf-output output-stream state data start count) (cond ((byte-vector? data) (let ((end (+ start count))) (let loop ((index start)) (cond ((byte-vector-index data 10 index end) => (lambda (lf-index) (output-bytes-n output-stream data index (- lf-index index)) (output-byte output-stream 13) (output-byte output-stream 10) (loop (+ 1 lf-index)))) (else (output-bytes-n output-stream data index (- end index))))))) ((= data 10) (output-byte output-stream 13) (output-byte output-stream 10)) (else (output-byte output-byte data))) (unspecific)) (define (byte-vector-index byte-vector byte start end) (let loop ((index start)) (cond ((>= index end) #f) ((= byte (byte-vector-ref byte-vector index)) index) (else (loop (+ 1 index))))))
Algorithmic reader producing an infinite stream of blanks:
(define (make-infinite-blanks-reader) (make-simple-reader "<blanks, blanks, and more blanks>" #f 4096 (lambda (bytes start count) (let loop ((index 0)) (if (>= index count) index (begin (byte-vector-set! bytes (+ start index) 32) (loop (+ 1 index)))))) (lambda () 1000) ; some number #f #f #f unspecific)) (define (make-infinite-blanks-stream) (open-reader-input-stream (make-infinite-blanks-reader)))
Sebastian Egner provided valuable comments on a draft of this SRFI.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.