[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Different interface to Basic String Ports



The draft of SRFI 6 proposes 3 new procedures:

1) (open-input-string <string>)     returns a port to read from <string>
2) (open-output-string)             returns a port accumulating into a string
3) (get-output-string <output-string-port>)  returns accumulated string

I have used this interface in early versions of Gambit, but I now use
a different interface which I think is superior because it provides
higher efficiency and more functionality with fewer procedures.
"open-input-string" and "open-output-string" are defined as above, and
the procedures "close-input-port" and "close-output-port" are extended
to return a string when their argument is a string port.  Specifically:


  (close-output-port <output-string-port>) returns a string containing
  the characters accumulated on the <output-string-port>.

  example:

  (let ((o (open-output-string)))
    (write "cloud" o)
    (write (* 3 3) o)
    (close-output-port o))
   --> "\"cloud\"9"


  (close-input-port <input-string-port>) returns a string containing
  the characters that were not read from the string passed to
  "open-input-string".

  example:

  (let ((i (open-input-string "alice #(1 2)")))
    (let ((a (read i)))
      (list a (close-input-port i))))
   --> (alice " #(1 2)")


Because we know (close-output-port <output-string-port>) is the last
use of the <output-string-port>, we can return the **internal** string
that is being used to accumulate the characters.  Note that for this
to really be efficient the implementation needs to support a
"string-shrink!" operation to decrease the length of the internal
string and the usual technique of reallocating a string twice the size
of the internal string when it overflows can be used.

The string returned by (close-input-port <input-string-port>) is
useful if some subsequent parsing of the remaining input is needed.
For example by checking if the string is "" you can verify that the
whole input string was read (this is not a very good example because
with the draft proposal (and my proposal) you can simply do
(eof-object? (read-char <port>))).  Also, the program can tell how
much of the input string was read.  Although this functionality is
also available with Clinger's proposal, usually more work is required
(such as a loop to count all the characters remaining in the input
string or a loop accumulating the remaining characters in a string (or
output-string-port)).

Note that there is no need to call "close-output-port" and
"close-input-port" if the resulting string is not needed (the garbage
collector will simply reclaim all the storage when the port is no
longer reachable) so the cost of building the strings is only paid
when you need them.

What I like most in my proposal is the duality of the
"open-xxx-string" and "close-xxx-port" procedures.
"open-input-string" and "open-output-string" allow going from the
string representation to the port representation, and
"close-input-port" and "close-output-port" when passed a string port
allow going from the port representation to the string representation.

My proposal also extends well to what I call "input-output string
ports" (basically a FIFO queue where each character written to the
port can subsequently be read in the order written).  Among other
things input-output string ports can be useful for interthread
communication (one thread writes to the port and another reads from
it).  When an input-output string port is created an initial string of
characters is specified.  The procedures "close-output-port" and
"close-input-port" return the string of characters that were written
but not read.  With this it is easy to define "open-input-string" and
"open-output-string" as special cases:

(define (open-input-string str) (open-input-output-string str))
(define (open-output-string) (open-input-output-string ""))

Please consider revising your proposal to support this interface.

Marc