[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: finalize or withdraw?

This page is part of the web mail archives of SRFI 56 from before July 7th, 2015. The new archives for SRFI 56 contain all messages, not just those from before July 7th, 2015.



On 8/20/05, Michael Sperber <sperber@xxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> 
> You have seen that SRFI 68 addresses all of these issues, right?

Yes, SRFI-68 can easily handle all of these issues, simply using
READ/WRITE-U8 for READ/WRITE-BYTE.  This is because SRFI-68 makes no
distinction between binary and character ports.  If we're willing to
drop that distinction, then the whole problem disappears, but we've
abandoned implementations that don't let you directly mix binary and
text operations (notably all Java implementations, and C implementations
using wchar).

Another way of looking at it is that SRFI-68 is flexible enough to allow
us to create new port types with an explicit binary vs character
distinction.  If we were to do that, then what API should we use when we
inevitably need to serialize/deserialize Scheme string objects to/from
binary ports?  A SRFI-68 approach might be to combine READ-BLOB-N with a
utility procedure

  (BLOB->STRING str [encoding])

probably implemented on top of blob-input-stream and transcoder.
This generalizes into the first category of solutions, using specific
procedures to read and write text to binary ports.

A variation on this approach is simply to unify those procedures with
the standard character port procedures.  In other words, READ/WRITE-CHAR
would be guaranteed to work on any port, including binary-ports, and in
some implementations this would involve a check to see if the port is
binary and if so use a separate path from the native underlying port
operation.  This is why SISC is able to provide READ/WRITE-CHAR for its
binary ports and thus pass the integer part of the test suite - when a
character procedure is called on a binary port it just returns a single
octet value (as though the port were Latin-1), instead of signalling an
error as the underlying Java library would.  UTF-8 could just as easily
be used.  Using this strategy we would still not guarantee binary
operations on a character port - if you need to mix I/O, use a binary
port.  It's a little more work for the Java implementations, but a simpler API.

-- 
Alex