This page is part of the web mail archives of SRFI 56 from before July 7th, 2015. The new archives for SRFI 56 contain all messages, not just those from before July 7th, 2015.
At Wed, 15 Sep 2004 21:51:44 -0700, Per Bothner wrote: > > From the draft: > > Some Schemes may wish to distinguish between binary and non-binary > > ports as in Common-Lisp. As these can be layered on top of the > > current ports this may better be relegated to a separate SRFI. > > Huh? This is backwards. The current ports are character ports. > As such they are layered on top of byte ports. I.e. non-binary > ports are layered on top of binary ports. Hello, I had in fact expected more opposition to this earlier and was wondering when it would turn up :) > You have to specify when the port is *opened* whether it is a > binary or character. That is one philosophy. Another is that given ports with which you can perform both byte and character operations, you can implement such B&D-style ports on top of them. > The alternative is for read-byte/write-byte to peek into the > implementation of a character port, and operate on the underlying > byte port. Ideally, as Bear mentioned earlier, I like to think of the byte-level operations as the only primitives on top of which character-level operations are defined, but that is an implementation detail. > This is losing: > (a) It complicates synchronizing (buffering) between the character > stream and the byte stream. "Complicated" should not prevent us from adding language features, and I don't see this as any more complex than having additional primitive port types. > (b) Some implementations of character streams may buffer a chuck > of bytes. If some bytes in the file cannot be mapped to characters > in the current character encoding, an exception may be signalled. Yes, and the SRFI goes so far as to provide a SRFI-36 condition for such a case. > (c) In some environments you cannot get at the underlying byte > stream from a character stream. This includes Java. A Scheme > implementation could do its own implementation of character streams > such that you could get at the underlying byte stream, but then > the read functions would only work on character streams created > using Scheme run-time routines, which complicates both implementation > and interoperability. This is again the complexity argument. The above strategy is also a backwards approach, as you said earlier, and could be made simpler and more efficient by making the byte-level operations the only primitives. > It makes no sense to mix character and binary I/O on the same port. > Anyone who tries it is in a state of sin. I work very often with binary file formats, including Scheme libraries for handling ELF, TIFF, and the gettext .mo format among others. Every one of these mixes binary and character data. Apparently almost everyone who has ever designed a binary format is a sinner :) > Kawa does treat binary ports as character ports with a special > character encoding of "binary". This is a feature others are likely to want, and you will undoubtedly find support if you write a SRFI for it. It can be implemented in portable Scheme on top of SRFI-56 by redefining the current port primitives. Given disjoint ports I can think of 3 options for working with binary formats, almost all of which include character data: 1) Toggle the port between binary mode and character mode. Clumsy and error-prone. Does not solve the problem that the character port will at times still be pointing to invalid characters. 2) Open two ports to the same source, one in character mode and one in binary mode, and read from them separately. Same problems as above with the added difficulty of keeping the two in the same position when the binary data is closely interleaved with character data. 3) Extract character data in binary ports as binary first then convert with utility procedures to character/string. In this case I would simply define as convenience forms read-char, read-line, etc. in terms of these utilities and the resulting API is indistinguishable from the current SRFI-56. Potential encoding errors are inherent in all ports and have to be dealt with whether or not you have disjoint port types. Any safety measures can and will be circumvented, becoming merely an inconvenience while providing a false sense of security. -- Alex