[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why are byte ports "ports" as such?

This page is part of the web mail archives of SRFI 91 from before July 7th, 2015. The new archives for SRFI 91 contain all messages, not just those from before July 7th, 2015.

To: Ben Goetter <goetter@mazama.net>
Subject: Re: Why are byte ports "ports" as such?
From: Marc Feeley <feeley@iro.umontreal.ca>
Date: Thu, 13 Apr 2006 17:41:41 -0400
Cc: srfi-91@srfi.schemers.org
Delivered-to: srfi-91@srfi.schemers.org
In-reply-to: <443E9048.8000804@mazama.net>
References: <443E9048.8000804@mazama.net>

On 13-Apr-06, at 1:54 PM, Ben Goetter wrote:

If you separate byte ports from character ports, and separate inputports from output ports (at least at the API level), you get aneasily type-checked interface. e.g.
open-input-file string [encoding keywords] -> input-character-port
read-char input-char-port -> character
open-input-file-raw string -> input-byte-port
read-byte input-byte-port -> integer


Did you read this section of the SRFI?

Byte ports support character I/O operations because with each byteport is attached a character encoding specifying how characters areencoded with bytes. It is incorrect to believe however that all portsare byte ports. For example the ``string ports'' of SRFI 6 (BasicString Ports) have no reason to be aware of the character to byteencoding because they only deal with sequences of characters. So theyneed not be byte ports. For this reason this SRFI views byte ports asa subtype of character ports. Character ports support character I/Ooperations and byte ports support character I/O operations and byte I/O operations. All I/O operations which are valid on a character portare also valid on a byte port. [Although not specified in this SRFI afurther generalization is ``object ports'' which are ports whosefundamental I/O unit is the Scheme object. Character ports are objectports because there is a standard encoding of (most) Scheme objectsto characters.]

SRFI 91 allows character I/O and binary I/O on byte ports becauseoften files use a format which mixes text and byte encoded data.Viewing byte ports as a subtype of character ports is consistent withcurrent practice (i.e. "text files" are just binary files whichencode the characters with a sequence of bytes that depend on thecharacter encoding).

For your bidi ports, perhaps
open-input-output-file string [encoding keywords] -> input-char-port output-char-port
with the two ports sharing common buffer structure in theimplementation.

It is a pain to carry those two ports around in the code when theprogram needs to communicate bidirectionally with some other entity(another process, a user at a terminal, a socket, etc). Moreover theseparation of a conceptually bidirectional channel into distinctports (input and output) destroys the conceptual link that theyhave. This hinders program understanding. For example, withbidirectional ports (close-port port) will close both sides of thebidirectional port (i.e. the link between the input and output portis preserved). With two unidirectional ports you have to duplicatesome operations (closing ports, changing port settings, ...).

Often one needs to open a file or a structure initially as a byteport, then decode subsequent sections of the sequence as charactersof a particular encoding. For that, a procedure like
cook-input-encoding integer input-byte-port [encoding keywords] ->input-char-port
can return a port that promises to decode a certain number ofoctets from the backing byte port with your encoding. It does'thandle variable-length structures well, though.

This is possible with SRFI 91. Just open the file (in buffered ornon-buffered mode) and read your bytes, then read your characters.If you need to read the characters first, then the file needs to beopened in non-buffered mode, read your characters, then read yourbytes (after switching back to buffered mode if you wish).

By the way I'm tempted to add string ports to this SRFI (compatiblewith SRFI 6 of course), and the analog ports for u8vectors, i.e.u8vector ports. String ports are character ports (but not byteports) and u8vector ports are byte ports (and character ports).Something along these lines:


(open-input-string string-or-settings)
(open-output-string [string-or-settings])
(open-string [string-or-settings])

and

(open-input-u8vector u8vector-or-settings)
(open-output-u8vector [u8vector-or-settings])
(open-u8vector [u8vector-or-settings])

These would allow a more complete set of procedures for encoding anddecoding strings into u8vectors. For example:


> (with-output-to-u8vector
    (list char-encoding: 'UTF-8)
    (lambda () (write-char (integer->char 1234))))
#u8(211 146)

I'm currently holding back to keep the SRFI lean, but I may change mymind (or write a separate SRFI).

I like your read-substring and write-substring.


Great.

Marc

Follow-Ups:
- Re: Why are byte ports "ports" as such?
  - From: John Cowan <cowan@ccil.org>

References:
- Why are byte ports "ports" as such?
  - From: Ben Goetter <goetter@mazama.net>

Prev by Date: Re: Why are byte ports "ports" as such?
Next by Date: Re: Why are byte ports "ports" as such?
Previous by thread: Re: Why are byte ports "ports" as such?
Next by thread: Re: Why are byte ports "ports" as such?
Index(es):
- Date
- Thread