This page is part of the web mail archives of SRFI 79 from before July 7th, 2015. The new archives for SRFI 79 contain all messages, not just those from before July 7th, 2015.
Many thanks for your long comments! Marcin 'Qrczak' Kowalczyk <qrczak@xxxxxxxxxx> writes: > I don't like the separation into readers/writers, streams, and ports. > Too many similar concepts are treated as completely disjoint types. Then I haven't explained things very well ... > As I understand it: > > - readers/writers deal with physical I/O of blocks of bytes > > - streams provide encoding and newline conversion, buffering, > and scanning the same input multiple times > > - ports provide raw I/O of bytes, can convert UTF-8 to characters, > and can convert between characters and lines, or characters and > Scheme external representations > > This feels like a single package. There should be some overall > description of the whole design somewhere, so one doesn't have to > dig into four separate SRFIs. You do know it used to be a single package (SRFI 68), and the folks over on its discussion list were pretty unanimously for splitting it up? I'm afraid you're outvoted ... But let me provide a different characterization: - Readers/writers are meant for people who *provide* new data sources---it's easy to provide readers and writers, but they're not meant to be used by user programs. - Ports and streams are convenient to use, but would be hard to provide directly. (Most Scheme systems that provide facilities for creating new "port types" either sacrifice performance or simplicity or offer something very similar to readers/writers.) Pick the one that suits your application. Ports are more what Schemers are used to. (Input) streams offer the advantage that it's much easier to write transcoder-like facilities. In this light, they appear very different from one another. > I don't quite understand the rationale for using UTF-8 as the > intermediate format. Could you be more specific as to what you don't understand? There's an explanation in the end. > For mixing textual and binary I/O (if the encoding is not known to > be UTF-8) one has to put and remove a converter dynamically on every > switch, and it's incompatible with block-conversion of input (it > must be converted one character at a time, unless we can find the > boundary between text and binary data when looking at the raw stream > before the conversion). ... or do the work in the one transcoder. Allowing mixing of textual and binary I/O is always a matter of trade-offs between efficiency and functionality. I can see how somebody else might come down in a different place, but I haven't seen a solution that addresses your problem. (There was some discussion over on the SRFI 68 list.) Maybe you can provide details on how you'd solve this problem---I'd love to improve upon this, but don't know how. > EOL style doesn't include the possibility of accepting any of the > three common conventions, which is used by Java and probably .NET. The problem is that this is only the case for input, not output, so you'd get three more EOL styles instead of one. This is probably better handled by a tailored READ-LINE / INPUT-LINE procedure. > Since on classic Macintosh Perl (and perhaps C too, haven't checked) > exchanges the meanings of \n and \r (by actually changing their > interpretation in the source instead of recoding), I guess it would be > more useful to exchange them when recoding in the CR style, instead > of by treating either as a newline on input and writing a newline for > either on output. Interesting, I didn't know this. Could you provide details or a web link? > StdIn and StdOut can be seekable, and it's sometimes useful (e.g. Unix > "wc" makes use of this). The reference implementation doesn't allow that. Sure---the underlying substrate doesn't allow it. But you might in your implementation, and that shouldn't be hard. > I don't understand input-string. How much does it read? However much the implementation feels like. To be honest, I'm not positive that it's that useful (unlike INPUT-BLOB)---it's mainly there for symmetry with INPUT-BLOB. > When reading from ports, it's not specified what happens when data are > not valid UTF-8. Similarly for decoding from e.g. UTF-16 (unpaired > surrogates), UTF-32 (too large values), or encoding to latin-1 > (characters above U+00FF). > Here are various concrete stream types: [...] Above, you criticize the fact that I have different types for different levels of the system. But how is your hierarchy of "stream types" different from that? (I'm not trying to criticize you---I'm curious.) -- Cheers =8-} Mike Friede, Völkerverständigung und überhaupt blabla