[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Names and primitives in SRFI 56

This page is part of the web mail archives of SRFI 56 from before July 7th, 2015. The new archives for SRFI 56 contain all messages, not just those from before July 7th, 2015.

On Sun, 19 Sep 2004, Bradd W. Szonye wrote:

> Many languages simply follow the assumptions of the operating system
> for which they were originally designed.

That was considered virtous design at the time; the objective was,
remember, to present the consumer with a standalone system where
all the programs worked together; one system could be fundamentally
different from others in all kinds of ways, but the programs on that
system NEVER had to interact with or exchange data with programs on
other systems.  So whatever the OS did, you followed absolutely in
an attempt to create a unified whole.

Now, computers are almost considered more as communications devices
than they are computing devices.  Data interchange has become a
more fundamental consideration than agreement with the operating
system, and "portability" the McGuffin of software design.

> For example, the original C language assumed that bytes and
> characters were synonymous, that there are no variable-length units,
> and that all I/O is stream-oriented. The ANSI C standard expanded
> the model to handle variable-length units, but it still assumes
> character/byte equivalence and stream sequencing. Newer languages in
> the C family have adopted the ANSI C assumptions, as do many
> higher-level languages that use C for the underlying implementation.

At this point, with "THE INTERNET" being the killer application
without which most people would not even bother to own computers, and
virtually all internet software needing stream-based I/O, I think
stream-based I/O can be considered universal.  I don't expect to find
record-based text ever again except on embedded devices with peculiar
runtime models, and I expect less and less of it even there.

I think it's reasonable to build stream I/O into a language; if
there's something different on a particular system, odds are it's
going to need a one-off I/O library and portability is out the window
anyway.  So there's nothing to lose by having code developed for that
system just use different I/O primitives, which you'll have to link
in, and avoid the stream-based I/O.

> In contrast, languages like COBOL make radically different
> assumptions. They assume record-oriented files to support the
> operating systems they were designed for.

Ghods, I remember that.  "Seeking" on text files with a granularity of
one line, and the fundamental reads and writes are one-line
operations.  80 characters per line, and carriage returns assumed by
default at the end of every line.  Works great, if everything else on
the system uses text in the same format.

These days I would be shocked to find a system with such large
assumptions on anything more sophisticated than a PDA. I think we can
and should ignore it when spec'ing a programming language intended for
broad application.

Unfortunately, beyond stream-based I/O, text has no dominant model
right now.  Any system now built will have to handle at least ASCII,
Latin1, Big5, many different versions of ISO-foo mostly having the
same mappings for the first 128 characters and different mappings for
the last 128, UTF8, UTF16 (little or big-endian, with BOM), UTF16be
and UTF16le (without BOM), and UTF32 (in eight different varieties for
big, little, and two kinds of middle-endian, with and without
BOM). Different systems will expect: Null-termination, prefixed
character counts, prefixed codepoint counts, or prefixed byte, word,
or halfword counts - with different expectations of Null size and
count widths!

"Standards" in text exchange are so multiplicitous that they might as
well not exist!

Sorry: ranting about "text" really has no place in a SRFI on binary
I/O; I started by trying to explain why, and got engaged with it
because it's a pet peeve.