[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encodings.

This page is part of the web mail archives of SRFI 52 from before July 7th, 2015. The new archives for SRFI 52 contain all messages, not just those from before July 7th, 2015.



Just a nit, but a text file is a binary file, which a program when it thinks
its opening a text file interprets in some pre-prescribed manor; which may
be as simple as a large string of characters, or as complex as an indexed
database containing revision histories, embedded binary encoded images, etc.
More sophisticated indexed record file structures supported by some os's
are themselves not much more than an intermediate primitive indexed data
base built on top of plan old files composed of disk sectors, often with the
knowledge of the storage systems blocking architecture for efficiency; but
the general rule remains, you get out literally what you put in, as
otherwise they wouldn't be very useful. (although there's a historical
distinction between binary and text files, it's an artifact with little
remaining practical distinction, given the variety of text file formats in
present use).

There was a time however (which is predominantly gone now), when
communication systems would steal the most or least significant bit of a
byte to try to save bandwidth or support higher level framing such as in
older T1 lines, but now even that's mostly gone as many TCP protocols need
all bits to be preserved.

-paul-

> From: "Bradd W. Szonye" <bradd+srfi@xxxxxxxxxx>
> Date: Fri, 13 Feb 2004 13:53:54 -0800
> To: srfi-52@xxxxxxxxxxxxxxxxx
> Subject: Re: Encodings.
> Resent-From: srfi-52@xxxxxxxxxxxxxxxxx
> Resent-Date: Fri, 13 Feb 2004 22:54:03 +0100 (NFT)
> 
>> Paul Schlie wrote:
>>> But feel compelled to observe that once an object's internal representation
>>> is formatted/encoded to/from whatever external representations form is
>>> desired/required, it is then essentially in binary form; therefore binary
>>> I/O actually represents the root common port format for of all I/O; where
>>> more abstract ports may be thought of as merely munging on the data prior to
>>> sending (or after receiving) it trough a binary port; which although it may
>>> seem like a subtlety, if scheme were to view ports in this hierarchical way,
>>> it could form the basis of a very flexible data transformation and I/O
>>> architecture.
> 
> bear wrote:
>> Central idea: Right.  If the binary port is primitive, then the
>> various kinds of character ports can be provided as libraries.
>> 
>> I take issue with several of your "therefores" though; while I agree
>> with your conclusions, I don't think that the internal representation
>> of any kind of data is, or should be presumed to be, at all similar to
>> that which passes through a binary port.
> 
> That's roughly my feeling too. I agree with some of his basic
> conclusions, but I disagree with many of his reasons for them.
> 
> For example, I think it's splitting hairs to call it "binary I/O" when
> you're reading or writing in the machine's native text format. In some
> cases, it's downright misleading; for example, the native text format on
> a VMS system is record-based and cannot be represented as a binary
> stream.
> 
> Because of that, I think it's a mistake to claim that binary I/O is more
> primitive than text I/O. On some systems, the two are entirely
> orthogonal. For UNIX-like systems, you can implement text on top of
> binary, but it's not generally possible. Something to keep in mind when
> specifying port & string standards.
> -- 
> Bradd W. Szonye
> http://www.szonye.com/bradd
>