[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encodings.

This page is part of the web mail archives of SRFI 52 from before July 7th, 2015. The new archives for SRFI 52 contain all messages, not just those from before July 7th, 2015.



Just to close the issue out in my mine, looked at the reader code, it seems
to strip \r characters following \n characters, treating the sequence as a
single \n character, allowing it to conveniently read generic DOS'ish files,
but writes /n characters to terminate lines, which older Mac C compilers
substitutes with an LF ironically I recall. (so you've got basically a
universal reader for binary based port text, authored using one of three
typical newline conventions, which would seem to be apparently adequate).

> From: Paul Schlie <schlie@xxxxxxxxxxx>
> Date: Sun, 15 Feb 2004 22:20:53 -0500
> To: <srfi-52@xxxxxxxxxxxxxxxxx>
> Subject: Re: Encodings.
> Resent-From: srfi-52@xxxxxxxxxxxxxxxxx
> Resent-Date: Mon, 16 Feb 2004 04:21:04 +0100 (NFT)
> 
> Hi Robby,
> 
> - as I've personally been using OSX for the past few years, I have to admit
> I forget what peculiarities may have existed under OS9 previously, but as
> Mac's have historically been the underdog, their text editors have had to
> become sensitive to various other platform end-of-line encoding, and adopt
> to it locally. (regardless of UNIX, DOS, or Mac initial encoding).
> 
> - do agree that all files should be opened in binary mode, but do suspect
> that it would be nice to adhere to local conventions, and be sensitive to
> foreign ones; although if one had to pick the most the neutral one, would
> guess it to be UNIX, as you've chosen.
> 
> - with respect to utf8, although I wouldn't expect any problems with respect
> to the use of Scheme's defined character-set; would guess that most programs
> will continue to interpret non-ASCII encoded character bytes based on their
> native character-set by default, which aren't presently likely Unicode based
> (but only relevant to those who expect something otherwise).
> 
> Thanks, -paul-
> 
>> From: Robby Findler <robby@xxxxxxxxxxxxxxx>
>> Date: Sun, 15 Feb 2004 19:03:57 -0600
>> To: Paul Schlie <schlie@xxxxxxxxxxx>
>> Cc: srfi-52@xxxxxxxxxxxxxxxxx
>> Subject: Re: Encodings.
>> Resent-From: srfi-52@xxxxxxxxxxxxxxxxx
>> Resent-Date: Mon, 16 Feb 2004 02:03:59 +0100 (NFT)
>> 
>> At Fri, 13 Feb 2004 12:15:54 -0500, Paul Schlie wrote:
>>> (although I suspect that it may be found necessary to
>>> base read-char, read-string, etc. on a flexibly defined "local" encoding
>>> definition, rather than assuming all text/data is encoded any particular
>>> way, on any particular platform.)
>> 
>> Our experience with the crlf issues on windows vs mac vs unix suggests
>> the opposite. The desire to be able to distribute a single set of
>> sources that runs on all those platforms means that we currently read
>> all files in binary (by default). Whether this translates to unicode
>> issues isn't entirely clear, but we're starting with a single default
>> encoding, rather than looking for local encodings.
>> 
>> Robby
>> 
>