[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encodings.



Just to close the issue out in my mine, looked at the reader code, it seems
to strip \r characters following \n characters, treating the sequence as a
single \n character, allowing it to conveniently read generic DOS'ish files,
but writes /n characters to terminate lines, which older Mac C compilers
substitutes with an LF ironically I recall. (so you've got basically a
universal reader for binary based port text, authored using one of three
typical newline conventions, which would seem to be apparently adequate).

> From: Paul Schlie <schlie@xxxxxxxxxxx>
> Date: Sun, 15 Feb 2004 22:20:53 -0500
> To: <srfi-52@xxxxxxxxxxxxxxxxx>
> Subject: Re: Encodings.
> Resent-From: srfi-52@xxxxxxxxxxxxxxxxx
> Resent-Date: Mon, 16 Feb 2004 04:21:04 +0100 (NFT)
> 
> Hi Robby,
> 
> - as I've personally been using OSX for the past few years, I have to admit
> I forget what peculiarities may have existed under OS9 previously, but as
> Mac's have historically been the underdog, their text editors have had to
> become sensitive to various other platform end-of-line encoding, and adopt
> to it locally. (regardless of UNIX, DOS, or Mac initial encoding).
> 
> - do agree that all files should be opened in binary mode, but do suspect
> that it would be nice to adhere to local conventions, and be sensitive to
> foreign ones; although if one had to pick the most the neutral one, would
> guess it to be UNIX, as you've chosen.
> 
> - with respect to utf8, although I wouldn't expect any problems with respect
> to the use of Scheme's defined character-set; would guess that most programs
> will continue to interpret non-ASCII encoded character bytes based on their
> native character-set by default, which aren't presently likely Unicode based
> (but only relevant to those who expect something otherwise).
> 
> Thanks, -paul-
> 
>> From: Robby Findler <robby@xxxxxxxxxxxxxxx>
>> Date: Sun, 15 Feb 2004 19:03:57 -0600
>> To: Paul Schlie <schlie@xxxxxxxxxxx>
>> Cc: srfi-52@xxxxxxxxxxxxxxxxx
>> Subject: Re: Encodings.
>> Resent-From: srfi-52@xxxxxxxxxxxxxxxxx
>> Resent-Date: Mon, 16 Feb 2004 02:03:59 +0100 (NFT)
>> 
>> At Fri, 13 Feb 2004 12:15:54 -0500, Paul Schlie wrote:
>>> (although I suspect that it may be found necessary to
>>> base read-char, read-string, etc. on a flexibly defined "local" encoding
>>> definition, rather than assuming all text/data is encoded any particular
>>> way, on any particular platform.)
>> 
>> Our experience with the crlf issues on windows vs mac vs unix suggests
>> the opposite. The desire to be able to distribute a single set of
>> sources that runs on all those platforms means that we currently read
>> all files in binary (by default). Whether this translates to unicode
>> issues isn't entirely clear, but we're starting with a single default
>> encoding, rather than looking for local encodings.
>> 
>> Robby
>> 
>