[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encodings.

Yup, same basic page; with a few asides:

- still don't suspect it's a good idea to specify any particular encoding
  for scheme's required character-set.

- still don¹t suspect it's a good idea to allow potentially non-portable
  characters to be used in identifier or comments.

- still suspect it would be nice to be able to extend the concept of being
  able to specify hierarchical format/encoding port procedures.

- and lastly, actually rather like your "titled:" argument syntax, makes
  reading and likely writing most code much easier and less error prone.

> From: Per Bothner <per@xxxxxxxxxxx>
> Date: Fri, 13 Feb 2004 10:34:47 -0800
> To: bear <bear@xxxxxxxxx>
> Cc: Paul Schlie <schlie@xxxxxxxxxxx>, srfi-52@xxxxxxxxxxxxxxxxx
> Subject: Re: Encodings.
> Resent-From: srfi-52@xxxxxxxxxxxxxxxxx
> Resent-Date: Fri, 13 Feb 2004 19:34:54 +0100 (NFT)
> bear wrote:
>> If there are multiple encodings/canonicalizations/etc in use
>> on a system, let schemes on those systems implement multiple kinds of
>> character ports.
> I don't think that's a good idea.
>> But it follows that there is NO WAY we should rely on I/O of
>> characters through character ports to read or write a particular
>> binary representation for "raw" data such as sound and image files.
> Does not follow.
> May I suggest the Kawa solution:
> You want to be able to specify character encoding when you open a
> file, even if people mostly use a system dependent port.  However
> that encoding is specified is orthogoal to my suggestion:  It can
> be a parameter object or a option parameter:
> (define p (open-input-port "foo.txt" encoding: "ebcdic"))
> Add a special encoding for binary files:
> (define b (open-input-port "foo.bin" encoding: "binary"))
> A "binary" encoding maps the byte n to (integer->char b),
> with no translations.
> Notice that a Windows or MacOS 8-bit system may do line-end
> munging for the default encoding, but is not allowed to do
> so for "binary".
>> The only reason programmers want to write characters that aren't in
>> the "normal" encoding/canonicalization/etc, is when they need really
>> close control of the exact format of I/O.  But when you need control
>> *that* close, you're not talking about a "character" port at all any
>> more; you're talking about binary I/O.  Rather than breaking the
>> abstraction barrier on character ports, you need a different kind of
>> port.  We need binary ports that support operations like (read-bytes)
>> and (write-bytes).
> Perhaps if you started from scratch, but my solution is more
> compatible with existing Scheme code.
> -- 
> --Per Bothner
> per@xxxxxxxxxxx   http://per.bothner.com/