[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encodings.

This page is part of the web mail archives of SRFI 52 from before July 7th, 2015. The new archives for SRFI 52 contain all messages, not just those from before July 7th, 2015.



Yup, same basic page; with a few asides:

- still don't suspect it's a good idea to specify any particular encoding
  for scheme's required character-set.

- still don¹t suspect it's a good idea to allow potentially non-portable
  characters to be used in identifier or comments.

- still suspect it would be nice to be able to extend the concept of being
  able to specify hierarchical format/encoding port procedures.

- and lastly, actually rather like your "titled:" argument syntax, makes
  reading and likely writing most code much easier and less error prone.

> From: Per Bothner <per@xxxxxxxxxxx>
> Date: Fri, 13 Feb 2004 10:34:47 -0800
> To: bear <bear@xxxxxxxxx>
> Cc: Paul Schlie <schlie@xxxxxxxxxxx>, srfi-52@xxxxxxxxxxxxxxxxx
> Subject: Re: Encodings.
> Resent-From: srfi-52@xxxxxxxxxxxxxxxxx
> Resent-Date: Fri, 13 Feb 2004 19:34:54 +0100 (NFT)
> 
> bear wrote:
> 
>> If there are multiple encodings/canonicalizations/etc in use
>> on a system, let schemes on those systems implement multiple kinds of
>> character ports.
> 
> I don't think that's a good idea.
> 
>> But it follows that there is NO WAY we should rely on I/O of
>> characters through character ports to read or write a particular
>> binary representation for "raw" data such as sound and image files.
> 
> Does not follow.
> 
> May I suggest the Kawa solution:
> 
> You want to be able to specify character encoding when you open a
> file, even if people mostly use a system dependent port.  However
> that encoding is specified is orthogoal to my suggestion:  It can
> be a parameter object or a option parameter:
> 
> (define p (open-input-port "foo.txt" encoding: "ebcdic"))
> 
> Add a special encoding for binary files:
> 
> (define b (open-input-port "foo.bin" encoding: "binary"))
> 
> A "binary" encoding maps the byte n to (integer->char b),
> with no translations.
> 
> Notice that a Windows or MacOS 8-bit system may do line-end
> munging for the default encoding, but is not allowed to do
> so for "binary".
> 
>> The only reason programmers want to write characters that aren't in
>> the "normal" encoding/canonicalization/etc, is when they need really
>> close control of the exact format of I/O.  But when you need control
>> *that* close, you're not talking about a "character" port at all any
>> more; you're talking about binary I/O.  Rather than breaking the
>> abstraction barrier on character ports, you need a different kind of
>> port.  We need binary ports that support operations like (read-bytes)
>> and (write-bytes).
> 
> Perhaps if you started from scratch, but my solution is more
> compatible with existing Scheme code.
> -- 
> --Per Bothner
> per@xxxxxxxxxxx   http://per.bothner.com/
>