[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encodings.



bear wrote:

If there are multiple encodings/canonicalizations/etc in use
on a system, let schemes on those systems implement multiple kinds of
character ports.

I don't think that's a good idea.

But it follows that there is NO WAY we should rely on I/O of
characters through character ports to read or write a particular
binary representation for "raw" data such as sound and image files.

Does not follow.

May I suggest the Kawa solution:

You want to be able to specify character encoding when you open a
file, even if people mostly use a system dependent port.  However
that encoding is specified is orthogoal to my suggestion:  It can
be a parameter object or a option parameter:

(define p (open-input-port "foo.txt" encoding: "ebcdic"))

Add a special encoding for binary files:

(define b (open-input-port "foo.bin" encoding: "binary"))

A "binary" encoding maps the byte n to (integer->char b),
with no translations.

Notice that a Windows or MacOS 8-bit system may do line-end
munging for the default encoding, but is not allowed to do
so for "binary".

The only reason programmers want to write characters that aren't in
the "normal" encoding/canonicalization/etc, is when they need really
close control of the exact format of I/O.  But when you need control
*that* close, you're not talking about a "character" port at all any
more; you're talking about binary I/O.  Rather than breaking the
abstraction barrier on character ports, you need a different kind of
port.  We need binary ports that support operations like (read-bytes)
and (write-bytes).

Perhaps if you started from scratch, but my solution is more
compatible with existing Scheme code.
--
	--Per Bothner
per@xxxxxxxxxxx   http://per.bothner.com/