[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encodings.

This page is part of the web mail archives of SRFI 52 from before July 7th, 2015. The new archives for SRFI 52 contain all messages, not just those from before July 7th, 2015.

bear wrote:

If there are multiple encodings/canonicalizations/etc in use
on a system, let schemes on those systems implement multiple kinds of
character ports.

I don't think that's a good idea.

But it follows that there is NO WAY we should rely on I/O of
characters through character ports to read or write a particular
binary representation for "raw" data such as sound and image files.

Does not follow.

May I suggest the Kawa solution:

You want to be able to specify character encoding when you open a
file, even if people mostly use a system dependent port.  However
that encoding is specified is orthogoal to my suggestion:  It can
be a parameter object or a option parameter:

(define p (open-input-port "foo.txt" encoding: "ebcdic"))

Add a special encoding for binary files:

(define b (open-input-port "foo.bin" encoding: "binary"))

A "binary" encoding maps the byte n to (integer->char b),
with no translations.

Notice that a Windows or MacOS 8-bit system may do line-end
munging for the default encoding, but is not allowed to do
so for "binary".

The only reason programmers want to write characters that aren't in
the "normal" encoding/canonicalization/etc, is when they need really
close control of the exact format of I/O.  But when you need control
*that* close, you're not talking about a "character" port at all any
more; you're talking about binary I/O.  Rather than breaking the
abstraction barrier on character ports, you need a different kind of
port.  We need binary ports that support operations like (read-bytes)
and (write-bytes).

Perhaps if you started from scratch, but my solution is more
compatible with existing Scheme code.
	--Per Bothner
per@xxxxxxxxxxx   http://per.bothner.com/