[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encodings.



Sorry, upon reflection, my conjectured hypothetical bytes->value etc.
procedures would likely be more useful if they were port/stream based:

(encode <format> <value> ...) -> <port> or #f ; where #f -> end of data.
(decode <format> <port>) -> <value> ... or #f ; where #f -> end of data.

Where if the format procedure is #f, it implies the use of "local" defaults,
based on <value> types. Which could provide a flexible mechanism to enable,
the arbitrary encoding translation of internal data to external forms, but
also could provide the basis for reasonably sophisticated hierarchical
internal formatting as well by being able to accept ports as value
arguments: (where for the sake of argument assuming the existence of
a pipe procedure which accepts a source and destination port as arguments,
where the destination port defaults to standard-output if not specified):

(pipe (encode utf8 "a string" a-symbol (encode graph 2.3 4.2 5.4)))

-> writes utf8 encoded string equivalent to stdout.

(let ((get-real (decode ieee-4-byte-float stdin)))
     (let (loop (real get-real) (reals '()))
       (when real
             (append! reals real)
             (loop get-real reals))
       reals)))

-> (<real> <real> ...)

which would seem to be a generally useful thing to be able to do.

thanks, -paul-


> From: Paul Schlie <schlie@xxxxxxxxxxx>
> Date: Fri, 13 Feb 2004 12:15:54 -0500
> To: <srfi-52@xxxxxxxxxxxxxxxxx>
> Subject: Re: Encodings.
> Resent-From: srfi-52@xxxxxxxxxxxxxxxxx
> Resent-Date: Fri, 13 Feb 2004 18:16:06 +0100 (NFT)
> 
> Just read his post. Works for me, and with a little luck and possibly a
> few tweaks if/as discovered through use to be nice/required, it could win
> broader acceptance. (although I suspect that it may be found necessary to
> base read-char, read-string, etc. on a flexibly defined "local" encoding
> definition, rather than assuming all text/data is encoded any particular
> way, on any particular platform.)
> 
> where with a: (bytes->value <bytes> <encoding>) -> <value>
> and matching: (value->bytes <value> <encoding>) -> <bytes>
> 
> then multiple <encoding> procedures may be defined as desired/required,
> which could default to "local" definitions if not explicitly specified.
> 
> thereby enabling any arbitrary sequence of encoded bytes read/written
> from/to a port to be converted to/from a generic internal scheme value
> type, using whatever external encoded representations may be required.
> 
> Thanks again, -paul-
> 
> 
>> From: Robby Findler <robby@xxxxxxxxxxxxxxx>
>> Date: Fri, 13 Feb 2004 09:01:45 -0600
>> To: srfi-52@xxxxxxxxxxxxxxxxx
>> Subject: Re: Encodings.
>> Resent-From: srfi-52@xxxxxxxxxxxxxxxxx
>> Resent-Date: Fri, 13 Feb 2004 16:01:57 +0100 (NFT)
>> 
>> At Fri, 13 Feb 2004 03:00:51 -0500, Paul Schlie wrote:
>>> Maybe I'm missing the boat, but from the best I can tell, all
>>> discussions seem to be leading to the erroneous presumption that it's
>>> adequate for scheme to restrict itself to exclusively processing data
>>> originating, and destined as Unicode encoded text, which would be
>>> most unfortunate.
>> 
>> I don't think that this is the case. PLT Scheme, for instance (you may
>> have seen Matthew's recent post on the plt-scheme mailing list), is
>> going to have byte ports. If you do read-char on a byte port, the bytes
>> coming out of the port will be interpreted as unicode (utf8 I believe,
>> unless you specify otherwise) but you can also extract the bytes from
>> the port directly.
>> 
>> Robby
>> 
>