[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: finalize or withdraw?

This page is part of the web mail archives of SRFI 56 from before July 7th, 2015. The new archives for SRFI 56 contain all messages, not just those from before July 7th, 2015.



On 8/23/05, Michael Sperber <sperber@xxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> 
> Alex Shinn <alexshinn@xxxxxxxxx> writes:
> 
> > I wasn't actually suggesting Schemes supporting SRFI-68 voluntarily
> > restrict themselves in this manner, I meant rather to show what the
> > problem is along with a more concrete solution implemented using
> > SRFI-68, like using a hash-table to implement a vector.
> 
> Understood.  I meant to say that I didn't understand your solution.

Ah, OK, I kind of hand-waved it.

Assuming a strict binary/character port distinction, we could effectively
read size-delimited strings out of a binary port by having SRFI-56
require support for READ-BLOB-N on binary ports, and using something
like:

  (define (read-binary-string-n port n)
    (blob->string (read-blob-n port n)))

or alternately just call it READ-STRING-N and have it dispatch and
handle both binary and character ports.  N would be defined in terms of
bytes - you'd need a similar procedure to read NULL terminated strings,
easily done by scanning bytes at a time until a zero byte occurs and
accumulating the results in a blob.

BLOB->STRING would be implementation dependent but could
be defined easily in SRFI-68 as

  (define (blob->string blob)
    (read-string-all (open-blob-input-port blob)))

For the time being SRFI-56 could either leave the character encoding
of binary ports unspecified, or else specify a default of UTF-8,
possibly allowing an optional encoding argument to BLOB->STRING.

Although we're considering using the same procedure name and
letting READ-STRING-N work on any port type, this is still much
simpler than allowing arbitrary character I/O on binary ports,
because it avoids the state problems (one can assume the port
is in the "default" state before and after the extracted string) and
it doesn't allow single character reading, which involves
complications such as knowing where character boundaries are
for all encodings and potentially skipping ahead state changing
bytes.

-- 
Alex