[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: strings draft

    > From: Matthew Dempsky <jivera@xxxxxxxxx>
    > Tom Lord <lord@xxxxxxx> writes:

    > > ** String Conversions

    > >   ~ t_scm_error scm_extract_string8 (t_uchar * answer,
    > >                                      size_t * answer_len,
    > >                                      enum uni_encoding_scheme enc,
    > >                                      t_scm_arena instance,
    > >                                      t_scm_word * str)

    > >     Normally, convert `str' to the indicated encoding (which must be
    > >     one of `uni_utf8', `uni_iso8859_*', or `uni_ascii') storing the
    > >     result in the memory addressed by `answer' and the number of 
    > >     bytes stored in `*answer_len'.  Return 0.

    > >     On input, `*answer_len' should indicate the amount of storage 
    > >     available at the address `answer'.  If there is insuffiencient 
    > >     memory available, `*answer_len' will be set to the number of bytes
    > >     needed and the value `scm_err_too_short' returned.

    > In the case that answer doesn't have enough memory allocated to it to
    > store the string, what happens to its contents?  I would propose that
    > the memory contents be undefined to allow implementations that don't
    > store strings in a simple vector to be able to write over the memory
    > as it goes and later realize it lacks the storage rather than
    > requiring an initial pass over the contents.

That's the intention.

    > I think there should also be an error raised when the string can't be
    > expressed in the requested encoding (I'll leave it up to someone else
    > to name this error) and again answer's memory should be undefined.

    > (These recommendations apply to all three scm_extract_string*
    > functions.)

Correct.   SRFI-50 seems still up-in-the-air at the moment but if I
had my druthers, we'd adopt the Pika-style conventions and start
making lists of error code names.

    > Somewhat less of an issue (and more current-Pika-implementation
    > specific), but why name the t_scm_arena value to instance?  A few
    > macros (SCM_PROTECT_FRAME and theoretically SCM_LSET) assume the
    > arena's name to be arena.

Oh, that's just me being goofy.  I prefer the name `arena' for random
reasons -- but in explaining the FFI on this list, `instance' seemed
more communicative (for some random reason).

(In a portable idea, I think that just for hygiene, the
SCM_PROTECT_FRAME and SCM_LSET analogs should accept an explicit
`instance' parameter.)