[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: strings draft



Tom Lord <lord@xxxxxxx> writes:

> ** String Conversions
>
>   ~ t_scm_error scm_extract_string8 (t_uchar * answer,
>                                      size_t * answer_len,
>                                      enum uni_encoding_scheme enc,
>                                      t_scm_arena instance,
>                                      t_scm_word * str)
>
>     Normally, convert `str' to the indicated encoding (which must be
>     one of `uni_utf8', `uni_iso8859_*', or `uni_ascii') storing the
>     result in the memory addressed by `answer' and the number of 
>     bytes stored in `*answer_len'.  Return 0.
>
>     On input, `*answer_len' should indicate the amount of storage 
>     available at the address `answer'.  If there is insuffiencient 
>     memory available, `*answer_len' will be set to the number of bytes
>     needed and the value `scm_err_too_short' returned.

In the case that answer doesn't have enough memory allocated to it to
store the string, what happens to its contents?  I would propose that
the memory contents be undefined to allow implementations that don't
store strings in a simple vector to be able to write over the memory
as it goes and later realize it lacks the storage rather than
requiring an initial pass over the contents.

I think there should also be an error raised when the string can't be
expressed in the requested encoding (I'll leave it up to someone else
to name this error) and again answer's memory should be undefined.

(These recommendations apply to all three scm_extract_string*
functions.)

Somewhat less of an issue (and more current-Pika-implementation
specific), but why name the t_scm_arena value to instance?  A few
macros (SCM_PROTECT_FRAME and theoretically SCM_LSET) assume the
arena's name to be arena.

-jivera