This page is part of the web mail archives of SRFI 50 from before July 7th, 2015. The new archives for SRFI 50 contain all messages, not just those from before July 7th, 2015.
On Fri, 30 Jan 2004, Tom Lord wrote: > Bear wrote: > > What's missing is an explicit declaration that it is unspecified > > whether or not values written into the buffer pointed at by the > > result of SCHEME_EXTRACT_STRING mutate the scheme string that > > was originally referred to, > >Interesting conlusion. I conclude that EXTRACT must allocate string >data which the C code must explicitly free. I arrived at this through >a fairly systematic exploration of the design space (described below). It's true that having a multi-step procedure where C code asks for the string length, allocates the buffer, then aks for the characters to be copied into the buffer, does whatever it does, and then disposes of the buffer when it doesn't need it anymore, would be more stable and general and easily portable to more scheme implementations. That was what I initially proposed (that only values, and not pointers, should cross the FFI), and that's what I'd still rather see. However, the general response, as I understood it, was while string copying or string-translating costs for SCHEME_EXTRACT_STRING are inevitable for implementations that use odd string representations, most people felt that is not acceptable to impose a string-copying cost on scheme runtimes that *do* represent strings in some form comprehensible to C systems. So, basically, I thought that the "copy everything" approach that you and I were advocating had been eliminated from discussion. *IF* the copy-everything approach is not on the table, and implementations that store strings internally in a C-comprehensible format are supposed to be spared the overhead of copying, then we need to warn developers that the pointer they get is unstable, and might cease to be valid on any string mutation from the scheme side or on garbage collection, and that writing to the buffer is not guaranteed to cause mutation to the scheme string. We need to warn them of this because the write-through question is not possible to solve just one way or the other. Explicitly supporting direct write-through mutation is in fact not even possible for implementations that must provide SCHEME_EXTRACT_STRING by means of copying/translating some other internal representation, or which may change internal representations (and locations) on mutation, or which have copying garbage collectors. Conversely, absolutely preventing direct write-through mutation entirely is impossible for most implementations that store strings in a form that the C code *can* understand and which implement SCHEME_EXTRACT_STRING without copying the string buffer. Essentially this seems to partition every easily-possible implementation into three classes; either it cannot guarantee to support write-through mutation (like mine), or it cannot guarantee to prevent it (like S48), or it can guarantee neither (like a scheme that uses byte strings but may relocate them on mutation or GC). Shiro's suggestion of forcing the C code to regard this buffer as const chars seems to be the best solution. But if the copy-everything still on the table, then I agree with you, that it's definitely a more general, stable, and portable approach, and I would support it. However, I'd also agree with its detractors that it imposes a copying overhead on some implementations that could have provided visibility to their strings without copying them. I regard this as an entirely acceptable cost, but then I may be biased as I couldn't possibly have avoided it anyway. Bear