This page is part of the web mail archives of SRFI 75 from before July 7th, 2015. The new archives for SRFI 75 contain all messages, not just those from before July 7th, 2015.
Thanks to everyone who has contributed to this discussion. It has moved so quickly that I have little hope of responding to everything, but I've found many of the comments to be helpful. The biggest piece of implicit feedback is that the SRFI does not really make the editors' goal clear. The goal is not to finally get strings "right", or even to be Unicode-compliant. The goal is simply to make Scheme programs more portable. My impression of the editors: we're not going to standardize anything less than a specific set of characters. There is a consensus that the current weak standard causes too many portability problems, and that the solution is to pin down precisely the meaning of "character". Meanwhile, implementations and libraries are certainly free --- encouraged, even --- to define other datatypes and other operations on "character" and "string". Those other datatypes and operations will be better than the standard (otherwise there would have been no point for the implementor), and as experience develops, something will likely replace whatever appears in R6RS. For an R6RS definition of "character", I think the editors would like to include most things that people wish to write within an identifier or string constant. Among the well understood and widely implemented definitions of character, the only candidates seem to be UTF-16 code points and Unicode scalar values. As far as we can tell, best practice currently points to scalar values. Keeping in mind that the goal is portability, the question with respect `char-upcase', `string-ci=?', etc. is not whether they do the "right" thing with respect to Unicode or natural language, but whether they are needed to write portable programs, whether they are so common that we should give them names to avoid gratuitous incompatibility, whether they are sufficiently simple to implement that we should impose them as a requirement on all Scheme systems, and whether the set of standardize operations is reasonably consistent. I am personally convinced (by this discussion and by past experience) that `string-ci=?' as defined in the SRFI is not what you really want under most circumstances. But it's often a good approximation. I think that Scheme needs at least an operation like `string-ci=?' for portable programs, something like it will exist in most implementations, it's simple to implement, and it's consistent with the rest of the proposal ---- so it still seems right to me to put it in the SRFI, despite its many flaws. A similar line of reasoning applies to the other operations. In contrast, a `string-ci=?' based on the the Unicode collation algorithm, while certainly a better approximation, seems like too much of an implementation burden to be in the SRFI. (Many posts on this list address exactly the issues of usefulness and complexity for various operations, and I find those posts particularly helpful.) The above does not begin to cover many other points raised in the discussion, and even for what it says, there are plenty of arguments to the contrary already on the list. Hopefully, though, it helps clarify the goal of the current SRFI as discussion continues. Thanks, again, to everyone, Matthew