This page is part of the web mail archives of SRFI 50 from before July 7th, 2015. The new archives for SRFI 50 contain all messages, not just those from before July 7th, 2015.
Tom Lord <lord@xxxxxxx> writes: > CHAR-UPCASE and CHAR-DOWNCASE are mandatory and STRING-CI=? is defined > in terms of CHAR-CI=? If you're asking what should be in the next RnRS, then there is no sense in which CHAR-UPCASE is mandatory. The editors can choose to include it, or not. I am speaking of what I would like the next RnRS to say, precisely because the current version is entirely unsuitable for correct character handling. There *is no* good implementation of R5RS if you want the Scheme character type to be based upon Unicode. > In the latter case, CHAR-DOWNCASE behaves in a linguistically odd for > Turkish speakers because it either converts #\I to #\i or #\I to #\I. This is not "linguistically odd", it's incorrect. It is in fact incorrect in a way which violates the best Unicode practices. It is this which I spoke of a while back when I first entered the thread. If you are saying that it doesn't matter that the R5RS character type cannot be used with the best Unicode practices, then I disagree strongly. > The character casemappings would still need to be defined to specify > Scheme. Reifying that definition into Scheme in the form of those > procedures is only natural. Huh? Why on earth would it? We could specify scheme and give *no* case-mapping functions, and instead only specify the output identifier matching function. I am coming to believe that it should not be specified as string-ci=?, in fact, because a-with-accent-grave is not ci=? to a-without-accent, but a system might sensibly choose to treat them as equivalent for identifiers. There should be string-id=? (or some other name) which implements the Scheme identifier matching rules, which should be specified for the required character set, and left unspecified for all other characters. None of this requires or even implicitly uses a case mapping function. > The standard would still need to specify CHAR-DOWNCASE. Why? Is there some government bureau that will shut us down if the next RnRS eleminates it? I don't mind STRING-DOWNCASE, of course, which should have a locale argument and be specified to permit the Correct Unicode Thing. Thomas