This page is part of the web mail archives of SRFI 50 from before July 7th, 2015. The new archives for SRFI 50 contain all messages, not just those from before July 7th, 2015.
> From: Ken Dickey <Ken.Dickey@xxxxxxxxxxxxxx> > Excuse me if the obvious has already been addressed, but.. > It would be a *bad thing* if in going from one locale to another > changed a working Scheme program into a broken Scheme program. > So, please be sure that the specification of character and > string encoding and of portable Scheme source code defines > Scheme source as being locale indepent (by construction). Do you agree that this is a portable, standard Scheme program?: (define i 42) [a] (display i) (newline) What about this next one? As nearly as I can tell, the formal syntax in chapter 7 says that this next program is _not_ portable, but the language in chapters 2 and 6 suggests that that is an unintended deficiency of chapter 7: (DEFINE I 42) [b] (DISPLAY I) (NEWLINE) and if that is legal, is this a portable, standard Scheme program with equivalent behavior? (DEFINE I 42) [c] (display i) (newline) Strictly speaking, R5RS seems to say that [a] is portable, [b] is not, and among implementations on which [b] and [c] both run, they are not required to be identical in meaning. The same strict reading implies that the following is _not_ a portable Scheme program: "H2O" and that this is permitted: (string-ci=? "define" "DEFINE") => #f I tend to think that R5RS is deficient (relative to the authors' intentions) in that regard. These restrictions would make it a real mess (at best) to try to write a portable Scheme program that could process Scheme source texts containing identifiers which use any letters other than #\a..#\z. For example, I would like this portable, standard program to produce as output a one-line, portable, standard Scheme expression: (display (char-downcase (char-upcase #\i))) (newline) however, the strictest reading of R5RS suggests that it is not guaranteed to do so. On the other hand, if [a], [b], and [c] are all portable, equivalent, standard Scheme programs -- then in Turkish implementations, CHAR-UPCASE, CHAR-DOWNCASE and friends must behave in a linguistically odd manner. I'm not so sure that that's terrible (and my proposals for R6RS reflect that assessment): those procedures are doomed to behave in a linguistically odd manner for a substantial number of reasons, in many other contexts besides Turkish implementations. While they _may_ behave in linguistically ideal ways in _some_ contexts -- that can not be what they are for. (Even where they must behave oddly, they can provide a good _approximation_ of something linguistically useful.) Rather, I propose that the standard character procedures be explicitly related to both the syntax of portable standard Scheme and the syntax of particular implementations. For example, R6RS should require that: (char-downcase #\I) => #\i and require that within a given implementation, if: (char-alphabetic c) => #t then (display c) (newline) produces as output a one line expression that consists of a valid identifier in that implementation. -t