This page is part of the web mail archives of SRFI 50 from before July 7th, 2015. The new archives for SRFI 50 contain all messages, not just those from before July 7th, 2015.
> Tom Lord wrote: >> You have a choice. >> >> 1) Standard Scheme becomes case-sensitive. May as well drop the case >> mappings from the standard entirely, in this case. >> >> 2) Standard Scheme specifies a deterministic case mapping for the >> portable character set in which portable programs may be written. >> >> 3) Standard Scheme does not provide for portable Scheme source texts. >> >> I pick (2) .... Alex Shinn wrote: > As do I, I certainly was not advocating (3) .... I'm not arguing > either way as to using a default (current-locale), I'm just pointing > it out as a likely possibility .... I think to really do a good job of text handling, a procedure must know the language and encoding for both the source text (parameter values) and the context (returned values). For example, the rules for embedding Arabic text (right to left) in a Latin document (left to right) are slightly different from the converse, IIRC. This suggests an encoding and processing scheme where every text has an associated locale and every text-processing procedure has a locale context parameter. For convenience's sake, that information may be implicit or supplied via global parameters (e.g., CURRENT-LOCALE), although there are disadvantages to doing it that way (e.g., changing a global locale can cause subtle data corruption or information loss problems). On a slightly different note, there's also issue of program source vs program data. Some languages, like C, separate the two. In principle, that makes it easier to use different environments for compiler hosting, program hosting, and program data. In practice, I think it causes confusion more than it helps. Such an approach is even more dubious for a language like Scheme, where self-hosting or metacircularity is extremely common (i.e., the compiler uses the same reader both for interpreting programs and reading program data). Rather than taking cues from languages like C, it might be better to look at the prior art in languages where the boundary between "program" and "data" is less sharp. XML might be a good example. An XML reader may recognize many languages and encodings, but the reader always begins in a default, "standard" state that only recognizes a few. That default state includes a way to specify a different locale as a kind of "metadata." With this approach, you can write XML code/data in other locales; the file begins in the standard locale, but you can then "bootstrap" the reader into a different locale. A Scheme reader could use the same technique. External representations are in the "default Scheme source locale" by default, but they can include metadata sexps to boot the reader into a different locale. (Implementations may also provide extensions to change the default locale.) This gives users a few options for making their source code and program data portable between systems (in order of decreasing portability): 1. Always use the standard Scheme locale. Any Scheme reader should be able to process your code/data, so long as the system supports a few basic assumptions (i.e., files are readable as octet streams). 2. Use your native language, and include the locale metadata at the start of the file (e.g., wrap the file with something like #,(LOCALE UTF-8 EN-US ( ... ))) 3. Use your native language, and rely on local system conventions to change the default Scheme locale. For example, a Scheme interpreter on a Linux system might recognize LANG=en_US.UTF-8 scheme program .... as a valid way to start the interpreter with its reader in UTF-8 encoded US English mode. This method is tricky, because it makes it harder to specify different locales for program and source data. The XML "locale metadata" approach isn't perfect, but it seems like a reasonable approach to provide locale flexibility in program code and data. Unfortunately, I haven't had much experience with it; any comments from people who have actually used this facility? -- Bradd W. Szonye http://www.szonye.com/bradd