This page is part of the web mail archives of SRFI 52 from before July 7th, 2015. The new archives for SRFI 52 contain all messages, not just those from before July 7th, 2015.
> bear <bear@xxxxxxxxx> wrote: > >> On Thu, 12 Feb 2004, Ken Dickey wrote: >> >> I assume that it is useful to distinguish the two goals of extending >> programming language identifiers and processing Unicode data. > > For temporary solutions and bandaids, yes. But scheme is a lisp, and > our code is data and our data is code. Our identifier-naming rules, > ultimately, *can* affect our program behavior, where with C and similar > languages, it cannot. > > Every implementation that deals with Unicode at all seriously is going > to have to create rules for distinguishing Unicode identifiers, and to > the extent that they adopt *different* rules, there will be enduring > and sometimes very subtle portability problems, and bugs where code > works slightly differently on one system than it does on another As Ken properly pointed out, and which should be abundantly clear to most by now; attempting to enable scheme to more conveniently process text encoded in an arbitrary character set, is distinctly different than attempting to enable scheme to utilize arbitrary characters within its program identifier/comment definitions. While the first is arguably noble, the second would be clearly a mistake. Since scheme's presently specified required character-set (not encoding) is already by-design a subset of the most broadly utilized character-sets; programs (including identifier and comment definitions) are easily and unambiguously transcodeable between any of these more broadly utilized character-sets; thereby enabling scheme program code to be "portable". Attempting to enable scheme programs to utilize characters within it's identifier and comment definitions, which are themselves not a pure subset of most broadly utilized character-set definitions, will enable the specification of scheme programs with are not easily and unambiguously transcodeable between arbitrary broadly utilized character sets, therefore "not portable"; which doesn't seem too clever or noble. If this distinction is understood, and taken to heart; most of the discussions revolving around ambiguities associated with the potential use of arbitrary Unicode characters within scheme program text disappear; in turn enabling discussions to focus on the potential extension of scheme to support more conveniently the expression of algorithms which process text which may be composed of arbitrary character-set characters, beyond those which portable scheme programs may be composed of themselves. (actually, it seems that the specification of anything beyond the trivial enabled use of extended character-sets is likely premature, given what appears to be limited practical experience with potential solutions within the community. Maybe a few straw-man solutions which have at least been somewhat "rung out" through trial application code development needs to occur first?) -paul- ------ End of Forwarded Message