[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: strings draft

This page is part of the web mail archives of SRFI 50 from before July 7th, 2015. The new archives for SRFI 50 contain all messages, not just those from before July 7th, 2015.

If scheme's existing "character" type is expanded beyond it's present
definition in attempt to make create a universal character type; then it's
likely desirable to provide some mechanism to guarantee that program code
manipulating scheme program or data text, ideally restrict its character set
use and lexical rules to comply with Scheme, or risk producing illegal
/non-portable code. (which seems like a scheme character type separate from
an arbitrary text processing character type to me)

So in effect:

- scheme programs are restricted to being expressed using a minimalist
  scheme character set, which is universally portable (subset of others)

- but may themselves manipulate text using a larger character set type,
  which may produce text which may itself not be portable to arbitrary

So rather than confuse the issues, to me it seems easier to simply restrict
code intended to manipulate scheme programs and/or data-structures to using
"scheme-character-set" characters (which basically also covers English and
most other programming languages); and code intended to manipulate more
sophisticated arbitrary text can utilize what ever character type it
requires, rather than making everything more complicated than it minimally
needs to be.


> From: tb@xxxxxxxxxx (Thomas Bushnell, BSG)
>> Paul Schlie <schlie@xxxxxxxxxxx> writes:
>> There's a distinct advantage to keeping the character set in which the
>> language is specified in (and is capable of processing itself), distinct
>> from the character set it can utilize to process arbitrary language text,
>> as otherwise it becomes too easy to then rationalize utilizing characters
>> specified within the broader character set within program code, which would
>> then truly needlessly limit the code's portability, from both a machine as
>> well as human perspective. (As I don't believe it's productive to anyone to
>> attempt to interpret code utilizing symbols written/spelled in arbitrary
>> languages and corresponding character sets; but is clearly useful to enable
>> portable programs to be written to process such arbitrary text).
> Then these little program-representing thingies should not be called
> "characters".  I don't know what the right word is, but it should be
> miles away from "character".  If this is the interpretation you wish
> to offer of what is called a "character" in R5RS, then we have a
> problem: Scheme *has no* characters of any sort, though it does have a
> simulacrum which is good enough for implementing the Scheme language.
> But it seems obvious to me that this is *not* what was in the minds of
> the R5RS authors.  I think they conceived of the R5RS character as not
> merely a thing for writing Scheme programs, but as roughly the "same
> thing" as char in C or Pascal.
> Thomas