[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: strings draft (musings)

This page is part of the web mail archives of SRFI 50 from before July 7th, 2015. The new archives for SRFI 50 contain all messages, not just those from before July 7th, 2015.



If one wanted to extend the notion of scheme vs. generalized text; scheme's
character set specification could be pushed even further to refine the
definition of scheme character lexical ordering, potentially dropping some
of the present required string case functions which would then not be
arguably necessary as scheme is case insensitive in theory (although suspect
a case preserving restriction should likely be added).

Thereby potentially allowing scheme to define it's own character set mapping
to be more self consistent with it's own language lexical conventions, which
could itself then be easily mapped to arbitrary platform/standard character
sets dynamically as required by the reader/writer; for example:

Hypothetical scheme-standard-character-set-encoding (lexical order encoded)

 upper   lower   scheme (8-bit case preserving encoding, shifted 1 bit
 case    case    uncase (right produces case insensitive code value)
 -----   -----   ------
 00: 0   01: 0   00: 0
         ...             (digits 0-9)
 12: 9   13: 9   00: 9
 -----   -----   -----
 14: A   15: a   0a: a
         ...             (digits/letters a-f)
 1e: F   1f: f   0f: f
 -----   -----   -----
 20: E   21: e   10: e
         ...             (letters e-z)
         ...             (symbols `-?)
         ...             (white-spaces)
         ...             (control) or whatever lexical order preferred

Where then a raw binary 8/16/whatever bit character type can then be
defined/subtyped to support the arbitrary processing of raw data, and/or
other arbitrarily defined character sets.

Which although may be a bit radical, would seem to enable a simpler base
Scheme language definition, and provide the flexibility for future
extensibility, albeit at the expense of the requirement for some backward
compatible interim library procedure support.

(however I don't take these thoughts too seriously though, as I don't
 honestly believe that there's a chance of its serious consideration, as
 it seems that it's both scheme's curse and strength to evolve very slowly).

-paul-

> From: Paul Schlie <schlie@xxxxxxxxxxx>
> Date: Fri, 23 Jan 2004 07:16:50 -0500
> To: "Thomas Bushnell, BSG" <tb@xxxxxxxxxx>
> Cc: <srfi-50@xxxxxxxxxxxxxxxxx>
> Subject: Re: strings draft
> Resent-From: srfi-50@xxxxxxxxxxxxxxxxx
> Resent-Date: Fri, 23 Jan 2004 13:17:03 +0100 (MET)
> 
> "features"?  It's basically ASCII, as such useful beyond Scheme because
> it's sufficient to process most text written in English, which is fine by
> me; but arguably insufficient otherwise.
> 
> There's a distinct advantage to keeping the character set in which the
> language is specified in (and is capable of processing itself), distinct
> from the character set it can utilize to process arbitrary language text,
> as otherwise it becomes too easy to then rationalize utilizing characters
> specified within the broader character set within program code, which would
> then truly needlessly limit the code's portability, from both a machine as
> well as human perspective. (As I don't believe it's productive to anyone to
> attempt to interpret code utilizing symbols written/spelled in arbitrary
> languages and corresponding character sets; but is clearly useful to enable
> portable programs to be written to process such arbitrary text).
> 
> -paul-
> 
>> From: tb@xxxxxxxxxx (Thomas Bushnell, BSG)
>>> Paul Schlie <schlie@xxxxxxxxxxx> writes:
>>> Or one could more simply reinforce the notion scheme's character type is
>>> simply distinct from (although likely a subset of) the definition of a
>>> new character type targeted to support more generalized text processing
>>> than is minimally necessary to support the definition and processing of
>>> the scheme language itself (which is all that scheme's character type is
>>> specified/suited to be sufficient for).
>> 
>> The Scheme character type includes many features designed to make it
>> more "useful", which are completely unnecessary for the simple task of
>> parsing Scheme.  This creates the problem that people may *use* it for
>> tasks other than just parsing Scheme (as indeed they do), and thus
>> programs which use it for those tasks will be ill suited to richer
>> environments.
>> 
>> Thomas
>