[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: the discussion so far



On 7/20/05, Thomas Bushnell BSG <tb@xxxxxxxxxx> wrote:
> Alex Shinn <alexshinn@xxxxxxxxx> writes:
> 
> >   CHAR-*CASE, CHAR-CI=?
> >     - as in R5RS
> >     - folds ASCII *only* (please don't enourage bad code)
> 
> I'm ok with this, but with bear's amendment: put "ascii" in the name.

I had originally suggested a name with ASCII (and note this is not an
encoding-based name as bear said but a char-set based bame).

The primary argument in favor of keeping the names as-is is partial
backwards compatibility with R5RS.  Character-level case operations are
currently used in programs for one of two semantic reasons - either
ASCII-based parsing or linguistic case mapping.  In the former case,
keeping the current R5RS names means no changes are needed and the
program continues to function properly.  In the latter case, the code is
fundamentally broken and needs to be rewritten to use string-level
operations anyway.  Unfortunately, in the latter case the code will
continue to work for English-speaking authors so the rewrite is not so
likely to take place.  Do we favor backwards compatibility as much as
possible, or do we introduce deliberate incompatibility and force people
not to use broken concepts?

This decision is also affected by the overall naming convention of the
SRFI.  If we are to have separate ASCII-based procedures and
Unicode-aware procedures, in general are the R5RS procedures thought of
as ASCII or as Unicode?  This is subjective - people may want to keep
the R5RS names for the semantics they use most often, but this will be
different depending on the type of programming you do.

On another note, so far the conversation is neglecting the predicates
CHAR-*CASE?.  Since these are defined as Unicode properties of
individual characters it does make sense to keep these as character
level operations.

-- 
Alex