[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Issues with Unicode

This page is part of the web mail archives of SRFI 91 from before July 7th, 2015. The new archives for SRFI 91 contain all messages, not just those from before July 7th, 2015.

To: bear <bear@sonic.net>
Subject: Re: Issues with Unicode
From: John Cowan <cowan@ccil.org>
Date: Wed, 10 May 2006 15:31:04 -0400
Cc: srfi-91@srfi.schemers.org
Delivered-to: srfi-91@srfi.schemers.org
In-reply-to: <Pine.LNX.4.58.0605101037220.8865@bolt.sonic.net>
References: <y9lbqushdsw.fsf@informatik.uni-tuebingen.de> <803467D9-D753-4E8A-AAEA-2F03E97EDB42@iro.umontreal.ca> <1147187289.13545.134.camel@vmx.eros-os.org> <20060509153245.GA1177@ccil.org> <1147193562.13545.149.camel@vmx.eros-os.org> <Pine.LNX.4.58.0605091651340.2829@bolt.sonic.net> <20060510041346.GH5149@ccil.org> <Pine.LNX.4.58.0605101037220.8865@bolt.sonic.net>
User-agent: Mutt/1.3.28i

bear scripsit:

> Didn't want it both ways.  String-set!, with unchanged contract,
> can be implemented on top of purely functional methods for
> manipulating string bodies and an atomic single mutation for
> manipulating the string head.

Ah.  Okay.

> Hah?  Unicode already encompasses, I believe, every living
> language with a writing system.  If you mean that there are
> programmers who can't get meaningful identifiers using the
> character set defined as of Unicode 4.1.0, I want to know
> who those programmers are.

Perhaps "potential programmers" is the correct expression.  I currently
count about 15 scripts commonly used to write living languages in the
pipeline that are not yet in Unicode 4.1, and there are a number of
natural languages that use existing scripts but don't quite have all
their characters: e.g. the Myanmar script currently handles Burmese but
not the minority languages of Myanmar.  It may be that no speakers of
those languages are currently programmers, but this is not fundamental.

Certainly things have improved quite a bit since XML 1.0, which froze
identifiers at Unicode 2.0.

> Meanwhile, allowing identifier syntax to shift with every
> version of Unicode creates the potential for version
> incompatibilities.

I quite agree, which is why I propose a fixed though over-inclusive syntax
along the lines of the "alternative identifiers" documented by Unicode.
Alternative identifiers allow whatever is not explicitly forbidden, while
still providing plenty of symbol characters for read-syntax extensions.

-- 
Principles.  You can't say A is         John Cowan <cowan@ccil.org>
made of B or vice versa.  All mass      http://www.ccil.org/~cowan
is interaction.  --Richard Feynman

References:
- Re: Issues with Unicode
  - From: "Jonathan S. Shapiro" <shap@eros-os.org>
- Re: Issues with Unicode
  - From: John Cowan <cowan@ccil.org>
- Re: Issues with Unicode
  - From: "Jonathan S. Shapiro" <shap@eros-os.org>
- Re: Issues with Unicode
  - From: bear <bear@sonic.net>
- Re: Issues with Unicode
  - From: John Cowan <cowan@ccil.org>
- Re: Issues with Unicode
  - From: bear <bear@sonic.net>

Prev by Date: Re: Issues with Unicode
Next by Date: Re: Why are byte ports "ports" as such?
Previous by thread: Re: Issues with Unicode
Next by thread: Re: Issues with Unicode
Index(es):
- Date
- Thread