Re: revised w/nocase text, considering titlecase and cased

This page is part of the web mail archives of SRFI 115 from before July 7th, 2015. The new archives for SRFI 115 contain all messages, not just those from before July 7th, 2015.

To: John Cowan <cowan@xxxxxxxxxxxxxxxx>

Subject: Re: revised w/nocase text, considering titlecase and cased

From: Alex Shinn <alexshinn@xxxxxxxxx>

Date: Sat, 10 May 2014 11:42:46 +0900

Cc: SRFI-115 discussion list <srfi-115@xxxxxxxxxxxxxxxxx>

Delivered-to: srfi-115@xxxxxxxxxxxxxxxxx

Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=RsLdxgOoBB89uNbVbhuOHAmT014280egUvevAklroE4=; b=p+vs5FgXf4KOyiQs33UafRMWrpw7Y1yJWPWXbenWdbs4+IngYhAuuCoTtwlr5U8zxW SDUCVDpNs+uy9Klrm25xckD2HwRiu+7Yi3JRtt0J6dy0cN5QGp5BDjGF8ghEGFr8vWVe agldWdSxkO1VfFU0byWarJVU8NuMook707SPJWPaZE4Val8vSVp1aE9J4yTJtndDsOA8 sAnzNXWRAgMDy8lRvaIkxJ+BdsLzEZ9Daa0tqEufzVwsSO8o7sq5eeeRavwq3J5bIdia MMWEYPxZhSnI7IvAvBVBfsc2ar3D7od5sm/iuDcjF1XkeQwEi49XJpwYeiDXhXCglZUY 9rBQ==

In-reply-to: <20140510022102.GX32663@mercury.ccil.org>

References: <CAMMPzYMg4wp2R9PetSGy+aF7TUJPWevkei8yLtrkZSt3NG=3SQ@mail.gmail.com> <20140509215947.GT32663@mercury.ccil.org> <CAMMPzYNFV-q9510W3nEa1ukrXpP8HObRH6XGmdnMf8UbpfF3aQ@mail.gmail.com> <20140510004929.GV32663@mercury.ccil.org> <CAMMPzYMWK5ZBycUb9JJf+f5mMYx+WAkxb1q3HNozW21JV=yooQ@mail.gmail.com> <20140510022102.GX32663@mercury.ccil.org>

On Sat, May 10, 2014 at 11:21 AM, John Cowan <cowan@xxxxxxxxxxxxxxxx> wrote:

Alex Shinn scripsit:
> But if you decide they should _not_ have case mappings, then
> you're treating them strictly as symbols, and giving them case
> properties is inconsistent. It should be one or the other.

They are something like symbols, but they are letter-like in other ways.
Per contra, the circled Latin letters are considered symbols (and so
have no case) but have case mappings just the same.

They are letter-like. Unicode decided they are not case-like -

they exist by themselves without any *-cased counterpart.

Therefore they should not have case properties.

You can also argue the other direction, that case is not just

about exact semantic identity, but perceived identity. When

I isearch in emacs I expect case insensitive matches (unless

specifically disabled) and would expect "x" to match either

case of x in formulas. Likewise in natural sorting you expect

the different cases to sort together. This is also true for

mathematical symbols even when there is no relation at all

between the two cases, such as in

http://en.wikipedia.org/wiki/Greek_letters_used_in_mathematics,_science,_and_engineering

Or put more simply, everyone _knows_ that X is the upper-

case of x, and expect it to behave that way in software.

Unicode is breaking expectations here.

In Unicode, things are always more complicated than you think.

Rhetoric. They are a committee making decisions, many of

which could go either way. I'm pointing out this was a bad

decision. Such things happen.

Alex