Re: revised w/nocase text, considering titlecase and cased

This page is part of the web mail archives of SRFI 115 from before July 7th, 2015. The new archives for SRFI 115 contain all messages, not just those from before July 7th, 2015.

To: John Cowan <cowan@xxxxxxxxxxxxxxxx>

Subject: Re: revised w/nocase text, considering titlecase and cased

From: Alex Shinn <alexshinn@xxxxxxxxx>

Date: Mon, 12 May 2014 12:46:25 +0900

Cc: SRFI-115 discussion list <srfi-115@xxxxxxxxxxxxxxxxx>

Delivered-to: srfi-115@xxxxxxxxxxxxxxxxx

Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=BrcV000DhgK5GFcq1QMDFd3se2v/cUIoJ17IFVEhIAE=; b=uzjqdzcJ8WFZwwFOHzSd/CEqm7GRy3eeWoV52sGesC/Yqj4uf+YO515mw0pEurwyh9 6mRtuoJtmwsKW5P7fvBQXkCpFyco7R+JzpySv4sYvfTlH10i+eTyxuziK8wXxFcjiVX6 fe6eDs3UjROk56fkGyFRpQETLOkVq5wxFj0kaqSPcqJu9bKKFMe7/BuSduo0CkC/JNKe 27H6SX0inrXjSoMrFYrt3r0nSotCahUD/pVRNDcEoAJHDBT9RMtFMX+yESQm/Vs5LsGc u3eF9AWt/dqSF0UzXKAV3zkniZ3kGIk7bJ6axYcJSeXSwhhZ66gxRRy21eeX8UFJ8H3n MGsQ==

In-reply-to: <20140510225646.GQ17946@mercury.ccil.org>

References: <CAMMPzYMg4wp2R9PetSGy+aF7TUJPWevkei8yLtrkZSt3NG=3SQ@mail.gmail.com> <20140509215947.GT32663@mercury.ccil.org> <CAMMPzYNFV-q9510W3nEa1ukrXpP8HObRH6XGmdnMf8UbpfF3aQ@mail.gmail.com> <20140510004929.GV32663@mercury.ccil.org> <CAMMPzYPdquEPtxxfT7jJ=3c5eaTQkZVe6nN2Wn=yPCWJn21wBg@mail.gmail.com> <20140510225646.GQ17946@mercury.ccil.org>

On Sun, May 11, 2014 at 7:56 AM, John Cowan <cowan@xxxxxxxxxxxxxxxx> wrote:

Alex Shinn scripsit:

> What I will do is specifically note that
>
> (w/nocase upper)
> (w/nocase lower)
> cased
>
> are all the same thing (where cased is characters with
> the cased (L&) property),

But they aren't the same thing; I already showed that.

Yes. I'm proposing _defining_ them to be the same thing.

Specifically, in the w/nocase text after the explanation of

how char-sets are handled, I would include:

As a special case, the pre-defined named character sets

upper and lower (and their aliases upper-case and lower-case)

are defined to match all characters with the cased property (L&).

Note also all other pre-defined named character sets are

equivalent to themselves under w/nocase.

Rationale: The differences between the case insensitive

lower and upper and the cased property are few and unlikely

to match user intention. Moreover, unlike the algorithmically

mapped upper and lower char-sets, the cased property is

readily available in most Unicode implementations.

And the only realistic alternative I can see is making this

special case optional, so that either behavior is correct.

Alex