[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SRFI 115 editorial

This page is part of the web mail archives of SRFI 115 from before July 7th, 2015. The new archives for SRFI 115 contain all messages, not just those from before July 7th, 2015.



Thanks for the editorial fixes!

On Mon, Oct 21, 2013 at 6:59 AM, John Cowan <cowan@xxxxxxxxxxxxxxxx> wrote:

lower-case, upper-case, alphabetic: the Unicode properties are *not* to
be identified with the General Categories Ll, Lu, and L&, because for
one thing, Alphabetic includes Nl, and for another, there are various
Other_{Uppercase,Lowercase,Alphabetic} characters that are also included.

Sorry, I'll fix these:

  Alphabetic: Lu+Ll+Lt+Lm+Lo+Nl+Other_Alphabetic
  Lowercase: Ll + Other_Lowercase
  Uppercase: Lu + Other_Uppercase

I think it's handy to include the short property names
because many regexp libraries allow for matching these
with something like \p{Lu}.

bog, eog:  This should point to UAX 29 rather than TR 18, because 29
contains the actual definition, which is more complex than what you
give here (you aren't allowing for Hangul grapheme clusters).  To avoid
confusion, I wouldn't include any definition at all.

I'll update the link and change to the precise definition.
I find the formal description convoluted, and think an
SRE specification would be much easier to read.

Probably won't have time to update the draft until
this weekend.

-- 
Alex