Alex Shinn scripsit:
Looks good to me.
> As a special case, the pre-defined named character sets
> upper and lower (and their aliases upper-case and lower-case)
> are defined to match all characters with the cased property (L&).
> Note also all other pre-defined named character sets are
> equivalent to themselves under w/nocase.
> Rationale: The differences between the case insensitive
> lower and upper and the cased property are few and unlikely
> to match user intention. Moreover, unlike the algorithmically
> mapped upper and lower char-sets, the cased property is
> readily available in most Unicode implementations.
I think this language should also be added:
Note that placing a sequence consisting of a base character
and combining characters into a character string representing
a character set will not do what the user probably expects;
it will create a character set pattern containing the base
character and the combining character(s) as alternatives.
For the same reason, it is inadvisable to apply Unicode
normalization to such strings.