[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TR29 word boundary use cases



On Thu, Dec 12, 2013 at 2:18 PM, John Cowan <cowan@xxxxxxxxxxxxxxxx> wrote:
Alex Shinn scripsit:

> So because of the lack of implementation support
> and the unintuitiveness of the algorithm, I'm dropping
> the TR29 word boundary requirement.

I'm fine with that.  But if you are returning to the PCRE algorithm,
what is the definition of "word character"?

This is the \w specified in TR18, and Perl complies with it,
so I think we should use it.  We should also provide an
SRE name for just this char-set.  We can make it long and
say `word-constituent' since the `word' uses will be more
common.

Note this is slightly different from the definition of an
identifier character from TR31.

-- 
Alex