[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: TR29 word boundary use cases

This page is part of the web mail archives of SRFI 115 from before July 7th, 2015. The new archives for SRFI 115 contain all messages, not just those from before July 7th, 2015.

On Thu, Dec 12, 2013 at 2:18 PM, John Cowan <cowan@xxxxxxxxxxxxxxxx> wrote:
Alex Shinn scripsit:

> So because of the lack of implementation support
> and the unintuitiveness of the algorithm, I'm dropping
> the TR29 word boundary requirement.

I'm fine with that.  But if you are returning to the PCRE algorithm,
what is the definition of "word character"?

This is the \w specified in TR18, and Perl complies with it,
so I think we should use it.  We should also provide an
SRE name for just this char-set.  We can make it long and
say `word-constituent' since the `word' uses will be more

Note this is slightly different from the definition of an
identifier character from TR31.