[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Should SRFI-115 character sets match extended grapheme clusters?

This page is part of the web mail archives of SRFI 115 from before July 7th, 2015. The new archives for SRFI 115 contain all messages, not just those from before July 7th, 2015.



Alex Shinn scripsit:

> Normalization was in the early issues and dismissed because of lack
> of implementation support and unclear costs in new implementations.
> I think good recommended practice for now is to just normalize both
> inputs and patterns separately.

Okay, I can live with that.  But normalizing an SRE is not a matter of
normalizing the strings in the SRE: indeed, that will break it.  So at
the very least I think a normalize-sre procedure must be provided that
takes an SRE and does the nitty-gritty of selectively expanding charsets
into disjunctions of sequences.  That would not be incompatible
with PCRE, because its effect is global.

-- 
John Cowan          http://www.ccil.org/~cowan        cowan@xxxxxxxx
I Hope, Sir, that we are not mutually Un-friended by this Difference
which hath happened betwixt us.
     --Thomas Fuller, Appeal of Injured Innocence (1659)