new draft of SRFI 115

This page is part of the web mail archives of SRFI 115 from before July 7th, 2015. The new archives for SRFI 115 contain all messages, not just those from before July 7th, 2015.

Pending reflection on srfi.schemers.org is

available at:

http://abrek.synthcode.com/srfi-115.html

Most issues are resolved. The one thing people

should pay attention to is how case folding works

in conjunction with subtractive char-set operations

(~, -, &).

I'll refine the text before the final draft, but what it

says is that in a w/nocase context, case mapping

happens at the leaf level. In a cset-sre you map

all literal characters to both their upper and lower-

case equivalents, and similarly for all named csets

(e.g. upper and lower both map to alpha).

The alternative is to perform all cset operations

normally and only case map the final result. Usually

these are the same thing, except when a leaf does

not contain already all cases of a character and is

being used subtractively as noted above.

The rationale is:

1) It's what PCRE does. "b" does not match /[^AaB]/i,

because the [AaB] class first becomes [AaBb], _then_

we take the complement. If you were to take the

complement first, then you'd have [...bC-Zc-z...], and

case mapping this would match both "b" and "B".

2) It's needed when allowing nested w/ascii and w/noascii

within a cset-sre.

3) It allows mapping large named char sets statically,

avoiding expensive case mapping on large Unicode

sets.

Alex