[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: english names for symbolic SREs



On 11/27/2013 7:37 AM, John Cowan wrote:
> Alex Shinn scripsit:
>
>> It was John who insisted that the names be added, and John
>> who came up with most of the new names, so I'm assuming
>> he genuinely wants them.
> I do, though I didn't come up with the idea and in fact was initially
> against having more than one way to do it, but you convinced me otherwise.
> I think the long names are more self-documenting, more Schemey, and
> will make SREs more accessible to people who find string REs an
> abomination of the outer darkness.

Hypothetically, lets say that this SRFI specifies a new regular 
expression syntax called NRE. It should be straightforward to transform 
SREs into NREs. The existing SRE implementations (IrRegex and SCSH) can 
provide a procedure sre->nre which people with existing SREs can use and 
their code is not gratuitously left behind.

The problem with providing both short names and long names is that when 
I write SREs I can just use the long names, but when I read other 
peoples SREs, then I potentially still need to know both.

SREs not only take the existing short names from PCREs, but they added more.

(: <sre1> <sre2> ...) means match <sre1> and <sre2> and ... Scheme 
already has an operator that means and.
($ <sre> ...) means numbered submatch. In PCREs, $ means match the end 
of the line.
(=> <name> <sre> ...) means named submatch. In Scheme this means call 
the procedure on the result of evaluating test.
(/ <range-spec> ...) means ranges. In Scheme / already means divide.
(~ <cset-sre> ...) means complement of union. If you are a C or C++ 
programmer, then this makes sense.
(= <n> <sre> ...) means match <n> times. As a Scheme programmer, I read 
that as <n> equals <sre> equals ...
(>= <n> <sre> ...) means match <n> or more times. As a Scheme 
programmer, I read that as <n> greater than or equal <sre> greater than 
or equal ...
(** <n> <m> <sre> ...) means match <n> to <m> times. I remember it as 
the Fortran exponential operator, but it has been a long time since I 
programmed in Fortran.

This is embarrassing. Does this example from the specification really 
look like Scheme? (regexp-matches '(* (& (/ "az") (~ ("aeiou")))) "xyzzy")

I feel like I have gone into an Apple store and tried to convince 
everyone that they should be running Microsoft Windows.