[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: english names for symbolic SREs

This page is part of the web mail archives of SRFI 115 from before July 7th, 2015. The new archives for SRFI 115 contain all messages, not just those from before July 7th, 2015.

On 11/27/2013 7:37 AM, John Cowan wrote:
> Alex Shinn scripsit:
>> It was John who insisted that the names be added, and John
>> who came up with most of the new names, so I'm assuming
>> he genuinely wants them.
> I do, though I didn't come up with the idea and in fact was initially
> against having more than one way to do it, but you convinced me otherwise.
> I think the long names are more self-documenting, more Schemey, and
> will make SREs more accessible to people who find string REs an
> abomination of the outer darkness.

Hypothetically, lets say that this SRFI specifies a new regular 
expression syntax called NRE. It should be straightforward to transform 
SREs into NREs. The existing SRE implementations (IrRegex and SCSH) can 
provide a procedure sre->nre which people with existing SREs can use and 
their code is not gratuitously left behind.

The problem with providing both short names and long names is that when 
I write SREs I can just use the long names, but when I read other 
peoples SREs, then I potentially still need to know both.

SREs not only take the existing short names from PCREs, but they added more.

(: <sre1> <sre2> ...) means match <sre1> and <sre2> and ... Scheme 
already has an operator that means and.
($ <sre> ...) means numbered submatch. In PCREs, $ means match the end 
of the line.
(=> <name> <sre> ...) means named submatch. In Scheme this means call 
the procedure on the result of evaluating test.
(/ <range-spec> ...) means ranges. In Scheme / already means divide.
(~ <cset-sre> ...) means complement of union. If you are a C or C++ 
programmer, then this makes sense.
(= <n> <sre> ...) means match <n> times. As a Scheme programmer, I read 
that as <n> equals <sre> equals ...
(>= <n> <sre> ...) means match <n> or more times. As a Scheme 
programmer, I read that as <n> greater than or equal <sre> greater than 
or equal ...
(** <n> <m> <sre> ...) means match <n> to <m> times. I remember it as 
the Fortran exponential operator, but it has been a long time since I 
programmed in Fortran.

This is embarrassing. Does this example from the specification really 
look like Scheme? (regexp-matches '(* (& (/ "az") (~ ("aeiou")))) "xyzzy")

I feel like I have gone into an Apple store and tried to convince 
everyone that they should be running Microsoft Windows.