[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: benefits of SRE syntax

This page is part of the web mail archives of SRFI 115 from before July 7th, 2015. The new archives for SRFI 115 contain all messages, not just those from before July 7th, 2015.



On Mon, Oct 21, 2013 at 2:15 AM, Per Bothner <per@xxxxxxxxxxx> wrote:
On 10/20/2013 07:21 AM, Alex Shinn wrote:
On Thu, Oct 17, 2013 at 4:14 AM, Per Bothner <per@xxxxxxxxxxx
    I think structured regular expressions make sense when integrated
    with a general pattern-matching framework (by which I mean something
    like http://docs.racket-lang.org/__reference/match.html
    <http://docs.racket-lang.org/reference/match.html>). Also,

    sub-matches should produce variable bindings.

I think it would be strange to provide regexp
matching as part of a general matching framework
without providing access to the underlying regexp
library.

You're assuming there is an underlying regexp library
for handling strings regexps.  But that assumes a separate 
syntax and operators for strings regexp matching.

Yes, indeed I was because you pointed to the Racket
match library for which that was the case.
 
I suggest taking a look at CDuce (or its precursor XDuce)
http://www.cduce.org/papers/cduce-design.ps.gz
http://www.cduce.org/manual_types_patterns.html

I think this is a much more elegant approach, and much more
in the "spirit of Scheme" (if you take away the static typing
aspect - which of course I like).

This also appears to have regexp patterns as a
distinct component of the syntax, in much the
same spirit as SREs.  They are polymorphic
because ML is polymorphic, but as John notes
Scheme is not.  The overhead of making a
polymorphic regexp library in Scheme would likely
be too expensive (unless you're using a Scheme
with optional static typing such as Kawa).

Even so, efficient DFA construction on arbitrary
types is difficult, and the CDuce paper doesn't
seem to say anything about the matching algorithm
they use.  I have my doubts about their regexp
performance.

If "structured regular expressions" are to be part of the
language, we should think about how they apply to sequences
(lists and vectors) in general, not just strings.

I'm not sure what you mean by "structured," but
"structural regular expressions" are a separate concept:

http://doc.cat-v.org/bell_labs/structural_regexps/

-- 
Alex