204: Wright-Cartwright-Shinn Pattern Matcher

by Felix Thibault


This SRFI is currently in withdrawn status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-204@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Table of Contents


Pattern matching decomposes a compound data structure into parts and assigns those parts to variables. This SRFI describes a pattern-matching library already in use by several scheme implementations which can match many common compound data structures.



Scheme has come with its own pattern-matching language since R4RS, as part of syntax-rules [Shinn, Cowan, Gleckler pp. 23-24]. Many implementations add a pattern matcher that does not involve writing macros to make pattern matching available to all users. Before R4RS, there was only the case [Shinn, Cowan, Gleckler pp. 14-15] statement, which does match, but does not bind its key. Bigloo1, STklos2 and s73 all use extended case statements for pattern matching.

Currently, the pattern-matching facilities that have been implemented by SRFIs are procedures of variable arity (SRFI 16) and S-expressions (SRFI 200). The focus of SRFI 200 is both consensus and clarity. That is, it aims to provide a pattern syntax that is both widely shared and easily read without requiring knowledge of any notation that is not ordinary Scheme. SRFI 200 also aims to provide a portable, readable way of using the pattern matchers already available in many implementations.

This SRFI aims to specify a matching syntax beyond S-expressions and provide a sample implementation, documentation, and tests. It is based on a pattern matcher originated by Andrew K. Wright and Robert Cartwright [Wright and Cartwright, pp. 116-120], using procedures and defmacros. Using a technique developed by Oleg Kiselyov [Kiselyov, 2003] Alex Shinn converted the pattern matcher to pure syntax-rules. This pattern-matching library (WCS) is already in use (at least in part) in several Scheme implementations. Aside from S-expressions, it also matches vectors and records. It has the ability to match based on predicates, to match "fields" (procedure applications), getters, and setters, and has syntax to match patterns zero or more, one or more, or exactly n times, or to walk down a tree with a pattern.

WCS is already part of Chibi4, Guile,5 Cyclone6, Chicken7, LispKit8, Loko9, Mosh10, and Sagittarius11. There are other pattern-matching libraries that are part of Armpit Scheme12, Gerbil13, Bigloo14, STklos15, the bindings library in Chicken16, Kawa17, Gambit18, Gauche19, Picrin20, Racket21, Scheme 9 from Empty Space22, and Scheje23. These matchers vary from very similar to the WCS matcher to S-expression matchers, type-oriented matchers, and expert systems more like ELIZA [Norvig, Chapter 5]. They also range from a wrapper around a library in the implementation language, part of the core in the implementation language, part of the core in Scheme, a Scheme library, and an example program. In length, they range from about twenty lines to thousands of lines across several files. Links to documentation and implementation (as available) are in the repository.

Starting with the WCS matcher, a portable syntax can be developed and a broad basis for implementing pattern matchers laid down. Some features of the current matcher are specified as optional and alternate means to achieve the same ends are specified. The current implementation, along with tests, documentation, and implementation-dependent code are in the repository.



Patterns are written to look like the printed representation of the objects they match. The basic usage is (match expr (pat body ...) ...) where the result of expr is matched against each pattern in turn, and the corresponding body is evaluated for the first to succeed. Thus, a list of three elements matches a list of three elements.

(let ((ls (list 1 2 3)))
  (match ls
    ((1 2 3) #t)))
=> #t

If no patterns match, an error is signalled. The forms match, match-lambda, and match-lambda* all can match their expression(s) against multiple patterns, but match-let, match-let*, and match-letrec match each expression against one pattern, so non-matching conditions have to be caught outside the match. The following terms will be used in describing pattern matching:

data that is supposed to match against the first argument to the match form.
the set of expressions that is evaluated when a pattern matches.
a thunk that can be called in the body to signal that a match was not successful, and restart matching at the next pattern clause.
next, subsequent, etc.
patterns are matched in the order they are listed. Matching stops at the first match unless a call is made to failure.
pattern clause
a pattern and its body.
matching which involves repetition through a variable that is used more than once in a pattern.
an identifier in a pattern which may be used in the body. May not be _, quote, $, struct, object, =, and, or, not, ?, set!, get!, quasiquote, ..., ___, **1, =.., *.. unquote, unquote-splicing.
the auxiliary syntax $ struct object = and or not ? set! get! ... ___ **1 =.. *.. which are exported (if not already defined) via SRFI 206, if available.
any quoted or self-quoting datum
is an error/are errors
according to the FAQ this phrase, unlike "signals an error," doesn't define any behavior.

Syntax and Semantics

A pattern matcher has the following form:

(match <arg-expression> <clause> <clause> ... )
<clause> -> (<pattern> <body>) |
            (<pattern>  (=> <non-pattern identifier>) <body>)

Syntax: it is an error if <arg-expression> does not match a pattern in any of the clauses. It is an error for the <non-pattern identifier> to be a pattern variable.

A pattern can have one of the following forms:

<pattern> -> (<pattern> ...) |
            #(<pattern> ...) |
               <pattern identifier> |
               <pattern s-expression> |
               <tail pattern> |
               <ellipsis pattern> |
               <record pattern> |
               <getter or setter pattern> |
               <field pattern> |
               <predicate pattern> |
               <boolean pattern> |
               <tree pattern> |
               <quasi-quote pattern> |

Syntax: A <pattern identifier> is any identifier except and or not set! get! $ struct object _ ... ___ =.. *** *.. = ? quote quasi-quote unquote-splicing and can be used as a pattern variable. A pattern variable will match any value, and can be repeated.

A <pattern s-expression> can be a boolean, number, character, vector, a (quote symbol) or (quote <pattern s-expression> ...). A <pattern s-expression> matches an <arg-expression> that is exactly like it. (A symbol in a quoted s-expression does not need a separate quote).

A tail pattern has the form:

<tail pattern> -> (<pattern> <pattern> ... . <pattern>) |
                 #(<pattern> <pattern> ... . <pattern>)

An ellipsis pattern has the form:

<ellipsis pattern> -> (<pattern> <pattern> ... <ellipsis form> <pattern> ...) |
                     #(<pattern> <pattern> ... <ellipsis form> <pattern> ...)
<ellipsis form> -> ... |
                   ___ |
                   =.. <number> |
                   *.. <number 1> <number 2>

Syntax: It is an error for a list to contain more than one ellipsis form or tail pattern, but the patterns that list contains can contain additional ellipsis forms or tail patterns. It is an error for <number 2> to be less than <number 1>.

A record pattern has the form:

<record pattern> -> ($ <record type> <pattern> <pattern> ...) |
                    (struct <record type> <pattern> <pattern> ...) |
                    (object <record type> (<slot name> <pattern>) (<slot name> <pattern>) ...)
<record type> -> implementation dependent
<slot name> -> <identifier>

Syntax: It is an error for the number of patterns in the $ or struct forms to exceed the number of fields in the record. It is an error if the record does not have a field named <slot name>. If a record is exported from a library without its <record type>, this form cannot be used.

A getter or setter pattern has the form:

<getter or setter pattern> -> (get! <pattern identifier>) |
                              (set! <pattern identifier>)

A field pattern has the form:

<field pattern> -> (= <operator> <pattern>)

A predicate pattern has the form:

<predicate pattern> -> (? <predicate> <pattern> ...)

A boolean pattern has the form:

<boolean pattern> -> (and <pattern> ...) |
                     (or  <pattern> ...) |
                     (not <pattern> <pattern> ...)

Syntax: It is an error for not patterns to be empty. It is an error to refer to the same variable inside and outside a not pattern.

A tree pattern has the form:

<tree pattern> -> (<pattern> *** <pattern>)

Syntax: It is an error to combine *** with ..., ___, ., *.., or =.. .

A quasi-quote pattern has the form:

<quasi-quote pattern> -> `<quasi-quote datum> |
                         (quasi-quote <quasi-quote datum>)
<quasi-quote datum> -> <pattern s-expression> |
                       <identifier> |
                       (<quasi-quote contents> ...)
<quasi-quote contents> -> <quasi-quote datum> |
                          ,<pattern> |
                          (unquote <pattern>) |
                          ,@<pattern> |
                          (unquote-splicing <pattern>)

Syntax: identifiers in unquoted patterns are subject to the same naming constraints as other pattern variables. ,@/unquote-splicing is a type of ellipsis pattern so it is a syntax error if a list contains this pattern in combination with tail, or other ellipsis patterns (=.. and *.. have to be unquoted to be interpreted as ellipsis patterns).

A _ pattern is a wildcard that will match anything but does not bind.

Helper forms:

(match-lambda <clause> <clause> ...)
(match-lambda* <clause> <clause> ...)
(match-let (<match-let clause> ...) <body>)
(match-let <identifier> (<match-let clause> ...) <body>)
(match-let\* (<match-let clause> ...) <body>)
(match-letrec (<match-let clause> ...) <body>)
<match-let clause> -> (<pattern> <arg-expression>) |
                      <binding spec>

Semantics: The use of match takes an expression and compares it to a series of patterns, beginning with the leftmost. When the expression matches a pattern, it evaluates the corresponding body, with any bindings formed in the pattern. A failure pattern assigns a zero-argument function to a procedure so that if it is determined in the body that the match actually failed, that procedure can be called to restart matching at the next argument.

A pattern identifier (which is any identifier except and or not set! get! $ struct object _ ... ___ =.. *** *.. = ? quote quasi-quote unquote-splicing) will match and bind anything, and bind it to that pattern variable. The special identifier _ matches anything but does not bind. The forms (<pattern> ...) and #(<pattern> ...) match lists and vectors whose elements match the contained patterns (() and #() match the empty list and vector).

A pattern s-expression matches literal values.

A tail pattern matches the rest of a list or vector.

A pattern followed by ellipsis matches zero or more times. A pattern followed by =.. k matches k times. A pattern followed by *.. k j matches between k and j times (inclusive).

The $ and struct syntax of record patterns match record fields positionally. The object record pattern matches an identifier against the record field slot names. This pattern may not be available (because an implementation has not made the necessary record introspection forms available) or may not be available for a particular record (because the record type is not exported).

Getter and setter functions bind a zero- and one-argument function to their identifier. The getter function returns the value of its match when called, and the setter function will take a new value to set its match to.

A field pattern (= <operator> <pattern> ...) applies operator (a one argument function) to what it is matching, and then matches the return value against its pattern. A predicate pattern (? <predicate> <pattern>) is similar in that it applies its predicate to a value, but if the predicate is true, that value is then matched against all additional patterns.

A boolean pattern has one of the forms (and <pattern> ...), (or <pattern> ...), (not <pattern> <pattern> ...). All of the forms in an and pattern must match, and an empty and pattern is true. At least one of the forms in the or pattern must match, and an empty or is false. None of the forms in a not pattern should match and an empty not is a syntax error.

In a tree pattern (<left pattern> *** <right pattern>), the /<left pattern> represents a path to the subtree that matches the <right pattern>.

In a quasi-quote pattern, <pattern s-expressions> and <identifiers> match their literal values (as symbols in the case of <identifiers>). Unquoted <pattern>s match as usual, and unquote-spliced <pattern>s have ellipsis matching.

match-lambda produces a one-argument lambda expression that matches its argument. match-lambda* makes a lambda expression that accepts any number of arguments and matches against a list of those arguments.

match-let match-let* and match-letrec both allow binding a name to a value through either standard let/letrec syntax or by matching a pattern against a value.

Pattern Grammar

pat : patvar                       ;; anything, and binds patvar
    | _                            ;; anything
    | ()                           ;; the empty list
    | #t                           ;; #t
    | #f                           ;; #f
    | string                       ;; a string
    | number                       ;; a number
    | character                    ;; a character
    | 'sexp                        ;; an s-expression
    | 'symbol                      ;; a symbol (special case of s-expr)
    | (pat1 ... patN)              ;; list of n elements
    | (pat1 ... patN . patN+1)     ;; list of n or more
    | (pat1 ... patN patN+1 ooo)   ;; list of n or more, each element
                                   ;;   of remainder must match patN+1
    | (pat1 ... patN patN+1 ooo patN+2 ... patM)
                                   ;; list of m or more, every element between the
				   ;; nth and the (m or more - n)th must match
				   ;; patN+1.
    | #(pat1 ... patN)             ;; vector of n elements
    | #(pat1 ... patN patN+1 ooo)  ;; vector of n or more, each element
                                   ;;   of remainder must match patN+1
    | #(pat1 ... patN patN+1 ooo patN+2 ... patM)
                                   ;; vector of m or more, every element between the
				   ;; nth and the (m or more - n)th must match
				   ;; patN+1.
    | ($ record-type pat1 ... patN)      ;; a record (patK matches in slot
                                         ;;    order) optional
    | (struct struct-type pat1 ... patN) ;; ditto (*) optional
    | (object struct-type (slot1 pat1) ...) ;; a record (using slot names) (*)
                                         ;;    optional
    | (= proc pat)                 ;; apply proc, match the result to pat
    | (and pat ...)                ;; if all of pats match
    | (or pat ...)                 ;; if any of pats match
    | (not pat ...)                ;; if no pats match
    | (? predicate pat ...)        ;; if predicate true and all pats match
    | (set! patvar)                ;; anything, and binds setter
    | (get! patvar)                ;; anything, and binds getter
    | (pat1 *** pat2)              ;; tree subpattern (*)
    | `qp                          ;; a quasi-pattern

    (*) extended syntax not originally part of
        Wright-Cartwright pattern matcher.

patvar : an identifier except _, quote, $, struct, @, object, =, and, or,
         not, ?, set!, get!, quasiquote, ..., ___, **1, =.., *..
	 unquote, unquote-splicing.

ooo : ...                          ;; zero or more
    | ___                          ;; zero or more
    | **1                          ;; one or more
    | =.. k                        ;; exactly k where k is an integer. (*)
                                   ;;   Example: =.. 1, =.. 2 ...
    | *.. k j                      ;; between k and j, where k and j are (*)
                                   ;;   integers. Example: *.. 3 4, match 3
                                   ;;   or 4 of a pattern *.. 1 5 match from
				   ;;   1 to 5 of a pattern

    (*) extended syntax not originally part of
        Wright-Cartwright pattern matcher.
    Note:  a list can contain only one ellipsis pattern which includes the
             patterns above, tail patterns, and unquote-splicing patterns.

qp  : ()                           ;; the empty list
    | #t                           ;; #t
    | #f                           ;; #f
    | string                       ;; a string
    | number                       ;; a number
    | character                    ;; a character
    | identifier                   ;; a symbol
    | (qp_1 ... qp_n)              ;; list of n elements
    | (qp_1 ... qp_n . qp_{n+1})   ;; list of n or more
    | (qp_1 ... qp_n qp_n+1 ooo)   ;; list of n or more, each element
                                   ;;   of remainder must match qp_n+1
    | #(qp_1 ... qp_n)             ;; vector of n elements
    | #(qp_1 ... qp_n qp_n+1 ooo)  ;; vector of n or more, each element
                                   ;;   of remainder must match qp_n+1
    | ,pat                         ;; a pattern
    | ,@pat                        ;; a pattern

Tail Contexts

If a pattern matcher is in tail context, expressions marked as tail-bodies below will be in tail context.

(match <expression> <match-clause>+)

(match-lambda <match-clause>+)

(match-lambda* <match-clause>+)

(match-let ((<pattern> <expression>)) <tail-body>+)

(match-letrec ((<pattern> <expression>)) <tail-body>+)

(match-let <variable> ((<pattern> <expression>)) <tail-body>+)

(match-let* ((<pattern> <expression>)) <tail-body>+)


<match-clause> → (<pattern> <tail-body>+)

Side Effects

There are two types of patterns where the issue of side-effects is prominent:


The following is a list of examples of the erroneous conditions that cause an error to be signaled and the message given by the current implementation:

There are also three conditions that are errors and may return a value:

Using in Other Macros

Other macros that use the pattern matcher will be subject to the same constraints, in particular with respect to names. For example:

(define-syntax make-chunker
  (syntax-rules ()
    ((_ s ...)
     (lambda (l)
       (let lp ((l l))
	 (match l (() '())
		  ((s ... . rest) (cons (list s ...) (lp rest)))
		  (end (list end))))))))

((make-chunker a b c d) (iota 20))
=> ((0 1 2 3) (4 5 6 7) (8 9 10 11) (12 13 14 15) (16 17 18 19))

((make-chunker a b c _) (iota 20))
=> ERROR on line 16: invalid use of syntax as value: _

((make-chunker a b c ___) (iota 20))
=> ERROR on line 715 of file ./srfi-204/srfi-204.scm:
     no expansion for:
       (match-syntax-error "dotted tail not allowed after ellipsis" rest)

Pattern Examples

A pattern is an expression made up of pattern operators, literals, and identifiers, which are interpreted as pattern variables. A quasi-pattern is a type of pattern that is quasi-quoted so it is made up of literals, unquoted identifiers, and unquoted operator expressions. Both types of patterns can be used in the sub-forms below, with some caveats for quasi-patterns.

Literal Patterns

Literal patterns are patterns like the example above, where the content of the pattern is only quoted or self-quoting data. In this case, quasi-patterns and patterns are very similar (except symbols don't need quoting in quasi-patterns):

(let ((ls (list 'a "b" #f 2 '() #\c '#(1))))
  (list (match ls (('a "b" #f 2 () #\c #(1)) 'ok))
        (match ls (`(a "b" #f 2 () #\c #(1)) 'ok))))
=> (ok ok)

The above expression will not work for R6RS, because the vector literal [#(1)] in ls is not quoted. Note also that the empty list does not need to be quoted in either type of pattern. At the time of this writing, cyclic literals seem irregularly implemented, at least in the repl, but this is one area where quasi-patterns and other literal patterns differ, since cyclic literals are illegal in quasiquote (R7RS, Sect. 2.4) and will lead to some kind of error/crash when they occur in quasi-patterns.

Pattern Variables and Var

Identifiers in patterns that are not one of the operators (_, quote, $, struct, @, object, =, and, or, not, ?, set!, get!, quasiquote, ..., ___, **1, =.., *.. unquote, unquote-splicing) will match anything and their corresponding binding is available in the body. The special variable _ matches anything and does not bind the result. It can be used to get rid of unintersting values or in a block of match expressions, _ is often used like else if the result is not needed. Pattern variables have to be unquoted in quasi-patterns.

(match (list 1 2 3) ((a b c) b)) => 2
(match (list 1 2 3) ((_ b _) b)) => 2
(match (list 1 2 3) (`(a ,b c) b) (_ 'fail)) => fail
(match (list 1 2 3) (`(1 ,b ,_) b) (_ 'fail)) => 2

Nonlinear Patterns: Repetition

A nonlinear pattern has repeated pattern variables:

(match (list 'A 'B 'A) ((a b a) a) (_ 'fail)) => A
(match (list 'A 'B 'A) (`(,a b ,a) a) (_ 'fail)) => fail
(match (list 'A 'B 'A) (`(,a B ,a) a) (_ 'fail)) => A
(match (list 'A 'B 'A) (`(,a ,b ,a) a) (_ 'fail)) => A

After the first match, subsequent instances must match a value equal? to the first. Any nonlinear pattern clause that involves repetition can be converted into a linear pattern clause that uses failure to restart pattern matching at the next pattern:

(match (list 1 2 1) ((a b a) a) (_ 'fail))

converts to

(match (list 1 2 1) ((a b c) (=> fail)
                        (if (equal? a c) a (fail)))
		       (_ 'fail))

If an implementation chooses not to support repetition, only the second pattern is valid and the first (as well as those in the first part of this section) should cause a syntax error.

Ellipsis and Tail Patterns

The first type of ellipsis pattern is just the ellipsis ..., which is equivalent to the unquote-splicing in quasi-patterns. An ellipsis will match any number (zero or more) of a pattern (like a regex Kleene star):

(match (list 1 2) ((1 2 3 ...) #t)) => #t
(match (list 1 2) (`(1 2 ,@3) #t)) => #t
(match (list 1 2 3) ((1 2 3 ...) #t)) => #t
(match (list 1 2 3) (`(1 2 ,@3) #t)) => #t
(match (list 1 2 3 3 3) ((1 2 3 ...) #t)) => #t
(match (list 1 2 3 3 3) (`(1 2 ,@3) #t)) => #t

(match '((a time) (stitch saves) (in nine)) (((x y) ...) (list x y)))
=> ((a stitch in) (time saves nine))

(match '((a b) (c d) (e f)) (`(,@(x y)) (list x y))) => ((a c e) (b d f))

There can be only one ellipsis or tail pattern per list, but there can be multiple ellipsis or tail patterns in a pattern:

;;;[List-of [List-of Any]] -> [List-of [List-of Any]]
(define transpose
  (match-lambda (((a b ...) ...) (cons a (transpose b))) (_ '())))

(transpose '((1 2 3) (4 5 6))) => ((1 4) (2 5) (3 6))

Ellipsis patterns also do not have to be the last pattern in a list:

(define (palindrome? str)
  (let loop ((chars (filter char-alphabetic?
                            (string->list (string-foldcase str)))))
    (match chars
      (() #t)
      ((a) #t)
      ((a b ... a) (loop b))
      (_ #f))))

(palindrome? "Able was I, ere I saw Elba.") => #t
(palindrome? "Napoleon") => #f

and they can be used with _ to throw away unimportant data:

(define first-column
  (match-lambda (((a _ ...) ...) a)))

(first-column '((1 2 3) (4 5 6) (7 8 9))) => (1 4 7)

___ is provided for when it is inconvenient to use the ellipsis (as in a syntax-rules template).

It is an error to refer to the same variable both inside and outside of an ellipsis pattern as in (match x (((a ...) a) a)) because the semantics are not clear.

The second kind of ellipsis is **1, which matches one or more of a pattern (like regex +)

(match (list 1 2) ((a b c **1) c))
ERROR: match: "no matching pattern"

(match (list 1 2 3) ((a b c **1) c)) => (3)

This pattern (and the remaining ellipsis patterns) has no quasi-pattern equivalent, so if it occurs in a quasi-pattern that quasi-pattern must be unquoted:

(define first-column-of-some
  (match-lambda (`(,@(a _ **1)) a)))

(first-column-of-some '((1) (2)))
ERROR: match: "no matching pattern"

(first-column-of-some '((1 2) (3 4))) => (1 3)

The =.. k syntax matches exactly k of a pattern:

(match '((a b) (c d) (e f))
       (((x y) =.. 3) (list x y))
       (_ 'fail))
=> ((a c e) (b d f))

(match '((a b) (c d) (e f) (g h))
       (((x y) =.. 3) (list x y))
       (_ 'fail))
=> fail

The *.. k j syntax matches between k and j of a pattern:

(match '((a b) (c d) (e f))
        (((x y) *.. 2 4) (list x y))
	(_ 'fail))
=> ((a c e) (b d f))

(match '((a b) (c d) (e f) (g h))
       (((x y) *.. 2 4) (list x y))
       (_ 'fail))
=> ((a c e g) (b d f h))

(match '((a b) (c d) (e f) (g h) (i j))
       (((x y) *.. 2 4) (list x y))
       (_ 'fail))
=> fail

Tail patterns are similar to ellipsis patterns in that they can take up several values, but they are different in a few respects:
Tail patterns will match dotted pairs, and ellipsis patterns won't:

(define keys
  (match-lambda (((a _ ...) ...) a) (_ 'fail)))

(keys '((a 1) (b 2) (c 3))) => (a b c)
(keys '((a . 1) (b . 2) (c . 3))) => fail

(define keys
  (match-lambda (((a . _) ...) a) (_ 'fail)))

(keys '((a 1) (b 2) (c 3))) => (a b c)
(keys '((a . 1) (b . 2) (c . 3))) => (a b c)

Tail patterns also don't require any special treatment in quasi-patterns

(define handle-arithmetic-sexpr
 (match-lambda (`(+ . ,operands) (apply + (map eval-sexpr operands)))
               (`(- . ,operands) (apply - (map eval-sexpr operands)))
               (`(* . ,operands) (apply * (map eval-sexpr operands)))
               (`(/ . ,operands) (apply / (map eval-sexpr operands)))))

Tree Patterns

Note: tree patterns are considered experimental at this stage and so are an optional part of this SRFI The new operator *** can be used to search a tree for subpatterns. A pattern of the form (x *** y) represents the subpattern y located somewhere in a tree where the path from the current object to y can be seen as a list of the form (x ...). y can immediately match the current object in which case the path is the empty list. In a sense it's a 2-dimensional version of the ... pattern. As a common case the pattern (_ *** y) can be used to search for y anywhere in a tree, regardless of the path used.

(match '(+ (* (+ 7 2) (/ 5 4)) (sqrt (+ (sqr x) (sqr y))))
	  ((a *** 7) a))
=> (+ * +)

(match '(+ (* (+ 7 2) (/ 5 4)) (sqrt (+ (sqr x) (sqr y))))
	  ((_ *** `(sqrt . ,rest)) rest))
=> ((+ (sqr x) (sqr y)))

Other examples of tree patterns that handle mixed data types or retrieve all matches are given in the predicate and extension sections.

Boolean Patterns

The boolean operators and, or and not can be used to group and negate patterns analogously to their Scheme counterparts. The and operator ensures that all subpatterns match. This operator is often used with the idiom (and x pat) to bind x to the entire value that matches pat (c.f. "as-patterns" in ML or Haskell). Another common use is in conjunction with not patterns to match a general case with certain exceptions.

(match 1 ((and) #t)) => #t
(match 1 ((and x) x)) => 1
(match 1 ((and x 1) x)) => 1

An and pattern is different from its Scheme counter part in that matching, not truth, is what determines whether it succeeds:

(match #f ((and) #t) (_ #f)) => #t

Failure is one way to catch when a value has matched false

(match #f ((and x) (=> fail) (if x #t (fail))) (_ #f)) => #f

The or operator ensures that at least one subpattern matches. If the same identifier occurs in different subpatterns, it is matched independently. All identifiers from all subpatterns are bound if the or operator matches, but the binding is only defined for identifiers from the subpattern which matched.

(match 1 ((or) #t) (else #f)) => #f
(match 1 ((or x) x)) => 1
(match 1 ((or x 2) x)) => 1

If failure patterns are used as replacements for or repetition patterns, the or should be moved from the pattern to the body:

(define last-matches-one-of-first-three
  (match-lambda ((a a) #t)
                ((a b c ... (or a b)) #t)
                ((a b c d ... c) #t)
                (_ #f)))

(define last-matches-one-of-first-three
  (match-lambda ((a a) #t)
                ((a b c ... d)
                 (=> fail)
                 (if (or (equal? d a) (equal? d b))
                ((a b c d ... e) (equal? c e))
                (_ #f)))

Since all variables in an or pattern are bound but only one is defined, using or in combination with ellipsis patterns (as below) can lead to results with large numbers of undefined values:

(match (get-environment-variables)
  (((or ("PATH" . path)
        ("HOMEPROFILE" . home)
	("HOME" . home)
	("USER" . user)
	("USERNAME" . user)
	(_ . _)) ...)
   (list path home user)))
=> [three large lists of mostly undefined values]
One way to get the intended results is shown below:

(fold (match-lambda*
	((("PATH" . path) (p h u)) (list path h u))
	((("USERPROFILE" . home)  (p h u)) (list p home u))
	((("HOME" . home)  (p h u)) (list p home u))
	((("USER" . user)  (p h u))  (list p h user))
	((("USERNAME" . user)  (p h u))  (list p h user))
	(_ out) out)
      (list #f #f #f) ; (p h u) init
=> [List of 3 strings]

In other cases, it may make sense to use ((or pats) ...) patterns with clean-up:

(define (clean lst)
  (let ((undef (when #f #f)))
    (remove (lambda (item) (equal? item undef)) lst)))

(match (iota 7)
   (((or 2 6 rest) ...) (clean rest)))
=> (0 1 3 4 5)

The not operator succeeds if the given pattern doesn't match. None of the identifiers used are available in the body. This pattern also gives us a way to eliminate values (like false above) in a more streamlined way:

(match 1 ((and x (not #f)) x) (_ 'fail)) => 1

(match #f ((and x (not #f)) x) (_ 'fail)) => fail

(match 1 ((not 2) #t)) => #t

It is an error for a pattern to refer to the same variable inside and outside of a not pattern as in (match x ((a (not a)) a)).

Predicates and Fields

The operator ? can be used to provide a predicate. The usage is (? predicate pat ...) where predicate is a Scheme expression evaluating to a predicate called on the value to match, and any optional patterns after the predicate are then matched as in an and pattern.

(match 1 ((? odd? x) x)) => 1

Along with boolean patterns, the predicate pattern is enough to make the other half of the math expression evaluator started in the section on dotted pairs:

(define eval-sexpr
  (match-lambda ((? number? n) n)
                ((and pair ((or '+ '- '* '/) . rest))
		 (handle-arithmetic-sexpr pair))
		(_ (error "not implemented yet"))))

(eval-sexpr '(+ (* 3 4 5) (- 10 3))) => 67

Predicates also can be used to make a tree pattern that finds all matches:

(define extract-num-addends
    (((and (k *** `(+ . ,addends)) ('+ (? number? i) ...)) . rest)
     (cons addends (extract-num-addends rest)))
    (((and (k *** `(+ . ,addends)) inner) . rest)
     (append (extract-num-addends inner)
	     (extract-num-addends rest)))
    ((this . rest) (extract-num-addends rest))
    (() '())))

(extract-num-addends '((+ (* 1 (+ 2 3)) (+ 4 5)) (- (/ 6 (+ 7 8)) (+ 9 10))))
=> ((2 3) (4 5) (7 8) (9 10))

or do tree matching on heterogenous data. This example uses the following JSON, which is converted by several packages into the subsequent list:

{"menu": {
  "id": "file",
  "value": "File",
  "popup": {
    "menuitem": [
      {"value": "New", "onclick": "CreateNewDoc()"},
      {"value": "Open", "onclick": "OpenDoc()"},
      {"value": "Close", "onclick": "CloseDoc()"}

;; in scheme
(define example-json
'((menu (id . "file")
 	 (value . "File")
	   (menuitem .
		    #(((value . "New") (onclick . " CreateNewDoc()"))
		      ((value . "Open") (onclick . "OpenDoc()"))
		      ((value . "Close") (onclick . "CloseDoc()"))))))))

;; vector-index is from (srfi 133)/(scheme red)

(define (get-close json)
  (define get-close-inner
		  ((key *** ('(value . "Close") . rest)) key)
		  ((? vector? v)
		   (let ((i (vector-index get-close-inner v)))
		     (if i
			 (cons i (get-close-inner (vector-ref v i)))
		  ((key *** (k . (? vector? v)))
		   (let ((r (get-close-inner v)))
		     (if r
			 (append key (cons k r))
		  (_ #f)))
  (get-close-inner (car json)))

(get-close example-json)
=> (menu popup menuitem 2)

A predicate may be called multiple times on matching and non-matching expressions and sub-expressions, in any order, so side effects should be avoided unless there is some overriding interest (debugging, logging, short-cutting, etc.). Predicates can also be written using references to preceding pattern variables, if repetition is supported:

(define fibby?
  (match-lambda ((a b (? (lambda (x) (= (+ a b) x)) c) . rest)
		  (fibby? (cons b (cons c rest))))
                 ((a b) #t)
                 ((a) #t)
                 (() #t)
		 (_ #f)))

;;; doesn't refer to pattern variables in predicate
(define fibby?
  (match-lambda ((a b c . rest)
		  (if (= (+ a b) c)
		      (fibby? (cons b (cons c rest)))
                 ((a b) #t)
                 ((a) #t)
                 (() #t)
		 (_ #f)))

(fibby? '(4 7 11 18 29 47)) => #t

The field operator = is used to extract an arbitrary field and match against it. It is useful for more complex or conditional destructuring that can't be more directly expressed in the pattern syntax. The usage is (= field pat), where field can be any expression, and should result in a procedure of one argument, which is applied to the value to match to generate a new value to match against pat. Thus the pattern (and (= car x) (= cdr y)) is equivalent to (x . y) except that it will result in an immediate error if the value isn't a pair. Like predicates, fields should avoid side effects when there is no overriding interest. In contrast to the predicate operator, the field operator succeeds when its value is false. Compare:

(match 1 ((and n (? even?)) n) (_ 'fail)) => fail

(match 1 ((and n (= even? r)) (list n r)) (_ 'fail)) => (1 #f)

(match '(a b c d) ((or (= (lambda (x) (memq 'f x)) r)
                       (= (lambda (x) (memq 'g x)) r)
                       (= (lambda (x) (memq 'b x)) r))
		   (_ 'fail))
=> #f

(match '(a b c d) ((or (= (lambda (x) (memq 'f x)) (and r (not #f)))
                       (= (lambda (x) (memq 'g x)) (and r (not #f)))
                       (= (lambda (x) (memq 'b x)) (and r (not #f))))
		   (_ 'fail))
=> (b c d)
(match '(1 . 2) ((= car x) x)) => 1
(match 4 ((= square x) x)) => 16

Getters and Setters

The set! and get! operators are used to bind an identifier to the setter and getter of a field, respectively. The setter is a procedure of one argument, which mutates the field to that argument. The getter is a procedure of no arguments which returns the current value of the field.

(let ((x (cons 1 2)))
  (match x
    ((1 . (set! s)) (s 3) x)))
=> (1 . 3)
(match '(1 . 2) ((1 . (get! g)) (g))) => 2

(define alist (list (cons 'a 1) (cons 'b 2) (cons 'c 3)))

(define get-c (match alist ((= (lambda (al) (assv 'c al))
                            (_ . (get! g)))
(get-c) => 3

(define set-c! (match alist ((= (lambda (al) (assv 'c al))
			    (_ . (set! s)))
(set-c! 7)

(get-c) => 7

alist => ((a . 1) (b . 2) (c . 7))

Record Patterns

Note: record introspection is not currently portable, so record operators are an interface whose implementation will not be portable. This is an optional part of the SRFI. See the implementation section for more information.

The record operator $ is used as a concise way to match records defined by SRFI 9 (or SRFI 99). The usage is ($ rtd field ...), where rtd should be the record type descriptor specified as the first argument to define-record-type, and each field is a subpattern matched against the fields of the record in order. Not all fields need be present. The operator struct can be used as a synonym for $.

(let ()
  (define-record-type employee
    (make-employee name title)
    (name get-name)
    (title get-title))
  (match (make-employee "Bob" "Doctor")
    (($ employee n t) (list t n))))
=> ("Doctor" "Bob")

For records with more fields it can be helpful to match them by name rather than position. For this you can use the object operator, originally a Gauche extension:

(let ()
  (define-record-type employee
    (make-employee name title)
    (name get-name)
    (title get-title))
  (match (make-employee "Bob" "Doctor")
    ((object employee (title t) (name n)) (list t n))))
=> ("Doctor" "Bob")

Matching on records when this feature is not implemented can be done as follows:

(match (make-employee "Bob" "Doctor")
       ((and (? employee?)
             (= get-title t)
	     (= get-name n))
	(list t n)))
=> ("Doctor" "Bob")

Emulating setter patterns like:

(let () (define-record-type <posn>
           (make-posn x y)
	   (x posn-x set-posn-x!)
           (y posn-y set-posn-y!))
	 (match (make-posn 3 4)
		((and p ($ <posn> (set! set-x)))
		(set-x 7)
		(match p (($ <posn> x y) (list x y))))))
=> (7 4)
is described in the next section.


For data types or matching not covered by equal?, SRFI 26 in combination with user-defined procedures, predicates, and fields can be useful.

For example, SRFI 111 (boxes) is opaque to record pattern matching, but can be handled as below:

(define (box-equal? a b)
  (if (and (box? a) (box? b))
      (box-equal? (unbox a) (unbox b))
      (equal? a b)))

(match (list (box 1) (box 1))
       ((a a) 'ok) (_ 'fail))
=> fail

(match (list (box 1) (box 1))
       ((a (? (cut box-equal? a <>))) 'ok)
       (_ 'fail))
=> ok

;case for no nonlinear predicates
(match (list (box 1) (box 1))
       ((a b) (if (box-equal? a b) 'ok 'fail)))
=> ok

(match (box 1) ((= unbox value) value))
=> 1

(define box-1 (box 1))

(define make-get-value
   ((and (? box?)
         (= (lambda (box) (cut unbox box)) get))

(define make-set-value
   ((and (? box?)
         (= (lambda (box) (cut set-box! box <>)) set))

(define get-value (make-get-value box-1))
(define set-value! (make-set-value box-1))

(get-value) => 1
(set-value! 18)
(get-value) => 18

Another approach is to convert the data type to a form that can be matched against, analagous to (match (proc data) (pat ...)). Below is an example using a one procedure to match imports, and another to read objects from a file to be matched:


(define extract-imports
      (`(import . ,imports) imports)
      (((and (key *** `(import . ,imports)) inner) . rest)
       (append (if (null? key)
		   (list imports)
		   (extract-imports inner)) (extract-imports rest)))
      ((this . rest) (extract-imports rest))
      (any '())))

(define (match-on-file filename matcher)
    (lambda (port)
      (let loop ((out '()))
	(if (eof-object? (peek-char port))
	    (loop (append out (matcher (read port)))))))))

;; filename relative to top of SRFI 204 repo
(match-on-file "test/data/forum-topics.scm" extract-imports)
=> (((srfi srfi-9))
       ((scheme case-lambda))
       ((srfi 69))
       ((srfi 204) (scheme red) (chibi json))
       ((srfi 69)))

A future SRFI will specify a method similar to this one using the pattern operators -> and => to set up views on data.


A Sample implementation is available on Github and in this .tgz file.

Any implementation of SRFI 204 should use SRFI 206 to export whatever of the auxiliary syntax needs to be defined (from $, ?, ***, ___, **1, =.., *.., get!, struct, object ) or the included auxiliary-syntax.scm file, if SRFI 206 is not available. SRFI 206 is used as follows:

(import (only (srfi 206 all) $ ? *** ___ **1 =.. etc.))
and auxiliary-syntax.scm is used as follows:

;; from Akku
(import (srfi private include))
  (include/resolve ("srfi" "204") "auxiliary-syntax.scm")
  (define-auxiliary-keywords $ ? *** ___ **1 =.. etc.)

;; in Guile
  (include-from-path "srfi/204/auxiliary-syntax.scm")
  (define-auxiliary-keywords $ ? *** ___ **1 =.. etc.)

To use the $, struct, and object syntax, an implementation needs to provide the forms is-a?, slot-ref, and slot-set!. These forms have the following syntax:

(is-a? rec rtd) => #t if type of rec matches rtd, #f otherwise

(slot-ref rtd rec slot) => If slot is an integer, value at position slot,
                    if slot is a symbol, value named slot

(slot-set! rtd rec slot new-value) If slot is an integer, set! the value at
                            position slot to new-value. If slot is a symbol,
                            set! the value named slot to new-value

Since these forms rely on the underlying structure of records in a particular implementation, their definition will vary from one Scheme to another. This is an example from Gauche, where records are implemented on top of native CLOS-like objects:

(import (only (gauche base) is-a? slot-definition-name class-slots)
  (scheme base)
  (rename (gauche base)
	  (slot-ref gb-slot-ref)
	  (slot-set! gb-slot-set!)))
    (define-syntax slot-ref
      (syntax-rules ()
        ((_ class inst n)
         (if (integer? n)
           (gb-slot-ref inst
                        (list-ref (map slot-definition-name
			               (class-slots class))
           (gb-slot-ref inst n)))))
    (define-syntax slot-set!
      (syntax-rules ()
        ((_ class inst n value)
         (if (integer? n)
             (gb-slot-set! inst
    	              (list-ref (map slot-definition-name
		                     (class-slots class))
             (gb-slot-set! inst n value))))))

this is an example from Larceny, which has SRFI 99:

(import (scheme base)
	(srfi 99 records))
  (define-syntax is-a?
    (syntax-rules ()
      ((_ rec rtd)
       ((rtd-predicate rtd) rec))))
  (define-syntax slot-ref
    (syntax-rules ()
      ((_ rtd rec n)
       (if (integer? n)
           ((rtd-accessor rtd (vector-ref (rtd-all-field-names rtd) n)) rec)
           ((rtd-accessor rtd n) rec)))))
  (define-syntax slot-set!
    (syntax-rules ()
      ((_ rtd rec n value)
       (if (integer? n)
           ((rtd-mutator rtd (vector-ref (rtd-all-field-names rtd) n)) rec value)
           ((rtd-mutator rtd n) rec value))))))

and this is an example from Loko and Chez, which use forms from (rnrs records procedural) and (rnrs records inspection):

(define-syntax is-a?
  (syntax-rules ()
    ((_ rec rtd)
     ((record-predicate (record-type-descriptor rtd)) rec))))
(define-syntax slot-ref
  (syntax-rules ()
    ((_ rtd rec n)
     (let ((rtd (record-type-descriptor rtd)))
       (if (integer? n)
	   ((record-accessor rtd n) rec)
	   ((record-accessor rtd (name->idx rtd n)) rec))))))
(define-syntax slot-set!
  (syntax-rules ()
    ((_ rtd rec n value)
     (let ((rtd (record-type-descriptor rtd)))
       (if (integer? n)
	   ((record-mutator rtd n) rec value)
	   ((record-mutator rtd (name->idx rtd n)) rec value))))))
(define-syntax name->idx
  (syntax-rules ()
	((_ rtd n)
	 (let* ((names (record-type-field-names rtd))
		(len (vector-length names)))
	   (let lp ((i 0))
	       ((> i len) (error "name not in record" n))
	       ((eq? n (vector-ref names i)) i)
	       (else (lp (+ i 1 )))))))))
match-letrec uses the technique described in [Kiselyov, 2013]. This form gets used the least by users, so it may be a useful exercise for implementors to look for internal defines and other items that can be translated into match-letrec to build up a testing library.


Alex Shinn, John Cowan, Arthur A. Gleckler, et al. 2013 Revised7 Report on the Algorithmic Language Scheme. Retrieved from https://small.r7rs.org/attachment/r7rs.pdf.

Bill Schottstaedt. 2020 s7 case.scm. Retrieved from https://ccrma.stanford.edu/software/snd/snd/s7.html#case.

Andrew K. Wright & Robert Cartwright. ACM Transactions on Programming Languages and Systems. Vol. 19, No. 1. January 1997. Pages 87-152. Retrieved on July 15, 2020 from https://www.iro.umontreal.ca/~feeley/cours/ift6232/doc/pres2/practical-soft-type-system-for-scheme.pdf.

Kiselyov, Oleg 2003 "How to write symbol? with syntax rules" comp.lang.scheme Retrieved from https://okmij.org/ftp/Scheme/macro-symbol-p.txt

Kiselyov, Oleg 2013 "How to Write Seemingly Unhygenic and Referentially Opaque Macros with Syntax-rules*" Retrieved from https://okmij.org/ftp/Scheme/Dirty-Macros.pdf

Alex Shinn. 2020. (chibi match). Retrieved from http://synthcode.com/scheme/chibi/lib/chibi/match.html.

Justin Ethier. 2019. Match Library. Retrieved from http://justinethier.github.io/cyclone/docs/api/cyclone/match.

Ludovic Cortès, Bake Timmons, Arun Isaac, & Paul Morris. 2019. Guile Reference Manual.. Sect. 7.8 "Pattern Matching." Retrieved from https://www.gnu.org/software/guile/manual/html_node/Pattern-Matching.html.

Matthias Zenger. 2019. match.scm (source file). Retrieved from https://github.com/objecthub/swift-lispkit/blob/master/Sources/LispKit/Resources/Libraries/lispkit/match.sld.

Göran Weinholt. 2019. match.sls (source file). Retrived from https://gitlab.com/weinholt/loko/-/blob/master/lib/match.sls.

Higepon Taro Minowa, et. al. 2010 Pattern Match. Retrieved from http://mosh.monaos.org/files/lib/match-ss.html#Pattern_Match.

Takashi Kato. 2019. Sagittarius Users' Reference. Sect. 8.1 "(match) -- Pattern Matching" Retrieved from http://ktakashi.github.io/sagittarius-online-ref/section81.html.

Hubert Montas. 2018. A Scheme Interpreter for ARM Microcontrollers. Sect. "Program Examples for Version 080 - Expert System" Retrieved from http://armpit.sourceforge.net/v080/aps_080_common_examples.html#Expert.

Manuel Serrano & Jean-Marie Geffroy. 2013. Bigloo/manual. Chapter 6. "Pattern Matching" Retrieved from http://www-sop.inria.fr/mimosa/fp/Bigloo/manual-chapter6.html.

R. Kent Dybig. 2010. "Using Match." Retrieved from Indiana University Bloomington via archive.org: https://web.archive.org/web/20180718090106/https://www.cs.indiana.edu/chezscheme/match/.

Alex Shinn. 2020. The Chicken Scheme Wiki. "matchable." Retrieved from http://wiki.call-cc.org/eggref/4/matchable.

Juergen Lorenz. 2020. The Chicken Scheme Wiki. "bindings." Retrieved from http://wiki.call-cc.org/eggref/4/bindings.

Marc Feely & Frédéric Hamel. 2020. match-support.scm (source file). Retrieved from https://github.com/gambit/gambit/blob/master/lib/termite/match-support.scm.

Shiro Kawai. Gauche Users Reference. Sect. 12.68 "util.match - Pattern matching." Retrieved from http://practical-scheme.net/gauche/man/gauche-refe/Pattern-matching.html#Pattern-matching.

Dimitris Vyzovitis et. al. Gerbil Reference. Sect. "Core Prelude - Prelude Macros - Pattern Matching." https://cons.io/reference/core-prelude.html#pattern-matching.

Per Bothner. 2020. The Kawa Scheme Language - Reference Documentation. Sect. 8.3 "Variables and Patterns." Retrieved from https://www.gnu.org/software/kawa/Variables-and-Patterns.html.

Marc Nieper-Wißkirchen. 2016. snow-fort.org. "(rapid match)" http://snow-fort.org/s/rapid-scheme.org/marc/rapid/match/0.1.5/index.html.

Yuichi Nishiwaki. 2016 50.destructuring-bind (source directory). Retrieved from https://github.com/picrin-scheme/picrin/tree/master/contrib/50.destructuring-bind.

Matthew Flatt & PLT. The Racket Reference. Chap. 9 "Pattern Matching" Retrieved from https://docs.racket-lang.org/reference/match.html?q=match#%28form._%28%28lib._racket%2Fmatch..rkt%29._match%29%29.

Nils M. Holm. matcher.scm (source code). Retrieved from http://t3x.org/s9fes/matcher.scm.html.

Rafik Naccache unifier.cljc (source code). Retrieved from https://github.com/turbopape/scheje/blob/master/src/scheje/unifier.cljc.

Erick Gallesio. STklos Reference Manual. Chapter 6 "Pattern Matching." Retrieved from https://stklos.net/Doc/html/stklos-ref-6.html.

Peter Norvig. 1992 Paradigms of Artificial intelligence. (1st Ed.). ELIZA,Dialog with a Machine, Chap. 5. Morgan Kaufmann Publishers, San Francisco, CA. Retrieved on July 15 2020 from https://github.com/norvig/paip-lisp/blob/master/docs/chapter5.md.


Andrew K. Wright, Robert Cartwright, Alex Shinn, and Panicz Maciej Godek.

Alex Shinn provided documentation of his implementation.

Shiro Kawai and Ludovic Cortès et. al. provided pattern grammar documentation.

The members of the SRFI 204 forum helped refine this specification, including John Cowan, Arthur A. Gleckler, Panicz Godek, Shiro Kawai, Lassi Kortela, Adam Nelson, Duy Nguyen, Marc Nieper-Wißkirchen, and Alex Shinn.

Copyright © Felix Thibault (2020).

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.


Editor: Arthur A. Gleckler