by Marc Nieper-Wißkirchen
This SRFI is currently in final status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-242@nospamsrfi.schemers.org
. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.
This SRFI defines a language to describe control-flow graphs (CFGs) suitable for formulating iterative and recursive algorithms. Using the notion of a CFG term, this language can be seamlessly embedded in the Scheme language. Complex CFG terms can be composed from simple CFG terms.
In the Scheme language, procedures (and thus jump labels, according to lambda-the-ultimate philosophy) are first-class entities. In a certain sense, this makes the control-flow graph of a Scheme program a dynamic and not static entity.
The CFG language described in this SRFI, on the other hand, describes a static control-flow graph. For the applications where this is sufficient, this has the advantage that the control-flow graph can be readily statically reasoned about, allowing one, in particular, to define variable scope in terms of it.
The CFG language itself is useful for describing iterative and recursive algorithms more clearly than it is possible in the Scheme language through tail calls and (multiple) return values. Its raison d'être, however, is that more specialized languages like that of a loop facility can be easily built on top of it.
The origin of the CFG language described in this SRFI is Olin
Shiver's paper The Anatomy of a Loop: A story of scope and
control. In his paper, he makes two important
points: The main iterative control construct in Scheme is a tail
call. While a tail call as a goto that passes arguments
is a pretty powerful construct, it is also as low-level as a
goto. His first point is that this implies that it is not the
right tool to write down iterative algorithms in a high-level
fashion. This fact has stimulated the search for loop
facilities allowing one to express iterative algorithms more
abstractly and more composably. His second point is that just as
the scoping of variables is well-defined in the Scheme
language, a well-defined model of (loop) variable scoping is
needed for loop facilities as well. For this, he formulates a
new scoping rule, namely that binders dominate references.
While not all surface syntax has been adopted from Olin Shiver's paper, the core of the language described here subsumes his conception. The addition of a facility to handle not only iterative but also recursive algorithms is a new invention in this SRFI. This addresses a third perceived shortcoming of the Scheme language. Recursive algorithms in Scheme are based on the fact that Scheme procedures return values, possibly multiple ones. However, as soon as more than one value needs to be returned and each recursion step only needs to modify one of them, a position-only identification of return values becomes unclear and leads to repetition of code. Instead, the CFG language in this SRFI gives names to intermediate results and allows parallel processing of them.
Thanks to Scheme's expressive macro system, the CFG language can be seamlessly implemented in the standard Scheme language, fully respecting Scheme semantics.
The author plans to submit a SRFI describing an extensible loop facility built on the CFG language defined in this SRFI in the future.
The following expression evaluates to an (iterative) procedure that takes a list of integers and returns two values, the number of even and the number of odd values in the list.
(lambda (n*) (cfg (labels [(f (execute (lambda (next done) (if (null? n*) (done) (next (car n*) (cdr n*)))) [(n n*) (execute (lambda (even odd) (if (odd? n) (odd (+ o 1)) (even (+ e 1)))) [(e) (call f)] [(o) (call f)])] [() (finally (e o) (values e o) (halt))]))] (execute (lambda (start) (start n* 0 0)) [(n* e o) (call f)])) (values e o)))
The following expression evaluates to a (recursive) procedure that takes a list of integers and returns two values, the sublist of even values and the sublist of odd values.
(lambda (n*) (cfg (labels [(f (execute (lambda (next done) (if (null? n*) (done) (next (car n*) (cdr n*)))) [(n n*) (execute (lambda (even odd) (if (odd? n) (odd) (even))) [() (finally (e*) (cons n e*) (call f))] [() (finally (o*) (cons n o*) (call f))])] [() (finally (e* o*) (values '() '()) (halt))]))] (execute (lambda (start) (start n*)) [(n*) (call f)])) (values e* o*)))
cfg
formA cfg
expression lets control flow along the edges
of a control-flow graph (CFG) defined by the cfg
expression, potentially binding so-called loop variables, until
a halt
CFG block is reached. Then, control flows
backwards along the edges previously taken, potentially binding
so-called return variables. Finally a Scheme expression is
evaluated in an environment extended by these bindings, and its
values are returned.
The general form of a cfg
expression is
(cfg cfg
term result expression)
where the cfg term
describes
the CFG and result
expression
is an arbitrary Scheme expression.
The simplest CFG is just one consisting of a halt
CFG block, whose syntax simply is:
(halt)
For example, we have
(cfg (halt) (+ 1 2))3
as the control flow along this CFG, which has no edges, immediately stops, leading on to the evaluation of the return expression.
A finally
CFG block is used to bind return
variables. The general syntax of a finally
block
is:
(finally formals expression cfg
term)
When control enters the CFG fragment it describes, control is
passed to the CFG fragment described by
the cfg term
first. When
control finally flows backwards, the
(Scheme) expression
is
evaluated to yield values that are bound as return variables to
the formals
. For example,
we have:
(cfg
(finally (x . y) (values 1 2 3)
(halt))
(list x y)(1 (2 3))
A more complicated example is the following:
(let ([x 1])
(cfg
(finally (y) (+ x 2)
(finally (x) (+ x 1)
(halt)))
(list x y)))(2 4)
Here, control flows first from top to bottom and then
backwards. The finally
CFG blocks bind the return
variables during the backward flow. The expression (+ x
1)
sees no binding of a return variable, only the outer
binding of x
. On the other hand, the
expression (+ x 2)
sees the binding
of x
as a return variable by the
inner finally
CFG block.
While finally
CFG blocks are used to perform
actions during the backward control flow, execute
CFG blocks perform actions during the initial forward flow
through the CFG. The syntax of an execute
CFG block is given by
(execute procedure expression [formals cfg term] …)
where procedure
expression
is a Scheme expression that must
evaluate to a procedure accepting as many arguments as there
are cfg term
s in
the execute
CFG term. When the forward control
flow through the CFG reaches an execute
CFG block,
this procedure is called and should tail-call one of its
arguments. Control flow then proceeds with the CFG block
described by the corresponding cfg
term
:
(let ([x 1] [y 2])
(cfg
(finally (y) (+ x 3)
(execute
(lambda (e)
(set! x y)
(e))
[() (halt)]))
(list x y)))(2 5)
In this example, the forward control flow passes through
the finally
CFG block before it reaches
the execute
block. The procedure expression is
evaluated in the environment where x
is bound
to 1
and y
is bound to 2
.
The tail call to e
lets the control flow continue
with the halt
CFG block, from where it flows
backward. The backward control flow passes through
the execute
CFG block before the (+ x
3)
is evaluated and the return variable y
is
bound to the result 6
.
Execute
CFG terms are used to create CFGs with
branches. For this, more than one cfg
term
has to be provided. Consider the following example:
(let ([x 1] [y 2])
(cfg
(execute
(lambda (e1 e2)
(if (odd? y) (e1) (e2)))
[() (halt)]
[() (finally (x) 3 (halt))])
(+ x 10)11
Here, the control flow exiting the execute
CFG
block can proceed along two possible edges, e1
and e2
, depending on the parity of y
in
this example. It stands out that the value of the example
expression is not 13
, as one might have expected,
but 11
. In other words, the binding
of x
that is visible in (+ x 10)
is
the let
binding of x
and not the
binding as a return variable introduced in
the finally
CFG block. To understand this, we have
to introduce the rule by with which the scope of return
variables is determined. A CFG block (and, likewise,
the result expression
) are
in the scope of a return variable if this CFG block
post-dominates definitions of this return variable, that is if
all possible control flow paths starting at the CFG block and
ending at a halt
CFG block pass through a
definition of the return variable, which happens along the entry
edge of a finally
CFG block. In the example above,
this is not the case. There is a (statically possible)
control-flow path from the entry block to the
first halt
CFG block that does not pass through
a finally
CFG block defining the return
variable x
. Compare with the following two
examples:
(let ([x 1] [y 2])
(cfg
(execute
(lambda (e1 e2)
(if (odd? y) (e1) (e2)))
[() (finally (x) #f (halt))]
[() (finally (x) 3 (halt))])
(+ x 10)13
(let ([x 1] [y 2])
(cfg
(finally (x) 3
(execute
(lambda (e1 e2)
(if (odd? y) (e1) (e2)))
[() (halt))]
[() (halt))]))
(+ x 10)13
So far, we haven't talked about
the formals
that appear in
a execute
CFG term. Using these, we can bind a
different kind of variables, namely so-called loop variables.
One can think of loop variables as being defined during forward
control flow and return variables as being defined during
backward flow. The value to which a loop variable is bound
when control flow leaves along an edge of
an execute
CFG Term is passed as an argument to
the tail-called procedure that corresponds to the edge:
(let ([x 1])
(cfg
(execute
(lambda (e)
(e (+ x 2)))
[(x) (finally (y) (+ x 3) (halt))])
y)6
In (+ x 2)
, the variable x
is let
-bound to 1
, and in (+ x
3)
the binding as a loop variable to (+ 1 2)
is visible.
While the scoping rule of return variables is based on the
post-dominance relation, the scoping rule for loop variables is
based on the dominance relation: A CFG block is in the scope of
a loop variable if definitions of this loop variable dominate
the CFG block, that is if all possible control flow paths
starting at the entry of the whole CFG and ending at the CFG
block pass through a definition of the loop variable, which
happens along an edge exiting an execute
CFG block.
In the following example, the definitions of
the loop variable x
do not dominate the expression (+ x 3)
:
(let ([x 1] [y 2])
(cfg
(execute
(lambda (e1 e2)
(if (odd? y) (e1 5) (e2)))
[(x) (finally (y) (+ x 2) (halt))]
[() (finally (y) (+ x 3) (halt))])
y))4
The expressions in the finally
CFG terms are
evaluated with loop and return variables in scope. If a loop
and a return variable have the same name and both are in scope,
the return variable shadows the loop variable. In the following
example, x
is bound to 1
in (+ x
1)
, bound to 2
in (+ x 3)
,
bound to 5
in (+ x 2)
and bound
to 7
in (+ x 4)
:
(let ([x 1])
(cfg
(execute
(lambda (e)
(e (+ x 1)))
[(x) (finally (x) (+ x 2)
(finally (x) (+ x 3)
(halt)))])
(+ x 4)11
Another CFG term that can be used to define loop variables is
bind
. Its syntax is:
(bind
([formals expression]
...) cfg term)
When the forward control flow reaches a bind
CFG
term, the expression
s are
evaluated in the same environment in which
the procedure expression
of an execute
CFG term would be evaluated. Then,
all formals are bound in parallel, and the loop variable bindings
become visible in the cfg
term
, with which the forward control flow
proceeds. The bind
CFG term is similar to
the let-values
Scheme expression, which binds
multiple Scheme variables in parallel.
(let ([x 1] [y 2])
(cfg
(bind ([(x) y] [(y) x])
(finally (x y) (values x y) (halt)))
(list x y)))(2 1)
In order to construct more interesting examples, we need to be
able to introduce loops in control-flow graphs. This is
possible with the use of labels
CFG terms, which have
the following syntax:
(labels
([label cfg
term] …) body cfg
term)
where each label
is an
identifier. A labels
CFG term binds
the labels
to
the cfg term
s, and this
binding is visible within the cfg
term
s and the body cfg
term
, much like the semantics of
the letrec
expression for Scheme code. When
(forward) control flow enters a labels
CFG
fragment, the control flow is passed to the CFG block described
by the body cfg term
.
Control can flow to the CFG block described by
a cfg term
using call
CFG terms with the corresponding label.
The syntax of a call
CFG term is:
(call label)
Labels occupy a different namespace than variables (of any kind) or keywords. Labels are lexically scoped.
(let ([x 1])
(cfg
(labels ([x (finally (x) x (halt))])
(call x))
x))1
In the following example (from Olin Shiver's paper),
the finally
CFG block is dominated by definitions of
the loop variable y
. When the control flows through
the label la
, the loop variable y
is
bound before the call to la
; otherwise, when the
control flows through the label lb
, the loop
variable y
is bound before the call
to lj
:
(let ([x 1]
[f (lambda (y) (set! x y))]
[g (lambda (y) (set! x (- y 1)))])
(cfg
(labels ([lj (finally y (+ y 10) (halt))]
[la (execute (lambda (e1)
(f y) (e1))
[() (call lj)])]
[lb (execute (lambda (e1)
(g x) (e1 (* x x)))
[(y) (call lj)])])
(execute (lambda (e1 e2)
(if (odd? x)
(e1 (+ x 1))
(e2)))
[(y) (call la)]
[() (call lb)]))
(list x y)))(2 12)
Label*
is another binding construct for CFG
labels; it is related to labels
as let*
is related to letrec
. In particular, the binding of
a label by label*
is not visible within the CFG term
bound by the label. The syntax of label*
does
not differ from the labels
syntax:
(label*
([label cfg
term] …) body cfg
term)
The semantics of label*
can be explained through
the following rule: Replace all occurrences
of (call label)
in body cfg term
by cfg term
when the
binding of the
corresponding label
is not
(lexically) shadowed by another label binding.
(cfg
(label* ([l (finally (x) 42 (halt))]
[l (call l)])
(call l))
x)42
The final CFG term that needs to be discussed is permute
. Its general syntax is:
(permute
([label cfg
term] …) body cfg
term)
When the forward control flow enters a CFG block described by
a permute
CFG term, the CFG blocks described by
the cfg terms
together
with their labels
are
dynamically permuted, and control then passes to the the first CFG block.
Within the lexical scope of this CFG term,
the corresponding label
is
bound so that when it is called, control proceeds to the
then second cfg term
and so on
until the then final cfg term
is reached whose accompanying label is bound so that when called,
control finally proceeds to the body cfg
term
term.
(cfg
(permute ([p (bind ([(x) 10])
(call p))]
[p (bind ([(y) 20])
(call p))])
(finally (z) (list x y) (halt)))
z)(10 20)
So far, the permute
CFG term just looks like an
inside-out label*
form up to the permutation that
is involved. The permute
CFG term's purpose comes
from some non-determinism it introduces. In the above example,
the sequencing of the two bind
CFG terms is not
determined, i.e. whether the forward control flow first passes
through the lexically first bind
CFG term and then
through the lexically second, or vice versa. In the above
example, it does not matter. In general, the non-determinism
shows up through the binding of loop and return variables. The
definition of x
by the first bind
CFG
term does not dominate the second bind
CFG term
(more precisely, the CFG block it describes), because a possible
control flow (coming from a different sequencing) coming from
the entry passes the second CFG block before the first. This is
demonstrated in the following two examples:
(let ([x 1])
(cfg
(permute ([p (bind ([(x) 2])
(call p))]
[p (bind ([(y) x])
(call p))])
(finally (z) (list x y) (halt)))
z))(2 1)
(let ([x 1])
(cfg
(permute ([p (finally ([y] x)
(call p))]
[p (finally ([x] 2)
(call p))])
(halt))
(list x y)))(2 1)
The permutation property of the permute
CFG terms
also propagates to the bodies of label*
and labels
forms and through calls to labels bound
by label*
:
(let ([x 1] [y 10])
(cfg
(permute ([p (bind ([(x) 2])
(call p))])
(labels ()
(label* ([p (permute ([p (bind ([(z) (list x y)])
(call p))])
(finally (x y z) (values x y z) (halt)))])
(permute ([p (bind ([(y) x])
(call p))])
(call p)))))
(list x y z)))(2 1 (1 10))
The final concept we need to explain are macros for
the cfg
form. Like define-syntax
is
used to define Scheme macros, we use
the define-cfg-syntax
definition here, which is an ordinary Scheme definition and of the following form:
(define-cfg-syntax cfg keyword transformer expression)
(There is also a starred
version define-cfg-syntax*
with the same syntax,
which does not change a previous binding of the
identifier cfg keyword
,
but only adds CFG keyword semantics.)
When the expander processes a cfg
form, it expands
CFG macro uses, which look like Scheme macro uses except
that a cfg keyword
is used
instead of a
(Scheme) keyword
. CFG
macros are expanded by the usual Scheme macro
expansion algorithm. An example is worth a thousand words:
(define-cfg-syntax loop
(lambda (stx)
(syntax-case stx ()
[(_ n-expr lp-lbl loop-cfg-term body-cfg-term)
(identifier? #'lp-lbl)
#'(bind ([(n) n-expr])
(labels ([lp-lbl
(execute
(lambda (loop done)
(if (zero? n)
(done)
(loop (- n 1))))
[(n) loop-cfg-term]
[() body-cfg-term])])
(call lp-lbl)))])))
(cfg
(bind ([(n) 0])
(loop 10 next
(bind ([(n) (+ n 2)])
(call next))
(finally (n) n (halt))))
n)20
The example above also demonstrates the usual hygiene
provisions of (Scheme) macros; the variable n
introduced in the macro is effectively renamed so that it
doesn't shadow the variable n
in the macro use.
This hygiene is also the reason why the label next
has to be given as an argument to the macro. One can actually
get rid of this as the following example shows:
(define-cfg-label next)
(define-cfg-syntax loop
(lambda (stx)
(syntax-case stx ()
[(_ n-expr loop-cfg-term body-cfg-term)
#'(bind ([(n) n-expr])
(labels ([next
(execute
(lambda (loop done)
(if (zero? n)
(done)
(loop (- n 1))))
[(n) loop-cfg-term]
[() body-cfg-term])])
(call next)))])))
(cfg
(bind ([(n) 0])
(loop 10
(bind ([(n) (+ n 2)])
(call next))
(finally (n) n (halt))))
n)20
Here, we used the define-cfg-label
definition, whose syntax is simply:
(define-cfg-label label)
(The define-cfg-label
syntax also comes in a
starred version define-cfg-label*
.) This definition
binds label
to a fresh label
and label
is replaced by
this label in labels
, label*
,
and call
CFG terms
where label
is in scope.
The identifiers defined in this section are exported by
the (srfi :242 cfg)
and the (srfi :242)
libraries in case of an R6RS system and by
the (srfi 242)
library in case of an R7RS
system.
Remark: This section is a formal account of the syntax
and semantics of the cfg
form. It should not be
misunderstood as a gentle introduction, which was given in the
previous section.
The following definitions are standard, but are included here for the sake of completeness.
A control-flow graph (CFG) is a (finite) set of CFG blocks together with a predecessor relation such that there is exactly one CFG block with no predecessor. This CFG block is the entry block of the cfg. A CFG block that is not the predecessor of any other block is an exit block. A CFG block is a successor of another one if the latter is a predecessor of the former. (In what follows, we usually describe CFGs by defining the successors of each block, which amounts to the same thing: the predecessor relation is basically the inverse one.)
The dominance relation of a CFG is the largest relation on the set of its CFG blocks such that the following holds: The set of strict dominators of the entry block is empty. The set of strict dominators of any other block is a subset of the set of dominators of each predecessor of the block. Here, a strict dominator of a block is a dominator that is not the block itself.
Note: From the definition it follows that each block dominates (but not strictly dominates) itself.
Remark: Because of the condition on the entry block, the dominance relation is not trivial and not the maximal relation on the set of CFG blocks.
The post-dominance relation of a CFG is the largest relation on the set of its CFG blocks such that the following holds: The set of strict post-dominators of an exit block is empty. The set of strict post-dominators of any other block is a subset of the set of post-dominators of each successor of the block. Here, a strict post-dominator of a block is a post-dominator that is not the block itself.
In order to describe the computation model of the CFG language, which is not the same computation model underlying Scheme (but can be expressed in the latter as the portable implementation shows), a couple of primitive notions have to be introduced, that is the grammatical context in which they are used. Their semantic meaning only follows from the context in which they are used below.
Loop variables are defined in CFG blocks. A loop variable's scope is the set of CFG blocks that are dominated by a CFG block in which the loop variable is defined.
Return variables are also defined in CFG blocks. A return variable's scope is the set of CFG blocks that strictly post-dominate a CFG block in which the return variable is introduced.
The formal description of the various CFG terms uses the notion of CFG locations, which is similarly to a location in the Scheme semantics, except that it can hold a CFG block instead of a Scheme value.
Moreover, a pending set in the sense of the formal semantics below is a set of pairs, each consisting of a CFG location and a CFG block.
The following action can be done with pending sets: when a pending set is materialized with a CFG block as its tail, the pairs of CFG locations and CFG blocks of the pending set are ordered non-deterministically. In the CFG location of each pair, the CFG block of the following pair is then stored. In the CFG location of the last pair, the tail CFG block is stored. For each CFG block in the list of pairs, its set of successors is the one-element set containing the CFG block of the following pair. For the CFG block in the last pair, its set of successors is the one-element set consisting of the tail CFG block. The CFG block in the first pair is then returned as the result of the materialization.
A CFG state is a (finite) set of bindings from identifiers to values.
A CFG term is a syntactic construct in the CFG language that can be evaluated in an environment and within a pending set. The result of the evaluation is a CFG. A block in this CFG can be called with a CFG state. The result of this call is again a CFG state. Evaluation of CFG terms happens during expand-time, the calling of CFG terms during runtime.
Each entry is of one of two categories, “syntax” or “cfg syntax”. The first category is as in R6RS and R7RS, while an entry of the second category describes a syntactic construct in the CFG language.
(cfg cfg term result expression)
Syntax: Cfg
term
is an arbitrary CFG
term. Result expression
is an arbitrary Scheme expression.
Semantics: A cfg
expression is
evaluated by first evaluating
the cfg term
within
an empty pending list, resulting in a CFG. Its entry block
is then called with an empty CFG state, yielding a resulting
CFG state. The environment of the
cfg
expression is then extended by binding the
return variables in whose scope
the result expression
is to locations holding the values to which they are bound
in the resulting CFG state. Finally,
the result expression
is evaluated in this extended environment and its values
returned as the results of the cfg
expression.
The simplest cfg
expression just halts and returns
some value(s). (The halt
and the other cfg terms are
described in the next subsection.) For example:
(cfg (halt) 'done)done
Using a finally
cfg term, return variables can be
bound that can be used in the result
expression
:
(let ([x 1])
(cfg (finally (x y) (values (+ x 1) (+ x 2))
(halt))
(list x y)))(2 3)
Loop variables are bound through execute
terms:
(let ([x 1])
(cfg (execute (lambda (e)
(e (+ x 1)))
[(x) (finally (res) x (halt))])
res))2
Execute
CFG terms can have more than one successor.
During evaluation, one control path is chosen:
(let ([x 1])
(cfg (execute (lambda (e1 e2)
(if (even? x) (e1) (e2 'odd)))
[() (finally (res) 'even (halt))]
[(a) (finally (res) a (halt))])
res))odd
Due to loop variable scoping, the second finally
cannot simply be hoisted because it would no longer be dominated
by the introduction of the loop
variable a
:
(let ([a 'outer]
[x 1])
(cfg (finally (res) a
(execute (lambda (e1 e2)
(if (even? x) (e1) (e2 'odd)))
[() (finally (res) 'even (halt))]
[(a) (halt)]))
res))outer
And due to return-variable scoping, the
first finally
cannot simply be left out because the
entry block would no longer post-dominate an introduction of
the res
return variable:
(let ([res 'outer]
[x 1])
(cfg (execute (lambda (e1 e2)
(if (even? x) (e1) (e2 'odd)))
[() (halt)]
[(a) (finally (res) a (halt))])
res))outer
Loop variables are called loop variables because the main
reason for the CFG language is that it allows writing loops.
This can be done with labels
CFG terms. Here, we
also use the bind
CFG term to bind loop
variables easily:
(cfg (labels ([f (execute
(lambda (e1 e2)
(if (> x 6)
(e1)
(e2 (+ x 1) (* a x))))
[() (finally (res) a (halt))]
[(x a) (call f)])])
(bind ([(x) 1] [(a) 1]) (call f)))
res)720
Finally, permute
CFG terms can be used to create
sequences of CFG blocks so that loop variables and return
variables introduced in these blocks do not have the other
blocks of this sequence in scope (comparable to scoping rules of
the let
expression of Scheme):
(let ([x 'outer] [y 'outer])
(cfg (label* ([c (permute ([p (finally (y) 'inner
(bind ([(a) x]) (call p)))])
(finally (a) a (halt)))])
(permute [(p (finally (b) y
(bind ([(x) 'inner]) (call p))))]
(call c)))
(list a b)))(outer outer)
(Label*
CFG terms do not allow loops
as labels
CFG terms, but permute
sequences carry forward across calls to labels introduced
by label*
.)
The following entries describe the primitive CFG terms.
A label
is an identifier.
Labels do not occupy the same namespace as keywords and all kind
of variables. That is, within the same scope, an identifier can
be bound as a label and as a variable or keyword, and local
bindings of either kind do not shadow other bindings of the
other kind. There are two (disjoint) types of
labels, static labels
and dynamic labels
.
Static labels are bound to CFG terms, and dynamic labels are
bound to CFG locations.
(execute procedure
expression
[formals cfg term]
…)
When an execute
CFG term is evaluated in
an environment and within a pending set,
the cfg term
s
are evaluated to CFG blocks in no particular order in the
environment, each within an empty pending set. The pending
set in which the execute
CFG term is being
evaluated is then materialized with a new CFG block
as its tail and the materialization is returned.
When the CFG block is later called with a CFG
state, the environment of the surrounding cfg
expression is extended by binding each loop variable in
whose scope the execute
CFG term occurs to a fresh
location holding the value to which it is bound in the CFG
state. The procedure
expression
is then evaluated in this extended
environment to yield a procedure value. This procedure is
then tail-called with as many procedure arguments as there
are cfg term
s. The
procedure should call one of the arguments exactly once and
the call should be a tail call. In the continuation of this
tail call, the CFG state is then extended by binding the
corresponding formals
to the return values of the call, and the CFG block
resulting from the evaluation of the
corresponding cfg term
is then tail-called with the extended CFG state and the
resulting CFG state is returned.
The CFG block has the CFG blocks resulting from evaluation
of the cfg term
s as
successors.
The variables of each formals
are introduced
as loop variables in the corresponding CFG block.
(halt)
When a (halt)
CFG term is evaluated, the
pending set is materialized with a new CFG block as
its tail, and the materialization is returned.
When the CFG block is called with a CFG state, it simply returns an empty CFG state.
The CFG block has no successors.
(labels
([dynamic label cfg term]
…) body cfg term)
When a labels
CFG term is evaluated in an
environment within a pending set, the environment is
extended by binding
each label
to a CFG
location initially holding an invalid CFG block.
The cfg term
s and
the body cfg term
are then evaluated in an unspecified order in this extended
environment. In this environment,
the cfg term
s are
evaluated within an empty pending set and
the body cfg term
is evaluated within the pending set in which
the labels
CFG term is being evaluated.
For each label
, the
CFG block resulting from evaluating the
corresponding cfg term
is then stored in the corresponding CFG location. Finally,
the CFG block resulting from evaluation of
the body cfg term
is returned.
When the CFG block returned by the labels
CFG
term is later called with a CFG state, the CFG block resulting
from evaluating
the body cfg term
is
tail-called with the CFG state and the resulting CFG state is
returned.
(label* ([static
label cfg term]) body cfg term)
When a label*
CFG term is evaluated in an
environment within a pending set, the environment is
extended by binding
the label
to the
(unevaluated) cfg
term
.
The body cfg term
is then evaluated in this environment within the pending
set. The result of this evaluation is returned.
When the CFG block returned by the labels
CFG
term is later called with a CFG state, the CFG block resulting
from evaluating
the body cfg term
is
called with the CFG state, and the resulting CFG state is
returned.
It is a syntax or undefined violation if
evaluating cfg term
in the non-extended environment within the empty pending set
would raise an exception of type &syntax
or &undefined
, respectively.
Note: The label*
syntax will be extended below.
(call static label)
When the call
CFG term is evaluated in an
environment and within a pending set, the CFG term to which
the static label
is
bound is evaluated in the same environment and within the
same pending set, and the result of this evaluation is returned.
It is an undefined violation if
the static label
is
not bound in the environment in which the call
CFG term is evaluated.
(call dynamic label)
When the call
CFG term is evaluated in an
environment within a pending set, the location to which
the dynamic label
is
bound is looked up and remembered. The pending set is then
materialized with a new CFG block as its tail, and
the materialization is returned.
When the CFG block returned by the call
CFG
term is called with a CFG state, the CFG block stored in the
remembered location is tail-called with the CFG state, and the
resulting CFG is returned.
The CFG block returned by the call
CFG term
has the CFG block eventually stored in the remembered location
as its successor.
It is an undefined violation if
the dynamic label
is not bound
in the environment in which the call
CFG term
is evaluated.
(finally cfg formals expression cfg term)
When a finally
CFG term is evaluated
within a pending set, the pending set is materialized with a
new CFG block as its tail, and the
materialization is returned.
When the CFG block is later called with a CFG
state, the environment of the surrounding cfg
expression is extended by binding each loop variable in
whose scope the CFG block occurs to a fresh location holding
the value to which it is bound in the CFG state. Then, the
CFG block resulting from evaluation of
the cfg term
is called. The extended
environment is then further extended by binding each return
variable in whose scope the CFG block returned by
the finally
CFG term occurs to a fresh
location holding the value to which it is bound in the
returned CFG state. Then,
the expression
is
evaluated in the doubly extended environment, the CFG
state received from call the cfg
term
is extended by binding
the formals
to the
values receiving from evaluating
the expression
, and
the extended CFG state is then returned.
The CFG block returned by
the finally
CFG term has the CFG block
resulting from the evaluation of
the cfg term
as
its successor.
(permute dynamic
label cfg term body cfg term)
When a permute
CFG term is evaluated in an
environment within a pending set, the environment is
extended by binding the dynamic
label
to a fresh CFG location initially
holding an invalid CFG block.
The cfg term
is
then evaluated in this extended environment within an empty
pending set to a CFG block. The pending set within which
the permute
CFG term is being evaluated is
then extended by adjoining a pair consisting of the fresh
location and the resulting CFG block. Finally,
the body cfg term
is evaluated in the original environment but within the
extended pending list, and the resulting CFG block is
returned.
Note: The permute
syntax will be extended below.
If a cfg
expression is in tail context,
the result expression
is in
tail context as well.
Through permute
CFG terms, non-trivial pending
sets are generated. Through the non-determinism of
materializing pending sets, the CFGs resulting from evaluating
CFG terms are non-deterministic as well. For such
non-deterministic CFGs (which can be viewed as a set of CFG
blocks together with a set of possible predecessor relations on
these CFG blocks), the scope of loop or return variable is
defined as the intersection of the scopes of the loop or return
variable over all possible materializations.
The following entries describe CFG terms that can be converted into primitive CFG terms.
(label* ([static
label1 cfg term1] …) body cfg term)
Effectively equivalent to body cfg term
when there is no static label
. Otherwise, effectively equivalent to
(label* ([static
label1 cfg term1])
(label* ([static
label2 cfg term2] …)
body cfg term))
.
Note: This cfg syntax extends the primitive label*
syntax to more than one static label
.
(permute ([dynamic
label1 cfg term1] …) body cfg term)
Effectively equivalent to body cfg term
when there is no dynamic label
. Otherwise, effectively equivalent to
(permute ([dynamic
label1 cfg term1])
(permute ([dynamic
label2 cfg term2] …)
body cfg term))
.
Note: This cfg syntax extends the
primitive permute
syntax to more than
one dynamic label
.
(bind
([formals expression]
...) cfg term)
When an bind
CFG term is evaluated in an
environment and within a pending set,
the cfg term
is
evaluated to CFG block in the environment within an empty
pending set. The pending set in which
the execute
CFG term is being evaluated is
then materialized with a new CFG block as its tail,
and the materialization is returned.
When the CFG block is later called with a CFG
state, the environment of the surrounding cfg
expression is extended by binding each loop variable in whose
scope the bind
CFG term occurs to a fresh location
holding the value to which it is bound in the CFG state.
The expression
s are then
evaluated in an unspecified order in this extended environment
to yield values for
each expression
. The
CFG state is then extended by binding
each formals
as loop
variables to the return values of the corresponding
evaluation, and the CFG block resulting from the evaluation
of cfg term
is then
tail-called with the extended CFG state, and the resulting CFG state
is returned.
It is a syntax violation if the loop variables in
all formals
are not
pairwise different.
The Scheme syntax described in this section
are definitions
that may
appear anywhere other definitions may appear.
Keyword bindings established by these definitions are visible
throughout the body in which they appear, except where shadowed
by other bindings, and nowhere else, just like variable bindings
established by define
. All bindings established by
a set of definitions are visible within the definitions
themselves.
(cfg keyword datum …)
(cfg keyword datum … . datum)
cfg keyword
At the start of the evaluation of a cfg
expression, these CFG macro uses are expanded by
the syntax expander into core CFG terms just as Scheme macro
uses are expanded into core forms. In particular,
a CFG transformer is like a Scheme transformer.
(define-cfg-syntax cfg keyword transformer expression)
Binds cfg keyword
to
the value transformer
expression
, which must evaluate, at
macro-expansion time, to a CFG transformer.
(define-cfg-syntax* cfg keyword transformer expression)
As define-cfg-syntax
except that cfg
keyword
must be bound. Define-cfg-syntax*
does not change the meaning of keyword
outside cfg term
s.
(define-cfg-label identifier)
Binds the identifier
to a fresh label.
Whenever
the identifier
appears
as a static label
or dynamic label
, it is replaced by the label.
(define-cfg-label* identifier)
As define-cfg-label
except that identifier
must be bound. Define-cfg-label*
does not change the meaning of identifier
outside cfg term
s.
The following example defines a CFG keyword that can be used like bind
to unconditionally bind loop variables:
(define-cfg-syntax simple-bind
(lambda (stx)
(syntax-case stx ()
[(_ ([id init] ...) cfg)
(for-all identifier? #'(id ...))
#'(execute (lambda (e)
(e init ...))
[(id ...) cfg])])))
(cfg (simple-bind ([x 1] [y 2])
(finally (res) (+ x y) (halt)))
res)3
A probably more useful CFG macro is the following one:
(define-cfg-syntax return
(lambda (stx)
(syntax-case stx ()
[(_ return-var ...)
(for-all identifier? #'(return-var ...))
#'(finally (return-var ...) (values return-var ...) (halt))])))
(cfg (simple-bind ([x 1]) (return x))
x)1
The label definition feature is demonstrated in the following example:
(define-syntax permuting
(lambda (stx)
(syntax-case stx ()
[(_ cfg-term ... result-expr)
#'(cfg (permute ([p cfg-term] ...)
(finally (res) result-expr (halt)))
res)])))
(permuting (simple-bind ([x 99]) (call p)) x)99
The sample implementation is a portable R6RS implementation written from scratch using SRFI 213 The sample implementation in the git repo is configured for Chez Scheme.
Git repo for the sample implementation.
This SRFI would not exist if there hadn't been Olin Shiver's paper The Anatomy of a Loop. In fact, almost all of the mental effort needed for this SRFI was already provided by him. This does not imply that he does or does not endorse this SRFI.
It was Jens Axel Søgaard who reminded me of Olin Shiver's paper, which I had once read but then forgotten about.
During the draft period, Wolfgang Corcoran-Mathe read the specification thoroughly and found a number of small bugs and inconsistencies, which could then be fixed.
© 2022 Marc Nieper-Wißkirchen.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.