Next: , Previous: , Up: compiler letrec   [Index]


17.10.3 General notes on the context of letrec optimisations

On let and let* bindings

Let’s consider the following program:

(import (rnrs))
(let ((A B))
  #t)

it will fail with “unbound identifier B”; we are not concerned with unbound identifiers in this compiler pass. So let’s move on to the following program:

(import (rnrs))
(let ((A 123))
  (let ((A A))
    #t))

no errors here: the identifier A in reference position is captured by the outer let binding for A. Now this program:

(import (rnrs))
(let* ((A 123)
       (B A))
  #t)

everything is all right; now this program:

(import (rnrs))
(let* ((A 123)
       (A A))
  #t)

again no error: the identifier A in reference position is captured by the first let* binding for A; let* allows us to create bindings with the same name.

letrec bindings

Let’s move to the letrec syntax. This program is legal:

(import (rnrs))
(letrec ((A (lambda () A)))
  #t)

because letrec defines recursive bindings, so we are allowed to reference A in the right–hand side of the binding for A itself, as long as we put such reference in the body of a lambda.

This program is also legal:

(import (rnrs))
(letrec ((A (lambda () B))
         (B (lambda () A)))
  #t)

because the cross references to A and B are in the body of lambda syntaxes.

This program is illegal:

(import (rnrs))
(letrec ((A (list A)))
  #t)

because the identifier A in reference position is not in the body of a lambda syntax: to evaluate the right–hand side of the binding we need the value of the binding itself. Notice that A in reference position is not an unbound identifier: it is captured by the A in binding position; it is just “illegal” and we must detect this situation, according to R6RS.

This program is illegal:

(import (rnrs))
(letrec ((A 123)
         (B (list A)))
  #t)

because the identifier A in reference position is not in the body of a lambda syntax: letrec does not impose an order to the evaluation of the init expressions, so to evaluate the right–hand side of the binding we need the value of the binding itself.

On letrec* bindings

Let’s move to the letrec* syntax; it is similar, but not equal, to letrec. This program is legal:

(import (rnrs))
(letrec* ((A (lambda () A)))
  #t)

because letrec* defines recursive bindings, so we are allowed to reference A in the right–hand side of the binding for A itself, as long as we put such reference in the body of a lambda.

This program is also legal:

(import (rnrs))
(letrec* ((A (lambda () B))
          (B (lambda () A)))
  #t)

because the cross references to A and B are in the body of lambda syntaxes.

This program is illegal:

(import (rnrs))
(letrec* ((A (list A)))
  #t)

because the identifier A in reference position is not in the body of a lambda syntax: to evaluate the right–hand side of the binding we need the value of the binding itself. Again, notice that A in reference position is not an unbound identifier: it is captured by the A in binding position; it is just “illegal” and we must detect this situation, according to R6RS.

This program is legal:

(import (rnrs))
(letrec* ((A 123)
          (B (list A)))
  #t)

because letrec* imposes a left–to–right order to the evaluation of the init expressions.

On illegal bindings

R6RS mandates that illegal references to bindings established by letrec and letrec* are detected at run–time and cause an assertion violation to be raised. Vicare detects them at compile–time, so some fully R6RS-compliant code will not work under Vicare.

The following code is illegal under both R6RS and Vicare:

(import (rnrs))
(letrec ((x y)
         (y x))
  'should-not-get-here)

The following program will run under a R6RS-compliant implementation:

(import (rnrs))
(letrec ((x (if (eq? (cons 1 2)
                     (cons 1 2))
                x
              1)))
  x)

because the form x in reference position in the right–hand side of the binding is never evaluated; under Vicare this code will rather raise a syntax violation at compile–time.

Bindings on the Scheme stack

In the recbind forms the right–hand side expressions ?rhs have no imposed order of evaluation; the following two forms, in which the order of the bindings is reversed, must be completely equivalent:

(recbind ((?lhs1 ?rhs1)
          (?lhs2 ?rhs2))
  ?body)

(recbind ((?lhs2 ?rhs2)
          (?lhs1 ?rhs1))
  ?body)

In the rec*bind forms the right-hand side expressions ?rhs must be evaluated in the same order in which they appear.

In a lower–level compiler pass: instances of struct bind will be converted to code that evaluates the ?rhs expressions and store their return value into appropriately allocated Scheme stack machine words; a machine word for every ?lhs will be allocated, each ?lhs will represent an actual “local variable”.

If we think about the Scheme stack, it is clear why bind structures cannot represent recursive bindings; given the core language expression:

(letrec ((f (lambda () f)))
  ?body)

the internal representation must be:

(bind ((f_0 '#!void))
  (seq
    (assign f_0 (lambda () f_0))
    ?body))

so that: first bind reserves a Scheme stack machine word for the local variable, initialised to #<void>; then a closure object is created, referencing the local variable f_0; finally a reference to the closure object is stored in f_0. To create the closure object, we must first know the address of the allocated machine word.

When processing recbind and rec*bind structures: we must make sure that a machine word is allocated on the Scheme stack before the address of such word is needed by the ?rhs expressions. We do this by transforming recbind and rec*bind instances into a composition of bind and assign instances.


Next: , Previous: , Up: compiler letrec   [Index]