Libraries for Vicare Scheme: srfi eager-comp spec generators

2.22.4.3 Generators

This section defines the syntax generator. Each generator defines a sequence of bindings through which one or more variables are run. The scope of the variables begins after the closing parenthesis of the generator expression and extends to the end of the comprehension it is part of.

The variables defined by the generators are specified using the syntax:

<vars> --> <variable1> [ (index <variable2>) ]

where variable1 runs through the values in the sequence defined by the generator, and the optional variable2 is an exact integer–valued index variable counting the values (starting from 0). The names of the variables must be distinct. The following example illustrates the index variable:

(list-ec (: x (index i) "abc")
   (list x i))
=> ((#\a 0) (#\b 1) (#\c 2))

Unless defined otherwise, all generators make sure that the expressions provided for their syntactic arguments are evaluated exactly once, before enumeration begins. Moreover, it may be assumed that the generators do not copy the code provided for their arguments, because that could lead to exponential growth in code size. Finally, it is possible to assign a value to the variables defined by a generator, but the effect of this assignment is unspecified.

The syntax generator consists of the following alternatives.

Generator: : vars arg1 arg*

First the expressions arg1 arg* are evaluated into a[1], a[2], …, a[n] and then a global dispatch procedure is used to dispatch on the number and types of the arguments and run the resulting generator. Initially (after loading the SRFI), the following cases are recognized:

:list           if      for all i in {1..n}: (list? a[i])

:string         if      for all i in {1..n}: (string? a[i])

:vector         if      for all i in {1..n}: (vector? a[i])

:range          if      n in {1..3} and for all i in {1..n}:
                        (integer? a[i]) and (exact? a[i])

:real-range     if      n in {1..3} and for all i in {1..n}:
                        (real? a[i])

:char-range     if      n = 2 and for all i in {1, 2}:
                        (char? a[i])

:port           if      n in {1, 2} and (input-port? a[1])
                        and (procedure? a[2])

The current dispatcher can be retrieved as (:-dispatch-ref), a new dispatcher d can be installed by (:-dispatch-set! d) yielding an unspecified result, and a copy of the initial dispatcher can be obtained as (make-initial-:-dispatch). Please refer to the section below for recommendation how to add cases to the dispatcher.

Generator: :list vars arg1 arg*

Generator: :string vars arg1 arg*

Generator: :vector vars arg1 arg*

Run through one or more lists, strings, or vectors. First all expressions in arg1 arg* are evaluated and then all elements of the resulting values are enumerated from left to right. One can think of it as first appending all arguments and then enumerating the combined object. As a clarifying example, consider:

(list-ec (:string c (index i) "a" "b")
  (cons c i))
=> ((#\a . 0)
    (#\b . 1))

Generator: :integers vars: Runs through the sequence 0, 1, 2, … of non–negative integers. This is most useful in combination with :parallel, :while, and :until or with a non–local exit in the body of the comprehension.

Generator: :range vars stop

Generator: :range vars start stop

Generator: :range vars start stop step

Runs through a range of exact rational numbers.

The form (:range vars stop) evaluates the expression stop, which must result in an exact integer n, and runs through the finite sequence 0, 1, 2, …, n-1. If n is zero or negative the sequence is empty.

The form (:range vars start stop) evaluates the expressions start and stop, which must result in exact integers a and b, and runs through the finite sequence a, a+1, a+2, …, b-1. If b is less or equal a then the sequence is empty.

The form (:range vars start stop step) first evaluates the expressions start, stop, and step, which must result in exact integers a, b, and s such that s is unequal to zero. Then the sequence a, a + s, a + 2 s, …, a + (n-1) s is enumerated where n = ceil((b-a)/s). In other words, the sequence starts at a, increments by s, and stops when the next value would reach or cross b. If n is zero or negative the sequence is empty.

Generator: :real-range vars stop

Generator: :real-range vars start stop

Generator: :real-range vars start stop step

Runs through a range of real numbers using an explicit index variable. This form of range enumeration avoids accumulation of rounding errors and is the one to use if any of the numbers defining the range is inexact, not an integer, or a bignum of large magnitude.

Providing default value 0 for start and 1 for step, the generator first evaluates start, stop, and step, which must result in reals a, b, and s such that n = (b-a)/s is also representable as a real. Then the sequence 0, 1, 2, … is enumerated while the current value i is less than n, and the variable in vars is bound to the value a + i s. If any of the values a, b, or s is non–exact then all values in the sequence are non–exact.

Generator: :char-range vars min max: Run through a range of characters. First min and max are evaluated, which must result in two characters a and b. Then the sequence of characters a, a+1, a+2, …, b is enumerated in the order defined by char<=? in the sense of R5RS Section 6.3.4. If b is smaller than a then the sequence is empty. (Note that b is included in the sequence.)

Generator: :port vars port
Generator: :port vars port read-proc: Read from the port until the eof–object is read. Providing the default read for read-proc, the generator first evaluates port and read-proc, which must result in an input port p and a procedure r. Then the variable is run through the sequence obtained by (r p) while the result does not satisfy eof-object?.

Generator: :dispatched vars dispatch arg1 arg*

Runs the variables through a sequence defined by dispatch and arg1 arg*. The purpose of :dispatched is implementing dispatched generators, in particular the predefined dispatching generator :.

The working of :dispatched is as follows. First dispatch and arg1 arg* are evaluated, resulting in a procedure d (the “dispatcher”) and the values a[1], a[2], …, a[n]. Then:

(d (list a[1] a[2] ... a[n]))

is evaluated, resulting in a value g. If g is not a procedure then the dispatcher did not recognize the argument list and an error is raised. Otherwise the “generator procedure” g is used to run vars through a sequence of values.

The sequence defined by g is obtained by repeated evaluation of (g empty) until the result is empty. In other words, g indicates the end of the sequence by returning its only argument, for which the caller has provided an object distinct from anything g can produce. (Generator procedures are state based, they are no such noble things as streams in the sense of SRFI-41.)

The definition of dispatchers is greatly simplified by the macro :generator-proc that constructs a generator procedure from a typed generator. Let (g var arg1 arg ...) be an instance of the generator syntax, for example an application–specific typed generator, with a single variable var and no index variable. Then:

(:generator-proc (g arg1 arg ...))
=> g

where the generator procedure g runs through the list:

(list-ec (g var arg1 arg ...) var)

In order to define a new dispatching generator (say :my) first a dispatching procedure (say :my-dispatch) is defined. The dispatcher will be called with a single (!) argument containing the list of all values to dispatch on. To enable informative error messages, the dispatcher should return a descriptive object (e.g. a symbol for the module name) when it is called with the empty list. Otherwise (if there is at least one value to dispatch on), the dispatcher must either return a generator procedure or #f (which means: no interest). As an example, the following skeleton code defines a dispatcher similar to the initial dispatcher of ::

(define (:my-dispatch args)
  (case (length args)
    [(0) 'SRFI-NN]
    [(1) (let ([a1 (car args)])
           (cond
            [(list? a1)
             (:generator-proc (:list a1))]
            [(string? a1)
             (:generator-proc (:string a1))]
            ...more unary cases...
            [else
             #f]))]
    [(2) (let ([a1 (car args)]
               [a2 (cadr args)])
           (cond
            [(and (list? a1) (list? a2))
             (:generator-proc (:list a1 a2))]
            ...more binary cases...
            [else
             #f]))]
    ...more arity cases...
    [else
     (cond
      [(every?-ec (:list a args) (list? a))
       (:generator-proc (:list (apply append args)))]
      ...more large variable arity cases...
      [else
       #f])]))

Once the dispatcher has been defined, the following macro implements the new dispatching generator:

(define-syntax :my
  (syntax-rules (index)
    [(:my cc var (index i) arg1 arg ...)
     (:dispatched cc var (index i) :my-dispatch arg1 arg ...)]
    [(:my cc var arg1 arg ...)
     (:dispatched cc var :my-dispatch arg1 arg ...)]))

This method of extension yields complete control of the dispatching process. Other modules can only add cases to :my if they have access to :my-dispatch.

Extending the predefined dispatched generator. An alternative to adding a new dispatched generator is to extend the predefined generator :. Technically, extending : means installing a new global dispatching procedure using :-dispatch-set! as described above. In most cases, however, the already installed dispatcher should be extended by new cases. The following procedure is a utility for doing so:

(dispatch-union d1 d2)
=> d

where the new dispatcher d recognizes the union of the cases recognized by the dispatchers d1 and d2. The new dispatcher always tries both component dispatchers and raises an error in case of conflict. The identification returned by (d) is the concatenation of the component identifications (d1) and (d2), enclosed in lists if necessary. For illustration, consider the following code:

(define (example-dispatch args)
  (cond
   [(null? args)
    'example]
   [(and (= (length args) 1) (symbol? (car args)) )
    (:generator-proc (:string (symbol->string (car args))))]
   [else
    #f]))

(:-dispatch-set! (dispatch-union (:-dispatch-ref) example-dispatch))

After evaluation of this code, the following example will work:

(list-ec (: c 'abc)
  c)
=> (#\a #\b #\c)

Adding cases to : is particularly useful for frequent cases of interactive input. Be warned, however, that the advantage of global extension also carries the danger of conflicts, unexpected side–effects, and slow dispatching.

Generator: :do (lb*) ne1? (ls*)

Generator: :do (let (ob*) oc*) (lb*) ne1? (let (ib*) ic*) ne2? (ls*)

Defines a generator in terms of a named–let, optionally decorated with inner and outer lets. This generator is for defining other generators. (In fact, the reference implementation transforms any other generator into an instance of fully decorated :do.)

The generator is a compromise between expressive power (more flexible loops) and fixed structure (necessary for merging and modifying generators).

In the fully decorated form, the syntactic variables ob (outer binding), oc (outer command), lb (loop binding), ne1? (not-end1?), ib (inner binding), ic (inner command), ne2? (not-end2?), and ls (loop step) define the following loop skeleton:

(let (ob*)
  oc*
  (let loop (lb*)
    (if ne1?
        (let (ib*)
          ic*
          payload
          (if ne2?
              (loop ls*) )))))

where oc* and ic* are syntactically equivalent to command*, i.e. they do not begin with a definition. The latter requirement allows the code generator to produce more efficient code for special cases by removing empty let–expressions altogether.

Generator: :let vars expression

Run through the sequence consisting of the value of expression, only. This is the same as:

(:list vars (list expression))

If an index variable is specified, its value is 0. The :let generator can be used to introduce an intermediate variable depending on outer generators.

Generator: :parallel generator*: Run several generators in parallel. This means that the next binding in the sequence is obtained by advancing each generator in generator* by one step. The parallel generator terminates when any of its component generators terminate. The generators share a common scope for the variables they introduce. This implies that the names of the variables introduced by the various generators must be distinct.

Generator: :while generator expression

Run generator while expression evaluates to non–#f. The guarding expression is included in the scope of the variables introduced by the generator.

Note the distinction between the filter if and the modified generator expressed by :while.

Generator: :until generator expression

Run generator until after expression has evaluated to non–#f. The guarding expression is included in the scope of the variables introduced by the generator.

Note the distinction between :while, stopping at a certain condition, and :until, stopping after a certain condition has occurred. The latter implies that the binding that has triggered termination has been processed by the comprehension.

Application specific typed generator

An important aspect of this SRFI is a modular mechanism to define new typed generators. To define a new typed generator a hygienic referentially transparent macro of the same name is defined to transform the generator pattern into an instance of the :do-generator. The extension is fully modular, meaning that no other macro has to be modified to add the new generator. This is achieved by defining the new macro in Continuation Passing Style.

Technically, this works as follows. Assume the generator syntax:

(:mygen var arg)

is to be implemented, for example running the variable var through the list (reverse arg). The following definition implements :mygen in terms of :list using the additional syntactic variable cc (read current continuation):

(define-syntax :mygen
  (syntax-rules ()
    [(:mygen cc var arg)
     (:list cc var (reverse arg))]))

After this definition, any comprehension will accept the :mygen generator and produce the proper code for it. This works as follows. When a comprehension sees something of the form (g arg ...) in the position of a qualifier then it will transform the entire comprehension into:

(g (continue ...) arg ...)

This effectively “transfers control” to the macro g, for example :mygen. The macro g has full control of the transformation, but eventually it should transform the expression into:

(:do (continue ...)  etc ...)

In the :mygen example this is done by the :list-macro. The macro :do finally transforms into:

(continue ... (:do etc ...))

As continue has been chosen by the macro implementing the comprehension, it can regain control and proceed with other qualifiers.

In order to ensure consistency of new generators with the ones defined in this SRFI, a few conventions are in order. Firstly, the generator patterns begin with one or more variables followed by arguments defining the sequence. Secondly, each generator except :do can handle an optional index variable. This is most easily implemented using :parallel together with :integers. In case the payload generator needs an index anyhow (e.g. :vector) it is more efficient to add an index variable if none is given and to implement the indexed case. Finally, make sure that no syntactic variable of the generator pattern ever gets duplicated in the code (to avoid exponential code size in nested application), and introduce sufficient intermediate variables to make sure expressions are evaluated at the correct time.