Next: , Previous: packrat intro, Up: packrat


76.2 Calculator example

The following example implements a simple arithmetics calculator. First let's see a symbolic expressions lexer; the parser expects a lexer closure returning records of type <lexical-token> having terminal category symbols in the set:

     NUMBER OPEN-PAREN CLOSE-PAREN + - * /

the lexer first converts an S–expression into a list of pairs:

     (sexp->stream '(1 + 2 + 3))
     ⇒ ((NUMBER . 1)
         (+)
         (NUMBER . 2)
         (+)
         (NUMBER . 3))
     
     (sexp->stream '((1 + 2) * 3))
     ⇒ ((OPEN-PAREN)
         (NUMBER . 1)
         (+)
         (NUMBER . 2)
         (CLOSE-PAREN)
         (*)
         (NUMBER . 3))

notice that ‘(NUMBER . 3)’ is a token representation with terminal category ‘NUMBER’ and semantic value ‘3’; also, ‘(+)’ is a token representation with terminal category ‘+’ and semantic value null.

     (define (sexp->stream sexp)
       (define (lexer sexp)
         (let loop ((sexp   (if (pair? sexp)
                                sexp
                              (list sexp)))
                    (result '()))
           (if (null? sexp)
               result
             (let ((atom (car sexp)))
               (loop (cdr sexp)
                     (cond ((number? atom)
                            `((NUMBER . ,atom) . ,result))
                           ((memq atom '(+ - * /))
                            `((,atom) . ,result))
                           ((pair? atom)
                            (append '((CLOSE-PAREN))
                                    (lexer atom)
                                    '((OPEN-PAREN))
                                    result))
                           (else
                            (error #f "invalid token" atom))))))))
       (reverse (lexer sexp)))

Then let's see the lexer closure; it takes an S–expression and returns a thunk which, when evaluated, returns records of type <lexical-token>. The source location field is set to #f, because for an S–expression there is no such information available.

     (define (make-lexer-closure sexp)
       (let ((stream (sexp->stream sexp)))
         (lambda ()
           (if (null? stream)
               (make-<lexical-token>/end-of-input #f)
             (let ((token (car stream)))
               (set! stream (cdr stream))
               (make-<lexical-token> (car token) #f
                                     (cdr token) 0))))))

Finally the parser definition; it defines the following non–terminal symbols:

expr
It is the start symbol and represents a general arithmetic expression.
mul-expr
It represents a multiplication, a division or a subexpression. It is subordinate to ‘expr’ to allow multiplication and division to have precedence over addition and subtraction. Also, it allows parenthesised subexpressions to be evaluated first.
simple
It represents subexpressions. It includes: simple numbers; negated expressions; expressions preceeded by a standalone plus sign; parenthesised subexpressions.

The make-grammar-combinator macro returns a combinator function which can be used any number of times to parse S–expressions.

     (define calc
       (make-grammar-combinator expr
     
         (expr     ((a <- mul-expr '+ b <- expr)
                    (+ a b))
                   ((a <- mul-expr '- b <- expr)
                    (- a b))
                   ((a <- mul-expr)
                    a))
     
         (mul-expr ((a <- simple '* b <- simple)
                    (* a b))
                   ((a <- simple '/ b <- simple)
                    (/ a b))
                   ((a <- simple)
                    a))
     
         (simple   ((a <- 'NUMBER)
                    a)
                   (('+ a <- expr)
                    a)
                   (('- a <- expr)
                    (- a))
                   (('OPEN-PAREN a <- expr 'CLOSE-PAREN)
                    a))))

typical usage of the interface is represented by the following function:

     (define (doit sexp)
       (let* ((lexer   (make-lexer-closure sexp))
              (result  (calc (initialise-state lexer))))
         (if (<success>? result)
             (<result>-value result)
           (<error>-message result))))

it works like this:

     (doit 1)
     ⇒ 1
     
     (doit '(+ 1))
     ⇒ 1
     
     (doit '(- 1))
     ⇒ -1
     
     (doit '(- - + + - + 1))
     ⇒ -1
     
     (doit '(1 * 2 - 3))
     ⇒ -1
     
     (doit '((1 + 2) * 3))
     ⇒ 9
     
     (doit '(+ *))
     ⇒ "expected token with category OPEN-PAREN"

there are ways to return better error messages.