Previous: parser logic operators, Up: parser logic [Index]
The following bindings are exported by the library (vicare
parser-logic)
.
Define an abstract parser specifying the rules for parsing the input characters through calls to a set of operator functions; the result of the expansion is a syntax definition which can be used to instantiate a concrete parser by combining the parser logic with the input device logic.
The input arguments are:
It must be an identifier. It is bound to the generated syntax definition; such syntax is used as follows:
(?definer ?device-logic (?operator-name …))
where: ?device-logic is the identifier bound to the device logic syntax; the ?operator-name are identifiers among the public operator function names.
It must be an identifier. When a character is successfully extracted from the input device, it is bound to this identifier and made available to the operator clauses.
It must be an identifier. The device logic rule
:generate-end-of-input-or-char-tests
must bind it to a syntax;
such syntax must expand to a tail–call to an operator processing the
next input character. ?next is used as follows in the operator
clauses:
(next ?operator-name ?operator-arg …)
and it should expand to something like:
(?operator-name ?device-arg … ?operator-arg …)
where: ?device-arg are the arguments representing the input device state; ?operator-arg are the arguments representing the parser state as specified in the ?operator-spec.
It must be an identifier. The device logic rule
:generate-end-of-input-or-char-tests
must bind it to a syntax;
such syntax is used to handle parsing errors detected by the operator
clauses. ?fail is simply used as (?fail)
.
Each ?operator-spec must have the form:
(?operator-name (?operator-arg …) ?operator-clause …)
where:
Must be an identifier. It is bound to a generated operator function.
There is no difference in the way public operators and private ones are specified; the public operators names are listed in the concrete parser definition. An operator can be public in a concrete parser and private in another concrete parser.
Must be identifiers bound to the formal arguments associated to the parser state.
Are symbolic expressions specifying the input accepted by the operator.
Each ?operator-clause must have one of the formats:
((?char0 ?char …) ?body0 ?body …)
Each ?char must be an expression evaluating to a Scheme character
object. The ?body forms are evaluated if the input character
bound to ?ch is equal, according to char=?
, to one among
the ?char characters.
((?func ?expr …) => ?ret ?body0 ?body …)
?func must be an expression evaluating to a function; the ?expr must be expressions; ?ret must be an identifier. The ?body forms are evaluated if the form:
(?func ?ch ?expr …)
evaluates to a true value; such true value is bound to ?ret prior to evaluating the ?body.
((:end-of-input) ?body0 ?body …)
The ?body forms are evaluated if no more characters are available from the input device. This clause is to be used by operators accepting the end–of–input state as valid; if such rule is not present: the end–of–input will cause an error and the device logic is used to handle it.
Identifiers used to specify device logic syntax rules; they must be used in a syntax definition like:
(define-syntax device-logic (syntax-rules (:introduce-device-arguments :generate-end-of-input-or-char-tests :unexpected-end-of-input :generate-delimiter-test :invalid-input-char) ((_ :introduce-device-arguments ---) ---) ((_ :generate-end-of-input-or-char-tests ---) ---) ((_ :unexpected-end-of-input ---) ---) ((_ :generate-delimiter-test ---) ---) ((_ :invalid-input-char ---) ---)))
the rules have the following syntax:
:introduce-device-arguments
The input form is:
(_ :introduce-device-arguments ?kont . ?rest)
this rule introduces a list of identifiers used as device–specific arguments; they will be the first arguments for each parser operator function. The output form must be:
(?kont (?device-arg …) . ?rest)
where the ?device-arg are identifiers.
:generate-end-of-input-or-char-tests
The input form is:
(_ :generate-end-of-input-or-char-tests ?ch ?next ?fail (?device-arg …) ?end-of-input-kont ?parse-input-char-kont)
this rule is used to generate the input device tests for an operator function. The expanded code must first test for the end–of–input state and then proceed to evaluate code for the input character; in pseudocode the output form should be:
(if (end-of-input? ?device-arg ...) ?end-of-input-kont (let ((?ch (get-next-char ?device-arg ...))) ?parse-input-char-kont))
?ch is an identifier. The input character must be bound to it before evaluating ?parse-input-char-kont.
?next is an identifier. This rule must bind it to a syntax used to tail–call another operator using ?device-arg as first arguments; for example:
(define-syntax ?next (syntax-rules () ((_ ?operator-name ?operator-arg ...) (?operator-name ?device-arg ... ?operator-arg))))
?fail is an identifier. This rule must bind it to a syntax used to signal an error detected by an operator clause; for example:
(define-syntax ?fail (syntax-rules () ((_) (error #f "invalid input character" ?device-arg ...))))
The ?device-arg are the identifiers introduced by
:introduce-device-arguments
.
?end-of-input-kont is a form to be evaluated whenever the end–of–input is detected.
?parse-input-char-kont is a form to be evaluated whenever a character is extracted from the input device.
:unexpected-end-of-input
The input form is:
(_ :unexpected-end-of-input (?device-arg …))
whenever the end–of–input is found by an operator that does not accept it as valid, this rule is used to decide what to do.
The ?device-arg are the identifiers introduced by
:introduce-device-arguments
.
The output form can return a value or raise an exception; the returned value becomes the return value of the call to the parser.
:generate-delimiter-test
The input form is:
(_ :generate-delimiter-test ?ch ?ch-is-delimiter-kont ?ch-is-not-delimiter-kont)
this rule is used for input devices for which the lexeme string is embedded into a sequence of other characters, so there exists a set of characters that delimit the end–of–lexeme. The parser delegates to the device the responsibility of knowing which characters are delimiters, if any.
?ch is an identifier bound to the input character. ?ch-is-delimiter-kont is a form to be evaluated whenever ?ch is a delimiter character. ?ch-is-not-delimiter-kont is a form to be evaluated whenever ?ch is not a delimiter character.
For parsers accepting a full Scheme string as lexeme: there are no delimiters,3 the end–of–lexeme is the end–of–input; such parsers should just use ?ch-is-not-delimiter-kont as output form.
For parsers having delimiter characters, for example, recognised by a function like:
(define (delimiter? ch) (or (char=? ch #\space) (char=? ch #\linefeed)))
the output form should be something like:
(if (delimiter? ?ch) ?ch-is-delimiter-kont ?ch-is-not-delimiter-kont)
:invalid-input-char
The input form is:
(_ :invalid-input-char (?device-arg …) ?ch)
whenever an input character is not accepted by an operator function this rule is used to decide what to do.
The ?device-arg are the identifiers introduced by
:introduce-device-arguments
; ?ch is an identifier bound to
the invalid input character.
The output form can return a value or raise an exception; the returned value becomes the return value of the call to the parser.
Define the device logic to parse a lexeme from a full Scheme string
object as in string->number
. It is implemented as follows:
(define-syntax string->token-or-false (syntax-rules (:introduce-device-arguments :generate-end-of-input-or-char-tests :unexpected-end-of-input :generate-delimiter-test :invalid-input-char) ((_ :introduce-device-arguments ?kont . ?rest) (?kont (input.string input.length input.index) . ?rest)) ((_ :invalid-input-char (?input.string ?input.length ?input.index) ?ch) #f) ((_ :unexpected-end-of-input (?input.string ?input.length ?input.index)) #f) ((_ :generate-delimiter-test ?ch ?ch-is-delimiter-kont ?ch-is-not-delimiter-kont) ?ch-is-not-delimiter-kont) ((_ :generate-end-of-input-or-char-tests ?ch ?next ?fail (?input.string ?input.length ?input.index) ?end-of-input-kont ?parse-input-char-kont) (let-syntax ((?fail (syntax-rules () ((_) #f))) (?next (syntax-rules () ((_ ?operator-name ?operator-arg (... ...)) (?operator-name ?input.string ?input.length (fx+ 1 ?input.index) ?operator-arg (... ...)))))) (if (fx=? ?input.index ?input.length) ?end-of-input-kont (let ((?ch (string-ref ?input.string ?input.index))) ?parse-input-char-kont)))) ))
Previous: parser logic operators, Up: parser logic [Index]