Next: , Up: silex semantics   [Index]


53.6.1 Evaluation of the actions

The action of a rule is evaluated when the corresponding pattern is matched. The result of its evaluation is the result that the lexical analyser returns to its caller.

We can think of an action like this: it is a form which is placed in the body of a lambda function, which in turn is invoked when a token matching the regular expression is found. So the following specification:

decint          [0-9]+

%%

{decint}        (string->number yytext)

will cause the following code to be put in the generated lexer tables:

(lambda (yytext)
  (string->number yytext))

arguments in the formals of the lambda are local bindings we can use in our actions. There are a few local bindings that are accessible by the action when it is evaluated: yycontinue, yygetc, yyungetc, yytext, yyline, yycolumn and yyoffset.

Binding: yycontinue

Contains the lexical analysis function itself. Use (yycontinue) to ask for the next token. Typically, the action associated with a pattern that matches white space is a call to yycontinue; it has the effect of skipping the white space.

Binding: yygetc
Binding: yyungetc

Contain functions to get and unget characters from the input of the analyser. They take no argument. yygetc returns a character, or the ‘(eof-object)’ value if the end–of–input is reached.

They should be used to read characters instead of accessing directly the input port because the analyser may have read more characters in order to have a look–ahead.

If we get more characters than we unget: those characters are skipped by the lexer function at the next invocation. If we want to perform a lookahead without loosing characters, we must unget all the characters we have got.

It is incorrect to try to unget more characters than has been gotten since the parsing of the last token. If such an attempt is made, yyungetc silently refuses.

Binding: yytext

Bound to a string containing the lexeme. This string is guaranteed not to be mutated. The string is created only if the action seems to need it. The action is considered to need the lexeme when ‘yytext’ appears somewhere in the text of the action.

Binding: yyline
Binding: yycolumn
Binding: yyoffset

Indicate the position in the input at the beginning of the lexeme. yyline is the number of the line; the first line is numbered 1. yycolumn is the number of the column; the first column numbered 1.

It is important to mention that characters such as the tabulation generate a variable length output when they are printed. So it would be more accurate to say that yycolumn is the index of the first character of the lexeme, starting at the beginning of the line.

yyoffset indicates the distance from the beginning of the input; the first lexeme has offset 0.

The three bindings may not all be existent depending on options given to the function lex when generating the tables.

There is a default action that is provided for a rule when its action is omitted.

It is clearer (and normally more useful) to specify explicitly the action associated with each rule.


Next: , Up: silex semantics   [Index]