Next: , Previous: , Up: silex syntax   [Index]


53.5.4 Atomic regular expressions

The following constructs are regular expressions:

c

Ordinary character. It is a regular expression that matches the character c itself. c must not be one of the following characters:

. \ { " [ | ? + * ( ) ^ $ ;

or any white space. If c is the ‘#’ character: notice that it could match a hex character specification (explained below); remember that SILex gives precedence to the longest match.

.

Wild card. It matches any character except the newline character.

\n
\integer
\c

Backslash. The backslash is used for two things: protect a character from special meaning; generating non–printable characters.

The expression \n matches the newline character.

The expression \integer matches the character that has number integer (in the sense of char->integer). integer must be a valid character number on the underlying Scheme implementation. Notice that ‘\9’ represents the horizontal tabulation ‘#\tab’, ‘\10’ the newline character ‘#\newline’, and ‘\13’ the carriage return character ‘#\return’.

The expression \c matches the character c if c is not ‘n’, ‘-’ nor a digit.

#xHEX
#XHEX

Hexadecimal characters. The expressions #xHEX and #XHEX match the character that has hex number HEX (in the sense of string->number). Remembering that SILex lexers match the longest input sequence: HEX terminates at the first non–hexadecimal digit character (uppercase or lowercase).

{name}

Macro reference. This expression matches the same lexemes as those matched by the regular expression named name. We can imagine that the reference is replaced by the text of the named expression. However, it works as if parentheses had been added to protect the substituting expression.

"some text"

String. A string matches a lexeme identical to its contents. The format of the string is the same defined by R6RS, including the quoted line wrapping.

[list of characters]
[]list of characters]
[-list of characters]
[^list of characters]

Character class. The expression matches one of the enumerated characters. For example, the expression ‘[abc]’ matches one of ‘a’, ‘b’ and ‘c’.

We can list a range of characters by writing the first character, the ‘-’ and the last character. For example, ‘[A-Za-z]’ matches one letter.

The special characters in a class are ‘]’, which closes the class, ‘-’, which denotes a range of characters, and ‘\’, which keeps its usual meaning.

There is an exception with the first character in a class. If the first character is ‘]’ or ‘-’, it loses its special meaning. If the first character is ‘^’, the expression matches one character if it is not enumerated in list of characters.


Next: , Previous: , Up: silex syntax   [Index]