Compiled regular expressions can be configured, at construction–time,
with a number of options collected in a cre2_options_t
object.
Notice that, by default, when attempting to compile an invalid regular
expression pattern, RE2 will print to stderr
an error message;
usually we want to avoid this logging by disabling the associated
option:
cre2_options_t * opt; opt = cre2_opt_new(); cre2_opt_set_log_errors(opt, 0);
Type of opaque pointers to options objects. Any instance of this type can be used to configure any number of regular expression objects.
Enumeration type for constants selecting encoding. It contains the following values:
CRE2_UNKNOWN CRE2_UTF8 CRE2_Latin1
The value CRE2_UNKNOWN
should never be used: it exists only in
case there is a mismatch between the definitions of RE2 and
CRE2.
Allocate and return a new options object. If memory allocation fails:
the return value is a NULL
pointer.
Finalise an options object releasing all the associated resources. Compiled regular expressions configured with this object are not affected by its destruction.
All the following functions are getters and setters for regular
expression options; the flag argument to the setter must be false
to disable the option and true to enable it; unless otherwise specified
the int
return value is true if the option is enabled and false
if it is disabled.
By default, the regular expression pattern and input text are interpreted as UTF-8. CRE2_Latin1 encoding causes them to be interpreted as Latin-1.
The getter returns CRE2_UNKNOWN
if the encoding value returned by
RE2 is unknown.
Restrict regexps to POSIX egrep syntax. Default is disabled.
Search for longest match, not first match. Default is disabled.
Log syntax and execution errors to stderr
. Default is enabled.
Interpret the pattern string as literal, not as regular expression. Default is disabled.
Setting this option is equivalent to quoting all the special characters defining a regular expression pattern:
cre2_regexp_t * rex; cre2_options_t * opt; const char * pattern = "(ciao) (hello)"; const char * text = pattern; int len = strlen(pattern); opt = cre2_opt_new(); cre2_opt_set_literal(opt, 1); rex = cre2_new(pattern, len, opt); { /* successful match */ cre2_match(rex, text, len, 0, len, CRE2_UNANCHORED, NULL, 0); } cre2_delete(rex); cre2_opt_delete(opt);
Never match a newline character, even if it is in the regular expression pattern; default is disabled. Turning on this option allows us to attempt a partial match, against the beginning of a multiline text, without using subpatterns to exclude the newline in the regexp pattern.
The dot matches everything, including the new line; default is disabled.
Parse all the parentheses as non–capturing; default is disabled.
Match is case–sensitive; the regular expression pattern can override
this setting with (?i)
unless configured in POSIX syntax
mode. Default is enabled.
The max memory option controls how much memory can be used to hold the compiled form of the regular expression and its cached DFA graphs. These functions set and get such amount of memory. See the documentation of RE2 for details.
The following options are only consulted when POSIX syntax is enabled; when POSIX syntax is disabled: these features are always enabled and cannot be turned off.
Allow Perl’s \d
, \s
, \w
, \D
, \S
,
\W
. Default is disabled.
Allow Perl’s \b
, \B
(word boundary and not). Default is
disabled.
The patterns ^
and $
only match at the beginning and end
of the text. Default is disabled.
This document describes version 0.4.0-devel.2 of CRE2.