Basic pattern matching goes as follows (with error checking omitted):
cre2_regexp_t * rex; cre2_options_t * opt; const char * pattern = "(ciao) (hello)"; opt = cre2_opt_new(); cre2_opt_set_posix_syntax(opt, 1); rex = cre2_new(pattern, strlen(pattern), opt); { const char * text = "ciao hello"; int text_len = strlen(text); int nmatch = 3; cre2_string_t match[nmatch]; cre2_match(rex, text, text_len, 0, text_len, CRE2_UNANCHORED, match, nmatch); /* prints: full match: ciao hello */ printf("full match: "); fwrite(match[0].data, match[0].length, 1, stdout); printf("\n"); /* prints: first group: ciao */ printf("first group: "); fwrite(match[1].data, match[1].length, 1, stdout); printf("\n"); /* prints: second group: hello */ printf("second group: "); fwrite(match[2].data, match[2].length, 1, stdout); printf("\n"); } cre2_delete(rex); cre2_opt_delete(opt);
Enumeration type for the anchor point of matching operations. It contains the following constants:
CRE2_UNANCHORED CRE2_ANCHOR_START CRE2_ANCHOR_BOTH
Match a substring of the text referenced by text and holding text_len bytes against the regular expression object rex. Return true if the text matched, false otherwise.
The zero–based indices start_pos (inclusive) and end_pos (exclusive) select the substring of text to be examined. anchor selects the anchor point for the matching operation.
Data about the matching groups is stored in the array match, which
must have at least nmatch entries; the referenced substrings are
portions of the text buffer. If we are only interested in
verifying if the text matches or not (ignoring the matching portions of
text): we can use NULL
as match argument and 0 as
nmatch argument.
The first element of match (index 0) references the full portion of the substring of text matching the pattern; the second element of match (index 1) references the portion of text matching the first parenthetical subexpression, the third element of match (index 2) references the portion of text matching the second parenthetical subexpression; and so on.
Like cre2_match()
but the pattern is specified as string
pattern holding pattern_len bytes. Also the text is fully
matched without anchoring.
If the text matches the pattern: the return value is 1. If the text does not match the pattern: the return value is 0. If the pattern is invalid: the return value is 2.
Structure type used to represent a substring of the text to be matched as starting and ending indices. It has the following fields:
long start
Inclusive start byte index.
long past
Exclusive end byte index.
Given an array of strings with nmatch elements being the result of matching text against a regular expression: fill the array of ranges with the index intervals in the text buffer representing the same results.
This document describes version 0.4.0-devel.2 of CRE2.