Next: pregexp syntax clusters, Previous: pregexp syntax chars, Up: pregexp syntax [Index]
The quantifiers *
, +
, and ?
match respectively:
zero or more, one or more, and zero or one instances of the preceding
subpattern.
(pregexp-match-positions "c[ad]*r" "cadaddadddr") ⇒ ((0 . 11)) (pregexp-match-positions "c[ad]*r" "cr") ⇒ ((0 . 2)) (pregexp-match-positions "c[ad]+r" "cadaddadddr") ⇒ ((0 . 11)) (pregexp-match-positions "c[ad]+r" "cr") ⇒ #f (pregexp-match-positions "c[ad]?r" "cadaddadddr") ⇒ #f (pregexp-match-positions "c[ad]?r" "cr") ⇒ ((0 . 2)) (pregexp-match-positions "c[ad]?r" "car") ⇒ ((0 . 3))
We can use braces to specify much finer–tuned quantification than is
possible with *
, +
, ?
.
The quantifier {m}
matches exactly m instances of the
preceding subpattern. m must be a nonnegative integer.
The quantifier {m,n}
matches at least m and at most
n instances. m and n are nonnegative integers with
m <= n. We may omit either or both numbers, in which case
m defaults to 0 and n to infinity.
It is evident that +
and ?
are abbreviations for
{1,}
and {0,1}
respectively. *
abbreviates
{,}
, which is the same as {0,}
.
(pregexp-match "[aeiou]{3}" "vacuous") ⇒ ("uou") (pregexp-match "[aeiou]{3}" "evolve") ⇒ #f (pregexp-match "[aeiou]{2,3}" "evolve") ⇒ #f (pregexp-match "[aeiou]{2,3}" "zeugma") ⇒ ("eu")
The quantifiers described above are greedy, i.e., they match the maximal number of instances that would still lead to an overall match for the full pattern.
(pregexp-match "<.*>" "<tag1> <tag2> <tag3>") ⇒ ("<tag1> <tag2> <tag3>")
To make these quantifiers non–greedy, append a ?
to them.
Non–greedy quantifiers match the minimal number of instances needed to
ensure an overall match.
(pregexp-match "<.*?>" "<tag1> <tag2> <tag3>") ⇒ ("<tag1>")
The non–greedy quantifiers are respectively: *?
, +?
,
??
, {m}?
, {m,n}?
. Note the two uses of the
metacharacter ?
.
Next: pregexp syntax clusters, Previous: pregexp syntax chars, Up: pregexp syntax [Index]