Next: pregexp syntax clusters, Previous: pregexp syntax chars, Up: pregexp syntax [Index]
The quantifiers *, +, and ? match respectively:
zero or more, one or more, and zero or one instances of the preceding
subpattern.
(pregexp-match-positions "c[ad]*r" "cadaddadddr") ⇒ ((0 . 11)) (pregexp-match-positions "c[ad]*r" "cr") ⇒ ((0 . 2)) (pregexp-match-positions "c[ad]+r" "cadaddadddr") ⇒ ((0 . 11)) (pregexp-match-positions "c[ad]+r" "cr") ⇒ #f (pregexp-match-positions "c[ad]?r" "cadaddadddr") ⇒ #f (pregexp-match-positions "c[ad]?r" "cr") ⇒ ((0 . 2)) (pregexp-match-positions "c[ad]?r" "car") ⇒ ((0 . 3))
We can use braces to specify much finer–tuned quantification than is
possible with *, +, ?.
The quantifier {m} matches exactly m instances of the
preceding subpattern. m must be a nonnegative integer.
The quantifier {m,n} matches at least m and at most
n instances. m and n are nonnegative integers with
m <= n. We may omit either or both numbers, in which case
m defaults to 0 and n to infinity.
It is evident that + and ? are abbreviations for
{1,} and {0,1} respectively. * abbreviates
{,}, which is the same as {0,}.
(pregexp-match "[aeiou]{3}" "vacuous")
⇒ ("uou")
(pregexp-match "[aeiou]{3}" "evolve")
⇒ #f
(pregexp-match "[aeiou]{2,3}" "evolve")
⇒ #f
(pregexp-match "[aeiou]{2,3}" "zeugma")
⇒ ("eu")
The quantifiers described above are greedy, i.e., they match the maximal number of instances that would still lead to an overall match for the full pattern.
(pregexp-match "<.*>" "<tag1> <tag2> <tag3>")
⇒ ("<tag1> <tag2> <tag3>")
To make these quantifiers non–greedy, append a ? to them.
Non–greedy quantifiers match the minimal number of instances needed to
ensure an overall match.
(pregexp-match "<.*?>" "<tag1> <tag2> <tag3>")
⇒ ("<tag1>")
The non–greedy quantifiers are respectively: *?, +?,
??, {m}?, {m,n}?. Note the two uses of the
metacharacter ?.
Next: pregexp syntax clusters, Previous: pregexp syntax chars, Up: pregexp syntax [Index]