Next: , Previous: , Up: pregexp syntax   [Index]


51.3.3 Quantifiers

The quantifiers *, +, and ? match respectively: zero or more, one or more, and zero or one instances of the preceding subpattern.

(pregexp-match-positions "c[ad]*r" "cadaddadddr")
⇒ ((0 . 11))
(pregexp-match-positions "c[ad]*r" "cr")
⇒ ((0 . 2))

(pregexp-match-positions "c[ad]+r" "cadaddadddr")
⇒ ((0 . 11))
(pregexp-match-positions "c[ad]+r" "cr")
⇒ #f

(pregexp-match-positions "c[ad]?r" "cadaddadddr")
⇒ #f
(pregexp-match-positions "c[ad]?r" "cr")
⇒ ((0 . 2))
(pregexp-match-positions "c[ad]?r" "car")
⇒ ((0 . 3))

Numeric quantifiers

We can use braces to specify much finer–tuned quantification than is possible with *, +, ?.

The quantifier {m} matches exactly m instances of the preceding subpattern. m must be a nonnegative integer.

The quantifier {m,n} matches at least m and at most n instances. m and n are nonnegative integers with m <= n. We may omit either or both numbers, in which case m defaults to 0 and n to infinity.

It is evident that + and ? are abbreviations for {1,} and {0,1} respectively. * abbreviates {,}, which is the same as {0,}.

(pregexp-match "[aeiou]{3}" "vacuous")
⇒ ("uou")

(pregexp-match "[aeiou]{3}" "evolve")
⇒ #f

(pregexp-match "[aeiou]{2,3}" "evolve")
⇒ #f

(pregexp-match "[aeiou]{2,3}" "zeugma")
⇒ ("eu")

Non–greedy quantifiers

The quantifiers described above are greedy, i.e., they match the maximal number of instances that would still lead to an overall match for the full pattern.

(pregexp-match "<.*>" "<tag1> <tag2> <tag3>")
⇒ ("<tag1> <tag2> <tag3>")

To make these quantifiers non–greedy, append a ? to them. Non–greedy quantifiers match the minimal number of instances needed to ensure an overall match.

(pregexp-match "<.*?>" "<tag1> <tag2> <tag3>")
⇒ ("<tag1>")

The non–greedy quantifiers are respectively: *?, +?, ??, {m}?, {m,n}?. Note the two uses of the metacharacter ?.


Next: , Previous: , Up: pregexp syntax   [Index]