Next: , Previous: , Up: irregex sre   [Index]


50.9.4 Character sets

Perhaps more common than matching specific strings is matching any of a set of characters. We can use the or alternation pattern on a list of single–character strings to simulate a character set, but this is too clumsy for everyday use so SRE syntax allows a number of shortcuts.

A single character matches that character literally, a trivial character class. More conveniently, a list holding a single element which is a string, refers to the character set composed of every character in the string.

(irregex-match '(* #\-) "---")
⇒ #<match>

(irregex-match '(* #\-) "-_-")
⇒ #f

(irregex-match '(* ("aeiou")) "oui")
⇒ #<match>

(irregex-match '(* ("aeiou")) "ouais")
⇒ #f

Ranges are introduced with the / operator. Strings or characters in the / are flattened and then taken in pairs to represent the start and end points, inclusive, of character ranges.

(irregex-match '(* (/ "AZ09")) "R2D2")
⇒ #<match>

(irregex-match '(* (/ "AZ09")) "C-3PO")
⇒ #f

In addition, a number of set algebra operations is provided. or, of course, has the same meaning, but when all the options are character sets it can be thought of as the set union operator. This is further extended by the & set intersection, - set difference, and ~ set complement operators.

(irregex-match '(* (& (/ "az") (~ ("aeiou")))) "xyzzy")
⇒ #<match>

(irregex-match '(* (& (/ "az") (~ ("aeiou")))) "vowels")
⇒ #f

(irregex-match '(* (- (/ "az") ("aeiou"))) "xyzzy")
⇒ #<match>

(irregex-match '(* (- (/ "az") ("aeiou"))) "vowels")
⇒ #f