Libraries for Vicare Scheme: srfi regexps rationale

2.39.3 Rationale

Regular expressions, coming from a long history of formal language theory, are today the lingua franca of simple string matching. A regular expression is an expression describing a regular language, the simplest level in the Chomsky hierarchy. They have the nice property that they can match in linear time, whereas parsers for the next level in the hierarchy require cubic time. This combined with their conciseness led them to be a popular choice for searching in editors, tools and search interfaces. Other tools may be better suited to specific purposes, but it is assumed any modern language will provide regular expression support.

SREs were first introduced in SCSH as an S–expression based alternative to the more common string based description. This format offers many advantages, including being easier to read and write (notably with structured editors), easier to compose (with no escaping issues), and faster and simpler to compile. An efficient reference implementation of this SRFI can be written in under 1000 lines of code, whereas in IrRegex the full PCRE parser alone requires over 500 lines.