Next: , Previous: , Up: Top   [Index]


29 Knuth–Morris–Pratt searching

The Knuth–Morris–Pratt (KMP) sequence–search algorithm is a method of rapidly scanning a sequence of values for the occurrence of some fixed pattern. It has the advantage of never requiring backtracking; hence, it is useful for searching sequences that do not support backtracking or random–access, such as input ports.

For an explanation of the algorithm:

http://en.wikipedia.org/wiki/Knuth-Morris-Pratt_algorithm

The following routines package up the initialisation and searching phases of the algorithm for general use in the library (vicare containers knuth-morris-pratt). They also support searching through sequences that arrive in buffered chunks, in that intermediate search state can be carried across applications of the search loop from the end of one buffer application to the next.

A second critical property of KMP search is that it requires the allocation of auxiliary memory proportional to the length of the pattern, but constant in the size of the value type. Alternate searching algorithms frequently require the construction of a table with an entry for every possible value, which can be prohibitively expensive, for example, in a 16-bit or 32-bit character representation.

The algorithm is especially suited to search patterns of characters in text, so many examples are given for strings. However the library supports searching any values in any sequence.


Next: , Previous: , Up: Top   [Index]