Next: kmp partial, Previous: kmp vector, Up: kmp [Index]
This function encapsulates the work performed by one step of the KMP search. Return the new index in the pattern; that is, how much of the pattern we have matched, including the given value from text.
Searching for the pattern "hello"
in the text "ciao hello
salut"
looks like this:
(let* ((text "ciao hello salut") ;; 01234567890123456 ;; 0 1 (text-past (string-length text)) (pattern "ciao") (pattern-start 0) (pattern-past (string-length pattern)) (rv (%kmp-make-restart-vector char=? string-ref pattern pattern-start pattern-past))) (let loop ((ti 0) (pi pattern-start)) (or (and (= pi pattern-past) ti) ; found (and (not (= ti text-past)) ; not found (loop (+ 1 ti) (%kmp-step char=? string-ref rv (string-ref text ti) pi pattern pattern-start)))))) ⇒ 10
if the pattern was not found the return value is #f
; if the
pattern was found the return value is the index in the text of the
character past the last matched one.
Abstracting the search from a string to a generic source of characters, we can write the following function, whose return value is the same as the loop above:
(define (return-match-past end-of-text? get-next-char pattern pattern-start pattern-past) (let ((rv (%kmp-make-restart-vector char=? string-ref pattern pattern-start pattern-past))) (let loop ((ti 0) (pi pattern-start)) (or (and (= pi pattern-past) ti) ; found (and (not (end-of-text?)) ; not found (loop (+ 1 ti) (%kmp-step char=? string-ref rv (get-next-char) pi pattern pattern-start)))))))
a usage example with a string looks like this:
(let* ((text "ciao hello salut") ;; 01234567890123456 ;; 0 1 (pattern "hello") (ti 0) (end-of-text? (lambda () (= ti (string-length text)))) (get-next-char (lambda () (begin0 (string-ref text ti) (set! ti (+ 1 ti)))))) (return-match-past end-of-text? get-next-char pattern 0 (string-length pattern))) ⇒ 10
a usage example with an input port looks like this:
(let* ((text "ciao hello salut") ;; 01234567890123456 ;; 0 1 (pattern "salut") (port (open-string-input-port text)) (end-of-text? (lambda () (eof-object? (peek-char port)))) (get-next-char (lambda () (read-char port)))) (return-match-past end-of-text? get-next-char pattern 0 (string-length pattern))) ⇒ 16
Next: kmp partial, Previous: kmp vector, Up: kmp [Index]