Knuth-Morris-Pratt (KMP) is a linear time string matching algorithm. The algorithm avoids unnecessary comparison and computation of the transition function by using prefix (Π) function . Before starting the actual comparison between the characters of pattern and text, it computes the Π (pi) values for each character of the pattern using prefix function algorithm. The code is based on the algorithm described in CLRS book, but it checks the multiple occurrences of the pattern along with mismatch.
def KMP_Matcher(t,p):
n = len (t)
m = len (p)
pi = calculate_PI(p)
k = 0
for i in xrange(n):
while (k > 0 and p[k] != t[i]):
k = pi[k]
if (p[k] == t[i]):
k = k+1
if (k == m):
print "pattern",p + " found at index ", i-m+1
k = 0
else:
if (i == n-1):
print "Pattern", p + " not found"
pat = "monkey"
str = "five little monkey jumping on the bed. one monkey fell down and now there are just four monkey"
KMP_Matcher(str,pat)
pat = "donkey"
str = "five little monkey jumping on the bed. one monkey fell down and now there are just four monkey"
KMP_Matcher(str,pat)
The output of the code is as follows…
pattern monkey found at index 12
pattern monkey found at index 43
pattern monkey found at index 88
Pattern donkey not found