In this tutorial, we will learn about Searching a pattern In a String Using the famous KMP algorithm.
Let's take an example, we take a string S="ABABBAB" and we have to find the Pattern P="ABBA" in this string
So, how to proceed this:-
1. Naive Approach - Try comparing each letter with the main string if you find a match move the index forward and compare,
if you find the pattern print the index - length of the pattern.
2. KMP Search Algorithm - (We will discuss more here) Optimized way to proceed.
3. Robin-Karp Algorithm - Slides the pattern to match with the main string.
KMP Algorithm or also known as Knuth-Morris-Pratt Algorithm is an optimized way to search pattern in a string by using the concept of
Longest Prefix Suffix. The Naive approach takes extra time to complete this process as it counts the already compared characters of the string again
while in KMP we work on this rule that all the compared characters will not form the required pattern and we solve it in O(n).
So here comes the main and most tricky part of explaining the working of the algorithm. So let me give you a line by line explanation of this Algorithm:-
1.Let us see what is an LPS(Longest Prefix Suffix) of an array.
Ex. Let's take the string S="ABCDABC"
Prefix can be (A,AB,ABC,ABCD...) Suffix will be (C,BC,ABC,DABC..)
So we can clearly see ABC is the Longest Prefix and a Suffix in the String.
2.So basically we have to create an lps array that will help us find the index of the character we have to match.
3 Let's take another example here and see how KMP Algorithm Works.
Before that let us understand how the Pi table works. In this algorithm, we find the Pi table of the pattern we have to match
Let's take a string p="ababd" and the start of the string let's take that index as 1, So we represent the occurrence of the character which is repeating so the pi table will give us [0,0,1,2,0].
After we generate this we are good to go for the Main Algorithm which follows these steps:-
a)Take 2 variables i and j.
b)Where i points to t and i points to j points to p.
c)Compare t[i] with p[j+1].
d)If a match is found then i++ and j++.
e) If Mismatch then Move the location of J as per pi table (Let's say j==2 then move to the index 2 of the string and see the value)
f) If j==0 Move i++ and start comparing again.
g)If the match is complete print out the index.