Coders Packet

Pattern Searching in a String using KMP Algorithm || C++

By srijan bharadwaj

In this tutorial, we will learn about Searching a pattern In a String Using the famous KMP algorithm.

What do we mean by Pattern Searching?

Let's take an example, we take a string S="ABABBAB" and we have to find the Pattern P="ABBA" in this string 

So, how to proceed this:-

1. Naive Approach - Try comparing each letter with the main string if you find a match move the index forward and compare,

if you find the pattern print the index - length of the pattern. 

2. KMP Search Algorithm -  (We will discuss more here) Optimized way to proceed.

3. Robin-Karp Algorithm - Slides the pattern to match with the main string.

What is KMP Algorithm?

KMP Algorithm or also known as Knuth-Morris-Pratt Algorithm is an optimized way to search pattern in a string by using the concept of 

Longest Prefix Suffix. The Naive approach takes extra time to complete this process as it counts the already compared characters of the string again

while in KMP we work on this rule that all the compared characters will not form the required pattern and we solve it in O(n).

How does this Algorithm work?

So here comes the main and most tricky part of explaining the working of the algorithm. So let me give you a line by line explanation of this Algorithm:-

1.Let us see what is an LPS(Longest Prefix Suffix) of an array.

Ex. Let's take the string S="ABCDABC"

Prefix can be (A,AB,ABC,ABCD...) Suffix will be (C,BC,ABC,DABC..)

So we can clearly see ABC is the Longest Prefix and a Suffix in the String.

2.So basically we have to create an lps array that will help us find the index of the character we have to match.

3 Let's take another example here and see how KMP Algorithm Works.

Before that let us understand how the Pi table works. In this algorithm, we find the Pi table of the pattern we have to match

Let's take a string p="ababd" and the start of the string let's take that index as 1, So we represent the occurrence of the character which is repeating so the pi table will give us [0,0,1,2,0]. 

After we generate this we are good to go for the Main Algorithm which follows these steps:-

a)Take 2 variables i and j.

b)Where i points to t[1] and i points to j points to p[0].

c)Compare t[i] with p[j+1].

d)If a match is found then i++ and j++.

e) If Mismatch then Move the location of J as per pi table (Let's say j==2 then move to the index 2 of the string and see the value)

f) If j==0 Move i++ and start comparing again.

g)If the match is complete print out the index.

 

Download project

Reviews Report

Submitted by srijan bharadwaj (srijan77)

Download packets of source code on Coders Packet