Data Structuresand Algorithms Good OldJava AdvancedInterview Topics Cloud andDatabases Web Designand Development Must-knowTools Good ToKnow

## Pattern Matching (Knuth Morris Pratt algorithm)

Problem Statement: Find all occurrences of pattern[] of length 'm' in txt[] of length 'n'.

If we do a strcmp at every index of txt, then it would be O(mn).
This can be reduced to O(m+n) by using KMP algorithm.

Solution: KMP algorithm makes use of the match done till now and does not begin searching all over again in case a mismatch is found while searching for a match.

Let lps be defined as Longest Prefix Suffix which indicates how many characters from the beginning match the ending characters of a string.

Example:

If pattern = [A,b,c,A,b,c,1,2], then
lps array is [0,0,0,1,2,3,0,0]

Now, suppose we begin matching pat in txt after constructing the above array.
We continue matching characters in pattern and txt till a mismatch occurs.
Suppose mismatch happens at txt[i] and pattern[j] characters.

At this point, we check the lps at index j.
If its non-zero, then it means that the 0 to lps[j] characters from pattern
are still matching in txt from txt[i-lps[j]] to txt[i] because by
definition, it means that pattern to pattern[lps[j]] characters are
matching from pat[j-lps[j]] to pattern[j]

 Java Code: ``` public class SearchAlgoKMP { public static void main (String args[]) { String pattern = "AbcAbc12"; String txt = "AbcAbcAbc12"; kmp(pattern, txt); } static void computeLPS(String pattern, int[] lps) { int currMaxLPS = 0; lps = 0; for (int i=1; i < pattern.length();) { if(pattern.charAt(i) == pattern.charAt(currMaxLPS)) { currMaxLPS++; lps[i] = currMaxLPS; i++; } else { if( currMaxLPS != 0 ) { // Note that lps[currMaxLPS-1] contains the largest // prefix of pattern that matches the suffix ending // at pattern[currMaxLPS-1] // So we try to match from that prefix (which has to be // smaller than current prefix) // Observe that i is not incrementing here. currMaxLPS = lps[currMaxLPS-1]; } else { lps[i] = 0; i++; } } } } /** * KMP is just a re-application of the LPS concept shown above. * Its algorithm is very similar to the above one. */ static void kmp(String pattern, String txt) { int lps[] = new int[pattern.length()]; computeLPS (pattern, lps); int txtPos=0, patternPos=0; while (txtPos < txt.length()) { if(pattern.charAt(patternPos) == txt.charAt(txtPos)) { patternPos++; txtPos++; } if (patternPos == pattern.length()) { System.out.println ("Pattern is found at index " + (txtPos-patternPos)); patternPos = lps[patternPos-1]; } else if(pattern.charAt(patternPos) != txt.charAt(txtPos)) { // mismatch occurs after j matches if(patternPos != 0) patternPos = lps[patternPos-1]; else txtPos++; } } } } ``` Sample execution: Example 1: ```A b c A b c A (String) 0 0 0 1 2 3 1 (LPS corresponding to it) ``` Example 2: ``` Pattern = AbcAbc12 Txt = AbcAbcAbc12 Creation of LPS i=1) currMaxLPS=0, lps=0, Ab i=2) currMaxLPS=0, lps=0, Abc i=3) currMaxLPS=1, lps=1, AbcA i=4) currMaxLPS=2, lps=2, AbcAb i=5) currMaxLPS=3, lps=3, AbcAbc i=6) currMaxLPS=0, lps=0, AbcAbc1 i=6) currMaxLPS=0, lps=0, AbcAbc1 i=7) currMaxLPS=0, lps=0, AbcAbc12 Final lps = [0,0,0,1,2,3,0,0] Searching using lps array txtPos=0, patternPos=0, doneTxt=A, patternMatched=A txtPos=1, patternPos=1, doneTxt=Ab, patternMatched=Ab txtPos=2, patternPos=2, doneTxt=Abc, patternMatched=Abc txtPos=3, patternPos=3, doneTxt=AbcA, patternMatched=AbcA txtPos=4, patternPos=4, doneTxt=AbcAb, patternMatched=AbcAb txtPos=5, patternPos=5, doneTxt=AbcAbc, patternMatched=AbcAbc txtPos=6, patternPos=3, doneTxt=AbcAbcA, patternMatched=AbcA txtPos=7, patternPos=4, doneTxt=AbcAbcAb, patternMatched=AbcAb txtPos=8, patternPos=5, doneTxt=AbcAbcAbc, patternMatched=AbcAbc txtPos=9, patternPos=6, doneTxt=AbcAbcAbc1, patternMatched=AbcAbc1 txtPos=10, patternPos=7, doneTxt=AbcAbcAbc12, patternMatched=AbcAbc12 Pattern is found at index 3 ``` Like us on Facebook to remain in touch
with the latest in technology and tutorials!

Got a thought to share or found a
bug in the code?
We'd love to hear from you:

 Name: Email: (Your email is not shared with anybody) Comment: