CONTROL + F or COMMAND + F
How often do you use above keyboard shortcut? In fact, for most of us, searching a string or substring in a pile of strings/document is involuntarily action with the above key combination. This post will deal with the subject of the substring search. We will quickly define the problem statement and move to our algorithm explanation. We will finally move to the implementation.
We will assume text (where we have to search) is an array of n characters. Let’s denote the text by T[1..n]. String to be searched is called pattern and is represented by an array P[1..m] where m ≤ n.
We need to find indexes in given text T where the given pattern P occurs.
Following picture will explain it better:
Brute Force Approach
The Brute Force algorithm compares the pattern to the text taking one character at a time.
For example, let’s suppose we need to find “abba” in “abcabaabcabac”.
We will start with first “a” in “abcabaabcabac”.
Pattern has “a”; Text has “a”. Now try to match second character of pattern to second character of the text.
Pattern has “b”; Text has “b”. Very well, now let us try to match third character of pattern with third character of the text.
Pattern has “b”; Text has “c”. Ah, failed.
Let’s try to see the same thing with graphics.
Now, We will start with second character “b” in “abcabaabcabac”.
Pattern has “a”; Text has “b”. This is a no go at the very first character of the.
Now, We will start with third character “c” in “abcabaabcabac”.
Pattern has “a”; Text has “c”. This is a no go at the very first character of the pattern.
Similarly, we will go on matching until our characters in text are exhausted or we find a match.
The output for this is as follow:
========String matcher Brute Force Approach=========== Pattern found at 3 Pattern found at 6 Time taken for matching 243434
In worst case, for any text of length n and pattern m, we are going to do m comparisons for each character of text, that will mean O(n*m)
We will be discussing a faster Rabin-Karp Algorithm for substring search in next post.
References and useful link: