A beginner guide to Brute Force Algorithm for substring search

Introduction

CONTROL + F or COMMAND + F

How often do you use above keyboard shortcut? In fact, for most of us, searching a string or substring in a pile of strings/document is involuntarily action with the above key combination. This post will deal with the subject of the substring search. We will quickly define the problem statement and move to our algorithm explanation. We will finally move to the implementation.

Problem Statement

We will assume text (where we have to search) is an array of  n characters. Let’s denote the text by T[1..n]. String to be searched is called pattern and is represented by an array P[1..m] where m ≤ n.

We need to find indexes in given text T where the given pattern P occurs.

Following picture will explain it better:

Screen Shot 2019-02-09 at 4.57.51 PM
Figure 1: Here, the blue coloured characters in the text matched with the pattern. The program should return 3, i.e. the index of in text.

 

Brute Force Approach

Introduction

The Brute Force algorithm compares the pattern to the text  taking one character at a time.

For example, let’s suppose we need to find  “abba” in “abcabaabcabac”.

We will start with first “a” in “abcabaabcabac”.

Pattern has “a”; Text has “a”. Now try to match second character of pattern to second character of the text.

Pattern has “b”; Text has “b”. Very well, now let us try to match third character of pattern with third character of the text.

Pattern has “b”; Text has “c”. Ah, failed.

Let’s try to see the same thing with graphics.

brute.png

brute (1).png

brute (2).png

Now, We will start with second character “b” in “abcabaabcabac”.

Pattern has “a”; Text has “b”. This is a no go at the very first character of the.

brute (3).png

Now, We will start with third character “c” in “abcabaabcabac”.

Pattern has “a”; Text has “c”. This is a no go at the very first character of the pattern.

brute (4).png

Similarly, we will go on matching until our characters in text are exhausted or we find a match.

Implementation

import java.util.ArrayList;
import java.util.List;
public class StringMatchBruteForce {
public static void main (String args []){
System.out.println("========String matcher Brute Force Approach===========");
String text = "abcabbabbaabac";
String pattern = "abba";
long initialTime = System.nanoTime();
List<Integer> matchedIndexes = bruteForceStringMatcher(text, pattern);
long finalTime = System.nanoTime();
for(Integer matchedIndex : matchedIndexes){
System.out.println("Pattern found at "+matchedIndex);
}
if(matchedIndexes.size() == 0)
System.out.println("Pattern not found");
System.out.println("Time taken for matching "+(finalTimeinitialTime));
}
public static List<Integer> bruteForceStringMatcher(String text, String pattern){
char[] textArray = text.toCharArray();
char[] patternArray = pattern.toCharArray();
int textLength = textArray.length;
int patternLength = patternArray.length;
List<Integer> matchedIndexes = new ArrayList<>();
int textIndex = 0;
for(textIndex = 0; textIndex < textLength; textIndex++){
int textIndexLocal = textIndex;
Boolean match = true;
int matchedIndex = textIndex;
for(int patternIndex = 0; patternIndex < patternLength; patternIndex++){
if(textArray[textIndexLocal] != patternArray[patternIndex]) {
match = false;
break;
}
textIndexLocal = textIndexLocal + 1;
}
if(match)
matchedIndexes.add(matchedIndex);
}
return matchedIndexes;
}
}

The output for this is as follow:

========String matcher Brute Force Approach===========
Pattern found at 3
Pattern found at 6
Time taken for matching 243434

Time Complexity

In worst case, for any text of length n and pattern m, we are going to do m comparisons for each character of text, that will mean O(n*m)

We will be discussing a faster Rabin-Karp Algorithm for substring search in next post.


References and useful link:

Corman

Rabin Karp

KMP

Rabin Fingerprint

PPT for Rabin Karp

If you liked this article and would like one such blog to land in your inbox every week, consider subscribing to our newsletter: https://skillcaptain.substack.com

Leave a Reply

Up ↑

%d bloggers like this: