Longest Common Substring Problem

Longest Common Substring Problem

Introduction

In this blog, we will discuss how to determine the longest common string. A substring is a contiguous sequence of characters in a string. The longest common substring of two strings is a substring which is common to both the string and the longest.

Sample Input:

x = “CodingforFun”,  y = “CodingisGreat”

Sample Output:

6

Explanation:

The Longest Common Substring is “Coding” and is of length 6.

Problem Statement

You are given two strings and you need to determine the longest common substring among both.

Naive Approach

The naive approach is to generate all the substrings of the strings separately and declare a hashmap of type {String, Boolean} and initialize the substrings of the first given string. Now iterate over the substrings of the second string and check whether it exists in the hashmap or not. If it exists then check if its length is greater than the already initialized variable “answer” then update the
value of the “answer” variable.

Dynamic programming approach

The another approach is to determine the longest common substring in O(M * N) time.

Algorithm

  • Initialize a dp table of size M * N with 0 value.
  • Now run a nested loop using index1 and index2 and iterate over all the cells and check whether the character at the index1 in string1 is equal to the character stored at the index2 in string2.
  • If it is so then update the dp[index1][index2] as dp[index1 - 1][index2 - 1] + 1.
  • Also, maintain an answer variable that keeps the record of the longest common substring so far.
  • Return answer.

Below is the implementation of this approach in C++:

Examples

Example 1:

Source Code:

#include <bits/stdc++.h>

using namespace std;

// Function to determine the
// length of the longest common substring   
int solve(string &string1, string &string2, int M, int N)
{
    // Create a dp table to store
    // lengths of longest
    // common suffixes of substrings. 
   
    int dp[M + 1][N + 1];
   
    // To store the length of the longest substring
    int answer = 0;
   
    // Iterate and fill the dp table
    for (int index1 = 0; index1 <= M; index1++)
    {
        for (int index2 = 0; index2 <= N; index2++)
        {
            // The first row and first column
            // entries have no logical meaning,
            // they are used only for simplicity
            // of program
            if (index1 == 0 || index2 == 0)
                dp[index1][index2] = 0;

            else if (string1[index1 - 1] == string2[index2 - 1]) {
                dp[index1][index2] = dp[index1 - 1][index2 - 1] + 1;
                answer = max(answer, dp[index1][index2]);
            }
            else
                dp[index1][index2] = 0;
        }
    }
   
    // Return the answer
    return answer;
}
 
int main() {
   
    // Initialize strings
    string string1 = "boardinfinitywebsite";
    string string2 = "boardinfinity";
   
    cout << "Length of the longest common substring is ";
   
    int N = string1.length();
    int M = string2.length();
   
    // Call solve() function
    int answer = solve(string1, string2, N, M);
   
    cout << answer;
   
    return 0;
}

Output:

As you can see “boardinfinity” is the longest common substring which has length equal to 13.

write your code here: Coding Playground

Example 2:

Source Code:

#include <bits/stdc++.h>

using namespace std;

// Function to determine the
// length of the longest common substring   
int solve(string &string1, string &string2, int M, int N)
{
    // Create a dp table to store
    // lengths of longest
    // common suffixes of substrings. 
   
    int dp[M + 1][N + 1];
   
    // To store the length of the longest substring
    int answer = 0;
   
    // Iterate and fill the dp table
    for (int index1 = 0; index1 <= M; index1++)
    {
        for (int index2 = 0; index2 <= N; index2++)
        {
            // The first row and first column
            // entries have no logical meaning,
            // they are used only for simplicity
            // of program
            if (index1 == 0 || index2 == 0)
                dp[index1][index2] = 0;

            else if (string1[index1 - 1] == string2[index2 - 1]) {
                dp[index1][index2] = dp[index1 - 1][index2 - 1] + 1;
                answer = max(answer, dp[index1][index2]);
            }
            else
                dp[index1][index2] = 0;
        }
    }
   
    // Return the answer
    return answer;
}
 
int main() {
   
    // Initialize strings
    string string1 = "boardinfinitywebsite";
    string string2 = "websiteinfinityboard";
   
    cout << "Length of the longest common substring is ";
   
    int N = string1.length();
    int M = string2.length();
   
    // Call solve() function
    int answer = solve(string1, string2, N, M);
   
    cout << answer;
   
    return 0;
}

Output:

write your code here: Coding Playground

Conclusion

In this blog, we discussed how to determine the longest common substring. We discussed the algorithm and its implementation in C++. We believe that this blog has been helpful for you.