DAY 19 - Longest Common Substring

Nayan Pahuja - Jun 22 '23 - - Dev Community

Problem: Longest Common Substring

Given two strings. The task is to find the length of the longest common substring.
LINK

Examples:

Input: S1 = "ABCDGH", S2 = "ACDGHR", n = 6, m = 6
Output: 4
Explanation: The longest common substring
is "CDGH" which has length 4.


Input: S1 = "ABC", S2 "ACB", n = 3, m = 3
Output: 1
Explanation: The longest common substrings
are "A", "B", "C" all having length 1.

Intuition:

This is a little bit different from our previous question.
To solve this question we will require the understanding of how we solved our previous question, so if you haven't read that, Please do now. It's a short 3-4 min read.

Now the question is different because

  • It requires us to find Longest Common Substring not subsequence. A substring unlike a subsequence must be continuous in order.

Approach + Space Optimization

Let's revisit our tabulation approach for longest common subsequence.
We initialized the base cases.
Declared a dp vector of size(n+1 , m+1) and initialized it with 0.

And Our Logic was like this:

  • If(S[ind1] == t[ind2]) In this case this common element will represent a single character common subsequence, so we can say that we have found one character and we can shrink both the strings by 1 to find the longest common subsequence in the remaining pair of strings.
  • If its not equal we go into two options.

If you carefully observe the dp array at every dp[ind1][ind2] it stores the value of common characters found in order.

We will do a little change since we want continuous characters, if the character is matching we add it to the previous value + 1 else we will initialize it as 0.

We will also keep a max count for max substring until yet.

Carefully read the code for better understanding.

//{ Driver Code Starts
#include<bits/stdc++.h>
using namespace std;

// } Driver Code Ends
class Solution{
    public:

    int longestCommonSubstr (string S1, string S2, int n, int m)
    {
        int maxi = INT_MIN;
        int ind1 = S1.size();
        int ind2 = S2.size();
        vector<vector<int>> dp(ind1+1, vector<int>(ind2+1,0));

        for(int i = 1; i <=ind1; i++){
            for(int d = 1; d <=ind2; d++){
                if(S1[i-1] == S2[d-1])
                dp[i][d] = 1 + dp[i-1][d-1];
                else
                dp[i][d] = 0;
                maxi = max(maxi, dp[i][d]);
                }
            }
        return maxi;
        }

};

//{ Driver Code Starts.

int main()
{
    int t; cin >> t;
    while (t--)
    {
        int n, m; cin >> n >> m;
        string s1, s2;
        cin >> s1 >> s2;
        Solution ob;

        cout << ob.longestCommonSubstr (s1, s2, n, m) << endl;
    }
}
// } Driver Code Ends
Enter fullscreen mode Exit fullscreen mode

Complexity Analysis:
Time Complexity: O(N*M)
Reason: There are two nested loops
Space Complexity: O(N*M)

We can further optimize it's space using a single 1-D array as we only require the previous row's value at that point of time.

Code:

  int longestCommonSubstr (string S1, string S2, int n, int m)
    {
int n = s1.size();
    int m = s2.size();

    vector<int> prev(m+1,0), cur(m+1,0);

    int ans = 0;

    for(int i=1;i<=n;i++){
        for(int j=1;j<=m;j++){
            if(s1[i-1]==s2[j-1]){
                int val = 1 + prev[j-1];
                cur[j] = val;
                ans = max(ans,val);
            }
            else
                cur[j] = 0;
        }
        prev=cur;
    }

    return ans;
}
Enter fullscreen mode Exit fullscreen mode

Complexity Analysis:
Time Complexity: O(N*M)
Reason: There are two nested loops
Space Complexity: O(M)

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .