If
a string has length n, then it will have 2^{n}
substrings.

So,
if 2 strings are of length m and n, then comparison of all their
substrings will be O(2^{m+n})

Recursive formulation

In
the above, we match characters from the end.

In
each recursive step, we reduce the length of one of the strings when
characters do not match. If characters match, then we reduce lengths
of both the strings.

Overlapping
subproblems:
In the above case, several cases are calculated repeatedly which
indicates a candidate for DP.

Example,
if end of both the strings do not match, then 2 recursive calls in
1st step are:

n-1
character string and m character string.

n
character string and m-1 character string.

In
the recursions of each of the above, a call is generated for n-1 char
string and m-1 string.

If
we had stored the result of these computations, then it would have
been much more efficient.

Dynamic Programming Solution

We
will need a matrix of size M+1 x N+1 for a lookup table.

If
one of the strings has 0 length, then for all lengths of other
string, max matching subsequence has 0 length.

If
both the strings are of 1 length, then lookup[1][1] will be either 0
or 1

If
both the strings are of length 2, then entries possible are:

lookup[1][1]
= 1 or 0

lookup[1][2]
= 1 or 0

lookup[2][1]
= 1 or 0

lookup[2][2]
= 2, 1, or 0

Note:
The above needs a space complexity of O(mn) but our algorithm only
needs values from a previous row. So we can optimize it by keeping
just 2 rows at a time. When one lookup row is completely formed, we
overwrite the previous row with the new row.

Find subsequence also alongwith the length

This can be done by making a second pass through both the strings as follows:

String findMaxSubsequence (String A, String B)
{
String S = "";
int i=0, j=0;
while (i < A.length && j < B.length)
{
if (A[i]==B[j])
{
S += A[i];
i++; j++;
}
else if (L[i+1,j] >= L[i,j+1]) i++;
else j++;
}
}

Maximum common substring

Substring is contiguous while subsequence is not.
A string of length n has 2^{n} subsequences and n(n+1)/2 substrings.

One good part of substring match problem is that once a non-matching character is found in the
strings, we can completely forget about the strings matched till those characters.

int maxCommonSubstring (char[] X, char[] Y, int m, int n)
{
if (m < 0 || n < 0)
return 0;
int maxA = 0;
if (X[m] == Y[n])
{
// If one char has matched, check how far the match goes
while (m >= 0 && n >= 0 && X[m] == Y[n])
{
maxA++;
m--;
n--;
}
}
int maxB = maxCommonSubstring (X, Y, m-1, n);
int maxC = maxCommonSubstring (X, Y, m, n-1);
return max (maxA, maxB, maxC);
}

This solution will compare each substring with each other substring.
Total comparisons for strings of lengths m and n would be:
m(m+1)/2 * n(n+1)/2

Right hand side proves that for m=3, n=3, total recursive calls are: 3*4/2 * 3*4/2 = 36

It is easy to see that lots of subproblems are being calculated again and again.
And the order can be improved by using dynamic programming here.

Sample Execution for worst case

abc, def
ab, def
a , def
"", def
a , de
"", de
a , d
"", d
a, ""
ab, de
a , de
"", de
a , d
"", d
a, ""
ab, d
a , d
ab, ""
abc, de
abc, d
abc, ""
ab, d
a , d
"", d
a, ""
ab, ""
ab, de
a , de
"", de
a , d
ab, d
a , d
"", d
a, ""
ab, ""

Got a thought to share or found a bug in the code? We'd love to hear from you: