The Longest Common Subsequence (LCS) problem is a dynamic programming problem in computer science. It aims to find the longest subsequence present in both input sequences, where a subsequence is a sequence that appears in the same relative order but not necessarily consecutively.
We’ve already discussed the Dynamic Programming Solution for finding the Longest Common Subsequence in this post – Longest Common Subsequence Solution
In this post, we will focus on how to find the actual longest common subsequence from the dynamic programming table of the LCS solution using C++. We will start by briefly explaining the LCS problem and the dynamic programming approach to solve it, and then we will discuss how to backtrack through the DP table to extract the LCS.
Problem Statement
Given two sequences, the LCS problem is to find the longest subsequence that is common to both sequences. For example, given the sequences:
X = "ABCBDAB"
Y = "BDCAB"
The LCS of X and Y is "BCAB"
Dynamic Programming Solution
The dynamic programming solution to the LCS problem involves constructing a 2D table (often called the DP table) where the entry dp[i][j]
represents the length of the LCS of the sequences X[0..i-1]
and Y[0..j-1]
.
Optimal Substructure
Let X = {X1, X2, X3,..Xm} and Y = {Y1, Y2, Y3,….Yn} be the two sequences and,
Let LCS[i][j] be the LCS of given sequences:
1. If X[i-1] == Y[j-1], then LCS[i][j] = LCS[i-1][j-1] + 1
(adding 1 to the length of the longest subsequence till i-2th & j-2th string
characters, if X[i-1] == Y[j-1])
2. Else LCS[i][j] = max(LCS[i-1][j], LCS[i][j-1])
Code Implementation
vector<vector<int>> lcs_table(const string& X, const string& Y) {
int m = X.length();
int n = Y.length();
vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (X[i - 1] == Y[j - 1]) {
dp[i][j] = dp[i - 1][j - 1] + 1;
} else {
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]);
}
}
}
return dp;
}
Finding the LCS from the DP Table
Once the DP table is built, the next step is to backtrack from dp[m][n]
to dp[0][0]
to construct the LCS string.
Pseudocode
1. Initialize an empty string ‘lcs’ to store the Longest Common Subsequence.
2. While both i > 0 and j > 0, repeat the following steps:
If X[i-1] == Y[j-1]:
lcs = X[i - 1] + lcs; // Add character to LCS
i--;
j--;
Else:
If dp[i-1][j] > dp[i][j-1]:
i--; // Move Up
Else:
j--; // Move Left
Code Implementation
//
// main.cpp
// Finding Longest Common Subsequence
//
// Created on 14/06/24.
//
#include <iostream>
#include <vector>
#include <string>
using namespace std;
void printLCSTable(const int m, const int n, vector<vector<int>> dp) {
for (int i = 0; i <= m; i++) {
for (int j = 0; j <= n; j++) {
cout << dp[i][j] << " ";
}
cout << endl;
}
}
vector<vector<int>> LCSTable(const string& X, const string& Y) {
int m = (int) X.length();
int n = (int) Y.length();
vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));
for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (X[i - 1] == Y[j - 1]) {
dp[i][j] = dp[i - 1][j - 1] + 1;
} else {
dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]);
}
}
}
return dp;
}
string findLCS(const string& X, const string& Y, const vector<vector<int>>& dp) {
int i = (int) X.length();
int j = (int) Y.length();
string lcs;
while (i > 0 && j > 0) {
if (X[i - 1] == Y[j - 1]) {
lcs = X[i - 1] + lcs; // Add this character to the LCS
i--;
j--;
} else if (dp[i - 1][j] > dp[i][j - 1]) {
i--; // Move Up
} else {
j--; // Move Left
}
}
return lcs;
}
int main() {
string X = "ATDBM";
string Y = "BATCB";
vector<vector<int>> dp = LCSTable(X, Y);
string lcs = findLCS(X, Y, dp);
cout << "The Longest Common Subsequence is: " << lcs << endl << endl;
cout << "LCS Table" << endl;
printLCSTable((int) X.size(), (int) Y.size(), dp);
return 0;
}
Output
The Longest Common Subsequence is: ATB
LCS Table
0 0 0 0 0 0
0 0 1 1 1 1
0 0 1 2 2 2
0 0 1 2 2 2
0 1 1 2 2 3
0 1 1 2 2 3
Time complexity: O(m * n), where m
is the length of string X
and n
is the length of string Y
; for traversing both string X and string Y in nested loops.
Auxiliary Space: O(m * n), for creating the LCS Dynamic Programming table.
Here’s a working code with step-by-step code run: Finding LCS
This approach ensures that we can efficiently solve the LCS problem and retrieve the longest subsequence common to both input sequences.