Finding the Actual Longest Common Subsequence from Dynamic Programming Table

The Longest Common Subsequence (LCS) problem is a dynamic programming problem in computer science. It aims to find the longest subsequence present in both input sequences, where a subsequence is a sequence that appears in the same relative order but not necessarily consecutively.

We’ve already discussed the Dynamic Programming Solution for finding the Longest Common Subsequence in this post – Longest Common Subsequence Solution

In this post, we will focus on how to find the actual longest common subsequence from the dynamic programming table of the LCS solution using C++. We will start by briefly explaining the LCS problem and the dynamic programming approach to solve it, and then we will discuss how to backtrack through the DP table to extract the LCS.

Problem Statement

Given two sequences, the LCS problem is to find the longest subsequence that is common to both sequences. For example, given the sequences:

X = "ABCBDAB"
Y = "BDCAB"

The LCS of X and Y is "BCAB"
Dynamic Programming Solution

The dynamic programming solution to the LCS problem involves constructing a 2D table (often called the DP table) where the entry dp[i][j] represents the length of the LCS of the sequences X[0..i-1] and Y[0..j-1].

Optimal Substructure

Let X = {X1, X2, X3,..Xm} and Y = {Y1, Y2, Y3,….Yn} be the two sequences and, 

Let LCS[i][j] be the LCS of given sequences:
1. If X[i-1] == Y[j-1], then LCS[i][j] = LCS[i-1][j-1] + 1
(adding 1 to the length of the longest subsequence till i-2th & j-2th string
characters, if X[i-1] == Y[j-1])
2. Else LCS[i][j] = max(LCS[i-1][j], LCS[i][j-1])

Code Implementation

vector<vector<int>> lcs_table(const string& X, const string& Y) {

    int m = X.length();
    int n = Y.length();
    vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));

    for (int i = 1; i <= m; i++) {
        for (int j = 1; j <= n; j++) {
            if (X[i - 1] == Y[j - 1]) {
                dp[i][j] = dp[i - 1][j - 1] + 1;
            } else {
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]);
            }
        }
    }

    return dp;
}
Finding the LCS from the DP Table

Once the DP table is built, the next step is to backtrack from dp[m][n] to dp[0][0] to construct the LCS string.

Pseudocode

1. Initialize an empty string ‘lcs’ to store the Longest Common Subsequence.

2. While both i > 0 and j > 0, repeat the following steps:
If X[i-1] == Y[j-1]:
lcs = X[i - 1] + lcs; // Add character to LCS
i--;
j--;
Else:
If dp[i-1][j] > dp[i][j-1]:
i--; // Move Up
Else:
j--; // Move Left

Code Implementation

//
//  main.cpp
//  Finding Longest Common Subsequence
//
//  Created on 14/06/24.
//

#include <iostream>
#include <vector>
#include <string>
using namespace std;

void printLCSTable(const int m, const int n, vector<vector<int>> dp) {

    for (int i = 0; i <= m; i++) {
        for (int j = 0; j <= n; j++) {
            cout << dp[i][j] << " ";
        }
        
        cout << endl;
    }
}

vector<vector<int>> LCSTable(const string& X, const string& Y) {
    
    int m = (int) X.length();
    int n = (int) Y.length();
    
    vector<vector<int>> dp(m + 1, vector<int>(n + 1, 0));

    for (int i = 1; i <= m; i++) {
        for (int j = 1; j <= n; j++) {
            if (X[i - 1] == Y[j - 1]) {
                dp[i][j] = dp[i - 1][j - 1] + 1;
            } else {
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1]);
            }
        }
    }
    
    return dp;
}

string findLCS(const string& X, const string& Y, const vector<vector<int>>& dp) {
    
    int i = (int) X.length();
    int j = (int) Y.length();
    string lcs;

    while (i > 0 && j > 0) {
        
        if (X[i - 1] == Y[j - 1]) {
            lcs = X[i - 1] + lcs; // Add this character to the LCS
            i--;
            j--;
        } else if (dp[i - 1][j] > dp[i][j - 1]) {
            i--;      // Move Up
        } else {
            j--;      // Move Left
        }
    }
    
    return lcs;
}

int main() {
    
    string X = "ATDBM";
    string Y = "BATCB";
    
    vector<vector<int>> dp = LCSTable(X, Y);
        
    string lcs = findLCS(X, Y, dp);
    
    cout << "The Longest Common Subsequence is: " << lcs << endl << endl;
    
    cout << "LCS Table" << endl;
    printLCSTable((int) X.size(), (int) Y.size(), dp);
    
    return 0;
}

Output

The Longest Common Subsequence is: ATB

LCS Table
0 0 0 0 0 0
0 0 1 1 1 1
0 0 1 2 2 2
0 0 1 2 2 2
0 1 1 2 2 3
0 1 1 2 2 3

Time complexity: O(m * n), where m is the length of string X and n is the length of string Y; for traversing both string X and string Y in nested loops.
Auxiliary Space: O(m * n), for creating the LCS Dynamic Programming table.

Here’s a working code with step-by-step code run: Finding LCS

This approach ensures that we can efficiently solve the LCS problem and retrieve the longest subsequence common to both input sequences.

Leave a Reply

Your email address will not be published. Required fields are marked *