-
Notifications
You must be signed in to change notification settings - Fork 0
Example: Longest Common Subsequence
-
Given a sequence X={
$x_1$ ,...,$x_m$ } we say that Z={$z_1$ ,...,$z_k$ } is a subsequence of X if there is a strictly increasing sequence {$i_1$ ,...,$i_k$ } of indices of X such that$X_{i_j}$ =$Z_j$ for each j=1...k. -
Given two sequences X and Y={
$y_1$ ,...,$y_n$ }, Z is a Common Subsequence if it is a subsequence of both. -
In the Longest Common Subsequence problem we are given X and Y and are tasked to find the Longest Common Subsequence.
-
To formulate the problem, we observe that if Z is the LCS of X and Y then:
- If
$x_m$ ==$y_n$ then$z_k$ ==$x_m$ ==$y_n$ and$Z_{k-1}$ is the LCS of$X_{m-1}$ and$Y_{n-1}$ .
- If
The proof is divided in two parts.
-
If
$z_k$ was different than$x_m$ then we could append$x_m$ to Z obtaining a LCS longer than Z, but that is a contradiction because Z is the LCS. -
If
$Z_{k-1}$ was not the LCS of$X_{m}$ and$Y_{n-1}$ , then we could find a common subsequence longer than k-1, but the length of the LCS Z was k so that is not possible.- If
$x_{m}$ !=$y_{n}$ and$x_{m}$ !=$z_{k}$ , then Z is LCS of$X_{m-1}$ and$Y_{n}$ - Symmetrically, If
$x_{m}$ !=$y_{n}$ and$y_{n}$ !=$z_{k}$ , then Z is LCS of$X_{m}$ and$Y_{n-1}$
- If
To prove this, we can observe that if Z doesn't contain
-
From these three points we see that we must examine one of two problems:
- If of
$x_{m}$ !=$y_{n}$ we must find the LCS of$X_{m-1}$ and$Y_{n-1}$ , appending$x_{m}$ to that yields the LCS of X and Y - If
$x_{m}$ !=$y_{n}$ we must pick the longest LCS in$X_{m}$ $Y_{n-1}$ and$X_{m-1}$ $Y_{n}$
- If of
-
This formulation shows that we can get the final solution by using solutions of smaller problems.
-
Let c[i,j] be the length of the LCS of
$X_{i}$ and$Y_{j}$ :- if i == 0 or j == 0 then one of the two sequences is empty thus the len of the LCS is 0
- if i,j>0 and
$x_{i}$ ==$y_{j}$ then c[i, j] = c[i-1,j-1] + 1 - if i,j>0 and
$x_{i}$ !=$y_{j}$ then c[i, j] = Max(c[i-1,j], c[i,j-1]
Template:
// X = 1..m
// Y = 1..n
LCS(X,Y)
m = X.len
n = Y.len
c[0..m, 0..n]
b[1..m, 1..n] // we don't really need this
for i = 1 to m
for j = 1 to n
if X[i] == Y[j]
c[i,j] = 1 + c[i-1, j-1]
// b[i, j] "up-left"
else if c[i-1, j] >= c[i, j-1]
c[i, j] = c[i-1,j]
// b[i, j] "up"
else
c[i, j] = c[i,j-1]
// b[i, j] "left"
PrintLCS_1(b, X, i, j)
if b[i,j] == up-left
PrintLCS_1(b, X, i-1, j-1)
print X[i]
else if b[i,j] == "up"
PrintLCS_1(b, X, i-1, j)
else
PrintLCS_1(b, X, i, j-1)
PrintLCS_2(c, X, i, j)
if i == 0 || j == 0 return
if c[i, j] >= c[i-1,j] && c[i, j] >= c[i,j-1]
// up left
PrintLCS_2(c, X, i-1, j-1)
print X[i]
else if c[i-1,j] >= c[i,j-1]
// up
PrintLCS_2(c, X, i-1, j)
else
// left
PrintLCS_2(c, X, i, j-1)