Question

In: Computer Science

Dynamic Programming Question. Make the table for finding a LCS (Longest Common Subsequence) of the strings...

Dynamic Programming Question. Make the table for finding a LCS (Longest Common Subsequence) of the strings SLWOVNNDK and ALWGQVNBKB. You need to Show the traceback for the LCS.

Solutions

Expert Solution

here string LWVNK is the LCS.

------------------------

The longest subsequence common to R = (GAC), and C = (AGCAT) will be found. Because the LCS function uses a "zeroth" element, it is convenient to define zero prefixes that are empty for these sequences: R0 = Ø; and C0 = Ø. All the prefixes are placed in a table with C in the first row (making it a column header) and R in the first column (making it a row header).

LCS Strings
Ø A G C A T
Ø Ø Ø Ø Ø Ø Ø
G Ø
A Ø
C Ø

This table is used to store the LCS sequence for each step of the calculation. The second column and second row have been filled in with Ø, because when an empty sequence is compared with a non-empty sequence, the longest common subsequence is always an empty sequence.

LCS(R1, C1) is determined by comparing the first elements in each sequence. G and A are not the same, so this LCS gets (using the "second property") the longest of the two sequences, LCS(R1, C0) and LCS(R0, C1). According to the table, both of these are empty, so LCS(R1, C1) is also empty, as shown in the table below. The arrows indicate that the sequence comes from both the cell above, LCS(R0, C1) and the cell on the left, LCS(R1, C0).

LCS(R1, C2) is determined by comparing G and G. They match, so G is appended to the upper left sequence, LCS(R0, C1), which is (Ø), giving (ØG), which is (G).

For LCS(R1, C3), G and C do not match. The sequence above is empty; the one to the left contains one element, G. Selecting the longest of these, LCS(R1, C3) is (G). The arrow points to the left, since that is the longest of the two sequences.

LCS(R1, C4), likewise, is (G).

LCS(R1, C5), likewise, is (G).

"G" Row Completed
Ø A G C A T
Ø Ø Ø Ø Ø Ø Ø
G Ø {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}Ø {\displaystyle {\overset {\nwarrow }{\ }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G)
A Ø
C Ø

For LCS(R2, C1), A is compared with A. The two elements match, so A is appended to Ø, giving (A).

For LCS(R2, C2), A and G do not match, so the longest of LCS(R1, C2), which is (G), and LCS(R2, C1), which is (A), is used. In this case, they each contain one element, so this LCS is given two subsequences: (A) and (G).

For LCS(R2, C3), A does not match C. LCS(R2, C2) contains sequences (A) and (G); LCS(R1, C3) is (G), which is already contained in LCS(R2, C2). The result is that LCS(R2, C3) also contains the two subsequences, (A) and (G).

For LCS(R2, C4), A matches A, which is appended to the upper left cell, giving (GA).

For LCS(R2, C5), A does not match T. Comparing the two sequences, (GA) and (G), the longest is (GA), so LCS(R2, C5) is (GA).

"G" & "A" Rows Completed
Ø A G C A T
Ø Ø Ø Ø Ø Ø Ø
G Ø {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}Ø {\displaystyle {\overset {\nwarrow }{\ }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G)
A Ø {\displaystyle {\overset {\nwarrow }{\ }}}(A) {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}(A) & (G) {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}(A) & (G) {\displaystyle {\overset {\nwarrow }{\ }}}(GA) {\displaystyle {\overset {\ }{\leftarrow }}}(GA)
C Ø

For LCS(R3, C1), C and A do not match, so LCS(R3, C1) gets the longest of the two sequences, (A).

For LCS(R3, C2), C and G do not match. Both LCS(R3, C1) and LCS(R2, C2) have one element. The result is that LCS(R3, C2) contains the two subsequences, (A) and (G).

For LCS(R3, C3), C and C match, so C is appended to LCS(R2, C2), which contains the two subsequences, (A) and (G), giving (AC) and (GC).

For LCS(R3, C4), C and A do not match. Combining LCS(R3, C3), which contains (AC) and (GC), and LCS(R2, C4), which contains (GA), gives a total of three sequences: (AC), (GC), and (GA).

Finally, for LCS(R3, C5), C and T do not match. The result is that LCS(R3, C5) also contains the three sequences, (AC), (GC), and (GA).

Completed LCS Table
Ø A G C A T
Ø Ø Ø Ø Ø Ø Ø
G Ø {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}Ø {\displaystyle {\overset {\nwarrow }{\ }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G)
A Ø {\displaystyle {\overset {\nwarrow }{\ }}}(A) {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}(A) & (G) {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}(A) & (G) {\displaystyle {\overset {\nwarrow }{\ }}}(GA) {\displaystyle {\overset {\ }{\leftarrow }}}(GA)
C Ø {\displaystyle {\overset {\ \uparrow }{\ }}}(A) {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}(A) & (G) {\displaystyle {\overset {\nwarrow }{\ }}}(AC) & (GC) {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}(AC) & (GC) & (GA) {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}(AC) & (GC) & (GA)

The final result is that the last cell contains all the longest subsequences common to (AGCAT) and (GAC); these are (AC), (GC), and (GA). The table also shows the longest common subsequences for every possible pair of prefixes. For example, for (AGC) and (GA), the longest common subsequence are (A) and (G).

Traceback approach[edit]

Calculating the LCS of a row of the LCS table requires only the solutions to the current row and the previous row. Still, for long sequences, these sequences can get numerous and long, requiring a lot of storage space. Storage space can be saved by saving not the actual subsequences, but the length of the subsequence and the direction of the arrows, as in the table below.

Storing length, rather than sequences
Ø A G C A T
Ø 0 0 0 0 0 0
G 0 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}0 {\displaystyle {\overset {\nwarrow }{\ }}}1 {\displaystyle {\overset {\ }{\leftarrow }}}1 {\displaystyle {\overset {\ }{\leftarrow }}}1 {\displaystyle {\overset {\ }{\leftarrow }}}1
A 0 {\displaystyle {\overset {\nwarrow }{\ }}}1 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}1 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}1 {\displaystyle {\overset {\nwarrow }{\ }}}2 {\displaystyle {\overset {\ }{\leftarrow }}}2
C 0 {\displaystyle {\overset {\ \uparrow }{\ }}}1 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}1 {\displaystyle {\overset {\nwarrow }{\ }}}2 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}2 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}2

The actual subsequences are deduced in a "traceback" procedure that follows the arrows backwards, starting from the last cell in the table. When the length decreases, the sequences must have had a common element. Several paths are possible when two arrows are shown in a cell. Below is the table for such an analysis, with numbers colored in cells where the length is about to decrease. The bold numbers trace out the sequence, (GA).[6]

Traceback example
Ø A G C A T
Ø 0 0 0 0 0 0
G 0 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}0 {\displaystyle {\overset {\nwarrow }{\ }}}1 {\displaystyle {\overset {\ }{\leftarrow }}}1 {\displaystyle {\overset {\ }{\leftarrow }}}1 {\displaystyle {\overset {\ }{\leftarrow }}}1
A 0 {\displaystyle {\overset {\nwarrow }{\ }}}1 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}1 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}1 {\displaystyle {\overset {\nwarrow }{\ }}}2 {\displaystyle {\overset {\ }{\leftarrow }}}2
C 0 {\displaystyle {\overset {\ \uparrow }{\ }}}1 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}1 {\displaystyle {\overset {\nwarrow }{\ }}}2 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}2 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}2

Related Solutions

Write the pseudo code for longest common subsequence algorithm in Java. Alg lCS( X , n,...
Write the pseudo code for longest common subsequence algorithm in Java. Alg lCS( X , n, Y, m) Input: String X of length n, String Y of length m Output: return the length of the longest common subsequence between X and Y
Prove Longest common subsequence algorithm class finds the optimal solution
Prove Longest common subsequence algorithm class finds the optimal solution
Problem 2. Purpose: practice algorithm design using dynamic programming. A subsequence is palindromic if it is...
Problem 2. Purpose: practice algorithm design using dynamic programming. A subsequence is palindromic if it is the same whether read left to right or right to left. For instance, the sequence A,C,G,T,G,T,C,A,A,A,A,T,C,G has many palindromic subsequences, including A,C,G,C,A and A,A,A,A (on the other hand, the subsequence A,C,T is not palindromic). Assume you are given a sequence x[1...n] of characters. Denote L(i,j) the length of the longest palindrome in the substring x[i,...,j]. The goal of the Maximum Palindromic Subsequence Problem (MPSP)...
Scheme Programming - Racket R5RS Longest Non-Decreasing Subsequence You will write two Scheme functions that compute...
Scheme Programming - Racket R5RS Longest Non-Decreasing Subsequence You will write two Scheme functions that compute a longest non-decreasing subsequence from a list of numbers. For example, if you type > (lis '(1 2 3 2 4 1 2)) you might get (1 2 3 4) Note that there may be more than one longest non-decreasing subsequence. In the above example, your program might also find (1 2 2 4) or (1 2 2 2). You should concentrate first on...
Given two strings X and Y of length n and m respectively, design a dynamic programming...
Given two strings X and Y of length n and m respectively, design a dynamic programming based algorithm to finnd a super-string Z of minimum length such that both X and Y are subsequence of Z. Note that characters in X and Y do not need to appear consecutively in Z as long as they appear in the same order in Z as in X or Y . Design an other algorithm for solving the same problem but with three...
Find the edit distance between HORSE and NORTH Please give a dynamic-programming table for computing this...
Find the edit distance between HORSE and NORTH Please give a dynamic-programming table for computing this distance
The following two questions can be solved by dynamic programming. For each question, please describe optimal...
The following two questions can be solved by dynamic programming. For each question, please describe optimal substructure and express recurrence relation. Give pseudo-code and analyze time and space complexity of your algorithm. 1. Longest palindrome subsequence A palindrome is a nonempty string over some alphabet that reads the same forward and backward. Examples of palindromes are all strings of length 1, “civic”, “racecar”, and “aibohphobia” (fear of palindromes). Design a dynamic programming algorithm to find the longest palindrome that is...
PRACTICAL 10 C PROGRAMMING. Question 1 - Reading into a dynamic array. Write a program called...
PRACTICAL 10 C PROGRAMMING. Question 1 - Reading into a dynamic array. Write a program called temperatures01 that reads a (non-empty) sequence maximum daily temperatures. Your program should first ask for the number of temperatures to read and dynamically allocate an array just big enough to hold the number of temperatures you read. You should then read in the elements of the array using a loop. You then should print out the elements of the array in reverse order (from...
IMPLEMENT IN JAVA PLEASE I NEED THIS DONE ASAP Question 1 (25pt) Dynamic Programming – Coin...
IMPLEMENT IN JAVA PLEASE I NEED THIS DONE ASAP Question 1 (25pt) Dynamic Programming – Coin Change Design algorithm for the coin change problem using dynamic programming approach. Given coin denominations by value: c1 < c2 < … < cn, and change amount of x, find out how to pay amount x to customer using fewest number of coins. Your algorithm needs to work for ANY coin denomination c1 < c2 < … < cn, and ANY change amount of...
1. Make a table to differentiate the 4 common types of leukemia: Acute lymphocytic leukemia (ALL),...
1. Make a table to differentiate the 4 common types of leukemia: Acute lymphocytic leukemia (ALL), Chronic lymphocytic leukemia (CLL), Acute myeloid leukemia (AML), and Chronic myeloid leukemia (CML) according to incidence, physiologic alterations, clinical manifestations, management, and prognosis. 2. Formulate a Nursing Care Plan for a patient with Acute Leukemia. 3. Enumerate the nursing implications for the administration of blood components.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT