Question

In: Computer Science

Dynamic Programming Question. Make the table for finding a LCS (Longest Common Subsequence) of the strings...

Dynamic Programming Question. Make the table for finding a LCS (Longest Common Subsequence) of the strings SLWOVNNDK and ALWGQVNBKB. You need to Show the traceback for the LCS.

Solutions

Expert Solution

here string LWVNK is the LCS.

------------------------

The longest subsequence common to R = (GAC), and C = (AGCAT) will be found. Because the LCS function uses a "zeroth" element, it is convenient to define zero prefixes that are empty for these sequences: R0 = Ø; and C0 = Ø. All the prefixes are placed in a table with C in the first row (making it a column header) and R in the first column (making it a row header).

LCS Strings
Ø A G C A T
Ø Ø Ø Ø Ø Ø Ø
G Ø
A Ø
C Ø

This table is used to store the LCS sequence for each step of the calculation. The second column and second row have been filled in with Ø, because when an empty sequence is compared with a non-empty sequence, the longest common subsequence is always an empty sequence.

LCS(R1, C1) is determined by comparing the first elements in each sequence. G and A are not the same, so this LCS gets (using the "second property") the longest of the two sequences, LCS(R1, C0) and LCS(R0, C1). According to the table, both of these are empty, so LCS(R1, C1) is also empty, as shown in the table below. The arrows indicate that the sequence comes from both the cell above, LCS(R0, C1) and the cell on the left, LCS(R1, C0).

LCS(R1, C2) is determined by comparing G and G. They match, so G is appended to the upper left sequence, LCS(R0, C1), which is (Ø), giving (ØG), which is (G).

For LCS(R1, C3), G and C do not match. The sequence above is empty; the one to the left contains one element, G. Selecting the longest of these, LCS(R1, C3) is (G). The arrow points to the left, since that is the longest of the two sequences.

LCS(R1, C4), likewise, is (G).

LCS(R1, C5), likewise, is (G).

"G" Row Completed
Ø A G C A T
Ø Ø Ø Ø Ø Ø Ø
G Ø {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}Ø {\displaystyle {\overset {\nwarrow }{\ }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G)
A Ø
C Ø

For LCS(R2, C1), A is compared with A. The two elements match, so A is appended to Ø, giving (A).

For LCS(R2, C2), A and G do not match, so the longest of LCS(R1, C2), which is (G), and LCS(R2, C1), which is (A), is used. In this case, they each contain one element, so this LCS is given two subsequences: (A) and (G).

For LCS(R2, C3), A does not match C. LCS(R2, C2) contains sequences (A) and (G); LCS(R1, C3) is (G), which is already contained in LCS(R2, C2). The result is that LCS(R2, C3) also contains the two subsequences, (A) and (G).

For LCS(R2, C4), A matches A, which is appended to the upper left cell, giving (GA).

For LCS(R2, C5), A does not match T. Comparing the two sequences, (GA) and (G), the longest is (GA), so LCS(R2, C5) is (GA).

"G" & "A" Rows Completed
Ø A G C A T
Ø Ø Ø Ø Ø Ø Ø
G Ø {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}Ø {\displaystyle {\overset {\nwarrow }{\ }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G)
A Ø {\displaystyle {\overset {\nwarrow }{\ }}}(A) {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}(A) & (G) {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}(A) & (G) {\displaystyle {\overset {\nwarrow }{\ }}}(GA) {\displaystyle {\overset {\ }{\leftarrow }}}(GA)
C Ø

For LCS(R3, C1), C and A do not match, so LCS(R3, C1) gets the longest of the two sequences, (A).

For LCS(R3, C2), C and G do not match. Both LCS(R3, C1) and LCS(R2, C2) have one element. The result is that LCS(R3, C2) contains the two subsequences, (A) and (G).

For LCS(R3, C3), C and C match, so C is appended to LCS(R2, C2), which contains the two subsequences, (A) and (G), giving (AC) and (GC).

For LCS(R3, C4), C and A do not match. Combining LCS(R3, C3), which contains (AC) and (GC), and LCS(R2, C4), which contains (GA), gives a total of three sequences: (AC), (GC), and (GA).

Finally, for LCS(R3, C5), C and T do not match. The result is that LCS(R3, C5) also contains the three sequences, (AC), (GC), and (GA).

Completed LCS Table
Ø A G C A T
Ø Ø Ø Ø Ø Ø Ø
G Ø {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}Ø {\displaystyle {\overset {\nwarrow }{\ }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G) {\displaystyle {\overset {\ }{\leftarrow }}}(G)
A Ø {\displaystyle {\overset {\nwarrow }{\ }}}(A) {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}(A) & (G) {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}(A) & (G) {\displaystyle {\overset {\nwarrow }{\ }}}(GA) {\displaystyle {\overset {\ }{\leftarrow }}}(GA)
C Ø {\displaystyle {\overset {\ \uparrow }{\ }}}(A) {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}(A) & (G) {\displaystyle {\overset {\nwarrow }{\ }}}(AC) & (GC) {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}(AC) & (GC) & (GA) {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}(AC) & (GC) & (GA)

The final result is that the last cell contains all the longest subsequences common to (AGCAT) and (GAC); these are (AC), (GC), and (GA). The table also shows the longest common subsequences for every possible pair of prefixes. For example, for (AGC) and (GA), the longest common subsequence are (A) and (G).

Traceback approach[edit]

Calculating the LCS of a row of the LCS table requires only the solutions to the current row and the previous row. Still, for long sequences, these sequences can get numerous and long, requiring a lot of storage space. Storage space can be saved by saving not the actual subsequences, but the length of the subsequence and the direction of the arrows, as in the table below.

Storing length, rather than sequences
Ø A G C A T
Ø 0 0 0 0 0 0
G 0 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}0 {\displaystyle {\overset {\nwarrow }{\ }}}1 {\displaystyle {\overset {\ }{\leftarrow }}}1 {\displaystyle {\overset {\ }{\leftarrow }}}1 {\displaystyle {\overset {\ }{\leftarrow }}}1
A 0 {\displaystyle {\overset {\nwarrow }{\ }}}1 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}1 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}1 {\displaystyle {\overset {\nwarrow }{\ }}}2 {\displaystyle {\overset {\ }{\leftarrow }}}2
C 0 {\displaystyle {\overset {\ \uparrow }{\ }}}1 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}1 {\displaystyle {\overset {\nwarrow }{\ }}}2 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}2 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}2

The actual subsequences are deduced in a "traceback" procedure that follows the arrows backwards, starting from the last cell in the table. When the length decreases, the sequences must have had a common element. Several paths are possible when two arrows are shown in a cell. Below is the table for such an analysis, with numbers colored in cells where the length is about to decrease. The bold numbers trace out the sequence, (GA).[6]

Traceback example
Ø A G C A T
Ø 0 0 0 0 0 0
G 0 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}0 {\displaystyle {\overset {\nwarrow }{\ }}}1 {\displaystyle {\overset {\ }{\leftarrow }}}1 {\displaystyle {\overset {\ }{\leftarrow }}}1 {\displaystyle {\overset {\ }{\leftarrow }}}1
A 0 {\displaystyle {\overset {\nwarrow }{\ }}}1 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}1 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}1 {\displaystyle {\overset {\nwarrow }{\ }}}2 {\displaystyle {\overset {\ }{\leftarrow }}}2
C 0 {\displaystyle {\overset {\ \uparrow }{\ }}}1 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}1 {\displaystyle {\overset {\nwarrow }{\ }}}2 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}2 {\displaystyle {\overset {\ \ \uparrow }{\leftarrow }}}2

Related Solutions

Prove Longest common subsequence algorithm class finds the optimal solution
Prove Longest common subsequence algorithm class finds the optimal solution
Problem 2. Purpose: practice algorithm design using dynamic programming. A subsequence is palindromic if it is...
Problem 2. Purpose: practice algorithm design using dynamic programming. A subsequence is palindromic if it is the same whether read left to right or right to left. For instance, the sequence A,C,G,T,G,T,C,A,A,A,A,T,C,G has many palindromic subsequences, including A,C,G,C,A and A,A,A,A (on the other hand, the subsequence A,C,T is not palindromic). Assume you are given a sequence x[1...n] of characters. Denote L(i,j) the length of the longest palindrome in the substring x[i,...,j]. The goal of the Maximum Palindromic Subsequence Problem (MPSP)...
Scheme Programming - Racket R5RS Longest Non-Decreasing Subsequence You will write two Scheme functions that compute...
Scheme Programming - Racket R5RS Longest Non-Decreasing Subsequence You will write two Scheme functions that compute a longest non-decreasing subsequence from a list of numbers. For example, if you type > (lis '(1 2 3 2 4 1 2)) you might get (1 2 3 4) Note that there may be more than one longest non-decreasing subsequence. In the above example, your program might also find (1 2 2 4) or (1 2 2 2). You should concentrate first on...
Find the edit distance between HORSE and NORTH Please give a dynamic-programming table for computing this...
Find the edit distance between HORSE and NORTH Please give a dynamic-programming table for computing this distance
The following two questions can be solved by dynamic programming. For each question, please describe optimal...
The following two questions can be solved by dynamic programming. For each question, please describe optimal substructure and express recurrence relation. Give pseudo-code and analyze time and space complexity of your algorithm. 1. Longest palindrome subsequence A palindrome is a nonempty string over some alphabet that reads the same forward and backward. Examples of palindromes are all strings of length 1, “civic”, “racecar”, and “aibohphobia” (fear of palindromes). Design a dynamic programming algorithm to find the longest palindrome that is...
1. Make a table to differentiate the 4 common types of leukemia: Acute lymphocytic leukemia (ALL),...
1. Make a table to differentiate the 4 common types of leukemia: Acute lymphocytic leukemia (ALL), Chronic lymphocytic leukemia (CLL), Acute myeloid leukemia (AML), and Chronic myeloid leukemia (CML) according to incidence, physiologic alterations, clinical manifestations, management, and prognosis. 2. Formulate a Nursing Care Plan for a patient with Acute Leukemia. 3. Enumerate the nursing implications for the administration of blood components.
Make a table to differentiate the 4 common types of leukemia: Acute lymphocytic leukemia (ALL), Chronic...
Make a table to differentiate the 4 common types of leukemia: Acute lymphocytic leukemia (ALL), Chronic lymphocytic leukemia (CLL), Acute myeloid leukemia (AML), and Chronic myeloid leukemia (CML) according to incidence, physiologic alterations, clinical manifestations, management, and prognosis. • Formulate a Nursing Care Plan for a patient with Acute Leukemia. • Enumerate the nursing implications for the administration of blood components
Make a table to differentiate the 4 common types of leukemia: Acute lymphocytic leukemia (ALL), Chronic...
Make a table to differentiate the 4 common types of leukemia: Acute lymphocytic leukemia (ALL), Chronic lymphocytic leukemia (CLL), Acute myeloid leukemia (AML), and Chronic myeloid leukemia (CML) according to incidence, physiologic alterations, clinical manifestations, management, and prognosis. Formulate a Nursing Care Plan for a patient with Acute Leukemia. Enumerate the nursing implications for the administration of blood components.
Refer the previous question: The following table gives the common disorders, problems and complaints associated with...
Refer the previous question: The following table gives the common disorders, problems and complaints associated with each body system and its components relevant to the nursing care you might provide for your clients in the Australian health care system. Complete the following table with regard to its definition, pathophysiology, signs and impact of specific health procedures(in 10-20 words each). DISEASES AFFECTING THE CARDIOVASCULAR SYSTEM Angina pectoris Angina pectoris 11.1) Definition 11.2) Briefly outline the pathophysiology 11.3) List four specific signs...
Refer the previous question: The following table gives the common disorders, problems and complaints associated with...
Refer the previous question: The following table gives the common disorders, problems and complaints associated with each body system and its components relevant to the nursing care you might provide for your clients in the Australian health care system. Complete the following table with regard to its definition, pathophysiology, signs and impact of specific health procedures(in 10-20 words each). DISEASES AFFECTING THE RESPIRATORY SYSTEM Pneumonia Pneumonia 14.1) Definition 14.2) Briefly outline the pathophysiology 14.3) List four signs 14.4) Nurses should...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT