Question

In: Computer Science

This program focuses on programming with Java Collections classes. You will implement a module that finds...

This program focuses on programming with Java Collections classes. You will implement a module that finds a simplified Levenshtein distance between two words represented by strings.

Your program will need support files LevenDistanceFinder.Java, dictionary.txt, while you will be implementing your code in the LevenDistanceFinder.java file.

INSTRUCTIONS

The Levenshtein distance, named for it's creator Vladimir Levenshtein, is a measure of the distance between two words. The edit distance between two strings is the minimum number of operations that are needed to transform one sting into the other. For a full implementation of the distance we would need to account for three different types of operations, removing a letter, adding a letter, or changing a letter.

For THIS program however, we are going to focus solely on the ability to change one letter to another. So, we will be thinking of changing one letter at a time, while maintaining the fact that the word is a still a valid word at all times.

So, for an example, the edit distance from "dog" to "cat" is 3. One possible path for this is

"dog" to "dot" to "cot" to cat"

Note, that there might be other paths, but they all involve at least 3 changes. Another longer example is from "witch" to "coven".

witch->watch->match->march->marcs->mares->mores->moves->coves->coven

You will be implementing a secondary class called LevenshteinFinder that will be used by the main program to solve the problem.

You are given a client program LevenDistanceFinder.java that does the file processing and user interaction. LevenDistanceFinder.java reads a dictionary text file as input and passes its entire contents to you as a list of strings. You are to write a class called LevenshteinFinder that will manage the actual work of finding the distance and the path.

The class will probably have the following necessary private fields:

  • Starting and Ending Strings

  • A map of words to a set of neighbors.

  • An integer distance that starts at -1.

  • A List of strings that is the path itself.

PUBLIC LEVENSHTEINFINDER(STRING, STRING, SET<STRING> )

Your constructor is passed a string which is the starting word of the path. A string that is the ending word of the path, and a collection of words as a set. It should use these values to initialize the class. The set of words will initially be all words from the dictionary.

You should throw an IllegalArgumentException if length of the two words is not the same.

You will want to store the starting and stopping words for later use, but you should not need to save the words list. First you will want to pull out only the words that are of the correct size. Then from this smaller list, what you will want to do is to create a map of words to their "neighbor" words. Start this by going through every word and add it and an empty set, to a map.

You will then go through the set a second time and check every word to every other word. If they are "neighbors" you will add each word to the set that goes with the other word. So if stringA and stringB are neigbors, then you add stringA to stringB's set, and vice versa.

Once the neighbor map is created, you will run the findDistacnce method and the findPath method to save those values for future delivery.

Note that this is all done in the constructor, so that all the work is done when it is “newed”/created.

PRIVATE INT DIFFERENTLETTERS(STRING A, STRING B)

This method should take two strings and find the number of letters that are different from each other and return that number.

PUBLIC INT GETDISTANCE()

This method is called by the main program to get the distance between the two words. Note that you found this in a different method. So this should just return a saved value. If it is longer than 2 lines, you are doing it wrong.

PUBLIC STRING GETPATH()

This method returns a string that is the path from the first word to the second word, assuming it exists. If the path distance is negative one, then return back an error message, if the path distance is zero or higher, then take the pathArray and convert it to a string with arrows in between.

Example: love->lave->late->hate

PRIVATE INT FINDDISTANCE(STRING A, STRING B)

This method finds the distance between two strings. (Note, only the distance, not the path itself). Start by creating two empty sets. Add the staring word to the second set. Set a counter to zero. Then while the sets are different sizes and the 2nd set does not contain the ending word, add everything from set 2 into set one, clear the 2nd set, and then take every word from set1 AND the neighbor of every word from set1 into set 2. Increment the loop counter.

When the loop finishes, if the 2nd set contains the final word return the counter because it was found, if the 2nd set doesn't contain the word, return -1 because there is no path.

PRIVATE VOID FINDPATH(STRING A, STRING B)

This method will find the path between two words. It should probably be the last method you work on. In fact I would create it, and have it do nothing until you are ready for it.

When running this method should only be run after findDistance() so that the distance has already been found and stored in a private int value.

When you are ready for it, this method should do the following.
Initialize the class path List to a new empty List.
Check the distance, if it is negative, store an error message in the path List and then exit the method. If the distance is zero or positive add the first element to the list.

Now in a loop that starts at the distance minus 1, and goes down until 1, look at the set of neighbors of the word in the last box of the list. Find one that has a distance to the ending word that matches the current loop counter. There may be multiples, but there has to be at least one. Find this word, and add it to the list.

Now repeat the loop until the for loop quits. Then add the ending word to the list. You are done.
Here is an example for the path from love -> hate.

The distance from love to hate is 3, so that is bigger than -1. So add love to the list. The list is now size one. Now start your loop from distance -1 (2) to 1. So i is currently 2. Find any word in the neighbor of love, that has a distance to hate of size 2. Use your findDistance method here!. One of those words is "lave". So add that to the array.

Next round, i is one. The list is now [love, lave]. So find a neighbor of "lave" that has a distance to "hate" of one. One possible word is "late". So add "late" to the list which now looks like [love, lave, late]

That should finish the loop, add "hate" to the list. The final list looks like [love, lave, late, hate]. Store that list in the class field. And you are done.

Your program should exactly reproduce the format and general behavior demonstrated on the previous pages.

You may assume that the list of words passed to your constructor will be a nonempty list of nonempty strings composed entirely of lowercase letters.

Use the TreeMaps, TreeSets and ArrayLists implementations for all of your ADT's.

HINTS

If you have not had to deal with exceptions before the proper way to throw and exception is the following: throw new IlleagalStateException() or IllegalArgumentEXception() as appropriate. This will end your method and return back to the primary program.

Solutions

Expert Solution

Given two strings str1 and str2 and below operations that can performed on str1. Find minimum number of edits (operations) required to convert ‘str1’ into ‘str2’.

1. Insert

2. Remove

3. Replace

All of the above operations are of equal cost.

What are the subproblems in this case?
The idea is process all characters one by one staring from either from left or right sides of both strings.
Let us traverse from right corner, there are two possibilities for every pair of character being traversed.

m: Length of str1 (first string) n: Length of str2 (second string)

1. If last characters of two strings are same, nothing much to do. Ignore last characters and get count for remaining strings. So we recur for lengths m-1 and n-1.

2. Else (If last characters are not same), we consider all operations on ‘str1’, consider all three operations on last character of first string, recursively compute minimum cost for all three operations and take minimum of three values.

1. Insert: Recur for m and n-1

2. Remove: Recur for m-1 and n

3. Replace: Recur for m-1 and n-1

Below is the java code:

// A Naive recursive Java program to find minimum number

// operations to convert str1 to str2

class EDIST {

    static int min(int x, int y, int z)

    {

        if (x <= y && x <= z)

            return x;

        if (y <= x && y <= z)

            return y;

        else

            return z;

    }

    static int editDist(String str1, String str2, int m, int n)

    {

        // If first string is empty, the only option is to

        // insert all characters of second string into first

        if (m == 0)

            return n;

        // If second string is empty, the only option is to

        // remove all characters of first string

        if (n == 0)

            return m;

        // If last characters of two strings are same, nothing

        // much to do. Ignore last characters and get count for

        // remaining strings.

        if (str1.charAt(m - 1) == str2.charAt(n - 1))

            return editDist(str1, str2, m - 1, n - 1);

        // If last characters are not same, consider all three

        // operations on last character of first string, recursively

        // compute minimum cost for all three operations and take

        // minimum of three values.

        return 1 + min(editDist(str1, str2, m, n - 1), // Insert

                       editDist(str1, str2, m - 1, n), // Remove

                       editDist(str1, str2, m - 1, n - 1) // Replace

                       );

    }

    public static void main(String args[])

    {

        String str1 = "sunday";

        String str2 = "saturday";

        System.out.println(editDist(str1, str2, str1.length(), str2.length()));

    }

}

Output:

3

The time complexity of above solution is exponential. In worst case, we may end up doing O(3m) operations. The worst case happens when none of characters of two strings match. Below is a recursive call diagram for worst case.

We can see that many subproblems are solved, again and again, for example, eD(2, 2) is called three times. Since same suproblems are called again, this problem has Overlapping Subprolems property. So Edit Distance problem has both properties (see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems can be avoided by constructing a temporary array that stores results of subproblems.

Code in java:

// A Dynamic Programming based Java program to find minimum

// number operations to convert str1 to str2

class EDIST {

    static int min(int x, int y, int z)

    {

        if (x <= y && x <= z)

            return x;

        if (y <= x && y <= z)

            return y;

        else

            return z;

    }

    static int editDistDP(String str1, String str2, int m, int n)

    {

        // Create a table to store results of subproblems

        int dp[][] = new int[m + 1][n + 1];

        // Fill d[][] in bottom up manner

        for (int i = 0; i <= m; i++) {

            for (int j = 0; j <= n; j++) {

                // If first string is empty, only option is to

                // insert all characters of second string

                if (i == 0)

                    dp[i][j] = j; // Min. operations = j

                // If second string is empty, only option is to

                // remove all characters of second string

                else if (j == 0)

                    dp[i][j] = i; // Min. operations = i

                // If last characters are same, ignore last char

                // and recur for remaining string

                else if (str1.charAt(i - 1) == str2.charAt(j - 1))

                    dp[i][j] = dp[i - 1][j - 1];

                // If the last character is different, consider all

                // possibilities and find the minimum

                else

                    dp[i][j] = 1 + min(dp[i][j - 1], // Insert

                                       dp[i - 1][j], // Remove

                                       dp[i - 1][j - 1]); // Replace

            }

        }

        return dp[m][n];

    }

    public static void main(String args[])

    {

        String str1 = "sunday";

        String str2 = "saturday";

        System.out.println(editDistDP(str1, str2, str1.length(), str2.length()));

    }

}

 3

Time Complexity: O(m x n)
Auxiliary Space: O(m x n)

Space Complex Solution: In the above-given method we require O(m x n) space. This will not be suitable if the length of strings is greater than 2000 as it can only create 2D array of 2000 x 2000. To fill a row in DP array we require only one row the upper row. For example, if we are filling the i = 10 rows in DP array we require only values of 9th row. So we simply create a DP array of 2 x str1 length. This approach reduces the space complexity. Here is the C++ implementation of the above-mentioned problem.

// A Space efficient Dynamic Programming

// based C++ program to find minimum

// number operations to convert str1 to str2

#include <bits/stdc++.h>

using namespace std;

void EditDistDP(string str1, string str2)

{

    int len1 = str1.length();

    int len2 = str2.length();

    // Create a DP array to memoize result

    // of previous computations

    int DP[2][len1 + 1];

    // To fill the DP array with 0

    memset(DP, 0, sizeof DP);

    // Base condition when second string

    // is empty then we remove all characters

    for (int i = 0; i <= len1; i++)

        DP[0][i] = i;

    // Start filling the DP

    // This loop run for every

    // character in second string

    for (int i = 1; i <= len2; i++) {

        // This loop compares the char from

        // second string with first string

        // characters

        for (int j = 0; j <= len1; j++) {

            // if first string is empty then

            // we have to perform add character

            // operation to get second string

            if (j == 0)

                DP[i % 2][j] = i;

            // if character from both string

            // is same then we do not perform any

            // operation . here i % 2 is for bound

            // the row number.

            else if (str1[j - 1] == str2[i - 1]) {

                DP[i % 2][j] = DP[(i - 1) % 2][j - 1];

            }

            // if character from both string is

            // not same then we take the minimum

            // from three specified operation

            else {

                DP[i % 2][j] = 1 + min(DP[(i - 1) % 2][j],

                                       min(DP[i % 2][j - 1],

                                           DP[(i - 1) % 2][j - 1]));

            }

        }

    }

    // after complete fill the DP array

    // if the len2 is even then we end

    // up in the 0th row else we end up

    // in the 1th row so we take len2 % 2

    // to get row

    cout << DP[len2 % 2][len1] << endl;

}

// Driver program

int main()

{

    string str1 = "food";

    string str2 = "money";

    EditDistDP(str1, str2);

    return 0;

}

Output:

 4

Time Complexity: O(m x n)
Auxiliary Space: O( m )


Related Solutions

java programming Concepts ArrayList - Collections Sorting Enhanced For Loop Collections Class Auto-boxing Programming Assignment 1....
java programming Concepts ArrayList - Collections Sorting Enhanced For Loop Collections Class Auto-boxing Programming Assignment 1. Describe auto-boxing, including why it is useful. (Google for this one) Write a few lines of code that auto-box an int into an Integer, and un-box an Integer to an int. 2. Declare an ArrayList of Strings. Add 5 names to the collection. "Bob" "Susan" ... Output the Strings onto the console using the enhanced for loop. 3. Sort the list using the method...
implement a JavaFX program to demonstrate skills and knowledge using the following: 1.General Java programming skills...
implement a JavaFX program to demonstrate skills and knowledge using the following: 1.General Java programming skills (e.g. conditionals, branching and loops) as well as object-oriented concepts) 2. Writing JavaFX applications incl. using fxml 3. • GUI Layouts (organizing UI controls) I just need some samples and explanations.
Is List type is an interface in the Java collections framework? Classes Vector, ArrayList, and LinkedList...
Is List type is an interface in the Java collections framework? Classes Vector, ArrayList, and LinkedList are the same data structure but implement data storage in different ways. Classes, that implement Map interface in Java collections framework are used for storing what type of data? Declare and instantiate a list of elements of type String. Name this list myArray. what type of data structure is Stack? LinkedList data structure in Java collections is implemented as doubly linked lists. In PriorityQueue...
(Code in Java please) You need to implement the GarageDoor, GarageDoorUpCommand and GarageDoorDownCommand classes (similar to...
(Code in Java please) You need to implement the GarageDoor, GarageDoorUpCommand and GarageDoorDownCommand classes (similar to Light, LightOnCommand and LightOffCommand), AND (edited) Implement the CeilingFan class (with state, see p. 220), AND CeilingFanHighCommand (p. 221), CeilingFanMediumCommand, CeilingFanLowCommand, CeilingFanOffCommand (WITH Undo), AND TEST your CeilingFan classes (see pp. 222 and 223), AND Finally, implement and Test the MacroCommand class on pp. 224-227 EXAMPLE output for homework assignment…………… ======= The COMMAND PATTERN ============= guest house garage door is UP guest house garage...
Kernal Modul Programing Quenstion : Implement the following 4 conditions in kernel module programming. 1. Implement...
Kernal Modul Programing Quenstion : Implement the following 4 conditions in kernel module programming. 1. Implement a timer module using the timer function provided by the file LINUX/timer.h 2. In module_init, the timer is initialized using setup_timer and called mod_timer to start the timer. 3. Call back function my_timer_callback when timer expires. 4. When you remove a module, delete the timer.
In this programming assignment, you will implement a SimpleWebGet program for non- interactive download of files...
In this programming assignment, you will implement a SimpleWebGet program for non- interactive download of files from the Internet. This program is very similar to wget utility in Unix/Linux environment.The synopsis of SimpleWebGet is: java SimpleWebGet URL. The URL could be either a valid link on the Internet, e.g., www.asu.edu/index.html, or gaia.cs.umass.edu/wireshark-labs/alice.txt or an invalid link, e.g., www.asu.edu/inde.html. ww.asu.edu/inde.html. The output of SimpleWebGet for valid links should be the same as wget utility in Linux, except the progress line highlighted...
Please do this in java program. In this assignment you are required to implement the Producer...
Please do this in java program. In this assignment you are required to implement the Producer Consumer Problem . Assume that there is only one Producer and there is only one Consumer. 1. The problem you will be solving is the bounded-buffer producer-consumer problem. You are required to implement this assignment in Java This buffer can hold a fixed number of items. This buffer needs to be a first-in first-out (FIFO) buffer. You should implement this as a Circular Buffer...
you will create a program with Java to implement a simplified version of RSA cryptosystems. To...
you will create a program with Java to implement a simplified version of RSA cryptosystems. To complete this project, you may follow the steps listed below (demonstrated in Java code) to guide yourself through the difficulties. Step I Key-gen: distinguish a prime number (20 pts) The generation of RSA's public/private keys depends on finding two large prime numbers, thus our program should be able to tell if a given number is a prime number or not. For simplicity, we define...
java programming You will be given two interfaces and two abstract classes, FileTextReader, FileTextWriter, AbstractFileMonitor, and...
java programming You will be given two interfaces and two abstract classes, FileTextReader, FileTextWriter, AbstractFileMonitor, and AbstractDictionary. Your job is to create two classes the first class should be named FileManager, the second class should be named Dictionary. The FileManager will implement the interfaces FileTextReader and FileTextWriter and extend the clas AbstractFileMonitor. Your class signature would look something like the following: public class FileManager extends AbstractFileMonitor implements FileTextReader, FileTextWriter{... The constructor signature of the FileManager should look like the following:...
java programming You will be given two interfaces and two abstract classes, FileTextReader, FileTextWriter, AbstractFileMonitor, and...
java programming You will be given two interfaces and two abstract classes, FileTextReader, FileTextWriter, AbstractFileMonitor, and AbstractDictionary. Your job is to create two classes the first class should be named FileManager, the second class should be named Dictionary. The FileManager will implement the interfaces FileTextReader and FileTextWriter and extend the clas AbstractFileMonitor. Your class signature would look something like the following: public class FileManager extends AbstractFileMonitor implements FileTextReader, FileTextWriter{... The constructor signature of the FileManager should look like the following:...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT