In: Computer Science
3. Write a Java program that discover all anagrams of all wordslisted in the input file, “dict.txt”. An anagram of a work is a rearrangement of its letters into a new legal word. Your program should do the following:
a. Read in the given “dict.txt” file and sort it in each word’s canonical form. The canonical form of a word contains the same letters as the original word, but in sorted order
b. Instead of putting the “dict.txt” in the code, ask the user to enter the file name
c. Retrieve a word’s canonical form and write a Comparator to compare words by using their canonical forms.
Here is the completed code for this problem. Comments are included, go through it, learn how things work and let me know if you have any doubts or if you need anything to change. If you are satisfied with the solution, please rate the answer. If not, PLEASE let me know before you rate, I’ll help you fix whatever issues. Thanks
//FindAnagrams.java
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Arrays;
import java.util.Comparator;
import java.util.Scanner;
import java.util.Set;
import java.util.TreeMap;
import java.util.TreeSet;
public class FindAnagrams {
// helper method to get a word in sorted order
static String getCanonicalForm(String word) {
// converting word to lower case and then to a char array
char chars[] = word.toLowerCase().toCharArray();
// sorting chars array
Arrays.sort(chars);
// creating and returning a new word contains chars in sorted order
word = new String(chars);
return word;
}
public static void main(String[] args) throws FileNotFoundException {
// creating a Comparator to compare two words by their canonical form
// please note that this comparator is never used anywhere as there is
// no need for that in our program, because we follow a different,
// simple approach. I'm writing this only because the question ask us
// to.
Comparator<String> comparator = new Comparator<String>() {
@Override
public int compare(String s1, String s2) {
// converting s1 and s2 to cannonical forms
s1 = getCanonicalForm(s1);
s2 = getCanonicalForm(s2);
// comparing s1 and s2, returning comparison result
return s1.compareTo(s2);
}
};
// example usage: comparator.compare("post", "stop") will return 0 since
// post and stop are anagrams
// creating a map to store anagrams. here, key is the word in canonical
// form, values are a set of unique words with that canonical form (when
// sorted), or in other words, values are anagrams found with same
// canonical form.
TreeMap<String, Set<String>> anagrams = new TreeMap<String, Set<String>>();
// using a scanner, asking and reading a file name
Scanner sc = new Scanner(System.in);
System.out.print("Enter the input file name: ");
String filename = sc.nextLine();
// reinitializing scanner to read from file. if file is not present,
// this will throw exception
sc = new Scanner(new File(filename));
String word;
// looping through each word on the file
while (sc.hasNext()) {
word = sc.next();
// getting cannonical form of word
String sorted = getCanonicalForm(word);
// checking if this sorted word is on anagrams map as a key
if (!anagrams.containsKey(sorted)) {
// nope! we add to anagrams map with key=sorted and value=empty
// set. here set is used to preserve only unique values. TreeSet
// will keep the words in alphabetical order.
anagrams.put(sorted, new TreeSet<String>());
}
// now adding original word to the set associated with key=sorted on
// anagrams map
anagrams.get(sorted).add(word);
}
// closing file
sc.close();
// now looping through the map and displaying each canonical form and
// the corresponding anagrams found.
// note: if the input file is huge, then there will be a lot of lines of
// output, in which case I suggest you to write the results to an output
// file instead of console.
System.out.printf("%-20s %s\n", "CANONICAL FORM", "ANAGRAMS");
for (String str : anagrams.keySet()) {
// displaying cannonical form and the set of anagrams associated
// with it
System.out.printf("%-20s %s\n", str, anagrams.get(str));
}
}
}
/*PARTIAL OUTPUT (for some dict.txt)*/
Enter the input file name: dict.txt
CANONICAL FORM ANAGRAMS
a [a]
aa [aa]
aaa [aaa]
aaaablm [alabama]
aaaacdgmrs [madagascar]
aaabbrr [barbara]
aaabcehillpt [alphabetical]
aaabdesst [databases]
aaabdest [database]
aaabdmorss [ambassador]
aaabeijnrz [azerbaijan]
...
abelst [stable, tables]
abeltt [battle, tablet]
abest [beast, beats]
abt [bat, tab, tba]
...
acdeilm [claimed, decimal, medical]
...
aefrs [fares, fears, safer]
...
deit [diet, edit, tide, tied]