In: Computer Science
Problem 2: Two words are said to be anagrams if one can be formed by permuting the letters of the other. For example: "pots", and "stop" are anagrams. An anagram chain is a list of words that are all anagrams to each other. The shortest anagram chain has length two. We are interested in calculating the length of the longest anagram chain in a given list of words. For example, the following nine words: rates, pots, tops, along, aster, stop, stare, tears, and long has two anagram chains where the longest includes the four words: rates, aster, stare, and tears.
Input
Your program will be tested on a number of test cases. The first line of the input file ("anagram.in") contains an integer D representing the number of test cases in the input file.
Each test case contains one or more words, but no more than 20000 words, with no duplicates. Each word appears on a separate line. All words are in small letters, and in no particular order. No word will be longer than 10 characters. Each test case ends with a string made of one or more '-' characters.
Output
For each test case, write, on a separate line, the length (number of words) of the longest anagram chain found in the given list of words. The output should be printed to the screen.
Sample Input Sample Output:
2
rates 4
pots
tops 2
along
aster
stop
stare
tears
long
------
north
fresher
refresh
thorn
bye
--
Anagrams can be detected on the basis of the fact that they have the same characters the same number of times. Only the order is different. What is we made that order irrelevant? What if we shuffle all letters of a word in a fixed way? For example what if for each word we arrange the letters on the basis of alphabetical order. Then words which are anagrams would return in the same string.
Example - (cats, dogs, scat)
If we sort these three strings individually we get (acst, dgos, acst). Hence anagram words result in same strings.
Also, now if we sort this list of strings we would get (acst, acst, dogs) i.e. we would get all anagram words together which makes it easier for us to count.
This is the logic behind the algo we will apply.
The algo is as follows (after which i've implemented a C++ code)
Another method can be to store words in a data structure like SET / MAP (where a value can be inputted only once) and a count of words i.e. if a word is not present in set then put it in set and make its count 1. If a word is already present in set, increase it's count. Then just find the maximum count and that is our example.
Ex - (cats, dogs, scat, tacs, sogd)
we read cats, sort it to acst and push in set because not present before and its count is 1 now.
we read dogs, sort it to dgos and push it to set because not present before and its count is 1 now.
we read scat, sort it to acst and do not push in set because already present, only count increased to 2.
we read tacs, sort it to acst and do not push in set because already present, only count increased to 3.
we read sogd, sort it to dgos and do not push in set because already present, only count increased to 2.
maximum count is 3 hence answer is 3.
this method saves times and space both because we dont need to store all words, only one occurence of a word and we dont need to sort the whole list because we only have one occurence of a word.
Code for method 1 :
CODE FOR METHOD 2 :
first 3 steps and step 5-6-7-8-9 are same as previous method.
Hope this helped.