In: Computer Science
The purpose of this problem is to gain familiarity with the dictionary ADT. A homophone is one of two or more words that are pronounced alike but are different in meaning or spelling; for example, the words “two”, “too”, and “to”.
Write a Java program that uses one of the implementations of the dictionary interface (UALDictionary or OALDictionary on Moodle) to find the word that has the most homophones.
Do not use data structures that we have not covered so far in the course. The file “cmudict.0.7a.txt” in the Dictionaries folder on Moodle contains a pronunciation dictionary downloaded from
http://www.speech.cs.cmu.edu/cgi-bin/cmudict
The page also contains a detailed description of the pronunciation dictionary. The file consists of lines of the form
ABUNDANT AH0 B AH1 N D AH0 N T
The first string is the word, which is followed by one or more phonemes (or phones) that describe the pronunciation of the word. There are 39 phonemes occurring in North American English that are used in the dictionary. The collection of 39 symbols is known as the Arpabet, for the Advanced Research Projects Agency (ARPA), which developed it in the 1970’s in connection with research on speech understanding.
The Dictionaries directory on Moodle contains several files you can use. UALDictionary.java is a class that implements the dictionary interface using an unordered array list. The array list stores (key, value) pairs as described in the class KVpair.java. Pronunciation.java is a class that stores and manages access to a (word, phonemes) pair.
There is also a program Homophones.java that shows how you can read in the cmu dictionary (skipping comments and so on). You can modify that program so that when you read in a pronunciation entry you store it in an appropriate dictionary. Call your program MostHomophones.
The output is a first line containing a single integer n, which is the largest number of homophones. The n homophones follow on the next n lines, one word per line.
UAL Dictionary
import java.util.ArrayList;
import java.util.Collections;
class UALDictionary<Key extends Comparable, E> implements Dictionary<Key, E> {
private static final int defaultSize = 10;
private ArrayList<KVpair<Key, E>> list;
// Constructors
UALDictionary() {
this(defaultSize);
}
UALDictionary(int sz) {
list = new ArrayList<KVpair<Key, E>>(sz);
}
public void clear() {
list.clear();
}
/** Insert an element: append to list */
public void insert(Key k, E e) {
KVpair<Key, E> temp = new KVpair<Key, E>(k, e);
list.add(temp);
}
/** Remove element with key k, return element */
public E remove(Key k) {
E temp = find(k);
if (temp != null)
list.remove(new KVpair<Key, E>(k, temp));
return temp;
}
/** Remove any element */
public E removeAny() {
return list.remove(list.size()-1).value();
}
/**
* Find k using sequential search
*
* @return Record with key value k
*/
public E find(Key k) {
for (KVpair<Key, E> t : list)
if (k.compareTo(t.key())==0)
return t.value();
return null; // "k" does not appear in dictionary
}
public Iterable<E> findAll(Key k) {
ArrayList<E> al = new ArrayList<E>();
for (KVpair<Key, E> t : list)
if (k.compareTo(t.key()) == 0)
al.add(t.value());
return al;
}
public int size(){
return list.size();
}
/** Returns an iterable collection of the dictionary. */
public Iterable<E> values() {
ArrayList<E> elements = new ArrayList<E>(list.size());
for(KVpair<Key, E> t : list)
elements.add(t.value());
return elements;
}
}
Homophones.java
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import java.util.ArrayList;
public class Homophone {
public static void main(String[] args) {
UALDictionary<String, Pronunciation> PDict = new UALDictionary<String, Pronunciation>();
File file = new File("cmudict.0.7a.txt");
final int len = 5; // we start with words of length 5 characters
long start = System.currentTimeMillis();
try {
Scanner scanner = new Scanner(file);
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
if (line.substring(0, 3).equals(";;;"))
continue; // skip comment lines
Pronunciation p = new Pronunciation(line);
if ((p.getWord().length() < len - 1)
|| (p.getWord().length() > len))
continue;
if ((p.getWord().length() == len - 1)
|| (p.getWord().length() == len))
PDict.insert(p.getWord(), p);
}
scanner.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
long middle = System.currentTimeMillis();
System.out.println("Loaded dictionary.");
for (Pronunciation p : PDict.values()) {
String w = p.getWord();
if (w.length() == len) {
/* Fill in code to determine if this word
is pronounced the same if we remove the
first letter or if we remove the second.
*/
}
}
long end = System.currentTimeMillis();
System.out.println("Run times: load dictionary= " + (middle - start)
+ " process= " + (end - middle) + " total= " + (end - start));
}
}
Please provide comments if possible, I will Thumbs Up for the help!
:: Solution ::
class Soundex{
private static int getConsonantCode( char ch ){
String codeList[] = { "BFPV", "CGJKQSXZ","DT","L","MN","R" };
int code = 0;
for( int i = 0 ; i < codeList.length ; i++ ){
if( codeList[i].indexOf(ch) >= 0 ) {
code = i+1;
}
}
return code;
}
private static boolean isVowel( char ch ){
return (new String("AEIOUaeiou")).indexOf(ch) >= 0 ;
}
public static String getSoundexCode( String str ){
str=str.toUpperCase();
String soundexCode = "" + str.charAt(0), temp="";
int length = str.length();
char curr, prev, next;{ }
String dropList = "AEIOUYHW";
for( int i=1 ; i< length ; i++ ){
curr = str.charAt(i);
prev = str.charAt( i-1 );
if( ( curr=='H' || curr == 'W') && i != length-1 ){
if( temp.length() >= 2) temp=temp.substring(1);
next=str.charAt( i+1 );
if( isVowel(curr) && getConsonantCode( prev ) == getConsonantCode( next ) ){
temp += prev+prev;
i=i+1;
}else if( getConsonantCode( prev ) == getConsonantCode( next ) ){
temp += prev;
i=i+1;
}
}else if( getConsonantCode( curr ) != getConsonantCode( prev ) ){
if( dropList.indexOf( curr ) == -1 ){
temp += curr;
}
}
}
temp = ( temp + "0000" ).substring( 0, 3 );
for( int i = 0; i<=2 ; i++ ){
soundexCode += getConsonantCode( temp.charAt( i ) );
}
return soundexCode;
}
public static void main( String []args ){
System.out.println( getSoundexCode( "Ashcraft" )+" "+"A261" );
System.out.println( getSoundexCode( "Ashcroft" )+" "+"A261" );
System.out.println( getSoundexCode( "Gauss" )+" "+"G200" );
System.out.println( getSoundexCode( "Ghosh" )+" "+"G200" );
System.out.println( getSoundexCode( "Hilbert" )+" "+"H416" );
System.out.println( getSoundexCode( "Heilbronn" )+" "+"H416" );
System.out.println( getSoundexCode( "Lee" )+" "+"L000" );
System.out.println( getSoundexCode( "Lloyd" )+" "+"L300" );
System.out.println( getSoundexCode( "Moses" )+" "+"M220" );
System.out.println( getSoundexCode( "Pfister" )+" "+"P236" );
System.out.println( getSoundexCode( "Robert" )+" "+"R163" );
System.out.println( getSoundexCode( "Rupert" )+" "+"R163" );
System.out.println( getSoundexCode( "Rubin" )+" "+"R150" );
System.out.println( getSoundexCode( "Tymczak" )+" "+"T522" );
System.out.println( getSoundexCode( "Soundex" )+" "+"S532" );
System.out.println( getSoundexCode( "Example" )+" "+"E251" );
}
}
Please rate my answer. Thank you...