Question

In: Computer Science

A file concordance tracks the unique words in a file and their frequencies. Write a program...

A file concordance tracks the unique words in a file and their frequencies. Write a program that displays a concordance for a file. The program should output the unique words and their frequencies in alphabetical order. Variations are to track sequences of two words and their frequencies, or n words and their frequencies.

Solutions

Expert Solution

Library.py

def getLines(fileName):
    """ getLines validates the given fileName.
        Returns all lines present in a valid file. """
    lines = ""
    if (fileName != None and len(fileName) > 0 and os.path.exists(fileName)):
        if os.path.isfile(fileName):
            file = open(fileName, 'r')
            lines = file.read()
            if (len(lines) > 0):
                return lines
            else:
                print("<" + fileName + "> is an empty file!", end="\n\n")
        else:
            print("<" + fileName + "> is not a file!", end="\n\n")
    else:
        print("<" + fileName + "> doesn't exists, try again!", end="\n\n")
    return lines

corondance.py

from library import getLines

# List of English Punctuation Symbols
# Reference : Took maximum puntuations symbols possible from https://en.wikipedia.org/wiki/Punctuation_of_English
# NOTE: Apostrophe is excluded from the list as having it or not having it will give always distinct words.
punctuations = ["[", "]", "(", ")", "{", "}", "<", ">", \
         ":", ";", ",", "`", "'", "\"", "-", ".", \
         "|", "\\", "?", "/", "!", "-", "_", "@", \
         "\#", "$", "%", "^", "&", "*", "+", "~", "=" ]

def stripPunctuation(data):
    """ Strip Punctuations from the given string. """
    for punctuation in punctuations:
        data = data.replace(punctuation, " ")
    return data

def display(wordsDictionary):
    """ Display sorted dictionary of words and their frequencies. """
    noOfWords = 0
    print("-" * 42)
    print("| %20s | %15s |" % ("WORDS".center(20), "FREQUENCY".center(15)))
    print("-" * 42)
    for word in list(sorted(wordsDictionary.keys())):
        noOfWords += 1
        print("| %-20s | %15s |" % (word, str(wordsDictionary.get(word)).center(15)))
        # Halt every 20 words (configurable)
        if (noOfWords != 0 and noOfWords % 20 == 0):
            print("\n" * 2)
            input("PRESS ENTER TO CONTINUE ... ")
            print("\n" * 5)
            print("-" * 42)
            print("| %20s | %15s |" % ("WORDS".center(20), "FREQUENCY".center(15)))
            print("-" * 42)
    print("-" * 42)
    print("\n" * 2)

def prepareDictionary(words):
    """ Prepare dictionary of words and count their occurences. """
    wordsDictionary = {}
    for word in words:
        # Handle subsequent Occurences
        if (wordsDictionary.get(word.lower(), None) != None):
            # Search and add words by checking their lowercase version
            wordsDictionary[word.lower()] = wordsDictionary.get(word.lower()) + 1
        # Handle first Occurence
        else:
            wordsDictionary[word.lower()] = 1
    return wordsDictionary

def main():
    """ Main method """
    print("\n" * 10)
    print("Given a file name, program will find unique words and their occurences!", end="\n\n");
    input("Press ENTER to start execution ... \n");

    # To store all the words and their frequencies
    wordsDictionary = {}
    lines = ""
    # Get valid input file
    while (len(lines) == 0):
        fileName = input("Enter the file name (RELATIVE ONLY and NOT ABSOLUTE): ")
        print("\n\n" * 1)
        lines = getLines(fileName)
    # Get all words by removing all puntuations
    words = stripPunctuation(lines).split()
    # Prepare the words dictionary
    wordsDictionary = prepareDictionary(words)
    # Display words dictionary
    display(wordsDictionary)

"""
    Starting point
"""
main()

Related Solutions

In Python. A file concordance tracks the unique words in a file and their frequencies. Write...
In Python. A file concordance tracks the unique words in a file and their frequencies. Write a program that displays a concordance for a file. The program should output the unique words and their frequencies in alphabetical order. Variations are to track sequences of two words and their frequencies, or n words and their frequencies. Below is an example file along with the program input and output: Input : test.txt output : 3/4 1, 98 1, AND 2, GUARANTEED 1,...
Write a program that creates a concordance. There will be two ways to create a concordance. The first requires a document to be read from an input file, and the concordance data is written to an output file.
Concepts tested by this program            Hash Table,            Link List,hash code, buckets/chaining,exception handling, read/write files (FileChooser)A concordance lists every word that occurs in a document in alphabetical order, and for each word it gives the line number of every line in the document where the word occurs.Write a program that creates a concordance. There will be two ways to create a concordance. The first requires a document to be read from an input file, and the concordance data is written to...
Python File program 5: Word Frequencies (Concordance)    20 pts 1. Use a text editor to create...
Python File program 5: Word Frequencies (Concordance)    20 pts 1. Use a text editor to create a text file (ex: myPaper.txt) It should contain at least 2 paragraphs with around 200 or more words. 2. Write a Python program (HW19.py) that asks the user to provide the name of the text file. Be SURE to check that it exists! Do NOT hard-code the name of the file! Use the entry provided by the user! read from the text file NOTE:...
Word Frequencies (Concordance)    1. Use a text editor to create a text file (ex: myPaper.txt)...
Word Frequencies (Concordance)    1. Use a text editor to create a text file (ex: myPaper.txt) It should contain at least 2 paragraphs with around 200 or more words. 2. Write a Python program (HW19.py) that asks the user to provide the name of the text file. Be SURE to check that it exists! Do NOT hard-code the name of the file! Use the entry provided by the user! read from the text file NOTE: (write your program so that...
Python: Word Frequencies (Concordance) 1. Use a text editor to create a text file (ex: myPaper.txt)...
Python: Word Frequencies (Concordance) 1. Use a text editor to create a text file (ex: myPaper.txt) It should contain at least 2 paragraphs with around 200 or more words. 2. Write a Python program (HW19.py) that asks the user to provide the name of the text file. Be SURE to check that it exists! Do NOT hard-code the name of the file! Use the entry provided by the user! read from the text file NOTE: (write your program so that...
Design and write a python program that reads a file of text and stores each unique...
Design and write a python program that reads a file of text and stores each unique word in some node of binary search tree while maintaining a count of the number appearance of that word. The word is stored only one time; if it appears more than once, the count is increased. The program then prints out 1) the number of distinct words stored un the tree, Function name: nword 2) the longest word in the input, function name: longest...
● Write a program that reads words from a text file and displays all the words...
● Write a program that reads words from a text file and displays all the words (duplicates allowed) in ascending alphabetical order. The words must start with a letter. Must use ArrayList. MY CODE IS INCORRECT PLEASE HELP THE TEXT FILE CONTAINS THESE WORDS IN THIS FORMAT: drunk topography microwave accession impressionist cascade payout schooner relationship reprint drunk impressionist schooner THE WORDS MUST BE PRINTED ON THE ECLIPSE CONSOLE BUT PRINTED OUT ON A TEXT FILE IN ALPHABETICAL ASCENDING ORDER...
● Write a program that reads words from a text file and displays all the words...
● Write a program that reads words from a text file and displays all the words (duplicates allowed) in ascending alphabetical order. The words must start with a letter. Must use ArrayList. THE TEXT FILE CONTAINS THESE WORDS IN THIS FORMAT: drunk topography microwave accession impressionist cascade payout schooner relationship reprint drunk impressionist schooner THE WORDS MUST BE PRINTED ON THE ECLIPSE CONSOLE BUT PRINTED OUT ON A TEXT FILE IN ALPHABETICAL ASCENDING ORDER IS PREFERRED THANK YOU IN ADVANCE...
Write a program to reverse the lines of a file and to reverse the words plus...
Write a program to reverse the lines of a file and to reverse the words plus the letter's of each word within each line using ArrayList. A file name mobydick.txt Example: Original file contains the following MOBY DICK; OR THE WHALE by Herman Melville CHAPTER 1 Loomings. Out put should be .sgnimooL 1 RETPAHC ellivleM namreH yb ELAHW EHT RO ;KCID YBOM its for java eclipse
Write a C++ program to open and read a text file and count each unique token...
Write a C++ program to open and read a text file and count each unique token (word) by creating a new data type, struct, and by managing a vector of struct objects, passing the vector into and out of a function. Declare a struct TokenFreq that consists of two data members: (1) string value; and (2) int freq; Obviously, an object of this struct will be used to store a specific token and its frequency. For example, the following object...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT