Question

In: Computer Science

Please Write-In Python Language (Topic: Word frequencies) Method/Function: List<Token> tokenize(TextFilePath) Write a method/function that reads in...

Please Write-In Python Language (Topic: Word frequencies)

Method/Function: List<Token> tokenize(TextFilePath)
Write a method/function that reads in a text file and returns a list of the tokens in that file. For the purposes of this project, a token is a sequence of alphanumeric characters, independent of capitalization (so Apple, apple, aPpLe are the same token). You are allowed to use regular expressions if you wish to (and you can use some regexp engine, no need to write it from scratch), but you are not allowed to import a tokenizer (e.g. from NLTK), since you are being asked to write a tokenizer.

Method:  Map<Token,Count> computeWordFrequencies(List<Token>)
Write another method/function that counts the number of occurrences of each token in the token list. Remember that you should write this assignment yourself from scratch so you are not allowed to import a counter when the assignment asks you to write that method.

Method: void print(Frequencies<Token, Count>)
Finally, write a method that prints out the word frequency count onto the screen. The print out should be ordered by decreasing frequency (so, the highest frequency words first).

Print the output in this format:

<token> -> <freq>

Please give me some notes about the codes, thanks!!!

Solutions

Expert Solution

import string

# 1) Splits the text file into individual characters
# to identify the commas and parsing the individual
# tokens.

# create a list to store the inputted numbers
numbers = list()
# Open the input text file for reading
dataFile = open('numbers.txt', 'r')

# Loop through each line of the input data file
for eachLine in dataFile:
# setup a temporay variable
    tmpStr = ''
    # loop through each character in the line
    for char in eachLine:
        # check whether the char is a number
        if char.isdigit():
            # if it is a number add it to the tmpStr
            tmpStr += char
            # if a comma is identified and tmpStr has a
            # value then append it to the numbers list
        elif char == ',' and tmpStr != '':
            numbers.append(int(tmpStr))
            tmpStr = ''
    # if the tmpStr contains a number add it to the
    # numbers list.
    if tmpStr.isdigit():
        numbers.append(int(tmpStr))
# Print the number list
print numbers
# Close the input data file.
dataFile.close()

# 2) Uses the string function split to line from the file
# into a list of substrings
numbers = list()
dataFile = open('C:\\PythonCourse\\unit3\\numbers.txt', 'r')

for eachLine in dataFile:
    # Simplify the script by using a python inbuilt
    # function to separate the tokens
    substrs = eachLine.split(',',eachLine.count(','))
    # Iterate throught the output and check that they
    # are numbers before adding to the numbers list
    for strVar in substrs:
        if strVar.isdigit():
            numbers.append(int(strVar))

print numbers

dataFile.close()


Related Solutions

Write a Python program that reads a file, input by the user, containing one word/token per...
Write a Python program that reads a file, input by the user, containing one word/token per line with an empty line between sentences. The program prints out the longest word found in the file along with its length.
I have a Python code that reads the text file, creates word list then calculates word...
I have a Python code that reads the text file, creates word list then calculates word frequency of each word. Please see below: #Open file f = open('example.txt', 'r') #list created with all words data=f.read().lower() list1=data.split() #empty dictionary d={} # Adding all elements of the list to a dictionary and assigning it's value as zero for i in set(list1):     d[i]=0 # checking and counting the values for i in list1:     for j in d.keys():        if i==j:           d[i]=d[i]+1 #Return all non-overlapping...
write a python program that reads a sentence and identifies a word from an existing glossary...
write a python program that reads a sentence and identifies a word from an existing glossary and states the meaning of the word
a python function that reads two text files and merges in to one Linked List, be...
a python function that reads two text files and merges in to one Linked List, be able to print each Item in the new single Linked List class Node(object): item = -1 next = None def __init__(self, item, next): self.item = item self.next = next ================================ textfile! 979 2744 5409 1364 4948 4994 5089 703 1994 4637 2228 4004 1088 2812 170 5179 2614 238 4523 4849 3592 3258 1951 3440 3977 1247 4076 1824 4759 4855 5430 347 974...
Language: Python 3 (Please structure answer as basic as possible) Write a function that involves two...
Language: Python 3 (Please structure answer as basic as possible) Write a function that involves two arguments, named changeTheCase(myFile, case), that takes, as arguments, the name of a file, myFile, and the case, which will either be “upper” or “lower”. If case is equal to “upper” the function will open the file, convert all characters on each line to upper case, write each line to a new file, named “upperCase.txt”, and return the string “Converted file to upper case.” If...
In PYTHON: Write a function that receives a sentence and returns the last word of that...
In PYTHON: Write a function that receives a sentence and returns the last word of that sentence. You may assume that there is exactly one space between every two words, and that there are no other spaces at the sentence. To make the problem simpler, you may assume that the sentence contains no hyphens, and you may return the word together with punctuation at its end.
Write a python code to Design and implement a function with no input parameter which reads...
Write a python code to Design and implement a function with no input parameter which reads a number from input (like 123). Only non-decimal numbers are valid (floating points are not valid). The number entered by the user should not be divisible by 10 and if the user enters a number that is divisible by 10 (like 560), it is considered invalid and the application should keep asking until the user enters a valid input. Once the user enters a...
Write a main function that reads a list of integers from a user, adds to an...
Write a main function that reads a list of integers from a user, adds to an array using dynamic memory allocation, and then displays the array. The program also displays the the largest element in the integer array. Requirement: Using pointer notation. Please do this with C++
PYTHON: Write a function insertInOrder that takes in a list and a number. This function should...
PYTHON: Write a function insertInOrder that takes in a list and a number. This function should assume that the list is already in ascending order. The function should insert the number into the correct position of the list so that the list stays in ascending order. It should modify the list, not build a new list. It does not need to return the list, because it is modifying it.   Hint: Use a whlie loop and list methods lst = [1,3,5,7]...
Write a Python function that takes a list of string as arguments. When the function is...
Write a Python function that takes a list of string as arguments. When the function is called it should ask the user to make a selection from the options listed in the given list. The it should get input from the user. Place " >" in front of user input. if the user doesn't input one of the given choices, then the program should repeatedly ask the user to pick from the list. Finally, the function should return the word...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT