Question

In: Computer Science

Using Python The script is to open a given file. The user is to be asked...

Using Python

  • The script is to open a given file. The user is to be asked what the name of the file is. The script will then open the file for processing and when done, close that file.
  • The script will produce an output file based on the name of the input file. The output file will have the same name as the input file except that it will begin with "Analysis-". This file will be opened and closed by the script.
  • The script is to process the file and calculate the following information and output the information to the file in the order presented here.
  • The script it to count the number of lines, the number of words, and the number of characters in the file and put this information out into the output file with appropriate labels for the reader to know what the numbers are. This information is to be echoed on the screen to the user.
    • You may find it easier to determine the number of words if you remove the punctuation, digits, and other non letter characters other than spaces before trying to count the words. Those items are not considered to be part of a word. Keep that in mind when referencing words in following instructions.
    • Count spaces, digits, punctuation and other non letter characters as characters though
  • The script is produce a list of all unique words in the file and the of times each word appears in the file. This list with frequency counts is to be put in the output file in alphabetical order and one word/frequency pair to a line. The format should be word (frequency count). Be sure there is a space between the word and the beginning parentheses. You will count words that appear only once. Due to the possible length of this list, you are not to echo this list to the screen, only place it in the output file.
  • The script is to produce a list of 2 word pairs found in the file that appear more than once. If a 2 word pair appears only once, it is not to be put into the output file. The format of the line in the output file should be the two word pair followed by the frequency count in parentheses as seen in the previous item involving unique words. This list is put out after the single word list. There is to be a heading to the list to let the user know that the information is changing and a blank line put in before the heading. This information is to be echoed on the screen to the user.
  • The last bit of information the script it to place into the output file is the total number of words, the average length of a word, the number of unique words, the average number of letters in the unique words, and the number of word pairs that have frequencies of 2 or more.   Properly label each item of information output in this section as well as placing a blank line before the section and giving the section a heading. This information is to be echoed on the screen to the user.
    • It is fully conceivable that the average number of letters in a word (length of a word) for the over all document is different than the average number of letters in a word for the unique word list. This is because a word such as "the" might appear multiple times in the file. In the first calculation, each instance of the is counted. In the second calculation, the word "the" is only counted 1 time on the list.
  • The script is to use solid programming practices like comments, self documenting variable names (also known as meaningful variable names) and easy to read and neat code.
  • You are to place a comment block at the very top of the script containing your name, the semester, the due date of the exam, and the instructor's name each on separate lines.

The logic is built to examine the process incoming data for specific items of information. This may need to be done in specific order with multiple processing steps.

You are to run your script of this test data file. Screen shot your interactions with the user for your submission document. Then place your analysis file, your python code file, and your submission document into a single zip file.

Some advice would be since you have the test data file, you can do these calculations by hand and check them against your analysis file to see if your program is working correctly.

Solutions

Expert Solution

ANSWER :--

GIVEN THAT :--

SCREEN SHOTS :--

CODE :--

import re
from collections import Counter
def read_in_file(fname):
data = []
try:
f = open(fname,'r') #opening the file
except:
return -1
line = f.readline() #reading one line and storing it
while(line!=''):
data.append(line) #adding the line to the list
line = f.readline()
f.close() #file closed
return data   

def main():
while(True):
fname = input("Enter the name of the file ==> ")
data = read_in_file(fname)
if(data==-1):
print("Could not find the file specified. "+fname+" not found")
else:
break
line_no = len(data) #no. of lines = length of list
words_no = 0
char_no = 0
data_nopunc = []
word_pairs = []
f = open('Analysis-'+fname,"w+") #opening file for writing
for line in data:
temp = line.strip() #removing trailing whitespaces
char_no+=len(temp) #total no. of characters on line = length of string "line"
temp= temp.split() #splitting the line into words
for i in range(len(temp)-1):
if(temp[i].isalpha() and temp[i+1].isalpha()): #if two consecutive words are alphabets, and not punctuation, they should be a word pair
word_pairs.append(temp[i]+","+temp[i+1])
temp = ' '.join(temp) #join the list elements into a string
temp = re.sub('[\W_]+', ' ', temp).split() #replace everything thst isn't a word with a space, and then form a list with the remaining words
for i in temp:
data_nopunc.append(i) #data with no punctuations
words_no+=len(temp) #no. of words
count = Counter(data_nopunc) #A Counter forms a dictionary, where the key is a list element and the value is the number of times the key exists in the list.
f.write("No.of words : "+str(words_no))
f.write("\n")
f.write("No.of chars : "+str(char_no))
f.write("\n")
f.write("No. of lines : "+str(line_no))
f.write("\n\n")
f.write("Unique words and their frequencies:-\n")
print("No.of words :",words_no)
print("No.of chars :",char_no)
print("No. of lines :",line_no)
print()
unique_no = len(count.keys())#No. of unique words will be the no. of keys in the dictionary
unique_letter_no = 0
for i in sorted(count.keys()):
f.write(i+" ("+str(count[i])+")")
f.write("\n")
unique_letter_no+=len(i)#adding no. of unique letters together
count = Counter(word_pairs)#counting no. of word pairs
f.write("\nRepeated two word pairs and their frequencies:-\n")
print("Repeated two word pairs and their frequencies:-")
letter_no = 0
for i in data_nopunc:
letter_no+= len(i)#The total no. of letters
cnt = 0
  
for i in sorted(count.keys()):
if(count[i]>1):#if a word pair appears more than once
print(i+" ("+str(count[i])+")")
f.write(i+" ("+str(count[i])+")")
cnt+=1
f.write("\n")
f.write("\nWord statistics:-\n")
f.write("Total no. of words : "+str(words_no))
f.write("\n")
f.write("Average length of a word : "+str(letter_no/words_no))
f.write("\n")
f.write("Total no. of unique words : "+str(unique_no))
f.write("\n")
f.write("Average length of unique words : "+str(unique_letter_no/unique_no))
f.write("\n")
f.write("No. of repeated two word pairs : "+str(cnt))

print("\nWord statistics:-")
print("Total no. of words : "+str(words_no))
print("Average length of a word : "+str(letter_no/words_no))
print("Total no. of unique words : "+str(unique_no))
print("Average length of unique words : "+str(unique_letter_no/unique_no))
print("No. of repeated two word pairs : "+str(cnt))

if __name__ == "__main__":
main()

PLEASE GIVE LIKE

*************THANKYOU****************


Related Solutions

write a script in ruby to automate user account creation in Linux using a CSV file....
write a script in ruby to automate user account creation in Linux using a CSV file. Username is a combination of first initial last initial of the first name and last name in the last name followed by the first initial last initial(Ex. Sam Smith = smithsm). If two or more employees had the same first and last name appends a number to the end of the username. After accounts are created write the first and last names along with...
Solve using PYTHON PROGRAMMING 9. Write a script that reads a file “ai_trends.txt”, into a list...
Solve using PYTHON PROGRAMMING 9. Write a script that reads a file “ai_trends.txt”, into a list of words, eliminates from the list of words the words in the file “stopwords_en.txt” and then a. Calculates the average occurrence of the words. Occurrence is the number of times a word is appearing in the text b. Calculates the longest word c. Calculates the average word length. This is based on the unique words: each word counts as one d. Create a bar...
Solve using PYTHON PROGRAMMING Write a script that reads a file “cars.csv”, into a pandas structure...
Solve using PYTHON PROGRAMMING Write a script that reads a file “cars.csv”, into a pandas structure and then print a. the first 3 rows and the last 3 of the dataset b. the 3 cars with the lowest average-mileage c. the 3 cars with the highest average-mileage. Solve using PYTHON PROGRAMMING
Using Python create a script called create_notes_drs.py. In the file, define and call a function called...
Using Python create a script called create_notes_drs.py. In the file, define and call a function called main that does the following: Creates a directory called CyberSecurity-Notes in the current working directory Within the CyberSecurity-Notes directory, creates 24 sub-directories (sub-folders), called Week 1, Week 2, Week 3, and so on until up through Week 24 Within each week directory, create 3 sub-directories, called Day 1, Day 2, and Day 3 Bonus Challenge: Add a conditional statement to abort the script if...
Write a script that prompts the user for a pathname of a file or directory and...
Write a script that prompts the user for a pathname of a file or directory and then responds with a description of the item as a file along with what the user's permissions are with respect to it (read, write, execute).
Query the user for the name of a file. Open the file, read it, and count...
Query the user for the name of a file. Open the file, read it, and count and report the number of vowels found in the file. Using C++.
create a Python script that prompts the user for a title, description, and filename of your...
create a Python script that prompts the user for a title, description, and filename of your Python program and add the following to the bottom of the existing homepage: Add the post title with an emphasis Add the post description beneath the post title Create a hyperlink to the Python file using the filename input Create another hyperlink to the page you will create in the Web Showcase assignment
Design a Python script that accepts as input a user-provided list and transforms it into a...
Design a Python script that accepts as input a user-provided list and transforms it into a different list in preparation for data analysis, the transformed list replaces each numeric element in the original list with its base-10 order of magnitude and replaces string elements with blanks. Example: This script accepts as input a user-provided list expected to contain non-zero numbers and strings. It then prints a transformed list replacing numbers with their order of magnitude, and strings as blanks. Type...
Task 2.5: Write a script that will ask the user for to input a file name...
Task 2.5: Write a script that will ask the user for to input a file name and then create the file and echo to the screen that the file name inputted had been created 1. Open a new file script creafile.sh using vi editor # vi creafile.sh 2. Type the following lines #!/bin/bash echo ‘enter a file name: ‘ read FILENAME touch $FILENAME echo “$FILENAME has been created” 3. Add the execute permission 4. Run the script #./creafile.sh 5. Enter...
File IO Java question • Ask the user to specify the file name to open for...
File IO Java question • Ask the user to specify the file name to open for reading • Get the number of data M (M<= N) he wants to read from file • Read M numbers from the file and store them in an array • Compute the average and display the numbers and average.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT