Question

In: Computer Science

Python is a great tool for automating a number of tasks, including scanning and summarizing the...

Python is a great tool for automating a number of tasks, including scanning and summarizing the files in a file system.

Write a Python class, FileAnalyzer that given a directory name, searches that directory for Python files (i.e. files ending with .py). For each .py file, open each file and calculate a summary of the file including:

  • the file name
  • the total number of lines in the file
  • the total number of characters in the file
  • the number of Python functions (lines that begin with ‘def ’, including methods inside class definitions)
  • the number of Python classes (lines that begin with ‘class ’)

Generate a summary report with the directory name, followed by a tabular output that looks similar to the following:

The class FileAnalyzer should have at least these attributes and these methods listed below:

  • Attributes:
    • self.files_summary: dict, a Python dictionary that store the summarized data for the given file path. The keys of dictionary will be the filename of a python file, the value of each key will be a dict as well. The value dictionary will have 4 key-value pairs that represent 4 data points we collect for each file. The structure of self.files_summarywill look like below:
{
    'filename_0.py': {
        'class': # number of classes in the file
        'function': # number of functions in the file
        'line': # number of lines in the file
        'char': # number of characters in the file
    }, ...
}
  • Methods:
    • self.analyze_files(self): a method that populate the summarized data into self.files_summary. Note that you MUST NOT pass any argument to this method, the given directory path is better stored as an attribute - there will be no naming requirement for this variable but I personally use self.directory.
    • self.pretty_print(self): a method that print out the pretty table from the data stored in the self.files_summary. Note that you MUST NOT pass any argument, other than self, to this method.

Note: Your should execute the self.analyze_files in the self.__init__ but you MUST NOTexecute the self.pretty_print in the self.__init__, you will get 10 points deducted if you do so.

Keep in mind that you may see a file, but you may not be able to read it. In that case, raise a FileNotFound exception if the specified file can’t be opened for reading

Your program should use exceptions to protect the program against unexpected events.

The PrettyTable module (Links to an external site.) provides an easy way to create tables for output. You should use PrettyTable to format your table. See the lecture notes for an example.

You’ll need to think about how to test this program with unittest. You do not need to automatically verify the output from the table but you should validate the various numeric values, e.g. number of classes, functions, lines, and characters. Your automated test should validate the counts for each file in the test directory.

Hints:

  1. Python’s os module has several helpful functions, e.g. os.listdir(directory) lists all of the files in the specified directory. Note that the file names returned by os.listdir do NOTinclude the directory name.
  2. os.chdir(directory) will change the current directory to the specified directory when your program runs.
  3. Be sure to handle the cases where the user enters an invalid directory name
  4. Test your code against the sample files in Canvas
  5. Be sure to use unittest to test your program and follow the PEP-8 coding standards.

Solutions

Expert Solution

Pre-requisites:

1) Python 3.x is used to execute the code

2) Python modules os, re and prettytable are installed in the python version been used.

CODE:

OUTPUT:

# ./fileanalyze.py

CODE to copy:

import os

import re

from prettytable import PrettyTable

class FileAnalyzer:

        def __init__(self):

                # declare the global dictionary to store the values

                self.files_summary = {}

                # directory to scan files for

                self.directory = "/home/vsaurabh/analyzer"

                # call the method to start scanning the files

                self.analyze_files()

        def analyze_files(self):

                # walk through the given directory

                for root, dir, files in os.walk(self.directory):

                        # check for each file

                        for file in files:

                                # check if the given file ends wiht .py extension

                                if file.endswith(".py"):

                                        # get the absolute path for the file

                                        myfile = root + "/" + file

                                      # declare required variables

                                        characters = 0

                                        lines = 0

                                        classes = 0

                                        definition = 0

                                        try:

                                                # read the file for requried data

                                                with open(myfile, 'r') as frb:

                                                        for line in frb:

                                                                # increment the no of line count

                                                                lines = lines + 1

                                                                # increment the no of characters count

                                                                characters = characters + len(line)

                                                                # check if given line is a class directive

                                                                if line.startswith("class"):

                                                                        # if yes, increment the counter

                                                                        classes = classes + 1

                                                                # check if given line is a function directive

                                                                if re.match('\s.*def|def', line):

                                                                        # if yes, increment the counter

                                                                        definition = definition + 1

                                                # if the file scanned is not found as a key in dictionary

                                                # add the filename as a key into the dictionary                                               if file not in self.files_summary:

                                                        # declare a nested dictionary for the respective key

                                                        self.files_summary[file] = {}

                                                # add values for the other paramters required to be stored

                                                self.files_summary[file]['class'] = classes

                                                self.files_summary[file]['function'] = definition

                                                self.files_summary[file]['line'] = lines

                                                self.files_summary[file]['char'] = characters

                                        except (FileNotFoundError):

                                                # Error is a filename is not found

                                                print("Wrong file or file path: %s" %myfile)

                                        except PermissionError:

                                                # Raise in case the file is not accessible

                                                print("Cannot open file : %s" %myfile)

                # call the function to print the dictionary

                self.pretty_print()

        def pretty_print(self):

                # declare a pretty table object

                x = PrettyTable()

                # add headers to the table

              x.field_names = ["Filename", "classes", "definition", "lines", "characters"]

                # for each key, value pair in dictionary add a row in the table

                for key,value in self.files_summary.items():

                        data = []

                        data.extend([key,value["class"],value["function"],value["line"],value["char"]])

                        x.add_row(data)

                # print the final table

                print(x)

if __name__ == "__main__":

        # instantiate the class to trigger _init_

        obj = FileAnalyzer()


Related Solutions

A great flood appears in a number of texts, including the Bible, the Sumerian Deluge, and...
A great flood appears in a number of texts, including the Bible, the Sumerian Deluge, and the Epic of Gilgamesh, to name three. Write at least one paragraph, of at least five sentences, setting out the similarities of the three flood stories. Write at least one paragraph, of at least five sentences, making a general comparison of the three flood stories. If you quote or paraphrase, use an informal citation (author's name and page number, in parentheses)
Tuition reimbursement is a great tool not only for the employees but also for the organization...
Tuition reimbursement is a great tool not only for the employees but also for the organization that provides it. Most organizations that have this benefit require the employees to obtain a grade of C or higher to earn the benefit. If this is the only requirement for the reimbursement, many organizations could be taken advantage of just for a degree. The company that I work for provides tuition reimbursement, but they also have further requirements that must be met. First,...
identify, and then describe a healthcare process in detail including the tasks of the process, individuals...
identify, and then describe a healthcare process in detail including the tasks of the process, individuals involved, the goal of the process itself, the interactions between the individuals involved, and their interaction with health Information Technology to accomplish those tasks. Identify social, behavioral, and technical influences on the process. Give the opinion on how you think the tasks of the process are affected by these influences. Discuss any unintended consequences or workarounds that may be present in this process.
Consider the following table summarizing the speed limit of a certain road and the number of...
Consider the following table summarizing the speed limit of a certain road and the number of accidents occurring on that road in January. Posted Speed Limit 52 50 43 36 21 22. Reported Number of Accidents 27 26 23 18 18 11. 1) Find the slope of the regression line predicting the number of accidents from the posted speed limit.Round to 3 decimal places. 2) Find the intercept of the regression line predicting the number of accidents from the posted...
(Python) Write a program which accomplishes the following tasks: set a variable to the result of...
(Python) Write a program which accomplishes the following tasks: set a variable to the result of mathematical expression including +, -, * and / and of both Integer and Float values (or variables) set a variable to the result of a combination of string values (or variables) set a variable to the result of a combination of string, Integer and Float values (you may need to use the type casting functions) Using the following variables: a = 1.3 b =...
Value at Risk (VaR) is an attempt to provide a single number for senor management summarizing...
Value at Risk (VaR) is an attempt to provide a single number for senor management summarizing the total risk in a portfolio of financial assets. It has become widely used by corporate treasurers and fund managers as well as by financial institutions. A company currently has $5 million invested in commodity X and $3 million invested in commodity Y. The daily sigma of commodity X is 1 percent, the daily sigma of commodity Y is 1.5 percent, and the coefficient...
#1) Consider the following table summarizing the speed limit of a certain road and the number...
#1) Consider the following table summarizing the speed limit of a certain road and the number of accidents occurring on that road in January Posted Speed Limit 51 48 43 35 22 20 Reported Number of Accidents 25 29 25 17 19 14 a) Find the slope of the regression line predicting the number of accidents from the posted speed limit. Round to 3 decimal places. b) Find the intercept of the regression line predicting the number of accidents from...
PYTHON PLEASE-------- For the first module, write the pseudocode to process these tasks: (Note: lines beginning...
PYTHON PLEASE-------- For the first module, write the pseudocode to process these tasks: (Note: lines beginning with # are comments with tips for you) From the random module import randint to roll each die randomly # in pseudocode, import a random function # the name is helpful for the next M5-2 assignment Define a class called Dice In Python, the syntax has a colon after it: class Dice(): In pseudocode, you can specify it generally or be specific Under the...
PYTHON. Tasks: Create and test the openFileReadRobust() function to open file (for reading) whose name is...
PYTHON. Tasks: Create and test the openFileReadRobust() function to open file (for reading) whose name is provided from the standard input (keyboard) – the name is requested until it is correct Create and test the openFileWriteRobust () function to open file (for writing) whose name is provided from the standard input (keyboard) – the name is requested until is correct Extend the program to count number of characters, words and lines in a file as discussed in the classby including...
I think that SQL language is a great tool for database creation and manipulation. The basics...
I think that SQL language is a great tool for database creation and manipulation. The basics of SQL are easy to learn and understand. The coding is simple from the creation, changing, or even deleting data from the database. I overall really like learning SQL I can easily visualize what I am doing with the code. What could be added to this post?
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT