In: Computer Science
In Python, create a program with 2 classes that do the following.
HashCreate class, this class will accept a directory and hash each file in the directory and store the results in a dictionary. The dictionary will contain the hash value and file name.
HashVerify, the second class will accept the dictionary as input and save that in an instance attribute. This class must also contain a method for lookups that require a file path as input. The lookup method will hash the file and lookup in the dictionary to see if the hash exists. If the hash exists, the lookup method will return true and a file name associated with it.
Hi,
As it's not mentioned as to which hashing algorithm needs to be used, I have used md5 for hashing.
I am posting here the screen shot of the source code followed by the actual source code and test run output which confirms that the code is working fine.
Source code:
import hashlib, os
from os import listdir, getcwd
from os.path import isfile, join, normpath
#dictionary to store the hash value and file name
HashDict = {}
class HashCreate:
def __init__(self, directory=""):
if not directory:
#checking if the directory is empty
print("directory
name is empty!")
return
if not os.path.exists
(directory): #checking if the directory exists
print("Can't
find directory: " + directory)
return
current_path =
normpath(directory) #normalising the path
#obtaining list of files in the
directory
filenames = [f for f in
listdir(current_path) if isfile(join(current_path, f))]
global HashDict
#creating the hash of each file and
updating hash and file name in global hash dictionary
for filename in filenames:
filepath =
join(current_path, filename)
HashDict[self.md5(filepath)] = filename
# method to create md5 hash
def md5(self, fname):
hash_md5 = hashlib.md5()
with open(fname, "rb") as f:
for chunk in
iter(lambda: f.read(2 ** 20), b""):
hash_md5.update(chunk)
return hash_md5.hexdigest()
class HashVerify:
def __init__(self, hashDict = {}):
#taking hashdict and storing in
instance attribute of HashVerify
self.HashDict = hashDict
#method to lookup the hash of the file specified by
the file path and return as mentioned in the question
def lookup(self, filepath = ""):
if not filepath:
#checking if filepath is empty
print("file path
is empty!")
return False,
""
if not
os.path.exists(filepath): #checking if filepath
exists
print("Can't
find directory: " + filepath)
return False,
""
fileHash =
self.md5(filepath) #calculating md5 hash of the
file
if fileHash in
self.HashDict: #checking if the hash exists in the hash
dict
return True,
self.HashDict[fileHash] #returning True and file name
as hash exists
else:
return False,
"" #returning False and empty file name as hash doesn't
exist
def md5(self, fname):
hash_md5 = hashlib.md5()
with open(fname, "rb") as f:
for chunk in
iter(lambda: f.read(2 ** 20), b""):
hash_md5.update(chunk)
return hash_md5.hexdigest()
#Testing the two classes
#Please replace the directory name with a directory in your local
computer
H1 =
HashCreate("/Users/skotak455/Documents/TestPrograms/single_LL")
print("HashDict: ")
print(HashDict)
H2 = HashVerify(HashDict)
#Please replace the filepath with a file in your local
computer
verified, file =
H2.lookup("/Users/skotak455/Documents/TestPrograms/single_LL/a.out")
if verified:
print("File is verified, file name: " + file)
else:
print("Can't lookup the file path")
Test run output:
Thank you!!