In: Computer Science
The Pool class of the multiprocessing Python module. There is a defined run method to perform the tasks. Create a pool object of the Pool class of a specific number of CPUs your system has by passing a number of tasks you have. Start each task within the pool object by calling the map instance method, and pass the run function and the list of tasks as an argument.Hint: os.walk() generates the file names in a directory tree by walking the tree either top-down or bottom-up. This is used to traverse the file system in Python.
Multisync.py
#!/usr/bin/env python3 from multiprocessing import Pool def run(task): # Do something with task here print("Handling {}".format(task)) if __name__ == "__main__": tasks = ['task1', 'task2', 'task3'] # Create a pool of specific number of CPUs p = Pool(len(tasks)) # Start each task within the pool p.map(run, tasks)
The hierarchy of the subfolders of /data/prod, data is from different projects (e.g., , beta, gamma, kappa) and they're independent of each other. You have to use multiprocessing and subprocess module methods to sync the data from /data/prod to /data/prod_backup folder.
Try applying multiprocessing, which takes advantage of the idle CPU cores for parallel processing. Here, you have to use multiprocessing and subprocess module methods to sync the data from /data/prod to /data/prod_backup folder.
Hint: os.walk() generates the file names in a directory tree by walking the tree either top-down or bottom-up. This is used to traverse the file system in Python.
Dailysync.py
#!/usr/bin/env python import subprocess src = "/data/prod/" dest = "/data/prod_backup/" subprocess.call(["rsync", "-arq", src, dest])
Ans:
'''
The above code can use multiprocessing and the subprocess module to do the following syncing task ( to backup data)
'''
#!/usr/bin/env python3
from multiprocessing import Pool
import subprocess
import os
from pathlib import Path, PureWindowsPath
def run(task):
# Do something with task here
#print("Handling {}".format(task))
src = "data/prod/"
dest = "data/prod_backup/"
subprocess.call(["rsync", "-arq", src, dest])
print(task)
if __name__ == "__main__":
print("Before multiprocess synchronisation: ")
print("Directory structure and files to be synchronised: ")
for root, dirs, files in os.walk("data"):
print("root dir path:",root)
print("directories inside:",dirs)
print("files inside:",files)
cores = os.cpu_count()
print("cores that can be utilised :",cores)
# Create a pool of specific number of CPUs
p = Pool()
tasks = []
# Start each task within the pool
for i in range(1,cores+1):
tasks.append("task{}".format(i))
p.map(run, tasks)
print("After multiprocess synchronisation: ")
print("Directory structure and files to be synchronised: ")
for root, dirs, files in os.walk("data"):
print("root dir path:",root)
print("directories inside:",dirs)
print("files inside:",files)
# BEFORE RUNNING THE SCRIPT THE DIRECTORY STRUCTURE WAS:
# AFTER RUNNING THE SCRIPT a 'prod_backup' folder was created and all files were synced inside it
# REQUIRED OUTPUT:
# PLEASE DO LIKE AND UPVOTE IF THIS WAS HELPFUL!
# THANK YOU SO MUCH IN ADVANCE!