In: Computer Science
Please I seek assistance Python Programing import os import numpy as np def generate_assignment_data(expected_grade_file_path, std_dev, output_file_path): """ Retrieve list of students and their expected grade from file, generate a sampled test grade for each student drawn from a Gaussian distribution defined by the student expected grade as mean, and the given standard deviation. If the sample is higher than 100, re-sample. If the sample is lower than 0 or 5 standard deviations below mean, re-sample Write the list of student grades to the given output file using ID and grade with TAB separation. :param expected_grade_file_path: This is our file of student IDs and expected grades :param std_dev: Standard deviation used when sampling grades :param output_file_path: Where to write the sample grades :return: number of student grades generated and tuple of mean, median and standard deviation of grades """
The required Code, Sample input file & sample output file are given below. Basically, we read the input file using python csv dictreader which reads each row as a dictionary. We then use Numpy random.normal function to draw a sample from the Gaussian distribution :
CODE:
import csv import os import numpy as np def generate_assignment_data(expected_grade_file_path, std_dev, output_file_path): with open(expected_grade_file_path, newline='') as csv_in: #Opening the file having expected grades with open(output_file_path, 'w', newline='') as csv_out: #Opening the file to write in write mode fieldnames = ['id', 'expected_grade', 'sampled_grade'] #specifying fieldnames for output file writer = csv.DictWriter(csv_out, fieldnames=fieldnames, delimiter='\t') #creating a writer object for writing output file writer.writeheader() data = csv.DictReader(csv_in, delimiter=',') #read data from specified file generated_grades = [] #initialize empty list for generated sample grades for row in data: row['sampled_grade'] = -1 #keep sampling again till any of the given condition is true while (row['sampled_grade'] < 0 or row['sampled_grade'] > 100 or row['sampled_grade'] < 5 * std_dev): row['sampled_grade'] = np.random.normal(float(row['expected_grade']), std_dev) #generate the sample generated_grades.append(row['sampled_grade']) print(row) writer.writerow(row) generated_grades = np.array(generated_grades) #convert generated grades list to numpy array return generated_grades.size,(generated_grades.mean(),np.median(generated_grades),generated_grades.std()) #return required values #testing print(generate_assignment_data('students.csv', 5, 'students_out.csv'))
CODE Screenshot:
Sample Input file used for testing:
id,expected_grade
s1,70
s2,80
s3,65
s4,75
Output:
Output file data:
id expected_grade sampled_grade
s1 70 72.89787027604346
s2 80 86.3827664017767
s3 65 75.18435378753523
s4 75 77.46518085069853
(*Note: Please up-vote. If any doubt, please let me know in the comments)