In: Computer Science
Please I seek assistance
Python Programing
import os
import numpy as np
def generate_assignment_data(expected_grade_file_path,
std_dev, output_file_path):
"""
Retrieve list of students and their expected grade from file,
generate a sampled test grade for each student
drawn from a Gaussian distribution defined by the
student expected grade as mean, and the given
standard deviation.
If the sample is higher than 100, re-sample.
If the sample is lower than 0 or 5 standard deviations below mean,
re-sample
Write the list of student grades to the given
output file using ID and grade with TAB separation.
:param expected_grade_file_path: This is our file of student IDs and expected grades
:param std_dev: Standard deviation used when sampling grades
:param output_file_path: Where to write the sample grades
:return: number of student grades generated and
tuple of mean, median and standard deviation of grades
"""
The required Code, Sample input file & sample output file are given below. Basically, we read the input file using python csv dictreader which reads each row as a dictionary. We then use Numpy random.normal function to draw a sample from the Gaussian distribution :
CODE:
import csv
import os
import numpy as np
def generate_assignment_data(expected_grade_file_path, std_dev, output_file_path):
with open(expected_grade_file_path, newline='') as csv_in: #Opening the file having expected grades
with open(output_file_path, 'w', newline='') as csv_out: #Opening the file to write in write mode
fieldnames = ['id', 'expected_grade', 'sampled_grade'] #specifying fieldnames for output file
writer = csv.DictWriter(csv_out, fieldnames=fieldnames, delimiter='\t') #creating a writer object for writing output file
writer.writeheader()
data = csv.DictReader(csv_in, delimiter=',') #read data from specified file
generated_grades = [] #initialize empty list for generated sample grades
for row in data:
row['sampled_grade'] = -1
#keep sampling again till any of the given condition is true
while (row['sampled_grade'] < 0 or row['sampled_grade'] > 100 or row['sampled_grade'] < 5 * std_dev):
row['sampled_grade'] = np.random.normal(float(row['expected_grade']), std_dev) #generate the sample
generated_grades.append(row['sampled_grade'])
print(row)
writer.writerow(row)
generated_grades = np.array(generated_grades) #convert generated grades list to numpy array
return generated_grades.size,(generated_grades.mean(),np.median(generated_grades),generated_grades.std()) #return required values
#testing
print(generate_assignment_data('students.csv', 5, 'students_out.csv'))
CODE Screenshot:

Sample Input file used for testing:
id,expected_grade
s1,70
s2,80
s3,65
s4,75
Output:

Output file data:
id expected_grade sampled_grade
s1 70 72.89787027604346
s2 80 86.3827664017767
s3 65 75.18435378753523
s4 75 77.46518085069853
(*Note: Please up-vote. If any doubt, please let me know in the comments)