In: Computer Science
PYTHON - You are given a data.csv file in the /root/customers/ directory containing information about your customers. It has the following columns: ID,NAME,CITY,COUNTRY,CPERSON,EMPLCNT,CONTRCNT,CONTRCOST where ID: Unique id of the customer NAME: Official customer company name CITY: Location city name COUNTRY: Location country name CPERSON: Email of the customer company contact person EMPLCNT: Customer company employees number CONTRCNT: Number of contracts signed with the customer CONTRCOST: Total amount of money paid by customer (float in format dollars.cents) Read and analyze the data.csv file, and output the answers to these questions: How many total customers are in this data set? How many customers are in each city? How many customers are in each country? Which country has the largest number of customers' contracts signed in it? How many contracts does it have? How many unique cities have at least one customer in them? The answers should be formatted as: Total customers: Customers by city: : : ... Customers by country: : : ... Country with most customers' contracts: USA ( contracts) Unique cities with at least one customer: The answers for Customers by city and Customers by country must be sorted by CITY and COUNTRY respectively, in ascending order. If there are several cities that are tied for having the most customers' contracts, print the lexicographically bigger one.
We are using the built-in csv module to read in the data.csv file. The program code, as per the given requirements, is as follows:
import csv
def analyze_data():
filename = "/root/customers/data.csv"
rows = []
city_cust_dict = {} # Dictionary with key as the city, and value as number of customers in that city
country_cust_dict = {} # Dictionary with key as the country, and value as number of customers in that country
with open(filename, 'r') as file:
reader = csv.reader(file)
fields = next(reader)
for row in reader:
rows.append(row)
city = row[2]
country = row[3]
# Add city, customer to city_cust_dict dictionary
if city in city_cust_dict:
city_cust_dict[city] += 1
else:
city_cust_dict[city] = 1
# Add country, customer to country_cust_dict dictionary
if country in country_cust_dict:
country_cust_dict[country] += 1
else:
country_cust_dict[country] = 1
# *** PRINTING REQUIRED DATA ***
# How many total customers are in this data set?
print("Total customers: " + str(len(rows)))
# How many customers are in each city?
sorted(city_cust_dict.items(), key=lambda x: x[1])
print("Customers by city:")
for k, v in city_cust_dict.items():
print(k + " : " + str(v))
# How many customers are in each country?
sorted(country_cust_dict.items(), key=lambda x: x[1])
print("Customers by country:")
for k, v in country_cust_dict.items():
print(k + " : " + str(v))
# Country with most customers' contracts?
country_keys = list(country_cust_dict.keys())
max_country = country_keys[0]
max_val = country_cust_dict[max_country]
for i in range(len(country_keys)):
k = country_keys[i]
v = country_cust_dict[k]
if v > max_val:
max_val = v
max_country = k
elif v == max_val:
if k > max_country:
k = max_country
max_val = country_cust_dict[max_country]
print("Country with most customers' contracts: " + max_country + " (" + str(max_val) + ")")
# How many unique cities have at least one customer in them?
print("Unique cities with at least one customer:")
for x in city_cust_dict.keys():
print(x)
if __name__ == '__main__':
analyze_data()
Hope that helps!