Question

In: Computer Science

The file cereal.scsv contains the nutritional information for over 70 cereals. However it's in a format...

The file cereal.scsv contains the nutritional information for over 70 cereals. However it's in a format that is rarely used (semi-colon separated values). The reason it's in this format is that a few cereals have commas in their name (e.g. Fruit & Fibre Dates, Walnuts, and Oats). There is a separate lesson on converting this data to a proper csv format.

Also note that the first line identifies the column names; the second line identifies the data types for each column.

Part 1 reading and finding

Create

Create a function named find_total_for that has two parameters (in this order):

  • filename_in which is the path the file to read
  • column_name which is the name of the column of data we are interested in

Read the data, parse the data

find_total_for will read in the file and sum up the values identified by column_name.

Notes and Hints:

  • only columns that have numeric values will be requested
  • open and read the .scsv file (i.e. filename_in)
  • parse out each line into its separate components

Return

The function returns the sum of values for column column_name

(In Python)

Solutions

Expert Solution

The following function uses the simple concepts of file handling and split function to get the required information from the file.


def find_total_for(filename_in,column_name):
    # Opening file as f
    f = open(filename_in)
    
    # Function to store all lines in a list
    rows = f.readlines()
    
    # To remove "\n" from end of each string
    for i in range(len(rows)-1):
        rows[i] = rows[i][:len(rows[i])-1]
    # Counter of column
    c = 0
    
    # Different column names given in first line
    columns = rows[0].split(";")
    
    # Getting the index of the required column 
    for column in columns:
        if(column==column_name):
            req_column = c
            break
        c+=1
    
    summ = 0
    
    #Adding up the values from each row
    for i in range(1,len(rows)):
        col = rows[i].split(";")
        summ += int(col[req_column])
    
    return summ


Related Solutions

the excel file cereal data provides a variety of nutritional information about 67 cereals and their...
the excel file cereal data provides a variety of nutritional information about 67 cereals and their shelf location. Use regression analysis to find the best model that explains the relationship between calories and the other variables. Investigat the model assumptions and clearly explain your conclusion. Keep in mind the principle of parsimony! DATA: Cereal Data Product Cereal Name Manufacturer Calories Sodium Fiber Carbs Sugars 1 100% Bran Nabisco 70 130 10 5 6 2 AlI-Bran Kellogg 70 260 9 7...
The file Cereals contains the calories and sugar, in grams, in one serving of seven breakfast...
The file Cereals contains the calories and sugar, in grams, in one serving of seven breakfast cereals: Cereal                                                             Calories                             Sugar Kellogg’s all Bran                                         80                                       6 Kellogg’s Corn Flakes                                  100                                     2 Wheaties                                                        100                                     4 Nature’s path Organic Multigrain Flakes 110                                     4 Kellogg’s rice Krispies                                 130                                     4 Post Shredded Wheat Vanilla almond       190                                     11 Kellogg’s Mini Wheats                                200                                     10 a. Compute and interpret the coefficient of correlation, r. b. at the 0.05 level of...
The dataset starbucks in the open intro package contains nutritional information on 77 Starbucks food items....
The dataset starbucks in the open intro package contains nutritional information on 77 Starbucks food items. Spend some time reading the help file of this dataset. For this problem, you will explore the relationship between the calories and carbohydrate grams in these items. Please complete in R Studio showing all steps. Create a scatterplot of this data with calories on the x-axis and carbohydrate grams on the y-axis, and describe the relationship you see. In the scatterplot you made, what...
A simple random sample of 70 customers is taken froma customer information file and the average...
A simple random sample of 70 customers is taken froma customer information file and the average age is 36. The population standard deviation o is unknown. Instead the sample standard deviation s is also calculated from the sample and is found to be 4.5. 2. Test the hypothesis that the population mean age is greater than 33 using the critical value approach and a 0.05 level of significance. а. Test the hypothesis that the population mean age is less than...
The Excel file: OfficeEnergy.xlsx contains information recorded on a sample of 40 of the company’s city...
The Excel file: OfficeEnergy.xlsx contains information recorded on a sample of 40 of the company’s city branches. The file contains the amount of renewable energy (kWh) consumed in the previous year and the size (square metres) of each branch. Download these data and fit a linear regression model for predicting RenewableEnergy (Y) from Size (X). Use your Excel output, and any other information provided, to answer the following questions. For each question, either choose the most correct option, or type...
C++ There is a file, called EmployeeInfo.txt, that contains information about company employees work for a...
C++ There is a file, called EmployeeInfo.txt, that contains information about company employees work for a week. You will write a program that reads the info from the file, and produces an output file called EmployeePay.txt with information about the employees pay amount for that week. Details: The input file is called EmployeeInfo.txt There are 4 lines of input Each line has the same form Last Name, first Name, hours worked, pay rate, tax percentage, extra deductions Example: John Doe...
Consider the data in the file Growth.csvwhich contains data on average growth rates over 1960-1995 for...
Consider the data in the file Growth.csvwhich contains data on average growth rates over 1960-1995 for 65 countries, along with variables that are potentially related to growth. A complete description of the data is given in data description which is under the name Growth- Data Description and can be found on Blackboard. Using this data, carry out the following empirical exercises: Construct a table that shows the sample mean, std. deviation, minimum and maximum values for the variablesGrowth, Trade-Share, YearsSchool,...
Use the data in the Mod8-2Data file to answer the following questions. The data contains information...
Use the data in the Mod8-2Data file to answer the following questions. The data contains information from a car seat manufacturer on the age of machine (in months) and the cost of repairs (in 10s of $). Run the regression in Minitab and show the regression line on a scatter plot. Assume a level of significance of 5%. Age Repairs10 110 32.767 113 37.668 114 39.252 134 44.314 93 34.262 141 47.616 115 32.474 115 33.898 115 43.345 142 52.637...
In a header file Record.h, create a Record structure that contains the following information: recordID, firstName,...
In a header file Record.h, create a Record structure that contains the following information: recordID, firstName, lastName, startYear. recordID is an integer. In testRecord, first create a record (record1) using “normal” pointers. Set the values to 1001, Fred, Flintstone, 1995 In testRecord, create a new record (record2) using a unique smart pointer. Set the values to 1002, Barney, Rubble, 2000 Create a function (or functions) that will print the records to any output stream. Since the print function does not...
1. Recall from chapter 2 that the file of Supermarket Transactions.xlsx contains over 14,000 transactions made...
1. Recall from chapter 2 that the file of Supermarket Transactions.xlsx contains over 14,000 transactions made by supermarket customers over a period of approximately two years. To understand which customers purchase which products, use pivot tables to create a crosstabs and an associated column part for each of the following. For each, express the counts as percentages so that for any value of the first variable listed, the percentages add to 100%. Do any patterns stand out? a. State or...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT