Question

In: Computer Science

The purpose of this is to plot data using Matplotlib. Description complete the Jupyter notebook named...

The purpose of this is to plot data using Matplotlib.

Description

complete the Jupyter notebook named main.ipynb that reads in the file diamonds.csv into a Pandas DataFrame. Information about the file can be found here:

-------

diamonds R Documentation

Prices of over 50,000 round cut diamonds

Description

A dataset containing the prices and other attributes of almost 54,000 diamonds. The variables are as follows:

Usage

diamonds

Format

A data frame with 53940 rows and 10 variables:

price

price in US dollars (\$326–\$18,823)

carat

weight of the diamond (0.2–5.01)

cut

quality of the cut (Fair, Good, Very Good, Premium, Ideal)

color

diamond colour, from D (best) to J (worst)

clarity

a measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best))

x

length in mm (0–10.74)

y

width in mm (0–58.9)

z

depth in mm (0–31.8)

depth

total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43–79)

table

width of top of diamond relative to widest point (43–95)

------

There are two figures that you need to create:

Figure 1:

  • normalized histogram with 30 bins using the Fair cut diamond prices
  • a line plot of the normal distribution using the mean and standard deviation of the Fair cut diamond prices
  • appropriate labels on the both the x and y axes
  • appropriate title
  • appropriate legend

Figure 2:

  • appropriate title

There are two figures that you need to create:

Figure 1:

  • normalized histogram with 30 bins using the Fair cut diamond prices
  • a line plot of the normal distribution using the mean and standard deviation of the Fair cut diamond prices
  • appropriate labels on the both the x and y axes
  • appropriate title
  • appropriate legend

Figure 2:

  • horizontal bar chart of the mean prices of the diamond cuts
  • ten evenly spaced tick marks on the x axis from 0 to the maximum mean price
  • appropriate labels on the x and y axes
  • appropriate title

main.ipynb

is

Setup

The following code imports the required libraries and loads a dataset containing information about diamonds into a Pandas DataFrame. Information about the dataset can be found here.

In [ ]:import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
%matplotlib inline

def normal_distribution(x, mu, sigma):
return 1/(sigma * np.sqrt(2*np.pi)) * np.exp(-(x-mu)**2/(2*sigma**2))

df = pd.read_csv('diamonds.csv')
df.pop('Unnamed: 0');

df = pd.read_csv('diamonds.csv')
df.pop('Unnamed: 0');

Bar chart of average price per cut

Make a plot the meets the following criteria:

  • horizontal bar chart of the mean prices of the diamond cuts
  • ten evenly spaced tick marks on the x axis from 0 to the maximum mean price
  • appropriate labels on the x and y axes
  • appropriate title

diamonds.csv is

https://forge.scilab.org/index.php/p/rdataset/source/tree/master/csv/ggplot2/diamonds.csv

Solutions

Expert Solution

1. The required source-code is given below:-

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib

def normal_distribution(x, mu, sigma):
        return 1/(sigma * np.sqrt(2*np.pi)) * np.exp(-(x-mu)**2/(2*sigma**2))

def plotFirst(df):
        data=df['price'].where(df['cut']=='Fair').dropna().tolist()
        # Calculating Normal Distribution 
        mean=np.mean(data)
        stdv=np.std(data)
        arr = []
        for num in data:
                a = normal_distribution(num,mean,stdv)
                arr.append(a)
        # Plotting the Graph
        fig, axs = plt.subplots(1, 1, figsize=(20,20))
        hist = axs.hist(arr, np.arange(min(arr),max(arr),(max(arr)-min(arr))/30))       
        axs.set_ylabel("Norm. Distrib. of fair-cut Diamonds")
        axs.set_xlabel("Bins")
        plt.show()
        
def plotSecond(df):
        distinct=df['cut'].unique().tolist()
        # Finding the Averages
        arr = []
        for cut in distinct:
                avg = df['price'].where(df['cut']==cut).mean()
                arr.append(avg)
        # Plotting the Graph    
        b = (distinct,arr)
        plt.bar(*b)
        plt.show()

if __name__=="__main__":
        df = pd.read_csv('diamonds.csv')
        df.pop('Unnamed: 0')
        plotFirst(df)
        plotSecond(df)

2. Screenshots of the output are below:-


Related Solutions

Python HW Open a new Jupyter notebook Create a new function named fibonacci() that takes one...
Python HW Open a new Jupyter notebook Create a new function named fibonacci() that takes one required parameter: maxint, an integer that will serve as the upper bound of the loop Following the example on the Python tutorial: https://docs.python.org/3/tutorial/introduction.html#first-steps-towards-programming Our implementation will have a few changes: While you will use a while loop to make a Fibonacci sequence, the upper bound of the sequence will be your maxint parameter. Store the results into a list and append each new generated...
Focuses on the design, development, implementation, and testing of a Python program using Jupyter Notebook only...
Focuses on the design, development, implementation, and testing of a Python program using Jupyter Notebook only to solve the problem described below. You will write a program that simulates an Automatic Teller Machine (ATM). For this program, your code can have of user-defined functions only. However, the program must not call on any external functions or modules to handle any of the input, computational, and output requirements. Note, the program can be completed without the use of user-defined functions. Requirements:...
Please solve using jupyter notebook . 10.10- (Invoice Class) Create a class called Invoice that a...
Please solve using jupyter notebook . 10.10- (Invoice Class) Create a class called Invoice that a hardware store might use to represent an invoice for an item sold at the store. An Invoice should include four pieces of information as data attributes—a part number (a string), a part description (a string), a quantity of the item being purchased (an int) and a price per item (a Decimal). Your class should have an __init__ method that initializes the four data attributes....
please solve using jupyter notebook . 10.9- (Square Class) Write a class that implements a Square...
please solve using jupyter notebook . 10.9- (Square Class) Write a class that implements a Square shape. The class should contain a side property. Provide an __init__ method that takes the side length as an argument. Also, provide the following read-only properties: a) perimeter returns 4 × side. b) area returns side × side. c) diagonal returns the square root of the expression (2 × side2). The perimeter, area and diagonal should not have corresponding data attributes; rather, they should...
I'm working on a scatter-plot program in Python using Pandas, Matplotlib, Numpy, etc. I'm pulling data...
I'm working on a scatter-plot program in Python using Pandas, Matplotlib, Numpy, etc. I'm pulling data from a CSV file, which has no names, just numbers. All I did was to read a .csv file. How do I pull data from three columns which contains about 1500 rows with just numbers and make a scatter plot with two in the x-axis and the third in the y-axis?
Graphs with Matplotlib Using the library Matplotlib and the provided data files create the following graphs:...
Graphs with Matplotlib Using the library Matplotlib and the provided data files create the following graphs: I) Pie chart Create a pie chart that shows the percentage of employees in each department within a company. The provided file: employee_count_by_department.txt contains the data required in order to generate this pie chart. II) Line Graph Create a line graph that shows a company's profit over the past ten years. The provided file: last_ten_year_net_profit.txt contains the data required in order to generate this...
Python: Using Jupyter Notebook 1. Write code to generate Fibonacci series. Fibonacci numbers – 1, 1,...
Python: Using Jupyter Notebook 1. Write code to generate Fibonacci series. Fibonacci numbers – 1, 1, 2, 3, 5, 8, … 2. Check if a number is an Armstrong number A positive integer is called an Armstrong number of order n if abcd... = a^n + b^n + c^n + d^n + ... In case of an Armstrong number of 3 digits, the sum of cubes of each digits is equal to the number itself. For example: 153 = 1*1*1...
using JUPYTER notebook: 9.1 (Class Average: Writing Grades to a Plain Text File) Figure 3.2 presented...
using JUPYTER notebook: 9.1 (Class Average: Writing Grades to a Plain Text File) Figure 3.2 presented a classaverage script in which you could enter any number of grades followed by a sentinel value, then calculate the class average. Another approach would be to read the grades from a file. In an IPython session, write code that enables you to store any number of grades into a grades.txt plain text file. In an IPython session, write code that reads the grades...
Machine Learning do using python on jupyter notebook 1. Linear Regression Dataset used: Diabetes from sklearn...
Machine Learning do using python on jupyter notebook 1. Linear Regression Dataset used: Diabetes from sklearn You are asked to solve a regression problem in the Diabetes dataset. Please review the Diabetes dataset used before creating a program to decide which attributes will be used in the regression process. please use the cross-validation step to produce the best evaluation of the model. All you have to do is • Perform linear regression using the OLS (Ordinary Least Square) method (sklearn.linear_model.LinearRegression)...
Plot the original data and the regression “line” ************USING Matlab************. "Submit plot" USING MATLAB! USING MATLAB!...
Plot the original data and the regression “line” ************USING Matlab************. "Submit plot" USING MATLAB! USING MATLAB! USING MATLAB! ONLY BY USING MATLAB!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 14.9 The concentration of E. coli bacteria in a swimming area is monitored after a storm: t (hr)                     4           8         12        16    20 24 c (CFU/100 mL) 1600     1320   1000     890 650 560 The time is measured in hours following the end of the storm and the unit CFU is a .colony forming unit.. Use this data to estimate (a)...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT