Question

In: Computer Science

The purpose of this is to plot data using Matplotlib. Description complete the Jupyter notebook named...

The purpose of this is to plot data using Matplotlib.

Description

complete the Jupyter notebook named main.ipynb that reads in the file diamonds.csv into a Pandas DataFrame. Information about the file can be found here:

-------

diamonds

R Documentation

Prices of over 50,000 round cut diamonds

Description

A dataset containing the prices and other attributes of almost 54,000 diamonds. The variables are as follows:

Usage

diamonds

Format

A data frame with 53940 rows and 10 variables:

price

price in US dollars (\$326–\$18,823)

carat

weight of the diamond (0.2–5.01)

cut

quality of the cut (Fair, Good, Very Good, Premium, Ideal)

color

diamond colour, from D (best) to J (worst)

clarity

a measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best))

x

length in mm (0–10.74)

y

width in mm (0–58.9)

z

depth in mm (0–31.8)

depth

total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43–79)

table

width of top of diamond relative to widest point (43–95)

------

There are two figures that you need to create:

Figure 1:

normalized histogram with 30 bins using the Fair cut diamond prices
a line plot of the normal distribution using the mean and standard deviation of the Fair cut diamond prices
appropriate labels on the both the x and y axes
appropriate title
appropriate legend

Figure 2:

appropriate title

There are two figures that you need to create:

Figure 1:

normalized histogram with 30 bins using the Fair cut diamond prices
a line plot of the normal distribution using the mean and standard deviation of the Fair cut diamond prices
appropriate labels on the both the x and y axes
appropriate title
appropriate legend

Figure 2:

horizontal bar chart of the mean prices of the diamond cuts
ten evenly spaced tick marks on the x axis from 0 to the maximum mean price
appropriate labels on the x and y axes
appropriate title

main.ipynb

is

Setup

The following code imports the required libraries and loads a dataset containing information about diamonds into a Pandas DataFrame. Information about the dataset can be found here.

In [ ]:import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
%matplotlib inline

def normal_distribution(x, mu, sigma):
return 1/(sigma * np.sqrt(2*np.pi)) * np.exp(-(x-mu)**2/(2*sigma**2))

df = pd.read_csv('diamonds.csv')
df.pop('Unnamed: 0');

Bar chart of average price per cut

Make a plot the meets the following criteria:

horizontal bar chart of the mean prices of the diamond cuts
ten evenly spaced tick marks on the x axis from 0 to the maximum mean price
appropriate labels on the x and y axes
appropriate title

diamonds.csv is

https://forge.scilab.org/index.php/p/rdataset/source/tree/master/csv/ggplot2/diamonds.csv

Expert Solution

1. The required source-code is given below:-

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib

def normal_distribution(x, mu, sigma):
        return 1/(sigma * np.sqrt(2*np.pi)) * np.exp(-(x-mu)**2/(2*sigma**2))

def plotFirst(df):
        data=df['price'].where(df['cut']=='Fair').dropna().tolist()
        # Calculating Normal Distribution 
        mean=np.mean(data)
        stdv=np.std(data)
        arr = []
        for num in data:
                a = normal_distribution(num,mean,stdv)
                arr.append(a)
        # Plotting the Graph
        fig, axs = plt.subplots(1, 1, figsize=(20,20))
        hist = axs.hist(arr, np.arange(min(arr),max(arr),(max(arr)-min(arr))/30))       
        axs.set_ylabel("Norm. Distrib. of fair-cut Diamonds")
        axs.set_xlabel("Bins")
        plt.show()
        
def plotSecond(df):
        distinct=df['cut'].unique().tolist()
        # Finding the Averages
        arr = []
        for cut in distinct:
                avg = df['price'].where(df['cut']==cut).mean()
                arr.append(avg)
        # Plotting the Graph    
        b = (distinct,arr)
        plt.bar(*b)
        plt.show()

if __name__=="__main__":
        df = pd.read_csv('diamonds.csv')
        df.pop('Unnamed: 0')
        plotFirst(df)
        plotSecond(df)

2. Screenshots of the output are below:-

venereology answered 2 years ago

Python HW with Jupyter Notebook Declare a variable named DATA as a dictionary object. Assign the...

Python HW with Jupyter Notebook Declare a variable named DATA as a dictionary object. Assign the set of key/value pairs shown below. Create a function named iter_dict_funky_sum() that takes one dictionary argument. Declare a running total integer variable. Extract the key/value pairs from DATA simultaneously in a loop. Do this with just one for loop and no additional forms of looping. Assign and append the product of the value minus the key to the running total variable. Return the funky...

Python HW Open a new Jupyter notebook Create a new function named fibonacci() that takes one...

Python HW Open a new Jupyter notebook Create a new function named fibonacci() that takes one required parameter: maxint, an integer that will serve as the upper bound of the loop Following the example on the Python tutorial: https://docs.python.org/3/tutorial/introduction.html#first-steps-towards-programming Our implementation will have a few changes: While you will use a while loop to make a Fibonacci sequence, the upper bound of the sequence will be your maxint parameter. Store the results into a list and append each new generated...

Focuses on the design, development, implementation, and testing of a Python program using Jupyter Notebook only...

Focuses on the design, development, implementation, and testing of a Python program using Jupyter Notebook only to solve the problem described below. You will write a program that simulates an Automatic Teller Machine (ATM). For this program, your code can have of user-defined functions only. However, the program must not call on any external functions or modules to handle any of the input, computational, and output requirements. Note, the program can be completed without the use of user-defined functions. Requirements:...

I'm working on a scatter-plot program in Python using Pandas, Matplotlib, Numpy, etc. I'm pulling data...

I'm working on a scatter-plot program in Python using Pandas, Matplotlib, Numpy, etc. I'm pulling data from a CSV file, which has no names, just numbers. All I did was to read a .csv file. How do I pull data from three columns which contains about 1500 rows with just numbers and make a scatter plot with two in the x-axis and the third in the y-axis?

Graphs with Matplotlib Using the library Matplotlib and the provided data files create the following graphs:...

Graphs with Matplotlib Using the library Matplotlib and the provided data files create the following graphs: I) Pie chart Create a pie chart that shows the percentage of employees in each department within a company. The provided file: employee_count_by_department.txt contains the data required in order to generate this pie chart. II) Line Graph Create a line graph that shows a company's profit over the past ten years. The provided file: last_ten_year_net_profit.txt contains the data required in order to generate this...

Please solve using jupyter notebook . 10.10- (Invoice Class) Create a class called Invoice that a...

Please solve using jupyter notebook . 10.10- (Invoice Class) Create a class called Invoice that a hardware store might use to represent an invoice for an item sold at the store. An Invoice should include four pieces of information as data attributes—a part number (a string), a part description (a string), a quantity of the item being purchased (an int) and a price per item (a Decimal). Your class should have an __init__ method that initializes the four data attributes....

please solve using jupyter notebook . 10.9- (Square Class) Write a class that implements a Square...

please solve using jupyter notebook . 10.9- (Square Class) Write a class that implements a Square shape. The class should contain a side property. Provide an __init__ method that takes the side length as an argument. Also, provide the following read-only properties: a) perimeter returns 4 × side. b) area returns side × side. c) diagonal returns the square root of the expression (2 × side2). The perimeter, area and diagonal should not have corresponding data attributes; rather, they should...

Please give me an example of how we import stock data in jupyter notebook(Python) and analyze...

Please give me an example of how we import stock data in jupyter notebook(Python) and analyze each step.

Using Python, generate 100 random numbers ranging from 1 to 5, then plot them using matplotlib...

Using Python, generate 100 random numbers ranging from 1 to 5, then plot them using matplotlib or Seaborn as a histogram

Python: Using Jupyter Notebook 1. Write code to generate Fibonacci series. Fibonacci numbers – 1, 1,...

Python: Using Jupyter Notebook 1. Write code to generate Fibonacci series. Fibonacci numbers – 1, 1, 2, 3, 5, 8, … 2. Check if a number is an Armstrong number A positive integer is called an Armstrong number of order n if abcd... = a^n + b^n + c^n + d^n + ... In case of an Armstrong number of 3 digits, the sum of cubes of each digits is equal to the number itself. For example: 153 = 1*1*1...