In: Computer Science
"Using Python, code to find the smallest and largest planet mass (if known), and plot these as two curves against the year of discovery"
Basically looking to use data from an excel sheet where 'disc_year' and 'pl_mass' are columns of data; and code Python to find the maximum value of mass and the minimum value of mass for each year, then plot this as two curves. There are multiple planets for any given year hence multiple values for mass. Below is example of what data might be.
Example data: Mass - 5.5, 0.2, 6, 56, 209, 0.3, 44, 124, 403, 98, 304, 202, 11.7, 5.4, 17.8, 21.9, 603.3, 108.2, 19, 0.3
Year - 2019, 2019, 1994, 2002, 2001, 2016, 1994, 2019, 2002, 2001, 2016, 2015, 2015, 1999, 2002, 1999, 1999, 2015, 2001, 2016
import pandas as pd
# creating a dataframe
df = pd.DataFrame([(5.5,2019),
(0.2,2019),
(6, 1994),
(56,2002),
(209,2001),
(0.3,2016),
(44,1994),
(124,2019),
(403,2002),
(98,2001),
(304,2016),
(202,2015),
(11.7,2015),
(5.4,1999),
(17.8,2002),
(21.9,1999),
(603.3,1999),
(108.2,2015),
(19,2001),
(0.3,2016)],
columns =('pl_mass', 'disc_year'))
print(df)
grouped_df = df.groupby("disc_year") #group by disc_year
maximums = grouped_df.max() #finding maximum value of pl_mass
maximums = maximums.reset_index() #resetting index and creating a dataframe for maximum pl_mass with year for plotting
print(maximums)
grouped_df = df.groupby("disc_year") #group by disc_year
minimums = grouped_df.min() #finding mainimum value of pl_mass
minimums = minimums.reset_index() #resetting index and creating a dataframe for minimum pl_mass with year for plotting
print(minimums)
import matplotlib.pyplot as plt
# line 1 points
x1 = maximums["disc_year"]
y1 = maximums["pl_mass"]
# plotting the line 1 points
plt.plot(x1, y1, label = "Maximum")
# line 2 points
x2 = minimums["disc_year"]
y2 = minimums["pl_mass"]
# plotting the line 2 points
plt.plot(x2, y2, label = "Minimum")
# naming the x axis
plt.xlabel('Discovery Year')
# naming the y axis
plt.ylabel('PLT_MASS')
# giving a title to my graph
plt.title('Comparison Curve')
# show a legend on the plot
plt.legend()
# function to show the plot
plt.show()
To use data directly form excel sheet you can try the following piece of code :
Import pandas as pd
Df = pd.real_excel("name_of_file.xlsx")
Or directly create a data frame as shown in the above code.