In: Computer Science
I'm working on a scatter-plot program in Python using Pandas, Matplotlib, Numpy, etc. I'm pulling data from a CSV file, which has no names, just numbers. All I did was to read a .csv file. How do I pull data from three columns which contains about 1500 rows with just numbers and make a scatter plot with two in the x-axis and the third in the y-axis?
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
file_path = "./inp.csv"
#read_csv() function helps in reading the csv file
#header=None will take care of that
#usecols attribute helps in reading specific column indexes for
your problem
#3 columns only. x , y1, y2 are the three columns new names
#names aattribute for setting names for the columns that read helps
in plotting
df = pd.read_csv(file_path, header=None,
usecols=[0,1,2],names=["x","y1","y2"])
#once data frame is created
#assign the scatter plot of the given data frame in variable
#x = "x" means on x axis use x column data in data frame read
#y = "y2" means on y axis use y2 column data in data frame
read
#color to distinguish and label is to say what the points of given
color
ax = df.plot(kind="scatter", x="x",y="y2", color="b", label="x vs.
y2")
#next plot another scatter plot but now use on x axis y1 and on
y axis y2 columns data
#and ax attribute is the axes.
#set it to the previous plot x vs y2
df.plot(kind="scatter", x="y1",y="y2", color="r", label="y1 vs.
y2", ax=ax)
#print the columns and data frame
# you may not need theese two prints coz you will have huge data.
so
#if you dont want to print comment below two print statements
print(df.columns)
print(df)
#set labels
ax.set_xlabel("horizontal label")
ax.set_ylabel("vertical label")
#finally show the plot.
plt.show()
#if you have any doubts please comment and like the
answer.