In: Computer Science
plot this data into a bar graph: PYTHON
data=pandas.read_csv(r'data/tv_shows.txt', low_memory=False)
print((data))
print((data.columns))
TV Shows : Rating 0 --------------------- 1 A Discovery of Witches : 100% 2 Barry : 100% 3 Unforgotten : 100% 4 Veep : 98% 5 Killing Eve : 97% 6 Billions : 96% 7 Les Misérables : 96% 8 Supergirl : 89% 9 Call the Midwife : 80% 10 Game of Thrones : 77% 11 Now Apocalypse : 77% 12 The Red Line : 69% 13 Lucifer : No Score Yet 14 Chernobyl : 95% 15 Dead to Me : 85% 16 Better Things : 100% 17 Brooklyn Nine-Nine : 100% 18 Tuca & Bertie : 100% 19 State of the Union : 100% 20 The Twilight Zone : 75% 21 Happy! : 100% Index(['TV Shows : Rating'], dtype='object')
In [9]:
display(data)
Few modifications in the text file 1.I changed normal spaces to Tab spaces in each line 2.I changed'Les Misérables' to 'Les Miserables'
TXT FILE:
TV Shows : Rating
0 ---------------------
1 A Discovery of Witches : 100%
2 Barry : 100%
3 Unforgotten : 100%
4 Veep : 98%
5 Killing Eve : 97%
6 Billions : 96%
7 Les Miserables : 96%
8 Supergirl : 89%
9 Call the Midwife : 80%
10 Game of Thrones : 77%
11 Now Apocalypse : 77%
12 The Red Line : 69%
13 Lucifer : No Score Yet
14 Chernobyl : 95%
15 Dead to Me : 85%
16 Better Things : 100%
17 Brooklyn Nine-Nine : 100%
18 Tuca & Bertie : 100%
19 State of the Union : 100%
20 The Twilight Zone : 75%
21 Happy! : 100%
CODE:
import matplotlib.pyplot as plt
import csv
import pandas as pd
tvshows = [] #for storing names of tvshows
rating = [] #for storing the ratings
Label = []
with open('asd.txt', mode='r') as csv_file:
data = csv.reader(csv_file,delimiter='\t') #remove tab space in lines
line_count = 0
for row in data:
line_count += 1
if(line_count==1):# if we are on first line then we store the names of X and Y labels
la = row[0].split(':')
#print(la)
Label.append(la[0])
Label.append(la[1])
if(len(row)==2) and line_count>2: #this 'if' is to ignore the line '0 -------'
arr = row[1].split(':') #we split with : as delimiter and store them in arr and row[1] means we are ignoring the
#serial number and starting from name of tvshows in each row
tvshows.append(arr[0]) #name of tv shows and arr[1] stores percentage of rating
sarr = arr[1].strip('%')#we removed % for making it easier to plot in bar graph as we require int values
if sarr == ' No Score Yet':
sarr = '0'
rating.append(int(sarr)) #converting string to int
#now plotting
plt.figure()
df = pd.DataFrame({'%Rating':rating}, index=tvshows)# first argument takes the number of bar plots(we require only one) and
#second argument takes the variables for which we are plotting
ax = df.plot.bar() #bar graph
plt.xlabel(Label[0]) #xlabel
plt.ylabel(Label[1]) #ylabel
plt.show()
