Question

In: Computer Science

#########################PANDAS LANGUAGE################## #########################MATPLOT LIB######################### # read movie.csv into a DataFrame called 'movie' # describe the dataframe...

#########################PANDAS LANGUAGE##################

#########################MATPLOT LIB#########################

# read movie.csv into a DataFrame called 'movie'
# describe the dataframe
#rename the column Runtime (Minutes) with Runtime_Minutes, and Revenue (Millions) with Revenue_Millions 
# show if any column has null value
# count total number of null vlaues in the dataframe
# print those rows which has null values
# fill null values, 
#if column is numerical than fill with means (if there is no numerical missing value in 
#data frame then don't code in this)
#if column is categorical than fill with most frequent value (if there is no categorical missing value in 
#data frame then don't code in this)
# plot histogram of the column name year in movie dataframe, which shows how many movies release in a year.
# print the movie detail with title 'Grumpier Old Men'.
# show those movies which are released after 1995-01-01
# sort the movie DataFrame in decending order based on release_date
# for each year, display the total number of movie with specific gerne for example Action=1000,adventure=400
# plot histogram the upper calculated total count
​# filter the movies with specific gerne # like show only those movies which are selected Action gerne 
# filter the movies with specific gerne
# like show only those movies which are selected Action gerne
# for each Director, display all the movies with detail.
# count the movies and plot barchart top 10 director's movies.
​# for each Actor, display all the movies with detail.
​# count the movies and visualize the top 10 actor's movies in plot

In [27]:

data file

Rank Title Genre Description Director Actors Year Runtime (Minutes) Rating Votes Revenue (Millions) Metascore
1 Guardians of the Galaxy Action,Adventure,Sci-Fi A group of intergalactic criminals are forced to work together to stop a fanatical warrior from taking control of the universe. James Gunn Chris Pratt, Vin Diesel, Bradley Cooper, Zoe Saldana 2014 121 8.1 757074 333.13 76
2 Prometheus Adventure,Mystery,Sci-Fi Following clues to the origin of mankind, a team finds a structure on a distant moon, but they soon realize they are not alone. Ridley Scott Noomi Rapace, Logan Marshall-Green, Michael Fassbender, Charlize Theron 2012 124 7 485820 126.46 65
3 Split Horror,Thriller Three girls are kidnapped by a man with a diagnosed 23 distinct personalities. They must try to escape before the apparent emergence of a frightful new 24th. M. Night Shyamalan James McAvoy, Anya Taylor-Joy, Haley Lu Richardson, Jessica Sula 2016 117 7.3 157606 138.12 62
4 Sing Animation,Comedy,Family In a city of humanoid animals, a hustling theater impresario's attempt to save his theater with a singing competition becomes grander than he anticipates even as its finalists' find that their lives will never be the same. Christophe Lourdelet Matthew McConaughey,Reese Witherspoon, Seth MacFarlane, Scarlett Johansson 2016 108 4.2 60545 270.32 59
5 Suicide Squad Action,Adventure,Fantasy A secret government agency recruits some of the most dangerous incarcerated super-villains to form a defensive task force. Their first mission: save the world from the apocalypse. David Ayer Will Smith, Jared Leto, Margot Robbie, Viola Davis 2015 123 3.2 393727 325.02 40
6 The Great Wall Action,Adventure,Fantasy European mercenaries searching for black powder become embroiled in the defense of the Great Wall of China against a horde of monstrous creatures. Yimou Zhang Matt Damon, Tian Jing, Willem Dafoe, Andy Lau 2014 103 6.1 56036 45.13 42
7 La La Land Comedy,Drama,Music A jazz pianist falls for an aspiring actress in Los Angeles. Damien Chazelle Ryan Gosling, Emma Stone, Rosemarie DeWitt, J.K. Simmons 2013 128 5.3 258682 151.06 93
8 Mindhorn Comedy A has-been actor best known for playing the title character in the 1980s detective series "Mindhorn" must work with the police when a serial killer says that he will only speak with Detective Mindhorn, whom he believes to be a real person. Sean Foley Essie Davis, Andrea Riseborough, Julian Barratt,Kenneth Branagh 2010 89 6.4 2490 71

Solutions

Expert Solution

# import necessary library
import pandas as pd
import matplotlib.pyplot as plt

#  read movie.csv into a DataFrame called 'movie'
movie = pd.read_csv('movies.csv')

# describe the dataframe
movie.describe()

# rename the column Runtime (Minutes) with Runtime_Minutes, and Revenue (Millions) with Revenue_Millions 
movie.rename(columns = {'Runtime (Minutes)':'Runtime_Minutes','Revenue (Millions)':'Revenue_Millions'}, inplace = True)

# show if any column has null value
movie.columns[movie.isna().any()].tolist()

# count total number of null vlaues in the dataframe
movie.isna().sum().sum()

# print those rows which has null values
movie[movie.isna().any(axis=1)]

# Fill null values : if column is numerical than fill with means
# Here 'Revenue_Millions', 'Metascore' these two columns are having null values and both are numerical.
movie.fillna(movie.mean(), inplace=True)

# plot histogram of the column name year in movie dataframe, which shows how many movies release in a year.
fig, ax = plt.subplots(figsize =(8, 6))
movie['Year'].hist()
fig.show()

# print the movie detail with title 'Grumpier Old Men'.
movie[movie.Title=='Grumpier Old Men']
# It seems there is no record with this title.

# show those movies which are released after 1995-01-01
movie[movie.Year>1995]

# sort the movie DataFrame in decending order based on release_date
movie.sort_values('Year', ascending=False)

# for each year, display the total number of movie with specific gerne for example
count = movie.groupby('Genre')['Genre'].count()
print(count)

#plot histogram the upper calculated total count
fig, ax = plt.subplots(figsize =(10, 7))
fig = count.hist()
plt.show()

# filter the movies with specific gerne # like show only those movies which are selected Action gerne
movie[movie['Genre']=='Action']

# for each Director, display all the movies with detail.
list_director = movie['Director'].unique().tolist()
for director in list_director:
    df = movie[movie['Director'] == director]
    print('---------------------------------'+director+'---------------------------------------')
    print(df.head())


# count the movies and plot barchart top 10 director's movies.
top_10_directors = movie.groupby('Director')['Director'].count().sort_values(ascending=False).head(10)
print(top_10_directors)


fig, ax = plt.subplots(figsize =(8, 6))
top_10_directors.plot.bar()
plt.show()

# for each Actor, display all the movies with detail.
import itertools
list_actor = movie['Actors'].unique().tolist()
l=[]
for actor in list_actor:
    l.append(actor.split(','))
list_flat = set(itertools.chain(*l))
#print(list_flat)
for actor in list_flat:
    df = movie[movie['Actors'].str.contains(actor)]
    print('---------------------------------'+actor+'---------------------------------------')
    print(df.head())



# count the movies and visualize the top 10 actor's movies in plot
actor_list = []
count_list = []
for actor in list_flat:
    length = movie[movie['Actors'].str.contains(actor)].shape[0]
    actor_list.append(actor)
    count_list.append(length)
actor_dict = {'Actor':actor_list, 'Count':count_list}

actor_df = pd.DataFrame.from_dict(actor_dict)
fig, ax = plt.subplots(figsize =(8, 6))
actor_df.sort_values('Count', ascending=False).head(10).plot.bar(x='Actor',y='Count')
fig.show()

All the questions are answered with proper comment.

Still if you have any queries, please feel free to post in comment box. I would be glad to assist you here. If you like my answers and explanation, please give a thumbs up, it really motivates us to provide a good quality answers.


Related Solutions

######################LANGUAGE PANDAS#################### #####################MATPLOTLIB########################### ######################################################### # read ufo.csv into a DataFrame called 'ufo' # print the head...
######################LANGUAGE PANDAS#################### #####################MATPLOTLIB########################### ######################################################### # read ufo.csv into a DataFrame called 'ufo' # print the head and the tail # examine the default index, data types, and shape of ufo dataframe # count the number of missing values in each column # count total number of null vlaues in the dataframe # print those rows which has null values # fill null values, #if any column is numerical has null value than fill this column with mean of that column...
######################LANGUAGE PANDAS#################### #####################MATPLOTLIB########################### ######################################################### # read ufo.csv into a DataFrame called 'ufo' # print the head...
######################LANGUAGE PANDAS#################### #####################MATPLOTLIB########################### ######################################################### # read ufo.csv into a DataFrame called 'ufo' # print the head and the tail # examine the default index, data types, and shape of ufo dataframe # count the number of missing values in each column # count total number of null vlaues in the dataframe # print those rows which has null values # fill null values, #if any column is numerical has null value than fill this column with mean of that column...
#########################PANDAS LANGUAGE################## #########################MATPLOT LIB######################### In [40]: #importing file users = pd.read_table('u.user', sep='|', index_col='user_id') Describe and show...
#########################PANDAS LANGUAGE################## #########################MATPLOT LIB######################### In [40]: #importing file users = pd.read_table('u.user', sep='|', index_col='user_id') Describe and show the dataframe In [ ]: # describe information of all columns ​ # describe information of all numeric columns only ​ # describe information of all object columns only ​ # show first 10 rows of users dataframe detecting duplicate rows In [10]: # check wheather a row is identical to a previous row ​ # count all duplicate rows in the dataframe ​...
Using pandas Read in the movies.csv into a dataframe named movies, display the first 5 rows...
Using pandas Read in the movies.csv into a dataframe named movies, display the first 5 rows and answer * Use the filter method to select the column names that contain the exact string facebook [ ] * Use the count method to find the number of non-missing values for each column. [ ] * Display the count of missing values for each column
How do I select every row in pandas dataframe?
How do I select every row in pandas dataframe?
Create a pandas dataframe and then impute missing values . data = { 'test' : [1,2,3,4,10,15]...
Create a pandas dataframe and then impute missing values . data = { 'test' : [1,2,3,4,10,15] 'missing' : [1,2,4,None,5,7] } replace the missing values in the missing table column with mean values using mean imputation ============ i am trying like this but i am not getting correct output and getting confused please explain with proper output and explanation import pandas as pd pd.DataFrame(data) temp = pd.DataFrame(data).fillna(np.mean()) temp ['missing'] . fillna(temp['missing'].mean()) ================ i am too much confused please write proper program...
Assume you have the Pandas DataFrame data, with the following contents: our_columns_name column_A column_B column_C column_D...
Assume you have the Pandas DataFrame data, with the following contents: our_columns_name column_A column_B column_C column_D column_E our_index_name                                                    row_name_0               9        93        71    Hello       102 row_name_1              28        64        37       my        92 row_name_2              13        91        93     name       104 row_name_3              45        29        54       is        74 row_name_4               0        36        31    Jason        36 Each column has a dtype (data type). Which of the following could be set of dtypes for this DataFrame? Hint 1: None of the numeric values shows a decimal point. (A float...
Analyze used car inventory dataset using Python's pandas library - using DataFrame data structure¶ Dataset: UsedCarInventory_Assignment1.txt...
Analyze used car inventory dataset using Python's pandas library - using DataFrame data structure¶ Dataset: UsedCarInventory_Assignment1.txt (available on Canvas) This dataset shows used cars available for sale at a dealership. Each row represents a car record and columns tell information about each car. The first row in the dataset contains column headers. You must use Pandas to complete all 10 tasks.
For c language. I want to read a text file called input.txt for example, the file...
For c language. I want to read a text file called input.txt for example, the file has the form. 4 hello goodbye hihi goodnight where the first number indicates the n number of words while other words are separated by newlines. I want to store these words into a 2D array so I can further work on these. and there are fewer words in the word file than specified by the number in the first line of the file, then...
Read in the movies.csv into a dataframe named movies, display the first 5 rows and answer...
Read in the movies.csv into a dataframe named movies, display the first 5 rows and answer the below 10 questions url = 'https://raw.githubusercontent.com/PacktPublishing/Pandas-Cookbook/master/data/movie.csv' 6) Use the count method to find the number of non-missing values for each column. [ ] 7) Display the count of missing values for each column [ ] 8) List the frequency for the top ten directors [ ] 9) List the top ten director_name that has the highest average of director_facebook_likes [ ] 10) List...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT