Question

In: Computer Science

This task exercises your ability to use python to represent data and use flow control and...

This task exercises your ability to use python to represent data and use flow control and functions to re-organize the data. You need to submit the ipynb file to Moodle.

A data scientist has collected tube information and saved the video info in multiple CSV files. Each CSV file has the following columns:

·         video_id

·         trending_date

·         title

·         channel_title

·         category_id

·         publish_time

·         tags

·         views   

·         likes     

·         dislikes

·         comment_count

·         thumbnail_link

·         comments_disabled    

·         ratings_disabled         

·         video_error_or_removed       

·         description

You are asked to write python code to process CSV data files. You can only use collections, numpy, and CSV modules. The task needs you to solve 6 problems as listed below. Use the template ipynb file to complete your code. For each question, your code should output an answer CSV file, so that the results can be viewed with MS Excel. Use one cell for each question. After running the 6 cells one after another, your solution should output 6 answer CSV files. The answer files must be named as “question1.csv”, “question2.csv”, … , “question6.csv”. Write documentation in your code. Add comments to explain key steps.

2 visible test cases are provided to you. Each has an input.csv file and the corresponding answer files.

Your code will be tested with multiple hidden test cases (HTCs). For each HTC video file, your code should generate the corresponding answers in 6 answer CSV files. The HTCs are similar to the given test cases. You can assume that HTC video files have all the above-mentioned columns. For each column, HTCs have the same value type and ranges. For example, HTC video files have the column “likes” and “dislikes”, with all values being non-negative integers. You don’t have to consider missing values or non-standard values (e.g. “117394” is never saved as “117,394”). The cells in your solution will be executed one after another.

Question 6:

Show all categories with at least 10 videos. For each category, show the category name, number of videos, videoId with the highest views (if same, print the one that appears first in the CSV), average comment count, number of videos disabled comments, and a list of unique channels that published videos in the category (in ascending order alphabetically. separated by '|'). Save the results in the descending order of video count.

Convert the category id to the actual category name. Since different countries have different category id encoding, your code should allow dynamically convert the category id to the category name. That means your code must directly read the categories from the category_id.txt file. Do not hard coding the categories in your code.

The output file should have the following headers:

category_name

video_count

most_popular_video

average_comment

disable_comment_count

channels

I just need help with the last question, question 6.

Solutions

Expert Solution

import csv
with open('file.csv','r') as f:
    reader = csv.reader(f)
    data = [list(row) for row in reader]
cid=[]
cname=[]
no_0f_videos=[]
popular_video=[]
average_comment_count=[]
disabled_count=[]
unique_channels=[]
popularvideocount=0
pvideotitle=""
commentcount=0
clen=0
commentdisabled=0
for i in range(1,len(data)):
    cid.append(data[i][4])
cid=list(dict.fromkeys(cid))
for id in cid:
    videocount=0
    uniquechannel=[]
    for i in range(1,len(data)):
        if(id==data[i][4]):
            videocount=videocount+1
            if(data[i][7]>popularvideocount):
                pvideotitle=data[i][2]
            commentcount=commentcount+data[i][10]
            clen=clen+1
            commentdisabled=commentdisabled+data[i][12]
            uniquechannel.append(data[i][3])
        no_0f_videos.append(videocount)
        popular_video.append(pvideotitle)
        average_comment_count.append(commentcount/clen)
        disabled_count.append(commentdisabled)
        uniquechannel=list(dict.fromkeys((uniquechannel))
        .extend(uniquechannel)

file1 = open('category_id.txt', 'r') 
Lines = file1.readlines()
cname=[]
fcid=[]
for line in Lines:
    a,b=line.split(" ")
    fcid.append(a)
    cname.append(b)
categoryname=[]
for id in cid:
        for i in range(1,len(fcid)):
                           if(id==fcid):
                               .append(cname[i])
                               break

for i in range(len(cid)):
    print("category name is ",categoryname[i])
    print("no of videos are ",[i])
    print("video with highest views  is ",popular_video[i])
    print("average comment count  is ",average_comment_count[i])
    print("disabled comments is ",disabled_count[i])
    print("unique cahannels are ",unique_channels[i].join("|"))
    
fields = ['category_name', 'video_count', 'popular_video']  
    
# data rows of csv file  
rows = [categoryname,no_0f_videos,popular_video]  
    
# name of csv file  
filename = "quetion6.csv"
    
# writing to csv file  
with open(filename, 'w') as csvfile:  
    # creating a csv writer object  
    csvwriter = csv.writer(csvfile)  
        
    # writing the fields  
    csvwriter.writerow(fields)  
        
    # writing the data rows  
    csvwriter.writerows(rows) 
            
        
    

Related Solutions

Use the Tornadoes data. Your TASK is to use the months of July and August to...
Use the Tornadoes data. Your TASK is to use the months of July and August to predict the tornado activity in October. Answer questions 5 to 7. Choose the best fitting answer. Note: numbers are truncated unless specified. 5. If July had 100 tornadoes and August had 200 tornadoes, what would be your prediction for the number of tornadoes in October? a. 34.88 b. 39.18 c. 44.15 d. 58.62 6. What is the approximate error of this prediction? a. 8.44...
Q1. Use the below Scenario to draw the context and level-0 data flow diagrams that represent...
Q1. Use the below Scenario to draw the context and level-0 data flow diagrams that represent the video rental system. (30 points for context diagram, and 70 points for level-0 diagram) Note: Make sure you include all the appropriate external entities, Processes, Data Stores, and Data Flows. Customer and manager are dealing with a video rental store system. Every time the customer requests to rent a video, he/she needs to enter this request to the system, and the system stores...
This task focuses on your ability to 'critically' watch a video and form your own opinions...
This task focuses on your ability to 'critically' watch a video and form your own opinions - in favour of/ against /neutral about the approach taken by the doctors in FIJI to treat a crisis on Diabetic complications. In Fiji there is one diabetic amputation every 12 hours. This is devastating for a country with a population around 1 million. This inspirational talk is by Dr Jone Hawea who is a surgeon and spent many years in this field, and...
This task focuses on your ability to 'critically' watch a video and form your own opinions...
This task focuses on your ability to 'critically' watch a video and form your own opinions - in favour of/ against /neutral about the approach taken by the doctors in FIJI to treat a crisis on Diabetic complications. In Fiji there is one diabetic amputation every 12 hours. This is devastating for a country with a population around 1 million. This inspirational talk is by Dr Jone Hawea who is a surgeon and spent many years in this field, and...
This task focuses on your ability to 'critically' watch a video and form your own opinions...
This task focuses on your ability to 'critically' watch a video and form your own opinions - in favour of/ against /neutral about the approach taken by the doctors in FIJI to treat a crisis on Diabetic complications. In Fiji there is one diabetic amputation every 12 hours. This is devastating for a country with a population around 1 million. This inspirational talk is by Dr Jone Hawea who is a surgeon and spent many years in this field, and...
For the following exercises, use the table of values that represent points on the graph of a quadratic function. By determining..
For the following exercises, use the table of values that represent points on the graph of a quadratic function. By determining the vertex and axis of symmetry, find the general form of the equation of the quadratic function.
This assessment task aims to develop your ability to apply the first three phases of the...
This assessment task aims to develop your ability to apply the first three phases of the clinical reasoning process, at an introductory level, to the patient scenario below. You are a student nurse working with a school nurse (registered nurse) in a secondary school. You and your mentor are supervising a bubble soccer match this afternoon (26th March) which commenced at 1400 hrs. The match goes for 40 minutes with a 5-minute break in between the two halves. It is...
This assessment task aims to develop your ability to apply the first three phases of the...
This assessment task aims to develop your ability to apply the first three phases of the clinical reasoning process, at an introductory level, to the patient scenario below. You are a student nurse working with a school nurse (registered nurse) in a secondary school. You and your mentor are supervising a bubble soccer match this afternoon (26th March) which commenced at 1400 hrs. The match goes for 40 minutes with a 5-minute break in between the two halves. It is...
How would your task force use the FOCUS model and the data collection, process mapping, and...
How would your task force use the FOCUS model and the data collection, process mapping, and process analysis tools to plan for a process change?
Hi, I would be grateful for some helot with this Python problem. Your task is to...
Hi, I would be grateful for some helot with this Python problem. Your task is to implement the simple elevator in Python using classes. The default strategy is the simple "start at the bottom, go to the top, then go to the bottom". Can you write a better strategy, one that is more efficient? Description / Specification Create three classes: Building, Elevator, and Customer. Equip the building with an elevator. Ask user to customize the number of floors and the...
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT