Question

In: Computer Science

1. Find an example of data analytics project that has published its data publicly and state...

1. Find an example of data analytics project that has published its data publicly and state the source.
2. Explain about the project and why it is interesting to you?
3. Identify and explain the objective of the data analytics.
4. Identify and explain the data types.  
5. Explain the data mining techniques involved
6. State the insights discovered in the project.

Solutions

Expert Solution

Answer 1) IMDB movie database Analysis of 100 years. IMDB has the data on its official website. We can download the data set from IMDB.com or it can be downloaded from Github.com.In this dataset you will try to find some interesting insights into a few movies released between 1916 and 2016, using Python. .Write Python code to explore the data, gain insights into the movies, actors, directors, and collections.

Answer2) The project has the movie dataset of 100 years from 1916 to 2016 with all its attributes such as ratings, budgets, directors, collections, top 250 movies of all time. It was interesting because it gave us insights into the movie database like never before. It also helped us to understand that movies may have a large number of Facebook likes but not good ratings, Ratings countrywide, Actorwise ratings, and director wise ratings.

Answer 3) The objective of the data analysis was to get insights from the data by :

  • Read the data in a data frame.
  • Cleaning the data by inspecting null values.
  • Dropping unnecessary columns.
  • Dropping unnecessary rows with a high percentage of null values.
  • Imputing values wherever necessary.
  • Checking the number of retained rows.
  • Changing the budget to millions and checking highest-grossing movies.
  • Dropping duplicate values.
  • Finding the Top 250 Movies of all time.
  • Finding the top 10 best directors and Most popular genres.
  • Finding critic favorite and adience favorite characters.

Answers 4) Since we have built the project in python3.8, we will discuss the python built-in datatypes:

  • Python has the following data types built-in by default, in these categories:

    Text Type: str
    Numeric Types: int, float, complex
    Sequence Types: list, tuple, range
    Mapping Type: dict
    Set Types: set, frozenset
    Boolean Type: bool
    Binary Types: bytes, bytearray, memoryview

Answer 5) Data mining techniques used here are OUTLIER DETECTION/ ANOMALY DETECTION.

This refers to the observation for data items in a dataset that do not match an expected pattern or an expected behavior. Anomalies are also known as outliers, novelties, noise, deviations and exceptions. Often they provide critical and actionable information. An anomaly is an item that deviates considerably from the common average within a dataset or a combination of data. These types of items are statistically aloof as compared to the rest of the data and hence, it indicates that something out of the ordinary has happened and requires additional attention.This technique can be used in a variety of domains, such as intrusion detection, system health monitoring, fraud detection, fault detection, event detection in sensor networks, and detecting eco-system disturbances. Analysts often remove the anomalous data from the dataset top discover results with an increased accuracy.

Answer6)

  • During the data analysis, some of the language columns had NaN values, so it was better to impute them as English.
  • We also noticed that after removing and imputing the values we still had around 77 percent of data to analyze.
  • No surprises that Damien Chazelle (director of Whiplash and La La Land) is in top 10 directors list.
  • Leonardo Di Caprio and Brad Pitt are two most liked actors.
  • When finding the mean of critic reviews and audience reviews here we found Leonardo Di caprio Aced Both lists.
  • actor_1_name
    Leonardo DiCaprio    330.190476
    Brad Pitt            245.000000
    Meryl Streep         181.454545
    Name: num_critic_for_reviews, dtype: float64
    actor_1_name
    Leonardo DiCaprio    914.476190
    Brad Pitt            742.352941
    Meryl Streep         297.181818
    Name: num_user_for_reviews, dtype: float64

Related Solutions

Project Management Analytics Exercise This is an example of a construction project. The construction for a...
Project Management Analytics Exercise This is an example of a construction project. The construction for a large corporate building must be started by February 19th and completed by the end of the year. The legal contract for the project includes a penalty of $10,000 for every week of delay beyond December 31st. This implies 315 days or 45 weeks for the expected project duration assuming a seven-day work schedule and no holidays in between! The details of the project activities...
Project Management Analytics Exercise This is an example of a construction project. The construction for a...
Project Management Analytics Exercise This is an example of a construction project. The construction for a large corporate building must be started by February 19th and completed by the end of the year. The legal contract for the project includes a penalty of $10,000 for every week of delay beyond December 31st. This implies 315 days or 45 weeks for the expected project duration assuming a seven-day work schedule and no holidays in between! The details of the project activities...
Key roles for a successful analytics project. Select a data analytics of your choice and discuss...
Key roles for a successful analytics project. Select a data analytics of your choice and discuss how the following roles add value to this initiative: Business User, Project Sponsor, Project Manager, Business Intelligence Analyst, Database Administrator, and Data Engineer. Again, please note that you will discuss specific activities that each of these roles may perform on a data analytics project that you select. Need 500 discussion
Need 600 words discussion Key roles for a successful analytics project. Select a data analytics of...
Need 600 words discussion Key roles for a successful analytics project. Select a data analytics of your choice and discuss how the following roles add value to this initiative: Business User, Project Sponsor, Project Manager, Business Intelligence Analyst, Database Administrator, and Data Engineer. Again, please note that you will discuss specific activities that each of these roles may perform on a data analytics project that you select.
1. What is business/data analytics? 2. What are the three broad classifications of data analytics? 3....
1. What is business/data analytics? 2. What are the three broad classifications of data analytics? 3. How do businesses use data analytics? 4. Find a picture and/or diagram of the information pyramid and post it. Describe it in your own words. 5. What are three reasons why we study business analytics?
1. Find an example of data that have been misrepresented. This data could be in the...
1. Find an example of data that have been misrepresented. This data could be in the form of a graph or a statistical report in any media available to you. Explain what caused the misinformation. 2. What is wrong with the following graph? How would you change the graph to correctly display the information?
1 a) Imagine a real-world example (application, situation, or business) that can benefit from Data Analytics....
1 a) Imagine a real-world example (application, situation, or business) that can benefit from Data Analytics. Discuss what sorts of data can be available in your example (not necessarily only one dataset), making sure some of the data to be structured, and some to be unstructured. You need to be specific about the data details, such as: where it comes from, how it can be collected, how it looks like (give few specific samples), and most importantly: how the data...
(a) Find one example of a major managerial decision published in any of the recent newspapers...
(a) Find one example of a major managerial decision published in any of the recent newspapers or periodicals (e.g. Wall Street Journal, Business Week, etc.). Describe the decision and any other information, such as what led to the decision, and what happened as a result of the decision. What did you learn about the topic of decisionmaking from this example? (b) Recently, an airline company had more customer complaints, such as late flights and mishandled baggage, than any other major...
Find a publicly traded company that has treasury stock on its balance sheet. Provide a link...
Find a publicly traded company that has treasury stock on its balance sheet. Provide a link to the balance sheet in your post and explain the details of the treasury stock transactions based upon the amounts and disclosures found in the financial statements. Why do you think the company acquired the treasury stock? Do not choose a company that has already been reported on by one of your classmates. Participate in follow-up discussion by critiquing the posts provided by your...
Explain the concept of an interest rate swap. Find an example of a publicly traded firm...
Explain the concept of an interest rate swap. Find an example of a publicly traded firm that used an interest rate swap. Provide detailed analytics on how the swap was used and how it benefited both firms involved. You should include amounts, time frames, and actual rates.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT