In: Statistics and Probability
Use the following table displaying 20 of the top (era-adjusted) grossing movies of all time, along with their Metacritic score (a weighted average of critics ratings), the amount of money they grossed (weighted for the year of release, in millions of dollars), and an indicator of whether or not the movie is a sequel to answer the following questions. You will need to use StatCrunch for this question.
Title | Metacritic | Adj_Gross | Sequel |
Gone with the Wind | 97 | 1895.422 | No |
Star Wars | 90 | 1668.98 | No |
The Sound of Music | 63 | 1335.086 | No |
E.T. the Extra-Terrestrial | 91 | 1329.175 | No |
Titanic | 75 | 1270.102 | No |
Jaws | 87 | 1200.098 | No |
Doctor Zhivago | 69 | 1163.15 | No |
The Exorcist | 81 | 1036.315 | No |
Snow White and the Seven Dwarfs | 95 | 1021.33 | No |
Star Wars: The Force Awakens | 80 | 1013.038 | Yes |
One Hundred and One Dalmatians | 83 | 936.2251 | No |
The Empire Strikes Back | 82 | 919.2448 | Yes |
Ben-Hur | 90 | 918.6995 | No |
Avatar | 83 | 911.791 | No |
Avengers: Endgame | 78 | 892.6696 | Yes |
Return of the Jedi | 58 | 881.3366 | Yes |
Jurassic Park | 68 | 858.8936 | No |
Star Wars: Episode I - The Phantom Menace | 51 | 846.2244 | Yes |
The Lion King | 88 | 835.3018 | No |
The Sting | 83 | 835.269 | No |
a) Find the correlation between the Metacritic and Adj_Gross variables. Describe the strength and direction of the correlation.
b) Does having a higher Metacritic score have more an impact on gross revenue for original movies (non-sequels) or sequels? Put another way, does an increase of 1 in the Metacritic variable add more value to sequels or non-sequels? Justify your answer. You should be finding two different regression equations to answer this question.
c) Let’s say a new movie came out that received a Metacritic score of 89. Using your regression equations, how much money would it be expected to make if it is a sequel? How much money would it be expected to make if it is not a sequel?
Here, the given data can be tabulated as follows:
a) Find the correlation between the Metacritic and Adj_Gross variables. Describe the strength and direction of the correlation.
We can visualize the correlation between the variables Metacritic and Adj_Gross variables using a scatterplot :
To find the correlation between the variables Metacritic and Adj_Gross , we use the following formula:
(equation A)
i.e, Correlation=(Co-Variance between Metacritic and Adj_Gross) / (Variance of Metacritic * Variance of Adj_Gross)1/2
now, Co-Variance between Metacritic and Adj_Gross can be given by the formula :
where, xi=ith data point in the variable x
yj=jth data point in the variable y
Xbar=mean of variable X
Ybar=mean of variable Y
n=total number of observation
Here, Cov(x,y)=24891.3476 (1)
also, Variance of Metacritic and Adj_Gross can be given by the formula :
Here, Var(x)=2884.8000 (2)
and, Var(y)=1603556.0400 (3)
Pitting (1), (2), (3) in equation A :
We obtain, Correlation=0.3659
Here, the value of correlation being 0.3659 reflects a moderate positive dependency of the variable Adj_Gross on the variable Metacritic. The correlation being positive shows that the there is an increase in the Adj_Gross as the Metacritic is likely to ncreases.
b) Does having a higher Metacritic score have more an impact on gross revenue for original movies (non-sequels) or sequels? Put another way, does an increase of 1 in the Metacritic variable add more value to sequels or non-sequels? Justify your answer. You should be finding two different regression equations to answer this question.
Here, we perform our analysis in Anaconda Python Notebook:
First, we devide the data into two groups based on being an original movies (non-sequels) and sequels , and then find the correlation between the variables Metacritic and Adj_Gross in both the cases using equation (A). Also, we simulate a linear regression model in each of the cases:
GROUP-1 (non-sequels)
We obtain, Correlationnon-sequel group=0.2412
interpretation: An increase of 1 in the Metacritic variable add 7.4795 to non-sequels
GROUP-2 (sequels)
We obtain, Correlationsequel group=0.7120
interpretation: An increase of 1 in the Metacritic variable add 3.1482 to sequels
So, By observing the analysis of both the groups closely, we can state the following:
(1)- For sequal group having a higher Metacritic score have more an impact on gross revenue compared to the non.sequal group the correlation beytween the variables Metacritic and Adj_Gross is found to be significantly higher for group 2.
(2)- An increase of 1 in the Metacritic variable adds significantly more value to non-sequels than to sequels.
c) Let’s say a new movie came out that received a Metacritic score of 89. Using your regression equations, how much money would it be expected to make if it is a sequel? How much money would it be expected to make if it is not a sequel?
FINDINGS on model
sequal =900.1123
non-sequal=1100.3234