In: Statistics and Probability
Movie |
Twitter Activity |
Receipts |
The Devil Inside |
219509 |
14763 |
The Dictator |
6405 |
5796 |
Paranormal Activity 3 |
165128 |
15829 |
The Hunger Games |
579288 |
36871 |
Bridesmaids |
6564 |
8995 |
Red Tails |
11104 |
7477 |
Act of Valor |
9152 |
8054 |
B. Write a null and alternative hypothesis statement
C. What is the R-squared value and interpret it’s meaning?
D. What is the correlation coefficient and interpret it’s meaning?
E. Predict the receipts for a movie that has a Twitter activity of 100,000 using the linear regression formula generated after you run the regression analysis in Excel (just like we did in the lecture notes)
F. At the 95% significance level, is there significant evidence of a relationship between Twitter activity and receipts?
G. Do you accept or reject the null hypothesis?
H. Based upon your answers above, provide us with a paragraph (3-4 sentence) conclusion that would be appropriate to explain the results to a senior executive that isn’t as familiar with statistics as you are!
Output using excel:
SUMMARY OUTPUT | ||||||
Regression Statistics | ||||||
Multiple R | 0.279643 | |||||
R Square | 0.0782 | |||||
Adjusted R Square | -0.10616 | |||||
Standard Error | 11337.88 | |||||
Observations | 7 | |||||
ANOVA | ||||||
df | SS | MS | F | Significance F | ||
Regression | 1 | 54526187.0403 | 54526187.0403 | 0.424172 | 0.543606 | |
Residual | 5 | 642737346.3883 | 128547469.2777 | |||
Total | 6 | 697263533.4286 | ||||
Coefficients | Standard Error | t Stat | P-value | Lower 95% | Upper 95% | |
Intercept | 11648.22 | 5573.582 | 2.089898 | 0.090912 | -2679.133 | 25975.57 |
X | 0.034148 | 0.052432 | 0.651285 | 0.543606 | -0.100633 | 0.16893 |
a) Independent variable (X) = Twitter Activity
Dependent variable (y) = Receipts
b) Null and alternative hypothesis:
Ho: ρ = 0
Ha: ρ ≠ 0
c) Coefficient of determination, r² = 0.0782
7.82% variation in y is explained by the least squares model.
d) Correlation coefficient, r = 0.2796
It indicates there is a positive weak relationship between x and y.
e) Regression equation :
ŷ = 11648.217 + (0.0341) x
Predicted value of y at x = 100000
ŷ = 11648.217 + (0.0341) * 100000 = 15063.0599
f) Correlation, r = 0.2796
Test statistic :
t = r*√(n-2)/√(1-r²) = 0.2796 *√(7 - 2)/√(1 - 0.2796²) = 0.6513
df = n-2 = 5
p-value = T.DIST.2T(ABS(0.6513), 5) = 0.5436
g) Conclusion:
p-value > α , Fail to reject the null hypothesis.
h) We fail to reject the null hypothesis.so, we can say that there is no correlation between x and y.