In: Statistics and Probability
Dataset #2 – Star War Film Data
Description: Weekly domestic box office revenues for the 8 Star War films
Research ‘Question’: Find a ‘best’ linear model to predict Star War revenue/day using the number of theaters, number of weeks since release, film number, and release year.
theaters | weeknum | film | year | revperday |
3672 | 1 | IV | 1977 | 18498679.7 |
3672 | 2 | IV | 1977 | 9505314.86 |
3672 | 3 | IV | 1977 | 4127697.71 |
3672 | 4 | IV | 1977 | 2632591 |
3422 | 5 | IV | 1977 | 1950438.14 |
3311 | 6 | IV | 1977 | 2521766.29 |
3186 | 7 | IV | 1977 | 2831227.86 |
2681 | 8 | IV | 1977 | 1023363.71 |
2170 | 9 | IV | 1977 | 652710.714 |
1851 | 10 | IV | 1977 | 566439 |
1202 | 11 | IV | 1977 | 250623.714 |
907 | 12 | IV | 1977 | 179533.714 |
505 | 13 | IV | 1977 | 102494.857 |
311 | 14 | IV | 1977 | 74403.1429 |
206 | 15 | IV | 1977 | 44651.5714 |
215 | 16 | IV | 1977 | 46953.5714 |
228 | 17 | IV | 1977 | 54924.2857 |
172 | 18 | IV | 1977 | 29591.1429 |
291 | 19 | IV | 1977 | 76476.1429 |
270 | 20 | IV | 1977 | 59581 |
160 | 21 | IV | 1977 | 41030.1429 |
111 | 22 | IV | 1977 | 28579.4286 |
57 | 23 | IV | 1977 | 22707.5714 |
43 | 24 | IV | 1977 | 17242.4286 |
40 | 25 | IV | 1977 | 11668.7143 |
30 | 26 | IV | 1977 | 9229 |
3682 | 1 | V | 1980 | 15161652.6 |
3682 | 2 | V | 1980 | 8844278.29 |
3682 | 3 | V | 1980 | 5120454.57 |
3387 | 4 | V | 1980 | 1772898.57 |
3025 | 5 | V | 1980 | 1165040.57 |
2505 | 6 | V | 1980 | 1340427.71 |
2505 | 7 | V | 1980 | 1944470 |
2015 | 8 | V | 1980 | 799467 |
1550 | 9 | V | 1980 | 421755.857 |
1077 | 10 | V | 1980 | 303789.143 |
783 | 11 | V | 1980 | 142854.857 |
502 | 12 | V | 1980 | 85785.1429 |
352 | 13 | V | 1980 | 52545.1429 |
441 | 14 | V | 1980 | 70452.4286 |
388 | 15 | V | 1980 | 45788.2857 |
388 | 16 | V | 1980 | 41332.7143 |
360 | 17 | V | 1980 | 39414.5714 |
205 | 18 | V | 1980 | 24388.8571 |
151 | 19 | V | 1980 | 17734.5714 |
95 | 20 | V | 1980 | 14462.7143 |
80 | 21 | V | 1980 | 12256.4286 |
72 | 22 | V | 1980 | 4412 |
15 | 23 | V | 1980 | 786.285714 |
7 | 24 | V | 1980 | 455.285714 |
3855 | 1 | VI | 1983 | 17580664.1 |
3855 | 2 | VI | 1983 | 7119019.71 |
3805 | 3 | VI | 1983 | 3913192.71 |
3004 | 4 | VI | 1983 | 2412629 |
2725 | 5 | VI | 1983 | 1652119.43 |
2002 | 6 | VI | 1983 | 977608.429 |
1460 | 7 | VI | 1983 | 643752.429 |
1008 | 8 | VI | 1983 | 404027.429 |
605 | 9 | VI | 1983 | 240410.429 |
409 | 10 | VI | 1983 | 169831.286 |
310 | 11 | VI | 1983 | 107789.429 |
248 | 12 | VI | 1983 | 80801.4286 |
391 | 13 | VI | 1983 | 95609.8571 |
391 | 14 | VI | 1983 | 90454.4286 |
321 | 15 | VI | 1983 | 38485 |
228 | 16 | VI | 1983 | 29893 |
246 | 17 | VI | 1983 | 25054 |
164 | 18 | VI | 1983 | 11661.4286 |
119 | 19 | VI | 1983 | 9036 |
74 | 20 | VI | 1983 | 8862.57143 |
55 | 21 | VI | 1983 | 7250 |
55 | 22 | VI | 1983 | 5731.71429 |
3858 | 1 | I | 1999 | 20897581.3 |
3858 | 2 | I | 1999 | 9015073 |
3858 | 3 | I | 1999 | 3487897.43 |
3325 | 4 | I | 1999 | 1834563.57 |
2750 | 5 | I | 1999 | 1438515.14 |
2424 | 6 | I | 1999 | 1818900.29 |
2316 | 7 | I | 1999 | 1315771.29 |
1555 | 8 | I | 1999 | 510037.571 |
1003 | 9 | I | 1999 | 345916.714 |
560 | 10 | I | 1999 | 159016.429 |
340 | 11 | I | 1999 | 96117.5714 |
245 | 12 | I | 1999 | 69097 |
160 | 13 | I | 1999 | 49419.4286 |
441 | 14 | I | 1999 | 136217 |
422 | 15 | I | 1999 | 93123.1429 |
331 | 16 | I | 1999 | 57197.7143 |
231 | 17 | I | 1999 | 39329.1429 |
191 | 18 | I | 1999 | 29226.5714 |
140 | 19 | I | 1999 | 22458.7143 |
89 | 20 | I | 1999 | 14974.7143 |
4285 | 1 | II | 2002 | 19483946.1 |
4285 | 2 | II | 2002 | 7050087.71 |
4005 | 3 | II | 2002 | 3828435.43 |
3125 | 4 | II | 2002 | 2158583 |
2585 | 5 | II | 2002 | 1212925.71 |
1955 | 6 | II | 2002 | 817540.571 |
1322 | 7 | II | 2002 | 488799.571 |
1017 | 8 | II | 2002 | 417103.143 |
775 | 9 | II | 2002 | 193287.571 |
589 | 10 | II | 2002 | 143490.429 |
320 | 11 | II | 2002 | 59758.8571 |
241 | 12 | II | 2002 | 41315.4286 |
408 | 13 | II | 2002 | 74103.8571 |
377 | 14 | II | 2002 | 54086.4286 |
283 | 15 | II | 2002 | 38864.1429 |
225 | 16 | II | 2002 | 27574.1429 |
159 | 17 | II | 2002 | 18940 |
105 | 18 | II | 2002 | 14270.4286 |
90 | 19 | II | 2002 | 9984.85714 |
56 | 20 | II | 2002 | 8214.28571 |
52 | 21 | II | 2002 | 4788.28571 |
38 | 22 | II | 2002 | 2020.85714 |
4325 | 1 | III | 2005 | 21314847.9 |
4455 | 2 | III | 2005 | 6561318.43 |
4393 | 3 | III | 2005 | 3879632 |
3455 | 4 | III | 2005 | 1973952.71 |
2771 | 5 | III | 2005 | 1146060.29 |
1936 | 6 | III | 2005 | 718753.857 |
1508 | 7 | III | 2005 | 474352.286 |
1091 | 8 | III | 2005 | 403442.857 |
744 | 9 | III | 2005 | 173298.571 |
415 | 10 | III | 2005 | 78098.7143 |
301 | 11 | III | 2005 | 51525.8571 |
190 | 12 | III | 2005 | 33442.8571 |
505 | 13 | III | 2005 | 84180.1429 |
356 | 14 | III | 2005 | 51179.8571 |
245 | 15 | III | 2005 | 33814.8571 |
201 | 16 | III | 2005 | 21102 |
135 | 17 | III | 2005 | 17775.7143 |
95 | 18 | III | 2005 | 11938.8571 |
44 | 19 | III | 2005 | 7837.85714 |
44 | 20 | III | 2005 | 6345.28571 |
36 | 21 | III | 2005 | 3118.28571 |
23 | 22 | III | 2005 | 1052.42857 |
4125 | 1 | VII | 2015 | 24281289.7 |
4125 | 2 | VII | 2015 | 8218801.86 |
4125 | 3 | VII | 2015 | 3098252 |
3577 | 4 | VII | 2015 | 1644693.14 |
1840 | 5 | VII | 2015 | 1302432.86 |
1732 | 6 | VII | 2015 | 1294747 |
1732 | 7 | VII | 2015 | 918122.286 |
1507 | 8 | VII | 2015 | 442270.857 |
941 | 9 | VII | 2015 | 291175.571 |
725 | 10 | VII | 2015 | 168580.857 |
465 | 11 | VII | 2015 | 109324.714 |
365 | 12 | VII | 2015 | 71774.2857 |
409 | 13 | VII | 2015 | 93213.2857 |
321 | 14 | VII | 2015 | 77634.8571 |
303 | 15 | VII | 2015 | 45363.7143 |
208 | 16 | VII | 2015 | 30144.8571 |
122 | 17 | VII | 2015 | 20494.5714 |
94 | 18 | VII | 2015 | 14027.7143 |
85 | 19 | VII | 2015 | 12463.4286 |
66 | 20 | VII | 2015 | 8202.42857 |
4375 | 1 | VIII | 2017 | 32302438.4 |
4375 | 2 | VIII | 2017 | 10059634.3 |
4145 | 3 | VIII | 2017 | 4872357.86 |
3175 | 4 | VIII | 2017 | 2777846.71 |
2414 | 5 | VIII | 2017 | 1630078.29 |
1738 | 6 | VIII | 2017 | 963457.571 |
1328 | 7 | VIII | 2017 | 558613 |
1092 | 8 | VIII | 2017 | 564588.286 |
810 | 9 | VIII | 2017 | 196717.429 |
601 | 10 | VIII | 2017 | 136677.857 |
320 | 11 | VIII | 2017 | 76497 |
252 | 12 | VIII | 2017 | 53219.8571 |
407 | 13 | VIII | 2017 | 86566.5714 |
330 | 14 | VIII | 2017 | 57112.1429 |
240 | 15 | VIII | 2017 | 35131 |
163 | 16 | VIII | 2017 | 22387.2857 |
225 | 17 | VIII | 2017 | 21222.2857 |
85 | 18 | VIII | 2017 | 10420.1429 |
78 | 19 | VIII | 2017 | 5208.14286 |
Hello
I'm going to use excel to find "best" linear model, by drawing a regression line in each variable's scatterplot diagram.
(a) Dependent Variable : Revenue per day, Independent Variable : No. of theatres
y = 2227.2x - 1000000
R² = 0.4543
(b) Dependent Variable : Revenue per day, Independent Variable : No. of weeks since release
y = -408649x + 7000000
R² = 0.3118
(c) Dependent Variable : Revenue per day, Independent Variable : Film number
y = 84910x + 2000000
R² = 0.0016
(d) Dependent Variable : Revenue per day, Independent Variable : Release Year
y = 18222x - 30000000
R² = 0.0032
R² is a statistical measure of how close your data is, to the plotted regression equation.
and because R² of the variable No. of theatres is maximum, i.e. 0.4543, it will be the best fit.
I hope this solves your query.
Do give a thumbs up if you find this helpful.