In: Statistics and Probability
Data from fifty of the top box office receipt generating movies are provided in the file. Quantitative variables for the movie's release year, inflation- and noninflations-adjusted box office receipts in the United States, budget and the world box office receipts are also included. Use the data visualization methods presented in Chapter 3 to explore these data and discover relationships between the variables. Include the following in your report:
3. Create a frequency distribution, percent frequency distribution, and histogram for the inflation-adjusted U.S. box office receipts. Use bin sizes of $100 million. Interpret the results. Do their appear to be any outliers?
4. Create a PivotTable for these data. Use the PivotTable to generate a crosstabulation for movie genre and rating. Determine which combinations of genre and rating are most represented in the top 50 movie data. Now filter the data to consider only movies after 1980. What combinations are most represented after 1980? What does this indicate about how the preferences of moviegoers have changed over time?
Title | Year Released | U.S. Box Office Receipts (Inflation Adjusted Millions $) | Rating | Genre | Budget (Non-Inflation Adjusted Millions $) | World Box Office Receipts (Non-Inflation Adjusted Millions $) | U.S. Box Office Receipts (Non-Inflation Adjusted Millions $) |
Gone With the Wind | 1939 | $1,650 | G | Drama | $3 | $391 | $199 |
Star Wars | 1977 | $1,426 | PG | SciFi/Fantasy | $11 | $798 | $461 |
The Sound of Music | 1965 | $1,145 | G | Musical | -- | $163 | $163 |
E.T. | 1982 | $1,132 | PG | SciFi/Fantasy | -- | $757 | $435 |
Titanic | 1997 | $1,096 | PG-13 | Drama | $200 | $2,185 | $659 |
The Ten Commandments | 1956 | $1,053 | G | Drama | $14 | $80 | $80 |
Jaws | 1975 | $1,029 | PG | Action | $12 | $471 | $260 |
Doctor Zhivago | 1965 | $973 | PG-13 | Drama | $11 | $112 | $112 |
The Jungle Book | 1967 | $871 | G | Animated | -- | $206 | $142 |
Snow White and the Seven Dwarfs | 1937 | $854 | G | Animated | $1 | $185 | $185 |
Ben-Hur | 1959 | $844 | G | Drama | $15 | $70 | $70 |
One Hundred and One Dalmatians | 1961 | $825 | G | Animated | -- | $215 | $153 |
The Exorcist | 1973 | $809 | R | Horror | $12 | $358 | $205 |
Avatar | 2009 | $804 | PG-13 | Action | $230 | $2,778 | $761 |
The Empire Strikes Back | 1980 | $772 | PG | SciFi/Fantasy | $18 | $534 | $290 |
Return of the Jedi | 1983 | $741 | PG | SciFi/Fantasy | $33 | $573 | $309 |
The Lion King | 1994 | $722 | G | Animated | $45 | $952 | $423 |
Star Wars: The Phantom Menace | 1999 | $720 | PG | SciFi/Fantasy | $110 | $1,027 | $475 |
The Sting | 1973 | $715 | PG | Drama | $6 | $160 | $160 |
Mary Poppins | 1964 | $686 | G | Musical | -- | $102 | $102 |
Raiders of the Lost Ark | 1981 | $683 | PG | Action | $20 | $384 | $248 |
Jurassic Park | 1993 | $676 | PG-13 | Action | $63 | $920 | $357 |
The Graduate | 1967 | $671 | PG | Drama | -- | $104 | $104 |
Fantasia | 1940 | $651 | G | Animated | $2 | $76 | $76 |
Marvel's The Avengers | 2012 | $623 | PG-13 | SciFi/Fantasy | $220 | $1,512 | $623 |
The Godfather | 1972 | $623 | R | Drama | $6 | $135 | $135 |
Forrest Gump | 1994 | $618 | PG-13 | Drama | $55 | $680 | $330 |
Close Encounters of the Third Kind | 1977 | $612 | PG | Drama | -- | $300 | $128 |
The Dark Knight | 2008 | $589 | PG-13 | SciFi/Fantasy | $150 | $1,002 | $533 |
Sleeping Beauty | 1959 | $570 | G | Animated | $6 | $52 | $52 |
Grease | 1978 | $562 | PG | Comedy | $6 | $380 | $182 |
Shrek 2 | 2004 | $557 | PG | Animated | $75 | $912 | $437 |
Butch Cassidy and the Sundance Kid | 1969 | $553 | PG | Western | -- | $102 | $102 |
Spider-Man | 2002 | $552 | PG-13 | SciFi/Fantasy | $139 | $807 | $404 |
Bambi | 1942 | $552 | G | Animated | -- | $268 | $103 |
Independence Day | 1996 | $549 | PG-13 | SciFi/Fantasy | $75 | $813 | $306 |
Love Story | 1970 | $549 | PG | Drama | -- | $106 | $106 |
Beverly Hills Cop | 1984 | $548 | R | Comedy | -- | $316 | $235 |
Pinocchio | 1940 | $529 | G | Animated | $3 | $84 | $84 |
Home Alone | 1990 | $528 | PG | Comedy | -- | $534 | $286 |
Cleopatra | 1963 | $527 | UR | Drama | $44 | $58 | $58 |
Airport | 1970 | $518 | G | Drama | $10 | $101 | $101 |
American Graffiti | 1973 | $515 | PG | Drama | $1 | $115 | $115 |
Ghostbusters | 1984 | $515 | PG | SciFi/Fantasy | $30 | $292 | $239 |
The Robe | 1953 | $513 | UR | Drama | $5 | $36 | $36 |
Pirates of the Caribbean: Dead Man's Chest | 2006 | $513 | PG-13 | Action | $225 | $1,066 | $423 |
Around the World in 80 Days | 1956 | $507 | G | Action | $6 | $42 | $42 |
LOTR: The Return of the King | 2003 | $496 | PG-13 | SciFi/Fantasy | $94 | $1,129 | $377 |
Blazing Saddles | 1974 | $496 | R | Comedy | -- | $120 | $120 |
Batman | 1989 | $494 | PG-13 | SciFi/Fantasy | -- | $413 | $251 |
Note:
Hey there! Thank you for the question. As the visualization methods in Chapter 3 are not mentioned here, we have answered whatever has been asked particularly in the question.
3.
The minimum value of inflation-adjusted U.S. box office receipts, given under the column “U.S. Box Office Receipts (Inflation Adjusted Millions $)” is 494, and the maximum value is 1,650.
While constructing the frequency distribution and the percent frequency distribution, for convenience, the lower boundary of the lowest class is taken to be 450, and the upper boundary of the highest class is 1,650.
Since each bin size is $100 million, the classes would be 450-550, 550-650, …, 1,550-1,650. To maintain parity, if a value is the same as that of a class boundary, it is counted in the class where it is the upper limit. For example, if a box office receipt value is $550, it should be counted in class 450-550, and not in the class 550-650.
Frequency:
Count the number of movies in each $100 million interval, to find the frequency of that class. It can be counted that 15 movies have inflation-adjusted U.S. box office receipts between $450 million and $550 million. So, the frequency of the first class, 450-550 is 15. Similarly, the frequencies of the other classes can be found.
Percent Frequency:
There are 50 movies in total, so that the total frequency is 50. Divide the frequency of a class by the total frequency, and multiply the obtained ratio by 100, to find the percent frequency of the class. The percent frequency of the first class is (15/50 * 100), that is, 30%. Similarly, the percent frequencies of the other classes can be found.
Histogram:
We have drawn the percent frequency histogram, by plotting the percent frequencies (along the vertical axis) against the corresponding classes (along the horizontal axis). Each class percent frequency is represented by a bin having the width the same as the width of the class, and the height representing the percent frequency.
The following are the frequency and percent frequency distributions, along with the percent frequency histogram:
4.
In order to create pivot table, open the dataset in an Excel sheet, and select the entire dataset.
Go to Insert > Tables > Pivot Table.
Click OK in the dialog box that opens. A new sheet opens with the pivot table.
From Choose fields to add to report, drag Rating to Column Labels, Genre to Row Labels, Title to Values, under the section Drag fields between areas below:.
Select the drop-down sign corresponding to Title under Values, and choose Value Field Settings. Choose Count under Summarize value field by > OK.
The pivot table is obtained. We have edited the title Column Labels as Rating and the title Row Labels as Genre.
The pivot table is given below:
It can be observed that the maximum frequency in a cell is 8 (excluding the Grand Total cells, of course), corresponding to the Animated genre and the G Rating. Hence, G-rated Animated movies are the most represented in the top 50 movie data.
In order to filter the data to consider only the movies after 1980, from Choose fields to add to report, drag Year Released to Report Filter, under the section Drag fields between areas below:.
Select the drop-down sign corresponding to Year Released in the table, and choose Select Multiple Items. Choose only those years in the list, that start from 1981 and end in 2012. Click OK.
The obtained filtered pivot table is given below:
It can be observed that the maximum frequency in a cell in the filtered table is 6, corresponding to the SciFi/Fantasy genre and the PG-13 Rating. Hence, PG-13 rated SciFi/Fantasy movies are the most represented after 1980.
It appears that the preferences of moviegoers have shifted towards the SciFi/Fantasy genre in the recent years, specifically, in the last 32 years of the data.