Question

In: Operations Management

For application case 4.6 – Data Mining Goes to Hollywood, describe the research study, the methodology,...

For application case 4.6 – Data Mining Goes to Hollywood, describe the research study, the methodology, the results and the conclusion.

Data Mining Goes to Hollywood: Predicting Financial Success of Movies

Predicting box-office receipts (i.e., financial success) of a particular motion picture is an interesting and challenging problem. According to some domain experts, the movie industry is the “land of hunches and wild guesses” due to the difficulty associated with forecasting product demand, making the movie business in Hollywood a risky endeavor. In support of such observations, Jack Valenti (the longtime president and CEO of the Motion Picture Association of America) once mentioned that “…no one can tell you how a movie is going to do in the marketplace…not until the film opens in darkened theatre and sparks fly up between the screen and the audience.” Entertainment industry trade journals and magazines have been full of examples, statements, and experiences that support such a claim. Like many other researchers who have attempted to shed light on this challenging real-world problem, Ramesh Sharda and Dursun Delen have been exploring the use of data mining to predict the financial performance of a motion picture at the box office before it even enters production (while the movie is nothing more than a conceptual idea). In their highly publicized prediction models, they convert the forecasting (or regression) problem into a classification problem; that is, rather than forecasting the point estimate of box-office receipts, they classify a movie based on its box-office receipts in one of nine categories, ranging from “flop” to “blockbuster,” making the problem a multinomial classification problem. Table 5.4 illustrates the definition of the nine classes in terms of the range of box-office receipts.

Data

Data was collected from variety of movie-related databases (e.g., ShowBiz, IMDb, IMSDb, AllMovie, etc.) and consolidated into a single data set. The data set for the most recently developed models contained 2,632 movies released between 1998 and 2006. A summary of the independent variables along with their specifications is provided in Table 5.5. For more descriptive details and justification for inclusion of these independent variables, the reader is referred to Sharda and Delen (2007). Business Intelligence Spring 2017

Methodology

Using a variety of data mining methods, including neural networks, decision trees, support vector machines, and three types of ensembles, Sharda and Delen developed the prediction models. The data from 1998 to 2005 were used as training data to build the prediction models, and the data from 2006 was used as the test data to assess and compare the models’ prediction accuracy. Figure 5.15 shows a screenshot of IBM SPSS Modeler (formerly Clementine data mining tool) depicting the process map employed for the prediction problem. The upper-left side of the process map shows the model development process, and the lower-right corner of the process map shows the model assessment (i.e., testing or scoring) process (more details on IBM SPSS Modeler tool and its usage can be found on the book’s Web site).

Results

Table 5.6 provides the prediction results of all three data mining methods as well as the results of the three different ensembles. The first performance measure is the percent correct classification rate, which is called bingo. Also reported in the table is the 1-Away correct classification rate (i.e., within one category). The results indicate that SVM performed the best among the individual prediction models, followed by ANN; the worst of the three was the CART decision tree algorithm. In general, the ensemble models performed better than the individual predictions models, of which the fusion algorithm performed the best. What is probably more important to decision makers, and standing out in the results table, is the significantly low standard deviation obtained from the ensembles compared to the individual models. Business Intelligence Spring 2017

Conclusion

The researchers claim that these prediction results are better than any reported in the published literature for this problem domain. Beyond the attractive accuracy of their prediction results of the box-office receipts, these models could also be used to further analyze (and potentially optimize) the decision variables in order to maximize the financial return. Specifically, the parameters used for modeling could be altered using the already trained prediction models in order to better understand the impact of different parameters on the end results. During this process, which is commonly referred to as sensitivity analysis, the decision maker of a given entertainment firm could find out, with a fairly high accuracy level, how much value a specific actor (or a specific release date, or the addition of more technical effects, etc.) brings to the financial success of a film, making the underlying system an invaluable decision aid.

Solutions

Expert Solution

Research Study:-

This research study where a number of software tools and data mining techniques are used to build models to predict financial success (box-office receipts) of Hollywood movies while they are nothing more than ideas (pre-release). Predicting box office receipts (financial success) of a particular motion picture is an interesting and challenging problem. The difficulty associated with forecasting product demand, making the movie business in Hollywood a risky endeavor

Methodology:-

The collected data from a variety of movie-related databases and consolidated into a single data set. -used a variety of data mining methods including neutral networks, decision trees, support machines, and using three types of ensembles to develop the prediction models.

Results:-

The ensemble models performed better than the individual predictions model of which the fusion algorithm performed the best. The significantly low standard deviation obtained from the ensembles compared to the individual models.

Conclusion:

Sensitivity analysis helps through this research study in analyzing effect of change of one variable at a time let’s say actor here, on the financial outcome of the cinema.

The model can be improved by addition of relevant variables & doing sensitivity analysis on them


Related Solutions

Concerning China goes to Hollywood Case study. Wanda's acquisition of AMC and Legendary entertainment.
Could you please highlight the using the PESTEL analysis, what are the most critical factors in the PESTEL analysis, or rather the upward trending factors and why?What are the international drivers for Wanda in acquiring AMC and Legendary and as such use the PESTEL analysis and Porter's Five forces?  
In Research Methodology, discuss the bases for the selection of a research design for a study.
In Research Methodology, discuss the bases for the selection of a research design for a study.
Write a full page Background of Gender Discrimination in workforce case study. (Research Methodology subject)
Write a full page Background of Gender Discrimination in workforce case study. (Research Methodology subject)
Find student project for data mining application and describe what industry, where the web site )...
Find student project for data mining application and describe what industry, where the web site ) you locate the project then briefly describe the purpose of the project and how the problem was solved and major finding .
Describe an application of exploratory factor analysis that is specific to research industry or to data...
Describe an application of exploratory factor analysis that is specific to research industry or to data science. Explain why this technique is suitable in terms of measurement scale of variables and their roles.
DSCI 5330 -- Enterprise Applications of Business Intelligence!! Read Application Case 4.6 in your textbook and...
DSCI 5330 -- Enterprise Applications of Business Intelligence!! Read Application Case 4.6 in your textbook and answer the following question: What data would you include in an application to predict the success of a new product and where would you find the data? Your answer should also include why you would choose the data and how you would use it e.g., how would this data help predict the success of a new product. **Choose any product you would like as...
A research proposal for a medical study describes the following methodology: “The design will involve a...
A research proposal for a medical study describes the following methodology: “The design will involve a placebo-controlled matched pairs study of the effects of Vitamin D supplements on frequency of joint pain. The methodological design will control for age, gender, diet, and exercise.” a) As described above, is this an observational or experimental study? b) Identify the independent variable/factor being examined in this study. c) What does the term “placebo controlled” suggest about how the study will be carried out?
The case study is used in qualitative research because it a. Facilitates the coding of data...
The case study is used in qualitative research because it a. Facilitates the coding of data b. Provides a comprehensive description c. Enables generalization of the research results d. Entails once off data collection
What is Data mining application and how does it work in telemedicine?
What is Data mining application and how does it work in telemedicine?
Discuss the applications of Binary Logistic Regression in Clinical Research using the case study given in the(Application of Binary Logistic Regression in Clinical Research)
  Discuss the applications of Binary Logistic Regression in Clinical Research using the case study given in the(Application of Binary Logistic Regression in Clinical Research) in a brief manner with a maximum length of two pages  
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT