In: Statistics and Probability
Data Mining - Milestone One
Page length requirements: One Paragraph
Question: Based on the scenerio below: What is the purpose of the analytic method/approach/strategy you are using? What typeof information does it yield?
The scenario: Bubba Gump Shrimp Company is a successful retailer of regional food, both in its restaurants and through other retail channels. Bubba Gump began as a small, privately owned restaurant. Thanks to unexpected exposure from a blockbuster movie, Bubba Gump grew rapidly from its humble beginnings and now operates several restaurants, sells branded merchandise through an online retail site, and wholesales its branded merchandise to other retail outlets. Bubba Gump's growth was initially very rapid in response to a strong demand and high name recognition that followed from its movie exposure. After its first few years of rapid growth, sales increased at slower rates and finally leveled off. Sales have declined in each of the last two years. Bubba Gump Shrimp Company has collected a large amount of data about its business, including restaurant point-of-sale (POS) data, web channel sales performance, customer information through restaurant loyalty programs, and customer and sales transaction data through its website and retail partners. Bubba Gump's leadership has decided to commission an analysis of the company's vast data assets to better understand its customers and look for ways to create new revenue growth. You have been assigned to plan, conduct, and report on this data mining initiative for Bubba Gump Shrimp Company. The company data that is available to you includes Bubba Gump's restaurant point-of-sale (cash register, credit card) data, its customer database (collected from its restaurant loyalty program and online sales channel), its web store sales transaction data, and customer and sales data from third-party retailers. All of Bubba Gump's data has recently been integrated in a data warehouse. That enterprise data warehouse was built specifically to support data mining initiatives like the one you have been assigned to conduct, by consolidating data from multiple operations and channels in one place and integrating the data across sources for a complete view of the customer experience. For the first time, Bubba Gump analysts can link sales transactions to specific customers at specific restaurants, for example. It also means that you can link customer transactions across channels; that is, for any given customer, you can link to both their restaurant purchases, their online purchases, and (in some cases) their purchases from third-party retail partners. You have been selected to develop and execute the data mining analysis plan for Bubba Gump's customer analysis project. Your project will be the first major data mining project conducted against the new Bubba Gump data warehouse. Because Bubba Gump's data was not previously integrated in a single data warehouse, company leadership has never been able to analyze its customers across their complete experience. In other words, customer restaurant purchases, online purchases, and third-party retailer purchases could not be analyzed together previously; each channel had to be analyzed separately. As a first step, a sample of 500 customers has been selected from the analytics data warehouse and given a survey in exchange for purchase credits at one of Bubba Gump's sales channels. The survey sample was selected from the universe of customers who have made purchases from at least one Bubba Gump outlet (restaurant, web store, etc.). Responses to various customer satisfaction questions were recorded, and historical purchase information has been extracted from the data warehouse for each customer in the sample.
Bubba Gump Shrimp
Earlier Phase of Analysis:
In earlier phase Bubba Gump Shrimp was doing the analysis separately for all its business lines i.e. offline restaurants, online services and third party retailers. So, the customers were not interlinked with each other in all 3 categories. It might be possible that the “same customers” availed the service from all 3 channels but Bubba Gump would never know this insight about its customers because the channels were not inter-connected according to their previous methodology.
So, they changed their methodology by keeping all the data in a data-warehouse.
Current Phase of Analysis:
Now, coming onto the current phase, we have data from all 3 major categories stored in a well format schema or we can say the data is now in standardized form where the benefit of such form is that:
1. We can easily map those ‘same customers’ who are availing our services from different channels. For example: Suppose there is a customer named “xyz” and he/she availed services from all 3 sectors i.e. one transaction from each ‘offline restaurants’, ‘online services’ and ‘third party retailers’, now depending on customers details in each category we can map his/her all the transactions under his/her unique ‘CustomerID’ using the power of data warehousing, so instead of having 3 distinct transactions in which we didn’t knew that whether these 3 transactions belong to the same customer or not which was the case in earlier methodology. So, once we do this type of mapping we will get to know about our customers and can prioritize our customer base accordingly.
2. As the sales of Bubba Gump have declined in last 2 years, to analyze the sales data and to figure out the possible solution we can opt for a data mining technique called Time Series Analysis, as the name suggests analyzing the data with respect to time. Because there was a decline in sales for last 2 years, we can look out at the factors due to which this declination is happen. A person can look at the current trend, or can look at cyclic disturbance in the data i.e. something like a product which Bubba Gump is offering for last 5 years but had gone out of trend for last 2 years so there is no logic in continuing with that product as it is out of trend instead of that they can look for a product similar to their product by doing a market research and the one which is in trend.
3. Talking about the ‘data mining technique’ that can be used to segregate the customer base into “High Priority”, “Medium Priority” and “Low Priority” customers is popularly known as RFM-Analysis, it stands for Recency Frequency Monetary-Analysis.
(i) RFM (Recency, Frequency and Monetary) model is widely applied in many practical areas, particularly in direct marketing. By adopting RFM model, decision makers can effectively identify valuable customers and then develop effective marketing strategy.
(ii) Integration of RFM analysis and data mining techniques provides useful information for current and new customers. Clustering based on RFM attributes provides more behavioral knowledge of customers’ actual marketing levels than other cluster analyses. Classification rules discovered from customer demographic variables and RFM variables provides useful knowledge for managers to predict future customer behavior such as how recently the customer will probably purchase, how often the customer will purchase, and what will the value of his/her purchases.
4. Using a sample of their 500 customers that they collected from different sources, they can run above RFM analysis to figure out their high priority and low priority customers and finally take actions on them accordingly by giving some discounts coupons and all to maintain a good bond with the customers.