In: Statistics and Probability
The proliferation of textual data in business is overwhelming. Unstructured textual data is being constantly generated via call center logs, emails, documents on the web, blogs, tweets, vidoes, customer reviews, and so on. While the amount of textual data is increasing rapidly, businesses’ ability to summarize, understand, and make sense of such data for making better business decisions using statisitcal remain challenging.
The basic premise to use text data in predictive models is that the terms contained within the text data can potentially represent the customer’s experiences (bad or good) which are supposedly consistent with the customer’s decision to continue with the business or churn in the nearest future. Hence the potential of mining text data in such applications cannot be undermined. Text data is first transformed into a set of numerical components called Singular Value Decomposition (SVD) units which collectively represent the text documents. These units are then used as additional inputs along with the existing structured input attributes to help improving the predictive power of the existing models.
Sentiment Analysis
An interesting and important goal of analyzing unstructured data such as customer complaints, issues, opinions or comments is to get a grasp on what they perceive about an entity. An entity can be a company’s brand image, product, service, person, group or an organization. Are consumers’ perceptions good, bad or neutral? What attributes (features) of the product or service they feel good or bad about? What do the customers think of the various attributes of a company’s product such as quality, price, durability, safety, ease of use? Typically, if customer feels good towards an entity, it is classified as a positive sentiment. If the perception towards the entity is bad, it can be considered as negative sentiment. A third kind of perception in which customer has neither good nor bad opinion implies a neutral sentiment. Social media sites such as Twitter and Facebook contains enormous volumes of customer opinions and comments on virtually all major organizations, events and products. This creates an unprecedented opportunity to mine text data in real-time to and analyze sentiment trends fluctuations over a period of time.