In: Mechanical Engineering
Explain using an example why computing lift ratio is better in assessing soundness of an association rule than just using support for the products associated in the rule or confidence in the rule. What is the practical significance of lift ratio less than 1?
Association rule have huge correlation relationships among large sets of data items. Association rules show attribute value conditions that occur frequently together in a given data set. A typical example of association rule mining is Market Basket Analysis.
Data is collected using bar-code scanners in grocery shop. Such market basket databases consist of a large number of transaction records. Each record lists all items bought by a customer on a single purchase transaction. Managers would be interested to know if certain groups of items are consistently purchased together. They could use this data for adjusting store layouts (placing items optimally with respect to each other), for cross-selling, for promotions, for catalog design, and to identify customer segments based on buying patterns.
For example, if a grocery shop database has 50000 point-of-sale transactions, out of which 2,000 include both items A and B, and 800 of these include item C, the association rule "If A and B are purchased, then C is purchased on the same trip," has a support of 800 transactions (alternatively 0.016 = 800/50000), and a confidence of 40% (=800/2,000). One way to think of support is that it is the probability that a randomly selected transaction from the database will contain all items in the antecedent and the consequent, whereas the confidence is the conditional probability that a randomly selected transaction will include all the items in the consequent, given that the transaction includes all the items in the antecedent.
Lift is one more parameter of interest in the association analysis. Lift is nothing but the ratio of Confidence to Expected Confidence. Using the above example, expected Confidence in this case means, "confidence, if buying A and B does not enhance the probability of buying C." It is the number of transactions that include the consequent divided by the total number of transactions. Suppose the number of total number of transactions for C are 5,000. Thus Expected Confidence is 5,000/50000=10%. For the grocery shop example the Lift = Confidence/Expected Confidence = 40%/10% = 4. Hence, Lift is a value that gives us information about the increase in probability of the then given the if part.
The larger the lift ratio, the more significant the
association.
Lift is a measure of the performance of a targeting model
(association rule) at predicting or classifying cases as having an
enhanced response (with respect to the population as a whole),
measured against a random choice targeting model.
If the lift is >1 that lets us know the degree to which those two occurrences are dependent on one another, and makes those rules potentially useful for predicting the consequent in future data sets.
Thanks
Please don't forget to press the like button
Please don't forget to press the like button
Please don't forget to press the like button