In: Computer Science
Consider the following table showing multiple transactions. Find all frequent itemsets using Apriori, then list all the strong association rules knowing that min_sup count = 2, and min_conf = 60%.
|
TID |
Items |
|
T1 T2 T3 T4 T5 T6 |
A, B, D, E A, B, C C, E B, C A A, B, C |
without handwriting, pleas.
Apriori algorithm is used to find the association between two objects whether they are strongly associated or weakly associated.
Given:
min_supcount=2
min_conf=60%
Solution:
Step 1: create a table containing the count of each item set or the their support count (supcount)
| Item set | supcount |
| A | 4 |
| B | 4 |
| C | 4 |
| D | 1 |
| E | 2 |
Step 2: Since min_supcount=2 D will be eliminated
The reamianing itemset will be : A,B.C.E and now we will pair them and then ww will make table to count each occurence of that pair basically we will make possible subset from the reamianing item set
| Item set | supcount |
| A,B | 3 |
| A,C | 2 |
| A,E | 1 |
| B,C | 3 |
| B,E | 1 |
| C,E | 1 |
Step 3: The pair (B,E) , ( C,E) and (A,E) will be eliminated because min_supcount is 2 and they have supcount less than 2
Now we will make triplets with the remaining itemset (A,B) , (A.C) , (B,C)
The only possible triplet is (A,B,C) because there are only three items left and it also fulfills the minimum support condition of 2
No we will figureout the association rule using the frequent itemset (A,B,C) and min_conf which is 60%
Confidence(A->B)=supcount (AUB) / supcount (A)
So the rule generation will be:
Itemset (A,B,C)
A^B -> C ( It means A and B gives C , it basically show the association between three variables)
Example of how confidence is calculated :
Confidence(A->B)=supcount (AUB) / supcount (A) assume A->(A^B) and B->C
Confidence= supcount((A^B)UC)) / supcount(A^B)
= 2 / 3
=0.66 OR 66%
Since min_supcount for (A,B,C) is 2 and the pair (A,B) has a supcount of 3
| Rules | Support | Confidence |
| A^B->C | 2 | supcount((A^B)UC)) / supcount(A^B)=2/3=66% |
| B^C->A | 2 | supcount((B^C)UA)) / supcount(B^C)=2/3=66% |
| C^A->B | 2 | supcount((C^A)UB)) / supcount(C^A)=2/2=100% |
| C->A^B | 2 | supcount((A^B)UC)) / supcount(C)=2/4=50% |
| A->B^C | 2 | supcount((A^B)UC)) / supcount(A)=2/4=50% |
| B->C^A | 2 | supcount((A^B)UC)) / supcount(B)=2/4=50% |
Th first three rules are strong association rule as they are above min_conf which is 60% and the last three h
are weak association rule as they are below 60%
The strong association rules are:
| A^B->C |
| B^C->A |
| C^A->B |