In: Computer Science
Web Mining Assignments
Exercise 6.1.1 : Suppose there are 100 items, numbered 1 to 100, and also 100 baskets, also numbered 1 to 100. Item i is in basket b if and only if i divides b with no remainder. Thus, item 1 is in all the baskets, item 2 is in all fifty of the even-numbered baskets, and so on. Basket 12 consists of items {1, 2, 3, 4, 6, 12}, since these are all the integers that divide 12. Answer the following questions:
(a) If the support threshold is 5, which items are
frequent?
! (b) If the support threshold is 5, which pairs of items are
frequent?
! (c) What is the sum of the sizes of all the baskets?
PS: I know the answer of a) which are 1, 2, 3, 4 ... 18, 19, 20. But how to get the answers of b) and c) please show the steps and the explanation. Thank you.
Since you have requested answer for b and c, lets first analyze question then we will step by step derive answer.
Items : 1, 2, 3, ... 100
100 baskets such that item i divides b with remainder 0, since 1 divides every number item 1 will be present in every basket, 2 in even number basket and so on. this can also be understood hint given in question for basket 12.
therefore,
baskets:
[ {1}, {1, 2}, {1, 3}, {1, 2, 4}, {1, 5}, {1, 2, 3, 6}, {1, 7} ....
{1, 2, 4, 5, 100, 10, 50} ]
a) explanation: for support threshold 5, items from 1 to 20 as 1 occurs 100 times, 2 occurs 50 times ... 20 occurs 5 times.
b) Let us pair two items, and since frequency also asked, for first item 1 and second item 2 {1, 2} will occur 50 times. because [100/ 1*2 = 50], {1, 3} pair occurs 33 times [100/ 1*3] = 33
Step 1: Two pairs with first item 1 :
[1, 2] occurs 50 times
[1, 3] => 33 times
and so on till 20 since threshold 5,
[1,20] => 5 times
using a) from second item 2 to 20, these are 19 pair.
Step 2: considering first item 2,
(2, 3) => 100/ 6 , 16 times,
(2, 4) => 100 / 2*4 , pair occurs 25 times,
(2, 5) => 10
(2, 6) => 16
(2, 7) => 7
(2, 8) => 12
(2, 9) => 5
(2, 10) => 10
(2, 12) => 8
(2, 14) => 7
(2, 16)=> 6
(2, 18) => 5
(2, 20) => 5
Step 3: Similarly taking first item 3, pair will form,
(3, 4) => 100/ 3*4, 8 such items in basket with pair 3,4,
similarly,
(3, 5): 6
(3, 6): 16
(3, 9): 11
(3, 12): 8
(3, 15): 6
(3, 18): 5
for easily obtaining pair, second item we will consider multiple or less than multiple of first item for matching threshold.
Step 4: taking 4 first pair, lets check pair for threshold 5,
(4, 5): 5
(4, 6): 8
(4, 8): 12
(4, 10): 5
(4, 12): 8
(4, 16): 6
(4, 20): 5
Step 5: for first item 5, only 3 pairs will form as multiples from 5, second item will only contain, 10, 15, 20,
(5, 10): 10
(5, 15): 6
(5, 20): 5
Step 6: first item 6
(6, 9): 5
(6, 12): 8
(6, 18): 5
Step 7: for 7, 8, 9, 10, only one pair exist only, consider second element till 20, for matching threshold 5.
for first item 7: (7,14)
8: pair (8,16)
9: pair (9, 18)
10: pair (10, 20)
c) Given question we are asked to calculate sum of length of all baskets,
basket1 {1} ,length = 1
basket2 {1,2} length = 2
basket3 {1, 3} length = 2
basket4 {1, 2, 4} length = 3
basket5 {1, 5} length = 2
we look for pattern, for index of basket length is number of factors of basket index.
therefore sum of length of factors for basket index from 1 to 100 will be 482.