Author | Message |
---|---|
abansal
Posts: 9
|
Posted 00:20 Jul 21, 2009 |
1. Consider the following three transactions: T1: {a1,a2,...,a99,a100}
T2: {a1,a2,...,a74,a75} T3: {a1,a2,...,a49,a50} Let minimum support count min_sup = 2. (a) How many frequent itemsets are there? Sol: 275 -1 Explaination : All the subsets of T3 have support count of 3 > min_sup All the subsets of T2 have support count of 2 = min_sup Since all the items in T1 do not have min_sup, so we can easily see that all subsets of T2 will become frequent itemsets. Hence solution will be power set of T2 which is 275 -1.
(b) List all the closed itemset(s). Sol: T2: {a1,a2,...,a74,a75}
Explaination : An itemset X is closed if there exists no proper superset of X that has the same support count.
That means all subsets of T3 will have support count of 3 (all of them are subsets of other 2 transactions too), giving a closed set of T3: {a1,a2,...,a49,a50}.
Similarly all subsets of T2 that are not subsets of T3 will have support count of 2 (all of them are subsets of T1), giving a closed set of T2: {a1,a2,...,a74,a75}.
(c) List all the maximal itemset(s). Sol: T2: {a1,a2,...,a74,a75}
Explaination :
An itemset X is a maximal frequent itemset if X is frequent and there exists no proper superset of X that is also frequent.
Looking at T2 and T3, that were only two closed frequent sets. We can easily tell that T3 is subset of T2, leaving T2 as maximal frequent itemset
|
abansal
Posts: 9
|
Posted 00:24 Jul 21, 2009 |
In part (a) Also see why only power set of T2 was solution. T1: We know all items don't have min_support so all subsets donot have support. T3: All subsets do have support > min_support which is 3 in this case. T2: All subsets have the support = min_ support and it also contains all subsets of T3 so if we take all subsets of T2, it contains all frequent itemsets. Hence to calculate number we calculate its powerset |
xieguahu
Posts: 50
|
Posted 19:14 Jul 21, 2009 |
(a) sol : 275-1
The explanation abansal provided is quite confusing to me.
"Since all the items in T1 do not have min_sup". I do not think this is correct. An simple counter example would be a1 is an item in T1, but the support_count of a1 will be 3, which is larger than 2.
My thought is that since the min_sup is 2, so the frequent itemset should not contain any item between a76 and a100. In another word, as long as the itemset only contain items between a1 and a75, the support_count of that itemset will be >= 2 and it will become a frequent itemset.
So the result will be 275 - 1. (powerset of 75 except the empty set).
(b) List all the closed itemset(s). {a1,a2,...,a49,a50} support_count = 2.
{a1,a2,...,a74,a75} support_count = 3.
I agree with abansal's explanation.
c) List all the maximal itemset(s). {a1,a2,...,a74,a75} I agree with abansal's explanation. But I want to add some further explanation. A maximal frequent itemset is also a closed frequent itemsets, but not vice versa. So we only need to check the two itemset we got at (b) and obviously only {a1,a2,...,a74,a75} is the maximal frequent itemset. Last edited by xieguahu at
19:22 Jul 21, 2009.
|
abansal
Posts: 9
|
Posted 20:03 Jul 21, 2009 |
Please check my second post for more explaination. Even if it isn't able to explain well, I'll make another attempt at that Last edited by abansal at
20:04 Jul 21, 2009.
|