reset password
Author Message
abansal
Posts: 9
Posted 00:20 Jul 21, 2009 |

 

1. Consider the following three transactions:

T1: {a1,a2,...,a99,a100}
T2: {a1,a2,...,a74,a75}
T3: {a1,a2,...,a49,a50}

Let minimum support count min_sup = 2.

(a) How many frequent itemsets are there?

Sol:  275 -1

Explaination :

All the subsets of T3 have support count of 3 > min_sup

All the subsets of T2 have support count of 2 = min_sup

Since all the items in T1 do not have min_sup, so we can easily see that all subsets of T2 will become frequent itemsets. Hence solution will be power set of T2

which is 275 -1.

 

(b) List all the closed itemset(s).

Sol:  T2: {a1,a2,...,a74,a75}
        T3: {a1,a2,...,a49,a50}

 

Explaination :

An itemset X is closed if there exists no proper superset of X that has the same support count.

That means all subsets of T3 will have support count of 3 (all of them are subsets of other 2 transactions too), giving a closed set of  T3: {a1,a2,...,a49,a50}.
Similarly all subsets of T2 that are not subsets of T3 will have support count of 2 (all of them are subsets of T1), giving a closed set of  T2: {a1,a2,...,a74,a75}.

 

(c) List all the maximal itemset(s).

Sol:   T2: {a1,a2,...,a74,a75}

 

Explaination :

 

An itemset X is a maximal frequent itemset if X is frequent and there exists no proper superset of X that is also frequent.

Looking at T2 and T3, that were only two closed frequent sets. We can easily tell that T3 is subset of T2, leaving T2 as maximal frequent itemset

 

abansal
Posts: 9
Posted 00:24 Jul 21, 2009 |

In part (a) Also see why only power set of T2 was solution.

T1: We know all items don't have min_support so all subsets donot have support.

T3: All subsets do have support > min_support which is 3 in this case.

T2: All subsets have the support = min_ support and it also contains all subsets of T3 so if we take all subsets of T2, it contains all frequent itemsets. Hence to calculate number we calculate its powerset

xieguahu
Posts: 50
Posted 19:14 Jul 21, 2009 |

(a)

sol : 275-1

 

The explanation abansal provided is quite confusing to me.

 

"Since all the items in T1 do not have min_sup".  I do not think this is correct.  An simple counter example would be a1 is an item in T1, but the support_count of a1 will be 3, which is larger than 2.

 

My thought is that since the min_sup is 2, so the frequent itemset should not contain any item between a76 and a100.  In another word, as long as the itemset only contain items between a1 and a75, the support_count of that itemset will be >= 2 and it will become a frequent itemset.

 

So the result will be 275 - 1.  (powerset of 75 except the empty set).

 

(b) List all the closed itemset(s).

  {a1,a2,...,a49,a50} support_count = 2.

{a1,a2,...,a74,a75} support_count = 3.

 

 

I agree with abansal's explanation.

 

c) List all the maximal itemset(s).

{a1,a2,...,a74,a75}

I agree with abansal's explanation.

But I want to add some further explanation.  A maximal frequent itemset is also a closed frequent itemsets, but not vice versa.  So we only need to check the two itemset we got at (b) and obviously only {a1,a2,...,a74,a75} is the maximal frequent itemset. 

Last edited by xieguahu at 19:22 Jul 21, 2009.
abansal
Posts: 9
Posted 20:03 Jul 21, 2009 |

Please check my second post for more explaination. Even if it isn't able to explain well, I'll make another attempt at that

Last edited by abansal at 20:04 Jul 21, 2009.