Author | Message |
---|---|
p0941
Posts: 95
|
Posted 23:00 Aug 24, 2009 |
It looks strange, but hope this is correct.
(a) From the algorithm 1 in the paper, From S1: Path <P3,P1,P2,P5>, <P3,P1,P2,P7> From S2: Path <P5> From S3: Path null From S4: Path <P5> From S5: Path <P5><P3> From S6: Path null From S7: Path <P2,P7> From S8: Path null
(b) From the algorithm 2 in the paper C1: <P5>, <P3>,<P2> |
xieguahu
Posts: 50
|
Posted 15:21 Aug 26, 2009 |
This is what I got:
(a) S1: <p1,p2,p5,p3> <p1,p2,p7,p3> S2: <p1,p4,p5,p1,p6,p7> S3: <p1,p6,p1,p4> S4: <p5,p1,p4> <p5,p1,p6,p7> S5: <p5,p3> S6: <p1,p2,p7,p3> S7: <p2,p7> S8: <p1,p2,p4,p1,p2,p6,p7,p3> So the session is:
<p1,p2,p5,p3> <p1,p2,p7,p3> <p1,p4,p5,p1,p6,p7> <p1,p6,p1,p4> <p5,p1,p4> <p1,p2,p4,p1,p2,p6,p7,p3> (b)
<p1,p2> <p1,p4> Last edited by xieguahu at
15:28 Aug 26, 2009.
|
HelloWorld
Posts: 88
|
Posted 16:41 Aug 29, 2009 |
This algorithm is written very confusing..
where is the EndFor? I tried to use Ruler to find it, and I couldn't find it.. Last edited by HelloWorld at
16:41 Aug 29, 2009.
|
alomo
Posts: 70
|
Posted 00:57 Aug 30, 2009 |
There are three steps in this algorithm. Step I starts on line 5: TPageSet := {} Then For-loop runs till line 13th inclusive. In this for-loop the TPageSet is being updated. Then, for-loop shall be closed and then (on line 15) Step II begins and the CandSession is updated. Just looking on this code you can see that this for-loop cannot run any further because a few lines later another for-loop (also by Pagei) begins. Step III-a will start if condition on line 16 is sutisfied. Otherwise, Step III-b will start (line 19). Last edited by alomo at
01:14 Aug 30, 2009.
|
HelloWorld
Posts: 88
|
Posted 10:27 Aug 30, 2009 |
the problem is that there's no EndFor for the for loop that starts on Line 6.. Are you sure though?
|
alomo
Posts: 70
|
Posted 11:05 Aug 30, 2009 |
I understand there's not EndFor line. Here is the code. CandSession is a scope for that loop, and CandSession is changing in line 15. Therefore, the loop must end before that line. On the other hand, the last use of Pagei by this loop is on line 13. That means EndFor line shall be between 13 and 15. I added it on line 14 (comments don't need to be numbered). 6: ForEach Pagei in CandSession There is another possible mistake in line 9: 9: If ( Link[Pagei, Pagej] = true) ... Here, Link[Pagei, Pagej] means that there are link directed from i to j. However, it would make a sense if we switch i and j.
Last edited by alomo at
14:04 Aug 30, 2009.
|
HelloWorld
Posts: 88
|
Posted 13:30 Aug 30, 2009 |
yeah, i agree with you.. but i still don't see on why if there's no link between j to i, would make page i as a start page.. it's basically saying that subpages cannot have link back to the main page.. doesn't make sense.. also, what does this mean?
Last edited by HelloWorld at
13:42 Aug 30, 2009.
|
alomo
Posts: 70
|
Posted 14:03 Aug 30, 2009 |
Actually, it does not say that subpages cannot have link back to the main page (back to the page we just came from). It just chooses not to use such links. As I understand, we are interested in forwarding links to move away from Pagei, along the session. So it is not important for us to know that there is a way to return back to the main page through the same way. I just don't see another explanation for that.
TSessionSet stands for the Temporary Session Set. In Step III we are appending TPageSet to TSessionSet, but because at this moment (Step III-a) TSessionSet is an empty set, we can simply do Union. Last edited by alomo at
14:44 Aug 30, 2009.
|
HelloWorld
Posts: 88
|
Posted 14:50 Aug 30, 2009 |
I know that it's union and that it's Temporary Sesion Set, my question is more, it's being unioned to what? I thought it's being unioned to the session that has that Pagei, but i'm not 100% sure.. |
xieguahu
Posts: 50
|
Posted 14:53 Aug 30, 2009 |
|
alomo
Posts: 70
|
Posted 11:29 Aug 31, 2009 |
I am getting so different results: (a) S1: <p1,p2,p5,p3>, <p1,p2,p7,p3> So the sessions are:
(b) I ve got same results as xieguahu had: <p1,p2>, <p1,p4> But I have some doubts about that result. Do we apply the apriori to the original set or to the set we got in (a)? Last edited by alomo at
13:05 Aug 31, 2009.
|
nshatok
Posts: 19
|
Posted 12:40 Aug 31, 2009 |
This is what I am getting: Sessions: a) {P1, P2, P5,P3} {P1,P2,P7,P3} {P1, P4,P5,P1,P6,P7} {P1,P6,P1,P4} {P5,P1,P6,P7} {P4,P1,P6,P7} {P1,P2,P4,P1,P2,P6,P7,P3} b) frequent patterns: P1,P2 P1,P6 P1,P6,P7 |
nshatok
Posts: 19
|
Posted 12:50 Aug 31, 2009 |
Sorry this is what I got Sessions: a) {P1, P2, P5,P3} {P1,P2,P7,P3} {P1, P4,P5,P1,P6,P7} {P1,P6,P1,P4} {P4,P1,P6,P7} {P1,P2,P4,P1,P2,P6,P7,P3} b) frequent patterns: P1,P2
As far as I understand the approach descrtibed on p 163 in a paper I think P4 should start a new session. |
alomo
Posts: 70
|
Posted 13:46 Aug 31, 2009 |
I think we are getting so different results because we read the algorithm code in different ways. It was my mistake to add the EndFor between IF's. The more appropriate way is the following: 6: ForEach Pagei in CandSession Last edited by alomo at
13:48 Aug 31, 2009.
|
cysun
Posts: 2935
|
Posted 16:59 Aug 31, 2009 |
No one is completely correct so far. From S2 you should get <P1,P4,P5,P1,P6,P7> and <P1,P4,P1,P6,P7>. |
xieguahu
Posts: 50
|
Posted 17:43 Aug 31, 2009 |
|
cysun
Posts: 2935
|
Posted 17:59 Aug 31, 2009 |
Why <p1,p4,p1,p6,p7> is a subseqence of <p1,p4,p5,p1,p6,p7>? I haven't noticed a clear definition for sequence containment in the paper, so I assume that to be a subsequence, all elements must be adjacent. |
xieguahu
Posts: 50
|
Posted 18:06 Aug 31, 2009 |
Last edited by xieguahu at
18:06 Aug 31, 2009.
|
cysun
Posts: 2935
|
Posted 18:09 Aug 31, 2009 |
That definition does not apply to this paper. |
cysun
Posts: 2935
|
Posted 18:20 Aug 31, 2009 |
But you are right that from S2 there's only one session <p1,p4,p5,p1,p6,p7>. I tried the algorithm again and realized that I made a mistake. Sorry about that. |
cysun
Posts: 2935
|
Posted 18:33 Aug 31, 2009 |
How did you get <p5,p1,p4> for S4? There's no p1 proceeding p4 in S4. The sessions for S1, S2, and S3 are correct. I'm continuing to verify the rest. |
xieguahu
Posts: 50
|
Posted 18:43 Aug 31, 2009 |
|
cysun
Posts: 2935
|
Posted 18:43 Aug 31, 2009 |
<p5,p1,p4> for S4 is wrong but the rest is fine; however, you should not drop the short sessions - maximality rule only apply to sessions constructed from the same original session. |
cysun
Posts: 2935
|
Posted 18:46 Aug 31, 2009 |
It should be just <p5,p1,p6,p7>. p4 is not linked from p5, and according to the algorithm, it's not included in the new sessions. |
alomo
Posts: 70
|
Posted 22:12 Sep 03, 2009 |
I think xieguahu is right. The algorithm says that in case the Link(Pn, Pn+1) = false we have to check pages in the constructed part of this session that visited before Pn (going backward from Pn-1 to P1) to find a link to Pn+1. An if there is no link found, the page Pn+1 becomes the first page of a new session. Therefore, we will have two sessions that xieguahu found - <P5,P1,P6,P7> and <P4,P1,P6,P7>. Last edited by alomo at
22:14 Sep 03, 2009.
|
cysun
Posts: 2935
|
Posted 22:50 Sep 03, 2009 |
Yes, you are right. |
alomo
Posts: 70
|
Posted 14:04 Sep 04, 2009 |
I think my last post needs some sorrections. Because we may have more than one reconstructed session, we have try to append the Pn+1 page to each of these reconstructed sessions. Adding the Pn+1 page only to one reconstructed session would not be enough (would not be correct).
For example, assume the following original session:
S3: <P1, P2, P4, P6, P7, P3> At that moment we would already have three reconstructed sessions: <P1, P2, P4>, <P1, P2, P7>, and <P1, P2, P6, P7>. Now, adding P3: <P1, P2, P4> + <P3> -> <P1, P2, P4>, <P3> (here a new session started because non of pages P1, P2, or P4 linked to the P3; <P1, P2, P7> + <P3> -> <P1, P2, P7, P3>; <P1, P2, P6, P7> + <P3> -> <P1, P2, P6, P7, P3>. Last edited by alomo at
15:56 Sep 04, 2009.
|
cysun
Posts: 2935
|
Posted 14:38 Sep 04, 2009 |
this is incorrect - <P3> doesn't become a new session. You can read line 20-30 of Algorithm 1 to see. |
alomo
Posts: 70
|
Posted 16:44 Sep 04, 2009 |
I just don't see how this case is different from one mentioned by xieguahu. As I see in here, according to the Maximality Rule, this short session <P3> will be absorbed by some of longer reconstructed sessions that already contain page P3.
Last edited by alomo at
16:44 Sep 04, 2009.
|