reset password
Author Message
p0941
Posts: 95
Posted 23:00 Aug 24, 2009 |

It looks strange, but hope this is correct.  

(a)

From the algorithm 1 in the paper,

From S1: Path <P3,P1,P2,P5>, <P3,P1,P2,P7>

From S2: Path <P5>

From S3: Path  null

From S4: Path <P5>

From S5: Path <P5><P3>

From S6: Path  null

From S7: Path  <P2,P7>

From S8: Path  null

 

(b)

From the algorithm 2 in the paper

C1: <P5>, <P3>,<P2>

xieguahu
Posts: 50
Posted 15:21 Aug 26, 2009 |

This is what I got:

(a)

S1: <p1,p2,p5,p3> <p1,p2,p7,p3>

S2: <p1,p4,p5,p1,p6,p7>

S3: <p1,p6,p1,p4>

S4: <p5,p1,p4> <p5,p1,p6,p7>

S5: <p5,p3>

S6: <p1,p2,p7,p3>

S7: <p2,p7>

S8: <p1,p2,p4,p1,p2,p6,p7,p3>

 So the session is:

<p1,p2,p5,p3>

<p1,p2,p7,p3>

<p1,p4,p5,p1,p6,p7>

<p1,p6,p1,p4>

<p5,p1,p4>

<p1,p2,p4,p1,p2,p6,p7,p3>

 

 

(b)

<p1,p2>

<p1,p4>

Last edited by xieguahu at 15:28 Aug 26, 2009.
HelloWorld
Posts: 88
Posted 16:41 Aug 29, 2009 |

This algorithm is written very confusing..

6: ForEach Page(i) in CandSession

where is the EndFor? I tried to use Ruler to find it, and I couldn't find it..

Last edited by HelloWorld at 16:41 Aug 29, 2009.
alomo
Posts: 70
Posted 00:57 Aug 30, 2009 |
HelloWorld wrote:

This algorithm is written very confusing..

6: ForEach Page(i) in CandSession

where is the EndFor? I tried to use Ruler to find it, and I couldn't find it..

There are three steps in this algorithm.

Step I starts on line 5: TPageSet := {}

Then For-loop runs till line 13th inclusive.

In this for-loop the TPageSet is being updated.

Then, for-loop shall be closed and then (on line 15) Step II begins and the CandSession is updated.

Just looking on this code you can see that this for-loop cannot run any further because a few lines later another for-loop (also by Pagei) begins.

Step III-a will start if condition on line 16 is sutisfied.

Otherwise, Step III-b will start (line 19).

Last edited by alomo at 01:14 Aug 30, 2009.
HelloWorld
Posts: 88
Posted 10:27 Aug 30, 2009 |

 

Then For-loop runs till line 13th inclusive.

the problem is that there's no EndFor for the for loop that starts on Line 6.. Are you sure though?

 

alomo
Posts: 70
Posted 11:05 Aug 30, 2009 |
HelloWorld wrote:

 

Then For-loop runs till line 13th inclusive.

the problem is that there's no EndFor for the for loop that starts on Line 6.. Are you sure though?

 

I understand there's not EndFor line.

Here is the code. CandSession is a scope for that loop, and CandSession is changing in line 15. Therefore, the loop must end before that line. On the other hand, the last use of Pagei by this loop is on line 13. That means EndFor line shall be between 13 and 15. I added it on line 14 (comments don't need to be numbered).

 6: ForEach Pagei in CandSession
7: StartPageFlag := TRUE
8: ForEach Pagej in CandSession with j > i
9: If ( Link[Pagei, Pagej] = true) and ( TimeDiff(Pagej , Pagei) ≤ σ ) Then
10: StartPageFlag := FALSE
11: End For
12: If StartPageFlag = TRUE Then
13: TPageSet := TPageSet U {Pagei}
14: End For
// Remove the selected pages from the current seq.
15: CandSession := CandSession TPageSet

There is another possible mistake in line 9:

9: If ( Link[Pagei, Pagej] = true) ...

Here, Link[Pagei, Pagej] means that there are link directed from i to j. However, it would make a sense if we switch i and j.

 



Last edited by alomo at 14:04 Aug 30, 2009.
HelloWorld
Posts: 88
Posted 13:30 Aug 30, 2009 |

Here, Link[PageiPagej] means that there are link directed from i to j. However, it would make a sence if we switch i and j

yeah, i agree with you.. but i still don't see on why if there's no link between j to i, would make page i as a start page.. it's basically saying that subpages cannot have link back to the main page.. doesn't make sense..

also, what does this mean?

TSessionSet := TSessionSet ∪ {[Pagei]}

Last edited by HelloWorld at 13:42 Aug 30, 2009.
alomo
Posts: 70
Posted 14:03 Aug 30, 2009 |
HelloWorld wrote:

Here, Link[PageiPagej] means that there are link directed from i to j. However, it would make a sense if we switch i and j

yeah, i agree with you.. but i still don't see on why if there's no link between j to i, would make page i as a start page.. it's basically saying that subpages cannot have link back to the main page.. doesn't make sense..

Actually, it does not say that subpages cannot have link back to the main page (back to the page we just came from). It just chooses not to use such links. As I understand, we are interested in forwarding links to move away from Pagei, along the session. So it is not important for us to know that there is a way to return back to the main page through the same way. I just don't see another explanation for that.

TSessionSet := TSessionSet ∪ {[Pagei]}

TSessionSet stands for the Temporary Session Set. In Step III we are appending TPageSet to TSessionSet, but because at this moment (Step III-a) TSessionSet is an empty set, we can simply do Union.

Last edited by alomo at 14:44 Aug 30, 2009.
HelloWorld
Posts: 88
Posted 14:50 Aug 30, 2009 |

TSessionSet stands for the Temporary Session Set. In Step III we are appending TPageSet to TSessionSet, but because at this moment (Step III-a) TSessionSet is an empty set, we can simply do Union.

I know that it's union and that it's Temporary Sesion Set, my question is more, it's being unioned to what? I thought it's being unioned to the session that has that Pagei, but i'm not 100% sure..

xieguahu
Posts: 50
Posted 14:53 Aug 30, 2009 |
We donot union to anything.  We union it to an empty set.  And this only occurs during the first step
HelloWorld wrote:

TSessionSet stands for the Temporary Session Set. In Step III we are appending TPageSet to TSessionSet, but because at this moment (Step III-a) TSessionSet is an empty set, we can simply do Union.

I know that it's union and that it's Temporary Sesion Set, my question is more, it's being unioned to what? I thought it's being unioned to the session that has that Pagei, but i'm not 100% sure..

 

alomo
Posts: 70
Posted 11:29 Aug 31, 2009 |

I am getting so different results:

(a)

S1: <p1,p2,p5,p3>, <p1,p2,p7,p3>
S2: <p1,p4,p5>, <p1,p6,p7>
S3: <p1,p4>, <p1,p6>
S4: <p5,p1,p4>, <p5,p1,p6,p7>
S5: <p5,p3>
S6: <p1,p2,p7,p3>
S7: <p2,p7>
S8: <p1,p2>, <p1,p4>, <p1,p6,p7,p3>

So the sessions are:
<p1,p2,p5,p3>
<p1,p2,p7,p3>
<p1,p4,p5>
<p5,p1,p4>
<p5,p1,p6,p7>
<p1,p6,p7,p3>

 

(b)

I ve got same results as xieguahu had: <p1,p2>, <p1,p4>

But I have some doubts about that result. Do we apply the apriori to the original set or to the set we got in (a)?

Last edited by alomo at 13:05 Aug 31, 2009.
nshatok
Posts: 19
Posted 12:40 Aug 31, 2009 |

This is what I am getting:

Sessions:

a)

{P1, P2, P5,P3}

{P1,P2,P7,P3}

{P1, P4,P5,P1,P6,P7}

{P1,P6,P1,P4}

{P5,P1,P6,P7}

{P4,P1,P6,P7}

{P1,P2,P4,P1,P2,P6,P7,P3}

b) frequent patterns:

P1,P2

P1,P6

P1,P6,P7

nshatok
Posts: 19
Posted 12:50 Aug 31, 2009 |

Sorry this is what I got

Sessions:

a)

{P1, P2, P5,P3}

{P1,P2,P7,P3}

{P1, P4,P5,P1,P6,P7}

{P1,P6,P1,P4}

{P4,P1,P6,P7}

{P1,P2,P4,P1,P2,P6,P7,P3}

b) frequent patterns:

P1,P2

 

As far as I understand  the approach descrtibed on p 163 in a paper I think P4 should start a new session.

alomo
Posts: 70
Posted 13:46 Aug 31, 2009 |

I think we are getting so different results because we read the algorithm code in different ways.

It was my mistake to add the EndFor between IF's. The more appropriate way is the following:

 6: ForEach Pagei in CandSession
7: StartPageFlag := TRUE
8: ForEach Pagej in CandSession with j > i
9: If ( Link[Pagei, Pagej] = true ) and ( TimeDiff(Pagej , Pagei) ≤ σ ) Then
10: StartPageFlag := FALSE
11: If ( StartPageFlag =
true ) Then
12: TPageSet := TPageSet U {Pagei}
13: End For
14: End For
// Remove the selected pages from the current seq.
15: CandSession := CandSession TPageSet
Last edited by alomo at 13:48 Aug 31, 2009.
cysun
Posts: 2935
Posted 16:59 Aug 31, 2009 |

No one is completely correct so far. From S2 you should get <P1,P4,P5,P1,P6,P7> and <P1,P4,P1,P6,P7>.

xieguahu
Posts: 50
Posted 17:43 Aug 31, 2009 |
But <p1,p4,p1,p6,p7> is a subsequence of <P1,P4,P5,P1,P6,P7>, and Maximality Rule will be violated.
cysun wrote:

No one is completely correct so far. From S2 you should get <P1,P4,P5,P1,P6,P7> and <P1,P4,P1,P6,P7>.

 

cysun
Posts: 2935
Posted 17:59 Aug 31, 2009 |
xieguahu wrote:
But <p1,p4,p1,p6,p7> is a subsequence of <P1,P4,P5,P1,P6,P7>, and Maximality Rule will be violated.
cysun wrote:

No one is completely correct so far. From S2 you should get <P1,P4,P5,P1,P6,P7> and <P1,P4,P1,P6,P7>.

 

Why <p1,p4,p1,p6,p7> is a subseqence of <p1,p4,p5,p1,p6,p7>? I haven't noticed a clear definition for sequence containment in the paper, so I assume that to be a subsequence, all elements must be adjacent.

xieguahu
Posts: 50
Posted 18:06 Aug 31, 2009 |
Because in your slides (more.ppt), page 5
A=<a1a2a3…an>
B=<b1b2b3bm>
A is a subsequence of B if there exists 1<=j1<j2<…<jn <=m such that a1 'belong' bj1,a2 'belong' bj2,…,an 'belong' bjn
And it doesnot say that all elements should be adjacent.
cysun wrote:
xieguahu wrote:
But <p1,p4,p1,p6,p7> is a subsequence of <P1,P4,P5,P1,P6,P7>, and Maximality Rule will be violated.
cysun wrote:

No one is completely correct so far. From S2 you should get <P1,P4,P5,P1,P6,P7> and <P1,P4,P1,P6,P7>.

 

Why <p1,p4,p1,p6,p7> is a subseqence of <p1,p4,p5,p1,p6,p7>? I haven't noticed a clear definition for sequence containment in the paper, so I assume that to be a subsequence, all elements must be adjacent.

 

Last edited by xieguahu at 18:06 Aug 31, 2009.
cysun
Posts: 2935
Posted 18:09 Aug 31, 2009 |
xieguahu wrote:
Because in your slides (more.ppt), page 5
A=<a1a2a3…an>
B=<b1b2b3bm>
A is a subsequence of B if there exists 1<=j1<j2<…<jn <=m such that a1 'belong' bj1,a2 'belong' bj2,…,an 'belong' bjn
And it doesnot say that all elements should be adjacent.
cysun wrote:
xieguahu wrote:
But <p1,p4,p1,p6,p7> is a subsequence of <P1,P4,P5,P1,P6,P7>, and Maximality Rule will be violated.
cysun wrote:

No one is completely correct so far. From S2 you should get <P1,P4,P5,P1,P6,P7> and <P1,P4,P1,P6,P7>.

 

Why <p1,p4,p1,p6,p7> is a subseqence of <p1,p4,p5,p1,p6,p7>? I haven't noticed a clear definition for sequence containment in the paper, so I assume that to be a subsequence, all elements must be adjacent.

That definition does not apply to this paper.

cysun
Posts: 2935
Posted 18:20 Aug 31, 2009 |
cysun wrote:
xieguahu wrote:
Because in your slides (more.ppt), page 5
A=<a1a2a3…an>
B=<b1b2b3bm>
A is a subsequence of B if there exists 1<=j1<j2<…<jn <=m such that a1 'belong' bj1,a2 'belong' bj2,…,an 'belong' bjn
And it doesnot say that all elements should be adjacent.
cysun wrote:
xieguahu wrote:
But <p1,p4,p1,p6,p7> is a subsequence of <P1,P4,P5,P1,P6,P7>, and Maximality Rule will be violated.
cysun wrote:

No one is completely correct so far. From S2 you should get <P1,P4,P5,P1,P6,P7> and <P1,P4,P1,P6,P7>.

 

Why <p1,p4,p1,p6,p7> is a subseqence of <p1,p4,p5,p1,p6,p7>? I haven't noticed a clear definition for sequence containment in the paper, so I assume that to be a subsequence, all elements must be adjacent.

That definition does not apply to this paper.

But you are right that from S2 there's only one session <p1,p4,p5,p1,p6,p7>. I tried the algorithm again and realized that I made a mistake. Sorry about that.

cysun
Posts: 2935
Posted 18:33 Aug 31, 2009 |
xieguahu wrote:
...

S4: <p5,p1,p4> <p5,p1,p6,p7>

...

How did you get <p5,p1,p4> for S4? There's no p1 proceeding p4 in S4. The sessions for S1, S2, and S3 are correct. I'm continuing to verify the rest.

xieguahu
Posts: 50
Posted 18:43 Aug 31, 2009 |
I think it is not correct, should it be <p5,p1,p6,p7><p4,p1,p6,p7>?
cysun wrote:
xieguahu wrote:
...

S4: <p5,p1,p4> <p5,p1,p6,p7>

...

How did you get <p5,p1,p4> for S4? There's no p1 proceeding p4 in S4. The sessions for S1, S2, and S3 are correct. I'm continuing to verify the rest.

 

cysun
Posts: 2935
Posted 18:43 Aug 31, 2009 |
xieguahu wrote:

This is what I got:

(a)

S1: <p1,p2,p5,p3> <p1,p2,p7,p3>

S2: <p1,p4,p5,p1,p6,p7>

S3: <p1,p6,p1,p4>

S4: <p5,p1,p4> <p5,p1,p6,p7>

S5: <p5,p3>

S6: <p1,p2,p7,p3>

S7: <p2,p7>

S8: <p1,p2,p4,p1,p2,p6,p7,p3>

 So the session is:

 

<p1,p2,p5,p3>

<p1,p2,p7,p3>

<p1,p4,p5,p1,p6,p7>

<p1,p6,p1,p4>

<p5,p1,p4>

<p1,p2,p4,p1,p2,p6,p7,p3>


<p5,p1,p4> for S4 is wrong but the rest is fine; however, you should not drop the short sessions - maximality rule only apply to sessions constructed from the same original session.

cysun
Posts: 2935
Posted 18:46 Aug 31, 2009 |
xieguahu wrote:
I think it is not correct, should it be <p5,p1,p6,p7><p4,p1,p6,p7>?

It should be just <p5,p1,p6,p7>. p4 is not linked from p5, and according to the algorithm, it's not included in the new sessions.

alomo
Posts: 70
Posted 22:12 Sep 03, 2009 |
cysun wrote:
xieguahu wrote:
I think it is not correct, should it be <p5,p1,p6,p7><p4,p1,p6,p7>?

It should be just <p5,p1,p6,p7>. p4 is not linked from p5, and according to the algorithm, it's not included in the new sessions.

I think xieguahu is right. The algorithm says that in case the Link(Pn, Pn+1) = false we have to check pages in the constructed part of this session that visited before Pn (going backward from Pn-1 to P1) to find a link to Pn+1. An if there is no link found, the page Pn+1 becomes the first page of a new session. Therefore, we will have two sessions that xieguahu found - <P5,P1,P6,P7> and <P4,P1,P6,P7>.

Last edited by alomo at 22:14 Sep 03, 2009.
cysun
Posts: 2935
Posted 22:50 Sep 03, 2009 |
alomo wrote:
cysun wrote:
xieguahu wrote:
I think it is not correct, should it be <p5,p1,p6,p7><p4,p1,p6,p7>?

It should be just <p5,p1,p6,p7>. p4 is not linked from p5, and according to the algorithm, it's not included in the new sessions.

I think xieguahu is right. The algorithm says that in case the Link(Pn, Pn+1) = false we have to check pages in the constructed part of this session that visited before Pn (going backward from Pn-1 to P1) to find a link to Pn+1. An if there is no link found, the page Pn+1 becomes the first page of a new session. Therefore, we will have two sessions that xieguahu found - <P5,P1,P6,P7> and <P4,P1,P6,P7>.

Yes, you are right.

alomo
Posts: 70
Posted 14:04 Sep 04, 2009 |
I think my last post needs some sorrections. Because we may have more than one reconstructed session, we have try to append the Pn+1 page to each of these reconstructed sessions. Adding the Pn+1 page only to one reconstructed session would not be enough (would not be correct).

For example, assume the following original session:

S3: <P1, P2, P4, P6, P7, P3>

At that moment we would already have three reconstructed sessions: <P1, P2, P4>, <P1, P2, P7>, and <P1, P2, P6, P7>.

Now, adding P3:

<P1, P2, P4> + <P3> -> <P1, P2, P4>, <P3> (here a new session started because non of pages P1, P2, or P4 linked to the P3;

<P1, P2, P7> + <P3> -> <P1, P2, P7, P3>;

<P1, P2, P6, P7> + <P3> -> <P1, P2, P6, P7, P3>.

Last edited by alomo at 15:56 Sep 04, 2009.
cysun
Posts: 2935
Posted 14:38 Sep 04, 2009 |
alomo wrote:

...

<P1, P2, P4> + <P3> -> <P1, P2, P4>, <P3> (here a new session started because non of pages P1, P2, or P4 linked to the P3;

...

this is incorrect - <P3> doesn't become a new session. You can read line 20-30 of Algorithm 1 to see.

alomo
Posts: 70
Posted 16:44 Sep 04, 2009 |
cysun wrote:
alomo wrote:

...

<P1, P2, P4> + <P3> -> <P1, P2, P4>, <P3> (here a new session started because non of pages P1, P2, or P4 linked to the P3;

...

this is incorrect - <P3> doesn't become a new session. You can read line 20-30 of Algorithm 1 to see.

I just don't see how this case is different from one mentioned by xieguahu. As I see in here, according to the Maximality Rule, this short session <P3> will be absorbed by some of longer reconstructed sessions that already contain page P3.

 

Last edited by alomo at 16:44 Sep 04, 2009.