reset password
Author Message
intregrisist
Posts: 41
Posted 00:54 Dec 03, 2010 |

Hi, when we split a leaf on a B-Tree order 3 (for example), where should the split happen?

Lets say we have the following B-Tree:

[1|3|5]

now we want to insert 4, is the result:

   [3| | ]

[1| | ] [4|5| ]

OR

   [4| | ]

[1|3| ] [5| | ]

According to what I found online, it says the the median is the value that moves up, but which would be considered the median. Also, wen we are choosing the median are we looking at:

[1|3|4|5] OR [1|3|5]

In a B-Tree Order 2 I just take the one in the middle after inserting the new value:

[1|3]

Add 4:

   [3| ]

[1| ] [4| ]

But what happens if we were inserting a 2 instead of the 4.  I hope I was clear about my question.  This isn't explained clearly, I might just be over thinking this.

cysun
Posts: 2935
Posted 08:24 Dec 03, 2010 |

The split as described in the lecture always moves up the smallest value in the right node, so the result should be

  [4| | ]

[1|3| ] [5| | ]

Mathematically the median of an even-numbered list (e.g. 1,3,4,5) is the average of the two middle values (i.e. 3.5), which may not always be a valid value (e.g. the median of 'a' and 'b' is not a character), so you need to either round up or round down. In our implementation we always round up.

Last edited by cysun at 08:26 Dec 03, 2010.
intregrisist
Posts: 41
Posted 19:51 Dec 06, 2010 |

I am confused, are we rounding to the nearest whole number not the nearest value (when obtaining the median)?

For example, lets say we have the following B-Tree (Order 3):

|4|5|26|

We want to insert 30, is the correct outcome:

|4|5|26|30|: (5+26) / 2 = 15.5 => 16

         |16|__|__|   // median

|4|5|__| -> |26|30|__|

OR

         |26|__|__|   // 26, because if you round up to the nearest value it would be 26.

|4|5|__| -> |26|30|__|

 

Also, I was told that all the values should be in the lowest level of the B-Tree.  In your response, 4 becomes an index and does not show in the lowest level even though it was one of the original values:

  [4| | ]

[1|3| ] [5| | ]

I thought the result should be:

  [4| | ]

[1|3| ] [4|5| ]

cysun
Posts: 2935
Posted 20:17 Dec 06, 2010 |

I shouldn't have said "round up" or "round down" as it implies integer values but in practice an integer value may not be a valid key. The correct way to put it is that we can move up either the largest value in the left node or the smallest value in the right node, and we always pick the smallest value in the right node. So in your first example we move up 26 instead 16.

You are also right in the 2nd part where the correct result should be

  [4| | ]

[1|3| ] [4|5| ]

I didn't notice "split a leaf" in the original question. Sorry about that.

ptran6
Posts: 25
Posted 20:57 Dec 06, 2010 |

Dr. Sun, it seems to me that the approach is to split the leaf as "evenly" as possible into two nodes, and move up the smallest value in the right node. For example:

[1|2|3|4] is split into [1|2] and [3|4], and 3 is moved up. Splitting [1|2|3|4] into [1] and [2|3|4] would be wrong.

[1|2|3] is split into [1] and [2|3], and 2 is moved up. Splitting [1|2|3] into [1|2] and [3] would be wrong.

Is this correct?

Last edited by ptran6 at 21:00 Dec 06, 2010.
intregrisist
Posts: 41
Posted 21:38 Dec 06, 2010 |

What if you have the following (made up) B-Tree:

                                                      [10|__|__]

           [5|__|__]                                                                        [20|30|50]               

[1|__|__] -> [6|__|__]                          [11|12|13] -> [21|23|25]  -> [31|35|38] -> [52|54|56]

 

We want to insert 12:

                                                                                     [10|30|__]

           [5|__|__]                                                           [16|20|__]                                                                               [30|50|__]               

[1|__|__] -> [6|__|__]                          [12|14|__] -> [16|18|__] -> [21|23|25]                              [__|__|__] -> [31|35|38] -> [52|54|56]

 

Is this the correct outcome?  Notice how [30|50|__]  in level 2 will never get a number less than 30.  I ran into the same issue in Q6.  BTW, this is my last questions for B-Trees.  I just want to make sure I understand every situation since this came up in the Sample Final.

cysun
Posts: 2935
Posted 21:52 Dec 06, 2010 |

One of the B-tree properties is that all nodes except root must be at least half full (see lecture notes), so for a B-tree of order 3, a split of [1] and [2|3|4] is definitely wrong.

For a B-tree of order 2, both [1]/[2|3] split and [1|2]/[3] split are technically correct. Which split to choose is implementation-specific. In our case we choose the first way to split because we move up the smallest of the right node.

cysun
Posts: 2935
Posted 22:01 Dec 06, 2010 |
intregrisist wrote:

What if you have the following (made up) B-Tree:

                                                      [10|__|__]

           [5|__|__]                                                                        [20|30|50]               

[1|__|__] -> [6|__|__]                          [11|12|13] -> [21|23|25]  -> [31|35|38] -> [52|54|56]

 

We want to insert 12:

  ...

In your example, 12 is already in the tree. Handling duplicate keys in B-tree requires additional mechanism (e.g. overflow leaf blocks). You don't need to worry about duplicate keys in the exam.

intregrisist
Posts: 41
Posted 22:12 Dec 06, 2010 |

Sorry, I didn't realize there was a 12.  What I wanted to point out is the situation with the children of [30|50|__] :

EX: (instead of 12 its 15) same B-Tree as my last post:

After we insert 15, we get:

                                                                                      [10|30|__]

           [5|__|__]                                                            [13|20|__]                                                                            [30|50|__]               

[1|__|__] -> [6|__|__]                          [11|12|__] -> [13|15|__] -> [21|23|25]                           [__|__|__] -> [31|35|38] -> [52|54|56]

 

Is this normal or correct?

cysun
Posts: 2935
Posted 07:50 Dec 07, 2010 |
intregrisist wrote:

Sorry, I didn't realize there was a 12.  What I wanted to point out is the situation with the children of [30|50|__] :

EX: (instead of 12 its 15) same B-Tree as my last post:

After we insert 15, we get:

                                                                                      [10|30|__]

           [5|__|__]                                                            [13|20|__]                                                                            [30|50|__]               

[1|__|__] -> [6|__|__]                          [11|12|__] -> [13|15|__] -> [21|23|25]                           [__|__|__] -> [31|35|38] -> [52|54|56]

 

Is this normal or correct?

It's incorrect.

  • The 3rd node in the 2nd level should be [50] instead of [30|50].
  • There shouldn't be an empty node at the leaf level (3rd from the right).
  • Although unrelated to this insert, the two leaf nodes on the left (i.e. [1] and [6]) are incorrect too because you shouldn't have two sibling nodes each having only one key.
cdecesa
Posts: 8
Posted 15:46 Dec 07, 2010 |

Dr. Sun,

I am still having some trouble with the B-Tree in question 6 of the sample final. I have tried several times but am struggling to find the correct solution. Can you please walk us through the steps for inserting 115 into this B-Tree (assuming we have already performed the insert operations for 180,139,176)?

HuanXue
Posts: 34
Posted 15:50 Dec 07, 2010 |

Why there's 5,10,30,50 in the non-leaf nodes in intregrisist's question?

All the keys should appear in the leafs, no?

HuanXue
Posts: 34
Posted 15:55 Dec 07, 2010 |

I think it's like this after insert the 115:

[100|120|  ]

[110|  |  ]                                                                       [150|179|  ]

[100|101|  ]     [110|115|  ]                                        [120|130|139]

 

I ignored the not related part.

cysun
Posts: 2935
Posted 15:57 Dec 07, 2010 |
cdecesa wrote:

Dr. Sun,

I am still having some trouble with the B-Tree in question 6 of the sample final. I have tried several times but am struggling to find the correct solution. Can you please walk us through the steps for inserting 115 into this B-Tree (assuming we have already performed the insert operations for 180,139,176)?

I don't give out answers to the sample exams. Feel free to discuss among yourselves though.

intregrisist
Posts: 41
Posted 15:59 Dec 07, 2010 |
HuanXue wrote:

Why there's 5,10,30,50 in the non-leaf nodes in intregrisist's question?

All the keys should appear in the leafs, no?

You are right.  When I made up the B-Tree I wasn't concerned on the leafs on the left hand side.  What I was going for is the empty leaf that I get when inserting 15.  A similar thing happens when inserting 115 in the sample final.

intregrisist
Posts: 41
Posted 16:07 Dec 07, 2010 |
HuanXue wrote:

I think it's like this after insert the 115:

[100|120|  ]

[110|  |  ]                                                                       [150|179|  ]

[100|101|  ]     [110|115|  ]                                        [120|130|139]

 

I ignored the not related part.

According to Sun: "...we always pick the smallest value in the right node."  In this case you used the largest of the left node. Also, do we not keep the 120 in lvl 2?

HuanXue
Posts: 34
Posted 16:12 Dec 07, 2010 |

110 is the largest of the left node.

I think we don't need 120 because [120|130|139] is the smallest child of [150|179|  ].

HuanXue
Posts: 34
Posted 16:13 Dec 07, 2010 |

I'm sorry. I mean "110 is the smallest value in the right node".

intregrisist
Posts: 41
Posted 16:15 Dec 07, 2010 |

Shouldn't you have used the smallest value in the right node for the split in lvl 2?  110 is the smallest in the left node, not the right node

Last edited by intregrisist at 16:17 Dec 07, 2010.
HuanXue
Posts: 34
Posted 16:21 Dec 07, 2010 |

I think no.

In the root case, that should be the smallest value of the leafs in the part which this key points to.

If you don't understand my poor English... you can see the lecture ppt "index.ppt", page 14.  After the insert, 160 should be the key of the right part.

intregrisist
Posts: 41
Posted 16:24 Dec 07, 2010 |

Also, after reading some more.  This isn't a B-Tree.  After doing some research what we are implementing here is a B+ Tree.  A B-Tree does not need to have all the values in its leaf, a B+ Tree does.  I wonder if this is true, else we may be following the wrong algorithm.  Sun told me we needed to have all the values in the leaf which is a property of a B+ Tree not a B-Tree

Last edited by intregrisist at 16:29 Dec 07, 2010.
intregrisist
Posts: 41
Posted 16:29 Dec 07, 2010 |
HuanXue wrote:

I think no.

In the root case, that should be the smallest value of the leafs in the part which this key points to.

If you don't understand my poor English... you can see the lecture ppt "index.ppt", page 14.  After the insert, 160 should be the key of the right part.

It does get the smallest value in the right node BUT it doesn't keep that value.  This would solve the issue I was having.

HuanXue
Posts: 34
Posted 16:29 Dec 07, 2010 |

Well... I still think I'm following the B-tree insert rules which we studied on the class.

intregrisist
Posts: 41
Posted 16:41 Dec 07, 2010 |
HuanXue wrote:

Well... I still think I'm following the B-tree insert rules which we studied on the class.

Ah, I see.  He does mention its a B+ Tree in slide 7.  That's why the algorithm is going off a B+ Tree properties even though it says B-Tree later on.

intregrisist
Posts: 41
Posted 16:53 Dec 07, 2010 |

Either way, I think the answer should be (after inserting 180,139,176,115):

                            [100|150|__]

                            [110|120|__]                                                                  [179|__|__]

[100|101|__] -> [110|115|__] -> [120|130|139]                   [150|156|176] -> [179|180|__]

 

I hope Sun can confirm this.