reset password
Author Message
dtang9
Posts: 52
Posted 17:06 Nov 13, 2019 |

For part d, it says bootstarp_size = 0.8*(Size of the original dataset). What is the Size of the original dataset? Is this the size of the df from part a?

I am also unsure of how to perform voting.

Last edited by dtang9 at 17:31 Nov 13, 2019.
mpourhoma
Posts: 39
Posted 20:54 Nov 13, 2019 |

No, it should be 0.8*Size of the original “Training” Dataset (build in part(b)). It is mentioned in step1.

As for voting, there are many different ways to do it. I wanted you to think about it, be creative, and do it in your own way. But, Here is one approach: 
In EACH round, after you train one of the 19 classifiers, perform prediction right away on the testing set, and save the results in a column of a matrix. So, after finishing the loop, you should have a matrix with 19 columns (as for 19 classifiers), and The number of rows should be equal to the size of testing set (one row for each testing sample). Then, on each row, check to see if you have more ones or more zeros for each testing sample (i.e. each row), and make decision based on majority. This is only one way to perform voting! Maybe you can design a more efficient way!