Author | Message |
---|---|
ljuster2
Posts: 19
|
Posted 19:37 Jan 28, 2015 |
I am trying to find out the indices that have null values for Age in my test data and I am using pandas to read in my data In the test.csv file, the ages are not whole integers like they are in the train.csv file Ex.: test.csv age '23.5' or '67.0' while train.csv has '23' or '67' When I run the following line: test_data.isnull(test_data.Age).astype(int) I get the error: TypeError: isnull() takes exactly 1 argument (2 given) I do NOT get this error when I run the same line with my training data. I couldn't figure out why, any thoughts?
Thanks
|
ljuster2
Posts: 19
|
Posted 08:31 Jan 29, 2015 |
Found my own silly error. I imported pandas as 'pd'
test_data['AgeIsNull'] = pd.isnull(td.Age).astype(int)
NOT
test_data['AgeIsNull'] = test_data.isnull(td.Age).astype(int) |
ljuster2
Posts: 19
|
Posted 12:56 Jan 30, 2015 |
Question on the extra credit: How are we supposed to use the test set on the model if the test set does not include information on whether the test examples survived or not? |
msargent
Posts: 519
|
Posted 13:20 Jan 30, 2015 |
Good question. Try this: divide the training set up into 2 sections: train on one, test on the other. |