Author | Message |
---|---|
mnguyen
Posts: 9
|
Posted 01:07 Feb 08, 2016 |
I finished implementing everything and according to the python plot, it looks like it has a proper convergence. But I don't know if it's "numerically correct". What values is everyone getting for your initial cost functions before running gradient_descent? Mine is 866.58, which seems off and doesn't look like it's predicting anything even though the graph is converging. Last edited by mnguyen at
01:09 Feb 08, 2016.
|
rkmx52
Posts: 23
|
Posted 08:48 Feb 08, 2016 |
I don't think there is a numerically correct value, seeing as everyone will have different test sizes for their training sets. |
msargent
Posts: 519
|
Posted 09:46 Feb 08, 2016 |
The training set should be the same size, if you are using the same milage data. |
msargent
Posts: 519
|
Posted 09:48 Feb 08, 2016 |
If it's converging, then you did it right. It might be there isn't a strong relationship between the features and the y value. You can try doing a few predictions and seeing how close it is. Next week there will be an exercise comparing your implementations with sci-kit learn's values. |
rkmx52
Posts: 23
|
Posted 10:04 Feb 08, 2016 |
Are you saying that we shouldn't split the data into training/test sets? I thought that was encouraged. FYI, I am referring to this, from sklearn.cross.validation: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42) |
skim144
Posts: 63
|
Posted 10:57 Feb 08, 2016 |
I think rkmx52 is right, we should split sets because we want to avoid overfitting. As long as we're using test_size = 0.33, the returning training set size (not the content) should be equal.
|
msargent
Posts: 519
|
Posted 12:57 Feb 08, 2016 |
We split the data into training and test sets when we try to report how accurate our model is. We aren't trying to do that in this assignment: I just asked that you code the algorithm and do a learning curve. Use the whole data set for training. That way we can compare results. Last edited by msargent at
12:58 Feb 08, 2016.
|
skim144
Posts: 63
|
Posted 14:13 Feb 08, 2016 |
Got it.
|