Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I have created a slightly different version of your program with the
following changes:
I have changed it to take one single "Training File" as input and it
will automatically split it into "tr_data", "cv_data" and "gt_data"
files. This might be easier for some folks who don't understand the
difference between the 3 files unless they have watched Andrew Ng's
video and know what these mean (and do).
I have created another function called "select_num_cols" that
automatically selects numeric columns from the data set above. This
enables most data scientists to get a smaller feature set than what they
have. It will also work well with your Gaussian Distribution program.
Since this version works with more than 2 variables, I have avoided
plotting the variables.
I hope these changes will be acceptable. If not, you can create a
version.