Uploaded both csv's and notebook-Team All Stars #6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Team Members:
In order to pre-process the data, we started by removing any rows that had NaN values for the attributes lifestyle, diastolic BP, and systolic BP.
After that, we interpolated the numerical characteristics and substituted the mode for the missing values in the categorical data.Next, we eliminated the anomalies from the data and employed one hot encoding to transform the categorical data into numerical data.
Then, using a regression model for multivariate distribution, we calculated the dagger and utilised regret to forecast the two features, namely the diastolic and systolic blood pressure.
We pre-processed the data for the classification task by interpolating, eliminating the outlier, and then one-hot encoding the gender into numerical data.
The data was then divided into train and test, we used the multivariate Bayes Classifier to train the model.
After that, we divided the test data and computed accuracy; the result was 76%.
Next, we used the test file to apply the model.