- HTML
- CSS
- JAVASCRIPT
- PYTHON
- DJANGO
- MACHINE LEARNING
-
Gathering the Data: Data preparation is the primary step for any machine learning problem. We will be using a dataset from Kaggle for this problem. This dataset consists of two CSV files one for training and one for testing.
-
Cleaning the Data: Cleaning is the most important step in a machine learning project. The quality of our data determines the quality of our machine learning model. So it is always necessary to clean the data before feeding it to the model for training.
-
Model Building: After gathering and cleaning the data, the data is ready and can be used to train a machine learning model. We will be using this cleaned data to train the Support Vector Classifier, Naive Bayes Classifier, and Random Forest Classifier.
-
Inference: After training the three models we will be predicting the disease for the input symptoms by combining the predictions of all three models. This makes our overall prediction more robust and accurate.
|_ dataset/
|_ training_data.csv
|_ test_data.csv
|_ saved_model/
|_ [ pre-trained models ]
|_ main.py [ code for laoding kaggle dataset, training & saving the model]
|_ notebook/
|_ dataset/
|_ raw_data.xlsx [Columbia dataset for notebook]
|_ Disease-Prediction-from-Symptoms-checkpoint.ipynb [ IPython Notebook for loading Columbia dataset, training model and Inference ]
Please make sure to install all dependencies before running the demo, using the following:
pip install -r requirements.txt
-
download the zip file
-
open any python file and look for the path 'C:\Users\Admin\Desktop\HealthDesease\
-
unzip the zip file and copy the healthprediction file into clipboard
-
crete the directories like above path pattern.
-
paste the healthprediction folder into the last directory named projects
python, machine lerning liabraries, dbbrowser, anaconda or pycharm or vscode
numpy pandas subprocess pip seaborn
install these liabraries into the terminal
look for the file heath.py and run this file