This is the repository of our final project for CS 207. We will be writing a Naive Bayes Classifier, a KNN Classifier and a Simple Linear Regressor in C and working on several Data Science problems for this.
For this project, we wrote a KNN Classifier, a Naive Bayes Classifier, and a Simple Regressor from scratch in C. We used python to pre-process and clean various datasets and performed classification problems on them.
- Writing our own classifiers in C.
- Working with datasets of various sizes.
- Understanding data science techniques.
- Using python to clean datasets.
- Using python libraries to do more challenging data science problems.
IRIS Dataset This dataset has 150 rows and 4 columns.
Problem: Predict the class of the flower based on available attributes.
Pima Indian Diabetes Dataset
Problem: Predict if the person is suffering from type-2 Diabetes.
Loan Prediction Dataset This dataset has 615 rows and 13 columns.
Problem: Predict if a loan will get approved or not.
Turkiye Student Evaluation Dataset This dataset has 5820 rows and 33 columns.
Problem: Predict final grade based on answers to all other questions.
Black Friday Dataset This dataset has 550,069 rows and 12 columns.
Problem: Predict purchase amount.
Trip History Dataset This dataset has 2.2 Lakh rows..
Problem: Identify the User Type.
Digit Identifier Dataset This dataset has pixel values for around 50,000 images of 28 X 28 size.
Problem: Identify digits from pixel values.