In this project, I will walk through solving a complete machine learning problem using a real-world dataset, then i will make Model Deployment as web application with flask.
This is a supervised, regression machine learning task: given a set of data with targets (in this case the price).
- Supervised problem: we are given both the features and the target
- Regression problem: the target is a continous variable
-
Scraped data of used cars listings. 100,000 listings, which have been separated into files corresponding to each car manufacturer (total 13 csv files).
-
The cleaned data set contains information of price, transmission, mileage, fuel type, road tax, miles per gallon (mpg), and engine size (in liters).
- Data cleaning and formatting
- Exploratory data analysis
- Data visualization
- Feature engineering and selection
- Establish a baseline and compare several machine learning models on a performance metric
- Perform hyperparameter tuning on the best model to optimize it for the problem
- Evaluate the best model on the testing set
- Model Deployment as web application with flask
- Video tutorial on youtube (Link: https://youtu.be/OrtbcW8dS4k)
- submit kaggle notebook (Link: https://www.kaggle.com/code/mohamedsalama1429/100-000-uk-cars-ml-price-prediction-96-score)