California Housing Prediction

Tools used:

Pandas
Numpy
Seaborn
Matplotlib
Scikit-Learn

This classic project explores the end-to-end process of a machine learning process using the Hands-On Machine Learning book by Aurélien Geron. We perform data analysis, feature engineering, create data pipelines and fit different models to select the best ones. From there we, use ensemble methods and perform evaluations improve accuracy.

The dataset contains housing and demographic information on California districts. The features are a mixture of categorical and numeric values.

The columns are:

longitude
latitude
housing_median_age
total_rooms
total_bedrooms
population
households
median_income
median_house_value
ocean_proximity

We experiment with Linear Regression, Random Forest, Support Vector Regression(kernel='linear'), and Linear Stochastic Gradient Descent. We fine tune the model using GridSearch and Randomized Search, to find the best combination of features and achieve a 95% accuracy using ensemble methods.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

California Housing Prediction

Tools used:

Files

README.md

Latest commit

History

README.md

File metadata and controls

California Housing Prediction

Tools used: