Skip to content

This project walks through the process of using regression to predict historical housing prices in California.

Notifications You must be signed in to change notification settings

lauxpaux/Cali_HousingPrediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

California Housing Prediction

Tools used:

  • Pandas
  • Numpy
  • Seaborn
  • Matplotlib
  • Scikit-Learn

This classic project explores the end-to-end process of a machine learning process using the Hands-On Machine Learning book by Aurélien Geron. We perform data analysis, feature engineering, create data pipelines and fit different models to select the best ones. From there we, use ensemble methods and perform evaluations improve accuracy.

The dataset contains housing and demographic information on California districts. The features are a mixture of categorical and numeric values.

image

The columns are:

  • longitude
  • latitude
  • housing_median_age
  • total_rooms
  • total_bedrooms
  • population
  • households
  • median_income
  • median_house_value
  • ocean_proximity

image

image

We experiment with Linear Regression, Random Forest, Support Vector Regression(kernel='linear'), and Linear Stochastic Gradient Descent. We fine tune the model using GridSearch and Randomized Search, to find the best combination of features and achieve a 95% accuracy using ensemble methods.

About

This project walks through the process of using regression to predict historical housing prices in California.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published