Time Series forecasting with SARIMAX and XGBoost: Chennai House price prediction and forecasting

Project Overview:

1. Data Preprocessing

The raw dataset is first cleaned and preprocessed. This includes:

Filling missing values with relevant imputation methods.

Correcting misspellings in categorical data (e.g., fixing incorrect values in the SALE_COND column).

Dropping irrelevant columns and handling duplicates.

Filling missing dates and dealing with extra records using custom logic.

Removing outliers and handling the remaining missing data using forward fill.

2. Feature Engineering

Key features are extracted to enhance model performance:

Encoding categorical features like SALE_COND, AREA, etc.

Aggregating area-based rankings.

Deriving additional features relevant to the time series and predictive models.

3. Time Series Forecasting (SARIMAX)

The SARIMAX (Seasonal ARIMA with eXogenous variables) model is used for time series forecasting. Key steps:

Hyperparameter tuning for SARIMA (p, d, q, P, D, Q, S).

Residual analysis and error metrics for evaluating the model's performance.

Forecasting future trends based on historical data.

4. XGBoost Model for Prediction

The XGBoost model is trained on temporal and non-temporal features:

Cross-validation using TimeSeriesSplit.

Predictions using the result of time series forecasting as part of the feature set.

Calculation of performance metrics such as R² and Mean Absolute Error (MAE).

Installation and Setup:

Clone the repository:

bash

Copy code

git clone

Install dependencies:

bash

Copy code

pip install -r requirements.txt

Add your data into the data/ folder (raw dataset as raw_data.csv).

Run the Jupyter notebooks sequentially to understand each step of the pipeline:

Start with 1_data_preprocessing.ipynb for cleaning the raw dataset.

Proceed with 2_feature_engineering.ipynb to extract features.

Continue to 3_time_series_forecasting.ipynb to train the SARIMAX model.

Finish with 4_xgboost_model_training.ipynb to train and evaluate the XGBoost model.

Future Enhancements

Web Application: A user interface (UI) can be built for real-time forecasting and predictions using the trained models.

Azure Integration: Deployment of the models using Azure services to enable scalable and accessible forecasting.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
models		models
notebooks		notebooks
LICENSE		LICENSE
README.md		README.md
google7c9fddbab5a660ab.html		google7c9fddbab5a660ab.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Time Series forecasting with SARIMAX and XGBoost: Chennai House price prediction and forecasting

Project Overview:

1. Data Preprocessing

2. Feature Engineering

3. Time Series Forecasting (SARIMAX)

4. XGBoost Model for Prediction

Installation and Setup:

Future Enhancements

About

Releases

Packages

Languages

License

Ramkmr/chennai-house-price-prediction-and-forecasting

Folders and files

Latest commit

History

Repository files navigation

Time Series forecasting with SARIMAX and XGBoost: Chennai House price prediction and forecasting

Project Overview:

1. Data Preprocessing

2. Feature Engineering

3. Time Series Forecasting (SARIMAX)

4. XGBoost Model for Prediction

Installation and Setup:

Future Enhancements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages