Skip to content

Afnan-214/DataMining_CustomerBehavior

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Group project: Stores' Data Mining to detect customers behavior

Dataset discription

The dataset contain data about 150 store of The FOODmart supermarket chain, which is one of the leading grocery stores in Australia.
It describes the nature of each store in terms of sales, Gross profits, location, holidays, home delivery, parking space, number of staff, managers' information and Cost of the basket of food items.

Target

  1. Analyse the natural of the stores in this super market chain
  2. Help the company to make informed desicion about making changes in its store
  3. Identify the customer preferences by analysing the features that improve the store sales.

Project phases

1. Pre-Processing

Data transformations Feature selection Handle outliers

2. Exploratory analysis

Data Visualization variables statistics sammary

3. Machine learning Algorithms

  • Regression

    Implement Multiple Linear Regression
    Implement Simple Linear Regression
    Plot the fitted line and errors of each model.

  • Clustering (My Implementation)

    Applying elbow method and visualize it using yellowbrick library to display the elbow point for k value in k-means
    Use PCA method to reduce dimensionality.

  • Neural Network

    Multi-Layer Perceptron Neural Network
    Make predictions on new data and get the predicted sales
    Looking at the importance scores to detect the effect each feature has on sales

  • Association Rule

    detect relation between feature such as: Sales $m and Number of Staff, Sales $m and Car Spaces, Female managers and Advertising

4. Insights

1- From Regression:

  1. The Advertise expenses has the most positive impact compared to " Wages $m, Competitors, Basket:2014" on the sales.
  2. The number of effective staff has a high positive impact on the sales with about increasing the sales by $886800 for a unit.
  3. The available car spaces also have high positive impact on the sales as it will increase by about $646300 for unit.

2- From clustering:

  1. cluster 0 has the higher gross profit this can help to detect features to improve other clustering as offering home delivery services.
  2. stores that assigned to cluster 1 have high features and the company can use its strategies to improve the other stores.
  3. cluster 2 stores have the lower features and need more attention from the company managers. they can improve some of its feature and change its managers to more experience ones.

3- From Neural Network:

  1. neural network has performed reasonably well, with a relatively low MSE, RMSE,MAE and MAPE and with high Explained variance
  2. This model can be used to predict sales of store with specific characteristics, and help the supermarket chain to make informed decision about opening a new store

4- From Association Rule:

  1. The Sales and the number of effective staff are strongly related and have Mutual effect on each other
  2. Sales and the number of available parking space are strongly related
  3. The stores with female managers are more likely to have low advertising expenses.
  4. Expected stores with high number of competing supermarkets have low profit
Customers prefer stores with :
  1. large parking spaces
  2. high number of full time staff
  3. low prices of main basket items
  4. Home delivery
  5. Opens on Sundays

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published