Our primary goal in this project is to gain a comprehensive understanding of the dynamics that drive Supermarket Sales. We're diving deep into product performances, sales trends, and customer behaviors by using MySQL Workbench. By understanding these dynamics, we aim to empower decision-makers with actionable intelligence. The dataset was obtained from the Kaggle Supermarket Sales
By using Exploratory Data Analysis (EDA), the major goal of this project is to analyze the data through statistical measures and visualizations. This helps us turn raw data into practical knowledge.
The data was obtained from the Kaggle Supermarket Sales This dataset was recorded in 3 different branches for 3 months of data. The data consists of 1000 rows and 17 columns.
Column | Desription | Data Type |
---|---|---|
Invoice_ID | Invoice Identification Number | VARCHAR(30) |
Branch | Branch of supermarket (Branch A, B, and C) | VARCHAR(5) |
City | Location of supermarket | VARCHAR(5) |
Customer_type | Type of customers (With members card or not) | VARCHAR(30) |
Gender | Gender of the customer | VARCHAR(10) |
Product_line | General item categorization groups | VARCHAR(100) |
Unit_price | Price of each product in dollars | DECIMAL(10,2) |
Quantity | Number of products purchased | INT |
Tax | 5% Tax fee | FLOAT(6,4) |
Total | Total price including tax | DECIMAL(10,2) |
Date | Date of purchase | DATE |
Time | Time of purchase | TIMESTAMP |
Payment | Payment used for purchase | TEXT |
cogs | Cost of goods sold | DECIMAL(10,2) |
gross_margin_percentage | Gross margin percentage | FLOAT(11,9) |
gross_income | Gross income | DECIMAL(10,2) |
Rating | Customer rating of their shopping experience | FLOAT(2,1) |
- Data Wrangling: Data inspection is conducted to identify and address NULL values. Various data replacement methods are employed to substitute or fill in the missing or NULL values.
- Exploratory Data Analysis (EDA): Exploratory data analysis is conducted to answer the listed questions and the goal of this project.
- Conclusion
- What is the most popular payment method used by customers?
- Which branch is the most profitable?
- Which product category generates the highest revenue?
- What are the spending patterns of females and males, and in which categories do they spend the most money?
- How many products are purchased by customers?
- Which product category should be the focus of the supermarket?
- In which city branch should expansion be considered, and which products should be emphasized?
For the rest of the code check the Query_Sales
-- Create table
CREATE TABLE IF NOT EXISTS sales(
Invoice_ID VARCHAR(30) NOT NULL PRIMARY KEY,
Branch VARCHAR(5) NOT NULL,
City VARCHAR(30) NOT NULL,
Customer_type VARCHAR(30) NOT NULL,
Gender VARCHAR(30) NOT NULL,
Product_line VARCHAR(100) NOT NULL,
Unit_price DECIMAL(10,2) NOT NULL,
Quantity INT NOT NULL,
Tax FLOAT(6,4) NOT NULL,
Total DECIMAL(12, 4) NOT NULL,
Purchase_Date DATETIME NOT NULL,
Purchase_Time TIME NOT NULL,
Payment VARCHAR(15) NOT NULL,
cogs DECIMAL(10,2) NOT NULL,
gross_margin_percentage FLOAT(11,9),
gross_income DECIMAL(12, 4),
Rating FLOAT(2, 1)
);
The favored payment method is E-wallet, with a notable preference for cash transactions as well. The dataset consists of three cities/branches where Naypyidaw's Branch C emerges as the most financially rewarding. On average, 'Health and Beauty' generates the highest gross revenue. Women spend the most on 'Fashion Accessories,' while for men, it is 'Health and Beauty.' Women also spend more on 'Sports and Travel.' Most customers purchase around 10 products. Despite the high ratings for 'Fashion Accessories' and 'Food and Beverages,' the quantity purchased is low. Therefore, the supply of these products needs to be increased. 'Fashion Accessories' and 'Food and Beverages' are the best-selling products in Naypyidaw, and these products should be focused on along with 'Electronic Accessories'.