Project Title: Retail Sales Analysis
Database: p1_retail_db
This project demonstrates my ability to analyze retail sales data using SQL. It involves creating a database, cleaning data, conducting exploratory data analysis, and answering business questions to extract actionable insights. This project emphasizes practical SQL applications for retail business analysis.
- Database Setup: Create and populate the retail sales database.
- Data Cleaning: Address missing or inconsistent data.
- Exploratory Data Analysis: Gain insights through structured queries.
- Business Insights: Use SQL to answer key business questions and derive insights into sales and customer behavior.
- Database Creation: Created a database named
p1_retail_db. - Table Structure: Designed a
retail_salestable with attributes for transactions, sales details, customer demographics, and product information.
CREATE DATABASE p1_retail_db;
CREATE TABLE retail_sales
(
transactions_id INT PRIMARY KEY,
sale_date DATE,
sale_time TIME,
customer_id INT,
gender VARCHAR(10),
age INT,
category VARCHAR(35),
quantity INT,
price_per_unit FLOAT,
cogs FLOAT,
total_sale FLOAT
);Performed initial exploration and data cleaning:
-
Count Records:
SELECT COUNT(*) FROM retail_sales;
-
Unique Customers:
SELECT COUNT(DISTINCT customer_id) FROM retail_sales;
-
Product Categories:
SELECT DISTINCT category FROM retail_sales;
-
Handle Missing Data:
DELETE FROM retail_sales WHERE sale_date IS NULL OR sale_time IS NULL OR customer_id IS NULL OR gender IS NULL OR age IS NULL OR category IS NULL OR quantity IS NULL OR price_per_unit IS NULL OR cogs IS NULL;
Below are the SQL queries addressing specific business questions:
-
Sales on Specific Date
Retrieve all sales made on2022-11-05.SELECT * FROM retail_sales WHERE sale_date = '2022-11-05';
-
High-Quantity Sales in Clothing Category
Identify transactions where the category isClothingand the quantity sold exceeds 4 in November 2022.SELECT * FROM retail_sales WHERE category = 'Clothing' AND EXTRACT(YEAR FROM sale_date) = 2022 AND EXTRACT(MONTH FROM sale_date) = 11 AND quantity > 4;
-
Category-Wise Total Sales
Calculate total sales and order count for each category.SELECT category, SUM(total_sale) AS net_sales, COUNT(*) AS total_orders FROM retail_sales GROUP BY category;
-
Customer Insights for Beauty Products
Find the average age of customers purchasing items from theBeautycategory.SELECT ROUND(AVG(age), 2) AS avg_age FROM retail_sales WHERE category = 'Beauty';
-
High-Value Transactions
Retrieve transactions with total sales above 1000.SELECT * FROM retail_sales WHERE total_sale > 1000;
-
Gender and Category Analysis
Total transactions by gender and category.SELECT category, gender, COUNT(*) AS total_transactions FROM retail_sales GROUP BY category, gender ORDER BY category, gender;
-
Top-Selling Month per Year
Determine the best month in terms of average sales for each year.SELECT year, month, avg_sale FROM ( SELECT EXTRACT(YEAR FROM sale_date) AS year, EXTRACT(MONTH FROM sale_date) AS month, AVG(total_sale) AS avg_sale, RANK() OVER (PARTITION BY EXTRACT(YEAR FROM sale_date) ORDER BY AVG(total_sale) DESC) AS rank FROM retail_sales GROUP BY year, month ) ranked_data WHERE rank = 1;
-
Top 5 Customers by Total Sales
Identify the top 5 customers based on total sales.SELECT customer_id, SUM(total_sale) AS total_sales FROM retail_sales GROUP BY customer_id ORDER BY total_sales DESC LIMIT 5;
-
Unique Customer Count per Category
Find the number of unique customers purchasing from each category.SELECT category, COUNT(DISTINCT customer_id) AS unique_customers FROM retail_sales GROUP BY category;
-
Order Shifts Analysis
Analyze the number of orders placed in Morning, Afternoon, and Evening shifts.WITH hourly_sales AS ( SELECT *, CASE WHEN EXTRACT(HOUR FROM sale_time) < 12 THEN 'Morning' WHEN EXTRACT(HOUR FROM sale_time) BETWEEN 12 AND 17 THEN 'Afternoon' ELSE 'Evening' END AS shift FROM retail_sales ) SELECT shift, COUNT(*) AS total_orders FROM hourly_sales GROUP BY shift;
- Sales Trends: Identified peak sales months, high-value transactions, and top-performing categories.
- Customer Behavior: Analyzed customer demographics, spending habits, and purchasing trends.
- Operational Insights: Highlighted sales performance across shifts, aiding operational planning.
This project highlights my application of SQL in real-world retail analysis, showcasing my ability to extract actionable insights from data. It helped me build a solid foundation for advanced analytics and business decision-making.
- Clone Repository: Clone this project from GitHub.
- Set Up Database: Use the
database_setup.sqlscript to create and populate the database. - Run Queries: Execute the provided SQL scripts for analysis and insights.
- Customize: Modify queries as needed to explore additional questions.