This repository contains a structured collection of essential SQL commands and practice queries focused on real-world data engineering scenarios. Each folder covers a core SQL concept, providing hands-on examples to strengthen your query-building skills and understanding of relational databases.
The project is organized into topic-specific directories, including:
- ALTER Queries β Modifying existing table structures
- Aggregation Functions β SUM, COUNT, AVG, etc., with practical sub-queries and CTE examples
- Analytical and Window Functions β Advanced SQL for ranking, running totals, and partitions
- Common Table Expressions (CTEs) β Simplifying complex sub-queries for better readability
- SQL Clauses & JOINS β Mastering WHERE, GROUP BY, INNER JOIN, LEFT JOIN, and more
- Database Constraints (PostgreSQL) β Enforcing data integrity using PRIMARY KEY, UNIQUE, NOT NULL
- DDL Commands β CREATE, ALTER, DROP for managing schema objects
- Data Exploration with SELECT β Retrieving and filtering data effectively
- ER Modelling β Entity-Relationship Diagrams to design and visualize database schemas
- Primary & Foreign Keys β Understanding relational links between tables
- Sub-queries β Writing nested queries for advanced use cases
- Views β Creating virtual tables for reusable queries
- Window Functions β Performing calculations across result sets
- Instacart Market Basket Analysis Project β Applying all learned concepts on real-world data
Each directory contains .sql files with descriptive filenames, showcasing practical examples, common patterns, and edge cases.
SQL-for-data-engineering/
βββ Aggregation-Funs/
βββ Analytical-Functions/
βββ CTE/
βββ Clause/
βββ DB-Constraints-PostgreSQL/
βββ DDL-Commands/
βββ Data-Exploration-with-SELECT-Command/
βββ E-Commerce-Final-Project-(SQL+Python)/
βββ ER-Modelling/
βββ JOINS/
βββ Primary-Foreign-KEYS/
βββ Sub_quries/
βββ Views/
βββ Window-Function/
βββ README.md
- Clone the Repository:
git clone https://github.com/maimran786/SQL-for-data-engineering.git cd SQL-for-data-engineering - Open SQL Files:
- Use any SQL editor or IDE (such as DBeaver, pgAdmin, or VSCode with SQL extensions).
- Set Up a Database:
- Recommended engine: PostgreSQL
- Load sample datasets or create your own to run and test the queries.
This project demonstrates the application of learned SQL concepts on a real-world dataset from Kaggle. It showcases advanced querying, temporary tables, aggregations, joins, and group-based insights.
- Join multiple tables to get product-level insights with department and aisle information
- Analyze total purchases, reorders, and cart frequency by product
- Explore department-level trends including weekday vs weekend behavior
- Identify top 10 most popular aisles with product distribution
- Combine all outputs into a final aggregated result
-
SQL β Used for all major analysis and transformations
-
Python (Jupyter Notebook) β Used for extended analysis and visualization
-
SQL file:
Ecommerce-Data-Analysis.sql -
Python notebook:
Instacart-Market-Basket-Analysis.ipynb(also available in my Python-for-Data-Engineering repo) -
This case study reflects a complete data engineering workflow: from SQL-based data transformation to Python-based insight generation.
-
SQL environment (PostgreSQL recommended)
-
Optional: Sample database (e.g., Northwind, AdventureWorks, or your own schema)
Imran
GitHub Profile