Skip to content

UofT-DSI | SQL - Assignment 2#2

Open
akorade wants to merge 8 commits intomainfrom
assignment-two
Open

UofT-DSI | SQL - Assignment 2#2
akorade wants to merge 8 commits intomainfrom
assignment-two

Conversation

@akorade
Copy link
Copy Markdown
Owner

@akorade akorade commented Nov 14, 2025

What changes are you trying to make? (e.g. Adding or removing code, refactoring existing code, adding reports)

I added a comprehensive Entity-Relationship (ER) diagram for a full bookstore database system, including customers, employees, orders, books, suppliers, inventory transactions, and employee shifts. I also added two SQL schema designs for the CUSTOMER_ADDRESS table, demonstrating both Type 1 (overwrite) and Type 2 (historical) Slowly Changing Dimensions. These changes improve the documentation, clarify the data model, and provide alternative architectures for handling customer addresses.

What did you learn from the changes you have made?

Data Modeling & Design (Section 1)

  • Designed logical models (ERDs) for a bookstore with proper entity relationships
  • Understood Type 1 vs Type 2 slowly changing dimensions:
    • Type 1: Overwrites old data (single current record per customer)
    • Type 2: Retains history (multiple records with validity dates and flags)

Advanced SQL Techniques (Sections 2 & 3)

String Manipulation:

  • Used INSTR(), SUBSTR(), and TRIM() together to dynamically extract product descriptions after hyphens
  • Applied REGEXP for pattern matching with numbers

Window Functions:

  • DENSE_RANK() and ROW_NUMBER() for numbering customer visits
  • COUNT() OVER (PARTITION BY ...) to calculate purchase frequency per customer-product combination

Set Operations:

  • Built CTEs with RANK() window functions to identify best/worst performing days
  • Combined results using UNION to display highest and lowest sales dates

Data Manipulation:

  • CROSS JOIN to create all possible vendor-product-customer combinations
  • INSERT, UPDATE, DELETE operations with proper WHERE clauses
  • ALTER TABLE to add columns and update values using subqueries with COALESCE()

Key Takeaway: Learned to combine multiple SQL functions (window functions, CTEs, joins) to solve complex analytical problems without hardcoding values.

Was there another approach you were thinking about making? If so, what approach(es) were you thinking of?

I considered adding a separate CUSTOMER_ADDRESS_HISTORY table, but the Type 2 model with validity ranges was more normalized and matched common data-warehouse practices.

Were there any challenges? If so, what issue(s) did you face? How did you overcome it?

One challenge was deciding how to handle inventory updates without risking inconsistencies. Initially, I considered storing stock_on_hand directly in the BOOK table, but that approach does not track history and is error-prone. I resolved this by introducing an INVENTORY_TRANSACTION table to log each stock movement.

How were these changes tested?

I validated the ER diagram by checking entity relationships for cardinality correctness and ensuring no circular or unnormalized dependencies.

A reference to a related issue in your repository (if applicable)

Checklist

  • I can confirm that my changes are working as intended

Added Mermaid source (.mmd) and PNG image files for the Bookstore ER diagram as part of Cohort 8 Assignment 2. The diagram models entities such as Customer, Employee, Order, Book, Supplier, and related relationships.
Renamed and updated the ER diagram to include EMPLOYEE, SHIFT, and EMPLOYEE_SHIFT entities, along with their relationships. Removed the old diagram files and added updated versions reflecting the new structure.
Provided SQL schema examples for customer address storage using type 1 (overwrite) and type 2 (history) slowly changing dimension approaches. This clarifies the architectural options for tracking customer addresses.
…pulation

Implemented queries for window functions including visit numbering, most recent visit filtering, and counting product purchases per customer. Added string manipulation queries to extract product descriptions and filter product sizes containing numbers.
@akorade akorade marked this pull request as draft November 14, 2025 01:32
@akorade akorade marked this pull request as ready for review November 19, 2025 01:29
Copy link
Copy Markdown

@monzchan monzchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

only the customer’s most recent visit. */


CREATE TEMP TABLE temp_customer_visits AS
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good attempt. The question requires reversing numbering based on the distinct visit list from above. Here used the raw customer_purchases may cause duplicate rows and incorrect visit numbering. consider this:
WITH visits AS (
SELECT
customer_id,
market_date,
ROW_NUMBER() OVER (
PARTITION BY customer_id
ORDER BY market_date DESC
) AS visit_number
FROM (
SELECT DISTINCT customer_id, market_date
FROM customer_purchases
) x
)
SELECT customer_id, market_date
FROM visits
WHERE visit_number = 1;


SELECT
product_name,
TRIM(SUBSTR(product_name, NULLIF(INSTR(product_name, '-'), 0) + 1)) AS description
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the product name has no hyphen, student formula incorrectly extracts the entire product_name instead of returning NULL, producing wrong descriptions.
CASE WHEN INSTR(product_name, '-') > 0 THEN
TRIM(SUBSTR(product_name, INSTR(product_name, '-') + 1))
ELSE NULL END

SELECT
v.vendor_name,
p.product_name,
customer_count * 5 * vi.original_price AS total_revenue
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SUM(5 * vi.original_price) AS total_revenue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants