Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified 01_slides/slides_03.pdf
Binary file not shown.
Binary file modified 01_slides/slides_04.pdf
Binary file not shown.
Binary file modified 01_slides/slides_05.pdf
Binary file not shown.
Binary file modified 01_slides/slides_06.pdf
Binary file not shown.
15 changes: 15 additions & 0 deletions 02_assignments/design_a_logical_model.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,30 @@ Create a logical model for a small bookstore. 📚

At the minimum it should have employee, order, sales, customer, and book entities (tables). Determine sensible column and table design based on what you know about these concepts. Keep it simple, but work out sensible relationships to keep tables reasonably sized. Include a date table. There are several tools online you can use, I'd recommend [_Draw.io_](https://www.drawio.com/) or [_LucidChart_](https://www.lucidchart.com/pages/).

<img width="627" alt="Assignment_q1" src="https://github.com/AdithyaLab/sql/assets/167127254/d7f51165-0d71-4057-a41f-d0bcec0ac16c">

```
I am assuming that employees have no direct relationship with sales or orders for this database. Employees purely manage store operations.
```


## Question 2
We want to create employee shifts, splitting up the day into morning and evening. Add this to the ERD.

<img width="653" alt="Assignment_q2" src="https://github.com/AdithyaLab/sql/assets/167127254/3b13caf7-b023-40c9-a802-66832ccf2a1d">


## Question 3
The store wants to keep customer addresses. Propose two architectures for the CUSTOMER_ADDRESS table, one that will retain changes, and another that will overwrite. Which is type 1, which is type 2?

_Hint, search type 1 vs type 2 slowly changing dimensions._

<img width="519" alt="Assignment_q3" src="https://github.com/AdithyaLab/sql/assets/167127254/ce81c910-946f-4c8d-837c-724624692cf3">

Bonus: Are there privacy implications to this, why or why not?
```
Your answer...
For the type 1 architecture, the old address is overwritten by the new address and the old data is deleted. This means there are fewer privacy concerns as only one address is available at any one time. For the type 2 architecture, old addresses are stored, providing a more extensive history of a customer's address history, which could raise privacy concerns.
```

## Question 4
Expand All @@ -24,6 +37,8 @@ Review the AdventureWorks Schema [here](https://i.stack.imgur.com/LMu4W.gif)
Highlight at least two differences between it and your ERD. Would you change anything in yours?
```
Your answer...
1) The Schema centers one "mega" table for each "coloured" function such as Employee, Person, SalesOrderHeader, Product and Vendor. This simplifies the overall architecture for a large database. If i was to expand on the data collected by the book store, I would follow this "hub and spoke" model
2) The Schema uses modified dates, utilizing the type 2 slowly changing dimension architecture. This means that the old data will be retained in the database. I would consider this architecture for my ERD if there was a regulatory requirement to do so, but for the purposes of reducing privacy risks, I will keep it the same.
```

# Criteria
Expand Down
35 changes: 35 additions & 0 deletions 03_homework/homework_4.sql
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,38 @@ only the customer’s most recent visit. */

/* 3. Using a COUNT() window function, include a value along with each row of the
customer_purchases table that indicates how many different times that customer has purchased that product_id. */




-- String manipulations
/* 1. Some product names in the product table have descriptions like "Jar" or "Organic".
These are separated from the product name with a hyphen.
Create a column using SUBSTR (and a couple of other commands) that captures these, but is otherwise NULL.
Remove any trailing or leading whitespaces. Don't just use a case statement for each product!

| product_name | description |
|----------------------------|-------------|
| Habanero Peppers - Organic | Organic |

Hint: you might need to use INSTR(product_name,'-') to find the hyphens. INSTR will help split the column. */



/* 2. Filter the query to show any product_size value that contain a number with REGEXP. */



-- UNION
/* 1. Using a UNION, write a query that displays the market dates with the highest and lowest total sales.

HINT: There are a possibly a few ways to do this query, but if you're struggling, try the following:
1) Create a CTE/Temp Table to find sales values grouped dates;
2) Create another CTE/Temp table with a rank windowed function on the previous query to create
"best day" and "worst day";
3) Query the second temp table twice, once for the best day, once for the worst day,
with a UNION binding them. */




30 changes: 0 additions & 30 deletions 03_homework/homework_5.sql
Original file line number Diff line number Diff line change
@@ -1,33 +1,3 @@
-- String manipulations
/* 1. Some product names in the product table have descriptions like "Jar" or "Organic".
These are separated from the product name with a hyphen.
Create a column using SUBSTR (and a couple of other commands) that captures these, but is otherwise NULL.
Remove any trailing or leading whitespaces. Don't just use a case statement for each product!

| product_name | description |
|----------------------------|-------------|
| Habanero Peppers - Organic | Organic |

Hint: you might need to use INSTR(product_name,'-') to find the hyphens. INSTR will help split the column. */



/* 2. Filter the query to show any product_size value that contain a number with REGEXP. */



-- UNION
/* 1. Using a UNION, write a query that displays the market dates with the highest and lowest total sales.

HINT: There are a possibly a few ways to do this query, but if you're struggling, try the following:
1) Create a CTE/Temp Table to find sales values grouped dates;
2) Create another CTE/Temp table with a rank windowed function on the previous query to create
"best day" and "worst day";
3) Query the second temp table twice, once for the best day, once for the worst day,
with a UNION binding them. */



-- Cross Join
/*1. Suppose every vendor in the `vendor_inventory` table had 5 of each of their products to sell to **every**
customer on record. How much money would each vendor make per product?
Expand Down
147 changes: 147 additions & 0 deletions 04_data/external/new_customer_purchases.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
product_id,vendor_id,market_date,customer_id,quantity,cost_to_customer_per_qty,transaction_time
21,3,2024-05-24,11,1,4.99,17:32:00
21,3,2024-05-24,25,2,4.99,18:23:00
21,3,2024-05-24,24,2,4.99,18:41:00
21,3,2024-05-24,6,2,4.99,18:18:00
21,3,2024-05-24,8,1,4.99,17:34:00
21,3,2024-05-24,2,1,4.99,12:20:00
21,3,2024-05-24,18,1,4.99,9:33:00
21,3,2024-05-24,24,3,4.99,13:05:00
21,3,2024-05-24,18,3,4.99,12:26:00
21,3,2024-05-24,4,2,4.99,12:46:00
21,3,2024-05-24,17,3,4.99,18:40:00
21,3,2024-05-24,13,2,4.99,18:06:00
21,3,2024-05-24,23,3,4.99,18:35:00
21,3,2024-05-24,3,3,4.99,18:56:00
21,3,2024-05-24,8,3,4.99,12:03:00
21,3,2024-05-24,1,2,4.99,11:43:00
21,3,2024-05-24,23,3,4.99,17:43:00
21,3,2024-05-24,13,3,4.99,18:22:00
21,3,2024-05-24,23,2,4.99,18:52:00
21,3,2024-05-24,11,2,4.99,17:07:00
21,3,2024-05-24,1,1,4.99,12:56:00
21,3,2024-05-24,3,1,4.99,11:02:00
21,3,2024-05-24,2,2,4.99,11:38:00
21,3,2024-05-24,23,1,4.99,18:56:00
21,3,2024-05-24,24,1,4.99,18:59:00
21,3,2024-05-24,2,1,4.99,10:42:00
21,3,2024-05-24,15,3,4.99,17:30:00
21,3,2024-05-24,5,2,4.99,9:08:00
21,3,2024-05-24,24,2,4.99,13:58:00
21,3,2024-05-24,24,1,4.99,9:37:00
21,3,2024-05-24,26,1,4.99,17:27:00
21,3,2024-05-24,19,1,4.99,11:40:00
21,3,2024-05-24,2,2,4.99,13:59:00
21,3,2024-05-24,25,1,4.99,12:21:00
21,3,2024-05-24,12,2,4.99,10:10:00
21,3,2024-05-24,12,1,4.99,10:39:00
21,3,2024-05-24,3,3,4.99,11:59:00
21,3,2024-05-24,25,1,4.99,18:18:00
21,3,2024-05-24,21,3,4.99,17:05:00
21,3,2024-05-24,11,1,4.99,17:39:00
21,3,2024-05-24,10,3,4.99,18:01:00
21,3,2024-05-24,18,1,4.99,17:19:00
21,3,2024-05-24,11,2,4.99,17:10:00
22,3,2024-05-24,22,2.5,6.49,18:36:00
22,3,2024-05-24,18,0.81,6.49,18:53:00
22,3,2024-05-24,10,2.35,6.49,17:44:00
22,3,2024-05-24,13,3.37,6.49,13:28:00
22,3,2024-05-24,10,0.82,6.49,18:18:00
22,3,2024-05-24,3,2.87,6.49,8:35:00
22,3,2024-05-24,1,0.27,6.49,8:47:00
22,3,2024-05-24,1,2.87,6.49,11:00:00
22,3,2024-05-24,17,2.38,6.49,18:56:00
22,3,2024-05-24,5,0.31,6.49,18:28:00
22,3,2024-05-24,7,3.45,6.49,10:00:00
22,3,2024-05-24,17,0.07,6.49,9:57:00
22,3,2024-05-24,20,0.57,6.49,9:12:00
22,3,2024-05-24,2,4.05,6.49,18:33:00
22,3,2024-05-24,4,0.76,6.49,16:30:00
22,3,2024-05-24,10,3.77,6.49,17:57:00
22,3,2024-05-24,19,4.11,6.49,13:33:00
22,3,2024-05-24,17,2.84,6.49,8:35:00
22,3,2024-05-24,18,0,6.49,8:49:00
22,3,2024-05-24,2,1.42,6.49,9:09:00
10,1,2024-05-24,9,3,4.50,18:37:00
10,1,2024-05-24,5,5,4.50,17:37:00
10,1,2024-05-24,3,6,4.50,17:52:00
10,1,2024-05-24,16,3,4.50,17:21:00
10,1,2024-05-24,26,2,4.50,13:49:00
10,1,2024-05-24,22,1,4.50,13:42:00
10,1,2024-05-24,13,5,4.50,18:41:00
10,1,2024-05-24,20,4,4.50,17:14:00
10,1,2024-05-24,10,4,4.50,12:17:00
10,1,2024-05-24,25,2,4.50,11:11:00
10,1,2024-05-24,18,2,4.50,11:27:00
10,1,2024-05-24,2,4,4.50,16:47:00
10,1,2024-05-24,12,5,4.50,18:52:00
10,1,2024-05-24,15,1,4.50,17:28:00
10,1,2024-05-24,16,4,4.50,9:23:00
10,1,2024-05-24,7,3,4.50,8:54:00
10,1,2024-05-24,13,5,4.50,18:54:00
10,1,2024-05-24,24,6,4.50,17:21:00
10,1,2024-05-24,8,1,4.50,18:16:00
10,1,2024-05-24,7,2,4.50,18:27:00
10,1,2024-05-24,13,6,4.50,17:54:00
10,1,2024-05-24,7,3,4.50,17:56:00
10,1,2024-05-24,26,2,4.50,11:01:00
10,1,2024-05-24,22,6,4.50,12:51:00
10,1,2024-05-24,16,2,4.50,10:21:00
10,1,2024-05-24,17,3,4.50,12:37:00
10,1,2024-05-24,5,4,4.50,18:53:00
10,1,2024-05-24,16,2,4.50,18:39:00
10,1,2024-05-24,20,1,4.50,17:34:00
10,1,2024-05-24,1,5,4.50,17:36:00
10,1,2024-05-24,10,6,4.50,17:53:00
10,1,2024-05-24,22,2,4.50,13:03:00
10,1,2024-05-24,7,1,4.50,10:09:00
10,1,2024-05-24,11,1,4.50,13:44:00
10,1,2024-05-24,1,6,4.50,11:40:00
10,1,2024-05-24,23,3,4.50,18:51:00
10,1,2024-05-24,24,1,4.50,18:15:00
10,1,2024-05-24,26,4,4.50,18:43:00
10,1,2024-05-24,9,1,4.50,13:30:00
10,1,2024-05-24,11,3,4.50,13:54:00
10,1,2024-05-24,23,2,4.50,18:45:00
10,1,2024-05-24,2,1,4.50,18:49:00
11,1,2024-05-24,22,0.65,7.29,10:33:00
11,1,2024-05-24,1,0.88,7.29,10:20:00
11,1,2024-05-24,16,3.08,7.29,10:24:00
11,1,2024-05-24,21,2.61,7.29,12:03:00
11,1,2024-05-24,16,1.52,7.29,11:43:00
11,1,2024-05-24,8,0.06,7.29,17:43:00
11,1,2024-05-24,19,0.23,7.29,18:22:00
11,1,2024-05-24,17,3.61,7.29,18:52:00
20,5,2024-05-24,2,2,10,12:56:00
20,5,2024-05-24,26,2,10,11:02:00
20,5,2024-05-24,20,2,10,11:38:00
20,5,2024-05-24,4,4,10,18:56:00
20,5,2024-05-24,14,4,10,18:59:00
20,5,2024-05-24,8,4,10,10:42:00
20,5,2024-05-24,8,6,10,9:33:00
20,5,2024-05-24,6,6,10,13:28:00
20,5,2024-05-24,11,6,10,13:33:00
20,5,2024-05-24,18,6,10,13:33:00
20,5,2024-05-24,19,6,10,11:02:00
20,5,2024-05-24,6,6,10,18:56:00
20,5,2024-05-24,2,4,10,18:36:00
20,5,2024-05-24,12,4,10,18:15:00
20,5,2024-05-24,9,4,10,17:35:00
20,5,2024-05-24,11,4,10,16:33:00
20,5,2024-05-24,24,4,10,18:16:00
20,5,2024-05-24,17,6,10,8:54:00
20,5,2024-05-24,15,2,10,9:52:00
20,5,2024-05-24,13,6,10,10:26:00
20,5,2024-05-24,23,2,10,12:37:00
20,5,2024-05-24,17,6,10,13:13:00
20,5,2024-05-24,14,6,10,11:29:00
20,5,2024-05-24,20,6,10,18:21:00
20,5,2024-05-24,8,2,10,16:58:00
20,5,2024-05-24,14,4,10,17:08:00
20,5,2024-05-24,17,2,10,16:41:00
19,5,2024-05-24,7,2,0.99,11:41:00
19,5,2024-05-24,12,1,0.99,10:07:00
19,5,2024-05-24,16,3,0.99,10:39:00
19,5,2024-05-24,9,2,0.99,18:35:00
19,5,2024-05-24,10,1,0.99,18:44:00
19,5,2024-05-24,19,4,0.99,17:50:00
Loading