M-Howorucha · M-Howorucha · Mar 31, 2026 · Mar 31, 2026 · Apr 7, 2026
diff --git a/02_activities/assignments/DC_Cohort/Assignment1.md b/02_activities/assignments/DC_Cohort/Assignment1.md
@@ -209,5 +209,5 @@ Consider, for example, concepts of fariness, inequality, social structures, marg
 
 
 ```
-Your thoughts...
+My day to day work involves data systems primarily within the field of experimentally derived atmospheric observations. Although these datasets are intended to be entirely unbiased, with respect to fairness, inequality, and social structures, there are still avenues for such concepts to impact the data and the downstream processes they inform. For example, global datasets of atmospheric pollutants are often concentrated in regions of the world with high economic and political influence such as North America and Europe. These also tend to be the regions least impacted by the detrimental health impacts of prolonged air quality hazards, due in part to reductions in pollutants, informed by these observations and air quality warnings, again, informed by observations. Conversely, developing nations, particularly in the global south, are disproportionally impacted by air quality risks, while simultaneously having fewer, and less reliable air quality warnings, due to the limited observational datasets available. Similarly, and closer to home, the Canadian high arctic is a region of the world were the inhabitants (primarily indigenous peoples) are simultaneously impacted by extreme weather events and less reliable weather forecasts. This is almost entirely due to the population extremes in Canada driving economic and technological progress further south, leaving the higher latitude regions to lag behind. Although this is simply a result of the distribution of resources, it nonetheless results in a marginalized group in Canada to be negatively impacted. There are ongoing efforts in the Canadian atmospheric research community to address this issue, but it is ongoing and will likely remain for many years to come, as well as become more significant under future climate change scenarios. 
 ```
diff --git a/02_activities/assignments/DC_Cohort/Assignment2.md b/02_activities/assignments/DC_Cohort/Assignment2.md
@@ -56,7 +56,8 @@ The store wants to keep customer addresses. Propose two architectures for the CU
 **HINT:** search type 1 vs type 2 slowly changing dimensions. 
 
 ```
-Your answer...
+One possible architecture would be to maintain a table of all customers, adding new rows for new customers, and updating their information (including postal codes) after each purchase. A second option would be to also have customers input their postal code after every purchase, but to add a new line to the table for each customer purchase. In this way, the most recent instance of a given customer purchase should be used to access their information, as older purchases may contain incorrect information.
+The first option would be classified as Type 1, since there is a single row for each customer, and the information is updated every time they make a purchase. The second option is Type 2 since a new row is added from each purchase so that there are new rows with potentially new information added over time.
 ```
 
 ***
@@ -191,5 +192,5 @@ Consider, for example, concepts of labour, bias, LLM proliferation, moderating c
 
 
 ```
-Your thoughts...
+The concept that modern AI models can trace their origins back to underpaid, perhaps unethical work decades ago is both shocking and unsurprising. Over the course of any technological development the current "state-of-the-art" systems owe their origins to the work of many in the past. This is simply the natural progression of technological development and neural networks, LLM, etc. are no different than any other invention. The concerning aspect here is how the developers go about improving their systems and who is impacted by their decisions. The fact that there are thousands of people out there being paid pennies to click images of dogs and cats so that some of the richest companies in the world can improve their bottom line is horrifying. As the author states, this is just another example of large companies exploiting workers for profit, just like in the fast fashion industry. Moving beyond the immediate impact of the workers in this situation, this system results in additional concerns with respect to the resulting models and output that their own biases impart to the training data. The classic example is Grok, which when trained on unmoderated Twitter posts, produced output that was riddled with racist, sexist, and otherwise horrendous answers. When training data is either not moderated, or is retrieved from a single source, it opens up the potential of the model to contain significant biases. In the case the authors site, human defined tags on images can be open to biases depending on the cultural or historical view points of the classifiers. For instance, individuals from regions will low cultural diversity may classify images of people dissimilar to themselves predominantly based on their appearances, despite perhaps more significant classifications being present. However to them, the most significant feature to an individual may be their race. In the end, LLM impact people at all stages, from training, to output, and the users of these systems must be aware of the impact and potential biases in their use.
 ```
diff --git a/02_activities/assignments/DC_Cohort/assignment1.sql b/02_activities/assignments/DC_Cohort/assignment1.sql
@@ -7,8 +7,7 @@
 /* 1. Write a query that returns everything in the customer table. */
 --QUERY 1
 
-
-
+Select * FROM customer;
 
 --END QUERY
 
@@ -17,8 +16,10 @@
 sorted by customer_last_name, then customer_first_ name. */
 --QUERY 2
 
+SELECT * FROM customer
 
-
+ORDER BY customer_last_name ASC, customer_first_name ASC
+LIMIT 10;
 
 --END QUERY
 
@@ -28,9 +29,10 @@ sorted by customer_last_name, then customer_first_ name. */
 Limit to 25 rows of output. */
 --QUERY 3
 
-
-
-
+SELECT * FROM customer_purchases
+WHERE product_id = 4
+OR product_id = 9
+LIMIT 25;
 --END QUERY
 
 
@@ -43,8 +45,9 @@ Limit to 25 rows of output.
 */
 --QUERY 4
 
-
-
+SELECT *, (quantity*cost_to_customer_per_qty) AS price FROM customer_purchases
+WHERE customer_id BETWEEN 8 AND 10
+LIMIT 25;
 
 --END QUERY
 
@@ -56,8 +59,14 @@ columns and add a column called prod_qty_type_condensed that displays the word
 if the product_qty_type is “unit,” and otherwise displays the word “bulk.” */
 --QUERY 5
 
+SELECT product_id, product_name, --product_qty_type,
+CASE 
+	WHEN product_qty_type = 'unit' 
+	THEN 'unit'
+	ELSE 'bulk'
+END AS prod_qty_type_condensed
 
-
+FROM product;
 
 --END QUERY
 
@@ -67,7 +76,19 @@ add a column to the previous query called pepper_flag that outputs a 1 if the pr
 contains the word “pepper” (regardless of capitalization), and otherwise outputs 0. */
 --QUERY 6
 
+SELECT product_id, product_name --product_qty_type,
+,CASE 
+	WHEN product_qty_type = 'unit' 
+	THEN 'unit'
+	ELSE 'bulk'
+END AS prod_qty_type_condensed
 
+,CASE 
+	WHEN product_name LIKE '%pepper%' THEN 1
+	ELSE 0
+END AS pepper_flag
+
+FROM product;
 
 
 --END QUERY
@@ -79,9 +100,17 @@ vendor_id field they both have in common, and sorts the result by market_date, t
 Limit to 24 rows of output. */
 --QUERY 7
 
+SELECT * FROM vendor v
 
+INNER JOIN vendor_booth_assignments vba 
+    ON v.vendor_id = vba.vendor_id
 
+ORDER BY 
+vba.market_date ASC, 
+v.vendor_name ASC
 
+LIMIT 24;
+
 --END QUERY
 
 
@@ -93,8 +122,16 @@ Limit to 24 rows of output. */
 at the farmer’s market by counting the vendor booth assignments per vendor_id. */
 --QUERY 8
 
-
-
+SELECT 
+    vendor_id, 
+    COUNT(*) AS vendors_count
+FROM 
+    vendor_booth_assignments
+GROUP BY 
+    vendor_id
+ORDER BY 
+    vendors_count DESC;
+
 
 --END QUERY
 
@@ -106,8 +143,23 @@ of customers for them to give stickers to, sorted by last name, then first name.
 HINT: This query requires you to join two tables, use an aggregate function, and use the HAVING keyword. */
 --QUERY 9
 
+SELECT 
+customer.customer_id,
+customer.customer_first_name,
+customer.customer_last_name,
+SUM(customer_purchases.quantity * customer_purchases.cost_to_customer_per_qty) AS total_spend
 
+FROM customer_purchases
 
+INNER JOIN 
+	customer
+	ON customer_purchases.customer_id = customer.customer_id
+
+GROUP BY customer.customer_id
+HAVING total_spend > 2000
+ORDER BY 
+    customer.customer_last_name,
+	customer.customer_first_name;
 
 --END QUERY
 
@@ -125,7 +177,14 @@ VALUES(col1,col2,col3,col4,col5)
 */
 --QUERY 10
 
+-- if a table named temp.new_vendor exists, delete it, otherwise do NOTHING
+DROP TABLE IF EXISTS temp.new_vendor;
+CREATE TABLE temp.new_vendor AS
+SELECT *
+FROM vendor;
 
+INSERT INTO temp.new_vendor
+VALUES(10,'Thomass Superfood Store', 'Fresh Focused', 'Thomas', 'Rosenthal');
 
 
 --END QUERY
-Original file line number
+Diff line change
@@ Expand Up @@
     ```
-    Your thoughts...
+    My day to day work involves data systems primarily within the field of experimentally derived atmospheric observations. Although these datasets are intended to be entirely unbiased, with respect to fairness, inequality, and social structures, there are still avenues for such concepts to impact the data and the downstream processes they inform. For example, global datasets of atmospheric pollutants are often concentrated in regions of the world with high economic and political influence such as North America and Europe. These also tend to be the regions least impacted by the detrimental health impacts of prolonged air quality hazards, due in part to reductions in pollutants, informed by these observations and air quality warnings, again, informed by observations. Conversely, developing nations, particularly in the global south, are disproportionally impacted by air quality risks, while simultaneously having fewer, and less reliable air quality warnings, due to the limited observational datasets available. Similarly, and closer to home, the Canadian high arctic is a region of the world were the inhabitants (primarily indigenous peoples) are simultaneously impacted by extreme weather events and less reliable weather forecasts. This is almost entirely due to the population extremes in Canada driving economic and technological progress further south, leaving the higher latitude regions to lag behind. Although this is simply a result of the distribution of resources, it nonetheless results in a marginalized group in Canada to be negatively impacted. There are ongoing efforts in the Canadian atmospheric research community to address this issue, but it is ongoing and will likely remain for many years to come, as well as become more significant under future climate change scenarios.
     ```