-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathsolution.txt
More file actions
2872 lines (2180 loc) · 81.3 KB
/
solution.txt
File metadata and controls
2872 lines (2180 loc) · 81.3 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Problem: 1757. Recyclable and Low Fat Products
Intuition:
This problem is straightforward, we only have to filter out the id of rows where
both low_fats and recyclable columns are set to 'Y'
Approach:
Simply extract the rows that meet the condition using WHERE statement
Solution:
SELECT product_id FROM Products
WHERE low_fats='Y' AND recyclable='Y';
Problem: 2356. Number of Unique Subjects Taught by Each Teacher
Intuition:
This problem needs us to compute the number of "unique" subjects told by each teacher.
When seeing the word "unique", we immediately come up with DISTINCT statement. Also,
because we want to compute for each teacher, we certainly have to use GROUP BY.
Approach:
Firstly, since we have to compute value for each teacher, we use `GROUP BY teacher_id`.
Next, as we can generate the unique subjects by `DISTINCT subject_id`. Then, we compute
the number by `COUNT(DISTINCT subject_id)`
Solution:
SELECT teacher_id, COUNT(distinct subject_id) as cnt FROM Teacher
GROUP BY teacher_id;
Problem: 1741. Find Total Time Spent by Each Employee
Intuition:
This problem needs us to compute the total time each employee spent in the office for
each day. And when it comes to `each`, we immediately come up with GROUP BY statement.
And for total, we could use SUM.
Approach:
Firstly, as we have to compute based on employee and date, so we have to use `GROUP BY emp_id, event_day`.
Next, we can directly use aggregate function SUM as `SUM(out_time-in_time)`
Solution:
SELECT event_day as day, emp_id, SUM(out_time-in_time) as total_time FROM Employees
GROUP BY emp_id, event_day;
Problem: 1693. Daily Leads and Partners
Intuition:
This problem needs us to compute the unique leads and partners for each date of each make
name. Surely, we come up with DISTINCT, GROUP BY, COUNT statements.
Approach:
Firstly, as we have to compute based on date and make name, we use `GROUP BY date_id, make_name`.
Next, we can filter out distinct valus of leads and partners using `DISTINCT lead_id` and
`DISTINCT partner_id`. Eventually, we make use of aggregate function COUNT as `COUNT(DISTINCT lead_id)`
and `COUNT(DISTINCT partner_id)` to compute the results.
Solution:
SELECT date_id, make_name, COUNT(distinct lead_id) as unique_leads, COUNT(distinct partner_id) as unique_partners FROM DailySales
GROUP BY date_id, make_name
Problem: 1393. Capital Gain/Loss
Intuition:
This problem needs us to compute the total gain/loss for each stock. So we must have to separate
buy operation and sell operation. Also, because we have to compute each stock respectively, we
have to use GROUP BY.
Approach:
One of the important thing to notice is that when computing total gain/loss, we don't have to care
about the order of trading under this circumstances. So the strategy becomes: total sell price -
total sell price. Firstly, we gather the total sell price for each stock using
SELECT stock_name, SUM(price) as sell_total FROM Stocks
WHERE operation='Sell'
GROUP BY stock_name, operation
This would serve as a subquery. Next we do the same thing for total buy price
SELECT stock_name, sum(price) as buy_total FROM Stocks
WHERE operation='Buy'
GROUP BY stock_name, operation
Once we have these two queries, we can join them and compute sell_total-buy_total
Solution:
SELECT s.stock_name, s.sell_total-SUM(price) as capital_gain_loss FROM Stocks as b
INNER JOIN (
SELECT stock_name, SUM(price) as sell_total FROM Stocks
WHERE operation='Sell'
GROUP BY stock_name, operation
) AS s ON b.stock_name = s.stock_name
WHERE operation='Buy'
GROUP BY stock_name, operation;
Modified Solution:
SELECT stock_name,
SUM(
CASE WHEN operation = 'Buy' THEN
-price
ELSE
price
END
) AS capital_gain_loss
FROM Stocks
GROUP BY stock_name;
Modified Solution Explained:
In the modified solution, instead of separating the buy and sell operation using subquery,
it makes use of CASE WHEN statement. This have the same results that subquery have, but it
prevents from creating another tables which take more time. And it makes the code more succinct.
Problem: 1795. Rearrange Products Table
Intuition:
When I first read this question, I found out that it's all about transpose. In programming language
like SAS, Python, and R, they all have great function to handle transpose directly. However, there
isn't such a function in MySQL, so we have to carefully create each column ourselves.
Approach:
Firstly, we notice that there're only 3 stores, so we can use `SELECT 'store1'` to create the
`store` column. Here, we can find out that running the statement `SELECT p.product_id, "store1"
as store, p.store1 as price FROM Products as p` actually returns the result that we want for each
store. So our next task would be stacking them up. Here, we can use `UNION ALL`. Eventually, we
use `WHERE` statement to filter out rows that contain None value
Solution:
SELECT * FROM (
SELECT p.product_id, "store1" as store, p.store1 as price FROM Products as p
UNION ALL
SELECT p.product_id, "store2" as store, p.store2 as price FROM Products as p
UNION ALL
SELECT p.product_id, "store3" as store, p.store3 as price FROM Products as p
) as r
WHERE r.price != 'None';
Problem: 1683. Invalid Tweets
Intuition:
It's straight forward
Approach:
Consider using `LENGTH` statement
Solution:
SELECT tweet_id FROM Tweets
WHERE LENGTH(content) > 15;
Problem: 1587. Bank Account Summary II
Intuition:
It contains two tables, so we surely have to use join operation. And it wants to compute
the balance for each person, so the combined use of `GROUP BY` and `SUM` is necessary.
Approach:
First, we have to consider what kind of join we have to use. For this problem, we can use
both inner join or left join (table Transactions would be left table). However, in real
world scenarios, I think it's more preferred to use left join because there might contains
transactions that doesn't match with the account name. Using left join would ensure that
no transaction records lost. So firstly, we performed left join operation
SELECT * FROM Transactions as t
LEFT JOIN Users as u
ON t.account = u.account
Next, we have to compute the total balance for each person, so the query becomes
SELECT u.name, SUM(t.amount) as balance FROM Transactions as t
LEFT JOIN Users as u
ON t.account = u.account
GROUP BY u.name
Eventually, we can filter the account that has balance>10000 using subquery
Solution:
SELECT r.name, r.balance FROM (
SELECT u.name, SUM(t.amount) as balance FROM Transactions as t
LEFT JOIN Users as u
ON t.account = u.account
GROUP BY u.name
) as r
WHERE balance > 10000;
Problem: 627. Swap Salary
Intuition:
This is a simple problem while it requires to use `UPDATE` statement. Also,
for conditioning labelling, we consider using `CASE WHEN` statement
Solution:
UPDATE Salary
SET sex = (
CASE sex
WHEN 'f' THEN 'm'
WHEN 'm' THEN 'f'
END);
Problem: 1378. Replace Employee ID With The Unique Identifier
Intuition:
It's a straightforward left join problem
Solution:
SELECT eu.unique_id, e.name FROM Employees as e
LEFT JOIN EmployeeUNI as eu
ON e.id = eu.id;
Problem: 1068. Product Sales Analysis I
Intuition:
It's a straightforward join operation problem. All we have to think is that
what kind of join operation we have to implement. Since the problem requires
us to demonstrate the information of each sale_id, so we can perform left join
and treat Sales table as left table.
Solution:
SELECT p.product_name, s.year, s.price FROM Sales as s
LEFT JOIN Product as p
ON s.product_id = p.product_id;
Problem: 1179. Reformat Department Table
Intuition:
This is also again a problem related to transposing. When think of transposing
problem, we would firstly consider `CASE WHEN` statement. And a tip for this
problem is that to show the "single" value when `GROUP BY` is used, we can
technically use `SUM` because `GROUP BY` statement requires an aggregate
function
Approach:
Firstly, we have to create new column for each month, so we use
CASE month WHEN 'blahblahblah' THEN revenue ELSE null END
Next, since the problem requires us to calculate result for each month, we
have to use
GROUP BY id
Here comes the importance of aggregate function
SUM(CASE month WHEN 'blahblahblah' THEN revenue ELSE null END) AS 'blahblahblah_revenue'
Solution:
SELECT id,
SUM(CASE month WHEN 'Jan' THEN revenue ELSE NULL END) AS Jan_Revenue,
SUM(CASE month WHEN 'Feb' THEN revenue ELSE NULL END) AS Feb_Revenue,
SUM(CASE month WHEN 'Mar' THEN revenue ELSE NULL END) AS Mar_Revenue,
SUM(CASE month WHEN 'Apr' THEN revenue ELSE NULL END) AS Apr_Revenue,
SUM(CASE month WHEN 'May' THEN revenue ELSE NULL END) AS May_Revenue,
SUM(CASE month WHEN 'Jun' THEN revenue ELSE NULL END) AS Jun_Revenue,
SUM(CASE month WHEN 'Jul' THEN revenue ELSE NULL END) AS Jul_Revenue,
SUM(CASE month WHEN 'Aug' THEN revenue ELSE NULL END) AS Aug_Revenue,
SUM(CASE month WHEN 'Sep' THEN revenue ELSE NULL END) AS Sep_Revenue,
SUM(CASE month WHEN 'Oct' THEN revenue ELSE NULL END) AS Oct_Revenue,
SUM(CASE month WHEN 'Nov' THEN revenue ELSE NULL END) AS Nov_Revenue,
SUM(CASE month WHEN 'Dec' THEN revenue ELSE NULL END) AS Dec_Revenue
FROM Department
GROUP BY id;
Problem: 1890. The Latest Login in 2020
Intuition:
This problem asks us to filter out the latest login for each id in 2020.
So the first solution we come up to mind my be the combination usage
of `WHERE`, `ORDER BY`, `GROUP BY`, and `LIMIT` statements. However, it turns
out that MySQL doesn't support this kind of operation. After thorough search,
I've found that after MySQL8, there is a new window function called `ROW_NUMBER`
that supports this kind of operation.
Approach:
Firstly, we filter out the date that is not in the year of 2020
SELECT *, FROM Logins
WHERE time_stamp >= '2020-01-01 00:00:00' and time_stamp <= '2020-12-31 23:59:59'
Next, we use the statement `ROW_NUMBER` to order the time_stamp based on each id
SELECT *,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY time_stamp DESC) AS row_num
FROM Logins
WHERE time_stamp >= '2020-01-01 00:00:00' and time_stamp <= '2020-12-31 23:59:59'
This way, we can use this subquery result to filter out the latest login date for each
id.
Solution:
SELECT r.user_id, r.time_stamp AS last_stamp FROM(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY time_stamp DESC) AS row_num
FROM Logins
WHERE time_stamp >= '2020-01-01 00:00:00' and time_stamp <= '2020-12-31 23:59:59'
) as r
WHERE r.row_num = 1;
Modified Solution:
SELECT user_id, MAX(time_stamp) AS last_stamp FROM Logins
WHERE YEAR(time_stamp) = '2020'
GROUP BY user_id;
Modified Solution Explained:
This solution basically has the same concept with mine, but it makes use of other statements
to make the code cleaner. First, it uses `YEAR` to filter out data in 2020. For another, it
makes use of `MAX` aggregate function in place of `WHERE r.row_number = 1`
Problem: 1484. Group Sold Products By The Date
Intuition:
This is a problem with `GROUP_BY`, but the tricky one is that it has to concatenate the string.
After thorough research, it turns out that this is only a simple problem that makes use of
`GROUP_CONCAT` statement.
Solution:
SELECT sell_date,
COUNT(DISTINCT product) AS num_sold,
GROUP_CONCAT(DISTINCT product order by product) AS products FROM Activities
GROUP BY sell_date;
Problem: 175. Combine Two Tables
Intuition:
This is a simple join problem. However, the worth-noticing part is that if a person doesn't have
address information, he/she should still appears on the table with NULL in his/her address column.
So the INNER JOIN operation doesn't work, as it will filters out people who don't have address
information. As a result, we shoud use LEFT JOIN operation
Solution:
SELECT p.firstname, p.lastname, a.city, a.state FROM Person as p
LEFT JOIN Address as a
ON p.personId = a.personId;
Problem: 1148. Article Views I
Intuition:
The problem requires us to list all the authors that have at least view there own articles once,
so what we have to to is to list the rows that has the same author_id and viewer_id. The following
steps are trivial
Solution:
SELECT DISTINCT author_id AS id FROM Views
WHERE author_id=viewer_id
ORDER BY author_id ASC;
Problem: 577. Employee Bonus
Intuition:
The is a simple LEFT JOIN problem
Solution:
SELECT e.name, b.bonus FROM Employee as e
LEFT JOIN Bonus as b
ON e.empId=b.empId
WHERE b.bonus<1000 or b.bonus is NULL;
Problem: 511. Game Play Analysis I
Intuition:
A simple problem requires special use of `GROUP BY` and `MIN` statement
Solution:
SELECT player_id, MIN(event_date) AS first_login FROM Activity
GROUP BY player_id;
Problem: 620. Not Boring Movies
Intuition:
A simple proble. Just have to notice how to use % to filter out odd numbers
Solution:
SELECT * FROM cinema
WHERE id % 2 = 1 and description != 'boring'
ORDER BY rating DESC;
Problem: 608. Tree Node
Intuition:
This problem asks us to label the type of nodes of the tree. As there're three
types of nodes, we have to come up of a condition to put them in different categories.
We can then notice that a root must doesn't have parent, so it's parent's id must be
NULL. Next, a node would be an inner node if its id is one of the parent's id. Then
the remaining nodes are leaf node. Under this circumstances, since we have to check the
condition for each row. We come up with `CASE WHEN` statement
Approach:
First, we can find root node by
p_id = 'None'
Next, we can find inner node by
id in (SELECT DISTINCT p_id from tree)
Eventually, we can use `CASE WHEN` to combine the results
Solution:
SELECT id,
CASE
WHEN p_id = 'None' THEN 'Root'
WHEN id in (SELECT DISTINCT p_id from tree) THEN 'Inner'
ELSE 'Leaf'
END AS type
FROM Tree;
Problem: 610. Triangle Judgement
Intuition:
Simple, just have to know that three sides could form a triangle if and only if
the sum of any two sides is greater than the third side
Solution:
SELECT *,
CASE
WHEN x+y>z and x+z>y and y+z>x THEN 'Yes'
ELSE 'No'
END AS 'triangle'
FROM Triangle;
Modified Solution:
SELECT *,
IF (x+y>z and x+z>y and y+z>x, 'Yes', 'No') AS 'triangle'
FROM Triangle;
Modified Solution Explained:
This solution makes use of the `IF` statement. It improves the codes' redability
and performance at the same time.
Problem: 1965. Employees With Missing Information
Intuition:
The first idea when I see this problem is to use FULL OUTER JOIN. Because we mustn't
left any records in both the tables. But it soon turns out that if we implement the
FULL OUTER JOIN directly, then we have to check whether the left or right table doesn't
contain employee_id. So instead of implementing FULL OUTER JOIN directly, we can actually
filter out the employee with missed information first and then implement FULL OUTER JOIN.
Approach:
This concept is kinda similar to FULL OUTER JOIN. To find out the missed information from
the left. We can run
SELECT e.employee_id FROM Employees as e
LEFT JOIN Salaries as s
ON e.employee_id=s.employee_id
WHERE e.name is NULL or s.salary is NULL
Next, we can find it from the right
SELECT s.employee_id FROM Salaries as s
LEFT JOIN Employees as e
ON e.employee_id=s.employee_id
WHERE e.name is NULL or s.salary is NULL
Then, we can use `UNION ALL` statement to concatenate the result and use subquery for reordering
Solution:
SELECT * FROM (
SELECT e.employee_id FROM Employees as e
LEFT JOIN Salaries as s
ON e.employee_id=s.employee_id
WHERE e.name is NULL or s.salary is NULL
UNION ALL
SELECT s.employee_id FROM Salaries as s
LEFT JOIN Employees as e
ON e.employee_id=s.employee_id
WHERE e.name is NULL or s.salary is NULL
) AS r
ORDER BY employee_id ASC;
Problem: 182. Duplicate Emails
Intuition:
The structure of the problem is straightforward: find the duplicate value. So the first thing
comes up to my mind is `ROWNUMBER() OVER` statement, which label rows based on specific columns
Approach:
First, we use `ROWNUMBER() OVER` statement to generate the id for each of the email
SELECT email,
ROW_NUMBER() OVER (PARTITION BY email) as row_num
FROM Person
Note that we don't have to add `GROUP BY email`. Because it will then treat email distinctly,
making the row_num becomes 1 for every email.
Then we could filter out the row that has row_num=2, indicating that this email occured more
than once
Solution:
SELECT email FROM (
SELECT email,
ROW_NUMBER() OVER (PARTITION BY email) as row_num
FROM Person
) as r
WHERE r.row_num=2;
Modified Solution;
SELECT email FROM Person
GROUP BY email
HAVING COUNT(email)>1;
Modified Solution Explained:
Traditionally, we would use `WHERE` to filter rows with condition. However, it cannot used with
aggregate function. But `HAVING` statement CAN!!! This solution makes use of `HAVING` statement,
which drastically improve readability and performance.
Problem: 1327. List the Products Ordered in a Period
Intuition:
It's actually a compound question that needs us to tackle step by step. We can break it into the
following part:
1. Find orders occured in 2020-02
2. Find the products that have at least 100 ordered units
3. Join the table to get the name of the product
Approach:
Step1: Find orders occured in 2020-02
SELECT * FROM Orders
WHERE YEAR(order_date)='2020' and MONTH(order_date)='2';
Step2: Find the products that have at least 100 ordered units
SELECT * FROM Orders
GROUP BY product_id
HAVING SUM(unit)>=100
Step3: Join the table to get the name of the product
Solution:
SELECT p.product_name, o.unit FROM Products as p
INNER JOIN (
SELECT product_id, SUM(unit) as unit FROM Orders
WHERE YEAR(order_date)='2020' and MONTH(order_date)='2'
GROUP BY product_id
HAVING SUM(unit)>=100
) AS o
ON p.product_id=o.product_id;
Problem: 1050. Actors and Directors Who Cooperated At Least Three Times
Intuition:
Because we have to find unique pairs of (actor_id, director_id), so we must build
such a string. After we have this string, we could use `GROUP BY` and `HAVING` to
count the occurance
Approach:
We use `CONCAT` to concatenate two columns, and use additional comma to serve as the
separator
SELECT CONCAT(actor_id, ',', director_id) AS ad
FROM ActorDirector
Next, we can count the occurance
SELECT CONCAT(actor_id, ',', director_id) AS ad
FROM ActorDirector
GROUP BY CONCAT(actor_id, ',', director_id)
HAVING COUNT(CONCAT(actor_id, ',', director_id)) >=3
Now, we have the result, but we must split it into two columns instead of one. We can
use `SUBSTRING_INDEX` statement here. So it will act like
SELECT SUBSTRING_INDEX(r.ad, ',', 1) AS actor_id, SUBSTRING_INDEX(r.ad, ',', -1) AS director_id
FROM r
Last step, we have to convert it back to integer, we could use `CAST` statement here
SELECT CAST(actor_id, UNSIGNED), CAST(director_id, UNSIGNED)
FROM r
Solution:
SELECT CAST(SUBSTRING_INDEX(r.ad, ',', 1) AS UNSIGNED) AS actor_id, CAST(SUBSTRING_INDEX(r.ad, ',', -1) AS UNSIGNED) AS director_id FROM (
SELECT CONCAT(actor_id, ',', director_id) AS ad
FROM ActorDirector
GROUP BY CONCAT(actor_id, ',', director_id)
HAVING COUNT(CONCAT(actor_id, ',', director_id)) >=3
) AS r
Modified Solution:
SELECT actor_id, director_id FROM ActorDirector
GROUP BY actor_id, director_id
HAVING COUNT(timestamp)>=3;
Modified Solution Explained:
This solution utilize the timestamp column as a was to count the occurance. This is brilliant. However,
if there are only actor_id and director_id column, then this solution couldn't work
Problem: 584. Find Customer Referee
Intuition:
Trivial
Solution:
SELECT name FROM Customer
WHERE referee_id != '2' OR referee_id is NULL;
Problem: 626. Exchange Seats
Intuition:
This problem is a little bit complicate. At first, I originally want to tackle this using
`CASE WHEN`, but it turns out that it's difficult. So I come up with an idea that makes use
of separating odd and even rows. It works like this
1. Filter out the odd rows and give them row numbers
2. Filter out the even rows and give them row numbers
3. Combine two tables and order by id and row numbers
By doing this, we can then swap the two consecutive rows
Approach:
Step1
SELECT *,
ROW_NUMBER() OVER (ORDER BY id) AS rown
FROM Seat
WHERE id % 2 = 1
Step2
SELECT *,
ROW_NUMBER() OVER (ORDER BY id) AS rown
FROM Seat
WHERE id % 2 = 0
Step3
SELECT TABLE1
UNION ALL
SELECT TABLE2
ORDER BY rown ASC, id DESC
Step4: This step is used to create a new id column for the table. Note
that we must explicitly declare the `ORDER BY` in `ROW_NUMBER()` statement.
Because when not explicitly declare, the result would be different based
on the configuration of SQL engine.
SELECT ROW_NUMBER() OVER (PARTITION BY (SELECT 1) ORDER BY r.rown ASC, r.id DESC) AS id,
r.student
FROM RESULT_TABLE AS r
Solution:
SELECT ROW_NUMBER() OVER (PARTITION BY (SELECT 1) ORDER BY r.rown ASC, r.id DESC) AS id,
r.student
FROM (
SELECT *,
ROW_NUMBER() OVER (ORDER BY id) AS rown
FROM Seat
WHERE id % 2 = 1
UNION ALL
SELECT *,
ROW_NUMBER() OVER (ORDER BY id) AS rown
FROM Seat
WHERE id % 2 = 0
ORDER BY rown ASC, id DESC
) AS r
Modified Solution:
SELECT
CASE
WHEN id = (SELECT MAX(id) FROM Seat) AND id % 2 = 1 THEN id
WHEN id % 2 = 1 THEN id + 1
WHEN id % 2 = 0 THEN id - 1
END AS id,
student
FROM Seat
ORDER BY id;
Modified Solution Explained:
This solution makes use of `CASE WHEN` statement, which improves readability. Also
the logic is more simple. However, the original solution is proved to have higher
efficiency.
Problem: 181. Employees Earning More Than Their Managers
Intuition:
It's a simple join problem that requires you to join the same table
Solution:
SELECT e1.name AS employee FROM Employee as e1
INNER JOIN Employee as e2
ON e1.managerId = e2.id
WHERE e1.salary > e2.salary;
Problem: 183. Customers Who Never Order
Intuition:
Trivial
Solution:
SELECT c.name AS customers FROM Customers AS c
LEFT OUTER JOIN Orders AS o
ON c.id = customerId
WHERE o.id is NULL;
Problem: 1729. Find Followers Count
Intuition:
Trivial
Solution:
SELECT user_id, COUNT(follower_id) AS followers_count FROM Followers
GROUP BY user_id
ORDER BY user_id ASC;
Problem: 1581. Customer Who Visited but Did Not Make Any Transactions
Intuition:
A simple problem requires usage of JOIN operation and some basic syntax
Solution:
SELECT v.customer_id, COUNT(*) AS count_no_trans FROM Visits AS v
LEFT OUTER JOIN Transactions AS t
ON v.visit_id=t.visit_id
WHERE t.transaction_id is NULL
GROUP BY v.customer_id;
Problem: 595. Big Countries
Intuition:
Trivial
Solution:
SELECT name, population, area FROM World
WHERE area>=3000000 or population>=25000000;
Problem: 1661. Average Time of Process per Machine
Intuition:
This problem asks us to calculate the average processing time for each machine.
And we see that for each row there is 'start' timestamp and 'end' timestamp. So
the first task we have to do is to compute the difference between two consecutive
rows. This task has to do together with `GROUP BY machine_id, process_id`. Then,
with the column that shows the difference between two consecutive rows. We then
have to compute the average based on `GROUP BY machine_id`. Then we finish
Approach:
Firstly, we use `CASE WHEN` to create new column. This new column would turn the
`start` timestamp into negative. By doing this, we can then add the start and end
timestamp to compute the difference
CASE activity_type
WHEN 'start' THEN ROUND(-timestamp, 3)
WHEN 'end' THEN ROUND(timestamp, 3)
END
Next, we can compute the total process time based on each process of each machine
SELECT machine_id,
SUM(
CASE activity_type
WHEN 'start' THEN ROUND(-timestamp, 3)
WHEN 'end' THEN ROUND(timestamp, 3)
END
) AS nn
FROM Activity
GROUP BY machine_id, process_id
Now that we can compute the average process time for each machine
SELECT r.machine_id, ROUND(AVG(r.nn), 3) AS processing_time FROM (
SUBTABLE
) AS r
GROUP BY r.machine_id;
Solution:
SELECT r.machine_id, ROUND(AVG(r.nn), 3) AS processing_time FROM (
SELECT machine_id,
SUM(
CASE activity_type
WHEN 'start' THEN ROUND(-timestamp, 3)
WHEN 'end' THEN ROUND(timestamp, 3)
END
) AS nn
FROM Activity
GROUP BY machine_id, process_id
) AS r
GROUP BY r.machine_id;
Modified Solution:
SELECT a1.machine_id, ROUND(AVG(a2.timestamp-a1.timestamp), 3) AS processing_time FROM Activity AS a1
INNER JOIN Activity AS a2
ON a1.machine_id=a2.machine_id AND a1.process_id=a2.process_id AND a1.activity_type='start' AND a2.activity_type='end'
GROUP BY a1.machine_id;
Modified Solution Explained:
This solution makes use of INNER JOIN to separate start and end timestamp. So the first step
is
SELECT * FROM Activity AS a1
INNER JOIN Activity AS a2
ON a1.machine_id=a2.machine_id AND a1.process_id=a2.process_id;
By doing this, the number of rows of (machine_id=p, process_id=m) will becomes 4 from 2, which
represents the all combinations. Next, we can filter out the row that we want to compute the
difference. This is where the
a1.activity_type='start' AND a2.activity_type='end'
comes. The following query is then trivial.
Problem: 1204. Last Person to Fit in the Bus
Intuition:
This is a classical problem. The most important part is how to implement the cumulative sum
like operation. There are two methods: Correlated Query and Variables
Correlated Query:
SELECT q1.person_name
FROM Queue AS q1
WHERE (SELECT SUM(q2.weight) FROM Queue as q2 WHERE q2.turn <= q1.turn) <= 1000
ORDER BY q1.turn DESC LIMIT 1;
Correlated Query Explained:
First, with the query
SELECT *,
(
SELECT SUM(q2.weight) FROM Queue as q2
WHERE q2.turn <= q1.turn
) AS cumulative_weight
FROM Queue AS q1;
we can find the cumulative weight of the original column. It works because of the statement
`WHERE q2.turn <= q1.turn`. Next, we can filter out rows that the cumulative weight is no
greater than 1000
SELECT q1.person_name
FROM Queue AS q1
WHERE (SELECT SUM(q2.weight) FROM Queue as q2 WHERE q2.turn <= q1.turn) <= 1000;
In the end, we use reverse order to find the last person
ORDER BY q1.turn DESC LIMIT 1;
Variables:
This method makes use of MySQL variabls that store the values of cumulative weight, the query
is as follows
SELECT person_name, turn,
@running_total := @running_total + q.weight AS cum
FROM Queue AS q
INNER JOIN (SELECT @running_total := 0) AS r
ORDER BY q.turn ASC
The `INNER JOIN` part is just use to initialize the variable every time the query runs. After
acquiring the cumulative weight column, the following steps are trivial.
Problem: 607. Sales Person
Intuition:
Though the problem seems complicated, we can break it into parts and tackle it step by step
1. Find the orders that have interaction with RED company
2. Find which sales_id have interaction with these orders
3. Drop sales_id in step2, the remaining is the answer
Approach:
Step1: Find the orders that have interaction with RED company
SELECT * FROM Orders
WHERE com_id IN (
SELECT com_Id FROM Company
WHERE name = "RED"
)
Step2: Find which sales_id have interaction with these orders
SELECT s.name FROM SalesPerson AS s
LEFT OUTER JOIN (SUBTABLE) AS oc
ON s.sales_id=oc.sales_id
Step3: Drop sales_id in step2, the remaining is the answer
WHERE oc.order_id IS NULL;
Solution:
SELECT s.name FROM SalesPerson AS s
LEFT OUTER JOIN (
SELECT * FROM Orders
WHERE com_id IN (
SELECT com_Id FROM Company
WHERE name = "RED"
)) AS oc
ON s.sales_id=oc.sales_id
WHERE oc.order_id IS NULL;
Problem: 586. Customer Placing the Largest Number of Orders
Intuition:
When facing this problem, we immediately come up `ORDER BY` aligned with `COUNT`.
When the first task is finished, we will use `ORDER BY` aligned with `LIMIT` to find
the customer_id that has the largest amount of orders
Solution:
SELECT r.customer_number FROM(
SELECT customer_number,
COUNT(*) AS order_num
FROM orders
GROUP BY customer_number
) AS r
ORDER BY r.order_num DESC LIMIT 1;
Problem: 2988. Manager of the Largest Department
Intuition:
We can solve this problem step by step with the following
1. Find out how many employees are there in the largest department
2. Filter out department that has that has the most employees
3. Find out their managers
Approach:
1: Find out how many employees are there in the largest department
SELECT COUNT(*) AS num FROM Employees
GROUP BY dep_id
ORDER BY COUNT(*) DESC LIMIT 1;
2: Filter out department that has that has the most employees
SELECT dep_id FROM Employees
GROUP BY dep_id
HAVING COUNT(*) = (SUBQUERY)
3: Find out how many employees are there in the largest department
Here, I make use of INNER JOIN to filter out the rows we need. This
query does two things: find the employees that belongs to the largest
department and whose position is manager
SELECT e.emp_name AS manager_name, e.dep_id FROM Employees AS e
INNER JOIN (SUBTABLE) AS d
WHERE e.dep_id=d.dep_id AND e.position='Manager'
ORDER BY e.dep_id ASC;
Solution:
SELECT e.emp_name AS manager_name, e.dep_id FROM Employees AS e
INNER JOIN (
SELECT dep_id FROM Employees
GROUP BY dep_id
HAVING COUNT(*) = (
SELECT COUNT(*) AS num FROM Employees
GROUP BY dep_id
ORDER BY COUNT(*) DESC LIMIT 1
)) AS d
WHERE e.dep_id=d.dep_id AND e.position='Manager'
ORDER BY e.dep_id ASC;
Modified Solution:
WITH largest(dep_id, rnk) AS (
SELECT dep_id,
RANK() OVER (ORDER BY COUNT(*) DESC) AS rnk
FROM Employees
GROUP BY dep_id
)
SELECT emp_name AS manager_name, dep_id FROM Employees
WHERE position='manager' AND dep_id IN (
SELECT dep_id FROM largest
WHERE rnk='1'
)
ORDER BY dep_id ASC;
Modified Solution Explained:
The solution provides an alternative approach using Common Table Expression (CTE)
that provides better readability
Problem: 1077. Project Employees III