Lecture 13 - Advanced SQL
Announcements
Homework 3 to be posted later today or by
tomorrow the latest, due next week on thursday
Homework 4 to be released tomorrow
Lecture Exercise 12 to be posted at 2pm today
Today’s lecture
Review of basic SQL
Advanced SQL
displaylimit: Value None will be treated as 0 (no limit)
Connecting to 'baking'
SELECT … FROM WHERE GROUP BY HAVING …
UNION/EXCEPT/INTERSECT
SELECT … FROM WHERE GROUP BY HAVING …
Running query in 'baking'
9 rows affected.
| baker |
episodeid |
result |
| Manon |
1 |
star baker |
| Rahul |
2 |
star baker |
| Rahul |
3 |
star baker |
| Dan |
4 |
star baker |
| Kim-Joy |
5 |
star baker |
| Briony |
6 |
star baker |
| Kim-Joy |
7 |
star baker |
| Ruby |
8 |
star baker |
| Ruby |
9 |
star baker |
Running query in 'baking'
6 rows affected.
| baker |
numepisodes |
| Kim-Joy |
2 |
| Rahul |
2 |
| Ruby |
2 |
| Briony |
1 |
| Dan |
1 |
| Manon |
1 |
Running query in 'baking'
15 rows affected.
| baker |
baker_1 |
episodeid |
| Antony |
None |
None |
| Briony |
Briony |
6 |
| Dan |
Dan |
4 |
| Imelda |
None |
None |
| Jon |
None |
None |
| Karen |
None |
None |
| Kim-Joy |
Kim-Joy |
5 |
| Kim-Joy |
Kim-Joy |
7 |
| Luke |
None |
None |
| Manon |
Manon |
1 |
| Rahul |
Rahul |
3 |
| Rahul |
Rahul |
2 |
| Ruby |
Ruby |
8 |
| Ruby |
Ruby |
9 |
| Terry |
None |
None |
Running query in 'baking'
12 rows affected.
| baker |
numepisodes |
| Antony |
0 |
| Briony |
1 |
| Dan |
1 |
| Imelda |
0 |
| Jon |
0 |
| Karen |
0 |
| Kim-Joy |
2 |
| Luke |
0 |
| Manon |
1 |
| Rahul |
2 |
| Ruby |
2 |
| Terry |
0 |
Running query in 'baking'
15 rows affected.
| baker |
ep1 |
result1 |
ep2 |
result2 |
| Antony |
None |
None |
3 |
eliminated |
| Briony |
6 |
star baker |
9 |
eliminated |
| Dan |
4 |
star baker |
6 |
eliminated |
| Imelda |
None |
None |
1 |
eliminated |
| Jon |
None |
None |
7 |
eliminated |
| Karen |
None |
None |
5 |
eliminated |
| Kim-Joy |
5 |
star baker |
None |
None |
| Kim-Joy |
7 |
star baker |
None |
None |
| Luke |
None |
None |
2 |
eliminated |
| Manon |
1 |
star baker |
8 |
eliminated |
| Rahul |
3 |
star baker |
None |
None |
| Rahul |
2 |
star baker |
None |
None |
| Ruby |
8 |
star baker |
None |
None |
| Ruby |
9 |
star baker |
None |
None |
| Terry |
None |
None |
5 |
eliminated |
Running query in 'baking'
12 rows affected.
| baker |
numwins |
numeliminated |
| Antony |
0 |
1 |
| Briony |
1 |
1 |
| Dan |
1 |
1 |
| Imelda |
0 |
1 |
| Jon |
0 |
1 |
| Karen |
0 |
1 |
| Kim-Joy |
2 |
0 |
| Luke |
0 |
1 |
| Manon |
1 |
1 |
| Rahul |
2 |
0 |
| Ruby |
2 |
0 |
| Terry |
0 |
1 |
Running query in 'baking'
3 rows affected.
Scalar queries
Running query in 'baking'
1 rows affected.
Running query in 'baking'
1 rows affected.
Running query in 'baking'
1 rows affected.
| avg |
min |
max |
| 36.4166666666666667 |
26 |
60 |
Running query in 'baking'
3 rows affected.
| baker |
fullname |
age |
occupation |
hometown |
| Jon |
Jon Jenkins |
47 |
Blood courier |
Newport |
| Karen |
Karen Wright |
60 |
In-store sampling assistant |
Wakefield |
| Terry |
Terry Hartill |
56 |
Retired air steward |
West Midlands |
Running query in 'baking'
12 rows affected.
| baker |
age |
avgage |
greatest |
| Antony |
30 |
36.42 |
0 |
| Briony |
33 |
36.42 |
0 |
| Dan |
36 |
36.42 |
0 |
| Imelda |
33 |
36.42 |
0 |
| Jon |
47 |
36.42 |
10.58 |
| Karen |
60 |
36.42 |
23.58 |
| Kim-Joy |
27 |
36.42 |
0 |
| Luke |
30 |
36.42 |
0 |
| Manon |
26 |
36.42 |
0 |
| Rahul |
30 |
36.42 |
0 |
| Ruby |
29 |
36.42 |
0 |
| Terry |
56 |
36.42 |
19.58 |
Queries that return a single value, but a set of tuples
value IN (SET – results of a query)
value NOT IN (SET – results of a query)
value >= ALL (SET – results of a query)
value < ALL (SET – results of a query)
value >= ANY (SET – results of a query)
value = ANY (SET – results of a query)
value <> ANY (SET – results of a query)
EXISTS (SET – results of a query)
– true if there are tuples returned by the query
NOT EXISTS (SET – results of a query)
– true if there are no tuples returned by the inner query
Running query in 'baking'
6 rows affected.
| baker |
fullname |
age |
occupation |
hometown |
| Briony |
Briony Williams |
33 |
Full-time parent |
Bristol |
| Dan |
Dan Beasley-Harling |
36 |
Full-time parent |
London |
| Imelda |
Imelda McCarron |
33 |
Countryside recreation officer |
County Tyrone |
| Jon |
Jon Jenkins |
47 |
Blood courier |
Newport |
| Karen |
Karen Wright |
60 |
In-store sampling assistant |
Wakefield |
| Terry |
Terry Hartill |
56 |
Retired air steward |
West Midlands |
Running query in 'baking'
4 rows affected.
Running query in 'baking'
3 rows affected.
| baker |
age |
hometown |
| Jon |
47 |
Newport |
| Karen |
60 |
Wakefield |
| Terry |
56 |
West Midlands |
Running query in 'baking'
3 rows affected.
| baker |
age |
hometown |
| Jon |
47 |
Newport |
| Karen |
60 |
Wakefield |
| Terry |
56 |
West Midlands |
Running query in 'baking'
3 rows affected.
| baker |
fullname |
age |
occupation |
hometown |
| Kim-Joy |
Kim-Joy Hewlett |
27 |
Mental health specialist |
Leeds |
| Rahul |
Rahul Mandal |
30 |
Research scientist |
Rotherham |
| Ruby |
Ruby Bhogal |
29 |
Project manager |
London |
Running query in 'baking'
6 rows affected.
| baker |
fullname |
age |
occupation |
hometown |
| Antony |
Antony Amourdoux |
30 |
Banker |
London |
| Imelda |
Imelda McCarron |
33 |
Countryside recreation officer |
County Tyrone |
| Jon |
Jon Jenkins |
47 |
Blood courier |
Newport |
| Karen |
Karen Wright |
60 |
In-store sampling assistant |
Wakefield |
| Luke |
Luke Thompson |
30 |
Civil servant/house and techno DJ |
Sheffield |
| Terry |
Terry Hartill |
56 |
Retired air steward |
West Midlands |
Running query in 'baking'
3 rows affected.
3 rows affected.
3 rows affected.
| baker |
?column? |
| Rahul |
3 |
| Ruby |
3 |
| Kim-Joy |
3 |
Running query in 'baking'
5 rows affected.
| QUERY PLAN |
| Seq Scan on bakers (cost=13.25..24.63 rows=55 width=70) |
| Filter: (NOT (ANY ((baker)::text = ((hashed SubPlan 1).col1)::text))) |
| SubPlan 1 |
| -> Seq Scan on results (cost=0.00..13.25 rows=1 width=218) |
| Filter: ((result)::text = 'eliminated'::text) |
Running query in 'baking'
63 rows affected.
| baker |
episodeid |
| Kim-Joy |
2 |
| Manon |
7 |
| Terry |
5 |
| Jon |
1 |
| Ruby |
5 |
| Briony |
3 |
| Karen |
2 |
| Kim-Joy |
10 |
| Briony |
7 |
| Rahul |
8 |
| Antony |
3 |
| Briony |
9 |
| Manon |
3 |
| Briony |
4 |
| Jon |
7 |
| Antony |
1 |
| Manon |
1 |
| Dan |
1 |
| Jon |
3 |
| Manon |
4 |
| Dan |
4 |
| Luke |
1 |
| Manon |
6 |
| Dan |
6 |
| Rahul |
1 |
| Briony |
6 |
| Terry |
2 |
| Kim-Joy |
5 |
| Rahul |
9 |
| Ruby |
10 |
| Karen |
5 |
| Jon |
6 |
| Ruby |
2 |
| Rahul |
3 |
| Manon |
8 |
| Kim-Joy |
6 |
| Ruby |
1 |
| Jon |
5 |
| Terry |
1 |
| Ruby |
4 |
| Ruby |
9 |
| Manon |
5 |
| Dan |
5 |
| Rahul |
10 |
| Ruby |
7 |
| Rahul |
2 |
| Ruby |
3 |
| Briony |
5 |
| Terry |
3 |
| Kim-Joy |
8 |
| Kim-Joy |
7 |
| Antony |
2 |
| Manon |
2 |
| Briony |
2 |
| Karen |
3 |
| Ruby |
8 |
| Luke |
2 |
| Kim-Joy |
3 |
| Karen |
1 |
| Kim-Joy |
1 |
| Imelda |
1 |
| Kim-Joy |
4 |
| Karen |
4 |
Running query in 'baking'
63 rows affected.
| baker |
episodeid |
| Antony |
1 |
| Dan |
1 |
| Imelda |
1 |
| Jon |
1 |
| Karen |
1 |
| Kim-Joy |
1 |
| Luke |
1 |
| Manon |
1 |
| Rahul |
1 |
| Ruby |
1 |
| Terry |
1 |
| Antony |
2 |
| Briony |
2 |
| Karen |
2 |
| Kim-Joy |
2 |
| Luke |
2 |
| Manon |
2 |
| Rahul |
2 |
| Ruby |
2 |
| Terry |
2 |
| Antony |
3 |
| Briony |
3 |
| Jon |
3 |
| Karen |
3 |
| Kim-Joy |
3 |
| Manon |
3 |
| Rahul |
3 |
| Ruby |
3 |
| Terry |
3 |
| Briony |
4 |
| Dan |
4 |
| Karen |
4 |
| Kim-Joy |
4 |
| Manon |
4 |
| Ruby |
4 |
| Briony |
5 |
| Dan |
5 |
| Jon |
5 |
| Karen |
5 |
| Kim-Joy |
5 |
| Manon |
5 |
| Ruby |
5 |
| Terry |
5 |
| Briony |
6 |
| Dan |
6 |
| Jon |
6 |
| Kim-Joy |
6 |
| Manon |
6 |
| Briony |
7 |
| Jon |
7 |
| Kim-Joy |
7 |
| Manon |
7 |
| Ruby |
7 |
| Kim-Joy |
8 |
| Manon |
8 |
| Rahul |
8 |
| Ruby |
8 |
| Briony |
9 |
| Rahul |
9 |
| Ruby |
9 |
| Kim-Joy |
10 |
| Rahul |
10 |
| Ruby |
10 |
Running query in 'baking'
3 rows affected.
Running query in 'baking'
3 rows affected.
Running query in 'baking'
7 rows affected.
| QUERY PLAN |
| Nested Loop Anti Join (cost=0.00..9659.32 rows=55 width=38) |
| Join Filter: (NOT (ANY ((e.id = (hashed SubPlan 2).col1) AND ((b.baker)::text = (hashed SubPlan 2).col2)))) |
| -> Seq Scan on bakers b (cost=0.00..11.10 rows=110 width=38) |
| -> Materialize (cost=0.00..11.20 rows=80 width=4) |
| -> Seq Scan on episodes e (cost=0.00..10.80 rows=80 width=4) |
| SubPlan 2 |
| -> Seq Scan on technicals t (cost=0.00..1.75 rows=75 width=36) |
Running query in 'baking'
3 rows affected.
Running query in 'baking'
7 rows affected.
| QUERY PLAN |
| HashAggregate (cost=13.13..13.29 rows=1 width=5) |
| Group Key: technicals.baker |
| Filter: (count(*) = (InitPlan 1).col1) |
| InitPlan 1 |
| -> Aggregate (cost=11.00..11.01 rows=1 width=8) |
| -> Seq Scan on episodes (cost=0.00..10.80 rows=80 width=0) |
| -> Seq Scan on technicals (cost=0.00..1.75 rows=75 width=5) |
Final notes:
Learn to write queries in many different ways
Using subqueries don’t often save cost, but may be
necessary for some queries.
Using subqueries may not significant increase query cost as long as the subquery is uncorrelated, i.e. runs independent of the outer query. In this case, the query can be executed once and results can be saved/cached.
Correlated subqueries may significantly increase cost, depending on the complexity.
You can see the cost of a query in Postgres using the explain keyword.
If you are using IN or EXISTS with a subquery, often that query can be rewritten without a subquery involving a join (though not always, especially if you are counting things!).
Don’t forget, count and left join are very powerful constructs and can help you solve many queries.