Lecture 10: ER wrap up and SQL#

Announcements#

  • No new hw or lecture exercise today, lecture exercises will resume on monday

  • Grading to be completed sometime next week, we will grade hw#2 after that

  • Now is the time start installing a db on your machine if you want to get the experience

Topics for today#

  • Hierarchies in ER Diagrams (students, faculty, staff or car,truck,motorcycle)

  • Introduction to SQL

    • DBMS install

    • standards

    • bag standards

    • control flow

    • case insensitive, special characters, comments

    • keywords!

    • SELECT, distinct, attributes/functions

    • WHERE like, null values

    • functions (strongly typed): string, numerical, math, utilities

    • date math

x x x x

SQL#

Standard Query Language - Industry Standard

We will use V17 in this class

SQL Implements bag semantics, i.e. multiple copies of the same tuple is allowed in a table unless there is a primary key.

SQL is case insensitive

SELECT … FROM … WHERE … ;
–semicolon marks the end of a query

load_ext sql

sql

%sql --section baking
Connecting to 'baking'
%%sql

SELECT 
   baker
    , baker
    , upper(baker) as upname
    , age
    , 5 as season
    ,baker ||' ' ||  occupation as bakerwithoc
    , baker || cast(age as varchar) || age::varchar
FROM 
   bakers;
Running query in 'baking'
12 rows affected.
baker baker_1 upname age season bakerwithoc ?column?
Antony Antony ANTONY 30 5 Antony Banker Antony3030
Briony Briony BRIONY 33 5 Briony Full-time parent Briony3333
Dan Dan DAN 36 5 Dan Full-time parent Dan3636
Imelda Imelda IMELDA 33 5 Imelda Countryside recreation officer Imelda3333
Jon Jon JON 47 5 Jon Blood courier Jon4747
Karen Karen KAREN 60 5 Karen In-store sampling assistant Karen6060
Kim-Joy Kim-Joy KIM-JOY 27 5 Kim-Joy Mental health specialist Kim-Joy2727
Luke Luke LUKE 30 5 Luke Civil servant/house and techno DJ Luke3030
Manon Manon MANON 26 5 Manon Software project manager Manon2626
Rahul Rahul RAHUL 30 5 Rahul Research scientist Rahul3030
Truncated to displaylimit of 10.
%%sql

SELECT DISTINCT  -- remove the copies in the result
   hometown
FROM
   bakers
WHERE
   age < 35 ;
Running query in 'baking'
6 rows affected.
hometown
London
Leeds
Rotherham
County Tyrone
Sheffield
Bristol
%%sql
SELECT 
    * 
FROM 
    bakers 
WHERE 
    hometown = 'London';
Running query in 'baking'
4 rows affected.
baker fullname age occupation hometown
Antony Antony Amourdoux 30 Banker London
Dan Dan Beasley-Harling 36 Full-time parent London
Manon Manon Lagrève 26 Software project manager London
Ruby Ruby Bhogal 29 Project manager London
%%sql

SELECT
   *
FROM
   bakers
WHERE
   lower(occupation) LIKE '%full%'
;
Running query in 'baking'
2 rows affected.
baker fullname age occupation hometown
Briony Briony Williams 33 Full-time parent Bristol
Dan Dan Beasley-Harling 36 Full-time parent London
%%sql

SELECT
   baker, episodeid
FROM
   signatures
WHERE
   lower(make) like '%ginger%'
;
Running query in 'baking'
12 rows affected.
baker episodeid
Luke 1
Terry 1
Briony 5
Dan 5
Jon 5
Karen 5
Kim-Joy 5
Manon 5
Rahul 5
Ruby 5
Truncated to displaylimit of 10.
%%sql
select 
   * 
from 
   episodes
where 
   firstaired > '2018-09-18'::date;
Running query in 'baking'
6 rows affected.
id title firstaired viewers7day signature technical showstopper
5 Spice 2018-09-25 8.67 Ginger Cake 12 Ma'amoul Spiced Biscuit Chandelier
6 Pastry 2018-10-02 9.3 12 Samosas 6 Puits D'amour Shaped Banquet Pie
7 Vegan 2018-10-09 9.54 8 Savoury Vegan Tartlets Vegan Tropical Fruit Pavlova Vegan Celebratory Cake
8 Danish 2018-10-16 9.69 2 Smørrebrød 14 Æbleskiver Kagemand/Kagekone
9 Pâtisserie (Semi-final) 2018-10-23 9.5 24 Madeleines Torta Setteveli Parisian Pâtisserie Window
10 Final 2018-10-30 10.34 12 Iced Doughnuts 6 Campfire Pita Breads Landscape Dessert
%%sql
select 
   id
    , title
    , firstaired
    , '2018-10-30'::date-firstaired as numdays
from 
   episodes    
;
Running query in 'baking'
10 rows affected.
id title firstaired numdays
1 Biscuits 2018-08-28 63
2 Cakes 2018-09-04 56
3 Bread 2018-09-11 49
4 Desserts 2018-09-18 42
5 Spice 2018-09-25 35
6 Pastry 2018-10-02 28
7 Vegan 2018-10-09 21
8 Danish 2018-10-16 14
9 Pâtisserie (Semi-final) 2018-10-23 7
10 Final 2018-10-30 0
Truncated to displaylimit of 10.

Date operations:

date - date = days

date + time = timestamp

%%sql
select '2025-10-02'::date ;
        
select '13:32'::time;
Running query in 'baking'
1 rows affected.
1 rows affected.
time
13:32:00
%%sql
select '2025-10-02'::date + '13:32'::time;
Running query in 'baking'
1 rows affected.
?column?
2025-10-02 13:32:00
%%sql
select ('13:50'::time - '13:32'::time);
                
Running query in 'baking'
1 rows affected.
?column?
0:18:00
%%sql
select now();
Running query in 'baking'
1 rows affected.
now
2025-10-06 22:54:52.396769-04:00
%%sql
select random();
Running query in 'baking'
1 rows affected.
random
0.45209653736957556

Null values#

Means there is no value for an attribute

  • there is currently no value

  • there is a value but it is unknown

  • it is not known if there is a value or not

%%sql

create table tmp(id1 int, id2 int);
insert into tmp values(1,2);
insert into tmp values(3,NULL);
insert into tmp values(4,NULL);
insert into tmp values(NULL,5);
insert into tmp values(NULL,NULL);
Running query in 'baking'
RuntimeError: (psycopg2.errors.DuplicateTable) relation "tmp" already exists

[SQL: create table tmp(id1 int, id2 int);]
(Background on this error at: https://sqlalche.me/e/20/f405)
If you need help solving this issue, send us a message: https://ploomber.io/community
%%sql
select * from tmp;
Running query in 'baking'
5 rows affected.
id1 id2
1 2
3 None
4 None
None 5
None None
%%sql
select * from tmp
where id2 > 2;
Running query in 'baking'
1 rows affected.
id1 id2
1 2

5 = NULL - evaluates to UNKNOWN NULL<5 - evaluates to UNKNOWN

I only return the tuples that evaluate to True

NOT (UNKNOWN) = UNKNOWN

TRUE AND UNKNOWN = UNKNOWN
FALSE AND UNKNOWN = FALSE
TRUE OR UNKNOWN = TRUE
FALSE OR UNKNOWN = UNKNOWN

%%sql 

select * 
from tmp
where
    id2>2
    or id2 is NULL;
Running query in 'baking'
4 rows affected.
id1 id2
3 None
4 None
None 5
None None