Lecture 3 ------------ - Office hours are now posted - Hw#1 to be posted soon - Lecture Exercise 2 after class, 48 hours to complete Relational Algebra A set of algebra operations that take as input a set of tuples and produce as output a new set of tuples (a new relation) --------- Operators: 1. Selection: SELECT_C(R) = {tuples t in R that satisfy the Boolean condition C} Note that selection does not change the data model (schema) but the tuples in R. For selection to be valid, C must be a condition over the attributes in R. 2. Projection: PROJECT_{A1,..,An} (R) = {tuples in t containing only values for the attributes A1,..,An} Projection is only valid is A1,..,An are attributes in R. Projection does not change the tuples in the relation, but their data model. Since relations are sets of tuples, then there is an implicit removal of duplicate tuples. Set operations Two relations R,S are set compatible, if they have the same exact attributes. Set operations are only defined if two relations are set compatible. 3. Set Union R UNION S = {set of all tuples in S OR in R if S and R are set compatible} 4. Set Difference R - S = {set of all tuples in R that are not in S, if R and S are set compatible} 5. Set Intersection R INTERSECT S = {set of all tuples in S AND in R if S and R are set compatible} 6. Rename R1(hid1, hn1, n1, u1) = MarvelHeroes 7. Cartesian Product: R x S = {tuples t such that there exists a tuple r in R and a tuple s in S and t is the combination of values for r and s} For this to be valid, R and S should have no attributes in common and the data model for R x S has all the attributes in R and all the attributes in S. 8. Theta JOIN R join_{C} S = Select_{C} ( R x S ) where R and S have no attributes in common and C is a join condition that is a boolean condition containing conditions that compare attributes from R to attributes from S 9. Natural Join R join S = Project_{A1,...An} (Select_{C} (R' x S')) Take R and S, rename all common attributes (get R',S'), then the condition C is the equality of all common attributes, and keep only one copy of the same attribute ------- MarvelHeroes(hid, hname, name, universe) Key: hid DCHeroes(hid, hname, name, universe) Key: hid Movies(mid, title, year) Key: mid HeroInMovie(hid, mid, role) Key: hid, mid TVShows(sid, title, fromyear, toyear) Key: sid HeroInShow(hid, sid, fromyear, toyear) Key: hid, sid Q1. ID of all marvel heroes that were not in a movie (id of all marvel heroes) - (all heroes that were in a movie) A1 = (project_{hid}(MarvelHeroes)) - (project_{hid} HeroInMovie) Q2. Find all years in which a hero movie was released or a new hero TV show started. R1 = project_{year} (Movies) R2(year) = project_{fromyear} (TVShows) R3 = R1 union R2 Q3. Find the name of all heroes that were in a movie in 2020. AllHeroes = MarvelHeroes union DCHeroes -- (hid, hname, name, universe) Moviesin2020(mid1) = project_{mid} (select_{year =2020} (Movies)) R1 = Moviesin2020 x HeroInMovie -- (mid1, hid, mid, role) R2(hid1) = project_{hid} (select_{mid1=mid} (R1)) -- Id of All heroes that were in a movie in 2020 R3 = R2 x AllHeroes -- (hid1, hid, hname, name, universe) R4 = project_{name} (select_{hid1=hid} (R3)) Q3'. Find the name of all heroes that were in a movie in 2020. AllHeroes = MarvelHeroes union DCHeroes -- (hid, hname, name, universe) Moviesin2020(mid1) = project_{mid} (select_{year =2020} (Movies)) R1(mid1, hid1, mid, role) = Moviesin2020 join_{mid1=mid} HeroInMovie R2 = R1 join_{hid1=hid} AllHeroes R3 = project_{name} (R2) Q4. Find the name and universe of all heroes that were in a TV show that started in 2020 and were not in a movie in 2020 AllHeroes = MarvelHeroes union DCHeroes -- (hid, hname, name, universe) M1(mid1, title1, year1) = select_{year=2020} (Movies) M2 = project_{hid} (M1 join_{mid1=mid} HeroInMovie) ### id of all heroes that were in a movie in 2020 S1(sid1) = project_{sid} (select_{fromyear =2020} (TVShows)) S2 = project_{hid} (S1 join_{sid1=sid} HeroInShow) ### id of all heroes that were in a show that started in 2020 S3(hid1) = S2 - M2 ### id of all heroes that were in a show that started in 2020 and not in a movie in that year S4 = project_{name, universe} (AllHeroes join_{hid1=hid} S3) ======================== AllHeroes = MarvelHeroes union DCHeroes -- (hid, hname, name, universe) T1(sid2, title2, fromyear2, toyear2) = TVShows HS1(hid1, sid1, fromyear1, toyear1) = HeroInShow M1(mid2, title2, year2) = Movies HM1(hid1, mid1, role1) = HeroInMovie A = select_{year2=2020} (AllHeroes join_{hid=hid1} HM1 join_{mid1=mid2} M1) B = select_{fromyear2=2020} (AllHeroes join_{hid=hid1} HS1 join_{sid1=sid2} T1) C = project_{hid,name, universe} (B) - project_{hid, name, universe} (A) D = project_{name, universe} (C) ======================== Q4. Find the name and universe of all heroes that were in a TV show that started in 2020 from beginning to end and were not in a movie in 2020 AllHeroes = MarvelHeroes union DCHeroes -- (hid, hname, name, universe) A = select_{year=2020} (AllHeroes join HeroInMovie join Movies) B = select_{fromyear=2020} (AllHeroes join HeroInShows join TVShows) C = project_{hid, name, universe} (B) - project_{hid, name, universe} (A) D = project_{name, universe} (C) Q5. Find the name of all Marvel heroes who joined a TV show 5 years after it started and stayed until the end of the show T1(sid2, title2, fromyear2, toyear2) = TVShows HS1(hid1, sid1, fromyear1, toyear1) = HeroInShow R1 = project_{hid1} (T1 join_{fromyear1=fromyear2+5 and toyear1=toyear2} HS1) R2 = project_{name} (R1 join_{hid1=hid} MarvelHeroes) Q6. Find the name of all pair of Marvel Heroes that were from a different universe but have the same hero name, and were in the same movie. H1(hid1,hname1,name1,universe1) = MarvelHeroes H2(hid2,hname2,name2,universe2) = MarvelHeroes R = project_{hid1,hid2} (H1 join_{hname1=hname2 and universe1<>universe2} H2) R2(hid1, hid2, mid2) = project{hid1, hid2, mid} (R join_{hid1=hid} HeroInMovie) ### hid1, hid2,hid, mid, role ### mid is a movie hid1 is in R3 = R2 join_{hid2=hid and mid=mid2} HeroInMovie