Example relations not in BCNF -------------------------------- Students(RIN, Name, Email, Major) RIN->Name Email ---- violates BCNF Not in BCNF ---- violates BCNF MusicGroups(GroupName, ArtistName, DoB, YearFounded) GroupName -> YearFounded ---- violates BCNF ArtistName -> DoB ---- violates BCNF Key: GroupName ArtistName Not in BCNF Courses(CRN,Semester,Year,Section,CourseCode,CourseName) CRN-> Semester Year Section CourseCode CourseCode -> CourseName ---- violates BCNF CourseCode Semester Year Section -> CRN Key: CRN or CourseCode Semester Year Section Not In BCNF ------------------------ Decomposition: A decomposition of R into R1,R2,...,Rn is valid if R1,..,Rn make up all the attributes in R and is given by: R1 = project_{attributes of R1} (R) R2 = project_{attributes of R2} (R) ... Rn = project_{attributes of Rn} (R) A good decomposition is: 1. lossless: required property [all decompositions should be lossless!] 2. dependency preserving [desired property!] Lossless -------- A decomposition of R into R1,R2,..Rn is lossless iff we are guaranteed for every possible instance of R that R = R1*R2*...*Rn Students(RIN, Name, Email, Major) RIN->Name Email ---- violates BCNF RIN Name Email Major a1 b1 c1 d1 a1 b1 c1 d2 a2 b1 c2 d1 S1(RIN, Name, Email) RIN Name Email a1 b1 c1 a1 b1 c1 a2 b1 c2 S2(Name, Major) Name Major b1 d1 b1 d2 Result = S1 * S2 RIN Name Email Major a1 b1 c1 d1 a1 b1 c1 d2 a2 b1 c2 d1 a2 b1 c2 d2 -- extra tuple, not in the original student relation This decomposition is not lossless! Chase Algorithm For Checking Lossless Decompositions ---------------------------------------------------- Given a relation R a set fds F and its decomposition R1, ...,Rn, First construct a canonical relation that has a tuple for each relation in the decomposition: --- for each tuple for Ri, if the attribute is in Ri, we represent it as an unsubscripted variable (known value!) otherwise, we are going to have a subscript (a distinct number for each tuple!) Given a canonical relation, apply all possible fd in F, X->Y such that if you have two tuples with the same X values, then make the Y values the same: -- if one tuple has no subscript for an attribute in Y, then change all the variables to no subscript, otherwise make all of them the same (same subscript). Continue until no more changes are possible. When finished, if there is a tuple with no subscripts, then this relation is lossless. If there is no such tuple, then this resulting canonical relation is a counter example of why this is a lossy decomposition. (no subscript = known value, has subscript = unknown value/variable!) Ex: Students(RIN, Name, Email, Major) RIN->Name Email S1(RIN, Name, Email) S2(Name, Major) RIN Name Email Major r n e m1 -- for s1 r2 n e2 m -- for s2 No two tuples have the same rin, so this is the final form. Lossy decomposition! S1: (r,n,e), (r2,n,e2) S2: (n,m1), (n,m) S1*S2: RIN Name Email Major r n e m1 r2 n e2 m r n e m r2 n e2 m1 Ex: Students(RIN, Name, Email, Major) RIN->Name Email S1(RIN, Name, Email) S2(RIN, Major) RIN Name Email Major r n e m1 r n2 e2 m Apply RIN -> name email RIN Name Email Major r n e m1 r n e m Last row has no subscript, so it is lossless! ------------------------ R(A,B,C,D,E,F,G) AB->C CD->EF F->A R1(A,B,C,D,E) R2(C,D,F) R3(A,B,D,G) R4(A,F) A B C D E F G a b c d e f1 g1 a2 b2 c d e2 f g2 a b c3 d e3 f3 g a b4 c4 d4 e4 f g4 Apply: AB->C A B C D E F G a b c d e f1 g1 a2 b2 c d e2 f g2 a b c d e3 f3 g a b4 c4 d4 e4 f g4 Apply: CD->EF A B C D E F G a b c d e f g1 a2 b2 c d e f g2 a b c d e f g a b4 c4 d4 e4 f g4 Third row has no subscript, this is a lossless decomposition!