FUNCTIONAL DEPENDENCY: A functional dependency (fd) for a relation R(A,B,C,D) is an expression of the form X -> Y where X is a set of attributes of R and Y is a set of attributes of R such that: whenever two tuples in R have the same values for all attributes in X, then they must have the same values for all attributes in Y. KEY: Given a relation R and a set F of fds, X is a key if 1. X+ is all attributes in R (in other words: X->X+ is in F+), 2. no subset of X implies all attributes in R (minimality). SUPERKEY (superset of key): Given a relation R and a set F of fds, X is a superkey if X+ is all attributes in R (in other words: X->X+ is in F+). PRIME ATTRIBUTE: Given a relation R and a set F of fds, a prime attribute is an attribute in a key. BASIS: A set of functional dependencies forms a basis, if there is only one attribute on the right hand side of each functional dependency. BOYCE-CODD NORMAL FORM (BCNF): Given a relation R and a set of functional dependencies F. R is in BCNF iff for all functional dependencies in F of the form X->Y one of the following is true: 1. X is a superkey of R, or 2. X->Y is trivial. (in other words: If for all functional dependencies in F that are NOT trivial, the left hand side is a superkey, then R is in BCNF. ) THIRD NORMAL FORM (3NF): Given a relation R and a set of functional dependencies F that forms a basis (only one attribute on the right hand side). R is in 3NF iff for all functional dependencies in F of the form X->Y one of the following is true: 1. X is a superkey of R, or 2. X->Y is trivial, or 3. Y is a prime attribute. If a relation is in BCNF, then it is also in 3NF. --------------- R1(A,B,C,D) AB->C ok (AB is a superkey) AB->D ok (AB is a superkey) C->A not ok (C is not a superkey), NOT IN BCNF A->A ok (trivial) ABD->C ok (ABD is a superkey) Key: AB, BC BC+ = {B,C,A,D} Superkeys: AB, BC, ABC, ABD, ABCD, BCD Prime attributes: A,B,C NOT IN BCNF 3NF? AB->C ok (AB is a superkey) AB->D ok (AB is a superkey) C->A ok (C is not a superkey, but A is a prime attribute) A->A ok (trivial) ABD->C ok (ABD is a superkey) BUT, it is in 3NF --------------- R2(A,B,C,D) AB->C C->D Key: AB AB+ = {A,B,C,D} Prime attributes: A,B NOT in BCNF due to C->D NOT in 3NF --------------- R3(A,B,C,D) A->B A->C A->D Key: A A+ = {A,B,C,D} Prime attribute: A IN BCNF, 3NF! --------------- R4(A,B,C,D,E) A->B A->C BD->E Key: AD Prime attributes: A,D Not in BCNF Not in 3NF: A->B, not trivial, A is not a superkey and B is not a prime attribute ----------------- R5(A,B,C,D,E) AB->C AB->D x not in 3NF C->A Key: ABE,BCE Prime attributes: A,B,C,E Not in BCNF, not in 3NF ------------ MusicGroups(GroupName, ArtistName, YearFormed, YearBorn, YearJoined) GroupName -> YearFormed ArtistName -> YearBorn GroupName ArtistName -> YearJoined Key: ArtistName GroupName Not in BCNF, because of the first two functional dependencies Not in 3NF Lecture 5 - Examples -------------------- R(A,B,C,D) AB->C B->D A B C D a1 b1 c1 d1 a1 b2 c3 d2 a2 b3 c2 d1 AB together uniquely determines a tuple in R, then AB is superkey. Let's check minimality: A is not a key, B is not a key. AB is minimal, it is a key! R(A,B,C,D) F = {AB->C, B->D} Check if AB->ABCD is in F+ AB+ = {A,B,C,D} AB->ABCD B+ = {B,D} not key A+ = {A} not key Key: AB Superkey: AB, ABC, ABD, ABCD Prime attributes: A, B Closure of a set of attributes Superkeys: set of attributes X that imply all attributes, such that X+ is all attributes in the relation Keys: Superkey but also minimal, no subset of it is a key. Prime attributes: Any attribute in a key ---------------- BCNF: A relation R given a set of F is in BCNF iff for all fds X->Y in F, either one of the following is true: 1. X is a superkey, or 2. X->Y is trivial R1(A,B,C,D,E) F = {A->BC, A->DE} Key: A A->BCDE A is a superkey, so R1 is in BCNF. R2(A,B,C,D,E,F) F = {A->BCDE} Key: AF A->BCDE No, A is not a superkey Not in BCNF Students(RIN, Name, Email, Major) RIN->Name, Email Not in BCNF R3(A,B,C,D,E,F) F = {A->BCD D->EF} Key: A A->BCD ok D->EF D is not a superkey Not in BCNF MusicGroups(GroupName, ArtistName, DoB, YearFounded) GroupName -> YearFounded no ArtistName -> DoB no Key: GroupName ArtistName Not in BCNF R4(A,B,C,D,E,F,G) F={AC->BD, ACD->AEFG, BC->B, EFG->AC} Key: AC, EFG AC->BD ok, superkey on lhs ACD->AEFG ok, superkey on lhs BC->B trivial EFG->AC ok, superkey on lhs --------------- BCNF: Good! It enforces the notion of modularity, each relation should be a single module ------------------- 3NF: A relation R given a set of F is in 3NF iff for all fds X->Y in F, either one of the following is true: 1. X is a superkey, or 2. X->Y is trivial, or 3. all attributes on the right hand side are prime attributes. If a relation is in BCNF, then it is in 3NF. R5(A,B,C,D,E,F,G) F={AC->BD, ACD->AEFG, BC->B, EFG->A} Key: AC, CEFG Prime attributes: A,C,E,F,G AC->BD Ok, superkey ACD->AEFG Ok, superkey BC->B Ok, trivial EFG->A EFG is not a superkey, but A is a prime attribute! Not in BCNF because of EFG->A, but it is in 3NF Address(Street, State, City, Zip) Street, State, City -> Zip Zip -> State, City Key: Street, State, City or Zip, Street Not in BCNF, in 3NF MINIMAL BASIS EQUIVALENCY: Two functional dependencies F1 and F2 over relation R are equivalent if F1+ = F2+ MINIMAL BASIS: A set of functional dependencies F is minimal if we cannot remove any f.d.s or any attributes from an f.d. without changing its meaning (closure). Algorithm for converting a set F to a minimal basis ---------------------------------------------------- Step 1: Convert F to a basis by using splitting rule AB->CD C->AC D->AF AC->G AB->BFH ----- AB->C AB->D C->A C->C D->A D->F AC->G AB->B AB->F AB->H Step 2: Remove all trivial functional dependencies AB->C AB->D C->A C->C -- remove D->A D->F AC->G AB->B -- remove AB->F AB->H ---- AB->C AB->D C->A D->A D->F AC->G AB->F AB->H Step 3: Suppose X->Y is in F, create F' by removing X->Y. If X+ is the same in F and F', then X->Y can be removed. F= AB->C AB->D ** removed in F' C->A D->A D->F AC->G AB->F AB->H F'= AB->C C->A D->A D->F AC->G AB->F AB->H in F , AB+={A,B,C,D,F,G,H} in F', AB+={A,B,C,G,H} We cannot remove AB->D, not equal -------------- F= AB->C AB->D C->A D->A D->F AC->G AB->F ** removed in F' AB->H F'= AB->C AB->D C->A D->A D->F AC->G AB->H in F : AB+={A,B,C,D,F,G,H} in F': AB+={A,B,C,D,F,G,H} Same! I can remove AB->F ----- AB->C AB->D C->A D->A D->F AC->G AB->H Step 4: If XZ->Y in F, then replace it with X->Y to get F'. If X+ in F and F' is the same, then F' can become F. F = AB->C AB->D C->A D->A D->F AC->G AB->H F' = AB->C A->D ** changed fd C->A D->A D->F AC->G AB->H in F , A+={A} in F', A+={A,D,F} Not the same, cannot remove B ---------- F = AB->C AB->D C->A D->A D->F AC->G AB->H F' = AB->C B->D ** changed fd C->A D->A D->F AC->G AB->H in F , B+={B} in F', B+={B,D,F} Not the same, cannot remove A --------- F = AB->C AB->D C->A D->A D->F AC->G AB->H F' = AB->C AB->D C->A D->A D->F C->G ** changed fd AB->H in F , C+={C,A,G} in F', C+={C,A,G} Same, I can remove A from AC->G. ------------ final result: minimal basis AB->C AB->D C->A D->A D->F C->G AB->H combining rule: minimal cover AB->CDH C->AG D->AF Example relations not in BCNF -------------------------------- Students(RIN, Name, Email, Major) RIN->Name Email ---- violates BCNF Not in BCNF ---- violates BCNF MusicGroups(GroupName, ArtistName, DoB, YearFounded) GroupName -> YearFounded ---- violates BCNF ArtistName -> DoB ---- violates BCNF Key: GroupName ArtistName Not in BCNF Courses(CRN,Semester,Year,Section,CourseCode,CourseName) CRN-> Semester Year Section CourseCode CourseCode -> CourseName ---- violates BCNF CourseCode Semester Year Section -> CRN Key: CRN or CourseCode Semester Year Section Not In BCNF