Lecture 6
-------------
Minimal vs. Minimal Basis
---------------------------
{AD->B, AD->C} --> minimal basis
{AD->BC} --> minimal but not basis
3NF Decomposition
---------------------
Guaranteed to be lossless and dependency preserving!
Given a relation R and a set F of fds that is minimal, the 3NF
decomposition is computed as follows:
1. For each fd X->Y, create a new relation with attributes X union Y.
2. Remove any relation Rx if there is another relation Ry that has all
the attribute in Rx.
3. If there is NO relation that contains all the attributes of one of
the keys of R, then add an extra relation for one of the keys.
Example: R(A,B,C,D,E,F)
F= {AB->C, C->B, BD->E, DF->A}
Keys: BDF, CDF
R1(A,B,C) {AB->C, C->B}
R2(B,C) {C->B} -- remove!
R3(B,D,E) {BD->E}
R4(A,D,F) {DF->A}
R5(B,D,F)
Final set:
R1(A,B,C) {AB->C, C->B}: Key: AB, AC
R3(B,D,E) {BD->E} Key: BD
R4(A,D,F) {DF->A} Key: DF
R5(B,D,F) Key: B,D,F
Lossless, let's prove using Chase algorithm:
F= {AB->C, C->B, BD->E, DF->A}
A B C D E F
a b c d1 e1 f1
a2 b c2 d e f2
a b3 c3 d e3 f
a4 b c4 d e4 f
BD->E
A B C D E F
a b c d1 e1 f1
a2 b c2 d e f2
a b3 c3 d e3 f
a4 b c4 d e f
DF->A
A B C D E F
a b c d1 e1 f1
a2 b c2 d e f2
a b3 c3 d e3 f
a b c4 d e f
AB->C
A B C D E F
a b c d1 e1 f1
a2 b c2 d e f2
a b3 c3 d e3 f
a b c d e f <- no subscripts! so this is lossless
-----------
R(A,B,C,D,E) F= {AB->CD, D->E}, Key: AB
3NF Decomposition
(A,B,C,D) {AB->CD}
(D,E) {D->E}
Suppose F was in basis form
R(A,B,C,D,E) F= {AB->C, AB->D, D->E}, Key: AB
3NF Decomposition
(A,B,C) AB->C
(A,B,D) AB->D
(D,E) D->E
----------------
BCNF Decomposition
--------------------
All BCNF decompositions are lossless, but not necessarily dependency
preserving.
Given a relation R and a set F of fds that is minimal, the BCNF
decomposition is computed as follows:
- Suppose X->Y is an fd in F that violates BCNF. Then, compute X+.
Create two relations:
1. R1(X+) X->X+
2. R2 has all attributes in R, except for X+-X (i.e. keep X, left hand side attributes)
For each Ri, compute functional dependency projections, check if they
are in BCNF. If any of them is not in BCNF, apply BCNF decomposition
recursively.
Example:
R(A,B,C,D,E,F), and F = {AB->C, C->DE, E->F}
Key:AB (C->DE and E->F violate BCNF)
We will take C->DE out!
C+ = {C,D,E,F} C->CDEF
R1(C,D,E,F) F1={C->DE, E->F} Key:C not in BCNF because E->F violates it
R2(A,B,C) F2={AB->C} Key: AB, in BCNF
Decompose:
R1(C,D,E,F) F1={C->DE, E->F} Key:C not in BCNF because E->F violates it
Take E->F out, E+ = {E,F}
R11(E,F) {E->F} in BCNF
R12(C,D,E) {C->DE} in BCNF
Final result:
(EF)
(CDE)
(ABC)
----------------
S(Showname, Startyear, Creator, StreamSite, URL, ShowSeasons, StreamSeasons)
Showname -> StartYear, Creator, ShowSeasons
StreamSite -> URL
URL -> StreamSite
Showname StreamSite -> StreamSeasons ok BCNF
Key: ShowName, StreamSite
Take out StreamSite->URL
S1(StreamSite, URL) {StreamSite->URL, URL->StreamSite} in BCNF
S2(Showname, Startyear, Creator, StreamSite, ShowSeasons, StreamSeasons)
Showname -> StartYear, Creator, ShowSeasons
Showname StreamSite -> StreamSeasons ok BCNF
Key: ShowName, StreamSite
Take out: Showname -> StartYear, Creator, ShowSeasons
S21(ShowName, StartYear, Creator, ShowSeasons)
Showname -> StartYear, Creator, ShowSeasons
S22(Showname, StreamSite, StreamSeasons)
Showname StreamSite -> StreamSeasons
Final result:
(ShowName, StartYear, Creator, ShowSeasons)
{Showname -> StartYear, Creator, ShowSeasons }
(Showname, StreamSite, StreamSeasons)
{Showname StreamSite -> StreamSeasons}
(StreamSite, URL)
{StreamSite->URL, URL->StreamSite}
------------------------------
4NF
-----------
GameInfo(gameid, name, category, type)
Games can have multiple categories and types
gameid -> name
Key: gameid, category, type
Games(gameid, name) gameid->name, key: gameid in BCNF
GameInfo(gameid, category, type) key: all attributes, in BCNF
GameInfo is in BCNF, but it contains two multivalued attributes
category and type. But category and type are not coupled with each
other. Hence, category and type are independeny of each other.
A multivalued dependency of the form
X =>> Y
and suppose Z is the remainder of attributes in R.
Says that whenever we have two tuples: (x1,y1,z1), and (x1,y2,z2)
then we must also have tuples: (x1,y1,z2) and (x1,y2,z1)
(Hence, Y and Z are independent of each other)
If X=>Y and X and Y together is not all the attributes in R, then R is
not in 4NF.
(Alternatively, If X=>Y and X,Y is all the attributes in R, then R is in 4NF)
If a relation is not in 4NF, use a method similar to BCNF
decomposition to decompose, but now using multivalued dependencies.
------------
Example:
GameInfo(gameid, category, type)
gameid =>> category,, since gameid and category together is not all
the attributes, this relation is not in 4NF
Take gameid=>> category out (no closure)
GameInfo1(gameid, category) gameid =>> category
Key: gameid, category (4NF)
GameInfo2(gameid, type) gameid =>> type
Key: gameid, type (4NF)
--------------
BCNF decomposition:
R(A,B,C,D,E,F) F={A->BC,C->D} Key:AEF
Take out C->D
R1(C,D) {C->D} in BCNF
R2(A,B,C,E,F) {A->BC} key: AEF
Take out A->BC
R21(A,B,C) {A->BC} Key: A
R22(A,E,F) {} Key: AEF
Final result:
(C,D) {C->D} in BCNF
(A,B,C) {A->BC} Key: A
(A,E,F) {} Key: AEF
===========
(assuming we are not storing which actor appears in which location, just
that movies have actors and locations)
Movies(movieid, actor, location)
movieid=>>actor
Not in 4NF
instead store:
Movies1(movieid, actor)
Movies2(movieid, location)