Lecture 5:

Also known as: Everything you wanted to know about functional
dependencies but were afraid to ask

Announcements:
-----------

- Hw 1 is due this thursday at midnight.
- Office hours have been updated, please check course website.

- Lecture 5 exercise after class

--------------------------------

FUNCTIONAL DEPENDENCY: Given a relation R, a functional dependency
(fd) is an expression of the form: X -> Y where X and Y are sets of
attributes from R.

If X -> Y, then whenever you have two tuples in R with the same values
for attributes in X, then they must have the same values for
attributes in Y.

--------------------------

INFERENCE RULES FOR FUNCTIONAL DEPENDENCIES: Given a relation R and a
set of F of fds, the following are inference rules:

1. Trivial: If Y is a subset or equal to X, then X->Y is implied by F.
2. Combining rule: If X->Y and X->Z are in F, then  X->YZ is implied by F. 
3. Decomposition: If X->YZ is in F, then X->Y and X->Z are both implied by F.
4. Transitivity: If X->Y, Y->Z are in F, then X->Z is implied by F.
5. Augmentation: If X->Y is in F, then XA->YA is implied by F.

------------------------------------------------------

THE CLOSURE OF A SET OF FUNCTIONAL DEPENDENCIES (F+): Given a relation
R and a set of F of fds, the closure of F given by F+ is the set of
all functional dependencies implied by F.

F+ = F union All trivial FDs union all other fds implied by F
----------------------------------------

THE CLOSURE OF A SET OF ATTRIBUTES (X+):Given a relation R and a set
of F of fds, the closure of a set X of attributes, given by X+ is the
set of all atttributes Y that are implied by X.

To compute of X+:

 - Initialize X+ = {X}
 - Repeat until X+  does not change:
        Find Y->Z such that Y is a subseteq of X+, then add Z to X+

Then, we have that X-> X+ is implied by F (or it is in F+)


F = {AB->CD, D->E, EF->G}
ABC->ABC

ABC+ = {A,B,C}
ABC+ = {A,B,C,D,E}

ABC->ABCDE is implied by F (or is in F+)
-------------------------------------

SUPERKEY: Given a relation R and a set of F of fds, X is a superkey if
X+ is the set of all attributes in R. 

-------------------------------------

KEY: A Key is a minimal set of attributes X that imply all the attributes in R.

Every key is a superkey by definition.


Key: A set of attributes X such that no two tuples can have same values for X.

R(A,B,C,D,E) F={AB->CDE}  Key: AB

a1 b1 c1 d1 e1
a1 b2 c1 d1 e1


AB->CDE
AB ->AB
AB->ABCDE
AB+ = {A,B,C,D,E}
-------------------------------------

PRIME ATTRIBUTE: If an attribute is in a key (any key), it is called a
prime attribute.

-------------------------------------

BOYCE CODD NORMAL FORM (BCNF): Given a relation R and a set of F of
fds. R is in BCNF iff and only one of the following is true for every
functional dependency X->Y in F:

a) either X->Y is trivial  (Y is a subseteq of X),
b) or X is a superkey of R.

------------------------------------

THIRD NORMAL FORM (3NF): Given a relation R and a set of F of fds.  R
is in 3NF iff and only one of the following is true for every
functional dependency X->Y in F:

a) either X->Y is trivial  (Y is a subseteq of X),
b) or X is a superkey of R,
c) all of Y are prime attributes.


R5(A,B,C,D)  F = {AB->CD, BC->B, D->A}

Key: AB, BD

In 3NF? Yes.

AB->CD  , AB is a superkey (condition b), OK  
BC->B   , trivial (condition a), ok
D->A    , D is not a superkey, but A is a prime attribute (condition c). OK.
------------------------------------------

F1 = {AB->CD, BC->B, D->A}

F2 = {AB->CD, D->A}

-------------------------------------

Two fd sets F1 and F2 are equivalent (F1 ≡ F2) have the same
meaning), iff F1+ = F2+.

To check if F1 ≡ F2 you can check:

a) everything in F1 is implied by F2, and
b) everything in F2 is implied by F1.

F1 = {AB->CD, D->A}
F2 = {AB->BCD, D->A, BD->C}

a) everything in F1 is implied by F2, and
True!

b) everything in F2 is implied by F1.

AB->BCD , using F1: AB+={A,B,C,D}, so it is implied by F1.
D->A,  in F1 so it is implied by F1.
BD->C, using F1, BD+={B,D,A,C} since C is in the closure, this is implied by F1.

Hence: F1 ≡ F2.

--------------

MINIMAL COVER: A set F of fds is a minimal cover if nothing can be
removed from it without changing F+.

---------------

BASIS: A set F is in a basis form if every fd has a single attribute
on the right hand side.

-----------------
Algorithm to compute the minimal basis:
--------------------------------------

Given F:

Step 1) Convert it to basis form

Step 2) Remove all trivial functional dependencies

Step 3) Given F, remove X->Y from F to get F'. If X+ is the same in F
and F', then I can remove X->Y from F.

Step 4) Given F, we can replace XZ->Y in F with X->Y to get F'. If X+
is the same in F and F', then we can change XZ->Y to X->Y.

Optional Step 5) Use the combining rule to combine fds with the same
left hand side.

-----------------------------
Given:
F = {AB->BCD, D->AE, DEF->DG, ABCF->G}

Step 1:

F = {AB->B, AB->C, AB->D, D->A, D->E, DEF->D, DEF->G, ABCF->G}

Step 2: Remove: AB->B, DEF->D

F = {AB->C, AB->D, D->A, D->E, DEF->G, ABCF->G}

Step 3:

F = {AB->C, AB->D, D->A, D->E, DEF->G, ABCF->G}

AB+ = {A,B,C,D,E}

F' = {AB->D, D->A, D->E, DEF->G, ABCF->G}

AB+ = {A,B,D,E}
Hence, cannot remove AB->C

-----
F = {AB->C, AB->D, D->A, D->E, DEF->G, ABCF->G}

DEF+ = {D,E,F,G,A}

F' = {AB->C, AB->D, D->A, D->E, ABCF->G}

DEF+ = {D,E,F,A}
Cannot remove DEF->G
-------

F = {AB->C, AB->D, D->A, D->E, DEF->G, ABCF->G}

ABCF+ = {A,B,C,F,G,D,E}

F' = {AB->C, AB->D, D->A, D->E, DEF->G}

ABCF+ = {A,B,C,F,D,E,G}
Can remove ABCF->G


Step 4:

F = {AB->C, AB->D, D->A, D->E, DEF->G}

A+ = {A}

F' = {A->C, AB->D, D->A, D->E, DEF->G}

A+ = {A,C}
I cannot remove B in AB->C
------

F = {AB->C, AB->D, D->A, D->E, DEF->G}

DF+ = {D,F,A,E,G}

F' = {AB->C, AB->D, D->A, D->E, DF->G}

DF+ = {D,F,G,A,E}
Same, so F' becomes F. (I can remove E from DEF->G)

Result:

F = {AB->C, AB->D, D->A, D->E, DF->G}

Step 5)

F = {AB->CD, D->AE, DF->G}

-----------

PROJECTION OF FUNCTIONAL DEPENDENCIES

Given a relation R and a set F of fds, if R2 = PROJECT_{A1...AN} (R)
the projection of F into F2 is the set of all functional dependencies
in F+ that only contain attributes in R2 (simplify by finding the
minimal basis).

To compute: find X+ for every subset of attributes in R2, add X->X+ to
the set F2, then find the minimal basis.

------------

R1(A,B,C,D,E,F)

F1 = {AB->CD, AC->EF}

R11(A,B,C,D) F11={AB->CD}
R12(A,C,E,F)  F12={AC->EF}

----------------

R2(A,B,C,D,E,F,G,H)

F2 = {AB->C, C->D, BE->F, F->G, F->H}

R21(A,B,D,E,G)    F21={AB->D, BE->G}

R22(B,C,D,F,H)  F22={C->D, F->H}

-------------------------

DECOMPOSITION: A decomposition of R into R1,...,Rn is given by:

If A1,...,An are the attributes in R, then: R1 = PROJECT_{A1,...,An} (R)

VALID: A decomposition is valid if R1,...,Rn contains all the attributes in R.

LOSSLESS DECOMPOSITION: If it is guaranteed that

R1 join R2 join .... join Rn = R, then the decomposition is lossless.
(or R1 * R2 * .... * Rn = R0

DEPENDENCY PRESERVING:  If F1, ..., Fn are the projection of F onto R1, ..,Rn
then

F1 union ... union Fn is equivalent to F, then the decomposition is
dependency preserving.

------------------------------


TVShows(Showname, StartYear, EndYear, Actor, Role, DateOfBirth,
JoinedCast, LeftCast)

Showname -> StartYear EndYear
Actor -> DateOfBirth
ShowName Actor Role -> JoinedCast LeftCast


T1(Showname, StartYEar, Endyear)

Farscape  1999 2003
Bab 5     1993 1998

T2(Actor, DateOfBirth)
Ben Browder  1962
Lani Tupu    1955
....

T3(ShowName, Actor, Role, JoinedCast, LeftCast)

Bab 5  Lani Tupu Pilot          1999 2003
Bab 5  Lani Tupu Captain Crais  1999 2002
......


T1 * T2 * T3


------------------

CHASE Algorithm to check if a decomposition is lossless

GIven a set F of fds and a decomposition R1, ..., Rn of a relation R.
Compute a canonical relation R  as follows:

-> A tuple for each Ri, and each attribute in Ri will have no
subscripts (known value) and the other attributes will have a unique
subscript (unknown value)

-> Use the functional dependencies to recover unknowns
   If: X->Y and two tuples have the same X values then make Y values the same

   If one Y value is known, then make the other one known, otherwise make them
   the same value.

-> Repeat until no row changes or you find a row with no subscripts.

If there is a tuple with no subscripts, then the relation is LOSSLESS.

If there is no tuples with no subscripts, then the relation is not
lossless AND the resulting relation is a counter example!

--------------

R(A,B,C,D,E,F,G,H)

F ={AB->C, C->D, BE->F, F->G, F->H}

R1(A,B,C,D)
R2(A,B,E,F)
R3(F,G,H)


 A  B  C  D  E  F  G  H
 -----------------------
 a  b  c  d  e1 f1 g1 h1
 a  b  c2 d2 e  f  g2 h2
 a3 b3 c3 d3 e3 f  g  h

AB->C

 A  B  C  D  E  F  G  H
 -----------------------
 a  b  c  d  e1 f1 g1 h1
 a  b  c  d2 e  f  g2 h2
 a3 b3 c3 d3 e3 f  g  h

C->D

 A  B  C  D  E  F  G  H
 -----------------------
 a  b  c  d  e1 f1 g1 h1
 a  b  c  d  e  f  g2 h2
 a3 b3 c3 d3 e3 f  g  h


F ={AB->C, C->D, BE->F, F->G, F->H}

F->G

 A  B  C  D  E  F  G  H
 -----------------------
 a  b  c  d  e1 f1 g1 h1
 a  b  c  d  e  f  g  h2
 a3 b3 c3 d3 e3 f  g  h

F->H

 A  B  C  D  E  F  G  H
 -----------------------
 a  b  c  d  e1 f1 g1 h1
 a  b  c  d  e  f  g  h   <--- no subscript, lossless.
 a3 b3 c3 d3 e3 f  g  h