Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from...

36
Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Transcript of Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from...

Page 1: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Normal Forms through BCNF

CPSC 356 Database

Ellen Walker

Hiram College

(Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Page 2: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Unnormalized Form

• A table that includes one or more repeating groups.

StudentName StudentClasses RoomNo Professor

Minnie Mouse

English 101

Psych 240

Accounting 110

BC

Gerst 120

Hins 209

H. Higgins

Lucy Van Pelt

Scrooge McD

Goofy Basketweaving 1

English 101

KC

BC

Daffy Duck

H. HIggins

Page 3: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

First Normal Form (1NF)

• Each row, column intersection must have a single value– No composite attributes– No multivalued attributes

• Put the table into 1NF by repeating student names for each repeating group.

Page 4: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Example in 1NF

StudentName StudentClass RoomNo Professor

Minnie Mouse English 101 BC H. Higgins

Minnie Mouse Psych 240 Gerst. 120 Lucy van Pelt

Minnie Mouse Accounting 110 Hins. 209 Scrooge McD

Goofy Basketweaving KC Daffy Duck

Goofy English 101 BC H. Higgins

Page 5: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Example: FDs and Candidate Key

• Functional Dependencies– StudentClass -> RoomNo, Prof – Prof -> RoomNo

• Candidate Key– Since StudentName cannot be determined from

anything else, it must be part of the key. StudentClass gives the rest.

– StudentName, StudentClass -> all attributes

Page 6: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Second Normal Form (2NF)

• Every non primary key is fully functionally dependent on the primary key

• Full Functional Dependency– Attribute B depends on A, but not on any subset of

A

• In other words, the primary key determines every other attribute, but no subset of the primary key determines any other attribute!

Page 7: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Example is not 2NF

• Primary key is StudentName, StudentClass– StudentName must be included because it can’t

be derived from anything else– StudentClass distinguishes tuples of the same

student

• RoomNo and Prof are not Fully Functionally Dependent on primary key– StudentClass -> RoomNo, Prof

Page 8: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Going to 2NF

• Split the relation into (at least) 2 relations– One relation has the subset of the primary key and

all attributes that depend on it– The other relation has the rest of the attributes and

an appropriate foreign key

• In our example:– CourseInfo (StudentClass, RoomNo, Prof)– Student(StudentName, StudentClass)

Page 9: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Notation for functional dependency

• First row is the relational schema• Additional row for each dependency

– “down arrow” for left side– “up arrow” for right side

– Example goes here

• For 2NF, no dependency has down arrow for only part of the primary key.

• Non-key dependencies don’t matter, e.g. Prof->Room

Page 10: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Example 13.8 (p. 392)

Page 11: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Third Normal Form (3NF)

• Schemas in 3NF have no transitive dependencies of non-key attributes

• Transitive dependencies cause potential duplication in the relations

• A transitive dependency is when– A, B, and C are attributes in the relation– A->B and B->C– C is not an attribute of the relation’s key

Page 12: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Recognizing Transitive Dependencies

• If any attribute has both a down-arrow and an up-arrow (staffNo in example 13.8), then there is a transitive dependency.

• If the attribute is part of the relation’s key, the transitive dependency does not violate 3NF

• Any dependency between 2 non-keys will be transitive! (key->non-key1; non-key1-> non-key2)

• Therefore, assuming that every Prof has a favorite Room, we have– StudentCourse->Prof, Prof->RoomNo

Page 13: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

From 2NF to 3NF

• Recognize transitive dependencies• Remove attributes involved in transitive

dependency to their own relation, leaving only a “foreign key” behind.

• Example:– CourseInfo (StudentClass, Prof)– Student(StudentName, StudentClass)– FavoriteRoom(Prof, RoomNo)

Page 14: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Summary of Normal Forms

Every attribute depends on • the key,• The whole key, • And nothing but the key!

Condition 1 describes the definition of key

Condition 2 describes 2NF

Condition 3 describes 3NF (or BCNF)

Page 15: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Boyce Codd Normal Form (BCNF)

• A relation is in BCNF if and only if every determinant is a candidate key

• A determinant is a set of attributes on which some other attribute is fully functionally dependent.

• Schemas that are 3NF but not BCNF are rare. They require– Two or more composite candidate keys– Candidate keys share at least one attribute

Page 16: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

BCNF vs. 3NF

• BCNF if every FD X->Y satisfies one of the following conditions:– The FD is trivial (Y is subset of X)– X is a superkey

• 3NF if BCNF or the following is true:– Every attribute in X but not Y belongs to a

candidate key

Page 17: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

BCNF Can Lose Dependencies

• When putting a relation into BCNF, it is possible that a functional dependency will not be preserved, because the related attributes will be split into separate relations.

• Tradeoff:– 3NF preserves all dependencies– BCNF prevents all redundancies

Page 18: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Example: Find 3NF, Is it BNF?

Student, Course, Semester -> Prof

Prof, Semester -> Course (Prof teaches 1 course / sem)

Course, Semester, Time -> Prof

Prof, Semester, Time -> Room, Course

Prof, Semester, Course, Time -> Room (redundant!)

Page 19: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Up / Down Arrow Form (Not 2NF)

Student Course Prof. Sem. Room Time

v v ^ v

^ v v

v ^ v v

^ v v ^ v

Page 20: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Find Candidate Keys

• Student, Semester, and Time have no up-arrow (cannot be determined by other attributes), so must be part of a candidate key

• Candidate keys:– Student, Course, Semester, Time– Student, Prof, Semester, Time

Page 21: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Remove Partial Dependencies

• Course depends only on Prof & Semester– R1: {Prof, Semester, Course}

• Room depends only on Prof, Semester and course; – R2: {Prof, Semester, Room, Time}

• Original with Room and Course removed– R3: {Student, Prof, Semester, Time}

Page 22: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

2NF Partition (Also 3NF)

Course Prof. Sem

^ v v

Prof. Sem Room Time

v v ^ v

Student Prof. Sem Time

Unused Depen- dencies Below

v v ^ v

v ^ v v

Page 23: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Dependency Analysis

• No transitive dependencies– This is good news. We are in 3NF

• Some dependencies are “broken” – They do not connect attributes of a single relation– This is “non-dependency-preserving”

• If we join back the relations we created will we get the same information?– Yes, in this case– No, in general

Page 24: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Algorithm to Get 3NF

• Start with 1NF• Remove all violations of 2NF by

decomposition• Remove all violations of 3NF by further

decomposition• It’s possible you will break dependencies

Page 25: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Dependency Preserving 3NF

• First, massage the dependencies into a standard form (minimal cover):– Every right side is a single attribute– No attributes are redundant– No dependencies are redundant

• Next, create a relation for each of the revised dependencies (guaranteed 3NF because only one dependency per relation)

• Finally, create one more relation for the primary key, if it’s not already included in one of the others.

Page 26: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Another Algorithm to get 3NF

• Find minimal cover of FDs– Every FD has one attribute on right– No FD can be derived from other FDs in the set

• Combine FDs with same attributes on the left• Create a relation for each remaining FD• If no relation contains the original superkey for all

attributes, construct one relation with just the superkey

• This set is guaranteed to be 3NF and equivalent to the original relation

Page 27: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Finding Minimal Cover

• Split up the dependencies with multiple right sides– X->Y,Z becomes X->Y, X-> Z

• Check for redundant attributes on left side– Compute closure of each set that leaves out one attribute. If

it includes the right side, remove the extra attribute.

• Check for redundant dependencies– For each dependency, compute the closure of the set on the

left side against all the other dependencies except the one you’re testing. If you find the attribute on the right side in the closure, you can leave that dependency out.

Page 28: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Example: Single Right Sides

T1. Student, Course, Semester -> Prof

T2. Prof, Semester -> Course

T3. Course, Semester, Time -> Prof

T4. Prof, Semester, Time -> Room, Course

T5. Prof, Semester, Course, Time -> Room

T4 is split:T4a. Prof, Semester, Time -> Room

T4b. Prof, Semester, Time -> Course

Page 29: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Example: Finding Redundant Attributes

• Consider T1: Student, Course, Sem -> Prof– {Student, Course}+ = {Student, Course}– {Student, Sem}+ = {Student, Sem}– {Course, Sem}+ = {Course, Sem}

• Since Prof cannot be derived without all 3 attributes, T1 has no redundant attributes

• T2 and T3 and T4a are similar (no redundant attributes)

• T2 is T4b with Time removed; obviously Time is redundant and T4b can be removed from the set.

Page 30: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Example: Redundant Attributes in T5

• T5: Prof, Semester, Course, Time -> Room• {Prof, Semester, Time}+ = {Prof, Semester,

Time, Course, Room}– Prof, Semester, Time -> Room by T4a– Therefore, Course is redundant in T5– Removing Course from T5 makes it the same as

T4a

Page 31: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Example After Redundant Attribute Removal

T1. Student, Course, Semester -> Prof

T2. Prof, Semester -> Course

T3. Course, Semester, Time -> Prof

T4a. Prof, Semester, Time -> Room

Page 32: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Example: Remove Redundant Dependencies

T1. Student, Course, Semester -> Prof

//Not redundant (see next slide)

T2. Prof, Semester -> Course//Course cannot be derived any other way

T3. Course, Semester, Time -> Prof

//Not redundant (see next slide)

T4a. Prof, Semester, Time -> Room

//Room cannot be derived any other way

Page 33: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Show T1 is not Redundant

• Remove T1 from the set of dependencies• Compute {Student, Course, Semester}+ using

only T2, T3, and T4a– No left sides are satisfied by this combination, so

the closure is simply {Student, Course, Semester}

• Because Professor was not in the closure, T1 is not redundant

(Similar reasoning for T3)

Page 34: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Creating the Relations

• Each relation has all (and only) the attributes mentioned in one dependency;

R1 = { Student, Course, Semester, Prof }R2 = { Prof, Semester, Course }R3 = { Course, Semester, Time, Prof }R4 = { Prof, Semester, Time, Room }Since none of these contains a key for the

whole relation, we addR5 = { Student, Course, Semester, Time }

Page 35: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Evaluating the Result

• Every functional dependency from the closure affects exactly one relation in the schema– No dependencies are lost (this is dependency

preserving)

• At most one non-key attribute per relation, so a transitive dependency would have to lead back to a key attribute (as in R3)

• Therefore, our result is in 3NF and dependency preserving.

Page 36: Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)

Result is Not BCNF

• The result is not BCNF because of the extra dependency in R3:– Course Semester Time Prof

– Prof, Semester -> Course

– It’s ok to have this dependency for 3NF, but not for BCNF.