Relational Database Design Management Databaseptw/teaching/DBM/normal-forms.pdf · Normalisation...
Transcript of Relational Database Design Management Databaseptw/teaching/DBM/normal-forms.pdf · Normalisation...
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Relational Database Design
There are two interconnected problems which are causedby bad database design:
I Redundancy problemsI Update anomalies
Good database design is based on using certain normalforms for relation schemas.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Example 1
Let F1 = {E → D,D → M,M → D}.E stands for ENAME, D stands for DNAME and M standsfor MNAME
A relation r1 over EMP1 (whose schema is {ENAME,DNAME, MNAME}):
ENAME DNAME MNAMEMark Computing Peter
Angela Computing PeterGraham Computing Peter
Paul Math DonaldGeorge Math Donald
E is the only key for EMP1 w.r.t. F1.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Problems with EMP1 and F1
1. We cannot represent a department and managerwithout any employees (i.e., we cannot insert a tuplewith a null ENAME because of entity integrity);such a problem is called an insertion anomaly.
2. For the same reason as (1), we cannot delete all theemployees in a department and keep just thedepartment information;such a problem is called a deletion anomaly.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
More problems with EMP1 and F1
3. E.g. in the first tuple, modifying “Peter” to “Philip” or“Computing” to “Math”, does not violate any FDresulting from a key but D → M would be violated(D is not a key for EMP1 w.r.t. F1) ;such a problem is called a modification anomaly.
? In (3) it is not sufficient to check that r1 satisfies the FDsresulting from the keys of EMP1 w.r.t. F1.
? Ideally, we would like all the FDs of a relation schema tobe inferred from key dependencies, i.e. FDs of the formK → schema(R), where K is a key for R w.r.t. F .
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
More problems with EMP1 and F1
3. E.g. in the first tuple, modifying “Peter” to “Philip” or“Computing” to “Math”, does not violate any FDresulting from a key but D → M would be violated(D is not a key for EMP1 w.r.t. F1) ;such a problem is called a modification anomaly.
? In (3) it is not sufficient to check that r1 satisfies the FDsresulting from the keys of EMP1 w.r.t. F1.
? Ideally, we would like all the FDs of a relation schema tobe inferred from key dependencies, i.e. FDs of the formK → schema(R), where K is a key for R w.r.t. F .
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Final problem with EMP1 and F1
4. There is redundancy in r1, i.e. for every employee ina given department MNAME is repeated.
? “Peter” appears three times for “Computing” and“Donald” twice for “Math”.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Example 2
Let F2 = {E → S}.E stands for ENAME, S stands for SAL and C stands forCNAME.
A relation r2 over EMP2 (whose schema is {ENAME,CNAME, SAL}):
ENAME CNAME SALJack Jill 25Jack Jake 25Jack John 25
Donald Dan 30Donald David 30
EC is the only key for EMP2 w.r.t. F2.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Problems with EMP2 and F2
1. Insertion anomaly: we cannot insert an employeewithout any children.
2. Deletion anomaly: if there is a mistake and “Donald”does not have any children, we cannot record thisfact by deleting the two tuples for “Donald”.
3. Modification anomaly: if we try to modify the salaryof “Jack” in the first tuple to be 27 instead of 25,since no FD resulting from a key will be violated, butE → S would be violated.
4. Redundancy: the salary of each employee isrepeated for every child.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Formalising Redundancy Problems
Let R be a relation schema and F be a set of FDs over R.
Definition. R has a redundancy problem if(1) there exists a relation r over R that satisfies F , and(2) there exists an FD X → A in F and two distinct tuplesin r that have equal XA values.
• It can be shown that redundancy problems, give rise toupdate anomalies and vice versa.
? Verify that the schemas of Examples, 1 and 2 haveredundancy problems.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Problem for you to work on
Consider the following relation over schema Films:
Title Year Genre StarNameStar Wars 1977 SciFi Carrie FisherStar Wars 1977 SciFi Harrison Ford
Raiders . . . 1981 Action Harrison FordRaiders . . . 1981 Adventure Harrison Ford
When Harry . . . 1989 Comedy Carrie Fisher
Assume that the only FD that holds on Films isTitle→ Year.
What is the only key for Films?
Give an example of1. an insertion anomaly2. a deletion anomaly3. a modification anomaly4. a redundancy problem
for Films.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Normal Forms
I We assume that we are given a (1NF) relationschema R and a set F of functional dependencies(FDs) over R.
I We define two normal forms for relation schemas:I Boyce-Codd Normal Form (BCNF)I Third Normal Form (3NF)
I BCNF guarantees that the relation schema has noredundancy problems
I BCNF is stronger than 3NF: If R is in BCNF, then Ris in 3NF
I 3NF, however, does sometimes have someadvantages (see later)
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Rules of inference for FDs
Given a set F of FDs, other FDs can be derived fromthose in F .
For example, if F contains E → D and D → M, thenE → M can be derived from F (transitivity).
An FD X → Y is trivial if Y ⊆ X ; otherwise it isnontrivial.
There are 3 rules of inference for FDs, known asArmstrong’s Axioms:
1. Reflexivity. If Y ⊆ X , then X → Y (trivial FDs).2. Augmentation. If X → Y , then XA→ YA for any
attribute A not in X or Y .3. Transitivity. If X → Y and Y → Z , then X → Z .
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Rules of inference for FDs
Given a set F of FDs, other FDs can be derived fromthose in F .
For example, if F contains E → D and D → M, thenE → M can be derived from F (transitivity).
An FD X → Y is trivial if Y ⊆ X ; otherwise it isnontrivial.
There are 3 rules of inference for FDs, known asArmstrong’s Axioms:
1. Reflexivity. If Y ⊆ X , then X → Y (trivial FDs).2. Augmentation. If X → Y , then XA→ YA for any
attribute A not in X or Y .3. Transitivity. If X → Y and Y → Z , then X → Z .
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Closure of a set of FDs
If X → Y can be derived for a set of FDs F , we write thisas F ` X → Y .
Given F , the closure of F , denoted by F+, is the set of allFDs that can be derived (or proven) from F . That is,
F+ = {X → Y | F ` X → Y}.
The closure of a set of attributes, CLOSURE(X ,F ),effectively uses Armstrong’s Axioms to find all attributesdetermined by X .
From CLOSURE(X ,F ) one can find all FDs in F+ thathave X on the lefthand side.
For example, if CLOSURE(HR,F ) = HRCT , then F+
contains HR → C, HR → T , . . .
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Closure of a set of FDs
If X → Y can be derived for a set of FDs F , we write thisas F ` X → Y .
Given F , the closure of F , denoted by F+, is the set of allFDs that can be derived (or proven) from F . That is,
F+ = {X → Y | F ` X → Y}.
The closure of a set of attributes, CLOSURE(X ,F ),effectively uses Armstrong’s Axioms to find all attributesdetermined by X .
From CLOSURE(X ,F ) one can find all FDs in F+ thathave X on the lefthand side.
For example, if CLOSURE(HR,F ) = HRCT , then F+
contains HR → C, HR → T , . . .
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Boyce-Codd Normal Form
Assume we are given a set F of FDs, along with itsclosure F+.
Definition. R is in Boyce-Codd Normal Form (BCNF)w.r.t. F if for every non-trivial FD X → Y in F+, X is asuperkey for R w.r.t. F .
Example 1
Let schema(R1) = {STUDENT, POSITION, SUBJECT};S stands for STUDENT, J stands for SUBJECT and Pstands for POSITION.
Let F1 = {SJ → P,PJ → S}.
• Is R1 in BCNF w.r.t. F1 ?
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Boyce-Codd Normal Form
Assume we are given a set F of FDs, along with itsclosure F+.
Definition. R is in Boyce-Codd Normal Form (BCNF)w.r.t. F if for every non-trivial FD X → Y in F+, X is asuperkey for R w.r.t. F .
Example 1
Let schema(R1) = {STUDENT, POSITION, SUBJECT};S stands for STUDENT, J stands for SUBJECT and Pstands for POSITION.
Let F1 = {SJ → P,PJ → S}.
• Is R1 in BCNF w.r.t. F1 ?
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Example 2
Let schema(R2) = {STREET, CITY, POSTCODE};S stands for STREET,C stands for CITY andP stands for POSTCODE.
Let F2 = {SC → P,P → C}.
• Is R2 in BCNF w.r.t. F2 ?
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Third Normal Form
Definition. An attribute A in schema(R) is said to beprime w.r.t. F if A is a member of one of the keys of Rw.r.t. F .
Definition. R is in Third Normal Form (3NF) w.r.t. F if forevery non-trivial FD X → A in F+ either X is a superkeyfor R w.r.t. F or A is prime.
So 3NF is weaker than BCNF
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Third Normal Form
Definition. An attribute A in schema(R) is said to beprime w.r.t. F if A is a member of one of the keys of Rw.r.t. F .
Definition. R is in Third Normal Form (3NF) w.r.t. F if forevery non-trivial FD X → A in F+ either X is a superkeyfor R w.r.t. F or A is prime.
So 3NF is weaker than BCNF
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Examples
I Example 1, with R1 = {S, J,P} andF1 = {SJ → P,PJ → S}, is in BCNF
I Therefore R is in 3NF
I What about Example 2, with R2 = {S,C,P} andF2 = {SC → P,P → C}, which was not in BCNF?
I What are the keys and prime attributes of R2 w.r.t.F2?
I Is R2 in 3NF w.r.t. F2 ?
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Examples
I Example 1, with R1 = {S, J,P} andF1 = {SJ → P,PJ → S}, is in BCNF
I Therefore R is in 3NFI What about Example 2, with R2 = {S,C,P} and
F2 = {SC → P,P → C}, which was not in BCNF?I What are the keys and prime attributes of R2 w.r.t.
F2?I Is R2 in 3NF w.r.t. F2 ?
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
BCNF/3NF Example 3
Let R3 be a relation schema, with schema(R3) ={ENAME, DNAME, MNAME};E stands for ENAME, D stands for DNAME and M standsfor MNAME.
Let F3 = {E → D,D → M}.
I Is R3 in BCNF w.r.t. F3?I What are the keys and prime attributes of R3 w.r.t.
F3?I Is R3 in 3NF w.r.t. F3?
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
BCNF/3NF Example 4
Let R4 be a relation schema, with schema(R4) ={ENAME, CNAME, SAL};E stands for ENAME, C stands for CNAME and S standsfor SAL.
Let F4 = {E → S}.
I Is R4 in BCNF w.r.t. F4?I What are the keys and prime attributes of R4 w.r.t.
F4?I Is R4 in 3NF w.r.t. F4?
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Problem for you to work on
Consider relation schema R where schema(R) = ABCD.
Let the set F of FDs which hold on R be{AB → C,C → D,D → A}.
1. What are all the keys of R? (done earlier)2. Which FDs violate BCNF?3. Which FDs violate 3NF?
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
BCNF Normalisation Algorithm
In practice, we are given:I an entity-relationship diagram (ERD) andI a set of functional dependencies (FDs) F .
To produce a database design, we1. Convert the ERD into a database schema S.2. If any of the relation schemas in S are not in BCNF
with respect to F , we decompose them.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Decomposing a relation schema
Given a relation schema R which is not in BCNF, wedecompose it by
I replacing it by two (smaller) relation schemas R1 andR2
I such that R = R1 ∪ R2.
For example, given R = {E ,C,S} (employee, child,salary) and F = {E → S}, we might decompose R into
I R1 = {E ,C} andI R2 = {E ,S}
How do we decide which attributes go into whichdecomposed relation schemas?
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Decomposing a relation schema
Given a relation schema R which is not in BCNF, wedecompose it by
I replacing it by two (smaller) relation schemas R1 andR2
I such that R = R1 ∪ R2.
For example, given R = {E ,C,S} (employee, child,salary) and F = {E → S}, we might decompose R into
I R1 = {E ,C} andI R2 = {E ,S}
How do we decide which attributes go into whichdecomposed relation schemas?
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Decomposing a relation schema
Given a relation schema R which is not in BCNF, wedecompose it by
I replacing it by two (smaller) relation schemas R1 andR2
I such that R = R1 ∪ R2.
For example, given R = {E ,C,S} (employee, child,salary) and F = {E → S}, we might decompose R into
I R1 = {E ,C} andI R2 = {E ,S}
How do we decide which attributes go into whichdecomposed relation schemas?
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Lossless join
What if we choose a different decomposition for ourexample?schema(R) = {ENAME, CNAME, SAL} andsingle FD: ENAME→ SAL
(Modified) relation r over R is given by
ENAME CNAME SALJack Diane 25Jack John 25
Donald Diane 30Donald David 30
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
If we decompose R into {ENAME,CNAME} and{CNAME,SAL} as follows:
ENAME CNAMEJack DianeJack John
Donald DianeDonald David
CNAME SALDiane 25John 25Diane 30David 30
and then perform the natural join, we get
ENAME CNAME SALJack Diane 25Jack Diane 30Jack John 25
Donald Diane 25Donald Diane 30Donald David 30
⇒ with two tuples that were not in the original relation
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
If we decompose R into {ENAME,CNAME} and{CNAME,SAL} as follows:
ENAME CNAMEJack DianeJack John
Donald DianeDonald David
CNAME SALDiane 25John 25Diane 30David 30
and then perform the natural join, we get
ENAME CNAME SALJack Diane 25Jack Diane 30Jack John 25
Donald Diane 25Donald Diane 30Donald David 30
⇒ with two tuples that were not in the original relation
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
I A decomposition such as that into {ENAME,CNAME}and {CNAME,SAL} is called lossy
I We started knowing Jack’s salary was 25I After decomposing, if we query Jack’s salary we get
both 25 and 30I The decomposition does not faithfully represent the
original information we had
I A decomposition which does faithfully represent theoriginal information is called lossless
I Losslessness is guaranteed if we ensure that thecommon attributes between a pair of decomposedrelation schemas is a key for one of them
I The BCNF algorithm ensures losslessdecompositions
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
I A decomposition such as that into {ENAME,CNAME}and {CNAME,SAL} is called lossy
I We started knowing Jack’s salary was 25I After decomposing, if we query Jack’s salary we get
both 25 and 30I The decomposition does not faithfully represent the
original information we hadI A decomposition which does faithfully represent the
original information is called losslessI Losslessness is guaranteed if we ensure that the
common attributes between a pair of decomposedrelation schemas is a key for one of them
I The BCNF algorithm ensures losslessdecompositions
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Decomposition Condition used by Algorithm
The FD ENAME→ SAL violates BCNF in schema R.
I ENAME is the left-hand side of the violating FDI SAL is the right-hand side (RHS) of the violating FD
Split {ENAME, CNAME, SAL} into two relation schemas:
1. R1 = EMPLOYEE, containing all the attributes in theviolating FD, i.e.,schema(EMPLOYEE) = { ENAME, SAL }, andF1 = { ENAME→ SAL }.
2. R2 = DEPENDENT, containing all attributes in Rexcept those on the RHS of the violating FD, i.e.,schema(DEPENDENT) = { ENAME, CNAME }, andF2 = ∅ (excluding trivial FDs).
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Decomposition Condition used by Algorithm
The FD ENAME→ SAL violates BCNF in schema R.
I ENAME is the left-hand side of the violating FDI SAL is the right-hand side (RHS) of the violating FD
Split {ENAME, CNAME, SAL} into two relation schemas:
1. R1 = EMPLOYEE, containing all the attributes in theviolating FD, i.e.,schema(EMPLOYEE) = { ENAME, SAL }, andF1 = { ENAME→ SAL }.
2. R2 = DEPENDENT, containing all attributes in Rexcept those on the RHS of the violating FD, i.e.,schema(DEPENDENT) = { ENAME, CNAME }, andF2 = ∅ (excluding trivial FDs).
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Decomposition Condition used by Algorithm
The FD ENAME→ SAL violates BCNF in schema R.
I ENAME is the left-hand side of the violating FDI SAL is the right-hand side (RHS) of the violating FD
Split {ENAME, CNAME, SAL} into two relation schemas:
1. R1 = EMPLOYEE, containing all the attributes in theviolating FD, i.e.,schema(EMPLOYEE) = { ENAME, SAL }, andF1 = { ENAME→ SAL }.
2. R2 = DEPENDENT, containing all attributes in Rexcept those on the RHS of the violating FD, i.e.,schema(DEPENDENT) = { ENAME, CNAME }, andF2 = ∅ (excluding trivial FDs).
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Algorithm DECOMPOSE(R, F )
let the output database schema Out be empty;if IS-BCNF(R, F) then
add R to Out;else
let X → A in F+ be nontrivial (i.e. A is not in X)such that X is not a superkey with respect to F;
let R1 have schema(R1) = X ∪ {A};merge DECOMPOSE(R1, F ) and Out;let R2 have schema(R2) = schema(R) − {A};merge DECOMPOSE(R2, F ) and Out;
end ifreturn Out;
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Algorithm properties
The natural join can be applied to all of the relations inDECOMPOSE(R, F) to recover precisely the informationstored in any relation over schema(R); this is known asthe lossless join property.
Note that, in general, we need to consider F+, theclosure of F , to check whether there are any FDs whichviolate BCNF.
But we can start trying to find violations in F , and onlyconsider F+ once we find no violations in F .
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Another Example of BCNF DecompositionLet STUD be a relation schema, with schema(STUD) ={SNUM, POSTCODE, CITY, COUNTRY}, with FDs{SNUM→ POSTCODE, POSTCODE→ CITY,CITY→ COUNTRY}
I CITY→ COUNTRY violates BCNF in STUD, sodecompose STUD intoCC, with schema(CC) = {CITY, COUNTRY}, andSTUD1, with schema(STUD1) = {SNUM,POSTCODE, CITY}
I CC is in BCNF while POSTCODE→ CITY violatesBCNF in STUD1, so decompose STUD1 intoPC, with schema(PC) = {POSTCODE, CITY}, andSINFO = {SNUM, POSTCODE}.
I All the relation schemas in the database schema{CC, PC, SINFO} are now in BCNF
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Another Example of BCNF DecompositionLet STUD be a relation schema, with schema(STUD) ={SNUM, POSTCODE, CITY, COUNTRY}, with FDs{SNUM→ POSTCODE, POSTCODE→ CITY,CITY→ COUNTRY}
I CITY→ COUNTRY violates BCNF in STUD, sodecompose STUD intoCC, with schema(CC) = {CITY, COUNTRY}, andSTUD1, with schema(STUD1) = {SNUM,POSTCODE, CITY}
I CC is in BCNF while POSTCODE→ CITY violatesBCNF in STUD1, so decompose STUD1 intoPC, with schema(PC) = {POSTCODE, CITY}, andSINFO = {SNUM, POSTCODE}.
I All the relation schemas in the database schema{CC, PC, SINFO} are now in BCNF
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Another Example of BCNF DecompositionLet STUD be a relation schema, with schema(STUD) ={SNUM, POSTCODE, CITY, COUNTRY}, with FDs{SNUM→ POSTCODE, POSTCODE→ CITY,CITY→ COUNTRY}
I CITY→ COUNTRY violates BCNF in STUD, sodecompose STUD intoCC, with schema(CC) = {CITY, COUNTRY}, andSTUD1, with schema(STUD1) = {SNUM,POSTCODE, CITY}
I CC is in BCNF while POSTCODE→ CITY violatesBCNF in STUD1, so decompose STUD1 intoPC, with schema(PC) = {POSTCODE, CITY}, andSINFO = {SNUM, POSTCODE}.
I All the relation schemas in the database schema{CC, PC, SINFO} are now in BCNF
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Another Example of BCNF DecompositionLet STUD be a relation schema, with schema(STUD) ={SNUM, POSTCODE, CITY, COUNTRY}, with FDs{SNUM→ POSTCODE, POSTCODE→ CITY,CITY→ COUNTRY}
I CITY→ COUNTRY violates BCNF in STUD, sodecompose STUD intoCC, with schema(CC) = {CITY, COUNTRY}, andSTUD1, with schema(STUD1) = {SNUM,POSTCODE, CITY}
I CC is in BCNF while POSTCODE→ CITY violatesBCNF in STUD1, so decompose STUD1 intoPC, with schema(PC) = {POSTCODE, CITY}, andSINFO = {SNUM, POSTCODE}.
I All the relation schemas in the database schema{CC, PC, SINFO} are now in BCNF
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
A Third Example
I Consider a modified relation schema EMP, withattributes ENAME, CNAME (child name), DNAME(department name) and MNAME (manager name).
I The set of FDs is F = {E→ D, D→ M, M→ D}, whereE stands for ENAME, D stands for DNAME and Mstands for MNAME (and C stands for child name).
I All three FDs violate BCNF since EC is the only key.I We can choose any one of them as the basis for the
first decomposition step.I We will consider all three decompositions in turn.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Third Example: Decomposition 1
I If we first decompose using D→ M, we get twoschemas with attributes {D, M} and {E, C, D}.
I FDs D→ M and M→ D are applicable to {D, M}, butboth D and M are keys.
I FD E→ D is applicable to {E, C, D} and E is not asuperkey.
I So we decompose {E, C, D} into {E, D} and {E, C}.I E is a key for {E, D} and EC is the key for {E, C}.I So the final database schema comprises
{D, M}, {E, D} and {E, C}.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Third Example: Decomposition 2
I If we first decompose using E→ D, we get twoschemas with attributes {E, D} and {E, C, M}.
I E→ D is applicable to {E, D}, but E is a key.I What FDs are applicable to {E, C, M}?I None of E→ D, D→ M or M→ D apply because D is
not in {E, C, M}.I We have to consider all FDs in F+.I Recall that E→ M follows from E→ D and D→ M.I E→ M violates BCNF in {E, C, M} because E is not a
key.I So we decompose {E, C, M} into {E, M} and {E, C}.I So the final database schema comprises
{E, D}, {E, M} and {E, C}.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Third Example: Decomposition 3
I If we first decompose using M→ D, we get twoschemas with attributes {M, D} and {E, C, M}.
I FDs D→ M and M→ D are applicable to {M, D}, butboth D and M are keys.
I Once again we have {E, C, M}, so it is decomposedas before into {E, M} and {E, C}.
I So the final database schema comprises{M, D}, {E, M} and {E, C}.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
An example for you to tryLet R be a relation schema, with schema(R) ={C,T,H,R,S,G}.
I C stands for a course,I T stands for a teacher,I H stands for hour,I R stands for room,I S stands for student andI G stands for grade.
An example set of FDs F over R :
1. C→ T,2. HR→ C,3. HT→ R,4. CS→ G and5. HS→ R.
Decompose R into BCNF.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Dependency Preservation
Recall example: F3 = {SC→ P, P→ C}.S stands for Street, C stands for City and P stands forPostcode.
{S,C,P} is not in BCNF
Decompose {S,C,P} into {P,C} and {P,S}
P Cp1 cp2 c
P Sp1 sp2 s
Only FD that can be tested in the decomposition is P→ C
When we join the two relations, we see that SC→ P isviolated.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Dependency Preservation
A decomposition is dependency preserving if the FDswhich hold on the original relation schema can be testedon the decomposed schemas, without using joins.
We cannot always find a BCNF decomposition that isdependency preserving.
To test that no FDs are violated, we may need to joinrelations (expensive).
We can always find a 3NF dependency-preservingdecomposition.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
Dependency Preservation
For a starting set of attributes and FDs, some BCNFdecompositions may be dependency preserving andsome not.
Consider the example with attributes { E, C, D, M } andFDs F = {E→ D, D→ M, M→ D}.
We had three possible decompositions1. {D, M}, {E, D} and {E, C}.2. {E, D}, {E, M} and {E, C}.3. {M, D}, {E, M} and {E, C}.
Which of them is dependency-preserving?
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
3NF Algorithm
Given a relation schema R and a set of FDs F , thefollowing steps produce a 3NF decomposition of R thatsatisfies the lossless join condition and is dependencypreserving:
1. Remove all redundancies from F (we haven’tcovered this).
2. For each FD X → A in F , use X ∪ {A} as theschema of one of the relations in the decomposition.
3. If none of the schemas from Step 2 includes asuperkey for R, add another relation schema that is akey for R.
4. Delete any of the schemas from Step 2 that iscontained in another.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
3NF Algorithm
Given a relation schema R and a set of FDs F , thefollowing steps produce a 3NF decomposition of R thatsatisfies the lossless join condition and is dependencypreserving:
1. Remove all redundancies from F (we haven’tcovered this).
2. For each FD X → A in F , use X ∪ {A} as theschema of one of the relations in the decomposition.
3. If none of the schemas from Step 2 includes asuperkey for R, add another relation schema that is akey for R.
4. Delete any of the schemas from Step 2 that iscontained in another.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
3NF Algorithm
Given a relation schema R and a set of FDs F , thefollowing steps produce a 3NF decomposition of R thatsatisfies the lossless join condition and is dependencypreserving:
1. Remove all redundancies from F (we haven’tcovered this).
2. For each FD X → A in F , use X ∪ {A} as theschema of one of the relations in the decomposition.
3. If none of the schemas from Step 2 includes asuperkey for R, add another relation schema that is akey for R.
4. Delete any of the schemas from Step 2 that iscontained in another.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
3NF Algorithm
Given a relation schema R and a set of FDs F , thefollowing steps produce a 3NF decomposition of R thatsatisfies the lossless join condition and is dependencypreserving:
1. Remove all redundancies from F (we haven’tcovered this).
2. For each FD X → A in F , use X ∪ {A} as theschema of one of the relations in the decomposition.
3. If none of the schemas from Step 2 includes asuperkey for R, add another relation schema that is akey for R.
4. Delete any of the schemas from Step 2 that iscontained in another.
DatabaseManagement
Peter Wood
RelationalDatabase DesignUpdate Anomalies
Data Redundancy
Normal FormsFD Inference
Boyce-Codd Normal Form
Third Normal Form
NormalisationAlgorithmsLossless Join
BCNF Algorithm
BCNF Examples
Dependency Preservation
3NF Algorithm
3NF Algorithm
Given a relation schema R and a set of FDs F , thefollowing steps produce a 3NF decomposition of R thatsatisfies the lossless join condition and is dependencypreserving:
1. Remove all redundancies from F (we haven’tcovered this).
2. For each FD X → A in F , use X ∪ {A} as theschema of one of the relations in the decomposition.
3. If none of the schemas from Step 2 includes asuperkey for R, add another relation schema that is akey for R.
4. Delete any of the schemas from Step 2 that iscontained in another.