Normalization. Introduction Badly structured tables, that contains redundant data, may suffer from...
-
Upload
mark-daniel -
Category
Documents
-
view
232 -
download
0
Transcript of Normalization. Introduction Badly structured tables, that contains redundant data, may suffer from...
Introduction
Badly structured tables , that contains redundant
data, may suffer from Update anomalies :
• Insertions
• Deletions
• Modification
Bad structure may occur due to :
• Errors in the original ER diagram.
• Or in the process of translating ER models into
tables.
Database Tables and Normalization
• Normalization is a technique to support the design of databases based on relational model
• Normalization helps reduce data redundancies and helps eliminate the data anomalies.
• Normalization works through a series of stages called normal forms:– First normal form (1NF)– Second normal form (2NF)– Third normal form (3NF)
• The highest level of normalization is not always desirable.
© Pearson Education Limited, 2004 4
Data redundancy and update anomalies
Major aim of relational database design is to group columns into tables to minimize data redundancy and reduce file storage space required by base tables.
Staff(staffNo , name, position, salary, branchNo)Primary key staffNoForeign key branchNo references Branch(branchNo)
Branch(branchNo , branchAddress , TelNo)Primary key branchNoForeign key branchNo references Branch(branchNo)
StaffBranch(staffNo , name, position, salary, branchNo, branchAddress,TelNo)
Primary key staffNo
Data redundancy and update anomalies
Major aim of relational database design is to group columns into tables to minimize data redundancy and reduce file storage space required by base tables
© Pearson Education Limited, 2004 8
What is the problem ?
• StaffBranch table has redundant data; the details of
a branch (branchAddress and telNo) are repeated
for every member of staff located at that branch.
• In contrast, in Branch table the branch information
appears only once for each branch and only the
branch number (branchNo) is repeated in the Staff
table, to represent where each member of staff is
located.
• Tables having redundant data may suffer from
update anomalies (insertion , deletion or
modification anomalies)
© Pearson Education Limited, 2004 9
Insertion anomalies
• How to insert the details of a new member of staff
at branch B002 into the StaffBranch table ?
• Any problem with the tables separated ?
© Pearson Education Limited, 2004 10
Insertion anomalies (1)
To insert the details of a new member of staff at branch B002
into the StaffBranch table , we must enter the correct details of
Branch B002 so that the branch details are consistent with values
for branch B002 in other records of StaffBranch table.
No problem with the tables separated, because no need to enter
the details , just the foreign key is enough.
© Pearson Education Limited, 2004 11
Insertion anomalies(2)
How to insert the details of a new branch that
currently has no member of staff into StaffBranch table
?
Any problem with the tables separated ?
© Pearson Education Limited, 2004 12
Insertion anomalies(2)
To insert the details of a new branch that currently has no
member of staff into StaffBranch table , it is necessary to enter
null into the staff-related column , such as StaffNo (the
primary key). This violates entity integrity and not allowed
No problem with the tables separated. We just enter the new brach
in the branch table ?
© Pearson Education Limited, 2004 13
Deletion anomalies
What happen if we delete a record from the StaffBranch table
that represent the last member of staff located at a branch ?
Any problem with the tables separated, Why ?
© Pearson Education Limited, 2004 14
Deletion anomalies
If we delete a record from the StaffBranch table that
represent the last member of staff located at a branch,
details about the branch are also lost from the database.
No problem with the tables separated, because branch
records are stored separately
© Pearson Education Limited, 2004 15
Modification anomalies
What if we change of the value of one of the
columns of a particular branch in the
StaffBranch table (ex: telephone number)?
© Pearson Education Limited, 2004 16
Modification anomalies
We must update the records of all staff located
at that branch
© Pearson Education Limited, 2004 17
First normal form (1NF)
Definition
A table in which the intersection of every column
and record contains only one value.
Only 1NF is critical in creating appropriate tables
for relational databases. All subsequent normal
forms are optional.
However to avoid update anomalies, proceed to
3NF
© Pearson Education Limited, 2004 18
Problem : Column telNos does not comply with 1NF, because there are multiple values at the intersection of the telNos column with every record.
How to solve the problem ?
© Pearson Education Limited, 2004 19
Solution : create a separate table BranchTelephone to hold the telephone numbers of branches , by removing telNo column from Branch table
NOTE : Primary key of the new table BranchTelephone table is the new telNo column
© Pearson Education Limited, 2004 20
Functional dependency
•The particular relationships that we show between the columns of a table are more formally referred to as functional dependencies.
•Functional dependency describes the relationship between columns in a table.
Functional dependency• Functional dependency in a table indicate
how columns relate to one another.• Column B is functionally dependent on
column A (A→B) = if we know the value of A , we find only one value of B in all records that has this value of A.
• We say that B is worked out from A• However, for a given value of B there
may be several values of A
© Pearson Education Limited, 2004 22
Problem : TempStaffAllocation table is not in 2NF, why ?
No primary-key columns
primary-key columns
Functional dependency
© Pearson Education Limited, 2004 23
Second normal form (2NF)A table in 2NF is one that is :
1NF Each non-primary-key column can be worked out from the values in all the columns that make up the primary key (primary-key columns).
This means every non-primary-key column is fully functional dependent on the primary key.
Fully means dependent on A but not on any proper subset of A
© Pearson Education Limited, 2004 24
Second normal form (2NF)
NB: 2NF only applies only to tables with composite primary keys ( primary key composed of 2 or more columns).
NB: 1NF table with a single column primary key is automatically in at least 2NF.
© Pearson Education Limited, 2004 25
Functional dependencyBranchAddress can be worked out from
BranchNo (part of the primary key). Every time B002 appears in branchNo column , the same address ”City center ……..” appears in branchAddress . The reverse is true. (partial dependency)
Name and position can be worked out from staffNo (part of the primary key).
Every time S455 appears in staffNo column , the name “Ellen Layman” and position “assistant” appears in name and position columns ( partial dependency)
© Pearson Education Limited, 2004 26
Functional dependency
hoursPerWeek can be worked only out from both staffNo and BranchNo ( the whole primary key).
•As a partial dependency exists on the primary key, the table is not 2NF .
•2NF is achieved by removing partial dependency. How ?
© Pearson Education Limited, 2004 28
Third normal form (3NF)
DefinitionA table that is in 1NF and 2NF and in which all non-primary-key column can be worked out from only the primary key column(s) and no other columns.