CS317 File and Database Systemsmercury.pr.erau.edu › ... › Lecture-Week-8-2-grayscale.pdf · 8...
Transcript of CS317 File and Database Systemsmercury.pr.erau.edu › ... › Lecture-Week-8-2-grayscale.pdf · 8...
October 17, 2017 Sam Siewert
CS317File and Database Systems
Lecture 8 – Introduction to Normalization
http://dilbert.com/strips/comic/2010-08-24/
RemindersExam #1 Questions?
Working on Grading Ex #3 - Return Next Week
Grading Breakdown here -http://mercury.pr.erau.edu/~siewerts/cs317/policies/Grading-Breakdown.pdf
Assignment #4, Wednesday, NormalizationAssignment #5, Logical and Physical DB DesignAssignment #6, DBMS Project of Your Interest
Sam Siewert 2
NormalizationConcern is Duplication of Data in DBMS– Wastes Space– Insert Hazard (Update Multiple Tables?)– Delete Hazard (Delete from Multiple Tables?)– Modification Hazard (Modify in Multiple Tables?)– Foreign Keys are Exception (Expected Redundancy for
Relational Model)
Minimal Attributes (Columns in Relations [Tables])Attributes in Table with Close Logical Relationship– Functionally Dependent Attributes in Same Relation– Models of Functional Dependency
Minimal Redundancy [Foreign Keys Only]
Sam Siewert 3
4
How Normalization Supports Database Design (Ref. Connolly-Begg)
5
Data Redundancy and Update Anomalies
FK Duplication [ok]
RedundantAttribute Data
6
Example Functional Dependency that holds for all Time
Consider the values shown in staffNo and sNameattributes of the Staff relation (previous slide).
Based on sample data, the following functional dependencies appear to hold.
staffNo → sNamesName → staffNo
7
Data Redundancy and Update Anomalies
StaffBranch relation has redundant data; the details of a branch are repeated for every member of staff.
In contrast, the branch information appears only once for each branch in the Branch relation and only the branch number (branchNo) is repeated in the Staff relation, to represent where each member of staff is located.
8
Duplicate Data and Update AnomaliesRelations that contain redundant information may suffer from update anomalies.
3 update anomalies– Row Insertion
Enter SL99 assinged B003“fat finger” bAddressSG37, SG14, SG5 share with SL99Which one is right?
– DeletionDelete SA9 (fired)What is bAddress of B007?Do we still have B007?
– ModificationCorrect Bad Street # for Deer Rd.Which row - SL21 or SL41 row?
updates
9
Lossless-join and Dependency Preservation Properties
Two important properties of decomposition.
– Lossless-join property enables us to find any instance of the original relation from corresponding instances in the smaller relations. I can create UNF table as a view if I want to!
– Dependency preservation property enables us to enforce a constraint on the original relation by enforcing some constraint on each of the smaller relations. E.g. Domain, Referential Integrity (all staff must have one branch assignment), StaffNo must be unique, etc.
10
Functional Dependencies
Important concept associated with normalization.
Functional dependency describes relationship between attributes.
For example, if A and B are attributes of relation R, B is functionally dependent on A (denoted A B), if each value of A in R is associated with exactly one value of B in R.
11
Characteristics of Functional Dependencies
Property of the meaning or semantics of the attributes in a relation.
Diagrammatic representation.
The determinant of a functional dependency refers to the attribute or group of attributes on the left-hand side of the arrow.
12
An Example Functional Dependency
13
Characteristics of Functional Dependencies
Full functional dependency indicates that if A and B are attributes of a relation, B is fully functionally dependent on A, if B is functionally dependent on A, but not on any proper subset of A.E.g. Branch assignment does not depend on your salary or position, just who you are
14
Functional DependenciesDeterminants with minimal number of attributes necessary to maintain the functional dependency with the attribute(s) on the right hand-side. E.g. A staff member is assigned to one and only one branch. A branch has many staff members assigned to it.
This requirement is called full functional dependency.
15
Full vs. Partial Functional DependencyStaff relation: staffNo, sName → branchNoEach value of (staffNo, sName) is associated with a single
value of branchNo. However, branchNo is also functionally dependent on a subset of (staffNo, sName), namely staffNo. Example above is a partial dependency (name irrelevant)
16
Better Staff Branch Relations
assigned to only one
assigned to only one
Full functional
Partial functional
E.g. Two employees named John WhiteSL21 & New John White as SL100SL21 -> B005SL100 -> B099
17
Transitive Dependencies
Important to recognize a transitive dependency because its existence in a relation can potentially cause update anomalies.
Transitive dependency describes a condition where A, B, and C are attributes of a relation such that if A →B and B → C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C).
18
Example Transitive Dependency
Consider functional dependencies in the StaffBranchrelation (see Slide 17).
staffNo → sName, position, salary, branchNo, bAddress
branchNo → bAddress
• Transitive dependency, branchNo → bAddress exists on staffNo via branchNo.
19
Better Staff Branch Relations
assigned to only one has only one address
Branch address of StaffNo is transitive
20
The Process of Normalization
21
The Process of Normalization
22
Unnormalized Form (UNF)
A table that contains one or more repeating groups.
Worst case
May also have Full/Partial Functional Dependencies
May also have Transitive Functional Dependencies
E.g. Most Excel Spreadsheets!!!
Case in PointOmission of data
Would an RDBMS have caught?
Perhaps if data for plot was queried from well formed schema?
Spreadsheets tend to use “ranges” rather than predicates
Sam Siewert 23
Reinhart, Rogoff... and Herndon: The student who caught out the profsBBC News story