CSE544 Data Modeling, Conceptual Design, XML
-
Upload
fennella-peppard -
Category
Documents
-
view
73 -
download
0
description
Transcript of CSE544 Data Modeling, Conceptual Design, XML
![Page 1: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/1.jpg)
1
CSE544Data Modeling,
Conceptual Design, XML
Wednesday, April 5, 2006
![Page 2: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/2.jpg)
2
Outline
• ER diagrams (Chapter 2)
• Conceptual Design (Chapter 19)
• XML (Web)
![Page 3: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/3.jpg)
3
Database Design
DataModeling
DataModeling RefinementRefinement SQL
Tables
SQLTables
E/R diagrams Relations
FilesFiles
![Page 4: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/4.jpg)
4
Relational Schema Design
PersonbuysProduct
name
price name ssn
Conceptual Model:
Relational Model:plus FD’s
Normalization:Eliminates anomalies
![Page 5: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/5.jpg)
5
Entity / Relationship Diagrams
Entity sets Product
address
buys
Attributes
Relationships
![Page 6: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/6.jpg)
6
address name ssn
Person
buys
makes
employs
CompanyProduct
name category
stockprice
name
price
![Page 7: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/7.jpg)
7
Keys in E/R Diagrams
• Every entity set must have a key
Product
name category
price
![Page 8: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/8.jpg)
8
Multiplicity of E/R Relations
• one-one:
• many-one
• many-many
123
abcd
123
abcd
123
abcd
![Page 9: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/9.jpg)
9
address name ssn
Person
buys
makes
employs
CompanyProduct
name category
stockprice
name
price
What doesthis say ?
![Page 10: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/10.jpg)
10
Multi-way Relationships
Purchase
Product
Person
Store
![Page 11: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/11.jpg)
11
Q: what does the arrow mean ?
Rental
VideoStore
Person
Movie
Invoice
Arrows in Multiway Relationships
A: if I know the store, person, invoice, I know the movie too
![Page 12: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/12.jpg)
12
Q: what do these arrow mean ?
Rental
VideoStore
Person
Movie
Invoice
Arrows in Multiway Relationships
A: store, person, invoice determines movie and store, invoice, movie determines person
![Page 13: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/13.jpg)
13
Q: how do I say: “invoice determines store” ?
A: no good way; best approximation:
Incomplete (why ?)
Rental
VideoStore
Person
Movie
Invoice
Arrows in Multiway Relationships
![Page 14: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/14.jpg)
14
Roles in Relationships
Purchase
What if we need an entity set twice in one relationship?
Product
Person
Store
salesperson buyer
![Page 15: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/15.jpg)
15
Attributes on Relationships
Purchase
Product
Person
Store
date
![Page 16: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/16.jpg)
16
Converting Multi-way Relationships to Binary
Purchase
Person
Store
Product
StoreOf
ProductOf
BuyerOf
date
Need arrows here !Which direction ?
![Page 17: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/17.jpg)
17
From E/R Diagramsto Relational Schema
• Entity set relation
• Relationship relation
![Page 18: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/18.jpg)
18
Entity Set to Relation
Product
name category
price
Product(name, category, price)
name category price
gizmo gadgets $19.99
![Page 19: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/19.jpg)
19
Relationships to Relations
makes CompanyProduct
name category
Stock price
name
Makes(product-name, product-category, company-name, year)
Start Year
price
(watch out for attribute name conflicts)
![Page 20: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/20.jpg)
20
Relationships to Relations
makes CompanyProduct
name category
Stock price
name
No need for Makes. Modify Product:
Product(name, category, price, startYear, companyName)
Start Year
price
![Page 21: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/21.jpg)
21
Multi-way Relationships to Relations
Purchase
Product
Person
Storename price
ssn name
name address
Purchase(prodName,stName,ssn)
![Page 22: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/22.jpg)
22
Design Principles
PurchaseProduct Person
What’s wrong?
President PersonCountry
Moral: be faithful!
![Page 23: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/23.jpg)
23
Design Principles:What’s Wrong?
Purchase
Product
Store
date
personNamepersonAddr
Moral: pick the right kind of entities.
![Page 24: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/24.jpg)
24
Design Principles:What’s Wrong?
Purchase
Product
Person
Store
dateDates
Moral: don’t complicate life more than it already is.
![Page 25: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/25.jpg)
25
Product
name category
price
isa isa
Educational ProductSoftware Product
Age Groupplatforms
Subclasses
![Page 26: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/26.jpg)
26
Subclasses to Relations
Product
name category
price
isa isa
Educational ProductSoftware Product
Age Groupplatforms
Name Price Category
Gizmo 99 gadget
Camera 49 photo
Toy 39 gadget
Name platforms
Gizmo unix
Name Age Group
Gizmo todler
Toy retired
Product
Sw.Product
Ed.Product
Notice: subclass = subsetAlternative: disjoint classes (Java, C++)
![Page 27: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/27.jpg)
27
Modeling UnionTypes With Subclasses
FurniturePiece
Person Company
Each piece of furniture is owned either by a person, or by a company
![Page 28: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/28.jpg)
28
Modeling Union Types with Subclasses
Solution 1. Acceptable, imperfect (What’s wrong ?)
FurniturePiecePerson Company
ownedByPerson ownedByPerson
![Page 29: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/29.jpg)
29
Modeling Union Types with Subclasses
Solution 2: better
isa
FurniturePiece
Person CompanyownedBy
Owner
isa
![Page 30: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/30.jpg)
30
Referential Integrity Constraints
CompanyProduct makes
CompanyProduct makes
Each product made by at most one company.Some products made by no company
Each product made by exactly one company.
![Page 31: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/31.jpg)
31
Other Constraints
CompanyProduct makes<100
What does this mean ?
![Page 32: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/32.jpg)
32
Weak Entity SetsEntity sets are weak when their key comes from otherclasses to which they are related.
UniversityTeam affiliation
numbersport name
University(name)Team(universityName, number, sport)
![Page 33: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/33.jpg)
33
Weak Entity Sets
E1E3
B3
A3 A1
R1
R2 E2
A2
R3 E4A4
What are the keys ?
![Page 34: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/34.jpg)
34
Schema Refinement
• For the relational model• Relation: R(A1, A2, …, Am)
– Schema: relation name, attribute names– Instance: a mathematical m-ary relation
• Database: R1, R2, …, Rn
– Schema– Instance
• Schema refinement = normalization
![Page 35: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/35.jpg)
35
First Normal Form (1NF)
• A database schema is in First Normal Form if all tables are flat
Name GPA Courses
Alice 3.8
Bob 3.7
Carol 3.9
Math
DB
OS
DB
OS
Math
OS
Student Name GPA
Alice 3.8
Bob 3.7
Carol 3.9
Student
Course
Math
DB
OS
Student Course
Alice Math
Carol Math
Alice DB
Bob DB
Alice OS
Carol OS
Takes Course
May needto add keys
![Page 36: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/36.jpg)
36
More Normal Forms
• Based on Functional Dependencies – 2nd Normal Form (obsolete)– 3rd Normal Form– Boyce Codd Normal Form (BCNF)
• Based on Multivalued Dependencies – 4th Normal Form
• Based on Join Dependencies– 5th Normal Form
Discussnext
![Page 37: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/37.jpg)
37
Data Anomalies
Anomalies:• Redundancy = repeat data• Update anomalies = Fred moves to “Bellevue”• Deletion anomalies = Joe deletes his phone number:
what is his city ?
Recall set attributes (persons with several phones):
SSN Name, CitySSN Name, City
Name SSN PhoneNumber City
Fred 123-45-6789 206-555-1234 Seattle
Fred 123-45-6789 206-555-6543 Seattle
Joe 987-65-4321 908-555-2121 Westfield
but not SSN PhoneNumber
![Page 38: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/38.jpg)
38
Relation DecompositionBreak the relation into two:
Name SSN City
Fred 123-45-6789 Seattle
Joe 987-65-4321 Westfield
SSN PhoneNumber
123-45-6789 206-555-1234
123-45-6789 206-555-6543
987-65-4321 908-555-2121Anomalies have gone:• No more repeated data• Easy to move Fred to “Bellevue” (how ?)• Easy to delete all Joe’s phone number (how ?)
Name SSN PhoneNumber City
Fred 123-45-6789 206-555-1234 Seattle
Fred 123-45-6789 206-555-6543 Seattle
Joe 987-65-4321 908-555-2121 Westfield
![Page 39: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/39.jpg)
39
Decompositions in General
R1 = projection of R on A1, ..., An, B1, ..., Bm
R2 = projection of R on A1, ..., An, C1, ..., Cp
R(A1, ..., An, B1, ..., Bm, C1, ..., Cp) R(A1, ..., An, B1, ..., Bm, C1, ..., Cp)
R1(A1, ..., An, B1, ..., Bm)R1(A1, ..., An, B1, ..., Bm) R2(A1, ..., An, C1, ..., Cp)R2(A1, ..., An, C1, ..., Cp)
![Page 40: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/40.jpg)
40
Problems With Decomposition
• Can we get the data back correctly ?– Lossless decomposition– Discuss next
• Can we recover the FD’s on the ‘big’ table from the FD’s on the small tables ?– Dependency-preserving decomposition– Figure out yourself, or read 19.5.2
![Page 41: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/41.jpg)
41
Lossless Decomposition
• Sometimes it is correct:Name Price Category
Gizmo 19.99 Gadget
OneClick 24.99 Camera
Gizmo 19.99 Camera
Name Price
Gizmo 19.99
OneClick 24.99
Gizmo 19.99
Name Category
Gizmo Gadget
OneClick Camera
Gizmo Camera
![Page 42: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/42.jpg)
42
Lossy Decomposition
• Sometimes it is not:
Name Price Category
Gizmo 19.99 Gadget
OneClick 24.99 Camera
Gizmo 19.99 Camera
Name Category
Gizmo Gadget
OneClick Camera
Gizmo Camera
Price Category
19.99 Gadget
24.99 Camera
19.99 Camera
What’swrong ??
![Page 43: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/43.jpg)
43
Decompositions in General
R(A1, ..., An, B1, ..., Bm, C1, ..., Cp) R(A1, ..., An, B1, ..., Bm, C1, ..., Cp)
Theorem If A1, ..., An B1, ..., Bm
Then the decomposition is lossless
R1(A1, ..., An, B1, ..., Bm)R1(A1, ..., An, B1, ..., Bm) R2(A1, ..., An, C1, ..., Cp)R2(A1, ..., An, C1, ..., Cp)
Example: name price, hence the first decomposition is lossless
Note: don’t need necessarily A1, ..., An C1, ..., Cp
![Page 44: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/44.jpg)
44
Functional Dependencies
• A form of constraint– hence, part of the schema
• Finding them is part of the database design
![Page 45: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/45.jpg)
45
Functional DependenciesFunctional Dependency:
A1, A2, …, An B1, B2, …, BmA1, A2, …, An B1, B2, …, Bm
Meaning:
If two tuples agree on the attributes
then they must also agree on the attributes
A1, A2, …, AnA1, A2, …, An
B1, B2, …, BmB1, B2, …, Bm
![Page 46: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/46.jpg)
46
Functional Dependencies
Definition: A1, ..., An B1, ..., Bm holds in R if:
t, t’ R, (t.A1=t’.A1 ... t.An=t’.An t.B1=t’.B1 ... t.Bm=t’.Bm )
A1 ... An B1 ... Bm
if t, t’ agree here then t, t’ agree here
t
t’
R
![Page 47: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/47.jpg)
47
Examples
• EmpID Name, Phone, Position
• Position Phone
• but Phone Position
EmpID Name Phone PositionE0045 Smith 1234 ClerkE1847 John 9876 SalesrepE1111 Smith 9876 SalesrepE9999 Mary 1234 Lawyer
![Page 48: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/48.jpg)
48
Example
Product(name, category, color, department, price)
name colorcategory departmentcolor, category price
name colorcategory departmentcolor, category price
Consider these FDs:
What do they say ?
![Page 49: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/49.jpg)
49
ExampleFD’s are constraints:• On some instances they hold• On others they don’t
name category color department price
Gizmo Gadget Green Toys 49
Tweaker Gadget Green Toys 99
Does this instance satisfy all the FDs ?
name colorcategory departmentcolor, category price
name colorcategory departmentcolor, category price
![Page 50: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/50.jpg)
50
Example
name category color department price
Gizmo Gadget Green Toys 49
Tweaker Gadget Black Toys 99
Gizmo Stationary Green Office-supp. 59
What about this one ?
name colorcategory departmentcolor, category price
name colorcategory departmentcolor, category price
![Page 51: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/51.jpg)
51
Inference
If some FDs are satisfied, thenothers are satisfied too
If all these FDs are true:name colorcategory departmentcolor, category price
name colorcategory departmentcolor, category price
Then this FD also holds: name, category pricename, category price
Why ?? We say that the new FD is implied
![Page 52: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/52.jpg)
52
Armstrong’s Axioms
Is equivalent to
Splitting rule and Combing rule
A1 ... An B1 ... Bm
A1, A2, …, An B1, B2, …, BmA1, A2, …, An B1, B2, …, Bm
A1, A2, …, An B1
A1, A2, …, An B2
. . . . .A1, A2, …, An Bm
A1, A2, …, An B1
A1, A2, …, An B2
. . . . .A1, A2, …, An Bm
![Page 53: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/53.jpg)
53
Armstrong’s Axioms
Trivial Rule
Why ?
A1 … Am
where i = 1, 2, ..., n
A1, A2, …, An AiA1, A2, …, An Ai
![Page 54: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/54.jpg)
54
Armstrong’s Axioms
Transitive Closure Rule
Why ?
If A1, A2, …, An B1, B2, …, BmA1, A2, …, An B1, B2, …, Bm
and B1, B2, …, Bm C1, C2, …, CpB1, B2, …, Bm C1, C2, …, Cp
then A1, A2, …, An C1, C2, …, CpA1, A2, …, An C1, C2, …, Cp
A1 … An B1 … Bm C1 ... Cp
![Page 55: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/55.jpg)
55
Example (continued)
From: 1. name color2. category department3. color, category price
1. name color2. category department3. color, category price
Inferred FDWhich Ruledid we apply ?
4. name, category name
5. name, category color
6. name, category category
7. name, category color, category
8. name, category price
name, category pricename, category priceTo:
![Page 56: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/56.jpg)
56
Example (continued)
Answers:
Inferred FDWhich Ruledid we apply ?
4. name, category name Trivial rule
5. name, category color Transitivity on 4, 1
6. name, category category Trivial rule
7. name, category color, category Split/combine on 5, 6
8. name, category price Transitivity on 3, 7
1. name color2. category department3. color, category price
1. name color2. category department3. color, category price
![Page 57: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/57.jpg)
57
Closure of a Set of FDs
Definition. Given a set F of functional dependencies, the closure, F+, denotes all FDs implied by F
Theorem. Armstrong axioms are sound and complete for computing F+
What do sound and complete mean ?
![Page 58: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/58.jpg)
58
Variation
If
then
Augmentation follows from trivial rules and transitivityHow ?
A1, A2, …, An BA1, A2, …, An B
A1, A2, …, An , C1, C2, …, Cp BA1, A2, …, An , C1, C2, …, Cp B
Augmentation
![Page 59: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/59.jpg)
59
Problem: Compute F+
Given F compute its closure F+.
How to proceed ?
• Apply Armstrong’s Axioms repeatedly
• Better: use the Closure Algorithm for a set of attributes (next)
![Page 60: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/60.jpg)
60
Closure of a set of AttributesGiven a set of attributes A1, …, An
The closure, {A1, …, An}+ , is the set of attributes Bs.t. A1, …, An B
Given a set of attributes A1, …, An
The closure, {A1, …, An}+ , is the set of attributes Bs.t. A1, …, An B
name colorcategory departmentcolor, category price
name colorcategory departmentcolor, category price
Example:
Closures: name+ = {name, color} {name, category}+ = {name, category, color, department, price} color+ = {color}
![Page 61: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/61.jpg)
61
Closure Algorithm (for Attributes)
Start with X={A1, …, An}.
Repeat until X doesn’t change do:
if B1, …, Bn C is a FD and B1, …, Bn are all in X then add C to X.
Start with X={A1, …, An}.
Repeat until X doesn’t change do:
if B1, …, Bn C is a FD and B1, …, Bn are all in X then add C to X.
{name, category}+ = {name, category, color, department, price}
name colorcategory departmentcolor, category price
name colorcategory departmentcolor, category price
Example:
![Page 62: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/62.jpg)
62
Example
Compute {A,B}+ X = {A, B, }
Compute {A, F}+ X = {A, F, }
R(A,B,C,D,E,F) A, B CA, D EB DA, F B
A, B CA, D EB DA, F B
In class:
![Page 63: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/63.jpg)
63
Closure Algorithm (for FDs)
A, B CA, D BB D
A, B CA, D BB D
Example:
Step 1: Compute X+, for every X:
A+ = A, B+ = BD, C+ = C, D+ = DAB+ = ABCD, AC+ = AC, AD+ = ABCDABC+ = ABD+ = ACD+ = ABCD (no need to compute– why ?)BCD+ = BCD, ABCD+ = ABCD
A+ = A, B+ = BD, C+ = C, D+ = DAB+ = ABCD, AC+ = AC, AD+ = ABCDABC+ = ABD+ = ACD+ = ABCD (no need to compute– why ?)BCD+ = BCD, ABCD+ = ABCD
Step 2: Enumerate all X, output X X+ - X
AB CD, ADBC, ABC D, ABD C, ACD BAB CD, ADBC, ABC D, ABD C, ACD B
![Page 64: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/64.jpg)
64
Keys
• A superkey is a set of attributes A1, ..., An s.t.
A1, ..., An B for all attributes B
• A key is a minimal superkey
![Page 65: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/65.jpg)
65
Computing Keys
• Compute X+ for all sets X
• If X+ = all attributes, then X is a superkey
• Consider only the minimal superkeys
Note: there can be exponentially many keys !
• Example: R(A,B,C), ABC, BCAKeys: AB and BC
![Page 66: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/66.jpg)
66
Examples of Keys
• Product(name, price, category, color)name, category pricecategory color
Key: {name, category} Superkeys: supersets
• Enrollment(student, address, course, room, time)student addressroom, time coursestudent, course room, time
Keys are: [in class]
![Page 67: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/67.jpg)
67
FD’s for E/R DiagramsSay: “the CreditCard determines the Person”
Purchase
Product
Person
Store
CreditCard
name
card-nossn
sname
Purchase(name , sname, ssn, card-no)
Incomplete(what does
it say ?)
card-no ssn
![Page 68: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/68.jpg)
68
Data Anomalies
When a database is poorly designed we get anomalies:
Redundancy: data is repeated
Updated anomalies: need to change in several places
Delete anomalies: may lose data when we don’t want
Schema refinement means removing the data anomalies.
![Page 69: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/69.jpg)
69
Normal Forms
• Decomposition into Boyce Codd Normal Form (BCNF)– Losselss
• Decomposition into 3rd Normal Form– Losless– Dependency preserving
![Page 70: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/70.jpg)
70
Boyce-Codd Normal Form
A simple condition for removing anomalies from relations:
Equivalently: for any set of attributes X, either X+ = X or X+ = all attributes
A relation R is in BCNF if:
If A1, ..., An B is a non-trivial dependency
in R , then {A1, ..., An} is a superkey for R
A relation R is in BCNF if:
If A1, ..., An B is a non-trivial dependency
in R , then {A1, ..., An} is a superkey for R
![Page 71: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/71.jpg)
71
BCNF Decomposition Algorithm
A’s restB’s
R1
Is there a 2-attribute relation that isnot in BCNF ?
Repeat choose A1, …, Am B1, …, Bn that violates the BNCF condition split R into R1(A1, …, Am, B1, …, Bn) and R2(A1, …, Am, [rest]) continue with both R1 and R2
Until no more violations
Repeat choose A1, …, Am B1, …, Bn that violates the BNCF condition split R into R1(A1, …, Am, B1, …, Bn) and R2(A1, …, Am, [rest]) continue with both R1 and R2
Until no more violations
R2
Heuristics: choose B1, …, Bn
“as large as possible”
Note: need tocompute the FDson R1, R2 (how ?)
![Page 72: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/72.jpg)
72
BCNF Example
FD: SSN Name, CityKey: {SSN, PhoneNumber}Is it in BCNF?
Name SSN PhoneNumber City
Fred 123-45-6789 206-555-1234 Seattle
Fred 123-45-6789 206-555-6543 Seattle
Joe 987-65-4321 908-555-2121 Westfield
Joe 987-65-4321 908-555-1234 Westfield
Another way: SSN+ ={SSN, Name, City} but no PhoneNumber
![Page 73: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/73.jpg)
73
BCNF Example
Name SSN City
Fred 123-45-6789 Seattle
Joe 987-65-4321 Westfield
SSN PhoneNumber
123-45-6789 206-555-1234
123-45-6789 206-555-6543
987-65-4321 908-555-2121
987-65-4321 908-555-1234
SSN Name, City
Let’s check anomalies:• Redundancy ?• Update ?• Delete ?
![Page 74: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/74.jpg)
74
Example• R(A,B,C,D) A B, B C
• Key: AD • Violations of BCNF:
A B, A C, ABC, B C
• Pick A BC first: split into R1(A,B,C) R2(A,D)• In R1 : B C; split into R11(A,B), R12(B,C)
• Final answer: R11(A,B), R12(B,C), R2(A,D)
![Page 75: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/75.jpg)
75
Example (cont’d)• R(A,B,C,D) A B, B C
• Order matters !• Pick A C first: R1(A,C), R2(A,B,D)• In R2: A B; decompose into R21(A,B), R22(A,D)
• Final answer: R1(A,C), R21(A,B), R22(A,D)
• Which one is better ?
![Page 76: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/76.jpg)
76
BCNF and Dependencies
FD’s: Unit Company; Company, Product UnitSo, there is a BCNF violation, and we decompose.
Unit Company
No FDs
In BCNF we loose the FD: Company, Product Unit
Unit Company Product
Unit Company
Unit Product
![Page 77: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/77.jpg)
77
Solution: 3rd Normal Form (3NF)
A simple condition for removing anomalies from relations:
A relation R is in 3rd normal form if :
Whenever there is a nontrivial dependency A1, A2, ..., An Bfor R , then {A1, A2, ..., An } a super-key for R, or B is part of a key.
A relation R is in 3rd normal form if :
Whenever there is a nontrivial dependency A1, A2, ..., An Bfor R , then {A1, A2, ..., An } a super-key for R, or B is part of a key.
Please read in the book !!!
![Page 78: CSE544 Data Modeling, Conceptual Design, XML](https://reader036.fdocuments.us/reader036/viewer/2022062407/56812cdd550346895d91a3f3/html5/thumbnails/78.jpg)
78
3NF Discussion
• 3NF decomposition v.s. BCNF decomposition:– Use same decomposition steps, for a while
– 3NF may stop decomposing, while BCNF continues
• Tradeoffs– BCNF = no anomalies, but may lose some FDs
– 3NF = keeps all FDs, but may have some anomalies