Database Design E R 2009

214
1 Database Design

description

 

Transcript of Database Design E R 2009

Page 1: Database Design E R 2009

1

Database Design

Page 2: Database Design E R 2009

2

What is a Database?

A collection of data that is organised in a predictable structured way

Any organised collection of data in one place can be considered a database

Examples filing cabinet library floppy disk

Page 3: Database Design E R 2009

3

What is Data?

The heart of the DBMS. Two kinds

Collection of information that is stored in the database.

A Metadata, information about the database. Also known as a data dictionary.

An example of a Metadata in shown in Appendix A.

Page 4: Database Design E R 2009

4

Relational Data Model

A relational database is perceived as a collection of tables.

Each table consists of a series of rows & columns.

Tables (or relations) are related to each other by sharing a common characteristic. (EG a customer or product table)

A table yields complete physical data independence.

Page 5: Database Design E R 2009

5

Features of the relational data model

Logical and Physical separated

Simple to understand. Easy to use.

Powerful nonprocedural (what, not how) language toaccess data.

Uniform access to all data.

Rigorous database design principles.

Access paths by matching data values, not by following fixed links.

Page 6: Database Design E R 2009

6

Terminology

Relation Null Value Tuple Attribute Domain Relation Schema Integrity Constraint Domain Constraint Key Constraint Key, Candidate Key Simple Key Composite Key Primary Key

Relational Database Relational Database Schema Referential Integrity Constraint Foreign Key Network Diagram Update Operations Join Projection Lossless join

Page 7: Database Design E R 2009

7

Relation A 2-dimensional table of values with these properties: No duplicate rows Rows can be in any order Columns are uniquely named by Attributes Each cell contains only one value

The special value is NULL which implies that there is no corresponding value for that cell. This may mean the value does not apply or that it is unavailable. Entire rows of NULLs are not allowed.

Terminology

Employee Job Manager

Jack Secretary Jill

Jill Executive Bozo

Bozo Director

Lulu Clerk Jill

Page 8: Database Design E R 2009

8

Tuple Commonly referred to as a row in a relation.

Eg:

Terminology

Attribute• A name given to a column in a relation. Each column must

have a unique attribute. This are often referred to as the fields.

Employee Job Manager

Jack Clerk Jill

Page 9: Database Design E R 2009

9

A pool of atomic values from which cells a given column take their values. Each attribute has a domain. Attributes may share domains

Terminology: Domain

Typist ManagerClerk........

Tom Mary Bozo Kali........

An attribute value (a value in a column labelled by the attribute)must be from the corresponding domain or may be NULL ( ).

Here again we use the same domain as above in

employee.

Attribute Domain

Employee Person Name

Job Job Name

Manager Person Name

Page 10: Database Design E R 2009

10

A Relational Schema is a named set of attributes. This refers to the structure only of a relation. It is derived from the traditional set notation displayed below

EMPLOYEE = { Employee, Job, Manager }

This is usually written in the modified version for database purposes:

EMPLOYEE( Employee, Job, Manager ) referring to the Table

Terminology:Relation Schema

EMPLOYEEEmployee Job Manager

Page 11: Database Design E R 2009

11

An Integrity Constraint is a condition that prescribes whatvalues are allowable in a relation. This permits the restriction of

the type of value that can be placed in a particular cell. Eg. only numbers for telephone numbers

The Domain Constraint is a condition on the allowable values for an attribute.

e.g. Salary < $60,000

Terminology:Integrity Constraint and Domain Constraint

EMPLOYEEThis restricts the

salary to be under a set value.

Employee Job Manager Salary

Jack Secretary Jill 25,000

Jill Executive Bozo 40,000

Bozo Director 50,000

Lulu Clerk Jill 30,000

Page 12: Database Design E R 2009

12

Dealing with many keys

We will be referring to the following keys Primary key Foreign key Simple key Composite key Concatenated key Candidate key Universal key

A key is a device that helps define relationships. Its role is based on the concept of functional dependency which we deal with extensively.

Page 13: Database Design E R 2009

13

A condition that no value of an attribute or set of attributes be repeated in a relation.

e.g. Employee(the attribute) has only unique values in EMPLOYEE (the relation).

The following relation violates this constraint:

Terminology:Key Constraint

EMPLOYEE

Jack appears twice. This means thatThis violates the Key Constraint

Employee Job Manager Salary

Jack Secretary Bozo 25,000

Jack Secretary Jill 25,000

Jill Executive Bozo 40,000

Bozo Director 50,000

Lulu Clerk Jill 30,000

Page 14: Database Design E R 2009

14

An attribute (or set of attributes) to which a key constraint applies is called a key ( or candidate key). Every relation schema must have a key.

If a key constraint applies to a set of attributes, it is called a composite or Concatenated Key. Otherwise it is a simple key.

Key

Simple Key Composite Key:

EMPLOYEE Another possible key. The combination of Job and manager is also unique

Terminology:Key Constraint

Employee Job Manager Salary

Jack Secretary Bozo 25,000

Kim Secretary Jill 25,000

Jill Executive Bozo 40,000

Bozo Director Bozo 50,000

Lulu Clerk Jill 30,000

Page 15: Database Design E R 2009

15

A key cannot have a NULL ( ) value.

For example, If we change the table so that the Employee Bozo does not have a manager then Job+Manager cannot be a key.

Terminology:Key Constraint

Employee Job Manager Salary

Jack Secretary Bozo 25,000

Kim Secretary Jill 25,000

Jill Executive Bozo 40,000

Bozo Director 50,000

Lulu Clerk Jill 30,000

Page 16: Database Design E R 2009

16

A primary key is a special preassigned key that can always be used to uniquely identify tuples. We have to choose a Primary Key for every Relation. We must consider all of the Candidate Keys and choose between them.

Employee is a primary key for EMPLOYEE is usuallywritten as:

EMPLOYEE( Employee, Job, Manager, Salary )

Here we have chosen the Simple Key Employee

Over the concatenated option of both

Job and Manager

Terminology:Key Constraint

Employee Job Manager Salary

Jack Secretary Bozo 25,000

Kim Secretary Jill 25,000

Jill Executive Bozo 40,000

Bozo Director Bozo 50,000

Lulu Clerk Jill 30,000

Page 17: Database Design E R 2009

17

A Database is more than multiple tables you must be able to “relate” them

Cus-code

Cus-Name Area-Code Phone Agent-Code

10010 Ramus 615 844-2573 50210011 Dunne 713 894-1238 50110012 Smith 615 894-2205 50210013 Olowaski 615 894-2180 50210014 Orlando 615 222-1672 50110015 O’Brian 713 442-3381 50310016 Brown 615 297-1226 50210017 Williams 615 290-2556 50310018 Farris 713 382-7185 50110019 Smith 615 297-3809 503

Agent-Code

Agent-Name Agent-AreaCode

Agent-Phone

501 Alby 713 226-1249

502 Hahn 615 882-1244

503 Okon 615 123-5589

The link is through the Agent-Code

Page 18: Database Design E R 2009

18

A Relational Database is just a set of Relations.For example

Terminology: Relational Database

Which Attribute do you think relates these two tables together?

JOB Job Salary

Secretary 25,000

Secretary 25,000

Executive 40,000

Director 50,000

Clerk 30,000

Employee Job Manager Salary

Jack Secretary Bozo 25,000

Kim Secretary Jill 25,000

Jill Executive Bozo 40,000

Bozo Director 50,000

Lulu Clerk Jill 30,000

EMPLOYEE

Page 19: Database Design E R 2009

19

A Relational Database Schema a set of Relation Schemas, together with a set of Integrity Constraints.

For example the Relations that you have been looking at with the headings

EMPLOYEE

JOB

are usually written as EMPLOYEE(Employee, Job, Manager) JOB(Job, Salary)

Notice how the Primary Keys are underlined

Terminology:Relational Database Schema

Employee Job Manager Salary

Job Salary

Page 20: Database Design E R 2009

20

This constraint says that –All the values in one column should also appear in another column.Look at the table below. Every entry in the Job column of the Employee table must appear in the Job column of the Job table

Terminology :Referential Integrity Constraint

EMPLOYEE JOBFK PK

Employee Job Manager

Jack Secretary Bozo

Kim Secretary Jill

Jill Executive Bozo

Bozo Director

Lulu Clerk Jill

Job Salary

Secretary 25,000

Secretary 25,000

Executive 40,000

Director 50,000

Clerk 30,000

FKPK

Page 21: Database Design E R 2009

21

Referential Integrity Constraint

Why does the following relational database violate the referential integrity constraints?

In other words, Why can’t Employee(Job) be a Foreign Key to Job(Job), or Employee(Manager) be a Foreign Key to Employee(Employee)?

Click here for the answers

Job Salary

Director 50,000

Clerk 30,000

Employee Job Manager

Jack Secretary Bozo

Kim Secretary Jill

Bozo Director

Lulu Clerk Jill

EMPLOYEE JOBFK

FKPK

PK

Page 22: Database Design E R 2009

22

Why Use Relational Databases

Their major advantage is they minimise the need to store the same data in a number of places

This is referred to as data redundancy

Page 23: Database Design E R 2009

23

Example of Data Redundancy (1)

Page 24: Database Design E R 2009

24

Example of Data Redundancy (2)

The names and addresses of all students are being maintained in three places

If Owen Money moves house, his address needs to be updated in three separate places

Consider what might happen if he forgot to let library administration know

Page 25: Database Design E R 2009

25

Example of Data Redundancy (3)

Page 26: Database Design E R 2009

26

Example of Data Redundancy (4)

Data redundancy results in: wastage of storage space by recording

duplicate information

difficulty in updating information

inaccurate, out-of-date data being maintained

Page 27: Database Design E R 2009

27

Other Advantages of Relational Databases

Flexibility relationships (links) are not implicitly defined

by the data Data structures are easily modified Data can be added, deleted, modified or

queried easily

Page 28: Database Design E R 2009

28

Summary of Some Common Relational

Terms Entity - an object (person, place or thing) that

we wish to store data about Relationship - an association between two entities Relation - a table of data Tuple - a row of data in a table Attribute - a column of data in a table Primary Key - an attribute (or group of attributes) that

uniquely identify individual records in a table

Foreign Key - an attribute appearing within a table that is a primary key in

another table

Page 29: Database Design E R 2009

29

Network Diagrams

Page 30: Database Design E R 2009

30

Terminology: Network Diagram

EMPLOYEE(Employee, Job, Manager) JOB(Job, Salary)

A relational database schema with referential integrity constraints can also be represented by a network diagram. A Referential Integrity Constraint is notated as an arrow labeled by the foreign key. You must always write the label of the Foreign Key on the arrow. Sometimes the same attribute has different titles in different tables.

EMPLOYEE JOBJob

Manager Network Diagram

Referential Integrity constraints can easily be represented by arrows FK PK. The arrow points from the Foreign Key to the matching Primary Key

Notice here, the label is Manager and not Employee.

Page 31: Database Design E R 2009

31

Personnel Database: Consider the following Tables

PROJECT

NAME P_NUMBER MANAGER ACTUAL_COST EXPECTED_COST

New billing system 23760 Yates 1000 10000Common stock issue 28765 Baker 3000 4000Resolve bad debts 26713 Kanter 2000 1500New office lease 26511 Yates 5000 5000Revise documentation 34054 Kanter 100 3000Entertain new client 87108 Yates 5000 2000New TV commercial 85005 Baker 10000 8000

ASSIGNMENT SKILL

E_NUMBER P_NUMBER AREA

1001 26713 Stock Market 1002 26713 Taxation 1003 23760 Investments 1003 26511 Management1004 26511 1004 287651005 23760

EMPLOYEE TITLE

NAME E_NUMBER DEPARTMENT E_NUMBER CURRENT_TITLE

Kanter 1111 Finance 1001 Senior consultant Yates 1112 Accounting 1002 Senior consultant Adams 1001 Finance 1003 Senior consultant Baker 1002 Finance 1004 Junior consultant Clarke 1003 Accounting 1005 Junior consultant Dexter 1004 Finance Early 1005 Accounting

PRIOR_JOB EXPERTISE

E_NUMBER PRIOR_TITLE E_NUMBER SKILL

1001 Junior consultant 1001 Stock market 1001 Research analyst 1001 Investments 1002 Junior consultant 1002 Stock market 1002 Research analyst 1003 Stock market 1003 Junior consultant 1003 Investments 1004 Summer intern 1004 Taxation

1005 Management

Page 32: Database Design E R 2009

32

ASSIGNMENT (E_NUMBER, P_NUMBER)

PRIOR_JOB (E_NUMBER, PRIOR_TITLE)

EXPERTISE (E_NUMBER, SKILL)

TITLE (E_NUMBER, CURRENT TITLE )

EMPLOYEE (NAME, E_NUMBER, DEPARTMENT)

SKILL (AREA)

PROJECT (NAME, P_NUMBER, MANAGER, ACTUAL_COST, EXPECTED_COST )

Personnel Database Schema

Not FK, we will look at this later

What are the connecting Foreign Keys to Primary Keys?

Page 33: Database Design E R 2009

33

TITLE

PROJECT

PRIOR_JOB

EMPLOYEESKILL

EXPERTISE

Personnel Database Network Diagram

ASSIGNMENT

Once you have produced your Schema and identified the Primary and Foreign Keys you can create the Network Diagram.The Network Diagram shows each of the tables with their links. Each of the Tables (Relations) are represented in a rectangle as shown. They are then connected by arrows that show the FKs pointing to the PKs, The arrow head points towards the PK, while the FK name written is the same as the attribute of the table that has the FK in it.

Page 34: Database Design E R 2009

34

Personnel Database Network Diagram

TITLE

PROJECT

PRIOR_JOB

EMPLOYEESKILL

EXPERTISE

skill

e_n

um

ber

p_n

um

ber

e_number

e_number

e_number

manager

ASSIGNMENT

Page 35: Database Design E R 2009

35

Summary: Questions

What is a Relational Database?

What actually is a relation?

What are Constraints?

What is a Schema?

What is a Network Diagram and why is it used?

Page 36: Database Design E R 2009

36

Summary: Answers

A relational database is based on the relational data model.It is one or more Relations(Tables) that are Related to each other

A relation is a table composed of rows (tuples) and columns, satisfying 5 properties• No duplicate rows• Rows can be in any order• Columns are uniquely named by Attributes• Each cell contains only one value• No null rows.

Constraints are central to the correct modeling of business information. Here we have seen them limit the set up of your tables: Referential Constraint

The Network Diagram is used to navigate complex database structures. It is a compact way to show the relationships between Relations (Tables)

Page 37: Database Design E R 2009

37

Activities

Consider the following relational database schemas.

Suppliers(suppId, name, street, city,state)Part(partId,partName,weight,length,composition)Products(prodId, prodName,department)Supplies(partId,suppId)Uses(partId,prodId) Make reasonable assumptions about the meaning of attribute

and relations, identify the primary and foreign keys and draw a network diagram showing the relations and foreign keys.

Page 38: Database Design E R 2009

38

Answer

Supplies

Supplier Part

Uses

Product

Page 39: Database Design E R 2009

39

Show the foreign keys on the network diagrams

Orders

Customer

SalesRep

Part

Ordnum ordDate custNumb12489 2/9/91 124

custNumb custName Address Balance credLim sksnumb

124 Adams 48 oak st 418.68 500 3

Slsnumber Name address totCom commRate

3 Mary 12 Way 2150 .05

Part Desc onHand IT wehsNumb

unitPrice

AX12 Iron 1.4 HW 3 17.95

Page 40: Database Design E R 2009

40

OrLineordNum Part ordNum quotePrice

Page 41: Database Design E R 2009

41

Answer

Orders

OrLine

Part

Customer

SalesRep

SlsNumber

CustNumb orLine

Part

Page 42: Database Design E R 2009

42

Obtain tutorial 1 from your tutor

Page 43: Database Design E R 2009

43

Functional Dependence FDD

Page 44: Database Design E R 2009

44

Functional Dependency Diagrams

Data AnalysisIn this Unit we look at the following:

Data Element, Attribute,Functional Dependency (FD),

Redundant FD,Pseudotransitive FD,Intersecting Attribute

Page 45: Database Design E R 2009

45

Functional Dependency Diagrams

A FUNCTIONAL DEPENDENCY DIAGRAM is a way ofrepresenting the structure of information needed tosupport a business or organization

It can easily be converted into a design for a relationaldatabase to support the operations of the business.

Page 46: Database Design E R 2009

46

Functional Dependency DiagramsThere are a number of methods for us to develop our database design from here. We could use the method of developing a large table with all attributes and breaking it down into smaller tables using what we refer to as Normalization by Decomposition (we look at this in detail later), or we could use Functional Dependency Diagrams to create a pictorial model of our database.

Page 47: Database Design E R 2009

47

Data Analysis and Database Design Using Functional Dependency

Diagrams1. The steps of Data Analysis in FDD are

1.1 Look for Data Elements1.2 Look for Functional Dependencies1.3 Represent Functional Dependencies in a

diagram1.4 Eliminate Redundant Functional

Dependencies2. Data Design, after we have our final version of the

FDD

2.1 Apply the Synthesis Algorithm

Page 48: Database Design E R 2009

48

Starting points for drawing functional

dependency diagrams

We must Understand the data

We Examine forms, reports,data entry and output

screens etc…

We Examine sample data

We consider Enterprise (business) rules

We examine narrative descriptions and conduct

interviews.

We apply our Experiences/Practice and that of others

To start the process of constructing our FDD we do the following:

Page 49: Database Design E R 2009

49

Enterprise Rules

What are Enterprise Rules?An enterprise rule (in the context of data analysis) is astatement made by the enterprise (organisation,

company,officer in charge etc.) which constrains data in some way.

Functional dependencies are the most important type ofconstraint on data and are often expressed in the form ofenterprise rules.

e.gNo two employees may have the same employee number.

An order is made by only one customerAn employee can belong to only one department at a

time.

Page 50: Database Design E R 2009

50

Drawing FDDs - Data Elements

We often refer to Data Elements during the FDD

process

A data element is a elementary piece of recorded

information

Every data element has a unique name.

A data element is either a

Label, e.g PersonName, Address,

BulidingCode, or

Measurement, e.g. Height, Age,

Date A data element must take values that can be

written down.

Page 51: Database Design E R 2009

51

Functional Dependency Diagrams

Now we have the Database Design

2NF Relation

3NF Relation

Universal Relation

1NF

TablesONF

Using the Method ofDecomposition

Method ofSynthesis

Sample Data

Eliminate Part Key

Dependencies

Eliminate Non KeyDependencies

EliminateRepeating

Groups

Attribute& Functional

Dependencies

Given theProblem

FunctionalDependency

Diagram

OR, here is the same process using the FDD approach

Page 52: Database Design E R 2009

52

Data Element Examples

Here are some examples PersonName has values Jeff, Jill, Gio, Enid Address has values 1 John St, 25 Rocky Road Height has values 171cm, 195cm Age has values 21,52,93,2 Date has values 20th May 1947, 2nd March 1997 JobName has values Manager, Secretary, Clerk Manager might not be a data element, but

ManagerName could be. It could be a value of another data element e.g. JobName

Page 53: Database Design E R 2009

53

Drawing FDDs Data Elements

Start drawing the Functional Dependency Diagram by

representing the Data Elements. A Data Element isrepresented by its name placed in a box:Every data element must have a unique name in

thefunctional dependency diagram.A data element cannot be composed of other data

elements i.e.it cannot be broken down into smaller componentsA Data Element is also known as an ATTRIBUTE,

because it generally describes a property of some thing which we will later call an ENTITY

Data Element

Page 54: Database Design E R 2009

54

A functional Dependency is a relationship between Attributes.

It is shown as an arrow e.g A B

It means that for every value of A, there is only one value for B It reads “A determines B”. A is called a determinant attribute.

B is called the dependent attribute.

Drawing FDDs –Using Elements

Page 55: Database Design E R 2009

55

Data Element Examples

Surname . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

On a form gives rise to the element

CREDIT CARD Bankcard Mastercard Visa Other

CreditCardType

Surname

Here are some examples of finding the Data Elements on a typical form

On a form gives rise to the element

Page 56: Database Design E R 2009

56

Functional Dependency Examples

Students and their family names

“Each student (identified by student number) has only one family name”

Students FamilyName

1 Smith

2 Jones

3 Smith

4 Andrews

Considering the rules stated above we should be able to draw a FDD for this. What are the elements of interest?

Page 57: Database Design E R 2009

57

FDDs AnswerStudents FamilyName

1 Smith

2 Jones

3 Smith

4 Andrews

Students determine FamilyName

(or FamilyName depends on Students)

Each student has exactly one family name, but the name could be the name of many students.

So FamilyName does not determine Student# e.g. “Smith is the name of students 1 and 3

Students FamilyName

Data elements of interest are Student# and FamilyName.

Page 58: Database Design E R 2009

58

FDDs ExamplesEmployees and the

departments they work for.Department Name Accounting

Employee Number 11

2

31

Department Name

Sales

Employee Number

45

27

In this example the tables are representing some interesting data of the business. We see that Employees with the ID numbers 11,2 and 31 all work in the Accounting Dept and that Employees with the ID numbers 45 and 27 work in the Sales Dept.

Do you think that you could draw an FDD to represent this? Have a go and then check your answers

Enterprise Rule: “Each employee works on only one department”

Page 59: Database Design E R 2009

59

FDD AnswersEmployees and the

departments they work for.Department Name Accounting

Employee Number 11

2

31

Department Name

Sales

Employee Number

45

27

Employee# DeptName

11 Acc

2 Acc

45 Sales

31 Acc

27 Acc

Data elements of interest are Employee# and DeptName”

Employee# DeptName

So we could make this following Table

Page 60: Database Design E R 2009

60

FDDs ExamplesThe quantity of parts held in a warehouse and their suppliers

Parts Suppliers Name QOH

1 Wang Electronics 23

2 Cumberland Enterprises 80

3 Wang Electronics 4

4 Roscoe Pty. Ltd 58

Part# determines SupplierName & Part# determines QOH

“Parts are uniquely identified by part numbers”“Suppliers are uniquely identified by Supplier Names”

“A part is supplied by only one supplier”“A part is held in only one quantity”

Parts SupplierName

Parts QOH

Should QOH be a determinant? No, common sense tells us that is not a reliable choice. We could have had repeating values

Page 61: Database Design E R 2009

61

FDDs ExamplesStudents and their subjects

enrolled.“Each student is given a unique student

number”

“A subject is uniquely identified by its name”

“A student may choose several subjects”Data element of interest are

Student# and SubjectName

Student

SubjectName

There us no functional dependency here.

Student# does not determine SubjectName,

nor does SubjectName determine Student#

Student SubjectName

1 History

1 Geography

1 Mathematics

1 History

2 English

2 English

3 Mathematics

3 English

4 French

4 Geography

Page 62: Database Design E R 2009

62

FDDs ExamplesResults obtained by each student

for each subject.

“Each student is given a unique student number”

“A subject is uniquely identified by its name”

“A student may choose several subjects”

“A student is allocated a result for each subject”

“Each student has only one name.”

Data elements are

Student#, StudentName, SubjectName and Grade

Page 63: Database Design E R 2009

63

FDDs ExamplesResults obtained by each student for

each subject.Student

Student Name

Subject Name Grade

1 Smith History A

1 Smith Geography B

1 SmithMathematics A

2 Jones History C

2 Jones English C

3 Smith English A

3 SmithMathematics A

4 Andrews English D

4 Andrews French C

4 Andrews Geography CTry and construct an FDD for this table considering the given Business Rules and the Data Elements

Page 64: Database Design E R 2009

64

FDDs ExamplesResults obtained by each student for each subject.

Student # StudentName

We can see that there is only one and only one student name for each student number, even though there might be more than one student with the same name. So….

But the subject grade for any student cannot be determined by the subject name or the student# by itself. A student can have many grades depending on the subject. How can we cater for this?

Page 65: Database Design E R 2009

65

FDDs AnswerResults obtained by each student for each subject.

Student

SubjectName Grade

StudentName

This is called the Composite Determinant

We need to combine the two Elements to say that there is one and only one grade for a student doing a particular subject. Here then is the complete diagram

Page 66: Database Design E R 2009

66

FDDs ExamplesCustomer Orders

Order Part# CustomerName Address

454 12 David Smith 1 John St, Hawthorn

454 23 David Smith 1 John St, Hawthorn

455 32 Emily Jones 45 Grattan St, Parkville

455 49 Emily Jones 45 Grattan St, Parkville

455 54 Emily Jones 45 Grattan St, Parkville

456 12 Mary Ho 44 Park St, Hawthorn

456 54 Mary Ho 44 Park St, Hawthorn

Validating functional dependenciesUsing simple data and populating the table, check there is only one value of

the dependent.

Page 67: Database Design E R 2009

67

FDDs Examples“Orders is uniquely identified by its names”

“Customers are uniquely identified by their names”

“A customer has only one address”

“An order belongs to only one customer”

“A part may be ordered only once one each order”

Order CustomerName

Order Parts Ordered CustomerName Address

454 23, 12 David Smith 1 John St, Hawthorn

455 54, 49, 32 Emily Jones 45 Grattan St, Parkville

456 54, 12 Mary Ho 44 Park St, Hawthorn

Address

Part#

Page 68: Database Design E R 2009

68

FDDs ExamplesEmployees and their tax files

numbers“Each employee has a unique employee

number”

“Each employee has a unique tax file number ”

Employee

TaxFile#

1 1024-5321

2 3456-3294

3 8246-7106

4 8861-6750

5 1234-4765

Employee#

Taxfile# Employee#

Taxfile#

Taxfile# Employee#

Employee# determines taxfile#

Taxfile# determines Employee#

Alternative keys

Page 69: Database Design E R 2009

69

Obtain Tutorial 2 from your tutor.

Page 70: Database Design E R 2009

70

Functional Dependency

DiagramsDatabase Design

Let’s look at the process of converting the FDD into a schema. We have a 12 step process to do so, that has an iterative component to it (loop).The 12 steps are outlined in the next series of slides.

Page 71: Database Design E R 2009

71

Functional Dependency Diagram Preparation

1. Represent each data element as a box.2. Represent each functional dependency by an arrow.3. Eliminate augmented dependencies.4. Eliminate transitive dependencies.5. Eliminate pseudo-transitive dependencies. 

By this stage, intersecting attributes should have been eliminated.

Page 72: Database Design E R 2009

72

Deriving 3NF Schema: Synthesis Algorithm

6. Pick any (unmarked) arrow in the diagram.

7.

Follow it back to its source, and write down the name of the source.

S

8.

Follow all arrows from the source data item, and write down the names of their destinations.

S, A, B, C

S is now the key of a 3NF relation (S , A, B, C).

S

S

A

B

C

Page 73: Database Design E R 2009

73

S

A

B

C

U1 U2 U3

Synthesis Algorithm: Deriving 3NF Schema

9. Mark all the arrows just processed.

10. If there are any unmarked arrows in the diagram, go back to step 6.

11. Finally, determine the Universal Key. Any attribute which is not determined by any other attribute (ie. has no arrow going into it) is part of the Universal Key.

12. If the universal key is not already contained in any of the above relations, make it into a relation. The universal key is the key of the new relation.

Page 74: Database Design E R 2009

74

A Fully Worked Example

We will now work from a given set of forms to produce an FDD then use the 12 steps to produce the Schema. The forms that follow show the time spent by a particular employee on a particular project. They contain details of the employee along with details of the project. In addition they also state the hours that the employee has spent on any one project to date. This is important to the FDD. Notice also that the employee can have many previous titles and have a number of skills. This also has to be dealt with in the FDD and then later after we have used the synthesis technique to create the Schema. Have a good look at the forms on the next 2 slides and try to develop the FDD yourself.

Page 75: Database Design E R 2009

75

EMPLOYEE ______________________________________________________________________________________________________________NAME E_NUMBER DEPARTMENT LOCATION CURRENT TITLE PRIOR_TITLES SKILLS_______________________________________________________________________________________________________________Adams 1001 Finance 9th Floor Senior consultant Junior consultant Stock market

Research analyst Investments ______________________________________________________________________________________________________________PROJECTS______________________________________________________________________________________________________________NAME TIME_SPENT P_NUMBER MANAGER ACTUAL_COSTEXPECTED_COST ______________________________________________________________________________________________________________Resolve bad debts 35 26713 Kanter 2000 1500______________________________________________________________________________________________________________

We say that this table is in “zero normal form” (0NF)This is because the cells have multiple values, eg. Prior titles and Skills. The next slide shows forms that demonstrate that an employee can work on many projects.

Personnel Database Forms 1

Page 76: Database Design E R 2009

76

EMPLOYEE __________________________________________________________________________________________________________NAME E_NUMBER DEPARTMENT LOCATION CURRENT TITLE PRIOR_TITLES SKILLS__________________________________________________________________________________________________________Baker 1002 Finance 9th Floor Senior consultant Junior consultant Stock market

Research analyst ______________________________________________________________________________________________________________________PROJECTS__________________________________________________________________________________________________________NAME TIME_SPENT P_NUMBER MANAGER_NUM ACTUAL_COSTEXPECTED_COST __________________________________________________________________________________________________________Res bad debts 18 26713 Kanter 2000 1500__________________________________________________________________________________________________________

________________________________________________________________________________________________________________

EMPLOYEE _________________________________________________________________________________________________________NAME E_NUMBER DEPARTMENT LOCATION CURRENT TITLE PRIOR_TITLES SKILLS_________________________________________________________________________________________________________Clarke 1003 Accounting 8th Floor Senior consultant Junior consultant Stock market

Investments _________________________________________________________________________________________________________

PROJECTS_________________________________________________________________________________________________________NAME TIME_SPENT P_NUMBER MANAGER_NUM ACTUAL_COSTEXPECTED_COST _________________________________________________________________________________________________________New billing system 26 23760 Yates 1000 10000New office lease 10 26511 Yates 5000 5000___________________________________________________________________________________________________________________________

Personnel Database Forms 2

Page 77: Database Design E R 2009

77

TIME_SPENT

EXPECTED_COST

Personnel Database FD Diagram

LOCATION

ACTUAL_COST

MANAGER_NUM

PROJECT_NAME

P_NUMBER

CURRENT_TITLE E_NUMBER

EMPLOYEE_NAMEPRIOR_TITLE

SKILL

DEPARTMENT_NAME

From the forms given we can produce the following FDD

Page 78: Database Design E R 2009

78

EXPECTED_COST

Personnel Database FD Diagram -Synthesis

ACTUAL_COST

MANAGER_NUM

PROJECT_NAME

P_NUMBER

Let us just consider the section of the FDD that looks at the project number as the determinant

By using the synthesis method we can choose an arrow, trace it back to the source, and gather together all of the attributes that the source points to. Try this and see if you can create the schema for this table.

Page 79: Database Design E R 2009

79

LOCATIONDEPARTMENT_NAME

Personnel Database FD Diagram - Synthesis

So the table DEPT(DEPARTMENT_NAME, LOCATION) is created

Again, if we choose another arrow that has not been chosen before and follow it back to the determinant we find DEPARTMENT_NAME is a determinant. Gathering all of the attributes that it points to we only have the location attribute. Hence this is a simple table consisting of DEPARTMENT_NAME as the Primary key and LOCATION as the only other attribute.

Page 80: Database Design E R 2009

80

Personnel Database FD Diagram - Synthesis

CURRENT_TITLE E_NUMBER

EMPLOYEE_NAME

DEPARTMENT_NAME

EMPLOYEE (EMPLOYEE_NAME, E_NUMBER, DEPARTMENT, CURRENT TITLE )

Likewise for the section of the FDD based around the E_NUMBER, creating the following table for the Employees details.

Page 81: Database Design E R 2009

81

TIME_SPENT

P_NUMBER

E_NUMBER

Personnel Database FD Diagram - Synthesis

Try to create the Assignment table for this part of the FDD.When you think you have it have a look at ours and see if you are right.

Here we have a slightly more complicated one. The Time spent on the project is dependent on both the Project number and the Employee name, as it is the time spent by a particular employee on a particular project. This is demonstrated by the boxing of both the above attributes together pointing to the TIME_SPENT

Page 82: Database Design E R 2009

82

TIME_SPENTP_NUMBER

E_NUMBER

Personnel DatabaseFD Diagram - Synthesis

ASSIGNMENT (E_NUMBER, P_NUMBER, TIME_SPENT)

The main difference here is that when choosing the arrow to follow back to the determinant we find that we have 2. This is OK, we just have to make sure that in the table both of them are the primary Key. We have a Composite Primary Key consisting P_NUMBER and E_NUMBER. When we then gather up all of the attributes that they point to together we get TIME_SPENT. Hence the table is written as

See the composite primary key

Page 83: Database Design E R 2009

83

P_NUMBER

E_NUMBER

PRIOR_TITLE

SKILL

UK (E_NUMBER, P_NUMBER, PRIOR_TITLE, SKILL)

Personnel Database FD Diagram - Universal Key

Now, the last part of the synthesis is often forgotten. We must collect up all of the attributes that do not have arrows pointing into them and place them in the one table called the Universal Key. Every attribute collected then becomes part of the composite Primary Key. In this case we have the following attributes inside the box below. Notice how Skill is there, as it sits by itself. Nothing is its determinant.

Page 84: Database Design E R 2009

84

Foreign Keys In the Synthesis Algorithm, a foreign key will arise from any

attribute that is:

A. both a determinant and part of another determinant, OR

B. both a determinant and a dependent.TIME_SPENT

LOCATION

P_NUMBER

E_NUMBER

DEPARTMENT_NAME

ASSIGNMENT (E_NUMBER, P_NUMBER, TIME_SPENT)

EMPLOYEE (E_NUMBER, DEPARTMENT_NAME)

DEPT(DEPARTMENT_NAME, LOCATION)

A.

B.

Page 85: Database Design E R 2009

85

ISA = Is A

MANAGER_NUM

E_NUMBER

ISA

Every MANAGER value is a E_NUMBER value.

Gives rise to a new Foreign Key

EMPLOYEE PROJECT MANAGER_NUM

In the case of the manager we say that the manager number is contained within the employee number

Page 86: Database Design E R 2009

86

ASSIGNMENT (E_NUMBER, P_NUMBER, TIME_SPENT)

EMPLOYEE (NAME, E_NUMBER, DEPARTMENT, CURRENT TITLE )

PROJECT (NAME, P_NUMBER, MANAGER_NUM, ACTUAL_COST, EXPECTED_COST )

Personnel Database SchemaGenerated by Synthesis

DEPT(DEPARTMENT, LOCATION)

UK (E_NUMBER, P_NUMBER, PRIOR_TITLE, SKILL)

This foreign key is a result of MANAGER

ISA E_NUMBER

Page 87: Database Design E R 2009

87

ASSIGNMENT

EMPLOYEE PROJECT

UK

Personnel Database Network Diagram Generated by Synthesis

E_NUMBER + P_NUMBER

P_NUMBERE_NUMBER

DEPT

DEPARTMENT_NAME

MANAGER_NUM

Page 88: Database Design E R 2009

88

A Fully Worked Example

We now have to take care of the multi-valued areas such as skills and prior titles. Our FDD synthesis takes care of everything up to that. It converts the FDD to what we call “Third normal Form”. We know that an individual can have many skills and many Prior Titles. They can also work on many Projects. Knowing the Employee number will not tell us one and only one value of the Skills that they have. We show this on the extended FDD with a double arrow notation.The notation for such a relationship is shown here where E_NUMBER is a determinant for many values of skill. Consequently the resulting representation shown on the next slide can be constructed, giving rise to the splitting of the UK to form three more relations

E_NUMBER

SKILL

Page 89: Database Design E R 2009

89

E_NUMBER

PRIOR_TITLE

SKILL

MVDs

PRIOR_JOB (E_NUMBER, PRIOR_TITLE)

EXPERTISE (E_NUMBER, SKILL)

Personnel DatabaseMultivalued Dependency-Decomposition

P_NUMBER,

ASSIGN (E_NUMBER, P_NUMBER)

MultiValued Dependency

Employees are associated with Projects, Titles and

Skills independently. There is no direct relationship

between Projects, Titles and Skills.

Hence we have the three new relations ASSIGN, PRIOR_JOB and EXPERTISE

Page 90: Database Design E R 2009

90

TIME_SPENT

EXPECTED_COST

Personnel Database FD Diagram with MVDs and Inclusion

LOCATION

ACTUAL_COSTMANAGER_NUM

PROJECT_NAME

P_NUMBER

CURRENT_TITLE E_NUMBER

EMPLOYEE_NAME

PRIOR_TITLE

SKILL

ISA

MVD

DEPARTMENT_NAME

MVD

Page 91: Database Design E R 2009

91

ASSIGNMENT (E_NUMBER, P_NUMBER, TIME_SPENT)

PRIOR_JOB (E_NUMBER, PRIOR_TITLE)

EXPERTISE (E_NUMBER, SKILL)

EMPLOYEE (NAME, E_NUMBER, DEPARTMENT, CURRENT TITLE )

PROJECT (NAME, P_NUMBER, MANAGER, ACTUAL_COST, EXPECTED_COST )

Final Personnel Database Schema

DEPT(DEPARTMENT, LOCATION)

Decomposed from

UK

Page 92: Database Design E R 2009

92

ASSIGNMENT

EMPLOYEE PROJECT

PRIOR_JOBEXPERTISE

Final Personnel Database Network Diagram

E_NUMBER P_NUMBERE_NUMBER

E_NUMBER

DEPT

DEPARTMENT_NAME

MANAGER_NUM

Page 93: Database Design E R 2009

93

EXPECTED_COST

Personnel DatabaseFD Diagram - Synthesis

ACTUAL_COST

MANAGER

PROJECT_NAME

P_NUMBER

PROJECT (PROJECT_NAME,P_NUMBER, MANAGER, ACTUAL_COST, EXPECTED_COST )

Choosing any of the arrows and following it back leads you to the project number (P_Number). This is then the Primary Key. If you then gather all of the attributes that P_Number points to and place them in the brackets you get the table Project with P_Number as the primary Key.

Page 94: Database Design E R 2009

94

Role Splitting In Functional Dependency Diagrams

In a Functional Dependency Diagram any group of attributes can be related in only one way. For example, a pair of attributes can be

related by an FD or not. Sometimes data can be related in more one way.

For example, a department can have an employee as its head or as a member.

The member relationship is represented in the FDD:

But the head relationship is represented in the FDD:

E_NUMBER DEPARTMENT_NAME

DEPARTMENT_NAME E_NUMBER

Page 95: Database Design E R 2009

95

Role Splitting In Functional Dependency

Diagrams

We can choose to split the E_NUMBER attribute into E_NUMBER and HOD.

But the foreign key constraint that a Head of Department is an Employee is lost on the FDD.

E_NUMBER DEPARTMENT_NAME

HODISA

EMPLOYEE DEPT

FDD

NetworkD DEPARTMENT_NAME

HOD

Synthesis

Page 96: Database Design E R 2009

96

Role Splitting In FDDs Alternatively, we can choose to split the

DEPARTMENT_NAME attribute into EMPLOYING_DEPT and HEADED_DEPT.

But the foreign key constraint that an Employing Department must be a Headed Department is again lost on the FDD.

E_NUMBER EMPLOYING_DEPT

HEADED_DEPT ISA

EMPLOYEE DEPT

FDD

NetworkDEMPLOYING_DEPT

E_NUMBER

Synthesis

Page 97: Database Design E R 2009

97

Role Splitting Example

Consider this example. We have the Employee with many Skills, Prior Titles, as before but we also have equipment that belongs to a particular employee, such as a computer and a fax. An employee can have many different pieces of equipment. It is worthwhile recognizing them on the diagram and then decomposing them into smaller relations as part of the schema

Page 98: Database Design E R 2009

98

LOCATION

CURRENT_TITLE E_NUMBER

EMPLOYEE_NAME

PRIOR_TITLE

SKILL

DEPARTMENT_NAME

SERIAL# DESCRIPTION

UK

• MVDs not necessarily embodied in the UK.• Better to decompose on MVDs first. • MVDs partition attributes into independent sets.

HOD

ISA

MVDs

Suppose each item of equipment (identified by SERIAL#) belongs to an

employee.

Page 99: Database Design E R 2009

99

Obtain Tutorial 3 from your tutor.

Page 100: Database Design E R 2009

100

ENTITY RELATIONSHIP ANALYSIS

In this area of the course we concentrate an another modelling technique called Entity Relationship Modelling (ERM or ER).

The first stage of this process will look at the following: ER Data Model and Notation Strong Entities Discovering Entities, Attributes Identifying Entities Discovering Relationships

Page 101: Database Design E R 2009

101

Critique of FD Analysis

We originally concentrated on the modelling technique called Functional Dependency Diagrams. They have limitations as follows:

Disadvantages of FDD Does not represents real world objects, but

only data; Cannot represent MVDs or specialization; Cannot represent multiple relationships without

artificial splitting of attributes; Entities fragmented during analysis;

Page 102: Database Design E R 2009

102

Conceptual Data Analysis

By using the ER technique we have the following advantages:

Data Analysis from the User's Point of View Models the Real World Independent of Technology Able to be validated in user terms

Page 103: Database Design E R 2009

103

Entity Relationship Data Model Features

The real value of using this type of modelling is that it considers the design in context to the environment where it comes from. We have these Entities that have there own identifying attributes, real things and real people. They can be observed in the environment. ERM has the following features:

Populations of Real World objects represented by Entities

Objects have Natural Identity Entities have Attributes which have values Entities related by Relationships Constraints Subtypes

Page 104: Database Design E R 2009

104

Occurrences versus Entities

56 28Jack Ackov Jill Hill

Entity OccurrencesEntity InstancesObjects

Let’s consider these two instances. Here we have both Jack and Jill, aged 56 and 23 respectively. By themselves they exist as people in their environment. In this case we consider them to be two customers. If we wish to model them and all of the possible customers that we have we need to create an Entity Class for all possibilities.

Page 105: Database Design E R 2009

105

Occurrences versus Entities

56 28Jack Ackov Jill Hill

CUSTOMER

Customer# CustName

Customer# CustName

5628

Jack AckovJill Hill

CUSTOMER(Customer#, CustName)

Entity OccurrencesEntity InstancesObjectsThese are the Tuples of the table below

Entity ClassesEntity TypesEntity SetsThis will convert to the schema below with Customer# being the Primary Key

Page 106: Database Design E R 2009

106

5628

Jack AckovJill Hill

BikeCup of Tea Pussy Cat23 156 234150 25

3

12

41

Here we have Jack and Jill placing orders for particular items of stock. They appear to order different amounts of each. For instance Jack orders 3 bikes. Each item being ordered also has a Stock#, Price and Description. These are individual instances of the process so we need to be able to represent any possibility of this in our model. See how we do this on the next page.

Page 107: Database Design E R 2009

107

5628

Jack AckovJill Hill

CUSTOMER

Customer# CustName

ITEM

Stock#

ORDERS

DescPrice

Quantity

Bike Cup of Tea Pussy Cat23 156 234150 25

3

124

1

Page 108: Database Design E R 2009

108

Customer# CustName

5628

Jack AckovJill Hill

CUSTOMER(Customer#, CustName)

Customer#

56562828

ORDERS(Customer#, Stock#, Quantity)

Stock# Desc

23156234

BikeCup of TeaPussy Cat

ITEM(Stock#, Price, Desc)Price

50125

Stock#

23156156234

Quantity

31241

Occurrences to Entities to Schemas

Page 109: Database Design E R 2009

109

ENTITIES

Entities are classes of objects about which we wish to store information. Examples are:

People: Employees, Customers, Students,..... Places: Offices, Cities, Routes, Warehouses,... Things: Equipment, Products, Vehicles, Parts,.... Organizations: Suppliers, Teams, Agencies, Depts,... Concepts: Projects, Orders, Complaints, Accounts,...... Events: Meetings, Appointments.

STRONG

WEAK

Page 110: Database Design E R 2009

110

STRONG ENTITIES

An entity is Existence Independent if an instance can exist in isolation. For example, CUSTOMER is existence independent of

ORDER, but ORDER is existence dependent on CUSTOMER. The ORDER is by a particular customer for a/many particular item(s)

An entity is identified if each instance can be uniquely distinguished by its attributes (or relationships). For example, CUSTOMER is identified by Customer#,

PERSON is identified by Name+Address+DoB, ORDER is identified by Customer#+Date+Time.

Page 111: Database Design E R 2009

111

An entity is STRONG if it can be identified by its (own) immediate attributes. Otherwise it is weak. For example, CUSTOMER and PERSON are strong entities,

but ORDER is weak because it requires an attribute of another entity to identify it. ORDER would be strong if it had an Order#.

Existence independent entities are always strong.

STRONG ENTITIES

Page 112: Database Design E R 2009

112

The Method: How to Develop the ERM

Step1: Search for Strong Entities and Attributes Step2. Attach attributes and identify strong entities. Step3. Search for relationships. Step4. Determine constraints. Step5. Attach remaining attributes to entities and relationships. Step6. Expand multivalued attributes, and relationship attributes. Represent attributed relationships and/or multivalued

attributes in a Functional Dependency Diagram.

Step7. Identify weak entities. Step8. Iterate steps 4,5,6,7,8 until no further expansion is possible. Step9. Look for generalization and specialization; Analyze Cycles;

Convert domain-sharing attributes to entities.

Page 113: Database Design E R 2009

113

Narrative&

Forms

1Search for

strong entitiesand attributes

Entities

Attributes

3Search for

relationships

Relationships

2Identifystrongentities

Strong entities

4 & 5Determine

constraints andattach attributes

Entity-RelationshipDiagram6

Expand attributedrelationships and/or

multivalued attributes

Weak Entities

7Identify

weak entitiesIdentified

weak entities

6’Represent attributed

relationships and/or multivalued attributesas Functional Dependencies

FunctionalDependency

Diagrams

The Method

The Method

Page 114: Database Design E R 2009

114

Step1: Search for Strong Entities and Attributes

1 Entities relevant nouns many instances have properties (attributes or

relationships) identifiable by properties

2 Strong Entities independent existence identifiable by own single-valued

attributes• 3 Attributes– printable names,

measurements– domain of values– no properties– dependent existence

Page 115: Database Design E R 2009

115

Narrative

A worked example finding strong Entities

A customer is identified by a customer#. A customer

has a name and an address. A customer may order

quantities of many items. An item may be ordered by many customers. An item is identified by a stock#. An item has a description and a price. A stock item may have many colours. Any

item ordered by a customer on the same day is part of

the same order

Here we have a scenario. Try to firstly identify all of the strong entities followed and all of the attributes. Can you also identify a weak entity? Are there any attributes that you have missed?

Page 116: Database Design E R 2009

116

Worked Example Continued

Let us take and place it around the nouns. These lead us to what we will consider to be the strong entities. If we then place the around items that we think would be the attributes, we can see if if any of the identified Entities are strong. You will notice that the item has a description, price, colour and stock # and a customer has a customer number, name, and address. These a Existence Independent Entities, and hence they must be strong.

Narrative

A customer is identified by a customer#. A

customer has a name and an address. A

customer may order quantities of many

items. An item may be ordered by many

customers. An item is identified by a stock#.

An item has a description and a

price. A stock item may have many

colours. Any item ordered by a customer

on the same day is part of the same order

Page 117: Database Design E R 2009

117

Conceptual Schema

CUSTOMER ITEM

Description

Address

Price

Quantity

Customer#

Stock#

Customer Name

Date

ORDERColour

Worked Example Continued

We have our Entities and the attributes displayed before us. Customer and Item are strong entities as they are Existence Independent. What about Order?

Order cannot be identified completely by any of its own attributes. It is dependent on the attributes of the other 2 entities to be identified. An order is made up of a customer ordering an item. We need the customer# and the item# to identify the order

Page 118: Database Design E R 2009

118

Step2. Identify Strong Entities.

Conceptual Schema

ITEMCUSTOMER

Customer#Price

AddressCustName

Stock#

Desc

Colour

DateQty

Both Customer and Item have what we call a Natural Identity

We now attach the attributes that belong to each of the Strong Entities. Notice that there are some left that belong to neither Customer or Item. We will look at this later.

Page 119: Database Design E R 2009

119

Another Example of the Difference Between Weak and

Strong Entities

Here is another example of a common occurrence that demonstrates the difference between a strong entity and a weak entity

A strong entity is identified by its own attributes. Bidders make purchases of goods at the auction.

BIDDER and a GOOD have independent existence, hence are strong, but PURCHASE requires attributes of BIDDER and GOOD. The Purchase is the identified by the Bibbers name and the Goods description. These are 2 attributes that belong to both the Bidder and the Good respectively.

Page 120: Database Design E R 2009

120

Additional Rules for Entities

For an Entity to exist we have the following additional rules: There must be more than one instance of an entity.

The company provides superannuation for its workers.Here there is only one instance of COMPANY so it is

not a valid entity.We do not model anything that only has one instance

Each instance of an entity must be potentially distinguishable by its properties. Members send five dollars to the association.

A dollar does not normally have distinguishing attributes.

Page 121: Database Design E R 2009

121

Step3. Search for Relationships.

We can now identify Relationships that have the following properties: Relationships

Have associate entities Are relevant

must be worth recording Can be"structural" verbs in the narrative

persistent, rather than transient relationships Can be "abstract" nouns in the narrative

nonmaterial connections, eg. Enrolment Can be verbalizable in the narrative

eg. Student EnrolledIn Unit Have 2 (binary)or more associated entities.(3-Ternary, up to n-ary

for n associated entities)

Page 122: Database Design E R 2009

122

Relationships:

A relationship must be relevant. It should indicate a structural, persistent (extending over time) association between entities. Students enrol in units selected from the

handbook. A relationship should not usually indicate a

procedural event (one that occurs momentarily, then is forgotten.). Students read about units selected from

the handbook.

Page 123: Database Design E R 2009

123

Relationships and the Worked Example.

Conceptual Schema

ITEMCUSTOMER

Customer#Price

AddressCustName

Stock#

Desc

Colour

DateQty

ORDERS

We can now deal with the order. The order is a relationship between the Customer and the Item. It is for a set Quantity on a given Date.

Page 124: Database Design E R 2009

124

Second Worked Example: The Agent

Analyze the data kept by the agent. Identify the entities, attributes and the relationships. To start with look at the nouns.Customers may order products stocked by various suppliers through the agent. The agent maintains a catalogue of what products are available from suppliers. The price of a product may depend on the supplier. Some products come in a variety of colours independently of supplier. Suppliers ship directly to customers and notify the agent only of the date and total. Customers then pay each supplier through the agent. The agent keeps records of all orders and payments, but is not interested in maintaining detailed invoice lines.

Page 125: Database Design E R 2009

125

Second Worked Example: The Agent

The nouns are Customers may order products stocked by various suppliers through the agent. The agent maintains a catalogue of what products are available from suppliers. The price of a product may depend on the supplier. Some products come in a variety of colours independently of supplier. Suppliers ship directly to customers and notify the agent only of the date and total. Customers then pay each supplier through the agent. The agent keeps records of all orders and payments, but is not interested in maintaining detailed invoice lines.

We have Customers, Products, Suppliers and an Agent. How many Agents are there. This is the Data for the Agent. There is only one instance. Hence we do not model it.

Page 126: Database Design E R 2009

126

The Agent:Additional Information

Customer#:Customer Name:

28 Date:Jill Hill

28 Fullview Lane, Glenvale

Oct 3, 1996

Stock# Description Qty156 Cup of Tea 4234 Pussy Cat 2

Manufacturer:Address:

Hill Creat Industries23 Highhill Blvd, Sumpend

Stock# Description Price156 Cup of Tea 1234 Pussy Cat 25

Manufacturer:Address:

Hill Creat Industries23 Highhill Blvd, Sumpend

Customer#:Customer Name:

28 Shipment Date: Oct 9,Jill Hill

1996

Total 54

These forms can tell us more information about the way the business runs.

Page 127: Database Design E R 2009

127

The Agent:Additional Information

• Notice that the forms also tell us the following additional facts:

• A Customer has a Cust#, Name and Address. The Supplier has a Name and Address and the stock has a Stock#, Description and Price.

• An order is made on a Date and is for the one Customer for many items. It also has the number of each item ordered.

• The shipping docket has the Date of shipping, both the Customers and Suppliers details along with the total price of the goods delivered.

• Try yourself to represent this in a diagram with the strong entities and the relationships between them.

Page 128: Database Design E R 2009

128

CUSTOMER

SUPPLIER

PRODUCT

Stock#{Colour}

Tradename

NameAddress

The Agent: Strong EntitiesThe Strong Entities

Each of the Entities below are strong. They have a Natural Identity and are Existent Independent. They are completely identifiable by their attributes

Address

Cust#

Page 129: Database Design E R 2009

129

CUSTOMER

SUPPLIER

PRODUCT

Qty

Price

Stock#{Colour}

Tradename

NameAddress

ORDERS

AVAILABLEFROM

ERDiagram

The Agent: RelationshipsThe Customer orders a Quantity of a particular product. All products are supplied from a Supplier at a price.

Address

Cust#

Page 130: Database Design E R 2009

130

CUSTOMER

SUPPLIER

PRODUCT

Date

Total

Paydate

Amount

Qty

Price

Barcode{Colour}

Tradename

NameAddress

ORDERS

RECEIVEDFROM

AVAILABLEFROM

PAIDERDiagram

The Agent: Final SolutionThe Product is shipped from the Supplier to the Customer on a Date with a total cost for the goods, and the Customer pays the Supplier on a Date an amount (which could be the amount for a number of shipments)

Page 131: Database Design E R 2009

131

Entity Relationship Analysis 2

We will now concentrate on the following areas of good ERM Cardinality and Participation Constraints Expanding to Weak Entities Identifying Weak Entities Derived Attributes and Relationships Ternary Relationships

Page 132: Database Design E R 2009

132

These are Steps 4,5 & 6 from the Original Diagram

Relationships

Strong entities

4 & 5Determine

constraints andattach attributes

Entity-RelationshipDiagram

6Expand attributed

relationships, domain sharing &

multivalued attributes

Weak Entities

7Identify

weak entities

Identifiedweak

entities

Unattched AttributesUnidentifiedweak

entities

Page 133: Database Design E R 2009

133

Step4. Determine constraints: Cardinality(How many participate

CUSTOMER ITEMORDERS

To complete this we “fix a single instance at one end and ask how many (one or many) are involved at the other end”.Look at the relationship where the Customer Orders an Item. Consider a single Customer. Can they order many items at the one time? Yes We have seen this. So we position a crows foot (<) at the point where the line touches the Entity Item. We then ask if an Item can be ordered by many Customers? Yes So agin we place a crows foot at the Customers end.

From left to right-A Cust can order many Items

From right to left- An Item can be ordered by many Cust

Page 134: Database Design E R 2009

134

Step4. Determine constraints: Cardinality.

CUSTOMER

Again to complete this task we “Fix a single instance at one end and ask how many (one or many) are involved at the other end”. All of the Customers live in a City. A Customer can only live in one City(unless they are politicians) In this case we must place a single straight line (|) at the intersection of the relationship line and the Entity City. However, a city can have many Customers. We show this by placing crows foot (>) at the end near the Customer

CITY

LIVES IN

Page 135: Database Design E R 2009

135

Step4. The Resulting ER with the Cardinality Constraints in Place

CUSTOMER ITEMORDERS

Many CUSTOMERs can ORDER an

ITEM.Many

ITEMs can be

ORDERed by a

CUSTOMER.

CITY

LIVES INMany CUSTOMERs can LIVE IN a

CITY.

A CUSTOMER can LIVE IN only one

CITY.

{Colour}

An ITEM can have

many Colours.

Page 136: Database Design E R 2009

136

Step4.Determine constraints: Participation.

CUSTOMER ITEMORDERS

Again, we “Fix a single instance at one end and ask if any must (might or must) be involved at the other end”.We ask “Does the Customer have to order an Item? Well, some would say that they do not they are not Customers! But we know that we must be able to recognise our Customers even though at present they do not have an order with us. So, in this case they do not have to place an order. This is then not mandatory, and we show it by placing the O beside the cardinality constraint. An Item does not have to be on an order as well, so it also gets the O notation.

Page 137: Database Design E R 2009

137

Step4.Determine constraints: Participation.

CUSTOMER

CITY

LIVES IN

This is also the case for the Customer living in the City. Does the customer have to live in the City? In this case Yes, as we class all areas as being within a City. Hence we place the “|” symbol beside the cardinality constraint next to the Entity City. The next one is difficult. Does a City have to have a Customer living in it. You might think No here, but are you prepared to record all of the cities in the world just to make sure? Common sense tells us that we have to make this mandatory so we only keep a record of the cities where our Customers live.

Page 138: Database Design E R 2009

138

Step4. The Resulting ER with the Participation Constraints in Place

CUSTOMER ITEMORDERS

An ITEM might be ordered by a CUSTOMER.

A CUSTOMER might order a

ITEM.

CITY

LIVES IN A CITY must have a CUSTOMER

LIVing IN it.

A CUSTOMER must LIVE IN a

CITY.

Page 139: Database Design E R 2009

139

Step4. Determine constraints: Validation by Population.

CUSTOMER ITEMORDERS

CITY

LIVES IN

Cust#

Stock#

CityName

{Colour}An important method of evaluating the proposed model is to populate with instances that demonstrate that the constraints that you have identified will work.

Page 140: Database Design E R 2009

140

Step4. Tables Created to Validate

CUSTOMER ITEMORDERS

CITY

LIVES IN

Cust# Stock#122312

13

77778899

CityName Cust#AyrAyrTully

122313

Cust#

Stock#

CityName

{Colour}

ColourStock#PinkBlue

7777

Page 141: Database Design E R 2009

141

Step5. Attach remaining attributes to entities and

relationships.In the previous lectures we looked at a worked problem with a Customer ordering an Item. Here we were able to identify Entities from the narration. Next we also listed the attributes which helped us identify the Strong Entities. We noticed that there were some Attributes, Qty and Date, left that could not be attached to any of the strong entities. They, in fact, belong to the Relationship that was associated with the two Entities.

ITEMCUSTOMERCustomer#

Price

AddressCustName

Stock#

Desc

Colour

DateQty

ORDERS

Page 142: Database Design E R 2009

142

Step5. Attach remaining attributes to entities and

relationships.

The quantity attribute cannot be attached to the Customer, as the Customer will order different quantities of various items at any time. It cannot also be attached to the Item. It must therefore be attached to the relationship between them, being the order. This is also the situation for the Date that the order was placed.

Page 143: Database Design E R 2009

143

Step5. Attach remaining attributes to entities and

relationships.

Conceptual Schema

ITEMCUSTOMER

Customer#Price

AddressCustName

Stock#

Desc

{Colour}DateQty

ORDERS

Page 144: Database Design E R 2009

144

Step6.Expand multi-valued attributes, domain sharing attributes and binary

relationship attributes.

Once we have identified the Strong Entities, Relationships and attached all Attributes to either the Strong Entities or Relationships, we are required to expand the diagram as much as possible to permit us to complete the process. This requires us to move in 2 directions. We must first look at all of the binary relationships to see what the cardinality constraints are between them. If they are “many-to-many” they must be carefully considered and expanded where appropriate.

We then must look at what we call Multi-valued Attributes and Domain Sharing Attributes. The process is shown on the following diagram.

Page 145: Database Design E R 2009

145

Step6 Entity-RelationshipDiagram

Expand relationships

with attributes

Dependent Entities

Many-to-many Relationships with Attributes

Multi-valued AttributesDomain Sharing Attributes

ExpandMulti-valued anddomain sharing

attributes

Characteristic EntitiesAssociative Entities

Page 146: Database Design E R 2009

146

Conceptual Schema

ITEMCUSTOMER

Customer#

Price

AddressCustName

Stock#

Desc

{Colour}DateQty

ORDERS

Step6

In the worked example we have a Many-to-Many relationship with 2 attributes . When we have a Many-to-Many relationship with attached attributes we are required to create an Associative Entity that bridges the 2 Entities.

Page 147: Database Design E R 2009

147

ITEMCUSTOMER

Customer#

Price

AddressCustName

Stock#

Desc

Date

Qty

ORDERMAKES FOR

Associative Entity

Step6

Between Customer and Item we create the Weak (Associative) Entity called Order. We have to redo the constraints. A customer can place many orders or none. An order can come from only one customer, and must be from a customer. An order is for many items and must be for at least one item, and an item can be on many orders but does not have to appear on an order. These have all been placed in the diagram shown below in their correct position.

Page 148: Database Design E R 2009

148

ITEMCUSTOMER

Customer#

Price

AddressCustName

Stock#

Desc

Colour

Date

Qty

ORDER

COLOUR

MAKES FOR

HAS

Associative Entity

Characteristic Entity

Step6

We have also noticed that an item can come in many colours. This is a multi-valued attribute. We can show this in our extended diagram by having a relationship between the Item and the Colour, where colour is the only attribute of the entity. In this case we are also saying that the colour of the item is optional (IE natural if requested) and that the only colours to be recorded are those that are used.

Page 149: Database Design E R 2009

149

Step6. Expand domain sharing attributes.

Managers supervise Workers. All employees are residents of a City. Employees who live in different cities from their managers get a special allowance.

MANAGER WORKERSUPERVISES

City City

Allowance

MANAGERSUPERVISES

CityName

Allowance

CITY

OF OF

WORKER

Characteristic Entity

Page 150: Database Design E R 2009

150

Step7. Identify weak entities. Clarify the notion of instance.

Weak entities are often ambiguous and difficult to agree on.

Attributes may be part of a key for a weak entity, but at least one (one-must) relationship for identification is required. So when we convert this into a table it will require one of the PKs from the strong entities as part of its own composite PK.

Validation, not design.The purpose of identification is not to allocate a

primary key, but to validate the concept. We have to be able to justify the concept of the relationship in the real world.

Never invent keys. I know that it is tempting but you must reflect the business as it is.

Page 151: Database Design E R 2009

151

Step7. Identify weak entities.

Conceptual Schema

ITEMCUSTOMER

Customer#

Price

AddressCustName

Stock#

Desc

Colour

Date

Qty

ORDER

COLOUR

MAKES

FOR

HAS

An ORDER is uniquely identified by the CUSTOMER and the Date.

Page 152: Database Design E R 2009

152

Step7. Identify weak entities.

Conceptual Schema

ITEMCUSTOMER

Customer#

Price

AddressCustName

Stock#

Desc

Colour

Date

Qty

ORDER

COLOUR

MAKES

FOR

HAS

Here we still have the relationship between Order and Item that is many to many with attributes. We must expand this.

Page 153: Database Design E R 2009

153

Step8. Iterate until no further expansion is possible.

Conceptual Schema

ITEM

CUSTOMER

Customer#

Price

Address

CustName

Stock#

Desc

Colour

Date

Qty

ORDER

COLOUR

MADE BY

FOR

HAS

ORDERLINEHAS

An ORDERLINE is identified by an ITEM on an ORDER.

An intersection entity is one that is identified by only by its relationships.

We introduce the weak entity orderline that for one item. It is fully dependent on the attributes of Order and Item to be identified

Page 154: Database Design E R 2009

154

Step8. Iterate until no further expansion is possible.

Ultimately every attribute must be single valued and attached to an entity.

Different development paths are possible. Your model could be different than mine depending on your research and your interpretation of the business.

Retract intersection entities. Even though we just showed you how to expand them in actual fact as they are fully dependent on the attributes of the surrounding entities you just retract them or ignore them. The conversion from ERM to the Schema will take care of everything.

Page 155: Database Design E R 2009

155

WARNING: Forms are not Entities

Forms contain attributes from many different entities. Forms are part of an already existing Information

System and are not necessarily part of the new system that is looking at the entities.

Forms are requirements documents, so can be analysed according to the Method.

Forms are often not identifiable and contain information about many weak entities.

The problem is that when people see forms they want to produce a table. This is not always the case. Many forms that you see in the workplace are reports. They have been derived by different pieces of information. That is part of the functionality of a good database management system.

Remember that:

Page 156: Database Design E R 2009

156

Derived Attributes

Attributes can sometimes be derived from other attributes by calculation.

Each product has a wholesale price and a retail price. There is always a 20% markup.

PRODUCT Wholesale Price

Barcode

Retail Price *

Page 157: Database Design E R 2009

157

Derived Relationships

• Relationships can sometimes be logically derived from other relationships. Consider this situationA student is enrolled in a unit and each

unit belongs to a course

STUDENT

UNIT

COURSE

STUDIES OFFERED IN

Page 158: Database Design E R 2009

158

Derived Relationships

• Now in addition place this in the picture.

A student enrolled in a unit must be enrolled in the course offering the unit.

• Retain derived relationships that bear constraints. This information needs to be kept and not taken out as repeating, due to its constraints

STUDENT

UNIT

COURSEENROLLED IN *

STUDIES OFFERED IN

Page 159: Database Design E R 2009

159

TERNARY RELATIONSHIPS

In some situations the relationships that hold together entities are quite complex. In most cases they are binary and a simple bi-polar positioning will work. It is when we have to hold three or more entities together that things can get quite complicated.

Let us look at a situation that requires a Ternary relationship to be used.

An Employee may be assigned to many projects. An employee may have many skills, but an employee may use only one skill or a particular project. A project may require several skills and several employees.

Page 160: Database Design E R 2009

160

TERNARY RELATIONSHIPS: Example

EMPLOYEE

PROJECT

SKILLQUALIFIED IN

WORKS ON

REQUIRES

Three binary relationships cannot

represent the fact that a particular employee

uses a particular skill on a particular project.

Page 161: Database Design E R 2009

161

EMPLOYEE

PROJECT

SKILL

TERNARY RELATIONSHIPS:Cardinality Constraints

Many employees may use a skill on a project.

An employee may useonly one skill on a project.

An employee may use a skill on many projects.

Page 162: Database Design E R 2009

162

For a ternary to be valid all associated binaries must be many-to-many.

TERNARY RELATIONSHIPS:Rule for Existence

Page 163: Database Design E R 2009

163

The Agent RevisitedDo you remember this problem that we had

previously?

Customers may order products stocked by various suppliers through the agent. The agent maintains a catalogue of what products are available from suppliers. The price of a product may depend on the supplier. Some products come in a variety of colours independently of supplier. Suppliers ship directly to customers and notify the agent only of the date and total. Customers then pay each supplier through the agent. The agent keeps records of all orders and payments, but is not interested in maintaining detailed invoice lines.

We modelled it as demonstrated on the next slide

Page 164: Database Design E R 2009

164

CUSTOMER

SUPPLIER

PRODUCT

Date

Total

Paydate

Amount

Qty

Price

Barcode{Colour}

Tradename

NameAddress

ORDERS

RECEIVEDFROM

AVAILABLEFROM

PAIDERDiagram

Example: The Agent, original simple solution

Now we need to expand it.

Page 165: Database Design E R 2009

165

CUSTOMER PRODUCT

Qty

Barcode

{Colour}

NameAddress

Example: The AgentExpanded ER Diagram

ORDERMAKES FOR

Date

COLOUR

Let us first look at the relationship between the customer and the product. We see that it is a many to many relationship with attached attributes (QTY). It must then be expanded. We do this by creating the weak entity Order which is identified by the date and the customers name. We do not bother introducing the orderline weak entity as it is only identifiable by the attributes of the surrounding entities

Page 166: Database Design E R 2009

166

SUPPLIER

Date

Total

Paydate

Amount

Tradename

Example: The Agent

SHIPMENT

PAYMENT

RECEIVES

FROM

TO

PAID

CUSTOMER

NameAddress

Consider the relationship between the customer and the supplier. Here we have 2 many to many relationships that have to be expanded. They create the weak entities payment and shipment as detailed below, with the attached attributes. Also, they have new constraints with them that show us the identifying attributes that belong to them.

Page 167: Database Design E R 2009

167

SUPPLIER

Price

Tradename

Example: The Agent

HOLDING

HAS

IN

PRODUCT

Barcode

{Colour}

COLOUR

Finally we have to consider the relationship between the supplier and the product. Here we again have a many to many that requires expanding, creating the weak entity Holding, identified by attributes from both the product and the supplier with its own attribute price. This is because different suppliers supply the goods at different prices.

Page 168: Database Design E R 2009

168

Example: The AgentThe final solution

In the end we have to combine all of these sections together to create the final ERM diagram for this problem

Page 169: Database Design E R 2009

169

SUPPLIER

Date

Total

Paydate

Amount

Price

Tradename

Example: The AgentThe final solution

SHIPMENT

HOLDINGPAYMENT

RECEIVES

FROM HAS

IN

TO

PAID

CUSTOMER PRODUCT

Qty

Barcode

{Colour}

NameAddress

ORDERMAKES FOR

Date

COLOUR

Page 170: Database Design E R 2009

170

Page 171: Database Design E R 2009

171

What is Normalisation

A process that ensures that each attribute is attached to the correct entity

A process of grouping data elements into tables representing entities and their relationships

An integral part of a design method that produces flexible and reliable databases

Page 172: Database Design E R 2009

172

Why Normalise Data?

Minimises data redundancy The only “redundancy” is the foreign key linking data This isn’t redundancy as the link has to be defined in

some fashion Most stable form to change Most robust structure Most adaptable and flexible structure

Page 173: Database Design E R 2009

173

Normal Forms

Introduced by E.F.Codd, there were originally three normal forms (The abbreviation is NF)These are sufficient for nearly all DB’s

In addition there are Boyce-Codd, 4th, 5th and domain-key normal forms. These are rarely required and will not be covered in this course.

Page 174: Database Design E R 2009

174

Primary Keys

An attribute (or group of attributes) that uniquely identifies a particular record in a relation

EMPLOYEE(Employee#,Name,Salary, Department#)ORDER_ITEM(Item#, Order#, Quantity)STUDENT(Stud No, Name(subcode,stitle,result))

Primary key is underlined

Page 175: Database Design E R 2009

175

Foreign Keys An attribute in one relation (table) that

is the primary key in another relation

EMPLOYEE(Employee#, Name, Salary, Department#)

DEPARTMENT(Department#, Dname, Budget)

TOUR(Tourcode, Tname)

BOOKING(Booking#, Seats, Tourcode, Depdate)

Foreign Keys

Page 176: Database Design E R 2009

176

First Normal Form (1)

Consider the problem posed by the entity

Entering data we might obtain

How long should the record be?

STUDENT(Stud No, Surname(Subcode, Subname, Result))

“Repeating Group”

Page 177: Database Design E R 2009

177

First Normal Form (2)

To convert the entity to 1st normal form (1NF) remove any repeating groups of data items from the unnormalised data

EACH RECORD MUST HAVE THE SAME LENGTH

STUDENT(Stud No, Surname(Subcode, Subname, Result))

Repeats a varying number of times,depending upon how many subjectsthe student is enrolled in

Page 178: Database Design E R 2009

178

Converting to 1NF

1.Remove the repeating group and make a new relation/entity

2.The ‘new’ relation now gets a concatenated primary key, which is made of the primary key of the original relation and the “primary” key of the repeating group

3.Give the new relationship a descriptive name

Page 179: Database Design E R 2009

179

1. Remove the Repeating Group

To convert our relation

STUDENT(Stud No,Surname(Subcode,Subname,Result))

Remove the repeating group and state it as a separate relation

STUDENT(Stud No, Surname)(Subcode, Subname, Result)

Page 180: Database Design E R 2009

180

2. A Concatenated Primary Key

STUDENT(Stud No, Surname)

(Subcode, Subname, Result)

Give the new (unnamed) relation a primary key consisting of the primary key of the original relation and the key of the repeating group

STUDENT(Stud No, Surname)(Stud No,Subcode, Subname, Result)

Page 181: Database Design E R 2009

181

3. Name the New Relation

STUDENT(Stud No, Surname)(Stud No, Subcode, Subname, Result)

Give the new relation a descriptive name

STUDENT(Stud No, Surname)

SUBJECT(Stud No,Subcode, Subname, Result)

Page 182: Database Design E R 2009

182

In First Normal Form (1)

So in first normal form (1NF) the original relation:

STUDENT(Stud No,Surname(Subcode,Subname,Result))

has become a pair of relations

STUDENT(Stud No, Surname)SUBJECT(Stud No, Subcode, Subname, Result)

Page 183: Database Design E R 2009

183

In First Normal Form (2)

Page 184: Database Design E R 2009

184

First Normal Forms - Examples

Change to first normal form EMPLOYEE(Employee#,

EmpName,Salary(Proj#.projname)) ORDER(Order#,Orderdate(Part#,NumberOrdered

))

Answers EMPLOYEE(Employee#, EmpName,Salary)

PROJECT(Employee#,Proj#,Projname)

ORDER(Order#,Orderdate)PART(Order#,Part#,NumberOrdered)

Page 185: Database Design E R 2009

185

Second Normal Form

Consider the following relation and the problems presented when creating, deleting, or updating a record

Page 186: Database Design E R 2009

186

Problems Creation

A new item “9999 - washer” cannot be added to the DB until it has been ordered

Also there could also a different description for the same item in a different row. Eg “9870 - 5cm nut”

Deletion If order 40 is the only order for nails, deleting it

will lose the item# and desc from the DB Update

If the item description for item 9870 is amended to “octagonal nut” then it must be changed in many places

Page 187: Database Design E R 2009

187

Second Normal Form Must firstly be in 1NF A non-key attribute cannot be identified by

part of a composite key:Order-item(Order#,Item#,Desc,Qty)

The quantity ordered is functionally dependent on the whole of the primary key, the order # and the item# Order-item(Order#,Item#,Desc,Qty)

The description of the item, however, doesn’t depend on the whole key. It only depends on item#

Page 188: Database Design E R 2009

188

Converting to 2NF

To convert a relation to second normal form

1.Write down all the possible “combinations” of the attributes forming the primary key.

2.Place each of the other attributes with the appropriate combination

3.Remove any relations that consist of a single attribute primary key alone

4.Give each remaining relation a descriptive name

Page 189: Database Design E R 2009

189

1. Possible Primary Keys

To change our relation Order-item(Order#,Item#,Desc,Qty) into 2NF

Write down all possible “combinations” of the attributes forming the primary key:(Order#(Item#(Order#,Item#

Page 190: Database Design E R 2009

190

2. Matching the Other Attributes

Order-item(Order#,Item#,Desc,Qty) Match each of the other attributes with the primary

key that depends upon

(Order#)(Item#, Desc)(Order#, Item#, Qty)

Page 191: Database Design E R 2009

191

3. Remove Trivial Relations4. Name Relations

Remove any relations that consist of a single attribute primary key alone: (Order#)(Item#, Desc)(Order#, Item#, Qty)

Give the remaining relations meaningful names:ITEM(Item#,Desc)ORDER-ITEM(Order#,Item#,Qty)

Page 192: Database Design E R 2009

192

In Second Normal Form(1)

Order-item(Order#,Item#,Desc,Qty)in second normal form has become

ITEM(Item#,Desc)ORDER-ITEM(Order#,Item#,Qty)

Page 193: Database Design E R 2009

193

In Second Normal Form(2)

This solves the problems highlighted earlier ADD new item at any time DELETE last order for item, but item remains in

DB to UPDATE, desc only needs to be altered in one

place

Page 194: Database Design E R 2009

194

Second Normal Form - Summary

Must first be in 1NF

Each attribute in a relation must be functionally dependent on the whole of the primary key

ie Every attribute needs the full primary key and not just parts of it

Page 195: Database Design E R 2009

195

Second Normal Form - Examples(1)

Convert to second normal formQ1.TRAINING(Emp#,EmpName,Dept#,Course,DateComp

leted)

Q2.ORDER-ITEM(Order#,Item#,Date-ord,Qty,Unit-price)

A1. EMPLOYEE(Emp#,EmpName,Dept#)TRAINING(Emp#,Course,DateCompleted

A2. ORDER(Order#,Dat-ord)ITEM(item#,unit price)ORDER-ITEM(Order#,item#.Qty)

Page 196: Database Design E R 2009

196

Second Normal Form - Examples(2)

Convert to 2NFEMPLOYEE(Emp#,Dept#,Ename,Salary)

Answer This is already in 2NF Any relation in 1NF that has a single attribute as

the primary key must be in 1NF as it cannot be dependent on only a portion of the key.

Page 197: Database Design E R 2009

197

Third Normal Form

Consider the following relation ( which is in 2NF) and the problems when CREATING, DELETING or UPDATING a record:

Page 198: Database Design E R 2009

198

ProblemsCREATION

Cannot add a new course until a student is enrolled

There is nothing in the design that stops a course having various names in different recordseg “A112 - DOT(C)”

DELETION Date for a course is lost when last student

enrolled in the course is deletedUPDATE

If the course name changes , it must be altered in many places

Page 199: Database Design E R 2009

199

Third Normal Form Must be in 2NF A non-key attribute cannot be dependent on another

non-key attribute. (This is known as transitive dependency)

STUDENT(Student#,Sname,CourseCode,CourseName)

Sname & CourseCode are both dependent on on Student #

STUDENT(Student#,Sname,CourseCode,CourseName)

CourseName, however is dependent on CourseCode But CourseCode is a non-key attribute.

Page 200: Database Design E R 2009

200

Converting To 3NF

1.Remove all attributes that are dependent on non-key attribute(s) into a new relation

2.Make a non-key attribute(s) that the removed attribute(s) are dependent on, the primary key of the new relation.

3.Give the new relation a descriptive name.

Page 201: Database Design E R 2009

201

1. Remove the Attributes

To convert our relationSTUDENT(Student#,Sname,CourseCode,CourseName)

into 3NF: Remove all the attributes that are dependent upon

a non-key attribute.STUDENT(Student#,Sname,CourseCode)(CourseName)

Page 202: Database Design E R 2009

202

2. Add the Primary key3. Name the New Relation

Make the non-key attribute that the removed attributes were dependent on, the primary key in the new relation.STUDENT(Student#,Sname,CourseCode)(CourseName)

Give the new relation a meaningful nameSTUDENT(Student#,Sname,CourseCode)COURSE(CourseCode,CourseName)

Page 203: Database Design E R 2009

203

Third Normal Form

This solves the problems highlighted earlier ADD new course at anytime DELETE last student in course and the course

will still remain To UPDATE, course name only needs to be

altered in one place

Page 204: Database Design E R 2009

204

Third Normal Form - Summary

Must be in 2NF (and hence in 1NF)

Each attribute in a relation must be dependent on the primary key only and not any other non-key attributes

Page 205: Database Design E R 2009

205

Third Normal Forms - Examples

Convert to 3NFQ1.EMPLOYEE(Emp#,EmpName,Dept#,DeptName)

Q2.CUSTOMER(Customer#,Cname,Caddress,SalesRep,Sname)

AnswersQ1. EMPLOYEE(Emp#,EmpName,Dept#)

DEPARTMENT(Dept#,DeptName)

Q2. CUSTOMER(Customer#,Cname,Caddress,SalesRep)

SALESREP(SalesRep,Sname)

Page 206: Database Design E R 2009

206

Incorrect Decompositions

Groupings of attributes that do not follow the rules of normalisation we have looked at result in: less flexible databases databases that lose data, or require major

alterations in their data, when creation, deletion or updating of records takes place

Page 207: Database Design E R 2009

207

The Normal Forms - Summary

First Normal Form (1NF) The relation contains no repeating groups

Second Normal Form (2NF) The relation is in 1NF and each attribute in

the relation is functionally dependent on the whole key

Third Normal Form (3NF) The relation is in 2NF and each attribute in

the relation is functionally dependent on nothing but the key.

Page 208: Database Design E R 2009

208

A Simple Test for 3NF

Each attribute should depend on

(the original ideas behind relational DB’s were proposed by Dr E.F.Codd)

The key,the whole keyand nothing but the key

(so help me Codd)

Page 209: Database Design E R 2009

209

Data Dictionaries A data dictionary is a structured analysis

tool that records every data name and defines, precisely, what is meant by that name.

Sometimes they are referred to as a metadata(ie data about data)

All the objects (data flows, data stores, processes, data elements etc) identified during analysis should be defined in the dictionary

It may also, optionally, include physical information about the method of data storage, etc

Page 210: Database Design E R 2009

210

Aliases Sometimes a data element in a system

may be referred (known) by more than one name the accounts department calls it

customer_payment the sales department knows it as

customer_owing

To avoid problems occurring because of multiple names for one item, a data dictionary should list any aliases (other names) by which the data is known

Page 211: Database Design E R 2009

211

Sample Entry 1

Data Name: Student_ID

Description: Unique identifier of students

Data Type: Text(7)

Values: Text field of 7 digits with the first two digits signifying the current year.

Aliases: Student_Number, Student#

Where used: Administration, Student_Records

Page 212: Database Design E R 2009

212

Scope of Course

The course will not be focussing on aliases or “Where used”

The following slides shows examples of listings expected in this course

Page 213: Database Design E R 2009

213

Sample Entry 2

Data Name: Skill_Level

Description: A code representing the level of skill of an employee

Data Type: Text(2)

Values: Represents the number of years experience of employee1 = 1 year experience2 = 2 years experience, etc

Page 214: Database Design E R 2009

214

Sample Entry 3

Data Name: Budget_Amount

Description: Amount set aside for each budget item

Data Type: Currency

Values: All amounts multiples of $100