Data Modeling

240
1 Data Modeling and Database Design Volume One Student Guide ORACLE Enabling the Information Age ™

Transcript of Data Modeling

Page 1: Data Modeling

1

Data Modeling and Database Design Volume One

Student Guide

ORACLE Enabling the Information Age ™

Page 2: Data Modeling

2

June 1992

M00475

ORACLE

Data Modeling and Database Design Student Guide • Volume One

Page 3: Data Modeling

3

Data Modelling and Database Design

Contributors: Ann Horton

Howard Benbrook

Dean Dameron

Art Hetherington

Jeff Jacobs

Steve Strickland

Kathy Andronica

Pete Cassidy

Claudia Herzog

Bill Hopkins

Cliff Longman

Tom Traver

Publishing: Scott Knudtson Rich Marinaccio

Copyright © Oracle Corporation, 1992

All rights reserved. Printed in the U.S.A.

This software/documentation contains proprietary information of Oracle Corporation; it is provided under a license

agreement containing restrictions on use and disclosure and is also protected by copyright law. Reverse engineering of the

software is prohibited. If this software/documentation is delivered to a U.S. Government Agency of the Department of

Defense, then it is delivered with Restricted Rights and the following legend is applicable:

Restricted Rights Legend

Use, duplication or disclosure by the Government is subject to restrictions for commercial computer software and shall be

deemed to be Restricted Rights software under Federal law and as set forth in subparagraph (c) (1) (ii) of DFARS 252.227-

7013, Rights in Technical Data and Computer Software (October 1988).

Use, duplication, or disclosure is subject to restrictions stated in your contract with Oracle Corporation.

If this software/documentation is delivered to a U.S. Government Agency not within the Department of Defense, then it is

delivered with "Restricted Rights." as defined in FAR 52.227-14, Rights in Data-General, including Alternate III (June

1987).

The information in this document is subject to change without notice. If you find any problems in the documentation, please

report them to us in writing to Oracle Corporation. 500 Oracle Parkway, Redwood Shores. CA 94065-9815. Oracle

Corporation does not warrant that this document is error free.

ORACLE, SQL*Plus, SQL*Connect, SQL*Net, SQL*DBA, SQL*Report, SQL*ReportWriter, SQL*Forms, SQL*Menu,

SQL*Loader, Easy*SQL, Pro*C, Pro*COBOL, Pro*Ada, Pro*Fortran, Pro*PL/I, Pro*Pascal, SQL*Calc, SQL*QMX,

Oracle Financials, and CASE*Dictionary are registered trademarks. Oracle General Ledger. Oracle Assets. Oracle Payables

and Oracle Purchasing. Oracle*Mail, SQL*TextRetrieval, PL/SQL, Oracle Graphics, Hyper*SQL, Oracle Card.

CASE*Designer, and CASE*Generator are trademarks of Oracle Corporation.

Lotus and 1-2-3 are trademarks of Lotus Development Corporation. Macintosh and HyperCard are registered trademarks and

HyperTalk is a trademark of Apple Computer. Inc. dBase is a trademark of Ashton-Tate Corporation. IBM. MVS. DB2,

SQL/DS, and IBM PC are trademarks of International Business Machines Corporation. Microsoft and MS-DOS are

registered trademarks and Windows is a trademark of Microsoft Corporation. Paintbrush is a trademark of Zsoft Corporation.

Page 4: Data Modeling

4

CONTENTS

CONTENTS .............................................................................................................................. 4

1 INTRODUCTION.....................................................................................9

COURSE OBJECTIVES ....................................................................................................... 10

ORACLE OVERVIEW......................................................................................................... 11

ORACLE'S CASE APPROACH .......................................................................................... 13

CASE*METHOD DEVELOPMENT CYCLE.................................................................... 14

2 OVERVIEW OF DATABASE DEVELOPMENT..................................15

SECTION OBJECTIVES...................................................................................................... 16

DATABASE DEVELOPMENT PROCESS ........................................................................ 17

BUSINESS INFORMATION REQUIREMENTS .............................................................. 18

CONCEPTUAL DATA MODELLING OVERVIEW ....................................................... 19

DATABASE DESIGN OVERVIEW.................................................................................... 20

DATABASE BUILD OVERVIEW....................................................................................... 21

DATABASE AND APPLICATION DEVELOPMENT..................................................... 22

3 BASIC CONCEPTUAL DATA MODELLING......................................23

SECTION OBJECTIVES...................................................................................................... 24

CONCEPTUAL DATA MODELLING ............................................................................... 25

ENTITIES ............................................................................................................................... 29

IDENTIFY AND MODEL ENTITIES................................................................................. 33

EXERCISE 3-1 ....................................................................................................................... 36

RELATIONSHIPS ................................................................................................................. 37

EXERCISE 3-2 ....................................................................................................................... 41

EXERCISE 3-3 ....................................................................................................................... 42

Page 5: Data Modeling

5

EXERCISE 3-4 ....................................................................................................................... 43

RELATIONSHIP TYPES ..................................................................................................... 44

USING A RELATIONSHIP MATRIX................................................................................ 48

ANALYZE AND MODEL RELATIONSHIPS................................................................... 50

DETERMINE A RELATIONSHIP'S EXISTENCE.......................................................... 51

NAME THE RELATIONSHIP............................................................................................. 53

DETERMINE RELATIONSHIP'S OPTIONALITY......................................................... 55

DETERMINE RELATIONSHIP'S DEGREE .................................................................... 56

VALIDATE THE RELATIONSHIP.................................................................................... 57

EXERCISE 3-5 ....................................................................................................................... 58

EXERCISE 3-6 ....................................................................................................................... 60

LAY OUT THE E-R DIAGRAM ......................................................................................... 62

ATTRIBUTES ........................................................................................................................ 64

DISTINGUISH ATTRIBUTES AND ENTITIES ............................................................... 69

ATTRIBUTE OPTIONALITY............................................................................................. 71

IDENTIFY ATTRIBUTES.................................................................................................... 73

EXERCISE 3-7 ....................................................................................................................... 75

ASSIGN UNIQUE IDENTIFIERS....................................................................................... 77

EXERCISE 3-8 ....................................................................................................................... 83

EXERCISE 3-9 ....................................................................................................................... 84

EXERCISE 3-10 ..................................................................................................................... 86

REVIEW: BASIC CONCEPTUAL DATA MODELLING............................................... 88

4 ADVANCED CONCEPTUAL DATA MODELLING............................92

SECTION OBJECTIVES...................................................................................................... 93

NORMALIZE THE DATA MODEL................................................................................... 94

EXERCISE 4-1 ....................................................................................................................... 98

Page 6: Data Modeling

6

RESOLVE M:M RELATIONSHIPS ................................................................................... 99

EXERCISE 4-2 ..................................................................................................................... 107

EXERCISE 4-3 ..................................................................................................................... 108

MODEL HIERARCHICAL DATA ................................................................................... 109

MODEL RECURSIVE RELATIONSHIPS ...................................................................... 112

EXERCISE 4-4 ..................................................................................................................... 117

MODEL ROLES WITH RELATIONSHIPS .................................................................... 118

MODEL SUBTYPES ........................................................................................................... 120

MODEL EXCLUSIVE RELATIONSHIPS ...................................................................... 124

EXERCISE 4-5 ..................................................................................................................... 126

MODEL DATA OVER TIME............................................................................................ 128

EXERCISE 4-6 ..................................................................................................................... 132

MODEL COMPLEX RELATIONSHIPS ......................................................................... 133

EXERCISE 4-7 ..................................................................................................................... 135

EXERCISE 4-8 ..................................................................................................................... 136

EXERCISE 4-9 ..................................................................................................................... 138

5 RELATIONAL DATABASE CONCEPTS...........................................139

SECTION OBJECTIVES.................................................................................................... 140

RELATIONAL DATABASE OVERVIEW ...................................................................... 141

PRIMARY KEYS................................................................................................................. 143

FOREIGN KEYS ................................................................................................................. 147

DATA INTEGRITY............................................................................................................. 149

6 INITIAL DATABASE DESIGN ...........................................................151

SECTION OBJECTIVES.................................................................................................... 152

DATABASE DESIGN.......................................................................................................... 153

INITIAL DATABASE DESIGN OVERVIEW ................................................................. 155

Page 7: Data Modeling

7

MAP SIMPLE ENTITIES................................................................................................... 158

MAP ATTRIBUTES TO COLUMNS................................................................................ 159

MAP UID'S TO PRIMARY KEYS .................................................................................... 161

MAP RELATIONSHIPS TO FOREIGN KEYS .............................................................. 163

REVIEW: MAPPING SIMPLE E-R MODELS TO TABLES........................................ 169

EXERCISE 6-1 ..................................................................................................................... 170

EXERCISE 6-2 ..................................................................................................................... 172

EXERCISE 6-3 ..................................................................................................................... 174

EXERCISE 6-4 ..................................................................................................................... 176

MAP COMPLEX E-R MODELS TO TABLES ............................................................... 179

EXERCISE 6-5 ..................................................................................................................... 183

CHOOSE SUBTYPE OPTIONS ........................................................................................ 187

EXERCISE 6-6 ..................................................................................................................... 196

REVIEW: INITIAL DATABASE DESIGN ...................................................................... 200

7 TABLE NORMALIZATION................................................................201

SECTION OBJECTIVES.................................................................................................... 202

NORMALIZE TABLES ...................................................................................................... 203

RECOGNIZE UNNORMALIZED DATA ........................................................................ 204

CONVERT TO FIRST NORMAL FORM........................................................................ 205

CONVERT TO SECOND NORMAL FORM................................................................... 206

CONVERT TO THIRD NORMAL FORM....................................................................... 208

EXERCISE 7-1 ..................................................................................................................... 210

NORMALIZE DURING DATA MODELLING............................................................... 214

8 FURTHER DATABASE DESIGN........................................................217

SECTION OBJECTIVES.................................................................................................... 218

FURTHER DATABASE DESIGN ..................................................................................... 219

Page 8: Data Modeling

8

SPECIFY REFERENTIAL INTEGRITY......................................................................... 220

DESIGN INDEXES.............................................................................................................. 222

ESTABLISH VIEWS ........................................................................................................... 227

DENORMALIZE THE DATABASE DESIGN................................................................. 230

PLAN PHYSICAL STORAGE USAGE............................................................................ 237

SUMMARY: DATABASE DESIGN .................................................................................. 238

SUMMARY: DATABASE DEVELOPMENT.................................................................. 239

DATABASE BUILD OVERVIEW..................................................................................... 240

Page 9: Data Modeling

9

1

INTRODUCTION

Page 10: Data Modeling

10

COURSE OBJECTIVES

At the end of this course, you will be able to:

1 Analyze user information requirements and develop an entity-relationship model to express

those requirements.

2 Develop a rela tional database design from an entity-relationship model.

Page 11: Data Modeling

11

ORACLE OVERVIEW

Page 12: Data Modeling

12

ORACLE Overview - cont'd

* Data Modelling and Database Design are techniques for analyzing

information requirements and designing relational databases.

Page 13: Data Modeling

13

ORACLE'S CASE APPROACH

Oracle's CASE (Computer-Aided Systems Engineering) approach provides

a full-suite of CASE methods, techniques and tools.

Business Requirements

Operational System

Page 14: Data Modeling

14

CASE*METHOD DEVELOPMENT CYCLE

Data modeling and database design support the first three stages of the

CASE*Method Development cycle.

Page 15: Data Modeling

15

2

OVERVIEW OF

DATABASE DEVELOPMENT

Page 16: Data Modeling

16

SECTION OBJECTIVES

At the end of this section, you will be able to:

1 Understand the phases of the Database Development Process.

2 Explain what Conceptual Data Modelling and Database Design involve.

3 Understand the parallel phases of the Application Development Process.

Page 17: Data Modeling

17

DATABASE DEVELOPMENT PROCESS

Database development is a top-down, systematic approach that transforms

business information requirements into an operational database.

The Database Development Process is a vertical slice of the CASE*Method

Development Cycle.

Page 18: Data Modeling

18

BUSINESS INFORMATION REQUIREMENTS

Top-down database development begins with the information requirements

of the business.

Example

Here is a set of information requirements:

"I manage the Human Resources Department for a large company. We need to keep information

about each of our company's employees. We need to track each employee's first name, last

name, job or position, hire date, and salary. For any employees on commission, we also need to

track their potential commission. Each employee is assigned a unique employee number.

Our company is divided into departments. Each employee is assigned to a department-for

example, accounting, sales, or development. We need to know the department responsible for

each employee and the department's location. Each department has a unique number, for

example, accounting is 10 and sales are 30.

Some of the employees are managers. We need to know each employee's manager, and the

employees each manager manages."

Quick Notes

• The scope of a set of information requirements may vary from the needs of a department to

the needs of a total company.

• Information requirements are tightly coupled with business function requirements. For

example, the Human Resources Department's business function requirements include Manage

employee information.

Page 19: Data Modeling

19

CONCEPTUAL DATA MODELLING OVERVIEW

In Conceptual Data Modelling, define and model the things of significance

about which the business needs to know or hold information, and the

relationships between them.

Example

The following entity-relationship model represents the information requirements of the Human

Resources Department.

An Entity-Relationship Data Model should accurately model the

organization's information needs and support the functions of the business.

Page 20: Data Modeling

20

DATABASE DESIGN OVERVIEW

In Database Design, map the information requirements reflected in an

Entity-Relationship Model into a relational database design.

Example

A design for the Human Resources database is shown in the following table instance charts.

Table Name: EMPLOYEE

Column Name

EMPN

O

FNAM

E

LNAME

JOB

HIREDATE

SA

L

COM

M

MGR

DEPTN

O Key Type PK FK1 FK2

Nulls/ Unique NN, U NN NN NN NN

7369 MARY SMITH CLERK 17-DEC-80 80 7902 20

7902 HENR FORD ANALYST 03-DEC-81 30 7566 50

7521

SUE

WARD

SALESMA

N

22-FEB-81

12

51

6000

7698

30

7698

BOB

BLAKE

MANAGER

01-MAY-81

28

50

1000

0

7839

30

Sample Data

7839 BOB KING PRESIDEN 17-NOV-81 50 5000 10

Table Name: DEPARTMENT

Column Name

DEPTNO

DNAME

LOC

Key Type

PK

Nulls/ Unique NN, U NN NN

10 ACCOUNTING NEW YORK

20 RESEARCH DALLAS

30 SALES CHICAGO

40 OPERATIONS BOSTON

Sample Data

50 DEVELOPMENT ATLANTA

The Table Instance Chart for each relational table identifies the table's

columns, primary key, and any foreign keys and provides a visual view of

sample data.

Page 21: Data Modeling

21

DATABASE BUILD OVERVIEW

In Database Build, create physical relational database tables to implement

the database design.

Example

The following Structured Query Language (SQL) statements will create the DEPARTMENT and

EMPLOYEE tables.

SQL> CREATE TABLE DEPARTMENT 2 (DEPTNO NUMBER(2) NOT NULL PRIMARY KEY, 3 DNAME CHAR(20) NOT NULL, 4 LOC CHAR 115) NOT NULL );

SQL> CREATE TABLE EMPLOYEE 2 (EMPNO NUMBER (5) NOT NULL PRIMARY KEY, 3 FNAME CHAR(15) NOT NULL, 4 LNAME CHAR(15) NOT NULL, 5 JOB CHAR(9), 6 HIREDATE DATE NOT NULL, 7 SAL NUMBER (7,2), 8 COMM NUMBER (7, 2), 9 MGR CHAR(4) REFERENCES EMPLOYEE(EMPNO),

10 DEPTNO NUMBER(2) NOT NULL REFERENCES DEPARTMENT (DEPTNO) );

The Structured Query Language (SQL) is used to create and manipulate

relational databases.

Page 22: Data Modeling

22

DATABASE AND APPLICATION DEVELOPMENT

The Database Development Process is tightly coupled with the Application

Development Process.

Page 23: Data Modeling

23

3

BASIC

CONCEPTUAL DATA MODELLING

Page 24: Data Modeling

24

SECTION OBJECTIVES

At the end of this section, you will be able to:

1. Identify and model entities.

2. Analyze and model the relationships between entities.

3. Analyze and model attributes.

4. Identify unique identifiers for each entity.

5. Develop a basic entity-relationship model from a statement of information requirements and

user interviews.

Page 25: Data Modeling

25

CONCEPTUAL DATA MODELLING

Conceptual Data Modelling is the first step of the top-down Database

Development Process, and is performed during the Strategy and Analysis

stages of the System Development Cycle.

Page 26: Data Modeling

26

Conceptual Data Modelling-cont'd

The goal of Conceptual Data Modeling is to develop an entity-relationship

model that represents the information requirements of the business.

Example

The following entity-relationship model represents the information requirements of the Human

Resources Department.

Entity-Relationship Model Components

• Entities - the things of significance about which information needs to be held.

• Relationships-how the things of significance are related.

• Attributes-the specific information, which needs to be held.

Page 27: Data Modeling

27

Conceptual Data Modelling-cont'd

An entity-relationship model is an effective means for collecting and

documenting an organization's information requirements.

Robust Syntax

• An E-R Model documents an organization's information requirements in a clear, precise

format.

User Communication

• Users can easily understand the pictorial form of an E-R Model.

Ease of Development

• An E-R Model can be easily developed and refined.

Definition of Scope

• An E-R Model provides a clear picture of the scope of an organization's information

requirements.

Integration of Multiple Applications

• An E-R Model provides an effective framework for integrating multiple applications,

development projects, and/or purchased application packages.

Quick Notes

• Be sure to fully establish an organization's information requirements during the conceptual

data modelling stage. Requirements changes during later stages of the development life-cycle

can be extremely expensive.

• Use views or subsets of an E-R Model as a communication aide.

Page 28: Data Modeling

28

Conceptual Data Modelling-cont'd

Conceptual Data Modelling is independent of the hardware or software to

be used for implementation. An E-R Model can be mapped to a

hierarchical, network, or relational database.

Page 29: Data Modeling

29

ENTITIES

An entity is a thing of significance about which information needs to be

known or held.

Alternate Entity Definitions

• An object of interest to the business.

• An entity is a class or category of thing.

• An entity is a named thing.

Examples

The following might be things of significance about which a business needs to hold information:

EMPLOYEE

DEPARTMENT

PROJECT

Attributes describe entities and are the specific pieces of information, which

need to be known.

Examples

Possible attributes for the entity EMPLOYEE are:

badge number, name, date of birth, and salary

Possible attributes for the entity DEPARTMENT are:

Name, number, and location

Quick Note

• An entity must have attributes that need to be known from the business's viewpoint or it is not

an entity within the scope of the business's requirements.

Page 30: Data Modeling

30

Entities - cont'd

Entity Diagramming Conventions

• Soft box with any dimensions

• Singular, unique entity name

• Entity name in upper case

• Optional synonym name (in parentheses)

• Attribute names in all lower case

Examples

Quick Notes

• A synonym is an alternate name for an entity.

• Synonyms are useful when two groups of users have different names for the same thing of

significance.

Page 31: Data Modeling

31

Entities - cont'd

Each entity must have multiple occurrences or instances.

Examples

The entity EMPLOYEE has one occurrence for each employee in the business:

Jim Brown, Mary Jones, Juan Gomez, and Jill Judge are all occurrences of the entity

EMPLOYEE.

The entity DEPARTMENT has one occurrence for each department in the company:

The Finance Department, the Sales Department, and the Development Department are all

instances of the entity DEPARTMENT.

Each entity instance has specific values for the entity's attributes.

Example

The entity EMPLOYEE has attributes of name, badge number, date of birth, and salary.

The instance Jim Brown has the following values: name Jim Brown, badge number 1322, date

of birth 15-MAR-50, and salary $55,000.

Quick Notes

• Instances are sometimes mistaken for entities.

• An entity is a class or category of thing - e.g. EMPLOYEE.

• An instance is a specific thing - e.g. the employee Jim Brown.

Page 32: Data Modeling

32

Entities - cont'd

Each instance must be uniquely identifiable from other instances of the same entity. An attribute or set

of attributes that uniquely identify an entity is called a Unique Identifier (UID).

Example

Each employee has a unique badge number. Badge number is a candidate for the entity EMPLOYEE'S

UID.

Look for attributes that uniquely identify an entity.

Example

What attributes might uniquely identify the following entities?

Quick Notes

• If an entity cannot be uniquely identified, it may not be an entity.

• Attributes, which uniquely identify an entity and are part of

the entity's UID are tagged with #*.

Page 33: Data Modeling

33

IDENTIFY AND MODEL ENTITIES

Follow the steps below to identify and model entities from a set of interview

notes.

• Examine the nouns. Are they things of significance?

• Name each entity.

• Is there information of interest about the entitiy that the-business needs to hold?

• Is each instance of the entity uniquely identifiable? Which attribute or attributes could serve as

its UID?

• Write a description of it. "An EMPLOYEE has significance as a paid worker at the company.

For example, John Brown and Mary Smith are EMPLOYEES."

• Diagram each entity and a few of its attributes.

Quick Note

• Do not disqualify a candidate entity too soon. Additional attributes of interest to the business

may be uncovered later.

Page 34: Data Modeling

34

Identify and Model Entities - cont'd

Example

Identify and model the entities in the following set of information requirements.

"I'm the manager of a training company that provides instructor-led courses in management

techniques. We teach many courses, each of which has a code, a name, and a fee. Introduction

to UNIX and C Programming-are two of our more popular courses. Courses vary in length from

one day to four days. An instructor can teach several courses. Paul Rogers and Maria Gonzales

are two of our best teachers. We track each instructor's name and phone number. Each course is

taught by only one instructor. We create a course and then line up an instructor. The students

can take several courses over time, and many of them do this. Jamie Brown from AT&T took

every course we offer! We track each student's name and phone number. Some of our students

and instructors do not give us their phone numbers.

Page 35: Data Modeling

35

Identify and Model Entities - cont'd

Solution

The following entities model the Training Company's information requirements.

Entity Descriptions

• A COURSE has significance as a training event offered by the Training Company. For

example, Introduction to UNIX and C Programming.

• A STUDENT has significance as a participant in one or more COURSES. For example, Jamie

Brown.

• An INSTRUCTOR has significance as a teacher of one or more COURSES. For example, Paul

Rogers and Maria Gonzales.

Page 36: Data Modeling

36

EXERCISE 3-1

Identify and model entities.

1. Identify and model the entities in the following set of information requirements. Write a brief

description of each entity. Show at least two attributes for each entity.

"I'm the owner of a small video store. We have over 3,000 videotapes that we need to keep track

of.

Each of our videotapes has a tape number. For each movie, we need to know its title and

category (e.g. comedy, suspense, drama, action, war, or sci-fi). Yes, we do have multiple copies

of many of our movies. We give each movie a specific id, and then track which movie a tape

contains. A tape may be either Beta or VHS format. We always have at least one tape for each

movie we track, and each tape is always a copy of a single, specific movie. Our tapes are very

long and we don't have any movies, which require multiple tapes.

We are frequently asked for movies starring specific actors. John Wayne and Katherine

Hepburn are always popular. So we'd like to keep track of the star actors appearing in each

movie. Not all of our movies have star actors. Customers like to know each actor's "real" birth

name and date of birth. We track only actors who appear in the movies in our inventory.

We have lots of customers. We only rent videos to people who have joined our "video club." To

belong to our club, they must have good credit. For each club member, we’d like to keep his/her

first and last name, current phone number, and current address. And, of course each club

member has a membership number.

Then we need to keep track of what videotapes each customer currently has checked out. A

customer may check out multiple videotapes at any given time. We just track current rentals.

We don't keep track of any rental histories."

Page 37: Data Modeling

37

RELATIONSHIPS

A relationship is a two-directional, significant association between two

entities, or between an entity and itself.

Relationship Syntax

Example

The relationship between INSTRUCTOR and COURSE is:

Each COURSE may be taught by one and only one INSTRUCTOR.

Each INSTRUCTOR may be assigned to one or more courses.

Each direction of a relationship has:

• a name - e.g., taught by or assigned to.

• an optionality - either must be or may be.

• a degree - either one and only one or one or more.

Quick Notes

• Cardinality is a synonym for the term degree.

• A degree of 0 is addressed by may be.

Page 38: Data Modeling

38

Relationships - cont'd

Diagramming Conventions

• A line between two entities

• Lower case relationship names

• Optionality

• Degree

Page 39: Data Modeling

39

Relationships - cont'd

First read a relationship in one direction, and then read the relationship in

the other direction.

Example

Read the relationship between EMPLOYEE and DEPARTMENT.

Read this relationship first from left to right, and then from right to left.

Relationship from Left to Right (partial diagram)

Each EMPLOYEE must be assigned to one and only one DEPARTMENT.

Relationship from Right to Left (partial diagram)

Each DEPARTMENT may be responsible for one or more EMPLOYEES.

Page 40: Data Modeling

40

Relationships - cont'd

Example

Read the relationship between STUDENT and COURSE.

Each STUDENT may be enrolled in one or more COURSES.

Each COURSE may be taken by one or more STUDENTS.

Example

Read the relationship between PAYCHECK and EMPLOYEE.

Each PAYCHECK must be for one and only one EMPLOYEE.

Each EMPLOYEE may be the receiver of one or more PAYCHECKs.

Page 41: Data Modeling

41

EXERCISE 3-2

Read relationships.

1. Write the relationship sentences for this E-R diagram.

Page 42: Data Modeling

42

EXERCISE 3-3

Draw an Entity-Relationship Diagram.

1. Draw an Entity-Relationship diagram to represent the following:

a. Each EMPLOYEE must be assigned to one and only one DEPARTMENT.

b. Each DEPARTMENT may be responsible for one or more EMPLOYEES.

c. Each EMPLOYEE may be assigned to one or more ACTIVITIES.

d. Each ACTIVITY may be performed by one or more EMPLOYEES.

Page 43: Data Modeling

43

EXERCISE 3-4

Optional Exercise

Draw an Entity-Relationship Diagram.

1. Draw an Entity-Relationship diagram to represent the following:

a. Each ORACLE DATABASE must be made up of one or more TABLESPACEs.

b. Each TABLESPACE must be part of one and only one ORACLE DATABASE.

c. Each TABLESPACE must be made up of one or more FILEs.

d. Each FILE may be part of one and only one TABLESPACE.

e. Each TABLESPACE may be divided into one or more SEGMENTS.

f. Each SEGMENT must be included in one and only one TABLESPACE.

g. Each SEGMENT must be inclusive of one or more EXTENTS.

h. Each EXTENT must be included in one and only one SEGMENT.

i. Each EXTENT must be composed of one or more BLOCKs.

j. Each BLOCK must be part of one and only one EXTENT.

k. Each FILE must be resident on one and only one DISK. *

l. Each DISK may be the host for one or more FILEs.

* Some operating systems may allow a file to span disks.

Page 44: Data Modeling

44

RELATIONSHIP TYPES

There are three types of relationships.

Relationship Types

• Many to One Relationships

• Many to Many Relationships

• One to One Relationships

All relationships should represent the information requirements and rules of the business.

Page 45: Data Modeling

45

Relationship Types - cont'd

A Many to One Relationship (M to 1 or M:1) has a degree of one or more in

one direction and a degree of one and only one in the other direction.

Example

There is a M:1 relationship between CUSTOMER and SALES REPRESENTATIVE.

Each CUSTOMER must be visited by one and only one SALES REPRESENTATIVE.

Each SALES REPRESENTATIVE may be assigned to visit one or more CUSTOMERS.

Quick Notes

• M:1 relationships are very common.

• M:1 relationships that are mandatory in both directions are rare.

Page 46: Data Modeling

46

Relationship Types - cont'd

A Many to Many Relationship (M to M or M:M) has a degree of one or

more in both directions.

Examples

There is a M:M relationship between STUDENT and COURSE.

Each STUDENT may be enrolled in one or more COURSES.

Each COURSE may be taken by one or more STUDENTS.

There is a M:M relationship between EMPLOYEE and JOB.

Each EMPLOYEE may be assigned to one or more JOBs.

Each JOB may be carried out by one or more EMPLOYEES.

Quick Notes

• Many to Many Relationships are very common.

• Many to Many relationships are usually optional in both directions, although a Many to Many

Relationship may be optional in just one direction.

Page 47: Data Modeling

47

Relationship Types - cont'd

A One to One Relationship (1 to 1 or 1:1) has a degree of one and only one

in both directions.

Example

There is a 1:1 relationship between MICROCOMPUTER and MOTHERBOARD.

Each MICROCOMPUTER must be the host for one and only one MOTHERBOARD.

Each MOTHERBOARD may be incorporated into one and only one MICROCOMPUTER.

Quick Notes

• 1:1 Relationships are rare.

• A 1:1 Relationship that is mandatory in both directions is very rare.

• Entities, which seem to have a 1:1 relationship, may really be the same entity.

Page 48: Data Modeling

48

USING A RELATIONSHIP MATRIX

Use a relationship matrix as an aide for the initial collection of information

about the relationships between a set of entities.

Relationship Matrix Conventions

• A relationship matrix shows if and how each row entity on the left-hand side of the matrix is

related to each column entity shown across the top of the matrix.

• All the entities are listed along both the left-hand side of the matrix and the top of the matrix.

• If a row entity is related to a column entity, then the name of that relationship is shown in the

intersection box.

• If a row entity is not related to a column entity, then a long dash is shown in the intersection

box.

• Each relationship above the diagonal line is the inverse or mirror image of a relationship

below the line.

• Recursive relationships (between an entity and itself) are represented by the boxes on the

diagonal.

Example

The following relationship matrix shows a set of relationships between four entities.

CUSTOMER is related to ORDER and the name of the relationship is the originator of. ORDER is related to CUSTOMER and the name of the relationship is originated by.

Page 49: Data Modeling

49

Using a Relationship Matrix - cont'd

Map the contents of a relationship matrix to an E-R diagram.

Example

Map the following relationship matrix to an E-R diagram.

Draw a softbox for each entity and add the entity's attributes. Draw a relationship line for each

relationship, write-in the relationship's name, and add each relationship's optionality and degree.

Page 50: Data Modeling

50

ANALYZE AND MODEL RELATIONSHIPS

Follow a series of five steps to analyze and model relationships.

Steps

• Determine the existence of a relationship.

• Name each direction of the relationship.

• Determine the optionality of each direction of the relationship.

• Determine the degree of each direction of the relationship.

• Read the relationship aloud to validate it.

Page 51: Data Modeling

51

DETERMINE A RELATIONSHIP'S EXISTENCE

Determine the existence of a relationship.

Examine each pair of entities to determine if a

relationship exists.

Ask About a Relationship's Existence

• Does a significant relationship exist between ENTITY A and

ENTITY B?

Example

Consider the entities DEPARTMENT and EMPLOYEE.

Is there a significant relationship between DEPARTMENT and EMPLOYEE?

Yes, there is a significant relationship between DEPARTMENT and EMPLOYEE.

Example

Consider the entities DEPARTMENT and ACTIVITY.

Is there a significant relationship between DEPARTMENT and ACTIVITY?

No, there is not a significant relationship between DEPARTMENT and ACTIVITY.

Page 52: Data Modeling

52

Determine a Relationship's Existence - cont'd

Use a relationship matrix to systematically examine

each pair of entities.

Example

Log the relationships among ACTIVITY, DEPARTMENT, and EM-

PLOYEE on a relationship matrix. The check marks indicate that a

relationship exists.

Page 53: Data Modeling

53

NAME THE RELATIONSHIP

Name each direction of a relationship.

Ask a Relationship's Name

• How is an ENTITY A related to an ENTITY B?

An ENTITY A is relationship name an ENTITY B.

• How is an ENTITY B related to an ENTITY A?

An ENTITY B is relationship name an ENTITY A.

Example

Consider the relationship between DEPARTMENT and EMPLOYEE.

How is a DEPARTMENT related to an EMPLOYEE?

Each DEPARTMENT is responsible for an EMPLOYEE.

How is an EMPLOYEE related to a DEPARTMENT?

Each EMPLOYEE is assigned to a DEPARTMENT.

Optionally, log the relationship names in a relationship grid.

Example

Log the relationship names for the relationship between DEPARTMENT and EMPLOYEE.

Page 54: Data Modeling

54

Name the Relationship - cont'd

Use a list of relationship name pairs to assist in

naming relationships.

Useful Relationship Name Pairs

• based on the basis for

• bought from the supplier of

• description of for

• operated by the operator for

• represented by the representation of

• responsible for the responsibility of

Quick Note

• Do not use related to or associated with as relationship names.

For further information on the subject see:

CASE*Method Entity Relationship Modelling, 5456-V1.0, page C-10

Page 55: Data Modeling

55

DETERMINE RELATIONSHIP'S OPTIONALITY

Determine the optionality of each direction of the

relationship.

Ask About a Relationship's Optionality

• Must ENTITY A be relationship name ENTITY B?

• Must ENTITY B be relationship name ENTITY A?

Example

Consider the relationship between DEPARTMENT and EMPLOYEE.

Must an EMPLOYEE be assigned to a DEPARTMENT? Always?

Is there any situation in which an EMPLOYEE will not be assigned to a DEPARTMENT?

No, an EMPLOYEE must always be assigned to a DEPARTMENT.

Must a DEPARTMENT be responsible for an EMPLOYEE?

No, a DEPARTMENT does not have to be responsible for an EMPLOYEE.

Draw the relationship lines, with the relationship names.

Example

Page 56: Data Modeling

56

DETERMINE RELATIONSHIP'S DEGREE

Determine the degree of the relationship in both

directions.

Ask About a Relationship's Degree

• May ENTITY A be relationship name more than one ENTITY

B?

• May ENTITY B be relationship name more than one ENTITY

A?

Example

Consider the relationship between DEPARTMENT and EMPLOYEE.

May an EMPLOYEE be assigned to more than one DEPARTMENT?

No, an EMPLOYEE must be assigned to only one DEPARTMENT.

May a DEPARTMENT be responsible for more than one EMPLOYEE?

Yes, a DEPARTMENT may be responsible for one or more EMPLOYEES.

Add the relationship degrees to the E-R Diagram.

Example

Page 57: Data Modeling

57

VALIDATE THE RELATIONSHIP

Re-examine the E-R Model and validate the

relationship.

Read the Relationship Aloud

• Relationships must be readable and make business sense.

Example

Read the relationship represented by the following diagram.

Each EMPLOYEE must be assigned to one and only one DEPARTMENT.

Each DEPARTMENT may be responsible for one or more EMPLOYEES.

Page 58: Data Modeling

58

EXERCISE 3-5

Analyze and model relationships.

1. Analyze and model the relationships in the following set of information requirements. Use a

relationship matrix to track the existence of relationships between the entities.

"I'm the manager of a training company that provides instructor-led courses in management

techniques. We teach many courses, each of which has a code, a name, and a fee. Introduction

to UNIX and C Programming are two of our more popular courses. Courses vary in length from

one day to four days. An instructor can teach several courses. Paul Rogers and Maria Gonzales

are two of our best teachers. We track each instructor's name and phone number. Each course is

taught by only one instructor. We create a course and then line up an instructor. The students

can take several courses over time, and many of them do this. Jamie Brown from AT&T took

every course we offer! We track each student's name and phone number. Some of our students

and instructors do not give us their phone numbers."

Page 59: Data Modeling

59

Exercise 3-5 - cont'd

The following entities were previously modelled.

Page 60: Data Modeling

60

EXERCISE 3-6

Analyze and model relationships.

1. Analyze and model the relationships in the following set of information requirements from

Exercise 3-1. Use a relationship matrix to track the existence of relationships between the

entities.

"I'm the owner of a small video store. We have over 3,000 videotapes that we need to keep track

of.

Each of our videotapes has a tape number. For each movie, we need to know its title and

category (e.g. comedy, suspense, drama, action, war, or sci-fi). Yes, we do have multiple copies

of many of our movies. We give each movie a specific id, and then track which movie a tape

contains. A tape may be either Beta or VHS format. We always have at least one tape for each

movie we track, and each tape is always a copy of a single, specific movie. Our tapes are very

long, and we don't have any movies, which require multiple tapes.

We are frequently asked for movies starring specific actors. John Wayne and Katherine

Hepburn are always popular. So we'd like to keep track of the star actors appearing in each

movie. Not all of our movies have star actors. Customers like to know each actor's "real" birth

name and date of birth. We track only actors who appear in the movies in our inventory.

We have lots of customers. We only rent videos to people who have joined our "video club." To

belong to our club, they must have good credit. For each club member, we'd like to keep their

first and last name, current phone number, and current address. And, of course each club

member has a membership number.

Then we need to keep track of what videotapes each customer currently has checked out. A

customer may check out multiple videotapes at any given time. We just track current rentals.

We don't keep track of any rental histories."

Page 61: Data Modeling

61

Exercise 3-6 - cont'd

The following entities were modelled earlier in Exercise 3-1.

Page 62: Data Modeling

62

LAY OUT THE E-R DIAGRAM

Make an E-R Diagram easy to read and applicable to the people who need

to work with it.

Neat and Tidy

• Line entity boxes up.

• Draw relationship lines straight and either horizontal or vertical.

• Use an angle of 30° to 60°, which is easier to follow when relationship lines must cross.

• Use plenty of white space to avoid the look of congestion.

• Avoid the use of many closely parallel lines, which are difficult to follow.

Unambiguous Text

• Make all text unambiguous.

• Avoid abbreviations and jargon.

• Add adjectives to improve understanding.

• Align text horizontally.

• Put relationship names at the ends of the line and on opposite sides of the line.

Memorable Shapes

• Make the E-R Diagram memorable. People remember shapes and patterns.

• Do not draw an E-R Diagram on a grid.

• Stretch or shrink entity boxes to help the layout of the diagram.

Page 63: Data Modeling

63

Lay Out the E-R Diagram - cont'd

Draw crowsfeet pointing up or to the left.

Layout Rules

• Try to position a crowsfoot on the left end or the top end of the relationships line.

• Position higher volume, more volatile entities toward the top and left of the diagram.

• Position lower volume, less volatile entities toward the bottom and right of the diagram.

Quick Note

• Until an M:M relationship is resolved, at least one end of the relationship will point down or

to (he right.

For further information on the subject see:

CASE*Method Entity Relationship Modelling, 5456-V1.0, pp. 3-16 and 3-17.

Page 64: Data Modeling

64

ATTRIBUTES

Attributes are information about an entity that needs to be known or held.

Attributes describe an entity by qualifying, identifying, classifying,

quantifying or expressing the state of the entity.

Example

What are some attributes of the entity EMPLOYEE?

• Badge number or payroll number identify an EMPLOYEE.

• First name and last name qualify an EMPLOYEE.

• Payroll category (e.g. weekly or salary) classifies an EMPLOYEE.

• Age quantifies an EMPLOYEE.

• Employment status (e.g. active, on leave, terminated) expresses the status of an EMPLOYEE.

Attributes represent a type of description or detail, not an instance.

Example

77506 and 763111 are values of the attribute badge number.

John is a value of the attribute first name of EMPLOYEE.

Quick Notes

• Attribute names should be clear to the user, not codified for the developer.

• The entity's name is always a qualifier of the attribute name - e.g., code of COURSE.

Therefore, an attribute's name should not include its entity's name.

• Attribute names should be specific - e.g., is it quantity, quantity returned, or quantity

purchased?

• Always clarify a date attribute with a descriptor or verb phrase, e.g. date of contact, date

ordered.

• An attribute should only be assigned to a single entity.

Page 65: Data Modeling

65

Attributes - cont'd

Diagramming Conventions

• Attribute names are singular and shown in lower case.

• List attribute names in their entity's soft box.

Example

Page 66: Data Modeling

66

Attributes - cont'd

Always break attributes down into their lowest meaningful components.

Examples

The name of a PERSON can be broken down into last name and first name.

The number of an ITEM consists of type, vendor, and item number.

Break down aggregate attributes and embedded code fields into simple

attributes.

Quick Notes

• Attributes containing dates, times, social security numbers, and zip codes are generally not

decomposed further.

• An attribute of address is frequently left as an aggregate and then decomposed during Design.

Alternative ly it can be decomposed into multiple attributes: apartment/suite, street address,

city, state, and zip code.

• The level of attribute decomposition will depend upon the business requirements.

Page 67: Data Modeling

67

Attributes - cont'd

Verify that each attribute has a single value for each entity instance. A

multi-valued attribute or repeating group is not a valid attribute.

Example

Are the attributes of CLIENT single -valued?

No, a CLIENT may be contacted multiple times, and the business needs to keep all dates of contact. The entity CONTACT is missing.

Quick Note

• A repeated attribute indicates a missing entity.

Page 68: Data Modeling

68

Attributes - cont'd

Verify that an attribute is not derived or calculated from the existing values

of other attributes.

Common Derived Data

• Counts (e.g. the number of salesman in a region)

• Totals (e.g. the total number of each salesman's monthly sales)

• Max/Min/Average (e.g. statistics on the sales of a group of salesmen)

• Other calculations (e.g. a salesman's commission calculated at 10% of sales)

Do not include derived attributes in an E-R Model.

Quick Notes

• Derived attributes are redundant.

• Redundant data can lead to inconsistent data values. The derived data must be revised

whenever the attributes upon which it is based are revised.

• Address the option of storing derived data during Database Design.

Page 69: Data Modeling

69

DISTINGUISH ATTRIBUTES AND ENTITIES

If an attribute has attributes of its own, then it is really an entity.

Example

Determine if all of the attributes of VEHICLE are really attributes.

Initially the user identified color scheme as an attribute of VEHICLE. Later, the user defined the requirement to track the paint color, paint type, and trim color for each color scheme. Color scheme then had attributes of its own, and became an entity with a relationship to VEHICLE.

Example

Determine if all the attributes of EMPLOYEE are attributes.

Number of dependents is an attribute of EMPLOYEE, but if it is necessary to keep each dependent's name and age, then DEPENDENT becomes an entity. Number of dependents can now be derived.

Quick Notes

• Entities have attributes.

• Attributes have no attributes on their own.

Page 70: Data Modeling

70

Distinguish Attributes and Entities - cont'd

All entities are nouns, but not all nouns are entities.

Entity Characteristics Attribute Characteristics

Anything about which information must

be held

Qualifies an entity

Possesses one or more attributes

Does not possess attribute (s) of its own

If an entity has no attributes, it may be

only an attribute

If an attribute has an attribute, then it is an entity or

have no significance

May have multiple occurrences associated

with another entity via a relationship

Has a single value for each entity occurrence (no

repeating groups)

Quick Notes

• Do not disqualify a candidate entity too quickly. Attributes for that entity may appear later.

• Instances of entities and attributes are also nouns.

Page 71: Data Modeling

71

ATTRIBUTE OPTIONALITY

Identify each attribute's optionality using an attribute tag.

Mandatory Attributes

• A value must be known for each entity occurrence.

• Tagged with *.

Optional Attributes

• A value may be known for each entity occurrence.

• Tagged with o.

Example

Identify the attributes for the PERSON entity. Determine their optionality.

The title and weight attributes are optional. The remaining attributes are mandatory.

Page 72: Data Modeling

72

Attribute Optionality - cont'd

Use sample attribute instance data to validate attribute Optionality.

Example

Are the mandatory and optional attribute tags for the PERSON entity correct? Use an Entity Instance

Chart to validate that the mandatory and optional attribute tags for the PERSON entity are correct.

Entity Name: PERSON

Attribute Name code name title sex weight

Tags * * o * o

110 Jones President F -

301 Smith Treasurer M 210

134 Gonzales - F 110

340 Johnson Secretary M -

Sample Data

589 Brown M 195

Quick Note

• An Entity Instance Chart is useful for logging sample attribute data.

Page 73: Data Modeling

73

IDENTIFY ATTRIBUTES

Identify attributes by examining interview notes and by asking the user

questions.

Attributes may appear in interview notes as:

• Descriptive words and phrases.

• Nouns.

• Prepositional phrases (e.g. Salary amount for each employee).

• Possessive nouns and pronouns (e.g. Employee's name).

Questions to Ask the User

• What information do you need to know or hold about entity x?

• What information would you like displayed or printed about entity x?

Page 74: Data Modeling

74

Identify Attributes - cont'd

Examine documentation on existing manual procedures or automated systems to discover additional

attributes and omissions.

Paper Forms Computer Reports Computer Files

Headings

Fields

Record layouts

Prompts

Headings

File Dumps

Sort Orders

Questions to Ask the User

• Is this attribute really needed?

Quick Notes

• Beware of obsolete requirements left over from previous systems.

• Beware of derived data.

For further information on the subject see:

CASE*Method Entity Relationship Modelling, 5456-V1.0, pp. 5-6 and 5-7.

Page 75: Data Modeling

75

EXERCISE 3-7

Develop an E-R Diagram.

1. Develop an E-R Diagram for the following situation. Be sure to tag each attribute with its

optionality.

"Our regional Oracle User's Group has grown to include over 200 members. We're an all

volunteer organization, and our records are a mess. We need an information system to help us

keep track of all our affairs.

We definitely need to automate our membership records. For each member, we need to keep the

member's name, title, mailing address, office phone number, type of membership (individual or

corporate), and whether or not the member is current on dues. We collect dues on a yearly basis,

and everyone's dues are due in January.

We also like to know which company a member works for, but keeping this information current

is a real chore because our members are always changing companies. We only try to track a

single current employer for each member. Our members come from many different companies

including Coors, EG&G, and Storage Tech. A few of our members are unemployed. For each

company, we keep the company name, address, and type of business. We have a standard set of

type of business codes. We only keep the main company address for each company.

We hold various events during the year, and we'd like to track information about each event.

Some of our annual events include the September Meeting, the November Meeting, the annual

Training Day in January, and our April Meeting. We also hold specia l events each year. For

example, we held a special CASE day last May, and Richard Barker from ORACLE U.K. came

and spoke. We hold our events at several different locations around town including AT&T,

Redrocks Community College, and D.U. We'd like to track each event's date, an optional

description of the event, number of attendees, where it was held, how much money we spent on

it, and any comments on the event. We treat all comments as if they came from an anonymous

submitter. A set of comments is just a free form text statement of any length. We number each

set of comments, and we frequently get multiple sets of comments for an event.

We also track which members attended which events. Some of our members are really active,

and others attend very infrequently or just enjoy receiving our newsletter.

(continued)

Page 76: Data Modeling

76

Exercise 3-7 - cont'd

"We also need to track what type of computer platforms our members are using. We have a

unique, three-digit system identification tag for each type of platform. For example, 001 is for

IBM/MVS; 002 is for IBM/VM; 003 is for VAX/VMS; 020 is for OS/2; 030 is for PC/DOS:

050 is for Sun Unix; and 080 is for other Unix platforms.

We also like to track which application areas each member is interested in. For example,

accounting, human resources, oil and gas, pharmaceuticals, and health systems. The

applications should be portable, so we don't need to know which platforms they run on."

Page 77: Data Modeling

77

ASSIGN UNIQUE IDENTIFIERS

A Unique Identifier (UID) is any combination of attributes and/or

relationships that serve to uniquely identify an occurrence of an entity.

Each entity occurrence must be uniquely identifiable.

Example

In a business, each occurrence of DEPARTMENT is uniquely identified by its department number.

The UID for the entity DEPARTMENT is the attribute number.

Example

For a small theatre, each ticket is uniquely identified by its date of performance and its seat number.

The UID for the entity THEATRE TICKET is the combination of the two attributes date of performance and seat number.

An entity must have a UID, or it is not an entity.

Quick Notes

• All components of a UID must be mandatory *.

• Tag each UID attribute with #*.

Page 78: Data Modeling

78

Assign Unique Identifiers - cont'd

An entity can be uniquely identified through a relationship.

Example

In the banking industry, each bank is assigned a unique bank number. Within a bank, each account has

a unique account number. What is the UID of the entity ACCOUNT?

ACCOUNT is uniquely identified by its attribute number and the specific BANK the account is related to.

Use a UID bar to indicate that a relationship is part of the entity's UID.

Example

The UID bar indicates that the relationship with BANK is part of the UID of ACCOUNT.

Quick Note

• A relationship included in a UID must be mandatory and one and only one in the direction that

participates in the UID.

Page 79: Data Modeling

79

Assign Unique Identifiers - cont'd

An entity may be uniquely identified through multiple relationships.

Example

A business needs to track the work assignments of its employees. Employees are given work

assignments to projects. An employee may be given multiple assignments to a single project, each

with a different date of assignment.

What is the UID of the entity WORK ASSIGNMENT?

A WORK ASSIGNMENT is uniquely identified by the EMPLOYEE the WORK ASSIGNMENT is for, the PROJECT the WORK ASSIGNMENT is to, and the date assigned.

Quick Note

• Both relationships are mandatory and one and only one in the direction included in the UID.

Page 80: Data Modeling

80

Assign Unique Identifiers - cont'd

An entity may have more than one UID.

Example

What uniquely identifies an EMPLOYEE?

Candidate UIDs include:

1. badge number

2. payroll number

3. first name/last name

Are they all unique? The first name/last name combination is probably not unique.

Select one candidate UID to be the primary UID, and the others to be

secondary UIDs.

Quick Notes

• Either tag Secondary UIDs as (#), or do not tag them.

• CASE*Dictionary can document multiple secondary UIDs.

Page 81: Data Modeling

81

Assign Unique Identifiers - cont'd

Consider creating unique, artificial attributes to help identify each entity.

Example

What uniquely identifies a CUSTOMER entity?

Possibly the CUSTOMER'S first and last name could be a UID. However, there could be two CUSTOMERS with the same name.

Create an artificial attribute called CUSTOMER code which will be unique for each instance of

CUSTOMER.

Quick Notes

• Artificial attributes are used often for UIDs.

• Define an artificial code when the business does not have a natural attribute which uniquely

identifies an entity.

Page 82: Data Modeling

82

Assign Unique Identifiers - cont'd

Search for attributes and relationships to identify each entity.

Evaluate the Attributes

• What mandatory attributes identify the entity? Seek out additional attributes that help identify

the entity. Consider creating artificial attributes for identification.

• Does an attribute uniquely identify the entity?

• What combinations of attributes uniquely identify the entity?

Consider the Relationships

• Which of the relationships help identify the entity?

• Are there missing relationships that help identify the entity?

• Does the relationship help uniquely identify the entity?

• Is the relationship mandatory and one and only one in the direction from the entity?

Validate the UID

• Examine sample data. Does the selected combination of attributes and relationships uniquely

identify each instance of an entity?

• Are all the attributes and relationships that are included in the UID mandatory?

Page 83: Data Modeling

83

EXERCISE 3-8

Identify UIDs.

1. For the Training Company situation and E-R model from Exercise 3-5, supply attribute tags

for each attribute, and identify a UID for each entity. Add these attribute tags and UID's to the

E-R model.

"I'm the manager of a training company that provides instructor-led courses in management

techniques. We teach many courses, each of which has a code, a name, and a fee. Introduction

to UNIX and C Programming are two of our more popular courses. Courses vary in length from

one day to four days. An instructor can teach several courses, Paul Rogers and Maria Gonzales

are two of our best teachers. We track each instructor's name and phone number. Each course is

taught by only one instructor. We create a course and then line up an instructor. The students

can take several courses over time, and many of them do this. Jamie Brown from AT&T took

every course we offer! We track each student's name and phone number. Some of our students

and instructors do not give us their phone numbers."

E-R Model from Exercise 3-5

Page 84: Data Modeling

84

EXERCISE 3-9

Identify UIDs.

1. For the Video Store situation and E-R Model from Exercise 3-6, identify a UID for each entity

and add these UIDs to the E-R model. Also, supply attribute tags for each attribute.

"I'm the owner of a small video store. We have over 3,000 video tapes that we need to keep

track of.

Each of our video tapes has a tape number. For each movie, we need to know its title and

category (e.g. comedy, suspense, drama, action, war, or sci-fi). Yes, we do have multiple copies

of many of our movies. We give each movie a specific id, and then track which movie a tape

contains. A tape may be either Beta or VHS format. We always have at least one tape for each

movie we track, and each tape is always a copy of a single, specific movie. Our tapes are very

long, and we don't have any movies, which require multiple tapes.

We are frequently asked for movies starring specific actors. John Wayne and Katherine

Hepburn are always popular. So we'd like to keep track of the star actors appearing in each

movie. Not all of our movies have star actors. Customers like to know each actor's "real" birth

name and date of birth. We track only actors who appear in the movies in our inventory.

We have lots of customers. We only rent videos to people who have joined our "video club." To

belong to our club, they must have good credit. For each club member, we’d like to keep his or

her first and last name, current phone number, and current address. And, of course each club

member has a membership number.

Then we need to keep track of what video tapes each customer currently has checked out. A

customer may check out multiple video tapes at any given time. We just track current rentals.

We don't keep track of any rental histories."

Page 85: Data Modeling

85

Exercise 3-9 - cont'd

E-R Model from Exercise 3-6

Page 86: Data Modeling

86

EXERCISE 3-10

Identify UIDs.

1. For the Oracle User's Group situation and E-R Model from Exercise 3-7, identify a UID for

each entity and add these UIDs to the E-R Model.

"Our regional Oracle User's Group has grown to include over 200 members. We're an all-

volunteer organization, and our records are a mess. We need an information system to help us

keep track of all our affairs.

We definitely need to automate our membership records. For each member, we need to keep the

member's name, title, mailing address, office phone number, type of membership (individual or

corporate), and whether or not the member is current on dues. We collect dues on a yearly basis

and everyone's dues are due in January.

We also like to know which company a member works for, but keeping this information current

is a real chore because our members are always changing companies. We only try to track a

single current employer for each member. Our members come from many different companies

including Coors, EG&G, and Storage Tech. A few of our members are unemployed. For each

company, we keep the company name, address, and type of business. We have a standard set of

type of business codes. We only keep the main - company address for each company.

We hold various events during the year, and we'd like to track information about each event.

Some of our annual events include the September Meeting, the November Meeting, the annual

Training Day in January, and our April Meeting. We also hold special events each year. For

example, we held a special CASE day last May, and Richard Barker from ORACLE U.K. came

and spoke. We hold our events at several different locations around town including AT&T.

Redrocks Community College, and D.U. We'd like to track each event's date, an optional

description of the event, number of attendees, where it was held, how much money we spent on

it, and any comments on the event. We treat all comments as if they came from an anonymous

submitter. A set of comments is just a free form text statement of any length. We number each

set of comments, and we frequently get multiple sets of comments for an event.

We also track which members attended which events. Some of our members are really active,

and others attend very infrequently or just enjoy receiving our newsletter.

(continued)

Page 87: Data Modeling

87

Exercise 3-10 - cont'd

We also need to track what type of computer platforms our members are using. We have a

unique, three-digit system identification tag for each type of platform. For example, 001 is for

IBM/MVS; 002 is for IBM/VM; 003 is for VAX/VMS; 020 is for OS/2; 030 is for PC/DOS;

050 is for Sun Unix;, and 080 is for other Unix platforms.

"We also like to track which application areas each member is interested in. For example,

accounting, human resources, oil and gas, pharmaceuticals, and health systems. The

applications should be portable, so we don't need to know which platforms they run on."

E-R Model from Exercise 3-7

Page 88: Data Modeling

88

REVIEW: BASIC CONCEPTUAL DATA MODELLING

An entity is a thing of significance about which information needs to be

known or held.

Diagramming Conventions

• Soft box

• Singular, unique name

• Name in upper case

• Optional synonym name (in parentheses)

• Any dimensions

Identify and Model Entities

1. Examine the nouns. Are they things of significance?

2. Name each entity.

3. Is there information of interest about the entity that the business needs to hold?

4. Is each instance of the entity uniquely identifiable? Which attribute or attributes could serve as

its UID?

5. Write a description of it. "An EMPLOYEE has significance as a paid worker at the company.

For example, John Brown and Mary Smith are EMPLOYEES."

6. Diagram each entity and a few of its attributes.

Page 89: Data Modeling

89

Review: Basic Conceptual Data Modelling - cont'd

A relationship is a two-directional, significant association between two entities, or between an entity

and itself.

Relationship Syntax

Diagramming Conventions

Crows always fly east or south!

Analyze and Model the Relationships Between Entities

1. Determine the existence of a relationship.

2. Name each direction of the relationship.

3. Determine the optionality of each direction of the relationship.

4. Determine the degree of each direction of the relationship.

5. Model the relationship.

Page 90: Data Modeling

90

Review: Basic Conceptual Data Modelling - cont'd

Attributes are information about an entity that needs to be known or held.

Diagramming Conventions

• Attribute names are singular, lower case, and do not include the entity's name.

• Attribute tags: * for mandatory and o for optional.

Analyze and Model Attributes

1. Identify a candidate attribute.

2. Associate the attribute with an entity.

3. Name the attribute.

4. Determine the optionality of the attribute.

5. Validate that the attribute is really an attribute and not an entity.

6. Break down aggregate attributes.

7. Verify that an attribute is single valued.

8. Verify that an attribute is not derived.

Page 91: Data Modeling

91

Review: Basic Conceptual Data Modelling - cont'd

Each entity must be uniquely identifiable. A Unique Identifier (UID) is any

combination of attributes and/or relationships that serve to uniquely

identify an occurrence of an entity.

Diagramming Conventions

• # indicates an attribute is part of an entity's UID.

• The UID bar indicates a relationship is part of the UID.

Identify UIDs for Each Entity

1. Seek out candidate attributes that help identify an entity.

2. Determine the entity's dependence upon other related entities.

3. Define the UID for the entity.

Page 92: Data Modeling

92

4

ADVANCED

CONCEPTUAL DATA MODELLING

Page 93: Data Modeling

93

SECTION OBJECTIVES

At the end of this section, you will be able to:

1. Validate that an attribute is properly placed based upon its dependence on its entity's UID.

2. Resolve many-to-many relationships with intersection entities.

3. Identify and model advanced data constructs including recursive relationships, subtypes, and

exclusive relationships.

Page 94: Data Modeling

94

NORMALIZE THE DATA MODEL

Normalization is a relational database concept, but its principles apply to

Conceptual Data Modelling.

Validate each attribute's placement using the rules of normalization.

Normal Form Rule Description

First Normal Form (1NF) All attributes must be single -valued

Second Normal Form (2NF) An attribute must be dependent upon its entity's entire unique

identifier.

Third Normal Form (3NF) No non-UID attribute can be dependent on another non-UID

attribute.

A normalized entity-relationship data model automatically translates into a

normalized relational database design.

Quick Notes

• Third normal form is the generally accepted goal for a database design that eliminates

redundancy.

• Higher normal forms are not widely used.

Page 95: Data Modeling

95

Normalize the Data Model - cont'd

First Normal Form Rule: All attributes must be single-valued.

Validation Check:

• Validate that each attribute has a single value for each occurrence of the entity. No attribute

should have repeating values.

Example

Does the entity CLIENT comply with 1NF? If not, how could it be converted to 1NF?

The attribute date contacted has multiple values, therefore the entity CLIENT is not in 1NF Create an additional entity CONTACT with a M:1 relationship to CLIENT.

If an attribute has multiple values, create an additional entity and relate it

to the original entity with a M:1 relationship.

Page 96: Data Modeling

96

Normalize the Data Model - cont'd

Second Normal Form Rule: An attribute must be dependent upon it entity's

entire unique identifier.

Validation Check:

• Validate that each attribute is dependent upon its entity's entire unique identifier. Each specific

instance of the UID must determine a single instance of each attribute.

• Validate that an attribute is not dependent upon only part of it's entity's UID.

Example

Validate the placement of the COURSE entity's attributes.

Each instance of a course code determines a specific value for name duration and fee. The attributes are properly placed.

Example

Validate the placement of the attributes for the ACCOUNT and BANK entities.

Each instance of a BANK and account number determine specific values of balance and date opened for each account. The attribute bank location is misplaced. It is dependent on BANK, but not on account number. It should not be an attribute of ACCOUNT.

If an attribute is not dependent on its entity's entire UID, it is misplaced

and must be moved.

Page 97: Data Modeling

97

Normalize the Data Model - cont'd

Third Normal Form Rule: No non-UID attribute can be dependent on

another non-UID attribute.

Validation Checks:

• Validate that each non-UID attribute is not dependent upon another non-UID attribute.

• Move any non-UID attribute that is dependent upon another non-UID attribute.

Example

Are any of the non-UID attributes for this entity dependent upon another non-UID attribute?

The attributes customer name and state are dependent upon the customer id. Create another entity called CUSTOMER with a UID of customer id, and place the attributes accordingly.

Quick Note

• If an attribute is dependent upon a non-UID attribute, move both the dependent attribute and

the attribute it is dependent upon to a new, related entity.

Page 98: Data Modeling

98

EXERCISE 4-1

Normalize an E-R Model

1. For the following E-R Model, evaluate each entity against the rules of normalization, identify

the misplaced attribute, and explain what rule of normalization each misplaced attribute

violates.

2. Optionally, re-draw the E-R diagrams in third normal form.

Page 99: Data Modeling

99

RESOLVE M:M RELATIONSHIPS

Attributes may seem to be associated with a M:M Relationship. Resolve

that M:M relationship by adding an intersection entity with those

attributes.

Example

Consider the M:M relationship between PRODUCT and VENDOR. What is the current price of a

specific PRODUCT from a specific VENDOR?

current price seems to be an attribute of the relationship between PRODUCT and VENDOR.

Attributes only describe entities. If attributes describe a relationship, the

relationship must be resolved.

Page 100: Data Modeling

100

Resolve M:M Relationships - cont'd

Replace or resolve a M:M Relationship with a new Intersection Entity and

two M:1 relationships.

Example

The M:M relationship between PRODUCT and VENDOR can be resolved by adding the intersection

entity CATALOG ITEM. Current price is really an attribute of the entity CATALOG ITEM.

Once the entity CATALOG ITEM is defined, the requirement for additional attributes of CATALOG ITEM surfaced: package quantity and unit of measure are also attributes of CATALOG ITEM. The UID for CATALOG ITEM is composed of its two relationships.

Quick Notes

• An Intersection Entity is frequently identified by its two originating relationships - note the

two UID bars.

• The relationships from the intersection entity are always mandatory.

• Intersection entities frequently represent real-world business entities.

• Intersection entit ies usually contain consumables like quantity used and dates. They tend to be

high volume and volatile entities.

Page 101: Data Modeling

101

Resolve M:M Relationships - cont'd

Position Intersection Entities to allow the crowsfeet to point up or to the

left.

M:M Relationship Layout

Intersection Entity Layout

Quick Notes

• A Reference Entity is an entity that has no mandatory rela tionship ends connected to it.

• When M:M relationships are resolved, the layout of the entire diagram may need to be

shuffled.

Page 102: Data Modeling

102

Resolve M:M Relationships - cont'd

The UID of an intersection entity is frequently composed of its relationships

to the two originating entities.

Example

Resolve the following M:M relationship to accommodate these additional requirements:

"Track the date each student enrolled in a course, the date the student completed the course, and

the student's grade."

Solution

Add the intersection entity ENROLLMENT and two M:1 relationships.

ENROLLMENT has attributes of date enrolled, date completed, and grade. The UID of ENROLLMENT is made up of its relationships to STUDENT and COURSE.

Quick Note

• This model only tracks the last date the student enrolled in a specific course. If multiple

enrollments need to be kept, include the attribute date enrolled as part of the UID.

Page 103: Data Modeling

103

Resolve M:M Relationships - cont'd

An intersection entity's relationships to the two originating entities may not

be adequate to uniquely define each occurrence of the intersection entity.

Example

Resolve the following M:M relationship to accommodate these additional requirements:

"Track the date each employee is assigned to a project, and the duration of that assignment."

Add an intersection entity called WORK ASSIGNMENT with attributes date assigned and duration.

WORK ASSIGNMENT is partially identified by its relationships to EMPLOYEE and PROJECT, but

those two relationships are not enough to uniquely identify a WORK ASSIGNMENT. An employee

may have multiple assignments to a project, with different assignment dates. Therefore, the UID of

WORK ASSIGNMENT must include the related EMPLOYEE, the related PROJECT, and the

attribute date assigned.

Page 104: Data Modeling

104

Resolve M:M Relationships - cont'd

Once an intersection entity is identified, search for additional attributes

which describe the intersection entity.

Example

What information needs to be known about the relationship between PRODUCT and VENDOR? "We

need to track the current price of a specific PRODUCT from a specific VENDOR.

Resolve the following M:M relationship to accommodate this additional requirement.

Add the intersection entity VENDOR ITEM with an attribute of current price.

What other information needs to be known about a VENDOR ITEM?

"We also need to know the package quantity and unit of measure of each VENDOR ITEM."

Page 105: Data Modeling

105

Resolve M:M Relationships - cont'd

Search for attributes which identify, or help to identify an intersection

entity.

Example

How do you identify each VENDOR ITEM? Do you use the combination of the related VENDOR

code and the PRODUCT id?

"No, we have a catalog of all orderable VENDOR ITEMs, and each VENDOR ITEM has a unique

catalog number."

According to the rules of the business, each VENDOR ITEM has a unique catalog number. So the

attribute catalog number should be the UID of VENDOR ITEM.

Page 106: Data Modeling

106

Resolve M:M Relationships - cont'd

Resolve all M:M relationships by the end of the Analysis phase. This forced

resolution may result in an Intersection Entity with no attributes.

Example

In the Video Store situation, the following M:M relationship was defined.

At the end of the Analysis Stage, the user has not identified any attributes that are associated with the

M:M relationship. Resolve the M:M relationship with an Intersection Entity with no attributes.

Quick Notes

• An Intersection Entity with no attributes is just a two-way cross-reference list between occurrences of the entities.

• An Intersection Entity with no attributes is the exception to the rule that an entity must have attributes to be an entity.

• The UID for an empty Intersection Entity is always composed of the relationships of the two entities from which it or iginated.

Page 107: Data Modeling

107

EXERCISE 4-2

Resolve a M:M relationship.

1. In the E-R Model for the Oracle User's Group from Exercise 3-10, a M:M relationship was

initially modelled between the MEMBER entity and the APPLICATION AREA entity.

Resolve that M:M relationship based upon the following additional requirements.

Additional Requirements

"We would also like to keep a brief description of each member's interest in each specific

application area. For example, one member might already have a large accounting application

system that they developed in house. Another member might be interested in an application area

without describing that interest."

Page 108: Data Modeling

108

EXERCISE 4-3

Resolve a M:M relationship.

1. Resolve the following M:M Relationship between CUSTOMER and PRODUCT. Add the

attributes date ordered, quantity ordered, and price.

Page 109: Data Modeling

109

MODEL HIERARCHICAL DATA

Represent hierarchical data as a set of many to one relationships.

Example

Model a company's hierarchical organization structure as a set of M:1 relationships.

Quick Note

• Oracle's E-R Diagram layout rule Crows fly east or south causes hierarchies to be drawn

upside-down or sideways!

Page 110: Data Modeling

110

Model Hierarchical Data - cont'd

The UID's for a set of hierarchical entities may be propagated through

multiple relationships.

Example

What are the UIDs of the entities FLOOR, SUITE, and ROOM?

The UID of ROOM is the room id and the SUITE it is located within. The UID of SUITE is the suite number and the FLOOR it is located on. The UID of FLOOR is the floor number and the BUILDING it is contained in.

Page 111: Data Modeling

111

Model Hierarchical Data - cont'd

Consider creating artificial attributes to help identify entities in a

hierarchical relationship.

Example

In a typical organization structure, what could uniquely identify instances of the entities DIVISION,

DEPARTMENT, and TEAM?

Each TEAM could be identified based upon its DEPARTMENT, DIVISION, and COMPANY. Or

each entity could have a unique, independent, artificial identification code.

Quick Notes

• Unique, independent, artificial identification codes tend to be shorter in length.

• If the hierarchical structure changes often, use independent artificial identifiers.

Page 112: Data Modeling

112

MODEL RECURSIVE RELATIONSHIPS

A Recursive Relationship is a relationship between an entity and itself.

Example

Read the recursive relationship in the following E-R Diagram.

Each EMPLOYEE may be managed by one and only one EMPLOYEE. Each EMPLOYEE may be the manager of one or more EMPLOYEES.

Quick Notes

• The E-R diagramming convention that shows a recursive relationship is known as a pig's ear.

• The loop can appear on any side of the entity's box, but remember that crows always fly east

or south.

Page 113: Data Modeling

113

Model Recursive Relationships - cont'd

Consider representing a hierarchy as a recursive relationship.

Example

A business hierarchy can be drawn as a recursive relationship.

Quick Notes

• The single recursive entity must include all of the attributes of each individual entity. Ideally, the entities at each level of the hierarchy would have the same attributes.

• A recursive organization model can readily accommodate the addition or subtraction of organization layers.

• A recursive organization model cannot handle a mandatory relationship. If each ORGANIZATION ELEMENT must be within another ORGANIZATION ELEMENT, the organization hierarchy would have to be infinite.

• A recursive relationship must be optional in both directions.

Page 114: Data Modeling

114

Model Recursive Relationships - cont'd

Bill of Materials data can be modelled with multiple entities for each

category of "part" and a set of relationships between each of those entities.

Example

An automobile manufacturing organization needs to track elementary parts, subassemblies,

assemblies, and products. The following E-R diagram models this data by considering each of these

part categories as an entity.

Page 115: Data Modeling

115

Model Recursive Relationships - cont'd

Model Bill of Materials data as a many to many recursive relationship.

Example

For the automobile manufacturing organization, consider all elementary parts, subassemblies,

assemblies, and products as instances of an entity called COMPONENT. Then the previous complex

E-R Model can be remodelled as a simple recursive relationship.

Each COMPONENT may be a part of one or more COMPONENTS. Each COMPONENT may be made up of one or more COMPONENTS.

Page 116: Data Modeling

116

Model Recursive Relationships - cont'd

Resolve a recursive M:M relationship with an intersection entity and two

M:1 relationships to different instances of the original entity.

Example

Consider the recursive model of a Bill of Materials structure. This model will track information about

which components are part of a fan. But if a washer is part of a fan, will it also track how many

washers are parts of a fan?

The attribute quantity seems to be associated with the recursive relationship.

Resolve this M:M recursive relationship by adding the intersection entity ASSEMBLY RULE and two

M:1 relationships back to the COMPONENT entity. ASSEMBLY RULE will have an attribute of

quantity.

The two M:1 relationships from an instance of ASSEMBLY RULE will be associated with different instances of the COMPONENT entity. For example, the ASSEMBLY RULE instance for washers to fan will have a M:1 relationship to the COMPONENT instance for washer and a second M:1 relationship to the COMPONENT instance for fan. The ASSEMBLY RULE entity will record the quantity of washers, which are a part of a single fan.

Page 117: Data Modeling

117

EXERCISE 4-4

Model hierarchical and recursive relationships.

1. Develop two E-R diagrams to represent the following situation. Develop one as a hierarchical

structure, and one as a recursive structure.

"Our company sells products throughout the United States. So we've divided the U.S. into four

major sales regions: the Northern, Eastern, Southern, and Western Regions. Each sales region

has a unique region code. Each sales region is then divided into sales districts. For example, the

Western Region is divided into the Rocky Mountain, Northwest, Pacific Coast, and Pacific

Districts. Each district has a unique district code.

Each district is made up of sales territories. The Rocky Mountain District is composed of three

territories: Wyoming-Montana, Colorado, and Utah-New Mexico. The northwest District is

made up of two territories: The Washington and Oregon-Idaho territories. The Pacific Coast

District is composed of two territories: the California and Nevada territories. The Pacific

District includes the Hawaii territory and the Alaska territory. Each territory has a unique

territory code.

Then each sales territory is broken down into sales areas. For example, Colorado is made up of

two sales areas: the Front Range and the Western Slope sales areas. Each sales area has a unique

sales area code.

Each salesperson is responsible for one or more sales areas, and has a specific sales quota. We

also have sales managers who are responsible for one or more sales districts, and sales directors

who are responsible for one or more sales regions. Each sales manager is responsible for the

territories within his districts. We don't overlap our employees' responsibilities - a sales area is

always the responsibility of a single salesperson, and our managers and director's

responsibilities don't overlap. Sometimes our salespersons, managers, and directors will be on

leave or special assignments arid will not have sales turf responsibilities. We identify all our

sales personnel by their employee ids."

Page 118: Data Modeling

118

MODEL ROLES WITH RELATIONSHIPS

Beware of entities that represent roles.

Example

In the E-R Model for the Training Company, we defined an INSTRUCTOR entity and a STUDENT

entity. This model works fine if an INSTRUCTOR is never a STUDENT, and a STUDENT is never

an INSTRUCTOR. But what if an INSTRUCTOR is also a STUDENT?

Entities, which represent roles, may share overlapping instances.

Page 119: Data Modeling

119

Model Roles with Relationships - cont'd

Use relationships to model roles. Relationships allow a single entity instance

to assume multiple roles.

Example

For the Training Company, define a PERSON entity, which may take on the roles of instructor and/or

student.

Page 120: Data Modeling

120

MODEL SUBTYPES

Use subtypes to model exclusive entity types which have common attributes

and common relationships.

Example

"A business has defined two types of employees: exempt and non-exempt. For all employees, track

each employee's badge number, first name, last name, and assigned department. For the exempt

employees, also track employee salary. For the non-exempt employees, track the employee's hourly

rate, overtime rate, and membership in a union."

Create an EMPLOYEE supertype with two subtypes. Each EMPLOYEE is either an EXEMPT

EMPLOYEE or a NON-EXEMPT EMPLOYEE.

Quick Note

• Beware of instances that could be both subtypes - the subtype/supertype construct is incorrect

in those instances.

Page 121: Data Modeling

121

Model Subtypes - cont'd

A supertype is an entity that has subtypes. A supertype may be split into

two or more mutually exclusive subtypes.

Example

An EMPLOYEE is either an EXEMPT EMPLOYEE or a NON-EXEMPT EMPLOYEE, but not both.

A supertype may have attributes and relationships shared by its subtypes.

Example

All EMPLOYEES must have the attributes badge number, first name, and last name. All EMPLOYEES

must be assigned to one and only one DEPARTMENT.

Each subtype may have its own attributes and relationships.

Example

The EXEMPT EMPLOYEE subtype has an attribute of salary.

The NON-EXEMPT EMPLOYEE subtype has attributes of hourly rate and overtime rate, and a

relationship with the entity UNION.

Quick Note

• A subtype with no attributes or relationships of its own may be a synonym for the supertype

entity and not a subtype.

Page 122: Data Modeling

122

Model Subtypes - cont'd

All instances of the supertype entity must belong to one and only one of the

subtype entities. Subtypes must form a complete set with no overlaps.

Example

In general, a job is either a MANUAL JOB or a CLERICAL JOB, but there might be a few excep-

tions.

Supertype Reading Rules

"Each supertype entity must be either a subtype 1 or a subtype2"

Example

"Each JOB must be either a MANUAL JOB, a CLERICAL JOB, or OTHER JOB."

Subtype Reading Rules

"...subtype, which is a type of supertype,..."

Example

"...CLERICAL JOB, which is a type of JOB,..."

Always use the subtype OTHER when unsure about the set's completeness.

Page 123: Data Modeling

123

Model Subtypes - cont'd

Subtypes can be further subtyped. Normally two or three levels of nesting

are adequate.

Example

Define further subtypes for the subtype entity AIRPLANE.

AIRPLANE is a subtype of AIRCRAFT and a supertype of POWERED AIRPLANE and GLIDER.

JET PLANE inherits the attributes and relationships of POWERED AIRPLANE, AIRPLANE, and AIRCRAFT.

Page 124: Data Modeling

124

MODEL EXCLUSIVE RELATIONSHIPS

Model two or more mutually exclusive relationships from the same entity

using an arc.

Example

A BANK ACCOUNT either must be owned by an INDIVIDUAL or must be owned by a COMPANY.

Use an arc to model this relationship.

Exclusive Relationship Reading Rules

"Each entityA either relationship1 entity1 or relationship2 entity2."

Example

Each BANK ACCOUNT either must be owned by one and only one INDIVIDUAL or must be owned

by one and only one COMPANY.

Arc Modelling Conventions

• The relationships in an arc frequently have the same relationship name.

• The relationships in an arc must be either all mandatory, or all optional.

• An arc belongs to a single entity, and must only include relationships originating from that entity.

• An entity may have multiple arcs, but a specific relationship can only participate in a single arc.

Page 125: Data Modeling

125

Model Exclusive Relationships - cont'd

Choose between two conventions for drawing arcs.

Drawing Convention 1 - An Arc with Optional Dots

A dot on the arc is used to signify that a relationship belongs to the arc.

Drawing Convention 2 - An Arc without Dots

Any relationship crossed by the arc belongs to the arc. A break in the arc indicates a relationship,

which is not included in the arc.

Page 126: Data Modeling

126

EXERCISE 4-5

Develop an E-R Model.

1. Develop an E-R Model for the following information requirements.

"The Right-Way Rental Truck Company rents small moving trucks and trailers for local and

one-way usage. We have 34 7 rental offices across the western United States. Our rental stock

includes a total of 5,780 vehicles including various types of trucks and trailers. We need to

implement a system to track our rental agreements and our vehicle assignments. Each rental

office rents vehicles that they have in stock to customers ready to take possession of the vehicle.

We don't take reservations, or speculate on when the customer will return rented vehicles. The

central office oversees the vehicle distribution, and directs transfers of vehicles from one rental

office to another.

Each rental office has an office name like "Littleton Right-Way." Each office also has a unique

three-digit office number. We also keep each office's address. Each office is a home office for

some of our vehicles, and each vehicle is based out of a single home office.

Each vehicle has a vehicle id, state of registration, and a license plate registration number. We

have five different types of vehicles: 36' trucks, 24' trucks, 10' trucks, 8' covered trailers, and 6'

open trailers. Yes, we do have a vehicle type code. For all our vehicles, we need to track the last

maintenance date, and expiration date of its registration. For our trucks, we need to know the

current odometer reading, the gas tank capacity, and whether or not it has a working radio. For

long moves, customers really prefer a radio. We log the current mileage just before we rent a

truck, and then again when it is returned.

Most of our rental agreements are for individual customers, but a rental agreement can either be

for an individual or for a company. We do rent a small percentage of our trucks to companies.

We assign each company an identifying company number and track the company's name and

address. No, we don't need to worry about any additional information about a company. Our

corporate sales group handles all that information separately.

(Continued)

Page 127: Data Modeling

127

Exercise 4-5 - cont'd

"For each individual customer, we record the customer's name. home phone, address, and

driver's license state, number, and expiration date. We like to keep track of all our customers. If

a customer damaged a vehicle, abandoned it, or didn't fully pay the bill, then we tag the

customer as a poor risk, and won't rent to that customer again.

We only allow a single individual or company for a given rental agreement, and we write a

separate rental agreement for each vehicle. Yes, we do have customers rent two or more

vehicles at the same time. Each rental agreement is identified by the originating rental office

number and a rental agreement number. We also need to track the rental date, the anticipated

duration of the rental, the originating rental office, the drop-off rental office, the amount of the

deposit paid, the quoted daily rental rate, and the quoted rate per mile. Of course for the trailers,

there isn't a mileage charge. No, we don't need to automate the financial side of our business,

just our rental agreement tracking and vehicle assignment functions."

Page 128: Data Modeling

128

MODEL DATA OVER TIME

Add additional entities and relationships to the E-R model to accommodate

historical data.

Ask the User:

• Is an audit trail required?

• Can attribute values change over time?

• Can relationships change over time?

• Do you need to query older data?

• Do you need to keep previous versions?

Quick Note

• Validate any requirements for storing historical data with the user. Storing unnecessary

historical data can be costly.

Page 129: Data Modeling

129

Model Data Over Time - cont'd

Create an additional entity to track an attribute's values over time.

Example

A consulting firm needs to keep information about its contracts. Each contract has a unique contract

id, and they need to keep a description of the contract, the contract's status (e.g. open, closed, or

suspended.) Initially the following CONTRACT entity was modelled.

The above CONTRACT entity supports a single current status value for CONTRACT. The law Firm

wants to track the dates each contract was opened, was closed, and was suspended. To model status

values over time add a STATUS entity.

The UID of the STATUS entity is the related CONTRACT and the effective date.

Quick Note

• Use a single entity to record the values over time of multiple attributes associated with an

entity (such as CONTRACT).

Page 130: Data Modeling

130

Model Data Over Time - cont'd

Add a new entity to accommodate a relationship that may change over

time.

Example

An apartment owner wants to track the tenants in each of his apartments. (The apartment only writes

rental contracts with a single person, not multiple people.) The following E-R Model will only track

the current renter of an APARTMENT.

Add the entity RENTAL HISTORY ENTRY to capture the values of the rental relationship over time.

Page 131: Data Modeling

131

Model Data Over Time - cont'd

An intersection entity is frequently used to track information about a

relationship, which changes over time.

Example

A professional society wants to track the companies that its members have been employed by over

time and the term of each employment (e.g. from date and to date). There is an M:M rela tionship

between each member and each company.

Add an intersection entity, EMPLOYMENT HISTORY ENTRY, to track each employee's em-

ployments over time and the dates of those employments.

By including the attribute from date in the UID of EMPLOYMENT HISTORY ENTRY, this model

will track any multiple terms of employment at a single company by a single employee.

Page 132: Data Modeling

132

EXERCISE 4-6

Model data over time.

1. Modify the Video Store E-R Model to accommodate the following additional requirements.

"You know, we really need to keep a history of all our rentals. Each time a customer rents a

tape, we would like to keep the rental date/time and the return date/time. All our tapes are due

back the next day, so we don't need to keep a due date.

Keeping this rental history will allow us to analyze the pattern of our rentals. We will be able to

determine how many tapes each customer rents and how many times a customer has returned a

tape late. We will also know how many times a particular tape has been used, and will then

know when to retire each tape. We will also be able to analyze our customers' movie

preferences."

Page 133: Data Modeling

133

MODEL COMPLEX RELATIONSHIPS

Beware of a ring of M:M relationships.

Example

Develop an E-R model for employment history. For each person, track the position held, the company

worked for, and the dates the posit ion was held. A person may hold a specific position within the same

company multiple times during their career. Initially the following E-R Model was defined.

The dates of the position seem to be an attribute of a relationship. So resolve each of the M:M

relationships.

Which intersection entity are the dates of the position attributes of? All of them? None of them?

Page 134: Data Modeling

134

Model Complex Relationships - cont'd

Model a relationship between three or more entities as an Intersection

Entity with mandatory relationships with those entities.

Example

A person's employment history is really a 3-way relationship between the PERSON, COMPANY, and

POSITION entities. Use a single intersection entity called EMPLOYMENT HISTORY to model this

relationship.

A complex relationship is a relationship between three or more entities.

Quick Notes

• An intersection entity for a complex relationship always has mandatory relationships back to

the entities to which it relates.

• For an intersection entity representing a complex relationship, follow the rules of basic E-R

Modelling to name the entity, and to analyze and model its relationships, its attributes, and its

UID.

• Consider its mandatory relationships as candidates for inclusion in its UID.

Page 135: Data Modeling

135

EXERCISE 4-7

Model a complex relationship.

1. In the E-R Model for the Oracle User's Group from Exercise 3-10, a M:M relationship was

initially modelled between the MEMBER entity and the COMPUTER PLATFORM entity.

Revise that relationship based upon the following revised requirements.

Revised Requirements

"No, we really don't need to know what computer platform each member is using. Instead, what

we really need to know is which Oracle products (RDBMS, Pro*C, SQL*Forms,

SQL*TextRetrieval, CASE, Financials, etc.) each member is using on which computer

platforms. No, we don't need to keep the specific version of each product, just the general

product name."

Page 136: Data Modeling

136

EXERCISE 4-8

Optional Exercise

Develop a complex E-R Model.

1. Develop an E-R Model for the following business.

"I am the senior partner in a large, diversified law firm. My firm Bailey and Associates, handles

a wide variety of cases including traffic violations, domestic disputes, civil suits, and homicide

cases.

We have retained a database administrator to organize and track various data because the firm

grew faster than we had imagined and now there are "cases lying all over the place."

Our firm is made up of departments such as litigation, homicide, etcetera, and each case is

assigned to a particular department for administrative purposes. Attorneys are also assigned to a

particular department, but this is only for billing/payroll purposes since an attorney can work on

cases in other departments.

We need a list of events for a given case (essentially a history of the case) that includes a log of

events and the date the event became effective, Cases have to be identifiable by a unique

number which appears on a list with every event date and event description. Events have special

codes like O for Open, T for Trial, L for Lost, and there must always be an event status for

every case.

We want to keep track of important information associated with a case including the department

to which it is assigned and a brief description (such as Jones vs. Jones). After a case has been

closed, it may be reopened at some future date. We assign reopened cases new case numbers,

but we need to tie the new case number to the previous case number.

(continued)

Page 137: Data Modeling

137

Exercise 4-8 - cont'd

Attorneys can be party to multiple cases the same way a number of people can be party to

multiple cases. For example, Jones may be a judge on one case and an eyewitness on another.

We are only interested in keeping track of parties and the roles that they play in the context of a

particular case. Parties should be identified by their name and date of birth, and some kind of

unique numbering system.The kinds of people that may be involved in cases include judges

(JG), eyewitnesses (EW), defendants (DE) and of course attorneys (AT). For example, we have

a murder case, and we're working for the defendant.One attorney is assigned to the case, and

there is, of course, a judge presiding over the case. There is also an eyewitness. Thus, there are

four people who are parties to this case, and we'd like to know about all four. In this context, we

are not tracking the attorney in terms of billing, but simply as party to a case.

To elaborate on the varying roles that people can play, assume that a given party can serve in

different roles in different cases, but a party can only serve in one role on a given case."

Page 138: Data Modeling

138

EXERCISE 4-9

Optional Exercise

Develop a complex E-R Model.

1. Develop an E-R Model for the following business.

"I'm Phil Sales with Shipmore Cruises. We've decided that our manual system of booking

passengers onto our ships won't hold up when we get our new ship. so I guess that's why you're

here. Yes. we'll have two ships, no not boats, boats can fit onto ships, and we'll probably expand

to 5 or 6 by 1995. Each one has the name "Goodsea," "Goodwind," and the new one.

"Goodsky," and each one has a specific passenger capacity and registry. Registry is the country

that it is registered with. No. we don't need to worry about tonnage or draft or anything else

about the ship.

Each year we put out a brochure with the information on each cruise that we offer. Every cruise

has a name, length in number of days - huh? Oh, three, seven, eleven and fourteen day cruises.

Each cruise also has a specific ship assigned to it, some people want to go on only the newer

ships. Yes, I guess we would need the age of each ship. So, for each cruise we also have

different ports that we stop at. A three day cruise will have only one stop, always on the second

day of the cruise, a seven day cruise will stop at three ports, and so on. We vary ports depending

on where the cruise originates. What? The ports of Los Angeles, CA. and Miami, FL, as well as

Anchorage, AK. See. the LA cruises go down to Mexico ports like Cabo San Lucas and Mexico

City; the Miami cruises go to the Bahamas and the Virgin Islands: and the Anchorage cruises

make stops all through Alaska. Depending upon the length of each cruise, each cruise will make

port calls on different days out.

Passengers who sail with us will pick a given cruise, which has a certain length and number of

ports, and which cruise they pick will tell us which cabins are available. Once they choose from

what is available, we can then price them. That depends on the number of people in the cabin

and the "class" of the cabin. Huh? Whenever we book a cabin under the manual system we

remove the cabin from the availability board, unless it's not full and that passenger wants to

share with someone else. If the cabin can hold four people, and they are travelling alone, then

it's gonna cost 'em more. Once passengers are booked, and we get a deposit from them, then we

can pay the travel agent who made the reservation their commission."

Page 139: Data Modeling

139

5

RELATIONAL

DATABASE CONCEPTS

Page 140: Data Modeling

140

SECTION OBJECTIVES

At the end of this section, you will be able to:

• Understand what a relational database is.

• Define what primary keys and foreign keys are.

• Understand the concept of data integrity.

Page 141: Data Modeling

141

RELATIONAL DATABASE OVERVIEW

A relational database is a database that is perceived by the user as a

collection of relations or two-dimensional tables.

Example

The relational table below contains employee data.

Quick Notes

• Relational database tables are simple but disciplined.

• A relational database must possess data integrity, i.e., its data must be accurate and consistent.

Page 142: Data Modeling

142

Relational Database Overview - cont'd

Relational databases are manipulated a set at a time rather than a record at

a time.

Example

To select all employees who work in Department 10, use the following SQL statement.

SQL> SELECT emp_no, lname, fname, dept_no 2 FROM employee 3 WHERE dept_no = 10;

EMP_NO LNAME FNAME DEPT_NO ------ ----- ----- ------- 100 SMITH JOHN 10 210 BROWN JIM 10

The Structured Query Language (SQL) is used to manipulate relational

databases.

Quick Notes

• The American National Standards Institute (ANSI) has established SQL as the standard

language for operating upon relational databases.

• A relational database can support a full set of relational operations. Relational operations

manipulate sets of data values. Tables can be operated on to create other tables. Rela tional

operations can be nested.

Page 143: Data Modeling

143

PRIMARY KEYS

A Primary Key (PK) is a column or set of columns that uniquely identifies

each row in a table. Each table must have a primary key, and a primary

key must be unique.

Example

The primary key for the EMPLOYEE table consists of the EMP_NO column. Each row in the table is

uniquely identified by its EMP_NO value.

Quick Notes

• No duplicates are allowed in a Primary Key. The primary key must be unique.

• Primary keys generally cannot be changed.

• An entity's UID will map to a Primary Key in its corresponding table.

Page 144: Data Modeling

144

Primary Keys - cont'd

A Primary Key consisting of multiple columns is called a Composite

Primary Key or a Compound Primary Key.

Example

The composite primary key for the ACCOUNT table consists of the combination of the BANK_NO

and ACCOUNT_NO columns. Each row in the table is uniquely identified by its BANK NO and

ACCOUNT NO values.

Quick Note

• The columns of a composite primary key must be unique in combination. The individual

columns can have duplicates, but in combination, no duplicates are allowed.

Page 145: Data Modeling

145

Primary Keys - cont'd

No part of a primary key may be NULL.

Example

EMP_NO is the primary key of the EMPLOYEE table. Therefore EMP_NO must be defined as NOT

NULL.

Example

How does the ACCOUNT table violate the rules of Primary Keys?

Two of the rows contain NULL values in part of the composite PK. Both BANK_NO and ACCOUNT_NO must be defined as NOT NULL.

Page 146: Data Modeling

146

Primary Keys - cont'd

A table can have more than one column or combination of columns that can

serve as the table's primary key. Each of these is called a Candidate Key.

Example

What are the candidate keys for the EMPLOYEE table?

EMP_NO and PAYROLL_ID are candidate keys.

Select one candidate key to be the Primary Key for the table. The other

candidates become Alternate Keys (or Unique Keys).

Example

Quick Notes

• All Candidate Keys must be Unique and NOT NULL.

• Secondary UIDs map to Alternate Keys.

• Person names are not normally candidate keys because their uniqueness cannot be guaranteed.

For example, in the EMPLOYEE Table, the combination LNAME/ FNAME would probably

not be a candidate key.

Page 147: Data Modeling

FOREIGN KEYS

A Foreign Key (FK) is a column or combination of columns in one table

that refers to a primary key in the same or another table.

Example

DEPT_NO is a FK in the EMPLOYEE Table, and refers to values in the DEPT_NO column of the

DEPARTMENT Table.

Quick Notes

• Foreign keys are used to join tables.

• Foreign keys are based on data values and are purely logical.

Page 148: Data Modeling

Foreign Keys - cont'd

A foreign key must match an existing primary key value (or else be NULL).

Example

The FK DEPT_NO in the EMPLOYEE table refers to values of the PK DEPT_NO in the DEPART-

MENT table.

If a Foreign Key is part of a Primary Key, that FK cannot be NULL.

Example

In the ACCOUNT table, the FK BANK_NO must be NOT NULL because it is part of the PK.

Page 149: Data Modeling

DATA INTEGRITY

Data Integrity refers to the accuracy and consistency of the data.

Data Integrity Constraints

Data integrity constraints define the relationally correct state for a database.

Data integr ity constraints ensure that users perform only operations which leave the database in a

correct, consistent state.

Constraint Type Explanation

Entity Integrity No part of a primary key can be NULL.

Referential Integrity A foreign key must match an existing primary key value (or else

be NULL).

Column Integrity A column must contain only values consistent with the defined

data format of the column.

User-Defined Integrity The data stored in a database must comply with the rules of the

business.

All data integrity constraints should be enforced by the DBMS or the

application software.

Quick Note

• Data is inconsistent if multiple copies of an entry exist, and not all copies have been updated.

An inconsistent database can supply incorrect or contradictory information to its users.

Page 150: Data Modeling

Data Integrity - cont'd

The rules of a business can also determine the correct state for a database.

Such business rules are called User-Defined Data Integrity Constraints.

Example

A business has the following user-defined data integrity constraints.

An exempt employee is not paid for the tirst 5 hours of overtime worked.

An employee in the Finance Department cannot have a title of:

"Programmer".

A Salesman's commission cannot exceed 50% of salary.

Quick Notes

• User-defined data integrity constraints can be set by management policy or be required by

government laws.

• Frequently these business rules are completely arbitrary, or at least seem to be arbitrary.

• User-defined data integrity constraints may involve multiple columns and tables.

Page 151: Data Modeling

6

INITIAL DATABASE DESIGN

Page 152: Data Modeling

SECTION OBJECTIVES

At the end of this section, you will be able to:

1. Explain how Database Design fits into the Database Development Process.

2. Translate an entity-relationship data model into a relational database design.

3. Document a database design using Table Instance Charts.

Page 153: Data Modeling

DATABASE DESIGN

Database Design is performed during the Design Stage of the System

Development Cycle and is performed concurrently with Application Design.

Page 154: Data Modeling

Database Design - cont'd

Database Design is performed in two distinct activities.

Database Design Activities

1. Map the E-R Model to relational tables to produce an initial design.

2. Refine the initial design to produce a complete database design.

Database Design Deliverable

The Database Design Stage produces design specifications for a relational database including

definitions for relational tables, indexes, views, and storage space.

Page 155: Data Modeling

INITIAL DATABASE DESIGN OVERVIEW

Document each relational table on a Table Instance Chart.

Table Instance Chart

Table Name: EMPLOYEE

Column

Name

EMPNO FNAME LNAME JOB HIREDATE SAL COMM MGR DEPTNO

Key

Type

PK FK1 FK2

Nulls/

Unique

NN, U NN NN NN NN

7369 MARY SMITH CLERK 17-DEC-80 800 7902 20

7902 HENRY FORD ANALYST 03-DEC-81 3000 7566 50

7521 SUE WARD SALESMAN 22-FEB-81 1250 6000 7698 30

7698 BOB BLAKE MANAGER 01-MAY-81 2850 10000 7839 30

Sample

Data

7839 BOB KING PRESIDENT 17-NOV-81 5000 5000 10

Quick Notes

• The valid Key Types are PK for a Primary Key column, and FK for a Foreign Key column.

• Use suffixes to distinguish between multiple FK columns in a single table, for example, FK1

and FK2. Label multiple column keys with the same suffix.

• Use NN for a column that must be defined NOT NULL.

• Use U for a column that must be unique.

• If multiple columns must be unique in combination, label them with a suffix, for example U1.

• Label a single column PK as NN, U.

• Label a multiple column PK as NN, U1 or possibly as NN, U1 and U.

Page 156: Data Modeling

Initial Database Design Overview - cont'd

This familiar Training Company E-R Model will be used to illustrate the

activities of Initial Database Design.

Training Company E-R Model

Page 157: Data Modeling

Initial Database Design Overview - cont'd

Follow a set of steps to map an E-R Model to a set of relational tables

producing an initial database design.

Steps in Initial Database Design

1. Map the simple entities to tables.

2. Map attributes to columns and document sample data.

3. Map unique identifiers to primary keys.

4. Map relationships to foreign keys.

5. Choose arc options.

6. Choose subtype options.

Page 158: Data Modeling

MAP SIMPLE ENTITIES

Map each simple entity to a table. Create a Table Instance

Chart for the new table. Record only the name of the table.

Example

Create a Table Instance Chart for the INSTRUCTOR entity. Name the table

INSTRUCTOR.

Table Name: INSTRUCTOR

Column

Name

Key Type

Nulls/

Unique

Sample

Data

Quick Notes

• The table name should be easy to trace back to the entity name. The plural of the entity name

is sometimes used because the table will contain a set of rows.

• A simple entity is not a subtype or supertype. In Step 6, the designer must decide how to map

a supertype/subtype construct to tables.

Page 159: Data Modeling

MAP ATTRIBUTES TO COLUMNS

Map each attribute to a column in its entity's table. Map

mandatory attributes to NOT NULL (NN) columns.

Example

Map the attributes of the entity INSTRUCTOR to columns in the IN-

STRUCTOR table. Since id. first name. and last name are mandatory attributes,

designate their columns as NOT NULL.

Table Name: INSTRUCTOR

Column

Name

INST_ID FNAME LNAME PHONENO

Key Type

Nulls/

Unique

NN NN NN

Sample

Data

For each attribute, select a short but meaningful column name.

Quick Notes

• Column names should be easily traced o the E-R model.

• Avoid the use of SQL reserved words as column names - for example, NUMBER.

• Use consistent abbreviations to avoid programmer and user confusion. For example, will

Number be abbreviated as NO or NUM. Is it DEPTNO or DEPTNUM?

• Short column names will reduce the time required for SQL command parsing.

Page 160: Data Modeling

Map Attributes to Columns - cont'd

Document sample rows of data in each table's Table

Instance Chart.

Example

Document sample data for the columns of the INSTRUCTOR table.

Table Name: INSTRUCTOR

Column Name INST_ID FNAME LNAME PHONENO

Key

Type

Nulls/

Unique

NN NN NN

10 NANCY HALL 798-2251

81 MARIA GONZALES 756-4891

73 PETE CASSIDY 301-2291

95 KATHY ANDRONICA 483-9221

Sample Data

301 ERIC CAMPLIN 535-3166

Sources for Sample Data

• User interview notes

• Entity Instance Charts

• Current computer systems

• Other analysis stage documentation

• Additional conversations with the user

Page 161: Data Modeling

MAP UID'S TO PRIMARY KEYS

Map any attribute(s), which are part of the entity's UID to

PK column(s). Label the columns PK.

Example

The attribute id is the UID of the entity INSTRUCTOR, so make the

corresponding column INST_ID the PK of the INSTRUCTOR table.

Table Name: INSTRUCTOR

Column Name INST_ID FNAME LNAME PHONENO

Key Type PK

Nulls/ Unique NN, U NN NN

10 NANCY HALL 798-2251

81 MARIA GONZALES 756-4891

73 PETE CASSIDY 301-2291

95 KATHY ANDRONICA 483-9221

Sample Data

301 ERIC CAMPLIN 535-3166

A key type of PK indicates a primary key column.

Quick Notes

• All columns labeled PK must also be labeled NN and U.

• Map a UID, which includes multiple attributes to a composite PK. Label those columns NN

and U1

Page 162: Data Modeling

Map UID's to Primary Keys - cont'd

If an entity's UID includes a relationship, add foreign key

columns to the table and mark them as part of the primary

key.

Example

The UID of the ENROLLMENT entity is composed of its relationship to

COURSE and its relationship to STUDENT. Add two FK columns to the

ENROLLMENT table for the PK of the COURSE table and the PK of the

STUDENT table.

Quick Notes

• Choose a unique name for each FK column, and label the column(s) PK, NN, and FK.

• If multiple FK columns exist in a table, use suffixes to distinguish between them, for example,

FK1 and FK2. Label multiple column keys with the same suffix.

• Composite PK's must be unique in combination and should be labeled VI.

• Add sample data for the FK columns.

Page 163: Data Modeling

MAP RELATIONSHIPS TO FOREIGN KEYS

For M:1 relationships, take the PK at the one end and put

it in the table at the many end.

Example

Take the PK INST_ID at the one end, and put it in the table COURSE at the

many end.

Table Name: COURSE

Column Name COURSE_ CODE NAME FEE DUR INST_ID

Key Type PK FK

Nulls/ Unique NN, U NN

344 SQL*FORMS 1000 5 81

974 SQL*RW 400 2 73

401 DB DESIGN 400 2 95

717 DBA 900 3 73

Sample Data

659 SOL*PLUS 400 2 301

Go with the many!

Quick Notes

• Choose a unique name for the FK column, and label the column (s) FK.

• For must be relationships, label the column NN.

• Supply sample data.

Page 164: Data Modeling

Map Relationships to Foreign Keys - cont'd

If the table's PK includes a foreign key, the FK columns to

support the relationship may have been added in Step 3.

Example

The PK for the ENROLLMENT table included both the foreign key

COURSE_CODE and the foreign key ST_ID. Therefore, these two columns

already exist, and do not need to be added to support the relationships.

Page 165: Data Modeling

Map Relationships to Foreign Keys - cont'd

For a mandatory 1:1 relationship, place the unique FK in

the table at the mandatory end and use the NOT NULL

constraint to enforce the mandatory condition.

Example

Since the relationship from PERSONAL COMPUTER is mandatory, place the

FK for the relationship in the PERSONAL_COMPUTER table and label it NOT

NULL. MB_ID is the FK column added. The FK is labeled U to enforce the 1:1

relationship.

Table Name: PERSONAL COMPUTER Table Name: MOTHERBOARD

Column

Name

INV_NUM CASE_TYPE POWER_

SUPPLY

MB_ID

Key

Type

PK FK

Nulls/

Unique

NN, U NN NN NN, U

1045 BABY AT 150 4579

0437 BABY AT 200 8731

1458 TOWER 220 4773

1223 TOWER 220 9978

Sample

Data

1088 MINITOWER 200 4517

Column

Name

MB_ID PROC_

CHIP

PROC_

SPEED

COPROC_

CHIP

Key

Type

PK

Nulls/

Unique

NN, U NN NN NN

9978 486 33 N

4517 386 40 Y

4773 486 25 N

4579 386SX 25 N

Sample

Data

8731 386 33 Y

Page 166: Data Modeling

Map Relationships to Foreign Keys - cont'd

If a 1:1 relationship is optional in both directions, place the

FK in the table at either end of the relationship.

Example

For the optional 1:1 relationship between BERTH and SHIP, the FK column

could also be placed either in the BERTH or SHIP table. The B_NUM column is

added to the SHIP table, and labeled Unique to enforce the 1:1 relationship.

Page 167: Data Modeling

Map Relationships to Foreign Keys - cont'd

For a 1:M recursive relationship, add a FK column to the single table. This FK

column will refer to values of the PK column.

1.1.1. Example

For this 1:M recursive relationship, add an FK column to the EMPLOYEE table

for each employee's manager. Name the column MGR_ID to reflect the

relationship.

Table Name: EMPLOYEE

Column Name EMP_ID FNAME LNAME MGR_ID

Key Type PK FK

Nulls/ Unique NN, U NN NN

7450 MARY SMITH -

5579 LESLIE STERNE 7450

6714 JANET GENTRY 5579

9451 BILL ABLE 7450

Sample Data

3040 JUAN GOMEZ 9451

Quick Notes

• The FK column refers to a row in the same table.

• Name the FK column name to reflect the relationship.

• A recursive FK will never be NOT NULL.

Page 168: Data Modeling

Map Relationships to Foreign Keys - cont'd

For a 1:1 recursive relationship, add a unique FK to the

table. This FK column will refer to values of the PK

column.

Example

For this 1:1 recursive relationship, add a unique column to the PERSON table.

Table Name: PERSON

Column

Name

PERS_ID FNAME LNAME SPOUSE_ID

Key Type PK FK

Nulls/

Unique

NN, U1 NN NN U1

7450 MARY SMITH -

5379 SUSAN JONES 9451

6714 JANET GENTRY 3040

9451 BILL JONES 5579

Sample Data

3040 JERRY JOHNSON 6714

Quick Notes

• The combination of the PK and FK columns must always be unique in order to ensure the 1:1

relationship.

• A recursive FK will never be NOT NULL

• The additional constraint that a PERSON cannot be married to him/herself would have to be

implemented separately by the application programs or stored procedures.

Page 169: Data Modeling

REVIEW: MAPPING SIMPLE E-R MODELS TO TABLES

Map a simple Entity-Relationship model to an initial database design using

the following four steps:

Steps

1. Map simple entities to tables.

2. Map attributes to columns and document sample data.

3. Map UID's to Primary Keys.

4. Map relationships to Foreign Keys.

5. Document each table design on a Table Instance Chart.

Page 170: Data Modeling

EXERCISE 6-1

Create an initial database design.

1. Follow the first four steps of Initia l Database Design to map this E-R Model to a set of initial

table designs. Document your table designs on the supplied set of Table Instance Charts.

Create sample data as required.

Page 171: Data Modeling

Exercise 6-1 - cont'd

Table Name:

Column

Name

Key Type

Nulls/ Unique

Sample Data

Table Name:

Column

Name

Key Type

Nulls/

Unique

Sample Data

Page 172: Data Modeling

EXERCISE 6-2

Create an initial database design.

1. Follow the first four steps of Initial Database Design to map this E-R Model to a set of initial

table designs. Document your table designs on the supplied set of Table Instance Charts.

Create sample data as required.

Page 173: Data Modeling

Exercise 6-2 - cont'd

Table Name:

Column

Name

Key Type

Nulls/

Unique

Sample Data

Table Name:

Column

Name

Key Type

Nulls/

Unique

Sample Data

Table Name:

Column

Name

Key Type

Nulls/

Unique

Sample

Page 174: Data Modeling

EXERCISE 6-3

Create an initial database design.

1. Follow the first four steps of Initial Database Design to map this E-R Model to a set of initial

table designs. Document your table designs on the supplied set of Table Instance Charts.

Create sample data as required.

Table Name:

Column

Name

Key Type

Nulls/

Unique

Sample Data

Page 175: Data Modeling

Exercise 6-3 - cont'd

Table Name:

Column

Name

Key Type

Nulls/

Unique

Sample Data

Table Name:

Column

Name

Key Type

Nulls/

Unique

Sample Data

Table Name:

Column

Name

Key Type

Nulls/

Unique

Sample

Page 176: Data Modeling

EXERCISE 6-4

Optional Exercise

Create an initial database design.

1 Follow the first four steps of Initial Database Design to map this E-R Model to a set of initial

table designs. Document your table designs on the supplied Table Instance Charts. Use the

interview notes on the following page to select sample data for the Table Instance Charts.

Page 177: Data Modeling

Exercise 6-4 - cont'd

Table Name:

Column

Name

Key Type

Nulls/

Unique

Sample Data

Table Name:

Column

Name

Key Type

Nulls/

Unique

Sample Data

Page 178: Data Modeling

Exercise 6-4 - cont'd

2 Use the following interview notes to select sample data for the Table Instance Charts.

"Our company sells products throughout the United States. So we've divided the U.S. into four

major sales regions: the Northern, Eastern, Southern, and Western Regions. Each sales region

has a unique region code. Each sales region is then divided into sales districts. For example, the

Western Region is divided into the Rocky Mountain, Northwest, Pacific Coast, and Pacific

Districts. Each district has a unique district code.

Each district is made up of sales territories. The Rocky Mountain District is composed of three

territories: Wyoming-Montana, Colorado, and Utah-New Mexico. The northwest District is

made up of two territories: The Washington and Oregon-Idaho territories. The Pacific Coast

District is composed of two territories: the California and Nevada territories. The Pacific

District includes the Hawaii territory and the Alaska territory. Each territory has a unique

territory code.

Then each sales territory is broken down into sales areas. For example, Colorado is made up of

two sales areas: the Front Range and the Western Slope sales areas. Each sales area has a unique

sales area code.

Each salesperson is responsible for one or more sales areas, and has a specific sales quota. We

also have sales managers who are responsible for one or more sales districts, and sales directors

who are responsible for one or more sales regions. Each sales manager is responsible for the

territories within his districts. We don't overlap our employees' responsibilities - a sales area is

always the responsibility of a single salesperson, and our managers and director's

responsibilities don't overlap. Sometimes our salespersons, managers, and directors will be on

leave or special assignments and will not have sales turf responsibilities. We identify all our

sales personnel by their employee ids."

Page 179: Data Modeling

MAP COMPLEX E-R MODELS TO TABLES

Follow the following additional steps to map a complex Entity-Relationship

Model to an initial database design.

Additional Steps

5 Choose Arc Options

6 Choose Subtype Options

Page 180: Data Modeling

CHOOSE ARC OPTIONS

Arcs represent a kind of multiple alternative foreign key.

Choose between two alternative designs for mapping arcs

to foreign keys.

Alternative Designs

• Explicit Arc Design

• Generic Arc Design

Example

This E-R Model will map to four tables. The OFFICE SUITE entity has an arc across the many ends

of three relationships, and corresponding FK columns must be added to the OFFICE_SUITE table.

Use either an Explicit Arc Design or a Generic Arc Design to add these multiple alternative foreign

keys.

Quick Notes

• Also use an Explicit Arc Design or a Generic Arc Design to implement multiple foreign keys

when an arc spans a set of 1:1 relationships.

• Arcs can only span relationship ends that are either all mandatory or all optional.

Page 181: Data Modeling

Choose Arc Options - cont'd

The Explicit Arc Design creates a foreign key column for

each relationship included in the arc.

Example

The following E-R Model contains four simple entities, and will be mapped to

four separate tables. The arc spans the many end of three relationships.

Therefore, FKs must be added to the OFFICE_SUITE table. Using an Explicit

Arc Design, create a FK column for each rela tionship.

Table Name: OFFICE_SUITE

Column Name BLDG_ ID SUITE_NUM INDIV_I

D

PARTNER_ CODE COMPANY_ NUMBER

Key Type PK PK FK1 FK2 FK3

Nulls/ Unique NN, U1 NN, U1

1024 101 30045

512 210 A4431

977 144 54532

3041 510 10844

Sample Data

2371 430 54101

Quick Notes

• The Explicit Arc Design will support multiple Foreign keys with different formats. For

example, INDIV_ID, PARTNER_CODE, and COMPANY_ID could all have a different

column format.

• Application software must enforce relationship exclusivity between the foreign keys.

Page 182: Data Modeling

Choose Arc Options - cont'd

The Generic Arc Design creates a single foreign key

column and one relationship flag column for the arc. Since

the relationships are exclusive, only one FK value will exist

for each row in the table.

Example

Again, create four separate tables for this E-R Model - one for each entity. Since

the arc spans the many end of the relationships, add the to the OFFICE_SUITE

table. Using the Generic Arc Design, create a single foreign key column, and

add a type column to indicate which of the three tables is referenced by the FK

column in each row. For example, J for INDIVIDUAL, P for PARTNERSHIP,

and C for COMPANY.

Table Name: OFFICE_SUITE

Column Name BLDG_ID SUITE_NUM RENTER _ID RENTER_ TYPE

Key Type PK PK FK

Nulls/ Unique NN, U1 NN, U1 NN NN

11124 1111 30045 1

512 2111 A4431 P

977 14 54532 1

MM 510 10844 C

Sample Data

2371 430 541111 C

Quick Notes

• If the relationships under the arc are mandatory, make both added columns NOT NULL.

• The foreign keys must share the same format for all referenced tables.

Page 183: Data Modeling

EXERCISE 6-5

Map arc structures to tables.

1 Using an Explicit Arc Design, develop a table design for this Entity-Relationship Model.

Document your design on the provided Table Instance Charts.

Table Name: STUDENT

Column Name

Key Type

Nulls/ Unique

Sample Data

Page 184: Data Modeling

Exercise 6-5 - cont'd

Table Name: COUNTY

Column Name

Key Type

Nulls/ Unique

Sample Data

Table Name: OTHER STATE

Column Name

Key Type

Nulls/ Unique

Sample Data

Table Name: FOREIGN COUNTRY

Column Name

Key Type

Nulls/ Unique

Sample Data

Page 185: Data Modeling

Exercise 6-5 - cont'd

2 Using a Generic Arc design, develop a table design for this Entity-Relationship Model.

Document your design on the provided Table Instance Charts.

Table Name: STUDENT

Column Name

Key Type

Nulls/ Unique

Sample Data

Page 186: Data Modeling

Exercise 6-5 - cont'd

Table Name: COUNTY

Column Name

Key Type

Nulls/ Unique

Sample Data

Table Name: OTHER STATE

Column Name

Key Type

Nulls/ Unique

Sample Data

Table Name: FOREIGN COUNTRY

Column Name

Key Type

Nulls/ Unique

Sample Data

Page 187: Data Modeling

CHOOSE SUBTYPE OPTIONS

Choose from three options for mapping subtypes to tables.

Subtype Table Mapping Options

• Single Table Design

• Separate Tables Design

• Arc Implementation (see Appendix E, p. E-4)

Example

In the following supertype/subtype construct, the EMPLOYEE, EXEMPT

EMPLOYEE, and NON-EXEMPT EMPLOYEE entities may be mapped to one,

two, or three tables, depending upon the subtype table mapping option selected.

Page 188: Data Modeling

Choose Subtype Options - cont'd

Option 1 - Single Table Subtype Design

Map the subtypes onto a single table for the supertype. The

single table will contain instances of all sub types.

Create

• single table for the supertype.

• a TYPE column to identify which subtype each row belongs to.

• a column for each of the supertype's attributes.

• a column for each of the subtype's attributes.

• FK columns for each of the supertype's relationships.

• FK columns for each of the subtype's relationships.

Use a single table design when the subtypes have few subtype-specific

attributes and relationships.

Page 189: Data Modeling

Choose Subtype Options - cont'd

Option 1 - Single Table Subtype Design

Example

Map the EMPLOYEE supertype and its subtypes onto a single EMPLOYEE

table.

Table Name: EMPLOYEE

Column Name BADGE_

NUM

FNAME LNAME EMP_ TYPE EE_

SALARY

NE_

HOURLY_

RATE

NE_

OVERTIME

_ RATE

NE_

UNION

_NUM

DEPT_ CODE

Key Type PK FK1 FK2

Nulls/ Unique NN, U NN NN NN NN

4579 JAMES JOYCE E 29000 40

6631 KAREN DIDONATO E 25000 35

1190 MICHAEL WEINTER E 42700 40

370 MARIA PENA E 44050 30

800 TERRY SMITH E 38450 35

7147 JOE SMITH NF 8.50 12.75 201 35

6794 JULIA WALKER NE 6.75 11.50 150 30

941 HARRY KAPLIN NE 12.00 18.00 201 45

1020 JOSE GOMEZ NE 9.50 16.15 201 30

Sample Data

3500 CLYDE JONES NE 10.50 15.75 180 45

Page 190: Data Modeling

Choose Subtype Options - cont'd

Option 1 - Single Table Subtype Design

The columns of the EMPLOYEE table are derived from the attributes and

relationships of the supertype and all its subtypes.

Entity

Type

Columns for

Attribtues

FK Columns for

Relationships

Supertype BADGE_NUM

FNAME LNAME

DEPT_CODE

Subtype EE_SALARY,

NE_HOURLY_RATE,

NE_OVERTIME_RATE

NE_UNION_NUM

Quick Note

• The single table subtype design requires that a new type column be created to identify each

row's subtype. The EMP_TYPE column was added to the EMPLOYEE table for this purpose.

Page 191: Data Modeling

Choose Subtype Options - cont'd

Option 1 - Single Table Subtype Design

Use the Single Table Subtype Design when there are few

subtype-specific attributes and relationships.

Design Advantages

• Access to the supertype is straightforward.

• The subtypes can be accessed and modified using views.

Design Disadvantages

• Subtype NOT NULL requirements cannot be enforced at the database level.

• Application logic will have to cater to different sets of attributes, depending on TYPE.

Page 192: Data Modeling

Choose Subtype Options - cont'd

Option 2 - Separate Tables Subtype Design

Map the subtypes onto separate tables - one for each

subtype. Each table will contain only instances of that

subtype.

Create

• a table for each subtype.

• a column for each attribute of a subtype in that subtype's table.

• a column for each attribute of the supertype in each of the subtype's table.

• an FK column for each relationship to a subtype in that subtype's table.

• an FK column for each relationship to the supertype in each of the subtype's tables.

Page 193: Data Modeling

Choose Subtype Options - cont'd

Option 2 - Separate Tables Subtype Design

Example

Map the EMPLOYEE supertype onto two tables - one for each subtype. First

create a separate table for the EXEMPT EMPLOYEE subtype.

Table Name: EXEMPT_EMPLOYEE

Column Name BADGE_NUM FNAME LNAME SALARY DEPT_CODE

Key Type PK FK

Nulls/ Unique NN, U NN NN NN NN

4579 JAMES JOYCE 29000 40

6631 KAREN DIDONATO 25000 35

1190 MICHAEL WEINER 42700 40

370 MARIA PENA 44050 30

Sample Data

800 TERRY SMITH 38450 35

Page 194: Data Modeling

Choose Subtype Options - cont'd

Option 2 - Separate Tables Subtype Design

Example - cont'd

Then create a separate table for the NON-EXEMPT EMPLOYEE subtype.

Table Name: NON_EXEMPT_EMPLOYEE

Column

Name

BADGE_

NUM

FNAME LNAME HOURLY_

RATE

OT_RATE UNION_

NUM

DEPT_ CODE

Key Type PK FK1 FK2

Nulls/

Unique

NN, U NN NN NN NN NN NN

7147 JOE SMITH 8.50 12.75 201 35

6794 JULIA WALKER 6.75 11.50 150 30

941 HARRY KAPLIN 12.00 18.00 201 45

1020 JOSE GOMEZ 9.50 16.15 201 30

Sample

Data

3500 CLYDE JONES 10.50 15.75 180 45

Page 195: Data Modeling

Choose Subtype Options - cont'd

Option 2 - Separate Tables Subtype Design

Use a Separate Tables Subtype Design when there are

many subtype-specific attributes or relationships.

Design Advantages

• The subtype's attribute optionality is enforced at the database level.

• Application logic does not require checks for subtypes.

Design Disadvantages

• Access to the supertype requires the UNION operator or a view with the UNION operator.

• Views that join the two tables are display only.

• Application program code must be specific to the individual subtype tables.

• Maintenance of UID's across subtypes is difficult to imple ment.

Page 196: Data Modeling

EXERCISE 6-6

Map subtypes to tables.

1 Using a Single Table Subtype Design, develop a table design for this Entity-Relationship

Model. Document your design on the supplied Table Instance Charts. Sample data is not

required.

Table Name: PRODUCT Table Name: ORDER

Page 197: Data Modeling

Exercise 6-6 - cont'd

Table Name: ORDER_LINE

Column

Name

Key Type

Nulls/

Unique

Sample

Data

Page 198: Data Modeling

Exercise 6-6 - cont'd

2 Using a Separate Tables Subtype Design, develop a table design for this Entity-Relationship

Model. Document your design on the supplied Table Instance Charts. Sample data is not

required.

Table Name: PRODUCT Table Name: ORDER

Page 199: Data Modeling

Exercise 6-6 - cont'd

Table Name: PRODUCT_ORDER_LINE

Column

Name

Key Type

Nulls/

Unique

Sample

Data

Table Name: SERVICE_ORDER_LINE

Column

Name

Key Type

Nulls/

Unique

Sample

Data

Page 200: Data Modeling

REVIEW: INITIAL DATABASE DESIGN

Map an Entity-Relationship Model to an initial database design using the

following interrelated steps.

Steps for Mapping Entity-Relationship Models

1 Map simple entities to tables.

2 Map attributes to columns and document sample data.

3 Map UID's to Primary Keys.

4 Map relationships to Foreign Keys.

5 Choose arc options.

6 Choose subtype options.

Document an initial database design on Table Instance Charts.

Page 201: Data Modeling

7

TABLE NORMALIZATION

Page 202: Data Modeling

SECTION OBJECTIVES

At the end of this section, you will be able to:

1 Define normalization and explain its benefits.

2 Place tables in Third Normal Form.

3 Explain how conceptual data modelling rules ensure normalized tables.

Page 203: Data Modeling

NORMALIZE TABLES

Categorize tables according to their degree of normalization.

Normal Form Rule Description

First Normal Form (1NF) The table must be expressed as a set of unordered, two-

dimensional tables. The table cannot contain repeating

groups.

Second Normal Form (2NF) The table must be in INF. Every non-key column must be

dependent on all parts of the primary key.

Third Normal Form (3NF) The table must be 2NF. No non-key column may be

functionally dependent on another non-key column.

"Each non-primary key value MUST be dependent on the key, the whole

key, and nothing but the key."

Why normalize tables?

• Normalization minimizes data redundancy. Unnormalized data is redundant.

• Data redundancy causes integrity problems. Update and delete transactions may not be

consistently applied to all copies of the data causing inconsistencies in the data.

• Normalization helps identify missing entities, relationships, and tables.

Quick Notes

• Third normal form is the generally accepted goal for a database design that eliminates

redundancy.

• Higher normal forms are not widely used.

Page 204: Data Modeling

RECOGNIZE UNNORMALIZED DATA

Unnormalized data does not comply with any of the rules of normalization.

Example

Consider the following set of data. Three variable length records are shown - one for each

ORDER_ID. Why is this data unnormalized?

ORDER

ID

DATE CUSTOMER

ID

CUSTOMER

NAME

STATE ITEM

NUM

ITEM

DESCRIP

QUANTITY PRIC

E

2301 6/23 101 Volleyrite IL 3786

4011

9132

net

racket

3-pack

3

6

8

35.00

65.00

4.75

2302 6/25 107 Herman's WI 5794 6-pack 4 5.00

2303 6/26 110 We-R-Sports MI 4011

3141

racket

cover

2

2

65.00

10.00

It contains a repeating group of ITEM NUM, ITEM DESCRIPTION, QUANTITY, and PRICE. First Normal Form prohibits repeating groups.

Page 205: Data Modeling

CONVERT TO FIRST NORMAL FORM

Remove any repeating groups.

Steps

1 Remove the repeating group from the base table.

2 Create a new table with the PK of the base table and the repeating group.

Example

Convert the following set of unnormalized data to First Normal Form.

ORDER ID

DATE CUSTOMER ID

CUSTOMER NAME

STATE ITEM NUM

ITEM DESCRIP

QUANTITY PRICE

2301 6/23 101 Volleyrite 1L 3786 4011 9132

net racket 3-pack

3 6 8

35.00 65.00 4.75

2302 6/25 107 Herman's Wl 5794 6-pack 4 5.00

2303 6/26 110 We-R-Sports MI 4011 3141

racket cover

2 2

65.00 10.00

Remove the repeating group of ITEM NUM, ITEM DESCRIPTION, QUA NTITY, and PRICE. The PK of the remaining table is ORDER ID. Create a new ORDERJTEM table with ORDER ID and the repeating group.

ORDER

ORDER ID DATE CUSTOMER ID CUSTOMER NAME STATE PK 2301 6/23 101 Volleyrite IL 2302 6/25 107 Herman's WI 2303 6/26 110 We-R-Sports MI ORDER ITEM

ORDER ID ITEM NUM ITEM DESCRIP QUANTITY PRICE PK,FK PK 2301 2301 2301 2302 2303 2303

3786 4011 9132 5794 4011 3141

Net racket 3-pack 6-pack racket cover

3 6 8 4 2 2

35.00 65.00 4.75 5.00 65.00 10.00

Page 206: Data Modeling

CONVERT TO SECOND NORMAL FORM

Remove any non-key columns that are not dependent upon the table's

entire primary key.

Steps

1 Determine which non-key columns are not dependent upon the table's entire primary key.

2 Remove those columns from the base table.

3 Create a second table with those columns and the column(s) from the PK that they are

dependent upon.

Example

Put the following table in2NF.

ORDER

ORDER ID DATE CUSTOMER ID CUSTOMER NAME STATE

PK

2301

2302

2303

6/23

6/25

6/26

101

107

110

Volleyrite

Herman's

We-R-Sports

IL

WI

MI

The ORDER table is already in 2NF. Any value of ORDERJD uniquely determines a single value of each column. Therefore, all columns are dependent on the PK ORDERJD.

Quick Notes

• If each column is not dependent upon the entire primary key, the table is not in 2NF.

• Any table with a single column primary key is automatically in 2NF.

Page 207: Data Modeling

Convert to Second Normal Form - cont'd

Remove any non-key columns that are not dependent upon the table's

entire primary key.

Example

Put the following table in 2NF.

ORDER ITEM

ORDER ID ITEM

NUM

ITEM

DESCRIP

QUANTITY PRICE

PK,FK PK

2301

2301

2301

2302

2303

2303

3786

4011

9132

5794

4011

3141

Net

racket

3-pack

6-pack

racket

cover

3

6

8

4

2

2

35.00

65.00

4.75

5.00

65.00

10.00

The ORDERJTEM table is not in 2NF since PRICE and DESCRIPTION are dependent upon ITEM NUM, but not dependent upon ORDER ID. To convert the table to 2NF, remove any partially dependent columns. Create an ITEM table with those columns and the column from the PK that they are dependent upon.

ORDER ITEM ITEM

ORDER ID ITEM NUM QUANTITY

PK.FK PK.FK

2301

2301

2301

2302

2303

2303

3786

4011

9132

5794

4011

3141

3

6

8

4

2

2

ITEM NUM DESCRIPTION PRICE

PK

3786 net 35.00

4011 racket 65.00

9132 3-pack 4.75

5794 6-pack 5.00

3141 cover 10.00

Page 208: Data Modeling

CONVERT TO THIRD NORMAL FORM

Remove any columns that are dependent upon another non-key column.

Steps

1 Determine which columns are dependent upon another non-key column.2

2 Remove those columns from the base table.

3 Create a second table with those columns and the non-key column that they are dependent

upon.

Example

Put the ORDER table in Third Normal Form.

ORDER

ORDER ID DATE CUSTOMER ID CUSTOMER NAME STATE

PK

2301

2302

2303

6/23

6/25

6/26

101

107

110

Volleyrite

Herman's

We-R-Sports

IL

WI

MI

CUSTOMER NAME and STATE are dependent upon CUSTOMER ID. CUSTOMER ID is not the PK. Therefore, the ORDER table is not in 3NF. Move the dependent non-key columns with the non-key column they depend upon Into a new CUSTOMER table.

ORDER CUSTOMER

ORDER ID DATE CUSTOMER ID

PK FK

2301 6/23 101

2302 6/25 107

2303 6/26 110

CUSTOMER ID CUSTOMER

NAME

STATE

PK

101 Volleyrite IL

107 Herman's WI

110 We-R-Sports MI

Quick Note

• A table is in Third Normal Form if no non-key column is functionally dependent upon another

non-key column.

Page 209: Data Modeling

Convert to Third Normal Form - cont'd

No non-key column can be functionally dependent upon another non-key

column.

Example

Consider the ORDER JTEM table. Is it in 3NF? Why or why not?

ORDER ITEM

ORDER

ID

ITEM

NUM

QUANTITY

PK,FK PK.FK

2301

2301

2301

2302

2303

2303

3786

4011

9132

5794

4011

3141

3

6

8

4

2

2

All non-key attributes are dependent on the key, the whole key, and nothing but the key. The

ORDERJTEM table is in 3NF.

Example

Consider the ITEM table. Is it in 3NF? Why or why not? ITEM

ITEM DESCRIPTION PRICE

NUM

PK

3786 net 35.00

4011 racket 65.00

9132 3-pack 4.75

5794 6-pack 5.00

3141 cover 10.00

All non-key attributes are dependent on the key, the whole key, and nothing but the key. The ITEM

table is in 3NF.

Page 210: Data Modeling

EXERCISE 7-1

Normalize a set of data.

1. Put the following data into First, Second, and Third Normal Form on the supplied Table

Instance Charts. Three variable length records are shown-one for each EMP_NUM.

EMPLOYEE

First Normal Form

Table Name: Column Name

Key Type

Nulls/ Unique

Sample Data

Table Name: Column Name

Key Type

Nulls/ Unique

Sample Data

Page 211: Data Modeling

Exercise 7-1 - cont'd

Second Normal Form

Table Name:

Column Name

Key Type

Nulls/ Unique

Sample Data

Table Name:

Column Name

Key Type

Nulls/ Unique

Sample Data

Table Name:

Column Name

Key Type

Nulls/ Unique

Sample Data

Page 212: Data Modeling

Exercise 7-1 - cont'd

Third Normal Form

Table Name:

Column Name

Key Type

Nulls/ Unique

Sample Data

Table Name:

Column Name

Key Type

Nulls/ Unique

Sample Data

Table Name:

Column Name

Key Type

Nulls/ Unique

Sample Data

Page 213: Data Modeling

Exercise 7-1 - cont'd

Third Normal Form - cont'd

Table Name:

Column Name

Key Type

Nulls/ Unique

Sample Data

Page 214: Data Modeling

NORMALIZE DURING DATA MODELLING

Ensure a 3NF table design by following the rules of data modelling.

First Normal Form Rule

• A table must contain no repeating groups.

Corresponding Data Modelling Rule

• All attributes must be single -valued.

Example

Is the entity CLIENT in 1NF? If not, how could it be converted to 1NF?

The attribute date contacted has multiple values, therefore the entity CLIENT is not in 1NF. Create an

additional entity CONTACT with a M:1 relationship to CLIENT. Create an additional entity and 1 :M

relationship to ensure 1 NF.

Page 215: Data Modeling

Normalize During Data Modelling - cont'd

Validate each attribute's dependence upon its entity's entire UID.

Second Normal Form Rule

• Every non-key column must be dependent upon all parts of the primary key.

Corresponding Data Modelling Rule

• An attribute must be dependent upon its entity's entire unique identifier.

Example

Are all of the attributes in the following E-R diagram dependent upon their entity's UID?

The attribute bank location is not dependent upon the UID of ACCOUNT. It is dependent upon the UID of BANK. Move the attribute and place it where it depends upon the UID of it's entity.

Page 216: Data Modeling

Normalize During Data Modelling - cont'd

Verify attribute placement to ensure a normalized table design.

Third Normal Form Rule

• No non-key column can be functionally dependent upon another non-key column.

Corresponding Data Modelling Rule

• No non-UID attribute can be dependent upon another non-UID attribute.

Example

Are any of the non-UID attributes for this entity dependent upon another non-UID attribute?

The attributes customer name and state are dependent upon the customer id. Create another entity called CUSTOMER with a UID of customer id, and place the attributes accordingly.

Page 217: Data Modeling

8

FURTHER DATABASE DESIGN

Page 218: Data Modeling

SECTION OBJECTIVES

At the end of this section, you will be able to:

1. Specify referential integrity constraints.

2. Design indexes.

3. Understand database views.

4. Evaluate table denormalization.

5. Work with your DBA to plan physical storage usage.

Page 219: Data Modeling

FURTHER DATABASE DESIGN

Review the default table design against the application module's

requirements, and refine and extend the initial design to produce a

complete database design.

Activities

• Define referential integrity constraints.

• Design indexes.

• Establish views.

• Denormalize the database design.

• Plan physical storage usage.

Page 220: Data Modeling

SPECIFY REFERENTIAL INTEGRITY

A foreign key column value must match an existing primary key column

value (or else be NULL). Use referential integrity constraints to specify how

referential integrity is to be maintained.

Delete Constraint

• What happens if a row containing a referenced primary key is deleted?

Update Constraint

• What happens if a referenced primary'key is updated? *

* Only an issue if the PK is updateable in the first place.

Page 221: Data Modeling

Specify Referential Integrity - cont'd

Specify a Delete Constraint to define what should happen if a row

containing a referenced primary key is deleted.

Options: CASCADE, RESTRICTED, or NULLIFY (only if NULLs are

allowed)

Example

Consider the EMPLOYEE and DEPARTMENT tables. What should happen if a DEPT.NO for which

employees work is deleted from the DEPARTMENT table?

Table Name: EMPLOYEE Table Name: DEPARTMENT

Option Explanation of Constraint

CASCADE The deletion should cascade to the matching employees. The matching EMPLOYEE rows should also be deleted.

RESTRICTED The deletion should be restricted to only DEPARTMENTS without employees.

NULLIFY The foreign key should be nullified (valid only for FK's allowing NULLs) when the referenced PK is deleted.

Page 222: Data Modeling

Specify Referential Integrity - cont'd

Specify an Update Constraint to define what should happen when a

referenced primary key is updated. (The Update Rule is only meaningful if

the PK is updateable.)

Options: CASCADE, RESTRICTED, or NULLIFY (only ifNULLs are

allowed)

Example

What should happen if a DEPT_NO for which employees work is changed to another

DEPT_NO?

Table Name: EMPLOYEE Table Name: DEPARTMENT

Option Explanation of Constraint

CASCADE The update should cascade to the matching employees. The matching

EMPLOYEE rows should also be updated to reflect the new PK value.

RESTRICTED The update should be restricted to only DEPARTMENTS without employees.

NULLIFY The foreign key should be nullified (valid only for FK's allowing NULLs) when the referenced PK is updated to a new PK value.

DESIGN INDEXES

An index is associated with a single physical table and contains the values of

Page 223: Data Modeling

one or more columns from that table.

Database Design - Table Instance Chart

Table Name: COURSE

Physical Representations

COURSE Table

I_COURSES_PRIME Index

(Unique)

I_COURSES_2 Index

(Not Unique)

Design Indexes - cont'd

Page 224: Data Modeling

Use indexes to significantly improve data access time.

Indexes

• Provide quick access to rows of data and avoid full table scans

• Facilitate table joins

• Ensure uniqueness of a value if defined as unique

• Are used automatically when. referenced in the WHERE clause of a SQL statement if the

column is not modified

For further information on the subject see:

SQL Language Reference Manual. 778-V6.0

ORACLE RDBMS Database Administrator's Guide Version 6.0,3601-V6.0

Page 225: Data Modeling

Design Indexes - cont'd

A concatenated index is an index created on a group of columns in a single

table. Map a composite key to a concatenated index.

Example

The ENROLLMENT table has a composite PK of COURSE_CODE and ST_ID. Create a composite

key called I_ENROLL_PRIME on both columns.

Table Name: ENROLLMENT Table Design

Column Name

ENROLL_ DATE

DATE_ COMPLETEC

GRADE COURSE_ CODE

STJD

Key Type

PK.FK1 PK.FK2

Nulls/ Unique

NN NN.U1 NN.U1

20-JUL-91 19-AUG-91 -- 344 47592 05-SEP-91 - -- 401 15402 14-JUN-91 28-JUL-91 A 717 51394 08-MAY-9 28-JUL-91 B 717 94572

Sample Data

05-MAY-9 21-MAY-91 A 401 51394

ENROLLMENT Table Physical Tables

ROW ID

ENROLL_ DATE

DATE_ COMPLETED

GRADE COURSE_ CODE

ST_ID

5011 20-JUL-91 19-AUG-91 - 344 47592 5012 05-SEP-91 - - 401 15402 5015 14-J UN-91 28-JUL-91 A 717 51394 5013 08-MAY-91 28-JUL-91 B 717 94572 5014 05-MAY-91 21-MAY-91 A 401 51394

I_ENROLL_PRIME Index (Unique)

COURSE_ CODE

ST_ID ROW ID

344 47592 5011 401 15402 5012 401 51394 5014 717 51394 5015

717 94572 5013

Page 226: Data Modeling

Design Indexes - cont'd

Use indexes to implement keys and to support application access

requirements.

Build Indexes for

• Primary keys (unique indexes)

• Foreign keys (generally non-unique indexes)

Consider indexing

• Alternate keys (unique indexes)

• Any critical non-key columns used in WHERE clauses

• Any search keys

Indexes add storage and update overhead.

Quick Notes

• A unique index references a column or set of columns that has unique values in the table.

• A non-unique index references a column or set of columns that are not unique in a table.

• Be aware that under certain conditions, indexes are not used by the RDBMS.

For further information on the subject see:

ORACLE RDBMS Database Administrator's Guide Version 6.0,3601-V6.0

ORACLE RDBMS Performance Tuning Manual 5317-V6.0

Page 227: Data Modeling

ESTABLISH VIEWS

Establish database views to meet application access requirements

Views can be used for:

• restricting access.

• providing referential integrity.

• presenting tables to users in any form.

• pre-packaging complex queries.

• producing rapid prototypes.

• pre-joined base tables in SQL*Forms.

• checking data input.

A view can be thought of as a predefined window onto the database.

Quick Notes

• A view has no data of its own and merely relays information from underlying tables.

• A view is defined by a SELECT statement that is named and stored in the ORACLE Data

Dictionary.

• A view is queried as if it were a table.

Establish Views - cont'd

A View can restrict what the user, designer, or tool sees.

Examples

A View of the EMP table could be used to restrict users from seeing the employees' salaries.

Page 228: Data Modeling

A view can be used to present normalized data in a denormalized form.

Example

Following the rules of normalization, the ORDER and CUSTOMER tables are separate.

A view defined across both tables could be used to pre-join the tables so the user would only see a

single table.

Page 229: Data Modeling

Establish Views - cont'd

Use views with caution. Access through a view is slower because it requires

an extra access to the data dictionary, and may cause query optimization to

be slower.

View Limitations

• For a view based upon a single table, the SQL INSERT, UPDATE, and DELETE commands

have no limitations.

• For multi-table views with virtual columns, INSERT, UPDATE, and DELETE are restricted.

• When accessing tables through a view, it is possible to add rows not visible through the view

unless the WITH CHECK OPTION is specified.

Page 230: Data Modeling

DENORMALIZE THE DATABASE DESIGN

Always start with tables in Third Normal Form.

Beware of Denormalization!

• Be extremely reluctant to denormalize the default table design.

• Denormalization can cause data inconsistency problems.

Denormalization may be a solution for transactions with performance

requirements such as:

• high throughput.

• high frequency.

• quick response time.

Consider all other options prior to denormalization, especially adding or

changing the index structure.

Page 231: Data Modeling

Denormalize the Database Design - cont'd

Combining tables is the most common form of denormalization.

Example

Consider the ACCOUNT and BANK tables.

If high-volume account queries always access the bank name, a combined table might be worth the data redundancy. The ACCOUNT table and the BANK table are combined on BANK_NUM.

Page 232: Data Modeling

Denormalize the Database Design - cont'd

Individual codes tables may be combined into a reference table for

validating and decoding coded values for an entire application system.

Example

The following separate codes tables are required for an application system. They are used to provide

the SQL*Forms list of values feature and to validate table values for INSERT or UP DATE.

Combine all the tables into a single table with an additional column, CODE_TYPE, that defines which set of values the code belongs to. Create a view for each CODE_TYPE.

Page 233: Data Modeling

Denormalize the Database Design - cont'd

Establish a companion CODE_TYPE table for validating code description

lengths.

Example

The CHAR_CODE table on the previous page includes four different types of codes. Each of these

code types has a different valid length for its code description. Set up a CODE_TYPE table for

validating the length of the descriptions.

The table contains two columns, CODE_TYPE and LENGTH. LENGTH is the maximum description length for each CODE_TYPE.

Page 234: Data Modeling

Denormalize the Database Design - cont'd

A vector is a one-dimensional array with a fixed number of values - a

repeating group of definite size. Represent vector data as either a set of

rows or a set of columns.

Column-Wise Table Design (3NF)

Row-Wise Table Design

Page 235: Data Modeling

Denormalize the Database Design - cont'd

Choose the table design for vector data based upon the functional access

requirements.

Advantages of a Column-Wise Design

• SQL group functions act on columns, e.g., SUM, AVG.

• Changes in the vector length can be easily accommodated.

Advantages of a Row -Wise Design

• On the input form, all data values can appear on a single line.

• All values can be inserted with a single INSERT statement.

• The storage space requirement is lower.

• Output reports showing all values horizontally are easy to produce.

Page 236: Data Modeling

Denormalize the Database Design - cont'd

Reconsider storing derived data in light of the functional access

requirements and the capabilities of the software development tools.

Example

A regional sales manager has 200 salespersons working for him. He frequently queries the total sales

quota and sales-to-date for his region. The sales quotas are established quarterly. Sales data is updated

weekly. Maintaining sales quota data by region would be desirable, and maintaining sales-to-date

might also be desirable.

Page 237: Data Modeling

PLAN PHYSICAL STORAGE USAGE

Work with the Database Administrator to plan the physical placement of

the database tables and indexes.

Considerations

• For each table and index, estimate the amount of disk space required.

• Decide on the placement of tables and indexes on logically separate tablespaces and

physically separate disks.

• Define storage allocation parameters based upon the expected patterns of data update and

growth.

For further information on the subject see:

ORACLE RDBMS Database Administrator's Guide Version 6.0,3601-V6.0

Page 238: Data Modeling

SUMMARY: DATABASE DESIGN

Database Design is the process of mapping the information requirements

reflected in an Entity-Relationship Model into a relational database.

Activity 1: Initial Database Design

• Map the simple entities to tables.

• Map attributes to columns and document sample data.

• Map unique identifiers to primary keys.

• Map relationships to foreign keys.

• Choose arc options.

• Choose subtype options.

Activity 2: Further Database Design

• Define referential integrity constraints.

• Design indexes.

• Establish views.

• Denormalize the database design.

• Add system support tables.

• Plan physical storage usage.

Page 239: Data Modeling

SUMMARY: DATABASE DEVELOPMENT

This course has covered the first two steps of the top-down database

development process. The last step is Database Build.

Page 240: Data Modeling

DATABASE BUILD OVERVIEW

In Database Build, create physical relational database tables to implement

the database design.

Example

The following Structured Query Language (SQL) statements will create the DEPARTMENT and

EMPLOYEE tables.

SQL> 2 3 4

CREATE TABLE DEPARTMENT DEPTNO NUMBER (2) NOT NULL PRIMARY KEY, DNAME CHAR(20) NOT NULL, LOC CHAR (15) NOT NULL ) ;

SQL> 2 3 4 5 6 7 8 9

10

CREATE TABLE EMPLOYEE EMPNO NUMBER (5) NOT NULL PRIMARY KEY, FNAME CHAR (15) NOT NULL, LNAME CHAR(15) NOT NULL, JOB CHAR(9), HIREDATE DATE NOT NULL, SAL NUMBER(7,2), COMM NUMBER(7,2), MGR CHAR(4) REFERENCES EMPLOYEE(EMPNO),

DEPTNO NUMBER(2) NOT NULL REFERENCES DEPARTMENT (DEPTNO) );

For further information on the subject attend:

Introduction to ORACLE for Developers