Physical Database Design DeSiaMorePowered by DeSiaMore 1.

45
Physical Database Design DeSiaMore Powered by DeSiaMore 1

description

What is Physical Database Design Physical database design involves taking the results from the logical design process and fine-tuning them against the usage, performance and storage requirements of some applications. Logical database design is about implementation independence. Physical database design is about implementation dependence. DeSiaMore Powered by DeSiaMore 3

Transcript of Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Page 1: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Physical Database Design

DeSiaMore Powered by DeSiaMore 1

Page 2: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Lecture Objectives

Overview of Physical Database Design Process.

Describing volume and usage analysis. Exploring the designing of fields. Designing of physical records and

denomalization.

DeSiaMore Powered by DeSiaMore 2

Page 3: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

What is Physical Database Design Physical database design involves taking

the results from the logical design process and fine-tuning them against the usage, performance and storage requirements of some applications.

Logical database design is about implementation independence.

Physical database design is about implementation dependence.

DeSiaMore Powered by DeSiaMore 3

Page 4: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Introduction

The purpose of physical database design is to translate the logical description of data into the technical specifications for storing and retrieving data

The goal is to create a design for storing data that will provide adequate performance, and insure database integrity, security, and recoverability

DeSiaMore Powered by DeSiaMore 4

Page 5: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Inputs to Physical Design Normalized relations Attribute definitions Estimations of data processing volume Descriptions of where and when data are entered,

retrieved, deleted, and updated Response time expectations/requirements Requirements for data security, backup, recovery,

retention, and integrity Characteristics of the DBMS to be used

DeSiaMore Powered by DeSiaMore 5

Page 6: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

What is Physical Database Design

The following activities are part of physical database design.Volume and Usage Analysis Integrity analysisControl Security AnalysisData Distribution Analysis.

DeSiaMore Powered by DeSiaMore 6

Page 7: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Volume Analysis

It is the first step to be taken to move from logical to physical design.

It aims at establishing estimates of the possible number of instances per entity.

This is useful because it estimates how many instances are most likely to be stored the system on average.

DeSiaMore Powered by DeSiaMore 7

Page 8: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Volume Analysis

The table below summarises sizing estimates for the student database.

DeSiaMore Powered by DeSiaMore 8

Page 9: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Volume Analysis

Data volumes reflect number of records in tables.

Access frequencies reflect number of table record accesses per unit of time

Note what attributes are used in table accesses(to aid design of table indexes)

DeSiaMore Powered by DeSiaMore 9

Page 10: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Usage Analysis

Usage analysis requires that we identify the major transactions required for a database system.

Transactions considered here consists of series of insertions, updates, retrieavals, or a mixture of all fours.

DeSiaMore Powered by DeSiaMore 10

Page 11: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

A sample of Transactions

Below are simple transactions common to College DatabaseRegister new studentsAdd new coursesAssign a lecturer to a course

DeSiaMore Powered by DeSiaMore 11

Page 12: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Group Exercise

Given particular supermarket database design, you are required to draw on various transactions that can be done. The logical design of the database consists of products, customer, and supplier tables.

DeSiaMore Powered by DeSiaMore 12

Page 13: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Physical Design Decisions Specify the data type for each attribute from the

logical data model Specify physical records by grouping attributes

from the logical data model Specify the file organization technique to use for

physical storage of data records Specify indexes to optimize data retrieval Specify query optimization strategies

DeSiaMore Powered by DeSiaMore 13

Page 14: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Designing Fields

Field: smallest unit of data in database

Field design Choosing data typeCoding, compression, encryptionControlling data integrity

DeSiaMore Powered by DeSiaMore 14

Page 15: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Choosing Data Types CHAR–fixed-length character VARCHAR2–variable-length character (memo) LONG–large number NUMBER–positive/negative number INTEGER–positive/negative whole number DATE–actual date BLOB–binary large object (good for graphics,

sound clips, etc.)

DeSiaMore Powered by DeSiaMore 15

Page 16: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Designing Fields Choosing the field data type:

Select from available types such as: text, memo, number, date/time, currency, etc.

Seek to: Minimize storage space

e.g., Integer vs. Floating Point Represent all possible values

e.g., Floating Point vs. Integer Improve data integrity (more on next slide)

e.g., Yes/No Support all data manipulations

e.g., Date/TimeDeSiaMore Powered by DeSiaMore 16

Page 17: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Designing Fields Controlling data integrity

Default value e.g., value “FL” for State field

Range control e.g., value “<=100” for Test_Score field

Null value control e.g., prohibit leaving Date_of_Birth field blank

Referential integrity e.g., restrict valid values for Part_No field in Order table to

the contents of this field in the Part table

DeSiaMore Powered by DeSiaMore 17

Page 18: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Designing Fields Fixed-Length Fields:

Make it easy to locate a specific record in a file and/or a specific field in that record

Each field has its maximum length specified and unused space in any given field is padded with spaces (text) or leading zeros (numeric)

Variable-Length Fields:When the need arises for a variable-length field

(e.g., a memo field), this field can be stored separate from the rest of the record with a pointer used to locate it when neededDeSiaMore Powered by DeSiaMore 18

Page 19: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Physical Records Physical Record: “A group of fields stored in

adjacent memory locations and retrieved together as a unit.”

Page: “The amount of data read or written in one secondary memory (disk) input or output operation.”

Blocking Factor: “The number of physical records per page.”

DeSiaMore Powered by DeSiaMore 19

Page 20: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Database Access Model

The goal in structuring physical records is to minimize performance bottlenecks resulting from disk accesses (accessing data from disk is slow compared to main

memory)DeSiaMore Powered by DeSiaMore 20

Page 21: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Optimization Decisions Denormalization Partitioning Selection of File Organization Creation of Indexes

DeSiaMore Powered by DeSiaMore 21

Page 22: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Denormalisation

The main problem with a fully normalised database is that it has many tables.

To perform useful queries such tables have to be reconstituted via expensive join operations.

Updates frequently have to be performed across more than one table.

DeSiaMore Powered by DeSiaMore 22

Page 23: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Denormalisation

One obvious way of improving retrieval or update performance is to go back from a fully normalized database and introduce some controlled redundancy.

DeSiaMore Powered by DeSiaMore 23

Page 24: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Definition

“The process of transforming normalized relations into unnormalized physical record specifications [for the purpose of improving overall database performance].” or

DeSiaMore Powered by DeSiaMore 24

Page 25: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Definition

Denormalization is a technique to move from higher to lower normal forms of database modeling in order to speed up database access.

You may apply Denormalization in the process of deriving a physical data model from a logical form.

DeSiaMore Powered by DeSiaMore 25

Page 26: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Example

Four examples of strict violations of normalization are shown in the model of schema below:ORDER (Order No, Customer No, Customer

Name, Customer Address, Order Date)ORDER LINE (Order No, Line No, Customer

No, Customer Name, Customer Address, Product Code, Unit Count, Unit Price, Total Price, Required By Date)

DeSiaMore Powered by DeSiaMore 26

Page 27: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Example

From the schema above It can be assumed that Customer Name

and Customer Address have been copied from a Customer table with primary key Customer No .

Customer No has been copied from the Order table to the Order Line table.

DeSiaMore Powered by DeSiaMore 27

Page 28: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Example

It can be assumed that Unit Price has been copied from a Product table with primary key Product Code .

Total Price can be calculated by multiplying Unit Price by Unit Count .

DeSiaMore Powered by DeSiaMore 28

Page 29: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Example……Benefits

Changes such as this are intended to offer performance benefits for some transactions.

For example, a query on the Order Line table that also requires the Customer No does not have to also access the Order table.

DeSiaMore Powered by DeSiaMore 29

Page 30: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Example……Benefits

However, there is a down side: each such additional column must be carefully controlled. It should not be able to be updated directly by

users. It must be updated automatically by the

application (e.g., via a DBMS trigger).

DeSiaMore Powered by DeSiaMore 30

Page 31: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Partitioning Horizontal Partitioning: Distributing the rows of

a table into two or more separate filese.g., Customer table is partitioned into four

separate files, one for each geographical region

Vertical Partitioning: Distributing the columns of a table into two or more separate filese.g., Employee table is partitioned into public file

(name, office, extension, etc.) and private file (salary, health history, etc.)

Note: the primary key is repeated in each fileDeSiaMore Powered by DeSiaMore 31

Page 32: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Partitioning Advantages of Partitioning:

Records used together are grouped together Each partition can be optimized for performance Security and recovery Partitions stored on different disks: less contention Parallel processing capability

Disadvantages of Partitioning: Slower retrievals when across partitions Complexity for application programmers Anomalies and extra storage space requirements

due to duplication of data across partitionsDeSiaMore Powered by DeSiaMore 32

Page 33: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Physical Files Physical File: A file as stored on disk Constructs to link two pieces of data:

Sequential storage Pointers

File Organization: How the files are arranged on the disk.

Access Method: How the data can be retrieved based on the file organization Relative - data accessed as an offset from the most

recently referenced point in secondary memory Direct - data accessed as a result of a calculation to

generate the beginning address of a recordDeSiaMore Powered by DeSiaMore 33

Page 34: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

File Organizations “A technique for physically arranging the

records of a file on secondary storage devices.” Goals in selecting: (trade-offs exist, of course)

Fast data retrievalHigh throughput for input and maintenanceEfficient use of storage spaceProtection from failures or data lossMinimal need for reorganizationAccommodation for growthSecurity from unauthorized use

DeSiaMore Powered by DeSiaMore 34

Page 35: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

File Organizations

Sequential Indexed

Indexed SequentialIndexed Nonsequential

DeSiaMore Powered by DeSiaMore 35

Page 36: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Sequential File Organization

Records of the file are stored in sequence by the primary key field values

DeSiaMore Powered by DeSiaMore 36

Page 37: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Sequential Retrieval Consider a file of 10,000 records each occupying

1 page Queries that require processing all records will

require 10,000 accesses e.g., Find all items of type 'E' Many disk accesses are wasted if few records

meet the condition However, very effective if most or all records will

be accessed (e.g., payroll)

DeSiaMore Powered by DeSiaMore 37

Page 38: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Indexed File Organization

Index concept is like index in a book Indexed-sequential file organization: The

records are stored sequentially by primary key values and there is an index built on the primary key field (and possibly indexes built on other fields, also)

DeSiaMore Powered by DeSiaMore 38

Page 39: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Indexing An index is a table file that is used to determine

the location of rows in another file that satisfy some condition

DeSiaMore Powered by DeSiaMore 39

Page 40: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Querying with an Index Read the index into memory Search the index to find records meeting the

condition Access only those records containing required

data Disk accesses are substantially reduced when

the query involves few records

DeSiaMore Powered by DeSiaMore 40

Page 41: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Maintaining an Index Adding a record requires at least two disk

accesses:Update the fileUpdate the index

Trade-off: Faster queries Slower maintenance (additions, deletions, and updates

of records)Thus, more static databases benefit more overall

DeSiaMore Powered by DeSiaMore 41

Page 42: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Rules of Thumbfor Using Indexes

1. Indexes are most useful on larger tables2. Index the primary key of each table

(may be automatic, as in Access)

3. Indexes are useful on search fields (WHERE)4. Indexes are also useful on fields used for

sorting (ORDER BY) and categorizing (GROUP BY)

5. Most useful to index on a field when there are many different values for that field

DeSiaMore Powered by DeSiaMore 42

Page 43: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Rules of Thumbfor Using Indexes

6. Find out the limits placed on indexing by your DBMS (Access allows 32 indexes per table, and no index may contain more than 10 fields)

7. Depending on the DBMS, null values may not be referenced from an index (thus, rows with a null value in the field that is indexed may not be found by a search using the index)

DeSiaMore Powered by DeSiaMore 43

Page 44: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Group Exercise

Consider a college database consisting of three tables, Student, Lecture, and Course. Denormalize your tables so that you increase the performance of the following query:

Give all students who take database development course lectured by Bajuna

DeSiaMore Powered by DeSiaMore 44

Page 45: Physical Database Design DeSiaMorePowered by DeSiaMore 1.

Next Topic

Client/Server and Middleware

DeSiaMore Powered by DeSiaMore 45