8/8/2019 Week08 - Physical Design
1/24
1
Database I
Methodology
Physical Design
8/8/2019 Week08 - Physical Design
2/24
2
Physical Database Design
Throughout the processes of conceptual and
logical database designs and the
normalization, the primary objective has beenthe storage efficiency and the consistency of
the database
In the physical database design, however,
the focus shifts from storage efficiency to theefficiency in execution
8/8/2019 Week08 - Physical Design
3/24
3
Physical Database Design
(Cont.)
The physical DB design involves:
Transforms logical DB design into technical
specifications for storing and retrieving data
Does not include practically implementing the
design however tool specific decisions are
involved
The Physical design requires the followinginput
Normalized relations
Definitions of each attribute (means the purpose
or objective of the attributes)
8/8/2019 Week08 - Physical Design
4/24
4
Physical Database Design
(Cont.)
Descriptions of data usage (how and by whomdata will be used)
Requirements for response time, data security,
backup etc. Tool to be used
Decisions that are made during this processare: Choosing data types
Deciding file organizations
Selecting structures
Preparing strategies for efficient access
8/8/2019 Week08 - Physical Design
5/24
5
De-normalization
De-normalization is a technique to move from higher
to lower normal forms of database modeling in order
to speed up database access
De-normalization process is applied for deriving a
physical data model from a logical design
In logical design we group things logically related
through same primary key
In physical database design fields are grouped, as
they are stored physically and accessed by DBMS
8/8/2019 Week08 - Physical Design
6/24
6
De-normalization (Cont.)
We should be aware that each new RDBMS
release usually bring enhanced performance
and improved access options that mayreduce the need for De-normalization
A fully normalized database schema can fail
to provide adequate system response time
due to excessive table join operations
8/8/2019 Week08 - Physical Design
7/24
7
De-normalization (Cont.)
De-normalization Situation 1:
Merge two Entity types into one with one to one
relationship
Even if one of the entity type is optional, so joining
can lead to wastage of storage, however if two
accessed together very frequently their merging
might be a wise decision
So those two relations must be merged for better
performance, which have one to one relationship
8/8/2019 Week08 - Physical Design
8/24
8
De-normalization (Cont.)
De-normalization Situation 2:
Many to many binary relationships mapped to three
relations
Queries needing data from two participating relationsneed joining of three relations that is expensive
Join is an expensive operation from execution point of
view
Consider the many to many relationship b/w EMP,PROJ and WORK
EMP (empID, eName,pjId,Sal)
PROJ (pjId,pjName)
WORK (empId.pjId,dtHired,Sal)
8/8/2019 Week08 - Physical Design
9/24
9
De-normalization (Cont.)
So now if we by de-normalizing these relations
and merge the WORK relation with PROJ relation
But in this case it is violating 2NF and anomalies
of 2NF would be there
But there would be only one join operation
involved by joining two tables, which increases
the efficiency
EMP (empID, eName,pjId,Sal)
PROJ (pjId,pjName, empId,dtHired,Sal)
8/8/2019 Week08 - Physical Design
10/24
10
De-normalization (Cont.)
De-normalization Situation 3: In 1:M situation when the ET on side does not
participate in any other relationship, then many sideET is appended with reference data rather than theforeign key
In this case the reference table should be merged withthe main table
Consider STUDENT and HOBBY relations
One student can have one hobby and one hobby canbe adopted by many students
Here hobby can be merged with the student relation
Thus redundancy of data would be there, but therewould not be any joining of two relations, which willhave a better performance
8/8/2019 Week08 - Physical Design
11/24
11
Partitioning
Partitioning splits same relation into two
Aims of data partitioning in database are to
Reduce workload (e.g. data access,communication costs, search space)
Balance workload
Speed up the rate of useful work (e.g. frequently
accessed objects in main memory)
There are two types of partitioning:
Horizontal Partitioning
Vertical Partitioning
8/8/2019 Week08 - Physical Design
12/24
12
Partitioning (Cont.)
Horizontal Partitioning
Table is split on the basis of rows, which means a
larger table is split into smaller tables
The advantage of this is that time in accessing the
records of a larger table is much more than a
smaller table
Range Partitioning In this type of partitioning range is imposed on any
particular attribute
For Example for those students whose ID is from 1-
1000 are in partition 1 and so on
8/8/2019 Week08 - Physical Design
13/24
13
Partitioning (Cont.)
Hash Partitioning
A particular algorithm is applied and DBMS knows that
algorithm
So hash partitioning reduces the chances ofunbalanced partitions to a large extent
List Partitioning
In this type of partitioning the values are specified for
every partition So there is a specified list for all the partitions
8/8/2019 Week08 - Physical Design
14/24
14
Partitioning (Cont.)
Vertical Partitioning
Vertical partitioning is done on the basis of
attributes
Same table is split into different physical records
depending on the nature of accesses
Primary key is repeated in all vertical partitions of
a table to get the original table Consider the Student relation
STD (stId, sName, sAdr, sPhone, cgpa, prName,
school, mtMrks, mtSubs, clgName,
intMarks, intSubs, dClg, bMarks, bSubs)
8/8/2019 Week08 - Physical Design
15/24
15
Partitioning (Cont.)
We can partition this relation vertically as
under
STD (stId, sName, sAdr, sPhone, cgpa,prName)
STDACD (sId, school, mtMrks, mtSubs,
clgName, intMarks, intSubs,
dClg, bMarks,bSubs)
8/8/2019 Week08 - Physical Design
16/24
16
Data Storage Concepts
Physical Storage Media Storage media are
classified according to following characteristics:
Speed of access
Cost per unit of data
Reliability
RAID Redundant Array of Inexpensive Disks
Many disk that look as a single disk to OS but have better
performance and betterreliability RAID disk drives are used frequently on servers
RAID have the property that the data are distributed over
the drives to allow parallel operations
8/8/2019 Week08 - Physical Design
17/24
17
Data Storage Concepts (Cont.)
Fundamental to RAID is "striping", a methodof concatenating multiple drives into onelogical storage unit
Striping involves partitioning each drive'sstorage space into stripes which may be assmall as one sector (512 bytes) or as large asseveral megabytes
The type of application environment, I/O ordata intensive, determines whether large orsmall stripes should be used
8/8/2019 Week08 - Physical Design
18/24
18
Data Storage Concepts (Cont.)
RAID-0
Simple Striping
Virtual single disk is divided up into strips of ksectors each
Since no redundant information is stored,
performance is very good, but the failure of
any disk in the array results in data loss
8/8/2019 Week08 - Physical Design
19/24
19
Data Storage Concepts (Cont.)
1
5
9
2
6
10
3
7
11
4
8
12
Note: This example is a basic virtual drive where
each element depicted as a disk is a physical disk
8/8/2019 Week08 - Physical Design
20/24
20
Data Storage Concepts (Cont.)
RAID-1
RAID Level 1 provides redundancy by writing all
data to two or more drives
The performance of a level 1 array tends to be
faster on reads and slower on writes compared to
a single drive, but if either drive fails, no data is
lost
This level is commonly referred to as mirroring
8/8/2019 Week08 - Physical Design
21/24
21
Data Storage Concepts (Cont.)
1
2
3
1
2
3
1
2
3
8/8/2019 Week08 - Physical Design
22/24
22
Data Storage Concepts (Cont.)
RAID-2,3 For reliability simple parity check code is used
Parity bit is stored on separate disk
RAID-4
RAID Level 4 stripes data at a block level acrossseveral drives, with parity stored on one drive
The performance of a level 4 array is very goodfor reads (the same as level 0)
Writes, however, require that parity data beupdated each time
8/8/2019 Week08 - Physical Design
23/24
23
Data Storage Concepts (Cont.)
RAID-5 RAID Level 5 is similar to level 4, but distributes parity
among the drives
This can speed small writes in multiprocessing systems,since the parity disk does not become a bottleneck
RAID-0 is the fastest and most efficient array typebut offers no fault-tolerance
RAID-1 is the array of choice for performance-
critical, fault-tolerant environments RAID-2 is seldom used today since ECC is
embedded in almost all modern disk drives
8/8/2019 Week08 - Physical Design
24/24
24
Data Storage Concepts (Cont.)
RAID-3 can be used in data intensive or single-user
environments which access long sequential records
to speed up data transfer. However, RAID-3 does
not allow multiple I/O operations to be overlapped RAID-4 offers no advantages over RAID-5 and does
not support multiple simultaneous write operations
RAID-5 is the best choices in multi-use
environments which are not write performancesensitive. However, at least three and more typically
five drives are required for RAID-5 arrays
Top Related