Physical Database Design Chapter 5 G. Green 1. Agenda Purpose Activities Fields Records Files 2.

23
Physical Database Design Chapter 5 G. Green 1

Transcript of Physical Database Design Chapter 5 G. Green 1. Agenda Purpose Activities Fields Records Files 2.

Physical Database DesignChapter 5

G.

Gre

en

1

Agenda

• Purpose• Activities

• Fields• Records• Files

2

Purpose• Determine physical specifications for data• Goal: Processing Efficiency

• Performance• Integrity• Security• Recoverability

3

Physical Design Activities

• Choose DBMS

• Detailed definitions for:• fields (data dictionary) • records (physical record structure,

quantity ) • files (access methods,

space requirements)

• Physical File Creation

• Query Optimization4

Physical Design Activities

Choose DBMSDetailed definitions for:

fields (data dictionary) records (physical record structure)

files (access methods)

5

Choosing a DBMS

NOTE: This diagram is for effect ONLY—it is incomplete (e.g., no MDDB, no ODDB) AND contains some inaccuracies

G.

Gre

en

6

Choosing a DBMS, cont…

• Compatibility with existing hardware, software, network, operating system

• DBMS features meet requirements*Needed functionalityStructure of dataNature of Workload

• Product reliability• Vendor support• IT personnel expertise• Pricing, licensing

7

Physical Design Activities

Choose DBMSDetailed definitions for:

fields (data dictionary) records (physical record structure)

files (access methods)

8

Designing Fields• Choose data types and lengths

• Represent all possible values• Ensure data integrity• Support data manipulations• Minimize storage space

• Data integrity controls:• defaults• ranges• nulls• referential integrity

• Document above in data dictionary• see Table 1-1• see Table 4-1

9

Designing Fields, cont…

• Text• Char vs. Varchar/NVarchar

• Numbers• Int vs. Decimal (or Numeric)

• Dates• Date vs. Time vs. Datetime

• Others (will not use in class)10

Physical Design Activities

Choose DBMSDetailed definitions for:

fields (data dictionary) records (physical record structure)

files (access methods)

11

Designing Records, cont...

• Re-design options include:• Denormalization

• What?• When?• Why?• Why not?

• Partitioning• Horizontal (row)• Vertical (column)

12

13

Usage Analysis Example

14

Extra table access

required

Duplicate descriptions

possible

Denormalization Example

Physical Design Activities

Choose DBMSDetailed definitions for:

fields (data dictionary) records (physical record structure) files (access methods)

15

Designing Files

Efficient access to dataFile Organizations

sequentialindexedhashed

Efficient storage of dataHow much storage?

16

Sequential File Organization

• Records physically ordered often by PK

• Examples• Advantages• Disadvantages

17

Indexed File Organization

• Data Records physically ordered• Index Records give physical location of each

data record• indexes are separate files

• Advantages• Disadvantages• Example

18

Sequential File Organization

PROD_NO (PK)

NAME TYPE COLOR

1 Pocket knife - Nile E Brown 2 Pocket knife - Nile E Brown 3 Compass N 4 Geo positioning system N 5 Map measure N 6 Hat - polar explorer C Red 7 Hat - polar explorer C White 8 Boots - snake proof C Green 9 Boots - snake proof C Black 10 Safari chair F Khaki

19

• Page size = 1Kb• 10,000 records• Data record size =.5Kb

1K/.5K = 2 data records/page• 20% = type E

10,000*.2 = 2,000 type E records

“Find all 2,000 products that are type E”

• How many total “reads”? 10,000/2 = 5,000

No Index

PRODUCT Table

Indexed File Organization, cont...

PROD_NO (PK)

NAME TYPE COLOR

1 Pocket knife - Nile E Brown 2 Pocket knife - Nile E Brown 3 Compass N 4 Geo positioning system N 5 Map measure N 6 Hat - polar explorer C Red 7 Hat - polar explorer C White 8 Boots - snake proof C Green 9 Boots - snake proof C Black 10 Safari chair F Khaki

20

• Page size = 1Kb• 10,000 records• Data record size =.5Kb

1K/.5K = 2 data records/page• 20% = type E

10,000*.2 = 2,000 type E

records• Index record size = 5 bytes

1 byte for TYPE 4 bytes for ADDR

• 1,000b/5b = 200 index records/page

“Find all 2,000 products that are type E”• How many index “reads”?

10,000/200 = 50• How many data “reads”?

2,000• How many total “reads”?

2,000 + 50 = 2,050

With Index

TYPE ADDR

C C C C E E F N N N

PRODUCT TablePROD_TYPE Index

Indexed File Organization, cont...• Types of Indexes

• Primary Key Index only ONE

• Secondary (Key) Indexes• How to choose?

• Clustered Index

• How to structure indexes?• B-Tree• Bitmap• … 21

B-Tree Index Organization

22

Root

Leaf

Branch

rowid

rowid

rowid

rowid rowid

rowid

rowid

rowid

rowid

See http://www.ovaistariq.net/733/understanding-btree-indexes-and-how-they-impact-performance/ for more information

Summary

• Purpose• Activities

• Choose DBMS• Fields

• Data dictionary• Data types

• Records• Usage• Denormalization

• Files• Organizations

• Indexing

23