Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

31
Database System Concepts, 5th Ed. Chapter 11: Storage and File Chapter 11: Storage and File Structure Structure

Transcript of Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

Page 1: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

Database System Concepts, 5th Ed.

Chapter 11: Storage and File StructureChapter 11: Storage and File Structure

Page 2: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.2Database System Concepts - 5th Edition, Oct 23, 2005.

Chapter 11: Storage and File StructureChapter 11: Storage and File Structure

Overview of Physical Storage Media

Storage Access

File Organization

Organization of Records in Files

Data-Dictionary Storage

Page 3: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.3Database System Concepts - 5th Edition, Oct 23, 2005.

Overview of Physical Storage Media Overview of Physical Storage Media 物理存储物理存储介质介质

Magnetic Disks 磁盘 RAID 磁盘阵列 Tertiary Storage 第三存储 Storage Access 存取 ** File Organization 文件结构 ** Organization of Records in Files 记录结构 ** Data-Dictionary Storage 数据字典 ** Storage Structures for Object-Oriented Databases

Page 4: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.4Database System Concepts - 5th Edition, Oct 23, 2005.

Classification of Physical Storage Classification of Physical Storage MediaMedia

Speed with which data can be accessed

Cost per unit of data

Reliability

data loss on power failure or system crash

physical failure of the storage device

Can differentiate storage into:

volatile storage: loses contents when power is switched off 易逝,挥发 RAM

non-volatile storage:

Contents persist even when power is switched off.

Includes secondary and tertiary storage, as well as batter-backed up main-memory. 磁盘,光盘,磁带,闪存

Page 5: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.5Database System Concepts - 5th Edition, Oct 23, 2005.

Physical Storage MediaPhysical Storage Media

Cache – fastest and most costly form of storage; volatile; managed by the computer system hardware. 快, 易逝, 硬件

•高速高价•设备

•低速低价•设备

•多种 Cache :•CPU- 与内存•内存与磁盘•OS 与 DBMS

Page 6: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.6Database System Concepts - 5th Edition, Oct 23, 2005.

Physical Storage MediaPhysical Storage Media

Main memory: 主存储 fast access (10s to 100s of nanoseconds; 1 nanosecond = 10–9

seconds) 10 纳秒 generally too small (or too expensive) to store the entire

database 不能存整个 DB

capacities of up to a few Gigabytes widely used currently

Capacities have gone up and per-byte costs have decreased steadily and rapidly (roughly factor of 2 every 2 to 3 years)

Volatile — contents of main memory are usually lost if a power failure or system crash occurs. 易逝,

Page 7: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.7Database System Concepts - 5th Edition, Oct 23, 2005.

Physical Storage Media (Cont.)Physical Storage Media (Cont.)

Flash memory 闪存 Data survives power failure 非易逝 Data can be written at a location only once, but location can be

erased and written to again 可重写 有限次读写 目前质量: 1-10 万次,够用了

Reads are roughly as fast as main memory 读快 写稍微慢 贵, 用于嵌入式设备,数字相机 MP3

also known as EEPROM (Electrically Erasable Programmable Read-Only Memory)

Page 8: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.8Database System Concepts - 5th Edition, Oct 23, 2005.

Physical Storage Media (Cont.)Physical Storage Media (Cont.)

Magnetic-disk 磁盘 Primary , long-term , entire database.

slower than memory direct-access – Hard disks vs floppy disks 100 GB currently

Survives power failures and system crashes disk failure can destroy data, but is very rare

Page 9: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.9Database System Concepts - 5th Edition, Oct 23, 2005.

Physical Storage Media (Cont.)Physical Storage Media (Cont.)

Optical storage 光存储 non-volatile, data is read optically from a spinning

disk using a laser 非易逝 CD-ROM (640 MB) and DVD (4.7 to 17 GB) most

popular forms

Write-one, read-many (WORM) optical disks used for archival storage (CD-R and DVD-R)

Multiple write versions also available (CD-RW, DVD-RW, and DVD-RAM)

slower than with magnetic disk

Juke-box systems, 多个可移动盘

Page 10: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.10Database System Concepts - 5th Edition, Oct 23, 2005.

Physical Storage Media (Cont.)Physical Storage Media (Cont.)

Tape storage 磁带 non-volatile, used primarily for backup (to recover from disk

failure), and for archival data 非易逝 sequential-access – much slower than disk

40 to 300 GB tapes available 大容量 cheaper than disk, but drives are expensive

Tape jukeboxes available for storing massive amounts of data 海量磁带库 hundreds of terabytes (1 terabyte = 109 bytes) to even a

petabyte (1 petabyte = 1012 bytes)

Page 11: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.11Database System Concepts - 5th Edition, Oct 23, 2005.

Storage HierarchyStorage Hierarchy

•各层•存储速度有数量级的差别•至少 10倍慢介质以廉价和容量而顽固地不肯退出历史舞台

Page 12: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.12Database System Concepts - 5th Edition, Oct 23, 2005.

Storage Hierarchy (Cont.) Storage Hierarchy (Cont.) 存储分级存储分级

primary storage: 主存 ,快 ,易逝 secondary storage: next level in hierarchy, non-volatile, moderately

fast access time

also called on-line storage

E.g. flash memory, magnetic disks 闪存,磁盘 tertiary storage: lowest level in hierarchy, non-volatile, slow access

time 第 3 存储 also called off-line storage

E.g. magnetic tape, optical storage 磁带光盘

Page 13: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.13Database System Concepts - 5th Edition, Oct 23, 2005.

Magnetic Hard Disk MechanismMagnetic Hard Disk Mechanism

NOTE: Diagram is schematic, and simplifies the structure of actual disk drives

Page 14: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.14Database System Concepts - 5th Edition, Oct 23, 2005.

11.5 Storage Access11.5 Storage Access (存储访问)(存储访问)

A database file is partitioned into fixed-length storage units called blocks. Blocks are units of both storage allocation and data transfer.

Database system seeks to minimize the number of block transfers between the disk and memory. We can reduce the number of disk accesses by keeping as many blocks as possible in main memory.

Buffer – portion of main memory available to store copies of disk blocks.

Buffer manager – subsystem responsible for allocating buffer space in main memory.

Page 15: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.15Database System Concepts - 5th Edition, Oct 23, 2005.

11.5.1 Buffer Manager11.5.1 Buffer Manager

Programs call on the buffer manager when they need a block from disk.

1. If the block is already in the buffer, buffer manager returns the address of the block in main memory

2. If the block is not in the buffer, the buffer manager

1. Allocates space in the buffer for the block

1. Replacing (throwing out) some other block, if required, to make space for the new block.

2. Replaced block written back to disk only if it was modified since the most recent time that it was written to/fetched from the disk.

2. Reads the block from the disk to the buffer, and returns the address of the block in main memory to requester.

Page 16: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.16Database System Concepts - 5th Edition, Oct 23, 2005.

11.5.2 Buffer-Replacement Policies11.5.2 Buffer-Replacement Policies Most operating systems replace the block least recently used (LRU

strategy)

Idea behind LRU – use past pattern of block references as a predictor of future references

Queries have well-defined access patterns (such as sequential scans), and a database system can use the information in a user’s query to predict future references

LRU can be a bad strategy for certain access patterns involving repeated scans of data

For example: when computing the join of 2 relations r and s by a nested loops for each tuple tr of r do for each tuple ts of s do if the tuples tr and ts match …

Mixed strategy with hints on replacement strategy providedby the query optimizer is preferable

Page 17: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.17Database System Concepts - 5th Edition, Oct 23, 2005.

Buffer-Replacement Policies (Cont.)Buffer-Replacement Policies (Cont.)

Pinned block (被钉住的块) – memory block that is not allowed to be written back to disk.

Most recently used (MRU) strategy – system must pin the block currently being processed. After the final tuple of that block has been processed, the block is unpinned, and it becomes the most recently used block.

Buffer manager can use statistical information regarding the probability that a request will reference a particular relation

Buffer managers also support forced output (强制写出) of blocks for the purpose of recovery

Page 18: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.18Database System Concepts - 5th Edition, Oct 23, 2005.

11.6 File Organization11.6 File Organization

The database is stored as a collection of files. Each file is a sequence of records. A record is a sequence of fields.

One approach:

assume record size is fixed

each file has records of one particular type only

different files are used for different relations

This case is easiest to implement; will consider variable length records later.

Page 19: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.19Database System Concepts - 5th Edition, Oct 23, 2005.

11.6.1 Fixed-Length Records11.6.1 Fixed-Length Records

Simple approach:

Store record i starting from byte n (i – 1), where n is the size of each record.

Record access is simple but records may cross blocks

Modification: do not allow records to cross block boundaries

Deletion of record i: alternatives:

move records i + 1, . . ., n to i, . . . , n – 1

move record n to i

do not move records, but link all free records on afree list

Page 20: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.20Database System Concepts - 5th Edition, Oct 23, 2005.

Free ListsFree Lists (空闲列表) (空闲列表)

Store the address of the first deleted record in the file header.

Use this first record to store the address of the second deleted record, and so on

Can think of these stored addresses as pointers since they “point” to the location of a record.

More space efficient representation: reuse space for normal attributes of free records to store pointers. (No pointers stored in in-use records.)

Page 21: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.21Database System Concepts - 5th Edition, Oct 23, 2005.

11.6.2 Variable-Length Records11.6.2 Variable-Length Records

Variable-length records arise in database systems in several ways:

Storage of multiple record types in a file.

Record types that allow variable lengths for one or more fields.

Record types that allow repeating fields (used in some older data models).

Page 22: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.22Database System Concepts - 5th Edition, Oct 23, 2005.

Variable-Length Records: Slotted Page Variable-Length Records: Slotted Page StructureStructure

变长记录 槽口 存页面结构变长记录 槽口 存页面结构

Slotted page header contains: 文件头存 :

number of record entries 记录个数 end of free space in the block 回收列表结束 location and size of each record 每个记录起止 与 光盘目录的组织 有点相似

Page 23: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.23Database System Concepts - 5th Edition, Oct 23, 2005.

记录在页面内移动,头文件中信息相应修改(可多次写的光盘是采用 另写目录)

外界的指针应该指向 文件头中的固定位置 而不应该指向具体的(可能会变的)记录位置

Page 24: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.24Database System Concepts - 5th Edition, Oct 23, 2005.

Variable-Length Records (Cont.)Variable-Length Records (Cont.)

定长 的实现 Fixed-length representation:

reserved space 保留空间 Pointers 指针

保留空间 Reserved space – 暂时不用的空着

Page 25: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.25Database System Concepts - 5th Edition, Oct 23, 2005.

Organization of Records in FilesOrganization of Records in Files记录组织 :堆,顺序,散列 ,聚簇记录组织 :堆,顺序,散列 ,聚簇

Heap – a record can be placed anywhere in the file where there is space 无序,堆,插入策略:无缝堆后,有缝插针,

Sequential – store records in sequential order, based on the value of the search key of each record ,有序

Hashing – 影院分单双号 Hash(X)= X mod 2, 入座时搜索空间减半 宝光寺:数罗汉 .

Hash(Person , sex)=2*(RedomSeed+Age+1)+Sex

clustering file organization 不同型记录可放一文件内,把相关记录(父子兄弟等,经常同时检索的)放在同一个磁盘块, 减少磁盘 IO, 快

注意 内存比磁盘快 10—100 倍,减少读块 次数 是速度关键

Page 26: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.26Database System Concepts - 5th Edition, Oct 23, 2005.

Sequential File OrganizationSequential File Organization

适用于 处理 序列性,历史性 数据的场合 按某个或几个属性 排序 search-key

Page 27: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.27Database System Concepts - 5th Edition, Oct 23, 2005.

Sequential File Organization (Cont.)Sequential File Organization (Cont.)

删除 – use pointer chains

插入 – 有缝插针 ,无缝 链溢 overflow block ,更新指针 与时俱进 重排序 reorganize time to time to restore

sequential order

无缝链溢 (溢出块)

Page 28: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.28Database System Concepts - 5th Edition, Oct 23, 2005.

Multitable Clustering File OrganizationMultitable Clustering File Organization

Store several relations in one file using a multitable clustering file organization

Page 29: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.29Database System Concepts - 5th Edition, Oct 23, 2005.

Clustering File Organization Clustering File Organization 聚簇文件聚簇文件

不同型但相关的数据放在一起 (旧式大家庭: 几代同堂 ) E.g., clustering organization of customer and

depositor:

作这两各关系连接时,快

results in variable size records

Page 30: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.30Database System Concepts - 5th Edition, Oct 23, 2005.

Data Dictionary Storage Data Dictionary Storage 数据字典存储数据字典存储

Information about relations names , 属性名,类型,长度 , views , integrity constraints

用户账号,口令 关于 DB 本身的统计数据 物理组织 Physical file organization information

物理存储 (sequential/hash/…) Physical location of relation , DISK ,Block

Information about indices 索引 (Chapter 12)

Data dictionary 或 system catalog

stores metadata: data about data, such as :

Page 31: Database System Concepts, 5th Ed. Chapter 11: Storage and File Structure.

©Silberschatz, Korth and Sudarshan11.31Database System Concepts - 5th Edition, Oct 23, 2005.

Chapter 11: Storage and File Structure Chapter 11: Storage and File Structure 小结小结 Overview of Physical Storage Media 物理存储介质 Magnetic Disks 磁盘 RAID 磁盘阵列 Tertiary Storage 第三存储 Storage Access 存取 File Organization 文件结构 Organization of Records in Files 记录结构 Data-Dictionary Storage 数据字典