Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing...

25
1 340151 Big Data & Cloud Computing (P. Baumann) Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital information lasts forever - or five years, whichever comes first." -- Jeff Rothenberg, RAND Corp., 1997

Transcript of Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing...

Page 1: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

1340151 Big Data & Cloud Computing (P. Baumann)

Storing Data:

Disks and Files

Garcia Molina, Ullman, Widom

Ramakrishnan/Gehrke Ch. 9

"Digital information lasts forever - or five years, whichever comes first."

-- Jeff Rothenberg, RAND Corp., 1997

Page 2: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

2340151 Big Data & Cloud Computing (P. Baumann)

Why Not Everything in Main Memory?

Costs too much

• [Rama/Gehrke] $1000 will buy you either 128MB of RAM or 7.5GB of disk

• Today: 80 EUR will buy you either 4 GB of RAM or 1 TB of disk

• …but today we have multi-Terabyte databases!

Main memory is volatile

• want data to be saved between runs (obviously!)

Typical storage hierarchy:

• Main memory (RAM) for currently used data

• Disk for main database (secondary storage)

• Tapes for archiving older versions of data (tertiary storage)

Page 3: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

3340151 Big Data & Cloud Computing (P. Baumann)

Storage Capacity

Absolute times as of 2003, but ratios still ~ same

Page 4: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

4340151 Big Data & Cloud Computing (P. Baumann)

Storage Cost

Again, absolute values as of 2003, but ratios still ~ same

Page 5: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

5340151 Big Data & Cloud Computing (P. Baumann)

Storage Hierarchies

Magneto-optical media

Optical media

Magnetic tapes

RAID systems

Magnetic disks

Main memory

Storage capacityStorage capacity

Larger

CheaperSlower

Primary

memory

Secondary

memory

Tertiary

memory

Page 7: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

7340151 Big Data & Cloud Computing (P. Baumann)

Nearline (Tertiary) Storage

Usually tape

• Reel, today: cartridge

• Capacity 10 GB ~6 TB per tape

Tape robots

• HSM =

Hierarchical storage management

• multi-Petabytes

Page 8: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

8340151 Big Data & Cloud Computing (P. Baumann)

Caching & Virtual Memory

Cache: Fast memory, holding frequently used parts of a slower, larger

memory

• small (L1) cache holds a few kilobytes of the memory "most recently used" by the

processor

• Most operating systems keep most recently used "pages" of memory in main memory,

put the rest on disk

Virtual memory

• programs don't know whether accessing main memory or a page on secondary

memory page (most operating systems)

Database systems usually take explicit control over 2ndary memory access

Page 9: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

9340151 Big Data & Cloud Computing (P. Baumann)

Where Databases Reside

Hard Disk is secondary storage device of choice

• Many flavors:

Disk: Floppy (hard, soft); Winchester; Ram disks; Optical, CD−ROM; Arrays

Main advantage over tapes: random access vs. sequential

Data stored and retrieved in units called disk blocks or pages

Unlike RAM, time to retrieve a disk page varies

depending upon location on disk

• relative placement of pages on disk

has major impact on DBMS performance!

Page 10: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

10340151 Big Data & Cloud Computing (P. Baumann)

The Miracle Called "Hard Disk"

Disk head contains magnet, hovering over spinning platter

flight height: 10-20 nm

(x 5,000 gives one hair!)

Page 11: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

11340151 Big Data & Cloud Computing (P. Baumann)

Components of a Disk

platters spin

arm assembly moves in or out

to position head on desired track

Tracks under heads = a cylinder

(imaginary!)

Sector size = N * block size

(fixed)

...typical numbers?

Page 12: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

12340151 Big Data & Cloud Computing (P. Baumann)

Typical Numbers

Diameter: 1 inch ...15 inches

Cylinders: 40 (floppy) ... 20,000

Surfaces: 1 (old CDs) ... 2 (floppies) ... 30

Sector Size: 512 B ... 50 kB

Capacity: 360 kB (old floppy) ... 4 TB

Page 13: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

13340151 Big Data & Cloud Computing (P. Baumann)

Disk Access Time

I want block X block X in memory

?

Page 14: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

14340151 Big Data & Cloud Computing (P. Baumann)

Disk Access Time

Time = Seek Time +

Rotational Delay +

Transfer Time +

Other

Page 15: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

15340151 Big Data & Cloud Computing (P. Baumann)

Time = Seek Time +Rotational Delay +Transfer Time +Other

Seek Time

Page 16: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

16340151 Big Data & Cloud Computing (P. Baumann)

Average Random Seek Time

Typical S: 10 ms ...40 ms

= millions of times RAM access !

Time = Seek Time +Rotational Delay +Transfer Time +Other

Page 17: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

17340151 Big Data & Cloud Computing (P. Baumann)

Average Rotational Delay

R = 1/2 revolution

typical R = 4.16 ms (7,200 RPM)

Time = Seek Time +Rotational Delay +Transfer Time +Other

Page 18: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

18340151 Big Data & Cloud Computing (P. Baumann)

Transfer Rate

Transfer rate: t

• typical t: 10 ... 50 MB/second

transfer time T:

block size

T = ---------------

t

Ex: block size 32 kB, t = 32 MB/second

transfer time = …?

Time = Seek Time +Rotational Delay +Transfer Time +Other

Page 19: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

19340151 Big Data & Cloud Computing (P. Baumann)

CPU time to issue I/O

Contention for controller

Contention for bus, memory

Typical Value:

Other Delays

Time = Seek Time +Rotational Delay +Transfer Time +Other

0

(relative to other values)

Page 20: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

20340151 Big Data & Cloud Computing (P. Baumann)

Sequential Read?

So far: Random Block Access

What about: Reading next block?

Disks optimized towards "consecutive" reading!

• Blocks within track

• Tracks within cylinder

• Next cylinder

Page 21: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

21340151 Big Data & Cloud Computing (P. Baumann)

"Next Block" Costs

`Next’ block concept:

• blocks on same track, followed by

• blocks on same cylinder, followed by

• blocks on adjacent cylinder

If we don’t need to change cylinder:

Block Size

Time to get = ---------------- + Negligible

block t

• + switch track (ie, read next arm)

• + once in a while, next cylinder

Page 22: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

22340151 Big Data & Cloud Computing (P. Baumann)

Random vs Sequential Read

Rule of Thumb:

• Random I/O: Expensive

• Sequential I/O: Less expensive

Ex: 1 KB Block:

• Random I/O: ~ 20 ms

• Sequential I/O: ~ 1 ms

relative difference is smaller for larger blocks

Whenever possible arrange file blocks sequentially on disk (by `next’)

to minimize seek and rotational delay

• For sequential scan, pre-fetching several pages at a time is a big win! “burst read”

Page 23: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

23340151 Big Data & Cloud Computing (P. Baumann)

...Writing?

Cost for Writing cost for Reading

... unless we want to verify!

• Then, need to add

Block size

---------------- + (full) rotation

t

Page 24: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

24340151 Big Data & Cloud Computing (P. Baumann)

...To Modify a Block?

(a) Read Block

(b) Modify in Memory

(c) Write Block

[ (d) Verify ]

Page 25: Storing Data: Disks and Files€¦ · 340151 Big Data & Cloud Computing (P. Baumann) 1 Storing Data: Disks and Files Garcia Molina, Ullman, Widom Ramakrishnan/Gehrke Ch. 9 "Digital

25340151 Big Data & Cloud Computing (P. Baumann)

Wrap-Up

Capacities grow, data hunger grows larger

• Moore's Law vs Greg's Law vs disk growth

Databases heavily i/o bound

• Disk space management largely determines performance

Disk access time =

Seek Time + Rotational Delay + Transfer Time + Other