Storage Systems CSE 598d, Spring 2007
description
Transcript of Storage Systems CSE 598d, Spring 2007
![Page 1: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/1.jpg)
Storage SystemsStorage SystemsCSE 598d, Spring 2007CSE 598d, Spring 2007
Lecture 3: Disk drive trends, modeling Lecture 3: Disk drive trends, modeling (contd.)(contd.)
Feb 1, 2007Feb 1, 2007
![Page 2: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/2.jpg)
• Topics– Disk drive modeling– SCSI vs ATA– Rules of thumb in data engineering
![Page 3: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/3.jpg)
Disk Drive Modeling• Problems because
– Non-linear– State Dependent
• Not easy to model analytically• Pitfalls
– Seek time linear w.r.t distance– Uniform distribution for rotational latency– Constant transfer times– Ignoring bus contention
![Page 4: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/4.jpg)
Comparing four different models
• (i) Constant fixed time for each I/O• (ii) Simple model which is
– Seek time is linear with distance– No head settle/switch costs– Uniform rotational delay– Fixed controller costs– Linear transfer costs
• (iii) Better seek and positioning model– (a) 3.45+0.597*sqrt(d) ms (for < 616 cylinders)– (b) 10.8+0.012d ms (for >= 616 cylinders)– 2.5 ms for head/track switch– Keeps track of rotational position
• (iv) All of (iii) + Cache model + Read ahead + Bus speed + Controller overheads
• Chosen metric for comparison: relative demerit
![Page 5: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/5.jpg)
Results
(i) (ii)
(iii) (iv)
This disk does not have a cache!
![Page 6: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/6.jpg)
Disk with a cache
(iii) (iv)
![Page 7: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/7.jpg)
Conclusions
• The following aspects are important– Disk cache/buffer (112)– Data transfer model (20)
• Overlaps with bus transfer, seek-time, head-switching– Rotational position, data layout (2)
• While these are not (that important)
![Page 8: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/8.jpg)
How do we get the drive parameters for such
modeling?
• Manuals/Data sheets– Not everything is publicized– Things can still vary
• Interrogative extraction– Though extensive SCSI interface, not all may be
supported– Several more parameters may be needed
• Empirical/experimental extraction– This is hard
![Page 9: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/9.jpg)
Complications of empirical extraction
• Overlapping controller overheads, bus transfers, mechanical delays, etc.
• Contention for shared resources• Cache segmentation• Prefetching• Non-uniformity in performance (e.g. seeks)• Large seemingly non-deterministic delays
(e.g. thermal recalibration)• Fluctuations in timing.
![Page 10: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/10.jpg)
Parameters needed for modeling
• Data layout• Seek, rotational latency and transfer
costs• Bus, controller and host processing
costs• Caching and prefetching parameters
![Page 11: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/11.jpg)
Data Layout Parameters• Where does a block actually reside
on disk?
• May need to be re-acquired upon each formatting (since a re-allocated defect may be converted to slipped defects for better efficiency)
• SEND/RECEIVE diagnostic of SCSI interface can be used to query the actual location of a block.– Doing this for each block would be
very time-consuming
![Page 12: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/12.jpg)
Storage SystemsStorage SystemsCSE 598d, Spring 2007CSE 598d, Spring 2007
Lecture 4: Disk drive trends, Lecture 4: Disk drive trends, modeling (contd.)modeling (contd.)
Feb 8, 2007Feb 8, 2007
![Page 13: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/13.jpg)
Empirical Extraction• Send commands to disk and measure Mean Time
Between Request Completions – MTBRC(a,b) – of 2 requests iteratively.
• Rotational distance between request pairs is varied until a minimum is reached.
![Page 14: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/14.jpg)
Extracting Head switch time
• MTBRC1 = MTBRC(1-sector write, 1-sector read on the same track)
= Host1+Cmd+Media+Bus+Comp• MTBRC2 = MTBRC(1-sector write, 1-sector read
on a diff. track of same cylinder)= Host2+Cmd+HdSw+Media+Bus+Comp
HdSw = (MTBRC2–Host2) – (MTBRC1–Host1)
![Page 15: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/15.jpg)
Extracting Seek Times(i) For each seek distance, select 5 points evenly
spaced. From each of these points, perform 10 inward and 10 outward seeks of this distance. Get the average of these.
(ii) Measure MTBRC(1-sector write, 1-sector read on same track), and MTBRC(1-sector write, 1-sector read on next cylinder). Difference between these is mechanical time for 1-cylinder seek.
(iii) Subtract (ii) from the 1-cylinder distance value of (i). The diff. represents the non-mechanical overheads of seek.
Subtract (iii) from each of the values obtained in (i)
![Page 16: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/16.jpg)
Typical Seek time profile
![Page 17: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/17.jpg)
Extracting Rotation Speed
• Perform a series of 1-sector writes to the same location and calculate the mean time between completions.
![Page 18: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/18.jpg)
Extracting Cache Segments, Size …
• Say the # of segments is N.• Perform 1-sector reads of the first logical
blocks of the first N-1 cylinders• Perform a 1-sector read of the first
logical block of the last data cylinder• Perform a 1-sector read of the first
logical block of the first cylinder. If that is a hit (measured by response time), then # of segments is N or greater.
![Page 19: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/19.jpg)
Extraction techniques for• Segment size• Do prefetched data replace requested data in the
current segment?• Are all requested data always thrown away?• Does prefetching stop on track/cylinder boundaries?• Is the prefetching size proportional to request size?• Does it implement read-on-arrival? Write-on-arrival?• Is cache space allocated on a track or sector basis?• Can READs hit on data placed in the cache by
WRITEs?• What is the segment replacement algorithm?
![Page 20: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/20.jpg)
The physical I/O path from CPU to the disk
![Page 21: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/21.jpg)
Bridge chip
Host I/O bus (PCI, Infiniband)
System bus
CPUs RAMs
I/O buses
SCSI FibreChannel
IP LAN
SCSIHBA
FCHBA
iSCSIHBA
Graphics Card
EthernetNIC
![Page 22: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/22.jpg)
• System bus– Rapid data transfer between CPU and memory
• Host I/O bus– Common: PCI, emerging: Infiniband
• Device drivers responsible for control of and communication with peripheral devices of all types– Part of the device driver for storage device almost
always realized by firmware that is processed by special processors (ASICs)
• ASICs are partially integrated into the main curcuit board, such as on-board SCSI controllers, or connected to the main board via add-on cards (PCI cards)
– Storage devices connected to the server via the host bus adapter (HBA)
– Communication connection between the HBA and the peripheral device is called the I/O bus
• Similar I/O path/techniques used within a disk subsystem
![Page 23: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/23.jpg)
I/O bus technologies
• SCSI• ATA/IDE, Serial-ATA (SATA)• SCSI over IP (iSCSI)• Fibre Channel• USB• … many more
![Page 24: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/24.jpg)
SCSI basics• Small Computer System Interface• First version released in 1986
– Many versions since
• The dominant technology for UNIX and PC servers– Assignment: find out what your laptop/desktop
uses
• A communication protocol as well as bus• Parallel bus for data and additional lines for
control of communication
![Page 25: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/25.jpg)
SCSI basics (more)• A daisy-chain can connect upto 16
devices together• SCSI protocol defines
– How devices reserve the bus– In what format data is transferred– Initial versions: message then ACK then
next message– Latest versions: asynchronous issuance,
multiple messages in transit together, increased data rate
![Page 26: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/26.jpg)
SCSI vs ATA:Motivating Factors
• Cost (Market Demands)• Form factor• Configuration in groups• Reliability• Access Patterns
![Page 27: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/27.jpg)
Leading to differences in …
• Mechanics• Materials• Electronics• Firmware• Performance (RPM and Seeks)• Reliability• Power Consumption• …
![Page 28: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/28.jpg)
Differences in Mechanics
• ES Head/Disc Assembly– Sustain higher disturbance– Higher rigidity– More mass– Higher bandwidth servos– Avoiding through holes– Filter for particles, desiccant for
humidity, carbon absorbent for organic materials
– Better air flow hardware– O-ring seals for spindle– Higher quality sealing
![Page 29: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/29.jpg)
Mechanics (contd.)• Actuator
– Larger magnets for faster seeks– Lower resistance (thicker and fewer
windings) actuator coils– Latch (to hold actuator when off) can affect
seek performance. ES compensates for this with a bi-stable latch.
• Spindle– Higher RPM => Windage and Vibrations– PS Drives use a cantilever design to hold a
motor (captured only at base), while ES drives capture the motor at both ends.
![Page 30: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/30.jpg)
Differences in Electronics
• Needs to take and process commands from host, perform head positioning, servo processing, data transfers, cache management, etc.– PS drives may not have separate servo processor
(to handle repeatable on non-repeatable runouts).– ES ASIC gate count 2X PS gate count– ES firmware code 2X PS firmware code size (to
handle more concurrency)– ES Cache space 10X PS Cache space
![Page 31: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/31.jpg)
Differences in Magnetics
• More or less similar (since there is no reason why latest advancements may not be used in both).
• Main differences are in electronics needed to provide a Signal-to-Noise (SNR) ratio for the higher RPM of ES drives.
![Page 32: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/32.jpg)
Differences in Performance
• Capacity– Areal density is similar since they
use same magnetics– Differences due to # of platters
and their size• Size of Platters
– Power is nearly cubic to platter size.
– To sustain higher RPM, ES drives use smaller platters (2.5” and lower) -> also helps seeking
• # of platters– Trend is towards de-populated
drives since you can use more drives to meet the capacity demands in ES environments
![Page 33: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/33.jpg)
Performance (contd.)• Data Rates
– Though higher RPM favors ES, PS benefits from larger platter size, and more frequent introductions of newer models.
![Page 34: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/34.jpg)
Performance (contd.)
• Seeks– Mechanical
improvements and smaller platters favors ES.
– ES also allows larger queue depths of outstanding requests to benefit from smarter scheduling.
![Page 35: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/35.jpg)
Rotational Vibration• Environment/nearby drives
can excite the drives to throw the actuator off-track.
• Note that this causes performance loss.
• Need to understand how much vibration (in radians/square-sec) is present and design for it.
• Some recent drives even have a vibration sensor for compensation in servo processing.
![Page 36: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/36.jpg)
Reliability
• Described based on power-on hours (8 hrs/day for PS and 24 hrs for ES).
• Depends on – Duty cycle (40% for
ES vs. 75% for PS due to shorter seeks)
– Temperature– Particles inside– Head crashes
![Page 37: Storage Systems CSE 598d, Spring 2007](https://reader035.fdocuments.us/reader035/viewer/2022062809/56815563550346895dc32cfb/html5/thumbnails/37.jpg)
Serial ATA (SATA)
• Serial implementation of ATA• Higher data rates
– 133-150 Mbps compared to 320 Mbps for SCSI
• Easier to configure, cheaper, less reliable (?)