Local Filesystems (part 2) CPS210 Spring 2006. Papers Towards Higher Disk Head Utilization:...

Post on 17-Dec-2015

213 views 0 download

Transcript of Local Filesystems (part 2) CPS210 Spring 2006. Papers Towards Higher Disk Head Utilization:...

Local Filesystems (part 2)

CPS210Spring 2006

Papers

Towards Higher Disk Head Utilization: Extracting Free Bandwidth From Busy Disk Drives Christopher Lumb

Automatic I/O Generation through Speculative Execution Fay Chang (Duke, ‘94)

About to read Blue sector

After reading Blue sector

After BLUE read

Red request scheduled next

After BLUE read

Seek to Red’s track

After BLUE read

Seek for RED

SEEK

Wait for Red sector to reach head

After BLUE read

Seek for RED

Rotational latency

ROTATESEEK

Read Red sector

After BLUE read

Seek for RED

Rotational latency

After RED read

ROTATESEEK

Freeblock scheduling

Observation Head passes lots of sectors during rotation May be can read those sectors for “free”?

Use free bandwidth for background tasks Don’t wait for long idle periods Interleave traffic with high priority requests

Rotational latency gap utilization

After BLUE read

Seek to third track

After BLUE read

Seek to Third

SEEK

Free transfer

After BLUE read

Seek to Third

Free transfer

SEEK FREE TRANSFER

Seek to Red’s track

After BLUE read

Seek to Third

Seek to REDFree transfer

SEEKSEEK FREE TRANSFER

Read Red sector

After BLUE read

Seek to Third

Seek to RED After RED read

Free transfer

SEEKSEEK FREE TRANSFER

Many possible schedules

What information is needed? Must have detailed info about the disk

t(access) = t(seek) + t(rotate) + t(read) Current head position Disk geometry

Cylinders, tracks/cylinder, sectors/track Disk physics

Seek time, min/max seek, spindle speed (State machine + cost of transitions)

Freeblock scheduler

ForegroundScheduler

RequestQueue

Disk StateMachine and

Physics Engine

FreeblockScheduler

Freeblock scheduling requirements

Which background apps is this best for? Low priority Large sets of desired blocks No ordering constraints Small memory working sets

Example apps: Scanners, layout optimization,

prefetching

Feasibility of freeblock scheduling

Impact of disk characteristics “Random” workload

10,000 requests 4KB requests Uniform start location Two reads per write

~1/3 time in rotation

Feasibility of freeblock scheduling

Impact of workload characteristics

More locality, more opportunitySmaller requests increase opportunity

Feasibility of freeblock scheduling

Impact of foreground scheduling algorithm

Four algorithms First-come-first-served Circular-LOOK Shortest-Seek-Time-First Shortest-Positioning-Time-First

Random, at least 20 outstanding requests

Feasibility of freeblock scheduling

SPTF uses same info as free-BW Siphons off free-BW

Instead we want Low positioning delay High rotational latency

Use SPTF-SWn%

SPTF-SWn%

Compute A via SPTF Of remaining requests

Compute those within n% of A Compute B with smallest seek from

that set Schedule B SPTF = SPTF-SW0% SSTF=SPTF-SW∞%

SPTF-n% results

What is the right balance between foreground performance and free-BW?

Scheduling algorithm At A, foreground scheduler picks B r(A, B) = rotational latency from A to B For each track, t, between A and B

s(t,A,B) = seek time from A to t and t to B How many reqs under head r(A,B) - s(t,A,B)?

Skip tracks where num reqs < best found Skip tracks where free BW < best found Start with source/dst cylinders

Search in ascending s(t,A,B) order

Scheduling algorithm

Best case? One seek: from A to B

Worst case? Lots of small requests evenly

distributed Most searches take 0-2.5ms

550MHz PII Average access time ~ 10ms

How do we deal with disk latency?

Writes? “Write-behind” buffering Return to app before data on disk

Reads? Caching Prefetching

Prefetching Basics

Assume most reads are sequential On request for block at B, also read B+1 Next read depends on last position

Some reads aren’t sequential Not all that useful for small reads (<

block size) Next read might depend on last content A wrong guess can hurt performance

SpecHint idea

Generate hints via speculative execution

Want more synchronous cache reads Use speculation to populate cache

This adds CPU load. Why is that OK? Processes are stalled during IO (10s of ms) CPU performances increases exponentially Disk performance gains are flatter

Assumptions

What assumptions does SpecHint make? Read data does affect control flow Damage of stray speculation can be limited Lots of idle CPU time

Are these reasonable? Does this make sense for one-disk

systems?

Design goals

Correctness Leave results of transformed app

unchanged Free

Worst-case: < 10% slower Effective

Increase in cache hits

Generating hints Want to issue as many hints as possible Want to issue those hints early “On track” vs “off track”

Is hint correct? Might start speculation after each IO

stall Limits speculation to ~ 10ms

Instead always run speculation Restart when it goes off track

Detecting stray speculation Maintain a hint log Original checks hint log before a read

If next entry is null, spec is off track If next entry is wrong, spec is off track If next entry matches, spec is on track

If speculative is off track, hit restart Recopy registers, cancel hints, clear c-o-w

Other ways to detect stray speculation?