Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford...

16
Caching Queues Caching Queues in Memory Buffers in Memory Buffers Rajeev Motwani (Stanford Rajeev Motwani (Stanford University) University) Dilys Thomas (Stanford Dilys Thomas (Stanford University) University)

Transcript of Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford...

Page 1: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

Caching QueuesCaching Queues in Memory Buffers in Memory Buffers

Rajeev Motwani (Stanford Rajeev Motwani (Stanford University)University)

Dilys Thomas (Stanford Dilys Thomas (Stanford University)University)

Page 2: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

ProblemProblem

►Memory: fast, expensive and smallMemory: fast, expensive and small

Disk: large (infinite), inexpensive but Disk: large (infinite), inexpensive but slow slow

►Maintaining Queues: motivated by Maintaining Queues: motivated by DataStreams, Distributed Transaction DataStreams, Distributed Transaction Processing, NetworksProcessing, Networks

►QueuesQueues to be to be maintained in memorymaintained in memory, , but may be but may be spilled onto disk.spilled onto disk.

Page 3: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

ModelModel►Queue updates and depletionQueue updates and depletion

* single/multiple queues* single/multiple queues►Cost Model:Cost Model:

* * unit costunit cost per read/write per read/write

* * extended cost modelextended cost model::

cc00 + c + c11٭٭numtuplesnumtuples

seek time=5-10ms,seek time=5-10ms,

transfer rates=10-160MBpstransfer rates=10-160MBps

Page 4: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

Contd..Contd..

►Online algorithms for different cost Online algorithms for different cost models.models.

►Competitive analysisCompetitive analysis

►AcyclicityAcyclicity

Page 5: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

Algorithm HALFAlgorithm HALF

SPILLED

HEAD < SPILLED < TAIL

SPILLED empty => TAIL empty

SPILLED nonempty => HEAD, TAIL < M/2

Memory size = M

TAIL

HEAD

Page 6: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

HEAD SPILLED TAIL

Page 7: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

Initially all tuples in HEADInitially all tuples in HEAD

First write: M/2 newest tuples from HEAD to First write: M/2 newest tuples from HEAD to SPILLED.SPILLED.

Then, tuples enter TAIL when SPILLED non-Then, tuples enter TAIL when SPILLED non-emptyempty

*WRITE-OUT: TAIL > M/2*WRITE-OUT: TAIL > M/2

write(M/2) TAIL write(M/2) TAIL →→SPILLEDSPILLED

*READ-IN: HEAD empty, SPILLED nonempty*READ-IN: HEAD empty, SPILLED nonempty

read(M/2) SPILLEDread(M/2) SPILLED→→ HEAD HEAD

*TRANSFER: after READ-IN if SPILLED empty*TRANSFER: after READ-IN if SPILLED empty

move (rename) TAIL move (rename) TAIL →→ HEAD HEAD

//to maintain invariant 3//to maintain invariant 3

Page 8: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

AnalysisAnalysis

HALF is acyclic (M/2 windows disjoint)HALF is acyclic (M/2 windows disjoint)Alternate M-windows disjoint.Alternate M-windows disjoint.Atleast one tuple from each M-window has to be Atleast one tuple from each M-window has to be

written to disk by any algorithm including offlinewritten to disk by any algorithm including offlineThese have to be distinct writes.These have to be distinct writes.Hence 2-competitive wrt writes.Hence 2-competitive wrt writes.Reads analysis similar.Reads analysis similar.

Lower bound of 2 by complicated argument: see paper.Lower bound of 2 by complicated argument: see paper.

m/2w1

m/2w2

m/2w3

m/2w4

mm

m/2w5

m/2w6

mm m m

Page 9: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

Multiple(n) QueuesMultiple(n) Queues

►Queue additions adversarial as in Queue additions adversarial as in previous setting.previous setting.

►Queue depletions:Queue depletions:

Round-RobinRound-Robin

AdversarialAdversarial

Static allocation of buffer between n queues cannot be competitiveStatic allocation of buffer between n queues cannot be competitive

Page 10: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

Multiple Queues: Multiple Queues: BufferedHeadBufferedHead

►Dynamic memory allocationDynamic memory allocation►Write out newest M/2n of the largest Write out newest M/2n of the largest

queue in memory when no space in queue in memory when no space in memory for incoming tuplesmemory for incoming tuples

►Read-ins in chunks of M/2nRead-ins in chunks of M/2n►Analysis: see paperAnalysis: see paper

Page 11: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

Multiple Queues: Multiple Queues: BufferedHeadBufferedHead

*BufferedHead is acyclic*BufferedHead is acyclic*Round-Robin: BufferedHead is 2n-*Round-Robin: BufferedHead is 2n-

competitive competitive √ √n lower bound on acyclic algorithmsn lower bound on acyclic algorithms

*Adversarial: no o(M) competitive *Adversarial: no o(M) competitive algorithmalgorithm

* However if given M/2 more memory * However if given M/2 more memory than adversary then BufferedHead is than adversary then BufferedHead is

2n-competitive2n-competitive

Page 12: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

ExtendedCost Model: ExtendedCost Model: GreedyChunkGreedyChunk

Cost model: cCost model: c00 + c + c11٭٭tt

Let block-size, T=cLet block-size, T=c00/c/c11

All read-ins, write-outs in chunks of size TAll read-ins, write-outs in chunks of size T

T=100KB- few MB T=100KB- few MB

Simple algorithm: GREEDY-CHUNKSimple algorithm: GREEDY-CHUNK

*Write-out newest T tuples when no *Write-out newest T tuples when no space in memoryspace in memory

*Read-in T oldest tuples if oldest tuple *Read-in T oldest tuples if oldest tuple on diskon disk

Page 13: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

Extended Cost Model: Extended Cost Model: GreedyChunkGreedyChunk

If M > 2T Algorithm GREEDY-CHUNKIf M > 2T Algorithm GREEDY-CHUNK

Else Algorithm HALFElse Algorithm HALF

Algorithm is 4-competitive, acyclic.Algorithm is 4-competitive, acyclic.

Analysis see paper.Analysis see paper.

Easy extension to Multiple Queues.Easy extension to Multiple Queues.

Page 14: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

SinglSinglee

QueuQueuee

Multiple QueuesMultiple Queues

Round-robinRound-robin AdversarialAdversarial

UUBB

LLBB

UBUB LBLB UBUB LBLB UB+UB+

unitunit 22 22 2n2n √ √nn MM O(MO(M))

2n2n

linealinearr

44

Page 15: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

Practical SignificancePractical Significance►Gigascope: AT&T’s network monitoring Gigascope: AT&T’s network monitoring

tool: SIGMOD 03 – drastic performance tool: SIGMOD 03 – drastic performance decrease on disk usagedecrease on disk usage

►DataStream systems: good alternative DataStream systems: good alternative to approximation, no spilling to approximation, no spilling algorithms previously studied.algorithms previously studied.

Page 16: Caching Queues in Memory Buffers Rajeev Motwani (Stanford University) Dilys Thomas (Stanford University)

Related WorkRelated Work

► IBM MQSeries: spilling to diskIBM MQSeries: spilling to disk►Related work on Network Router Related work on Network Router

Design: using SRAM and DRAM Design: using SRAM and DRAM memory hierarchies on the chipmemory hierarchies on the chip

Open ProblemsOpen ProblemsAcyclicity: remove for multiple queues.Acyclicity: remove for multiple queues.Close the gap between the upper and Close the gap between the upper and the lower bound.the lower bound.