Scheduling in Staged- DB Systems
-
Upload
todd-perkins -
Category
Documents
-
view
27 -
download
1
description
Transcript of Scheduling in Staged- DB Systems
![Page 1: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/1.jpg)
Staged-DB IC-65 Advances in Data Management Systems 1
Scheduling in Staged- DB Systems
Nicolas Bonvin, Rammohan Narendula, and Surender Reddy Yerva
![Page 2: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/2.jpg)
Staged-DB IC-65 Advances in Data Management Systems 2
Organization
• What is Staged-DB?
• Scheduling in Staged-DB
• Our Contribution
–Scheduling in Execution Phase
–System Modeling• System Design Details
• Performance Study
• Future Work
![Page 3: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/3.jpg)
MotivationResponse time: time needed to produce the first page as output
Big advantage for the overlapping case ('1')
![Page 4: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/4.jpg)
Staged-DB IC-65 Advances in Data Management Systems 4
QueryPARSER
OPTIMIZER
EXECUTION
Answer
Querytree
Queryplan
Data
catalogs and
statisticsoperators
Query Lifetime in DBMS
EXECUTION(Disk-IO) : 90% OF TIME
![Page 5: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/5.jpg)
Staged-DB IC-65 Advances in Data Management Systems 5
DB Paradigm So Far..
• Query Query Execution Plan (Tree of Operators)• Multiple Queries
– Each query handled by a DIFFERENT THREAD• No cross communication/sharing across threads • Sharing Opportunity is missed
DBMS
thread pool
xno
coordination
D
CD
C
One Query Multiple Operators
![Page 6: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/6.jpg)
Staged-DB IC-65 Advances in Data Management Systems 6
Staged-DB Paradigm
• DB is remodeled as various stages
• Stage
– “Common execution logic” grouped into a stage
– Each operator in QEP can be seen as a stage
• Query passed through all the needed stages to get an output
• Common Data needs Detected by the Stage
DBMS
thread pool
D
CD
C
StagedDB
One Operator Multiple queries
![Page 7: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/7.jpg)
Staged-DB IC-65 Advances in Data Management Systems 7
Staged Database Systems
• DB Stages ; Execution Stage microEngine• Each Stage has a queue, Also each microEngine has a request queue.
DBMS
queries
Stage 3
Stage 2
Stage 1
StagedDB
queries
Conventional
High concurrency locality across requests
![Page 8: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/8.jpg)
Staged-DB IC-65 Advances in Data Management Systems 8
Scheduling In Staged-DB
• Scheduling at Different levels– Stages (Parser, Optimizer, Execution)– Across MicroEngines (Execution Engine has
SCAN,JOIN etc micro-engines) – Within MicroEngine
• We Consider only scheduling “across microEngines”
• Scheduling Policies:– Round-Robin– Heavy Load First– Light Load First
![Page 9: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/9.jpg)
Staged-DB IC-65 Advances in Data Management Systems 9
Detailed System Design
• Based on Discrete Event Simulation technique• All the computation, data needs, dependencies
are modeled using events• System components
– Global System Queue– Dispatcher– Operator (or) Engine – Global Scheduler– Main Memory– Overlap Detector
![Page 10: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/10.jpg)
Staged-DB IC-65 Advances in Data Management Systems 10
QueryArrival
Dispatcher
Scheduler
Disk-Fetch
EngineInsert
EngineExec-Begin
EngineExec-EndMemory
Global System Queue
event
eventId
componentId
functionId
firingTime
packet
![Page 11: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/11.jpg)
Staged-DB IC-65 Advances in Data Management Systems 11
Engine
EngineInsert
EngineExecution Begin
EngineExecution End
Input Packet Queue
Packet format
queryId list
queryPlans
pageId
contextInfo
Request packet from parent node/ dispatcher
Call Overlap detector
Insert packet
Pick packet from Q
Send packet to
Child OR execute and produce output
Insert event into
Event queue for the scheduler
![Page 12: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/12.jpg)
Staged-DB IC-65 Advances in Data Management Systems 12
Engines
• Join• Sort• Aggregation• Scan• Wait and Scan• Index Scan
![Page 13: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/13.jpg)
Staged-DB IC-65 Advances in Data Management Systems 13
Overlap detection
• With memory• With input queue• Two types
– Linear– Spike
![Page 14: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/14.jpg)
Staged-DB IC-65 Advances in Data Management Systems 14
Memory Manager
• Pinning and unpinning• Put()• pageExists()• consumePage()
![Page 15: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/15.jpg)
Staged-DB IC-65 Advances in Data Management Systems 15
Performance study
• 5 queries• 5 runs• Uniform arrival rate
![Page 16: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/16.jpg)
Effect of OverlappingResponse time: time needed to produce the first page as output
Big advantage for the overlapping case ('1')
![Page 17: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/17.jpg)
Effect of OverlappingMemory consumption: max # of pages consumed in memory during the life time of the query
Higher memory consumption with Overlapping !
![Page 18: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/18.jpg)
Effect of OverlappingThroughput: # of queries completed in a unit of time
Clear advantage with Overlap detection !
![Page 19: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/19.jpg)
Comparing scheduling policiesMean response time
Round Robin seems to perform a little better
![Page 20: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/20.jpg)
Comparing scheduling policiesMemory consumption
No differences !
![Page 21: Scheduling in Staged- DB Systems](https://reader035.fdocuments.us/reader035/viewer/2022062308/56812c58550346895d90e00a/html5/thumbnails/21.jpg)
Staged-DB IC-65 Advances in Data Management Systems 21
Future Work
• Few more interesting global scheduling policies are possible.
• The system did not consider a local scheduling policy to pick one packet among many in the input packet queue, for processing next. It picks the fist packet in the queue at the moment.
• Regarding implementation, experimentation should be done with more Engines and a bench mark style input queries.