Backlog Estimation and Management for Real-Time Data Services

29
Backlog Estimation and Management for Real-Time Data Services Kyoung-Don Kang , Jisu Oh, and Yan Zhou Department of Computer Science State University of New York at Binghamton 20 th Euromicro Conference on Real-Time Systems (ECRTS ’08), July 2~4, 2008

description

Kyoung-Don Kang , Jisu Oh, and Yan Zhou Department of Computer Science State University of New York at Binghamton. Backlog Estimation and Management for Real-Time Data Services. 20 th Euromicro Conference on Real-Time Systems (ECRTS ’08) ‏ , July 2~4, 2008. - PowerPoint PPT Presentation

Transcript of Backlog Estimation and Management for Real-Time Data Services

Backlog Estimation and Management

for Real-Time Data Services

Kyoung-Don Kang , Jisu Oh, and Yan ZhouDepartment of Computer Science

State University of New York at Binghamton

20th Euromicro Conference on Real-Time Systems (ECRTS ’08), July 2~4, 2008

2

Real-time data service is challenging

Real-time database (RTDB) applications e-commerce, traffic monitoring

Requirements Processing user requests in a timely manner Maintaining the freshness of temporal data

Challenges Dynamic database workloads due to data/resource

contention Difficult to precisely analyze the amount of data the

database will need to process Conflicts between the transaction timeliness and data

freshness

Feedback Control

A promising approach to manage the performance of RTDBs

In the presence of dynamic workloads

Required procedure System modeling (SYSID) – using a linear relationship Controller design (ex, P, PI, PID) Controller tuning (Root Locus tool in Matlab)

to support performance requirements and closed-loop system stability requirements

Actuator design

Shortcomings of existing work

System modeling problem An inaccurate metric for measuring data service workload:

the queue length vs. service delay Coarse-grain approach When the amount of data accessed by each request

varies, cannot correctly measure the workload

Mostly based on simulations due to a lack of real-time database testbeds

Reveal limitations in modelling real system behaviours

5

Our Approach

Goal: Supporting the target service delay bound and enhancing the data service throughput

Key technologies 1) A database backlog estimation mechanism

2) Modeling of the system dynamics based on the relation between the estimated backlog and service delay

3) A systematic backlog adaptation via a fine-grained feedback-based admission control based on the backlog model

4) Workload smoothing using a hint based scheme

6

Implemented and evaluatedin a real database system

Chronos a soft real-time database testbed built on top of Berkeley DB processes thousands of client’s data service requests

for stock quotes and stock trading periodically updates 3,000 stock prices for data

freshness management critical for real-time data services

7

RTDB Architecture

Commit

UserRequests

Ready Queue

Database

Dispatch

User RequestService Threads

… … Stock UpdateThreads

Stock PriceUpdate Server

BacklogEstimator

Metadata

FeedbackController

Perf.Monitor

Service Delay s(k)

Backlog Adjustment

δd(k)

Target Service Delay

AC & TS

dt(k) = dt(k-1) + δd(k)

8

Backlog

The amount of data for the database to process

For backlog estimation, Use database schema and semantics of queries and

transactions Maintain metadata

System statistics (# rows per table, row sizes, etc.) Materialized portfolio information (# stock items each client

owns)

About 1.2% CPU utilization overhead

9

Backlog estimation: example (1) “view-stock” query

A user query about stock prices for a set of companies ex): “view-stock, IBM, DELL, Microsoft”

Chronos accesses two tables to answer this query STOCKS table to read a set of companies’ information QUOTES table to get the associated stock prices

Backlog for this query is nc ∙ {r(STOCKS) + r(QUOTES)} Where nc is the number of companies in a query r(x) is the average size of a row in table x

10

Backlog estimation: example (2) “view-portfolio” transaction

A client transaction about stock information in his/her portfolio

ex) “view-portfolio, client-id 150”

Chronos accesses two tables to answer this query PORTFOLIOS table to look up a set of company stock IDs of

the stocks owned by the client QUOTES table to get associated stock prices

Backlog for this query is |portfolio(id)| ∙ {r(PORTFOLIOS) + r(QUOTES)}

Where |portfolio(id)| is the number of stock items in the portfolio owned by the client whose ID is id

11

The database backlog at time t

• d(t) : the database backlog at time t

• ni : the amount of data to be processed by a request Ti in the ready queue

• q(t) : the number of the requests in the ready queue at time t

)(

1i

)(tq

in=td

q(t)

12

The relation between database backlog d(k) and service delay s(k)

A fine-grained modeling Because it can closely capture system dynamics when the

transaction size varies

Derive a model using the ARX (Auto Regressive eXternal) model

Apply the Recursive Least Square (RLS) method

RTDB Model

13

System Identification (SYSID)

System settings No feedback control, admission control, or traffic

smoothing Number of clients: 1200 Inter-request time: [0.5s, 2.5s] Experimental duration: 3600 seconds At every 10 seconds, the recursive least square estimator

predicts the response time

14

System Identification (cont.)

)2(006.0)1(015.0)2(104.0)1(003.0)( kdkdksksks

Use the square roots of the performance data to reduce the impact of burstiness

Model evaluation metric: If R2 > 0.8, a model is acceptable

Choose the second order model

R2: 0.86 (1st << 2nd ≈ 3rd ≈ 4th)

delay) service (actual var

)predictiondelay (service var12 R

15

Target Performance

Based on a fine-grained RTDB model using the database backlog

Target Performance Target service delay bound (St) : 2s

E*Trade (www.etrade.com)

Service Delay Overshoot (Sv) : 2.5s

Settling Time (Tv) : 10s

16

Design of Feedback Controller and Backlog Adaptation

The closed-loop for data service delay is

Design a PI controller using Root Locus method in Matlab Controller input: e(k) = St – s(k) Controller output δd(k) : the backlog adjustment The closed-loop poles: -0.308, 0.52 ±0.106i KP = 2.77, KI = 5.28

New target backlog dt(k) = dt(k-1) + δd(k)

17

Admission Control

nnew : the amount of data to process required by a user request Tnew ni : the amount of data to be processed by a request Ti in the ready queueq(t) : the number of the requests in the ready queue at time t

User request Tnew

)(

1

)(tq

=iin=td

A transaction Tnew arriving at time t during the (k+1)th sampling period is admitted if d(t) + nnew ≤ dt(k) where dt(k) is the desired backlog

q(t)

d(t) + nnew ≤ dt(k)Yes

No

18

Load Smoothing

Purpose Reducing the burstiness of workload

Key idea Use only under overload Each client voluntarily delays the submission of the

request by an additional period of time Increase the chances of a client's request being

admitted

19

Load Smoothing (cont.)

Mechanism Chronos provides the rejection rate p(k) as the

current server status

Clients delay their request submission in a predefined range [ta, tb] with probability p(k) in addition to the arbitrary inter-request time between [t1, t2]

Benefit Enhances the stability of the system Distributed traffic smoothing reduced overhead

20

Load Smoothing: example

When inter-request time is in a range of [t1, t2] = [0.1s, 0.5s], a client submits = 3.3 requests/s

If p(k) > 0 (e.g., 0.5) and a predefined extra delay is in a range of [ta, tb] = [0.1s, 0.3s] , a client submits

= 2.5 requests/s

The total arrival rate is expected to be reduced by

srequestst+t

/)(0.5

1

21

srequestst+tkp+t+t ba

/)))(((0.5

1

21

%25100%3.3

2.5-3.3=

21

Experimental Environments

Chronos serverDell laptop

1.66 GHz dual core1GB of RAM

Linux 2.6.15 kernelClients

Dell Desktop3GHz CPU

4GB of RAMLinux 2.6.15 kernel

1 Gbps Ethernet switch

Stock Price Update ServerDell Desktop

3GHz CPU / 2GB of RAMLinux 2.6.15 kernel

22

Workload Settings

1200 clients60% Queries (view-stock)40% Transactions (view-portfolio, purchase, sale)

Chronos server

3000 stock prices

0 5m 10m 15m

Inte

r-re

ques

t tim

e [3.5s~4.0s] [0.1s~0.5s]

To generate bursty workload

increase workload by 7~40 times

23

Tested Approaches

Open Pure Berkeley DB

AC Ad-hoc admission control

FC-Q Feedback control – queue length vs. delay model

FC-D Feedback control – database backlog vs. delay model

FC-TS Feedback control of database backlog & traffic smoothing

24

Performance Metrics

Performance metrics Data service delay Data service throughput: average number of the data

processed by the committed transactions and queries Total Timely (data processed by the timely transactions

completed within St)

Each experimental run is 15 minutes long

take the average of 10 runs with 90% confidence interval

25

Average Service Delay

FC-D 1.52±0.07s By adapting database

backlog systematically

FC-TS 1.24±0.03s By further reducing the

workload burstiness

Baselines Fail to support 2s Target service delay

26

Transient Service Delay

FC-D & FC-TS Outperform the baselines Overshoots decay in less than

three sampling periods, satisfying Tv < 10s

27

Data Service Throughput

FC-D - Process more than 18,680,000 data - Around 86% data were processed

within the desired delay bound

FC-TS - Process around 35% more data

than FC-D by further smoothing incoming workload

Baselines - Show similar throughput

The average number of the data processed by the committed transactions, which is normalized to the corresponding number for FC-D.

28

Conclusions

To enhance RTDB performance without degrading the data freshness, Predict the database backlog as the amount of data for the

database to process, using the meta data extracted from the database schema and transaction semantics

Adjust the database backlog via a fine-grained closed loop admission control based on the backlog model to support the desired service delay

Reduce the burstiness of incoming data service requests via hint-based incoming load smoothing

Questions?