Processing Rate Allocation for Proportional Slowdown ...zbo/publications/Zhou-IPDPS04.pdfof Internet...

10
Processing Rate Allocation for Proportional Slowdown Differentiation on Internet Servers Xiaobo Zhou Department of Computer Science University of Colorado, Colorado Springs, 80933 [email protected] Jianbin Wei and Cheng-Zhong Xu Department of Electrical & Computer Engg. Wayne State University, Detroit, 48202 jbwei, czxu @ece.eng.wayne.edu Abstract A proportional differentiation model states that quality of service of different classes of Internet traffic should be kept proportional to their pre-specified differentiation parame- ters, independent of the class loads. The model has been applied in the proportional queueing delay differentiation (PDD) in both network core and network edges. However, in the server side, an important and interesting performance metric is slowdown, the ratio of a request’s queueing delay to its service time. Slowdown is important because it is de- sirable that a request’s delay be proportional to its process- ing requirement. In this paper, we investigate the problem of processing rate allocation for proportional slowdown differentiation (PSD) on Internet servers. Existing algorithms for PDD provisioning in the network side are not applicable to PSD provisioning in the server side because slowdown is not only dependent on a job’s queueing delay but also on its service time, which varies significantly depending on the requested services. We first derive a closed form expression of the expected slowdown in an FCFS queue, which is an FCFS queue with a typical heavy-tailed service time distribution (Bounded Pareto distribution). PSD provi- sioning is realized by deploying a task server for handling each request class in a FCFS way. We then develop a strat- egy of processing rate allocation for the task servers in sup- port of PSD provisioning. Simulation results have showed that the proposed rate allocation strategy can provide pre- dictable and controllable PSD services on the servers. 1 Introduction Internet applications and clients have very diverse ser- vice expectations, demanding for provisioning of different levels of quality of service (QoS) on the Internet. This service differentiation provisioning problem was firstly ad- dressed in the network core. Differentiated Services (Diff- Serv) architecture, which aims to provide different QoS lev- els among multiple classes of aggregated traffic flows, has been an active research topic since it was formulated by IETF in 1998 [5]. Basically, there are two types of Diff- Serv scheme. One is absolute DiffServ, in which each class receives an absolute share of resource usages. The other is relative DiffServ, in which a class with a higher desired QoS level (referred to as a higher priority class) will re- ceive better (at least no worse) service than a lower priority class. Although absolute DiffServ is desired to Internet ser- vices like audio/video streaming applications that have hard time constraints, relative DiffServ is sufficient and more at- tractive for soft real-time Web applications. In order for a relative DiffServ scheme to be effective, the scheme must satisfy two basic properties: predictability and controllabil- ity. Predictability requires that higher priority classes re- ceive better or no worse service quality than lower priority classes, independent of the system load conditions. Con- trollability requires that the scheduler contain a number of controllable parameters that are adjustable for the control of quality spacings between classes. The proportional differentiation model [7] states that cer- tain class performance metrics should be proportional to the differentiation parameters the network operator chooses, independent of the class loads. The model has been ac- cepted as an important relative DiffServ model and been applied in the proportional delay differentiation (PDD) in packet scheduling [8]. In this model, the network traf- fic is categorized into classes of service. Each class is assigned a queueing delay differentiation parameter. The packet scheduler of a router gives different priorities to dif- ferent classes. The objective is to keep the ratio of average delay of a higher priority class to that of a lower priority class equal to the pre-specified value. Since the PDD model was formulated in [8], many algo- rithms have been designed to achieve the PDD provision- ing in the network routers. They can be classified into two categories: rate-based [8, 18], and time-dependent priority 0-7695-2132-0/04/$17.00 (C) 2004 IEEE Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS’04)

Transcript of Processing Rate Allocation for Proportional Slowdown ...zbo/publications/Zhou-IPDPS04.pdfof Internet...

Page 1: Processing Rate Allocation for Proportional Slowdown ...zbo/publications/Zhou-IPDPS04.pdfof Internet services is not only due to the packet transmis-sion delay in the network core,

Processing Rate Allocation for Proportional Slowdown Differentiationon Internet Servers

Xiaobo ZhouDepartment of Computer Science

University of Colorado, Colorado Springs, [email protected]

Jianbin Wei and Cheng-Zhong XuDepartment of Electrical & Computer Engg.

Wayne State University, Detroit, 48202�jbwei, czxu�@ece.eng.wayne.edu

Abstract

A proportional differentiation model states that quality ofservice of different classes of Internet traffic should be keptproportional to their pre-specified differentiation parame-ters, independent of the class loads. The model has beenapplied in the proportional queueing delay differentiation(PDD) in both network core and network edges. However,in the server side, an important and interesting performancemetric is slowdown, the ratio of a request’s queueing delayto its service time. Slowdown is important because it is de-sirable that a request’s delay be proportional to its process-ing requirement.

In this paper, we investigate the problem of processingrate allocation for proportional slowdown differentiation(PSD) on Internet servers. Existing algorithms for PDDprovisioning in the network side are not applicable to PSDprovisioning in the server side because slowdown is not onlydependent on a job’s queueing delay but also on its servicetime, which varies significantly depending on the requestedservices. We first derive a closed form expression of theexpected slowdown in an ���� �� FCFS queue, which isan����� FCFS queue with a typical heavy-tailed servicetime distribution (Bounded Pareto distribution). PSD provi-sioning is realized by deploying a task server for handlingeach request class in a FCFS way. We then develop a strat-egy of processing rate allocation for the task servers in sup-port of PSD provisioning. Simulation results have showedthat the proposed rate allocation strategy can provide pre-dictable and controllable PSD services on the servers.

1 Introduction

Internet applications and clients have very diverse ser-vice expectations, demanding for provisioning of differentlevels of quality of service (QoS) on the Internet. Thisservice differentiation provisioning problem was firstly ad-

dressed in the network core. Differentiated Services (Diff-Serv) architecture, which aims to provide different QoS lev-els among multiple classes of aggregated traffic flows, hasbeen an active research topic since it was formulated byIETF in 1998 [5]. Basically, there are two types of Diff-Serv scheme. One is absolute DiffServ, in which each classreceives an absolute share of resource usages. The otheris relative DiffServ, in which a class with a higher desiredQoS level (referred to as a higher priority class) will re-ceive better (at least no worse) service than a lower priorityclass. Although absolute DiffServ is desired to Internet ser-vices like audio/video streaming applications that have hardtime constraints, relative DiffServ is sufficient and more at-tractive for soft real-time Web applications. In order for arelative DiffServ scheme to be effective, the scheme mustsatisfy two basic properties: predictability and controllabil-ity. Predictability requires that higher priority classes re-ceive better or no worse service quality than lower priorityclasses, independent of the system load conditions. Con-trollability requires that the scheduler contain a number ofcontrollable parameters that are adjustable for the control ofquality spacings between classes.

The proportional differentiation model [7] states that cer-tain class performance metrics should be proportional tothe differentiation parameters the network operator chooses,independent of the class loads. The model has been ac-cepted as an important relative DiffServ model and beenapplied in the proportional delay differentiation (PDD) inpacket scheduling [8]. In this model, the network traf-fic is categorized into � classes of service. Each class isassigned a queueing delay differentiation parameter. Thepacket scheduler of a router gives different priorities to dif-ferent classes. The objective is to keep the ratio of averagedelay of a higher priority class to that of a lower priorityclass equal to the pre-specified value.

Since the PDD model was formulated in [8], many algo-rithms have been designed to achieve the PDD provision-ing in the network routers. They can be classified into twocategories: rate-based [8, 18], and time-dependent priority

0-7695-2132-0/04/$17.00 (C) 2004 IEEE

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS’04)

Page 2: Processing Rate Allocation for Proportional Slowdown ...zbo/publications/Zhou-IPDPS04.pdfof Internet services is not only due to the packet transmis-sion delay in the network core,

based [9, 11, 17, 19, 23]. The end-to-end time performanceof Internet services is not only due to the packet transmis-sion delay in the network core, but also due to the process-ing and queueing delay on the servers. Therefore, serversare a major force in DiffServ provisioning. Those algo-rithms can be tailored for PDD provisioning on servers [15].However, in the server side, an important and interestingperformance metric is slowdown, the ratio of a request’squeueing delay to its service time. Both queueing delayand response time are major performance metrics. But theyare not suitable to compare requests that have very differentresource demands. Actually, clients are likely to anticipateshort delays for ”small” requests, and are willing to toleratelong delays for ”large” requests. Thus, it is desirable thata request’s delay be proportional to its processing require-ment. A high slowdown can also indicate that the system isheavily loaded [25].

Slowdown is being used as a fundamental performancemetric of responsiveness on Internet servers [13, 21, 25, 24].However, few work exists for slowdown differentiation. Ex-isting time-dependent priority based PDD packet schedul-ing algorithms cannot be tailored to achieve proportionalslowdown differentiation (PSD) because they adjust the pri-ority of a backlogged request class according to the experi-enced queueing delay of its head-of-line request or back-logged requests, or both. Queueing delay aside, slowdownis also dependent on the service time of a request, whichis costly, if not impossible, to predict a priori for priorityrequest scheduling. Rate-based PDD packet scheduling al-gorithms can be applied for server-side proportional delaydifferentiation, but not PSD services because PSD provi-sioning needs to consider queueing delay together with ser-vice time.

In this paper, we investigate the problem of PSD provi-sioning on Internet servers. The problem is important be-cause the proportional model is a widely accepted DiffServmodel and slowdown is a key QoS metric in the server side.It is challenging because in order to meet the needs of dif-ferentiation predictability and controllability, a closed formexpression of expected slowdown is required for the formu-lation of resource allocation and scheduling. We considera popular heavy-tailed distribution, Bounded Pareto, for theservice time distribution. In the paper, an ����� queuewith Bounded Pareto service time distribution is referredto an ���� �� queue. We give the expected slowdown ofan ���� �� FCFS queue in a closed analytic form. Wethen propose a processing rate allocation strategy for PSDprovisioning on servers, based on the assumption that theprocessing rate of a server can be proportionally allocatedto a number of task servers. Each task server � (� � � � � �� )represents a processing unit that handles requests from thesame class in a FCFS way. Recently, there has been a re-newal of interests in achieving the proportional-share re-

source scheduling among multiple queues in both operat-ing systems and networks; see GPS [14], PGPS [20], andLottery Scheduling [22] for examples. They provide a basefor the processing rate allocation in our work. Note that atask server is an abstract concept in the sense that it can bea child process in a multi-process server, or a thread in amulti-thread server [6].

The structure of the paper is as follows. Section 2 givesa slowdown model for ���� �� FCFS queues. Section 3presents the processing rate allocation strategy for PSD pro-visioning. Section 4 focuses on simulation issues and per-formance evaluation. In Section 5, we review other relatedprocessing rate allocation and scheduling disciplines in theDiffServ areas. Section 6 concludes the paper.

2 Slowdown Modeling in an������ Queue

2.1 Preliminaries of Slowdown

Recent Internet workload measurements indicate that,for many Web applications, a heavy-tailed distribution is amore accurate model for service time distribution than theexponential distribution [3, 13]. In general, a heavy-taileddistribution is one for which Pr�� � �� � ���� � �, where � denotes the service time density distribution.

The Pareto distribution is a typical heavy-tailed distribu-tion, with probability mass function

���� � ������� � � �� � � �� (1)

and cumulative distribution function � ��� � Pr�� ��� � �� ����

�.In practice, there is some upper bound on the maximum

size of a job. As the work in [13], throughout this paper, wemodel job sizes as being generated i.i.d. from a distributionthat is heavy-tailed, but has an upper bound. This BoundedPareto distribution is characterized by three parameters: ,the shape parameter; �, the shortest possible job; and �, theupper bound of jobs. In the distribution, � � � � �. Itfollows that� �

������ �

� �

��������� � �� �������

Thus, as in [13], the probability density function for theBounded Pareto is defined as:

���� ��

�� ������������� � � �� � � � � ��

Since , �, and � are parameters of the Bounded Paretodistribution, for derivation simplicity, we define a function��� �� �� � ���

�������� in the following. The probabilitydensity function ���� is rewritten as:

���� � �������� �� �� � � �� � � � � �� (2)

0-7695-2132-0/04/$17.00 (C) 2004 IEEE

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS’04)

Page 3: Processing Rate Allocation for Proportional Slowdown ...zbo/publications/Zhou-IPDPS04.pdfof Internet services is not only due to the packet transmis-sion delay in the network core,

From (2), we have:

��� � �

� �

�������

������������������� if � �� ��

��� �� �� ������ �� �� if � � �(3)

����� �

� �

�������� ����� �� ��

��� � � �� �� (4)

������ �

� �

��������� ����� �� ��

��� �� �� �� (5)

According to Pollaczek-Khinchin formula [14], we haveLemma 1. Given an ��� �� FCFS queue on a server,where the arrival process has rate and � denotes theBounded Pareto service time density distribution. Let � bea job’s queueing delay, and � be a job’s slowdown. Then,

���� � ��� � ������� � �����������

��� ��� �� (6)

Note that the slowdown formula follows from the fact that� and � are independent from a FCFS queue.

2.2 Slowdown on Internet Servers

Based on the proportional-share resource schedulingmechanisms like GPS [14], PGPS [20], and LotteryScheduling [22], we assume that the processing rate of anInternet server can be proportionally allocated to a numberof task servers. Each task server � (� � �� ) represents aprocessing unit that handles requests in a service class in aFCFS way. Let �� denote the normalized processing rate ofthe task server �. Then,

�����

�� � � � � �� � � for � � � � � (7)

Lemma 2. Given an ��� �� FCFS queue on a taskserver � with processing capacity ��, �� denotes theBounded Pareto service time density distribution on the taskserver. Then,

����� ��

����� � (8)

����� � �

�������� (9)

������ � � �������� (10)

Proof. On the task server �, the lower bound and upperbound for the Bounded Pareto distribution is ���� and ����,respectively. According to (1), we have� ����

����

������ �

� ����

����

���������� � ��� ��� �������

Thus, on the task server �, we define the probability den-sity function for the Bounded Pareto as:

���� ����

��� ��� ������������� �� � � ��

��� � �

��

Recall the definition of���� �� �� � ���

�������� , the prob-ability density function for the Bounded Pareto on the taskserver � is rewritten as:

���� � ���� ��������� �� ��

We have:

����� �

� ����

����

������

����

������������������ if � �� ��

������ �� �� ������ �� �� if � � �

(11)

����� � �

� ����

����

�������� ��

���

���� �� ��

��� � � �� �� (12)

������ � �

� ����

����

������ � ������ �� ��

��� �� �� �� (13)

From (3) and (11), we have ����� ������� �. From (4)

and (12), we have ����� �

����

�����. From (5) and (13), we

have ������ � � ��������. �

According to Lemma 1 and 2, we haveTheorem 1. Given an ��� �� FCFS queue on a taskserver � with processing capacity ��, where � denotes thearrival rate and �� denotes the Bounded Pareto servicetime density distribution on the task server. Let �� be ajob’s queueing delay and �� be a job’s slowdown on thetask server. Then,

����� ������������

��� � ���� �� (14)

Note that the ��� �� queue is reduced to an ����queue when requests’ service times are equal to a constant�. This kind of queueing model is interesting in a session-based E-commerce workload model. A session is a se-quence of requests of different types made by a single cus-tomer during a single visit to a site. Requests at some statessuch as home entry or register take approximately the sameservice time. They can be modeled as an ���� queue. Inthe ���� FCFS system, (14) is reduced to

����� � ��

��� � ��� (15)

0-7695-2132-0/04/$17.00 (C) 2004 IEEE

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS’04)

Page 4: Processing Rate Allocation for Proportional Slowdown ...zbo/publications/Zhou-IPDPS04.pdfof Internet services is not only due to the packet transmis-sion delay in the network core,

3 Processing Rate Allocation for Propor-tional Slowdown Differentiation

A proportional differentiation model ensures the qualityspacing between class � and class � to be proportional to cer-tain pre-specified differentiation parameters � and � [7];that is,

����

���

� � �� � � ��

where �� and �� are the QoS factor of class � and class �,respectively. So it is up to applications and clients to selectappropriate QoS levels in terms of differentiation parame-ters that best meets their requirements, cost, and constraints.

The PSD model aims to control the ratios of the aver-age class slowdown based on the differentiation parameters��� � � �� � � � � ��. Specifically, the PSD model requiresthat the ratio of average slowdown between class � and �is fixed to the ratio of the corresponding differentiation pa-rameters

�����

���� �

��

� � �� � � �� (16)

The differentiation predictability property requires thathigher classes receive better service, i.e., lower slowdowns.Without loss of generality, we assume that class 1 is the’highest class’ and set � � Æ� � Æ� � � � � � Æ� .

For feasible rate allocation, we must ensure that the sys-tem utilization

��

��� ��� � � �. That is, the total pro-cessing requirement of the � classes of traffic is less thanthe server processing capacity.

According to Theorem 1, the set of (16), in combinationwith (7), lead to

�� ���

����

��� ��� ���

�����

�

� ��� �� (17)

From this equation, we can observe that the remaining ca-pacity of the server is fairly allocated to different classesaccording to their scaled arrival rates with respect to theirdifferentiation parameters.

It follows that the expected slowdown of class �, �����,is calculated as:

����� �����������

��

�����

�

������ ���

��� ��� (18)

From (18), we have the following three basic propertiesregarding the predictability and controllability of the pro-portional slowdown differentiation given by the allocationstrategy:

1. Slowdown of a request class increases with its requestarrival rate.

Request generator 1

Request generator N

Load estimator

Task server 1

Rate allocator

waiting queue 1

waiting queue N

Server

Task server N

Figure 1. The simulation model’s structure.

2. With the increase of the differentiation parameter ofa request class, its slowdown increases but all otherrequest classes have lower slowdowns.

3. Increasing the workload (request arrival rate) of ahigher request class causes a larger increase in slow-down of a request class than increasing the workloadof a lower request class.

4 Performance Evaluation

In order to evaluate the performance of the proposed pro-cessing rate allocation strategy for PSD provisioning un-der an � �� � traffic model, we conducted the simula-tions with requests generated by using GNU scientific li-brary [12]. In this section, after introducing the simula-tion model, we first illustrate the effectiveness of the rate-allocation PSD strategy by comparing the achieved slow-downs with those calculated from the PSD model. We thenshow the differentiation predictability and controllability ofthe proposed rate-allocation strategy. Finally, we discussthe influence of the shape parameter � and the upper bound� of the Bounded Pareto distribution on PSD provisioning.

4.1 Simulation Model

We built a simulation model which consisted of a numberof request generators, waiting queues, a load estimator, aprocessing rate allocator, and a number of task servers. Fig-ure 1 outlines the basic structure of the simulation model.

The request generators produced requests with appropri-ate inter-arrival distribution and size distribution. In thesimulation model, we generated the Bounded Pareto ser-vice time distribution by modifying GNU scientific library.In the � �� � traffic model, the request sizes of eachclass followed a Bounded Pareto distribution. Note thatthe number of classes in PSD provisioning on the serveris usually rather limited. It varied from 2 to 3 in many sim-ilar experiments for PDD provisioning in packet schedul-ing [8, 9, 11, 17]. Each request was sent to the server andstored in a waiting queue according to its class type. Re-quests from the same class were processed by a task serverin a FCFS manner.

0-7695-2132-0/04/$17.00 (C) 2004 IEEE

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS’04)

Page 5: Processing Rate Allocation for Proportional Slowdown ...zbo/publications/Zhou-IPDPS04.pdfof Internet services is not only due to the packet transmis-sion delay in the network core,

0 10 20 30 40 50 60 70 80 90 100

System load (%)

1

10

100

Slow

dow

n

Class 1 (simulated)Class 2 (simulated)Class 1 (expected)Class 2 (expected)

Figure 2. Simulated and expected slowdownsof two classes (� � � � � � �).

The load estimator measured the arrival rate and the in-curred load for every class. In the simulation, the load wasestimated for every thousand time units. A time unit wasset equal to the processing time of an average-size request.We estimated a class’s load based on its history; that is, theload for next thousand time units was the average load inpast five thousand time units. The rate allocator performedthe proposed rate-allocation strategy according to (17) forevery class. Meanwhile, the processing rate was reallocatedfor every thousand time units.

Simulation parameters were set as follows. The shapeparameter (�) of the Bounded Pareto distribution was setequal to 1.5, as suggested in [9]. As indicated in [13], itslower bound and upper bound were set equal to 0.1 and100, respectively. We also did experiments for larger up-per bound settings to evaluate the influence. We assumedthat all classes had the same load. In the experiments, thesimulator first warmed up for 10,000 time units. The slow-down of a class was then measured for every thousand timeunits. After 60,000 time units, we calculated the slowdownsof classes. Each reported result is an average of 100 runs.

4.2 Effectiveness of the Rate-allocation Strategy

In this section, we show the effectiveness of the pro-posed rate-allocation strategy by comparing the simulatedslowdowns with the expected values calculated by (18) un-der various load conditions. Figure 2 shows the result oftwo classes. Their differentiation parameters (�� �) are(1, 2). To show the results in a comparable way, the log-arithmic y-axis is used. The figure shows very small differ-ences between the simulated and expected slowdowns un-der various load conditions. It also shows the achieved sys-tem slowdowns, which are the weighted slowdowns of the

0 10 20 30 40 50 60 70 80 90 100

System load (%)

1

10

100

Slow

dow

n

Class 1 (simulated)Class 2 (simulated)Class 1 (expected)Class 2 (expected)

Figure 3. Simulated and expected slowdownsof two classes (� � � � � � �).

0 10 20 30 40 50 60 70 80 90 100

System load (%)

1

10

100

Slow

dow

n

Class 1 (simulated)Class 2 (simulated)Class 3 (simulated)Class 1 (expected)Class 2 (expected)Class 3 (expected)

Figure 4. Simulated and expected slowdownsof three classes (� � � � � � � � � � �).

two classes, are also very close to the expected ones. Wethen change the differentiation parameters of the two classes(�� �) to (1, 4). The results are shown in Figure 3. We alsoshow the results of the experiment with three classes in Fig-ure 4, where the differentiation parameters (�� �� �) are(1, 2, 3). From all these figures, we can observe that theproposed rate-allocation strategy is effective in achievingexpected slowdowns under various load conditions. Thisprovides the base for the following discussions on the prop-erties of the PSD rate-allocation strategy.

0-7695-2132-0/04/$17.00 (C) 2004 IEEE

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS’04)

Page 6: Processing Rate Allocation for Proportional Slowdown ...zbo/publications/Zhou-IPDPS04.pdfof Internet services is not only due to the packet transmis-sion delay in the network core,

0 20 40 60 80 100

System load (%)

0

2

4

6

8

10

12

14

16

18

20

22Sl

owdo

wn

ratio

Class 2/Class 1 (δ2/δ1=2)Class 2/Class 1 (δ2/δ1=4)Class 2/Class 1 (δ2/δ1=8)

27.07

Figure 5. Percentiles of simulated slowdownratios for two classes.

4.3 Differentiation Predictability of the Rate-allocation Strategy

In this section, we show experiment results to demon-strate the differentiation predictability of the proposed rate-allocation strategy. Recall that the predictability propertymeans the slowdown of a higher class is proportionallysmaller than that of a lower class under various system loadconditions in various timescales.

We first show the results in long timescales. Figure 5and 6 shows the results of different parameter settings fortwo and three classes, respectively. In both figures, theupper line is the 95th percentile; the bar is the 50th per-centile; and the lower line is the 5th percentile. When thepercentile is too large, we give its value in the figure di-rectly. Both figures show that under various system loadconditions, the proposed rate-allocation strategy can guar-antee that the achieved ratios of the average slowdown areclose to the corresponding pre-specified differentiation pa-rameter ratios. It means that a higher class has proportion-ally smaller average slowdown than a lower class in longtimescales, although there exist some variances. From theseresults, we find that the proposed rate-allocation strategycan achieve the objective of providing PSD services to dif-ferent classes with different differentiation parameters un-der various system load conditions in long timescales.

From the long-timescale results shown in the figures, wecan also observe that the achieved slowdown ratios are notdistributed equally around the 50th percentile. For example,Figure 5 shows when the pre-specified differentiation ratioof the two classes (���) is 4 and system load is 10%, the95th percentile of the experienced slowdown ratio is about12 while that of the 5th percentile is 1.2. We believe this be-havior is caused by the heavy-tail property of the Bounded

0 20 40 60 80 100

System load (%)

0

2

4

6

8

10

Slow

dow

n ra

tio

Class 2/Class 1 (δ2/δ1=2)Class 3/Class 1 (δ3/δ1=3)

Figure 6. Percentiles of simulated slowdownratios for three classes.

Pareto distribution. More experiments will be conducted inour future work to further investigate this behavior.

From Figure 5, we unexpectedly find when the pre-specified differentiation ratio is small (i.e., ��� � �), theexperienced slowdown of a higher class (class 1) can belarger than that of a lower class (class 2). Such behaviorcan be obviously observed when system load is 10%. Thatis, the ratio of experienced slowdowns of Class 2 to Class 1is lower than 1 at the 5th percentile, while the pre-specifieddifferentiation ratio is 2.

In order to demonstrate the differentiation predictabil-ity due to the proposed rate-allocation strategy in shorttimescales, we show the slowdowns of individual requestsin Figures 7 and 8. From Figure 7, we can observe that theslowdown difference between class 1 and class 2 is small inmoderate-load conditions. Some requests from class 1 havelarge slowdowns while some from class 2 have large slow-downs, although the pre-specified slowdown ratio of class2 to class 1 is 2. Figure 8 shows the results in heavy-loadconditions. We can observe that requests from class 1 expe-rienced larger slowdowns than those from class 2. This be-havior contradicts their pre-specified differentiation ratios.Close analysis shows that the slowdown ratio of class 2 toclass 1 is 0.33 instead of 2 during this period. More ex-periments have been carried out to verify this. We find thatsometimes the behavior of individual requests is consistentwith their slowdown parameters, and sometimes not. Theysuggest that the proposed rate-allocation strategy can onlyprovide weak predictability in short timescales. Actually,the strategy only determines the processing rate allocatedto one class periodically. In other words, it acts accord-ing to the macro-behavior (class load) of a class rather thanits micro-behavior, such as experienced slowdowns of in-dividual requests. Improving the performance of the rate-

0-7695-2132-0/04/$17.00 (C) 2004 IEEE

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS’04)

Page 7: Processing Rate Allocation for Proportional Slowdown ...zbo/publications/Zhou-IPDPS04.pdfof Internet services is not only due to the packet transmis-sion delay in the network core,

60000 60200 60400 60600 60800 61000Time (time unit)

0

50

100

150

Slow

dow

n

Class 1 (simulated)Class 2 (simulated)

Figure 7. Slowdown of individual requestswhen the system load is 50%.

allocation strategy in achieving differentiation predictabil-ity in short timescales will be part of our future work.

4.4 Differentiation Controllability of the Rate-allocation Algorithm

In this section, we show the differentiation controllabil-ity due to the proposed rate-allocation strategy. Figure 9shows the achieved slowdown ratios of two classes withdifferent differentiation parameter settings. It can be seenthat when the pre-specified differentiation parameter ratiosare small (i.e., ���=2 and 4), the rate-allocation strategycan accurately achieve the corresponding slowdown ratiosat various load conditions. As the pre-specified differentia-tion parameter ratio increases (���=8), the difference be-tween the achieved slowdown ratios and pre-specified onesbecome large at various load conditions. Such behavior, webelieve, is caused by the load estimation error.

From (17), we can see that the processing rate allocatedto a class is determined by its class load, its differentiationparameter, and system load. Therefore, it is important forthe rate-allocation strategy to accurately estimate a class’sload. Meanwhile, it is necessary to do such estimation inshort timescales, such as one thousand time units used inour experiments, so as to govern the adaptiveness and short-timescale predictability of the PSD provisioning. Such es-timation, however, is difficult because of the burstiness ofthe Bounded Pareto distribution and the variance of Poissondistribution. According to (17), such estimation error haslarger influence on the achieved slowdown ratio with theincrease of the differentiation parameter.

Figure 10 depicts the simulated slowdown ratios for sys-tem with three classes. In comparison with Figure 9, wecan see that the variance of these ratios is larger than those

60000 60200 60400 60600 60800 61000Time (time unit)

0

50

100

150

200

250

Slow

dow

n

Class 1 (simulated)Class 2 (simulated)

Figure 8. Slowdown of individual requestswhen the system load is 90%.

0 10 20 30 40 50 60 70 80 90 100

System load (%)

0

2

4

6

8

10

12

14Sl

owdo

wn

ratio

Class 2/Class 1 (δ2/δ1 = 2)Class 2/Class 1 (δ2/δ1 = 4)Class 2/Class 1 (δ2/δ1 = 8)

Figure 9. Simulated slowdown ratios of twoclasses.

in system with two classes. This behavior is also causedby the estimation error. When the load of one class is es-timated inaccurately, it affects not only the processing rateallocated to this class, but other classes as well. Therefore,the situation may become worse as the number of classes tobe differentiated increases.

Although there exist some variances for the simulatedslowdown ratios under various system conditions, fromboth figures, we can observe that the rate-allocation strategycan achieve the pre-specified differentiation ratios. They il-lustrate that the strategy can control the slowdown ratiosbetween classes adaptively.

0-7695-2132-0/04/$17.00 (C) 2004 IEEE

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS’04)

Page 8: Processing Rate Allocation for Proportional Slowdown ...zbo/publications/Zhou-IPDPS04.pdfof Internet services is not only due to the packet transmis-sion delay in the network core,

0 20 40 60 80 100

System load (%)

0

0.5

1

1.5

2

2.5

3

3.5

4Sl

owdo

wn

ratio

Class 2/Class 1 (δ2/δ1 = 2)Class 3/Class 1 (δ3/δ1 = 3)

Figure 10. Simulated slowdown ratios of threeclasses.

1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2

Shape parameter of the bounded Pareto distribution

10

100

1000

Slow

dow

n

Class 1 (simulated)Class 2 (simulated)Class 1 (expected)Class 2 (expected)

Figure 11. Influence of the shape parameterof the Bounded Pareto distribution.

4.5 Influence of the Shape Parameter and UpperBound

In this part, we examine the influence of the shape pa-rameter � and the upper bound � of the Bounded Pareto dis-tribution on the performance of the proposed rate-allocationstrategy. The shape parameter determines the correlation oftraffic requests. A large shape parameter implies that therequests are independent with each other in size. The up-per bound reflects the heavy-tail property of the BoundedPareto distribution.

Figure 11 shows the experienced slowdowns of twoclasses (��� � �) due to various shape parameter set-tings (��� � � � ���). The first observation is that theshape parameter has little influence on the differentiation

100 1000 10000Upper bound of the bounded Pareto distribution

10

100

1000

Slow

dow

n

Class 1 (simulated)Class 2 (simulated)Class 1 (expected)Class 2 (expected)

Figure 12. Influence of the upper bound of theBounded Pareto distribution.

predictability due to the proposed rate-allocation strategy.Both classes can receive the differentiated slowdowns asexpected. The differences between the experienced slow-downs and expected ones are not dependent on the shapeparameter. This behavior is explained by the fact that thereis no assumption about the shape parameter in the rate-allocation strategy. The second important observation isthat, the slowdown of a class decreases as the shape pa-rameter increases. Intuitively, for the given lower and up-per bounds, the smaller the shape parameter �, the burstierthe generated traffic [16]. A request may experience largerqueueing delay than that from a “smooth” traffic. Formally,by (4), (5), we know that, when the shape parameter de-creases, its second moment ����� increases, its ������decreases, and its expected slowdown also increases.

The upper bound of the Bounded Pareto distribution (�)also affects the experienced slowdown of the���� �� traf-fic. We show the results in Figure 12. The ratio of the dif-ferentiation parameter of two classes (���) is 2. It can beseen that the upper bound has little influence on the dif-ferentiation predictability due to the rate-allocation strat-egy. The differences between the experienced slowdownsand expected ones are not dependent on the upper bound.Second, the higher the upper bound, the larger the expectedslowdown of the classes. Note that the lower bound of theBounded Pareto distribution (�) remains the same. Intu-itively, as the upper bound increases, the Bounded Paretodistribution becomes more heavy-tailed and the slowdownincreases. By (4), (5), as the upper bound increases, thesecond moment of the traffic ����� increases and ������remains almost unchanged. We find that, in the ���� ��traffic model, as the shape parameter increases, the slow-down decreases; as the upper bound increases, the slow-down increases.

0-7695-2132-0/04/$17.00 (C) 2004 IEEE

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS’04)

Page 9: Processing Rate Allocation for Proportional Slowdown ...zbo/publications/Zhou-IPDPS04.pdfof Internet services is not only due to the packet transmis-sion delay in the network core,

5 Related Work

The DiffServ provisioning problem was first addressedin the network core. The proportional differentiationmodel [7] has been accepted as an important DiffServmodel and been applied in the PDD model in packetscheduling [8]. Many algorithms have been designedto achieve the PDD provisioning in the network routers.They can be classified into two categories: rate-based;see BPR [8] and JoBS [18] for examples, time-dependentpriority based; see WTP, PAD, and HPD [9], adaptiveWTP [11, 17], MDP [19], and VirtualLength [23] for exam-ples. Servers play an important role in end-to-end DiffServprovisioning. Those algorithms can be tailored for requestscheduling for PDD provisioning in the server side [15].However, the algorithms are not applicable to PSD provi-sioning in the server side because slowdown is not only de-pendent on a job’s queueing delay but also on its servicetime, which varies significantly depending on the requestedservices.

In the server side, a primary focus of DiffServ provi-sioning has been on priority-based request scheduling forresponsive time differentiation [2, 4, 6, 10]. For example,in [2], the authors addressed strict priority scheduling strate-gies for controlling CPU utilization in Web content host-ing servers. QoS was introduced by assigning priorities torequests for different contents. Requests of lower priorityclasses were only executed if no requests existed in anyhigher priority classes. The results showed that service dif-ferentiation can be achieved but the quality spacings amongdifferent classes cannot be guaranteed by this kind of strictpriority scheduling. Therefore, this kind of priority-basedscheduling schemes cannot achieve PSD provisioning.

Admission control is often used in combination withpriority-based scheduling for DiffServ provisioning. Forexample, in [1], the authors used classical feedback controltheory to achieve overload protection, performance guaran-tees, and service differentiation in Web servers. The strat-egy was based on real-time scheduling theory which statesthat response time can be guaranteed if server utilization ismaintained below a pre-computed bound. Thus, control-theory approaches, in combination with content adaptationstrategies, were formulated to keep server utilization at orbelow the bound. In [15], the authors proposed admissioncontrol algorithms in combination with time-dependent pri-ority scheduling for proportional queueing-delay differen-tiation on a Web server. Therefore, this kind of admissioncontrol itself is not sufficient in PDD provisioning and is notapplicable to PSD provisioning.

Stretch factor, a variant of slowdown, was adoptedin [25] as the performance metric for DiffServ provision-ing in a cluster of Internet servers. The authors proposed ademand-driven DiffServ strategy by adopting an �����

queueing model to guide node-based resource allocationoptimization. They implicitly applied processor sharingscheduling strategy for the modeling of stretch factor. How-ever, in a single queue, a realistic scheduling strategy isFCFS. We note that for an ����� FCFS queue withthe unbounded exponential service time density distribu-tion� , there is no valid stretch factor or slowdown because������ is not existent. For an ����� FCFS queue witha bounded exponential service time density distribution � ,there is no closed form expression for the stretch factor orslowdown because ������ only has a definite value whenthe lower bound and the upper bound of service time aregiven. Recent Internet workload measurements indicate thatfor many Web applications the exponential distribution is apoor model for service time distribution and that a heavy-tailed distribution is more accurate [3, 13]. In this paper,we investigate the problem of processing rate allocation forPSD provisioning under a popular heavy-tailed traffic pat-tern (Bounded Pareto).

In [13], Harchol used slowdown as a primary perfor-mance metric in evaluating task assignment strategies in adistributed server system, where the workload was heavy-tailed (Bounded Pareto) and job size was unknown to thescheduler. The primary objective was to minimize meanslowdown of all job classes in the distributed system. Inthis paper, we give a closed analytic form of expected slow-down in an���� �� FCFS queue for processing rate allo-cation on servers for PSD provisioning among different jobclasses. Our work is complementary to the previous work.

6 Conclusion

Slowdown is an important performance metric on In-ternet servers because it takes into account both the delayand service time of a request simultaneously. Althoughproportional delay differentiation has been studied exten-sively in the literatures, there are few research work mod-els for PSD provisioning in the server side. In this paper,we have investigated the problem of processing rate allo-cation for PSD provisioning on Internet servers. We havederived a closed form expression of the expected slowdownin����� FCFS queues with Bounded Pareto service timedistribution, referred to ���� �� queues. We have de-ployed a task server for handling each request class in aFCFS way and presented the strategy of processing rate al-location for the task server in support of PSD provisioning.We have built a simulation model for the processing rateallocations. Simulation results have showed that the allo-cation strategies can achieve the objective of providing pre-dictable, controllable and fair proportional slowdown dif-ferentiation on the servers. Our future work will be on im-proving the performance of the rate-allocation strategy inproviding short-timescale differentiation predictability.

0-7695-2132-0/04/$17.00 (C) 2004 IEEE

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS’04)

Page 10: Processing Rate Allocation for Proportional Slowdown ...zbo/publications/Zhou-IPDPS04.pdfof Internet services is not only due to the packet transmis-sion delay in the network core,

Acknowledgement

This research was supported in part by NSF grants CCR-9988266 and ACI-0203592.

References

[1] T. F. Abdelzaher, K. G. Shin, and N. Bhatti. Per-formance guarantees for Web server end-systems: acontrol-theoretical approach. IEEE Trans. on Paralleland Distributed Systems, 13(1):80–96, 2002.

[2] J. Almeida, M. Dabu, A. Manikutty, and P. Cao. Pro-viding differentiated levels of services in Web contenthosting. In Proc. ACM SIGMETRICS Workshop onInternet Server Performance, pages 91–102, 1998.

[3] M. Arlitt, D. Krishnamurthy, and J. Rolia. Charac-terizing the scalability of a large Web-based shoppingsystem. ACM Trans. on Internet Technology, 1(1):44–69, 2001.

[4] N. Bhatti and R. Friedrich. Web server support fortiered services. IEEE Network, 13(5):64–71, 1999.

[5] S. Blake, D. Black, M. Carlson, E. Davies, W. Z., andW. Weiss. An architecture for differentiated services.IETF RFC 2475, 1998.

[6] X. Chen and P. Mohapatra. Performance evaluation ofservice differentiating internet servers. IEEE Trans.on Computers, 51(11):1,368–1,375, 2002.

[7] C. Dovrolis and P. Ramanathan. A case for relativedifferentiated services and the proportional differenti-ation model. IEEE Network, 13(5):26–34, 1999.

[8] C. Dovrolis, D. Stiliadis, and P. Ramanathan. Pro-portional differentiated services: Delay differentiationand packet scheduling. In Proc. ACM SIGCOMM,1999.

[9] C. Dovrolis, D. Stiliadis, and P. Ramanathan. Pro-portional differentiated services: Delay differentiationand packet scheduling. IEEE/ACM Trans. on Net-working, 10(1):12–26, 2002.

[10] L. Eggert and J. Heidemann. Application-level differ-entiated services for Web servers. World Wide WebJournal, 3(2):133–142, 1999.

[11] L. Essafi, G. Bolch, and A. Andres. An adaptive wait-ing time priority scheduler for the proportional differ-entiation model. In Proc. of the High PerformanceComputing Symposium, April 2001.

[12] Free Software Foundation. GSL – GNU Scientific Li-brary. Available: http://www.gnu.org/software/gsl/.

[13] M. Harchol-Balter. Task assignment with unknownduration. Journal of ACM, 29(2):260–288, 2002.

[14] L. Kleinrock. Queueing Systems, Volume II. John Wi-ley and Sons, 1976.

[15] S. C. M. Lee, J. C. S. Lui, and D. K. Y. Yau. Admissioncontrol and dynamic adaptation for a proportional-delay DiffServ-enabled Web server. In Proc. ACMSIGMETRICS, 2002.

[16] W. E. Leland, M. S. Taqqu, W. Willinger, and D. V.Wilson. On the self-similar nature of ethernet traffic(extended version). IEEE/ACM Transactions on Net-working, 2(1):1–15, February 1994.

[17] M. K. H. Leung, J. C. S. Lui, and D. K. Y. Yau. Adap-tive proportional delay differentiated services: Char-acterization and performance evaluation. IEEE/ACMTrans. on Networking, 9(6):908–817, 2001.

[18] J. Liebeherr and N. Christin. JoBS: Joint buffer man-agement and scheduling for differentiated services.In Proc. of the Int’l Workshop on Quality of Service(IWQoS), pages 404–418, June 2001.

[19] T. Nandagopal, N. Venkitaraman, R. Sivakumar, andV. Bharghavan. Delay differentiation and adaptationin core stateless networks. In Proceedings of IEEEINFOCOM, pages 421–430, April 2000.

[20] A. K. Parehk and R. G. Gallager. A generalized pro-cessor sharing approach to flow control in IntegratedSevices networks: the single-node case. IEEE/ACMTrans. on Networking, 1(3):344–357, 1993.

[21] A. Riska, E. Smirni, and G. Ciardo. ADAPTLOAD:effective balancing in clustered Web servers undertransient load conditions. In Proc. IEEE Int’l Conf.on Distributed Computing Systems (ICDCS), 2002.

[22] C. A. Waldspurger and W. E. Weihl. Lottery schedul-ing: flexible proportional-share resource management.In Proc. the 1st USENIX Symposium on OperatingSystem Design and Implementation, 1994.

[23] J. Wei, Q. Li, and C.-Z. Xu. Virtuallength: A newpacket scheduling algorithm for proportional delaydifferentiation. In Proc. of IEEE Int’l Conf. on Com-puter Communications and Network (ICCCN), 2003.

[24] X. Zhou, J. Wei, and C.-Z. Xu. Modeling and analysisof 2D service differentiation on E-commerce servers.In Proc. of the 24th IEEE Int’l Conference on Dis-tributed Computing Systems (ICDCS), March 2004.

[25] H. Zhu, H. Tang, and T. Yang. Demand-driven servicedifferentiation for cluster-based network servers. InProc. IEEE INFOCOM, pages 679–688, 2001.

0-7695-2132-0/04/$17.00 (C) 2004 IEEE

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS’04)