Mor Harchol-Balter Carnegie Mellon University School of Computer Science

Mor Harchol-BalterCarnegie Mellon UniversitySchool of Computer Science

“size” = service requirement

load < 1

jobs SRPT

jobs PS

Q: Which minimizes mean response time?

“size” = service requirement

jobs SRPT

load < 1

jobs PS

Q: Which best represents scheduling in web servers ?

How about using SRPT instead of PS in web servers?

Linux 0.S.

WEBSERVER(Apache)

client 1

client 2

client 3

“Get File 1”

“Get File 2”

“Get File 3”

Internet

Many servers receive mostly static web requests.

“GET FILE”

For static web requests, know file size

Approx. know service requirement of request.

Immediate Objections

1) Can’t assume known job size

2) But the big jobs will starve ...

Outline of Talk

[BH – Sigmetrics 01] “Analysis of SRPT: Investigating Unfairness”[HSW-Performance 02] “Asymptotic Convergence of Scheduling Policies…”[WH – Sigmetrics 03*] “Classifying Scheduling Policies wrt Unfairness …”

THEORY

IMPLEMENT

www.cs.cmu.edu/~harchol/

[HSBA – TOCS 03] “Size-based Scheduling to Improve Web Performance”[SH – ITC 03*] “Web servers under overload: How scheduling can help”[MSAH – ICDE03] “Priority Mechanisms for OLTP and Web Applications”

(M/G/1)

Schroeder

Wierman

THEORY SRPT has a long history ...

1966 Schrage & Miller derive M/G/1/SRPT response time:

1968 Schrage proves optimality

1979 Pechinkin & Solovyev & Yashkov generalize

1990 Schassberger derives distribution on queue length

BUT WHAT DOES IT ALL MEAN?

THEORYSRPT has a long history (cont.)1990 - 97 7-year long study at Univ. of Aachen under Schreiber SRPT WINS BIG ON MEAN!

1998, 1999 Slowdown for SRPT under adversary: Rajmohan, Gehrke, Muthukrishnan, Rajaraman, Shaheen, Bender, Chakrabarti, etc. SRPT STARVES BIG JOBS!

Various o.s. books: Silberschatz, Stallings, Tannenbaum: Warn about starvation of big jobs ...

Kleinrock’s Conservation Law: “Preferential treatment given to one class of customers is afforded at the expense of other customers.”

Unfairness Question

Let =0.9. Let G: Bounded Pareto(= 1.1, max=1010)

Question: Which queue does biggest job prefer?

Results on UnfairnessLet =0.9. Let G: Bounded Pareto(= 1.1, max=1010)

I SRPT

Unfairness – General Distribution

All-can-win-theorem:

For all distributions, if ½,

E[T(x)]SRPT E[T(x)]PS for all x.

All-can-win-theorem:

For all distributions, if ½,

E[T(x)]SRPT E[T(x)]PS for all x.

Proof idea:

0 )1 1x

dttft 2 )(

Waiting time (SRPT) Residence (SRPT) Total (PS)

Classification of Scheduling Policies

ALWAYS FAIR For all loads, for all service distributions,

ALWAYS UNFAIR For all loads, for all service distributions,

SOMETIMES UNFAIR For some loads:

For other loads :

PSP xTExTEx )]([)]([ ,

PSP xxTExTE ,)]([)]([

PSP xTExTEx )]([)]([ ,

PSP xxTExTE ,)]([)]([

Classification of Scheduling Policies

AlwaysFAIR

AlwaysUnfair

Sometimes Unfair

Age-BasedPolicies

Preemptive Size-basedPolicies

Remaining Size-basedPolicies

Non-preemptive

PS PLCFS

SRPT Lots of open problems…

What does SRPT mean within a Web server?

• Many devices: Where to do the scheduling?

• No longer one job at a time.

IMPLEMENT From theory to practice:

Server’s Performance BottleneckIMPLEMENT

Linux 0.S.

WEBSERVER(Apache)

client 1

client 2

client 3

“Get File 1”

“Get File 2”

“Get File 3”

Rest ofInternet ISP

Site buyslimited fractionof ISP’s bandwidth

We model bottleneck by limiting bandwidth on server’s uplink.

Network/O.S. insides of traditional Web server

Sockets take turnsdraining --- FAIR = PS.

WebServer

Socket 1

Socket 3

Socket 2Network Card

Client1

Client3

Client2BOTTLENECK

IMPLEMENT

Network/O.S. insides of our improved Web server

Socket corresponding to filewith smallest remaining datagets to feed first.

WebServer

Socket 1

Socket 3

Socket 2Network Card

Client1

Client3

Client2

priorityqueues.

BOTTLENECK

IMPLEMENT

Experimental Setup

Implementation SRPT-based scheduling: 1) Modifications to Linux O.S.: 6 priority Levels 2) Modifications to Apache Web server 3) Priority algorithm design.

Linux 0.S.

APACHEWEB

SERVER

switch

Experimental Setup

APACHEWEB

SERVER

Linux 0.S.

switch

Trace-based workload: Number requests made: 1,000,000Size of file requested: 41B -- 2 MBDistribution of file sizes requested has HT property.

Apache

WAN EMU

Geographically-dispersed clients

10Mbps uplink

100Mbps uplink

Trace-based

Open system

Partly-open

Load < 1

Transient overload

+ Other effects: initial RTO; user abort/reload; persistent connections, etc.

Preliminary Comments

• Job throughput, byte throughput, and bandwidth utilization were same under SRPT and FAIR scheduling.

• Same set of requests complete.

• No additional CPU overhead under SRPT scheduling. Network was bottleneck in all experiments.

APACHEWEB

SERVER

Linux 0.S.

switch

SRPTMea

Results: Mean Response Time (LAN)

Percentile of Request Size

Load =0.8

Mean Response Time vs. Size Percentile (LAN)

Transient Overload

Transient Overload - Baseline

Mean response time

SRPTFAIR

Transient overloadResponse time as function of job

small jobswin big!

big jobsaren’t hurt!

Baseline Case

WAN propagation delays

WAN loss

Persistent Connections

Initial RTO value

SYN Cookies

User Abort/Reload

Packet Length

Realistic Scenario

WAN loss + delay

RTT: 0 – 150 ms

Loss: 0 – 15%

Loss: 0 – 15%RTT: 0 – 150 ms,

0 – 10 requests/conn.

RTO = 0.5 sec – 3 sec

ON/OFF

Abort after 3 – 15 sec, with 2,4,6,8 retries.

Packet length = 536 – 1500 Bytes

RTT = 100 ms; Loss = 5%; 5 requests/conn.,RTO = 3 sec; pkt len = 1500B; User abortsAfter 7 sec and retries up to 3 times.

FACTORS

Transient Overload - Realistic

Mean response time

FAIR SRPT

SRPT scheduling is a promising solution for reducing

mean response time seen by clients, particularly when the load at server bottleneck is high, or under transient overload conditions.

SRPT results in negligible or zero unfairness to large requests.

SRPT is easy to implement and efficient. No CPU overhead. No drop in throughput.

Results corroborated via implementation and analysis.

Conclusion so far …

Mor Harchol-Balter Carnegie Mellon University School of Computer Science

Documents

Transcript of Mor Harchol-Balter Carnegie Mellon University School of Computer Science

Applying to Ph.D. Programs in Computer Scienceharchol/gradschooltalk.pdf · Applying to Ph.D. Programs in Computer Science Mor Harchol-Balter Computer Science Department Carnegie

ISAAC GROSOF, MOR HARCHOL-BALTER, arXiv:1905.03439v1 [cs.PF] 9 … · MOR HARCHOL-BALTER, Carnegie Mellon University, USA Load balancing systems, comprising a central dispatcher and

Analysis of Task Assignment with Cycle Stealingharchol/Papers/cmu.02.158.pdf · Analysis of Task Assignment with Cycle Stealing Mor Harchol-Balter 1 Cuihong Li 2 Takayuki Osogami

VARUN GUPTA Carnegie Mellon University 1 With: Mor Harchol-Balter (CMU)

Thesis Oral VARUN GUPTA - University of Chicagohome.uchicago.edu/~guptav/talks/thesis_slides.pdf · 38 [Performance’] V. Gupta, M. Harchol-Balter, K. Sigman, and W. Whitt. Analysis

1 The Effect of Heavy-Tailed Job Size Distributions on System Design Mor Harchol-Balter MIT Laboratory for Computer Science.

Thread Cluster Memory Scheduling : Exploiting Differences in Memory Access Behavior Yoongu Kim Michael Papamichael Onur Mutlu Mor Harchol-Balter.

Léo Balter: JavaScript Idiomático

Anshul Gandhi (Carnegie Mellon University) Varun Gupta (CMU), Mor Harchol-Balter (CMU) Michael Kozuch (Intel, Pittsburgh)

Balter Guidelines

Join-the-Shortest-Queue (JSQ) Routing in Web Server Farms VARUN GUPTA Joint with: Mor Harchol-Balter Carnegie Mellon Univ. Karl Sigman Columbia Univ. Ward.

Kluman and Balter

1 Alan Scheller-Wolf Joint with: Mor Harchol-Balter, Taka Osogami, Adam Wierman, and Li Zhang. Dimensionality Reduction for the analysis of Cycle Stealing,

1 Scheduling in Server Farms Mor Harchol-Balter Associate Department Head Computer Science Dept Carnegie Mellon University harchol@cs.cmu.edu.

Effect of higher moments of job size distribution on the performance of an M/G/k system VARUN GUPTA Joint work with: Mor Harchol-Balter Carnegie Mellon.

Fundamental Characteristics of Queues with Fluctuating Load (appeared in SIGMETRICS 2006) VARUN GUPTA Joint with: Mor Harchol-Balter Carnegie Mellon Univ.

Multi-server queueing systems with multiple priority classesharchol/Papers/questa.pdf · Multi-server queueing systems with multiple priority classes Mor Harchol-Balter∗ Takayuki

Mor hol-Balter Harcharchol/Thesis/mythesis.pdf · ork w Net Analysis Without y Exptialit onen Assumptions y b Mor hol-Balter Harc B.A. (Brandeis y) ersit Univ 1988 A dissertation

1 Mor Harchol-Balter Carnegie Mellon University Joint work with Bianca Schroeder.

Balter adam slideshowPCP