FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper...

40
FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures origins from their presentation.

Transcript of FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper...

Page 1: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

FINISHING FLOWS QUICKLY

WITH PREEMPTIVE

SCHEDULINGSPEAKER: YU, YE

The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012Some figures origins from their presentation.

Page 2: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

OUTLINE

• Motivation• PDQ solution to flow scheduling • Evaluation• Discussion

Page 3: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

DATACENTER NETWORKS

Page 4: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

LATENCY1ms latency = 1% reduce of sales.

100ms latency = 0.2% number searches

2.2 seconds faster / page = 60 M more download / year

Page 5: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

LOW LATENCY

• LOW Latency Datacenters?• Finish Flows Earlier

• The LAST of flows == final result

• Meet Flow Deadline• User-facing applications, Latency

Goal

Page 6: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

PARTITION AGGREGATE MODEL

• associated component deadlines in the parentheses.

Page 7: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

TODAY’S TRANSPORT PROTOCOLS• TCP / RCP/ ICTCP / DCTCP:• Fair Sharing

• Divide link bandwidth equally.• Fail to reduce flow completion time.

Page 8: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

WHAT IS TCP

• TCP slow start• TCP fast recovery• additive increase• multiplicative decrease

Host A

one segment

RTT

Host B

time

two segments

four segments

Page 9: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

WHAT IS RCP

• Rate Control Protocol• RCP is an adaptive algorithm to emulate

Processor Share : a router divides outgoing link bandwidth equally

• Rate is picked by the routers based on queue size and aggregate traffic

• Router assigns a single rate to all flows• Requires no per-flow state or per-packet

calculation

Page 10: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

FAIRNESS DAMAGES COMPLETION TIME• Flow Fa,Fb,Fc arrives at the same time, with

size = 1,2,3 and deadline = 1,4,6 Fair share, FC time = (3+5+6)/3 = 4.67

D3 for order BACFC time = (2+4+6)/3 = 4

Shortest Job First/ Earliest Deadline FirstFC time = (1+3+6)/3 = 3.33

Page 11: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

D3 DEPENDS ON FLOW ORDER• D3 satisfies as many flows as possible in the order

of their arrival,• Request rate = flow size / time until deadline.

Satisfy request by Order

Page 12: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

THE PDQ SOLUTION

• Preemptive Distributed Quick• Schedule by flow criticality.• Criticality: relative priority of flows.• Scheduling discipline.

Preemptive : relating to the purchase of goods or shares by one person or party before the opportunity is offered to others.

Page 13: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

PDQ’S SCHEDULING DICIPLINES• EDF: earliest deadline first

• Optimal for flow deadlines.

• SJF: shortest job first• Optimal for mean flow finish time.

• EDF+SJF:• Give preference to deadline flows.

• Policy based:• Manually allocate priority of flow.

Page 14: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

CHALLENGES.

• Decentralizing scheduling discipline• More mice than elephant.

• Switching between flows seamlessly• Hard to full utilize bandwidth

• Prioritizing flows using FIFO tail-drop Queues• FIFO Queue length limited

Page 15: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

OUTLINE

• Motivation• PDQ solution to flow scheduling • Evaluation• Discussion

Page 16: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

PDQ PROTOCOL - OVERVIEW

Page 17: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.
Page 18: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

PDQ PROTOCOL-PDQ SENDER-1• SYN / TERM packet for initialization and

termination.• Resend after timeout.• sender maintains info for in-flight packets:

• Current Sending Rate (Rs)• ID of switch who paused it (Ps)• Deadline (Ds)• Expected flow transmission time (Ts)• Inter-probing time (Is)• Measured RTT (RTTs)

Page 19: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

PDQ PROTOLS-SENDER-2• Sender sends package with rate Rs• If Rs = 0, Send a probe packet

heartbeatly.(scheduling header without data)

• When ACK arrives, update Rs (ACKinfo: accept/pause)

Page 20: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

PDQ PROTOLS-SENDER-EARLY-TERMINATION• Sender TERMNINATES a flow when

it cannot meet its deadline. Whenever:• Deadline is past.• Remaining flow transmission + time >

deadline• Flow is paused , and time + RTT>

deadline

Page 21: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

PDQ PROTOLS-SWITCH

• Let the most critical flow complete asap.• Critical flows preempt others to achieve

the highest possible sending rate• 1) maintain state about each flow• 2) Compute Rate Feedback

• a) flow controller to decide witch flows to send

• b) rate controller to determine Rate

Page 22: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

PDQ PROTOCOL-SWITCH-STATE• Maintains flow states on each link • <Rate, P, Deadline, expected

Time,RTT>• Pi: flow i is paused by switch Pi• Store 2K of them, most critical

ones. K is number of Current Sending Flow.

Page 23: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

PDQ PROTOCOL-FLOW CONTROL• Whenever a Switch receives

ACK/data, ACCEPT or PAUSE a flow• Pause: inform others flow f is Paused.

• Switch who receives ACK-Pause i removes i from its own states

• Accept: calculate available bandwidth• Other Switch who receives ACK-accept i

updates state i

Page 24: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

ALGORITHM RECV DATA/ACK FLOW F• 1) if f is paused by other Switch, remove

it from my list. • 2) if f is not in my list:

• Try to add f into my list, if can not ,pause f

• 3) if (w = min(aviliableBW ,Rf) > 0 ):• Accept f• Otherwise pause f

Page 25: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

FLOW-CONTROL-3 OPTIMIZATION• Dampening

• If switch accepted a flow, then in a short period of time he can not accept other new flows.

• Early starting• Suppressed probing

Page 26: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

EARLY STARTSEAMLESS SCHEDULE

Page 27: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

SUPPRESSED PROBING

• Sender may send probe packages too often.

• Flow info If : tell the sender of f that you should send probe every If*RTT.

• If is maintained by switches , by calculation of average finish time of all flows and rank of f

Page 28: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

PDQ PROTOCOL-RATE CONTROL• Control the total sending rate of its

accepted flows.• Maintains variable C to compute

range of Rate.• reserves BW for early started flows • C = Full_BW- Queue_size/(K*RTT)

Page 29: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

OUTLINE

• Motivation• PDQ solution to flow scheduling • Evaluation• Discussion

Page 30: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

EVALUATION SETTING: TRAFFIC• Deadline-constrained flows:

• Time sensitive : ~20ms• Short message : 2KB~200KB• Goal: Application Throughput =

percentage of flows meets their deadlines• Deadline-unconstrained flows:

• 100~1000KB• Goal: average flow completion time

Page 31: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

EVALUATION SETTING: TOPOLOGY

Page 32: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

QUERY AGGREGATION:• All senders initiate at the same

time to the same receiver.

Optimal: one scheduler control all transmission with no delay.maximize application throughput:sort by EDF, and then uses a dynamic programming

Page 33: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

• The Deadline-unconstrained case

Page 34: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

SEAMLESS FLOW SWITCHING

Five flow (~1MB) comes at the same time

Page 35: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

an elephant flow and 50 short flows starting from 10ms

Page 36: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

IMPACT OF NETWORK SCALE

Page 37: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

OUTLINE

• Motivation• PDQ solution to flow scheduling • Evaluation• Discussion

Page 38: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

FAIRNESS?

Page 39: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

OTHER CONCERNS

• Does it require rewriting APP?• PDQ paused appears like TCP slow,• The transport connection stays open.

• Deployment?• Hosts: between IP and transport layer• Switch: modify hardware/software, O(k)

Page 40: FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012 Some figures.

Thank you!Q&A