FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper...
-
Upload
emilie-herod -
Category
Documents
-
view
214 -
download
0
Transcript of FINISHING FLOWS QUICKLY WITH PREEMPTIVE SCHEDULING SPEAKER: YU, YE The slides are based on the paper...
FINISHING FLOWS QUICKLY
WITH PREEMPTIVE
SCHEDULINGSPEAKER: YU, YE
The slides are based on the paper by Chi-Yao Hong, et. al, SIGCOMM 2012Some figures origins from their presentation.
OUTLINE
• Motivation• PDQ solution to flow scheduling • Evaluation• Discussion
DATACENTER NETWORKS
LATENCY1ms latency = 1% reduce of sales.
100ms latency = 0.2% number searches
2.2 seconds faster / page = 60 M more download / year
LOW LATENCY
• LOW Latency Datacenters?• Finish Flows Earlier
• The LAST of flows == final result
• Meet Flow Deadline• User-facing applications, Latency
Goal
PARTITION AGGREGATE MODEL
• associated component deadlines in the parentheses.
TODAY’S TRANSPORT PROTOCOLS• TCP / RCP/ ICTCP / DCTCP:• Fair Sharing
• Divide link bandwidth equally.• Fail to reduce flow completion time.
WHAT IS TCP
• TCP slow start• TCP fast recovery• additive increase• multiplicative decrease
Host A
one segment
RTT
Host B
time
two segments
four segments
WHAT IS RCP
• Rate Control Protocol• RCP is an adaptive algorithm to emulate
Processor Share : a router divides outgoing link bandwidth equally
• Rate is picked by the routers based on queue size and aggregate traffic
• Router assigns a single rate to all flows• Requires no per-flow state or per-packet
calculation
FAIRNESS DAMAGES COMPLETION TIME• Flow Fa,Fb,Fc arrives at the same time, with
size = 1,2,3 and deadline = 1,4,6 Fair share, FC time = (3+5+6)/3 = 4.67
D3 for order BACFC time = (2+4+6)/3 = 4
Shortest Job First/ Earliest Deadline FirstFC time = (1+3+6)/3 = 3.33
D3 DEPENDS ON FLOW ORDER• D3 satisfies as many flows as possible in the order
of their arrival,• Request rate = flow size / time until deadline.
Satisfy request by Order
THE PDQ SOLUTION
• Preemptive Distributed Quick• Schedule by flow criticality.• Criticality: relative priority of flows.• Scheduling discipline.
Preemptive : relating to the purchase of goods or shares by one person or party before the opportunity is offered to others.
PDQ’S SCHEDULING DICIPLINES• EDF: earliest deadline first
• Optimal for flow deadlines.
• SJF: shortest job first• Optimal for mean flow finish time.
• EDF+SJF:• Give preference to deadline flows.
• Policy based:• Manually allocate priority of flow.
CHALLENGES.
• Decentralizing scheduling discipline• More mice than elephant.
• Switching between flows seamlessly• Hard to full utilize bandwidth
• Prioritizing flows using FIFO tail-drop Queues• FIFO Queue length limited
OUTLINE
• Motivation• PDQ solution to flow scheduling • Evaluation• Discussion
PDQ PROTOCOL - OVERVIEW
PDQ PROTOCOL-PDQ SENDER-1• SYN / TERM packet for initialization and
termination.• Resend after timeout.• sender maintains info for in-flight packets:
• Current Sending Rate (Rs)• ID of switch who paused it (Ps)• Deadline (Ds)• Expected flow transmission time (Ts)• Inter-probing time (Is)• Measured RTT (RTTs)
PDQ PROTOLS-SENDER-2• Sender sends package with rate Rs• If Rs = 0, Send a probe packet
heartbeatly.(scheduling header without data)
• When ACK arrives, update Rs (ACKinfo: accept/pause)
PDQ PROTOLS-SENDER-EARLY-TERMINATION• Sender TERMNINATES a flow when
it cannot meet its deadline. Whenever:• Deadline is past.• Remaining flow transmission + time >
deadline• Flow is paused , and time + RTT>
deadline
PDQ PROTOLS-SWITCH
• Let the most critical flow complete asap.• Critical flows preempt others to achieve
the highest possible sending rate• 1) maintain state about each flow• 2) Compute Rate Feedback
• a) flow controller to decide witch flows to send
• b) rate controller to determine Rate
PDQ PROTOCOL-SWITCH-STATE• Maintains flow states on each link • <Rate, P, Deadline, expected
Time,RTT>• Pi: flow i is paused by switch Pi• Store 2K of them, most critical
ones. K is number of Current Sending Flow.
PDQ PROTOCOL-FLOW CONTROL• Whenever a Switch receives
ACK/data, ACCEPT or PAUSE a flow• Pause: inform others flow f is Paused.
• Switch who receives ACK-Pause i removes i from its own states
• Accept: calculate available bandwidth• Other Switch who receives ACK-accept i
updates state i
ALGORITHM RECV DATA/ACK FLOW F• 1) if f is paused by other Switch, remove
it from my list. • 2) if f is not in my list:
• Try to add f into my list, if can not ,pause f
• 3) if (w = min(aviliableBW ,Rf) > 0 ):• Accept f• Otherwise pause f
FLOW-CONTROL-3 OPTIMIZATION• Dampening
• If switch accepted a flow, then in a short period of time he can not accept other new flows.
• Early starting• Suppressed probing
EARLY STARTSEAMLESS SCHEDULE
SUPPRESSED PROBING
• Sender may send probe packages too often.
• Flow info If : tell the sender of f that you should send probe every If*RTT.
• If is maintained by switches , by calculation of average finish time of all flows and rank of f
PDQ PROTOCOL-RATE CONTROL• Control the total sending rate of its
accepted flows.• Maintains variable C to compute
range of Rate.• reserves BW for early started flows • C = Full_BW- Queue_size/(K*RTT)
OUTLINE
• Motivation• PDQ solution to flow scheduling • Evaluation• Discussion
EVALUATION SETTING: TRAFFIC• Deadline-constrained flows:
• Time sensitive : ~20ms• Short message : 2KB~200KB• Goal: Application Throughput =
percentage of flows meets their deadlines• Deadline-unconstrained flows:
• 100~1000KB• Goal: average flow completion time
EVALUATION SETTING: TOPOLOGY
QUERY AGGREGATION:• All senders initiate at the same
time to the same receiver.
Optimal: one scheduler control all transmission with no delay.maximize application throughput:sort by EDF, and then uses a dynamic programming
• The Deadline-unconstrained case
SEAMLESS FLOW SWITCHING
Five flow (~1MB) comes at the same time
an elephant flow and 50 short flows starting from 10ms
IMPACT OF NETWORK SCALE
OUTLINE
• Motivation• PDQ solution to flow scheduling • Evaluation• Discussion
FAIRNESS?
OTHER CONCERNS
• Does it require rewriting APP?• PDQ paused appears like TCP slow,• The transport connection stays open.
• Deployment?• Hosts: between IP and transport layer• Switch: modify hardware/software, O(k)
Thank you!Q&A