Sruta Keerti Kasula 1. SRU-Server Request Unit 3 TCP timeout Worker 1 Worker 2 Worker 3 Worker 4...

13
Congestion control in data centers Sruta Keerti Kasula 1

Transcript of Sruta Keerti Kasula 1. SRU-Server Request Unit 3 TCP timeout Worker 1 Worker 2 Worker 3 Worker 4...

Page 1: Sruta Keerti Kasula 1. SRU-Server Request Unit 3 TCP timeout Worker 1 Worker 2 Worker 3 Worker 4 Aggregator RTO min = 300 ms  Caused by Partition/Aggregate.

1

Congestion control in data centers

Sruta Keerti Kasula

Page 2: Sruta Keerti Kasula 1. SRU-Server Request Unit 3 TCP timeout Worker 1 Worker 2 Worker 3 Worker 4 Aggregator RTO min = 300 ms  Caused by Partition/Aggregate.

Data center structure

SRU-Server Request Unit

Page 3: Sruta Keerti Kasula 1. SRU-Server Request Unit 3 TCP timeout Worker 1 Worker 2 Worker 3 Worker 4 Aggregator RTO min = 300 ms  Caused by Partition/Aggregate.

Incast

3

TCP timeout

Worker 1

Worker 2

Worker 3

Worker 4

Aggregator

RTOmin = 300 ms

Caused by Partition/Aggregate.

Page 4: Sruta Keerti Kasula 1. SRU-Server Request Unit 3 TCP timeout Worker 1 Worker 2 Worker 3 Worker 4 Aggregator RTO min = 300 ms  Caused by Partition/Aggregate.

TCP is 25 years old, designed for low bandwidth delay product.

The higher the Bandwidth-Delay product, the less efficient TCP’s congestion control algorithms such as AIMD.

Networks with high delays makes TCP react slower to packet drops. Additionally, bandwidth is reacquired more slowly in such networks.

Why develop XCP?

Page 5: Sruta Keerti Kasula 1. SRU-Server Request Unit 3 TCP timeout Worker 1 Worker 2 Worker 3 Worker 4 Aggregator RTO min = 300 ms  Caused by Partition/Aggregate.

XCP protocol

XCP introduces a 20 byte header between IP and TCP that carries information about the sender’s desired bandwidth and information about what the routers allow.

XCP requires all routers and hosts in the network to use the XCP protocol in order to work as intended.

Application

TCP

XCP

IP

Link

The TCP/XCP/IP stack

Page 6: Sruta Keerti Kasula 1. SRU-Server Request Unit 3 TCP timeout Worker 1 Worker 2 Worker 3 Worker 4 Aggregator RTO min = 300 ms  Caused by Partition/Aggregate.

XCP’s solution

XCP allows the routers in the network to continuously do the adjustments by changing the contents of the packets (XCP header) transferred between the sender and receiver.

Router Router

XCP provides a theoretically analysis, yet effective approach to congestion control.

Page 7: Sruta Keerti Kasula 1. SRU-Server Request Unit 3 TCP timeout Worker 1 Worker 2 Worker 3 Worker 4 Aggregator RTO min = 300 ms  Caused by Partition/Aggregate.

An XCP network (simplified)Network RTT: 100 msRouter’s capacity: 200.000 B/s (available 200.000 B/s)Sender’s capacity: 10.000.000 B/s (available 10.000.000

B/s)Sender’s current throughput: 0 B/s (or 0 B/ms)

1 1 6 20 0

100

0

10.000.000

0

Sender

1 1 6 20 0

100

0

200.000

0

Router

1 2 6 20 0

100

0

0

200.000

Receiver

RTTCurrent ThroughputDelta ThroughputFeedback

Page 8: Sruta Keerti Kasula 1. SRU-Server Request Unit 3 TCP timeout Worker 1 Worker 2 Worker 3 Worker 4 Aggregator RTO min = 300 ms  Caused by Partition/Aggregate.

An XCP network (simplified)Network RTT: 100 msRouter’s capacity: 200.000 B/s (0 B/s is available)Sender’s capacity: 10.000.000 B/s (9.800.000 B/s is

available)Sender’s current throughput: 200.000 B/s (or 200 B/ms)

1 1 6 20 0

100

200

9.800.000

0

Sender

1 1 6 20 0

100

200

0

0

Router

1 2 6 20 0

100

0

0

0

Receiver

RTTCurrent ThroughputDelta ThroughputFeedback

Page 9: Sruta Keerti Kasula 1. SRU-Server Request Unit 3 TCP timeout Worker 1 Worker 2 Worker 3 Worker 4 Aggregator RTO min = 300 ms  Caused by Partition/Aggregate.

DCTCP

9

Sender 1

Sender 2

Receiver

ECN Mark (1 bit)

ECN = Explicit Congestion Notification

Page 10: Sruta Keerti Kasula 1. SRU-Server Request Unit 3 TCP timeout Worker 1 Worker 2 Worker 3 Worker 4 Aggregator RTO min = 300 ms  Caused by Partition/Aggregate.

Data Center TCP AlgorithmData Center TCP AlgorithmSwitch side:

◦ Mark packets when Queue Length > K.

19

KMarkDon’t Mark

DCTCP satisfies all our requirements for Data Center packet transport.

Handles bursts well Keeps queuing delays low Achieves high throughput

Features: Very simple change to TCP and a single

switch parameter K. Based on ECN mechanisms already

available in commodity switch.

Page 11: Sruta Keerti Kasula 1. SRU-Server Request Unit 3 TCP timeout Worker 1 Worker 2 Worker 3 Worker 4 Aggregator RTO min = 300 ms  Caused by Partition/Aggregate.

The available bandwidth at the receiver is used as a quota to increase the receiver window size on all incoming connections.

Congestion control is performed for each flow independently in slotted RTT and this is also the control latency in the feedback loop.

A receive window based scheme is considered i.e., Ratio of difference of expected and measured throughput over expected is used to adjust the receiver window size.

ICTCP Observations

Page 12: Sruta Keerti Kasula 1. SRU-Server Request Unit 3 TCP timeout Worker 1 Worker 2 Worker 3 Worker 4 Aggregator RTO min = 300 ms  Caused by Partition/Aggregate.

It’s a window based congestion control algorithm at TCP receiver-side.

The experimental results demonstrated that ICTCP was effective to avoid congestion by achieving almost zero timeout for TCP Incast, and it provided high performance and fairness among competing flows

ICTCP

Page 13: Sruta Keerti Kasula 1. SRU-Server Request Unit 3 TCP timeout Worker 1 Worker 2 Worker 3 Worker 4 Aggregator RTO min = 300 ms  Caused by Partition/Aggregate.

folk.uio.no/paalh/students/PetterMosebekk.ppt http://www.academia.edu/2160335/

A_Survey_on_TCP_Incast_in_Data_Center_Networks http://iosrjournals.org/iosr-jce/papers/ICAET-2014/volume-

4/2.pdf?id=7557 http://www.yumpu.com/en/document/view/10675849/

ictcp-incast-congestion-control-for-tcp-in-data-center-sigcomm

Haitao Wu? et al., ICTCP: Incast Congestion Control for TCP in Data Center Networks, proceedings ACM CoNEXT 2010

V. Jacobson and M. Karels, Congestion Avoidance and Control, In Proc. ACM SIGCOMM '88. understand congestion problem

http://slideplayer.us/slide/2304322/#

References