1 EE384Y: Packet Switch Architectures Part II Sizing Router Buffers (Recent work by Guido...

Post on 26-Mar-2015

219 views 1 download

Tags:

Transcript of 1 EE384Y: Packet Switch Architectures Part II Sizing Router Buffers (Recent work by Guido...

1

High PerformanceSwitching and RoutingTelecom Center Workshop: Sept 4, 1997.

EE384Y: Packet Switch ArchitecturesPart II

Sizing Router Buffers

(Recent work by Guido Appenzeller)

Nick McKeownProfessor of Electrical Engineering and Computer Science, Stanford University

nickm@stanford.eduhttp://www.stanford.edu/~nickm

2

How much Buffer does a Router need?

Universally applied rule-of-thumb: A router needs a buffer size:

• 2T is the round-trip propagation time (or just 250ms)• C is the capacity of the outgoing link

Background Mandated in backbone and edge routers. Appears in RFPs and IETF architectural guidelines. Has major consequences for router design. Comes from dynamics of TCP congestion control. Villamizar and Song: “High Performance TCP in ANSNET”,

CCR, 1994. Based on 2 to 16 TCP flows at speeds of up to 40 Mb/s.

CTB 2

3

Example

10Gb/s linecard or router Requires 300Mbytes of buffering. Read and write new packet every 32ns.

Memory technologies SRAM: require 80 devices, 1kW, $2000. DRAM: require 4 devices, but too slow.

Problem gets harder at 40Gb/s Hence RLDRAM, FCRAM, etc.

4

TCP

TCP adapts to congestion Sender sends packets, receiver sends ACKs Sending rate is controlled by Window W At any time, only W unacknowledged packets may be

outstanding

W is adjusted for each packet (in CA mode): If ACK received: W = W+1/W (W=W+1 for each W

packets) If packet is lost: W = W/2 (W halved in case of loss)

The sending rate of TCP is: RTT

WR

5

Single TCP FlowRouter with large enough buffers for full link utilization

B

DestCC’ > C

Source

maxW

2maxW

t

Window size Buffer size and RTT

For every W ACKs received, send W+1 packets

RTT

WR

6

Over-buffered Link

7

Under-buffered Link

8

Buffer = Rule-of-thumb

Interval magnifiedon next slide

9

Microscopic TCP BehaviorWhen sender pauses, buffer drains

one RTTDrop

10

Origin of rule-of-thumb Before and after reducing window size, the sending rate of the

TCP sender is the same

Inserting the rate equation we get

The RTT is part transmission delay T and part queuing delay B/C . We know that after reducing the window, the queueing delay is zero.

newold RR

new

new

old

old

RTT

W

RTT

W

T

W

CBT

W oldold

2

2/

/2

BCT 2

11

Rule-of-thumb

Rule-of-thumb makes sense for one flow Typical backbone link has > 20,000 flows Does the rule-of-thumb still hold?

Answer: If flows are perfectly synchronized, then Yes. If flows are desynchronized then No.

12

Buffer size is height of sawtooth

t

B

0

13

If flows are synchronized

maxW

Aggregate window has same dynamics Therefore buffer occupancy has same dynamics Rule-of-thumb still holds.

2maxW

t

max

2

W

maxW

14

Two TCP FlowsTwo TCP flows can synchronize

15

If flows are not synchronized

maxW

Aggregate window has less variation Therefore buffer occupancy has less variation The more flows, the smaller the variation Rule-of-thumb does not hold.

2maxW

t

)( WMin

)( WMax

16

If flows are not synchronized

maxW

2maxW

ProbabilityDistributionBuffer Size

B

0

17

Quantitative Model Model congestion window of a flow as random variable

)(tWi model as )(][ xfxWP i iW where

For many de-synchronized flows We assume congestions windows are independent All congestion windows have the same probability distribution

2]var[][ WiWi WWE

Now central limit theorem gives us queue length distribution

)1,0()( NnntW WWn

i

18

Required buffer size

2T C

n

Simulation

19

Required buffer size

2T C

n

99.9%

98.0%

99.5%2T C

n

20

Small buffers help short flowsAverage flow completion times of 14 packet flows that share a congested bottleneck link with long-lived flows.

2T C

n

CT 2

21

Experiments with backbone routerGSR 12000, OC3 Line Card

TCP

Flows

Router Buffer Link Utilization

Pkts RAM Model Sim Exp

100 0.5 x

1 x

2 x

3 x

64

129

258

387

1Mb

2Mb

4Mb

8Mb

96.9%

99.9%

100%

100%

94.7%

99.3%

99.9%

99.8%

94.9%

98.1%

99.8%

99.7%

400 0.5 x

1 x

2 x

3 x

32

64

128

192

512kb

1Mb

2Mb

4Mb

99.7%

100%

100%

100%

99.2%

99.8%

100%

100%

99.5%

100%

100%

99.9%

2T C

n

Thanks: Experiments conducted by Paul Barford and Joel Sommers, U of Wisconsin

22

What about Short Flows?

So far we assumed long flows in congestion avoidance mode. What if traffic is mainly short flows in slow-start?

Answer: Behavior is different, but In mixes of flows, long flows drive buffer requirements Required buffer for short flows is independent of line

speed and RTT (same for 1Mbit/s or 40 Gbit/s)

23

A single, short-lived TCP flowFlow length 62 packets, RTT ~140 ms

2

4

8

16

32

RTTsynfin ack

received

Flow Completion Time (FCT)

24

Modelling TCPFlows vs. independent bursts

Inter-Burst Arrival Time is greater than buffer sizeTherefore, we assume bursts are independent.

Poisson arrivals of flows

Arrivals of length Lflow (the

flow length in packets)

Poisson arrivals of bursts

Four different poisson arrival processes of lengths 2,4,...

S i Lflow

flow

CLflow

S i 2,4, 8, 16

burst

CLflow E S i

flow

E S i

25

The M/G/1 ModelTCP traffic is modelled as an M/G/1 arrival

process: poisson arrivals of jobs

with an arrival rate of

S i 2,4, 8, 16...

burst E Si

is the load

Average queue length in jobs is:

E NQ

2E S 2

2 1

2E S 2

2 1 E S 2

This gives us an average queue length in packets of

E Q E NQ E S2E S 2

2 1 E S

Let's see if this works in practice...

26

Average Queue length

capacity :C 40Mbit sload : 0.8

for length 50packets :Lflow 400MbitAverage100flows secondCompletion time 400ms

27

Queue Distribution To determine the required buffer, we need the queue

distribution.

Or at least the tail endof the queue distribution

PDrop

x B

P Q x

P(Q = x)

Q

Buffer B

Packet Loss

● For M/G/1 queues there is no general solution for the queue distribution.

● We did two things (details are in the paper):

– Use M/G/1 processor sharing model (bad)– Use Frank Kelly's effective bandwidth (good)

28

In Summary

Buffer size is dictated by long TCP flows. 10Gb/s linecard with 200,000 x 56kb/s flows

Rule-of-thumb: Buffer = 2.5Gbits• Requires external, slow DRAM

Becomes: Buffer = 6Mbits• Can use on-chip, fast SRAM• Completion time halved for short-flows

40Gb/s linecard with 40,000 x 1Mb/s flows Rule-of-thumb: Buffer = 10Gbits Becomes: Buffer = 50Mbits