Open Issues in Buffer Sizing
description
Transcript of Open Issues in Buffer Sizing
Open Issues in Buffer Sizing
Amogh Dhamdhere Constantine Dovrolis
College of ComputingGeorgia Tech
Outline
Motivation and previous work The Stanford model for buffer sizing Important issues in buffer sizing Simulation results for the Stanford model Buffer sizing for bounded loss rate (Infocom’05)
Motivation Router buffers are crucial elements of packet
networks Absorb rate variations of incoming traffic Prevent packet losses during traffic bursts
Increasing the router buffer size: Can increase link utilization (especially with TCP traffic) Can decrease packet loss rate Can also increase queuing delays
Common operational practices Major router vendor recommends 500ms of
buffering Implication: buffer size increases proportionally to link
capacity Why 500ms?
Bandwidth Delay Product (BDP) rule: Buffer size B = link capacity C x typical RTT T (B = CxT) What does “typical RTT” mean?
Measurement studies showed that RTTs vary from 1ms to 10sec!
How do different types of flows (TCP elephants vs mice) affect buffer requirement?
Poor performance is often due to buffer size: Under-buffered switches: high loss rate and poor utilization Over-buffered DSL modems: excessive queuing delay for
interactive apps
Previous work Approaches based on queuing theory (e.g. M|M|1|B)
Assume a certain input traffic model, service model and buffer size
Loss probability for M|M|1|B system is given by
TCP is not open-loop; TCP flows react to congestion There is no universally accepted Internet traffic model
Morris’ Flow Proportional Queuing (Infocom ’00) Proposed a buffer size proportional to the number of active
TCP flows (B = 6*N) Did not specify which flows to count in N Objective: limit loss rate High loss rate causes unfairness and poor application performance
1B
B
ρ1)ρ1(ρ
p
TCP window dynamics for long flows
TCP-aware buffer sizing must take into account TCP dynamics
Saw-tooth behavior Window increases until packet
loss Single loss results in cwnd
reduction by factor of two Square-root TCP model
TCP throughput can beapproximated by
Valid when loss rate p is small (less than 2-5%) Average window size isindependent of RTT
0.87R
T pLoss RateRTT
Origin of BDP rule
Consider a single flow with RTT T Window follows TCP’s saw-tooth behavior Maximum window size = CT + B
At this point packet loss occurs Window size after packet loss = (CT + B)/2 Key step: Even when window size is minimum,
link should be fully utilized (CT + B)/2 ≥ CT which means B ≥ CT
Known as the bandwidth delay product rule Same result for N homogeneous TCP connections
Outline
Motivation and previous work The Stanford model for buffer sizing Important issues in buffer sizing Simulation results for the Stanford model Buffer sizing for bounded loss rate (BSCL)
Stanford Model - Appenzeller et al.
Objective: Find the minimum buffer size to achieve full utilization of target link
Assumption: Most traffic is from TCP flows If N is large, flows are independent and unsynchronized Aggregate window size distribution tends to normal Queue size distribution also tends to normal Flows in congestion avoidance (linear increase of window
between successive packet drops) Buffer for full utilization is given by
N is the number of “long” flows at the link CT: Bandwidth delay product
N/CTB
Stanford Model (cont’)
If link has only short flows, buffer size depends only on offered load and average flow size Flow size determines the size of bursts during slow start
For a mix of short and long flows, buffer size is determined by number of long flows Small flows do not have a significant impact on buffer
sizing Resulting buffer can achieve full utilization of
target link Loss rate at target link is not taken into account
Outline
Motivation and previous work The Stanford model for buffer sizing Important issues in buffer sizing Simulation results for the Stanford model Buffer sizing for bounded loss rate (BSCL)
What are the objectives ?
Network layer vs. application layer objectives Network’s perspective: Utilization, loss rate, queuing delay User’s perspective: Per-flow throughput, fairness etc.
Stanford Model: Focus on utilization & queueing delay Can lead to high loss rate (> 10% in some cases)
BSCL: Both utilization and loss rate Can lead to large queuing delay
Buffer sizing scheme that bounds queuing delay Can lead to high loss rate and low utilization
A certain buffer size cannot meet all objectives Which problem should we try to solve?
Saturable/congestible links A link is saturable when offered load is sufficient
to fully utilize it, given large enough buffer A link may not be saturable at all times Some links may never be saturable
Advertised-window limitation, other bottlenecks, size-limited
Small buffers are sufficient for non-saturable links Only needed to absorb short term traffic bursts
Stanford model applicable: when N is large Backbone links are usually not saturable due to over-
provisioning Edge links are more likely to be saturable
But N may not be large for such links
Which flows to count ?
N: Number of “long” flows at the link “Long” flows show TCP’s saw-tooth behavior “Short” flows do not exit slow start
Does size matter? Size does not indicate slow start or congestion avoidance
behavior If no congestion, even large flows do not exit slow start If highly congested, small flows can enter congestion avoidance
Should the following flows be included in N ? Flows limited by congestion at other links Flows limited by sender/receiver socket buffer size
N varies with time. Which value should we use ? Min ? Max ? Time average ?
Which traffic model to use ?
Traffic model has major implications on buffer sizing
Early work considered traffic as exogenous process Not realistic. The offered load due to TCP flows depends
on network conditions Stanford model considers mostly persistent
connections No ambiguity about number of “long” flows (N) N is time-invariant
In practice, TCP connections have finite size and duration, and N varies with time Open-loop vs closed-loop flow arrivals
Traffic model (cont’)
Open-loop TCP traffic: Flows arrive randomly with average size S, average rate Offered load S, link capacity C Offered load is independent of system state (delay, loss) The system is unstable if S > C
Closed-loop TCP traffic: Each user starts a new transfer only after the completion
of previous transfer Random think time between consecutive transfers Offered load depends on system state The system can never be unstable
Outline
Motivation and previous work The Stanford model for buffer sizing Important issues in buffer sizing Simulation results for the Stanford model Buffer sizing for bounded loss rate (BSCL)
Why worry about loss rate? The Stanford model gives very small buffer if N is large
E.g., CT=200 packets, N=400 flows: B=10 packets What is the loss rate with such a small buffer size?
Per-flow throughput and transfer latency? Compare with BDP-based buffer sizing
Distinguish between large and small flows Small flows that do not see losses: limited only by RTT
Flow size: k segments
Large flows depend on both losses & RTT:
0.87R
T p
Tk
kR
)(log2
Simulation setup
Use ns-2 simulations to study the effect of buffer size on loss rate for different traffic models
Heterogeneous RTTs (20ms to 530ms)
TCP NewReno with SACK option BDP = 250 packets (1500 B) Model-1: persistent flows + mice
200 “infinite” connections – active for whole simulation duration
mice flows - 5% of capacity, size between 3 and 25 packets, exponential inter-arrivals
Simulation setup (cont’) Flow size distribution for finite size flows:
Sum of 3 exponential distributions: Small files (avg. 15 packets), medium files (avg. 50 packets) and large files (avg. 200 packets)
70% of total bytes come from the largest 30% of flows Model-2: Closed-loop traffic
675 source agents Think time exponentially distributed with average 5 s Time average of 200 flows in congestion avoidance
Model-3: Open-loop traffic Exponentially distributed flow inter-arrival times Offered load is 95% of link capacity Time average of 200 flows in congestion avoidance
Simulation results – Loss rate
CT=250 packets, N=200 for all traffic types
Stanford model gives a buffer of 18 packets
High loss rate with Stanford buffer Greater than 10% for open
loop traffic 7-8% for persistent and closed
loop traffic Increasing buffer to BDP or
small multiple of BDP can significantly decrease loss rate Stanford buffer
Per-flow throughput
Transfer latency = flow-size / flow-throughput Flow throughput depends on both loss rate and
queuing delay Loss rate decreases with buffer size (good) Queuing delay increases with buffer size (bad)
Major tradeoff: Should we have low loss rate or low queuing delay ?
Answer depends on various factors Which flows are considered: Long or short ? Which traffic model is considered?
Persistent connections and mice
Application layer throughput for B=18 (Stanford buffer) and larger buffer B=500
Two flow categories: Large (>100KB) and small (<100KB)
Majority of large flows get better throughput with large buffer Large difference in loss rates
Smaller variability of per-flow throughput with larger buffer
Majority of short flows get better throughput with small buffer Lower RTT and smaller
difference in loss rates
Closed-loop traffic
Per-flow throughput for large flows is slightly better with larger buffer
Majority of small flows see better throughput with smaller buffer Similar to persistent case
Not a significant difference in per-flow loss rate
Reason: Loss rate decreases slowly with buffer size
Open-loop traffic
Both large and small flows get much better throughput with large buffer
Significantly smaller per-flow loss rate with larger buffer
Reason: Loss rate decreases very quickly with buffer size
Outline
Motivation and previous work The Stanford model for buffer sizing Important issues in buffer sizing Simulation results for the Stanford model Buffer sizing for bounded loss rate (BSCL)
Our buffer sizing objectives
Full utilization: The average utilization of the target link should be at least
% when the offered load is sufficiently high
Bounded loss rate: The loss rate p should not exceed , typically 1-2% for a
saturated link Minimum queuing delays and buffer requirement, given
previous two objectives: Large queuing delay causes higher transfer latencies and jitter Large buffer size increases router cost and power consumption So, we aim to determine the minimum buffer size that meets
the given utilization and loss rate constraints
ˆ 100
p̂
Why limit the loss rate?
End-user perceived performance is very poor when loss rate is more than 5-10%
Particularly true for short and interactive flows High loss rate is also detrimental for large TCP
flows High variability in per-flow throughput Some “unlucky” flows suffer repeated losses and
timeouts We aim to bound the packet loss rate to = 1-2%
p̂
Traffic classes
Locally Bottlenecked Persistent (LBP) TCP flows Large TCP flows limited by losses at target link Loss rate p is equal to loss rate at target link
Remotely Bottlenecked Persistent (RBP) TCP flows Large TCP flows limited by losses at other links Loss rate is greater than loss rate at target link
Window Limited Persistent TCP flows Large TCP flows limited by advertised window, instead of
congestion window Short TCP flows and non-TCP traffic
Scope of our model
Key assumption: LBP flows account for most of the traffic at the target link
(80-90 %) Reason: we ignore buffer requirement of non-LBP traffic
Scope of our model: Congested links that mostly carry large TCP flows,
bottlenecked at target link
Minimum buffer requirement for full utilization: homogenous flows
Consider a single LBP flow with RTT T Window follows TCP’s saw-tooth behavior Maximum window size = CT + B
At this point packet loss occurs Window size after packet loss = (CT + B)/2 Key step: Even when window size is minimum,
link should be fully utilized (CT + B)/2 >= CT which means B >= CT
Known as the bandwidth delay product rule Same result for N homogeneous TCP connections
Minimum buffer requirement for full utilization: heterogeneous flows
Nb heterogeneous LBP flows with RTTs {Ti} Initially, assume Global Loss Synchronization
All flows decrease windows simultaneously in response to single congestion event
We derive that:
As a bandwidth-delay product: Te: “effective RTT” is the harmonic mean of RTTs
Practical Implication: Few connections with very large RTTs cannot significantly
increase buffer requirement, as long as most flows have small RTTs
1
11
Nb
bNi
i i
CB
T
1
1
11
b
b
N
e Ni
i i
TT
eB CT
Minimum buffer requirement for full utilization (cont’)
More realistic model: partial loss synchronization Loss burst length L(Nb): number of packets lost by Nb flows
during single congestion event Assumption: loss burst length increases almost linearly with
Nb, i.e., L(Nb) = α Nb α: synchronization factor (around 0.5-0.6 in our simulations)
Minimum buffer size requirement:
: Fraction of flows that see losses in a congestion event M: Average segment size
Partial loss synchronization reduces buffer requirement
( ) 2 [1 ( )]
2 ( )b b b
b
q N CT MN q NB
q N
( )bq N
Validation (ns2 simulations) Heterogeneous flows (RTTs vary between 20ms &
530ms) Partial synchronization model: accurate Global synchronization (deterministic) model
overestimates buffer requirement by factor 3-5
Relation between loss rate and N
Nb homogeneous LBP flows at target link Link capacity: C, flows’ RTT: T
If flows saturate target link, then flow throughput is given by
Loss rate is proportional to square of Nb
Hence, to keep loss rate less than we must limit number of flows
But this would require admission control (not deployed)
2 20.87( )bp N
CT
ˆ / 0.87bN pCT
0.87RT p
p̂
Flow Proportional Queuing (FPQ)
First proposed by Morris (Infocom’00) Bound loss rate by:
Increasing RTT proportionally to number of flows
Solving for T gives:
Where and Tp: RTT’s propagation delay
Set Tq C/B, and solve for B:
Window of each flow should be Kp packets, consisting of Packets in target link buffer (B term) Packets “on the wire” (CTp term)
Practically, Kp=6 packets for 2% loss rate, and Kp=9 packets for 1% loss rate
0.87
ˆpK
p
2 20.87( )bp N
CT
T Nb
C
0.87ˆ p
N b
CK p Tp Tq
B K pNb CTp
Buffer size requirement for both full utilization and bounded loss rate
We previously showed separate results for full utilization and bounded loss rate
To meet both goals, provide enough buffers to satisfy most stringent of two requirements
Buffer requirement: Decreases with Nb (full utilization objective) Increases with Nb (loss rate objective)
Crossover point:
Previous result is referred to as the BSCL formula
( ) 2 [1 ( )]ˆ if 2 ( )
ˆ if
eb b bb b
b
p p eb b b
q N CT MN q NB B N N
q N
B B K N CT N N
Model validation Heterogeneous flows Utilization % and loss constraint % ˆ 98 ˆ 1p
Utilizationconstraint Loss rate constraint
Parameter estimation1. Number of LBP flows:
With LBP flows, all rate reductions occur due to packet losses at target link
RBP flows: some rate reductions due to losses elsewhere
2. Effective RTT: Jiang et al. (2002): simple algorithms to measure TCP
RTT from packet traces
3. Loss burst lengths or loss synchronization factor: Measure loss burst lengths from packet loss trace or use
approximation L(Nb) = α Nb
Results: Bound loss rate to 1%
Results: Bound loss rate to 1%
Per-flow throughput with BSCL
BSCL can achieve network layer objectives of full utilization and bounded loss rate Can lead to large queuing delay due to larger buffer
How does this affect application throughput ? BSCL loss rate target set to 1% BSCL buffer size is 1550 packets Compare with the buffer of 500 packets BSCL is able to bound the loss rate to 1% target
for all traffic models
Persistent connections and mice
BSCL buffer gives better throughput for large flows
Also reduces variability of per-flow throughputs Loss rate decrease favors
large flows in spite of larger queuing delay
All smaller flows get worse throughput with the BSCL buffer Increase in queuing delay
harms small flows
Closed-loop traffic
Similar to persistent traffic case
BSCL buffer improves throughput for large flows
Also reduces variability of per-flow throughputs Loss rate decrease favors
large flows in spite of larger queuing delay
All smaller flows get worse throughput with the BSCL buffer Increase in queuing delay
harms small flows
Open-loop traffic
No significant difference between B=500 and B=1550
Reason: Loss rate for open loop traffic decrease quickly Loss rate for B=500 is
already less than 1% Further increase in buffer
reduces loss rate to ≈ 0 Large buffer does not
increase queuing delays significantly
Summary
We derived a buffer sizing formula (BSCL) for congested links that mostly carry TCP traffic
Objectives: Full utilization Bounded loss rate Minimum queuing delay, given previous two objectives
BSCL formula is applicable for links with more than 80-90% of traffic coming from large and locally bottlenecked TCP flows
BSCL accounts for the effects of heterogeneous RTTs and partial loss synchronization
Validated BSCL through simulations