Congestion Control and Active Queue Management
description
Transcript of Congestion Control and Active Queue Management
winter 2008 congestion control and AQM
1
Congestion Control and Active Queue Management
• Review of TCP Congestion Control– A simple TCP throughput formula
• RED and Active Queue Management– How RED works– Fluid model of TCP and RED interaction (optional material)– Other AQM mechanisms
• XCP: congestion for large delay-bandwidth product– Router-based mechanism– Decoupling congestion control from fairness
• DCCP: datagram congestion control protocol– congestion control for non-TCP flows – TCP-Friendly Rate Control (TFRC)
Readings: do required and optional readings if interested
winter 2008 congestion control and AQM
2
TCP Congestion Control Behavior
• congestion control: – decrease sending rate
when loss detected, increase when no loss
• routers– discard, mark packets
when congestion occurs
• interaction between end systems (TCP) and routers?– want to understand
(quantify) this interaction
TCP runs at end-hosts
congested router drops packets
winter 2008 congestion control and AQM
3
Generic TCP CC Behavior: Additive Increase
• window algorithm (window W )– up to W packets in network– return of ACK allows sender to send another packet– cumulative ACKS
• increase window by one per RTT W W +1/W per ACK W W +1 per RTT• seeks available network bandwidth• Ignoring the “slow start” phase during which
window increased by one per ACK W W +1 per ACK W W per RTT
winter 2008 congestion control and AQM
4
sender
receiver
W
winter 2008 congestion control and AQM
5
Generic TCP CC Behavior:Multiplicative Decrease
• window algorithm (window W)• increase window by one per RTT W W +1/W per ACK• loss indication of congestion
• decrease window by half on detection of loss, (triple duplicate ACKs), W W/2
winter 2008 congestion control and AQM
6
sender
receiver
TD
winter 2008 congestion control and AQM
7
Generic TCP CC Behavior:After Time-Out (TO)
• window algorithm (window W)• increase window by one per RTT W W +1/W per ACK• halve window on detection of loss, W W/2• timeouts due to lack of ACKs window reduced
to one, W 1
winter 2008 congestion control and AQM
8
sender
receiver
TO
winter 2008 congestion control and AQM
9
Generic TCP Behavior: Summary
• window algorithm (window W)• increase window by one per RTT (or one over
window per ACK, W W +1/W)• halve window on detection of loss, W W/2• timeouts due to lack of ACKs, W 1• successive timeout intervals grow exponentially
long up to six times
winter 2008 congestion control and AQM
10
Understanding TCP Behavior
• can simulate (ns-2)+ faithful to operation of TCP- expensive, time consuming
• deterministic approximations+ quick- ignore some TCP details, steady state
• fluid models+ transient behavior- ignore some TCP details
winter 2008 congestion control and AQM
11
TCP Throughput/Loss Relationship
Idealized model:• W is maximum supportable
window size (then loss occurs)• TCP window starts at W/2
grows to W, then halves, then grows to W, then halves…
• one window worth of packets each RTT
• to find: throughput as function of loss, RTT
TCPwindow
size
time (rtt)
W/2
W
loss occurs
winter 2008 congestion control and AQM
12
TCP Throughput/Loss Relationship
TCPwindow
size
time (rtt)
W/2
W
period
# packets sent per “period” =
winter 2008 congestion control and AQM
13
TCP Throughput/Loss Relationship
TCPwindow
size
time (rtt)
W/2
W
period
2/
0
)2
(...122
W
n
nW
WWW
2/
021
2
W
n
nWW
2
)12/(2/
21
2
WWWW
WW4
3
8
3 2
# packets sent per “period” =
2
8
3W
winter 2008 congestion control and AQM
14
TCP Throughput/Loss Relationship
TCPwindow
size
time (rtt)
W/2
W
period
# packets sent per “period” 2
8
3W
1 packet lost per “period” implies:
ploss 23
8
W or:
losspW
3
8
rtt
packets
4
3utavg._thrup WB
B throughput formula can be extendedto model timeouts and slow start [PFTK’98](see slide 59 for details)
rtt
packets22.1utavg._thrup
losspB
winter 2008 congestion control and AQM
15
Drawbacks of FIFO with Tail-drop
• Sometimes too late a signal to end system about network congestion – in particular, when RTT is large
• Buffer lock out by misbehaving flows• Synchronizing effect for multiple TCP flows• Burst or multiple consecutive packet drops
– Bad for TCP fast recovery
winter 2008 congestion control and AQM
16
FIFO Router with Two TCP Sessions
winter 2008 congestion control and AQM
17
Active Queue Management• Dropping/marking
packets depends on average queue length -> p = p(x)
• Advantages:– signal end systems
earlier– absorb burst better– avoids synchronization
• Examples:– RED
– REM
– …
– …
tmin tmax
pmax
1
2tmax
Mark
ing
pro
bab
ilit
y p
average queue length x
0
winter 2008 congestion control and AQM
18
RED: Parameters• min_th – minimum threshold• max_th – maximum threshold• avg_len – average queue length
– avg_len = (1-w)*avg_len + w*sample_len
Discard Probability
AverageQueue Length
0
1
min_th max_th queue_len
winter 2008 congestion control and AQM
19
RED: Packet Dropping
• If (avg_len < min_th) enqueue packet• If (avg_len > max_th) drop packet• If (avg_len >= min_th and avg_len <
max_th) enqueue packet with probability P
Discard Probability (P)
AverageQueue Length
0
1
min_th max_th queue_len
winter 2008 congestion control and AQM
20
RED: Packet Dropping (cont’d)
• P = max_P*(avg_len – min_th)/(max_th – min_th)• Improvements to spread the drops
P’ = P/(1 – count*P), where• count – how many packets were consecutively
enqueued since last drop
Discard Probability
AverageQueue Length
0
1
min_th max_th queue_len
avg_len
P
max_P
winter 2008 congestion control and AQM
21
RED Router with Two TCP Sessions
winter 2008 congestion control and AQM
22
Dynamic (Transient) Analysis of TCP Fluids
Optional materials (slides 22-33)
• model TCP traffic as fluid• describe behavior of flows and queues using
Ordinary Differential Equations (ODEs)• solve resulting ODEs numerically
winter 2008 congestion control and AQM
23
Loss Model
Sender
AQM Router
Packet Drop/Mark
Receiver
Loss Rate as seen by Sender: (t = B(t-p(t-
Round Trip Delay ()
B(t)p(t)
winter 2008 congestion control and AQM
24
A Single Congested Router
TCP flow i
AQM router
C, p
• focus on single bottlenecked router– capacity {C (packets/sec) }
– queue length q(t)– discard prob. p(t)
• N TCP flows thru router– window sizes Wi(t)
– round trip time Ri(t) = Ai+q(t)/C
– throughputs Bi (t) = Wi(t)/Ri(t)
winter 2008 congestion control and AQM
25
Adding RED to the Model
RED: Marking/dropping based on average queue length x(t)
tmin tmax
pmax
1
2tmax
Mark
ing
pro
babili
ty p
Average queue length x
t ->
- q(t)- x(t)
x(t): smoothed, time averaged q(t)
winter 2008 congestion control and AQM
26
kk
k
kkk
kkk
k
kk
tR
tWqC
dt
qd
txptR
tWtW
tR
txpdtWd
)(
)(}0{1
))(()(
)(
2
)(
)(
))((1
System of Differential Equations
Window Size:
Additiveincrease
Mult.decrease
Loss arrivalrate
Outgoingtraffic
Incomingtraffic
Queue length:
Timeouts and slow start ignored
winter 2008 congestion control and AQM
27
System of Differential Equations (cont.)
Average smoothedqueue length:
Where = averaging parameter of RED(wth)= sampling interval ~ 1/C
)(ln
)()1ln(
tqtxdt
xd
Loss probability:
Where dp is obtained from the marking profile dx
dt
xd
xd
dp
dt
dp
winter 2008 congestion control and AQM
28
N+2 coupled equations
N flows
Wi(t) = Window size of flow i
Ri(t) = RTT of flow i
p(t) = Drop probability
q(t) = queue lengthEquations solved numerically using MATLAB
NiWRpfdtWd iii ,,1,,,1
qfdtdp 3 iWfdtqd 2
winter 2008 congestion control and AQM
29
Steady State Behavior
• let t → ∞
• this yields
• the throughput is
kkk RtRWtWptp
dtWd
)(,)(,)(,0
p
pWp
R
WW
R
pk
k
kk
k
)1(2
2
10
or
ppRpR
pB
kk
k smallfor2)1(2
winter 2008 congestion control and AQM
30
A Queue is not a NetworkNetwork - set of AQM routers, V sequence Vi for session i
Link bandwidth constraints
Queue equations
Loss/marking probability - cumulative prob
pi (t) = 1-v Vi (1 - pv(qv(t)))
Round trip time - aggregate delay
Ri(t) = Ai + vVi qv(t)
winter 2008 congestion control and AQM
31
How well does it work?
• OC-12 – OC-48 links• RED with target delay
5msec• 2600 TCP flows
OC-48
OC-12
• decrease to 1300 at 30 sec.
• increase to 2600 at 90 sec.
t=30 t=90
2600 j 2600 j1300 j
winter 2008 congestion control and AQM
32
Good queue length match
inst
an
taneous
dela
y
time (sec)
simulationfluid model
winter 2008 congestion control and AQM
33
time (sec)
win
dow
si
ze
matches average window size
simulationfluid model
time (sec)
avera
ge w
indow
siz
e
simulationfluid model
winter 2008 congestion control and AQM
34
Issues with RED
• Parameter sensitivity– how to set minth, maxth, and maxp
– Goal: maintain avg. queue size below midpoint between min_{th} and max_{th}
• maxth needs to be significantly smaller than max. queue size to absorb transient peaks
• maxp determines drop rate
– In reality, hard to set these parameters
• RED uses avg. queue length, may introduce large feedback delay, lead to instability
winter 2008 congestion control and AQM
35
Other AQM Mechanisms
• Adaptive RED (ARED)• BLUE• Virtual Queue• Random Early Discard (REM)• Proportional Integral Controller • Adaptive Virtual Queue
– Improved AQMs are designed based on control theory to provide better faster response to congestion and more stable systems
winter 2008 congestion control and AQM
36
Explicit Congestion Notification (ECN)
• Standard TCP:– Losses needed to detect congestion– Wasteful and unnecessary
• ECN (RFC 2481):– Routers mark packets instead of dropping them– Receiver returns marks to sender in ACK
packets– Sender adjusts its window accordingly
• Two bits in IP header:– ECT: ECN-capable transport (set to 1)– CE: congestion experienced (set to 1)
winter 2008 congestion control and AQM
37
TCP congestion control performs poorly as bandwidth or delay increases
Round Trip Delay (sec)
Avg
. T
CP
Util
iza t
ion
Bottleneck Bandwidth (Mb/s)
Avg
. T
CP
Util
iza t
ion
Shown analytically in [Low01] and via simulations
Because TCP lacks fast response
• Spare bandwidth is available TCP increases by 1 pkt/RTT even if spare bandwidth is huge• When a TCP starts, it increases exponentially Too many drops Flows ramp up by 1 pkt/RTT, taking forever to grab the large bandwidth
Because TCP lacks fast response
• Spare bandwidth is available TCP increases by 1 pkt/RTT even if spare bandwidth is huge• When a TCP starts, it increases exponentially Too many drops Flows ramp up by 1 pkt/RTT, taking forever to grab the large bandwidth
50 flows in both directionsBuffer = BW x Delay
RTT = 80 ms
50 flows in both directionsBuffer = BW x Delay
BW = 155 Mb/s
winter 2008 congestion control and AQM
38
High Utilization; Small Queues; Few Drops
Bandwidth Allocation Policy
Solution: Decouple Congestion Control from Fairness
XCP: eXplicit congestion Control Protocol
winter 2008 congestion control and AQM
39
Solution: Decouple Congestion Control from Fairness
Example: In TCP, Additive-Increase Multiplicative-Decrease (AIMD) controls both
Coupled because a single mechanism controls both
How does decoupling solve the problem?
1. To control congestion: use MIMD which shows fast response
2. To control fairness: use AIMD which converges to fairness
Why Decoupling?
winter 2008 congestion control and AQM
40
Characteristics of XCP Solution
1. Improved Congestion Control (in high bandwidth-delay & conventional environments):
• Small queues
• Almost no drops
2. Improved Fairness
3. Scalable (no per-flow state)
4. Flexible bandwidth allocation: min-max fairness, proportional fairness, differential bandwidth allocation,…
winter 2008 congestion control and AQM
41
XCP: An eXplicit Control Protocol
1. Congestion Controller2. Fairness Controller
winter 2008 congestion control and AQM
42
Feedback
Round Trip Time
Congestion Window
Congestion Header
Feedback
Round Trip Time
Congestion Window
How does XCP Work?
Feedback = + 0.1 packet
winter 2008 congestion control and AQM
43
Feedback = + 0.1 packet
Round Trip Time
Congestion Window
Feedback = - 0.3 packet
How does XCP Work?
winter 2008 congestion control and AQM
44
Congestion Window = Congestion Window + Feedback
Routers compute feedback without any per-flow state
Routers compute feedback without any per-flow state
How does XCP Work?
XCP uses ECN and “Core Stateless” mechanism (i.e. state carried in packet header)
winter 2008 congestion control and AQM
45
How Does an XCP Router Compute the Feedback?
Congestion Controller Fairness ControllerGoal: Divides between flows to converge to fairnessLooks at a flow’s state in Congestion Header
Algorithm:If > 0 Divide equally between flowsIf < 0 Divide between flows proportionally to their current rates
MIMD AIMD
Goal: Matches input traffic to link capacity & drains the queueLooks at aggregate traffic & queue
Algorithm:Aggregate traffic changes by ~ Spare Bandwidth ~ - Queue SizeSo, = davg Spare - Queue
winter 2008 congestion control and AQM
46
= davg Spare - Queue
224
0 2 and
Theorem: System converges to optimal utilization (i.e., stable) for any link bandwidth, delay, number of sources if:
(Proof based on Nyquist Criterion)
Getting the devil out of the details …
Congestion Controller Fairness Controller
No Parameter Tuning
No Parameter Tuning
Algorithm:If > 0 Divide equally between flowsIf < 0 Divide between flows proportionally to their current rates
Need to estimate number of flows N
Tinpkts pktpkt RTTCwndTN
)/(1
RTTpkt : Round Trip Time in header
Cwndpkt : Congestion Window in header
T: Counting Interval
No Per-Flow State
No Per-Flow State
winter 2008 congestion control and AQM
47
Congestion Control without Reliability
• So far: congestion control for reliable (TCP-like) data streams– Use AIMD window/rate adjustment
• What about “long-lived” UDP flows?– VoIP, video streaming flows – prefer timeliness over reliability– AIMD: too abrupt rate adjustment
• DCCP: Datagram Congestion Control Protocol (IETF proposed standard)
– Large increase in long-lived UDP flows on Internet– Reduce impact of long-lived UDP flows on TCP flows and
among themselves, avoid congestion collapse – Can’t leave it to application developers
• May lead to “buggy” implementation, even if they are willing to implement congestion control
winter 2008 congestion control and AQM
48
DCCP Design Goals• Minimalism:
– Minimal functionality in line with e2e argument– Few core protocol features, rich in implementation
• Leave out other features that can be implemented successfully by apps or intermediate library
• Minimize header size
• Robustness– in particular to attacks such as DoS – robust and transparent to middleboxes
• Framework for modern congestion control– Allow use of a variety of congestion control
algorithms
• Self-sufficiency– perform congestion control without application
intervention
• Support timing-reliability trade-offs
winter 2008 congestion control and AQM
49
DCCP Overview• Fundamental Design Choices:
– In-band signaling• Alternative: a separate signaling channel
– Bi-directional communication• Alternative: one-way data flow
– Per-packet sequence number space• including all non-data control packets (even ack
pkts!)• Alternative: TCP-like byte stream, per-data pkt
• Core Features– Connection management– Synchronization among two end-points– “Negotiation” of congestion control mechanisms – Others: mobility/multi-homed end points, partial
checksum
winter 2008 congestion control and AQM
50
DCCP Header Format
(a) generic header: starts every DCCP packet(b) additional header info depending on packet type possibly followed by options payload starts at data offset
0 8 16 24
source port
data offset
res
CCVal CsCov
destination port
checksum
reservedtype 1 sequence number
sequence number (low bits) [48 or 24 bits]
(a)
acknowledgement number reserved
acknowledgment number (low bits) [48 or 24 bits]
(b)
(0 to 1008 bytes) head option
winter 2008 congestion control and AQM
51
DCCP Packet Types and States • Packet Types
• Protocol State Machine:
No “half-closed” state (cf. TCP)
DCCP-Sync,DCCP-SyncAck: for explicit re-synchronization after burst of losses
winter 2008 congestion control and AQM
52
DCCP: Two “Half Connections” • Half-connection (HC):
– data flowing one direction plus corresponding ACKs
• Eg. One HC: A-> B data plus B->A acks• A: HC-sender; B: HC-receiver
• Each connection: two conceptually separate HC’s– some may have only one HC (i.e., data flow
one-way)
• State (seq. no, ack. no., …) maintained for each HC – Each HC may use different congestion control
mechanisms
winter 2008 congestion control and AQM
53
Connection Set-up/Teardown: Exp 1
• Client close: client enters “timed wait” state
winter 2008 congestion control and AQM
54
Connection Set-up/Teardown: Exp 2
• Server close: – client enters “timed wait” state– server never enters “timed wait” state!
winter 2008 congestion control and AQM
55
How to Maintain State and Sync without Reliability
• HC-sender:– Maintain state about what have been sent, window of
expected ack no.’s, etc.• HC-receiver:
– maintain state about what pkts been received, window of expected seq. no.’s, etc.
– ack no in ACK/DATA-ACK: only acked previously received packet (i.e., not cumulative ack!)
• packets can be data or “control” packets• use “ACK Vector” option for selective ack’s
– But lost packets may never be re-transmitted• “state explosion” at receiver!• Sending “ack of ack’s” to clean up state at HC-receiver
• Sender and receiver may lose “sync” after burst of packet losses!– Use Sync and SyncAck to get into sync again– Care must be taken to deal with “half-open”
connection
winter 2008 congestion control and AQM
56
Congestion Control Mechanisms
• Represented by Congestion Control Identifier• CCID 2: TCP-like congestion mechanism
– sawtooth rate adjustment– quickly get available bandwidth
• CCID 3: TCP-Friendly Rate Control (TFRC) mechanism– respond more gradually to congestion– more suitably for audio/video flows
• Feature Negotiation– Provide a generic mechanism for negotiating shared
parameters• select CCID for each HC• negotiate CCID-specific parameters
CCID 0: reserved; CCID 1: unspecified sender-based congestion control
winter 2008 congestion control and AQM
57
A Few Words about Security and Mobility
• Want to prevent data injection, DoS attacks– Use initiation cookies (or “nounces”)– Identification and Challenge options– In general, obey “TCP robustness principle”
• “be conservative in what you send, and liberal in what you accept” (modulo security)
• weigh trade-off between being “strict with validity check” vs. exploitation of such check for “denial-of-service” attacks
• End-Point Mobility (and Multi-Homing)– Basic Mechanism: end-point moves, send
“DCCP-move” from new address• contain old address and port• mandatory security mechanism (identification
option) to prevent hijacking
winter 2008 congestion control and AQM
58
TFRC: General Idea• Use a model of TCP's throughout as a function of the loss rate and RTT directly in a congestion control algorithm
– If transmission rate is higher than that given by the model, reduce the transmission rate to the model's rate.
– Otherwise increase the transmission rate
• Equation-based: TCP-friendly– TCP throughput approximate formula:
– Improved formula taking into account effect of “time out”
winter 2008 congestion control and AQM
59
An Improved "Steady State" Model
A pretty good improved model of TCP Reno, including timeouts, from Padhye et al, Sigcomm 1998:
Would be better to have a model of TCP SACK, but the differences aren’t critical.
winter 2008 congestion control and AQM
60
TFRC Details
• The devil's in the details – How to measure the loss rate? – How to respond to persistent congestion?– How to use RTT and prevent oscillatory behavior?
• Not as simple as first thought
winter 2008 congestion control and AQM
61
TFRC Performance (Simulation)
winter 2008 congestion control and AQM
62
TFRC Performance (Experimental)