TCP Congestion Control and Common AQM Schemes: Quick Revision
description
Transcript of TCP Congestion Control and Common AQM Schemes: Quick Revision
Shivkumar KalyanaramanRensselaer Polytechnic Institute
1
TCP Congestion Control and Common AQM Schemes: Quick
Revision
Shivkumar Kalyanaraman
Rensselaer Polytechnic Institute
http://www.ecse.rpi.edu/Homepages/shivkuma
Based in part upon slides of Prof. Raj Jain (OSU), Srini Seshan (CMU), J. Kurose (U Mass), I.Stoica (UCB)
Shivkumar KalyanaramanRensselaer Polytechnic Institute
2
TCP Congestion Control Model and Mechnisms
TCP Versions: Tahoe, Reno, NewReno, SACK, Vegas etc
AQM schemes: common goals, RED, …
Overview
Shivkumar KalyanaramanRensselaer Polytechnic Institute
3
TCP Congestion Control
Maintains three variables:cwnd – congestion window rcv_win – receiver advertised window ssthresh – threshold size (used to update
cwnd) Rough estimate of knee point…
For sending use: win = min(rcv_win, cwnd)
Shivkumar KalyanaramanRensselaer Polytechnic Institute
4
Packet Conservation: Self-clocking
PrPb
Ar
Ab
ReceiverSender
As
Implications of ack-clocking:
More batching of acks => bursty traffic
Less batching leads to a large fraction of Internet traffic being just acks (overhead)
Shivkumar KalyanaramanRensselaer Polytechnic Institute
5
TCP: Slow Start
Whenever starting traffic on a new connection, or whenever increasing traffic after congestion was experienced:
Set cwnd =1 Each time a segment is acknowledged
increment cwnd by one (cwnd++).
Does Slow Start increment slowly? Not really. In fact, the increase of cwnd is exponential!!Window increases to W in RTT * log2(W)
Shivkumar KalyanaramanRensselaer Polytechnic Institute
6
Slow Start Example
The congestion window size grows very rapidly
TCP slows down the increase of cwnd when cwnd >= ssthresh
ACK for segment 1
segment 1cwnd = 1
cwnd = 2 segment 2segment 3
ACK for segments 2 + 3
cwnd = 4 segment 4segment 5segment 6segment 7
ACK for segments 4+5+6+7
cwnd = 8
Shivkumar KalyanaramanRensselaer Polytechnic Institute
7
Slow Start Sequence Plot
Time
Sequence No
.
.
.
Window doubles every round
Shivkumar KalyanaramanRensselaer Polytechnic Institute
8
Congestion Avoidance
Goal: maintain operating point at the left of the cliff: How?
additive increase: starting from the rough estimate (ssthresh), slowly increase cwnd to probe for additional available bandwidth
multiplicative decrease: cut congestion window size aggressively if a loss is detected.
If cwnd > ssthresh then each time a segment is acknowledged increment cwnd by 1/cwnd
i.e. (cwnd += 1/cwnd).
Shivkumar KalyanaramanRensselaer Polytechnic Institute
9
Additive Increase/Multiplicative Decrease (AIMD) Policy
Assumption: decrease policy must (at minimum) reverse the load increase over-and-above efficiency line Implication: decrease factor should be conservatively set to account for any
congestion detection lags etc
x0
x1
x2
Efficiency Line
Fairness Line
User 1’s Allocation x1
User 2’s Allocation
x2
Shivkumar KalyanaramanRensselaer Polytechnic Institute
10
Congestion Avoidance Sequence Plot
Time
Sequence No Window growsby 1 every round
Shivkumar KalyanaramanRensselaer Polytechnic Institute
11
Slow Start/Congestion Avoidance Eg.
Assume that ssthresh = 8
cwnd = 1
cwnd = 2
cwnd = 4
cwnd = 8
cwnd = 9
cwnd = 10
0
2
4
6
8
10
12
14
t=0
t=2
t=4
t=6
Roundtrip times
Cw
nd (
in s
egm
ents
)
ssthresh
Shivkumar KalyanaramanRensselaer Polytechnic Institute
12
Putting Everything Together:TCP Pseudo-code
Initially:cwnd = 1;ssthresh = infinite;
New ack received:if (cwnd < ssthresh) /* Slow Start*/ cwnd = cwnd + 1;else /* Congestion Avoidance */ cwnd = cwnd + 1/cwnd;
Timeout: (loss detection)/* Multiplicative decrease */ssthresh = win/2;cwnd = 1;
while (next < unack + win)
transmit next packet;
where win = min(cwnd, flow_win);
unack next
win
seq #
Shivkumar KalyanaramanRensselaer Polytechnic Institute
13
The big picture
Time
cwnd
Timeout
Slow Start
CongestionAvoidance
Shivkumar KalyanaramanRensselaer Polytechnic Institute
14
Packet Loss Detection: Timeout Avoidance
Wait for Retransmission Time Out (RTO) What’s the problem with this?
Because RTO is a performance killer In BSD TCP implementation, RTO is usually more than 1
second the granularity of RTT estimate is 500 ms retransmission timeout is at least two times of RTT
Solution: Don’t wait for RTO to expire Use fast retransmission/recovery for loss detection Fall back to RTO only if these mechanisms fail. TCP Versions: Tahoe, Reno, NewReno, SACK
Shivkumar KalyanaramanRensselaer Polytechnic Institute
15
TCP Congestion Control Summary Sliding window limited by receiver window. Dynamic windows: slow start (exponential rise),
congestion avoidance (additive rise), multiplicative decrease. Ack clocking
Adaptive timeout: need mean RTT & deviation Timer backoff and Karn’s algo during retransmission
Go-back-N or Selective retransmission Cumulative and Selective acknowledgements Timeout avoidance: Fast Retransmit
Shivkumar KalyanaramanRensselaer Polytechnic Institute
16
Queuing Disciplines
Each router must implement some queuing discipline Queuing allocates bandwidth and buffer space:
Bandwidth: which packet to serve next (scheduling) Buffer space: which packet to drop next (buff mgmt)
Queuing also affects latency
Class C
Class B
Class A
Traffic Classes
Traffic Sources
DropScheduling Buffer Management
Shivkumar KalyanaramanRensselaer Polytechnic Institute
17
Typical Internet Queuing
FIFO + drop-tail Simplest choice Used widely in the Internet
FIFO (first-in-first-out) Implies single class of traffic
Drop-tail Arriving packets get dropped when queue is full
regardless of flow or importance Important distinction:
FIFO: scheduling discipline Drop-tail: drop (buffer management) policy
Shivkumar KalyanaramanRensselaer Polytechnic Institute
18
FIFO + Drop-tail Problems FIFO Issues: In a FIFO discipline, the service seen by a
flow is convoluted with the arrivals of packets from all other flows! No isolation between flows: full burden on e2e control No policing: send more packets get more service
Drop-tail issues: Routers are forced to have have large queues to
maintain high utilizations Larger buffers => larger steady state queues/delays Synchronization: end hosts react to same events
because packets tend to be lost in bursts Lock-out: a side effect of burstiness and synchronization
is that a few flows can monopolize queue space
Shivkumar KalyanaramanRensselaer Polytechnic Institute
19
Queue Management Ideas Synchronization, lock-out:
Random drop: drop a randomly chosen packet Drop front: drop packet from head of queue
High steady-state queuing vs burstiness: Early drop: Drop packets before queue full Do not drop packets “too early” because queue may reflect
only burstiness and not true overload Misbehaving vs Fragile flows:
Drop packets proportional to queue occupancy of flow Try to protect fragile flows from packet loss (eg: color them or
classify them on the fly) Drop packets vs Mark packets:
Dropping packets interacts w/ reliability mechanisms Mark packets: need to trust end-systems to respond!
Shivkumar KalyanaramanRensselaer Polytechnic Institute
20
Packet Drop Dimensions
AggregationPer-connection state Single class
Drop positionHead Tail
Random location
Class-based queuing
Early drop Overflow drop
Shivkumar KalyanaramanRensselaer Polytechnic Institute
21
Random Early Detection (RED)
Min threshMax thresh
Average Queue Length
minth maxth
maxP
1.0
Avg queue length
P(drop)
Shivkumar KalyanaramanRensselaer Polytechnic Institute
22
Random Early Detection (RED)
Maintain running average of queue length Low pass filtering
If avg Q < minth do nothing Low queuing, send packets through
If avg Q > maxth, drop packet Protection from misbehaving sources
Else mark (or drop) packet in a manner proportional to queue length & bias to protect against synchronization Pb = maxp(avg - minth) / (maxth - minth)
Further, bias Pb by history of unmarked packets
Pa = Pb/(1 - count*Pb)
Shivkumar KalyanaramanRensselaer Polytechnic Institute
23
RED Issues Issues:
Breaks synchronization well Extremely sensitive to parameter settings Wild queue oscillations upon load changes Fail to prevent buffer overflow as #sources increases Does not help fragile flows (eg: small window flows or
retransmitted packets) Does not adequately isolate cooperative flows from non-
cooperative flows Isolation:
Fair queuing achieves isolation using per-flow state RED penalty box: Monitor history for packet drops,
identify flows that use disproportionate bandwidth
Shivkumar KalyanaramanRensselaer Polytechnic Institute
24
0 2 4 6 8 10 12 14 16 18 200
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Link congestion measure
Lin
k m
ark
ing p
robability
REM Athuraliya & Low 2000
Main ideasDecouple congestion & performance measure “Price” adjusted to match rate and clear bufferMarking probability exponential in `price’
REM RED
Avg queue
1
Shivkumar KalyanaramanRensselaer Polytechnic Institute
25
Comparison of AQM Performance
DropTailqueue = 94%
REDmin_th = 10 pktsmax_th = 40 pktsmax_p = 0.1
REM
queue = 1.5 pktsutilization = 92% = 0.05, = 0.4, = 1.15
Shivkumar KalyanaramanRensselaer Polytechnic Institute
26
Area = 2w2/3
What is TCP Throughput?
Each cycle delivers 2w2/3 packets Assume: each cycle delivers 1/p packets = 2w2/3
Delivers 1/p packets followed by a drop=> Loss probability = p/(1+p) ~ p if p is small.
Hence
t
window
2w/3
w = (4w/3+2w/3)/2
4w/3
2w/3
pw 2/3
Shivkumar KalyanaramanRensselaer Polytechnic Institute
27
Law
Equilibrium window size
Equilibrium rate
Empirically constant a ~ 1 Verified extensively through simulations and on Internet References
T.J.Ott, J.H.B. Kemperman and M.Mathis (1996)M.Mathis, J.Semke, J.Mahdavi, T.Ott (1997)T.V.Lakshman and U.Mahdow (1997)J.Padhye, V.Firoiu, D.Towsley, J.Kurose (1998)
p1
p
aw
s
pD
ax
s
s