Congestion Control Chapter 6 Outline Resource Allocation Issues Queuing Disciplines FCFS (FIFO...

21
Congestion Control Chapter 6 Outline Resource Allocation Issues Queuing Disciplines FCFS (FIFO queues) Priority Queuing Fair Queuing (for flows) TCP Congestion Control Detection – Resolution approach (AIMD and Slow Start) Alternatives: Fast Transmit / Fast Recovery Congestion Avoidance router-centric: DECbit and RED Gateways host-centric: TCP Vegas QoS

Transcript of Congestion Control Chapter 6 Outline Resource Allocation Issues Queuing Disciplines FCFS (FIFO...

Congestion ControlChapter 6

Outline

Resource Allocation Issues

Queuing Disciplines

FCFS (FIFO queues)

Priority Queuing

Fair Queuing (for flows)

TCP Congestion Control

Detection – Resolution approach (AIMD and Slow Start)

Alternatives: Fast Transmit / Fast Recovery

Congestion Avoidance

router-centric: DECbit and RED Gateways

host-centric: TCP Vegas

QoS

Congestion Control

ISSUES:

• How to fairly allocate resources (link bandwidths and switch buffers) among users.

• Two sides of the same coin:– Resource allocation so as to avoid congestion (difficult with any precision)– Congestion control if (and when) it occurs

• Resource allocation and congestion control involve both:– hosts at the edges of the network (transport protocols)– routers inside the network (queuing disciplines)

• Underlying service model can be– best-effort (assume here – end-hosts

given no opportunity for QoS demands)– multiple qualities of service QoS (later)

Destination1.5-Mbps T1 link

Router

Source2

Source1

100-Mbps FDDI

10-Mbps Ethernet

Congestion in a packet-switched network

Framework• Connectionless flows assumed: What are they? Even tho datagrams from a source to a dest are switched independently, they

typically flow thru the same path.– Routers maintain soft state info

• Somewhere between the hardstate info of a VC switch (bandwidth,

cell-loss ratio, etc) and no state info of pure connectionless.• Correct operation does not depend on

soft state info but is improved by it.– Implicitly defined: router watches

for what appears to be a flow – used in TCP Congestion Control.– Explicitly defined: source sends flow-setup (flow about to start) across network.

(a step down from a VC since explicit flow has no reliable, ordered delivery)

• Taxonomy of Resource Allocation/Congestion Control mechanisms– Router-centric: address prob inside net (decide forwards/drops, inform hosts) versus Host-centric: address problem from outside

the network)– Reservation-based: hosts request capacity when flow is established; versus Feedback-based: Explicit (e.g., congested

router sends “slow-down message) Implicit (eg, host adjust rate based on, eg, cell-loss rate)

– Window-based (telling sender remaining buffer space – as in flow control) versusRate-based (telling sender the rate at which data can be absorbed)

Router

Source2

Source1

Source3

Router

Router

Destination2

Destination1

Multiple flows passing thru a set of routers

Evaluation Criteria(of resource allocation effectiveness & fairness)

• Effective Resource Allocation

(utilization issue – network-wide point of view)

measured by Power = ratio of thruput to delay.

• Fair Resource Allocation (to individual senders)– Can assume Fair means Equal shares

– E.g., Raj Jain proposed metric when Fair means Equal and all paths are equal length:

Jain’s Fairness Index: Given flow thruputs (units/sec) x1, x2, …, xn

f(x1, x2, …, xn) = ( n

i=1 xi )2 / ( n

n

i=1 xi

2 )

If all n flows have thruput of 1 unit/sec, f = n2 / n*n = 1.

However if k have thruput 1 and n-k have thruput 0, f = k2 / n*k = k/n (less fair)

Optimalload Load

Th

rou

ghp

ut/d

elay

Pow

er

Thrashing orcongestion collapse

Queuing Discipline (Each router specifies a queuing discipline

regardless of resource allocation mechanism. Algorithm can be thought of as allocating bandwidth (which packets get transmitted) and buffer space (which packets get

dropped))

• First-In-First-Out or FIFO (AKA: FCFS)– Packets transmitted in arrival order.– No discrimination between traffic sources.– Usually used with “tail drop” policy.– FIFO + tail-drop = bundle.– Widely used in Internet.– Variations include priority queuing.

• Fair Queuing (FQ) for Flows– explicitly segregates traffic based on flows

(separate queue per flow)

• Weighted Fair Queuing allows a

weight to be assigned to each flow.

Flow 1

Flow 2

Flow 3

Flow 4

Round-robinservice

Fair Queuing - FQ AlgorithmFor simplicity, suppose clock ticks each time bit is transmitted (bit = tic)

Let Pi = length of packet i

Si = time when transmission of packet i starts

Fi = time when transmission of packet i finishes

Fi = Si + Pi

For a single flow, when does a router start transmitting packet i? if it’s before router is finished with this flow’s packet i-1, right after last bit of i-1 (Fi-1)

if no current packets for this flow, then start transmitting when 1 arrives (at time Ai)

Thus: MAX (Fi - 1, Ai) and Fi = MAX (Fi - 1, Ai) + Pi

For multiple flows (Not perfect: can’t preempt current packet)

calculate Fi for each packet that arrives on each flow (treat as timestamps)packet with lowest timestamp is next.

Flow 1 Flow 2

(a) (b)

Output Output

F = 8 F = 10F = 5

F = 10

F = 2

Flow 1(arriving)

Flow 2(transmitting)

Queue discipline: Shortest packet first Longer packet already in progress is completed first

TCP Congestion Control• Idea

– assumes best-effort network (FIFO or FQ routers) each source determines network capacity for itself

– uses implicit feedback (host adjusts rate based on its knowledge)– ACKs pace transmission (self-clocking) (I.e., only allow n

outstanding un-Ack’ed packets.

• Challenge– determining the available capacity in the first place– adjusting to changes in the available capacity

• AIMD and Slow Start were the original solutions for TCP

Additive Increase/Multiplicative Decrease (AIMD)Objective: adjust to changes in the available capacity• New state variable per connection: CongestionWindow

– set by source to limit number of packets in transit• Recall, FlowCtrl AdvertisedWindow = # of packets destination can still buffer)

MaxWin = MIN( CongestionWindow, AdvertisedWindow )EffWin = MaxWin - ( LastByteSent - LastByteAcked )

# of outstanding packets• Idea:

– increase CongestionWindow when congestion goes down– decrease CongestionWindow when congestion goes up

• Question: how does the source determine whether or not the network is congested?

• Answer: a packet timeout occurs (I.e., an Ack is late)– Assumes timeout signals that a packet was dropped due to congestion

(packet loss is so seldom due to transmission error)– lost packet implies congestion

AIMD (cont)

• In practice however, TCP increments a little for each ACK, using:

Increment = MSS * (MSS/CongestionWindow)

CongestionWindow += Increment

Trace: CongestionWindow sawtooth behavior with AIMD

AIMD works well when

source is operating close

to the available capacity of

the network. But takes too

long to ramp up from scratch.

SLOW START (ironically name)

is intended to solve that using multiplicative increase.

Source Destination

Algorithm: Each time source successfully sends a CongestionWindow

of packets, increase CongestionWindow by 1 packet (additive incr).

Divide CongestionWindow by 2 each timeout (multiplicative decr)

(never below Min Seg Size – MSS is in bytes – usually 1 packet)

60

20

1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0

KB

Time (seconds)

70

30

40

50

10

10.0

Slow Start (2nd mechanism provided by TCP)• Start with CongestionWindow (CW) = 1 packet

a slow start compared to a CongestionWindow=AdvertisedWindow start

• Double CongestionWindow each RTT (multiplicative incr)

until it reaches CongestionThreshold (CT), then increment by 1 per RTT.

Used when first starting connection and if connection goes deadwaiting for timeout (another “start over” situation).

Slow Start Trace:

60

20

1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0

KB

70

304050

10

timeouts

Hash marks =times when each packet is transmitted

time when retransmitted packetswere first transmitted

Time in sec

Source Destination

No increase; No Acks arriving – due to lost packets

Timeout; 17=CTCW/2; CW 0

mult increase

Multiplicative increase until CT, then Additive increase

No increase; No Acks arriving

Timeout; 11=CTCW/2; CW 0

Multiplicative increase until CT, then Additive increase

Fast RetransmitProblem: Coarse-grain TCP timeouts lead to idle periodsFast retransmit: use duplicate ACKs to trigger retrans.Idea: every time a packet arrives, receiver sends ACK.Thus, when a packet arrives out-of-order (and TCP can’tACK because earlier packets have not yet arrived)TCP resends last legit cumm ACK (called duplicate ACK).When sender sees 3 dups, retransmits next packet.

Trace of CongestionWindow with fast retransmit

Fast Recovery: Upon congestion, rather than drop back to 0 and use Slow Start, just

cut window in half and resume additive increase.

Packet 1Packet 2Packet 3Packet 4

Packet 5Packet 6

Retransmitpacket 3

ACK 1ACK 2

ACK 2ACK 2

ACK 6

ACK 2

Sender Receiver

60

20

1.0 2.0 3.0 4.0 5.0 6.0 7.0

KB

70

304050

10

timeout

Time in sec

Hash marks =times when each packet is transmitted

time when retransmitted packetswere first transmitted

Eliminates many of the flat areas where no packets were transmitted

Congestion Avoidance• TCP’s strategy is to control congestion once it happens(repeatedly increase load to find the point at which congestion occurs, and then back off)

• Alternative strategy

– predict when congestion is about to happen

– reduce rate before packets start being discarded

– call this congestion avoidance, instead of congestion control

• Two possibilities

– router-centric: DECbit and RED Gateways

– host-centric: TCP Vegas

DECbit• Add congestion bit to packet header.• Router

– monitors average queue length overlast busy-idle cycle + current busy cycle,

set congestion bit if average queue length > 1• End Host

– Destination echoes bit back to source– Source records how many packets resulted in set bit– If less than 50% of last CongestionWindow’s worth had bit set

• increase CongestionWindow by 1 packet– If 50% or more of last window’s worth had bit set

• decrease CongestionWindow to 7/8th of its value.

Random Early Dectection (RED)• Notification is implicit

– Router just drops the packet when congested (TCP will timeout)• Early random drop

– rather than wait for queue to become completely full, drop each arriving packet with some drop probability whenever the queue length exceeds some drop level

Currenttime

TimeCurrentcycle

Previouscycle

Averaginginterval

Queue length

RED DetailsCompute average queue lengthAvgLen = (1-Weight)*AvgLen+Weight*SampleLen 0 < Weight < 1 (usually 0.002) SampleLen = queue length each time packet arrives

Two queue length thresholdsif AvgLen MinThreshold then enqueue packetif MinThreshold < AvgLen < MaxThreshold then calculate probability P drop arriving packet with probability Pif MaxThreshold AvgLen, then drop arriving packet

Computing probability PTempP = MaxP * (AvgLen - MinThreshold)

(MaxThreshold - MinThreshold) Count = # packets (denom of AvgLen)

P = TempP/(1 - count * TempP)

MaxThreshold MinThreshold

AvgLen

Weighted runnng avg queue length

P(drop)

1.0

MaxP

MinThresh MaxThreshAvgLen

Drop probability curve

TCP Vegas (host-centric congestion avoidance)

Idea: source watches for some sign router’s queue is building (eg, RTT grows; sending rate flattens) ExpectedRate =CW/BaseRTT

Diff = ExpectedRate – ActualRate

if Diff < α increase CW linearly

else if Diff > β decrease CW linearly

else leave CW unchanged( when α < Diff < β )

min of all measured RTTs,Typically RTT of 1st packet

Source calculates current sending rate as the # bytes divided by the RTT for a distinguished packet

roughly corresponds to too little data in the network

roughly corresponds to too much data in the network

TCP Vegas (trace of congestion avoidance mechanism)

Parameters = 1 packet = 3 packets

70605040302010

KB

Time (seconds)

0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0

0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0

CA

M K

Bps

240200160120

8040

Time (seconds)

Congestion Window Trace for TCP Vegas

Actual throughput Expccted throughputShaded area is region between and units awayFrom the Expected throughput (the goal to keep actual inthis region. Note the actual gets drug along by shaded.)

QoSReal-time App• Require “deliver on time” assurances

– must come from inside the network (hosts cannot make such guarantees alone)

• Example application (audio)

– sample voice once every 125us

– each sample has a playback time

– packets experience variable delay in network

– add constant factor to playback time: playback point

• Playback Buffer

Microphone

Speaker

Sampler,A D

converter

Buffer,D A

Seq

uenc

e nu

mbe

r Packet generation

Network delayBuffer

Playback

Time

Packet arrival

Integrated Services• Refers to the body of work by IETF 1995-97 working group on Integrated Services.

• Integrated Services allocates resources to individual flows– whereas Differentiated Services allocates resources by “classes of traffic”

• Integrated Service Service Classes

– E.g., Guaranteed service (packets are never late – guaranteed max delay time)

• Flowspecs (Set of info we provide to the network to specify needs.)– Tspec

• describes flow’s Traffic characteristics (e.g., average bandwidth, token issues..)

– Rspec• describes the services Requested from the network

– E.g., guarantees, such as, delay target

RSVP Resource reSerVation Protocol• While connection-oriented networks have setup protocols,

best-effort connectionless networks don’t – they need some sort of reservation protocl in order to offer QoS.

– Internet resource reservation corresponds to signaling in ATM– Proposed Internet standard is called RSVP

• Receiver-oriented• 2 messages: PATH and RESV• Source transmits PATH messages

every 30 seconds to make requests.• Destination responds with

RESV message to ack.

R

R

R

R

R

Sender 1

Sender 2

PATH

PATH

RESV(merged)

RESV

RESV

Receiver B

Receiver A

RSVP versus ATM (Q.2931)

• RSVP– receiver generates reservation– soft state info used in routeers (it is refreshed/timedout)– separate from route establishment– QoS can change dynamically

• ATM– sender generates connection request– hard state info (requires explicit delete at teardown)– concurrent with route establishment– QoS is static for life of connection

Differentiated Services (also IETF)• Problem with Integrated Services: scalability• Idea of Differentiated Serivces: support 2 classes of packets

– DS adds new Premium Service to best effort traffic class)• Which packets are premium?• Use premium-bit in header.