Packet Switches with Output and Shared Buffer
description
Transcript of Packet Switches with Output and Shared Buffer
1
Packet Switches with Output and Shared Buffer
2
Packet Switches with Output Buffers and Shared Buffer
• Packet switches with output buffers, or shared buffer
• Delay Guarantees• Fairness• Fair Queueing• Deficit Round Robin• Random Eearly Detection• Weighted Fair Early Packet Discard
3
Quality of Service: Requirements
How stringent the quality-of-service requirements are.
5-30
4
Buffering
Smoothing the output stream by buffering packets.
5
Quality of Service
• Integrated Services•Bandwidth is negotiated and the
traffic is policed or shaped accordingly
• Differentiated Services•Traffic is served according to its
priority: Expedite forwarding (EF), assured forwarding (AF), best effort forwarding (BE)
6
The Leaky Bucket Algorithm
(a) A leaky bucket with water. (b) a leaky bucket with packets.
7
The Leaky Bucket
Algorithm(a) Input to a leaky
bucket. (b) Output from a leaky bucket. Output from a token bucket with capacities of (c) 250 KB, (d) 500 KB, (e) 750 KB, (f) Output from a 500KB token bucket feeding a 10-MB/sec leaky bucket.
8
The Token Bucket Algorithm
(a) Before (b) After
5-34
9
Admission Control
An example of flow specification
5-34
10
Packet Switches with Output Buffers
11
Packet Switches with Shared Buffer
12
Delay Guarantees
• All flows must police their traffic: send certain amount of data within one policing interval
• E.g. 10Mbps flow should send 10Kb within 1ms
• If output is not overloaded, it is guaranteed that the data passes the switch within one policing interval
13
Fairness
• When some output is overloaded, its bandwidth should be fairly shared among different flows.
• What is fair? • Widely adopted definition is max-min
fairness. • The simplest definition (for me) for fair
service is bit-by-bit round-robin (BR).
14
Fairness Definitions1. Max-min fairness:
1) No user receives more than it requests2) No other allocation scheme has a higher
minimum allocation (received service divided by weight w)
3) Condition (2) recursively holds when the minimal user is removed
2. General Processor Sharing: if Si(t1,t2) is the amount of traffic of flow i served in (t1,t2) and flow i is backlogged during , then it holds
j
i
j
i
w
w
ttS
ttS),(
),(
21
21
15
Examples
• Link bandwidth is 10Mbps; Flow rates: 10Mbps, 30Mbps; Flow weights: 1,1; Fair shares: 5Mbps, 5Mbps
• Link bandwidth is 10Mbps; Flow rates: 10Mbps, 30Mbps; Flow weights: 4,1; Fair shares: 8Mbps, 2Mbps
• Link bandwidth is 10Mbps; Flow capacities: 4Mbps, 30Mbps; Flow weights: 3,1; Fair shares: 4Mbps, 6Mbps
• Exercise: Link bandwidth 100Mbps; Flow rates: 5,10,20,50,50,100; Flow weights: 1,4,4,2,7,2; Fair shares ?
16
Fairness Measure• It is obviously impossible to implement bit-
by-bit round-robin• Other practical algorithm will not be
perfectly fair, there is a trade-off between the protocol complexity and its level of fairness
• Fairness measure is defined as:
where flows i an j are backlogged during
(t1,t2) and should be as low as possible
j
j
i
i
w
ttS
w
ttS ),(),(maxFM 2121
17
Fair Queueing (FQ)• It is emulation of bit-by-bit round robin, proposed
by Demers, Keshav and Shenker• Introduce virtual time which is the number of
service rounds until time t, and is calculated as:
• Define with Si the virtual time when packet k of flow i is serviced, and Fi the virtual time when this packet departs the switch. Its length is Li, and arriving time to the switch at ti. It holds
• Packets are transmitted in an increasing order of their departure times
)(tNdt
dV
ac
ki
ki
ki
ki
ki
ki LSFtVFS ,)(,max 1
k
k
k
k
18
Examples of FQ Performance
• The performance of different end-to-end flow control mechanisms passing through switches employing WFQ has been examined by Demers et al.
• Generic flow control algorithm uses a sliding window like TCP, and timeout mechanism where congestion recovery starts after 2RTT (RTT is an exponentially averaged round trip time)
• Flow control algorithm proposed by Jacobson and Karels, TCP Tahoe version. It comprises: slow start, adaptive window threshold, tedious estimation of RTT
• In the selective DECbit algorithm, switches send congestion messages to sources using more than their fair shares
19
Examples of FQ Performance
• Telnet source 40B per 5s, FTP 1KB, maximum window size 5
F6 T7 T8
B
F1
800kbps
56kbps
15packets
PolicyFTP Telnet
F1 F2 F3 F4 F5 F6 T7 T8
G/FIFO 18 1154 1159 3 1149 15 31 3
G/FQ 178 838 591 600 615 621 96 98
JK/FIFO 582 583 585 585 583 582 3 0
JK/FQ 574 579 546 594 599 601 87 96
DEC 582 582 582 582 582 582 99 90
Sl DEC 582 582 582 582 582 582 105 97
20
Examples of FQ Performance
• Telnet source 40B per 5s, FTP 1KB, maximum window size 5, ill behaved source twice the line bit-rate
T2I3
B
F1
800kbps
56kbps
20packets
PolicyFTP Telnet Ill behaved
F1 T2 I3
G/FIFO 3 11 3497
G/FQ 3491 95 5
JK/FIFO 0 0 3500
JK/FQ 3489 110 6
DEC 166 0 3334
Sl DEC 3493 95 3
21
Examples of FQ Performance
• FTP 1KB, maximum window size 5
S
F1
56kbps20packets
PolicyFTP
F1 F2 F3 F4
G/FIFO 2500 2500 2500 1000
G/FQ 1750 1750 1750 1750
JK/FIFO 2500 2500 2500 1000
JK/FQ 1750 1750 1750 1750
DEC 2395 2406 2377 783
Sl DEC 1750 1750 1750 1750
S S S S S
F1 S2F2S1
F4
S3F3
S4
22
Packet Generalized Processor Sharing (PGPS)
• Parekh and Gallager generalized FQ by introducing weights and simplified it a little
• Virtual time is updated whenever there is an event in the system: arrival or departure as follows
• Virtual arrival and departure of packet k of flow i are calculated as
jj
ii
jj ttt
w
tttVtV
j
1
11 ,)()(
A
i
kik
iki
ki
ki
ki w
LSFtVFS ,)(,max 1
24
Properties of PGPS
• Theorem: For PGPS it holds that
where Lmax is the maximum packet length.
• Complexity of the algorithm is O(N) because so many packets may arrive within a packet transmission time.
max2121),(),(
maxFM Lw
ttS
w
ttS
j
j
i
i
25
Deficit Round Robin• Proposed by Shreedhar and Varghese at
Washington University in St. Louis• In DRR, flow i is assigned quantum Qi
proportional to its weight wi, and counter ci. Initially counter value is set to 0. The number of bits of packets that are transmitted in some round-robin round must satisfy ti<ci+Qi. And counter is set to new value ci=ci+Qi-ti. If queue gets emptied ci=0;
• The complexity of this algorithm is O(1) because a couple of operations should be performed within a packet duration time, if algorithm serves non-empty queue whenever it visits the queue.
26
Properties of DRR
• Theorem: For PGPS it holds that
where Lmax is the maximum packet length.• Proof: Counter ci<Lmax, because it remains
ci>0 if heading packet is longer than ci. It holds that Si(t1,t2)=mQi+ci(0)-ci(m) where m is the number of round-robin round and (t1,t2) is the busy interval and therefore |Si(t1,t2)-mQi|<Lmax.
max2121 3),(),(
maxFM Lw
ttS
w
ttS
j
j
i
i
27
Properties of DRR• Proof(cont.): Si(t1,t2)/wi≤(m-1)·Q+Q+Lmax/wi,
and Sj(t1,t2)/wj≥m’·Q-Lmax/wj where m’ is the number of round-robin rounds for flow j. Because m’≥m-1, FM=Q+Lmax/wi+Lmax/wj =3Lmax because wi,wj≥1, and Q≥Lmax in order for the protocol to have complexity of O(1). Namely if Q<Lmax it may happen that queue is not served when round-robin pointer points to it and the complexity of the algorithm is larger than O(1). Namely each queue visit incurs the operation of comparison, and many queues may be visited, up to N per packet transmission.
28
Properties of DRR• Maximum delay in BR is NLmax/B. In DRR,
an incoming packet might have to wait for ∑iQi/B, and its maximum delay is NQmax/B. So, the ratio of the DRR delay and the ideal delay is Qmax/Lmax=Qmax/Qmin=wmax/wmin, and may be significant if the fairness granularity should be very fine.
• Shreedhar and Varghese propose to serve the delay sensitive traffic with reservations and which is policed.
30
Packet Discard• First schemes discard packets coming to
the full buffer or coming to the buffer with the number of queued packets exceeding some specified threshold.
• They are biased against bursty traffic, because the probability that a packet is discarded increases with its burst length.
• TCP sources sending discarded packets would slow down their rates and underutilize the network. All sources are synchronized, the network throughput would be oscillatory and the efficiency becomes low.
31
Random Early Detection (RED)
• Floyd and Jacobson introduce two threshold for the queue length were introduced in random early detection (RED) algorithm.
• When the queue length exceeds the low threshold but is below the high threshold, packets are dropped with a probability which increases with the queue length. The probability is calculated so that the packets that are dropped are equally spaced.
• When the queue length exceeds the higher threshold, all incoming packets are dropped.
• The queue length is calculated as an exponential weighted moving average, and it depends on the instantaneous queue length, and past values of the queue length.
32
Motivation for RED
• The global synchronization is avoided by making a softer decision on packet dropping, i.e. by using two thresholds, and by evenly dropping packets between thresholds.
• The queue length is calculation as an exponential weighted moving average allow short term bursts because they do not trigger packet drops.
• Also, Authors argue that fair queueing is not required because the flows sending more traffic will loose more packets. But, it was shown in the subsequent papers that the fairness is not satisfactory because the flows are not isolated.
33
Severe Criticism of RED
• Bonald, May and Bolot severely criticize RED. They analyzed RED and TailDrop
• Removing bias against bursty traffic means higher drop probabilities for UDP traffic because TCP dominates
• The average number of consecutive dropped packets is higher for RED, and so (they claim) the possibility for synchronization
• They show that jitter introduced by RED is higher
34
Weighted Fair Early Packet Discard (WFEPD)
• Racz, Fodor, and Turanyi proposed protocol WFPD to ensure fair throughput to different flows
• Calculate average flow rate as a moving average
where is the number of bytes arrived in the last interval of length
/)1( iavi
avi cqrqr
ic
35
Weighted Fair Early Packet Discard (WFEPD)
• Violating, non-violating and pending sources are determined based on their rates• Flows are ordered so that
• If first k-1 flows are violating, and E is the rate in excess then the bandwidth of violating flows is
ErRrrrRRk
i
avi
N
i
avi
k
i
avi
N
ki
aviv
1
11
1
1
NavN
avav wrwrwr /// 2211
36
Weighted Fair Early Packet Discard (WFEPD)
• If kmin is minimal k for which the inequality
is satisfied, then all flows below kmin are violating, and they get:
k
ii
kk
i
avi
avk
w
wErr
1
1
1
1
1
1min
min
k
ii
ik
i
avi
schi
w
wErr
37
Weighted Fair Early Packet Discard (WFEPD)
• If pmin is the largest p for which the inequality holds
• If pmax is the minimal integer that satisfies:
• Here 0<thmin<1 and thmax>1. Flows from pmin to pmax are pending, and are dropped with the probability which linearly increases with the flow rate
min
1
1
thw
wErr p
ii
pp
i
avi
avp
schp
avp rthr max
38
Examples of WFEPD Performance
• Fair for TCP flows, gives bandwidth according to the weights. FIFO and early packet discard (EPD) protocol
• Isolate misbehaving UDP flows that overload the output port and give them almost equal shares as to TCP flows with equal weights. FIFO queueing gives remaining bandwidth to TCP flows.
• Give equal shares to TCP flows with different round-trip times (RTT) and equal weights, while FIFO queueing gives three times more bandwidth to the flows with three times shorter RTT
39
References
• A. Demers, S. Keshav, and S. Shenker, “Analysis and simulation of a fair queueing algorithm,” Internet Research and Experiments, vol.1, 1990.
• A. Parekh, and R. Gallager, “A generalized processor sharing approach to flow control in integrated services networks: The single-node case,” IEEE/ACM Transactions on Networking, vol. 1 no.3, June 1993?
• M. Shreedhar, and G. Varghese, “Efficient fair queueing using deficit round robin,” IEEE/ACM Transactions on Networking, vol. 4, no. 3, 1996.
• J. Bennett, and H. Zhang, “Hierarchical packet fair queueing algorithms,” IEEE/ACM Transactions on Networking, vol. 5, no. 5, October 1997.
40
References
• S. Floyd and V. Jacobson, “Random early detection gateways for congestion avoidance,” IEEE/ACM Transactions on Networking, vol. 1, no. 4, August 1993, pp. 397-413.
• T. Bonald, M. May, J.C. Bolot, “Analytic evaluation of RED performance,” INFOCOM 2000, March 2000, pp. 1415 – 1424.
• A. Racz, G. Fodor, Z. Turanyi, “Weighted fair early packet discard at an ATM switch output port,” INFOCOM 1999, pp. 1160-1168.