Distributed Virtual-Time Distributed Virtual-Time Scheduling in Rings (DVSR)Scheduling in Rings (DVSR)
Chun-Hung ChenChun-Hung Chen2004.04.302004.04.30
National Taipei University of National Taipei University of TechnologyTechnology
OutlinesOutlines
RPR RecallRPR Recall
Problems in RPRProblems in RPR
Ring Ingress Aggregated with Spatial Ring Ingress Aggregated with Spatial Reuse Fairness (RIAS)Reuse Fairness (RIAS)
Distributed Virtual-Time Scheduling in Distributed Virtual-Time Scheduling in Rings (DVSR)Rings (DVSR)
Simulation ResultsSimulation Results
ConclusionsConclusions
RPR RecallRPR Recall
RPR stands for Resilient Packet Ring, which is RPR stands for Resilient Packet Ring, which is in IEEE 802.17 Draft Statein IEEE 802.17 Draft State
Dual rings structure with Destination strip Dual rings structure with Destination strip mechanismmechanism
Traffic is classified in three classes:Traffic is classified in three classes: Class A (A0 or A1), Class B (CIR or EIR), Class A (A0 or A1), Class B (CIR or EIR),
Class CClass C
When congested, the station will compute When congested, the station will compute its approximation fair rate byits approximation fair rate by Dividing the available bandwidth between all Dividing the available bandwidth between all
upstream stations that are currently sending upstream stations that are currently sending frames through this stationframes through this station
Using its own current add rateUsing its own current add rate
Two operation modeTwo operation mode Conservative ModeConservative Mode
Congested station will wait a FRTT to send a new Congested station will wait a FRTT to send a new fair rate if it is still in congestionfair rate if it is still in congestion
Aggressive ModeAggressive ModeCongested station sends new fair rate in every Congested station sends new fair rate in every 100μs if it is still in congestion100μs if it is still in congestion
Problems in RPRProblems in RPR
Single Rate ControllerSingle Rate Controller Per-destination rate controller is optionalPer-destination rate controller is optional
Permanent Oscillation With Unbalanced Constant-Rate TPermanent Oscillation With Unbalanced Constant-Rate Traffic Inputsraffic Inputs
Unbalanced traffic will trigger severe and permanent oscillationsUnbalanced traffic will trigger severe and permanent oscillations Computed Computed add_rateadd_rate or or Capacity/Active_StationsCapacity/Active_Stations do not reflect th do not reflect th
e true situatione true situation
Throughput LossThroughput Loss Utilization degrades due to oscillationUtilization degrades due to oscillation
AM & CMAM & CM
ConvergenceConvergence Slow convergence timeSlow convergence time
Transit traffic has priorityTransit traffic has priority over ingress “station” traffic over ingress “station” traffic
Each node measures Each node measures my_ratemy_rate of ingress traffic of ingress traffic
If a node is congested:If a node is congested: send my_rate upstream send my_rate upstream upstream nodes upstream nodes throttle throttle to to my_ratemy_rate
my_rateallow_rate
… …
congestedThrottle traffic and propagate rate upstream
Throttle trafficAfter approximately 500 iterations, converges to the fair rates in this case
The Problem with DarwinThe Problem with Darwin
my_rate is NOT the ring-wide fair ratemy_rate is NOT the ring-wide fair rate
Example of permanent oscillation and Example of permanent oscillation and throughput degradation in Darwin:throughput degradation in Darwin:
flo w (1,3)Inp ut R ate = 622 M b p s
1
6 5 4
32
flo w (2,3)Inp ut R ate = 1 M b p s
Modeling RPR Oscillations Modeling RPR Oscillations (Analytical and Simulation Results)(Analytical and Simulation Results)
Conservative Mode
flo w (1,3)
flo w (2,3)
1 32 ......
Inp ut rate = 1 M b p s
Inp ut rate = 622 M b p s
Aggressive Mode
flo w (1,3)
flo w (2,3)
1 32 ......
Inp ut rate = 622 M b p s
Inp ut rate = 1 M b p s
Model accurately matches simulation
RIASRIAS
Ring Ingress Aggregated with Spatial Reuse Ring Ingress Aggregated with Spatial Reuse FairnessFairness Define the level of traffic granularity for fairness Define the level of traffic granularity for fairness
determination at a link as an ingress-aggregated (IA) determination at a link as an ingress-aggregated (IA) flowflow
Ensure maximal spatial reuse subject to the first Ensure maximal spatial reuse subject to the first constraintconstraint
Steps of RIASSteps of RIAS Allocate bandwidth on each link Allocate bandwidth on each link locally fairlocally fair according according
to an ingress aggregated granularity (IA traffic)to an ingress aggregated granularity (IA traffic) RefineRefine bandwidth allocation for each IA flow bandwidth allocation for each IA flow
according to its according to its egress point and bottlenecksegress point and bottlenecks ReclaimReclaim unused bandwidth fairly by iterating unused bandwidth fairly by iterating
Highly Similar to Max-Min Flow ControlHighly Similar to Max-Min Flow Control
ComparisonComparison
Proportional Fair AllocationProportional Fair Allocation Penalizes flows farther away from the Penalizes flows farther away from the
destinationdestination Important for TCP in the Internet (rate Important for TCP in the Internet (rate
decrease with RTT)decrease with RTT)
Fairness with Ingress-Egress flow Fairness with Ingress-Egress flow granularitygranularity Incorrectly rewards nodes for spreading out Incorrectly rewards nodes for spreading out
traffic to many destination versus all to hub traffic to many destination versus all to hub nodenode
Illustration of RIAS Fair (1/3)Illustration of RIAS Fair (1/3)
Parking LotParking Lot 4 flows each receive rate ¼ 4 flows each receive rate ¼
1/41/41/4
1/4
Illustration of RIAS Fair (2/3)Illustration of RIAS Fair (2/3)
Parallel Parking LotParallel Parking Lot Each flow receives rate ¼ on downstream linkEach flow receives rate ¼ on downstream link Left 1-hop flow fully reclaims excess bandwidth (RIALeft 1-hop flow fully reclaims excess bandwidth (RIASS))
1/41/41/4
1/43/4
1/41/41/4
1/4
3/4 1/2
1/4
1/2
Upstream Parallel Parking LotUpstream Parallel Parking Lot Key points:Key points:
Flow granularity for fairnessFlow granularity for fairnessSpatial reuseSpatial reuse
Illustration of RIAS Fair (3/3)
Proportional FairProportional Fair
““Proportional fairness”Proportional fairness” Penalizes flows farther away from the hubPenalizes flows farther away from the hub Important for TCP in the Internet (rate decreases with RTT)Important for TCP in the Internet (rate decreases with RTT) TCP/GigE approximates this in the parking lotTCP/GigE approximates this in the parking lot
.12
.16
.24
.48
• Variants of all of these have been discussed and proposed in the RPR standard meetings
Ingress-Egress Flow GranularityIngress-Egress Flow Granularity
Fairness with Fairness with Ingress-Egress flow granularityIngress-Egress flow granularity Incorrectly rewards nodes for spreading out traffic to many destIncorrectly rewards nodes for spreading out traffic to many dest
inations vs. all to hub nodeinations vs. all to hub node Wrong flow granularity counts 6 flows and gives rate 1/6Wrong flow granularity counts 6 flows and gives rate 1/6 (RIAS-fair: all green flows together get ¼ vs ½)(RIAS-fair: all green flows together get ¼ vs ½)
DVSRDVSR
Nodes construct a proxy of virtual time at Nodes construct a proxy of virtual time at the ingress-aggregated flow granularitythe ingress-aggregated flow granularity Using per-ingress byte countsUsing per-ingress byte counts
The proxy is a lower bound on virtual time The proxy is a lower bound on virtual time temporally aggregated over time and temporally aggregated over time and spatially aggregated over traffic flows spatially aggregated over traffic flows sharing the same ingress point (IA flows)sharing the same ingress point (IA flows)
Distributed Fair Bandwidth Distributed Fair Bandwidth AllocationAllocation
Remote Fair QueuingRemote Fair Queuing Control of upstream rate controllers via use of ingresControl of upstream rate controllers via use of ingres
s-aggregated virtual time as a congestion message res-aggregated virtual time as a congestion message received from downstream nodesceived from downstream nodes
Conceptually an ideal GPS processorConceptually an ideal GPS processor
Delayed and Temporally Aggregated Control InfDelayed and Temporally Aggregated Control Informationormation Proxy of Virtual TimeProxy of Virtual Time
Multinode RIAS FairnessMultinode RIAS Fairness Three Steps to approximate RIASThree Steps to approximate RIAS
Remote Fair Queuing: Remote Fair Queuing: Single Resource IllustrationSingle Resource Illustration
G P S
M UX
F e e d b ac k
...
...
...
... v ( t )
D e l ay
R at e C o n t r o l l e r
( a ) G P S S e r ve r ( b ) A p p r o x i m at i o n
• Control of upstream rate controllers via downstream virtual time progression• True fair queueing replaced with rate controllers + m
ultiplexer • Note: no packets queued in mux when = 0
ExampleExample
Link capacity = 1 pkt/sLink capacity = 1 pkt/sec ec T = 10T = 10 pkt transmissio pkt transmission timesn timesb = 0.8 (fraction of time b = 0.8 (fraction of time busy)busy)
> 0> 0 Controller set at Controller set at tt for rates in for rates in
[t-T- [t-T- , t- , t- ]]
Packet Size
t0
4
2
3
1
5
6
42 31 5 6 7 8 9 10
t0
1
42 31t0
1
42 31
(b) Traffic Arrival for Flow 2(a) Traffic Arrival for Flow 1
(c) Virtual Time
v(t)
Packet Size
5
Limiter value = 0.8
Step I: Local FairnessStep I: Local Fairness
Label nodes 1, …, N and links 1, …, N-1Label nodes 1, …, N and links 1, …, N-1
rrijij is the is the traffic demandtraffic demand between nodes between nodes ii and and jj at a p at a particular time instantarticular time instant
rriinn is the Ingress Aggregated traffic from ingress no is the Ingress Aggregated traffic from ingress no
de i at link nde i at link n rrii
nn = = ∑∑j>nj>nrrijij
The locally fair allocation on link n isThe locally fair allocation on link n isRRii
nn = max_min = max_minii(C,r(C,r11nn,r,r22
nn,…,r,…,riinn,…, r,…, rnn
nn))
Footnote on max_minFootnote on max_min
What is max_minWhat is max_minii( )?( )? The “textbook” definition of (locally) fairThe “textbook” definition of (locally) fair Would be achieved by fair queueing if fair queWould be achieved by fair queueing if fair que
ueing was performed on ingress aggregatesueing was performed on ingress aggregates Can write down the exact computation [BerGalCan write down the exact computation [BerGal
92,p527]92,p527] Maximizing the network use allocated to the seMaximizing the network use allocated to the se
ssions with the minimum allocationssions with the minimum allocation
Step II: Ingress Fairly Sub-allocates Step II: Ingress Fairly Sub-allocates
Per-link BandwidthsPer-link Bandwidths
• Rijn = max_minj(Ri
n,ri,n+1,ri,n+2,…,ri,j,…,ri,N)
• Ingress has bandwidth Rin on link n and divides it fairly a
mong flows traversing n
• End-to-End rate is the bottleneck rateri,j = minnRij
n, n=i, i+1,…,j-1
Step III: IterateStep III: Iterate
• There may be further bandwidth available for spatial reuse– Due to multiple congestion points
• Iterate process such that all excess capacity is fairly reclaimed
• Set new capacity to all unallocated capacityCn=Cn-∑ijRij
n
• Go to Step I
DVSR ProtocolDVSR Protocol
Scheduling of Station versus Transit PacketsScheduling of Station versus Transit Packets FIFO queueFIFO queue Class A is not taken in considerationClass A is not taken in consideration
Feedback Signal ComputationFeedback Signal ComputationFeedback Signal TransmissionFeedback Signal Transmission Control message is N bytes while there exist N stationControl message is N bytes while there exist N station
ss Each station i writes its value at i bytesEach station i writes its value at i bytes
Rate Limit ComputationRate Limit Computation Suballocate its per-link fair rates to the flows with diffeSuballocate its per-link fair rates to the flows with diffe
rent egress nodesrent egress nodes
DVSR Protocol DVSR Protocol
SchedulingScheduling FIFO (or SP)FIFO (or SP)
Computation of feeComputation of feedback signaldback signal Byte count for each ingrByte count for each ingr
ess node - lower bound ess node - lower bound of virtual timeof virtual time
Order such thatOrder such that
ll11 ≤≤ l l2 2 ≤≤ … … ≤≤ l lkk
Analysis of DVSRAnalysis of DVSR
Fairness BoundFairness Bound Lemma 1Lemma 1
A node-backlogged flow in DVSR can be under-throttled by A node-backlogged flow in DVSR can be under-throttled by at most (1-(1/N))CTat most (1-(1/N))CT
Lemma 2Lemma 2A node-backlogged flow in DVSR can be over-throttled by at A node-backlogged flow in DVSR can be over-throttled by at most (1-(1/N))CTmost (1-(1/N))CT
Lemma 3Lemma 3The service difference during any interval for two flows i and j The service difference during any interval for two flows i and j with infinite demand is bounded by 2(C-(1/N)C)T under with infinite demand is bounded by 2(C-(1/N)C)T under DVSRDVSR
Simulations ResultsSimulations Results
Fairness and Spatial ReuseFairness and Spatial Reuse Fairness in the Parking LotFairness in the Parking Lot Performance Isolation for TCP TrafficPerformance Isolation for TCP Traffic RIAS versus Proportional Fairness for TCP RIAS versus Proportional Fairness for TCP
TrafficTraffic Spatial Reuse in the Parallel Parking LotSpatial Reuse in the Parallel Parking Lot
Convergence Time ComparisonConvergence Time Comparison
Fairness in the Parking LotFairness in the Parking Lot
Four constant-rate UDP fFour constant-rate UDP flows sending at 622 Mbplows sending at 622 Mbpss
DVSR provides RIAS fair DVSR provides RIAS fair sharesshares
GigE does notGigE does not
flo w (1,5)flo w (2,5)
flo w (3,5)flo w (4,5)
1
67891 0
5432
0
0.1
0.2
0.3
0.4
0.5
0.6
flow (1,5) flow (2,5) flow (3,5) flow (4,5)
DVSR
GigE
F low
Nor
mal
ized
Thr
ough
put
7 8 7 8
1 5 5
3 1 1
1 5 5 .5 1 5 5 .5 1 5 5 .5 1 5 5 .5
Spatial Reuse in the Parallel Spatial Reuse in the Parallel Parking LotParking Lot
DVSR is within DVSR is within 1% of RIA1% of RIAS fair rates S fair rates GigE favors downstream floGigE favors downstream flows & cannot achieve spatial ws & cannot achieve spatial reusereuseDarwin achieves only if usinDarwin achieves only if using “multi-choke” optiong “multi-choke” option
flo w (1,5)flo w (2,5)
flo w (3,5)flo w (4,5)flo w (1,2)
1
67891 0
5432link 1 link 2 link 3 link 4
CBR UDP flows sending at the link capacity
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
flow (1,5) flow (2,5) flow (3,5) flow (4,5) flow (1,2)
DVSR
GigE
F low
Nor
mal
ized
Thr
ough
put
157.5 154 155 155.5
464.5
52
104
155.5
310.5 310.5
Upstream Parallel Parking LotUpstream Parallel Parking Lot(Results in Unbalanced Traffic Even with Balanced Inputs)(Results in Unbalanced Traffic Even with Balanced Inputs)
Darwin oscillation range is Darwin oscillation range is 0.25 to 0.75 and throughput 0.25 to 0.75 and throughput loss is 14%loss is 14%
Many other scenarios can Many other scenarios can result in traffic imbalances result in traffic imbalances and throughput lossesand throughput losses
DVSR within 0.1% of RIASDVSR within 0.1% of RIAS
flo w (2,6)flo w (3,6)
flo w (5,6)
2 65431
flo w (1,3) flo w (4,6)
.. . . . .
Darwin Behavior
RIAS vs. Proportional Fairness RIAS vs. Proportional Fairness for TCP Trafficfor TCP Traffic
Each flow =1 TCP micrEach flow =1 TCP micro flow (ftp/TCP Reno)o flow (ftp/TCP Reno)
Rate within Rate within 1% of RIA1% of RIAS fair rates for 1 TCP mS fair rates for 1 TCP micro-flowicro-flow
GigE tends to provide “GigE tends to provide “proportional fair” ratesproportional fair” rates
flo w (1,5)flo w (2,5)
flo w (3,5)flo w (4,5)
1
67891 0
5432
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
flow (1,5) flow (2,5) flow (3,5) flow (4,5)
DVSR
GigE
F low
Nor
mal
ized
Thr
ough
put
150.5 152.5 156163
120.5
138.5
161.5
201.5
Convergence Time in the Parking Convergence Time in the Parking LotLot
CBR UDP flows with rate 0.4 (248.8Mbps) Flow(1,5), (2,5), (3,5), (4,5) begin transmission at times 0.0, 0.1, 0.2, and 0.
3 seconds respectively Convergence time 0.2 msec for DVSR, 50 msec for Darwin Richer feedback signal allows faster convergence
DVSR Gandalf
Inter-Node Performance Inter-Node Performance Isolation of TCP/UDP TrafficIsolation of TCP/UDP Traffic
Flow (1,5) TCP micro-flowsFlow (1,5) TCP micro-flows
Others are CBR UDP flows Others are CBR UDP flows with rate 0.3with rate 0.3
More TCP micro-flows – DVMore TCP micro-flows – DVSR able to achieve RIAS faiSR able to achieve RIAS fairnessrness
Darwin performance unknoDarwin performance unknown (MAC sim incompatible wn (MAC sim incompatible with TCP)with TCP)
flo w (1,5)flo w (2,5)
flo w (3,5)flo w (4,5)
1
67891 0
5432
ConclusionsConclusions
Link capacity does not be considered in RLink capacity does not be considered in RPRPR
Do my_rate and forward_rate in RPR fit thDo my_rate and forward_rate in RPR fit the bandwidth allocation?e bandwidth allocation?
DVSR approximate RIAS quicker than RPDVSR approximate RIAS quicker than RPRR
RPR may have better performance if feedRPR may have better performance if feedback mechanism is modifiedback mechanism is modified
ReferenceReference
V. Gambiroza, P. Yuan, B. Balzano, Y. Liu, S.ShV. Gambiroza, P. Yuan, B. Balzano, Y. Liu, S.Sheafor, “Design, Analysis, and Implementation of eafor, “Design, Analysis, and Implementation of DVSR: A Fair High-Performance Protocol for PaDVSR: A Fair High-Performance Protocol for Packet Rings”, IEEE/ACM Transactions on Networcket Rings”, IEEE/ACM Transactions on Networking, Feb. 2004king, Feb. 2004
F. Davik, M.Yilmaz, S. Gjessing, N. Uzun, “IEEE F. Davik, M.Yilmaz, S. Gjessing, N. Uzun, “IEEE 802.17 Resilient Packet Ring Tutorial”, IEEE Co802.17 Resilient Packet Ring Tutorial”, IEEE Communicaion Magazine, Mar. 2004mmunicaion Magazine, Mar. 2004
http://www.ece.rice.edu/networks/RPR/http://www.ece.rice.edu/networks/RPR/
Top Related