Evaluation Of Current Switching ArchitecturesSubmitters:Erez RokahErez Goldshide
Supervisor:Yossi Kanizo
MotivationA switch is a computer networking device
that connects network segmentsSwitches form the Internet infrastructure It is critical for switches to be fast, with high
throughput, reliable, modular, cost and power effective
We will evaluate switches with the following measures in mind: Throughput and Delay (speed).
Performance MeasuresThroughput
The average rate of successful data delivered by the switch.Thus, we would like to maximize the throughput
DelayThe average time interval between packet arrival time to the switch and exit time (total average time spent in the system)Thus, we would like to minimize the delay
Switching ArchitecturesIn order to support multiple inputs and
output a switch must contains buffersThe switching architecture can be
categorized by the location of the buffers:Output Queued Switch – buffers at the outputsInput Queued (with VOQ) Switch – buffers at
the inputsBuffered Crossbar Switch – similar to Input
Queued, just with a small sized buffer at the cross points
Cross Point Queued Switch – only cross points have buffers
IQVOQ and XBar Switches
IQ – Input Queued
LinecardsSwitch
FabricSwitch Fabric
CICQ – Combined Input and Crosspoint Queued(Buffered Crossbar)
Linecards
Taken from: Yossi Kanizo, David Hay and Isaac Keslassy, "The Crosspoint-Queued Switch,"
Cross Point Queued SwitchSwitch Core
Taken from: Yossi Kanizo, David Hay and Isaac Keslassy, "The Crosspoint-Queued Switch,"
Project GoalsStudying the current high-speed switch
architecturesEvaluation of current switching architecturesSimulate the basic switching architectures
mentioned aboveCompare to the theorem and mathematical
analysisObject Oriented Design (using C#)Modular Design
Simulation FlowInput
Generator generates cells for
current cycle
Input cells are inserted into the switch
queues
Logger logs current cycle
results
The switch scheduler
decides which queues are
processed and outputs the
corresponding cells
Design:CrossPointsInputGeneratorand General
Design:Scuedulers
Design:Switches
The Switch ClassAbstract Class for defining the switch objectContains the number of inputs, number of
outputs, cross points matrix and a scheduler object to handle the switch queues
Deriving classes (switches) must implement methods for handling incoming cells (inserting to queues) and a method for clearing the switch.
Implemented SwitchesOutputQueuedSwitch – implements a switch
with queues at the outputs.InputQueuedVOQSwitch - implements a
switch with queues at the inputs.BufferedCrossBarSwitch – implements a
switch based on the Buffered Crossbar Architecture.
CrossPointQueuedSwitch - implements a switch based on the Cross Point Queued Switch Architecture.
The Scheduler ClassAbstract Class for defining the scheduler
object Allows modular implementation of various
scheduling algorithmsDeriving classes (various schedulers) must
implement the ‘get_match_from_queues’ method, according to the desired scheduling algorithm (e.g. Round Robin, First Come First Served)
Implemented Schedulers OQFCFSScheduler which outputs cells from the output queues using the
FIFO algorithm.
XBarRRInRROutScheduler which selects input queues and cross point buffers using the RR algorithm.
IQMaximumScheduler which selects input queues using the Maximum Matching algorithm.
IQPIMMaximalScheduler which selects input queues using the PIM algorithm.
CQLQFScheduler which selects cross point queues using the LQF algorithm. Allows both exhaustive and non exhaustive LQF.
CQRandomScheduler which selects cross point queues using the Random algorithm. Allows both exhaustive and non exhaustive Random.
CQRRScheduler which selects cross point queues using the RR algorithm. Allows both exhaustive and non exhaustive RR.
The InputGenerator Class Abstract Class for defining the Input
Generator object Deriving classes must implement the
‘get_next_packets’ method, according to the desired input we want to simulate (e.g. Bernoulli Independently and Identically Distributed, Trace Driven)
Implemented Input ModelsBernoulliInputGenerator which generates
input cells according to a traffic matrix (as described in the background section).
TraceDrivenInputGenerator which generates input cells according to the data in a trace file.
OnOffInputGenerator which generates a bursty traffic followed by a commonly used model.
Simulation Resultwith Analytical Analysis
130 , 1.5
20NumberOfInputsNumberOfBernoulliTrials p
NumberOfOutputs
Bin Pois
1.5
0
1.5
!
1 1
i
i
eP NoService P NoArrivals e
i
P Service P NoService e
1.510.52
1.5AverageArrivalRateForEachOutput
eThroughput SuccessfulCellsPercentage
OQ switch with 30 inputs, 10 outputs 1-sized output queues and FIFO scheduling.Bernoulli IID input model with p=0.5.
The Bernoulli Trials:
Simulation Results32x32 CQ switch under uniform
traffic (p=1)
1 2 3 4 5 6 7 8 9 100.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1cq-32x32-fig-2 (iterations average)
buffer capacity
Thr
ough
put
Longest-Queue-First
RandomRound-Robin
Exhaustive Round-Robin
Simulation Results32x32 CQ switch under trace-based
traffic
4 16 64 2560.85
0.9
0.95
1cq-32x32-fig-7 (iterations average)
buffer capacity
Thr
ough
put
Longest-Queue-First
Random
Round-Robin
Exhaustive-Round-Robin
Simulation Results32x1 CQ switch under on-off traffic
1 2 3 4 5 6 7 8 9 100.65
0.7
0.75
0.8
0.85
0.9
0.95
1cq-32x1-on-off (iterations average)
buffer capacity
Thr
ough
put
Longest-Queue-First
RandomRound-Robin
Exhaustive-Round-Robin
Simulation ResultsThroughput (average load) of a 32x32 IQVOQ switch
under uniform traffic (p=1) and using PIM matching algorithm with 10 iterations (infinite sized queues).The resulted throughput was 0.9959.
Throughput (average load) of a 32x32 XBar switch under uniform traffic (p=1 and infinite sized input queues).The resulted throughput was 0.9978.
Throughput (average load) of a 32x32 XBar switch under trace-based traffic (infinite sized input queues).The resulted throughput was 0.982.
ConclusionsAll switches, provided that the queues’
capacity is high enough, can reach near 100% throughput.
Exhaustive scheduling algorithms are best when using small sized queues with burst income traffic.
The “Longest Queue First” algorithm gives the overall best results for a Crosspoint Queued Switch.
Future DevelopmentThis project goal was to provide a modular code for implementing and running simulations of various switching architectures.
The modularity of the code allows future development of:More switching architectures.More scheduling algorithms.More input models.
Also it is possible to further develop a GUI for running and displaying simulation results, and even real time cells flow.
LiteratureYossi Kanizo, David Hay and Issac Keslassy, “The
Crosspoint-Queued Switch”, Technical Report TR08-04, Comnet, Technion, Israel (article and slides).
046993 – High Speed Networks course’s slides,Electrical Engineering Department, Technion, Israel.
Google.
“Solver for the Maximum Weight Matching Problem”.Available: http://elib.zib.de/pub/Packages/mathprog/matching/weighted/
Thank you for you attention
© Erez & Erez
Top Related