iSLIP Switch Scheduler Ali Mohammad Zareh Bidoki April 2002
description
Transcript of iSLIP Switch Scheduler Ali Mohammad Zareh Bidoki April 2002
iSLIP Switch Scheduler
Ali Mohammad Zareh BidokiApril 2002
Table of Contents The place Buffer in Crossbar Switches Example of Fabrics
PIM iSLIP (in CISCO 12000 ,5Gb/s router and
Tiny Tera 0.5 Tb/s) RRM WFA PP_VOQ
Multicasting A 2.5Tb/s Router
The place of Buffer in Crossbar
Output Buffer Shared Buffer Input buffer
InterconnectsTwo basic techniques
Input Queueing Output Queueing
Usually a non-blockingswitch fabric (e.g. crossbar)
Usually a fast bus
InterconnectsInput Queueing with Crossbar
configuration
Data
In
Data Out
Arbiter
Memory b/w = 2R
Input QueueingHead of Line Blocking
Dela
y
Load58.6% 100%
Head of Line Blocking
Virtual output Queuing Crossbar Switch fabric
Crossbar Switch fabric
To port 1
To port 1
To port 1
To port 1
Inpu
t que
ues To port 2
To port 2
To port 2
Port n queue
To port n
To port n
To port n
To port n
Port 2 queuePort 1 queue
Input port 1
Queue scheduler
Input QueueingVirtual output queues
Input QueueingVirtual Output Queues
Dela
y
Load100%
Which is better? Virtual output Queue (input
queue). Ideal Output queue.
Input QueueingVirtual output queues
ArbiterComplex!
VOQ Arbiter Input memory management
Problem Definition (bipartite)
Maximum or Maximal matching
Request Graph Maximum Matching Maximal Matching
Maximum or Maximal matching
Maximum matching Maximizes instantaneous throughput Starvation Time complexity is very high in Hardware (o(n3))
Maximal matching Can’t add any connection on the current match
without alert existing connections More practical (e.g. WFA, PIM, iSLIP, DRR,RRM)
Matching Algorithms
Each algo. is evaluated by four parameters:1. Latency(Throughput).2. Starvation free.3. Fast.4. Implementation.
3. iSLIP – Iterative Serial-Line IP(base on PIM and RRM)
2. RRM – Round-Robin Matching1. PIM - Parallel Iterative Matching
We will discuss three different matching algo.:
When no new matching can be found, the algorithm stops.
3. Accept - If an input receives a grant, it accepts one by selecting an output randomly among those that granted to this output..
2. Grant - If an unmatched output receives any requests, it grants to one by randomly selecting a request uniformly over all requests.
1. Request - Each unmatched input sends a request to every output for which it has a queued cell.
PIM - Parallel Iterative MatchingThe basic matching algorithm. Each iteration of the algorithm follows these three steps:
PIM Each iteration will eliminate at least ¾ of the remaining
connections Converge in O(logN) iterations No input queue is starved if service No memory or state is used
At the beginning of each cell time, the match begins over, independently of the matches that were made in previous cell times
PIM does not perform well for a single iteration: it limits the throughput to approximately 63%, only slightly higher than for a FIFO switch.
This is because the probability that an input will remain ungranted is (N-1/N)N , hence as N increases, the throughput tends to .63% (1-(1/e))
Implementation is hard in Hardware
RRM – Round-Robin Matching
The pointer gi to the highest priority element of the round-robin schedule is incremented (modulo N) to one location beyond the granted input.
2. Grant - If an output receives any requests, it chooses the one that appears next in a fixed, round-robin schedule starting from the highest priority element. The output notifies each input whether or not its request was granted.
1. Request - Each unmatched input sends a request to every output for which it has a queued cell.
g2
g4
g1a1
a3
a4
1
23
4
1
23
4
1
23
41
23
4
1
23
4
1
23
4
RRM – Round-Robin Matching
The pointer ai to the highest priority element of the round-robin schedule is incremented (modulo N) to one location beyond the accepted output.
3. Accept - If an input receives a grant, it accepts the one that appears next in a fixed, round-robin schedule starting from the highest priority element.
The pointer gi to the highest priority element of the round-robin schedule is incremented (modulo N) to one location beyond the granted input.
2. Grant - If an output receives any requests, it chooses the one that appears next in a fixed, round-robin schedule starting from the highest priority element. The output notifies each input whether or not its request was granted.
1. Request - Each unmatched input sends a request to every output for which it has a queued cell.
a1
a3
a4
1
23
4
1
23
4
1
23
41
23
4
1
23
4
1
23
4
g2
g4
g1
RRM – Round-Robin Matching
The pointer ai to the highest priority element of the round-robin schedule is incremented (modulo N) to one location beyond the accepted output.
3. Accept - If an input receives a grant, it accepts the one that appears next in a fixed, round-robin schedule starting from the highest priority element.
The pointer gi to the highest priority element of the round-robin schedule is incremented (modulo N) to one location beyond the granted input.
2. Grant - If an output receives any requests, it chooses the one that appears next in a fixed, round-robin schedule starting from the highest priority element. The output notifies each input whether or not its request was granted.
1. Request - Each unmatched input sends a request to every output for which it has a queued cell.
a1
a3
a4
1
23
4
1
23
4
1
23
41
23
4
1
23
4
1
23
4
g2
g4
g1
RRM – Round-Robin Matching
g2
g3
g1a1
a2
a3
12
3
12
3
12
312
3
12
3
12
3
First cycle
The RRM is not starvation free:In the following example, we assume there are always cells waiting to be transferred. The destination is always the same.
RRM – Round-Robin Matching
g2
g3
g1
a1
a2
a3
12
3
12
3
12
312
3
12
3
12
3
First cycle
The RRM is not starvation free:In the following example, we assume there are always cells waiting to be transferred. The destination is always the same.
RRM – Round-Robin Matching
g2
g3
g1
a1
a2
a3
12
3
12
3
12
312
3
12
3
12
3
First cycle
The RRM is not starvation free:In the following example, we assume there are always cells waiting to be transferred. The destination is always the same.
RRM – Round-Robin Matching
g2
g3
g1
a1
a2
a3
12
3
12
3
12
312
3
12
3
12
3
First cycle
The RRM is not starvation free:In the following example, we assume there are always cells waiting to be transferred. The destination is always the same.
RRM – Round-Robin Matching
g2
g3
g1
a1
a2
a3
12
3
12
3
12
312
3
12
3
12
3
First cycle
The RRM is not starvation free:In the following example, we assume there are always cells waiting to be transferred. The destination is always the same.
RRM – Round-Robin Matching
g2
g3
g1
a1
a2
a3
12
3
12
3
12
312
3
12
3
12
3
Second cycle
The RRM is not starvation free:In the following example, we assume there are always cells waiting to be transferred. The destination is always the same.
RRM – Round-Robin Matching
g2
g3
g1
a1
a2
a3
12
3
12
3
12
312
3
12
3
12
3
Second cycle
The RRM is not starvation free:In the following example, we assume there are always cells waiting to be transferred. The destination is always the same.
RRM – Round-Robin MatchingThe RRM is not starvation free:In the following example, we assume there are always cells waiting to be transferred. The destination is always the same.
g2
g3
g1
a1
a2
a3
12
3
12
3
12
312
3
12
3
12
3
Second cycle
At this point the sequence of the events will repeat itself:Outputs 1 and 3 will always grant input 1, while output 2 will always grant input 1 at the first iteration of the first cycle, but input 1 will select output 1 indefinitely, leaving output 2 to grant either input 2 or input 3.Thus the cell from input 1 to output 2 will never be granted.In order to solve this starvation the iSlip algorithm was developed.
RRM RRM overcomes two problem
Complexity Unfairness
the round-robin arbiters are much simpler and can perform faster than random arbiters.
The rotating priority aids the algorithm in assigning bandwidth equally and more fairly among requesting connections.
Its throughput is about 63%
2x2 switch with RRM algorithm under heavy load.
synchronization of output arbiters leads to a throughput of just 50%.
Perfo
rman
ce
Synchronization
iSLIP – Iterative Serial-Line IP2. Grant - If an output receives any requests, it chooses the one that appears next in a fixed, round-robin schedule starting from the highest priority element. The output notifies each input whether or not its request was granted.
g2
g4
g1a1
a3
a4
1
23
4
1
23
4
1
23
41
23
4
1
23
4
1
23
4
The pointer gi to the highest priority element of the round-robin schedule is incremented (modulo N) to one location beyond the granted input if and only if the grant is accepted in Step 3 of the first iteration.
iSLIP – Iterative Serial-Line IP
The pointer gi to the highest priority element of the round-robin schedule is incremented (modulo N) to one location beyond the granted input if and only if the grant is accepted in Step 3 of the first iteration.
2. Grant - If an output receives any requests, it chooses the one that appears next in a fixed, round-robin schedule starting from the highest priority element. The output notifies each input whether or not its request was granted.
g2
g4
g1
1
23
4
1
23
4
1
23
41
23
4
1
23
4
1
23
4
a1
a3
a4
iSLIP properties
Property 1. Lowest priority is given to the most recently made connection.
If input i successfully connects to output j, both a i and g j are updated and the connection from input i to output j becomes the lowest priority connection in the next cell time.
Property 2. No connection is starved. This is because an input will continue to request an output until it is successful. The output will serve at most other inputs first, waiting at most N cell times to be accepted by each input. Therefore, a requesting input is always served in less than N 2 cell times.
Property 3. Under heavy load, all queues with a common output have the same throughput. This is a consequence of Property 2: the output pointer moves to each requesting input in a fixed order, thus pr-viding each with the same throughput.
iSLIP properties Simple to implement in hardware Starvation free Its throughput is about 100% It is fair As the load increases, the number of
synchronized arbiters decreases (see Figure), leading to a large sized match.
Under uniform 100% offered load the iSLIP arbiters adapt to a time-division multiplexing scheme.
It converge in O(1)
Bursty Arrivals
Burstiness Reduction Results indicate that iSLIP reduces the average burst
length, and will tend to be more burst-reducing as the offered load increases.
This is because the probability of switching between multiple connections increases as the utilization increases.
As the load increases, the contention increases and bursts are interleaved at the output. In fact, if the offered load exceeds approximately 70%, the average burst length drops to exactly one cell.
Burstiness Reduction
Multiple Iteration The pointer gi to the highest priority element of the
round-robin schedule is incremented (modulo N) to one location beyond the granted input if and only if the grant is accepted in Step 3 of the first iteration.
Note that pointers g i and a i are only updated for matches found in the first iteration.
It converge in O(logN)
Multiple Iteration
All with 4 iterations
Implementation
Implementation(2N arbiters)
Implementation(N arbiters)Each arbiter is used for both inputand output arbitration. In this case, each arbiter contains two registers to hold pointers gi and ai .
Implementation
Priority in iSLIP
Why iSLIP is good for high speed?
input buffers are separated Separated scheduler for each input
and output Each work independently
Multicasting Fanout splitting: higher throughput, but not as simple Non-fanout splitting:Easy, but low throughput
Multicasting (ESLIP: Combining Unicast and Multicast-use in CISCO 12000)
IP packet in iSLIP switch (2N2 Queue)
Arbiter
Linecard
LCS
LCS
1: Req
LCS Ingress Flow control(2.5Tb/s)
3: DataSwitch
Scheduler
Switch
Scheduler
2: Grant/credit
Seq num
Switch
Fabric
Switch
Fabric
Switch Port
Req
Grant
LCS Over Optical Fiber 10Gb/s Linecards
10Gb/s Linecard
LCS
Switch
Scheduler
Switch
Scheduler
Switch
Fabric
Switch
Fabric
10Gb/s Switch Port
LCS
12 multimode fibers
12 multimode fibers
2.5Gb/s LVDS
GENETQuad Serdes
2.56Tb/s IP router
LCS
1000ft/300m
Port #256
Port #1
2.56Tb/s switch core
Linecards
PortProcessor
opticsLCS Protocol
optics
PortProcessor
opticsLCS Protocol
optics
Crossbar
Switch core architecturePort #1
Scheduler
Request
Grant/Credit
Cell Data
Port #256