Multistage Networks with Static Routing -...
Transcript of Multistage Networks with Static Routing -...
1
Multistage Networks withStatic Routing
OverviewNonblocking Networks Estimating Blocking ProbabilityMulticast Networks
2
7-2 - Jon Turner - 8/6/2004
Multistage Networks with Static Routing
All cells in a session follow a specific pre-assigned path.»table at input specifies full path (plus outgoing VCI, for ATM)»makes sense only when sessions configured in advance
–difficult to use effectively for datagrams
Requires path hunt and resource reservation.»creates potential for blocking
Eliminates need for resequencing and dist. scheduling.
3
5
5,132
3
7-3 - Jon Turner - 8/6/2004
General Design IssuesPath hunting» how to rapidly find path with sufficient capacity» for bursty traffic may require complex calculations
Blocking» session requests block if no path available with sufficient
bandwidth» can configure systems to be nonblocking
–at the cost of higher speedup» alternatively, can accept some small blocking probability
Queueing performance determined by» switch element buffering method» flow control method» speedup» distribution of traffic in network » traffic variability
4
7-4 - Jon Turner - 8/6/2004
Unicast Routing in Extended Delta Networks
To find paths in static routing net, control processor must track bandwidth used on links and search for paths with unused capacity.The following code finds paths in D*n,d,h and returns path as a list of link numbers; b(i,j) denotes the normalized bandwidth in use on link (i,j) and ⎣x⎦d denotes the largest multiple of d that is ≤x.list Route(i,j1,j2,w)
if b(i,j1)+w>1 or b(h+k-i,j2)+w>1 thenreturn [ nil ];
fi;if i=h then
Let yk-1. . .y0 be the d-ary representation of j2.j := ⎣σ(h,j1)⎦d + y(k-h)-1;L := [ ];
for r from h+1 to k−1 doL := L & [ j ]if b(r,j)+w>1 then return [ nil ]; fi;j := ⎣σ(r,j)⎦d +yk-(r+1);
rof;return [ j1 ] & L & [ j2 ];
fi;
for r from 0 to d−1 doL := Route(i+1,⎣σ(i,j1)⎦d + r,σ-1(h+k-i,⎣j2⎦d +r),w); if L ≠ [ nil ] then return [ j1 ] & L & [ j2 ]; fi;
rof;return [nil];
end;
Note: as written, code does not update b().
The running time for the program is O(dh(k-h)).With bursty data sources, it can become necessary to consider impact on queueing, making the path-finding process more complex.
5
7-5 - Jon Turner - 8/6/2004
Blocking in Static Routing NetworksA session is a triple (x,y,ω) where x is an input, y is an output and ω is called the weight, and represents the fraction of the external link bandwidth, the session requires.Weights can be restricted so that 0≤b≤ω≤Β≤1.A route is a path in the network connecting x to y. A set of sessions is compatible if the sum of the weights at the inputs and outputs is at most 1; a set of routes is compatible if the sum of the weights on all links is at most the speedup S.A state of a network is a set of compatible sessions and corresponding compatible routesA network is strictly nonblocking for given b, B, β if in all states, a new compatible session can be added, without modifying any existing routes.A network is rearrangeably nonblocking if any set of compatible sessions can be routed.» implies that a compatible session can be routed after modifying existing
routes. The special case of b=B=S=1 is called the circuit switching case.
6
7-6 - Jon Turner - 8/6/2004
The Clos NetworkLet d, r, n be integers where ddivides n evenly. The three stage Clos network is defined by
C X X Xn d r d r n d n d r d, , , / , / ,3 = ⊗ ⊗ n
d r
Proof. Suppose C 3n,d,r is in state S and we have a new reservation (x,y,1). Let u be the stage 1 switch containing input x and notice that the number of outputs of u that are busy is at most d–1.This means that at most d–1 middle stage switches are inaccessible from x and similarly at most d–1 middle stage switches are inaccessible from y.Since r≥2d–1 there must be at least one middle stage switch that is accessible from both x and y.
Theorem. For b =B=S=1, C 3n,d,r is strictly nonblocking for unicast sessions if and only if r ≥2d–1.
7
7-7 - Jon Turner - 8/6/2004
Complexity of Clos NetworkThe crosspoint count for a three stage Clos network is 2dr(n/d)+r(n/d)2. » if we substitute r=2d, making the network nonblocking, this
becomes 2n(2d+n/d)» the crosspoint count is minimized by taking d=(n/2)1/2
» this gives a crosspoint count of 4⋅21/2n 3/2 ≈ 5.7n 3/2.Another nonblocking network can be obtained by replacing the middle stage switches by Clos networks.
C X X X X Xn d r d r d r n d n d r d r d, , , , / , / , ,( )52 2= ⊗ ⊗ ⊗ ⊗
4 2 11
2 14 2 1
11
1
11 1( ) ( )
( )
//k
k
k
kkk
kk n−
−
−+−
−−−
⎛⎝⎜
⎞⎠⎟
» C5 is also nonblocking when r ≥2d−1. If r=2d and d=(2n/3)1/3, the crosspoint count is ≈15.7n 4/3.
» this generalizes to C 2k−1, which is nonblocking when r≥2d −1. If r=2d and d=(2k-1(k −1)n/4(2k-1−1))1/k, the crosspoint count is
8
7-8 - Jon Turner - 8/6/2004
General Nonblocking ConditionTheorem. For unicast sessions, C 3n,d,r is nonblocking if
x
y
>S−ω
⎪⎪⎩
⎪⎪⎨
⎧
−>⎥⎦⎥
⎢⎣⎢ −
−⎥⎥⎤
⎢⎢⎡
−−
>BSb
bbdBSBd
rif2
22
»total load from routes that might block new reservation is ≤d−ω.»a middle stage switch is inaccessible from x if load on link is >S–ω»the number of inaccessible middle-stage switches is at most
⎡((d–ω)/(S–ω))–1⎤ = ⎡(d–ω)/(S–ω)⎤ –1 (if b >S–ω, it is at most ⎣(d–ω)/b⎦).»since there are r middle stage switches, can connect x to y if
r >2 ⎡(d–ω)/(S–ω)⎤ –2 (if b >S–ω, it’s sufficient to have r >2 ⎣(d–ω)/b⎦)»this always holds if r >2 ⎡(d–B)/(S–B)⎤ –2 and S≤d
(if b >S–ω, r >2 ⎣(d–b)/b⎦) .
Proof. The network is clearly nonblocking if S>d. So, assume S≤d. Consider a session from input x to output y at rate ω.
9
7-9 - Jon Turner - 8/6/2004
Cost of Nonblocking Clos Network
Can reduce blocking by increasing either r or S.Increasing S reduces fragmentation.Fragmentation negligible if session rates small.
0
1
2
3
4
5
0 5 10 15 20 25 30number of middle stage switches (r )
(spe
edup
)(r/d
)
session rate = link ratesession rate = (link rate)/2session rate = (link rate)/16
n=256, d =16
10
7-10 - Jon Turner - 8/6/2004
Nonblocking Condition for Benes Network
Theorem. For unicast sessions with b =0, Benes is nonblocking when
S ≥ (1–2/d)B + (2/d)(1+(d–1)(k–1))
Proof sketch. For unicast sessions, when path from an input x is blocked at stage i, dk−i−1 middle stage switches become inaccessible.»an adversary can minimize access to middle stage by blocking paths early, preventing access to at most
M = dk−2(d–ω)/(S–ω) + dk−3(d2–d)/(S–ω) + dk−4(d3–d2)/(S–ω)+ . . . + (dk−1–dk−2)/(S–ω)
= (dk−2/(S–ω))(1+(d–1)(k–1)) – dk−2(ω/(S–ω))middle stage switches
»the network is nonblocking when M<n/2d, which is true when S≥(1–2/d)B + (2/d)(1+(d–1)(k–1))
11
7-11 - Jon Turner - 8/6/2004
Nonblocking Condition for Benes Network
Large switch elements yield feasible speedup requirement.
0
2
4
6
8
10
12
14
16 32 64 128 256 512 1024 2048 4096number of ports (n )
nonb
lock
ing
spee
dup d =2
4
8
16
VC rate=link rateVC rate=(link rate)/16
12
7-12 - Jon Turner - 8/6/2004
Blocking Probability
Blocking may be acceptable if it does not happen too often. For unicast and b=B=β=1, there are simple approximations for estimating the probability of blocking.» parallel rule: compute the blocking probability for a parallel combination
of networks by multiplying the probabilities of the parallel subnetworks.» series rule: compute the blocking probability for a series combination of
subnetworks by multiplying probabilities of not blocking to obtain the complement of the blocking probability for the combination.
» these rules assume that the blocking probabilities in the subnetworksare independent, which is not strictly true, but is often fairly accurate, and in any case is guaranteed to produce an upper bound on the true blocking probability.
This blocking probability estimation method is called Lee’s method.
p1 p2 pn
(1−p)=(1−p1) (1−p2) . . . (1−pn)
. . .
p1
p2
pnp=p1p2 . . . pn
13
7-13 - Jon Turner - 8/6/2004
Applications of Lee’s MethodTo apply Lee’s method to the Clos network, C3
n,d,r we must account for the fact that the links connecting to the middle stages have a lower probability of being busy than the inputs or outputs.» Letting p denote the probability that an input or output is busy, the
blocking probability estimate becomes [1−(1−pd/r)2]r. » Notice that this gives a non-zero blocking probability, even when r≥2d−
1, when we know the network is nonblocking; this emphasizes the approximate nature of Lee’s method.
For the delta network Dn,d, Lee’s method gives a blocking probability estimate of 1−(1−p)k-1, where p is the probability that any input or output is busy and the probability that any given internal link is busy; k=logdn is the number of stages.We can apply Lee’s method to the extended delta network D*
n,d,h in a recursive fashion, by letting P(n,d,h) be the blocking probability estimate. Then
( )[ ]P n d p
P n d h p P n d d h
k
d
( , , ) ( )
( , , ) ( ) ( / , , )
0 1 1
1 1 1 1
1
2
= − −
= − − − −
−
14
7-14 - Jon Turner - 8/6/2004
More Accurate Blocking Probability Estimate
Lee’s method assigns “busy probabilities” to links independently of one another and consequently assigns non-zero probability to busy-idle configurations that do not correspond to real network states.Consider the probability of blocking between an input x and an output y of the delta network Dn,2.» Let z be other input to the first stage switch that x is connected to.» The link connecting x to the recursive subnetwork (Dn/2,2) is busy if z is busy and
connected to that link rather than the other. This occurs with probability p/2.» On the other hand, if the connecting link is not busy, we block only if blocking
occurs in the subnetwork. Thus, if we let f(n,d) be the (approximate) blocking probability for Dn,d, we find
f (n,2)=p/2+(1−p/2)f(n/2,2)=1−(1−p/2)k-1
» This is still approximate, since we are assuming (incorrectly) that the probability that z is busy does not depend on the status of y. However, it gives a better approximation than Lee’s method.
By similar reasoning, if we let
α =−⎛
⎝⎜
⎞⎠⎟ − = −
=
− − −∑ ( / ) ( ) ( / )( )i dd
ip p d p
i
d i d i11 1 1
1
1 1
we find f(n,d)=α+(1–α)f(n/d,d) =1–(1–(1–1/d)p)k-1.
15
7-15 - Jon Turner - 8/6/2004
We can make a similar improved estimate for the Benes network.Let g(n,d) be the approximate blocking probability for Bn,d and consider the case of d=2.» Let x and y be an idle input-ouput pair and let a be the other input to
the switch x is connected to and let b be the other output of the switch that y is connected to.
» If a and b are both idle, x and y block with probability g(n/2,2)2.» If one of a and b is busy, x and y block with probability g(n/2,2).» If both are busy and they are connected to the same recursive
subnetwork, then x and y block with probability g(n/2,2).» If both are busy and they are connected to different recursive
subnetworks then x and y block with probability 1. Thus for n>2g(n,2)=(1−p)2g(n/2,2)2+(2p(1−p)+p2/2)g(n/2,2)+p2/2
Applying this method to B16,2 with p=0.5 yields a blocking probability estimate of .265, while Lee’s method gives .899. For larger d, the differences get smaller.
16
7-16 - Jon Turner - 8/6/2004
The estimate for d=2 can be generalized to arbitrary d as follows
( )( )( )( )( )( )( )( )⎪
⎪⎪⎪
⎩
⎪⎪⎪⎪
⎨
⎧
+−+
−+−−
====
=
−⎟⎟⎠
⎞⎜⎜⎝
⎛ −−⎟⎟
⎠
⎞⎜⎜⎝
⎛ −=
−−−
−−
−−
−
+−
−−−−−
=
−
=∑ ∑
),/(),(),1(
otherwise)1,()1,1(
0or 0 if),/(or if1
),(
where
),()1(1
)1(1
),(
1,
1,
1,
1,
)(
,
,)1()1(1
0
1
0
ddngjiXjiX
jiXjiX
jiddngajai
jiX
jiXpj
dppp
id
dng
adna
jaa
ia
adna
jaai
adna
ja
ia
adna
jai
jia
adn
ddn
jdjidid
i
d
j
Here, Xan,d(i,j) is the probability of blocking from some input x to
some output y in Bn,d when the first stage switch connected to x has exactly i of its first a output links busy and the last stage switch connected to y has exactly j of its first a inputs busy.These methods can be applied to other networks as well.
17
7-17 - Jon Turner - 8/6/2004
Estimating Blocking Probability for General Rates
Classical methods for determining blocking probability apply only to circuit-switching case and are difficult to extend.
»distribution of session rates is rarely known with precision»effect of blocking on distribution of rates in network difficult to compute»for special case of b=B, can approximate using Lee’s method
Compromise approach is to make worst-case assumptions about distribution of session rates in network, but assume blocked links are randomly distributed.For C 3n,d,r
»the number of blocked links leaving a first stage switch or entering a third stage switch is at most t =⎡(βd −Β)/(1−Β )⎤ −1 (when b=0)
»if blocked links are randomly distributed, then the estimated probability of blocking is
t t t r t t r tr r r t
( ) ( ) ( ) ( )( ) ( )
− − + − − +− − +
1 2 1 1 11 1
L L
L
» for d=16, r =24, β=1 and B =1/16, t =16 and blocking probability estimate is .017
» Similar analysis can be done for Benes network.
18
7-18 - Jon Turner - 8/6/2004
0101
0010 1000
1001
1010
0110
Multicast Static Routing Networks
Routing tables in each SE specify branching.Multicast Index (MI) supplied by input port.MI used to lookup bit vector specifying branching.»may use direct lookup, CAM or hash»for IP may do multicast address lookup at SEs instead of using MI
for ATM, need additional MI to VCI lookup at output.
3
5
4
1
8
2
19
7-19 - Jon Turner - 8/6/2004
Nonblocking Multicast NetworksA session is a triple (x,Y,ω) where x is an input, Y is a set of outputs and ω is the weight.A route is a subtree of the network connecting x to all of Y. A set of sessions is compatible if the sum of the weights at the inputs and outputs is at most 1; a set of routes is compatible if the sum of the weights on all links is at most S.A state of a network is a set of compatible sessions and corresponding compatible routesA network is strictly nonblocking for given b, B, β if in all states, a new compatible session or compatible extension can be routed, without modifying any existing routes.A network is reroutably nonblocking if for all states reachable by a given routing algorithm, any compatible session can be added, without modifying any existing routes.»implies that an existing route can extended, by rerouting
A network is rearrangeably nonblocking if any set of compatible sessions can be routed.
20
7-20 - Jon Turner - 8/6/2004
Multicast Nonblocking Condition for Clos
Theorem. For multicast connections C 3n,d,r is nonblocking if
⎪⎪⎩
⎪⎪⎨
⎧
−>⎥⎦⎥
⎢⎣⎢ −
+⎥⎦⎥
⎢⎣⎢ −
−⎥⎥⎤
⎢⎢⎡
−−
+⎥⎥⎤
⎢⎢⎡
−−
>BSb
bbd
bbdF
BSBd
BSBdF
rif
2
Proof. The network is clearly nonblocking if S>d. So, assume S≤d. Multicast routes can branch to F middle stage switches in worst-case.
»total load that can block an input x’s access to middle stage is ≤ F(d−ω)»the number of middle stage switches inaccessible from x is at most
⎡(F (d–ω)/(S–ω)) −1⎤ (if b>S–ω, it is at most ⎣F(d–ω)/b⎦)»can always add new route if r > ⎡(F(d−ω) /(S−ω)⎤+ ⎡(d–ω)/(S–ω)⎤ –2.
(or r>⎣F (βd–ω)/b⎦)+⎣(βd–ω)/b⎦ if b>1−ω).»this always holds when r > ⎡(F(d–B) /(S–B)⎤ + ⎡(d–B)/(S–B)⎤ –2 if S≤d.
(or r>⎣F(βd–b)/b⎦)+⎣(βd–b)/b⎦ if b>S–ω).
where F = min{r,n/d}.
21
7-21 - Jon Turner - 8/6/2004
Cost of Nonblocking Multicast Clos
Most cost-effective configuration (by far) is r=1.» but not practical in highest capacity systems
Session rate has minimal impact unless speedup is small.Worst-case analysis overly pessimistic for multicast.
0
5
10
15
20
25
30
35
0 5 10 15 20 25 30number of middle stage switches (r )
(spe
ed a
dvan
tage
)(r /d
) n=256, session rate=link rate
d =8
16
32
22
7-22 - Jon Turner - 8/6/2004
Extension to Asymmetric Clos Networks
⎪⎪⎩
⎪⎪⎨
⎧
−>⎥⎦⎥
⎢⎣⎢ −
+⎥⎦⎥
⎢⎣⎢ −
−⎥⎥⎤
⎢⎢⎡
−−
+⎥⎥⎤
⎢⎢⎡
−−
>BSb
bbd
bbdF
BSBd
BSBdF
rif
2
21
21
M n d n d r X X Xd r n d n d r d( , , , , ) , / , / ,1 1 2 2 1 1 1 2 2 2= ⊗ ⊗
where F = min{r,n2/d2}.
The proof is very similar to the previous case.
Theorem. For multicast sessions, M(n1,d1,n2,d2,r) is nonblocking if
We can generalize C 3 to networks with more outputs than inputs.
23
7-23 - Jon Turner - 8/6/2004
Making Clos Network Reroutably Nonblocking
If we limit the amount that a route can branch in the first stage of M(n1,d1,n2,d2,r) to f<F, we obtain a reroutably nonblocking network that can be significantly less expensive than the fully nonblocking version.»routing algorithm excludes all states with routes that branch bymore than f in first stage
Theorem. M(n1,d1,n2,d2,r) is reroutably nonblocking ifr > ⎡f (d1–B)/(S–B) –1⎤ + ⎡(d2–B)/(S–B)–1⎤(n2/d2)1/f
if b = 0 and the first stage fanout is limited to f.The key to proving this is a lemma concerning the set covering problem: given a set A={a1,...,at}, and a collection S={S1,...,Sp} where each Si is a subset of A, find the smallest possible number of sets Si whose union equals A.The greedy algorithm for set covering selects sets one at a time, always picking the next set that covers the most previously uncovered elements.
24
7-24 - Jon Turner - 8/6/2004
Set Covering LemmaLemma. Let A={a1,...,at} and S={S1,...,Sp} be an instance of set covering in which for some q (1≤q<p), every ai appears in at least p−q of the sets Sj. If p>qt1/f then the greedy algorithm finds a solution with no more than f sets.Proof. Let h be the number of sets in the solution found by the greedy algorithm and assume the sets are numbered so that for 1≤i≤h, Si is the set chosen by the greedy algorithm in step i. Define Ui=S1 ∪ . . . ∪Si , let Di=Si –Ui-1 and let si=|Si|, ui =|Ui| and di= |Di|. Then
ps s p q t u q p t x tii
p1 1 1 01 1≥ ≥ − ≥ − = −
=∑ ( ) ( / ) ( )so
( ) | | ( )( ) ( )( )( ) ( ) ( ) ( )
p d S U p q t u d x t uu u d x t x u x t x x t x x t
ii
p− ≥ − ≥ − − ≥ − −= + ≥ − + ≥ − + − = −
=∑1 11 1 1 1
2 2 1 1 2 1 1
2 1 2 1 1 1 1 1 0 1 0
so
u x x x t x t t t tf ff≥ − ≥ − > − = −−( ) ( ) ( ( / ))1 1 1 1 11 1 0 0L
So, Uf has more than t–1 elements and since A has only t, Uf must also have t.
In a similar fashion, we find that for i≤h, ui≥(1–xi-1 . . . x1x0)t. In particular
where xj=(q–j)/(p–j). Next, note that
25
7-25 - Jon Turner - 8/6/2004
Proof of Nonblocking Theorem for MTo prove the theorem, we show that one can setup a route of weight B from an input x to some arbitrary set of outputs.Let p be the number of second stage switches reachable from x and note that p≥r–⎡f(d1–B)/(S–B)–1⎤ > ⎡(d2–1)/(S–B)–1⎤(n2/d2)1/f , using the bound on r in the statement of the theorem.Note that each third stage switch can be prevented from reachingat most ⎡(d2–B)/(S–B)–1⎤ second stage switches.To prove the theorem, apply the lemma by letting A be the set of switches in the third stage (so t=n2/d2). We define Sj to be the set of third stage switches that can be reached by the j-th second stage switch that is reachable from x.Observe that each third stage switch appears in at least p –⎡(d2–B)/(S–B)–1⎤ of the Sj, so let q=⎡(d2–B)/(S–B)–1⎤ .Now, we see that qt 1/f=⎡(d2–B)/(S–B)–1⎤(n2/d2)1/f and since p is larger than this, the lemma tells us that we can reach all the third stage switches through just f of the second stage switches.
26
7-26 - Jon Turner - 8/6/2004
Multicast Nonblocking Conditions for Benes
Theorem. For multicast sessions with b=0, Bn,d is nonblocking when
S≥(1/d)[n+(1+(d–1)(k–1))–B]Theorem. For multicast sessions with fanout≤di and b=0, Bn,d is nonblocking when
S≥(1/d)[di (1+ (d–1)(k–i)) +(1+(d–1)(k–1))–B]The proofs of both theorems are similar to the proof for the unicast nonblocking condition (page 7-10).For typical system configurations, speedup needed for nonblocking operation is impractically large.
27
7-27 - Jon Turner - 8/6/2004
Cascaded Benes Networks
Limit fanout to f in each Benes network.If f ≤d, and there are logf n subnetworks, no blocking when
Bn,d Bn,d Bn,d. . .
Bkdd
fS +−−++
≥ ))1)(1(1(1
When f=4, cost is about (5/4)log2n times the cost of nonblocking unicast network (so 10× for n=256)
28
7-28 - Jon Turner - 8/6/2004
Binary Tree MulticastBinary branching allowed in each pass.Extend connection by inserting additional binary branch point.Retract connection by removing binary branch point.Requires connection re-routing, but time required comparable to unicast.With Benes network, no blocking if
where δ is multicast traffic fraction. Only multicast method for which network complexity and time per operation is roughly comparable with unicast.
Bkdd
S +−−++
≥ ))1)(1(1()1(3 δ
29
7-29 - Jon Turner - 8/6/2004
Reroutably Nonblocking VersionsBy cascading two Benes networks, we obtain a reroutably nonblocking network if
S≥(1–1/d)B + (2/d)(1+(d–1)(k–1))and branching is allowed only in second network. New
connections are routed through most lightly loaded part of second network.Can accomplish same thing with two passes through one network if
S≥B+(1+δ)(2/d)(1+(d–1)(k–1))where δ is fraction of traffic belonging to multicast connections» for n=4096, d=8, δ=.1, link speed of 2.4 Gb/s and max session
rate of 150 Mb/s, need 1/β ≥ 6.12; for unicast, the speedup needed is 5.55.
» requires mechanism to avoid disruption of data stream when re-switching
30
7-30 - Jon Turner - 8/6/2004
Blocking Probability for MulticastWorst-case analysis overly pessimistic for multicast.» based on maximum branching in “left half” of network» in practice, expect most branching to be in right half
Determination of blocking probability difficult.» no “natural” assumptions for fanout or bandwidth distribution» routing algorithms can have strong influence on results
For networks operated with fanout restriction, can estimate» for circuit switched case, C 3n,d,r with first stage fanout restriction f and
r=(f +1)d–1, blocking probability is <((d–1)/r ) f ≈ (1/(f +1)) f
if busy links are randomly distributed.» for f =3, result is ≈.062, for f =4, it’s .0016, for f=5, it’s ≈.00012.» can extend to general VC rates by assuming the worst-case number of
blocked links but random distribution of blocked linksMore reliable estimates can be obtained by simulation.» results strongly dependent on traffic assumptions (fanout and session
rate) and on routing algorithm used