Multistage Networks with Static Routing -...

1

Multistage Networks withStatic Routing

OverviewNonblocking Networks Estimating Blocking ProbabilityMulticast Networks

2

7-2 - Jon Turner - 8/6/2004

Multistage Networks with Static Routing

All cells in a session follow a specific pre-assigned path.»table at input specifies full path (plus outgoing VCI, for ATM)»makes sense only when sessions configured in advance

–difficult to use effectively for datagrams

Requires path hunt and resource reservation.»creates potential for blocking

Eliminates need for resequencing and dist. scheduling.

3

5

5,132

3

7-3 - Jon Turner - 8/6/2004

General Design IssuesPath hunting» how to rapidly find path with sufficient capacity» for bursty traffic may require complex calculations

Blocking» session requests block if no path available with sufficient

bandwidth» can configure systems to be nonblocking

–at the cost of higher speedup» alternatively, can accept some small blocking probability

Queueing performance determined by» switch element buffering method» flow control method» speedup» distribution of traffic in network » traffic variability

4

7-4 - Jon Turner - 8/6/2004

Unicast Routing in Extended Delta Networks

To find paths in static routing net, control processor must track bandwidth used on links and search for paths with unused capacity.The following code finds paths in D*n,d,h and returns path as a list of link numbers; b(i,j) denotes the normalized bandwidth in use on link (i,j) and ⎣x⎦d denotes the largest multiple of d that is ≤x.list Route(i,j1,j2,w)

if b(i,j1)+w>1 or b(h+k-i,j2)+w>1 thenreturn [ nil ];

fi;if i=h then

Let yk-1. . .y0 be the d-ary representation of j2.j := ⎣σ(h,j1)⎦d + y(k-h)-1;L := [ ];

for r from h+1 to k−1 doL := L & [ j ]if b(r,j)+w>1 then return [ nil ]; fi;j := ⎣σ(r,j)⎦d +yk-(r+1);

rof;return [ j1 ] & L & [ j2 ];

fi;

for r from 0 to d−1 doL := Route(i+1,⎣σ(i,j1)⎦d + r,σ-1(h+k-i,⎣j2⎦d +r),w); if L ≠ [ nil ] then return [ j1 ] & L & [ j2 ]; fi;

rof;return [nil];

end;

Note: as written, code does not update b().

The running time for the program is O(dh(k-h)).With bursty data sources, it can become necessary to consider impact on queueing, making the path-finding process more complex.

5

7-5 - Jon Turner - 8/6/2004

Blocking in Static Routing NetworksA session is a triple (x,y,ω) where x is an input, y is an output and ω is called the weight, and represents the fraction of the external link bandwidth, the session requires.Weights can be restricted so that 0≤b≤ω≤Β≤1.A route is a path in the network connecting x to y. A set of sessions is compatible if the sum of the weights at the inputs and outputs is at most 1; a set of routes is compatible if the sum of the weights on all links is at most the speedup S.A state of a network is a set of compatible sessions and corresponding compatible routesA network is strictly nonblocking for given b, B, β if in all states, a new compatible session can be added, without modifying any existing routes.A network is rearrangeably nonblocking if any set of compatible sessions can be routed.» implies that a compatible session can be routed after modifying existing

routes. The special case of b=B=S=1 is called the circuit switching case.

6

7-6 - Jon Turner - 8/6/2004

The Clos NetworkLet d, r, n be integers where ddivides n evenly. The three stage Clos network is defined by

C X X Xn d r d r n d n d r d, , , / , / ,3 = ⊗ ⊗ n

d r

Proof. Suppose C 3n,d,r is in state S and we have a new reservation (x,y,1). Let u be the stage 1 switch containing input x and notice that the number of outputs of u that are busy is at most d–1.This means that at most d–1 middle stage switches are inaccessible from x and similarly at most d–1 middle stage switches are inaccessible from y.Since r≥2d–1 there must be at least one middle stage switch that is accessible from both x and y.

Theorem. For b =B=S=1, C 3n,d,r is strictly nonblocking for unicast sessions if and only if r ≥2d–1.

7

7-7 - Jon Turner - 8/6/2004

Complexity of Clos NetworkThe crosspoint count for a three stage Clos network is 2dr(n/d)+r(n/d)2. » if we substitute r=2d, making the network nonblocking, this

becomes 2n(2d+n/d)» the crosspoint count is minimized by taking d=(n/2)1/2

» this gives a crosspoint count of 4⋅21/2n 3/2 ≈ 5.7n 3/2.Another nonblocking network can be obtained by replacing the middle stage switches by Clos networks.

C X X X X Xn d r d r d r n d n d r d r d, , , , / , / , ,( )52 2= ⊗ ⊗ ⊗ ⊗

4 2 11

2 14 2 1

11

1

11 1( ) ( )

( )

//k

k

k

kkk

kk n−

−

−+−

−−−

⎛⎝⎜

⎞⎠⎟

» C5 is also nonblocking when r ≥2d−1. If r=2d and d=(2n/3)1/3, the crosspoint count is ≈15.7n 4/3.

» this generalizes to C 2k−1, which is nonblocking when r≥2d −1. If r=2d and d=(2k-1(k −1)n/4(2k-1−1))1/k, the crosspoint count is

8

7-8 - Jon Turner - 8/6/2004

General Nonblocking ConditionTheorem. For unicast sessions, C 3n,d,r is nonblocking if

x

y

>S−ω

⎪⎪⎩

⎪⎪⎨

⎧

−>⎥⎦⎥

⎢⎣⎢ −

−⎥⎥⎤

⎢⎢⎡

−−

>BSb

bbdBSBd

rif2

22

»total load from routes that might block new reservation is ≤d−ω.»a middle stage switch is inaccessible from x if load on link is >S–ω»the number of inaccessible middle-stage switches is at most

⎡((d–ω)/(S–ω))–1⎤ = ⎡(d–ω)/(S–ω)⎤ –1 (if b >S–ω, it is at most ⎣(d–ω)/b⎦).»since there are r middle stage switches, can connect x to y if

r >2 ⎡(d–ω)/(S–ω)⎤ –2 (if b >S–ω, it’s sufficient to have r >2 ⎣(d–ω)/b⎦)»this always holds if r >2 ⎡(d–B)/(S–B)⎤ –2 and S≤d

(if b >S–ω, r >2 ⎣(d–b)/b⎦) .

Proof. The network is clearly nonblocking if S>d. So, assume S≤d. Consider a session from input x to output y at rate ω.

9

7-9 - Jon Turner - 8/6/2004

Cost of Nonblocking Clos Network

Can reduce blocking by increasing either r or S.Increasing S reduces fragmentation.Fragmentation negligible if session rates small.

0

1

2

3

4

5

0 5 10 15 20 25 30number of middle stage switches (r )

(spe

edup

)(r/d

)

session rate = link ratesession rate = (link rate)/2session rate = (link rate)/16

n=256, d =16

10

7-10 - Jon Turner - 8/6/2004

Nonblocking Condition for Benes Network

Theorem. For unicast sessions with b =0, Benes is nonblocking when

S ≥ (1–2/d)B + (2/d)(1+(d–1)(k–1))

Proof sketch. For unicast sessions, when path from an input x is blocked at stage i, dk−i−1 middle stage switches become inaccessible.»an adversary can minimize access to middle stage by blocking paths early, preventing access to at most

M = dk−2(d–ω)/(S–ω) + dk−3(d2–d)/(S–ω) + dk−4(d3–d2)/(S–ω)+ . . . + (dk−1–dk−2)/(S–ω)

= (dk−2/(S–ω))(1+(d–1)(k–1)) – dk−2(ω/(S–ω))middle stage switches

»the network is nonblocking when M<n/2d, which is true when S≥(1–2/d)B + (2/d)(1+(d–1)(k–1))

11

7-11 - Jon Turner - 8/6/2004

Nonblocking Condition for Benes Network

Large switch elements yield feasible speedup requirement.

0

2

4

6

8

10

12

14

16 32 64 128 256 512 1024 2048 4096number of ports (n )

nonb

lock

ing

spee

dup d =2

4

8

16

VC rate=link rateVC rate=(link rate)/16

12

7-12 - Jon Turner - 8/6/2004

Blocking Probability

Blocking may be acceptable if it does not happen too often. For unicast and b=B=β=1, there are simple approximations for estimating the probability of blocking.» parallel rule: compute the blocking probability for a parallel combination

of networks by multiplying the probabilities of the parallel subnetworks.» series rule: compute the blocking probability for a series combination of

subnetworks by multiplying probabilities of not blocking to obtain the complement of the blocking probability for the combination.

» these rules assume that the blocking probabilities in the subnetworksare independent, which is not strictly true, but is often fairly accurate, and in any case is guaranteed to produce an upper bound on the true blocking probability.

This blocking probability estimation method is called Lee’s method.

p1 p2 pn

(1−p)=(1−p1) (1−p2) . . . (1−pn)

. . .

p1

p2

pnp=p1p2 . . . pn

13

7-13 - Jon Turner - 8/6/2004

Applications of Lee’s MethodTo apply Lee’s method to the Clos network, C3

n,d,r we must account for the fact that the links connecting to the middle stages have a lower probability of being busy than the inputs or outputs.» Letting p denote the probability that an input or output is busy, the

blocking probability estimate becomes [1−(1−pd/r)2]r. » Notice that this gives a non-zero blocking probability, even when r≥2d−

1, when we know the network is nonblocking; this emphasizes the approximate nature of Lee’s method.

For the delta network Dn,d, Lee’s method gives a blocking probability estimate of 1−(1−p)k-1, where p is the probability that any input or output is busy and the probability that any given internal link is busy; k=logdn is the number of stages.We can apply Lee’s method to the extended delta network D*

n,d,h in a recursive fashion, by letting P(n,d,h) be the blocking probability estimate. Then

( )[ ]P n d p

P n d h p P n d d h

k

d

( , , ) ( )

( , , ) ( ) ( / , , )

0 1 1

1 1 1 1

1

2

= − −

= − − − −

−

14

7-14 - Jon Turner - 8/6/2004

More Accurate Blocking Probability Estimate

Lee’s method assigns “busy probabilities” to links independently of one another and consequently assigns non-zero probability to busy-idle configurations that do not correspond to real network states.Consider the probability of blocking between an input x and an output y of the delta network Dn,2.» Let z be other input to the first stage switch that x is connected to.» The link connecting x to the recursive subnetwork (Dn/2,2) is busy if z is busy and

connected to that link rather than the other. This occurs with probability p/2.» On the other hand, if the connecting link is not busy, we block only if blocking

occurs in the subnetwork. Thus, if we let f(n,d) be the (approximate) blocking probability for Dn,d, we find

f (n,2)=p/2+(1−p/2)f(n/2,2)=1−(1−p/2)k-1

» This is still approximate, since we are assuming (incorrectly) that the probability that z is busy does not depend on the status of y. However, it gives a better approximation than Lee’s method.

By similar reasoning, if we let

α =−⎛

⎝⎜

⎞⎠⎟ − = −

=

− − −∑ ( / ) ( ) ( / )( )i dd

ip p d p

i

d i d i11 1 1

1

1 1

we find f(n,d)=α+(1–α)f(n/d,d) =1–(1–(1–1/d)p)k-1.

15

7-15 - Jon Turner - 8/6/2004

We can make a similar improved estimate for the Benes network.Let g(n,d) be the approximate blocking probability for Bn,d and consider the case of d=2.» Let x and y be an idle input-ouput pair and let a be the other input to

the switch x is connected to and let b be the other output of the switch that y is connected to.

» If a and b are both idle, x and y block with probability g(n/2,2)2.» If one of a and b is busy, x and y block with probability g(n/2,2).» If both are busy and they are connected to the same recursive

subnetwork, then x and y block with probability g(n/2,2).» If both are busy and they are connected to different recursive

subnetworks then x and y block with probability 1. Thus for n>2g(n,2)=(1−p)2g(n/2,2)2+(2p(1−p)+p2/2)g(n/2,2)+p2/2

Applying this method to B16,2 with p=0.5 yields a blocking probability estimate of .265, while Lee’s method gives .899. For larger d, the differences get smaller.

16

7-16 - Jon Turner - 8/6/2004

The estimate for d=2 can be generalized to arbitrary d as follows

( )( )( )( )( )( )( )( )⎪

⎪⎪⎪

⎩

⎪⎪⎪⎪

⎨

⎧

+−+

−+−−

====

=

−⎟⎟⎠

⎞⎜⎜⎝

⎛ −−⎟⎟

⎠

⎞⎜⎜⎝

⎛ −=

−−−

−−

−−

−

+−

−−−−−

=

−

=∑ ∑

),/(),(),1(

otherwise)1,()1,1(

0or 0 if),/(or if1

),(

where

),()1(1

)1(1

),(

1,

1,

1,

1,

)(

,

,)1()1(1

0

1

0

ddngjiXjiX

jiXjiX

jiddngajai

jiX

jiXpj

dppp

id

dng

adna

jaa

ia

adna

jaai

adna

ja

ia

adna

jai

jia

adn

ddn

jdjidid

i

d

j

Here, Xan,d(i,j) is the probability of blocking from some input x to

some output y in Bn,d when the first stage switch connected to x has exactly i of its first a output links busy and the last stage switch connected to y has exactly j of its first a inputs busy.These methods can be applied to other networks as well.

17

7-17 - Jon Turner - 8/6/2004

Estimating Blocking Probability for General Rates

Classical methods for determining blocking probability apply only to circuit-switching case and are difficult to extend.

»distribution of session rates is rarely known with precision»effect of blocking on distribution of rates in network difficult to compute»for special case of b=B, can approximate using Lee’s method

Compromise approach is to make worst-case assumptions about distribution of session rates in network, but assume blocked links are randomly distributed.For C 3n,d,r

»the number of blocked links leaving a first stage switch or entering a third stage switch is at most t =⎡(βd −Β)/(1−Β )⎤ −1 (when b=0)

»if blocked links are randomly distributed, then the estimated probability of blocking is

t t t r t t r tr r r t

( ) ( ) ( ) ( )( ) ( )

− − + − − +− − +

1 2 1 1 11 1

L L

L

» for d=16, r =24, β=1 and B =1/16, t =16 and blocking probability estimate is .017

» Similar analysis can be done for Benes network.

18

7-18 - Jon Turner - 8/6/2004

0101

0010 1000

1001

1010

0110

Multicast Static Routing Networks

Routing tables in each SE specify branching.Multicast Index (MI) supplied by input port.MI used to lookup bit vector specifying branching.»may use direct lookup, CAM or hash»for IP may do multicast address lookup at SEs instead of using MI

for ATM, need additional MI to VCI lookup at output.

3

5

4

1

8

2

19

7-19 - Jon Turner - 8/6/2004

Nonblocking Multicast NetworksA session is a triple (x,Y,ω) where x is an input, Y is a set of outputs and ω is the weight.A route is a subtree of the network connecting x to all of Y. A set of sessions is compatible if the sum of the weights at the inputs and outputs is at most 1; a set of routes is compatible if the sum of the weights on all links is at most S.A state of a network is a set of compatible sessions and corresponding compatible routesA network is strictly nonblocking for given b, B, β if in all states, a new compatible session or compatible extension can be routed, without modifying any existing routes.A network is reroutably nonblocking if for all states reachable by a given routing algorithm, any compatible session can be added, without modifying any existing routes.»implies that an existing route can extended, by rerouting

A network is rearrangeably nonblocking if any set of compatible sessions can be routed.

20

7-20 - Jon Turner - 8/6/2004

Multicast Nonblocking Condition for Clos

Theorem. For multicast connections C 3n,d,r is nonblocking if

⎪⎪⎩

⎪⎪⎨

⎧

−>⎥⎦⎥

⎢⎣⎢ −

+⎥⎦⎥

⎢⎣⎢ −

−⎥⎥⎤

⎢⎢⎡

−−

+⎥⎥⎤

⎢⎢⎡

−−

>BSb

bbd

bbdF

BSBd

BSBdF

rif

2

Proof. The network is clearly nonblocking if S>d. So, assume S≤d. Multicast routes can branch to F middle stage switches in worst-case.

»total load that can block an input x’s access to middle stage is ≤ F(d−ω)»the number of middle stage switches inaccessible from x is at most

⎡(F (d–ω)/(S–ω)) −1⎤ (if b>S–ω, it is at most ⎣F(d–ω)/b⎦)»can always add new route if r > ⎡(F(d−ω) /(S−ω)⎤+ ⎡(d–ω)/(S–ω)⎤ –2.

(or r>⎣F (βd–ω)/b⎦)+⎣(βd–ω)/b⎦ if b>1−ω).»this always holds when r > ⎡(F(d–B) /(S–B)⎤ + ⎡(d–B)/(S–B)⎤ –2 if S≤d.

(or r>⎣F(βd–b)/b⎦)+⎣(βd–b)/b⎦ if b>S–ω).

where F = min{r,n/d}.

21

7-21 - Jon Turner - 8/6/2004

Cost of Nonblocking Multicast Clos

Most cost-effective configuration (by far) is r=1.» but not practical in highest capacity systems

Session rate has minimal impact unless speedup is small.Worst-case analysis overly pessimistic for multicast.

0

5

10

15

20

25

30

35

0 5 10 15 20 25 30number of middle stage switches (r )

(spe

ed a

dvan

tage

)(r /d

) n=256, session rate=link rate

d =8

16

32

22

7-22 - Jon Turner - 8/6/2004

Extension to Asymmetric Clos Networks

⎪⎪⎩

⎪⎪⎨

⎧

−>⎥⎦⎥

⎢⎣⎢ −

+⎥⎦⎥

⎢⎣⎢ −

−⎥⎥⎤

⎢⎢⎡

−−

+⎥⎥⎤

⎢⎢⎡

−−

>BSb

bbd

bbdF

BSBd

BSBdF

rif

2

21

21

M n d n d r X X Xd r n d n d r d( , , , , ) , / , / ,1 1 2 2 1 1 1 2 2 2= ⊗ ⊗

where F = min{r,n2/d2}.

The proof is very similar to the previous case.

Theorem. For multicast sessions, M(n1,d1,n2,d2,r) is nonblocking if

We can generalize C 3 to networks with more outputs than inputs.

23

7-23 - Jon Turner - 8/6/2004

Making Clos Network Reroutably Nonblocking

If we limit the amount that a route can branch in the first stage of M(n1,d1,n2,d2,r) to f<F, we obtain a reroutably nonblocking network that can be significantly less expensive than the fully nonblocking version.»routing algorithm excludes all states with routes that branch bymore than f in first stage

Theorem. M(n1,d1,n2,d2,r) is reroutably nonblocking ifr > ⎡f (d1–B)/(S–B) –1⎤ + ⎡(d2–B)/(S–B)–1⎤(n2/d2)1/f

if b = 0 and the first stage fanout is limited to f.The key to proving this is a lemma concerning the set covering problem: given a set A={a1,...,at}, and a collection S={S1,...,Sp} where each Si is a subset of A, find the smallest possible number of sets Si whose union equals A.The greedy algorithm for set covering selects sets one at a time, always picking the next set that covers the most previously uncovered elements.

24

7-24 - Jon Turner - 8/6/2004

Set Covering LemmaLemma. Let A={a1,...,at} and S={S1,...,Sp} be an instance of set covering in which for some q (1≤q<p), every ai appears in at least p−q of the sets Sj. If p>qt1/f then the greedy algorithm finds a solution with no more than f sets.Proof. Let h be the number of sets in the solution found by the greedy algorithm and assume the sets are numbered so that for 1≤i≤h, Si is the set chosen by the greedy algorithm in step i. Define Ui=S1 ∪ . . . ∪Si , let Di=Si –Ui-1 and let si=|Si|, ui =|Ui| and di= |Di|. Then

ps s p q t u q p t x tii

p1 1 1 01 1≥ ≥ − ≥ − = −

=∑ ( ) ( / ) ( )so

( ) | | ( )( ) ( )( )( ) ( ) ( ) ( )

p d S U p q t u d x t uu u d x t x u x t x x t x x t

ii

p− ≥ − ≥ − − ≥ − −= + ≥ − + ≥ − + − = −

=∑1 11 1 1 1

2 2 1 1 2 1 1

2 1 2 1 1 1 1 1 0 1 0

so

u x x x t x t t t tf ff≥ − ≥ − > − = −−( ) ( ) ( ( / ))1 1 1 1 11 1 0 0L

So, Uf has more than t–1 elements and since A has only t, Uf must also have t.

In a similar fashion, we find that for i≤h, ui≥(1–xi-1 . . . x1x0)t. In particular

where xj=(q–j)/(p–j). Next, note that

25

7-25 - Jon Turner - 8/6/2004

Proof of Nonblocking Theorem for MTo prove the theorem, we show that one can setup a route of weight B from an input x to some arbitrary set of outputs.Let p be the number of second stage switches reachable from x and note that p≥r–⎡f(d1–B)/(S–B)–1⎤ > ⎡(d2–1)/(S–B)–1⎤(n2/d2)1/f , using the bound on r in the statement of the theorem.Note that each third stage switch can be prevented from reachingat most ⎡(d2–B)/(S–B)–1⎤ second stage switches.To prove the theorem, apply the lemma by letting A be the set of switches in the third stage (so t=n2/d2). We define Sj to be the set of third stage switches that can be reached by the j-th second stage switch that is reachable from x.Observe that each third stage switch appears in at least p –⎡(d2–B)/(S–B)–1⎤ of the Sj, so let q=⎡(d2–B)/(S–B)–1⎤ .Now, we see that qt 1/f=⎡(d2–B)/(S–B)–1⎤(n2/d2)1/f and since p is larger than this, the lemma tells us that we can reach all the third stage switches through just f of the second stage switches.

26

7-26 - Jon Turner - 8/6/2004

Multicast Nonblocking Conditions for Benes

Theorem. For multicast sessions with b=0, Bn,d is nonblocking when

S≥(1/d)[n+(1+(d–1)(k–1))–B]Theorem. For multicast sessions with fanout≤di and b=0, Bn,d is nonblocking when

S≥(1/d)[di (1+ (d–1)(k–i)) +(1+(d–1)(k–1))–B]The proofs of both theorems are similar to the proof for the unicast nonblocking condition (page 7-10).For typical system configurations, speedup needed for nonblocking operation is impractically large.

27

7-27 - Jon Turner - 8/6/2004

Cascaded Benes Networks

Limit fanout to f in each Benes network.If f ≤d, and there are logf n subnetworks, no blocking when

Bn,d Bn,d Bn,d. . .

Bkdd

fS +−−++

≥ ))1)(1(1(1

When f=4, cost is about (5/4)log2n times the cost of nonblocking unicast network (so 10× for n=256)

28

7-28 - Jon Turner - 8/6/2004

Binary Tree MulticastBinary branching allowed in each pass.Extend connection by inserting additional binary branch point.Retract connection by removing binary branch point.Requires connection re-routing, but time required comparable to unicast.With Benes network, no blocking if

where δ is multicast traffic fraction. Only multicast method for which network complexity and time per operation is roughly comparable with unicast.

Bkdd

S +−−++

≥ ))1)(1(1()1(3 δ

29

7-29 - Jon Turner - 8/6/2004

Reroutably Nonblocking VersionsBy cascading two Benes networks, we obtain a reroutably nonblocking network if

S≥(1–1/d)B + (2/d)(1+(d–1)(k–1))and branching is allowed only in second network. New

connections are routed through most lightly loaded part of second network.Can accomplish same thing with two passes through one network if

S≥B+(1+δ)(2/d)(1+(d–1)(k–1))where δ is fraction of traffic belonging to multicast connections» for n=4096, d=8, δ=.1, link speed of 2.4 Gb/s and max session

rate of 150 Mb/s, need 1/β ≥ 6.12; for unicast, the speedup needed is 5.55.

» requires mechanism to avoid disruption of data stream when re-switching

30

7-30 - Jon Turner - 8/6/2004

Blocking Probability for MulticastWorst-case analysis overly pessimistic for multicast.» based on maximum branching in “left half” of network» in practice, expect most branching to be in right half

Determination of blocking probability difficult.» no “natural” assumptions for fanout or bandwidth distribution» routing algorithms can have strong influence on results

For networks operated with fanout restriction, can estimate» for circuit switched case, C 3n,d,r with first stage fanout restriction f and

r=(f +1)d–1, blocking probability is <((d–1)/r ) f ≈ (1/(f +1)) f

if busy links are randomly distributed.» for f =3, result is ≈.062, for f =4, it’s .0016, for f=5, it’s ≈.00012.» can extend to general VC rates by assuming the worst-case number of

blocked links but random distribution of blocked linksMore reliable estimates can be obtained by simulation.» results strongly dependent on traffic assumptions (fanout and session

rate) and on routing algorithm used

Multistage Networks with Static Routing -...

Documents

Transcript of Multistage Networks with Static Routing -...