Parallel nested dissection for path algebra computations

8
Volume 5, Number.4 OPERATIONS RESEARCH LETTERS October 1986 PARALLEL NESTED DISSECTION i 3R PATH ALGEBRA COMPUTATIONS Victor PAN * Computer Science Department, SUNYA, Albany,, lgY 12222, USA John REIF ** Aiken Computation Laboratory, Harvard University. Cambridge, MA 02138, USA Recdved February 1986 Revised July 1986 Th/s paper extends the authors"parallel nested dissectionalgorithm of [13] originallydevised for solvingsparse lhteaf systems. We present a class of new applications of the nested dissectionmethod, this time to path algebra compulal~,.hs (in both car~ of single source and all pair paths), where the,~ ~ problem is defined by a symmetric malrix .4 who~ asem¢is4~i graph G with n verticex is planar. We substamliallyimproce the known algorithms for path algebra problems of that l~mnml class; this has further applications to maxinmm flow and minimum cut woblem~ in an undirected planar network and to the feasibility testing of a multicommodity flow hi a planar networlL graph computations * path algebras * parallel algorithms• network flow I. Intro&actkm In this ~*,per we substantially improve the known par~.el algorithms for several problems of practical ~aterest which can be reduced to path algebra computations. Gondran and Min~ux [4, pp. 41-42, 75-81] list the applications of path algebras to the problems of: veh/cle routing,, in- vestment and stock control, dynamic progrmn- ming with discrete states and discrete time, net- work optiniization, artificial intelligence and pat- tern refognition, labyrinths and mathematical <~games, enc~di~ and decoding of information: ~ompare also Lawler [8], Tarjan [17,18]. [4, pp. 84-102] thoroughly investigates general al- gorithms for such problems based on matrix oper- atio~ in d/o/ds, see next sections. We propose a substantial improvement of Ihese. gco~ al- gorithms in the important case where-the input matrix A is associated with an uadirected planar graph or, more generally, wi~ a graph from the ° Supported by I~'~FGrant DCR-8507573. ** Supported by Oflke of Naval ~ Centre~t N00014- 80-C4~t'I and by NSF Gram 13CIt-8~3~ ~ class of graphs having small separator families; see Definition 2 of Section 4 and compare Lipton, Rose and Tarjan [10], Pan and Reif [13,15]. Our improventent relies on our extension of the nested dissection parallel algorithm of ll31 to path alSe- bra problems (originally the al~ithm was applied in [13] to linear systems of ~tious to extend the sequential algorithm of [10] for the same problem; then, in [15], the algorithm was extended to the kast-apmres and ~ear p r ~ cmnpma- tions). Our new extens/oa is somewhat smpfisiag because the divisions and subtractimm of the mil~ hal algorithm of 1131 are am genera~ aUowed in the dioid: furtlg.mm~ for that reason we can extend (to dioids) neither the special" rco.u'sive facto~/zation of the haput matrix A from |131 ~./'s factodzation of A from [10L but we do extend the special teem'sly© factodzation of the inverse matrix A -1 of [13] to the ~hnilar factoriza- tion of the qms/-/ma'rs¢ A*. The qumi-iaverse A* can be a dense matrix, so Omlike [4] amd like I13.1 and 1151) we avoid e~iiy ~ A* aad exploit its recursive factod~tion when we aeed ~o so~ the sin~ source pmh ~ As ia [4], we define the alpritlmm over dioids

Transcript of Parallel nested dissection for path algebra computations

Page 1: Parallel nested dissection for path algebra computations

Volume 5, Number.4 OPERATIONS RESEARCH LETTERS October 1986

PARALLEL NESTED DISSECTION i 3 R PATH ALGEBRA COMPUTATIONS

Victor PAN *

Computer Science Department, SUNYA, Albany,, lgY 12222, USA

John REIF **

Aiken Computation Laboratory, Harvard University. Cambridge, MA 02138, USA

Recdved February 1986 Revised July 1986

Th/s paper extends the authors" parallel nested dissection algorithm of [13] originally devised for solving sparse lhteaf systems. We present a class of new applications of the nested dissection method, this time to path algebra compulal~,.hs (in both car~ of single source and all pair paths), where the, ~ ~ problem is defined by a symmetric malrix .4 who~ asem¢is4~i graph G with n verticex is planar. We substamlially improce the known algorithms for path algebra problems of that l~mnml class; this has further applications to maxinmm flow and minimum cut woblem~ in an undirected planar network and to the feasibility testing of a multicommodity flow hi a planar networlL

graph computations * path algebras * parallel algorithms • network flow

I. Intro&actkm

In this ~*,per we substantially improve the known par~.el algorithms for several problems of practical ~aterest which can be reduced to path algebra computations. Gondran and Min~ux [4, pp. 41-42, 75-81] list the applications of path algebras to the problems of: veh/cle routing,, in- vestment and stock control, dynamic progrmn- ming with discrete states and discrete time, net- work optiniization, artificial intelligence and pat- tern refognition, labyrinths and mathematical

<~games, enc~di~ and decoding of information: ~ompare also Lawler [8], Tarjan [17,18]. [4, pp. 84-102] thoroughly investigates general al- gorithms for such problems based on matrix oper- a t i o ~ in d/o/ds, see next sections. We propose a substantial improvement of Ihese. g c o ~ al- gorithms in the important case where- the input matrix A is associated with an uadirected planar graph or, more generally, w i ~ a graph from the

° Supported by I~'~F Grant DCR-8507573. ** Supported by Oflke of Naval ~ Centre~t N00014-

80-C4~t'I and by NSF Gram 13CIt-8~3~ ~

class of graphs having small separator families; see Definition 2 of Section 4 and compare Lipton, Rose and Tarjan [10], Pan and Reif [13,15]. Our improventent relies on our extension o f the nested dissection parallel algorithm of ll31 to path alSe- bra problems (originally the a l ~ i t h m was applied in [13] to linear systems of ~ t i o u s to extend the sequential algorithm of [10] for the same problem; then, in [15], the algorithm was extended to the kast-apmres and ~ e a r p r ~ cmnpma- tions). Our new extens/oa is somewhat smpfisiag because the divisions and subtractimm of the mi l~ hal algorithm of 1131 are am genera~ aUowed in the dioid: furtlg.mm~ for that reason we can extend (to dioids) neither the special" rco.u'sive facto~/zation of the haput matrix A from |131 ~ . / ' s factodzation of A from [10L but we do extend the special teem'sly© factodzation of the inverse matrix A -1 of [13] to the ~hnilar factoriza- tion of the qms/-/ma'rs¢ A*. The qumi-iaverse A* can be a dense matrix, so Omlike [4] amd like I13.1 and 1151) we avoid e ~ i i y ~ A* aad exploit its recursive factod~tion when we aeed ~o s o ~ the s i n ~ source pmh ~

As ia [4], we define the alpritlmm over dioids

Page 2: Parallel nested dissection for path algebra computations

Vohm~ 5, b~amber 4 OPERATIONS RESEARCH LETTERS October 1986

(=~, ._.~::.mgs); respectively, we estimate the compu- tational cost in terms of dioid operations. We assume a customary machine model of parallel computation, where on every parallel step each processor performs at most one operation ef ' the considered class; in our case this means at most one operation of the dioid. In the major specific applications to the classes of the problems of existence, optinfizafion and counting (see the next section), an operation over a dioid is an addition, a multiplication or a comparison of two numbers; the numbers involved in the computation by our algorithms require about the same precision (num- ber of binary digits) as the input values.

Table I shows our substantial improvement of the known algorithms in both sequential and parallel settings. (In case of parallel algorithms we assume that (7i is given with its O(J-d)-separator family.) Further appfications lead, in particular, to computing a maxflow and a mincut in an un- directed planar network using O(log4n) paral- lel steps, nLS/log n processors or alternatively O0og3n) steps, n2/log n procesmrs, versus the known bounds, O(log2n) and n 4, of Johnson and Venkatesan [6].

The estimates of that table hold in case of planar graphs and of all graphs halving an O(¢~-)- separator family (see definition in Section 4); that family is assumed given or readily computable. In some cases (grid graphs) the separator family is defined immediately. For a planar graph such a family can be computed in O(n) sequential time using the algorithm of Lipton and Tarjan [9]. O(n) is a minor contribution to the total sequen- tial computational complexity of the problem, for which we have the bounds O(n 2) (all pairs) and O(n 1~) (single source); see Table 1. Thus our algorithms sub:.:z~fially improve the known sequential time estimates for the most general path algebra computations for planar graphs. They also decrease the bounds on the parallel complex- ity of such computations provided that the sep- arator family is available. In many cases several

computations must be performed for the same graph G, having variable edge weights; then one may precompute an O(fn)-separator family of G, provided that there exists such a farifily. (In case of a planar graph that preconditioning can be reduced to the breadth first search (Miller [11]).) Whenever an O(v~)-separator family of G has been precomputed, we may apply parallel al- gorithms to solve general path algebra problems using polylogarithmic time and having processor bounds ~ess than the sequential time bound; see Table 1..

In particular all the bounds of the table can be app!;.ed to the shortest path computation in an undirected planar graph (network) G. This does not improve the known sequential time bounds for that specific problem, O(n~f0g n ) (single source) and O(n 2) (all pairs) (Fredericson [3]). However, our parallel complexity bounds, O(log3n) parallel steps and nl'5/log n processors (single source shortest paths) and n2/lo, g n processors (all pair shortest paths), substantia, tly improve the known estimates (so far the pctlylog time parallel al- gorithms for both probl,.'ms of all pair shortest paths and single source shortest paihs in planar graphs require n 3 pro~'essors, as in the general paths computation; compare Table 1). Further- more, our parallel algo,dthms for shortest paths in planar graphs, combined with the results of Klein and Reif [7] lead to ~:L new parallel algorithm for the evaluation of a maximum flow and a mini- mum cut in G using O(log4n) steps and nl'5/log n processors or alteraatively O(log 3n) steps and n2/log n processors, to compare with the previous bounds of O(log2rt) steps, n 4 processors (Johnson and Venkatesan [6]). Further applications can be obtained, in part,~cular to feasibility testing of a multicommodity flow in a planar network; see [14].

In the next ,~ection we define dioids and state some path alf~ebra problems; in Section 3 we estimate the computational cost of solving those problems in case of general graphs. (In both Sec-

Tab le 1

Previous algorithms [4] New a~.gorithms Sequential Parallel Processors Sequc:atial Parallel Processors time time time time

SOUfC~ O( n 2 ) O(log2n ) ,I ~ O( ~,' "~ ) O( log3n ) n I S / l o g n All pair O(n ~) O(Iog2n ) n ~ O( ,,t 2 ) O(Iog~n) n 2/log n

17g

Page 3: Parallel nested dissection for path algebra computations

Volume 5, Number 4 OPERATIONS RESEARCH LETTERS October 1986

tions 2 and 3 we follow [4].) In Sectio:.. 4 we present our improvement of the known algorithms for those problems for the graphs having small separator families, in Section 4 we specify parallel algorithms for the shortest path problems in an undirected planar network with the applications to computing maxflow and mincut in Section 5.

back at (l)), so an a.p.s.p.p, can be reduced to n s.s.s.p.p.'s.

The known algorithms of lit~ar algebra can be extended to solve the systems (1) and (2); thi~ turns most of them into known algorithms for s.s.s.p.p, a n d / o r the a.p.s.p.p. Here are two exam- pies.

2. Path algebra problems

We start with the special case of the shortest path problem in a graph G with n vertices defined by an n x n matrix A = [au] of non-negative arc lengths, where a O = 00 if there is no arc between the vertices i and j in G. (A is a symmetric matrix if G is an undirected graph.) We seek the vector x = [x(i)} of distances x( i ) (the I~ngths of the shortest paths) from vertex 1 to all vertices i in G. This is the single source shortest path problem (s.s.s.p.pL The distances satisfy the system, x(1) = 0, x(-) ~ miny(x( j ) + aj,), i = 2 . . . . . n, or equiv- alently,

x( ] ) = rain( m~n ( x ( j ) + aj,), 0),

x( i ) ;=min(n f i jn (x ( j ) +ay,), 00),

i = 2 . . . . . n.

We substitute ¢) for min and • for + and rewrite that system as follows:

e

~ 0 ) - - E ~ ( J ) ~a j~ * 0 , J

~ ( i ) -- T ~ x ( j ) • a~, ~ ~ , ~ = 2 . . . . . - , J

or, in matrix notation, denodng i ~ = [0, 0o . . . . .

x = x ~ A e i (". (1)

Similafly~ seeking the matrix X = [ x ( i , j ) ] of distances between all pair3 of vertices in G (this is the all pair shortest path problem, a.p.s.p.p.) and denoting I = [8o]. ~;, = 0. 8{~ = ~ if i ~f l we arrive at Lhe following matrix equation:

X = X ~ A 0 I . (2)

Restricting (2) to the hth row we arrive at the s.s.s.p.p, of computing the distances from the vertez h to all vertices in G (for h = 1 we arrive

Algorithm I. Set x ~°~ = io); compute #*+l )=X(k) ~ A e i (~), k=O, 1 .. . . until xV'+;)=xe'); then output the vector x ~ x (k) satisfying ( l ) .

Algorithm 1 extends Jacobi's method of linear algebra and amounts to the algorithm of B e l l n ~ []] for the s.s.s,p,p.

Algoritl~m 2. (a) Set A I°! = A. {b) For ~:=0, 1,. n - l , compute aV'+~= • " ' - - O

al+l,j • a Ikl,, • a/~ !, i, j---- 1,.. ., n. (c]~ Outpu'.t X = A I"] • I. (The matrix X salis-

ties (2).)

Algorithm 2 extends Jordan's algorithm of lin- ear algebra an~l amounts to the algm:;thm of Floyd [2] for the a.p.:';.p.p.; compare also AIgoriti~, 4 (in Section 4).

Other problems listed above can be also reduced ~o the linear systems (1) or (2) or to similar computations. We need to recall the 8encral con- c~pt already implicitly used in our reduction of the s.s.s.p.p, to (1) and of the a.p.s.p.p, to (2).

De fmitilm I. A dioid (sometimes called a ring) is a set S with two operations, • and ~ , such that for any triplet of eknnents a, b~ c ~ S and for two special elements e (xmity) and ((zero) of S, the f~lowing equations hold:

a e b = b e a ~ S , ~ ( a e b ) O c = a O ( b O ¢ ) ,

a~)( = a, a ~ b ~ S , ( a ~ b ) ~ , c = a ~ ( b ~ c ) ,

a ~ ( b ®c ) = (a • b),',) (a • c),

(bOc)~a=(b~a)t~(c~a).

In the abos~ reduction of th~ s.s,~.p.p, to (i) a~d of the a.p.s.p.p, to (2), we used tile where S = £ U ~ , R [x:ing the set of real num- bers, 0 =rain, • ~ + ,e=O, ( = ~ . C , ¢ n ~ ~

179

Page 4: Parallel nested dissection for path algebra computations

Volume 5. Number 4 OPERATIONS RESEARCH LETrE.RS October 1986

(1) and (2) to arbitrary dioids we define that

~ ' ,=[e . ,+ . . . . . ,q. +=[,s,.d+ +.+,=++,

,8,+~=¢ i f i+ j . (3)

Let us list some c]ass,~ of path problems, which can be r~duced to solving the systems (I) and (2) or to ~ Am/far matrix operations in ap- propriate dioids:

(i) existence (problems of connectivity), (ii) enameration (elementary paths, multicriteria

problems, generation of regular languages), (iii) o p t i ~ (paths of maxim:am capacity,

paths with minimum munber of arcs, shortest paths, katgest paths, paths of maximum relia- bility, reliability of a network),

(iv) cmmting (coentiag of paths, Markov chains), (v) op~imDation and post-optimization (prob-

lems of k th path, ~3ptirnal paff~s). Specifica~+ the class (i) includes the prob- iems of

(a) the existence of paths h a ~ g k (or at most k) arcs ffor a gi', en k) between vertices i and j in a g/yen (di~graph G,

(b) c~m~ufing the transitive closure of G, (c) tes~ag G for being strongly connected and

for having circuits. An appropriate dioid for problems of class (i) is

the nodcan a ~ a a , Sffi{0,~}, ~ff imax, • = min_ ¢ = O. e ffi 1, and m the incidence matrix A =[ao] of G, a+~--1 if and only if {i, j } is an asc of G.

The s u b c k ~ of shortest path l~'oblems in (iii) i n c h u ~ s.s.s.p~., a.p.s.p.p. (also in the versions where the sh~lest paths are required to have k o~ at ~ k arcs) and testing a graph for having c i r c u ~ of negative t e n ~ .

Class (iv) includes counting the nrzmbers of (a) distinct paths having k (o~r at most k ) arcs

~ i ap~ j in G. (b) atl t ~ "~stinct paths between i and j in G, (c) all the circuits in G h a ~ a ~ven number of

a r g : s .

Fro- that -~ass we choose the dk)id where S is the,~-~ of integers. ~ , ~ + . , ~ = , (that is, aged • are the conveational "addition ~ multipli- cation, n:spectiveb') ~ : O, e= l ; A =[%]. a+s: i ~ ~ Only if {i, j } is an ar~ of G; s¢¢ further

in. 14, pp- 91, 94--t02].

3. C~acumtio~l complexity o.* pagh ~ for general graphs

The solution of most of the p a t h problems listed in the previous section can be reduced to the evaluation (over the dioid) of the entries of the matrix A (k) (the all pair path problems) or of the vector bA (k) (the single source path problems) for some positive k, usually for k = n - 1. Here

A(q+I)~A(q)~Aq+I, q = 0 , 1 . . . . .

4 (0) = I (see (3)), A is an n × a input matrix, b = i (h) is a fixed coordinate vector of dimension n. Here and hereafter we assume that all comput~.- t.;ons, in particular computing matrix sums, prod= ucts and powers, are performed over the dioid associated with a given path problem. For every incidence matrix A of the path problems listed in the previous section there exists the quosi-inverse matrix

A * = lira A (+', (4) q--* ~,

e x c ~ ~ ~or the shortest path and muhieriteria problems where there' exist circtfits of negative lengths in G and for counting problems wl~cre ~ere e~.ists a cir,--air in G- Ia b~th latter ~ the existence of such circuits is detected via comput- ing A k or ( I ~ A) k over the dioids. Hereafter we will coy, sider the most important case where there exists A*, the quasi-inverse of A, and moreover where

A*ffiA°'- t) . , A ( q ~ f A ('t-1) foi ° q > n . (5)

(Our estimates of this section for the cost of the evaluation of A* and hA* under (5) can be ~l - mediately extended to the case of the evaluation of A q, A (q) and bA (q) lot q ~ n - 1.) The ¢qs. (5) a r i~ naturally where A* is the incidence v~mtrix ~i the ~ra~.~;~,.,~ ~ u r e of the graph of A (~his i,5 the case ftt" the existence and connectivity problems). (5) impfies that

A * = I ~ A ~ A2 ~ . . . ~ A " - I

A* = ( I @ A)( ] @ a2)( I ~) a4) . . . ( ] ~ a2~ ),

k = tnog~. l .

"[husk + l matrix additions and 2k matrix multi- pl~ications suffice, which means ( 4 n k - k + 1)n 2 operations in the dioid. (The known fast matrix muitipfication aigorithms, see Pan [12], cannot be generally applied over dioids.) For many dioids

1N

Page 5: Parallel nested dissection for path algebra computations

Volume 5, Number 4 OPERATIONS RESEARCH LETTER5 October lqV6

the operation @ is idempotent, that is, a @ a ffi a for all a ~ S. In that case

r~-0 r~O

=(leaf' (where Z denotes a sum in the dioid, k = [Iogzn]). so A* can be computed (via repeated squaring of I • A) using only k - 1 matrix multiplications and a single addition of the two matrices A and !, that is, ~tsing a total of n 2 ( k - 1 X 2 n - 1 ) + n oper- ations. It is easy to parallelize these two known algorithms, which yields rath,:r efficient parallel scheme, with O((k - 1) log ~r) steps, [2n3/log n] processors, for the evaluation of A* where A is a dense mztiix. If A is sparse, then the above ways ~-~re relatively less effective 'for the sp~rsity of A is not generally preserved during the computation.

Comporting hA* (the single source path prob- lems) can be reduced to n successive postmultipli- cations of the vectors bEk,=o.d" by the ,matrix A for k = 0, 1 . . . . . n -- 1 (and to n - I vector ad- ditions in the case where the operation • is not idempotent in the dioid), that is, to a total of (2D{A) - nX~ - 1) operations in the di~|d (or of 2D(A~fn-1) operations in the c~," of nor,,- idempotent ~). p[ovided that the operations in the dioid m'e noi counted if at least one of the operands is zero, ~. Here D(A) denotes the num- ber of non-zero entries of A. Tiffs way we exploit sparsity of A to some extent; north, however, that the above estimates translate imo O(n log n) parallel steps (substantially more than O(log2n) achievable even in the denne c ~ ; ) and D(A) processors and that n times more ,~erations and pr~essors are needed to extend thi~ to computing

4. Solving lmth ~ for ~ with m a l l separators

Definition 2~ Compare 110] and [131. G = (Y. E) has an s( n ).separator family if eithex ~ V i -< no (for a constant no) or dele~it~g some s~.~arator ~et S of vertices, such that IS [ < s( [ V lilt. partitions G into two disconnected subgraphs with the vertex ~ t s V I and V 2, such that J V, I ~ a J V I, i ~ 1, 2, a is a copstant, a < 1, ~.nd farthennore each of the

two subgraphs of G defined by the vertex sets S o V~, i = I, 2, also has an s(n~-separator family.

The grid graphs on a d-dimensional hypercube have (n i - t/a)-separator families, which are read- ily available; in particular, ~ u a r e grids have ~ff- separator families. An undirected planar graph has a 8~ - sepa ra to r family, which can be com- puted in O(n) time [9].

in this section we consider a path problem for an undirected graph G = (V. E ) given together with its s(n)-separator family, s(n) ,= n °, a < !. In that case we may further reduce the tional cost of computing A* and hA* using the nested dissection algorithms of [10] for sequential computation and of [13] for parallel computation.

The nested dissection algorifitms of [10] and [13] require to invert some auxiiiary matrices. The dioi¢' elements and matrices in dioids may have no inverses, but we compute quasi-inverses of matrices over dioids applying the foll~-i~.~ ~ m l i z e d Jordan elimination algorithm, which requires only the operations ~ and • [4, p. 110].

A l # t h m 3 (evaimtieu a l A*k Set A I°l --,4, B ~ = I and recursively compute A iki = Mfk!A ~" ~-~, B [t'l ffi MIIIB I t - l | for k ~ 1, 2 . . . . . . n. Here M ltl is obtained from the matrix I of (3~ by replacing the (k, k) entry of ! by (atk~t - tl)* and by replacing other entries of the k fit column of ! by the entrk~

r.,lJ,-I! ~ ( a l ~ - q ) . ] where i, k = of the v~ to r t,~ijt 1 . . . . . n. Output B t'l.

Algorithm 3 extends the Jordan dimir~ati,mt scheme for linear systems, avoiding subtractions and replacing the reciprocals l/(l---tk°ik-tI~- by (,,[k- li~,

t 4 ~ t i p .

Lmmm i. Let B In! be computed by Alg, orlalm 3. Then A* = B t"l.

Lemma 1 is proven in Pan and Reif [14] using in particular the next simple lemma (which can be immediately verif~l; see (4)).

2. .4* sm~i~fi~ {2).

A t'el ~= BInlA (obvious), t~'refore A* ffi A l~l • L

so we may dispense with the evaluation of BtkJ

a.,td simplify Algorithm 3 as follows.

181

Page 6: Parallel nested dissection for path algebra computations

Volume 5, Number 4 OPERATIONS RESEARCH LETTERS October 1986

4 (evaluafam of A*). (a) Set A t°l = A. (b) Foi A from 1 to n,

ai'lk, = (a~t~- 11) *,

a[~ I - , -- a Oik'II • a0kl'- I] • aik| • ~'~- ' l , i .

for all i. j except for i = j = k.

(c} Output A* = A ~°! o L Algorithm 4 turns into Algorithm 2 of the

Introduction for the dioids where a ek~ = e ; this ~k inc lud~ the d;.oid associated with the shortest p~fil problmns.

To compute ~.* = pTA~p in parallel, w¢ use rec~grsive factoriz~tion for the matax A~ = (pApT) *. extending the r,~cm'stve faetorization from Section 4 of [~3] lot: the inverse A o ~. Here P is ,*,lie p~muta t ion matrix obtained at the ordering stage of the nested dissection algorithm of [13] appfied to the input matrix A. Here and hereafter W* and W T denote the quasi-inverse and the transpose of a matrix W, respectively; O denotes the null matrix, filled with the zeros e; 1 denotes the identity matrices (3) of appropriate sizes. Here is our recursive factorization, where h = 0, 1 . . . . . d

(7)

Let us verify that (7) indeed defines AS. Ex- pand the right side of (7) deleting h and replacing h + 1 by 1 in the subscripts~ to simplify the no- tation,

W f f i [ X * O X * y T A ~ Y X * X*y'rA~] n ~ Y X * A~ " (8)

I,emma 3 (|141, ~ Remark 2 below). Let yT

A = [ ~ z ] and W be defined by (8). Let there exist the quasi-inverses A~. X*. Thc~: WA • ! = IV.

compute a* as ( e - a) -~ in the dioids that have inverse operations to • and ~ . )

Remark I. The proofs of Lemmas 1 and 3 and consequently of the validity of Algorithms 3, 4 and of the r~cursive factorization (6), (7) dc not require t¢~, assume (5). It is sufficient to use the definition f4) of quasi-inverse and to assume the existence of aii the quasi-inverses included in Al- gorithms 3, 4 and in the recursive factorizati.on (6), t7,~.

Remalk 2. The lnatrix equation (7) can be reduced to Algorithm 3. Indeed rewrite (7) as follows:

A~'= O ' A~'+, YhXh* "

Let n = 2 and let A h replace A in Algori~.:m 3. Then Algorithm 3 cod~putes just the latter f.~ct:)ri- zati,:n applied to the 2 × 2 block matrix A h.

Remark 3. In [4] the fact that the solution X = Bin, of (2) equals A* (see Lemma 1) is stated under the additional assumption that fhe preorder relation in the d io id (a ___ b if and only if there exists c such that a ~ c ffi b) is the order relation (a < b and a >_ b together imply that a = b). Under that assumption, [4] suggests (wi~h the proof omitted) that B !~ is the minimum solution. This implies that B [hI --A* for A* is easily proven to be the minim:::~, sotutmil to (2): in Lemma I we prove that B [hI =- A* even where the order relation in the dioid is not assumed.

Rem~;rk 4. (6), (7) generalize the recursive factori- zatio~l from [13] based on the following factoriza- tion of A o = PAP T (which itself, however, does not .,=eem to be extendable to the case of dioids):

Y, Zi, 1" Z"ffiAs+I + YkXh ,

Y, X f I 0 Ak+ ~ I !"

Lemma 3 impfies that W = A * under (8) (provided that there e~dst quasi-inverses AIL X* ~. This substantiates the validity of the recursive faetorization (6), (7). (Note that computing a* for a E S may require more thau one operation in a dio~d~ :h-~Icss (5) holds, on the other hand, we may

Next we estimate the costs of computing hA* and A*. We proceed similarly to [131, noting that for some auxiliary s(ahn) X s(abn) block-~natrices B (where a < 1, h = k, k - 1, . . . . 0, k = O(iog n)) we need to compute D'c, v being a fixed vector° it is easy to extend the assumed property that A ~q÷ z~

182

Page 7: Parallel nested dissection for path algebra computations

Volume 3. Number 4 OPERATIONS RESEARCH I.E'I'rERS October i9~16

= A (q) f o r q > n - 1 to the equations 8 ~q+t~ -~ B (q~ for q ~ s ( a h n ) - 1. (Indeed, A and B are associ- ated with the path problems of the. same kind, having only different sizes, n and s(aSn), respec- tively.) For the evaluation of B* given B. we apply the cited earlier algorithms for the dense matrix case, using s:(4~,k(s) - k(s) + t~ oper- ations in the dioid or O(k(~')log s) steps, [2s3/log s ] processors where k ( s ) = [log2s ] - 1, s = s(c~n). Thus we arrive at the favorable com- plexity bounds of O(log n logZs(n)) parallel steps and [E I +ls3 (n ) / l og s(n)] processors for co:a- puting the recursive factorization (6), (7), and c'f O(Iog n log s(n)) parallel steps and I El +s2(nj processors for computing hA* for every b where the recursive faetorization is available. Here [El denotes the number of edges o~" the graph associ- ated with the matrix A, [El = O(n) for planar graphs. Therefore O(i~g n Iog2s(n)) steps suffice for both single source pare problems (where b is fixed and only the row hA*, but not the whole matrix A*, must be computed; so I E [ + l s3(n) / Iog s(n)] processors suffice) and all pair path problem (where we compute A* say by evalaating BA* for all the n coordinate vectors b. so we use n( lEI + [s2(n)) / log s(n)~) proces- sors). Multiplying the bounds for the numbers of steps and processors together, we obtain ,sequen- tial time bounds, which can be sligh0y redu~-:d f~ctber, to ~, E [ +s3(n) for the single source p~ths and to ( t E l+s2(n)) a for the all pair paths, if we e~tet~d the scquer,:~al nes~cd dissection algorithm cf [10j (rather than ~he parallel algorithm of [~3]) to path algebra cocnputations, If s ( n ) = O(vrn) ~md [ E I -- O(n), as is the case for plfina~ ~ graphs, we arrive at the estimates shown in the table in our summa/y. It~ part;.cular this includes the short- ~st path computations with ~ome further applica- t:ons (see the introduction).

5. Improvement of parallel ewluation of a mini- mum cu~ and a, maximmn flow in an undirected planar netwo~ -.

The best seque:atial algorithms for computing a minimum cu~ and 'a maximum flow in an undi- rected planar network N-~ (G, c), ( G ~ ( V , E) a ~aph, c a set of the exig~, capacities), run in O(n log n) time and exploit the reduction to the shortest path computations; see [3], Hassin and

Johnson [5], Reif [161. Specifically, [161 presented O(n log2n) time algorithm for computing a mincut, [5] extended that -algorithm to computing a maximum flow and [3] improved the time bound to O(n log n). The previous best parallel polylog time algorithms [6] for those problems (via a rather straightforward parallelization of the sequential scheme) require an order of n 4 processors. Com- bining our results for the s.s.s.p.p, in planar graphs with the results of [7] we arrive at the bounds of O(Iog4n) steps and nl"5/Iog ~ processors or alter- natively O(Iog3n) steps and n2/log n processors.

[61 reduces computing a mincut and the vah~e v,,,~, of a maxflow to the following stages (see [5,6,7,1~] for further details):

(1) compute a p~ane embedding H of N, ~or tim estimated cost of O(log2n) steps, n* proces- sors,

(2) find the plane dual network D(N); step, processor bounds are O(|og n), n 3,

( 3 ) compute the p-path, that is, the sh~rtest path in the dual between the two faces F~ and F t that adjoin the source s and the sLqk t of the primal network; step, processor boun~,~ are O(log2n), n 3,

(4) compute the consistence clockwise order- ings for the faces on the/ t path; step, proocssor count is O(log n), n 2,

(5) compute the F-minimum cut-cycles in D(N) for every dual ve~ex F on the/Lath (that is, compute the cut-cycles of the minimum length in D ( N ) passing through '.he vertex F); the latter stage can be reduced to solving tim a.p.s.p.p, in the dual network D(N); the cost is O(logen) steps, n 3 processors,

(6) finally compute the minimum value of the F-minimum cut-cycles over all the dual vertic~ F on the tt-path; this gives vma x and a mincut; the cost is O(log n) steps, n processors.

The recent algorithm of [7] performs the com- putation at the sabstage (1) using O0og3n) steps, n processors; the computation also includes sub- stages (2) and (4) performed using O(log n) steps, n processors. Applying our parallel algorithm for the s.s.s.p.p, at stage (3) and our parallel algorithm for the a.p.s.p.p, at stage (5), we perform the computations at those stages in O(log3n) steps using nLS/Iog n processors at stage (3) and nZ// leg n at stage (5). Summarizing we need O0og3n) steps, nZ/log n. prt'~,-essors for computing vau a and a mincut. Alternatively we may u ~ O(log4~O

183

Page 8: Parallel nested dissection for path algebra computations

Volume 5, Numbel 4 OPERATIONS RESEARCH LETTERS October 1986

steps and O(n:'~'/Iog t,) processors at stage (5), which dominate~ the total complexity. To arrive at those bounds , we apply the a !god thm of [16], which performs stage (5) by successively solving s.s.s;p~p.'s in the dual networks derived f rom D ( N ) . This is per formed in at most [log2n ] sub- stages; on substage r up to 2 ~ s.s.s.p.p.'s are solved in the de, rived networks having the total n u m b e r o f edges at most 2 ] E [ + 2g r = 0, 1 . . . . Here E is the edge set of the original p lanar network, I E l = O ( n ) , 2 " < 2 1 + l ° g " = 2 n , so the to",~al n u m b e r o f edges on each substance is O(n) , Therefore our a lgor i thm for the s.s.s.p.p, enables us to per form each substage using O(log3n) steps, n L S / I o g n processors , s-o O ( I o g 4 n ) steps, nLS/Iog n processors suffice in all substages of stage (5) and consequent ly suffice for the entire compu ta t ion of Vma ~ and of a mincut .

W h e n a mincut (passing through a vertex F on the p-path) and the value Vma ~ are known~ we may immedia te ly reduce comput ing a maxf low to an s.s.s.p.p, in the dual network, following [5]. (Specifically, this is the s.s.s.p.p, of comput ing the shortest dis tances between F and all o ther vertices in the dual ne twork N.) Thus a t that final stage we on ly need O(log~n) steps, nLS/Iog n proces- sors.

Acknowledgement

The authors thank Sally Gooda l l for typing this paper .

R e f e r e n c e s

[ll R. Bellman, "On a routing problem", Quart. Appl. Math. 16. 87-90 (1958).

[2l R.N. Floyd, "'Algorithm 97, shortest path", Comm. ACM 5. 345 0962).

[3] C~N Fr~-~e~c~r~.n.. '~'Fast ~g6rithws for short~,t path.': in

planar graphs, with applications". CSD TR 486, Depart- ment of Computer Science. Purdue University, West lafayette, IN, 1984.

[4] M Gondran and M. Minoux, Graphs and Algor#hms, Wiley-lnterscience, New York, 1984.

[51 R. Hat,sin and D.B. Johns,m, "An O(n Iog2n) algorithm fo," maximum flow in undirected planar networks", SIAM J. on Computing, foithcoming.

[6] D.B. Johnson and S.W. Venkatesan, "'Parallel algorithms for minimum cuts and maximum flows in planar net- works", Proc. 23-rd Area. IEEE Syrup. FOCS, 244-254 (1982).

17] P. Klein and J. geif, "An efficient paraUel algorithm for planarity", Technical Report, Center for Research ~n Computer Technology, Aiken Compute, an Laboratory. Harvard University, Cambridge, M A and Pr,,c. 27.th Ann. IEEE Syrup. FOCS, Toronto. Oct. 1986.

[8] E.L. Lawler, Combinatiorial Optimization: ,Vetworks a~td Matroid~', Hob, Rinehard and Winston, N~w York, 1976.

[9] R.J. Lipton ,~d R.E. Tarjan, "A separator theorem for planar graphs", SIAM J. Applied Math. 36 (2), 1"/7-189 (1979). R.J. Lipton, D. Rose and R.E. Tarjan, "Generalized nested dissection", SIAM J. Numer. Analysis |6 {2l. 346-358 (1979). G. Miller, "Finding simple cycle separat,~r for 2-con- nected planar graph", J. Computer & S3stem Science, forthcoming. V. Pan, "How to multiply matrices faster", Lecture l~',~w,s. in Computer Scwn('e, Vol. 179, Springer, Berlin, 1984. V. Pan and J. Re-if, "Fast and efficient solution of iin,~'ar systems", Technical Report TR-02-85, Center for Re- search in Computer Technology, Aiken Computation Laboratory, Harvard University, Cambridge, MA, 1985. Extended abstract in Proc: i 7-th Amr. A CM STOC, Provi- dence, RI, 143-152. V. Pan and J. Reil', "Extension of the parallel nested dls~ct~on algorithms to path algebra computations", Technical Report TR 85-9, Computer Science Depart- ment, SUNYA, Albany, NY, 1985. V. Pan ar.d L geif, " Efficient sparse linear programming", Technical Report TR-11-85, Center for Research in Com- puter Technology, Aiken Computation Laboratory, Harvard University, Cambridge, lviA, forthcoming in this Journal.

I10] J. Reif, "Minimum s - t cut of a pie, nat undirected net- work in O(n Iog2(nl) time, SIAM J. Compm. 12 (1), 71-81 (1983).

[17] R.E. Tarjan, "A muffed approach to path problems", J. ACM 28 (3l, 577-593 (19811.

[18] R.E. Tarjan, "Fast algorithms for solving path problems", .;: ACM 28 (3), 594-614 (1981).

i!o1

I11]

[12]

!13t

llSl

184