IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3,...

10
IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, MARCH 2014 1023 Bounding the Advantage of Multicast Network Coding in General Network Models Xunrui Yin, Yan Wang, Student Member, IEEE, Zongpeng Li, Senior Member, IEEE, Xin Wang, Member, IEEE, Jin Zhao, Member, IEEE, Xiangyang Xue, Member, IEEE Abstract—Network coding encourages information flow mixing in a network. It helps increase the throughput and reduce the cost of data transmission, especially for one-to-many mul- ticast applications. An interesting problem is to understand and quantify the coding advantage and cost advantage, i.e., the potential benefits of network coding, as compared to routing, in terms of increasing throughput and reducing transmission cost, respectively. Two classic network models were considered in pre- vious studies: directed networks and undirected networks. This work further focuses on two types of parameterized networks, including bidirected networks and hyper-networks, generalizing the directed and the undirected network models, respectively. We prove upper- and lower-bounds on multicast coding advantage and cost advantage in these models. Index Terms—Network coding, multicast throughput, frac- tional solution, packing number. I. I NTRODUCTION N ETWORK coding [1] encourages information flows to be encoded within a data network, besides merely being forwarded and replicated. Such a departure from the classic store-and-forward principle has proven effective in increasing the network capacity. Higher end-to-end throughput, particu- larly for multicast data transmission, is witnessed in a number of network scenarios [1], [2], [3]. Multicast represents an increasingly more important class of applications on the Inter- net, encompassing traditional and emerging one-to-many data dissemination applications, such as software patch distribution, live media streaming and video conferencing. A fundamental problem in network coding is to quantify the benefits of network coding over routing, known as the coding advantage, measured as the ratio of the achievable throughput with network coding over that with routing. Without network coding, a multicast routing solution is based on a multicast tree, or packing a set of multicast trees [3], [4]. In the directed network setting where each link has a pre- defined direction, there exists a combination network pattern Manuscript received April 28, 2013; revised October 25, 2013. The editor coordinating the review of this paper and approving it for publication was P. Popovski. X. Yin and Z. Li are with the Department of Computer Science, University of Calgary (e-mail: {xunyin, zongpeng}@ucalgary.ca). Y. Wang, X. Wang, J. Zhao, and X. Xue are with the School of Computer Science, Fudan University (e-mail: {11110240029, xinw, jzhao, xyxue}@fudan.edu.cn). Y. Wang is also with the School of Software, East China Jiao Tong University. This paper was presented in part at IEEE INFOCOM 2012, when X. Yin was with Fudan University. This work is supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC). Digital Object Identifier 10.1109/TCOMM.2014.011614.130316 u v 3 5 3 4 4 4 2 1 2 2 3 1 2 2 3 1 1 3 u v Fig. 1. An undirected network (left) and a bidirected network (right). Link capacities are labeled besides the links. where the coding advantage is unbounded as the network size grows [2]. However, in the undirected network setting, where capacity at each link can be shared flexibly between the two directions, a contrasting result was proved: the coding advantage is upper-bounded by a constant of 2 [4], [5], [6]. Directed and undirected graphs are classic subjects of study in theoretical computer science. While simple and easy to apply, they do not faithfully depict the wireline or wireless network topologies in practice. For example, large coding advantages in the directed setting are observed in contrived, extremely asymmetric topologies that favors network coding over tree packing, with links existing in one direction only between neighboring nodes. This is apparently different from the picture of the Internet, where pair-wise router interconnec- tions are mostly bidirectional, i.e., if a router A can transmit to a neighbor router B, so can B transmit to A [7]. This work studies the coding advantage in two types of parameterized networks with richer modeling power. The first is the bidirected network model, parameterized with α, the highest ratio of opposite link capacities between neighboring nodes. When α = 1, we have a (completely) balanced network, fairly close to the Internet core/backbone [7], [8]. When α = , we arrive at the directed network model. Results obtained for the bidirected model directly carry over to special cases including these two extreme ones. Note that bidirectional transmissions are also supported in an undirected network. The key difference between undirected links and bidirected links is that opposite links between two neighbor nodes share the total available capacity in a flexible way in the former, and in a pre-defined way in the latter. For example, in Fig. 1, the flow rate f (u, v)= f (v,u)=1.5 is feasible in the undirected network but not the bidirected network, since f (u, v)+f (v,u) does not exceed the capacity of the undirected link c({u, v})=3 but f (v,u) is larger the capacity of the directed link c(v,u)=1. Another motivation for studying bidirected networks is that, interestingly, it further provides a tool for analyzing the coding advantage in hyper-networks. In the hyper-network model, a hyper-link connects two or 0090-6778/14$31.00 c 2014 IEEE

Transcript of IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3,...

Page 1: IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, …pages.cpsc.ucalgary.ca/~zongpeng/publications/tcom14-xunrui.pdf · 1024 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3,

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, MARCH 2014 1023

Bounding the Advantage of Multicast NetworkCoding in General Network Models

Xunrui Yin, Yan Wang, Student Member, IEEE, Zongpeng Li, Senior Member, IEEE, Xin Wang, Member, IEEE,Jin Zhao, Member, IEEE, Xiangyang Xue, Member, IEEE

Abstract—Network coding encourages information flow mixingin a network. It helps increase the throughput and reducethe cost of data transmission, especially for one-to-many mul-ticast applications. An interesting problem is to understandand quantify the coding advantage and cost advantage, i.e., thepotential benefits of network coding, as compared to routing, interms of increasing throughput and reducing transmission cost,respectively. Two classic network models were considered in pre-vious studies: directed networks and undirected networks. Thiswork further focuses on two types of parameterized networks,including bidirected networks and hyper-networks, generalizingthe directed and the undirected network models, respectively. Weprove upper- and lower-bounds on multicast coding advantageand cost advantage in these models.

Index Terms—Network coding, multicast throughput, frac-tional solution, packing number.

I. INTRODUCTION

NETWORK coding [1] encourages information flows tobe encoded within a data network, besides merely being

forwarded and replicated. Such a departure from the classicstore-and-forward principle has proven effective in increasingthe network capacity. Higher end-to-end throughput, particu-larly for multicast data transmission, is witnessed in a numberof network scenarios [1], [2], [3]. Multicast represents anincreasingly more important class of applications on the Inter-net, encompassing traditional and emerging one-to-many datadissemination applications, such as software patch distribution,live media streaming and video conferencing.

A fundamental problem in network coding is to quantify thebenefits of network coding over routing, known as the codingadvantage, measured as the ratio of the achievable throughputwith network coding over that with routing. Without networkcoding, a multicast routing solution is based on a multicasttree, or packing a set of multicast trees [3], [4].

In the directed network setting where each link has a pre-defined direction, there exists a combination network pattern

Manuscript received April 28, 2013; revised October 25, 2013. The editorcoordinating the review of this paper and approving it for publication was P.Popovski.

X. Yin and Z. Li are with the Department of Computer Science, Universityof Calgary (e-mail: {xunyin, zongpeng}@ucalgary.ca).

Y. Wang, X. Wang, J. Zhao, and X. Xue are with the School ofComputer Science, Fudan University (e-mail: {11110240029, xinw, jzhao,xyxue}@fudan.edu.cn). Y. Wang is also with the School of Software, EastChina Jiao Tong University.

This paper was presented in part at IEEE INFOCOM 2012, when X. Yinwas with Fudan University. This work is supported in part by the NaturalSciences and Engineering Research Council of Canada (NSERC).

Digital Object Identifier 10.1109/TCOMM.2014.011614.130316

u v3

5

3 4

44

21 2

2

3

1

2

2

3 1 1 3

u v

Fig. 1. An undirected network (left) and a bidirected network (right). Linkcapacities are labeled besides the links.

where the coding advantage is unbounded as the networksize grows [2]. However, in the undirected network setting,where capacity at each link can be shared flexibly betweenthe two directions, a contrasting result was proved: the codingadvantage is upper-bounded by a constant of 2 [4], [5], [6].

Directed and undirected graphs are classic subjects of studyin theoretical computer science. While simple and easy toapply, they do not faithfully depict the wireline or wirelessnetwork topologies in practice. For example, large codingadvantages in the directed setting are observed in contrived,extremely asymmetric topologies that favors network codingover tree packing, with links existing in one direction onlybetween neighboring nodes. This is apparently different fromthe picture of the Internet, where pair-wise router interconnec-tions are mostly bidirectional, i.e., if a router A can transmitto a neighbor router B, so can B transmit to A [7].

This work studies the coding advantage in two types ofparameterized networks with richer modeling power. The firstis the bidirected network model, parameterized with α, thehighest ratio of opposite link capacities between neighboringnodes. When α = 1, we have a (completely) balancednetwork, fairly close to the Internet core/backbone [7], [8].When α = ∞, we arrive at the directed network model.Results obtained for the bidirected model directly carry overto special cases including these two extreme ones. Note thatbidirectional transmissions are also supported in an undirectednetwork. The key difference between undirected links andbidirected links is that opposite links between two neighbornodes share the total available capacity in a flexible way in theformer, and in a pre-defined way in the latter. For example,in Fig. 1, the flow rate f(u, v) = f(v, u) = 1.5 is feasible inthe undirected network but not the bidirected network, sincef(u, v)+f(v, u) does not exceed the capacity of the undirectedlink c({u, v}) = 3 but f(v, u) is larger the capacity of thedirected link c(v, u) = 1. Another motivation for studyingbidirected networks is that, interestingly, it further provides atool for analyzing the coding advantage in hyper-networks.

In the hyper-network model, a hyper-link connects two or

0090-6778/14$31.00 c© 2014 IEEE

Page 2: IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, …pages.cpsc.ucalgary.ca/~zongpeng/publications/tcom14-xunrui.pdf · 1024 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3,

1024 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, MARCH 2014

more nodes; a transmission from one node reaches all. Ahyper-network H is parameterized with β, the max number ofnodes a hyper-link may connect. An undirected network is aspecial case with β = 2. An Ethernet bus or a multicast switchcan be viewed as a hyper-link, in that one transmission reachesall nodes simultaneously. Furthermore, the topology of wire-less networks are usually described by directed hypergraphs[9]. We note that, in the study of minimum-energy multicastingwhere interference is irrelevant, the wireless network can bemodeled as a directed hyper-network.

Our main problem of study is: how large can the codingadvantage be, in bidirected networks and hyper-networks?We explore how the coding advantage is related to the linksymmetry of a network, and how an (approximately) balancednetwork behaves differently from a directed or undirectednetwork. For hyper-networks, we aim to prove the first upper-bound on the coding advantage.

Combining the technique of link splitting in Eulerian graphswith Edmonds’ Theorem on spanning tree packing, we provethat, for a node-balanced multicast network where each nodehas symmetrical transmit and receive capacities, the codingadvantage is always 1. This implies that the coding advantageis 1 for link-balanced networks, improving the existing upper-bound of 4 [10]. We show that in either link-balanced or node-balanced networks, the max multicast throughput is alwaysfeasible without any information processing (replication orencoding) at relay nodes. If one assumes Internet routers havesymmetrical inbound and outbound capacities available, thenwe can conclude that in-network processing (IP-multicast ornetwork coding) is unnecessary, and end system multicast suf-fices. The tight upper-bound of 1 further implies that multicasttree packing, NP-hard in general, becomes polynomial-timecomputable in balanced networks.

An α-balanced bidirected network relaxes link symmetryfrom 1 to α ≥ 1, capturing a larger class of networks inpractice. We prove that the coding advantage here is at mostα, improving the known upper-bound of 2(α + 1) [10]. Wefurther prove a first lower-bound Ω(

√α) on the maximum

coding advantage in α-balanced networks. These two boundsfor α-balanced networks unify previous bounds for balancednetworks and directed networks that can be arbitrarily imbal-anced, revealing a connection between the asymmetry of anetwork and its coding advantage.

For hyper-networks, Li et al. [10] proposed the followingopen problem: does a constant upper-bound for multicastcoding advantage exist in hyper-networks? We provide anegative answer, by proving a lower-bound of Ω(log(β))on the maximum coding advantage, through generalizing the3-layer combination network [11] into the hyper-networkparadigm. We further prove an upper-bound of β, through anetwork transformation technique relating hyper-networks tobalanced networks. The proof reveals an interesting connectionbetween the two seemingly distinct network models. The resultimplies the well-known upper-bound of 2 for multicast codingadvantage in undirected networks [4], [5], [6].

Throughput advantage and cost advantage of network cod-ing follow a loose primal-dual relation [5], [11]. The latter ismore general, and depends on not only link capacities but linkcosts. For a bidirected network with both balanced link capac-

ities and balanced link costs, we prove that the cost advantageis at most 2. When the ratio of cost between a pair of nodes isbetween 1 and α′, we prove a generalized upper-bound of 2α′,unifying the unbounded cost advantage in directed networksand the upper-bound of 2 in undirected networks. For hyper-networks, we prove that the cost advantage is at most β, whichdirectly implies the upper-bound 2 in undirected networks [5].

In the rest of the paper, we review previous research inSec. II, and define models and notations in Sec. III. Codingadvantage is analyzed for bidirected networks in Sec. IV,and for hyper-networks in Sec. V. Cost advantage is studiedin Sec. VI. We extend the results in bidirected networks tobidirected hyper-networks in Sec. VII. In Sec. VIII and Sec. IXdiscuss realworld relevance and conclude the paper.

II. PREVIOUS RESEARCH

Li et al. studied the coding advantage in undirected net-works [4], [10]. Applying graph theory techniques, theyfirst prove that the coding advantage is upper-bounded bya constant of 2. The same bound is later proved usinglinear programming techniques [5] and different graph theorytechniques [6]. This work was partly inspired by the work of Liet al.. In particular, applying graph splitting for handling relaynodes still plays a role in our analysis, although in directednot undirected fashion. Li et al. further proposed as an openquestion, whether a constant upper-bound exists for multicastin hyper-networks [10]. We resolve this open problem, andgeneralize the upper-bound of 2 for undirected networks toan upper-bound of β for hyper-networks. In Peer-to-Peer andoverlay networks, the coding advantage was studied by Chiuet al. [12], and by Shao and Li [13]. Recently, Xu et al. studiedthe benefit of network coding for group communications, inundirected networks with fairness constraints [14].

The benefit of network coding in wireless networks was firststudied by Liu et al. [15], and then extended by Keshavarz-Haddad and Riedi [16]. They consider wireless interferenceand geometric properties in modeling a wireless network,and prove that both coding and cost advantages are upperbounded by constant factors. Karande et al. [17] consider thecase of multiple-multicast in wireless Ad-hoc networks andprove order bounds on the throughput improvement. Yang etal. studied the coding advantage of physical layer networkcoding in two-way relaying systems [18].

For multicast throughput, we adopt the standard definitionof symmetric throughput in the literature, i.e., all receivers arerequired to receive and recover information flows transmittedfrom the source at an equal rate. It is also possible to relaxthe symmetry requirement, by allowing different receiversto receive at different rates. Under such a model, codingadvantage is studied by Chekuri et al. [19], for averagethroughput across all receivers.

III. NETWORK MODELS AND NOTATIONS

Undirected Networks. An undirected network is a capacitatedgraph G(V,E, c), with node set V , edge set E, and linkcapacity vector c ∈ ZE

+ . Here Z+ is the set of positiveintegers. An undirected link uv with total capacity c(uv) can

Page 3: IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, …pages.cpsc.ucalgary.ca/~zongpeng/publications/tcom14-xunrui.pdf · 1024 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3,

YIN et al.: BOUNDING THE ADVANTAGE OF MULTICAST NETWORK CODING IN GENERAL NETWORK MODELS 1025

u v

w

c({u, v, w})

(b) Hyper-link

u vc(

→uv)

c(→vu)

u vc(

→uv)

c(→vu)

(a) Bidirected Link

Fig. 2. Communication links in (a) bidirected networks and (b) hyper-networks. In bidirected networks, we use a double-headed arrow to representa pair of opposite links for simplicity.

support a flow f(→uv) from u to v and a flow f(

→vu) from v

to u simultaneously, as long as f(→uv) + f(

→vu) ≤ c(uv).

Directed Networks. A directed network is a directed graphD(V,A, c), with node set V , link set A, and link capacityvector c ∈ ZA

+ . A directed link→uv can transmit a flow from

u to v only, subject to f(→uv) ≤ c(

→uv).

Bidirected Networks. In a bidirected network, node con-nection is always bidirectional (Fig. 2 (a)): ∀u, v ∈ V ,→uv ∈ A implies

→vu ∈ A. Given a bidirected network

B(V,A, c), we define the link capacity balance parameter

as α � max→uv∈A

c(→uv)

c(→vu)

, and refer to B(V,A, c) as an α-

balanced network. A special case is α = 1, correspondingto a (completely) balanced network.

As a classic technique in graph theory, a link with integralcapacity can be replaced with multiple unit-capacity linksbetween the same pair of nodes. Such a transformation resultsin a directed multi-graph, and does not change the networkconnectivity between any pair of nodes.

Hyper-Networks. A generalization of the undirected networkmodel is the hyper-network model, where each hyper-linkconnects two or more nodes (Fig. 2 (b)). Specifically, wemodel a hyper-network with a hypergraph H(V, E), where Eis the set of hyper-edges. Each hyper-edge e ∈ E ‘covers’ a setof two or more nodes. The size of a hyper-edge is the numberof nodes it connects. We denote the maximum hyper-edge sizeas β. An undirected network can be viewed as a special typeof hyper-network with β = 2.

When one node transmits an information flow through ahyper-link, the flow reaches all the other nodes adjacent tothe hyper-link simultaneously. Let f(e, v), e ∈ E , v ∈ e denoteflow transmitted from v through hyper-link e. The total flowon a hyper-link at a given time is upper-bound by the hyper-link capacity. For example, if e covers three nodes u, v, w, wehave f(e, u) + f(e, v) + f(e, w) ≤ c(e).

IV. CODING ADVANTAGE IN BIDIRECTED NETWORKS

A. Optimal Multicast in Bidirected Networks

Multicast models the dissemination of information from acommon source to a set of receivers within the same network.Given a (bi)directed network D(V,A, c), let s ∈ V be themulticast source, and T ⊂ V \{s} be the set of multicastreceivers. Note that unicast (one-to-one) and broadcast (one-to-all) can be viewed as special cases of multicast, with|T | = 1 and |T | = |V | − 1, respectively. A multicast rate(throughput) r is achieved if each receiver can receive andrecover information flows from the source at rate r.

u v

z

u v

z

Fig. 3. Directed splitting operation at z.

Let λD(u, v) be the max-flow rate from u to v, whichequals the maximum number of link-disjoint paths from uto v in the unit-capacitated multigraph model. By the max-flow min-cut theorem in network flows, λD(u, v) equals thecapacity of the min cut between u and v. The celebrated result[1] on multicast rate feasibility extends the max-flow min-cuttheorem from one-to-one unicast to one-to-many multicast:with network coding, a multicast rate r is feasible in a directednetwork off it is feasible as a unicast rate to each receivert ∈ T independently. Therefore, the maximum multicast ratewith network coding is Rnc(D, s, T ) = mint∈T λD(s, t).Parameters (D, s, T ) are dropped when clear from context.

Without network coding, i.e., with routing and replication,a multicast solution can be decomposed into a set of multicasttrees, since each information flow propagates along a treerooted at the multicast source, covering all multicast receivers.In this context, achieving the maximum multicast throughputis equivalent to the combinatorial problem of Steiner treepacking [4]. Let T denote the set of all possible multicasttrees. For each tree τ ∈ T , let rτ be the information flowrate to transmit along τ . A tree packing solution is feasible iffor each link

→uv ∈ A, the total flow rate of trees containing

→uv is bounded by the link capacity c(

→uv). The maximum

throughput achieved by routing, denoted by Rtree(D, s, T ),is the maximum aggregated flow rates of a feasible multicasttree packing, which can be formulated into a linear program:

maximize∑

τ∈T rτ

subject to :∑

τ :→uv∈τ

rτ ≤ c(→uv), ∀ →

uv ∈ A

rτ ≥ 0, ∀τ ∈ T

By definition, Rtree ≤ Rnc for any (D, s, T ), since amulticast tree packing solution is a special case of a networkcoding solution. The coding advantage is Rnc/Rtree, theratio of maximum throughput with network coding over thatwithout coding, which is at least 1.

B. Upper-bound for Completely Balanced Networks

We introduce two lemmas from graph theory for studyingthe coding advantage in completely balanced networks. Thefirst is about the splitting operation illustrated in Fig. 3.In particular, a directed splitting at a node z refers to thereplacement of two links

→uz,

→zv with a new link

→uv.

Lemma 1. [20] Let D = (V +z, A) be a directed multigraphwith symmetrical nodal in- and out- degrees. For each link→uz ∈ A, there exists a link

→zv ∈ A, such that after splitting off

→uz,

→zv, the link connectivity i.e., the number of directed link-

disjoint paths, between every pair of nodes in V is unchanged.

Lemma 2. [21] In a directed multigraph D = (V,A), themaximum number of arc-disjoint spanning trees rooted at s ∈V is mint∈V−s λD(s, t).

Page 4: IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, …pages.cpsc.ucalgary.ca/~zongpeng/publications/tcom14-xunrui.pdf · 1024 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3,

1026 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, MARCH 2014

Theorem 1. For multicasting in a completely balanced net-work B, Rtree = Rnc. Furthermore, the optimal multicastthroughput can be achieved with integral link flow rates, andwithout coding or replication at relay nodes.

Proof: We consider the equivalent directed multigraph,where a link with capacity larger than 1 is replaced withparallel links each with unit capacity. We prove that thereexist Rnc link-disjoint multicast trees, each of which maycontribute a multicast throughput of 1.

A completely balanced network satisfies the node-symmetric condition of Lemma 1. After each splitting op-eration, the node symmetric property remains true, althoughlink symmetry may no longer hold. Consequently, we cancompletely isolate a node z by repeatedly applying splittingoperations at z, while preserving the max flow rate betweenany two other nodes.

We apply complete splitting to relay nodes in V −s−T se-quentially, and denote the resulting digraph as D. According toLemma 1, Rnc(B) = mint∈T λB(s, t) = mint∈T λD(s, t) =Rnc(D), i.e., the optimal throughput achieved by networkcoding is not affected. We do not explicitly distinguish thesetwo notations in the rest of the proof.

By Lemma 2, network coding cannot improve broadcastthroughput. D has no relay nodes and is a broadcast network,hence there must exist Rnc broadcast trees in D.

The splitting operations at the relay nodes indicate how toforward the received information. No replication or encodingis required at the nodes being split off. We can achievethe optimal throughput Rnc by applying reverse splittingoperations on broadcast trees in D, and using the resultingmulticast trees for transmission. As a result, we only need toreplicate information at the source and receiver nodes.

Polynomial Time Algorithm for Steiner Tree Packing.Steiner Tree Packing in general graphs is a well known NP-hard problem. Interestingly, for completely balanced networks,a polynomial time tree packing algorithm can be extractedfrom the proof of Theorem 1:

Step 1: Split off all relay nodes to obtain a directed graph D;

Step 2: Apply the polynomial time algorithm of Wu et al. [22]to compute a spanning tree packing in D;

Step 3: Translate spanning trees in D to Steiner trees in B,by reversing the split operations.

For step 1, brute-force search for splittable link pairs takesO(|T ||E||V |3) time. Step 2 can be done in time O(R3

nc|T ||E|)[22]. Step 3 can be done in time O(|E|) with the splitoperations in step 1 stored in memory. Thus the overallcomplexity is O(|T ||E||V |3 +R3

nc|T ||E|).Wu et al. [23] designed heuristic algorithms for Steiner

tree packing in a set of six-ISP topologies that are all link-balanced. Our algorithm outperforms their heuristic algorithmsby guaranteeing optimal solutions.

Discussions. The “link-balanced” condition of the theorem canbe relaxed to “node-balanced”, which guarantees the existenceof the desired directed splitting operation. We emphasize thatthe optimal throughput can be achieved by packing multicasttrees with integer flow rates, while in general, fractionalpacking outperforms integral packing [5]. For example, in the

classic butterfly network (directed) [1], [24], fractional packingcan achieve a multicast rate 1.5, while integral packing canonly achieve a rate 1.

While access networks at the edge of the Internet may not bebalanced, the core of the Internet is rather close to a balancednetwork [25]. Then by Theorem 1, in such type of balancednetworks, optimal multicast is feasible without IP-multicastor network coding requirements at routers in the middle ofthe network. Furthermore, the fact that fractional flows areunnecessary for achieving the maximum multicast throughputhelps reduce the overhead in traffic splitting and managementfor achieving optimal multicast.

C. Upper-bound for α-Balanced Networks

Networks in practice may exhibit approximate rather thanabsolute symmetry in opposite link capacities, and can be char-acterized using the α-balanced network model. The codingadvantage in α-balanced networks was previously analyzedby Li et al. [10] and was proved to be upper-bounded by2(α + 1). We apply Theorem 1 to improve this upper-boundto α.

Theorem 2. The coding advantage for multicast in an α-balanced network is upper-bounded by α.

Proof: Let B = (V,A, c) be an α-balanced network,let B1 = (V,A, c) denote the completely balanced graphinduced from B by truncating the larger link capacity to thesmaller one, between each pair of neighboring nodes. We haveRtree(B) ≥ Rtree(B1).

Let αB1 = (V,A, αc) be the digraph induced from B1

by scaling each link’s capacity by α. Each link’s capacityin αB1 is no less than in B, hence Rnc(αB1) ≥ Rnc(B),and Rtree(B1) = Rnc(B1) = 1

αRnc(αB1), where the firstequality is due to Theorem 1, since B1 is a completelybalanced network, and the second equality holds because themaximum throughput scales linearly with the edge capacities.We conclude that Rtree(B) ≥ 1

αRnc(B).

D. Lower-bound for α-Balanced Networks

Can the improved upper-bound of α in Theorem 2 be furthertightened? Apparently, the bound is tight for the case of α = 1.For larger values of α, we show that there exist α-balancednetworks with coding advantage of Ω(

√α). The upper- and

lower- bounds we prove are therefore approximately tightagainst each other, within a factor of O(

√α).

Our approach to show the lower-bound of the maximumcoding advantage is to construct bidirected multicast networkswith large coding advantages. We manipulate a class ofdirected networks with known coding advantage into the bidi-rected domain. The directed networks chosen are ZK(p,N)networks introduced by Chekuri et al. [19] in their investiga-tion of the coding advantage for average multicast throughput.The ZK(p,N) networks turn out to be more powerful thandirected combinational network, in terms of providing highercoding advantage with less terminal nodes.

A ZK(p,N) network consists of a source s, N receiverst1, t2, · · · , tN , and three layers of relay nodes. The first layer,layer A, has

(N

p−1

)nodes, which can be indexed as AU , with

U ⊂ {1, 2, · · · , N} and |U | = p − 1. The next two layers

Page 5: IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, …pages.cpsc.ucalgary.ca/~zongpeng/publications/tcom14-xunrui.pdf · 1024 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3,

YIN et al.: BOUNDING THE ADVANTAGE OF MULTICAST NETWORK CODING IN GENERAL NETWORK MODELS 1027

12 13 14 23 24 34

1 2 3 4

t1 t2 t3 t4

s

Layer A

Layer B

Layer C

Fig. 4. The α-balanced network ZKα(p = 2, N = 4). Each downwardlink has capacity α, each upward link has capacity 1.

B and C each has(Np

)nodes, which can be indexed as BW

and CW respectively, with W ⊂ {1, 2, · · · , N} and |W | = p.These nodes are connected in the following way: for any nodesAU , BW , CW , there is a link from s to AU , a link from AU toBW if U ⊂ W , a link from BW to CW , and a link from CW

to ti if i ∈ W . So far each link is unidirectional. To constructthe α-balanced network ZKα(p,N) from ZK(p,N), we addan opposite link of unit capacity for each link in ZK(p,N),and set all the original links’ capacity to α. Fig. 4 illustratesthe ZKα(p = 2, N = 4) network.

Theorem 3. The coding advantage in ZKα(p,N) is of orderΩ(

√α).

Proof: For each rooted Steiner tree τ , let aτ denote thenumber of layer A nodes in τ , cτ denote the number oflayer C nodes with its predecessor in layer B, and dτ denotethe number of layer C nodes with its predecessor being areceiver. For a fractional tree packing T = {τ1, τ2, · · · , τ|T |}with tree flow rates r1, r2, · · · , r|T |, consider the aggregatedlink capacity constraints at nodes counted in aτ , cτ , and dτrespectively,

∑τ∈T

rτaτ ≤ α

(N

p− 1

)+ p

(N

p

)(1)

∑τ∈T

rτ cτ ≤ α

(N

p

)(2)

∑τ∈T

rτdτ ≤ p

(N

p

)(3)

The maximum number of receivers directly covered bynodes counted in cτ is aτ (p−1)+cτ . This is because the nodeswith the same grand parent in layer A have p − 1 receiversin common. Grouping them according to their grand parent inlayer A, each group will cover p− 1 common receivers witheach member covering one special receiver at most. For eachnode in dτ , it can cover at most p − 1 receivers. Therefore,to cover all the receivers, aτ (p − 1) + cτ + dτ (p − 1) ≥ N .Multiplying both sides with

∑τ∈T rτ ,

N∑τ∈T

rτ ≤∑τ∈T

rτ[aτ (p− 1) + cτ + dτ (p− 1)

](4)

Combining the above inequalities (1–4), we have that themaximum rate for fractional routing

Rtree ≤ 1

N

[α(p− 1)

(N

p− 1

)+ α

(N

p

)+ 2p(p− 1)

(N

p

)]

Furthermore, Rnc = α(N−1p−1

). Therefore,

Rtree

Rnc≤ p− 1

N − p+ 1+

1

p+

2(p− 1)

α

For large α that α and√α are both integers, let p−1 =

√α

and N = α, then Rtree

Rnc< 1√

α−1+ 3√

α.

V. CODING ADVANTAGE IN HYPER-NETWORKS

Wireless networks represent a promising paradigm for ap-plication of network coding. Due to the broadcast nature of awireless transmission, an encoded packet can be delivered tomultiple neighbors in a single transmission, helping more thanone of them in a bandwidth efficient manner. Correspondingly,a hyper-network model for wireless networks was looked atby Li et al. [10], but bounding the coding advantage thereis left open. In this section, we prove the first upper-boundfor the coding advantage in hyper-networks.Hyper-networkscapture the broadcast nature but not interference of wirelesstransmissions, hence results in this section do not directlyextend to real-world wireless networks.

A. Optimal Multicast in Hyper-networks

Let H = (V, E) be an (undirected) hypergraph. In this sec-tion, we use e to denote a hyper-edge in E . A path connectings and t is a sequence of hyper-edges (e1, e2, · · · , en), suchthat s ∈ e1, t ∈ en, ei ∩ ei+1 �= ∅, i = 1, 2, · · · , n − 1, andei �= ej , ∀i �= j. The edge connectivity λH(s, t) is defined asthe maximum number of hyper-edge disjoint paths betweens and t. Assume each hyper-edge has unit capacity, thenthe maximum throughput from s to t is upper bounded byλH(s, t), i.e., Rnc(H, s, T ) ≤ mint∈T λH(s, t).

An orientation of a hypergraph is obtained by assigninga direction to each hyper-edge e, via identifying a tail for e,indicating which node is sending messages through this hyper-link.

For s ∈ V, T ⊂ V \{s}, define an s-T hyper-tree as a setof hyper-edges τ that has an orientation containing at leastone directed path from s to t, ∀t ∈ T . For routing in thehyper-network model, the trace of a message from s to allreceivers forms an s-T hyper-tree. Therefore, the maximumthroughput achieved by routing is the maximum packing ofs-T hyper-trees.

Let T denote the set of all possible s-T hyper-trees. Similarto the bidirected network case, Rtree(H, s, T ) is the optimalvalue of the following linear program:

maximize∑

τ∈T rτ

subject to :∑

τ :e∈τ rτ ≤ c(e), ∀e ∈ Erτ ≥ 0, ∀τ ∈ T

Page 6: IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, …pages.cpsc.ucalgary.ca/~zongpeng/publications/tcom14-xunrui.pdf · 1024 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3,

1028 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, MARCH 2014

u

y

w

xve1ve2e1 e2

u y

wx

Fig. 5. A hyper-network (left) and its corresponding completely balancedbipartite network (right).

B. Upper-bound for Hyper-networks

Theorem 4. In an undirected hypergraph H = (V, E),the coding advantage Rnc/Rtree is upper bounded by themaximum hyper-edge size β.

Proof: As the maximum throughput Rnc is upperbounded by λH(s, T ) = mint∈T λH(s, t), it is sufficient toprove that Rtree ≥ 1

βλH(s, T ).As illustrated in Fig. 5, given the hypergraph H = (V, E),

we construct a completely balanced bipartite digraph B(V ∪VE , A), where V is the original node set in H , and VE ={ve | ∀e ∈ E}. Connect v ∈ V and ve ∈ VE if v is an endnode of e. In other words, A = { →

vve,→vev | ∀v ∈ e}. For

any pair of nodes u, v ∈ V , λB(u, v) ≥ λH(u, v), becausefor any n hyper-edge disjoint s-t paths P1,P2, · · · ,Pn in H ,we can find n edge disjoint s-t paths P1, P2, · · · , Pn in B bythe following method: let Pi = (e1, e2, · · · , en), we choosea sequence of intermediate nodes vi ∈ ei ∩ ei+1, so the pathcan be rewritten as s = v0, e1, v1, e2, · · · , vn−1, en, vn = t.Then we construct Pi as (v0, ve1 , v1, · · · , ven , vn). Becauseei appears only once in P1,P2, · · · ,Pn, P1, P2, · · · , Pn areedge disjoint.

By Theorem 1, in the completely balanced network B, thereis a rooted Steiner tree packing τ1, τ2, · · · , τm with weightsr1, r2, · · · , rm, such that

∑mj=1 rj = Rtree(B) = Rnc(B) =

λB(s, T ) ≥ λH(s, T ). We complete the proof by constructingm hyper multicast trees τ ′1, τ

′2, · · · , τ ′m in H with weights

r1/β, r2/β, · · · , rm/β. Let τ ′j = {e ∈ E | ve ∈ τj}. Theorientation and s-t directed path in τ ′j can be determinedreferring to the s-t path in τj , so τ ′j is an s-T hyper-tree.For any hyper-edge e ∈ E ,∑

j:e∈τ ′j

rj/β =1

∑j:ve∈τj

rj/β =2

∑j:∃!v∈V,

→vve∈τj

rj/β

=3

∑v∈N(ve)

∑j:

→vve∈τj

rj/β ≤4

∑v∈N(ve)

1/β ≤5 1

where N(ve) is ve’s neighbor set in B. In the abovederivation, =1 is due to the construction of hyper-tree τ ′j ,and =2 holds because there is a unique parent v for theintermediate node ve, as τj is an s rooted tree. In other words,ve ∈ τj iff there is a unique link

→vve ∈ τj . The right side

of =3 rewrites the sum by enumerating the neighbors of ve.≤4 holds because τ1, τ2, · · · , τm with weights r1, r2, · · · , rmform a feasible tree packing. ≤5 holds because each hyper-edge covers β nodes at most, and the neighbors of ve in Bare the same nodes covered by e in H .

Therefore, the hyper-trees τ ′1, τ′2, · · · , τ ′m with weights

r1/β, r2/β, · · · , rm/β constitute a feasible hyper-tree packingin H . Recall that

∑mj=1 rj/β = 1

βλB ≥ 1βλH , we have

Rtree(H) ≥ 1βλH(s, T ) ≥ 1

βRnc(H).

C. Lower-bound for Hyper-networks

It is postulated that the upper-bound of 2 for codingadvantage in undirected networks is not tight [10], and thelargest value known is 8/7. Closing the gap between 2 and8/7 has remained an important open problem. For hyper-networks, we prove that the maximum coding advantage is oforder Ω(log β), through transforming combination networks[11] into hyper-networks.

Theorem 5. The maximum coding advantage among hyper-networks of maximum hyper-edge size β is of order Ω(log β).

Proof: We construct a hypergraph HC(n, k) from theundirected combination network C(n, k), by viewing thestar topology centered at each relay node as a hyper-edge.HC(n, k) contains one source node,

(nk

)receiver nodes, and

n hyper-edges each of unit capacity. Label the receivers withthe chosen set M ⊂ {1, 2, · · · , n}, |M | = k, hyper-edgeei(i = 1, 2, · · · , n) connects the source and all the receiversM if i ∈ M . Each receiver is adjacent to k hyper-edges, andeach hyper-edge has size β =

(n−1k−1

)+ 1.

Let T denote the set of all possible s-T hyper-trees, and{rτ |τ ∈ T } be the optimal hyper-tree packing. For any hyper-tree τ ∈ T which connects s to all the receivers, τ has atleast n − k hyper-edges, because if there are k hyper-edgesnot contained in τ , the corresponding receiver only connectingto them is uncovered. Sum all the inequalities of hyper-edgecapacity constraints

∑τ :ei∈τ rτ ≤ 1, i = 1, 2, · · · , n, we

have∑

τ∈T rτ |τ | ≤ n, where |τ | is the number of hyper-edgesin τ and is at least n − k. Therefore, Rtree =

∑τ∈T rτ ≤

nn−k . We also claim Rnc = k, since each receiver is directlyconnected to the source by k hyper-edges. Let n = 2k,Rnc/Rtree ≥ k/2. Since β =

(n−1k−1

)+ 1 < 22k, we have

Rnc/Rtree >14 log β.

VI. COST ADVANTAGE

Besides throughput improvements, bandwidth and cost sav-ing is another important benefit of network coding. With thehelp of network coding, less overall bandwidth consumptionis required to achieve a target multicast rate. More generally,we assume a link cost vector w ∈ QA

+, indicating the costto transmit a unit flow through a link. Here Q+ is the setof positive rational numbers. For a multicast solution with anunderlaying flow of rate f(

→uv) at each link

→uv, total cost is∑

→uv∈A

w(→uv)f(

→uv).

A. Upper-bound for Bidirected Networks

We start with uncapacitated networks that have sufficientbandwidth supply at each link, such that link capacities are nota limiting factor of optimal multicast cost. Given an uncapac-itated bidirected graph D(V,A) and a link cost vector w, letCtree(D,w) and Cnc(D,w) be the min cost to achieve a unitmulticast rate with routing and network coding respectively.The cost advantage is defined as Ctree(D,w)/Cnc(D,w).

With link capacity limits removed, an undirected networkand a bidirected network differ only in that link weights inopposite directions are always equal in the former but not thelatter. Let α′ denote the maximum ratio of link weights of the

Page 7: IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, …pages.cpsc.ucalgary.ca/~zongpeng/publications/tcom14-xunrui.pdf · 1024 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3,

YIN et al.: BOUNDING THE ADVANTAGE OF MULTICAST NETWORK CODING IN GENERAL NETWORK MODELS 1029

two directions, i.e., α′ � max→uv∈A

w(→uv)

w(→vu)

. The weight vector

w is called α′-balanced.When α′ = 1, we are essentially considering the maximum

cost advantage in an undirected network, which was provedto be equal to the maximum coding advantage of undirectednetworks [5], [11], and therefore, is upper bounded by 2:

Theorem 6. In an uncapacitated bidirected network withsymmetric link weights, the cost advantage is at most 2.

For the general case α′ ≥ 1, we have the following resultthat echoes the the coding advantage.

Theorem 7. In an uncapacitated bidirected network (B,w),the cost advantage is at most 2α′.

Proof: Let w denote weight vector derived fromw by truncating the larger weight to the smaller one, sothat w is symmetric. As w is α′-balanced, α′w ≥w ≥ w. Because the min cost is monotonic tothe weight vector, we have Cnc(B,w) ≥ Cnc(B, w)and Ctree(B,α′w) ≥ Ctree(B,w). As w is symmet-ric, Cnc(B, w) ≥ 1

2Ctree(B, w) according to theo-rem 6. Combining these inequalities with the fact that12Ctree(B, w) = 1

2α′ Ctree(B,α′w), we conclude thatCnc(B,w) ≥ 1

2α′ Ctree(B,w).If the network is not bidirected, the cost advantage is

unbounded, as can be demonstrated in directed combinationnetworks C(n, k) with the following setting of link weights:set the link weight to 1 for each link from the source to anintermediate node, and set the link weight to ε → 0 for otherlinks. As each multicast tree contains at least n− k+ 1 linksfrom the source to the intermediate nodes, Ctree ≥ n− k+1.But with network coding, we can achieve a unit multicastrate with each link carrying a flow of rate 1/k, which meansCnc ≤ n/k. Thus letting n = 2k → ∞, we can see the costadvantage is unbounded. This can be viewed as a special caseof α′ = ∞ (link costs are arbitrarily unbalanced).

It is interesting to note that unlike the duality between thecost advantage and coding advantage in undirected networks,we have coding advantage of 1 for completely balancednetworks, but maximum cost advantage lower bounded by 9/8[11] with completely balanced link weights.

B. Lower-bound for Bidirected Networks

Similar to the case of coding advantage, we now derive alower bound of order Ω(

√α′) for the maximum cost advantage

of bidirected networks with α′-balanced link weights.

Theorem 8. A lower-bound of order Ω(√α′) exists for the max

cost advantage in an α′-weight-balanced bidirected network.

Proof: For a large integer α′, we construct a network withcost advantage of order Ω(

√α′) by assigning an α′-balanced

link weight vector w to the bidirected network ZK(p,N),where N = p(p − 1) = α′. Assign weights to the downwardlinks: 1) w(e) = N/p for a link from the source to layer A; 2)w(e) = 1/p for a link from layer A to layer B; 3) w(e) = 1for a link from layer B to layer C; 4) w(e) = 1/p for a linkfrom layer C to receivers. Each upward link weight is N timesthe backward link’s cost. The network is α′ = N balanced.

A network coding solution with unit flow rate on eachdownward link achieves multicast rate

(N−1p−1

)with cost

(N

p−1

Np + p

(Np

)· 1p +

(Np

)+ p

(Np

)· 1p . Therefore, the min-cost to

achieve a unit multicast rate Cnc ≤ NN−p+1 · N

p + 3 · Np

For any multicast tree τ , let aτ denote the number of layerA nodes in τ , cτ the number of layer C nodes with predecessorin layer B, and dτ the number of layer C nodes with a receiverpredecessor. Considering the total cost of links entering thesenodes, the cost of τ is no less than aτ

Np + cτ + dτ

Np =

aτ (p − 1) + cτ + dτ (p − 1) ≥ N , where the last inequalityis due to (4). Hence Ctree ≥ N , and for sufficiently large N ,

NN−p+1 · 1

p + 3p ≤ 5

p ≈ 5√α′ .

C. Cost Advantage in Hyper-networks

For a hyper-network H , assign each hyper-link a weightw(e) indicating the cost to transmit a unit flow there, thenthe cost of a multicast solution is defined in the same way asin the bidirected model. Inspired by Agarwal and Charikar’swork [5] on the relation between coding advantage and costadvantage in undirected networks, we prove that the maximumcost advantage in hyper-networks equals the maximum codingadvantage, suggesting that these two quantities can be studiedunder a unified framework.

Theorem 9. Given a hyper-network H = (V, E), the maxi-mum coding advantage of any link capacity function c equalsthe maximum cost advantage of any link weight function w:

maxc

Rnc(H, c)

Rtree(H, c)= max

w

Ctree(H,w)

Cnc(H,w)

Proof: We carry out the prove in two steps. First, weprove that maxc

Rnc(H,c)Rtree(H,c) ≤ maxw

Ctree(H,w)Cnc(H,w) by finding a

weight function w for any given capacity c, such that the costadvantage with w is no less than the coding advantage with c.Without loss of generality, we may assume Rnc(H, c) = 1 andconsider the dual of the optimal hyper-tree packing problem:

minimize∑

e∈E c(e)y(e)

subject to :∑

e∈τ y(e) ≥ 1, ∀τ ∈ Ty(e) ≥ 0, ∀e ∈ E

In fact, the optimal solution y of this dual problem givesthe desired weight function. As Rnc(H, c) = 1, there is anetwork coding solution achieving unit multicast rate with costno more than

∑e∈E c(e)y(e), which equals Rtree(H, c) ac-

cording to the strong duality property. Therefore, Cnc(H, y) ≤Rtree(H, c). On the other hand, the constraints of the dualproblem promise that each hyper-tree has cost at least 1, i.e.,Ctree(H,y) ≥ 1.

Second, we prove that maxcRnc(H,c)Rtree(H,c) ≥ maxw

Ctree(H,w)Cnc(H,w) .

Without loss of generality, assume Ctree(H,w) = 1. Considera network coding solution achieving the min cost Cnc(H,w).Let c(e) be the sum of information flows through this link.Obviously, Rnc(H, c) ≥ 1. As the cost of each hyper-tree isat least 1, the optimal tree packing number Rtree(H, c) is nomore than the total cost

∑e∈E c(e)w(e) = Cnc(H,w), which

completes the proof.Combining the above theorem with our previous results on

the coding advantage in section V, we can derive:

Page 8: IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, …pages.cpsc.ucalgary.ca/~zongpeng/publications/tcom14-xunrui.pdf · 1024 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3,

1030 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, MARCH 2014

s

t1 t2

1 1

2 2

Fig. 6. A capacitated bidirected network, where both link capacity and linkweight are balanced. Each link has unit capacity. Four pairs of links havelabeled weights; other links have negligible weights. Target multicast rate is2. A network coding solution using all the downward links has cost 6. Abrute-force search shows that the min cost of a routing solution has cost 7,shown with color links. The cost advantage is at least 7/6.

Corollary 1. In undirected hyper-networks, the cost advan-tage is upper bounded by the maximum hyper-link size β, andlower bounded by 1

4 log β.

D. Cost Advantage in Capacitated Bidirected Networks

So far in this section, we have assumed that links areover-provisioned in bandwidth. We now remove this as-sumption, and consider cost advantage in capacitated net-works, which further depends on the required multicast rate.Given a (bi)directed graph D(V,A), a link capacity vec-tor c, a weight vector w, and a required multicast rate rwhich can be achieved by both routing and network coding,we define the cost advantage of a capacitated network asC′tree(D, c, w, r)/C′

nc(D, c, w, r), where C′tree(D, c, w, r) and

C′nc(D, c, w, r) are the min cost of routing solution and

network coding solution for achieving multicast rate r, respec-tively. The uncapacitated scenario can be viewed as a specialcase of the capacitated scenario, with r ≤ c(

→uv), ∀ →

uv ∈ A.For any network where the coding advantage is larger than

1, we can add a very high cost link from the source to eachreceiver, and set the required multicast rate accordingly, sothat the routing solution has to resort to the high cost links,while a network coding solution does not. The cost advantagethen becomes arbitrarily large in such capacitated networks.

Theorem 10. In a completely balanced network (B, c, w), thecost advantage is upper bounded by 1+α′ for any achievablerate r.

Proof: Let f(→uv) be the min cost flow for network

coding solution, so that C′nc =

∑→uv∈A

w(→uv)f(

→uv). Since the

network is completely balanced, there is a completely balancedsubgraph (B′, c′) from f with capacity c′(

→uv) = c′(

→vu) =

max{f(→uv), f(→vu)}. By Theorem 1, there is a rout-ing solution in (B′, c′). Thus C′

tree ≤∑

→uv∈A

w(→uv)c′(

→uv

) ≤∑

→uv∈A

w(→uv)(f(

→uv) + f(

→vu)) ≤

∑→uv∈A

w(→uv)f(

→uv) +∑

→vu∈A

α′w(→vu)f(

→vu) ≤ (1 + α′)C′

nc.For an example where the cost advantage is strictly larger

than 1, consider the completely balanced network in Fig. 6,where each link has unit capacity, the target multicast through-put is 2, and the weight vector is symmetric (labeled beside thelinks). By enumerating all possible routing solutions, we findthat the min cost with routing is 7, while there is a network

coding solution of cost 6. The cost advantage therefore is noless than 7/6 in this example.

Combining all the results on completely balanced networksin this paper, we may conclude that the benefits of networkcoding in this scenario is marginal, not only in terms ofmaximum throughput, but also in terms of computationalcomplexity. We can find a routing solution in polynomial time,which achieves the optimal multicast rate with cost at mosttwice of the optimal cost with network coding.

VII. EXTENSION TO BIDIRECTED HYPER-NETWORKS

So far, we have considered undirected hyper-networkswhich generalize the undirected network in terms of link size.For directed hyper-networks, since a directed network can beregarded as a directed hyper-network with link size β = 2,existing results show that coding advantage can be arbitrarilylarge even in directed hyper-networks with small link size [2].

“Bidirectional transmission” also exists in broadcast links,i.e., if node A can broadcast messages to node B, node B isable to broadcast to A as well. This motivates us to studythe “bidirected hyper-networks”. In particular, let (e, v), v ∈ edenote a directed hyper-link that transmits messages from vto the set of nodes e\v. A directed hyper-network H(V,A) isbidirected if ∃(e, u) ∈ A : v ∈ e implies ∃(e′, v) ∈ A : u ∈ e′.

Coding Advantage Given a bidirected hyper-networkH(V,A) with link capacity c, we define the link capacitybalance parameter α as

α � maxu,v∈V,∃(e,v)∈A,u∈e

∑(e′,v)∈A,u∈e′ c(e

′, v)∑

(e′′,u)∈A,v∈e′′ c(e′′, u)

Note that there may be different hyper-links that connect u tov. Thus we take the sum capacity of these links as the directtransmission capacity from u to v in the definition. We referto H(V,A, c) as an α-balanced bidirected hyper-network.

Theorem 11. The coding advantage in α-balanced bidirectedhyper-networks is upper bounded by α(β− 1), where β is themaximum link size.

Proof: Given the hyper-network H(V,A) with link ca-pacity c, we construct an α-balanced bidirected networkB(V,A′, c′) by replacing each hyper-link (e, v) ∈ A with|e| − 1 directed links (v, u), ∀u ∈ e\{v}, whose capacitiesare set to c(e, v). By the definition of α-balanced bidirectedhyper-network, the constructed network B is α-balanced.

For any s, t ∈ V , the maximum (s, t)-flow in B is no lessthan the corresponding max-flow in H because any (s, t)-flowin H is also an (s, t) flow in B. Thus, Rnc(B) ≥ Rnc(H). ByTheorem 2, Rtree(B) ≥ 1

αRnc(B). To complete the proof, itis sufficient to show that Rtree(H) ≥ 1

β−1Rtree(B).Let τ1, τ2, · · · , τm with weights r1, r2, · · · , rm be a feasible

tree packing in B, we obtain a hyper-tree τ ′i from τi byreplacing each link (u, v) ∈ τi with the hyper-link (e, u)from which (u, v) is introduced. As

∑j:(u,v)∈τj

rj ≤ c(e, u)and (e, u) is replaced with |e| − 1 links in the constructionof B, the hyper-tree packing τ ′1, τ

′2, · · · , τ ′m with weights

r1/(β − 1), r2/(β − 1), · · · , rm/(β − 1), is a feasible treepacking in H . Therefore, Rtree(H) ≥ 1

β−1Rtree(B).

Page 9: IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, …pages.cpsc.ucalgary.ca/~zongpeng/publications/tcom14-xunrui.pdf · 1024 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3,

YIN et al.: BOUNDING THE ADVANTAGE OF MULTICAST NETWORK CODING IN GENERAL NETWORK MODELS 1031

A. Cost Advantage

Let w(e, u) denote the weight of a directed hyper-link(e, u). As multiple hyper-links may connect u to v, we usethe minimum link weight to define the link weight balanceparameter α′:

α′ � maxu,v∈V,∃(e,v)∈A,u∈e

min(e′,v)∈A,u∈e′ w(e′, v)

min(e′,u)∈A,v∈e′ w(e′, u)

Theorem 12. In a bidirected hyper-network with α′-balancedlink weights, the cost advantage is upper bounded by 2α′(β−1), where β is the maximum link size.

Proof: Let H(V,A) denote the bidirected hyper-networkwith α′-balanced link weights w. Construct a bidirected net-work B(V,A) as A = {(u, v)|∃(e, u) ∈ A, v ∈ e}, and letthe link weight w′(u, v) = min∃(e,u)∈A,v∈ew(e, u). By thedefinition of α′, B is α′-balanced.

By Theorem 7, we have Ctree(B) ≤ 2α′Cnc(B). Eachmulticast tree in B can be directly converted into a hyper-tree in H with the same cost by replacing each directed link(u, v) with the hyper-link (e, u), v ∈ e that has the minimumweight w′(u, v). Thus, Ctree(H) ≤ Ctree(B). To complete theproof, it is sufficient to show that Cnc(B) ≤ (β − 1)Cnc(H).

For a network coding solution in H , let f(e, u) denotethe flow rate on each hyper-link (e, u). We can use |e| − 1flows f ′(u, v) = f(e, u), v ∈ e\{u} to mimic the broadcasttransmission, yielding a network coding solution in B withcost at most (β − 1)Cnc(H).

VIII. CONNECTIONS TO REAL-WORLD NETWORKS

A. Bidirected Networks

Most of today’s wireline networks operate in full-duplexmode. The two opposite transmissions have independent linkcapacities. Consequently, these networks can be modeled asbidirected networks. The Internet can be roughly divided intoaccess networks and backbone networks. Access networksconnect subscribers to their ISPs. As most consumers havea larger download demand than upload demand, the accesslinks are often imbalanced. There are many types of wirelineaccess technologies. For example, α is usually 3 ∼ 12 forADSL and VDSL links [26]. For an access network withoptical fiber or cable modem, the capacities of up-stream anddown-stream links depend on the ISP’s sale plans. Accordingto the empirical study on residential broadband plans over29 countries [27], α in broadband plans from 25 countriesranges from 2 to 21, with a median value of 8.53. A recentmeasurement on the broadband performance in US also revealsthat α lies between 2 and 16 [28]. Ethernet connections arecommonly found in companies and schools. For an Ethernetwith twisted-pair or fiber optic links in conjunction with full-duplexing switches, its α = 1. We note that if the Ethernet isconnected by coaxial cables or hubs that work in half-duplexmode, it should be modeled as an undirected hyper-network.

Backbone networks inter-connect local access networks.Measurement studies shows that the core of the Internet hascompletely balanced link capacities. For example, the sixISP topologies in the Rocketfuel project are all completelybalanced [23]. Internet structure studies often apply a singlevalue for specifying link capacities in both directions [7]. If

5 10 151

5

10

15

α

Cod

ing

adva

ntag

e

Our upper−boundPrev. upper−bound [10]

backbonenetworks

access networks

Fig. 7. Upper-bounds of coding advantage in real-world networks.

the background traffic is considered, the available bandwidthsfor a multicast session may be imbalanced. Fraleigh et al.[25] report that the closer to the Internet backbone, the moresymmetric the communication traffic is. For the links theyexamined (OC-48), the opposite traffic ratio is always between1:1 and 5:1. John and Tafvelin [8] tap a backbone link(OC-192), and report that no significant difference betweenopposite link flow volumes can be observed. Fig. 7 summariesthe implication of the coding advantage upper-bounds in real-world networks.

B. Hyper-Networks

Wireless networks have the salient feature of supportingbroadcast transmissions. Both directed and undirected hyper-network models model such broadcast nature but not wirelessinterference, and hence cannot be used to study the codingadvantage in wireless networks. However, the directed hyper-network model can be used to study the cost advantage thatrefers to the minimum cost (e.g. energy) to multicast a unit-length message, where interference is irrelevant. Wu et al.proved that the minimum energy multicast formulation isequivalent to a cost minimization with linear pricing [29],and there are a number of works using directed hypergraphsto represent the network topology [9]. We note that manywireless networks are bidirected, which can be modeled by thebidirected hyper-network. The maximum size of a broadcastlink varies in a wide range. For ad hoc wireless networks, thebroadcast range is often restricted, resulting in a small β. Forexample, β ≤ 14 in the testbeds of wireless network codingapplication [30]. For cellular networks, a base station mayserve thousands of terminals.

IX. CONCLUSION

We have focused on the benefits of network coding intwo types of parameterized networks throughout this work,including bidirected networks and hyper-networks. Comparedto simple directed and undirected network models, thesenetworks are more powerful and flexible for characterizingreal-world networks. We proved a number of upper-bounds onthe potential benefits of network coding, in terms of improvingmulticast throughput and saving multicast cost.

REFERENCES

[1] R. Ahlswede, N. Cai, S.-Y. Li, and R. Yeung, “Network informationflow,” IEEE Trans. Inf. Theory, vol. 46, no. 4, pp. 1204–1216, Jul. 2000.

[2] S. Jaggi, P. Sanders, P. Chou, M. Effros, S. Egner, K. Jain, andL. Tolhuizen, “Polynomial time algorithms for multicast network codeconstruction,” IEEE Trans. Inf. Theory, vol. 51, no. 6, pp. 1973–1982,Jun. 2005.

Page 10: IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, …pages.cpsc.ucalgary.ca/~zongpeng/publications/tcom14-xunrui.pdf · 1024 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3,

1032 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 62, NO. 3, MARCH 2014

[3] Z. Li, B. Li, and L. Lau, “On achieving maximum multicast throughputin undirected networks,” IEEE Trans. Inf. Theory, vol. 52, no. 6, pp.2467–2485, Jun. 2006.

[4] Z. Li and B. Li, “Network coding in undirected networks,” in Proc.2004 CISS.

[5] A. Agarwal and M. Charikar, “On the advantage of network coding forimproving network throughput,” in Proc. 2004 ITW.

[6] C. Fragouli and E. Soljanin, “Network coding fundamentals,” in Foun-dations and Trends in Networking, 2007.

[7] A. M. Townsend, “Network cities and the global structure of theInternet,” American Behavioral Scientist, vol. 44, no. 10, pp. 1697–1716,June 2001.

[8] W. John and S. Tafvelin, “Differences between in- and outbound Internetbackbone traffic,” in 2007 TERENA Networking Conference.

[9] D. S. Lun, N. Ratnakar, M. Medard, R. Koetter, D. R. Karger, T. Ho,E. Ahmed, and F. Zhao, “Minimum-cost multicast over coded packetnetworks,” IEEE/ACM Trans. Networking, vol. 14, pp. 2608–2623, Jun.2006.

[10] Z. Li, B. Li, and L. C. Lau, “A constant bound on throughput improve-ment of multicast network coding in undirected networks,” IEEE Trans.Inf. Theory, vol. 55, no. 3, pp. 1016–1026, Mar. 2009.

[11] S. Maheshwar, Z. Li, and B. Li, “Bounding the coding advantage ofcombination network coding in undirected networks,” IEEE Trans. Inf.Theory, vol. 58, no. 2, pp. 570–584, 2012.

[12] D. M. Chiu, R. Yeung, J. Huang, and B. Fan, “Can network coding helpin p2p networks?” in Proc. 2006 Internl Symposium on Modeling andOptimization in Mobile, Ad Hoc and Wireless Networks.

[13] Z. Shao and S. Li, “To code or not to code: rate optimality of networkcoding versus routing in peer-to-peer networks,” IEEE Trans. Commun.,vol. 59, no. 4, pp. 948–954, 2011.

[14] Y. Xu, I. Butun, R. Sankar, N. Sapankevych, and J. Crain, “Comparisonof routing and network coding in undirected network group communi-cations,” in Proc. 2012 IEEE Southeastcon.

[15] J. Liu, D. Goeckel, and D. Towsley, “Bounds on the throughput gainof network coding in unicast and multicast wireless networks,” IEEE J.Sel. Areas Commun., vol. 27, no. 5, pp. 582–592, Jun. 2009.

[16] A. Keshavarz-Haddad and R. Riedi, “Bounds on the benefit of networkcoding for wireless multicast and unicast,” IEEE Trans. Mobile Com-puting, vol. PP, no. 99, pp. 1–13, 2012.

[17] S. Karande, Z. Wang, H. Sadjadpour, and J. Garcia-Luna-Aceves, “Mul-ticast throughput order of network coding in wireless ad-hoc networks,”IEEE Trans. Commun., vol. 59, no. 2, pp. 497–506, 2011.

[18] F. Yang, M. Huang, S. Zhang, and W. Zhou, “Bound analysis of physicallayer network coding in interference-limited two-way relaying system,”in Proc, 2012 VTC — Spring, pp. 1–5.

[19] C. Chekuri, C. Fragouli, and E. Soljanin, “On average throughput andalphabet size in network coding,” IEEE Trans. Inf. Theory, vol. 52, no. 6,pp. 2410–2424, Jun. 2006.

[20] J. Bang-Jensen and G. Z. Gutin, Digraphs: Theory, Algorithms andApplications, 2nd ed. Springer, 2008.

[21] J. Edmonds, “Edge-disjoint branchings,” in Combinatorial Algorithms.Academic Press, 1973, pp. 91–96.

[22] Y. Wu, K. Jain, and S.-Y. Kung, “A unification of network coding andtree-packing (routing) theorems,” IEEE Trans. Inf. Theory, vol. 52, no. 6,pp. 2398–2409, Jun. 2006.

[23] Y. Wu, P. Chou, and K. Jain, “A comparison of network coding and treepacking,” in Proc. 2004 ISIT.

[24] S.-Y. Li, R. Yeung, and N. Cai, “Linear network coding,” IEEE Trans.Inf. Theory, vol. 49, no. 2, pp. 371–381, Feb. 2003.

[25] C. Fraleigh, S. Moon, B. Lyles, C. Cotton, M. Khan, D. Moll, R. Rock-ell, T. Seely, and S. Diot, “Packet-level traffic measurements from thesprint IP backbone,” IEEE Network, vol. 17, no. 6, pp. 6–16, Nov.-Dec.2003.

[26] “Very high speed digital subscriber line.” Available:http://en.wikipedia.org/wiki/Digital$\rule[-2pt]{0.2cm}{0.5pt}$subscriber$\rule[-2pt]{0.2cm}{0.5pt}$line

[27] S. J. Wallsten and J. Riso, “Residential and business broadband pricespart 1: An empirical analysis of metering and other price determinants,”2010. Available: http://works.bepress.com/scott$\rule[-2pt]{0.2cm}{0.5pt}$wallsten/59

[28] I. Canadi, P. Barford, and J. Sommers, “Revisiting broadband perfor-mance,” in Proc. 2012 ACM Internet Measurement Conference.

[29] Y. Wu, P. Chou, and S.-Y. Kung, “Minimum-energy multicast in mobile

ad hoc networks using network coding,” IEEE Trans. Commun., vol. 53,no. 11, pp. 1906–1918, 2005.

[30] S. Sengupta, S. Rayanchu, and S. Banerjee, “Network coding-awarerouting in wireless networks,” IEEE/ACM Trans. Networking, vol. 18,no. 4, pp. 1158–1170, 2010.

Xunrui Yin received his B.S. and Ph.D. degreesin computer science from Fudan University in 2007and 2013, respectively. He is currently a Post-doctoral fellow at University of Calgary, Canada.His research interests include network coding, net-work topology design and coding theory.

Yan Wang received her Master’s degree in computerscience from Nanchang University in 2006. Sheis currently working towards her Ph.D. at FudanUniversity, China. She is an Assistant Professorin East China Jiaotong University since 2006. Sheis a recipient of the 2013 Google China AnitaBorg Memorial Scholarship. Her research interestsinclude distributed storage systems, network codingand data center networks.

Zongpeng Li received his B.E. degree in ComputerScience and Technology from Tsinghua University(Beijing) in 1999, his M.S. degree in ComputerScience from University of Toronto in 2001, andhis Ph.D. degree in Electrical and Computer Engi-neering from University of Toronto in 2005. Since2005, he has been with the Department of ComputerScience at the University of Calgary. In 2011-2012,Zongpeng was a visitor at the Institute of NetworkCoding in CUHK. His research interests are incomputer networks and network coding.

Xin Wang received his BS Degree in InformationTheory and MS Degree in Communication andElectronic Systems from Xidian University, China,in 1994 and 1997, respectively. He received his PhDDegree in Computer Science from Shizuoka Uni-versity, Japan, in 2002. He is currently a professorin Fudan University, Shanghai, China. His researchinterests include quality of network service, next-generation network architecture, mobile Internet andnetwork coding.

Jin Zhao received the B.Eng. degree in computercommunications from Nanjing University of Postsand Telecommunications, China, in 2001, and thePh.D. degree in computer science from NanjingUniversity, China, in 2006. He has been with Fu-dan University since 2006,where he is currentlyan Associate Professor. He stayed at University ofGoettingen, Germany for 3 months as a visitingscholar in 2010. His research interests include futureInternet and network coding theory. He is a memberof IEEE and ACM.

Xiangyang Xue received the B.S., M.S., and Ph.D.degrees in communication engineering from XidianUniversity, Xian, China, in 1989, 1992, and 1995,respectively. He joined the Department of Com-puter Science, Fudan University, Shanghai, China,in 1995. Since 2000, he has been a Full Professor.His current research interests include multimediaanalysis, retrieval and filtering. He has authoredmore than 100 research papers in these fields.