On the design of network control and management plane

13
On the design of network control and management plane Hammad Iqbal a,, Taieb Znati a,b a School of Information Sciences, University of Pittsburgh, Pittsburgh, PA 15260, United States b Department of Computer Science, University of Pittsburgh, Pittsburgh, PA 15260, United States article info Article history: Received 17 December 2009 Received in revised form 16 December 2010 Accepted 29 January 2011 Available online 25 February 2011 Responsible Editor: Neuman de Souza Keywords: Network management Control Network architecture abstract We provide a design of a control and management plane for data networks using the abstraction of 4D architecture, utilizing and extending 4D’s concept of a logically central- ized Decision plane that is responsible for managing network-wide resources. In this paper, a scalable protocol and a dynamically adaptable algorithm for assigning Data plane devices to a physically distributed Decision plane are investigated, that enable a network to oper- ate with minimal configuration and human intervention while providing optimal conver- gence and robustness against failures. Our work is especially relevant in the context of ISPs and large geographically dispersed enterprise networks, where robust and scalable network-wide control of a large number of heterogeneous devices is required. We also pro- vide an extensive evaluation of our design using real-world and artificially generated ISP topologies along with an experimental evaluation using ns-2 simulator. Ó 2011 Elsevier B.V. All rights reserved. 1. Introduction Present-day data networks are controlled by a variety of distributed routing algorithms (e.g. OSPF, IS-IS, BGP, etc.), each working independently to achieve some network- wide objective, while operating collectively on diverse physical network devices. This has created a situation where each network function (e.g. inter-domain and in- tra-domain routing) maintains a distinct state across many different physical devices and is governed by its own set of configuration rules and protocol logic, making it extremely difficult to control their interactions. Consequently, the management of typical data networks requires extensive manual configuration of individual protocol parameters, leaving the networks fragile [1–3] and insecure [4]. Effective control and management is especially a chal- lenge for large and geographically dispersed networks, such as first and second tier ISPs, where it is important to efficiently manage the network resources across a large number of heterogeneous network devices, while meeting strict constraints on network availability and reliability. The control of such networks has additional challenges as the robustness, scalability and responsiveness of the con- trol functionality is impacted by scale and geographical dispersion. Incremental solutions to improve network manage- ment, including the use of better management tools, have been ineffective as they try to match the pace of changes in various device operations and technical advances. Addi- tionally, newer services and objectives beyond best-effort routing place new demands on the network control algo- rithms that are difficult to meet in the presence of intricate inter-dependencies and distributed state and logic. This of- ten necessitates the error prone and laborious process of indirectly inducing desired behavior in dynamic protocol operation through static configurations, e.g. traffic engi- neering [5]. To tackle the challenge of management complexity, an alternative approach to incremental solutions involves centralization of control state and logic inside a logically centralized Decision plane that is responsible for collect- ing, computing, and maintaining the state required by the network devices to operate. This approach is the basic tenant of the 4D architecture [2], which advocates a new 1389-1286/$ - see front matter Ó 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.comnet.2011.01.018 Corresponding author. Tel.: +1 412 624 8490; fax: +1 412 624 8854. E-mail addresses: [email protected] (H. Iqbal), [email protected] (T. Znati). Computer Networks 55 (2011) 2079–2091 Contents lists available at ScienceDirect Computer Networks journal homepage: www.elsevier.com/locate/comnet

Transcript of On the design of network control and management plane

Page 1: On the design of network control and management plane

Computer Networks 55 (2011) 2079–2091

Contents lists available at ScienceDirect

Computer Networks

journal homepage: www.elsevier .com/ locate/comnet

On the design of network control and management plane

Hammad Iqbal a,⇑, Taieb Znati a,b

a School of Information Sciences, University of Pittsburgh, Pittsburgh, PA 15260, United Statesb Department of Computer Science, University of Pittsburgh, Pittsburgh, PA 15260, United States

a r t i c l e i n f o a b s t r a c t

Article history:Received 17 December 2009Received in revised form 16 December 2010Accepted 29 January 2011Available online 25 February 2011Responsible Editor: Neuman de Souza

Keywords:Network managementControlNetwork architecture

1389-1286/$ - see front matter � 2011 Elsevier B.Vdoi:10.1016/j.comnet.2011.01.018

⇑ Corresponding author. Tel.: +1 412 624 8490; faE-mail addresses: [email protected] (H. Iqba

(T. Znati).

We provide a design of a control and management plane for data networks using theabstraction of 4D architecture, utilizing and extending 4D’s concept of a logically central-ized Decision plane that is responsible for managing network-wide resources. In this paper,a scalable protocol and a dynamically adaptable algorithm for assigning Data plane devicesto a physically distributed Decision plane are investigated, that enable a network to oper-ate with minimal configuration and human intervention while providing optimal conver-gence and robustness against failures. Our work is especially relevant in the context ofISPs and large geographically dispersed enterprise networks, where robust and scalablenetwork-wide control of a large number of heterogeneous devices is required. We also pro-vide an extensive evaluation of our design using real-world and artificially generated ISPtopologies along with an experimental evaluation using ns-2 simulator.

� 2011 Elsevier B.V. All rights reserved.

1. Introduction

Present-day data networks are controlled by a variety ofdistributed routing algorithms (e.g. OSPF, IS-IS, BGP, etc.),each working independently to achieve some network-wide objective, while operating collectively on diversephysical network devices. This has created a situationwhere each network function (e.g. inter-domain and in-tra-domain routing) maintains a distinct state across manydifferent physical devices and is governed by its own set ofconfiguration rules and protocol logic, making it extremelydifficult to control their interactions. Consequently, themanagement of typical data networks requires extensivemanual configuration of individual protocol parameters,leaving the networks fragile [1–3] and insecure [4].

Effective control and management is especially a chal-lenge for large and geographically dispersed networks,such as first and second tier ISPs, where it is important toefficiently manage the network resources across a largenumber of heterogeneous network devices, while meeting

. All rights reserved.

x: +1 412 624 8854.l), [email protected]

strict constraints on network availability and reliability.The control of such networks has additional challenges asthe robustness, scalability and responsiveness of the con-trol functionality is impacted by scale and geographicaldispersion.

Incremental solutions to improve network manage-ment, including the use of better management tools, havebeen ineffective as they try to match the pace of changesin various device operations and technical advances. Addi-tionally, newer services and objectives beyond best-effortrouting place new demands on the network control algo-rithms that are difficult to meet in the presence of intricateinter-dependencies and distributed state and logic. This of-ten necessitates the error prone and laborious process ofindirectly inducing desired behavior in dynamic protocoloperation through static configurations, e.g. traffic engi-neering [5].

To tackle the challenge of management complexity, analternative approach to incremental solutions involvescentralization of control state and logic inside a logicallycentralized Decision plane that is responsible for collect-ing, computing, and maintaining the state required bythe network devices to operate. This approach is the basictenant of the 4D architecture [2], which advocates a new

Page 2: On the design of network control and management plane

2080 H. Iqbal, T. Znati / Computer Networks 55 (2011) 2079–2091

layering design of the IP networks which separates the taskof packet forwarding, a data layer function, from the task ofnetwork control, an operation and management function.This separation of data and control layers is in contrastwith the current practice where the data forwarding mech-anism and control logic are intertwined inside monolithicnetwork devices, such as network routers or switches. Thisapproach to network control necessitates the centraliza-tion of control state and logic inside a logically centralizedDecision plane, that is responsible for collecting, comput-ing, and maintaining the state required by the network de-vices to operate.

The design of an efficient and robust Decision plane re-quires careful consideration of the design efficiency androbustness. A physically centralized decision plane designwas investigated in [6,7], where the entire autonomoussystem was controlled by a single Decision Element (DE)and replication of DEs was used to ensure Decision planerobustness to DE failures. An alternative design approachwas identified in [8], where the logically centralized Deci-sion plane was distributed over physically independentDEs. In this design, each DE controls a subset of the entirenetwork and works collaboratively with other DEs toachieve overall network control. This approach of logicallycentralized Decision plane design tries to balance the ex-treme design positions of total distribution of networkcontrol, as seen in the case of current IP networks, and totalphysical centralization. However, it is also important to en-sure that the reliability of the physically distributed con-trol approach matches or exceeds the reliability offeredby today’s distributed architecture.

In this paper, we focus on the design of logically central-ized clean-slate Decision plane using the clean-slate net-work paradigm as the basis for developing an efficient,robust, and reliable network control architecture. We ar-gue that the Decision plane design should be based onmeeting the following objectives:

� Scalability: The Decision plane must be scalable to net-work size in terms of the number of routers.� Robustness: The design should be dynamically adapt-

able to failures at both Decision and Data planes.� Optimal convergence: Total response time of the Deci-

sion plane to any network event must be minimized,and the protocol operating at the Decision plane shouldbe able to converge quickly enough to operate on thetime-scale of events happening at the Data plane, e.g.router/link failures.

Achieving these objectives requires the development ofa Decision Plane Protocol (DPP) that maintains a network-wide state across the set of physically distributed DEs, andpresents a uniform interface to the network switches orrouters.1 Furthermore, the DEs and their assigned routersmust respond swiftly to events such as failures and trafficsurges. This requires that the delay between the DEs andtheir assigned routers be minimized. This paper addresses

1 We use ‘‘router’’ as a generic label for routers or switches in Data plane,while ‘‘DE’’ is used to represent Decision Elements in a Decision plane.

these design requirements and presents a Decision planewhere a set of DEs, each governing a subset of routers, col-laboratively maintains a network-wide state to support net-work-wide routing decisions.

Our work is especially relevant in the context of ISPs andsimilar large and geographically dispersed networks, wherenetwork-wide control is highly desired but any Decisionplane design must meet stringent challenges of scalabilityand robustness, which are explicit design objectives ofour scheme. In our design, an individual member of theDecision plane only governs a subset of the total numberof routers in the Data plane, and Points Of Presence (POP)in ISP topologies are naturally amenable to such grouping.

The main contribution of our work is the design of ascalable logically centralized and physically distributed Deci-sion plane. The first building block in our design is the for-mulation of an optimization problem focused on efficientassignment of routers to DEs. The solution of this problemleads to an algorithm that minimizes network delay be-tween the DEs and their assigned routers while balancingthe load at the DEs. This algorithm is then used in the pro-posed protocol that is responsible for the operation of log-ically centralized Decision plane. Our paper is organized asfollows: Section 2 presents the related work in networkcontrol and management. We describe the network modelused in our paper in Section 3. Trade-offs in the design ofassignment algorithm are considered in Section 4. In Sec-tion 5, formulation of the router assignment problem ispresented along with a novel adaptive algorithm for itssolution. Section 6 describes our proposed protocol for log-ically centralized Decision plane operation. Section 7 pro-vides an analysis of the algorithm’s performance on realworld and artificially generated topologies and Section 8describes the simulation results of our Decision planeimplementation. Section 9 concludes our paper.

2. Related work

Centralization of network control has been explored ina variety of networks and different contexts. In data net-work, centralized route computation was employed in leg-acy specialized networks such as IBM SNA [9] and TYMNET[10]. IBM SNA employed dedicated network controllers tocompute the routes in a session based host-terminal net-work. TYMNET used a single Network Supervisor to com-pute the routes for a virtual circuit based network.TYMNET’s use of a centralized Network Supervisor is anal-ogous to using a single DE in a physically centralized archi-tecture. In TYMNET’s case, the scalability of the networkwas constrained by the resource bottleneck at the NetworkSupervisor, limiting the maximum network size to 500–600 nodes. While realizing the technological advances incomputation power and bandwidth availability since TYM-NET’s demise, we note that the scalability concerns aboutphysically centralized design are still valid. In our design,the physically distributed Decision plane operates asyn-chronously over the AS network and is specificallydesigned for better scalability.

In IP network design, several recent studies haveembraced centralization of network logic as a way of

Page 3: On the design of network control and management plane

Decision

Dissemination & Discovery

Data

Fig. 1. Layered Design of 4D architecture.

H. Iqbal, T. Znati / Computer Networks 55 (2011) 2079–2091 2081

overcoming management complexity or providing newservices that are presently difficult to implement. Central-ization of control logic has been explored within the con-text of traffic management, as seen in MPLS [11], andBGP route management, as exemplified by RCP. RoutingControl Platform (RCP) [1,12] was proposed as a logicallycentralized point for computing Border Gateway Protocol(BGP) routes and improving the scalability of large net-works. RCP uses centralized servers for route computationand utilizes iBGP paths between BGP-speaking routers andservers. However, RCP is limited to BGP route computationand does not extend to the Interior Gateway Protocol (IGP)routes. In contrast, the Decision plane design provided inour work is not limited to either inter or intra-domainroute computation.

Our work is primarily based on the design of 4D archi-tecture [2], the architectural vision of which we embraceand extend in this paper. We also use the same layeringsemantics as used in the 4D architecture, where the Deci-sion plane is responsible for network-wide management.A brief discussion of the salient features of 4D architectureis provided in Section 3. Yan et al. [7] provide a physicallycentralized design of the 4D Decision plane, where net-work-wide state and control logic is centralized in a singleDecision Element (DE). In contrast, our work explores thedesign space between total centralization and total distri-bution, and provides a Decision plane design that allowsmultiple, physically distributed, DEs to cooperatively gov-ern the (logic-and-state-less) network devices.

Several efforts in open router design [13–15] have alsoadvocated migration of control functions away from rou-ters to reduce their complexity. SoftRouter architecture[13] advocated separation of control function from the dataforwarding task of the routers, and provided protocols forbinding between routers and the servers implementingthe distributed network algorithms. Recently, CONMAN[16] utilized the concept of management plane and cen-tralization in the design and operation of ‘‘network manag-ers’’ that are used to manage the protocols running onindividual routers. Similarly, Morpheus [17] provides anopen routing platform for inter-AS routing. However, thisline of research is constrained by having to use the Inter-net’s distributed control algorithms, even in the case ofcentralized computation, e.g. in the ‘‘control elements’’ inSoftRouter’s case. In contrast, our work uses 4D’s approachof network-wide decision making and presents a robustand scalable design for the Decision plane that goes be-yond the centralized implementation of current distrib-uted algorithms.

2 Discovery and Dissemination planes are shown merged togather herefor the sake of simplicity.

3. Network model

We utilize the abstraction of 4D architecture [2], whichdecomposes a data network into four separate planes viz.Data, Discovery, Dissemination, and Decision plane. TheData plane comprises the routers, switches, and other net-work level devices. The main distinction between the 4D’sData plane and the conventional network architecture isthe lack of any control state or logic in 4D’s Data plane.Instead of using distributed protocols and requiring

pre-configured state, the Data plane devices in 4D architec-ture are governed by a centralized Decision plane operat-ing at the autonomous system level. The Decision planeis therefore responsible for collecting and maintaininginformation about the state of network devices and utiliz-ing this centralized view for computing the mechanisms(such as routing tables) required by the Data plane devices.As an example, the basic routing functionality can beimplemented by collecting network topology, throughthe use of Discovery & Dissemination planes, and using itto generate the routing tables at the Decision plane. Theserouting tables would be sent to the routers, using the pathsestablished by Dissemination plane, where they would beused for making packet forwarding decisions. Similarly, aDecision plane is envisioned to control access (by configur-ing Access Control Lists), flow level authentication, trafficengineering, and other similar functions that can benefitfrom network-wide views, centralized decision-making,and direct control over network devices. In this paper,our focus is on the Decision plane design and we assumethe existence of a basic set of services provided by Discov-ery & Dissemination planes, such as provided by Tessaract[7].

Fig. 1 shows the layered architecture adopted in thispaper.2 This layering provides a separation of the dataforwarding mechanisms, such as packet forwarding andfiltering, from the state and logic required to manage thenetwork. This separation is aimed at eliminating the needfor implementing complex distributed control at the devicelevel of routers and switches, and moving the control func-tionality to the logically centralized Decision plane.

The new decomposition of network functionality by thisarchitecture eliminates the current management complex-ity of configuring myriad distributed algorithms andprotocols at the device level. Instead of device level config-urations, a network administrator would define AS-widepolicies that will be translated by the logically centralizedDecision plane into mechanisms and instructions neededby the network devices. The use of centralized algorithmsat the Decision plane, for tasks such as network-wide routecomputation, allow another opportunity for reducingcomplexity and misconfigurations as centralized algo-rithms require less configurations and often allow simplerimplementations.

Page 4: On the design of network control and management plane

2082 H. Iqbal, T. Znati / Computer Networks 55 (2011) 2079–2091

However, physical centralization of network control lo-gic is undesirable in order to avoid potential problems withscalability and fault-tolerance. Logical centralization ofnetwork control is an alternative, explored earlier in [8]that proposed using a set of Decision Elements (DEs) whichcan collaborate to perform the function of network-widecontrol, adding a level of distribution in the Decision plane.We model the Decision plane in this paper utilizing thesame abstraction of logical centralization; the high-leveldesign of which is exemplified in Fig. 2 for an ISP topologyspanning the continental US with several POPs. The figureillustrates a logically centralized Decision plane, compris-ing of three DEs, that is controlling a large ISP network.The ISP network is modeled as having three PoPs, each ofwhich contains different types of routers, as is commonlyseen in ISP networks with backbone, edge, DSL, and othertypes of routers. In the course of normal operation, as de-picted, each DE is seen as controlling a different PoP. Thefigure also illustrates the few basic assumptions taken inour network model.

(1) The entire network topology is under a singleadministrative control.

(2) The Decision plane is fully connected, i.e. there is apath from each DE to all other DEs that is not depen-dent on the operation of Data plane.

(3) Positioning of DEs corresponds to the natural geo-graphical clustering of routers in the Data plane,e.g. within an ISP POP.

We believe these assumptions are easy to meet in anyreasonably large network where control and managementis presently an issue. The first assumption is necessary forconsistent network-wide management and deserves nofurther explanation. The use of dedicated out-of-band con-trol paths in the second assumption is in contrast with thein-band paths used in current IP networks, where data androuting information packets share the same channels.Although it is possible to use the same scheme in logicallycentralized Decision plane design, we have purposelyavoided the potential complexity and network fragility

POP1

POP2

DATA PLA

DECISION PLAN

Logical LinkPhysical Link

Fig. 2. Overview of the De

introduced by piggybacking control information over datapaths. Our use of out-of-band paths is analogous to theSS7 signaling used in PSTN networks [18] and can be sim-ilarly implemented. Use of separate time-slots or wave-length channels for control messages is one way thisseparation could be accomplished. Finally, our thirdassumption positions DEs in accordance with the cluster-ing of routers in the underlying Data plane, using the tech-niques discussed in [8]. This ensures that latency ofDecision plane response, and convergence delay in caseof failures, is kept close to minimum.

In our design, each DE is only responsible for computingrouting tables for the routers under its direct control, i.e. asubset of the total number of routers in a network. We re-fer to this (sub) set of routers as a logical area and it marksthe extent of a DE’s direct control over the network. More-over, DEs exchange reachability information about theirlogical areas and utilize this information in establishingrouting paths between different logical areas. In the caseof unweighted shortest-paths routing, which we employfor route computation, a path between routers in two dif-ferent logical areas must travel the inter-area links be-tween them. This results in optimal routes only underthe condition that a similar routing process on the com-plete topology would have selected the same path. Similarargument also applies to the intra-area routes. It is easy tosee that this condition is fulfilled in topologies where dis-tances between routers inside geographical clusters areless than the distance between the clusters. We believenetwork size and geographical distances between sub-entities in enterprise and ISP networks naturally allowthe fulfillment of this condition.

The logically centralized structure of the Decision planestrikes a balance between the extremes of distributedoperation of individual routers, as seen in the current datanetworks, and total centralization, with its inherent scala-bility and robustness issues. More subtly, it also has thepotential to allow easier deployment and transition froma distributed model of operation; as instead of a ‘‘forklift’’change of the entire networking infrastructure, only a sub-set of the AS network could be transitioned at a time.

POP3

NE

E

cision plane design.

Page 5: On the design of network control and management plane

e1 e2r1

r4

r3r2

Fig. 3. Effect of contiguity constraint on a sample topology where multi-hop router assignments are indicated by dashed lines. The (infeasible)assignment of router r4 to DE e2 would have resulted in minimal delayand optimal load-balancing.

H. Iqbal, T. Znati / Computer Networks 55 (2011) 2079–2091 2083

4. Trade-offs in decision plane design

Robustness of the Decision plane is dependent on themechanisms employed to ensure its continued functioningin case of failures. While the Decision plane routing logicdeals with failures happening at the Data plane, the mitiga-tion of failures at the Decision plane is dependent on itsown design. An approach to this problem was presentedin [7], where the Decision plane was designed to be phys-ically centralized and multiple hot-standby DEs were usedto increase its robustness in case the current ‘‘master’’ DEfails.

In contrast, a DE in a logically centralized Decisionplane is not required to control the entire AS; only a subsetof the total number of routers are under the control of asingle DE. Any DE failure would therefore orphan the rou-ters under its control. This calls for a scheme that reassignsorphaned routers to the functional DEs so that networkcontrol is reinstated.

This assignment of routers takes place both at networkbootstrap and as a result of DE failures. It involves a trade-off in minimizing routing convergence delay, responsetime, and load balancing at the Decision layer. The routingconvergence delay — transient time period between DEfailure and orphaned routers’ reception of new routingassignments — represents loss of management control overthe orphaned devices, and must be minimized. Similarly, innormal operation the response time of Decision plane alsoneeds to be minimized. In both cases, aggregate router-DEdelay provides a desirable metric for the minimizationobjective. Additionally, large variation in DE work-loadscan result in slower Decision plane response in parts ofthe network and increased potential for DE failures, sug-gesting a need for load balancing at the Decision plane.Therefore, the optimality of router assignment will bebased on minimizing the aggregate router-DE delay whilelimiting the variance in DE work-loads.

Assignment mechanism is also constrained in a uniqueway as any router assignment must adhere to the underly-ing physical data plane topology. Specifically, since a DEonly controls the routers in its own logical area, the assign-ment mechanism must avoid any assignment that involvesthe usage of inter-logical area paths between routersbelonging to the same logical area. This condition is neces-sary to ensure that routers in a logical area can be governedlocally without requiring AS-wide network knowledge.Therefore, there must be a physical path between routersthat are assigned to the same DE that does not involveany links or routers not totally contained within the samelogical area. We refer to this condition as the contiguityconstraint and Fig. 3 illustrates a simple example wherethe assignment that is optimal in terms of delay and loadbalancing objectives does not satisfy the contiguityrequirement.

Trade-offs also exist between complexity of a recoveryscheme and the desired level of robustness. For example,we can generalize a simple scheme of using backups asproposed in [7,13], where each router is statically config-ured with a primary and an ordered list of standby DEs.Failure of the primary DE automatically results in the

assignment of its orphaned routers to their highest-rankedfunctional DEs. However, it is easy to show that thisscheme can lead to uneven DE work-loads in case of multi-ple DE failures, potentially causing severe performancedegradation. Moreover, the underlying network topology,on which any static assignment is based, may change dueto the dynamics of network operation, such as link and de-vice failures — potentially making a static order of assign-ment infeasible or highly inefficient. Consequently, a staticassignment will have to be updated to ensure its applica-bility and validity with the dynamically changing networktopology. These shortcomings of static ordering schemessuggest that it is desirable to have an adaptive mechanism,that can assign routers to feasible DEs while, 1., balancingthe DE workload and, 2., minimizing the physical con-straint on Decision plane response time, i.e. the propaga-tion delays between routers and DEs. In the followingsection we describe our design of such adaptive routerassignment mechanism.

5. Adaptive assignment of data plane devices

Let R = {r1,r2, . . . ,rm} be the collection of routers in a AS,assumed to be homogeneous in terms of their demands ofDecision plane resources, and E = {e1,e2, . . . ,en} be the set ofn functional DEs in the network. For any ri, N(ri) denotesthe set of routers in physical open neighborhood of ri, i.e.ri and all of its physically adjacent routers. We defineA(ej) to be the set of routers assigned to ej and A as the adja-cency matrix of router assignments for all DEs in E, whichis the output of the assignment problem. Let x(ri,ej) be abinary indicator variable defined as x(ri,ej) = 1, ej ri.Let d(ri,ej) be the minimum delay between router ri and aDE ej, and D[d(ri,ej)]mxn be the matrix of all such delays.Let Lj ¼

Pri2Rxðri; ejÞ be the load on DE ej and Qj be the

capacity, i.e. the maximum number of routers, that ej isable to govern.

We assume that information about the network topol-ogy, specifically router adjacencies and delay, would beavailable to the Decision plane as part of the service offeredby the Discovery and Dissemination planes of 4D architec-ture. Use of source routes [7,13] is one method by whichsuch information can be collected, and Section 6.2discusses the protocol primitives that can be used for in-ter-layer communication. However, the design specificsof Discovery and Dissemination planes are beyond thescope of this work.

Page 6: On the design of network control and management plane

2084 H. Iqbal, T. Znati / Computer Networks 55 (2011) 2079–2091

5.1. ILP formulation

From the discussion of the previous section, the objec-tive of the assignment problem is to assign routers in Rto DEs in E in such a way that aggregate delay betweenrouters and their assigned DEs is minimized, while ensur-ing that the DE workload is balanced. Formally, we defineour objective function asXej2E

Xri2R

dðri; ejÞxðri; ejÞ

We introduce a constraint to balance the DE work-loadsusing the average load, Lavg, and a load balancing parame-ter D P 1.

Lavg ¼ mX

ej

Q j

,0 < Lavg 6 1

The optimization problem can be formulated as the follow-ing ILP:

MinimizeXej2E

Xri2R

dðri; ejÞxðri; ejÞ ð1Þ

s:t:Xej2E

xðri; ejÞ ¼ 1 8ri 2 R; ð2Þ

Xri2R

xðri; ejÞ � Q j 6 0 8ej 2 E; ð3Þ

Lj 6 dDLavgQ je 8ej 2 E; ð4ÞXrk2NðriÞ

xðrk; ejÞP xðri; ejÞ jAðejÞjP 1; 8ri 2 R;

ð5Þxðri; ejÞ 2 f0;1g 8ri 2 R; ej 2 E: ð6Þ

The objective function minimizes aggregate delay betweenrouters and their assigned DEs. Constraint (2) ensures thateach router in R is assigned, constraint (3) ensures that theDE workload capacities are not violated, and constraint (5)imposes the contiguity requirement.

The load balancing constraint (4) is weighted by aparameter, D, which controls the maximum deviation ofa DE’s normalized workload from the average normalizedworkload for all DEs. Setting D = 1 would force workloadsof all DEs to be exactly equal to the average normalizedworkload, or in other words each DE will have the samefractional utilization of its capacity as all others. In caseof homogeneous DE capacities this translates to an equalworkload for all DEs. On the other hand, D > 1 allows thenormalized workload of at least one DE to be higher thanthe average by (D � 1) ⁄ 100 percentage.

The value of D also dictates the trade-off between theobjectives of minimum aggregate delay and load balancingas it changes the feasible set of solutions. A large value of Doptimizes a solution for the objective of minimizing aggre-gate delay, while a tighter constraint will show significanttrade-off in favor of load balancing. The addition of a hardconstraint for load balancing comes at the cost of reducedfeasibility where optimal solutions could be infeasible be-cause of a choice of D which is too low. This situation islikely to arise in tightly constrained problems especiallyin the event of reduced capacity as a result of DE failures.However, the dependence of constraint (4) on the average

normalized workload ensures that the formulation dynam-ically adapts to failures, as a DE failure lowers the totalavailable capacity thereby increasing right hand side ofthe constraint. This will result in higher workload sharesfor the remaining functional DEs to accommodate the or-phaned routers. If the total capacity of the remaining DEsis less than the workload offered by the Data plane, no fea-sible solution will exist for the problem.

Our approach is different from the traditional load bal-ancing method of minimizing the maximum load, and pro-vides better control to a network operator while ensuringrobust and efficient operation of the Decision plane. Thesub-problem with only the minimum delay objective andconstraints (2), (3) and (6) is commonly referred to as Ter-minal Assignment Problem [19].

5.2. Two-phase router assignment algorithm

We construct a two-phase exact algorithm to solve theoptimization problem. The first phase of the algorithm con-structs an ordering of routers, S, where S is the sorted orderof minimum delay assignments for each router, and greed-ily assigns routers in the order of S to their closest (min-de-lay) feasible DEs, if such assignments are possible. To meetthe contiguity constraint (5), a router ri’s assignment ismade to the closest DE ej if d(ri,ej) is strictly less than the de-lay between ri and any other DE and ej has slack capacity. Onthe other hand, if there are other DEs at same delay from ri

as ej, ri is assigned to a feasible DE that has an existingassignment in N(ri). Otherwise, ri is kept unassigned.

The goal of the first phase of algorithm is to make allfeasible lowest-cost assignments that can be made withoutchanging any previously made assignments. This phaseconstructs an optimal solution for the assigned routers.Any routers that remain unassigned after the first phaseare assigned by the second phase using a branch exchangealgorithm that iteratively accommodates previously unas-signed routers, while maintaining feasibility of the solu-tion. Our solution is O(m2n) in the worst case, and findsoptimal solution to the assignment problem if it exists.

5.2.1. Greedy phaseWe utilize a greedy heuristic to assign routers to DEs

while maintaining the feasibility of solution. Since, by def-inition, a greedy approach does not make any changes toits local decisions, the order in which decisions are takenbecomes important. Our approach considers routers inthe order of lowest assignment costs for each router.Assignments are made only with a feasible min-delay DE,where feasibility is determined by the constraints givenin Section 5.1. Fig. 4 describes the definitions and operationof this phase.

Lemma 1. Let xðrsi ; esik Þ be an assignment made in the greedy

phase. By construction, dðrsi ; esik Þ 6 dðrsi ; e

sij Þ 8esi

j 2 Esi i.e. esik

must be the minimum cost feasible assignment for rsi .

The algorithm explicitly checks a potential assignmentagainst the capacity (3) and load balancing (4) constraints,while implicitly meeting the contiguity constraint (5)according to the following Lemma:

Page 7: On the design of network control and management plane

Fig. 4. Greedy phase algorithm.

H. Iqbal, T. Znati / Computer Networks 55 (2011) 2079–2091 2085

Lemma 2 (Greedy phase meets constraint (5)). Sincerouter assignments are done strictly in the order of min-delay,it suffices to show that routers assigned in this order will meetthe contiguity constraint. We prove this Lemma by induction onthe assignment of a router rsi : if Aðesi

1 Þ ¼ /, the Lemma triviallyholds as rsi must be directly connected with esi

1 by Lemma 1. Forthe case of Aðesi

1 Þ – /, we assume that Lemma holds for i � 1assignments and rsi is the ith assignment that violates theLemma, implying 9ra R Aðesi

k Þ 8ra 2 Nðrsi ÞConditioning on ra, we observe that there must be a path

from rsi to esik which passes through ra. Hence, dðrsi ; e

sik Þ ¼

dðrsi ; raÞ þ dðra; esik Þ which implies dðra; e

sik Þ < dðrsi ; e

sik Þ. There-

fore, ra must have been picked by the algorithm before rsi andsince esi

k is a feasible choice for rsi it must have been a feasiblechoice for ra. This implies ra is assigned to an arbitrary DE ea

1where ea

1 – esik and dðra; ea

1Þ < dðra; esik Þ. By substitution, it can

be seen that this results in dðrsi ; ea1Þ < dðrsi ; e

sik Þ, thus violating

Lemma 1. Therefore, ith assignment must be valid.

5.2.2. Exchange phaseThe greedy phase makes all the feasible min-cost router

assignments that can be made without changing any exist-ing assignment. Consequently, assignment of an unas-signed router after the greedy phase’s completion mayinvolve a trade-off between sub-optimal assignmentto available DEs or reassignment/exchange of already

assigned routers to allow a lower cost assignment. There-fore, in order to ensure optimality of the solution, theassignment mechanism must be able to find the lowest-cost set of exchanges that allows the assignment of anunassigned router. This mechanism is provided by the ex-change phase, which utilizes a branch-exchange algorithm,similar in design to the method described in [19], to con-struct an auxiliary graph of the network and uses shortestpath algorithm for computing lowest-cost assignments.

In simple terms, auxiliary graph represents the feasiblecombinations of router assignment exchanges betweenDEs, weighed by the cost of such exchanges. The min-costpath through the graph represents the min-delay assign-ment for a previously unassigned router. Therefore, edgesof the graph represent possible feasible exchanges (andnew assignments) between DEs which, themselves, arerepresented by the graph’s vertices. Similar to the greedyphase, feasibility of any exchange or new assignment de-pends on conformance to the constraints presented in Sec-tion 5.1. Auxiliary graph is constructed according to thefollowing rules:

� There are two special vertices S and F that represent thesource and destination vertices for the shortest pathcomputation. The shortest path from S to F, at eachiteration of exchange phase, provides the lowest costassignment of one unassigned router.� There are additional vertices, Y = Y1, Y2, . . . , Yk, each

corresponding to a DE without any slack capacity.� There is an edge (S,Yk) corresponding to potential

assignment of an unassigned router Yk ri : $ra 2 A(Yk),ra 2 N(ri) with an edge weight d(ri,Yk).� There is an edge (Yk,Yl) corresponding to a router ri at

the border of Yk and Yl’s logical areas, such that x(ri,Yk) = 1, $ra 2 A(Yl), ra 2 N(ri) and the weight d(ri,Yl) � -d(ri,Yk) is positive.� There is an edge (Yk,F) corresponding to a router ri’s fea-

sible re-assignment from Yk to a DE ej with slack capac-ity. The weight of this edge is d(ri,ej) � d(ri,Yk). If thereare no DEs with slack capacity, an auxiliary graph cannot be created and the algorithm will terminate.� There is an edge (S,F) with weight d(ri,ej) for ej ri.

Lemma 3 (The auxilary graph has no negativecycles). There can not be any negative cycles involving Sand F vertices, and so it only remains to be shown that thevertices in Y do not have any negative cycles between them.We observe that only edges with positive weights are allowedbetween vertices in Y, and since a negative cycle implies edgeswith negative weights, the Lemma is proven by construction.

Dijkstra’s shortest path algorithm is used to compute theshortest path on the directed auxiliary graph from vertex S,which represents an unassigned router, to F, which representsDEs with slack capacity,. Lemma 3 establishes that Dijkstra’salgorithm, which can only be used in graphs with no negativecycles, is applicable to the auxiliary graph. This shortest pathrepresents the minimum cost set of exchanges that are neededto assign a previously unassigned router. The auxiliary graphis updated after the assignment and the process repeated untilall routers have been assigned, or no DE remains with slackcapacity.

Page 8: On the design of network control and management plane

2086 H. Iqbal, T. Znati / Computer Networks 55 (2011) 2079–2091

Fig. 5 shows the operation of exchange phase for a simplenetwork example. The network topology and assignmentsafter the completion of greedy phase are shown in Fig. 5(a),where r1 is unassigned as it can not be assigned to its closestDE, e1, without violating the strict (D = 1) constraint on loadbalancing. Fig. 5(b) shows the construction of the auxiliarygraph using the auxiliary graph construction rules on thenetwork topology. In this figure, the shortest-path from thesource node, S, to the destination node, F, represents the min-cost assignment for the unassigned router, r1. The min-costassignment, ((S,e1), (e1,F)), assigns r1 to e1 while reassigningr3 from e1 to e2, at a total cost of 4. The other possible pathfrom S to F represents the direct assignment of r1 to e2, at ahigher total cost of 5. Since only one r1 was unassigned in Fig.5(a), the algorithm will terminate after its assignment.

5.2.3. AnalysisThe exchange phase of the algorithm tries to iteratively

assign all of the routers that were unassigned at the end ofthe greedy phase. In networks where

Pej

Q j P m, i.e. whenenough capacity exists at the Decision plane to handle allthe routers in the network, there will always be at leastone DE with slack capacity to allow the formation of aux-iliary graph. More specifically, vertex F, which serves asthe destination vertex for shortest path computation onthe auxiliary graph, can only be reachable from the sourcevertex, S, if a DE with slack capacity exists in the network.In cases where the construction of auxiliary graph fails, i.e.when

Pej

Q j 6 m, the algorithm will terminate with the setof already computed router assignments as output. This setwill contain the maximum number of router assignmentsthat can be made without violating the constraints on DEcapacities. Otherwise, the algorithm will terminate afterassigning all m routers in the network.

The greedy phase of the algorithm is O(m). The ex-change phase’s complexity is dependent on the shortestpath computation, with worst case complexity of O(n2).The exchange phase calls Dijkstra’s algorithm for eachunassigned router, resulting in an overall worst case com-plexity of O(mn2). In reality, the greedy phase assigns most

Fig. 5. Operation of the exchange phase on a network example whereD = 1 and edges are annotated with delay values. The min-cost assign-ment is along (S,e1), (e1,F).

of the routers, and the few unassigned routers in tightly-constrained DE failure scenarios each require one iterationof the exchange phase. This results in average-case com-plexity of H(m + kn2), where k�m. Also, since the numberof routers in a network are expected to be much higherthan the number of DEs, i.e. m� n, complexity of thescheme is dominated by the complexity of greedy phase,resulting in very fast run-times e.g. less than 3.5 s on aver-age in a network with (m,n,D) = (1500,10,1.0) as describedin Section 7.

6. DPP protocol for decision plane operation

In this section we discuss the design of an experimentalprotocol for the operation of logically centralized Decisionplane using the router assignment algorithm. A discussionof the main functional requirements of DPP protocol is pre-sented, followed by a description of the protocol structureand states, and finally we discuss how the protocol inter-acts with other layers of the 4D architecture.

6.1. Functional requirements

The protocol operating at the Decision layer is responsi-ble for management of DEs in providing a uniform net-work-wide Decision plane. To effectively meet the designgoals specified in Section 1, the design needs to conformto the following basic functional requirements:

� Robustness to multiple failures in the Decision and Dataplanes must be insured. This implies a design thatincorporates redundant control logic and storage of net-work state.� Any pre-configuration of protocol parameters should be

minimized and the protocol must be able to operatewithout constant human intervention.� Protocol must be easily extensible and evolvable to

include additional functionalities.� To improve scalability of the Decision plane, the proto-

col must distinguish between events which have net-work-wide significance vs. events which have theirimpact limited within a local DE’s control boundaries.For example, failure of a redundant link totally con-tained within a logical area may not have AS-wide sig-nificance, while failure of a backbone link connectingtwo different logical areas might require re-computa-tion of routing matrices at multiple DEs to redirect traf-fic away from the affected link.� The protocol must be able to deal with synchronization

issues expected in the control of a large geographically-dispersed AS.

These requirements are not meant to be exhaustive butto serve as a guideline for the protocol design.

6.2. Protocol design

The functional requirements of the previous sectionprovide a basis for the design of DPP protocol where weincorporate the following salient design features:

Page 9: On the design of network control and management plane

Table 1APIs used for inter-layer communication.

Construct Function

get_topo () Request network topology discoveryfrom the Discovery plane

send_RT () Send a new RT to the specified routerusing the Dissemination plane

push_event () Notify other DEs of a local eventthrough the Dissemination plane

H. Iqbal, T. Znati / Computer Networks 55 (2011) 2079–2091 2087

6.2.1. Leader electionRouter assignment algorithm is computed only by the

DE which has been chosen to act as leader. We utilize asimple leader election protocol based on unique pre-con-figured DE identifiers. The leader election protocol is usedat network bootstrap, after the setup of control paths be-tween DEs, and leader’s failure event. This mechanism ful-fills the design requirements in several ways. Firstly, itdoes not require any pre-configuration on part of networkoperator beyond the DE identifiers. Secondly, it avoids thepotential assignment conflicts that could arise due to asyn-chronous computation of assignments by DEs. Finally, it al-lows a robust design as failure of any particular machinedoes not jeopardize the network operation.

6.2.2. Network state and logicThe network state, consisting of the topology informa-

tion of Data plane and routes advertised by DEs, isreplicated across the Decision plane. The route advertise-ments, in the form of DE-DE messages, provide reachabilityinformation about a DE’s logical area. Frequent collectionof topology information from the lower layers of the archi-tecture is avoided as it is a costly process in terms of over-head and delay. This is because the abstraction of logicalarea boundaries does not extend to any lower layers anda request from the Decision plane for collecting topologyinformation encompasses the entire network topology.Therefore, we limit topology discovery to the cases ofnetwork bootstrap and new DE addition only. In othercases, e.g. when a DE is restarted after a failure, topologydiscovery is not required as it had been done previouslyand the persistent network state can be acquired fromthe current leader along with router assignments.

We categorize failure event at the Data plane into, (1)Local-area events, which do not require a re-computationof router assignments and, (2) non-local-area events whichrequire assignment re-computation for their resolution.Since assignment re-computation is a relatively costly net-work event, this distinction reduces the protocol conver-gence time by eliminating the need to re-compute as aresult of each failure event. Local-area events result in arouting table update within the logical area where they oc-cur, and updates from the logical area DE to other DEs inthe AS. This update indicates the change in network topol-ogy and any prefixes that are unreachable as a result ofrouter failure. On contrary, non-local-area events requirea router assignment re-computation, and routing table up-dates in more than one logical areas. We describe the typesof Data plane failure events and their relation to the above-mentioned categories as follows.

� Non-partitioning Failure: is a router or link failure thatdoes not result in the partitioning of the logical areawhere it occurred. This type of event is a local-areaevent. A redundant backbone router failure is an exam-ple of this failure type.� Disconnection Failure: This type of failure results in a set

of routers being unreachable from any DE in the net-work. The set of unreachable routers may not havefailed but the loss of paths to all DEs renders them dis-connected from the network. This is another case of a

local-area event as re-computation of router assign-ments can not restore the connectivity to the discon-nected set. A link failure in a router chain leading toan ISP’s customer is an example of this failure type.� Partitioning Failure: This type of failure partitions a log-

ical area into two or more partitions, resulting in theloss of control paths to the logical area DE for at leastone partition, which otherwise remains reachable fromanother functional and feasible DE. A router assignmentre-computation restores the network control over thepartition by assigning the partitioned routers to feasibleDE. As a result, this type of failure is a non-local-areaevent.

The avoidance of router assignment re-computation inthe case of local area events reduces the protocol conver-gence time for a significant fraction of failure events. Prac-tical network topologies often incorporate redundancy ofpaths, making partitioning failures less likely. Indeed, ouranalysis in Section 8 on real-world topologies found themajority of single router failures to be local area events.

6.2.3. Interaction with Other 4D layersDPP is designed to require only a small set of APIs from

the underlying layers of the 4D architecture, as listed inTable 1. This mechanism is selected with the aim ofimproving extensibility of the architecture, allowing thisbasic set of APIs to be re-used in any additional control fea-tures beyond shortest-paths routing. The implementationof these APIs in the lower architectural layers is notexplored in this work.

6.3. Protocol states

A DE is transitioned through several states from initial-ization to full operation and undergoes further statechanges in response to network events. Fig. 6(a) illustratesthe state machine of the DPP protocol where we utilize thefollowing states to describe its operation:

Init or initialization state follows immediately afterboot-up. Secure channels for the exchange of control mes-sages are immediately established with each of DE’s neigh-bors in the fully-connected Decision plane. If there are nopreviously initialized neighbors, all DEs are transitionedthrough the leader election protocol. Otherwise, the cur-rent leader checks a newly booted DE’s identifier to findout if it was previously initialized.

Elect state is used when there is no leader DE in the net-work, which will be the case at network bootstrap, or incase of leader’s failure. Each DE in the network is pre-con-figured with a unique integer identifier. The DEs exchange

Page 10: On the design of network control and management plane

Fig. 6. State transition diagram for the Decision Plane Protocol.

2088 H. Iqbal, T. Znati / Computer Networks 55 (2011) 2079–2091

their identifiers to elect the one associated with the lowestidentifier as leader.

Topology Discovery In this state, network topologyinformation is requested from the Discovery layer usingthe get_topo () construct. The topology is in the formof a weighted graph where vertices indicate routers andedges specify physical adjacencies, which are weightedby propagation delay of the links. The topology informa-tion is exchanged between DEs to ensure full replicationof network state across the Decision plane.

Router Assignment The leader DE transitions into thisstate in the event of a DE failure, failure at inter-area links,or an addition of a new router.

Routing Table Computation is done by each DE for therouters in its logical area whenever it receives a newassignment from leader DE, in case of intra-area events,and when it receives new reachability information fromanother DE. The completion of routing table computationis immediately followed by an update of each router’s rout-ing table using the send_RT () construct to the Dissemi-nation plane, and an update of reachability informationto other DEs if the new computation results in changes tothe routes available to their logical areas.

Topology Update is a result of an event in a DE’s logicalarea. It requires sending topology update to other DEs inDecision plane in order to synchronize the network state.A push_event () construct allows Dissemination planeto signal such events to the Decision plane.

Full DE in this state indicates a fully initialized Decisionplane. This state would be maintained in normal operation.

7. Numerical evaluation

In this section we provide results of our evaluation ofthe assignment algorithm on real-world and a variety ofartificially generated topologies.

The first set consists of the ISP backbone topologies col-lected by Rocketfuel project [20]. The second set are artifi-cial two-tiered hierarchical topologies generated by BRITE[21] using the GLP model [22]. GLP model along with BRITEhas been reported to generate ISP-like topologies [23],which we use to model a large-sized ISP topology consist-ing of 1500 routers and 15 DEs. Our evaluation was focusedon determination of the following characteristics:

(1) Reassignment of non-orphaned routers: The accom-modation of routers orphaned as a result of a DE fail-ure, may necessitate re-assignment of non-orphanedrouters from other DEs to balance the load amongthe surviving DEs. A large percentage of such reas-signments could have an adverse effect on the Deci-sion plane performance and it is desirable to reducesuch router churn. We measure this as a percentageof non-orphaned routers undergoing re-assignmentout of the total number of routers in the network.

(2) Computation time: Each failure in the Decision planetriggers the re-computation of the router assign-ments. We measured the time taken for each runof the assignment algorithm on a 64 bit 3.6 GHzmachine.

In each topology, we determine the best positioning of aset of DEs based on the discussion in [8]. Results were ob-tained by removing all combinations of ‘‘failed’’ DEs fromthe original set. Maximum number of DEs (nmax) was lim-ited to 15 in BRITE and 10 in Rocketfuel sets. The minimumnumber of DEs nmin was constant at 5 in both sets, whichwas found to be sufficient in attaining near-optimal con-vergence delays[8]. The capacities of individual DEs wereassumed to be a non-limiting factor and, in the case ofBRITE set, our experiments were repeated for different de-gree distributions (d) of logical areas.

Fig. 7(a) shows non-orphaned router reassignment forthe case of Rocketfuel backbone topologies, where wepresent results by bounding the maximum percentage ofrouter reassignments in a network and presenting the min-imum value of D that is needed to ensure that reassign-ment rate remains below the bound. We observe thateven in this very limiting case of backbone topologies,the rate of reassignment falls off rapidly with an increasein D and relatively small values of D are sufficient inachieving tight bounds on router reassignment. In the caseof BRITE topologies, we observe even better performance

Page 11: On the design of network control and management plane

79:294 87:322 104:302 138:744 161:656 315:9721

1.2

1.4

1.6

1.8

2

Topology (routers:links)

Min

imum

Δ

10%20%30%40%

1 2 3 4 51

1.1

1.2

1.3

Number of DE Failures

Min

imum

Δ

d=2d=4d=6d=8

Fig. 7. Trade-off between load balancing and percentage of non-orphanedrouter re-assignment. Plots show the minimum value of D needed to limitthe re-assignments below a given percentage. Top: (a) Rocketfuelbackbone topologies, Bottom: (b) BRITE topologies of m = 1500 withmax. 5% re-assignments.

79:147 87:161 104:151 138:372 161:328 315:9720

0.01

0.02

0.03

0.04

0.05

0.06

Topology (routers:links)

Ass

ignm

ent C

ompu

tatio

n T

ime

(s)

1

1.5

2

2.5

3

3.5

4

4.5

1 2 3 4 5

Number of DE FailuresA

ssig

nmen

t Com

puta

tion

Tim

e (s

)

Fig. 8. Box plot of the computation time for router re-assignment withD = 1. The box shows the first and third quartile along with the median.Whiskers show the min. and max. values, while the outliers are plotted as‘‘+’’. Top: (a) Rocketfuel backbone topologies, Bottom: (b) BRITE topolo-gies with m = 1500.

H. Iqbal, T. Znati / Computer Networks 55 (2011) 2079–2091 2089

as full topological information is available. Fig. 7(b) showsresults for the case of BRITE topologies where we reportthe observed minimum values of D required in boundingmaximum reassignments to 5% for different logical areadegrees.

The computation time required to run each iteration ofthe algorithm is plotted in Fig. 8 for both sets of topologies,with a worst-case DE capacity constraint of D = 1.0. Theplot shows that even in case of very large network topolo-gies and worst-case constraints on load-balancing routerassignment algorithm converges to a solution within veryreasonable times.

Table 2Bootstrap convergence delays for Rocketfuel topologies

Topology(routers:links)

Max. Network delay(ms)

Bootstrap delay(ms)

104:151 28 95.1387:161 35 126.35161:328 47 175.1279:147 72 235.3

8. Simulation results

We analyzed the convergence performance of the DPPprotocol with simulations on Rocketfuel topologies usedin the previous section, using ns-2 simulator3 where wecreated new modules to implement the functionality ofDecision plane. We collected results on the convergencedelay in cases of network bootstrap, DE, and router failures.

The convergence delays, in the cases of DE failures, arecomputed by randomly forcing the failure of a DE and mea-suring the time until all routers in the network receive re-computed routing tables. This convergence delay includes,(1) delay at the Decision plane between the time a failureactually occurs and when it is detected by the functionalDEs, (2) computation time of router assignment algorithm,

3 http://isi.edu/nsnam/ns/.

(3) reception of new assignments by the DEs, (4) new rout-ing table computation, and (5) reception of new routing ta-bles at each router. The Decision plane failures are detectedby a DE keep-alive timer which expire when no keep-alivemessage is received by a neighboring DE within a time per-iod equal to the maximum delay between DEs. We utilizedresults obtained in the previous section for routing assign-ment computation time while RT computation time waskept constant at 1 ms. Simulation were repeated for therange of DE failure combinations with nmax = 10, nmin = 3.Table 2 shows convergence and maximum network delaysin the case of network bootstrap. Box plot of the conver-gence delays in cases of DE failures is shown in Fig. 9.

Convergence in case of router failures is simulated byrandomly stopping a router instance in ns-2 and measur-ing the time for protocol convergence. After the detectionof failure at the logical area DE, a hold-down timer of30ms is used to detect correlated failures. Upon the expiryof hold-down timer, the DE checks the type of the eventand requests a re-assignment from the leader in case of

317:972 86 306.4138:372 97 383.2

Page 12: On the design of network control and management plane

104:151 87:161 161:328 79:147 317:972 138:372

0.05

0.1

0.15

Topology (routers:links)

Con

verg

ence

del

ay a

fter

rou

ter

failu

re (

s)

Fig. 10. Box plot of protocol convergence delay after router failures forRocketfuel topologies with nmax = 10 and D = 1. The box shows the firstand third quartile along with the median. Whiskers show the min. andmax. values, while the outliers are plotted as ‘‘+’’.

104:151 87:161 161:328 79:147 317:972 138:372

0.1

0.2

0.3

0.4

Topology (routers:links)

Con

verg

ence

del

ay a

fter

DE

fai

lure

(s)

Fig. 9. Box plot of protocol convergence delay after DE failures forRocketfuel topologies with nmax = 10 and D = 1. The box shows the firstand third quartile along with the median. Whiskers show the min. andmax. values, while the outliers are plotted as ‘‘+’’.

2090 H. Iqbal, T. Znati / Computer Networks 55 (2011) 2079–2091

non local-area events. Shortest path route computation atthe DEs was handled by Floyd-Warshall algorithm. Therouter assignments in the simulated topologies werecomputed using D = 1 and nmax = 10.

Fig. 10 shows the boxplot of protocol convergencedelays after router failures for 100 random single routerfailures in each rocketfuel topology. In our simulations,We found that no more than 8% of the router failure werenon local-area events. Since only non local-area eventsrequire assignment re-computation, the vast majority ofrouter failure events did not require router assignmentre-computation. In the failure cases where assignmentre-computation was required, the convergence delayswere similar to DE failure results of Fig. 9. The results showthat DPP protocol achieves sub-second convergence delayseven in largest of the simulated topologies for both routerand DE failure cases. This convergence delay performancecompares favorably with the reported studies of optimizedIGP convergence in conventional distributed routing proto-cols [24].

9. Conclusion

We presented the design of a clean-slate control andmanagement plane to simplify the management complex-

ity in large enterprise and ISP networks. Within the archi-tectural framework of 4D architecture, the proposedDecision plane is designed to meet its stated goals ofachieving high scalability and robustness, while minimiz-ing the response time to any event.

Our work included a novel method of adaptive assign-ment of routers to the logically centralized DecisionElements (DE) and a protocol that allows distributed oper-ation of the DEs while ensuring replication and synchroni-zation of network state across the entire Decision plane.The evaluation of our protocol and algorithm throughdifferent mechanisms and models supports the adherenceof our design to its goals, and the feasibility and benefit ofusing a logically centralized Decision plane.

The main avenues of related future research include theinteroperable design of lower 4D layers, extension of man-agement functions e.g. to include provision for systemmaintenance, and detailed specification and analysis ofDPP.

Acknowledgments

We are grateful to the anonymous reviewers for theirhelpful suggestions for improving the quality of this paper,and to the NSF for providing support (Award Numbers0519728 and 0426886).

References

[1] N. Feamster, H. Balakrishnan, J. Rexford, A. Shaikh, J.V.D. Merwe, Thecase for separating routing from routers, in: Proceedings of FDNA’04.ACM SIGCOMM Workshop on Future Directions in NetworkArchitecture, ACM Press, 2004.

[2] A. Greenberg, G. Hjalmtysson, D.A. Maltz, A. Myers, J. Rexford, G. Xie,H. Yan, J. Zhan, H. Zhang, A clean slate 4d approach to networkcontrol and management, ACM SIGCOMM Comput. Commun. Rev. 35(2005) 41–54.

[3] D.A. Maltz, G. Xie, J. Zhan, H. Zhang, Routing design in operationalnetworks: a look from the inside, in: Proceedings of ACMSIGCOMM’04, ACM Press, 2004, pp. 27–40.

[4] A. Wool, A quantitative study of firewall configuration errors, IEEEComput. 37 (2004) 62–67.

[5] B. Fortz, J. Rexford, M. Thorup, Traffic engineering with traditional IProuting protocols, IEEE Commun. Mag. 40 (2002) 118–124.

[6] A. Greenberg, G. Hjalmtysson, D. Maltz, A. Myers, J. Rexford, G. Xie, H.Yan, J. Zhan, H. Zhang, Refactoring network control andmanagement: a case for the 4D architecture, Technical ReportCMU-CS-05-117, Carnegie Mellon University, 2005.

[7] H. Yan, D. Maltz, T. Ng, H. Gogineni, H. Zhang, Z. Cai, Tesseract: a 4Dnetwork control plane, in: Proceedings of NSDI’07, Symposium onNetworked Systems Design and Implementation, USENIX, 2007.

[8] H. Iqbal, T. Znati, Distributed control plane for 4D architecture, in:Proceedings of GLOBECOM ’07 IEEE Global TelecommunicationsConference, IEEE, 2007, pp. 1901–1905.

[9] D. Lynch, J. Gray, E. Rabinovitch (Eds.), SNA and TCP/IP EnterpriseNetworking, Prentice Hall, 1997.

[10] L. Tymes, Routing and flow control in TYMNET, IEEE Trans. Commun.29 (1981) 392–398.

[11] E. Rosen, A. Viswanathan, R. Callon, Multiprotocol label switchingarchitecture, Internet Request for Comments, vol. 3031, 2001.

[12] M. Caesar, D. Caldwell, N. Feamster, J. Rexford, A. Shaikh, J. derMerwe, Design and implementation of a routing control platform, in:Proceedings of NSDI’05, Symposium on Networked Systems Designand Implementation (NSDI), USENIX, 2005.

[13] T.V. Lakshman, T. Nandagopal, R. Ramjee, K. Sabnani, T. Woo, TheSoftRouter architecture, in: Proceedings of ACM HotNets-IIIworkshop, ACM, 2004.

[14] L. Yang, R.D.T. Anderson, R. Gopal, RFC 3746 Forwarding and ControlElement Separation Framework, 2004.

[15] R. Morris, E. Kohler, J. Jannotti, M.F. Kaashoek, The click modularrouter, ACM SIGOPS Oper. Syst. Rev. 33 (1999) 217–231.

Page 13: On the design of network control and management plane

H. Iqbal, T. Znati / Computer Networks 55 (2011) 2079–2091 2091

[16] H. Ballani, P. Francis, CONMan: a step towards networkmanageability, ACM SIGCOMM Comput. Commun. Rev. 37 (2007).

[17] Y. Wang, I. Avramopoulos, J. Rexford, Morpheus: making routingprogrammable, in: Proceedings of INM’07, 2007 SIGCOMMWorkshop on Internet Network Management, ACM, New York, NY,USA, 2007, pp. 285–286.

[18] I.T. Union, ITU-T recommendation Q.700: Introduction to CCITTSignalling System No. 7, 1993.

[19] A. Kershenbaum, Telecommunications Network Design Algorithms,McGraw-Hill, Inc., 1993.

[20] R. Mahajan, N. Spring, D. Wetherall, T. Anderson, Inferring linkweights using end-to-end measurements, in: Proceedings of INM’02ACM SIGCOMM Internet Measurement Workshop, ACM, 2002, pp.231–236.

[21] A. Medina, A. Lakhina, I. Matta, J. Byers, BRITE. An approach touniversal topology generation, in: MASCOTS’01, Internationalworkshop on Modeling, Analysis, and Simulation of ComputerSystems, 2001.

[22] T. Bu, D. Towsley, On distinguishing between internet power lawtopology generators, in: Proceedings of INFOCOM’02, IEEE, 2002.

[23] O. Heckmann, M. Piringer, J. Schmitt, R. Steinmetz, On realisticnetwork topologies for simulation, in: Proceedings of MoMeTools’03ACM SIGCOMM Workshop on Models, Methods and Tools forReproducible Network Research, ACM, 2003.

[24] P. Francois, C. Filsfils, J. Evans, O. Bonaventure, Achieving sub-secondIGP convergence in large IP networks, ACM SIGCOMM Comput.Commun. Rev. 35 (2005) 35–44.

Hammad Iqbal is a Ph.D. candidate at theSchool of Information Sciences at Universityof Pittsburgh. Before joining Pitt, he receivedM.S.E in Telecommunications and Networkingfrom University of Pennsylvania in 2002. Hisresearch interests focus on the design andoptimization of communication networks thatare secure, evolvable, and can work smartlywith minimal human involvement. His dis-sertation research involves understanding thetrade-offs between centralized and distrib-uted network management in large enterprise

networks, with the aim of finding the right balance between the com-plexity of distributed decision-making and robustness concerns of cen-tralized design. He is the recipient of 2010 Ofiesh Orner research

excellence award by University of Pittsburgh.

Taieb Znati received a Ph.D. degree in Com-puter Science from Michigan State Universityin 1988, and a M.S. degree in Computer Sci-ence from Purdue University, in 1984. He is aProfessor in the Department of ComputerScience, with a joint appointment in Tele-communications in the Department of Infor-mation Science, and a joint appointment inComputer Engineering at the School of Engi-neering. He currently serves as the Director ofthe Computer and Network Systems Divisionat the National Science Foundation. He also

served as a Senior Program Director for networking research at theNational Science Foundation. In this capacity, he led the InformationTechnology Research (ITR) Initiative, a cross-directorate research pro-

gram, and served as the Chair of ITR Committee.His current research interests focus on the design and analysis of evolv-able, secure and resilient network architectures and protocols for wiredand wireless communication networks. He is a recipient of severalresearch grants from government agencies and from industry. He is fre-quently invited to present keynotes in networking and distributed con-ferences both in the United States and abroad.He currently serves as the General Chair of GlobeCom 2010. He alsoserved as the general chair of IEEE INFOCOM 2005, the general chair ofSECON 2004, the first IEEE conference on Sensor and Ad Hoc Communi-cations and Networks, the general chair of the Annual Simulation Sym-posium, and the general chair of the Communication Networks andDistributed Systems Modeling and Simulation Conference. He is a mem-ber of the Editorial Board of the International Journal of Parallel andDistributed Systems and Networks, the Pervasive and Mobile ComputingJournal, the Journal on Wireless Communications and Mobile Computing,and Wireless Networks, the Journal of Mobile Communication, Compu-tation and information. He was also a member of the editorial board ofthe Journal on Ad-Hoc Networks, and a member of IEEE Transactions ofParallel and Distributed Systems.