Post on 04-Apr-2018
7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter
1/14
7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter
2/14
International Journal of Software Engineering and Its Applications
Vol. 6, No. 2, April, 2012
94
restriction on scalability of MDCN. So, intra-container networks could adopt some
complex topologies which may be considered not suitable for traditional DCNs.
In this work, we present SCautz, a novel hierarchical intra-container network struc-
ture for a Shipping Container Kautz network. SCautz comprises of a base physical
Kautz topology, which is built by interconnecting servers' NIC ports and a small
amount of redundant COTS (commodity off-the-shelf) switches. SCautz base topologyadopts the server-centric approach. It is servers that take charge of routing traffic and
work with switches to bypass the failed servers for achieving graceful perfor-mance
degradation.
The basic idea of SCautz is driven by the demand of MDCs service -free mode onfault-tolerance of MDCN, and the inspiration of design is based on scale-out principlein datacenter construction. Results from theoretical analysis and simulations show that
SCautz is more viable for MDCN because of the following reasons:
First, SCautzs base topology can offer as high network capacity as BCube [5] for
one-to-x (e.g., one-to-one, one-to-all) and all-to-all traffic.
Second, we propose a fault-tolerant routing algorithm called SCRouing+, whichleverages switches and peer servers connected to the same switch to bypass the failed
servers. SCautz thus can maintain the throughput for one-to-x traffic, and make
network performance degrade smoothly for all-to-all traffic and much slower than
MDCs computation and storage capacity do.Third, the extra cost of redundant switches is very low. Theoretical analysis shows
that a typical SCautz-based container with 1280 servers only needs 160 switches.
The rest of the paper is organized as follows. Section 2 discusses background.
Section 3 presents BCube and its support for various trac patterns. Section 4 designs
BSR. Section 5 addresses other design issues. Sections 6 studies graceful degradation.
Section 7 presents implementation and experiments. Section 8 discusses related workand Section 9 concludes the paper.
2. Related Works
As MDC gets popular, modular datacenter networks (MDCN) have attracted moreand more interest from cloud providers, hardware vendors and academic fields. Against
the fatal drawbacks in supporting cloud data-intensive computing, lots of noveldatacenter networks (DCN) have been proposed.
VL2 [6] and PortLand [7] organize the switches into more sophisticated Clos and fat-
tree structures respectively, in which any two servers are able to communicate to each
other at maximum rate of network-interface cards (NIC). Since the routing intelligences
are placed on switches, VL2 and PortLand belong to switch-centric DCN; while Dcell[8], BCube and Camcube [9] belong to server-centric DCN, for their routing
intelligences are placed on servers. Dcell proposes a new recursive structure for high
scalability, BCube leverages the low-end COTS switches to implement the intra-
container network based on the Hypercube topology [10], and CamCube designs a
direct-connect 3D torus topology, which has been adopted by Content AddressableNetwork (CAN) overlay [11]. Because its servers are always equipped with multiple
NICs, server-centric DCN is more effective in supporting data-intensive applications
and dealing with failures than switch-centric one. Moreover, since Dcell has network
performance bottleneck at its lower hierarchy and CamCube mainly studies the
flexibility of routing API for cloud appications, BCube could better offer higher
uniform network capacity, and achieve graceful performance degradation.
7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter
3/14
International Journal of Software Engineering and Its Applications
Vol. 6, No. 2, April, 2012
95
In server-centric MDCN, the failures of servers and switches both lead to overall
performance of containers decreases. For example, BCubes incomplete structure makeits throughput for one-to-x traffic patterns drops evidently, and ABT (Aggregation
Bottle Throughput) for all-to-all traffic degrades faster than computation and storage
do. Furthermore, switch failures decrease BCube performance much more significantly.
Its ABT shrinks beyond 50% in the presence of 20% switch failures [5].SCautz proposes a novel hierachical network structure, in the model of undirected
Kautz graph, to avoid the above problems. Kautz achieves near-optimal tradeoff
between node degree and diameter, and have better bisection width and bottleneck
degree. However, it was considered not suitable for mega datacenter, because it is hard
to be incrementally deployed without violating the origin structure. For MDCN, the
amount of servers in a container is fixed, and the interior network will not be changed
during its whole lifecycle. So this restriction doesnot exist anymore. Through
simulations and comparisons, we show that SCautz is more viable for MDCN.
3. SCautz Architecture
SCautz comprises of two types of components: servers with multiple NICs and COTS
switches. Servers interconnect their NICs forming a physical Kautz topology asSCautzs base network structure, denoted as . Switches use their low-speed (1Gbps) ports to connect a specific number of servers, and reserve the high-speed(10Gbps) ports for inter-container networks.
3.1. Preliminaries
For defining the base undirected Kautz topology of SCautz, we introduce the
definition of directed Kautz graph first. Let be an alphabet of d+1letters, and be the identifier space ofKautz, wherein vertex identifiers are a set of strings with length kand base d, and their
consecutive letters are different.
21 10
02
12 01
20
21 10
02
12 01
20
Figure 1. The Kautz Graph and its Undirected Structure
Definition 1 (Kautz graph[13]
) The vertices and edges of Kautz graph K(d,k) are V(K(d,k))
and E(K(d,k)):
, .
7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter
4/14
International Journal of Software Engineering and Its Applications
Vol. 6, No. 2, April, 2012
96
The Kautz graph is d-regular, the diameter is k and there are vertices and edges in it. The SCautzs base undirected Kautz structure UK(d,k) isobtained by omitting the directions of the edges and keeping the loops between the
vertices of the form (abab...), e.g., (01,10), (21,12) and (02,20). So it is regular,which is not like the general undirected Kautz. Figure 1 shows K(2, 2) and UK(2, 2).
3.2. SCautz Structure
The complete structure of SCautz with redundant switches is denoted as
or for short, defined as follows.Definition 2 Let be the complete SCautz network with base topology and the redundant switch structure. The node (), switch ( and ),cluster ( and ) and link (), in which links comprise of the links () directlyconnecting servers and the links connecting servers and switches (), are represented asfollows:
,
,
,
, , or ,
.
The definition of SCautz(d,k,t)s nodes is just the same as in Kautz(d,k), where tmeans the length of switchs identifier. Due to the different rules of organizing servers,
the switches are divided into two categories: and . And let servers,whose rightmost (or leftmost) substrings of length t are the same and identical to a
certain switchs identifier, connect to the corresponding (or ). So tdeterminesthe amount of servers in one cluster and the total amount of switches. The n servers
connected to one same switch form a cluster, hence, all clusters fall into two categories
as well: the clusters with (or ) are denoted as (or ), and each serveris a member of and simultaneously. Therefore, The switch =10connects with four servers (1010, 2010, 0210 and 1210) building the cluster
={10}; the switch =02 also connects with four servers (0201, 0202, 0210and 0212) building the cluster ={02 }, and server 0210 is the member ofclusters {10} and {02} both, as shown in Figure 2. In the rest paper, we willnot distinguish S and C, and represent = (resp. ) for short,e.g., ==10 and ==21. The links in SCautz include building and connecting switches with their servers. All the links inSCautz are undirected, and the physical cables are full-duplex, thus can be definedin two equivalent ways.
7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter
5/14
International Journal of Software Engineering and Its Applications
Vol. 6, No. 2, April, 2012
97
10cluster
02
0210
cluster
Cleft Cright
Sleft Sright
Figure 2. The Cluster Structures of Two Types in SCautz(2,4,2)
If clusters are treated as virtual nodes and the reduplicate links between any pairs ofclusters are not considered temporarily, we can easily obtain the following theorem 1
and prove it true according to Definition 3. The SCautz(2,4,2) are shown in Figure 3,
including the full higher-level and and the partialcorresponding physical structures of servers. Note that the arrows of the links in Figure
3 are only used to exhibit SCautzs logical structures of clusters better.
21 10
02
12 01
20
10
01 02
1010 2010 0210 1210
2101
0101
0102
2102
physical
structure
1201
0201
0202
0102
Cright logical
structure
cluster
cluster
cluster
21 10
02
12 01
20
10
01 02
1010 2010 0210 1210
2101
0101
0102
2102
physical
structure
1201
0201
0202
0102
Cright logical
structure
cluster
cluster
cluster
Figure 3. The SCautz(2,4,2)s Two Full higher-level Logical Structures and
Partial Physical Structures
7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter
6/14
International Journal of Software Engineering and Its Applications
Vol. 6, No. 2, April, 2012
98
THEOREM 1. All the (or ) form a logical Kautz structure, denoted as (or ).
In SCautz, and represent the right-neighbors and left-neighbors of the server X by one L-shift and R-shift operation. The right-neighbor
clusters and left-neighbor clusters of and are defined in the Definition4. (or ) denotes the cluster, which server X belongs to, via (or), while (or ) denotes the peer servers in the same cluster (or ) with server X.
Definition 4 For any server , let (),,( ) and )be the neighbor-clusters of and .
, , , .
Therefore, the server as a member of has d right-neighbor clusters andone left-neighbor cluster while it as a member of has d left-neighbor clustersand one right-neighbor cluster. Take 1210 as an example,
= =01, or = =02 and =21 hold, while
= =10, or = =20 and = =21 hold. Combing the hybridstructure of SCautz(d,k,t) and above definitions, we can obtain the following key
properties about any server , cluster and their neighbors.
Property 1. Each server ( ) in the cluster( ) has d right-neighbor servers , and these s are evenlydistributed in d different right-neighbor clusters . Moreover, a cluster has d right-neighbor clusters,and all the servers in this connect to m servers in each right-neighborcluster.
Property 2. Each server in the cluster has d left-neighbor servers , and these d servers are inthe same cluster
(
). Moreover,
a cluster has d left-neighbor clusters ( , and every servers whose s areidentical (assuming ) connect to all the servers in oneleft-neighbor clusters (( ).
7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter
7/14
International Journal of Software Engineering and Its Applications
Vol. 6, No. 2, April, 2012
99
Therefore, we obtain the following lemmas.
Lemma 1. If , then . That is all the servers in the cluster ( ) connect with only one server in each right-neighbor cluster , and
.Lemma 2. If , then . Thus all the severs in one cluster connect with corresponding servers and .
010
101 102
1010 2010
0101 0102
Scautz(2,4,3)
2101 2102
cluster
cluster
cluster
10
01 02
1010 2010 0210 1210
2101
0101
0102
2102
Scautz(2,4,2)
1201
0201
0202
0102
cluster
cluster
cluster
Figure 4. The Cluster Interconnection Structures in SCautz(2,4,3) and
SCautz(2,4,2)
Therefore, if , there are node-disjoint paths and edge-disjointpaths from to each of its d right-neighborclusters ( ). Take SCautz(2,4,3) and SCautz(2,4,2) asexamples, for SCautz(2,4,3) shown in Figure 3, there are two servers in cluster 010 and
their neighbor-servers are distributed in two neighbor-clusters 101 and 102. But
according to Lemma 1, the two servers connect to only one server in each neighbor-
cluster respectively. So, if server 0101 fails, all the links between cluster 010 and 101
are broken; while for SCautz(2,4,2), according to Lemma 2, there are two node-disjointpaths and four edge-disjoint paths, so it is more reliable than SCautz(2,4,3). Thus, we
will always let in this paper.
Lemma 3. There are node-disjoint paths and edge-disjoint paths from
to its each of itsd
left-neighbor clusters ().It is easy to know the logical Kautz structures of C_right (X) and C_left (X) are
isomorphic, so we can also derive the corresponding properties about C_left (X) andthey will not be listed here.
The SCautz is server-centric and its routing intelligence is implemented on servers.
In consideration of the number limits of servers Ethernet NIC slots and COTS switcheslow-speed ports, we pick SCautz(4,5,3) as a typical structure for MDCN. SCautz(4,5,3)
7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter
8/14
International Journal of Software Engineering and Its Applications
Vol. 6, No. 2, April, 2012
100
supports 1280 servers using only 160 COTS switches. Each server need to be equipped
10 Ethernet ports, in which 8 ports are used for constructing Kautz topology and 2 used
for connecting to each type of COTS switches. Now the multi-port (Dual-port, Quad-
port) Ethernet NICs have become COTS components, and the COTS switch is generally
equipped with tens (e.g, 24) of 1 GigE ports and several (e.g, 4) 10 GigE ports. SCautz
uses switches 1 GigE ports to communicate with servers in the same cluster andreserve high-speed 10 GigE ports for inter-container network. Thus, SCautz is a
practical approach for intra-container network of MDCN.
4. Routing in SCautz
According to SCautzs hierarchical structure, we propose a suite of routing algorithms to
effectively utilize the redundant resources. In this section, we first introduce the regular
routing methods in Kautz, i.e. in fault-free ; and then we analyze theirdisadvantages on dealing with node faults; at last, we present a fault-tolerant routing
algorithm, SCRouing+, to achieve graceful performance degradation.
4.1. Routing in Kautz graph
is a complete undirected Kautz structure. For directed Kautz graph, Fiolproposed a shortest path routing algorithm from source X to destination Y by using L-shift,
defined in Definition 2: Find the largest suffix of X which coincides with a prefix of Y, and
the substring is denoted as R-string. Then put the hop H with longer suffix that coincides with
a prefix of Y than its previous hop until reach the destination Y, and isobtained. In the same way, could be computed by using R-shift operations, andR-shift is defined below too.
Definition 2 Let L-shift and R-shift denote the shift operations on X:
= = .
Combing Fiols [12] and Pradhans [13] ideas, we design a routing algorithm for
, called SCRouting. Let |R-string| and |L-string| refer to the length of R-string and L-string. SCRouting algorithm first compares |R-string| and |L-string|. If |R-
string|>|L-string|, then the is picked as by performing L-shift;otherwise, is picked as to route packets.
4.2. Routing in Kautz graph
In , there are either d parallel R-paths or d parallel L-paths betweenany pairs of servers. Generally, the Kautz graph uses one R-path (or L-path) for data
transmission. If the path breaks down, it is discarded and replaced by another one from
the rest d-1 R-paths (or L-paths). The reason why not find a sub-path to bypass thefailed links or nodes is that such a sub-path may need at most k hops. For example, if
node 20 fails, then path 12->20->01 is not valid anymore, then it compute another new
path 12->21->10>01 from 12 to 01, as shown in Figure 5. In this way, though the
destination is still reachable, the capacity has shrunken: For one-to-one traffic, the sparepaths are always longer than the primitive one, so the delay of single-path routing
increases; since there are d-1 parallel paths left, so throughput of multi-path routing
decreases by 1/d. For one-to-x traffic, since even one failure of link or server will make
7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter
9/14
International Journal of Software Engineering and Its Applications
Vol. 6, No. 2, April, 2012
101
all the paths via it become invalid, so the network capacity and reliability degrades
severely.
To remedy the deficiencies, we propose a fault-tolerant routing algorithm
SCRouing+ based on SCautzs hybrid structure. It can handle the faults in paths
generated by both SCBRouting in and SCRouting in .SCRouing+ uses the survival peer server in the same cluster with the unreachable one tobypass the failed link or server: for R-path( ), it utilizes the peer server in, while for L-path( ), it utilizes the one in s.
Figure 5. Fault-tolerant Routing in Kautz
Figure 6. SCRouting+ fault-tolerant Routing in SCautz
Let (resp. ) represent the i-th right-neighbor (resp.left-neighbor) servers by i L-shift (resp. R-shift) operations. For example assuming
and 2, the
21 10
02
12 01
20 failed
12 20
01010201 1201 2101
1012
0212
0120
2120
1212
2012
1020
2020
cluster cluster
cluster
failed
failed failed
01
7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter
10/14
International Journal of Software Engineering and Its Applications
Vol. 6, No. 2, April, 2012
102
means s right-neighbors right-neighbor server, i.e. the second right-neighbor. Thenthe lemma 4 is obtained and proved easily .
Lemma 4. For in logical , and are in the same cluster. If their mrightmost letters are identical and m+1 rightmost letters are different, then
, in which
. So it is true for
in
logical .According to the Lemma 4, if a server detects the next hop is unreachable, SCRouting+
picks an idle peer server as the next hop from the ones, wheres suffix (or prefix)of length coincides with s and that of length not. Then (or ). Thus SCRouting+ bypasses thefailed hop and reaches its next hop. Moreover, the new fault-tolerant path is only one hop
more than the original one, and without impacts on the other parallel paths. For example,
server 2120 is down, resulting in the sub-path 0212->2120->1201 in certain path invalid.
SCRouing+ constructs the sub-path 0212->1012->0120->1201 to bypass 2120, shown in
Figure 6, instead of a new path 0212->2120->1201->2012->0120->1201 in regular method.
5. Simulations
In this section, we conduct simulations to evaluate the behavior of SCautz and SCRouting+
on fault-tolerance. First, we analyze the performance of SCautzs base topology on handlingvarious patterns of traffic and compare the results to several representative BCubes. And
then we test the performance decline of SCautz and BCube when failures happen and
increase.In these simulations, we use SCautz(4,5,3) as a typical intra-container network of MDCN,
whose base Kautz topology is UK(4,5) and t=3. There are servers equippedwith 5 dual-port NICs and COTS switches with 24 1 GigE ports and 410 GigE ports. For comparisons, we pick two full BCube structures (BCube(32,1),
BCube(4,4))[6]
and one partial BCube (BCube(8,3))[3]
, in which the partial BCube(8,3) uses
2 complete BCube(8,2) with full layer-4 switches ( ). So there are 1024 servers in allthree BCubes but with 64, 1280, 704 switches in BCube(32,1), BCube(4,4) and BCube(8,3)
respectively.
5.1. Performance of We assume the bandwidth of each server s NIC port is 1 Gbps and intermediate
servers relay traffic without delay. We summary some key results in Table 1.
Table 1. Key Simulation Results of and Bcube BCube(32,1) BCube(4,4) BCube(8,3)
ave_path 4.38 1.94 3.75 3.511-to-1 4 2 5 4
1-to-all 4 2 5 4
ABT 1168.95 1057.03 1365.33 1170.29
7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter
11/14
International Journal of Software Engineering and Its Applications
Vol. 6, No. 2, April, 2012
103
From the simulations and comparisons, we know that could offer as highthroughput for one-to-x traffic and high throughput for all-to-all traffic as BCube(8,3) does.
But s ABT and per-server throughput are a little lower than BCube(4,4)because of its longer average path length, because the average path length directly affects the
ABT. In our work, when computing path length for BCube, we considers the switches as
dumb crossbar, as
[5]
says but unlike in
[14,15]
, so the two hops travelling a switch onlyaccounted as one. In addition, BCube(4,4) needs more switches of an order of magnitude. The
results illustrate that just is able to effectively accelerate various types oftraffic patterns as well as BCube, when a container is fault-free.
5.2 Fault-tolerance Evaluation
Since either link or server failure makes one hop in the path unreachable, we assume all
faults are caused by servers or switches and server failures also result in computation and
storage capacity decline in our simulations.
As shown in Figure 7, when one server failure happens, the per-server throughput of
BCube(32,1), BCube(4,4) and BCube(8,3) lose by 50%, 20% and 25% for one-to-x traffic.
Using switches, SCRouting+ algorithm could bypass the failed server by one more hop and
keep the original path valid. So is able to retain the original throughput as afault-free one.
In Figure 8, when 10% and 20% servers fail, the overall computation capacity drops 10%
and 20% correspondingly, while BCubes ABT drops by 15.3% and 25.23%, represented by
the polyline named BCube(8,3). In contrast, only loses by 6.91% and13.74% throughput respectively, much slower than computation and storage decrease. In
addition, BCubes ABT shrinks beyond 50% when 20% switch fail, but no impact on
.
Figure 4. Throughput Degradation for one-to-one Traffic
7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter
12/14
International Journal of Software Engineering and Its Applications
Vol. 6, No. 2, April, 2012
104
Figure 8. ABT Degradation for all-to-all Traffic
5.3 Fault-tolerance Analysis
From the above simulations, we can see that is able to leverage redundantswitches to maintain the per-server throughput for one-to-x traffic and reduce about half ABT
decrease than BCube, so as to improve the reliability of SCautz evidently. Switch faults have
little impact on , but result in BCubes ABT drop sharply. It is because that switches inSCatuz are mainly used to tolerate the increasing faults, while switches in BCube exist
between any two servers and participate in forwarding each network packet.
It is easy to obtain an effective scheme of SCautz-based container to deal with frequent and
increasing failures: First let a fault-free containers SCautzs base topology functions, andthen leverage switches to tolerate faults. Thus, SCautz is able to retain the merits of its
original base structure and achieve performance graceful degradation.
9. Conclusion
MDCs distinct service-free service model poses stricter demand on fault-tolerance ofdatacenter network. According to the scale-out design principle, we propose a novelhierarchical intra-container network structure for MDC, named SCautz. SCautz comprises of
a base physical Kautz topology and hundreds of redundant COTS switches. Its base topology,
, is able to effectively accelerate one-to-x traffic and offer high networkthroughput for all-to-all traffic, behaving as well as BCube. Besides, each switch of two types
together with a specific number of servers form clsters, and clusters build two logical
Kautz structures in higher level. Thus, SCautz is able to retain the throughput for processingone-to-x traffic in the presence of failures and achieve more graceful performance
degradation by reducing about half ABT decrease than BCube.
In this paper, we have proved that SCautz is able to meet the strict requirements ofMDCN through theoretical analysis and simulating evaluations. In our future work, we
will study how to design inter-container network by interconnecting SCautz-based
containers to build mega-datacenters. Moreover, we need to design novel load-balanced
7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter
13/14
International Journal of Software Engineering and Its Applications
Vol. 6, No. 2, April, 2012
105
routing algorithm to process burst network flows of data-intensive applications [16, 17],
so the map-reduce-like applications would not miss the strict deadline for fetching
intermediate results from worker nodes[18]
.
Acknowledgements
This work is supported in part by the National Basic Research Program of China (973)under Grant No. 2011CB302600, the National Natural Science Foundation of China (NSFC)
under Grant No. 60903205, the Foundation for the Author of National Excellent Doctoral
Dissertations of PR China (FANEDD) under Grant No. 200953, and the Research Fund for
the Doctoral Program of Higher Education (RFDP) under Grant No. 20094307110008.
References
[1] J. R. Hamilton, Recent Architecture for Modular Data Centers, Proceedings of Biennial Conference onInnovative Data Systems Research (CIDR), (2007) January 7-10, 2007, Asilomar, California, USA.
[2] K. V. Vishwanath, A. Greenberg, and D. A. Reed, Modular data centers: how to design them?, Proceedingsof LSAP, (2009), June 10. Munich, Germany.
[3] A. B. Letaifa, A. Haji, M. Jebalia and S. Tabbane, State of the Art and Research Challenges of new servicesarchitecture technologies: Virtualization, SOA and Cloud Computing. International Journal of Grid andDistributed Computing (IJGDC). 3, 68 (2010).
[4] P. Chakraborty, D. Bhattacharyya, N. Y. Sattarova and S. Bedaj, Green computing: Practice of Efficient andEco-Friendly Computing Resources, International Journal of Grid and Distributed Computing (IJGDC). 2,33 (2009).
[5] C. Guo, G. Lu, et al. BCube: a high performance, server-centric network architecture for modular
datacenters, Proceedings of the ACM SIGCOMM conference on Data communication (SIGCOMM 09),(2009) August 1721, Barcelona, Spain.
[6] A. Greenberg and J. R. Hamilton, VL2: a scalable and flexible data center network, Proceedings of theACM SIGCOMM conference on Data communication (SIGCOMM 09), (2009) August 1721, Barcelona,Spain.
[7] R. N. Mysore, A. Pamboris, et al., PortLand: a scalable fault-tolerant layer 2 data center network fabric,Proceedings of the ACM SIGCOMM conference on Data communication (SIGCOMM 09), (2009) August
1721, Barcelona, Spain.[8] C. Guo, H. Wu, et al., Dcell: a scalable and fault-tolerant network structure for data centers, Proceedings of
the ACM SIGCOMM conference on Data communication (SIGCOMM 098), (2008) August 1722, Seattle,Washington, USA.
[9] H. Abu-Libdeh, P. Costa, et al., Symbiotic routing in future data centers, Proceedings of the ACMSIGCOMM conference on SIGCOMM (SIGCOMM 10). (2010) August 30September 3, New Delhi, India.
[10] H. Sim, J.-C. Oh and H.-O. Lee, Multiple Reduced Hypercube(MRH): A New Interconnection NetworkReducing Both Diameter and Edge of Hypercube, International Journal of Grid and Distributed Computing(IJGDC). 3, 19 (2010).
[11] M. O. Balitanas and T. Kim, Using Incentives for Heterogeneous peer-to-peer Network, InternationalJournal of Advanced Science and Technology (IJAST), 14, 23 (2010).
[12] M. A. Fiol and A. S. Llado, The partial line digraph technique in the design of large interconnectionnetworks, IEEE Trans. Computers, 41, 848 (1992).
[13] D. K. Pradhan and S. M. Reddy, A fault-tolerant communication architecture for distributed systems, IEEE
Trans. Computers, 32: 863, (1982).[14] Praveen G, P. Vijayrajan, Analysis of Performance in the Virtual Machines Environment, International
Journal of Advanced Science and Technology (IJAST), 32, 53 (2011).
[15] H. Wu, G. Lu, D. Li, et al., MDCube: a high performance network structure for modular datacenterinterconnection, Proceedings ofCoNEXT 09,(2009), December 14, Rome, Italy.
[16] M. Al-Fares, S. Radhakrishnan, BarathRaghavan, NelsonHuang and AminVahdat, Hedera: Dynamic Flow
Scheduling for Data Center Networks, Proceedings of the 7th USENIX conference on Networked systemsdesign and implementation (NSDI10), (2010).
7/31/2019 A Fault-Tolerant Network Architecture for Modular Datacenter
14/14
International Journal of Software Engineering and Its Applications
Vol. 6, No. 2, April, 2012
106
[17] C. Raiciu, S Barre, A. Greenhalgh, D. Wischik and M. Handley, Improving datacenter performance androbustness with multipath tcp., Proceedings of the ACM SIGCOMM conference on SIGCOMM(SIGCOMM 11), (2011) August 1519, Toronto, Ontario, Canada.
[18] C. Wilson and H. Ballani, Better never than late: Meeting deadlines in datacenter networks, In: Proceedings
of the ACM SIGCOMM conference on SIGCOMM (SIGCOMM 11), (2011) August 1519, Toronto, Ontario,Canada.
Authors
Feng Huang
He received the B.Sc. degree (with honors) in computer science
from College of Computer, National University of DefenseTechnology (NUDT), Changsha, China, in 2001. He is now a
student for Ph.D. at National Lab for Parallel and Distributed
Processing, NUDT. His research interests include could computing,
datacenter network, grid computing, virtual machine technology
and data-intensive applications.