Efficiency in Small World Networks

download Efficiency in Small World Networks

of 5

Transcript of Efficiency in Small World Networks

  • 7/30/2019 Efficiency in Small World Networks

    1/5

    arXiv:cond-mat/0101396v2

    12Oct2001

    Efficient Behavior of Small-World Networks

    Vito Latora1,2 and Massimo Marchiori3,41 Laboratoire de Physique Theorique et Modeles Statistiques, Universite Paris-Sud,

    Bat.100, 91405 Orsay Cedex, France2 Department of Physics and Astronomy, University of Catania, and INFN, Italy

    3 W3C and Lab. for Computer Science, Massachusetts Institute of Technology, USA4 Department of Computer Science, University of Venice, Italy

    (February 1, 2008)

    We introduce the concept of efficiency of a network, measuring how efficiently it exchanges informa-tion. By using this simple measure small-world networks are seen as systems that are both globallyand locally efficient. This allows to give a clear physical meaning to the concept of small-world, andalso to perform a precise quantitative analysis of both weighted and unweighted networks. We studyneural networks and man-made communication and transportation systems and we show that theunderlying general principle of their construction is in fact a small-world principle of high efficiency.PACS numbers 89.70.+c, 05.90.+m, 87.18.Sn, 89.40.+k

    We live in a world of networks. In fact any complexsystem in nature can be modeled as a network, wherevertices are the elements of the system and edges repre-sent the interactions between them. Coupled biologicaland chemical systems, neural networks, social interact-ing species, computer networks or the Internet are onlyfew of such examples [1]. Characterizing the structuralproperties of the networks is then of fundamental impor-tance to understand the complex dynamics of these sys-tems. A recent paper [2] has shown that the connectiontopology of some biological and social networks is nei-ther completely regular nor completely random. Thesenetworks, there named small-worlds, in analogy with theconcept of small-world phenomenon developed 30 yearsago in social psychology [3], are in fact highly clusteredlike regular lattices, yet having small characteristics path

    lengths like random graphs. The original paper has trig-gered a large interest in the study of the properties ofsmall-worlds (see ref. [4] for a recent review). Researchershave focused their attention on different aspects: studyof the inset mechanism [57], dynamics [8] and spreadingof diseases on small-worlds [9], applications to social net-works [10,11] and to the Internet [12,13]. In this letter weintroduce the concept ofefficiency of a network, measur-ing how efficiently information is exchanged over the net-work. By using efficiency small-world networks results assystems that are both globally and locally efficient. Thisformalization gives a clear physical meaning to the con-cept of small-world, and also allows a precise quantita-

    tive analysis of unweighted and weighted networks. Westudy several systems, like brains, communication andtransportation networks, and show that the underlyinggeneral principle of their construction is in fact a small-world principle, provided attention is taken not to ignorean important observational property (closure).We start by reexamining the original formulation pro-posed in ref. [2]. There, a generic graph G with Nvertices and K edges is considered. G is assumed to

    be unweighted, i.e. edges are all equal, sparse (K N(N1)/2), and connected. i.e. there exists at least onepath connecting any couple of vertices with a finite num-ber of steps. G is therefore represented by simply givingthe adjacency (or connection) matrix, i.e. the N N ma-trix whose entry aij is 1 if there is an edge joining vertexi to vertex j, and 0 otherwise. An important quantity ofG is the degree of vertex i, i.e. the number ki of edgesincident with vertex i (the number of neighbours of i).The average value of ki is k = 2K/N. Once {aij} isgiven it can be used to calculate the matrix of the short-est path lengths dij between two generic vertices i and j.The fact that G is assumed to be connected implies thatdij is positive and finite i = j. In order to quantify thestructural properties ofG, [2] proposes to evaluate twodifferent quantities: the characteristic path length L and

    the clustering coefficient C. L is the average distance be-tween two generic vertices L = 1N(N1)

    i=j dij, and C

    is a local property defined as C = 1N

    i Ci. Here Ci is

    the number of edges existing in Gi, the subgraph of theneighbors of i, divided by the maximum possible numberki(ki1)/2. In [2] a simple method is considered to pro-duce a class of graphs with increasing randomness. Theinitial graph G is taken to be a one-dimensional latticewith each vertex connected to its k neighbours and withperiodic boundary conditions. Rewiring each edge at ran-dom with probability p, G can be tuned in a continuousway from a regular lattice (p = 0) into a random graph(p = 1). For the regular lattice we expect L N/2k and

    a high clustering coefficient C = 3/4(k2)/(k1), whilefor a random graph L ln N/ln(k 1) and C k/N[14,5]. Although in the two limit cases a large C is asso-ciated to a large L and vice versa a small C to a small L,the numerical experiment reveals an intermediate regimeat small p where the system is highly clustered like regu-lar lattices, yet having small characteristics path lengthslike random graphs. This behavior is there called small-world and it is found to be a property of some social and

    1

    http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2http://arxiv.org/abs/cond-mat/0101396v2
  • 7/30/2019 Efficiency in Small World Networks

    2/5

    biological networks analyzed [2].Now we propose a more general set-up to investigate

    real networks. We will show that:- the definition of small-world behavior can be given interms of a single variable with a physical meaning, theefficiencyE of the network.- 1/L and C can be seen as first approximations of Eevaluated resp. on a global and on a local scale.

    - we can drop all the restrictions on the system, like un-weightedness, connectedness and sparseness.We represent a real network as a generic weighted (andpossibly even non sparse and non connected) graph G.Such a graph needs two matrices to be described: the ad-

    jacency matrix {aij} defined as for the unweighted graph,and the matrix {ij} of physical distances. The numberij can be the space distance between the two vertices orthe strength of their possible interaction: we suppose ijto be known even if in the graph there is no edge betweeni and j. To make some examples, ij can be the geograph-ical distance between stations in transportation systems(in such a case ij respects the triangle equality, thoughthis is not a necessary assumption), the time taken to ex-change a packet of information between routers in the In-ternet, or the inverse velocity of chemical reactions alonga direct connection in a biological system. Of course, inthe particular case of an unweighted graph ij = 1 i = j.The shortest path length dij between two generic pointsi and j is the smallest sum of the physical distancesthroughout all the possible paths in the graph from ito j. The matrix {dij} is therefore calculated by usingthe information contained both in matrix {aij} and inmatrix {ij}. We have dij ij i, j, the equality beingvalid when there is an edge between i and j. Let us now

    suppose that the system is parallel, i.e. every vertex sendsinformation concurrently along the network, through itsedges. The efficiency ij in the communication betweenvertex i and j can be then defined to be inversely propor-tional to the shortest distance: ij = 1/dij i, j. Whenthere is no path in the graph between i and j, dij = +and consistently ij = 0. The average efficiencyofG canbe defined as:

    E(G) =

    i=jGij

    N(N 1)=

    1

    N(N 1)

    i=jG

    1

    dij(1)

    To normalize E we consider the ideal case Gid in which

    the graphG

    has all the N(N 1)/2 possible edges. Insuch a case the information is propagated in the most ef-ficient way since dij = ij i, j, and E assumes its maxi-mum value E(Gid) =

    1N(N1)

    i=jG

    1lij

    . The efficiency

    E(G) considered in the following of the paper is alwaysdivided by E(Gid) and therefore 0 E(G) 1. Thoughthe equality E = 1 is valid when there is an edge betweeneach couple of vertices, real networks can reach a highvalue of E.

    In our formalism, we can define the small-world be-haviour by using the single measure E to analyze boththe local and global behaviour, rather than two differentvariables L and C. The quantity in eq. (1) is the globalefficiencyofG and we therefore name it Eglob. Since Eis defined also for a disconnected graph we can character-ize the local properties ofG by evaluating for each ver-tex i the efficiency ofGi, the subgraph of the neighbors

    of i. We define the local efficiency as the average effi-ciency of the local subgraphs, Eloc = 1/N

    iG E(Gi).

    This quantity plays a role similar to the clustering co-efficient C. Since i / Gi, the local efficiency Eloc tellshow much the system is fault tolerant, thus how efficientis the communication between the first neighbours of iwhen i is removed [15]. The definition of small-world cannow be rephrased and generalized in terms of the infor-mation flow: small-world networks have high Eglob andEloc, i.e. are very efficient in global and local communi-cation. This definition is valid both for unweighted andweighted graphs, and can also be applied to disconnectedand/or non sparse graphs.

    It is interesting to see the correspondence between ourmeasure and the quantities L and C of [2] (or, corre-spondingly, 1/L and C). The fundamental difference isthat Eglob is the efficiency of a parallel systems, whereall the nodes in the network concurrently exchange pack-ets of information (such are all the systems in [2], forexample), while 1/L measures the efficiency of a sequen-tial system (i.e. only one packet of information goesalong the network). 1/L is a reasonable approximationof Eglob when there are not huge differences among thedistances in the graph, and this can explain why L worksreasonably well in the unweighted examples of [2]. But,

    in general 1/L can significantly depart from Eglob. Forinstance, in the Internet, having few computers with anextremely slow connection does not mean that the wholeInternet diminishes by far its efficiency: in practice, thepresence of such very slow computers goes unnoticed, be-cause the other thousands of computers are exchangingpackets among them in a very efficient way. Here 1/Lwould give a number very close to zero (strictly 0 in theparticular case when a computer is disconnected fromthe others and L = +), while Eglob gives the correctefficiency measure of the Internet. We turn now our at-tention to the local properties of a network. C is only oneamong the many possible intuitive measures [10] of how

    well connected a cluster is. It can be shown that whenin a graph most of its local subgraphs Gi are not sparse,then C is a good approximation of Eloc. Summing upthere are not two different kinds of analysis to be donefor the global and local scales, but just one with a veryprecise physical meaning: the efficiency in transportinginformation.We now illustrate the onset of the small-world in an un-weighted graph by means of the same example used in [2].A regular lattice with N = 1000 and k = 20 is rewired

    2

  • 7/30/2019 Efficiency in Small World Networks

    3/5

    with probability p and Eglob and Eloc are reported in fig.1as functions ofp [16]. For p = 0 we expect the system tobe inefficient on a global scale (Eglob k/N log(N/K))but locally efficient. The situation is inverted for the ran-dom graph. In fact at p = 1 Eglob assumes a maximumvalue of 0.4, meaning 40% the efficiency of the ideal graphwith an edge between each couple of vertices. This at theexpenses of the fault tolerance (Eloc 0).

    0.001 0.01 0.1 1

    p

    0

    0.2

    0.4

    0.6

    0.8

    1

    Eloc(p)

    Eglob

    (p)

    FIG. 1. FIG.1 Global and local efficiency for the graphexample considered in [2]. A regular lattice with N = 1000and k = 20 is rewired with probability p. The small-worldbehaviour results from the increase of Eglob caused by theintroduction of only a few rewired edges (short cuts), whichon the other side do not affect Eloc. At p 0.1, Eglob hasalmost reached the value of the random graph, though Elochas only diminished by very little from the value of 0.82 ofthe regular lattice. Small worlds have high Eglob and Eloc.

    The small-world behaviour appears for intermediatevalues of p. It results from the fast increase of Eglob (forsmall p we find a linear increase ofEglob in the logarith-mic horizontal scale) caused by the introduction of onlya few rewired edges (short cuts), which on the other sidedo not affect Eloc. At p 0.1, Eglob has almost reachedthe maximum value of 0.4, though Eloc has only dimin-ished by very little from the maximum value of 0.82. Foran unweighted case the description in terms of networkefficiency resembles the approximation given in [2]. Inparticular we have checked that a good agreement withcurves L(p) and C(p) [2] can be obtained by reporting

    1/Eglob(p) and Eloc(p). Of course in such an example theshort cuts connect at almost no cost vertices that wouldotherwise be much farther apart (because ij = 1 i = j).On the other hand this is not true when we consider aweighted network.As real networks we consider first different examples ofnatural systems (neural networks), and then we turn ourattention to man-made communication and transporta-tion systems.

    1) Neural Networks. Thanks to recent experiments

    neural structures can be studied at several levels of scale.Here we focus first on the analysis of the neuroanatom-ical structure of cerebral cortex, and then on a simplenervous system at the level of wiring between neurons.The anatomical connections between cortical areas areof particular importance for their intricate relationshipwith the functional connectivity of the cerebral cortex[18]. We analyze two databases of cortico-cortical con-

    nections in the macaque and in the cat [19]. Tab.1 indi-cates the two networks are small-worlds [16]: they havehigh Eglob, respectively 52% and 69% the efficiency ofthe ideal graph with an edge between each couple of ver-tices (just slightly smaller than the best possible values of57% and 70% obtained in random graphs) and high Eloc,respectively 70% and 83%, i.e. high fault tolerance [22].These results indicate that in neural cortex each region isintermingled with the others and has grown following aperfect balance between local necessities (fault tolerance)and wide-scope interactions. Next we consider the neuralnetwork of C. elegans, the only case of a nervous systemcompletely mapped at the level of neurons and chemicalsynapses [23]. Tab.1 shows that this is also a small-worldnetwork: C. elegans achieves both a 50% of global andlocal efficiency. Moreover the value of Eglob is similar toEloc. This is a difference from cortex databases wherefault tolerance is slighty privileged with respect to globalcommunication.

    2) Communication Networks. We have considered twoof the most important large-scale communication net-works present nowadays: the World Wide Web and theInternet. Tab.2 shows that they have relatively high val-ues of Eglob (slightly smaller than the best possible val-ues obtained for random graphs) and Eloc. Despite the

    WWW is a virtual network and the Internet is a physi-cal network, at a global scale they transport informationessentially in the same way (as their Eglobs are almostequal). At a local scale, the bigger Eloc in the WWW casecan be explained both by the tendency in the WWW tocreate Web communities (where pages talking about thesame subject tend to link to each other), and by the factthat many pages within the same site are often quicklyconnected to each other by some root or menu page.

    3) Transport Networks. Differently from previousdatabases the Boston subway transportation system(MBTA) can be better described by a weighted graph,the matrix {ij} being given by the geographical dis-

    tances between stations. If we consider the MBTA asan unweighted graph we obtain that it is apparently nei-ther locally nor globally efficient (see Tab.3). On theother hand, when we take into account the geographicaldistances, we have Eglob = 0.63: this shows the MBTA isa very efficient transportation system on a global scale,only 37% less efficient than the ideal subway with a di-rect tunnel from each station to the others. Even in theweighted case Eloc stays low (0.03), indicating a poorlocal behaviour: differently from a neural network the

    3

  • 7/30/2019 Efficiency in Small World Networks

    4/5

    MBTA is not fault tolerant and a damage in a stationwill dramatically affect the connection between the pre-vious and the next station. The difference with respectto neural networks comes from different needs and prior-ities in the construction and evolution mechanism: whenwe build a subway system, the priority is given to theachievement of global efficiency, and not to fault toler-ance. In fact a temporary problem in a station can be

    solved by other means: for example, walking, or taking abus from the previous to the next station. That is to say,the MBTA is not a closed system: it can be considered,after all, as a subgraph of a wider transportation network.This property is of fundamental importance when we an-alyze a system: while global efficiency is without doubtthe major characteristic, it is closure that somehow leadsa system to have high local efficiency (without alterna-tives, there should be high fault-tolerance). The MBTAis not a closed system, and thus this explains why, unlikein the case of the brain, fault tolerance is not a criticalissue. Indeed, if we increase the precision of the analysisand change the MBTA subway network by taking intoaccount, for example, the Boston Bus System, this ex-tended transportation system comes back to be a small-world network (Eglob = 0.72, Eloc = 0.46). Qualitativelysimilar results, confirming the similarity of constructionprinciples, have been obtained for other undergroundsand for a wider transportation system consisting of allthe main airplane and highway connections throughoutthe world [25]. Considering all the transportation alter-natives available at that scale makes again the systemclosed (there are no other reasonable routing alterna-tives), and so fault-tolerance comes back as a leadingconstruction principle.

    Summing up, the introduction of the efficiency mea-sure allows to give a definition of small-world with a clearphysical meaning, and provides important hints on whythe original formulas of [2] work reasonably well in somecases, and where they fail. The efficiency measure al-lows a precise quantitative analysis of the informationflow, and works both in the unweighted abstraction, andin the more realistic assumption of weighted networks.Finally, analysis of real data indicates that various ex-isting (neural, communication and transport) networksexhibit the small-world behaviour (even, in some cases,when their unweighted abstractions do not), substanti-ating the idea that the diffusion of small-world networkscan be interpreted as the need to create networks that

    are both globally and locally efficient.

    [1] Y. Bar-Yam, Dynamics of Complex Systems (Addison-Wesley, Reading Mass, 1997).

    [2] D.J. Watts and S.H. Strogatz, Nature 393, 440 (1998).

    [3] S. Milgram, Physicol. Today, 2, 60 (1967).[4] M.E.J. Newman, cond-mat/0001118.[5] A. Barrat, M. Weigt, Europ. Phys. J. B 13, 547 (2000)[6] M. Marchiori and V. Latora, Physica A285, 539 (2000).[7] M. Barthelemy, L. Amaral, Phys. Rev. Lett. 82, 3180

    (1999).[8] L. F. Lago-Fernandez et al, Phys. Rev. Lett. 84, 2758

    (2000).[9] C. Moore and M.E.J. Newman, Phys. Rev. E61, 5678

    (2000).[10] M.E.J. Newman, cond-mat/0011144.[11] L. A. N. Amaral, A. Scala, M. Barthelemy, and

    H. E. Stanley, Proc. Natl. Acad. Sci. 97, 11149 (2000).[12] R. Albert, H. Jeong, and A.-L. Barabasi, Nature 401,

    130 (1999).[13] A.-L. Barabasi and R. Albert, Science 286, 509 (1999).[14] B. Bollobas, Random Graphs (Academic, London, 1985).[15] Our concept of fault tolerance is different from the one

    adopted in R. Albert, H. Jeong, and A.-L. Barabasi, Na-ture 406, 378 (2000); R. Cohen et al. Phys. Rev. Lett. 85,2758 (2000), where the authors consider the response ofthe entire network to the removal of i.

    [16] Here and in the following the matrix {dij}i,jG hasbeen computed by using two different methods: theFloyd-Warshall (O(N3)) [17] and the Dijkstra algorithm(O(N2 logN)) [10].

    [17] G. Gallo and S. Pallottino, Ann. Oper. Res. 13, 3 (1988).[18] O. Sporns, G. Tononi, G.M. Edelman, Celebral Cortex

    10, 127 (2000).[19] J.W.Scannell, Nature 386, 452 (1997).[20] M.P. Young, Phil.Trans.R.Soc B252, 13 (1993).[21] J.W. Scannell, M.P. Young and C. Blakemore, J. Neu-

    rosci. 15, 1463 (1995).[22] E. Sivan, H. Parnas and D. Dolev, Biol. Cybern. 81, 11-

    23 (1999).[23] J.G. White et. al., Phil. Trans. R. Soc. London B314, 1

    (1986).[24] T.B. Achacoso and W.S. Yamamoto, AYs Neuroanatomyof C. elegans for Computation (CRC Press, FL, 1992).

    [25] M. Marchiori and V. Latora, in preparation.

    TABLE I. Macaque and cat cortico-cortical connections[19]. The macaque database contains N = 69 cortical areasand K= 413 connections [20]. The cat database has N = 55cortical areas (including hippocampus, amygdala, entorhinalcortex and subiculum) and K = 564 (revised database andcortical parcellation from [21]). The nervous system of C.elegans consists of N = 282 neurons and K = 2462 linkswhich can b e either synaptic connections or gap junctions[24].

    Eglob ElocMacaque 0.52 0.70Cat 0.69 0.83

    C. elegans 0.46 0.47

    http://arxiv.org/abs/cond-mat/0001118http://arxiv.org/abs/cond-mat/0011144http://arxiv.org/abs/cond-mat/0011144http://arxiv.org/abs/cond-mat/0001118
  • 7/30/2019 Efficiency in Small World Networks

    5/5

    TABLE II. Communication networks. Data onthe World Wide Web from http://www.nd.edu/networkscontains N = 325729 documents and K = 1090108links [12], while the Internet database is taken fromhttp://moat.nlanr.net and has N = 6474 nodes andK = 12572 links.

    Eglob ElocWWW 0.28 0.36Internet 0.29 0.26

    TABLE III. The Boston underground transportationsystem (MBTA) consists of N = 124 stations and K = 124tunnels. The matrix {ij} of the spatial distances betweenstations, used for the weighted case, has been calculated us-ing databases from http://www.mbta.com/ and the U.S. Na-tional Mapping Division.

    Eglob ElocMBTA (unweighted) 0.10 0.006MBTA (weighted) 0.63 0.03

    http://www.nd.edu/~networkshttp://moat.nlanr.net/http://www.mbta.com/http://www.mbta.com/http://moat.nlanr.net/http://www.nd.edu/~networks