Analysis and Simulation of Scientic Networks

58
425 453 482 492 529 251 181 530 80 199 202 222 336 41 398 7 27 223 224 371 477 102 105 134 338 359

description

In recent times, hearing the word network immediately arouses the idea of physically wired networks as those formed by telephone lines or computer links. Though, network is a concept a good deal more general than only this. This paper describes and evaluates different existing network models and develops a generalization for network bundles.

Transcript of Analysis and Simulation of Scientic Networks

Page 1: Analysis and Simulation of Scientic Networks

Institute of Theoretical Physics

University of Cologne

Diploma Thesis

Analysis and Simulation

of Scienti�c Networks

Felix P�utsch

July 14, 2003

425

453

482492

529

251

181

530

80

199202

222

336

41

3987

27

223

224

371477

102

105

134

338

359

supervised by Prof. D. Stau�er

Page 2: Analysis and Simulation of Scientic Networks

Hereby, I con�rm (according to the Pr�ufungsordnung of July 12, 1996, x20(5)) having

composed this diploma thesis alone, using no other than the mentioned sources and tools.

Citations have been marked.

Hiermit versichere ich gem�ass x20(5) der Pr�ufungsordnung vom 12. Juli 1996, dass ich

diese Diplomarbeit alleine erstellt und keine anderen als die angegebenen Quellen und

Hilfsmittel verwendet habe. Zitate wurden kenntlich gemacht.

* [email protected]

Page 3: Analysis and Simulation of Scientic Networks

Though the mountains divide

And the oceans are wide

It's a small world after all

R. M. and R. B. Sherman

Page 4: Analysis and Simulation of Scientic Networks
Page 5: Analysis and Simulation of Scientic Networks

Contents

1. Introduction 7

1.1. Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2. Six Degrees of Separation . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3. Small World E�ect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4. Science Collaboration Networks . . . . . . . . . . . . . . . . . . . . . . . 8

2. Network Models 9

2.1. Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.1. Small World E�ect . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.2. Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.3. Scale-Free Behavior . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2. Regular Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.3. Erd}os-R�enyi Random Networks . . . . . . . . . . . . . . . . . . . . . . . 11

2.4. Watts-Strogatz Small-World Networks . . . . . . . . . . . . . . . . . . . 11

2.5. Barab�asi-Albert Network Model . . . . . . . . . . . . . . . . . . . . . . . 11

2.6. New Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3. Empirical Collaboration Network 15

3.1. Typology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.1.1. Citation Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.1.2. Collaboration Graph . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2. Building the net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2.1. Proceeding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2.2. Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.3. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.3.1. Authors per paper . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.3.2. Connections per author . . . . . . . . . . . . . . . . . . . . . . . 18

3.3.3. Double vs. unique links . . . . . . . . . . . . . . . . . . . . . . . 19

3.3.4. Cluster sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.4. Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5

Page 6: Analysis and Simulation of Scientic Networks

Contents

3.4.1. Erd}os-R�enyi random graphs . . . . . . . . . . . . . . . . . . . . . 21

3.4.2. Watts-Strogatz small-world networks . . . . . . . . . . . . . . . . 21

3.4.3. Barab�asi-Albert networks . . . . . . . . . . . . . . . . . . . . . . 22

4. Spin models 23

4.1. Leadership e�ect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.1.1. Ising model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.1.2. Phase transition . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.1.3. Degree distribution . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.1.4. Spin ip model . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.2. Cluster limited Ising models . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.2.1. Proceeding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2.2. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2.3. Bias adjusti�cation . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.2.4. Linear Relationship . . . . . . . . . . . . . . . . . . . . . . . . . 29

5. Barab�asi-Albert network models 31

5.1. Modi�ed Barab�asi-Albert model . . . . . . . . . . . . . . . . . . . . . . 31

5.2. Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.2.1. Isolated clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.2.2. Merging clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.2.3. scale-free behavior . . . . . . . . . . . . . . . . . . . . . . . . . . 34

6. Conclusion 37

A. Acknowledgements 39

B. Source code 41

B.1. Ising model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

B.2. Spin ip model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

B.3. Modi�ed Barab�asi-Albert model . . . . . . . . . . . . . . . . . . . . . . 51

C. Figures 55

Bibliography 55

6

Page 7: Analysis and Simulation of Scientic Networks

1. Introduction

1.1. Networks

In recent times, hearing the word network

immediately arouses the idea of physically

wired networks as those formed by tele-

phone lines or computer links. Though, net-

work is a concept a good deal more general

than only this.

Mathematically spoken, networks are

graphs, i.e. a set of nodes (of whatever

kind) connected by edges (links, connec-

tions) between certain pairs.

This abstraction has been known for a

long time. Probably the �rst paper of graph

theory was written by Euler [1], the so-

called \bridge problem of K�onigsberg". Eu-

ler discusses whether or not it is possible

to make a round walk, passing of each of

K�onigsberg's nine bridges exactly once (�g-

ure 1.1).

The concept of networks can be applied

to lots of theoretical or experimental sub-

jects [2{4], nodes being people [5], Internet

servers [6], scientists [7, 8] or others, the

range of links comprises e-mails [9], friend-

ships [5], citations [10, 11] and more.

Thus, there are numerable di�erent kinds

of networks, physical ones (e.g. hard wired)

as well as logical (e.g. dependencies) or

social ones (e.g. contacts, friendships),

stretching out to topics far from wired net-

works [9, 12]. The area is under vigorous

research. Good reviews can be found in [2{

4, 13, 14].

1.2. Six Degrees of Separation

Out of personal experience, nearly every-

body has been confronted with what we call

small world e�ect. There are numerous ex-

amples:

At a party, we �nd out to know some

stranger we just started talking to by only a

few middle-persons (or technically spoken

we are only separated from him by a low

degree). E.g., he could be our street neigh-

bors' colleague's son. We hear \My god,

how world is small".

Rumors are another example. We are as-

tonished to experience the pace at which

they spread. After a few hours and thus

only a few possibilities of telling rumors to

others, whole city seems to know.

Milgram [15] made an experiment on

this. He instructed a set of people to try

to send a letter to some stranger, only by

using personal contacts. He found out that

an astonishing short chain of social links is

needed for this task, which entered in every-

day's language as Six Degrees of Separa-

tion. Recently, this has been reviewed on

a more popular basis by a German weekly

newspaper [5].

1.3. Small World E�ect

Six Degrees of Separation is only one man-

ifestation of a more general principle: the

small world e�ect [16{18].

Observations of many real-world net-

7

Page 8: Analysis and Simulation of Scientic Networks

1. Introduction

Figure 1.1.: K�onigsberg bridge problem: Is it possible to make a round trip, passing each bridge exactly

once? [1]

works in computer science, biology, chem-

istry, linguistic, sociology, etc. have re-

vealed a crucial di�erence from regular lat-

tices.

Regarding average (or sometimes max-

imum) path lengths on such networks, we

would expect to see an increase linearly with

the number of nodes. Instead, we examine

distances growing logarithmically with sys-

tem size.

This behavior is not only an amus-

ing e�ect but has far-spreading conse-

quences [19]. Prominent examples are In-

ternet's stability against attacks [20], dis-

ease spreading [21, 22] or path �nding

strategies [23, 24].

1.4. Science Collaboration

Networks

In context of science, the network between

scientists as nodes of the graph is of par-

ticular interest. This network belongs to

the group of social ones, with humans as

nodes. Unlike many other forms of social

relationships, that are mostly quite diÆcult

to capture objectively, the �eld of published

papers is very widespread covered by the

Science Citation Index [25] and so easily

available to research.

8

Page 9: Analysis and Simulation of Scientic Networks

2. Network Models

There are many types of networks com-

peting to describe observations made in

socio-physics. After discussing which mea-

surements describe a given networks struc-

ture, we will give a short overview about

what we think to be the most important

ones and discuss advantages and possible

disadvantages.

2.1. Measurements

2.1.1. Small World E�ect

As illustrated in the introduction, we are

interested in the correlation of network size

and average (maximum) path lengths. We

investigate if there is linear, logarithmic or

other behavior. In case of a logarithmic one,

the net is said to show the Small World

E�ect.

2.1.2. Clustering

In friendship network, we �nd friends of one

person often to be friends themselves. This

is true for most social networks and even

other ones. Links are not spread randomly

but arranged in clusters.

To describe thing mathematically, we in-

troduce a clustering coeÆcient C

i

of a node

i describing the portion of m established

links between all k

i

next neighbors com-

pared to the maximum possible number of

M =

m

2

, i.e.

C

n

=

m

M

:

This value is averaged over all vertices to

give a clustering coeÆcient C for the whole

network.

Typical values experienced are far above

results expected for random networks [2, p.

50].

2.1.3. Scale-Free Behavior

A third observation regarding social net-

works is its distribution of degrees. Regard-

ing frequency of vertices of given coordina-

tion numbers, we do not �nd an exponential

but a power law [26].

In all, we have three possible means

to classify networks. Many social graphs

show small path lengths, high clustering

and scale-free behavior.

2.2. Regular Lattices

The simplest form of a lattice is a symmet-

rical formation of nodes connected by edges

between all pairs (or all pairs of adjacent)

nodes as shown in �gure 2.1a,b. Reasons

to choose this linking are e.g. to simulate

neighborship in a town etc.

Such network show a high degree of clus-

tering, as wished. The average path lengths

are very long, though and scale with sys-

tem size. So, small world behavior cannot

be found which makes the model inappro-

priate for our needs.

9

Page 10: Analysis and Simulation of Scientic Networks

2. Network Models

Figure 2.1.: network types: a, b regular lattices, c random network, c scale-free graph [14]

10

Page 11: Analysis and Simulation of Scientic Networks

2.5. Barab�asi-Albert Network Model

2.3. Erd}os-R�enyi Random

Networks

Random networks are the extremum on the

other side of the spectrum. A number of

nodes is wired by pure chance, i.e. we throw

the dices to select two nodes and place an

edge between them.

Such graphs have been �rst proposed by

Solomono� and Rapoport [27] and have

been extensively studied by Erd}os and R�enyi

[28]. Actual results have been reviewed in

[29]. A typical result can be seen in �g-

ure 2.1c.

We �nd that average path lengths behave

logarithmically with network size. While

this is as desired for small world simulation,

obviously there is no clustering.

2.4. Watts-Strogatz Small-World

Networks

The idea is plausible to try combining both

presented models to sum up their corre-

sponding advantages. A big step towards

this goal was done by Watts and Strogatz

[16].

Their model starts with a circular graph

that is regularly wired (�gure 2.2). Step

by step, edges are chosen by chance and

rewired to an arbitrary destination node.

Thus, a small fraction of links are long-

range ones. To illustrate, this could be

habitants of a street of neighbors having

relationships with far-away relatives.

At �rst, the model seems to ful�ll our de-

sires. It shows small average path lengths as

well as high clustering. Looking at the de-

gree distribution, i.e. the frequency of nodes

Figure 2.3.: Barab�asi and Albert [30]

of a certain degree, we �nd strong di�er-

ences from real-world data as there is no

scale-free behavior.

2.5. Barab�asi-Albert Network

Model

Barab�asi and Albert [30] started a new idea.

Their model consists of two ingredients:

growth and preferential attachment.

We start with a graph of m

0

= 3 vertices,

each one connected to each other. Now, in

each time step, we add a node that is con-

nected to others by m = 3 links. The new

node being one side of the links, the other

one is chosen at random from the existing

network. The probability of a vertex being

selected is proportional to the number of

links already attached to it.

To stay in the image: If you already have

lots of friends, you are more likely to get

new ones. \The rich get richer."

1

These rules result in a network (�g-

ure 2.1d) that is capable of reproducing

small-world behavior, as well as being scale-

free. Research has found good collapse with

1

\Whoever has will be given more, and he will

have an abundance." [31].

11

Page 12: Analysis and Simulation of Scientic Networks

2. Network Models

(b) (c)(a)

Figure 2.2.: Watts-Strogatz network model: We start with a regular lattice (a) formed to a ring (b)

and re-wire a small fraction of links to random destinations (c) [18]

empirical networks, included e.g. the world

wide web [6]. Good introductions can be

found in [32, 33].

Clustering is present to a certain degree,

but still much too small regarding experi-

mental values.

2.6. New Approaches

Recently, new network models have been

developed to cope with inconveniences en-

countered with present ones.

Ravasz and Barab�asi [35] examined net-

works of a self-similar structure imitating

the idea of hierarchical organization in soci-

ology. Combining high clustering and scale-

free behavior, their model does not show

short path lengths, though.

Klemm and Egu

iluz [34]

2

developed an

auspicious model joining all three demands

in one network. The authors present a gen-

eralization of the Barab�asi-Albert model,

adding aging of nodes and some random

behavior.

There will be further research to be done

2

cf. also [36]

on this model to verify if it copes with re-

ality.

An overview of all models can be found

in �gure 2.4.

12

Page 13: Analysis and Simulation of Scientic Networks

2.6. New Approaches

scale−free

high clustering

Klemm−Eguiluz

Barabasi−Albert

regular lattice

path lengths

short average

small world

Watts−Strogatz

hierarchical model

Ravasz−Barabasi

random network

Erdos−Renyi

Figure 2.4.: overview over recent network models (Erd}os and R�enyi [28], Watts and Strogatz [16],

Barab�asi and Albert [30], Klemm and Egu

iluz [34], Ravasz and Barab�asi [35])

13

Page 14: Analysis and Simulation of Scientic Networks

2. Network Models

14

Page 15: Analysis and Simulation of Scientic Networks

3. Empirical Collaboration Network

In context of science, the network be-

tween scientists as nodes of graph is of par-

ticular interest.

3.1. Typology

First, we want to deal with the de�nition

of a collaboration graph. As to the nodes,

we have the choice to identify each vertex

either with an author or with a paper.

The second possibility is also area of re-

search [37], but we think studying the rela-

tionship of scientists as the paper's authors

o�ers more insight in how research works.

So we will make each scientist a node of

our network.

As what concerns the edges, there are ba-

sically two possible choices|both covered

equally by the database used [25].

3.1.1. Citation Graph

We might chose to consider citations from

one author to another as links [10], thus

resulting in a directed graph.

Starting at a given paper, we can enlarge

our network by following links recursively

up to a certain depth, e.g. by depth-�rst

or width-�rst algorithms. Each new work

will cite several to many still un-included.

Roughly spoken, the number of publications

to include will raise exponentially with the

maximum depth chosen.

Quickly, we arrive at huge amounts of

data. Additionally, there is no canonical

end of the hunt for new links. In the ex-

treme case we could be caught in a giant

cluster containing all or nearly all of the pa-

pers ever published. We see no possibility to

narrow this down in a reasonable way with-

out fear of introducing arbitrary boundary

conditions.

3.1.2. Collaboration Graph

Second possibility to de�ne edges of a graph

is creating links by co-authorship in one or

several papers [7, 8]. If n scientists pub-

lish a paper together, they are connected

to each other by

n

2

edges.

As an additional advantage, we have the

choice to start with an arbitrary set of au-

thors, establishing links between them by

looking at all papers they are involved. This

will result in a graph of limited size.

Of course, we should think carefully

about reasonable selection, to avoid edge

e�ects. We will discuss this in the next sec-

tion.

3.2. Building the net

3.2.1. Proceeding

As solution, we choose the following pro-

ceeding: We start with one paper. As one

part of our work will be the comparison

of real world data to Barab�asi-Albert net-

works, we take the corresponding paper [30]

as center of investigation.

15

Page 16: Analysis and Simulation of Scientic Networks

3. Empirical Collaboration Network

In order to determine the set of authors

we want to deal with, we select all 185 pa-

pers that cite this paper.

1

Secondly, we

construct a list of unique authors from all

these papers. A �rst approach delivers 559

scientists, whereof some turned out to be

identical but appearing in particular papers

with typos. We �nish with a set of 555

authors to whom we attribute consecutive

numbers.

The last step of the network creation pro-

cess consists in establishing links between

all these authors. This is done by selecting

one paper after the other and introducing

connections between each possible pair of

this paper's authors (i.e.

n

2

links for n au-

thors).

Eventually, this gives us a graph of 555

nodes representing scienti�c collaboration

in the area of Barab�asi-Albert networks.

The network size is relatively small com-

pared to all data in the Science Citation

Index [25] (approx. 10

7

papers). Studying

properties of this subnet, we hope getting

an insight to what leads to the structure

observed. Veri�cation with bigger networks

remains a task for the future.

3.2.2. Visualization

To get an idea of what we are dealing about,

we visualize the graph using a spring model

[38, 39]. In order to give manageable results

we remove a paper on the Human Genome

Project [40] with 274 authors. Brief exami-

nation yields that this is no harm, as scien-

1

We have to be careful not to mix citation data

from di�erent dates as new papers are continu-

ously added to the database. Base of our inves-

tigation is October 21

st

, 2002.

tists participating in this work did not co-

operate with others in our graph, and form

a big cluster on their own. The result is

shown in �gure 3.1.

3.3. Analysis

3.3.1. Authors per paper

authors frequency

1 37

2 69

3 47

4 21

5 6

6 4

7 1

Table 3.1.: Authors per paper

First thing we are interested in is the fre-

quency distribution of papers per author.

We expect to see many papers with few au-

thors and vice versa (table 3.1).

0

10

20

30

40

50

60

70

1 2 3 4 5 6

frequency

number of authors

~x^3.58*exp(-x/0.54)~x^-3.3

Figure 3.2.: frequency distribution of the number

of authors per paper

16

Page 17: Analysis and Simulation of Scientic Networks

3.3. Analysis

net of cooperation

4582498

201

40

1

15 124

296457

30

345

422

4

209

307

393

6

470

245

274

272

7

27

223

224

371477

41

102

105

134

338

359

398

425

453

482492

529

251

181

530

9

383

401

480

12

271

463

13312

334

434

17

52

18

107

154

177

20

37

164

429

433

24

308

350

99

314

26

385

38

267

408

290

400

39

318

43

386

47

405

454

484

49

14550

212

449

51

122

5587

270

57

62

315

389

58

204258

365

461111

504

156

226

61

220452

339

68

79

404

69

215

332

478

74

502

77

375

498

80

199202

222

336

81

92

82

306

84

128

187

554

85

185

86

117

317

94

129

95

264

479

552

96275

346555

101

179

110

113

292 431

243

378 488

116

165

329

418

120

171

219

436

472

512

123344

125

259

130

205

136

494

138

153160

280 384257

142

213406

143

197

144

246

343

370

373374

147

440

485

151309

395

163

237252

234255

253

254

283

166341

377

491

381

167

302416

169

320

217

239

218

269

230

423493

232233

282

235

278 430

435

241

263515

265

444

273

533

279

316

285

486

536

299390

466

301

322

319

450

542

361

380

372

421

376

469506

391

411

451

524

Figure 3.1.: collaboration network with 555 nodes (plotted using GraphViz package [38, 39])

17

Page 18: Analysis and Simulation of Scientic Networks

3. Empirical Collaboration Network

Indeed, the considered graph (�gure 3.2)

shows this behavior with one remarkable ex-

ception. There are much too few papers

written by only one author. This could be

due to the fact that collaboration helps in

science, but the more (scienti�c!) partners

you have, the slower gets your scienti�c out-

put as communication overhead increases.

In other words: establishing scienti�c re-

lationships with other authors is not easy.

You have to agree on the �eld of research,

coordinate your e�orts etc. Postulating

that cooperation with more scientists is al-

ways favorable, we can explain the statistics

by diÆculty of �nding new partners. This

even increases corresponding to the num-

ber of co-workers you already have, as ad-

ditional coordination is needed. The risk of

research overlap raises, too.

The power law predicted by Lotka [11]

with an exponent of �2 cannot be con-

�rmed. This could be due to insuÆcient

statistics for this test. Other recent stud-

ies of collaboration networks found an ex-

ponent of 2:1 or 2:4 [41] which is another

indication for statistical errors predominat-

ing our results of study.

3.3.2. Connections per author

Next, we study the number of connections

per author, which is the number of other

scientists an author ever published papers

with. This number is weighted by the num-

ber of papers, i.e. a coauthor with whom a

scientist published n papers contributes n

connections (table 3.2).

Again, we expect to see a frequency de-

crease with increasing number of connec-

tions. The experimental data (�gure 3.3)

links weighted unique

0 16 16

1 51 67

2 75 69

3 61 67

4 24 27

5 21 22

6 10 2

7 5 6

8 4 3

9 2

10 3

11 1

13 2 1

14 1

15 1

17 1 1

20 2

29 1

273 274 274

Table 3.2.: Connections per author

0.01

0.1

1

10

100

1 10

frequency

number of connections

x^-2.85x^-3.53

Figure 3.3.: frequency distribution of the num-

ber of connections per author

(black:weighted|grey:unique)

shows smaller frequencies for \isolated" au-

18

Page 19: Analysis and Simulation of Scientic Networks

3.3. Analysis

thors that never publish with others as well

as for authors with only one coauthor. This

is comparable to the e�ect observed in the

last graph. The most productive seem to be

authors with two or three colleagues they

are working with.

Statistical data in the area of highly con-

nected authors shows a truncated power

law. The exponent of approximately �2:85

falls well in the region bounded by analysis

of other scale-free networks (www: around

2:3 [6, 26, 42]). The sharp or exponen-

tial cuto� at very high connection numbers

has been reported for other networks, too

[7, 43]. Mossa et al. [44] o�er an expla-

nation using a model with limited (local)

information on the network. Surely, no sci-

entist knows all others, so this could lead

to the observed e�ect.

3.3.3. Double vs. unique links

We are interested how things change when

we cease weighting connections by num-

ber of papers published together, i.e. we

only take into account how many other

unique scientists a researcher published pa-

pers with. Results can be found in table 3.2.

We see that despite the di�erent num-

bers, results are qualitatively the same. Sci-

entists working together with two other au-

thors are the most productive.

Depending on whether your glass is half

full or half empty there are two contrary

explanations:

1. It is common practice that you name

persons as authors of your work that

did not contribute to it, out of a feeling

of debt, may it be sponsors or others.

2. Science lives from cooperation. Work-

ing together on one subject increases

scienti�cal output whilst reducing er-

rors.

The author of this paper will not judge.

3.3.4. Cluster sizes

size frequency

1 16

2 28

3 15

4 13

5 2

6 3

7 2

8 2

9 2

10 1

26 1

274 1

Table 3.3.: Cluster sizes

Our last focus is on subnets of science

that exist in our net of collaboration. Au-

thors group into several clusters by connec-

tions established between them. We inves-

tigate the frequency of clusters of a given

size. Our expectation is getting a frequency

increase for growing cluster sizes up to a

peak, and then a decay as clusters grow

very big, comparable to the statistics we

saw already.

The experimental data (table 3.3, �g-

ure 3.5) shows this behavior, but with one

surprise: although the most frequent cluster

size is 2 due to a big number of publications

19

Page 20: Analysis and Simulation of Scientic Networks

3. Empirical Collaboration Network

338

359

398

425

453

482

492529

251

181

530

80

199202

222336134

7

27223

224

371477

41

102

105

1

15

124

296

457

30

345

422

8 341

377

491

381

285

486

536

458

249

8

201

40 6

470

245274 272

10

160

280

384

257

299

390

466 151

309

395

142

213

406

153

120

171

219436

472

512138

317

2

3

4

5

26

a b

a b

a b

ab

b

a

ba

c

6

b

a

117

77375

498

230423

493 167

302

416

86

253

254

283

282

255163

237

252

234

144246

343

370

373

374

58204

258

365 461

111504

156

226

9

166

5762

315

38918

107

154

434

61220

452

339

84

128

187

554

13

312334

177

9

383

401

480

235 278

430 435

69

215

332

4784

209

307

393

74

502 391

411169

320101

179 372421

26

385

319

301

322 217

23949

145232

233

123

344 451

524

218269

450

449 12271

463

241

263515

68

79 404

212542

147 440

485

55

87

270

376

469

506

50

95

533

17

52361

380 85

185

273494 386

43

39 318

265

444 82306

125

259 143

197

110

7

129

113

292431

243

378488

38

267

408

290

400

94

136

264479

552 96

275

346555

47

405

454484

24

308

350

99314

8192

279

316

51

122 130

205

116

165329

418

20

37164

429

433

Figure 3.4.: Collaboration clusters, ordered by size of participants. The shaded box no. 458 represents

my professor, D. Stau�er.

20

Page 21: Analysis and Simulation of Scientic Networks

3.4. Comparison

0.01

0.1

1

10

1 10

frequency

cluster size

x^-2.61

Figure 3.5.: frequency distribution of the cluster

size

with two authors, most scientists maintain

collaboration with three others.

The can only be due to scientists be-

ing member of research groups involved in

di�erent themes, thus connecting di�erent

clusters formed by single papers. This can

directly be veri�ed by the graphical repre-

sentation (�gure �gure 3.4) of the clusters,

ordered by size.

3.4. Comparison

We have collected some statistical �gures

to express the structure of the network in

concern. Now we want to �nd out whether

classical or current network models bear

similar results.

3.4.1. Erd}os-R�enyi random graphs

Connections per author In a random

graph the distribution is a binominal one,

i.e. the probability of a node with k con-

nections in a net with N nodes is

P (k) /

N � 1

k

!

p

k

(1� p)

N�1�k

:

In the limit of large N this approaches a

Poisson distribution around the expectation

value hki = pN

P (k) / e

�hki

hki

k

k!

:

This is contrary to the power-law statistic

observed in the collaboration network.

Cluster sizes For random graphs, perco-

lation theory predicts that the cluster size

distribution shows an exponential decay for

big cluster sizes [2, 45].

Again, the considered network show

rather a power-law decay than an exponen-

tial one.

Unsurprisingly, the structure of scienti�c

collaboration di�ers basically from that of

a random network.

3.4.2. Watts-Strogatz small-world

networks

Connections per author The degree

distribution of Watts-Strogatz small-world

networks is similar to that of a random

graph [2]. It has a peak and decays ex-

ponentially for large connection numbers,

contrary to the collaboration network.

Cluster sizes The usual case in Watts-

Strogatz networks is re-wiring of only a

small portion of links. Thus, the network

stays well connected, mostly forming one

giant cluster.

21

Page 22: Analysis and Simulation of Scientic Networks

3. Empirical Collaboration Network

This model does not describe scienti�c

collaboration as well.

3.4.3. Barab�asi-Albert networks

Connections per author Barab�asi-

Albert networks show a vertex degree

distribution as P (k) / k

�3

[30, 46].

In our science collaboration network we

found out exponents of 2:85 resp. 3:53

which is only a slight deviation.

Cluster sizes In Barab�asi-Albert net-

works, new sites are added with links to

already existing nodes. Consequently only

one giant cluster forms. Obviously this dif-

fers crucially from the net of co-authorship.

22

Page 23: Analysis and Simulation of Scientic Networks

4. Spin models

4.1. Leadership e�ect

4.1.1. Ising model

In 1925, Ising [47] published a paper on a

model of spin interaction that later became

very famous. The idea to this had been

given to him by his teacher Lenz [48], so it is

sometimes referenced as Lenz-Ising model.

1

The idea is to consider spins (e.g. on a

square lattice) and an interaction Hamilto-

nian

H = �

X

i6=j

J

i;j

S

i

S

j

where S

i

are the spins and J

i;j

is a matrix

describing the interaction forces. Usually

we consider the case

S

i;j

=

(

J : i; j nearest neighbors

0: else,

i.e. only allow equal interaction between

nearest neighbors (J > 0 for ferromagnetic

behavior).

In the following chapter, we investigate

how such a model behaves on our con-

structed collaboration network.

1

A generalization of the Ising model is the Potts

model [49, 50]. Instead of Ising spins with two

possible states +1 and �1, Potts allows k � 2

di�erent spin values. The Hamiltonian is

H = �J

X

i6=j

Æ

i;k

;

i; j being nearest neighbors.

Applying it to the scienti�c network, I �nd re-

sults very similar to those of Ising's model.

Figure 4.1.: Ising [47]

We use a Metropolis [51] Ising model,

i.e. probabilities for a single spin ip of

p / e

�E

=k

B

T

if �E > 0 and 1 otherwise.

To determine �E we sum up spins of all

vertices connected to a given node. In prin-

ciple, we have the choice between two pro-

ceedings:

� consider only unique links between two

nodes

� count a connection several times ac-

cording to the number of links, i. e.

the number of papers the correspond-

ing authors published together.

Both possibilities have been examined.

2

2

change the switch NODOUBLE in line 17 of source

23

Page 24: Analysis and Simulation of Scientic Networks

4. Spin models

4.1.2. Phase transition

The results of both experiments show qual-

itative similarity. We observe a rounded

phase transition at about

k

B

T

=J = 0:8 op-

posed to a value of about 2:3 on the regular

square lattice.

A closer look reveals that decay of mag-

netization with raising temperature is ex-

ponential. This result corresponds to re-

search on Ising models on Barab�asi-Albert

networks by Aleksiejuk, Ho lyst, and Stauf-

fer [52], who also found an exponential law.

Anyhow, the critical temperatures found

by me and by Aleksiejuk et al. [52] di�er

by more than one order of magnitude. This

can easily be explained by di�erent coor-

dination numbers in both networks. The

collaboration graph holds a maximum of 14

neighbors of a single vertex, the graph of

Aleksiejuk et al. [52] exceeds this by sev-

eral orders of magnitude. This makes it far

more \diÆcult" to break the ferromagnetic

bonds, resulting in a higher critical temper-

ature.

4.1.3. Degree distribution

Dorogovtsev et al. [53] studied random

graphs with given degree distributions P (k)

of a vertex of degree k. They deduced an

estimate for the critical temperature of an

Ising model on such networks as

J

k

B

T

c

=

1

2

ln

hk

2

i

hk

2

i � 2hki

!

:

Considering the collaboration network as

a random graph with given degree distribu-

tion (table 4.1), we use their formula and

code in section B.1

degree several unique

0 16 16

1 51 67

2 75 69

3 61 67

4 24 27

5 21 22

6 10 2

7 5 6

8 4 3

9 2

10 3

11 1

13 2 1

14 1

15 1

17 1 1

20 2

29 1

Table 4.1.: Degree distribution

get

T

c

=J = 2:91, counting only unique links

between scientists. Using all links, we �nd

T

c

=J = 5:46.

Both values are far from critical temper-

atures observed in simulation. This is a

strong clue towards the statement that col-

laboration networks are crucially di�erent

from random networks, even with the same

degree distribution.

4.1.4. Spin ip model

Following a suggestion of Ho lyst

3

, we can

determine the importance of most con-

nected authors of our collaboration network

by successive ipping of most connected

3

personal correspondence, cf. [52]

24

Page 25: Analysis and Simulation of Scientic Networks

4.2. Cluster limited Ising models

spins and pinning them in their new posi-

tion.

In other words: After some time of equili-

bration, we chose the author who has most

connections to others, and change his/her

spin permanently to a value of �1, opposite

to all others (at T = 0, or nearly all others

else). Subsequently, we allow the system

to relax some time, after which we perma-

nently ip the second most connected spin,

and so on.

0

50

100

150

200

250

0 50000 100000 150000 200000 250000 300000

M

t

uniquemultiple

Figure 4.2.: Ising model with successive spin ips.

After 10

5

steps of equilibration, we

ip the most connected spin and

stick it to its new value. After some

relaxation of 10

4

steps, this step is

repeated. Network with multiple and

network with unique links used. Av-

eraged over 1000 runs. T = 0:2.

Results are shown in �gure 4.2. We ob-

serve two things:

1. Even after switching 20 most con-

nected spins, the system does not ip

in the opposite state with all spins

pointing down. In simulations of Alek-

siejuk et al. [52], less than 6 spins

were enough to ip a whole network

of 30; 000 nodes.

This is quite obviously due to the fact

that we don't have a contiguous graph,

but one consisting of di�erent clusters.

A spin ip in one cluster is not able to

a�ect spins in others.

In pictures of spins representing opin-

ions (yes/no, etc. [54]), this means a

few authors with view di�ering from

the broad mass of scientists are hardly

capable of changing the global opinion,

may they even be the most connected

(known) ones.

2. Allowing multiple links in our net,

we expect the magnetization to break

down much faster, as a ip of a spin is

able to in uence others in a stronger

way.

Yet, the simulation shows contrary

results. The graph containing only

unique links shows a much steeper de-

cay of magnetization (�gure 4.2).

Possible explanation is the fact, that

choosing most connected spins in a

network only having unique links picks

authors with connections to many

other authors, whereas in a network al-

lowing multiple links, there are as well

spins connected in a strong manner to

only few others.

It seems that in order to spread new

opinions, it is more advantageous to

have small in uence on many other

people than a big impact on only few

ones.

4.2. Cluster limited Ising models

The network we are looking at consists of

many distinct clusters of di�erent sizes (�g-

25

Page 26: Analysis and Simulation of Scientic Networks

4. Spin models

ure 3.4). We may ask if they di�er regard-

ing their properties, or if they behave alike.

4.2.1. Proceeding

We split up the network into sub-nets, each

containing one cluster, numbered sequen-

tially, 2, 3a, 3b,. . . 26 (numbers and letters

as in �gure 3.4).

On each net, we run an Ising model

for temperatures from 0:1 to 6:9 in 0:2

steps. Each of these simulations runs for

10

6

steps, with magnetization measured ev-

ery 100 steps, thus resulting in 10

4

mea-

surements per run, to give good statistics.

4.2.2. Results

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

M/N

kT/J

Figure 4.3.: Ising model on the di�erent clusters

of a collaboration net, averaged over

10

4

measurements per temperature

and net.

Examining the results (�gure 4.3), we see

di�erent curves that all decay with raising

temperature, but show no apparent similar-

ities. We wonder why the curves seemingly

do not converge to zero but to �nite values.

Obviously, the network is so small that

macroscopic magnetization ips occur fre-

quently, even at moderate temperatures.

That means, expectation value of magne-

tization

4

at high temperatures is not zero,

but something around one!

4.2.3. Bias adjusti�cation

To validate this hypothesis, we simulate the

networks at very high temperature (

k

B

T

=J =

50), in order to determine M

1

= M(T =

1) (table 4.2).

net M

1

2 1.03

3a 1.71

3b 1.51

4a 1.59

4b 1.54

4c 1.55

5a 2.02

5b 1.96

6a 2.00

6b 1.94

7a 2.27

7b 2.23

8a 2.26

8b 2.29

9a 2.55

9b 2.55

10 2.54

26 4.25

Table 4.2.: Bias

4

all over this publication we consider the (un-

signed) value jM j as magnetization, not M !

Doing the latter leads to false results. E.g. at

low temperatures, averaging M over very long

times would give zero, as ips of the whole sys-

tem occur (though at very low probabilities).

26

Page 27: Analysis and Simulation of Scientic Networks

4.2. Cluster limited Ising models

Using these values, we rescale our simu-

lation results from �gure 4.3 by use of the

scaling function

5

M

N

�!

M �M

1

N �M

1

;

where N is the total number of authors in

the cluster (�gure 4.4).

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

(M-M

∞)/

(N-M

∞)

kT/J

23a3b4a4b4c5a5b6a6b7a7b8a8b9a9b1026

Figure 4.4.: Same data as in �gure 4.3 but ad-

justed to a common scale from 0 to

1 by eliminating bias from random

uctuations.

We �nd a much cleaner image. Apart

from one exception (6b) all curves are par-

allel up to the M = 0:5-line, and even be-

yond there are very few crossings.

Each di�erent cluster can now be char-

acterized by the temperature # at which it

achieves M = 0:5. This leads to a sort of

\melting temperature" (table 4.3).

To get an idea of this temperature's

meaning, we sort the cluster's graphical rep-

resentations by # (�gure 4.5).

Is seems plausible that # is a measure

of coherence or connectiveness of a clus-

5

This function is a linear approximation that gives

1 for M ! N and zero for M !M

1

.

net #

2 1.7

3a 2.8

3b 1.5

4a 3.3

4b 2.1

4c 2.6

5a 4.0

5b 2.2

6a 4.8

6b 3.5

7a 2.8

7b 1.8

8a 2.5

8b 3.7

9a 2.2

9b 3.1

10 1.9

26 3.8

Table 4.3.: Melting temperatures

ter. Single bonds lead to lower melting tem-

peratures, fully connected subsets to higher

ones.

This o�ers a possible explanation why 6b

shows a di�erent behavior from all others

in �gure 4.4: This cluster consists of two

parts. One of them is a completely con-

nected set of �ve nodes, the other one a

single vertex. Both are linked by one single

edge. Probably, this \con ict of interest"

leads to the observed anomality.

Additionally, connectedness described by

# seems to be crucially di�erent from the

classi�cation given by standard clustering

coeÆcient. E.g. the net consisting of three

completely connected vertices 3a yields a

clustering coeÆcient of 1 but a low melting

27

Page 28: Analysis and Simulation of Scientic Networks

4. Spin models

1.5

2.0

2.5

3.0

3.5

4.0

4.5

223

7

27

224

371

105

336222

202199

80

530

181

251

529492

482

453

425

398

359

338

134

102

41

477

504 111

461365

258204

58

156

466

390

299

488378

243

431292

113

110

226

422

345

30

457

296

124

15

1

554

187

128

84

257

384

280

160153

138

536

486

285

381

491

377

314

99

350

308

24

282

283

254

253

255

234

252

237

163

341166

418

329165

116

374

373

370

343

246144

542

450

433

429

16437

20

319

640

201

8

249

458

339

452

220 61

470

322

301

129 94

400

290

408

267

38

272274

245

Figure 4.5.: collaboration clusters from �gure 3.4, ordered by #

28

Page 29: Analysis and Simulation of Scientic Networks

4.2. Cluster limited Ising models

temperature.

4.2.4. Linear Relationship

Surprisingly, we �nd a linear relationship be-

tween N and

E

=# (�gure 4.6). Thus, we

postulate

E

#

= aN � b

and conclude a formula for #:

#

calc

(E;N) =

E

aN � b

:

Fitting parameters to our measurements

(yielding a = 0:72; b = 0:89), this gives

good prediction of melting temperatures.

Results can be seen in �gure 4.7, as well

as a diagram showing errors being inferior

to 10% in most cases.

#

calc

=

E

aN � b

N!1

�!

hki

2a

:

We see that, in the limit of high N , the

melting temperature # is proportional to

the average number of edges per site hki =

2E

=N, a result known from mean �eld the-

ory.

29

Page 30: Analysis and Simulation of Scientic Networks

4. Spin models

0

2

4

6

8

10

12

14

16

18

20

0 5 10 15 20 25 30

E/ϑ

N

0.76N-1.0

Figure 4.6.:

E

=# vs. N shows a surprisingly linear correlation.

1.5

2

2.5

3

3.5

4

4.5

5

1.5 2 2.5 3 3.5 4 4.5 5

ϑ c

alc

ula

ted

ϑ measured

a=0.72; b=0.89

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

2 4 8 16

∆ϑ

N

a=0.72; b=0.89

Figure 4.7.: # determined by #

calc

(E;N) =

E

aN�b

vs. measurements. E is the cluster's total number

of edges.

30

Page 31: Analysis and Simulation of Scientic Networks

5. Barab�asi-Albert network models

5.1. Modi�ed Barab�asi-Albert

model

Network model of Barab�asi and Albert [30]

was introduced in section 2.5. We pointed

out that it shows rather good �ts with em-

pirical networks, but lacks support for dis-

jointed ones, as the algorithm only delivers

one giant cluster.

Thus, to cope with networks consisting

of several components, we must modify the

model. We chose a very simple approach:

In each step of adding nodes, we start a new

cluster of m

0

= 3 nodes with probability p.

Vertices added in consecutive time steps

can connect to any node in any component

respecting the same probability rule as in

the standard model.

� In the case m = 1, components can

only grow (isolated clusters), whereas

� in the case m > 1, new nodes are able

to connect two or more existing com-

ponents of the network (merging clus-

ters).

5.2. Simulation

In order to compare simulation results with

real-world data from a scienti�c collabora-

tion network [55], we let the network grow

up to the size of 555 nodes. For proper

statistics, this is repeated 10

4

times.

5.2.1. Isolated clusters

scale-free behavior In case of m = 1,

i.e. considering isolated clusters, we can be

sure to get scale-free behavior within the

distinct clusters, as the probabilities for at-

tachment of a new node to an existing one

are the same as in a single Barab�asi-Albert

network (modulo a constant factor due to

a new node having the \choice" between

di�erent clusters to connect to).

However, complete network is a priori not

necessarily scale-free, as total statistics is

made up by the sum of all scale-free sub-

networks or clusters. So, we have to focus

later on the question, if scale-free behavior

prevails.

cluster size distribution Next, we exam-

ine the number of clusters of di�erent sizes

(�gure 5.1). Obviously, we �nd that high

probability of starting a new net leads to

many smaller networks, whereas low values

privilege bigger networks. Yet, we make

an interesting observation: low probabili-

ties lead to a cluster-size distribution that

is not monotonic any more, but favors big

networks.

Looking at �gure 5.1 which shows the

number of points in clusters of a given size

instead of the sheer cluster count, makes

this more plausible.

� For p = 0, we will see a graph /

Æ(555), as there is only one giant clus-

ter,

31

Page 32: Analysis and Simulation of Scientic Networks

5. Barab�asi-Albert network models

1

10

100

1000

10000

100000

1e+06

10 100

fre

qu

en

cy

cluster size

0.010.040.10.40.8

100

1000

10000

100000

1e+06

10 100

no

de

co

un

t

cluster size

0.010.040.10.40.8

Figure 5.1.: Frequency of clusters (left) resp. number of nodes in clusters of a given size (right) vs.

cluster size at di�erent probabilities for a new net. Simulation was run 10

4

times with a

network growing up to 555 nodes. The curve for p = 0:01 is the one with the rightmost

peak; to the left follow the other p-values in descending order.

� for p = 1 a graph / Æ(m

0

= 3), be-

cause there are only embryonic sub-

nets.

� What we observe for 0 < p < 1 is the

transition between both extremes.

1

2

4

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

exponent of pow

er

law

regio

n

probability for a new net

-exponente

2.25x

Figure 5.2.: Negative exponent of the power law

part of the curves in �gure 5.1 vs.

probability p for a new net. The line

corresponds to exponent = �e

2:25p

.

power law region For all p, we start with

a power law region, regarding the distribu-

tion of small and medium cluster sizes. The

exponent varies with network-birth proba-

bility p. In the semi-logarithmic plot in �g-

ure 5.2 it is shown, that the exponential

relation �e

2:25p

describes our data rather

well. Of course, this formula cannot be

true for general p as for p ! 1 we expect

m! �1!

Regarding empirical data from sections

3.2's collaboration graph, we �nd good

overlap with the model (�gure 5.3). In-

terestingly, the model is even able to ex-

plain facts formerly regarded as statistical

anomalies, as the observation of a giant

cluster of a size exceeding largely all oth-

ers in the network (section 3.2.2).

5.2.2. Merging clusters

Now, we modify the model by examining

m > 1. In this case, newly added vertices

develop several links to existing nodes (and

thus existing clusters), being able to con-

32

Page 33: Analysis and Simulation of Scientic Networks

5.2. Simulation

0.01

0.1

1

10

1 10 100

fre

qu

en

cy

total authors in clusters of given size

collaboration networkp=4%’

Figure 5.3.: Semi-logarithmic plot comparing the simulation with p = 0:04 using the isolated clusters

model of �gure 5.1 and statistical data from a science collaboration network (section 3.2).

nect hitherto separated networks. In this

paper, we limit our considerations on the

standard Barab�asi-Albert case m = m

0

=

3.

Using di�erent p, we quickly recognize

that low and medium probabilities make

the simulations nearly always end up with a

single giant cluster containing all vertices.

Points of interest are higher p in the region

of 60{90%.

cluster size distribution Again, we plot

the total number of nodes contained in clus-

ters of a given size (�gure 5.4). For small

cluster sizes, we observe an non-uniform be-

havior of the graph. The explanation is as

follows: newly born clusters have a size of

m

0

= 3 and thus appear very often. Also,

cluster of sizes 4 or 7 are very probable,

whereas a cluster of size 5 is very rare, be-

cause it can only be formed by a new clus-

ter to which two new ones have connected

without gluing it to a second cluster.

In a semi-logarithmic plot (�gure 5.4), we

�nd a parabolic dependence for high cluster

sizes (i.e. a Gaussian distribution around a

mean depending on p). Appearently, the

merging clusters cannot cope with reality.

33

Page 34: Analysis and Simulation of Scientic Networks

5. Barab�asi-Albert network models

1

10

100

1000

10000

100000

1e+06

2 4 8 16 32

fre

qu

en

cy

degree

p=60%p=80%

1

10

100

1000

10000

100000

1e+06

1e+07

0 5 10 15 20 25 30 35

fre

qu

en

cy

degree

p=60%p=80%

Figure 5.5.: Frequency of nodes with a certain degree. Simulation was run 10

4

times with networks

growing up to 555 nodes. Left plot is linear, right plot semi-logarithmic. M = 1 (isolated

clusters).

1

10

100

1000

10000

100000

1e+06

1e+07

2 4 8 16 32 64 128

fre

qu

en

cy

degree

p=1%p=40%

1

10

100

1000

10000

100000

1e+06

1e+07

0 20 40 60 80 100 120 140 160 180 200

fre

qu

en

cy

degree

p=1%p=40%

Figure 5.6.: Same plots as �gure 5.5 using M = 3 (merging clusters).

5.2.3. scale-free behavior

In �gure 5.5 we can see that there is no

pure scale-free behavior. There seems to

be power-law behavior for small degrees and

an exponential cuto� (�gure 5.5) at higher

values. Similar results have been observed

by Newman [7] for collaboration networks.

One could argue that this e�ect is due

to the fact that we do not plot the degree

distribution for single clusters, but for the

whole set of them. This demur only counts

at �rst sight, though. At p = 80% we have

several small clusters but virtually only one

giant cluster dominating the degree distri-

bution for high degrees. So, the fact of

averaging of many di�erent sized clusters

should manifest mainly in the area of small

degrees opposite to our observations.

Mossa et al. [44] o�er a possible expla-

nation for the exponential cuto� encoun-

tered. They use a model which attributes

to each node only a restricted knowledge

on the network, i.e. the vertex is not able

34

Page 35: Analysis and Simulation of Scientic Networks

5.2. Simulation

10

100

1000

10000

100000

1e+06

1e+07

50 100 150 200 250 300 350 400

node c

ount

cluster size

p=70%p=80%p=90%

Figure 5.4.: Number of nodes in clusters of a

given size vs. cluster size at di�erent

probabilities for a new net. Simula-

tion was run 10

5

times with a net-

work growing up to 555 nodes.

to consider the whole graph's structure, but

only a subset according to its limited view.

M p ln(N

0

) k �.

1 1% 16.8 2.27 60.

1 40% 17.1 2.45 9.2

3 60% 18.1 3.1 5.1

3 80% 20.1 4.8 3.5

Table 5.1.: CoeÆcients for �gure 5.7

35

Page 36: Analysis and Simulation of Scientic Networks

5. Barab�asi-Albert network models

1

10

100

1000

10000

100000

1e+06

4 8 16 32 64 128

frequency

degree

M=1; p= 1%M=1; p=40%M=3; p=60%M=3; p=80%

Figure 5.7.: Plots of �gure 5.5 and �gure 5.6 show good �t to exponentially truncated power laws

y = N

0

x

�k

e

x

=�

.

36

Page 37: Analysis and Simulation of Scientic Networks

6. Conclusion

Dealing with real-world networks, scien-

tists found three properties predominating:

� short average path lengths (Small

World E�ect),

� scale-free behavior,

� high clustering.

Di�erent models were developed to cope

with this challenge, each having di�erent

advantages and disadvantages. The model

of Barab�asi and Albert [30] is a promis-

ing one, but lacks support for discontiguous

networks.

We constructed a network of co-

authorship with 555 authors. Only scien-

tists were chosen that cite a speci�c pa-

per [30]. The resulting net shows scale-free

characteristics but di�ers substantially from

accepted computer models' results.

Simulating Ising models on the network

reveals strong robustness against distur-

bances (spin ip experiment/leadership ef-

fect) and shows coherence with mean �eld

theory : We �nd the critical temperature of

subnets of our graph being proportional to

the average number of edges per site, in the

limit of a high node count.

In order to overcome the mentioned dis-

advantages of the Barab�asi-Albert model,

we developed a modi�ed version, allowing

formation of multiple clusters. We saw a

strong dependence of a node's edges count

on the network structure, separating two

cases: isolated clusters and merging clus-

ters.

Only the �rst case leads to results �t-

ting reality. Comparison with statistics from

our collaboration net shows similar behavior

and is even able to explain facts at �rst re-

garded as statistical anomalies as the obser-

vation of a giant cluster of a size exceeding

largely all others in the network. Even ex-

ponential cuto� of nodes with high degrees,

as encountered empirically, is reproduced.

Re-evaluating the model with a higher

number of authors would lead to better

statistics and greater reliability.

37

Page 38: Analysis and Simulation of Scientic Networks

6. Conclusion

38

Page 39: Analysis and Simulation of Scientic Networks

A. Acknowledgements

I would like to thank D. Stau�er

1

for giv-

ing me the idea of the subject and support-

ing me by comments and discussions during

my research.

Thanks to M. Abd-Elmeguid

2

for co-

judging this work.

My thanks for writing excellent software

goes to the authors of

� L

A

T

E

X, BibT

E

X, pdfL

A

T

E

X, dvips

� L

A

T

E

Xpackages: KOMAScript, natbib,

custom-bib, graphicx, listings, units,

hyperref, hypernat, colortbl, color

� SciTE

� gnuplot

� Ruby, C++, Perl, bash

� dotty, neato[39]

Thanks to A. Sindermann for supporting

my work by addressing computer network

problems.

Special thanks to K. Godthardt for sup-

porting me.

1

Institute of Theoretical Physics, University of

Cologne

2

II. Institute of Experimental Physics, University

of Cologne

39

Page 40: Analysis and Simulation of Scientic Networks

A. Acknowledgements

40

Page 41: Analysis and Simulation of Scientic Networks

B. Source code

B.1. Ising model

This C++ program simulates an Ising model on a given graph. In section 4.1.1 it was

used on our collaboration network.

// This program reads network data from a file and simulates

// an Ising model an this graph.

// Metropolis probabilities are used.

//

5 // Felix Puetsch <[email protected] -koeln.de >, 2003-01-22

#inc lude < i o s t r eam>

#inc lude < f s t ream>

#inc lude < s t d i o . h>

10 #inc lude < a s s e r t . h>

#def ine MAX INT 2147483647

#def ine MAX CONN 30

15 #def ine MAX NODE 600

#def ine NODOUBLE 1

// #define NOASSERT

20

us ing namespace s td ;

// === random number generator ================================

25 c l a s s Random f

p r i v a t e :

i n t s t a t e ;

pub l i c :

Random( i n t seed ) ;

30 i n t get ( ) f r e tu rn s t a t e �=65539 ; g // 16807

g ;

Random : : Random( i n t seed ) f

a s s e r t ( seed % 2 == 1) ;

35 s t a t e = seed ;

g

// === Vertex =================================================

41

Page 42: Analysis and Simulation of Scientic Networks

B. Source code

40 c l a s s I s i n g ;

c l a s s Ver t ex f

p r i v a t e :

i n t number , conn count , s p i n ;

45 Ver t ex � ne i ghbou r [MAX CONN ] ;

I s i n g � i s i n g ;

pub l i c :

Ve r t e x ( I s i n g � i s , i n t nr ) ;

~ Ve r t ex ( ) ;

50 i n t ge tSp in ( ) f r e tu rn s p i n ; g

vo id s e t Sp i n ( i n t s ) f s p i n = s ; g

vo id addConn ( Ve r t ex � to , i n t nodoub le = 0) ;

i n t getNumber ( ) f r e tu rn number ; g

i n t getConnCount ( ) f r e tu rn conn count ; g

55 Ver t ex � getConn ( i n t i ) ;

i n t s imu l a t eS t ep ( ) ;

g ;

// === Ising ==================================================

60

c l a s s I s i n g f

pub l i c :

Random � rnd ;

i n t e n l i m i t [ 2�MAX CONN+1] ;

65 p r i v a t e :

i f s t r e am net ;

char b u f f e r [ 8 0 ] ;

i n t v count ;

Ve r t ex � v l i s t [MAX NODE ] ;

70 pub l i c :

I s i n g ( char � fname ) ;

vo id b u i l d n e t ( ) ;

vo id debug ( i n t nr ) ;

vo id r e s e t ( ) ;

75 vo id s imu l a t e ( double kT , i n t maxtime=�1, i n t s t e p t ime=0) ;

g ;

// === Vertex =================================================

// ... Struktors .............................................

80

Ver t ex : : Ve r t ex ( I s i n g � i s , i n t nr ) f

conn count = 0 ;

i s i n g = i s ;

number = nr ;

85 g

Ver t ex : : ~ Ve r t ex ( ) f

cout << "~ Ve r t ex " << end l ;

g

90

vo id Ver t ex : : addConn ( Ve r t ex � to , i n t nodoub le ) f

i f ( nodoub le )

f o r ( i n t i =0; i<conn count ; i++)

42

Page 43: Analysis and Simulation of Scientic Networks

B.1. Ising model

i f ( ne i ghbou r [ i ]==to ) r e tu rn ;

95 ne i ghbou r [ conn count++] = to ;

a s s e r t ( conn count < MAX CONN) ;

g

Ver t ex � Ver t ex : : getConn ( i n t i ) f

100 a s s e r t ( i < conn count ) ;

r e tu rn ne i ghbou r [ i ] ;

g

i n t Ver t ex : : s imu l a t eS t ep ( ) f

105 i n t sp insum=0;

f o r ( i n t i =0; i<conn count ; i++)

spinsum += ne ighbou r [ i ]�>ge tSp in ( ) ;

sp insum �= sp i n ;

i f ( i s i n g �>rnd�>get ( ) < i s i n g �>e n l i m i t [ sp insum+MAX CONN] )

110 s p i n �=�1;

r e tu rn s p i n ;

g

// === Ising ==================================================

115 // ... Struktors .............................................

I s i n g : : I s i n g ( char � fname ) f

cout << " I s i n g " << end l ;

rnd = new Random(1) ;

120 net . open ( fname ) ;

i f ( ! net . i s o p e n ( ) ) f

c e r r << " i npu t f i l e not found " << end l ;

e x i t ( 1 ) ;

g

125 b u i l d n e t ( ) ;

f o r ( f l o a t kT=50; kT<50 .1 ; kT+=0.2) f

c e r r << " s t a r t i n g s imu l a t i o n wi th kT=" << kT << end l ;

r e s e t ( ) ;

s imu l a t e (kT , 1 000000 , 1 00 ) ;

130 g

g

// ... Methods ...............................................

135 vo id I s i n g : : b u i l d n e t ( ) f

cout << " b u i l d n e t " << end l ;

f o r ( i n t i =0; i<MAX NODE ; i++) f

v l i s t [ i ] = new Ver t ex ( th i s , i ) ;

v l i s t [ i ]�> s e t Sp i n (1 ) ;

140 g

i n t from , to ;

whi le ( ! net . e o f ( ) ) f

net . g e t l i n e ( bu f f e r , 8 0 ) ;

i f ( s s c a n f ( bu f f e r , "%i �� % i ; " , & from , & to ) != 2) f

145 c e r r << " i npu t l i n e i g no r e d : ' " << b u f f e r << " ' " << end l ;

cont inue ;

g

43

Page 44: Analysis and Simulation of Scientic Networks

B. Source code

a s s e r t ( from < MAX NODE) ; a s s e r t ( to < MAX NODE) ;

v l i s t [ from]�>addConn ( v l i s t [ to ] , NODOUBLE) ;

150 v l i s t [ to]�>addConn ( v l i s t [ from ] , NODOUBLE) ;

g

net . c l o s e ( ) ;

v count = 0 ;

f o r ( i n t i =1; i<MAX NODE ; i++)

155 i f ( v l i s t [ i ]�>getConnCount ( )>0) v count=i ;

e l s e v l i s t [ i ]�> s e t Sp i n (0 ) ;

c e r r << v count << " nodes " << end l ;

g

160 vo id I s i n g : : debug ( i n t nr ) f

Ver t ex � node = v l i s t [ n r ] ;

i n t nb = node�>getConnCount ( ) ;

cout << "node " << nr << " has " << nb

<< " conn e c t i o n s : " << end l ;

165 f o r ( i n t i =0; i<nb ; i++)

cout << node�>getConn ( i )�>getNumber ( ) << end l ;

g

vo id I s i n g : : r e s e t ( ) f

170 f o r ( i n t i =0; i<MAX NODE ; i++)

v l i s t [ i ]�> s e t Sp i n ( abs ( v l i s t [ i ]�>ge tSp in ( ) ) ) ;

g

// ... Simulation .............................................

175

vo id I s i n g : : s imu l a t e ( double kT , i n t maxtime , i n t s t e p t ime ) f

i f ( ! s t e p t ime ) s t e p t ime=maxtime ;

f o r ( i n t i=�MAX CONN; i<=MAX CONN; i++)

e n l i m i t [ i+MAX CONN] =

180 ( i n t ) (MAX INT � ( 2� exp (�2.� i /kT)�1) ) ;

i n t mag , t ime=0;

whi le ( time<maxtime ) f

f o r ( i n t s t e p =0; s tep<s t e p t ime ; s t e p++) f

mag = 0 ;

185 f o r ( i n t nr =1; nr<=v count ; n r++) f

a s s e r t ( v l i s t [ n r ] !=NULL) ;

mag += v l i s t [ n r ]�> s imu l a t eS t ep ( ) ;

g

g

190 t ime += s t ep t ime ;

cout << t ime << " " << kT << " " << abs (mag) << " " << mag << end l ;

g

g

195 // === main ===================================================

i n t main ( i n t argc , char �� argv ) f

a s s e r t ( c e r r << "debug mode on" << end l ) ;

a s s e r t ( a rgc==2) ;

200 I s i n g i s i n g ( argv [ 1 ] ) ;

r e tu rn 0 ;

44

Page 45: Analysis and Simulation of Scientic Networks

B.1. Ising model

g

45

Page 46: Analysis and Simulation of Scientic Networks

B. Source code

B.2. Spin ip model

This C++ program simulates an Ising model on a given graph. In regular time intervals,

the most connected spins (hard coded) are pinned to an up position. In section 4.1.4

this was used on our collaboration network.

// This program reads network data from a file and simulates

// an Ising model an this graph.

// additionally , every ... time steps the next most connected

// spin is flipped permanently .

5 // Metropolis probabilities are used .

//

// Felix Puetsch <[email protected] -koeln.de >, 2003-02-05

#inc lude < i o s t r eam>

10 #inc lude < f s t ream>

#inc lude < s t d i o . h>

#inc lude < a s s e r t . h>

#def ine MAX INT 2147483647

15

#def ine MAX CONN 40

#def ine MAX NODE 600

#def ine NODOUBLE 0

20

// #define NOASSERT

us ing namespace s td ;

25 // === random number generator ================================

c l a s s Random f

p r i v a t e :

i n t s t a t e ;

30 pub l i c :

Random( i n t seed ) ;

i n t get ( ) f r e tu rn s t a t e �=65539 ; g // 16807

g ;

35 Random : : Random( i n t seed ) f

a s s e r t ( seed % 2 == 1) ;

s t a t e = seed ;

g

40 // === Vertex =================================================

c l a s s I s i n g ;

c l a s s Ver t ex f

45 p r i v a t e :

i n t number , conn count , sp i n , s t i c k y ;

Ve r t ex � ne i ghbou r [MAX CONN ] ;

46

Page 47: Analysis and Simulation of Scientic Networks

B.2. Spin ip model

I s i n g � i s i n g ;

pub l i c :

50 Ver t ex ( I s i n g � i s , i n t nr ) ;

~ Ve r t ex ( ) ;

i n t ge tSp in ( ) f r e tu rn s p i n ; g

vo id s t i c k I t ( ) ;

vo id s e t Sp i n ( i n t s ) f s p i n = s ; g

55 vo id addConn ( Ve r t ex � to , i n t nodoub le = 0) ;

i n t getNumber ( ) f r e tu rn number ; g

i n t getConnCount ( ) f r e tu rn conn count ; g

Ver t ex � getConn ( i n t i ) ;

i n t s imu l a t eS t ep ( ) ;

60 g ;

// === Ising ==================================================

c l a s s I s i n g f

65 pub l i c :

Random � rnd ;

i n t e n l i m i t [ 2�MAX CONN+1] ;

p r i v a t e :

i f s t r e am net ;

70 char bu f f e r [ 8 0 ] ;

i n t v count ;

Ve r t ex � v l i s t [MAX NODE ] ;

pub l i c :

I s i n g ( char � fname ) ;

75 vo id bu i l d n e t ( ) ;

vo id debug ( i n t nr ) ;

vo id r e s e t ( ) ;

vo id s imu l a t e ( double kT , i n t maxtime=�1, i n t s t e p t ime =0 , i n t s t a r t t im e

=0) ;

g ;

80

// === Vertex =================================================

// ... Struktors .............................................

Ver t ex : : Ve r t ex ( I s i n g � i s , i n t nr ) f

85 conn count = 0 ;

s t i c k y = 0 ;

i s i n g = i s ;

number = nr ;

g

90

Ver t ex : : ~ Ve r t ex ( ) f

cout << "~ Ve r t ex " << end l ;

g

95 vo id Ver t ex : : addConn ( Ve r t ex � to , i n t nodoub le ) f

i f ( nodoub le )

f o r ( i n t i =0; i<conn count ; i++)

i f ( ne i ghbou r [ i ]==to ) r e tu rn ;

n e i ghbou r [ conn count++] = to ;

100 a s s e r t ( conn count < MAX CONN) ;

47

Page 48: Analysis and Simulation of Scientic Networks

B. Source code

g

Ver t ex � Ver t ex : : getConn ( i n t i ) f

a s s e r t ( i < conn count ) ;

105 r e tu rn ne i ghbou r [ i ] ;

g

vo id Ver t ex : : s t i c k I t ( ) f

s t i c k y = 1 ;

110 s p i n = �1;

c e r r << number << " s t i c k e d . " << end l ;

g

i n t Ver t ex : : s imu l a t eS t ep ( ) f

115 i f ( s t i c k y !=0) r e tu rn s p i n ;

i n t sp insum=0;

f o r ( i n t i =0; i<conn count ; i++)

spinsum += ne ighbou r [ i ]�>ge tSp in ( ) ;

sp insum �= sp i n ;

120 i f ( i s i n g �>rnd�>get ( ) < i s i n g �>e n l i m i t [ sp insum+MAX CONN] )

s p i n �=�1;

r e tu rn s p i n ;

g

125 // === Ising ==================================================

// ... Struktors .............................................

I s i n g : : I s i n g ( char � fname ) f

cout << " I s i n g " << end l ;

130 rnd = new Random(1) ;

net . open ( fname ) ;

i f ( ! net . i s o p e n ( ) ) f

c e r r << " i npu t f i l e not found " << end l ;

e x i t ( 1 ) ;

135 g

b u i l d n e t ( ) ;

double kT=0.2 ;

f o r ( i n t i =1; i <=100; i++) f

c e r r << " s t a r t i n g s imu l a t i o n wi th kT=" << kT << end l ;

140 r e s e t ( ) ;

// the following is not beautiful , but quick :-)

s imu l a t e (kT , 1 00000 , 1 000 ) ;

v l i s t [27]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 1 00000 ) ;

145 a s s e r t ( v l i s t [27]�> ge tSp in ( )==�1);

v l i s t [329]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 1 10000 ) ;

v l i s t [116]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 1 20000 ) ;

150 v l i s t [223]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 1 30000 ) ;

v l i s t [251]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 1 40000 ) ;

v l i s t [237]�> s t i c k I t ( ) ;

48

Page 49: Analysis and Simulation of Scientic Networks

B.2. Spin ip model

155 s imu l a t e (kT , 1 0000 , 1 000 , 1 50000 ) ;

v l i s t [491]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 1 60000 ) ;

v l i s t [365]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 1 70000 ) ;

160 v l i s t [7]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 1 80000 ) ;

v l i s t [418]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 1 90000 ) ;

v l i s t [381]�> s t i c k I t ( ) ;

165 s imu l a t e (kT , 1 0000 , 1 000 , 2 00000 ) ;

v l i s t [199]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 2 10000 ) ;

v l i s t [398]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 2 20000 ) ;

170 v l i s t [15]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 2 30000 ) ;

v l i s t [492]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 2 40000 ) ;

v l i s t [461]�> s t i c k I t ( ) ;

175 s imu l a t e (kT , 1 0000 , 1 000 , 2 50000 ) ;

v l i s t [371]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 2 60000 ) ;

v l i s t [249]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 2 70000 ) ;

180 v l i s t [80]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 2 80000 ) ;

v l i s t [486]�> s t i c k I t ( ) ;

s imu l a t e (kT , 1 0000 , 1 000 , 2 90000 ) ;

g

185 g

// ... Methoden ...............................................

vo id I s i n g : : b u i l d n e t ( ) f

190 cout << " b u i l d n e t " << end l ;

f o r ( i n t i =0; i<MAX NODE ; i++) f

v l i s t [ i ] = new Ver t ex ( th i s , i ) ;

v l i s t [ i ]�> s e t Sp i n (1 ) ;

g

195 i n t from , to ;

whi le ( ! net . e o f ( ) ) f

net . g e t l i n e ( bu f f e r , 8 0 ) ;

i f ( s s c a n f ( bu f f e r , "%i �� % i ; " , & from , & to ) != 2) f

c e r r << " i npu t l i n e i g no r e d : ' " << b u f f e r << " ' " << end l ;

200 cont inue ;

g

a s s e r t ( from < MAX NODE) ; a s s e r t ( to < MAX NODE) ;

v l i s t [ from]�>addConn ( v l i s t [ to ] , NODOUBLE) ;

v l i s t [ to]�>addConn ( v l i s t [ from ] , NODOUBLE) ;

205 g

net . c l o s e ( ) ;

v count = 0 ;

f o r ( i n t i =1; i<MAX NODE ; i++)

49

Page 50: Analysis and Simulation of Scientic Networks

B. Source code

i f ( v l i s t [ i ]�>getConnCount ( )>0) v count=i ;

210 e l s e v l i s t [ i ]�> s e t Sp i n (0 ) ;

c e r r << v count << " nodes " << end l ;

g

vo id I s i n g : : debug ( i n t nr ) f

215 Ver t ex � node = v l i s t [ n r ] ;

i n t nb = node�>getConnCount ( ) ;

cout << "node " << nr << " has " << nb

<< " conn e c t i o n s : " << end l ;

f o r ( i n t i =0; i<nb ; i++)

220 cout << node�>getConn ( i )�>getNumber ( ) << end l ;

g

vo id I s i n g : : r e s e t ( ) f

f o r ( i n t i =0; i<MAX NODE ; i++)

225 v l i s t [ i ]�> s e t Sp i n ( abs ( v l i s t [ i ]�>ge tSp in ( ) ) ) ;

g

// ... Simulation .............................................

230 vo id I s i n g : : s imu l a t e ( double kT , i n t maxtime , i n t s t ep t ime , i n t s t a r t t i me

) f

i f ( ! s t e p t ime ) s t e p t ime=maxtime ;

f o r ( i n t i=�MAX CONN; i<=MAX CONN; i++)

e n l i m i t [ i+MAX CONN] =

( i n t ) (MAX INT � ( 2� exp (�2.� i /kT)�1) ) ;

235 i n t mag , t ime=s t a r t t i m e ;

whi le ( time�s t a r t t ime<maxtime ) f

f o r ( i n t s t e p =0; s tep<s t e p t ime ; s t e p++) f

mag = 0 ;

f o r ( i n t nr =1; nr<=v count ; n r++) f

240 a s s e r t ( v l i s t [ n r ] !=NULL) ;

mag += v l i s t [ n r ]�> s imu l a t eS t ep ( ) ;

g

g

t ime += s t ep t ime ;

245 cout << t ime << " " << kT << " " << mag << end l ;

g

g

// === main ===================================================

250

i n t main ( i n t argc , char �� argv ) f

a s s e r t ( c e r r << "debug mode on" << end l ) ;

I s i n g i s i n g ( " net . t x t " ) ;

r e tu rn 0 ;

255 g

50

Page 51: Analysis and Simulation of Scientic Networks

B.3. Modi�ed Barab�asi-Albert model

B.3. Modi�ed Barab�asi-Albert model

This Ruby program creates modi�ed Barab�asi-Albert models with a given �nal size. In

section 5 this was used on our collaboration network.

#!/home/fxp/bin/ruby -w

$P = 0 .8

$RUNS = 1E0 . t o i

5 $M = 1

$ c l u s t e r i n i t i a l s i z e = 3

$MAXINT = 2147483647�2+1

10

c l a s s Random

@@ibm = 1

def rnd (max=n i l )

@@ibm �= 65539 # 16807 # 65539

15 @@ibm &= $MAXINT

max ? @@ibm�max/$MAXINT : @@ibm

end

end

20 c l a s s Ve r t ex

a t t r r e a d e r : c onn e c t i o n s

a t t r a c c e s s o r : n r

@@ t o t a l v e r t i c e s = 0

de f i n i t i a l i z e

25 @connec t i on s = [ ]

@@ t o t a l v e r t i c e s += 1

@nr = @@ t o t a l v e r t i c e s

end

de f i n s p e c t

30 " I 'm node nr . #f@nr g connected to #f@connec t i on s . c o l l e c t f j c j c . n r g .

j o i n (" , " ) g . "

end

de f a d d l i n k ( p a r t n e r )

@connec t i on s <<= pa r t n e r

end

35 de f connect ( p a r t n e r )

a d d l i n k ( p a r t n e r )

p a r t n e r . a d d l i n k ( s e l f )

end

end

40

c l a s s BA net

de f i n i t i a l i z e ( prob new=0)

@p new = ( prob new � $MAXINT) . t o i

@nodes = [ ]

45 @ke r t e s z = [ ]

@r = Random . new

100 . t imes f @r . rnd g

s t a r t n ew n e t

51

Page 52: Analysis and Simulation of Scientic Networks

B. Source code

end

50 de f s i z e

@nodes . l e n g t h

end

de f s t a r t n ew n e t

max = @nodes . l e n g t h

55 $ c l u s t e r i n i t i a l s i z e . t imes f @nodes << Ver t ex . new g

$ c l u s t e r i n i t i a l s i z e . t imes f j i j

o r i g = i + max

de s t = (( i +1) % $ c l u s t e r i n i t i a l s i z e ) + max

@nodes [ o r i g ] . connect ( @nodes [ d e s t ] )

60 @ke r t e s z << o r i g << de s t

g

end

de f i n s p e c t

@nodes . c o l l e c t f j node j node . i n s p e c t g . j o i n ("nn") + "nn#f@ke r t e s z .

i n s p e c t g"

65 end

de f add node

i f @r . rnd < @p new

s t a r t n ew n e t

e l s e

70 @nodes << new node = Ver t ex . new

l = @ke r t e s z . l e n g t h

$M. t imes f

de s t = @ke r t e s z [ @r . rnd ( l �1) ]

new node . connect ( @nodes [ d e s t ] )

75 @ke r t e s z << @nodes . l eng th �1 << de s t

g

end

end

de f c h e c k s ub t r e e ( node , s ub t r e e )

80 r e t u r n i f ! node . nr

s ub t r e e << node . nr

node . nr = n i l

node . c onn e c t i o n s . each f j n j ch e c k s ub t r e e (n , s ub t r e e ) g

end

85 de f a n a l y s i s

@nodes . each f j node j

next i f ! node . nr

t r e e = [ ]

c h e c k s ub t r e e ( node , t r e e )

90 # p t r e e . s o r t

# [ 1 ] [ 2 , 3 ] [ 4 , 5 , 6 , 7 ] [ 8 . . . ] . . .

bucket = t r e e . l e n g t h

i f ! $ s t a t i s t i k [ bucket ]

$ s t a t i s t i k [ bucket ] = 1

95 e l s e

$ s t a t i s t i k [ bucket ] += 1

end

g

end

100 end

52

Page 53: Analysis and Simulation of Scientic Networks

B.3. Modi�ed Barab�asi-Albert model

$ s t a t i s t i k = fg

$RUNS. t imes f j i j

$ s t d e r r . p r i n t "#f i g . . . " i f i %100==0

105 mynet = BA net . new ($P)

beg in

mynet . add node

end wh i l e mynet . s i z e < 555

# p mynet

110 mynet . a n a l y s i s

g

$ s t a t i s t i k . k ey s . s o r t . each f j l j p r i n t f "%3 i %6.4 f nn" , l , $ s t a t i s t i k [ l

]g # . t o f /$RUNS g

53

Page 54: Analysis and Simulation of Scientic Networks

B. Source code

54

Page 55: Analysis and Simulation of Scientic Networks

C. Figures

The �gures found in this publication were

made by myself, except the following:

� �gure 1.1 was found on http://-

www.math.colostate.edu/~betten/-

courses/M501/combi.html.

� �gure 2.1 was found in [14].

� �gure 2.3 was found on http://-

www.science.nd.edu/physics/-

Faculty/barabasi.html.

� �gure 2.3 was found on http://-

www.phys.psu.edu/~ralbert/.

� �gure 2.2 was found in [18].

� �gure 4.1 was found on http://-

www.physik.tu-dresden.de/itp/-

members/kobe/isingphbl/.

55

Page 56: Analysis and Simulation of Scientic Networks

C. Figures

56

Page 57: Analysis and Simulation of Scientic Networks

Bibliography

[1] L. Euler. Solutio problematis ad geometriam situs

pertinentis. Commetarii Academiae Scientiarum

Imperialis Petropolitanae 8 (1736).

[2] R. Albert and A.-L. Barab

asi. Statistical me-

chanics of complex networks. Rev. Mod. Phys. 74,

47{97 (2002).

[3] S. N. Dorogovtsev and J. F. F. Mendes. Evo-

lution of networks. Adv. Phys. 51, 1079{1187

(2002).

[4] A.-L. Barab

asi. Linked. The New Science of Net-

works (Perseus, Cambridge, Massachusetts, 2002).

[5] K. Rieger. Sind Sie mit Marlon Brando befreun-

det? Die Zeit 44, BL21 (1999).

[6] A.-L. Barab

asi, R. Albert, and H. Jeong.

Scale-free characteristics of random networks: the

topology of the world-wide web. Physica A 281,

69{77 (2000).

[7] M. E. J. Newman. Scienti�c collaboration net-

works. Network construction and fundamental re-

sults. Phys. Rev. E 64, 01631 (2001).

[8] M. E. J. Newman. Scienti�c collaboration net-

works. Shortest paths, weighted networks, and cen-

trality. Phys. Rev. E 64, 01632 (2001).

[9] H. Ebel, L.-I. Mielsch, and S. Bornholdt.

Scale-free topology of e-mail networks. Phys. Rev.

E 66, 035103 (2002).

[10] S. Redner. How popular is your paper? an empir-

ical study of the citation distribution. Europ. Phys.

J. B 4, 131{134 (1998).

[11] A. J. Lotka. The frequency distribution of scien-

ti�c productivity. J. Wash. Acad. Sc. 16, 317{323

(1926).

[12] F. Liljeros, C. R. Edling, et al. The web of

human sexual contacts. Nature 411, 907 (2001).

[13] S. N. Dorogovtsev and J. F. F. Mendes. Evo-

lution of networks: From Biological Nets to the

Internet and WWW (Oxford University Press, Ox-

ford, 2003).

[14] S. H. Strogatz. Exploring complex networks. Na-

ture 410, 268{276 (2001).

[15] S. Milgram. The small world problem. Psych.

Today 61 (1967).

[16] D. J. Watts and S. H. Strogatz. Small world.

Nature 393, 440{442 (1998).

[17] D. Watts. Small worlds: the dynamics of net-

works between order and randomness (Princeton

University Press, Princeton, New Jersey, 1999).

[18] M. E. J. Newman. Models of the small world|a

review (2000). ArXiv:cond-mat/0001118.

[19] M. Gladwell. The Tipping Point. How little

things can make a big di�erence. (Little Brown

and Company, Boston, Massachusetts, 2000).

[20] R. Cohen, K. Erez, D. ben Avraham, and

S. Havlin. Resilience of the internet to random

breakdowns. Phys. Rev. Lett. 85, 21, 4626{4628

(2000).

[21] A. L. Lloyd and R. M. May. How viruses spread

among computers and people. Science 292, 1316{

1317 (2001).

[22] M. E. J. Newman, S. Forrest, and

J. Balthrop. Email networks and the spread

of computer viruses. Phys. Rev. E 66, 035101

(2002).

[23] B. J. Kim, C. N. Yoon, S. K. Han, and H. Jeong.

Path �nding strategies in scale-free networks.

Phys. Rev. E 65, 027103 (2002).

[24] J. M. Kleinberg. Navigation in a small world.

Nature 406, 845 (2000).

[25] Isi web of science (2002). http://www.isinet.com/-

isi/products/citation/wos/index.html.

[26] R. Albert, H. Jeong, and A.-L. Barab

asi. Di-

ameter of the world-wide web. Nature 401, 130

(1999).

[27] R. Solomonoff and A. Rapoport. Connectivity

of random nets. Bull. Math. Biophys. 13, 107{227

(1951).

[28] P. Erd

}

os and A. R

enyi. On random graphs. Pub.

Mathem. 6, 290{297 (1959).

[29] M. E. J. Newman. Random graphs as models of

networks (2002). ArXiv:cond-mat/0202208.

57

Page 58: Analysis and Simulation of Scientic Networks

Bibliography

[30] A.-L. Barab

asi and R. Albert. Emergence of

scaling in random networks. Science 286, 509{512

(1999).

[31] St. Matthew. New Testament. In The Bible,

chap. 12, 13 (God, ca. 10AD).

[32] D. Cohen. All the world's a net. New Scientist

174, 24{29 (2002).

[33] A.-L. Barab

asi and E. Bonabeau. Scale-free net-

works. Sci. Amer. 3, 50{59 (2003).

[34] K. Klemm and V. M. Egu

iluz. Growing scale-free

networks with small-world behavior. Phys. Rev. E

65, 057102 (2002).

[35] E. Ravasz and A.-L. Barab

asi. Hierarchical or-

ganization in complex networks. Phys. Rev. E 67,

026112 (2003).

[36] K. Klemm and V. M. Egu

iluz. Highly clustered

scale-free networks. Phys. Rev. E 65, 036123

(2002).

[37] A. F. J. van Raan. Fractal dimension of co-

citations. Nature 347, 626 (1990).

[38] AT&T. Graphviz|open source graph drawing

software (2002). http://www.research.att.com/-

sw/tools/graphviz/.

[39] S. C. North. Drawing graphs with neato. AT&T

(2002).

[40] J. C. Venter, M. D. Adams, E. W. Myers,

et al. The sequence of the human genome. Sci-

ence 291, 1304{1351 (2001).

[41] A.-L. Barab

asi, H. Jeong, Z. N

eda, E. Ravasz,

A. Schubert, and T. Vicsek. Evolution of the so-

cial network of scienti�c collaborations. Physica A

311, 590{614 (2002).

[42] A.-L. Barab

asi. The physics of the web

(2001). http://www.physicsweb.org/article/-

world/14/7/09.

[43] L. A. N. Amaral, A. Scala, M. Barth

el

emy,

and H. E. Stanley. Classes of small-world net-

works. Proc. Nat. Acad. Sci. USA 97, 21, 509{512

(2000).

[44] S. Mossa, M. Barth

elemy, H. E. Stanley, and

L. A. N. Amaral. Truncation of power law be-

haviour in "scale-free" network models due to in-

formation �ltering. Phys. Rev. Lett. 89, 208701

(2002).

[45] D. Stauffer and A. Aharony. Introduction to

Percolation Theory (Taylor and Francis, London,

1994), second ed.

[46] A.-L. Barab

asi, R. Albert, and H. Jeong. Mean

�eld theory for scale-free random networks. Phys-

ica A 272, 173{187 (1999).

[47] E. Ising. Beitrag zur Theorie des Ferromag-

netismus. Zeitschr. f. Phys. 31, 253{258 (1925).

[48] W. Lenz. Beitrag zum Verst�andnis der mag-

netischen Erscheinungen in festen K�orpern. Phys.

Zeitschr. 21, 613{615 (1920).

[49] R. B. Potts. Some generalized order-disorder

transformations. Proc. Cambridge Phil. Soc. 48,

106 (1952).

[50] F. Y. Wu. The Potts model. Rev. Mod. Phys. 54,

235{268 (1982).

[51] N. Metropolis, A. W. Rosenbluth, and A. H.

Teller. Equation of state calculations by fast

computing machines. J. Chem. Phys. 21, 6, 1087{

1092 (1953).

[52] A. Aleksiejuk, J. A. Ho lyst, and D. Stauffer.

Ferromagnetic phase transition in Barab�asi-Albert

networks. Physica A 310, 260{266 (2002).

[53] S. N. Dorogovtsev, A. V. Goltsev, and J. F. F.

Mendes. Ising model on networks with an arbi-

trary distribution of connections. Phys. Rev. E 66,

016104 (2002).

[54] K. Kacperski and J. A. Ho lyst. Phase transi-

tions as a persistent feature of groups with leaders

in models of opinion formation. Physica A 287,

631{643 (2000).

[55] F. P

utsch. Analysis and modeling of science col-

laboration networks. Adv. Compl. Sys. (2003).

Submitted on May 6

th

.

58