Download - Week 5 - Models of Complex Networks I

Week 5 - Models of Complex Networks I

Dr. Anthony BonatoRyerson University

AM8002Fall 2014

Key properties of complex networks

1. Large scale.

2. Evolving over time.

3. Power law degree distributions.

4. Small world properties.

• in the next two lectures, we consider various models simulating these properties

2

3

Why model complex networks?

• uncover and explain the generative mechanisms underlying complex networks

• predict the future• nice mathematical challenges• models can uncover the hidden reality of

networks

4

“All models are wrong, but some are more useful.” – G.P.E. Box

5

G(n,p) random graph model(Erdős, Rényi, 63)

• p = p(n) a real number in (0,1), n a positive integer

• G(n,p): probability space on graphs with nodes {1,…,n}, two nodes joined independently and with probability p

51 2 3 4

6

Degrees and diameter

• an event An happens asymptotically almost surely (a.a.s.) in G(n,p) if it holds there with probability tending to 1 as n→∞

Theorem 5.1: A.a.s. the degree of each vertex of G in G(n,p) equals

• concentration: binomial distribution

Theorem 5.2: If p is constant, then a.a.s diam(G(n,p)) = 2.

pnonpnOpn ))1(1()log(

7

Aside: evolution of G(n,p)

• think of G(n,p) as evolving from a co-clique to clique as p increases from 0 to 1

• at p=1/n, Erdős and Rényi observed something interesting happens a.a.s.:– with p = c/n, with c < 1, the graph is disconnected with all

components trees, the largest of order Θ(log(n))– as p = c/n, with c > 1, the graph becomes connected with a giant

component of order Θ(n)

• Erdős and Rényi called this the double jump• physicists call it the phase transition: it is similar to

phenomena like freezing or boiling

9

G(n,p) is not a model for complex networks

• degree distribution is binomial

• low diameter, rich but uniform substructures

10

Preferential attachment model

Albert-László Barabási Réka Albert

11

Preferential attachment

• say there are n nodes xi in G, and we add in a new node z

• z is joined to the xi by preferential attachment if the probability zxi is an edge is proportional to degrees:

• the larger deg(xi), the higher the probability that z is joined to xi

|)(|2

deg

)deg(

deg

1

GE

x

x

x i

nii

i

12

Preferential attachment (PA) model(Barabási, Albert, 99), (Bollobás,Riordan,Spencer,Tusnady,01)

• parameter: m a positive integer• at time 0, add a single edge• at time t+1, add m edges from a new node vt+1 to

existing nodes forming the graph Gt

– the edge vt+1 vs is added with probability

)1(2

deg

mt

vsGt

13

Wilensky, U. (2005). NetLogo Preferential Attachment model. http://ccl.northwestern.edu/netlogo/models/PreferentialAttachment.

Preferential Attachment Model(Barabási, Albert, 99), (Bollobás,Riordan,Spencer,Tusnady,01)

http://ccl.northwestern.edu/netlogo/models/PreferentialAttachment

14

• Theorem 5.3 (BRST,01) A.a.s. for all k satisfying 0 ≤ k ≤ t1/15

• Theorem 5.4 (Bollobás, Riordan, 04) A.a.s. the diameter of the graph at time t is

.))1(1( 3, kot

N tk

Properties of the PA model

.loglog

log)1(1

t

to

15

Idea of proof of power law degree distribution

1. Derive an asymptotic expression for E(Nk,t) via a recurrence relation.

2. Prove that Nk,t concentrates around E(Nk,t). – this is accomplished via martingales or using

variance

Azuma-Hoeffding inequality

If (Xi:0 ≤ i ≤ t) is a martingale satisfying the c-Lipschitz condition, then for all real λ > 0,

16

.2

exp2)|)Pr(|2

2

0

c

tXX t

Sketch of proof of (2), when m=1

• let A = Nk,t and Zi = Gi

• define Xi = E[A| Z1,…, Zi]

• It can be shown that (Xi) is a martingale (ie a Doob martingale)

• a new vertex can affect the degrees of at most two existing nodes, so we have that

|Xi – Xi-1| ≤ 2

• now apply Azuma-Hoeffding inequality with

17

tlog

).1(2)log|)Pr(| 8/10 otttXX t

ACL PA model

• (Aeillo,Chung,Lu,2002) introduced a preferential attachment model where the parameters allow exponents to range over (2,∞)

• Fix p in (0,1). This is the sole parameter of the model.• At t=0, G0 is a single vertex with a loop.

• A vertex-step adds a new vertex v and an edge uv, where u is chosen from existing vertices by preferential attachment.

• An edge-step adds an edge uv, where both endpoints are chosen by preferential attachment.

• To form Gt+1, with probability p take a vertex-step, and with probability 1-p, an edge-step.

18

ACL PA, continued

• note that the number of vertices is a random variable; but it concentrates on 1+pt.

• to give a flavour of estimating the expectations of random variables Nk,t we derive the following result. The case (2) for general k>1 follows by an induction.

19

Power law for expected degree distribution in ACL PA model

Theorem 5.5 (ACL,02).

1)

2) For k sufficiently large,

20

.4

2lim ,1

p

p

t

NE t

t

.lim 22

,

p

p

tk

tkO

t

NE

21

Copying models

• new nodes copy some of the link structure of an existing node

Motivation:

1. web page generation (Kumar et al, 00)

2. mutation in biology (Chung et al, 03)

22

u

v

x

yN(u)

N(v)

Copying model (Kumar et al,00)

• Parameters: p in (0,1), d > 0 an integer, and a fixed digraph G0 = H with constant out-degree d

• Assume Gt has out-degree d.

• At time t+1, an existing vertex, ut, is chosen u.a.r. The vertex ut is called the copying vertex.

• To form Gt+1 a new vertex vt+1 is added. For each of the d-out-neighbours z of ut, add a directed edge (vt+1,z) with probability 1-p, and with probability p add a directed edge (vt+1,z), where z is chosen u.a.r. from Gt

23

24

Properties of the copying model

• power laws:– Kumar et al: exponent in interval (2,∞)– Chung, Lu: (1,2)

• bipartite subgraphs:– Kumar et al: larger expected number of

bicliques than in PA models– simplified model of community structure


Theorem 5.6 (Kumar et al, 00) If k > 0, then the copying model with parameter p satisfies a.a.s.

In particular, the in-degree distribution follows a power law with exponent (2-p)/(1-p)

25

.1

2,

p

pintk kt

N


Theorem 5.7 (Kumar et al, 00) A.a.s. with parameter d >0 and for i ≤ log t,

where Nt,i,d is the expected number of Ki,i which are subgraphs of Gt.

• indicates strong community structure in copying model

26

)),exp((,, itN dit