Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford...

33
CS 322: (Social and Information) Network Analysis Jure Leskovec Stanford University

Transcript of Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford...

Page 1: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

CS 322: (Social and Information) Network AnalysisJure LeskovecStanford University

Page 2: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

3 pages per group 3 pages per group Should include both Reaction part Reaction part Proposed work part

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 2

Page 3: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

Random networkRandom network Scale‐free (power‐law) network(Erdos‐Renyi random graph)

Jure Leskovec, Stanford CS322: Network Analysis10/13/2009 3

Page 4: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

Preferential attachment Preferential attachment [Price 1965, Albert‐Barabasi 1999]: Nodes arrive in orderNodes arrive in order  A new node j creates m out‐links Prob. of linking to a previous node i is g pproportional to its degree di

idijP )(

kk

i

dijP )(

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 4

Page 5: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

[Mitzenmacher, ‘03]

Pages are created in order Pages are created in order 1,2,…,n

When node j is created it When node j is created it produces a single link to some earlier node i chosen:earlier node i chosen:

1) With prob. p, j links to i chosen f l duniformly at random

2) With prob. 1-p, j links to i with ) p p, jprob. proportional to di (degree of i)

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 5

Page 6: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

How does degree d of node i grow over time? How does degree di of node i grow over time?

dd ii 1dtdq

tp

td ii

1d

d

q=1-p

What is the degree di(t) of node i at time t?

q 1 p

1)(

q

i itptd

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis

iq6

Page 7: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

Fraction of nodes with degree >d at time t Fraction of nodes with degree >d at time t

qq

dqtidtptd

1

11)(

i dpqtid

iqptd 11)(

Fraction of nodes with degree d?1

kq q

1111111

11

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis

pqpp

1

7

Page 8: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

We observe how the connectivity (length of the paths) of the network changes as the vertices get removed g[Albert et al. 00; Palmer et al. 01]

Vertices can be removed: Uniformly at random In order of decreasing degreeIn order of decreasing degree

It is important for epidemiology Removal of vertices corresponds to p

vaccination

Jure Leskovec, Stanford CS322: Network Analysis10/13/2009 8

Page 9: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

Real‐world networks are resilient to random attacks One has to remove all web‐pages of degree > 5 to disconnect the webOne has to remove all web pages of degree > 5 to disconnect the web But this is a very small percentage of web pages

Random network has better resilience to targeted attacks Random networkInternet (Autonomous systems)

h

Preferentialremoval

Random network( y )

path

leng

thM

ean Random

removal

Fraction of removed nodes Fraction of removed nodesJure Leskovec, Stanford CS322: Network Analysis10/13/2009 9

Page 10: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

G : Gnp: not realistic  B t i ll di t But gives small diameters

Small world:

Degree distribution of the

Small world: small diameter + local structure

l dgSmall world model But no power‐law degrees

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 10

Page 11: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

Preferential Preferential attachment: Power law degree Power law degree distributions But no local But no local clustering

Can we get all ofCan we get all of them?

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 11

Page 12: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

Preferential attachment is a network growth Preferential attachment is a network growthmodel

What governs the network growth and What governs the network growth and evolution? P1) Node arrival process: nodes enter the network P1) Node arrival process: nodes enter the network P2) Edge initiation process: each node decides 

when to initiate an edgewhen to initiate an edge P3) Edge destination process: determines 

destination after a node decides to initiatedestination after a node decides to initiate

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12

Page 13: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

[Leskovec et al. KDD 08]

4 online social networks with 4 online social networks with exact edge arrival sequence For every edge (u,v) we know exactFor every edge (u,v) we know exact time of the appearance tuv

Directly observe mechanisms leadingand so on for millions… Directly observe mechanisms leading 

to global network properties

(F)(D)(A)

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis

(A)(L)

13

Page 14: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

(F) (D)(F) (D)

Flickr: Exponential

Delicious: Linear

(A) (L)

Answers: S b li

LinkedIn: Q d ti

Jure Leskovec, Stanford CS322: Network Analysis

Sub‐linear Quadratic

10/13/2009 14

Page 15: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

How long do nodes live? How long do nodes live? How often nodes “wake up” to create edges?

How long do nodes live? Node life time is the time bet een the 1st and the Node life‐time is the time between the 1st and the last edge of a node

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 15

Page 16: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

Lifetime aLi k dI Lifetime a: time between node’s first d l t d

LinkedInand last edge

Node lifetime is exponential: p(a) = λ exp(‐λa) 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 16

Page 17: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

How long do nodes live? How long do nodes live? How often nodes “wake up” to create edges? Edge gap δ(d): time between dth and d+1st edge Edge gap δ(d): time between dth and d+1st edge of a node: Let ti(d) be the creation time of d‐th edge of node iLet ti(d) be the creation time of d th edge of node i δi(d) = ti(d+1) ‐ ti(d) Then δ(d) is a distribution (histogram) of δi(d) over g iall nodes i

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 17

Page 18: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

Edge gap δ(d):Edge gap δ(d): inter‐arrival time between 

LinkedIndth and d+1stedge

For every dwe get a different plotp

)()(),);(( dg eddp

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 18

Page 19: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

As the degree of the node degree increases As the degree of the node degree increases, how α and β change?

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 19

Page 20: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

α is const, β linear in d – gaps get smaller with dα is const, β linear in d  gaps get smaller with d

ty

Degreed=1

d=3d=2

Proba

bilit

d 1P

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis

Edge gap20

Page 21: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

What do we know so far? What do we know so far?

Process Our findingProcess Our finding

P1) Node arrival • Node arrival function is given

P2) Edge initiation• Node lifetime is exponential• Edge gaps get smaller as the g g g gdegree increases

Jure Leskovec, Stanford CS322: Network Analysis

P3) Edge destination

10/13/2009 21

Page 22: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

Source node wakes up and creates an edge Source node wakes up and creates an edge How does it find a target node? What is the degree of the target? What is the degree of the target? Do preferential attachment really hold?

How many hops away if the target? How many hops away if the target? Are edges really local?

Wh t i  th  d   f th  t t?What is the degree of the target?

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 22

Page 23: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 23

Page 24: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

kk)( H  f  d   d   ?kkpe )(Network τ

How far do edges go?

Network τGnm 0PAPA 1F 1DD 1A 0.9L 0.6

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 24

Page 25: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 25

Page 26: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

Fraction of triad closing edges

Network %Δ

F 66%

g g

F 66%

D 28%

A %A 23%

L 50%

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 26

Page 27: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

New triad closing edge (u w) appears next New triad‐closing edge (u,w) appears next  We model this as:1 Ch ’ i hb

wv’

1. Choose u’s neighbor v2. Choose v’s neighbor w

( )

uv

3. Connect (u,w) Compute edge prob. under Random‐R d ( )Random: p(u,w) = 

S f h ( )Score of a graph = p(u,w)

Jure Leskovec, Stanford CS322: Network Analysis10/13/2009 27

Page 28: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

Impro ement o er the baseline Improvement over the baseline:Strategy to select v (1st node)

ode)

t  w (2

ndno

Select

Strategies to pick a neighbor:Strategies to pick a neighbor: random: uniformly at random deg: proportional to its degree com: prop. to the number of common friends u

wv

Jure Leskovec, Stanford CS322: Network Analysis

last: prop. to time since last activity comlast: prop. to com*last

v

10/13/2009 28

Page 29: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

Th d lif i d d l d Theorem: node lifetimes and edge gaps lead to power law degree distribution

Interesting as temporal behavior predicts structural network propertystructural network property

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 29

Page 30: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

Node lifetime: pl(a) = pl( ) Let a node have life‐time a, what will its final degree D be? g

D~

The two exponential functions “cancel out” pand the power law part remains

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 30

Page 31: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

Given our model one can take an existing Given our model one can take an existing network continue its evolution

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 31

Page 32: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

Take Flickr at time T/2 and then further evolve it continue evolving it using PA and our model.g g

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 32

Page 33: Stanford Universitysnap.stanford.edu/na09/07-evolution-annot.pdf · 2009. 10. 13. · Stanford University ... 10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 12 [Leskovecet

How do networks evolve at the macroscopic How do networks evolve at the macroscopic level?

Are there global phenomena of network evolution?evolution?

What are analogs of the “small world” and What are analogs of the  small world  and “power‐law degrees”

10/13/2009 Jure Leskovec, Stanford CS322: Network Analysis 33