Lecture 2: Network properties CS 790g: Complex Networks Slides are modified from Networks: Theory...

Post on 18-Jan-2016

218 views 0 download

Tags:

Transcript of Lecture 2: Network properties CS 790g: Complex Networks Slides are modified from Networks: Theory...

Lecture 2:

Network properties

CS 790g: Complex Networks

Slides are modified from Networks: Theory and Application by Lada Adamic

Outline

What is a network? a bunch of nodes and edges

How do you characterize it? with some basic network metrics

Network models

2

What are networks?

Networks are collections of points joined by lines.

“Network” ≡ “Graph”

points lines

vertices edges, arcs math

nodes links computer science

sites bonds physics

actors ties, relations sociology

node

edge

3

Network elements: edges

Directed (also called arcs) A -> B

A likes B, A gave a gift to B, A is B’s child

Undirected A <-> B or A – B

A and B like each other A and B are siblings A and B are co-authors

Edge attributes weight (e.g. frequency of communication) ranking (best friend, second best friend…) type (friend, relative, co-worker) properties depending on the structure of the rest of the graph:

e.g. betweenness

4

Directed networks

2

1

1

2

1

2

1

2

1

2

21

1

2

1

2

1

2

12

1

2

1

2

1

2

1

21

2 1

2

1

2

12 1

2

1

2

12

1

2

12

1

2

1 2

12

Ada

Cora

Louise

Jean

Helen

Martha

Alice

Robin

Marion

Maxine

Lena

Hazel Hilda

Frances

Eva

RuthEdna

Adele

Jane

Anna

Mary

Betty

Ella

Ellen

Laura

Irene

girls’ school dormitory dining-table partners (Moreno, The sociometry reader, 1960)

first and second choices shown

5

Edge weights can have positive or negative values

One gene activates/ inhibits another

One person trusting/ distrusting another

Research challenge: How does one

‘propagate’ negative feelings in a social network?

Is my enemy’s enemy my friend?

Transcription regulatory network in baker’s yeast

6

Adjacency matrices

Representing edges (who is adjacent to whom) as a matrix Aij = 1 if node i has an edge to node j

= 0 if node i does not have an edge to j

Aii = 0 unless the network has self-loops

Aij = Aji if the network is undirected,or if i and j share a reciprocated edge

ij

i

ij

1

2

3

4

Example:

5

0 0 0 0 0

0 0 1 1 0

0 1 0 1 0

0 0 0 0 1

1 1 0 0 0

A =

7

Adjacency lists

Edge list 2 3 2 4 3 2 3 4 4 5 5 2 5 1

Adjacency list is easier to work with if network is

large sparse

quickly retrieve all neighbors for a node 1: 2: 3 4 3: 2 4 4: 5 5: 1 2

1

2

3

45

8

Nodes

Node network properties from immediate connections

indegreehow many directed edges (arcs) are incident on a node

outdegreehow many directed edges (arcs) originate at a node

degree (in or out)number of edges incident on a node

outdegree=2

indegree=3

degree=5

9

Node degree from matrix values

Outdegree =0 0 0 0 0

0 0 1 1 0

0 1 0 1 0

0 0 0 0 1

1 1 0 0 0

A =

n

jijA

1

example: outdegree for node 3 is 2, which we obtain by summing the number of non-zero entries in the 3rd row

Indegree =0 0 0 0 0

0 0 1 1 0

0 1 0 1 0

0 0 0 0 1

1 1 0 0 0

A =

n

iijA

1

example: the indegree for node 3 is 1, which we obtain by summing the number of non-zero entries in the 3rd column

n

iiA

13

n

jjA

13

1

2

3

45

10

Characterizing networks:Is everything connected?

11

Network metrics: connected components

Strongly connected components Each node within the component can be reached from every other node

in the component by following directed links

Strongly connected components B C D E A G H F

Weakly connected components: every node can be reached from every other node by following links in either direction

A

B

C

DE

FG

H

A

B

C

DE

FG

H

Weakly connected components A B C D E G H F

In undirected networks one talks simply about ‘connected components’

12

network metrics: size of giant component

if the largest component encompasses a significant fraction of the graph, it is called the giant component

13

Outline

What is a network? a bunch of nodes and edges

How do you characterize it? with some basic network metrics

Network models

14

Structural Metrics

Degree distribution

Average path length

Centrality Betweenness Closeness

Graph density Clustering coefficient

Several other graph metrics exist Assortativity Modularity …

15

degree sequence and degree distribution

Degree sequence: An ordered list of the (in,out) degree of each node

In-degree sequence: [2, 2, 2, 1, 1, 1, 1, 0]

Out-degree sequence: [2, 2, 2, 2, 1, 1, 1, 0]

(undirected) degree sequence: [3, 3, 3, 2, 2, 1, 1, 1]

Degree distribution: A frequency count of the occurrence of each degree

In-degree distribution: [(2,3) (1,4) (0,1)]

Out-degree distribution: [(2,4) (1,3) (0,1)]

(undirected) distribution: [(3,3) (2,2) (1,3)]

0 1 20

1

2

3

4

5

indegree

fre

qu

en

cy

16

Structural Metrics:Degree distribution

17

Characterizing networks:How far apart are things?

18

Structural metrics: Average path length

19

Characterizing networks:Who is most central?

?

?

?

20

Centrality: betweenness

The fraction of all directed paths between any two vertices that pass through a node

Normalization undirected: (N-1)*(N-2)/2 directed graph: (N-1)*(N-2) e.g.

CB (ni) g jkj ,k

(i) /g jk

betweenness of vertex i paths between j and k that pass through i

all paths between j and k

CB

' (ni) CB(i) /[(N 1)(N 2)]

21

Centrality: closeness

How close the vertex is to others depends on inverse distance to other vertices

Cc (i) d(i , j)j1

g

1

CC' (i) (CC (i))(n 1)

Normalization

22

network metrics: graph density

Of the connections that may exist between n nodes directed graph

emax = n*(n-1)

undirected graphemax = n*(n-1)/2

What fraction are present? density = e/ emax

For example, out of 12 possible connections,

this graph has 7, giving it a density of 7/12 = 0.583

Would this measure be useful for comparing networks of different sizes (different numbers of nodes)?

23

Structural Metrics:Clustering coefficient

24

Outline

What is a network? a bunch of nodes and edges

How do you characterize it? with some basic network metrics

Network models

25

Four structural models

Regular networks

Random networks

Small-world networks

Scale-free networks

26

Regular networks – fully connected

27

Regular networks – Lattice

28

Regular networks – Lattice: ring world

29

modeling networks: random networks

Nodes connected at random Number of edges incident on each node is Poisson

distributed

Poisson distribution

30

Random networks

31

Erdos-Renyi random graphs

What happens to the size of the giant component as the density of the network increases?

http://ccl.northwestern.edu/netlogo/models/run.cgi?GiantComponent.884.53432

Random Networks

33

modeling networks: small worlds

Small worlds a friend of a friend is also

frequently a friend but only six hops separate any

two people in the world

Arnold S. – thomashawk, Flickr;

http://creativecommons.org/licenses/by-nc/2.0/deed.en34

Small world models

Duncan Watts and Steven Strogatz a few random links in an otherwise structured graph make the

network a small world: the average shortest path is short

regular lattice:

my friend’s friend is

always my friend

small world:

mostly structured

with a few random

connections

random graph:

all connections

random

Source: Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442. 35

Watts Strogatz Small World Model

As you rewire more and more of the links and random, what happens to the clustering coefficient and average shortest path relative to their values for the regular lattice?

http://projects.si.umich.edu/netlearn/NetLogo4/SmallWorldWS.html36

Small-world networks

37

Scale-free networks

38

Scale-free networks

Many real world networks contain hubs: highly connected nodes

Usually the distribution of edges is extremely skewed

many nodes with few edges

fat tail: a few nodes with a very large numberof edges

no “typical” number of edges

number of edges

num

ber

of n

odes

with

so

man

y ed

ges

39

But is it really a power-law?

A power-law will appear as a straight line on a log-log plot:

A deviation from a straight line could indicate a different distribution: exponential lognormal

log(# edges)

log(

# n

odes

)

40

Scale-free networks

41