Construction Algorithms for Online Social...

36
Construction Algorithms for Online Social Networks Minas Gjoka, Balint Tillman, Athina Markopoulou University of California, Irvine

Transcript of Construction Algorithms for Online Social...

Page 1: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Construction Algorithms for Online Social Networks

Minas  Gjoka,  Balint  Tillman,  Athina  Markopoulou    

University  of  California,  Irvine    

Page 2: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Graphs

Social Networks

Protein interactions World Wide Web

Autonomous Systems

DNS

2

Page 3: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Motivation §  Measurements/sampling OSNs

•  http://odysseas.calit.uci.edu/osn/ •  [INFOCOM 2010],[ SIGMETRICS 2011],

3x[JSAC 2011], [WOSN 2012]… •  ~3500 researchers have requested our

FB datasets

§  Generate synthetic graphs that resemble real social networks •  to use in simulations •  for anonymization

§  Q1: resemble in terms of what? §  Q2: generate how?

3

Page 4: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

dK-Series

§  dK-series framework [Mahadevan et al, Sigcomm ’06] •  “A set of graph properties that describe and constrain random

graphs, using degree correlations, in successively finer detail”

4

Page 5: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

dK-Series

§  dK-series framework [Mahadevan et al, Sigcomm ’06] •  0K specifies the average node degree

VE

k2

=

5

Page 6: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

dK-Series

§  dK-series framework [Mahadevan et al, Sigcomm ’06] •  0K specifies the average node degree •  1K specifies the node degree sequence

o  node degree “sequence” vs “distribution”

∑∈=

kVakD 1)(

1a

2a

4a

3b

3a

1b

4b

2

1

2

2

k #

1

2

3

4

1K 6

Page 7: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

dK-Series

§  dK-series framework [Mahadevan et al, Sigcomm ’06] •  0K specifies the average node degree •  1K specifies the node degree sequence •  2K specifies the joint node degree matrix (JDM)

∑ ∑∈ ∈ ∈=k lVa Vb EbalkJDM }),{{1),(

1a

2a

4a

3b

3a

1b

4b

(k,l)

1 1

1 1

1 1 4

1 1 4 2

1 2 3 4

1

2

3

4

2K 7

Page 8: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

§  dK-series framework [Mahadevan et al, Sigcomm ’06] •  0K specifies the average node degree •  1K specifies the node degree distribution •  2K specifies the joint node degree matrix (JDM) •  3K specifies the number of induced subgraphs of 3 nodes

o  nodes are labeled by their degree k

dK-Series

#Wedges(k1,k2,k3) #Triangles(k1,k2,k3)

k1 k2

k3

k1 k2

k3

8

Page 9: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

dK-Series

§  dK-series framework [Mahadevan et al, Sigcomm ’06] •  0K specifies the average node degree •  1K specifies the node degree distribution •  2K specifies the joint node degree matrix (JDM) •  3K specifies the distribution of subgraphs of 3 nodes •  … •  nK specifies the entire graph

§  Nice properties •  Inclusion •  Convergence •  Tradeoff : accuracy vs. complexity

OSNs “2K+”

9

Page 10: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Related Work §  Graph Construction Approaches:

•  Stochastic: reproduces dk-distribution in expectation. •  Configuration (“pseudograph”): reproduces dk-distribution exactly.

o  Deterministic algorithms up to d=2. MCMC for d>=2.

§  1K Construction •  Configuration: 1K multigraphs or simple graphs [Molloy’1995] •  1K+ [Bansal ’2009, Newman’2009, Serrano & Boguna’2005, …] •  What else is known: conditions for 1K to be graphical [Erdos-Gallai, Havel]; space of

graphs with degree sequence connected [Havel-Hakimi]; MCMC for sampling. §  2K Construction

•  Configuration model [Mahadevan’2006] •  Balance Degree Invariant: [Amanatidis’2008], [Stanton’ 2012],[Czabarka’2014]. •  What else is known: conditions for 2K to be graphical [Amantidis’08][Stanton’12];

space of graphs with a given JDM is connected[Stanton’12 [Czabarka’14]; MCMC convergence speed is an open problem.

§  2K+ Construction

•  2K preserving, 3K targeting using edge rewiring: [Mahadevan’ 2006] •  2.5K heuristic: JDM+degree dependent clustering coefficient: [Gjoka’13]

10

Page 11: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Our Contributions §  New 2K Construction Algorithm

§  can produce any simple graph §  Main benefit: no constraints in constructed graphs

§  with the exact JDMtarget §  in O(|E|dmax)

§  2K+ Framework: JDMtarget+ Additional Properties §  2K + Node Attributes (exactly) §  2K + Avg Clustering (approx)

§  Main benefit: orders of magnitude faster than 2K+MCMC

11

Page 12: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

2K Construction

1 1

1 1

1 1 4

1 1 4 2

1 2 3 4

1

2

3

4

JDMtarget §  Input: Joint Degree Matrix

•  JDMtarget must be graphical

§  Goal:

•  Construct a simple graph with exactly JDMtarget

12

Page 13: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

2K Construction

0/1 0/1

0/1 0/1

0/1 0/1 0/4

0/1 0/1 0/4 0/2

1 2 3 4

1

2

3

4

JDM/JDMtarget

1a

2a

4a

3b

3a

1b

4b

Initialize: 1K: create nodes and stubs

JDM(k,l)=0 for all k,l

13

Page 14: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

2K Construction

0/1 1/1

0/1 0/1

0/1 0/1 0/4

1/1 0/1 0/4 0/2

1 2 3 4

1

2

3

4

JDM/JDMtarget

1a

2a

4a

3b

3a

1b

4b

Initialize: 1K: create nodes and stubs

JDM(k,l)=0 for all k,l

14

Page 15: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

2K Construction

0/1 1/1

1/1 0/1

0/1 1/1 0/4

1/1 0/1 0/4 0/2

1 2 3 4

1

2

3

4

JDM/JDMtarget

1a

2a

4a

3b

3a

1b

4b

Initialize: 1K: create nodes and stubs

JDM(k,l)=0 for all k,l

15

Page 16: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

2K Construction

1a

2a

4a

3b

3a

1b

4b

Initialize:

1K: create nodes and stubs JDM(k,l)=0 for all k,l

Pick (k, l) degree pair, in any order While JDM(k, l) < JDMtarget(k, l)

Pick (x, y) any pair of disconnected nodes with degrees k and l

if x does not have free stubs neighbor switch for x

if y does not have free stubs neighbor switch for y

add edge between (x, y) JDM(k, l)++

0/1 1/1

1/1 0/1

0/1 1/1 0/4

1/1 0/1 0/4 0/2

1 2 3 4

1

2

3

4

JDM/JDMtarget

16

Page 17: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Case 1 x, y both have free stubs JDM(k, l) < JDMtarget(k, l) node x has degree k node y has degree l

x  

y  

Add edge between x and y

k=3

l=4

17

Page 18: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Case 2 x has free stubs but y does not

x  

y  

k=3

l=4

t  

Neighbor switch between y and b using t

b  

Add edge between x and y

JDM(k, l) < JDMtarget(k, l) node x has degree k node y has degree l

18

Page 19: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Case 3 neither x nor y have free stubs

x   b2  

y  

k=3

l=4

t1  

Neighbor switch between y and b1 using t1

b1  

Neighbor switch between x and b2 using t2

t2  

Add edge between x and y

JDM(k, l) < JDMtarget(k, l) node x has degree k node y has degree l

19

Page 20: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Properties of 2K Algorithm

20

§  Terminates with exact JDMtarget in O(|E|dmax) •  It adds 1 edge at a time, while staying below JDMtarget

§  It can produce ALL graphs with the JDMtarget §  Output graph depends on the order of adding edges

Page 21: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Space of constructed graphs Example: all 7-node graphs Generate  all  non-­‐isomorphic    

7-­‐node  simple  graphs    G1,  ..,  G1044    

All    Unique    Joint  Degree  Matrices  JDM1,  ...  ,  JDM768    

2K  ConstrucOon  Algorithm  

Output

Input

768  syntheOc  graphs  (not  all  disOnct)  

21

Page 22: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Flexibility of 2K Algorithm

22

§  Output graph depends on the order of considering edges to add §  It can produce ALL graphs with JDMtarget §  Family of algorithms: add one edge at a time, while staying below

JDMtarget •  any order of degree pairs (k,l) •  any order of node pairs (x,y), even before completing a degree pair •  Can start with an empty or partially built graph

§  2K+: can target additional properties fast §  Previously known: space of graphs with JDMtarget is connected; but

slow MCMC mixing §  Property 1: clustering §  Property 2: attribute correlation

Page 23: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Extension 1: Target JDM + Clustering

2

2

3

3

2 2

2

2

3

3

2

2

2 2

3

3

2 2

JDM

2

3

2 3k l

4   4  4   2  

Intuition: by controlling the order we add edges we can control clustering.

0 triangles 1 triangles 2 triangles

23

Page 24: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

2a

2c

3b

3a

2b

2d

2a 2b

3b

3a

2d 2c

JDM

2

3

2 3k l

4   4  4   2  

0 triangles 2 triangles

0

25

75

50

2b

3a

3b 2d

2a

2c

2b 3a

3b 2d

2a

2c

2b 3a

3b 2d

2a

2c

Extension 1: Target JDM + Clustering

[INFOCOM 2013]: add edges in increasing distance àhigh clustering

nodes randomly on a circle, consider node pairs’ distance

Random order of node pairs à low clustering 24

Page 25: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

“Sortedness” of node pairs’ list controls clustering

•  Example: JDMtarget of Facebook Caltech Network •  Consider many orders of node pairs à create graphs with JDMtarget

à compute avg clustering c.

25

2b 3a

3b 2d

2a

2c

[INFOCOM 2015]: control order of node pairs à control clustering

Page 26: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

2K+ Avg Clustering Input: target JDM, avg clustering coefficient c Stage 1 E’ = list of node pairs s.t. sortedness(E’)≈S(c)

FOR each candidate node pair (v,w) in E’: IF both nodes v and w have free stubs and the corresponding JDM(k, l) < JDMtarget(k, l): add edge (v,w)

Stage 2 If not all |E| edges have been added:

Add remaining edges using 2K_Simple

Extension 1: Target JDM + Clustering

26

Page 27: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Real world examples target JDM+avg clustering

Average Clustering Coefficient

Average Node Shortest Path Length

Average Node Closeness

27

Page 28: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

2K+MCMC did not finish after several days

Real world examples target JDM+avg clustering

28

Page 29: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Extension 2: Node Attributes

JDM

1

2

1 2k l

2  2   6  

JDM

1

2

1 2k l

2  2   6  

29

JAM

2   2  2   4  

Joint Attribute Matrix (or Attribute Mixing Matrix)

Page 30: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Extension 2: Node Attributes Mixing

JDM

1

2

1 2

JAM

k l

2  2   6  

2   2  2   4  

JDM

1

2

1 2

JAM

k l

2  2   6  

4  6  

Joint Attribute Matrix (or Attribute Mixing Matrix)

30

Page 31: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

JDM

1

2

1 2

JAM

k l

2  2   6  

2   2  2   4  

JDM

1

2

1 2

JAM

k l

2  2   6  

4  6  

1

2

2

1 2 21   1  

1   1  

1   1   4  

1

2

2

1 2 22  

2  

6  

Joint Degree and Attribute Matrix (JDAM)

Extension 2: Degree+Attribute Mixing

31

Page 32: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

1

2

2

1 2 21   1  

1   1  

1   1   4  

1

2

2

1 2 22  

2  

6  

Joint Degree and Attribute Matrix (JDAM)

Extension 2: target JDAM

2K Algorithm also works for target JDAM

32

Page 33: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Real world examples graphs with node attributes

Average Clustering Coefficient

Average Node Shortest Path Length

Average Node Closeness

33

Page 34: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Real world examples small graphs with node attributes

Simulation takes ~1 day to target 2K and c = 0.24 with MCMC (using double edge swaps) 34

Page 35: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Construction of 2K+ Graphs §  New 2K Construction Algorithm

•  can produce any simple graph with exact JDMtarget in O(|E|dmax)

§  2K+ Framework: JDMtarget+ Additional Properties §  Extension 1: 2K (exactly) + Avg Clustering (approx) §  Extension 2: 2K (exactly) + Node Attributes (exactly)

§  Future directions §  Construction: target attributes + structure (towards 3K) §  Applications to privacy

http://odysseas.calit2.uci.edu/osn/

35

Page 36: Construction Algorithms for Online Social Networksodysseas.calit2.uci.edu/lib/exe/fetch.php/public:2k_march2015.pdf · Construction Algorithms for Online Social Networks ... • ~3500

Construction of 2K+ Graphs

QuesOons?