“Adversarial Deletion in Scale Free Random Graph Process” by A.D. Flaxman et al. Hammad Iqbal CS...
-
Upload
rachel-barker -
Category
Documents
-
view
215 -
download
0
Transcript of “Adversarial Deletion in Scale Free Random Graph Process” by A.D. Flaxman et al. Hammad Iqbal CS...
“Adversarial Deletion in Scale Free Random Graph Process” by A.D. Flaxman et al.
Hammad Iqbal
CS 3150
24 April 2006
Talk Overview1. Background
1. Large graphs2. Modeling large graphs
2. Robustness and Vulnerability1. Problem and Mechanism2. Main Results
3. Adversarial Deletions During Graph Generation
1. Results2. Graph Coupling3. Construction of the proofs
Large Graphs
Modeling of large graphs has recently generated interest ~ 1990s
Driven by the computerization of data acquisition and greater computing power
Theoretical models are still being developed
Modeling difficulties include Heterogeneity of elements Non-local interactions
Large Graphs Examples
Hollywood graph: 225,000 actors as vertices; an edge connects two actors if they were cast in the same movie
World Wide Web: 800 million pages as vertices; links from one page to another are the edges
Citation pattern of scientific publications Electrical Power-grid of US Nervous system of the nematode worm
Caenorhabditis elegans
Small World of Large Graphs Large naturally occurring graphs tend to show:
Sparsity: Hollywood graph has 13 million edges (25 billion for a clique
of 225,000 vertices) Clustering:
In WWW, two pages that are linked to the same page have a higher prob of including link to one another
Small Diameter: ~log n
D.J. Watts and S.H. Strogatz, Collective dynamics of 'small-world' networks, Nature (1998)
Talk Overview1. Background
1. Large graphs2. Modeling large graphs
2. Robustness and Vulnerability1. Problem and Mechanism2. Main Results
3. Adversarial Deletions During Graph Generation
1. Results2. Graph Coupling3. Construction of the proofs
Erdos-Renyi Random Graphs
Developed around 1960 by Hungarian mathematicians Paul Erdos and Alfred Renyi.
Traditional models of large scale graphs G(n,p): a graph on [n] where each pair is
joined independently with prob p Weaknesses:
Fixed number of vertices No clustering
Barabasi model
Incorporates growth and preferential attachment
Evolves to a steady ‘scale-free’ state: the distribution of node degrees don’t change over time
Prob of finding a vertex with k edges ~k-3
Degree Distribution
Scale Free P [X ≥ k] ~ ck-α
Power Law distributed Heavy Tail
Erdos- Renyi Graphs P [X = k] = e-λ λk / k! λ depends on the N Poisson distributed Decays rapidly for large k P[X≥k] 0 for large k
Exponential (ER) vs Scale Free
Albert, Jeong, Barabasi 2000
130 vertices and 430 edgesRed = 5 highest connected verticesGreen = Neighbors of red
Degree Sequence of WWW
In-degree for WWW pages is power-law distributed with x-2.1
Out-degree x-2.45
Av. path length between nodes ~16
Talk Overview1. Background
1. Large graphs2. Modeling large graphs
2. Robustness and Vulnerability1. Problem and Mechanism2. Main Results
3. Adversarial Deletions During Graph Generation1. Results2. Graph Coupling3. Construction of the proofs
Robustness and Vulnerability
Many complex systems display inherent tolerance against random failures
Examples: genetic systems, communication systems (Internet)
Redundant wiring is common but not the only factor
This tolerance is only shown by scale-free graphs (Albert, Jeong, Barabasi 2000)
Inverse Bond Percolation What happens when a
fraction p of edges are removed from a graph?
Threshold prob pc(N): Connected if edge
removal probability p<pc(N)
Infinite-dimensional percolation
Worse for node removal
General Mechanism Barabasi (2000) - Networks with the same
number of nodes and edges, differing only in degree distribution
Two types of node removals: Randomly selected nodes Highly connected nodes (Worst case)
Study parameters: Size of the largest remaining cluster (giant
component) S Average path length l
Main Results(Deletion occurs after generation)
□ Random node removal ○ Preferential node removal
Why is this important?Why is this important?
Talk Overview1. Background
1. Large graphs2. Modeling large graphs
2. Robustness and Vulnerability1. Problem and Mechanism2. Main Results
3. Adversarial Deletions During Graph Generation
1. Results2. Graph Coupling3. Construction of the proofs
Main Result
Time steps {1,…,n} New vertex with m edges using preferential att. Total deleted vertices ≤ δn (Adversarially) m >> δ w.h.p a component of size ≥ n/30
Formal Statements
Theorem 1 For any sufficiently small constant δ there
exists a sufficiently large constant m=m(δ) and a constant θ=θ(δ,m) such that whp Gn
has a “giant” connected component with size at least θn
Graph Coupling
Random Graph G(n’,p)
Red = Induced graph vertices Γn
InformalInformal Proof Construction
A random graph can be tightly coupled with the scale free graph on the induced subset (Theorem 2)
Deleting few edges from a random graph with relatively many edges will leave a giant connected component (Lemma 1)
There will be a sufficient number of vertices for the construction of induced subset (Lemma 2)
w.h.p
Formal Statements
Theorem 2 We can couple the construction of Gn and
random graph Hn such that Hn ~ G(Γn,p) and whp
e(Hn \ Gn) ≤ Ae-Bmn
Difference in edge sets of Gn and Hn
decreases exponentially with the number of edges
Induced Sub-graph Properties
Vertex classification at each time step t: Good if:
Created after t/2 Number of original edges that remain
undeleted ≥ m/6 Bad otherwise
Γt = set of good vertices at time t Good vertex can become bad Bad vertex remains bad
Proof of Theorem 2Construction
H[n/2] ~ G(Γn/2,p)
For k > n/2, both G[k] and H[k] are constructed inductively: Gk is generated by preferential attachment
model. H[k] is constructed by connecting a new
vertex with the vertices that are good in G[k] A difference will only happen in case of
‘failure’
Proof of Theorem 2Type 0 failure
If not enough good vertices in Gk Lemma 2: whp γt ≥ t/10
Prob of occurrence is therefore o(1) Generate G[n] and H[n] independently if
this occurs
If not enough good vertices are chosen by xk+1 in G[k]
r = number of good vertices selected Let P[a given vertex is good] = ε0
Failure if r ≤ (1-δ)ε0m
Upper bound:
Proof of Theorem 2Type 1 failure
If the number of good vertices chosen by xk+1 in G[k] is less than the random vertices generated in H[k]
X~Bi(r, ε0) and Y~Bi(γk,p)
Failure if Y>X Upper bound on type 2 failure prob: Ae-
Bm
Proof of Theorem 2Type 2 failure
Take a random subset of size Y of the good chosen vertices in G[k] and connect them with the new vertex in H[k]
Delete vertices in H[k] that are deleted by the adversary in G[k]
Hn ~ G(Γn,p)
Difference can only occur due to failure
Proof of Theorem 2Coupling and deletion
Proof of Theorem 2Bound on failures
Prob of failure at each step Ae-Bm
Total number of misplaced edges added:
E[M] ≤ Ae-Bmn
Lemma 1Statement
Let G obtained by deleting fewer than n/100 edges from a realization of Gn,c/n. if c≥10 then whp G has a component of size at least n/3
Proof of Lemma 1
Gn,c/n contains a set S of size n/3 ≤ s ≤ n/2
P [at most n/100 edges joining s to n-s] is small
E [number of edges across this cut] = s(n-s)c/n
Pick some ε so that n/100 ≤(1-ε)s(n-s)c/n
n-ss
N/100
Proof of Lemma 1
Proof of Lemma 2Statement and Notation
whp γt ≥ t/10 for n/2 < t ≤ n
Let zt = number of deleted vertices
ν’t = number of vertices in Gt
It is sufficient to show that
Proof of Lemma 2Coupling
Couple two generative processes P : adversary deletes vertices at each time step P* : no vertices are deleted until t and then same vertices are
deleted as P
Difference can only occur because of ‘failure’
Upper bound on zt(P*)
Theorem 1Statement
For any sufficiently small constant δ there exists a sufficiently large constant m=m(δ) and a constant θ=θ(δ,m) such that whp Gn
has a “giant” connected component with size at least θn
Proof of Theorem 1
Let G1=Gn and G2= G(Γn,p)
Let G = G1 ∩ G2
e(G2 \ G) ≤ Ae-Bmn by theorem 2
whp |G|= γn ≥ n/10 by lemma 2
Let m be large so that p>10/ γn
Proof by lemma 1