Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

230
Locality Sensitive Distributed Computing David Peleg Weizmann Institute

Transcript of Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Page 1: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Locality Sensitive Distributed Computing

David PelegWeizmann Institute

Page 2: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Structure of mini-course

1. Basics of distributed network algorithms

2. Locality-preserving network representations

3. Constructions and applications

Page 3: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Part 2: Representations

1. Clustered representations• Basic concepts: clusters, covers, partitions • Sparse covers and partitions• Decompositions and regional matchings

2. Skeletal representations

• Spanning trees and tree covers• Sparse and light weight spanners

Page 4: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Basic idea of locality-sensitive distributed computing

Utilize locality to both • simplify control structures and algorithms and • reduce their costs

Operation performed in large network may concern few processors in small region

(Global operation may have local sub-operations)

Reduce costs by utilizing “locality of reference”

Page 5: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Components of locality theory

• General framework, complexity measures and algorithmic methodology

• Suitable graph-theoretic structures and efficient construction methods

• Adaptation to wide variety of applications

Page 6: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Fundamental approach

Clustered representation:• Impose clustered hierarchical organization on

given network• Use it efficiently for bounding complexity of

distributed algorithms.

Skeletal representation:• Sparsify given network • Execute applications on remaining skeleton,

reducing complexity

Page 7: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Clusters, covers and partitions

Cluster = connected subset of vertices S V

Page 8: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Clusters, covers and partitions

Cover of G(V,E,) = collection of clusters={S1,...,Sm} containing all vertices of G

(i.e., s.t. [ = V).

Page 9: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

PartitionsPartial partition of G = collection of disjointclusters ={S1,...,Sm}, i.e., s.t. Si Å Sj=

Partition = cover & partial partition

Page 10: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Evaluation criteria

Locality and Sparsity

Locality level: cluster radius

Sparsity level: vertex / cluster degrees

Page 11: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Evaluation criteria

Locality - sparsity tradeoff:

locality and sparsity parametersgo opposite ways:

better sparsity ⇔ worse locality (and vice versa)

Page 12: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Evaluation criteria

Locality measures

Weighted distances:

Length of path (e1,...,es) = ∑1≤i≤s (ei)

dist(u,w,G) = (weighted) length of shortest path

dist(U,W) = min{ dist(u,w) | uU, wW }

Page 13: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Evaluation criteria

Diameter, radius: As before, except weighted

Denote logD = dlog Diam(G)e

For collection of clusters :

• Diam() = maxi Diam(Si)

• Rad () = maxi Rad (Si)

Page 14: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Neighborhoods

(v) = neighborhood of v = set of neighbors in G(including v itself)

(v)

Page 15: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Neighborhoods

(v) = -neighborhood of v = vertices at distance or less from v

0(v)

1(v)

2(v)

Page 16: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Neighborhood covers

For W V:

s(W) = -neighborhood cover of W

= { (v) | vW }

(collection of -neighborhoods of W vertices)

Page 17: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Neighborhood covers

E.g: s0

(V) = partition into singleton clusters

Page 18: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Neighborhood covers

E.g: s

1 (W) = cover of W nodes by neighborhoods

W = colored nodes

s1

(W)

Page 19: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Sparsity measures

Different representations

Different ways to measure sparsity

Page 20: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Cover sparsity measure - overlap

deg(v,) = # occurrences of v in clusters Si.e., degree of v in hypergraph (V,)

deg(v) = 3

v

C() = maximum degree of cover

Av() = average degree of = ∑vV deg(v,) / n = ∑S|S| / n

Page 21: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Partition sparsity measure - adjacency

Intuition: “contract” clusters into super-nodes,look at resulting cluster graph of ,()=(, )

Page 22: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Partition sparsity measure - adjacency

edges = inter-cluster edges

()=(, ) :={(S,S') | S,S‘ ,G contains edge (u,v) for u S and v S'}

Page 23: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Cluster-neighborhood

Def: Given partition , cluster S , integer ≥0:

S

Cluster-neighborhood of S = neighborhood of S in cluster graph ()

c(S,G) = (S,())

c(S,G)

Page 24: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Sparsity measure

Average cluster-degree of partition :

Avc() = S |c(S)| / n

Note:

Avc() ~ # inter-cluster edges

Page 25: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Example: A basic construction

Goal: produce a partition with:

1. clusters of radius ≤ k2. few inter-cluster edges (or, low Avc())

Algorithm BasicPart

Algorithm operates in iterations,each constructing one cluster

Page 26: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Example: A basic construction

At end of iteration:- Add resulting cluster S to output collection - Discard it from V- If V is not empty then start new iteration

Page 27: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Iteration structure

• Arbitrarily pick a vertex v from V

• Grow cluster S around v, adding layer by layer

• Vertices added to S are discarded from V

Page 28: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Iteration structure

• Layer merging process is carried repeatedly until reaching required sparsity condition:

- next iteration increases # vertices by a factor of < n1/k

(I.e., |(S)| < |S| · n1/k)

Page 29: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

Av-Deg-Partition Thm: Given n-vertex graph G(V,E), integer k≥1,Alg. BasicPart creates a partition satisfying:

1) Rad() ≤ k-1,2) # inter-cluster edges in () ≤ n1+1/k

(or, Avc() ≤ n1/k)

Page 30: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Proof:

Correctness:• Every S added to is (connected) cluster• The generated clusters are disjoint

(Alg erases from V every v added to cluster)• is a partition (covers all vertices)

Page 31: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Property (2): [(()) ≤ n1+1/k ]By termination condition of internal loop,the resulting S satisfies |(S)| ≤ n1/k·|S|

(# inter-cluster edges touching S) ≤ n1/k·|S|

Number can only decrease in later iterations, if adjacent vertices get merged into same cluster

|| ≤ ∑S n1/k ·|S| = n1+1/k

Page 32: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Property (1): [ Rad() ≤ k-1 ]Consider iteration of main loop.

Let J = # times internal loop was executed

Let Si = S constructed on i'th internal iteration

|Si| > n(i-1)/k for 2≤i≤J (By induction on i)

Page 33: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

J ≤ k (otherwise, |S| > n)

Note: Rad(Si) ≤ i-1 for every 1≤i≤J (S1 is composed of a single vertex, each additional layer increases Rad(Si) by 1)

Rad(SJ) ≤ k-1

Page 34: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Variant - Separated partial partitions

Sep() = Separation of partial partition = minimal distance between any two clusters

When Sep()=s, we say is s-separated

Example: 2-separated partial partition

Page 35: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Coarsening

Cover ={T1,...,Tq} coarsens ={S1,...,Sp}if clusters are fully subsumed in clusters

Page 36: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Coarsening (cont)

The radius ratio of the coarsening = Rad() / Rad()

r

R

= R / r

Page 37: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Coarsening (cont)

Motivation:Given “useful” with high overlaps:

Coarsen by merging some clusters together, getting a coarsening cover with

• larger clusters • better sparsity • increased radii

Page 38: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Sparse coversGoal:For initial cover , construct coarsening with low overlaps, paying little in cluster radii

Inherent tradeoff:

Simple Goal: Low average degree

lower overlap higher radius ratio

(and vice versa)

Page 39: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Sparse coversAlgorithm AvCover

Operates in iterationsEach iteration merges together some clusters into one output cluster Z

At end of iteration:• Add resulting cluster Z to output collection • Discard merged clusters from • If is not empty then start new iteration

Page 40: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Sparse covers

Algorithm AvCover – high-level flow

Page 41: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Iteration structure

• Arbitrarily pick cluster S0 in (as kernel Y of cluster Z constructed next)

• Repeatedly merge cluster with intersecting clusters from (adding one layer at a time)

• Clusters added to Z are discarded from

Page 42: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Iteration structure

- Layer merging process is carried repeatedly until reaching required sparsity condition:

adding next layer increases # verticesby a factor of ≤ n1/k

(|Z| ≤ |Y| · n1/k)

Page 43: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

Thm: Given graph G(V,E,), cover , int k≥1,

Algorithm AvCover constructs a cover s.t.:

1. coarsens 2. Rad() ≤ (2k+1) Rad() (radius ratio ≤ 2k+1)3. Av() ≤ n1/k (low average sparsity)

Page 44: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Corollary for -neighborhood cover:Given G(V,E,), integers k,≥1,there exists cover = ,k s.t.

1. coarsens the neighborhood cover s(V)

2. Rad() ≤ (2k+1)3. Av() ≤ n1/k

Page 45: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Proof of Thm:

Property (1): [ coarsens ]

Holds directly from construction

(Each Z added to is a (connected) cluster,since at the beginning contained clusters)

Page 46: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Claim: The kernels Y corresponding to clusters Z generated by the algorithm are mutually disjoint.

Proof: By contradiction.Suppose there is a vertex v s.t. v YÅY'W.l.o.g. suppose Y was created before Y'v Y'

There is a cluster S' s.t. vS' and S' was still in when algorithm started constructing Y'.

Page 47: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

But S' satisfies S'ÅY ≠ ∅

The final merge creating from Yshould have added S' into and eliminated it from ; contradiction.

Page 48: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Output clusters and kernels

cover

kernels

Page 49: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Property (2): [ Rad() ≤ (2k+1)·Rad() ]

Consider some iteration of main loop(starting with cluster S)

J = # times internal loop was executed.0 = initial set

i = constructed on i'th internal iteration (1≤i≤J) Respectively Zi,Yi

Page 50: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Note 1:|Zi| > ni/k, for every 1≤i≤J-1,

J ≤ k

Note 2:Rad(Yi) ≤ (2i-1)Rad(), for every 1≤i≤J

Rad (YJ) ≤ (2k-1)Rad()

Page 51: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Property (3): [ Av() ≤ n1/k ]

Av() = ∑Zi|Zi| / n

≤ ∑Zi|Yi|·n1/k / n

≤ n · n1/k / n (Yi’s are disjoint)= n1/k

Page 52: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Partial partitions

Goal:Given initial cover and integer k≥1,construct a partial partition :• subsuming a “large” subset of clusters , • with low radius ratio.

Page 53: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Partial partitions (cont)

Procedure Part

General structure and iterations similar to Algorithm AvCover, except for two differences:

Small difference:Procedure keeps also “unmerged” collections , of original clusters merged into Y and Z.

Page 54: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Partial partitions (cont)

Small difference (cont):

Sparsity condition concerns sizes of , ,i.e., # original clusters “captured” by merge,and not sizes of Y, Z, i.e., # vertices covered

Merging ends when next iteration increases # clusters merged into by a factor ≤ ||1/k.

Page 55: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Main difference

Procedure removes all clusters in ,

but takes into output collection only the kernel Y, not the cluster Z

Page 56: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Main difference

Implication: Each selected cluster Y has additional “external layer” of clusters around it, acting as “protective barrier”providing disjointness between different clusters Y, Y' added to

Page 57: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Main difference

Note: Not all clusters are subsumed by (E.g., those merged into some external layer -will not be subsumed)

Page 58: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

Partial Partition Lemma: Given graph G(V,E,), cluster collection and integer k≥1, the collections and constructed by Procedure Part() satisfy:1. coarsens (as before)2. is a partial partition (i.e., YÅY’ = for every Y,Y’ ) (guaranteed by construction)3. || ≥ ||1-1/k (# clusters discarded ≤ ||1/k · # clusters taken)4. Rad() ≤ (2k-1)Rad() (as before)

Page 59: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

s-Separated partial partitions

Goal: For initial -neighborhood cover ,s,k≥1,

construct s-separated partial partition subsuming a “large” subset of clusters , with low radius ratio.

Page 60: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

s-Separated partial partitions (cont)

Procedure SepPart

• Given , construct modified collection ' of neighborhoods of radius ' = +s/2 :

= {(v) | vW} for some WV

' = {'(v) | vW }

Page 61: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

Lemma: Given graph G(V,E,), collection of -neighborhoods and integers s,k, the collections and constructed byProcedure SepPart satisfy:

1. coarsens 2. is an s-separated partial partition3. || ≥ ||1-1/k

4. Rad() ≤ (2k-1)·+ k s

Page 62: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Sparse covers with low max degree

Goal:For initial cover , construct coarsening cover with low max degree and cluster ratio.

Idea: Reduce to sub-problem of partial partition

Page 63: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Low max degree covers (cont)

Strategy:Given initial cover and integer k≥1:1. Repeatedly select low radius partial partitions,

each subsuming many clusters of .2. Their union should subsume all of .3. The resulting overlap = # partial partitions.

1

2

3

Page 64: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Low max degree covers (cont)

Algorithm MaxCover• Cover clusters by several partial partitions

(repeatedly using Procedure Part on remaining clusters, until is empty)

• Merge the constructed partial partitions into the desired cover

Page 65: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Low max degree covers (cont)

Max-Deg-Cover Thm:Given G(V,E,), cover , integer k≥1,Algorithm MaxCover constructs cover satisfying:

1. coarsens ,2. Rad() ≤ (2k-1) Rad(),3. C() ≤ 2k||1/k

Page 66: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

Proof: Define

i = contents of at start of phase i;ri = |i|

i = set added to at end of phase i, i = set removed from at end of phase.

Page 67: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Property (1): [ coarsens ]

Since = [ii, = [ii, and by Partial Partition Lemma, i coarsens i for every i.

Property (2): [ Rad() ≤ (2k-1) Rad() ] Directly by Partial Partition Lemma

Page 68: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Property (3): [ C() ≤ 2k||1/k ] By Partial Partition Lemma, clusters in i are disjoint

# clusters v belongs to ≤ # phases of algorithm

Page 69: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Observation: In every phase i, # i clusters removed from i satisfies

|i| ≥ |i|1-1/k

(by Partial Partition Lemma)

size of remaining i shrinks as ri+1 ≤ ri - ri

1-1/k

Page 70: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Claim: Given recurrence xi+1 = xi - xi

, 0<<1,let f(n) = least index i s.t. xi≤1 given x0=n. Then

f(n) < ((1-) ln 2)-1·n1-

Consequently: as r0=||, is exhausted after ≤ 2k·||1/k phasesof Algorithm MaxCover

C() ≤ 2k·||1/k

Page 71: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Corollary for -neighborhood cover:

Given G(V,E,), integers k,≥ 1,there exists cover = ,k satisfying :

1. coarsens s(V)

2. Rad() ≤ (2k-1) 3. C() ≤ 2k·n1/k

Page 72: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Covers based on s-separated partial partitions

Goal: Cover coarsening neighborhood cover s

(V), in which the partial partitions are well separated.

Method: Substitute Procedure SepPart forProcedure Part in Algorithm MaxCover.

Page 73: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Covers based on s-separated partial partitions

Thm: Given G(V,E,), integers k, ≥ 1,there exists cover = ,k s.t.:

1. coarsens s(V),

2. Rad() ≤ (2k-1) + k·s,3. C() ≤ 2k n1/k,4. each of the C() layers of partial partitions

composing is s-separated.

Page 74: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Related graph representations

Network decomposition:

Partition is a (d,c)-decomposition of G(V,E) if

• radius of clusters in G is Rad() ≤ d• chromatic number of cluster graph () is (()) ≤ c

Page 75: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Example: A (2,3)-decomposition

Rad() ≤ 2

(()) ≤ 3

Page 76: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Decomposition algorithm

Algorithm operates in iterations

In each iteration i:- Invoke Procedure SepPart to construct a 2-separated partial partition for V

At end of iteration:- Assign color i to all output clusters- Delete covered vertices from V- If V is not empty then start new iteration

Page 77: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Decomposition algorithm (cont)

Main properties:1. Uses Procedure SepPart instead of Part

(i.e., guaranteed separation = 2, not 1)

Ensures all output clusters of a single iteration can be colored by single color

2. Each iteration applies only to remaining nodes

Clusters generated in different iterations are disjoint.

Page 78: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

Thm: Given G(V,E,), k ≥ 1,there is a (k,k·n1/k)-decomposition.

Proof:Note: Final collection is a partition(- each generated by SepPart is a

partial partition - vertices added to of iteration i are removed from V)

Page 79: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)Iteration starting with results with of size || = (||1-1/k)

Process continues for ≤ O(k·n1/k) iterations

End with O(k·n1/k) colors, and each cluster has O(k) diameter.

Picking k=log n:Corollary: Every n-vertex graph G has a (log n,log n)-decomposition.

Page 80: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Skeletal representations

Spanner: connected subgraph spanning all nodes(Special case: spanning tree)

Tree cover: collection of trees covering G

Page 81: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Skeletal representations

Evaluation criteria

Locality level: stretch factor

Sparsity level: # edges

As for clustered representations, locality and sparsity parameters go opposite ways:

better sparsity ⇔ worse locality

Page 82: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Stretch

Given a graph G(V,E,) and a spanning subgraph G'(V,E'),the stretch factor of G' is:

G

G'

Stretch(G') = 2

Stretch(G') = maxu,vV {dist(u,v,G’) / dist(u,v,G)}

Page 83: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Depth

Def: Depth of v in tree T = distance from root:DepthT(v) = dist(v,r0,T)

Depth(T) = maxv Depth(v,T) = radius w.r.t. root

Depth(T) = Rad(r0,T)

Page 84: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Sparsity measures

Def: Given subgraph G'(V',E') of G(V,E,):

(G') = weight of G' = eE' (e)

Size of G' = # edges, |E'|

Page 85: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Spanning trees - basic types

MST: minimum-weight spanning tree of G = spanning tree TM minimizing (TM)

SPT: shortest paths tree of G w.r.t. given root r0

= spanning tree TS s.t. for every v≠ r0, the path

from r0 to v in the tree is the shortest possible,

or, Stretch(TS,r0)=1

Page 86: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Spanning trees - basic types

BFS: breadth-first tree of G w.r.t. given root r0

= spanning tree TB s.t. for every v≠r0, path from

r0 to v in tree is shortest possible, measuring

path length in # edges

Page 87: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Controlling tree degrees

deg(v,G) = degree of v in G(G) = max degree in G

Tree Embedding Thm: For every rooted tree T, integer m ≥ 1, embedded virtual tree S with same node set,same root (but different edge set), s.t.1. (S) ≤ 2m2. Each edge of S has path of length ≤ 2 in T3. DepthS(v) ≤ (2logm(T)-1) DepthT(v),

for every v

Page 88: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Proximity-preserving spannersMotivation: How good is a shortest paths tree as spanner?

TS preserves distances in graph w.r.t. root r0,

i.e., achieves Stretch(TS,r0)=1

However, it fails to preserve distances w.r.t. vertex pairs not involving r0

(or, to bound Stretch(TS) )Q: Construct example where two neighboring vertices in G are at distance 2·Depth(T) in SPT

Page 89: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Proximity-preserving spanners

k-Spanner:Given graph G(V,E,) , the subgraph G'(V,E') is a k-spanner of G if Stretch(G') ≤ k

Typical goal:Find sparse (small size, small weight) spanners with small stretch factor

Page 90: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Example - 2-spanner

Page 91: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Example - 2-spanner

Page 92: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Tree covers

Basic notion:A tree T covering the -neighborhood of v

v2(V)

covering T

Page 93: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Tree covers (cont)

-tree cover for graph G =tree cover for s

(V) = collection TC of trees in G s.t. for every vV, there is a tree TTC(denoted home(v) ), spanning the -neighborhood of v

Depth(TC) = maxTTC {Depth(T)}

Overlap(TC) = maxv {# trees containing v}

Page 94: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Tree covers

Algorithm TreeCover(G,k,)

1. Construct -neighborhood cover of G, = s

(V)2. Compute a coarsening cover for as in

Max-Deg-Cover Thm, with parameter k3. Select in each cluster Ran SPT T(R)

rooted at some center of R and spanning R4. Set TC(k,) = { T(R) | R}

Page 95: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Tree covers (cont)

Thm:For every graph G(V,E,), integers k, ≥ 1,there is an -tree cover TC=TC(k,) with

• Depth(TC) ≤ (2k-1)• Overlap(TC) ≤ d2k·n1/ke

Page 96: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Tree covers (cont)

Proof:1. TC built by Alg. TreeCover is -tree cover:

Consider vV

coarsens

there is a cluster Rs.t. (v)R

tree T(R)TC covers -neighborhood (v)

Page 97: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Tree covers (cont)

2. Bound on Depth(TC):follows from radius bound on clusters of cover ,guaranteed by Max-Deg-Cover Thm,as these trees are SPT's.

3. Bound on Overlap(TC):follows from degree bound on (Max-Deg-Cover Thm), as ||=n

Page 98: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Tree covers (cont)

Relying on Theorem and Tree Embedding Thm, and taking m=n1/k:

Corollary:For every graph G(V,E,), integers k, ≥ 1,there is a (virtual) -tree cover TC=TC(k,) for G,with• Depth(TC) ≤ (2k-1)2• Overlap(TC) ≤ d2k·n1/ke, • (T) ≤ 2n1/k for every tree TTC

Page 99: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Tree covers (cont)

Motivating intuition: a tree cover TC constructed for a given cluster-based cover serves as a way to “materialize”or “implement” efficiently.

(In fact, applications employing covers actually use the corresponding tree cover)

Page 100: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Sparse spanners for unweighted graphs

Basic lemma: For unweighted graph G(V,E),

subgraph G' is a k-spanner of G

for every (u,v)E, distG'(u,v) ≤ k

(No need to look at the stretch of each pair u,v ; suffices to consider the stretch of edges)

Page 101: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Sparse spanners for unweighted graphs

Algorithm Unweighted_Span(G,k)

1. Set initial partition = s0(V) = { {v} | vV }

2. Build coarsening using Alg. BasicPart

Page 102: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Algorithm UnweightedSpan(G,k) - cont

3. For every cluster Ti construct SPT rooted at some center ci of Ti

4. Add all edges of these trees to spanner G'

Page 103: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Algorithm UnweightedSpan(G,k) - cont

5. In addition, for every pair of neighboring clusters Ti,Tj:- select a single “intercluster” edge eij,- add it to G'

Page 104: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

Thm: For every unweighted graph G, k≤1,

there is an O(k)-spanner with O(n1+1/k) edges

Page 105: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

(a) Estimating # edges in spanner: 1. is a partition of V

# edges of trees built for clusters ≤ n

2. Av-Deg-Partition Thm

# intercluster edges ≤ n1+1/k

Page 106: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

(b) Bounding the stretch:

Consider edge e=(u,w) in G(recall: enough to look at edges)

e was selected to the spanner stretch = 1

So suppose e is not in the spanner.

Page 107: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Case 1: endpoints u,w belong to same cluster Ti

length of path from u to w through center ci ≤ 2r

Clusters have radius ≤ r k-1

Page 108: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Case 2: endpoints belong to clusters uTi, wTj

These clusters are connected by aninter-cluster edge

Page 109: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

There is a u-w path from u to ci (≤ r steps), from ci through eij to cj (≤ r+1+r steps), from cj to w (≤ r steps)

total length ≤ 4r+1

≤ 4k-3

Page 110: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Stretch factor analysis

Fixing k=log n we get:

Corollary: For every unweighted graph G(V,E)there is an O(log n)-spanner with O(n) edges.

Page 111: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Lower bounds

Def:Girth(G) = # edges of shortest cycle in G

Girth = 3 Girth = 4 Girth = ∞

Page 112: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Lower boundsLemma:For every k ≥ 1, for every unweighted G(V,E) with Girth(G) ≥ k+2, the only k-spanner of G is G itself

(no edge can be erased from G)

Page 113: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Lower boundsProof:Suppose, towards contradiction, that G has some spanner G' in which the edge e=(u,v) E is omitted

G' has alternative path P of length k from u to v

e

P P∪{e} = cycle of length k+1 < Girth(G);Contradiction.

k

1

Page 114: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Size and girth

Lemma:• For every r≥1 and n-vertex, m-edge graph

G(V,E) with girth Girth(G) ≥ r, m ≤ n1+2/(r-2) + n

• For every r≥3, there are n-vertex, m-edge graphs G(V,E) with girth Girth(G) ≤ r and m ≥ n1+1/r / 4

Page 115: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Lower bounds (cont)

Thm: For every k≥3, there are graphs G(V,E) for which every (k-2)-spanner requires (n1+1/k) edges

Page 116: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Lower bounds (cont)

Corollary: For every k≥3, there is an unweighted G(V,E) s.t.

(a) for every cover coarsening s1(V),

if Rad() ≤ k then Av() = (n1/k)(b) for every partition coarsening s

0 (V),

if Rad() ≤ k then Avc() = (n1/k)

Page 117: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Lower bounds (cont)

Similar bounds implied for average degree partition problem and all maximum degree problems.

The radius - chromatic number tradeoff for network decomposition presented earlier is also optimal within factor k

Lower bound on radius-degree tradeoff for -regional matchings on arbitrary graphs follow similarly

Page 118: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Examples

Restricted graph families: Behave better

Graph classes with O(n) edges have (trivial) optimal spanner (includes common topologies such as bounded-degree and planar graphs -rings, meshes, trees, butterflies, cube-connected cycles,…)

General picture:

larger k ⇔ sparser spanner

Page 119: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Spanners for weighted graphs

Algorithm Weighted_Span(G,k)

1. For every 1 < i < logD : construct 2i-tree-cover TC(k,2i) for G using Alg. TreeCover

2. Take all edges of tree covers into spanner G'(V,E')

Page 120: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Spanners for weighted graphs (cont)

Lemma: Spanner G' built by Alg. Weighted_Span(G,k) has

(1) Stretch(G') ≤ 2k-1(2) O(logD·k·n1+1/k) edges

Page 121: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Greedy construction

Algorithm GreedySpan(G,k)/* Generalization of Kruskal's MST algorithm */

1. Sort E by nondecreasing edge weight, get E={e1,...,em}

(sorted: (ei) ≥ (ei+1)) 2. Set E'= (spanner edges)

Page 122: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Greedy construction3. Scan edges one by one,

for each ej=(u,v) do:• Compute P(u,v) = shortest path from u to v

in G'(V,E')• If (P(u,v)) > k·(ej)

(alternative path is too long) then E' E' [ {ej}

(must include e in spanner)4. Output G'(V,E')

ej

P(u,v)

Page 123: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

Lemma: Spanner G' built by Algorithm GreedySpan(G,k) has Stretch(G') ≤ k

Proof: Consider two vertices x,y of G

Px,y = (e1,...,eq) = shortest x - y path in G

Page 124: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Consider edge ej=(u,v) along Px,y

ej not included in G'

when ej was examined by the algorithm,E' contained a u - v path Pj = P(u,v) of length ≤ k·(ej)

Pj

Page 125: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

This path exists in final G'

To mimic the path Px,y in G' : replace each “missing” edge ej (not taken to G') by its substitute Pj

Resulting path has total length ≤ k·(Px,y)

Page 126: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Lemma: Spanner has Girth(G') > k+1Proof: Consider cycle C in G'.Let ej=(u,v) be last edge added to C by alg.

When algorithm examined ej, the spanner E' already contained all other C edges

the shortest u - v path Pj constructed by the algorithm satisfies (Pj) ≤ (C-{ej})

Page 127: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

ej added to E'

(Pj) > k·(ej) (by selection rule)

ej = heaviest edge in C

(C) ≤ |C|·(ej)

|C| > k+1

Pj C

(C) > (k+1)·(ej) (C = Pj ∪ {ej} )

Page 128: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Corollary: |E'| ≤ n1+2/k + n

Thm: For every weighted graph G(V,E,), k ≥ 1,there is an (2k+1)-spanner G'(V,E') s.t.|E'| < n·dn1/ke

Recall: For every r ≥ 1, every graph G(V,E) with Girth(G) ≤ r has |E| ≤ n1+2/(r-2) + n

Page 129: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Shallow Light Trees

Goal: Find spanning tree T near-optimal in both depth and weight

Candidate 1: SPTProblem: ∃G s.t.

(SPT) = (n·(MST))

Example G:

MST:

Heavy SPT:

Page 130: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Shallow Light Trees (cont)

Candidate 2: MSTProblem: ∃G s.t.

Depth(MST) = (n·Depth(SPT))

Example G:

Deep MST:

SPT:

Page 131: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Shallow Light Trees (cont)

Shallow-light tree (SLT) for graph G(V,E,) and root r0:

Spanning tree T satisfying both• Stretch(T,r0) = O(1)• (T) / (MST) = O(1)

Thm: Shallow-light trees exist for every graph G and root r0

Page 132: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Light, sparse, low-stretch spanners

Algorithm GreedySpan guarantees:

Thm: For every graph G(V,E,), integer k≥1, there is a spanner G'(V,E') for G with

1. Stretch(G') < 2k+1

2. |E'| < n·dn1/ke

3. (G') = (MST(G))·O(n1/k)

Page 133: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Lower bound

Thm: For every k ≥ 3, there are graphs G(V,E,) s.t. every spanner G'(V,E') for G withStretch(G') ≤ k-2 requires

• |E'| = (n1+1/k) and• (G') = ((MST(G))·n1/k)

Proof:By bound for unweighted graphs

Page 134: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Part 3: Constructions and Applications

• Distributed construction of basic partition• Fast decompositions• Exploiting topological knowledge:

broadcast revisited• Local coordination:

synchronizers revisited• Hierarchical example:

routing revisited• Advanced symmetry breaking:

MIS revisited

Page 135: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Basic partition construction algorithm

Simple distributed implementation for Algorithm BasicPart

Single “thread” of computation(single locus of activity at any given moment)

Page 136: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Basic partition construction algorithm

Components

ClusterCons :Procedure for constructing a cluster around a chosen center vNextCtr :Procedure for selecting the next center v around which to grow a clusterRepEdge :Procedure for selecting a representative inter-cluster edge between any two adjacent clusters

Page 137: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

Thm: Distributed Algorithm BasicPart requires

Time = O(nk)

Comm = O(n2)

Page 138: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Efficient cover construction algorithms

Goal: Fast distributed algorithm for coarsening a neighborhood cover

Known: Randomized algorithms for constructing low (average or maximum) degree cover of G,guaranteeing bounds on weak cluster diameter

Page 139: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Efficient decompositions

Goal: fast distributed algorithms for constructing a network decomposition

Basic tool:s-separated, r-ruling set ((s,r)-set):(Combinaton of independent & dominating set)

W={w1,...,wm} V in G s.t.

• dist(wi,wj) ≥ s, 1≤i<j≤m,

• for every vW, 1≤i≤m s.t. dist(wi,v) ≤ r

Page 140: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Efficient decompositions

(s,r)-partition:(associated with (s,r)-set W = {w1,...,wm} )

Partition of G, (W) = {S1,...,Sm},s.t. 1≤i≤m :

• wiSi

• Rad(wi,G(Si)) ≤ r≥s

≤rwiSi

Page 141: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Distributed construction

Thm:There is a deterministic distributed algorithm

for constructing (2,2)-decomposition

for given n-vertex graph in time O(2), for = p(c log n), for some constant c>0

Using an efficient distributed construction for (3,2)-sets and (3,2)-partitions and a recursive coloring algorithm, one can get:

Page 142: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Exploiting topological knowledge: Broadcast revisited

Delay measure:When broadcasting from source s,message delivery to node v suffers delay ρ if it reaches it after ρ·dist(s,v) time.

For broadcast algorithm B:

Delay(B) = maxv {Delay(v,B)}

Page 143: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Broadcast on a subgraph

Lemma:Flood(G') broadcast on subgraph G' costs

• Message(Flood(G')) = |E(G')|• Comm(Flood(G')) = (G')• Delay(Flood(G')) = Stretch(G')(in both synchronous and asynchronous

models)

Page 144: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Broadcast (cont)

Selecting an appropriate subgraph:

For spanning tree T:

• Message(Flood(T)) = n-1 (optimal)• Comm(Flood(T)) = (T)• Delay(Flood(T)) = Stretch(T,r0)

Goal: Lower both (T) and Stretch(T,r0)

Page 145: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Broadcast (cont)

Using light, low-stretch tree (SLT):

Lemma: For every graph G, source v,there is a spanning tree SLTv s.t. broadcast by Flood(SLTv) costs:

• Message(Flood(SLTv)) = n-1

• Comm(Flood(SLTv)) = O((MST))

• Delay(Flood(SLTv)) = O(1)

Page 146: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Broadcasting on a spanner

Disadvantage of SLT broadcast:

Tree efficient for broadcasting from one sourceis poor for another, w.r.t Delay

Solution 1: Maintain separate tree for every source

(heavy memory / update costs, involved control)

Page 147: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Broadcasting on a spanner

Solution 2: Flood(G') broadcast on spanner G'

Recall: For every graph G(V,E,), integer k≥1,there is a spanner G'(V,E') for G with

1. Stretch(G') ≤ 2k+12. |E'| ≤ n1+1/k

3. (G') = (MST(G))·O(n1/k)

Page 148: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Broadcasting on a spanner (cont)

Setting k=log n:

Thm: For every graph G, there is a spanner G',s.t. Algorithm Flood(G') has complexities • Message(Flood(G')) = O(n · log n · logD)• Comm(Flood(G')) = O(log n · logD · (MST))• Delay(Flood(G')) = O(log n)

(optimal up to polylog factor in all 3 measures)

Page 149: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Topology knowledge and broadcast

Assumption: No predefined structures exist in G (Broadcast performed “from scratch”)

Focus on message complexity

Extreme models of topological knowledge:

KT∞ model: Full knowledgeVertices have full topological knowledge

Page 150: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Topology knowledge and broadcast

KT∞ model: Full topological knowledge

broadcast with minimal # messages,Message=(n)

1. Each v locally constructs same tree T, sending no messages

2. Use tree broadcast algorithm Flood(T)

Page 151: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Topology knowledge and broadcast

KT0 model: “Clean” network:

Vertices know nothing on topology

KT1 model: Neighbor knowledge:

Vertices know own + neighbor ID's, nothing else

Page 152: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Topology knowledge & msg complexity

Lemma: In model KT0, every broadcast algorithm must send ≥1 message over every edge of G

Proof: Suppose there is an algorithm disobeying the claim.

Consider graph G and edge e=(u,w) s.t. broadcasts on G without sending any messages over e

Page 153: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Topology knowledge & msg complexity

Then G can be replaced by G' as follows:

Page 154: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Clean network model

u and w cannot distinguish betweenthe two topologies G' and G

No msgs sent on e

No msgs sent on e1 , e2

Page 155: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Clean network model

In executing algorithm over G',u and w fail to forward the message to u' and w'

x x

Page 156: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Clean network model

u' and w' do not get message, contradiction

x x

Page 157: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Clean network model

Thm: Every broadcast protocol for the KT0

model has complexity Message() = (|E|)

Page 158: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Msg complexity of broadcast in KT1

Note: In KT1, previous intuition fails !

Nodes know the ID’s of their neighbors

⇓not all edges must be used

Page 159: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Broadcast in KT1 (cont)

Traveler algorithm

“Traveler” (token) performs DFS traversal on G

Traveler carries a list L of vertices visited so far.

Page 160: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Broadcast in KT1 (cont)

To pick next neighbor to visit after v:- Compare L with list of v's neighbors, - Make next choice only from neighbors not in L (If all v neighbors were already visited, backtrack from v on edge to parent.)

{0}

{0,1}

0

1 2

3 4 5{0

,1,3

}

{0,1,3,4}

{0,1,3,4,5}

Page 161: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Broadcast in KT1 (cont)

Note: Traveler's “forward” steps restricted to the edges of the DFS tree spanning G; non-tree edges are not traversed

{0}

{0,1}

0

1 2

3 4 5{0

,1,3

}

{0,1,3,4}

{0,1,3,4,5}

No need to send messages on every edge !

Page 162: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Broadcast in KT1 (cont)

Q: Does the traveler algorithm disprove the (|E|) lower bound on messages?

Observe: # basic (O(log n) bit) messages sent by algorithm = (n2) >> 2n(the lists carried by the traveler contain up to O(n) vertex ID's)

traversing an edge requires O(n) basic messages on average

Page 163: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

(|E|) lower bound for KT1

Idea: To avoid traversing edge e=(v,u) – the traveler algorithm must inform, say, v, that u already got the message.

This can only be done by sending some message to u- as expensive as traversing e itself…

Intuitively, edge e was “utilized,” just as if a message actually crossed it

Page 164: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Lower bound (cont)

Def: Edge e=(u,v) E is utilized during a run ofalgorithm on G if one of the following eventsholds:

1. A message is sent on e2. u either sends or receives a message

containing ID(v)3. v either sends or receives a message

containing ID(u)

Page 165: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Lower bound (cont)

m = # utilized edges in run of protocol on G

M = # (basic) messages sent during run

Lemma: M=(m)Proof: Consider a message sent over e=(u,v).The message contains O(1) node ID's z1,...,zB.Each zi utilizes ≤ 2 edges, (u,zi) and (v,zi)(if exist).Also, e itself becomes utilized.

Page 166: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Lower bound (cont)

⇒ To prove a lower bound on messages, it suffices to prove a lower bound on # edges utilized by algorithm

Lemma: Every algorithm for broadcast under the KT1 model must utilize every edge of G

Thm: Every broadcast protocol for the KT1

model has complexity Message() = (|E|)

Page 167: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Lower bound (cont)

Observation: Thm no longer holds if, in addition to arbitrary computations, we allow protocols with time unbounded in network size.

Once such behavior is allowed, one may encode an unbounded number of ID's by the choice of transmission round, and hence implement, say, the “traveler” algorithm.

(This relates only to the synchronous model;In asynch model such encoding is impossible!)

Page 168: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Hierarchy of partial topological knowledge

KTk model: Known topology to radius k:Every vertex knows the topology of the neighborhood of radius k around it, G(k(v))

Example: In KT2, v knows the topology of its 2-neighnorhood

Page 169: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Hierarchy of partial topological knowledge

KTk model: Known topology to radius k:Every vertex knows topology of subgraph of radius k around it, G(k(v))

Information-communication tradeoff:For every fixed k ≥ 1:# basic messages required for broadcast in the KTk model = (min{|E|,n1+(1)/k})

Page 170: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Hierarchy of partial topological knowledge

Lower bound proof: Variant of KT1 case.

Upper bound Idea:v knows all edges at distance ≤ k from it

v can detect all short cycles (length ≤ 2k)going through it

Possible to disconnect all short cycles locally,

by deleting one edge in each cycle.

Page 171: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

KTk model

Algorithm k-Flood

Assumption: There is some (locally computable) assignment of distinct weights to edges

Page 172: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

KTk model

Algorithm k-Flood

• Define subgraph G*(V,E*) of G:1. Mark heaviest edge in each short cycle

“unusable”,2. include precisely all unmarked edges in E*

(Only e endpoints should know e is usable;

Given partial topological knowledge, edge deletions done locally, sending no messages)

Page 173: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

KTk model

Algorithm k-Flood (cont)

• Perform broadcast by Alg. Flood(G*) on G* (I.e., whenever v receives message first time, it sends it over all incident usable edges eE* )

Page 174: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

Lemma: G connected G* connected too.

Consequence of marking process defining G*: All short cycles are disconnected

Lemma: Girth(G*) ≤ 2k+1

Page 175: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

Recall: For every r≥1, graph G(V,E) with girth Girth(G) ≥ r,

|E| ≤ n1+2/(r-2) + n

Corollary: |E*|=O(n1+c/k) for constant c>0

Thm: For every G(V,E), k≥1, Algorithm k-Flood performs broadcast in KTk model, withMessage(k-Flood)=O(min{|E|,n1+c/k})(fixed c>0)

Page 176: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Synchronizers revisitedRecall: • Synchronizers enable transforming

an algorithm for synchronous networks into an algorithm for asynchronous networks.

Phase A (of pulse p): Each processor learns (in finite time) that all messages it sent during pulse p have arrived (it is safe)

Phase B (of pulse p): Each processor learns that all its neighbors are safe w.r.t. pulse p

• Operate in 2 phases per pulse

Page 177: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Learning neighbor safety

SafeReady

Page 178: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Synchronizer costs

Goal: Synchronizer capturing reasonable middle points on time-communication tradeoff scale

Cpulse O(|E|) O(n)

Tpulse O(1) O(Diam)

Page 179: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Synchronizer Assumption: Given a low-degree partition

• Rad() ≤ k-1,• # inter-cluster edges in () ≤ n1+1/k

Page 180: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Synchronizer (cont)

For each cluster in , build rooted spanning tree.

synchronization link

spanning tree

In addition, between any two neighboring clusters designate a synchronization link.

Page 181: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Handling safety information (in Phase B)

Step 1: For every cluster separately,apply synchronizer (By end of step, every node knows that every node in its cluster is safe)

my_subtree_safe

cluster_safe

Page 182: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Handling safety information (in Phase B)

Step 2: Every node incident to a synchronization link sends a message to the other cluster, saying “my cluster is safe”

my_cluster_safe

Page 183: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Handling safety information (in Phase B)

Step 3: Repeat step 1, but the convergecast performed in each cluster carries different information:

• Whenever v learns all clusters neighboring its subtree are safe, it reports this to parent.

all_clusters_adjacent_to_my_subtree_are_safe

Page 184: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Handling safety information (in Phase B)

Step 4: When root learns all neighboring clusters are safe, it broadcasts “start new pulse” on tree

all_neighboring_

clusters_are_safe

(By end of step, every node knows that all its neighbors are safe)

Page 185: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

Claim: Synchronizer is correct.

Claim:1. Cpulse() = O(n1+1/k)

2. Tpulse() = O(k)

Page 186: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Proof:Time to implement one pulse: ≤ 2 broadcast / convergecast rounds in clusters(+ 1 message-exchange step among border

vertices in neighboring clusters)

Tpulse() ≤ 4 Rad() +1 = O(k)

Page 187: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Complexity

Messages: Broadcast / convergecast rounds, separately in each cluster,cost O(n) messages in total(clusters are disjoint)

Communication step among neighboring clustersrequires n·Avc() = O(n1+1/k) messages

Cpulse() = O(n1+1/k)

Page 188: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Synchronizer

Assumption: Given a sparse k-spanner G'(V,E')

G'(V,E')

Page 189: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Synchronizer (cont)

Handling safety information (in Phase B):

When v learns it is safe for pulse p:For k rounds do:1.Send “safe” to all spanner neighbors2.Wait to hear same from these neighbors

Page 190: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Synchronizer

Lemma: For every 1≤i≤k, once v completes i rounds, every node u at distance dist(u,v,G’) ≤ i from v in the spanner G' is safe

Page 191: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

Proof: By induction on i.

For i=0: Immediate.

For i+1:Consider the time v finishes (i+1)st round.

Page 192: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

v received i+1 messages “safe” from its neighbors in G'

These neighbors each sent their (i+1)st messageonly after finishing their i’th round

Page 193: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

AnalysisBy inductive hypothesis, for every such neighbor u,every w at distance ≤ i from u in G' is safe.

Every w at distance ≤ i+1 from v in G' is safe too

i

i i

i

i i

Page 194: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Corollary: When v finishes k rounds, each neighbor of v in G is safe (v is ready for pulse p+1)

Proof:By lemma, at that time, every processor u at

distance ≤ k from v in G' is safe.

By def of k-spanners, every neighbor of v in G is at distance ≤ k from v in G'.

every neighbor is safe.

Page 195: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Lemma: If G has k-spanner with m edges, then it has synchronizer with

• Tpulse()=O(k)

• Cpulse()=O(k·m)

Page 196: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Summary

Cpulse O(|E|) O(n) O(n1+1/k) O(kn1+1/k)

Tpulse O(1) O(Diam) O(k) O(k)

On a general n-vertex graph, for parameter k≥1:

Page 197: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Compact routing revisitedTradeoff between stretch and space:Any routing scheme for general n-vertex networks achieving stretch factor k≥1 must use (n1+1/(2k+4)) bits of routing information overall

(Lower bound holds for unweighted networks as well, and concerns total memory requirements)

Page 198: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Interval tree routing

Goal: Given tree T, design routing scheme based on interval labeling

Idea: Label each v by integer interval Int(v)s.t. for every two vertices u,v,

Int(v) Int(u) ⇔ v is a descendent of u in T

Page 199: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Interval labeling

Algorithm IntLab on tree T1. Perform depth-first

(DFS) tour of T, starting at root; Assign each uT a depth-first number DFS(u)

Page 200: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Interval labeling (cont)

Algorithm IntLab on tree T2. Label node u by interval

[DFS(u),DFS(w)] : w = last descendent of u visited by DFS

(Labels contain ≤ d2log ne bits)

Page 201: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Interval tree routing

Data structures:Vertex u stores its own label Int(u) and the labels of its children in T

Forwarding protocol: Routes along unique path

Page 202: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Interval tree routing

Lemma: For every tree T=(V,E,), scheme ITR(T) has Dilation(ITR,G)=1 and usesO((T)log n) bits per vertex, andO(n log n) memory in total

Page 203: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Interval tree routing (cont)

Forwarding protocol: Routing M from u to v:

At intermediate w along route:

Compare Int(v) with Int(w)

Possibilities:1. Int(w) = Int(v) (w = v):

receive M.

Page 204: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Interval tree routing (cont)

2. Int(w) Int(v) (w descendent of v):Forward M upwards to parent

w

v

Page 205: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Interval tree routing (cont)

3. Disjoint intervals (v, w in different subtrees): Forward M upwards to parent

w

v

Page 206: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Interval tree routing (cont)

4. Int(v) Int(w) (v descendent of w):Examine intervals of w’s children,find unique child w' s.t. Int(v) Int(w'),forward M to w'

w

v

Page 207: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

ITR for general networks

1. Construct shortest paths tree T for G,2. Apply ITR to T.

Total memory requirements = O(n log n) bits

Problems: - stretch may be as high as Rad(G),- maximum memory per vertex

depend on maximum degree of T

Page 208: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Overcoming high max degree problem

Recall: For every rooted tree T, integer m ≥ 1,there is an embedded virtual tree S with same

node set, same root (but different edge set), s.t.

1. (S) ≤ 2m2. Each edge of S corresponds to path of

length ≤ 2 in T 3. DepthS(v) ≤ (2logm(T)-1) DepthT(v),

for every v

Page 209: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Overcoming high max degree problem

Setting m=n1/k, embed in T a virtual tree T' with • (T') < 2n1/k

• Depth(T') < (2k -1) Rad(G)

Page 210: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Overcoming high max degree problem

Lemma: For every G(V,E,), the ITR(T) scheme guarantees message delivery in Gwith communication O(Rad(G)) and uses O(n log n) memory

Problem: stretch may be as high as Rad(G)

Page 211: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

A regional (C,)-routing scheme

For every u,v:

• If dist(u,v) ≤ : scheme succeeds in delivering M from u to v.

• Else: routing fails, M returns to u

Communication cost ≤ C.

Page 212: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

A regional (C,)-routing schemeRecall: For graph G(V,E,), integers k, ≥ 1,there is an -tree cover TC=TC(k,) with• Depth(TC) ≤ (2k-1)• Overlap(TC) ≤ 2k n1/k

sum of tree sizes = O(k·n1+1/k)

Page 213: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Data structures

1. Construct tree cover TC(k,)2. Assign each tree T in TC(k,) distinct Id(T) 3. Set up interval tree routing component

ITR(T) on each tree T TC(k,)

Page 214: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Data structuresRecall: Every vV has home tree

T=home(v) in TC(k,), containing its entire -neighborhood.

Scheme RSk,:

Routing label for v: Pair (Id(T),IntT(v)) where• Id(T) = ID of v's home tree• IntT(v) = v's routing label in ITR(T)

Page 215: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Data structuresForwarding protocol: Routing M from u to v

with label (Id(T),IntT(v)):

Examine if u belongs to T.- u not in T: detect “unknown destination”

failure and terminate routing procedure.- u in T: send M using ITR(T) component

Page 216: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

AnalysisLemma: For every graph G, integers k, ≥ 1, scheme RSk, is a regional (O(k),)-routing

scheme and it uses O(kn·1+1/k·log n) memory

Page 217: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)Proof:Stretch: Suppose dist(u,v) ≤ for some u,v.

By definition, v (u).Let T = home tree of u.

(u) V(T)

v T

ITR(T) succeeds.Also, path length = O(Depth(T)) = O(k)

Page 218: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Memory: Each v stores O((T(C))·log n) bitsper each cluster C to which it belongs, where T(C) = spanning tree constructed for C

O(k·n1+1/k·log n) memory in total

Page 219: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Hierarchical routing scheme RSk

Data structures:

For 1 ≤ i ≤ logD:construct a regional (O(ki),i)-routing scheme

Ri=RSk, for i=2i

Each v belongs to all regional schemes Ri

(has home tree homei(v) in each Ri

and routing label in each level,stores all info required for each scheme)

Page 220: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Hierarchical routing scheme RSk

Routing label = concatenation of regional labels

Forwarding protocol: Routing M from u to v:1. Identify lowest-level regional scheme Ri usable

(u first checks if it belongs to tree home1(v) ;If not, then check second level, etc.)

2. Forward M to v on ITR(homei(v)) component of regional scheme Ri

Page 221: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

Lemma: Dilation(RSk)=O(k).

Proof: Suppose u sends M to v.Let d=dist(u,v) and j= dlog de(2j-1 < d < 2j)

i = lowest level s.t. u belongs to v's home tree

Page 222: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysisu must belong to homej(v)

regional scheme Rj is usable (if no lower level was)

(Note: highest-level RlogD always succeeds)

Comm(RSk,u,v) ≤ |(RSk,u,v)|

≤ ∑ji=1 O(k·2i)

≤ O(k·2j+1) ≤ O(k)·dist(u,v)

Page 223: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Thm: For every graph G, integer k≥1, hierarchical routing scheme RSk hasDilation(RSk) = O(k) Mem(RSk) = O(k·n1+1/k·logn·logD)

Page 224: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Proof: Memory required by hierarchical scheme= logD terms, each bounded by O(k·n1+1/k·logn)

total memory = O(k·n1+1/k·logn·logD) bits

Page 225: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Deterministic decomposition-based MIS

Assumption: given a (d,c)-decomposition for G plus coloring of clusters in cluster graph

MIS computation() phases: Phase i computes MIS among vertices

belonging to clusters colored i(These clusters are non-adjacent, so may

compute MIS for each independently, in parallel, using PRAM-based distributed algorithm, in time O(d log2 n).)

Page 226: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Deterministic MIS (cont)

Note: A vertex joining the MIS must mark all neighbors as excluded off MIS, including those of other colors

not all occupants of clusters colored i participate in phase i - only those not excluded in earlier phases

Page 227: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Deterministic MIS (cont)

Procedure DecompToMIS(d,c) - code for v

For phase i=1 through c do: /* Each phase consists of O(d log2n) rounds */

1. If v's cluster is colored i then do:a. If v has not decided yet (v = -1)

then compute MIS on cluster using PRAM-based algorithm

b. If v joined MIS then inform all neighbors2. Else if neighbor joined MIS then decide v0

Page 228: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis

# phases ()=O(c)

Time = O(c·d·log2n)

Lemma: There is a deterministic distributed algorithm that given colored (d,c)-decomposition for G, computes MIS for G in time O(d·c·log2n)

Recall: For every graph G, k ≥ 1,there is a (k,kn1/k)-decomposition

Page 229: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Analysis (cont)

Taking k=log n, we get:

Recall: There is a deterministic algorithm

for computing a decomposition in time O(2) for = cp(log n), constant c>0

Corollary: There is a deterministic distributed MIS algorithm with time O(2p(c log n) )

Corollary: Given colored (log n,log n)-decomp’ for G, there is a deterministic distributed MIS algorithm with time O(polylog n)

Page 230: Locality Sensitive Distributed Computing David Peleg Weizmann Institute.