Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Locality Sensitive Distributed Computing

David PelegWeizmann Institute

Structure of mini-course

1. Basics of distributed network algorithms

2. Locality-preserving network representations

3. Constructions and applications

Part 2: Representations

1. Clustered representations• Basic concepts: clusters, covers, partitions • Sparse covers and partitions• Decompositions and regional matchings

2. Skeletal representations

• Spanning trees and tree covers• Sparse and light weight spanners

Basic idea of locality-sensitive distributed computing

Utilize locality to both • simplify control structures and algorithms and • reduce their costs

Operation performed in large network may concern few processors in small region

(Global operation may have local sub-operations)

Reduce costs by utilizing “locality of reference”

Components of locality theory

• General framework, complexity measures and algorithmic methodology

• Suitable graph-theoretic structures and efficient construction methods

• Adaptation to wide variety of applications

Fundamental approach

Clustered representation:• Impose clustered hierarchical organization on

given network• Use it efficiently for bounding complexity of

distributed algorithms.

Skeletal representation:• Sparsify given network • Execute applications on remaining skeleton,

reducing complexity

Clusters, covers and partitions

Cluster = connected subset of vertices S V

Clusters, covers and partitions

Cover of G(V,E,) = collection of clusters={S1,...,Sm} containing all vertices of G

(i.e., s.t. [ = V).

PartitionsPartial partition of G = collection of disjointclusters ={S1,...,Sm}, i.e., s.t. Si Å Sj=

Partition = cover & partial partition

Evaluation criteria

Locality and Sparsity

Locality level: cluster radius

Sparsity level: vertex / cluster degrees

Evaluation criteria

Locality - sparsity tradeoff:

locality and sparsity parametersgo opposite ways:

better sparsity ⇔ worse locality (and vice versa)

Evaluation criteria

Locality measures

Weighted distances:

Length of path (e1,...,es) = ∑1≤i≤s (ei)

dist(u,w,G) = (weighted) length of shortest path

dist(U,W) = min{ dist(u,w) | uU, wW }

Evaluation criteria

Diameter, radius: As before, except weighted

Denote logD = dlog Diam(G)e

For collection of clusters :

• Diam() = maxi Diam(Si)

• Rad () = maxi Rad (Si)

Neighborhoods

(v) = neighborhood of v = set of neighbors in G(including v itself)

(v)

Neighborhoods

(v) = -neighborhood of v = vertices at distance or less from v

0(v)

1(v)

2(v)

Neighborhood covers

For W V:

s(W) = -neighborhood cover of W

= { (v) | vW }

(collection of -neighborhoods of W vertices)

Neighborhood covers

E.g: s0

(V) = partition into singleton clusters

Neighborhood covers

E.g: s

1 (W) = cover of W nodes by neighborhoods

W = colored nodes

s1

(W)

Sparsity measures

Different representations

Different ways to measure sparsity

Cover sparsity measure - overlap

deg(v,) = # occurrences of v in clusters Si.e., degree of v in hypergraph (V,)

deg(v) = 3

v

C() = maximum degree of cover

Av() = average degree of = ∑vV deg(v,) / n = ∑S|S| / n

Partition sparsity measure - adjacency

Intuition: “contract” clusters into super-nodes,look at resulting cluster graph of ,()=(, )

Partition sparsity measure - adjacency

edges = inter-cluster edges

()=(, ) :={(S,S') | S,S‘ ,G contains edge (u,v) for u S and v S'}

Cluster-neighborhood

Def: Given partition , cluster S , integer ≥0:

S

Cluster-neighborhood of S = neighborhood of S in cluster graph ()

c(S,G) = (S,())

c(S,G)

Sparsity measure

Average cluster-degree of partition :

Avc() = S |c(S)| / n

Note:

Avc() ~ # inter-cluster edges

Example: A basic construction

Goal: produce a partition with:

1. clusters of radius ≤ k2. few inter-cluster edges (or, low Avc())

Algorithm BasicPart

Algorithm operates in iterations,each constructing one cluster

Example: A basic construction

At end of iteration:- Add resulting cluster S to output collection - Discard it from V- If V is not empty then start new iteration

Iteration structure

• Arbitrarily pick a vertex v from V

• Grow cluster S around v, adding layer by layer

• Vertices added to S are discarded from V

Iteration structure

• Layer merging process is carried repeatedly until reaching required sparsity condition:

- next iteration increases # vertices by a factor of < n1/k

(I.e., |(S)| < |S| · n1/k)

Analysis

Av-Deg-Partition Thm: Given n-vertex graph G(V,E), integer k≥1,Alg. BasicPart creates a partition satisfying:

1) Rad() ≤ k-1,2) # inter-cluster edges in () ≤ n1+1/k

(or, Avc() ≤ n1/k)

Analysis (cont)

Proof:

Correctness:• Every S added to is (connected) cluster• The generated clusters are disjoint

(Alg erases from V every v added to cluster)• is a partition (covers all vertices)

Analysis (cont)

Property (2): [(()) ≤ n1+1/k ]By termination condition of internal loop,the resulting S satisfies |(S)| ≤ n1/k·|S|

(# inter-cluster edges touching S) ≤ n1/k·|S|

Number can only decrease in later iterations, if adjacent vertices get merged into same cluster

|| ≤ ∑S n1/k ·|S| = n1+1/k

Analysis (cont)

Property (1): [ Rad() ≤ k-1 ]Consider iteration of main loop.

Let J = # times internal loop was executed

Let Si = S constructed on i'th internal iteration

|Si| > n(i-1)/k for 2≤i≤J (By induction on i)

Analysis (cont)

J ≤ k (otherwise, |S| > n)

Note: Rad(Si) ≤ i-1 for every 1≤i≤J (S1 is composed of a single vertex, each additional layer increases Rad(Si) by 1)

Rad(SJ) ≤ k-1

Variant - Separated partial partitions

Sep() = Separation of partial partition = minimal distance between any two clusters

When Sep()=s, we say is s-separated

Example: 2-separated partial partition

Coarsening

Cover ={T1,...,Tq} coarsens ={S1,...,Sp}if clusters are fully subsumed in clusters

Coarsening (cont)

The radius ratio of the coarsening = Rad() / Rad()

r

R

= R / r

Coarsening (cont)

Motivation:Given “useful” with high overlaps:

Coarsen by merging some clusters together, getting a coarsening cover with

• larger clusters • better sparsity • increased radii

Sparse coversGoal:For initial cover , construct coarsening with low overlaps, paying little in cluster radii

Inherent tradeoff:

Simple Goal: Low average degree

lower overlap higher radius ratio

(and vice versa)

Sparse coversAlgorithm AvCover

Operates in iterationsEach iteration merges together some clusters into one output cluster Z

At end of iteration:• Add resulting cluster Z to output collection • Discard merged clusters from • If is not empty then start new iteration

Sparse covers

Algorithm AvCover – high-level flow

Iteration structure

• Arbitrarily pick cluster S0 in (as kernel Y of cluster Z constructed next)

• Repeatedly merge cluster with intersecting clusters from (adding one layer at a time)

• Clusters added to Z are discarded from

Iteration structure

- Layer merging process is carried repeatedly until reaching required sparsity condition:

adding next layer increases # verticesby a factor of ≤ n1/k

(|Z| ≤ |Y| · n1/k)

Analysis

Thm: Given graph G(V,E,), cover , int k≥1,

Algorithm AvCover constructs a cover s.t.:

1. coarsens 2. Rad() ≤ (2k+1) Rad() (radius ratio ≤ 2k+1)3. Av() ≤ n1/k (low average sparsity)

Analysis (cont)

Corollary for -neighborhood cover:Given G(V,E,), integers k,≥1,there exists cover = ,k s.t.

1. coarsens the neighborhood cover s(V)

2. Rad() ≤ (2k+1)3. Av() ≤ n1/k

Analysis (cont)

Proof of Thm:

Property (1): [ coarsens ]

Holds directly from construction

(Each Z added to is a (connected) cluster,since at the beginning contained clusters)

Analysis (cont)

Claim: The kernels Y corresponding to clusters Z generated by the algorithm are mutually disjoint.

Proof: By contradiction.Suppose there is a vertex v s.t. v YÅY'W.l.o.g. suppose Y was created before Y'v Y'

There is a cluster S' s.t. vS' and S' was still in when algorithm started constructing Y'.

Analysis (cont)

But S' satisfies S'ÅY ≠ ∅

The final merge creating from Yshould have added S' into and eliminated it from ; contradiction.

Output clusters and kernels

cover

kernels

Analysis (cont)

Property (2): [ Rad() ≤ (2k+1)·Rad() ]

Consider some iteration of main loop(starting with cluster S)

J = # times internal loop was executed.0 = initial set

i = constructed on i'th internal iteration (1≤i≤J) Respectively Zi,Yi

Analysis (cont)

Note 1:|Zi| > ni/k, for every 1≤i≤J-1,

J ≤ k

Note 2:Rad(Yi) ≤ (2i-1)Rad(), for every 1≤i≤J

Rad (YJ) ≤ (2k-1)Rad()

Analysis (cont)

Property (3): [ Av() ≤ n1/k ]

Av() = ∑Zi|Zi| / n

≤ ∑Zi|Yi|·n1/k / n

≤ n · n1/k / n (Yi’s are disjoint)= n1/k

Partial partitions

Goal:Given initial cover and integer k≥1,construct a partial partition :• subsuming a “large” subset of clusters , • with low radius ratio.

Partial partitions (cont)

Procedure Part

General structure and iterations similar to Algorithm AvCover, except for two differences:

Small difference:Procedure keeps also “unmerged” collections , of original clusters merged into Y and Z.

Partial partitions (cont)

Small difference (cont):

Sparsity condition concerns sizes of , ,i.e., # original clusters “captured” by merge,and not sizes of Y, Z, i.e., # vertices covered

Merging ends when next iteration increases # clusters merged into by a factor ≤ ||1/k.

Main difference

Procedure removes all clusters in ,

but takes into output collection only the kernel Y, not the cluster Z

Main difference

Implication: Each selected cluster Y has additional “external layer” of clusters around it, acting as “protective barrier”providing disjointness between different clusters Y, Y' added to

Main difference

Note: Not all clusters are subsumed by (E.g., those merged into some external layer -will not be subsumed)

Analysis

Partial Partition Lemma: Given graph G(V,E,), cluster collection and integer k≥1, the collections and constructed by Procedure Part() satisfy:1. coarsens (as before)2. is a partial partition (i.e., YÅY’ = for every Y,Y’ ) (guaranteed by construction)3. || ≥ ||1-1/k (# clusters discarded ≤ ||1/k · # clusters taken)4. Rad() ≤ (2k-1)Rad() (as before)

s-Separated partial partitions

Goal: For initial -neighborhood cover ,s,k≥1,

construct s-separated partial partition subsuming a “large” subset of clusters , with low radius ratio.

s-Separated partial partitions (cont)

Procedure SepPart

• Given , construct modified collection ' of neighborhoods of radius ' = +s/2 :

= {(v) | vW} for some WV

' = {'(v) | vW }

Analysis

Lemma: Given graph G(V,E,), collection of -neighborhoods and integers s,k, the collections and constructed byProcedure SepPart satisfy:

1. coarsens 2. is an s-separated partial partition3. || ≥ ||1-1/k

4. Rad() ≤ (2k-1)·+ k s

Sparse covers with low max degree

Goal:For initial cover , construct coarsening cover with low max degree and cluster ratio.

Idea: Reduce to sub-problem of partial partition

Low max degree covers (cont)

Strategy:Given initial cover and integer k≥1:1. Repeatedly select low radius partial partitions,

each subsuming many clusters of .2. Their union should subsume all of .3. The resulting overlap = # partial partitions.

1

2

3


Algorithm MaxCover• Cover clusters by several partial partitions

(repeatedly using Procedure Part on remaining clusters, until is empty)

• Merge the constructed partial partitions into the desired cover


Max-Deg-Cover Thm:Given G(V,E,), cover , integer k≥1,Algorithm MaxCover constructs cover satisfying:

1. coarsens ,2. Rad() ≤ (2k-1) Rad(),3. C() ≤ 2k||1/k

Analysis

Proof: Define

i = contents of at start of phase i;ri = |i|

i = set added to at end of phase i, i = set removed from at end of phase.

Analysis (cont)

Property (1): [ coarsens ]

Since = [ii, = [ii, and by Partial Partition Lemma, i coarsens i for every i.

Property (2): [ Rad() ≤ (2k-1) Rad() ] Directly by Partial Partition Lemma

Analysis (cont)

Property (3): [ C() ≤ 2k||1/k ] By Partial Partition Lemma, clusters in i are disjoint

# clusters v belongs to ≤ # phases of algorithm

Analysis (cont)

Observation: In every phase i, # i clusters removed from i satisfies

|i| ≥ |i|1-1/k

(by Partial Partition Lemma)

size of remaining i shrinks as ri+1 ≤ ri - ri

1-1/k

Analysis (cont)

Claim: Given recurrence xi+1 = xi - xi

, 0<<1,let f(n) = least index i s.t. xi≤1 given x0=n. Then

f(n) < ((1-) ln 2)-1·n1-

Consequently: as r0=||, is exhausted after ≤ 2k·||1/k phasesof Algorithm MaxCover

C() ≤ 2k·||1/k

Analysis (cont)

Corollary for -neighborhood cover:

Given G(V,E,), integers k,≥ 1,there exists cover = ,k satisfying :

1. coarsens s(V)

2. Rad() ≤ (2k-1) 3. C() ≤ 2k·n1/k

Covers based on s-separated partial partitions

Goal: Cover coarsening neighborhood cover s

(V), in which the partial partitions are well separated.

Method: Substitute Procedure SepPart forProcedure Part in Algorithm MaxCover.

Covers based on s-separated partial partitions

Thm: Given G(V,E,), integers k, ≥ 1,there exists cover = ,k s.t.:

1. coarsens s(V),

2. Rad() ≤ (2k-1) + k·s,3. C() ≤ 2k n1/k,4. each of the C() layers of partial partitions

composing is s-separated.

Related graph representations

Network decomposition:

Partition is a (d,c)-decomposition of G(V,E) if

• radius of clusters in G is Rad() ≤ d• chromatic number of cluster graph () is (()) ≤ c

Example: A (2,3)-decomposition

Rad() ≤ 2

(()) ≤ 3

Decomposition algorithm

Algorithm operates in iterations

In each iteration i:- Invoke Procedure SepPart to construct a 2-separated partial partition for V

At end of iteration:- Assign color i to all output clusters- Delete covered vertices from V- If V is not empty then start new iteration

Decomposition algorithm (cont)

Main properties:1. Uses Procedure SepPart instead of Part

(i.e., guaranteed separation = 2, not 1)

Ensures all output clusters of a single iteration can be colored by single color

2. Each iteration applies only to remaining nodes

Clusters generated in different iterations are disjoint.

Analysis

Thm: Given G(V,E,), k ≥ 1,there is a (k,k·n1/k)-decomposition.

Proof:Note: Final collection is a partition(- each generated by SepPart is a

partial partition - vertices added to of iteration i are removed from V)

Analysis (cont)Iteration starting with results with of size || = (||1-1/k)

Process continues for ≤ O(k·n1/k) iterations

End with O(k·n1/k) colors, and each cluster has O(k) diameter.

Picking k=log n:Corollary: Every n-vertex graph G has a (log n,log n)-decomposition.

Skeletal representations

Spanner: connected subgraph spanning all nodes(Special case: spanning tree)

Tree cover: collection of trees covering G

Skeletal representations

Evaluation criteria

Locality level: stretch factor

Sparsity level: # edges

As for clustered representations, locality and sparsity parameters go opposite ways:

better sparsity ⇔ worse locality

Stretch

Given a graph G(V,E,) and a spanning subgraph G'(V,E'),the stretch factor of G' is:

G

G'

Stretch(G') = 2

Stretch(G') = maxu,vV {dist(u,v,G’) / dist(u,v,G)}

Depth

Def: Depth of v in tree T = distance from root:DepthT(v) = dist(v,r0,T)

Depth(T) = maxv Depth(v,T) = radius w.r.t. root

Depth(T) = Rad(r0,T)

Sparsity measures

Def: Given subgraph G'(V',E') of G(V,E,):

(G') = weight of G' = eE' (e)

Size of G' = # edges, |E'|

Spanning trees - basic types

MST: minimum-weight spanning tree of G = spanning tree TM minimizing (TM)

SPT: shortest paths tree of G w.r.t. given root r0

= spanning tree TS s.t. for every v≠ r0, the path

from r0 to v in the tree is the shortest possible,

or, Stretch(TS,r0)=1

Spanning trees - basic types

BFS: breadth-first tree of G w.r.t. given root r0

= spanning tree TB s.t. for every v≠r0, path from

r0 to v in tree is shortest possible, measuring

path length in # edges

Controlling tree degrees

deg(v,G) = degree of v in G(G) = max degree in G

Tree Embedding Thm: For every rooted tree T, integer m ≥ 1, embedded virtual tree S with same node set,same root (but different edge set), s.t.1. (S) ≤ 2m2. Each edge of S has path of length ≤ 2 in T3. DepthS(v) ≤ (2logm(T)-1) DepthT(v),

for every v

Proximity-preserving spannersMotivation: How good is a shortest paths tree as spanner?

TS preserves distances in graph w.r.t. root r0,

i.e., achieves Stretch(TS,r0)=1

However, it fails to preserve distances w.r.t. vertex pairs not involving r0

(or, to bound Stretch(TS) )Q: Construct example where two neighboring vertices in G are at distance 2·Depth(T) in SPT

Proximity-preserving spanners

k-Spanner:Given graph G(V,E,) , the subgraph G'(V,E') is a k-spanner of G if Stretch(G') ≤ k

Typical goal:Find sparse (small size, small weight) spanners with small stretch factor

Example - 2-spanner

Tree covers

Basic notion:A tree T covering the -neighborhood of v

v2(V)

covering T

Tree covers (cont)

-tree cover for graph G =tree cover for s

(V) = collection TC of trees in G s.t. for every vV, there is a tree TTC(denoted home(v) ), spanning the -neighborhood of v

Depth(TC) = maxTTC {Depth(T)}

Overlap(TC) = maxv {# trees containing v}

Tree covers

Algorithm TreeCover(G,k,)

1. Construct -neighborhood cover of G, = s

(V)2. Compute a coarsening cover for as in

Max-Deg-Cover Thm, with parameter k3. Select in each cluster Ran SPT T(R)

rooted at some center of R and spanning R4. Set TC(k,) = { T(R) | R}

Tree covers (cont)

Thm:For every graph G(V,E,), integers k, ≥ 1,there is an -tree cover TC=TC(k,) with

• Depth(TC) ≤ (2k-1)• Overlap(TC) ≤ d2k·n1/ke

Tree covers (cont)

Proof:1. TC built by Alg. TreeCover is -tree cover:

Consider vV

coarsens

there is a cluster Rs.t. (v)R

tree T(R)TC covers -neighborhood (v)

Tree covers (cont)

2. Bound on Depth(TC):follows from radius bound on clusters of cover ,guaranteed by Max-Deg-Cover Thm,as these trees are SPT's.

3. Bound on Overlap(TC):follows from degree bound on (Max-Deg-Cover Thm), as ||=n

Tree covers (cont)

Relying on Theorem and Tree Embedding Thm, and taking m=n1/k:

Corollary:For every graph G(V,E,), integers k, ≥ 1,there is a (virtual) -tree cover TC=TC(k,) for G,with• Depth(TC) ≤ (2k-1)2• Overlap(TC) ≤ d2k·n1/ke, • (T) ≤ 2n1/k for every tree TTC

Tree covers (cont)

Motivating intuition: a tree cover TC constructed for a given cluster-based cover serves as a way to “materialize”or “implement” efficiently.

(In fact, applications employing covers actually use the corresponding tree cover)

Sparse spanners for unweighted graphs

Basic lemma: For unweighted graph G(V,E),

subgraph G' is a k-spanner of G

⇔

for every (u,v)E, distG'(u,v) ≤ k

(No need to look at the stretch of each pair u,v ; suffices to consider the stretch of edges)

Sparse spanners for unweighted graphs

Algorithm Unweighted_Span(G,k)

1. Set initial partition = s0(V) = { {v} | vV }

2. Build coarsening using Alg. BasicPart

Algorithm UnweightedSpan(G,k) - cont

3. For every cluster Ti construct SPT rooted at some center ci of Ti

4. Add all edges of these trees to spanner G'

Algorithm UnweightedSpan(G,k) - cont

5. In addition, for every pair of neighboring clusters Ti,Tj:- select a single “intercluster” edge eij,- add it to G'

Analysis

Thm: For every unweighted graph G, k≤1,

there is an O(k)-spanner with O(n1+1/k) edges

Analysis (cont)

(a) Estimating # edges in spanner: 1. is a partition of V

# edges of trees built for clusters ≤ n

2. Av-Deg-Partition Thm

# intercluster edges ≤ n1+1/k

Analysis (cont)

(b) Bounding the stretch:

Consider edge e=(u,w) in G(recall: enough to look at edges)

e was selected to the spanner stretch = 1

So suppose e is not in the spanner.

Analysis (cont)

Case 1: endpoints u,w belong to same cluster Ti

length of path from u to w through center ci ≤ 2r

Clusters have radius ≤ r k-1

Analysis (cont)

Case 2: endpoints belong to clusters uTi, wTj

These clusters are connected by aninter-cluster edge

Analysis (cont)

There is a u-w path from u to ci (≤ r steps), from ci through eij to cj (≤ r+1+r steps), from cj to w (≤ r steps)

total length ≤ 4r+1

≤ 4k-3

Stretch factor analysis

Fixing k=log n we get:

Corollary: For every unweighted graph G(V,E)there is an O(log n)-spanner with O(n) edges.

Lower bounds

Def:Girth(G) = # edges of shortest cycle in G

Girth = 3 Girth = 4 Girth = ∞

Lower boundsLemma:For every k ≥ 1, for every unweighted G(V,E) with Girth(G) ≥ k+2, the only k-spanner of G is G itself

(no edge can be erased from G)

Lower boundsProof:Suppose, towards contradiction, that G has some spanner G' in which the edge e=(u,v) E is omitted

G' has alternative path P of length k from u to v

e

P P∪{e} = cycle of length k+1 < Girth(G);Contradiction.

k

1

Size and girth

Lemma:• For every r≥1 and n-vertex, m-edge graph

G(V,E) with girth Girth(G) ≥ r, m ≤ n1+2/(r-2) + n

• For every r≥3, there are n-vertex, m-edge graphs G(V,E) with girth Girth(G) ≤ r and m ≥ n1+1/r / 4

Lower bounds (cont)

Thm: For every k≥3, there are graphs G(V,E) for which every (k-2)-spanner requires (n1+1/k) edges

Lower bounds (cont)

Corollary: For every k≥3, there is an unweighted G(V,E) s.t.

(a) for every cover coarsening s1(V),

if Rad() ≤ k then Av() = (n1/k)(b) for every partition coarsening s

0 (V),

if Rad() ≤ k then Avc() = (n1/k)

Lower bounds (cont)

Similar bounds implied for average degree partition problem and all maximum degree problems.

The radius - chromatic number tradeoff for network decomposition presented earlier is also optimal within factor k

Lower bound on radius-degree tradeoff for -regional matchings on arbitrary graphs follow similarly

Examples

Restricted graph families: Behave better

Graph classes with O(n) edges have (trivial) optimal spanner (includes common topologies such as bounded-degree and planar graphs -rings, meshes, trees, butterflies, cube-connected cycles,…)

General picture:

larger k ⇔ sparser spanner

Spanners for weighted graphs

Algorithm Weighted_Span(G,k)

1. For every 1 < i < logD : construct 2i-tree-cover TC(k,2i) for G using Alg. TreeCover

2. Take all edges of tree covers into spanner G'(V,E')

Spanners for weighted graphs (cont)

Lemma: Spanner G' built by Alg. Weighted_Span(G,k) has

(1) Stretch(G') ≤ 2k-1(2) O(logD·k·n1+1/k) edges

Greedy construction

Algorithm GreedySpan(G,k)/* Generalization of Kruskal's MST algorithm */

1. Sort E by nondecreasing edge weight, get E={e1,...,em}

(sorted: (ei) ≥ (ei+1)) 2. Set E'= (spanner edges)

Greedy construction3. Scan edges one by one,

for each ej=(u,v) do:• Compute P(u,v) = shortest path from u to v

in G'(V,E')• If (P(u,v)) > k·(ej)

(alternative path is too long) then E' E' [ {ej}

(must include e in spanner)4. Output G'(V,E')

ej

P(u,v)

Analysis

Lemma: Spanner G' built by Algorithm GreedySpan(G,k) has Stretch(G') ≤ k

Proof: Consider two vertices x,y of G

Px,y = (e1,...,eq) = shortest x - y path in G

Analysis (cont)

Consider edge ej=(u,v) along Px,y

ej not included in G'

when ej was examined by the algorithm,E' contained a u - v path Pj = P(u,v) of length ≤ k·(ej)

Pj

Analysis (cont)

This path exists in final G'

To mimic the path Px,y in G' : replace each “missing” edge ej (not taken to G') by its substitute Pj

Resulting path has total length ≤ k·(Px,y)

Analysis (cont)

Lemma: Spanner has Girth(G') > k+1Proof: Consider cycle C in G'.Let ej=(u,v) be last edge added to C by alg.

When algorithm examined ej, the spanner E' already contained all other C edges

the shortest u - v path Pj constructed by the algorithm satisfies (Pj) ≤ (C-{ej})

Analysis (cont)

ej added to E'

(Pj) > k·(ej) (by selection rule)

ej = heaviest edge in C

(C) ≤ |C|·(ej)

|C| > k+1

Pj C

(C) > (k+1)·(ej) (C = Pj ∪ {ej} )

Analysis (cont)

Corollary: |E'| ≤ n1+2/k + n

Thm: For every weighted graph G(V,E,), k ≥ 1,there is an (2k+1)-spanner G'(V,E') s.t.|E'| < n·dn1/ke

Recall: For every r ≥ 1, every graph G(V,E) with Girth(G) ≤ r has |E| ≤ n1+2/(r-2) + n

Shallow Light Trees

Goal: Find spanning tree T near-optimal in both depth and weight

Candidate 1: SPTProblem: ∃G s.t.

(SPT) = (n·(MST))

Example G:

MST:

Heavy SPT:

Shallow Light Trees (cont)

Candidate 2: MSTProblem: ∃G s.t.

Depth(MST) = (n·Depth(SPT))

Example G:

Deep MST:

SPT:

Shallow Light Trees (cont)

Shallow-light tree (SLT) for graph G(V,E,) and root r0:

Spanning tree T satisfying both• Stretch(T,r0) = O(1)• (T) / (MST) = O(1)

Thm: Shallow-light trees exist for every graph G and root r0

Light, sparse, low-stretch spanners

Algorithm GreedySpan guarantees:

Thm: For every graph G(V,E,), integer k≥1, there is a spanner G'(V,E') for G with

1. Stretch(G') < 2k+1

2. |E'| < n·dn1/ke

3. (G') = (MST(G))·O(n1/k)

Lower bound

Thm: For every k ≥ 3, there are graphs G(V,E,) s.t. every spanner G'(V,E') for G withStretch(G') ≤ k-2 requires

• |E'| = (n1+1/k) and• (G') = ((MST(G))·n1/k)

Proof:By bound for unweighted graphs

Part 3: Constructions and Applications

• Distributed construction of basic partition• Fast decompositions• Exploiting topological knowledge:

broadcast revisited• Local coordination:

synchronizers revisited• Hierarchical example:

routing revisited• Advanced symmetry breaking:

MIS revisited

Basic partition construction algorithm

Simple distributed implementation for Algorithm BasicPart

Single “thread” of computation(single locus of activity at any given moment)

Basic partition construction algorithm

Components

ClusterCons :Procedure for constructing a cluster around a chosen center vNextCtr :Procedure for selecting the next center v around which to grow a clusterRepEdge :Procedure for selecting a representative inter-cluster edge between any two adjacent clusters

Analysis

Thm: Distributed Algorithm BasicPart requires

Time = O(nk)

Comm = O(n2)

Efficient cover construction algorithms

Goal: Fast distributed algorithm for coarsening a neighborhood cover

Known: Randomized algorithms for constructing low (average or maximum) degree cover of G,guaranteeing bounds on weak cluster diameter

Efficient decompositions

Goal: fast distributed algorithms for constructing a network decomposition

Basic tool:s-separated, r-ruling set ((s,r)-set):(Combinaton of independent & dominating set)

W={w1,...,wm} V in G s.t.

• dist(wi,wj) ≥ s, 1≤i<j≤m,

• for every vW, 1≤i≤m s.t. dist(wi,v) ≤ r

Efficient decompositions

(s,r)-partition:(associated with (s,r)-set W = {w1,...,wm} )

Partition of G, (W) = {S1,...,Sm},s.t. 1≤i≤m :

• wiSi

• Rad(wi,G(Si)) ≤ r≥s

≤rwiSi

Distributed construction

Thm:There is a deterministic distributed algorithm

for constructing (2,2)-decomposition

for given n-vertex graph in time O(2), for = p(c log n), for some constant c>0

Using an efficient distributed construction for (3,2)-sets and (3,2)-partitions and a recursive coloring algorithm, one can get:

Exploiting topological knowledge: Broadcast revisited

Delay measure:When broadcasting from source s,message delivery to node v suffers delay ρ if it reaches it after ρ·dist(s,v) time.

For broadcast algorithm B:

Delay(B) = maxv {Delay(v,B)}

Broadcast on a subgraph

Lemma:Flood(G') broadcast on subgraph G' costs

• Message(Flood(G')) = |E(G')|• Comm(Flood(G')) = (G')• Delay(Flood(G')) = Stretch(G')(in both synchronous and asynchronous

models)

Broadcast (cont)

Selecting an appropriate subgraph:

For spanning tree T:

• Message(Flood(T)) = n-1 (optimal)• Comm(Flood(T)) = (T)• Delay(Flood(T)) = Stretch(T,r0)

Goal: Lower both (T) and Stretch(T,r0)

Broadcast (cont)

Using light, low-stretch tree (SLT):

Lemma: For every graph G, source v,there is a spanning tree SLTv s.t. broadcast by Flood(SLTv) costs:

• Message(Flood(SLTv)) = n-1

• Comm(Flood(SLTv)) = O((MST))

• Delay(Flood(SLTv)) = O(1)

Broadcasting on a spanner

Disadvantage of SLT broadcast:

Tree efficient for broadcasting from one sourceis poor for another, w.r.t Delay

Solution 1: Maintain separate tree for every source

(heavy memory / update costs, involved control)

Broadcasting on a spanner

Solution 2: Flood(G') broadcast on spanner G'

Recall: For every graph G(V,E,), integer k≥1,there is a spanner G'(V,E') for G with

1. Stretch(G') ≤ 2k+12. |E'| ≤ n1+1/k

3. (G') = (MST(G))·O(n1/k)

Broadcasting on a spanner (cont)

Setting k=log n:

Thm: For every graph G, there is a spanner G',s.t. Algorithm Flood(G') has complexities • Message(Flood(G')) = O(n · log n · logD)• Comm(Flood(G')) = O(log n · logD · (MST))• Delay(Flood(G')) = O(log n)

(optimal up to polylog factor in all 3 measures)

Topology knowledge and broadcast

Assumption: No predefined structures exist in G (Broadcast performed “from scratch”)

Focus on message complexity

Extreme models of topological knowledge:

KT∞ model: Full knowledgeVertices have full topological knowledge


KT∞ model: Full topological knowledge

broadcast with minimal # messages,Message=(n)

1. Each v locally constructs same tree T, sending no messages

2. Use tree broadcast algorithm Flood(T)


KT0 model: “Clean” network:

Vertices know nothing on topology

KT1 model: Neighbor knowledge:

Vertices know own + neighbor ID's, nothing else

Topology knowledge & msg complexity

Lemma: In model KT0, every broadcast algorithm must send ≥1 message over every edge of G

Proof: Suppose there is an algorithm disobeying the claim.

Consider graph G and edge e=(u,w) s.t. broadcasts on G without sending any messages over e

Topology knowledge & msg complexity

Then G can be replaced by G' as follows:

⇒

Clean network model

u and w cannot distinguish betweenthe two topologies G' and G

No msgs sent on e

No msgs sent on e1 , e2

Clean network model

In executing algorithm over G',u and w fail to forward the message to u' and w'

x x

Clean network model

u' and w' do not get message, contradiction

x x

Clean network model

Thm: Every broadcast protocol for the KT0

model has complexity Message() = (|E|)

Msg complexity of broadcast in KT1

Note: In KT1, previous intuition fails !

Nodes know the ID’s of their neighbors

⇓not all edges must be used

Broadcast in KT1 (cont)

Traveler algorithm

“Traveler” (token) performs DFS traversal on G

Traveler carries a list L of vertices visited so far.


To pick next neighbor to visit after v:- Compare L with list of v's neighbors, - Make next choice only from neighbors not in L (If all v neighbors were already visited, backtrack from v on edge to parent.)

{0}

{0,1}

0

1 2

3 4 5{0

,1,3

}

{0,1,3,4}

{0,1,3,4,5}


Note: Traveler's “forward” steps restricted to the edges of the DFS tree spanning G; non-tree edges are not traversed

{0}

{0,1}

0

1 2

3 4 5{0

,1,3

}

{0,1,3,4}

{0,1,3,4,5}

No need to send messages on every edge !


Q: Does the traveler algorithm disprove the (|E|) lower bound on messages?

Observe: # basic (O(log n) bit) messages sent by algorithm = (n2) >> 2n(the lists carried by the traveler contain up to O(n) vertex ID's)

traversing an edge requires O(n) basic messages on average

(|E|) lower bound for KT1

Idea: To avoid traversing edge e=(v,u) – the traveler algorithm must inform, say, v, that u already got the message.

This can only be done by sending some message to u- as expensive as traversing e itself…

Intuitively, edge e was “utilized,” just as if a message actually crossed it

Lower bound (cont)

Def: Edge e=(u,v) E is utilized during a run ofalgorithm on G if one of the following eventsholds:

1. A message is sent on e2. u either sends or receives a message

containing ID(v)3. v either sends or receives a message

containing ID(u)

Lower bound (cont)

m = # utilized edges in run of protocol on G

M = # (basic) messages sent during run

Lemma: M=(m)Proof: Consider a message sent over e=(u,v).The message contains O(1) node ID's z1,...,zB.Each zi utilizes ≤ 2 edges, (u,zi) and (v,zi)(if exist).Also, e itself becomes utilized.

Lower bound (cont)

⇒ To prove a lower bound on messages, it suffices to prove a lower bound on # edges utilized by algorithm

Lemma: Every algorithm for broadcast under the KT1 model must utilize every edge of G

Thm: Every broadcast protocol for the KT1

model has complexity Message() = (|E|)

Lower bound (cont)

Observation: Thm no longer holds if, in addition to arbitrary computations, we allow protocols with time unbounded in network size.

Once such behavior is allowed, one may encode an unbounded number of ID's by the choice of transmission round, and hence implement, say, the “traveler” algorithm.

(This relates only to the synchronous model;In asynch model such encoding is impossible!)

Hierarchy of partial topological knowledge

KTk model: Known topology to radius k:Every vertex knows the topology of the neighborhood of radius k around it, G(k(v))

Example: In KT2, v knows the topology of its 2-neighnorhood


KTk model: Known topology to radius k:Every vertex knows topology of subgraph of radius k around it, G(k(v))

Information-communication tradeoff:For every fixed k ≥ 1:# basic messages required for broadcast in the KTk model = (min{|E|,n1+(1)/k})


Lower bound proof: Variant of KT1 case.

Upper bound Idea:v knows all edges at distance ≤ k from it

v can detect all short cycles (length ≤ 2k)going through it

Possible to disconnect all short cycles locally,

by deleting one edge in each cycle.

KTk model

Algorithm k-Flood

Assumption: There is some (locally computable) assignment of distinct weights to edges

KTk model

Algorithm k-Flood

• Define subgraph G*(V,E*) of G:1. Mark heaviest edge in each short cycle

“unusable”,2. include precisely all unmarked edges in E*

(Only e endpoints should know e is usable;

Given partial topological knowledge, edge deletions done locally, sending no messages)

KTk model

Algorithm k-Flood (cont)

• Perform broadcast by Alg. Flood(G*) on G* (I.e., whenever v receives message first time, it sends it over all incident usable edges eE* )

Analysis

Lemma: G connected G* connected too.

Consequence of marking process defining G*: All short cycles are disconnected

Lemma: Girth(G*) ≤ 2k+1

Analysis

Recall: For every r≥1, graph G(V,E) with girth Girth(G) ≥ r,

|E| ≤ n1+2/(r-2) + n

Corollary: |E*|=O(n1+c/k) for constant c>0

Thm: For every G(V,E), k≥1, Algorithm k-Flood performs broadcast in KTk model, withMessage(k-Flood)=O(min{|E|,n1+c/k})(fixed c>0)

Synchronizers revisitedRecall: • Synchronizers enable transforming

an algorithm for synchronous networks into an algorithm for asynchronous networks.

Phase A (of pulse p): Each processor learns (in finite time) that all messages it sent during pulse p have arrived (it is safe)

Phase B (of pulse p): Each processor learns that all its neighbors are safe w.r.t. pulse p

• Operate in 2 phases per pulse

Learning neighbor safety

SafeReady

Synchronizer costs

Goal: Synchronizer capturing reasonable middle points on time-communication tradeoff scale

Cpulse O(|E|) O(n)

Tpulse O(1) O(Diam)

Synchronizer Assumption: Given a low-degree partition

• Rad() ≤ k-1,• # inter-cluster edges in () ≤ n1+1/k

Synchronizer (cont)

For each cluster in , build rooted spanning tree.

synchronization link

spanning tree

In addition, between any two neighboring clusters designate a synchronization link.

Handling safety information (in Phase B)

Step 1: For every cluster separately,apply synchronizer (By end of step, every node knows that every node in its cluster is safe)

my_subtree_safe

cluster_safe


Step 2: Every node incident to a synchronization link sends a message to the other cluster, saying “my cluster is safe”

my_cluster_safe


Step 3: Repeat step 1, but the convergecast performed in each cluster carries different information:

• Whenever v learns all clusters neighboring its subtree are safe, it reports this to parent.

all_clusters_adjacent_to_my_subtree_are_safe

☑

☑


Step 4: When root learns all neighboring clusters are safe, it broadcasts “start new pulse” on tree

all_neighboring_

clusters_are_safe

(By end of step, every node knows that all its neighbors are safe)

Analysis

Claim: Synchronizer is correct.

Claim:1. Cpulse() = O(n1+1/k)

2. Tpulse() = O(k)

Analysis (cont)

Proof:Time to implement one pulse: ≤ 2 broadcast / convergecast rounds in clusters(+ 1 message-exchange step among border

vertices in neighboring clusters)

Tpulse() ≤ 4 Rad() +1 = O(k)

Complexity

Messages: Broadcast / convergecast rounds, separately in each cluster,cost O(n) messages in total(clusters are disjoint)

Communication step among neighboring clustersrequires n·Avc() = O(n1+1/k) messages

Cpulse() = O(n1+1/k)

Synchronizer

Assumption: Given a sparse k-spanner G'(V,E')

G'(V,E')

Synchronizer (cont)

Handling safety information (in Phase B):

When v learns it is safe for pulse p:For k rounds do:1.Send “safe” to all spanner neighbors2.Wait to hear same from these neighbors

Synchronizer

Lemma: For every 1≤i≤k, once v completes i rounds, every node u at distance dist(u,v,G’) ≤ i from v in the spanner G' is safe

Analysis

Proof: By induction on i.

For i=0: Immediate.

For i+1:Consider the time v finishes (i+1)st round.

Analysis

v received i+1 messages “safe” from its neighbors in G'

These neighbors each sent their (i+1)st messageonly after finishing their i’th round

AnalysisBy inductive hypothesis, for every such neighbor u,every w at distance ≤ i from u in G' is safe.

Every w at distance ≤ i+1 from v in G' is safe too

i

i i

i

i i

Analysis (cont)

Corollary: When v finishes k rounds, each neighbor of v in G is safe (v is ready for pulse p+1)

Proof:By lemma, at that time, every processor u at

distance ≤ k from v in G' is safe.

By def of k-spanners, every neighbor of v in G is at distance ≤ k from v in G'.

every neighbor is safe.

Analysis (cont)

Lemma: If G has k-spanner with m edges, then it has synchronizer with

• Tpulse()=O(k)

• Cpulse()=O(k·m)

Summary

Cpulse O(|E|) O(n) O(n1+1/k) O(kn1+1/k)

Tpulse O(1) O(Diam) O(k) O(k)

On a general n-vertex graph, for parameter k≥1:

Compact routing revisitedTradeoff between stretch and space:Any routing scheme for general n-vertex networks achieving stretch factor k≥1 must use (n1+1/(2k+4)) bits of routing information overall

(Lower bound holds for unweighted networks as well, and concerns total memory requirements)

Interval tree routing

Goal: Given tree T, design routing scheme based on interval labeling

Idea: Label each v by integer interval Int(v)s.t. for every two vertices u,v,

Int(v) Int(u) ⇔ v is a descendent of u in T

Interval labeling

Algorithm IntLab on tree T1. Perform depth-first

(DFS) tour of T, starting at root; Assign each uT a depth-first number DFS(u)

Interval labeling (cont)

Algorithm IntLab on tree T2. Label node u by interval

[DFS(u),DFS(w)] : w = last descendent of u visited by DFS

(Labels contain ≤ d2log ne bits)


Data structures:Vertex u stores its own label Int(u) and the labels of its children in T

Forwarding protocol: Routes along unique path


Lemma: For every tree T=(V,E,), scheme ITR(T) has Dilation(ITR,G)=1 and usesO((T)log n) bits per vertex, andO(n log n) memory in total

Interval tree routing (cont)

Forwarding protocol: Routing M from u to v:

At intermediate w along route:

Compare Int(v) with Int(w)

Possibilities:1. Int(w) = Int(v) (w = v):

receive M.


2. Int(w) Int(v) (w descendent of v):Forward M upwards to parent

w

v


3. Disjoint intervals (v, w in different subtrees): Forward M upwards to parent

w

v


4. Int(v) Int(w) (v descendent of w):Examine intervals of w’s children,find unique child w' s.t. Int(v) Int(w'),forward M to w'

w

v

ITR for general networks

1. Construct shortest paths tree T for G,2. Apply ITR to T.

Total memory requirements = O(n log n) bits

Problems: - stretch may be as high as Rad(G),- maximum memory per vertex

depend on maximum degree of T

Overcoming high max degree problem

Recall: For every rooted tree T, integer m ≥ 1,there is an embedded virtual tree S with same

node set, same root (but different edge set), s.t.

1. (S) ≤ 2m2. Each edge of S corresponds to path of

length ≤ 2 in T 3. DepthS(v) ≤ (2logm(T)-1) DepthT(v),

for every v


Setting m=n1/k, embed in T a virtual tree T' with • (T') < 2n1/k

• Depth(T') < (2k -1) Rad(G)


Lemma: For every G(V,E,), the ITR(T) scheme guarantees message delivery in Gwith communication O(Rad(G)) and uses O(n log n) memory

Problem: stretch may be as high as Rad(G)

A regional (C,)-routing scheme

For every u,v:

• If dist(u,v) ≤ : scheme succeeds in delivering M from u to v.

• Else: routing fails, M returns to u

Communication cost ≤ C.

A regional (C,)-routing schemeRecall: For graph G(V,E,), integers k, ≥ 1,there is an -tree cover TC=TC(k,) with• Depth(TC) ≤ (2k-1)• Overlap(TC) ≤ 2k n1/k

sum of tree sizes = O(k·n1+1/k)

Data structures

1. Construct tree cover TC(k,)2. Assign each tree T in TC(k,) distinct Id(T) 3. Set up interval tree routing component

ITR(T) on each tree T TC(k,)

Data structuresRecall: Every vV has home tree

T=home(v) in TC(k,), containing its entire -neighborhood.

Scheme RSk,:

Routing label for v: Pair (Id(T),IntT(v)) where• Id(T) = ID of v's home tree• IntT(v) = v's routing label in ITR(T)

Data structuresForwarding protocol: Routing M from u to v

with label (Id(T),IntT(v)):

Examine if u belongs to T.- u not in T: detect “unknown destination”

failure and terminate routing procedure.- u in T: send M using ITR(T) component

AnalysisLemma: For every graph G, integers k, ≥ 1, scheme RSk, is a regional (O(k),)-routing

scheme and it uses O(kn·1+1/k·log n) memory

Analysis (cont)Proof:Stretch: Suppose dist(u,v) ≤ for some u,v.

By definition, v (u).Let T = home tree of u.

(u) V(T)

v T

ITR(T) succeeds.Also, path length = O(Depth(T)) = O(k)

Analysis (cont)

Memory: Each v stores O((T(C))·log n) bitsper each cluster C to which it belongs, where T(C) = spanning tree constructed for C

O(k·n1+1/k·log n) memory in total

Hierarchical routing scheme RSk

Data structures:

For 1 ≤ i ≤ logD:construct a regional (O(ki),i)-routing scheme

Ri=RSk, for i=2i

Each v belongs to all regional schemes Ri

(has home tree homei(v) in each Ri

and routing label in each level,stores all info required for each scheme)

Hierarchical routing scheme RSk

Routing label = concatenation of regional labels

Forwarding protocol: Routing M from u to v:1. Identify lowest-level regional scheme Ri usable

(u first checks if it belongs to tree home1(v) ;If not, then check second level, etc.)

2. Forward M to v on ITR(homei(v)) component of regional scheme Ri

Analysis

Lemma: Dilation(RSk)=O(k).

Proof: Suppose u sends M to v.Let d=dist(u,v) and j= dlog de(2j-1 < d < 2j)

i = lowest level s.t. u belongs to v's home tree

Analysisu must belong to homej(v)

regional scheme Rj is usable (if no lower level was)

(Note: highest-level RlogD always succeeds)

Comm(RSk,u,v) ≤ |(RSk,u,v)|

≤ ∑ji=1 O(k·2i)

≤ O(k·2j+1) ≤ O(k)·dist(u,v)

Analysis (cont)

Thm: For every graph G, integer k≥1, hierarchical routing scheme RSk hasDilation(RSk) = O(k) Mem(RSk) = O(k·n1+1/k·logn·logD)

Analysis (cont)

Proof: Memory required by hierarchical scheme= logD terms, each bounded by O(k·n1+1/k·logn)

total memory = O(k·n1+1/k·logn·logD) bits

Deterministic decomposition-based MIS

Assumption: given a (d,c)-decomposition for G plus coloring of clusters in cluster graph

MIS computation() phases: Phase i computes MIS among vertices

belonging to clusters colored i(These clusters are non-adjacent, so may

compute MIS for each independently, in parallel, using PRAM-based distributed algorithm, in time O(d log2 n).)

Deterministic MIS (cont)

Note: A vertex joining the MIS must mark all neighbors as excluded off MIS, including those of other colors

not all occupants of clusters colored i participate in phase i - only those not excluded in earlier phases

Deterministic MIS (cont)

Procedure DecompToMIS(d,c) - code for v

For phase i=1 through c do: /* Each phase consists of O(d log2n) rounds */

1. If v's cluster is colored i then do:a. If v has not decided yet (v = -1)

then compute MIS on cluster using PRAM-based algorithm

b. If v joined MIS then inform all neighbors2. Else if neighbor joined MIS then decide v0

Analysis

# phases ()=O(c)

Time = O(c·d·log2n)

Lemma: There is a deterministic distributed algorithm that given colored (d,c)-decomposition for G, computes MIS for G in time O(d·c·log2n)

Recall: For every graph G, k ≥ 1,there is a (k,kn1/k)-decomposition

Analysis (cont)

Taking k=log n, we get:

Recall: There is a deterministic algorithm

for computing a decomposition in time O(2) for = cp(log n), constant c>0

Corollary: There is a deterministic distributed MIS algorithm with time O(2p(c log n) )

Corollary: Given colored (log n,log n)-decomp’ for G, there is a deterministic distributed MIS algorithm with time O(polylog n)

Locality Sensitive Distributed Computing David Peleg Weizmann Institute.

Documents

Transcript of Locality Sensitive Distributed Computing David Peleg Weizmann Institute.