Distributed Computing Algorithms David Peleg Weizmann Institute.

download Distributed Computing Algorithms David Peleg Weizmann Institute.

If you can't read please download the document

description

What is Distributed Computing All About?

Transcript of Distributed Computing Algorithms David Peleg Weizmann Institute.

Distributed Computing Algorithms David Peleg Weizmann Institute Structure of mini-course Distributed model Broadcast Tree constructions Leader election Coloring, MIS Synchronizers What is Distributed Computing All About? Characterizing distributed computing Sequential-centralized computing: Single processor Single control point Parallel / distributed computing Multiple processors Multiple control points Characterizing distributed computing Sequential- centralized Parallel Distributed Common clock Synchronization Fast communication Shared memory Coupling level Characterizing distributed computing Sequential- centralized Parallel Distributed Cooperation level Common goal, single process Positive cooperation, shared purpose Negative cooperation, separate agendas Sub-models of distributed computing Shared memory models [multi-cores, parallel machines] Shared variables Atomic registers Concurrent reads / writes Sub-models of distributed computing Shared memory models [multi-cores, parallel machines] Message-passing models [communication networks] Point-to-point Wireless Broadcast medium (Ethernet) Model The distributed network model Point-to-point communication network The distributed network model Described by undirected weighted graph G(V,E, ) V={v 1,,v n } - Processors (network sites) E - bidirectional communication links The distributed network model : E + edge weight function representing transmission costs (usually satisfies triangle inequality) Unique processor ID's: ID : V S S={s 1,s 2,} ordered set of integers Communication Processor v has deg(v,G) ports (external connection points) Edge e represents pair ((u,i),(v,j)) = link connecting u's port i to v's port j Communication Message transmission from u to neighbor v: u loads M onto port i v receives M in input buffer of port j Communication Assumption: At most one message can occupy a communication link at any given time (A link is available for the next transmission only after the previous message is removed from the input buffer by the receiving processor) Allowable message size CONGEST model: Message size = O(log n) bits (messages carry a fixed number of vertex ID's, e.g., sender and destination) Messages longer than O(log n) bits must be broken into packets of O(log n) bits and sent separately. LOCAL model: Message size unlimited Issues unique to distributed computing There are several inherent differences between the distributed and the traditional centralized-sequential computational models Communication In centralized setting: Issue nonexistent In distributed setting: Communication has its limits (in speed and capacity) does not come for free should be treated as a computational resource such as time or memory (often - the dominating consideration) Communication as a scarce resource One common model: LOCAL Assumes local processing comes for free (Algorithm pays only for communication) Incomplete knowledge In centralized-sequential setting: Processor knows everything (inputs, intermediate results, etc.) In distributed setting: Processors have very partial picture Partial topological knowledge Model of anonymous networks: Identical nodes no ID's no topology knowledge Intermediate models: Estimates for network diameter, # nodes etc unique identifiers neighbor knowledge Partial topological knowledge (cont) Permissive models: Topological knowledge of large regions, or even entire network Structured models: Known sub-structure, e.g., spanning tree / subgraph / hierarchical partition / routing service available Other knowledge deficiencies know only local portion of the input do not know who else participates do not know current stage of other participants Coping with failures In centralized setting: Straightforward - Upon abnormal termination or system crash: Locate source of failure, fix it and go on. In distributed setting: Complication - When one component fails, others continue Ambitious goal: ensure protocol runs correctly despite occasional failures at some machines (including confusion-causing failures, e.g., failed processors sending corrupted messages) Timing and synchrony Fully synchronous network: All link delays are bounded Each processor keeps local clock Local pulses satisfy following property: Think of entire system as driven by global clock Message sent from v to neighbor u at pulse p of v arrives u before its pulse p+1 Timing and synchrony Machine cycle of processors - composed of 3 steps: 1. Send msgs to (some) neighbors 2.Wait to receive msgs from neighbors 3.Perform some local computation Asynchronous model Algorithms are event-driven : No access to global clock Processing speeds of different processors may be different (The processing speed of a single processor may vary with time) Messages sent from processor to neighbor arrive within finite but unpredictable time Asynchronous model A processor cannot rely on its clock to tell if message is coming or not: perhaps the message is still on its way Impossible to rely on ordering of events (might reverse due to different message transmission speeds) Asynchronous model Other issues related to timing: Simultaneous wake-up: ensure common start time Termination detection: verify run has terminated Nondeterminism Asynchronous computations are inherently nondeterministic (even when the protocols do not use randomization) Nondeterminism Reason: Message arrival order may differ from one execution to another (e.g., due to other events concurrently occurring in the system queues, failures) If you run the same algorithm twice on the same inputs you might get different outputs / scenarios Nondeterminism - example Complexity measures Traditional (time, memory) New (messages, communication) Time For synchronous algorithm : Time( ) = (worst case) # pulses during execution For asynchronous algorithm ? (Even a single message can incur arbitrary delay ! ) Time For asynchronous algorithm : Time( ) = (worst-case) # time units from start to end of execution, assuming each message incurs delay 1 time unit (*) Time Note: 1.Assumption (*) is used only for performance evaluation, not for correctness. 2.(*) does not restrict set of possible scenarios any execution can be normalized to fit this constraint 3.Worst-case means all possible inputs and all possible scenarios over each input Memory Mem( ) = (worst-case) # memory bits used throughout the network MaxMem( ) = maximum local memory Message complexity Basic message = O(log n) bits Longer messages cost proportionally to length Sending basic message over edge costs 1 Message( ) = (worst case) # basic messages sent during execution Comm( ) = Message( ) Examples Network: Complete graph on n vertices {0,,n-1} Problem A1: Each vertex i needs to send a message M i containing 1 bit to vertex i+1 Time(A1) = 1 Comm(A1) = n-1 i+1 i (Assume a packet must include the IDs of the sender and receiver, so it is at least (logn) bits) Examples Problem A2: A single message M containing 1 bit must start at vertex 0 and be forwarded successively from vertex i to vertex i+1, until reaching vertex n-1. Time(A2) = n-1 Comm(A2) = n-1 1 0 Examples Problem A3: Vertex 0 must send a different message M i containing logn bits to each vertex i Time(A3) = 1 Comm(A3) = n-1 0 Examples Problem A4: A single message M containing 1 bit must start at vertex 0 and be forwarded successively from vertex i to vertex i+1, with each vertex appending one more bit to M, until reaching vertex n-1. Time(A4) = ? 1 0 Examples Time(A4) depends on the model i+1 i In the LOCAL model: Time(A4) = n-1 Ex. 1: What happens if we allow pipelining? (I.e., vertex i does not need to wait until it gets the entire message M from i-1; it can start forwarding packets to i+1 once it gets them) Distance definitions Length of path (e 1,...,e s ) = s dist(u,w,G) = length of shortest u - w path in G Diameter: Diam(G) = max u,v V {dist(u,v,G)} Distance definitions (cont) Radius: Rad(v,G) = max w V {dist(v,w,G)} Distance definitions (cont) Radius: Rad(G) = min v V {Rad(v,G)} Rad(v,G) = max w V {dist(v,w,G)} A center of G: vertex v s.t. Rad(v,G)=Rad(G) Observe: Rad(G) < Diam(G) < 2Rad(G) Basic complexity issues - Exercises Ex. 2: Prove or disprove: In a graph G(V,E), if there are at least k edge-disjoint paths of length d between the nodes v and w, then it is possible to send m msgs from v to w in time O(d+m/k). Exercises (cont) Ex. 3: Prove or disprove: In a graph G(V,E), if dist(v,w)=k and there are k 2 edge-disjoint paths between the nodes v and w, then it is possible to send k 2 msgs from v to w in time O(k). Broadcast Goal: Disseminate message M originated at source r 0 to all vertices in network M M M M M M M Basic lower bounds Thm: For every broadcast algorithm B: Message(B) n-1, Time(B) Rad(r 0,G) = (Diam(G)) Tree broadcast Algorithm Tcast( r 0,T) Use a spanning tree T of G rooted at r 0 Root broadcasts M to all its children Each node v getting M, forwards it to children Tree broadcast (cont) Assume: Spanning tree known to all nodes (Q: what does it mean in distributed context?) Tree broadcast (cont) Assume: Spanning tree known to all nodes (Q: what does it mean in distributed context?) Tree broadcast (cont) Claim: For spanning tree T rooted at r 0 : Message(Tcast) = n-1 Time(Tcast) = Depth(T) Tcast on BFS tree BFS (Breadth-First Search) tree = Shortest-paths tree: The level of each v in T is dist(r 0,v,G) Tcast (cont) Corollary: For BFS tree T w.r.t. r 0 : Message(Tcast) = n-1 Time(Tcast) Diam(G) (Optimal in both) But what if no spanning tree is given ? Flooding The flooding algorithm Algorithm Flood( r 0,M) 1.Source sends M on each outgoing link 2.For other vertex v: On receiving M first time over edge e: store in buffer; forward on every edge ee On receiving M again (over other edges): discard it and do nothing Flooding - correctness Lemma: 1. Alg. Flood yields correct broadcast 2. Time(Flood)= (Rad( r 0,G)) = (Diam(G)) 3. Message(Flood)= (|E|) in both synchronous and asynchronous model Proof: Message complexity: Each edge delivers M at most once in each direction Neighborhoods (v) = neighborhood of v = vertices adjacent to v (v) Neighborhoods (v) = -neighborhood of v = vertices at distance or less from v 0 (v) 1 (v) 2 (v) Time complexity Verify (by induction on t) that: After t time units, M has already reached every vertex at distance t from r 0 (= every vertex in the t-neighborhood t ( r 0 ) ) Note: In asynchronous model, M may have reached additional vertices (messages may travel faster) Time complexity Note: Algorithm Flood implicitly constructs directed spanning tree T rooted at r 0, defined as follows: The parent of each v in T is the node from which v received M for the first time Lemma: In the synchronous model, T is a BFS tree w.r.t. r 0, with depth Rad( r 0,G) Flood time Note: In the asynchronous model, T may be deeper (n-1) Note: Time is still O(Diam(G)) even in this case ! r0r0 Flood time Note: In the asynchronous model, T may be deeper (n-1) In this example, every node gets M within 1 time unit! r0r0 Broadcast with echo Goal: Verify successful completion of broadcast Method: Collect acknowledgements on a spanning tree T Broadcast with echo Converge(Ack) process - code for v Upon getting M do: For v leaf in T: - Send up an Ack message to parent For v non-leaf: - Collect Ack messages from all children - Send Ack message to parent Collecting Acks Semantics of Ack from v Joint ack for entire subtree T v rooted at v, signifying that each vertex in T v received M TvTv Semantics of Ack from v Joint ack for entire subtree T v rooted at v, signifying that each vertex in T v received M r 0 receives Ack from all children only after all vertices received M Claim: On tree T, Message(Converge(Ack)) = O(n) Time(Converge(Ack))=O(Depth(T)) Tree selection Tree broadcast alg: Take same tree used for broadcast. Time / message complexities grow by const factor. Flooding alg: Use tree T defined by broadcast Tree selection Synch. model: BFS tree - complexities double Asynch. model: no guarantee r0r0 In above example, acks might take n-1 time units! Tree selection - complexity Lemma: In network G(V,E) of diameter D, complexities of broadcast with echo are: Message(FloodEcho)=O(|E|) Time(FloodEcho)= O(D) in synchronous model, O(n) in asynchronous model. In both models, M reaches all vertices by time D Topology Knowledge & Broadcast Topology knowledge and broadcast Assumption: No predefined structures exist in G (Broadcast performed from scratch) Focus on message complexity Extreme models of topological knowledge: KT model: Full knowledge Vertices have full topological knowledge Topology knowledge and broadcast KT model: Full topological knowledge broadcast with minimal # messages, Message= (n) 1.Each v locally constructs same tree T, sending no messages 2.Use tree broadcast algorithm Flood(T) Topology knowledge and broadcast KT 0 model: Clean network: Vertices know nothing on topology KT 1 model: Neighbor knowledge: Vertices know own + neighbor ID's, nothing else Topology knowledge & msg complexity Lemma: In model KT 0, every broadcast algorithm must send 1 message over every edge of G Proof: Suppose there is an algorithm disobeying the claim. Consider graph G and edge e=(u,w) s.t. broadcasts on G without sending any messages over e Topology knowledge & msg complexity Then G can be replaced by G' as follows: Clean network model u and w cannot distinguish between the two topologies G' and G No msgs sent on e No msgs sent on e 1, e 2 Clean network model In executing algorithm over G', u and w fail to forward the message to u' and w' xx Clean network model u' and w' do not get message, contradiction xx Clean network model Thm: In a clean network (the KT 0 model), every broadcast protocol has complexity Message( ) = (|E|) Msg complexity of broadcast in KT 1 Note: In KT 1, previous intuition fails ! Nodes know the IDs of their neighbors not all edges must be used Broadcast in KT 1 (cont) Traveler algorithm Traveler (token) performs DFS traversal on G Traveler carries a list L of vertices visited so far. Broadcast in KT 1 (cont) To pick next neighbor to visit after v: - Compare L with list of v's neighbors, - Make next choice only from neighbors not in L (If all v neighbors were already visited, backtrack from v on edge to parent.) {0} {0,1} {0,1,3} {0,1,3,4} {0,1,3,4,5} Broadcast in KT 1 (cont) Note: Traveler's forward steps restricted to the edges of the DFS tree spanning G; non-tree edges are not traversed {0} {0,1} {0,1,3} {0,1,3,4} {0,1,3,4,5} No need to send messages on every edge ! Broadcast in KT 1 (cont) Q: Does the traveler algorithm disprove the (|E|) lower bound on messages? Observe: # basic (O(log n) bit) messages sent by algorithm = (n 2 ) >> 2n (the lists carried by the traveler contain up to O(n) vertex ID's) traversing an edge requires O(n) basic messages on average (|E|) lower bound for KT 1 Idea: To avoid traversing edge e=(v,u) the traveler algorithm must inform, say, v, that u already got the message. This can only be done by sending some message to u - as expensive as traversing e itself Intuitively, edge e was utilized, just as if a message actually crossed it Lower bound (cont) Def: Edge e=(u,v) E is utilized during a run of algorithm on G if one of the following events holds: 1. A message is sent on e 2. u either sends or receives a message containing ID(v) 3. v either sends or receives a message containing ID(u) Lower bound (cont) m = # utilized edges in run of protocol on G M = # (basic) messages sent during run Lemma: M= (m) Proof: Consider a message sent over e=(u,v). The message contains O(1) node ID's z 1,...,z B. Each z i utilizes 2 edges, (u,z i ) and (v,z i ) (if exist). Also, e itself becomes utilized. Lower bound (cont) To prove a lower bound on messages, it suffices to prove a lower bound on # edges utilized by algorithm Lemma: Every algorithm for broadcast under the KT 1 model must utilize every edge of G Thm: Every broadcast protocol for the KT 1 model has complexity Message( ) = (|E|) Hierarchy of partial topological knowledge KT k model: Known topology to radius k: Every vertex knows the topology of the neighborhood of radius k around it, G( k (v)) Example: In KT 2, v knows the topology of its 2-neighnorhood Hierarchy of partial topological knowledge KT k model: Known topology to radius k: Every vertex knows topology of subgraph of radius k around it, G( k (v)) Information-communication tradeoff: For every fixed k 1: # basic messages required for broadcast in the KT k model = (min{|E|,n 1+ (1)/k }) Hierarchy of partial topological knowledge Lower bound proof: Variant of KT 1 case. Upper bound Idea: v knows all edges at distance k from it v can detect all short cycles (length 2k+1) going through it v2v2 v3v3 v4v4 v5v5 v Hierarchy of partial topological knowledge Disconnecting a short cycles locally (without communication) by deleting an edge: v2v2 v3v3 v4v4 v5v5 v Assumption: There is some (locally computable) assignment of distinct weights to edges E.g: lexicographic names (v 4,v 5 ) Requires all nodes on the cycle to agree on one edge KT k model Algorithm k-Flood Define subgraph G * (V,E * ) of G: 1.In each short cycle, find the heaviest edge and mark it unusable 2.include precisely all unmarked edges in E * - unusable - E * k=2 short = length 5 KT k model Only e endpoints should know that e is usable Given partial topological knowledge, edge deletions done locally, sending no messages KT k model Algorithm k-Flood (cont) Perform broadcast by Alg. Flood(G * ) on G * (I.e., whenever v receives message first time, it sends it over all incident usable edges e E * ) Analysis Lemma: G connected G * connected too. (Even erasing all cycles leaves G * connected ) Consequence of marking process defining G * : All short cycles are disconnected Lemma: Girth(G * ) 2k+2 Analysis Known: For every r1, graph G(V,E) with girth Girth(G) r, |E| n 1+2/(r-2) + n Corollary: |E * |=O(n 1+c/k ) for constant c>0 Thm: For every G(V,E), k1, Algorithm k-Flood performs broadcast in KT k model, with Message(k-Flood)=O(min{|E|,n 1+c/k }) (fixed c>0) Distributed BFS Tree Construction BFS tree constructions In synchronous model: Algorithm Flood generates BFS tree of optimal Message(Flood)= (|E|) Time(Flood) = (Diam(G)) In asynchronous model: Tree generated by Algorithm Flood is not BFS Level-synchronized BFS construction (Dijkstra) Idea: Develop BFS tree from root r 0 in phases, level by level Build next level by adding all vertices adjacent to nodes in lowest tree level After p phases: Constructed partial tree T p The tree T p is a BFS tree for p ( r 0 ) Each v in T p knows its parent, children, depth Level-synchronized BFS (Dijkstra) Phase p+1: 1.r 0 broadcasts message Pulse on current T p 2.Each leaf of T p sends exploration message Layer to all neighbors except parent. Level-synchronized BFS (Dijkstra) 3.Vertex w receiving Layer message for the first time (possibly from many neighbors) picks one neighbor v, lists it as parent, sends back Ack messages to all Layer messages Vertex w in T p receiving Layer message sends back Ack messages to all Layer messages Level-synchronized BFS (Dijkstra) 4.Each leaf v collects acks on exploration msgs. If w chose v as parent, v lists w as child 5.Once receiving Ack on all Layer messages, leaf v sends Ack to parent. 6.Acks are convergecast on T p back to r 0. 7.Once convergecast terminates, r 0 starts next phase v w Analysis Correctness: By induction on the phase number p, show: After phase p, variables parent and child define a legal BFS tree spanning r 0 's p-neighborhood Algorithm constructs BFS tree rooted at r 0. Analysis (cont) Time complexity: Time(Phase p) = 2p+2 Time(Dijkstra) = = p 2p+2 = = O(Diam 2 (G)) Analysis (cont) Message complexity: For integer p 0 let V p = vertices in layer p E p = edges internal to V p E p,p+1 = edges between V p and V p+1 Analysis (cont) Phase p: Layer messages of phase p - sent only on E p and E p,p+1 Only O(1) messages sent over each edge T p edges are traversed twice ( 2n messages) Analysis (cont) Message(Phase p) = O(n) + O(|E p |+|E p,p+1 |) In total: Message(Dijkstra) = = p O(n + |E p |+|E p,p+1 |) = O(n Diam(G)+|E|) Dijkstra BFS - Exercises Ex. 4: Prove the tightness of the message complexity analysis of Dijkstra's algorithm, by establishing the following: Lower bound: For integers n and 1 D n-1, there exists n-node, D-diameter graph G=(V,E) on which the execution of Dijkstra's algorithm requires (nD+|E|) messages. Dijkstra BFS - Exercises Ex. 5: Termination detection: Modify the Distributed Dijkstra algorithm so that the root can tell when the process is completed (and the entire graph is spanned by the constructed BFS tree) Bellman-Ford BFS construction Idea: Optimistic execution Run Algorithm Flood, build initial (possibly bad) tree As flooding progresses: Discover shorter paths to root r 0, Fix tree (adopt closer parent), Propagate corrections by informing neighbors Bellman-Ford BFS construction Algorithm Each v stores var L = # layer (distance from r 0 ): r 0 sets L 0, other vertices set L r 0 sends message Layer(0) to all neighbors. At vertex v r 0 : Upon receiving msg Layer(d) from neighbor w: If d+1 < L then do: parent w L d+1 Send Layer(d+1) to all neighbors but w Bellman-Ford BFS construction Analysis Synchronous model: Bellman-Ford alg behaves like Alg. Flood. Asynchronous model: For every d 1, after d time units from start of run, every v in distance d from r 0 received Layer(d-1) message from some neighbor, set L=d, and chose parent with L=d-1. Analysis (cont) Time complexity: O(Diam(G)) Communication complexity: First value assigned to L is n-1 v changes its L variable n-2 times v sends (n-2) deg(v) msgs during run Comm v n deg(v) = O(n|E|). Complexities of BFS algorithms ReferenceMessagesTime Lower bound ED (+ Sync. Model) DijkstraE+ n DD 2 Bellman-Fordn ED Best knownE + n log 3 nD log 3 n Distributed DFS Distributed Depth-First Search DFS: Search process on G, traversing all vertices, progressing over edges, with preference to visiting new vertices Distributed Depth-First Search Sequential DFS algorithm Search starts at origin r 0 Whenever search reaches vertex v: -If v was visited before: backtrack. -Else [first visit to v] do: -If v has unvisited outgoing edges, then visit one of them next. - Else return to the vertex from which visited first Distributed Depth-First Search Sequential DFS algorithm (cont) Whenever backtracking to vertex v: -If v has unvisited outgoing edges, then visit one of them next. -Else return to the vertex from which visited first If v = r 0 then end Distributed Depth First Search Facts: DFS process visits every vertex in G. Search defines DFS tree, with v 0 as root, where v's parent = node from which v was visited first Sequential time complexity = O(|E|) Direct distributed implementation -Completely sequential: one activity locus at any time. - Control carried via single message (token) traversing G in depth-first fashion Note: For v to know whether neighbor w was visited or not, it must send message over edge (v,w) Direct distributed implementation every edge must be explored both time and message complexities = (|E|) Improving the time complexity Idea: Explore all non-tree edges for free. When v is visited for the first time: - temporarily freeze the DFS process, - Inform all neighbors it was visited - wait for acknowledgements from all neighbors v Improving the time complexity Idea: Explore all non-tree edges for free. When v is visited for the first time: - temporarily freeze the DFS process, - Inform all neighbors it was visited - wait for acknowledgements from all neighbors - resume DFS process v v told me he was visited Improving the time complexity On each visit at node w, it knows exactly which neighbors were already visited Token can choose a new next vertex to visit v w v X Improving the time complexity O(1) time incurred in first visit to vertex Only tree edges are traversed by DFS process Time complexity becomes O(n). Leader Election Setting: Undirected network Goal: Agree on a single vertex as the leader Leader Election Early motivation: IBM Token rings (in the 60s) Leader Election Token loss deadlock Token duplication chaos Solution: Invoke leader election procedure Problem: From time to time Leader Election Possible assumptions: Known / unknown network size Synchronous / asynchronous communication Uni / bi-directional communication No IDs / Bounded IDs / Unbounded IDs Leader Election: Formal Statement Input: Bit C v at each vertex v, such that: C v =1 if v initiates an election C v =0 otherwise Output: Explicit version: Variable L v at each vertex v, containing the ID of the chosen leader Implicit version: Bit L v at each vertex v, such that: L v =1 if v was chosen to be the leader L v =0 otherwise (e.g. for anonymous networks) (There might be more than one initiator ! ) Leader Election: Formal Statement Requirements from the output: Explicit version: variable L v contains the same ID at all v Implicit version: bit L v contains L v =1 at exactly one v, and L w =0 at any other w Leader Election: More Versions Requirements from the output: Candidate version: the chosen leader must be one of the initiators Arbitrary version: any vertex can be chosen as leader Special Difficulties Multiple initiators Termination detection needed Non-simultaneous wakeup Possible rule: A vertex can initiate an election (namely, set its input to C v =1) only until it is woken up Special Difficulties Anonymous networks (Identical processors, identical programs, no IDs) Implicit version only Observe: The explicit and implicit versions are closely related. A solution to the explicit version immediately gives a solution to the implicit one. A solution to the implicit version can be transformed into a solution to the explicit one at the cost of a broadcast operation - O(D) time, O(E) messages. For thought: on rings and complete graphs, the asymptotic costs of both versions are the same. Is it always true, or are there network topologies on which the explicit version costs more? Observe: The candidate and arbitrary versions are closely related. A solution to the candidate version immediately gives a solution to the arbitrary one. A solution to the arbitrary version can be transformed into a solution to the candidate one at the cost of a convergecast + broadcast operation - O(D) time, O(E) messages. (If the chosen leader is not one of the candidates, then it collects candidate IDs by convergecast and broadcasts one of them as the new leader.) Leader election on general graphs Difficulty: Termination detection (A message from a distant candidate v of small ID might be on the way) Conceptual solution: Reduction to broadcast Each candidate (C v =1) floods its ID. The smallest ID vertex is selected as leader. Solution: Apply broadcast + convergecast of acks Leader election on general graphs Solution: Broadcast + convergecast of acks If the convergecast of candidate v ends before v hears of another candidate v with smaller ID, then v knows it is the chosen leader. v v Any higher-ID candidate v will see the ID of v before its own convergecast ends. Leader election on general graphs Q: How do non-candidate vertices detect termination? v A: Once the leader v knows it was chosen, it broadcasts the message I am the leader Complexity in synchronous model k candidates O(kE) messages - O(k+n) time may need to pipeline k broadcasts over the same edges - O(kE) messages Complexity in synchronous model Improvement: Let low-ID broadcasts cancel high-ID broadcasts. Each v keeps X = the lowest ID I have seen. A new broadcast is forwarded only if it is lower than X, else it is blocked. -O(D) time (the broadcast of the lowest-ID candidate waits for nobody) -O(min{k,D} E) messages (at most E messages per round at most O(DE) messages ) For thought: Give example scenario requiring (DE) messages. Complexity in asynchronous model Problems: -Convergecast may require (n) time - Asynchrony may prevent broadcast blocking (e.g., if broadcasts by high-ID candidates are fast and broadcasts by low-ID candidates are slow) -O(n) time -O(nE) messages For thought: Give example scenario requiring (nE) messages. MAX computation Problem: -Each vertex v has a value X v - Need to find M=max v {X v } Observe: The LE and MAX problems are closely related: Comm(LE) Comm(MAX) Time(LE) Time(MAX) (To solve LE: set X v = ID v for candidate v, X v = 0 for non-candidate v and run MAX procedure) MAX computation Conversely: Comm(MAX) Comm(LE) + Comm(Convergecast) Time(MAX) Time(LE) + Time(Convergecast) (To solve MAX: -Elect a leader v -The leader broadcast a request to collect inputs -M = maximum X is convergecast towards v -v broadcasts the maximum.) MAX computation Results: In synchronous model: Comm(MAX) Comm(LE) + E Time(MAX) Time(LE) + D In asynchronous model: Comm(MAX) Comm(LE) + E Time(MAX) Time(LE) + n MAX computation Results: On a ring: Comm(MAX) = Comm(LE) Time(MAX) = Time(LE) For thought: Is this true on every network, or are there network topologies on which MAX costs more than LE (asymptotically)? MAX computation Note: Key role of leader election: breaking symmetry. (once a distinguished leader is chosen, finding MAX on the ring requires only O(n) time and messages) LE Algorithms on the ring one ring to rule them all LE Algorithm A1 [LeLann, Chang+Roberts] Assumptions: -Distinct vertex IDs -Asynchronous model -Unknown network size n -Uni-directional communication Elected leader: Highest-ID vertex LE Algorithm A1 [LeLann, Chang+Roberts] Algorithm: -v keeps Z v = largest ID seen so far -When v wakes up (spontaneously or by getting a message) : -Set Z v ID v -Whenever v gets a new message M : -If M contains a higher ID x then -Set Z v x -Forward M to next neighbor LE Algorithm A1 [LeLann, Chang+Roberts] Algorithm (cont.): -If v gets back its own ID (ID v completed a trip around the ring, so: - v has the largest ID, and - every vertex w has Z w =ID v ) : -Send the message ID v is the leader around the ring (needed for termination detection) Note: The elected leader is not necessarily one of the candidates. 1 LE Algorithm A1 [LeLann, Chang+Roberts] Complexity: -Comm(A1)=O(n 2 ) (each vertex ID circles the ring at most once) n Note: This might happen in the worst-case scenario For example: the IDs are as in the figure, and the message carrying ID(i) is faster than that carrying ID(i+1) for every i. Then ID(i) will travel to distance i, so Comm(A1) = 1+2++n = (n 2 ) LE Algorithm A1 [LeLann, Chang+Roberts] Complexity: -Time(A1)=O(n) The highest ID vertex v wakes up at most n-1 rounds after start of execution; Its message circles the ring once (n rounds), followed by the leader announcement message (another n rounds) LE Algorithm A1 [LeLann, Chang+Roberts] Question: Suppose the IDs are {1,,n}, And they are randomly distributed on the ring. What is the expected message complexity of Alg. A1? Ex. 6 : prove formally. Assumptions: -Distinct vertex IDs -Synchronous model -Simultaneous wakeup -Bi-directional communication -Unknown network size n Elected leader: Highest-ID vertex LE Algorithm A2 [Hirschberg+Sinclair] The improvement: O(nlogn) messages Idea: While there are many candidates: -Give each candidate a small budget for advertisement -Eliminate some of them LE Algorithm A2 [Hirschberg+Sinclair] Algorithm (for vertex v): Keep Z v = largest ID seen so far (Initialize Z v to ID v ) Operate in phases If you are not a candidate act only as a relay (forwarding messages of others) LE Algorithm A2 [Hirschberg+Sinclair] Algorithm (for vertex v): In phase 1: If you are a candidate, then send your ID v to distance 1 in both directions LE Algorithm A2 [Hirschberg+Sinclair] v 1 Algorithm (for vertex v): In phase i: If you are a candidate, then send your ID v to distance 2 i-1 in both directions* LE Algorithm A2 [Hirschberg+Sinclair] * Implementation: The message carries a counter that is increased in each step. It is terminated once the counter reaches 2 i-1 v 2 i-1 If you get an id ID w greater than Z v : forward it; keep it in Z v ; mark yourself non-candidate [* From now on, act as a relay *] If you get an id ID w smaller than Z v : discard it (do not forward) LE Algorithm A2 [Hirschberg+Sinclair] v If you get a message containing your own ID v (in phase logn +1), then: You are the leader Send an announcement around the ring LE Algorithm A2 [Hirschberg+Sinclair] v Complexity: Time(A2)=O(n) Consider the messages of the winner. These are never discarded. Phase 1: 1 time units Phase i: 2 i-1 time units Total: +2 logn 2n time units Announcement message: n additional time units LE Algorithm A2 [Hirschberg+Sinclair] Complexity: Comm(A2)=O(nlogn) Lemma: At the end of phase i, every two candidates are at distance at least 2 i-1 of each other. Conclusion: At the end of phase i, # candidates n/2 i-1 2 i-1 Reason: Otherwise, they would hear of each other in phase i, and one of them would mark itself non-candidate LE Algorithm A2 [Hirschberg+Sinclair] Cor: # messages sent in phase i is at most 4n Conclusion: Over all logn +2 phases, # messages 4n(logn +1) + n = O(nlogn) Reason: At phase i: # candidates n/2 i-2 Each sends 2 i messages (2 i-1 in each direction) v 2 i-1 Question: Do we need the assumptions -Synchronous model -Simultaneous wakeup ? LE Algorithm A2 [Hirschberg+Sinclair] Ex. 7: Analyze the correctness and time / communication complexities of Algorithm A2 without these assumptions. Riddle Algorithm A1: -Uni-directional communication -O(n 2 ) messages Algorithm A2: -Bi-directional communication -O(nlogn) messages Is this a coincidence? Or is there an O(nlogn)-message algorithm on a uni-directional ring? Assumptions: -Distinct vertex IDs -Synchronous model -Simultaneous wakeup -Uni-directional communication -Unknown network size n LE Algorithm A3 [Peterson] O(nlogn) messages Not essential LE Algorithm A3 [Peterson] Algorithm (for vertex v): At any time, can be either an active candidate or inactive (acting only as a relay forwarding messages of others) v If active candidate: Set CL v ID v [* Candidate leader ID *] Operate in phases LE Algorithm A3 [Peterson] Algorithm (for vertex v): In a phase: If active: send a message containing CL v two active steps down the ring v CL v Implementation: the message carries a counter C = # of active candidates passed Initially C 0. The message halts once reaching the second active candidate. LE Algorithm A3 [Peterson] Algorithm (for vertex v): If active: Once receiving two messages containing CL u and CL w : u CL u w v CL w CL v CL u CL w CL v Decide to remain active iff CL u, CL v < CL w LE Algorithm A3 [Peterson] Algorithm (for vertex v): u CL u w v CL w CL v If you remain active, adopt ws candidate: Set CL v CL w Confusion ?? Two messages with same candidate ID ?? No ! LE Algorithm A3 [Peterson] Note: v remains active w, z become passive CL u CL w CL v v remains active CL x CL u CL w w remains active u CL u w v CL w CL v x z LE Algorithm A3 [Peterson] Cor 1: Any ID will serve a single active candidate u CL u w v CL w CL v x z Remark: The algorithm will work correctly also without candidate switching but then, the elected leader might not be the maximum ID one LE Algorithm A3 [Peterson] Termination: If you receive your own message (full cycle around the ring), and CL v > CL w (w=other remaining active node) : Your candidate CL v is the leader Send an announcement around the ring Algorithm (for vertex v): LE Algorithm A3 [Peterson] u CL u w v CL w CL v x z Cor 2: The number of active candidates halves (at least) in each phase Cor 3: # phases logn Note: In each phase, at least one candidate remains (the highest ID one) LE Algorithm A3 [Peterson] u CL u w v CL w CL v x z Complexity: Comm(A3)=O(nlogn) Reason: In each phase, each node (active or inactive) forwards 2 messages so 2n messages are sent on each phase LE Algorithm A3 [Peterson] u CL u w v CL w CL v x z Complexity: Time(A3)=O(n) Reason: Backward tracing of the execution LE Algorithm A3 [Peterson] v0v0 Last action: v 0 the last active node announces the leader What previous action led to this? LE Algorithm A3 [Peterson] v0v0 v1v1 CL v0v0 What previous action led to this? v 0 must have received a message from itself LE Algorithm A3 [Peterson] v0v0 v1v1 v2v2 v3v3 What previous action led to this? v 0 must have received a message from someone in the previous phase LE Algorithm A3 [Peterson] v0v0 v1v1 v2v2 v3v3 v 1 must have received a message from someone in the previous phase v4v4 v5v5 LE Algorithm A3 [Peterson] v0v0 v1v1 v2v2 v3v3 What previous action led to this? v4v4 v 1 must have received a message from someone in the previous phase v5v5 LE Algorithm A3 [Peterson] v0v0 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 v7v7 v8v8 v9v9 The earliest action that v 0 waits for happened 2n time units before v 0 declares the leader Question All previous algorithms required at least O(nlogn) messages in the worst-case. Can leader election be achieved with fewer messages ? Assumptions: -Distinct integer vertex IDs -Synchronous model -Simultaneous wakeup -Uni-directional communication -Known network size n LE Algorithm A4 [frederickson] O(n) messages LE Algorithm A4 [frederickson] Idea: Exploiting the sound of silence Elected leader: Lowest-ID vertex LE Algorithm A4 [frederickson] Algorithm operation: Each phase takes n steps On phase i, only a candidate node with ID=i (if exists) may initiate a message transmission The message can be a single bit Other nodes relay the message Nodes deduce the leader ID based on the time LE Algorithm A4 [frederickson] m = lowest ID in the ring Complexity: Comm(A4) = n Time(A4) = mn Obs: The algorithm terminates after mn steps First m-1 phases: silent Phase m: the leaders ID surrounds the ring Ex. 8: Devise an algorithm based on phases of length n/k for integer k, with complexity: Comm = mn/k Time = nklogn Ex. 9: Devise a variant of the algorithm that works without the assumption of simultaneous wakeup and still achieves message complexity O(n). Relaxed assumption: -Distinct integer vertex IDs -Synchronous model -Simultaneous wakeup -Uni-directional communication -Unknown network size n LE Algorithm A5 O(n) messages Idea: Using slow-down LE Algorithm A5 To prevent message waste: Slow down message speeds: Message carrying ID i is forwarded only once every 2 i time cycles Similarly to Algorithm A1: Candidate IDs circle the ring Smaller IDs eliminate larger ones Smallest ID completes the cycle m 2 = 2nd lowest ID in the ring The message carrying m 2 is forwarded at most n/2 times Bounding # messages: m = lowest candidate ID in the ring The message carrying m is forwarded n times LE Algorithm A5 m2m2 m The message carrying m moves once every 2 m time steps LE Algorithm A5 m2m2 m The message carrying m 2 moves at most once every 2 m+1 time steps m 2 m+1 m 2 = 2nd lowest ID in the ring m 2 is forwarded n/2 times Bounding # messages: m = lowest candidate ID in the ring m is forwarded n times LE Algorithm A5 m i = ith lowest ID in the ring m i is forwarded n/2 i-1 times Comm(A5) n i=1 (n/2 i-1 ) 2n Time(A5) n2 m (Cycle of ms message) Capitalizing on low IDs of non-candidates: LE Algorithm A5 Act as if all nodes are candidates. After determining the lowest ID, v 0, make another round-trip from v 0 to find the lowest ID candidate, and another to inform everyone. These two round-trips are done at normal speed. Time(A5) n2 m +2n where m = lowest ID in the ring A: Suppose V={1,,n}. Q: What might happen if wake-up is not simultaneous? LE Algorithm A5 With simultaneous wake-up: Comm(A5) 4n Assuming n wakes-up first: Comm(A5) = O(n 2 ) n Fix: Start with a wake-up phase based on each waking node sending a wake-up message around the ring at normal speed LE Algorithm A5 This phase will cost Comm n Time n and when it ends, all nodes are awake n Comm(A5) 4n Time(A5) 4n Fix: Start with a wake-up phase based on each waking node sending a wake-up message around the ring at normal speed LE Algorithm A5 - This phase costs Comm n Time n -When it ends, all nodes are awake -Wake-up times are within n of each other n LE Algorithm A n Total additional message waste due to slow candidates working during wakeup period = O(n) Ex. 10: Prove this. Assumptions: -No vertex IDs -Synchronous model -Simultaneous wakeup -Uni-directional communication -Known network size n LE Algorithm A6 O(n) messages and time Randomized Idea: Phases (of n time units) In each phase: -Set b1 with probability 1/n [active candidate] -If b=1 then send a message around the ring LE Algorithm A6 -If nodes have IDs: include your ID; -Otherwise, include a counter, to be increased on each step. Upon getting such messages: -Forward the first two -Erase the rest LE Algorithm A6 By end of phase: Zero messages failure [no active candidates] One message - success Two (or more) failure [two or more candidates] Complexity analysis: P i = P(# active candidates in the phase = i) LE Algorithm A6 In particular: So in a given phase, with probability 1/e, there is exactly one active candidate Time: P 1 1/e LE Algorithm A6 E(# phases until A6 halts) = e = O(1) Each phase takes n time units E(Time(A6)) = O(n) Messages: At most 2n messages per phase LE Algorithm A6 E(Comm(A6)) = O(n) Thm: In the asynchronous model, every deterministic algorithm for leader election on n-node rings requires (nlogn) messages. Message lower bound (The O(n) message algorithms that we have seen: synchronous or randomized.) Ex Ex. 16: Suppose it is known that the number of candidates does not exceed k. Modify each of the algorithms A1 to A6 to be as efficient as possible under this assumption. Prove the correctness of the modified algorithms. Analyze the time / communication complexities of the modified algorithm under the assumption. Assumptions: -Synchronous model -Simultaneous wakeup -Known network size n -No vertex IDs (Implicit version: the output is a bit) Randomized LE in complete networks - Fewer than n messages! - Constant time Idea: Exploit birthday paradox Randomized LE in complete networks Randomized Monte-Carlo Las-Vegas requires (n) messages Randomized LE in complete networks Algorithm 4.If picked as referee by some vertices then do: -Find the candidate u with highest rank r u among those that picked you as referee -Send u an ack Randomized LE in complete networks Algorithm 5.If b=1 (active candidate) and received an ack from all referees then decide L v =1 [leader] Else decide L v =0 [not leader] Randomized LE in complete networks Claim 1: C 5logn with probability 1-1/n 2. Randomized LE in complete networks Obs. 2: Comm(A)=O(C ) Each of the C candidates talks to its referees Randomized LE in complete networks Claim 4: The algorithm fails with probability 1/n Randomized LE in complete networks Expected picture: The sets of referees of every two candidates - intersect u w So only one candidate becomes the leader Randomized LE in complete networks Second possible failure: Two or more active candidates get all their acks (i.e., have disjoint sets of referees): P < 1/n [Ex. 18: Prove] Local Algorithms: Coloring, MIS Understanding the effects of locality LOCAL Model: synchronous simultaneous wakeup large messages allowed local computations for free Goal: Focus on limitations stemming from locality of knowledge Symmetry breaking algorithms coloring maximal independent set Coloring Vertex coloring problem: Associate a color v with each v in V, s.t. any two adjacent vertices have different color Coloring Naive solution: Use unique vertex ID's = legal coloring by n colors Goal: obtain coloring with few colors Coloring (cont) (G) = max vertex degree in G Coloring idea: v's neighbors occupy at most distinct colors +1 colors always suffice to find a free color that can be used for coloring v Coloring (cont) First-Free coloring procedure (For set of colors and node set W V) FirstFree(W, ) = min color in that is currently not used by any vertex in W Standard palette: m = {1,...,m}, for integer m 1 Sequential coloring For every node v do (sequentially): v FirstFree( (v), +1 ) [ Pick new color 1 j +1, different from those used by the neighboring nodes ] Example Palette: 4 = {1,2,3,4} Color reduction Basic palette reduction procedure: Given legal coloring by m colors, reduce # colors Procedure Reduce(m) - parallelization Code for v: For round j= +2 to m do: [ all nodes colored j recolor themselves simultaneously ] If v's original color is v = j then do: 1.Set v FirstFree( (v), +1 ) [ Pick new color 1 j +1, different from those used by the neighbors ] 2.Inform all neighbors Procedure Reduce(m) - analysis Lemma: Procedure Reduce produces a legal coloring of G with +1 colors Time(Reduce(m)) = m- +1 Proof: Time bound: Each iteration requires one time unit. Procedure Reduce(m) - analysis Correctness: Consider iteration j. When node v re-colors itself, it always finds a non-conflicting color ( neighbors, and +1 color palette) No conflict with nodes recolored in earlier iterations (or originally colored 1, 2, , +1). Procedure Reduce(m) - analysis Correctness (cont): No conflict with choices of other nodes in iteration j (they are all mutually nonadjacent, by legality of original coloring ) New coloring is legal. All nodes recolor simultaneously without coordination, but this will not create collisions! 3-coloring trees Goal: color a tree T with 3 colors in time O(log*n) 2-coloring a tree T requires time (Depth(T)) 3-coloring trees Goal: color a tree T with 3 colors in time O(log*n) Recall: Iterated-log log (i) n log (1) n = log n log (i+1) n = log(log (i) n) log*n = min { i | log (i) n 2 } General idea: Look at colors as bit strings. Attempt to reduce # bits used for colors. 3-coloring trees | v | = # bits in the color v v [i] = ith bit in the bit string representing v Recoloring operation: Produce new color from old v as follows: 1.find index 0 i | v | in which v's color differs from its parent's color. (Root picks, say, index 0.) 2.set new color to: i, v [i] [ the index i concatenated with the bit v [i] ] 3-coloring trees We will show: a. neighbors have different new colors b. length of new coloring is roughly logarithmic in that of previous coloring root Old coloring: 3-coloring trees (cont) Algorithm SixColor(T) - code for v Let v ID(v) [ initial coloring ] Repeat: | v | If v is the root then set I 0 else set I min{ i | v [i] parent(v) [i] } Set v I; v [I] Inform all children of this choice until | v | = 3-coloring trees (cont) Lemma: In each iteration, Procedure SixColor produces a legal coloring Proof: Consider iteration i, neighboring nodes v,w T, v=parent(w). I = index picked by v; J = index picked by w 3-coloring trees (cont) If I J: new colors of v and w differ in 1st component v w I=1 J=2 3-coloring trees (cont) If I J: new colors of v and w differ in 1st component If I=J: new colors differ in 2nd component v w I=2 J=2 3-coloring trees (cont) K i = # bits in color representation after ith iteration. (K 0 =K=O(log n) = # bits in original ID coloring.) Note: K i+1 = log K i + 1 2 nd coloring uses about loglog n bits, 3 rd - about logloglog n, etc 3-coloring trees (cont) Lemma: Final coloring uses six colors Proof: Final iteration i satisfies K i = K i-1 3 In final coloring, there are 3 choices for the index to the bit in (i-1)st coloring, and two choices for the value of the bit Total of six possible colors Reducing from 6 to 3 colors Shift-down operation: Given legal coloring of T: 1.re-color each non-root vertex by color of parent 2.re-color root by new color (different from current one) Reducing from 6 to 3 colors Claim: 1.Shift-down step preserves coloring legality 2.In new coloring, siblings are monochromatic Reducing from 6 to 3 colors Cancelling color x, for x {4,5,6}: 1.Perform shift-down operation on current coloring, 2.All nodes colored x apply FirstFree( (v), 3 ) [ choose a new color from among {1,2,3} not used by any neighbor ] Reducing from 6 to 3 colors shift-downFirstFree Claim: Rule for cancelling color x produces legal coloring Example: cancelling color 4 Overall 3 coloring process Thm: There is a deterministic distributed algorithm for 3-coloring trees in time O(log*n) 1.Invoke Algorithm SixColor(T) (O(log*n) time) 2.Use ShiftDown 3 times to cancel colors 6, 5, 4 (O(1) time) Remark: Same applies to coloring the ring. +1-coloring bounded-degree graphs Goal: Color arbitrary bounded degree G ( (G)=O(1)) with +1 colors in time O(log*n) +1-coloring bounded-degree graphs Variant of Algorithm SixColor In each iteration i: 1.v selects for each of its d neighbors w an index I w s.t. v [I w ] w [I w ] 2.New color I w1 ; v [I w1 ]; ; I wd ; v [I wd ] +1-coloring bounded-degree graphs Correctness: Reduction step is legality preserving Progress rate: K-bit coloring ( log K +1) bits For =O(1): Process takes log*n steps, ending with 3 -bit coloring (at most 2 3 different colors) +1-coloring bounded-degree graphs Can be reduced to +1-coloring in 2 3 = O(1) time using Procedure Reduce Thm: There is a deterministic distributed algorithm for coloring bounded-degree graphs with +1 colors in time O(log*n) Lower bound (shown later): 3-coloring an n-vertex ring takes (log*n) time +1-coloring for arbitrary graphs Goal: Color G of max degree with +1 colors in O( log n) time Node IDs in G = K-bit strings Idea: Recursive procedure ReColor(x), where x = binary string of K bits. U x = { v | ID(v) has suffix x } (|U x | 2 K-|x| ) The procedure is applied to U x, and returns with a coloring of U x vertices with +1 colors. +1-coloring for arbitrary graphs Procedure ReColor(x) - intuition If |x|=K (U x has one node) then return color 0. Otherwise: 1.Separate U x into two sets U 0x and U 1x 2.Recursively compute +1 coloring for each, invoking ReColor(0x) and ReColor(1x) 3.Remove conflicts between the two colorings by altering the colors of U 1x vertices, color by color, as in Procedure Reduce. ReColor distributed implementation Set |x| If = K [ singleton U x = {v} ] then set v 0 and return Set b a K- [ v U bx ] v ReColor(bx). Procedure ReColor(x) code for v U x [ ID(v)=a 1 a 2... a K, x = a K-|x|+1... a K ] Procedure ReColor(x) - code for v [ Reconciling the colorings on U 0x and U 1x ] If b=1 then do: For round i=1 through +1 do: If v =i then do: v FirstFree( (v), P +1 ) [ pick a new color 1 j +1, different from those used by any neighbor ] Inform all neighbors of this choice Analysis Lemma: For = empty word: Procedure ReColor( ) produces legal coloring of G with +1 colors Time(ReColor( )) = O( log n) Analysis Proof: Sub-claim: ReColor(x) yields legal +1-coloring for vertices of subgraph G(U x ) induced by U x Proof: By induction on length of parameter x. Base (|x|=K): Immediate General case: Consider run of ReColor(x). Analysis Note: Coloring assigned to U 0x is legal (by Ind. Hyp.), and does not change later. Coloring assigned to U 1x is also legal Problem: possible collisions between neighboring v U 0x and w U 1x Analysis (cont) Consider v in U 1x recoloring itself in some iteration i via the FirstFree operation. Note: v always finds a non-conflicting color: No conflict with nodes of U 1x recolored in earlier iterations, or with nodes of U 0x No conflict with other nodes that recolor in iteration i (mutually non-adjacent, by legality of coloring generated by ReColor(1x) to set U 1x ) new coloring is legal Analysis (cont) Time bound: Each of the K=O(log n) recursion levels requires +1 time units O( log n) time Ring 3-coloring lower bound Lower bound for 3-coloring the ring Lower bound: Any deterministic distributed algorithm for 3-coloring n-node rings requires at least (log*n-1)/2 time. Applies in the strong LOCAL model: Synchronous communication Simultaneous wakeup Unbounded message size Free local computation IDs = permutation of {1,,n} Lower bound for 3-coloring the ring After t time units, v knows everything known to anyone in its t-neighborhood. In particular, given no inputs but vertex ID's: after t steps, node v learns the topology of its t-neighborhood t (v) (including ID's) Lower bound for 3-coloring the ring On a ring, v learned a (2t+1)-tuple (v) = (x 1,...,x 2t+1 ) from space W 2t+1,n, where W s,n = {(x 1,...,x s ) | 1 x i n, x i x j }, x t+1 = ID(v), x t and x t+2 = ID's of vs two neighbors, etc. Lower bound for 3-coloring the ring Example: The vector (v) held by v at time t is: t=0: (7) t=1: (1,7,2) t=2: (9,1,7,2,8) etc Coloring lower bound (cont) W.l.o.g., any deterministic t(n)-step algorithm A(t,) for coloring a ring in c colors follows a 2-phase policy: Phase 1: For t rounds, exchange topology info. At end, each v holds a tuple (v) W 2t+1,n Phase 2: Select v ( (v)) where : W 2t+1,n {1,...,c} is the coloring function of algorithm A Coloring lower bound (cont) is an s-ary c-coloring function if 1. (x 1,x 2,...,x s ) {1,,c} for every 1 x 1