Fundamentals
-
Upload
vighnesh1990 -
Category
Documents
-
view
55 -
download
1
Transcript of Fundamentals
CS60002:Distributed Systems
Textbook etc.
No one textbookWill follow for some time
“Advanced Concepts in Operating Systems”by Mukesh Singhal and Niranjan G. Shivaratri
supplemented by copies of papers
Will give materials from other books, papers etc. from time to time
Introduction
Distributed System
A broad definitionA set of autonomous processes that
communicate among themselves to perform some task
Modes of communicationMessage passingShared memory
Includes single machine with multiple communicating processes also
A more common definitionA network of autonomous computers that
communicate by message passing to perform some task
A practical distributed system may have bothComputers that communicate by messagesProcesses/threads on a computer that communicate by messages or shared memory
Advantages
Resource SharingHigher throughputHandle inherent distribution in problem structureFault ToleranceScalability
Representing Distributed Systems
Graph representationNodes = processesEdges = communication linksLinks can be bidirectional (undirected graph) or unidirectional (directed graph)Links can have weights to represent different things (Ex. delay, length, bandwidth,…)Links in the graph may or may not correspond with physical links
Why are They Harder to Design?
Lack of global shared memoryHard to find the global system state at any point
Lack of global clockEvents cannot be started at the same timeEvents cannot be ordered in time easily
Hard to verify and proveArbitrary interleaving of actions makes the system hard to verify Same problem is there for multi-process programs on a single machineHarder here due to communication delays
Example: Lack of Global Memory
Problem of Distributed SearchA set of elements distributed across multiple machinesA query comes at any one machine A for an element XNeed to search for X in the whole system
Sequential algorithm is very simpleSearch and update done on a single array in a single machineNo. of elements also known in a single variable
A distributed algorithm has more hurdles to solveHow to send the query to all other machines?Do all machines even know all other machines?How to get back the result of the search in each m/c?Handling updates (both add/delete of elements at a machine and add/remove of machines) – adds more complexity
Main problemNo one place (global memory) that a machine can look
up to see the current system state (what machines, what elements, how many elements)
Example: Lack of Global Clock
Problem of Distributed Replication3 machines A, B, C have copies of a data X, say initialized to 1Query/Updates can happen in any m/cNeed to make the copies consistent within short time in case of update at any one machineNaïve algorithm
On an update, a machine sends the updated value to the other replicasA replica, on receiving an update, applies it
3 3
2X=2
X=2
2 3
2X=2
3 3
3
X=33 1
1X=3
2 3
2X=2
Node accepts X=2
2 2
2
1 1
2X=2
X=2
2 1
2X=2
3 3
3 X=2
X=33 1
2X=2
X=3
3 3
3X=2
What should this node do now?Reject X=2, right?
But it has received exactly thesame messages in the same order
But then, consider the following scenario
Could be easily solved if all nodes had a synchronized global clock
Models for Distributed Algorithms
Informally, guarantees that one can assume the underlying system will (or will not!) give
Topology : completely connected, ring, tree, arbitrary,…
Communication : shared memory/message passing (Reliable? Delay? FIFO? Broadcast/multicast?…)Synchronous/asynchronousFailure possible or not
What all can fail?Failure models (crash, omission, Byzantine, timing,…)
Unique IdsOther Knowledge : no. of nodes, diameter
Less assumptions => weaker modelA distributed algorithm needs to specify the model on which it is supposed to workThe model may not match the underlying physical system always
Physical System
Gap between assumption and system available
Model assumed
Need to implementwith h/w-s/w
Complexity Measures
Message complexity : total no. of messages sentCommunication complexity/Bit Complexity : total no. of bits sentTime complexity : For synchronous systems, no. of rounds. For asynchronous systems, different definitions are thereSpace complexity : total no. of bits needed for storage at all the nodes
Example: Distributed Search Again
Assume that all elements are distinctNetwork represented by graph G with n nodes and m edges
Model 1Asynchronous, completely connected topology,
reliable communicationAlgorithm:
Send query to all neighborsWait for reply from all, or till one node says FoundA node, on receiving a query for X, does local search for X and replies Found/Not found.
Worst case message complexity = 2(n – 1) per query
Model 2Asynchronous, completely connected topology,
unreliable communicationAlgorithm:
Send query to all neighborsWait for reply from all, or till one node says FoundA node, on receiving a query for X, does local search for X and replies Found/Not found.If no reply within some time, send query again
Problems!How long to wait for? No bound on message delay!Message can be lost again and again, so this still does not solve the problem.In fact, impossible to solve (may not terminate)!!
Model 3Synchronous, completely connected topology, reliable
communication
Maximum one-way message delay = αMaximum search time at each m/c = βAlgorithm:
Send query to all neighborsWait for reply from all for T = 2α + β, or till one node says FoundA node, on receiving a query for X, does local search for X and replies Found if found, does not reply if not foundIf no reply received within T, return “Not found”Message complexity = n -1 if not found, n if foundMessage complexity reduced, possibly at the cost of more time
Model 4Asynchronous, reliable communication, but not
completely connected
How to send the query to all?Algorithm (first attempt):
Querying node A sends query for X to all its neighborsAny other node, on receiving query for X, first searches for X. If found, send back “Found” to A. If not, send back “Not found” to A, and also forward the query to all its neighbors other than the one it received from (flooding)Eventually all nodes get it and replyMessage complexity – O(nm) (why?)
But are we done?Suppose X is not there. A gets many “Not found”messages. How does it know if all nodes have replied? (Termination Detection)
Lets change (strengthen) the modelSuppose A knows n, the total number of nodes
A can now count the number of messages received. Termination if at least one “Found” message, or n “Not found” messagesMessage complexity – O(nm)
Suppose A knows upper bound on network diameter and synchronous system
Can be done with O(m) messages only
Can you do it without changing the model?
So Which Model to Choose?Ideally, as close to the physical system available as possible
The algorithm can directly run on the systemShould be implementable on the physical system by additional h/w-s/w
Ex., reliable communication (say TCP) over an unreliable physical system
Sometimes, start with a strong model, then weaken it
Easier to design algorithms on a stronger model (more guarantees from the system)Helps in understanding the behavior of the systemCan use this knowledge to then design algorithms on the weaker model
Some Fundamental Problems
Ordering events in the absence of a global clockCapturing the global stateMutual exclusionLeader electionClock synchronizationTermination detectionBuilding structures
Spanning treeShortest path tree…
Ordering of Events and Logical Clocks
Ordering of Events
Lamport’s Happened Before relationship:
For two events a and b, a → b (a happened beforeb) if
a and b are events in the same process and a occurred before ba is a send event of a message m and b is the corresponding receive event at the destination processa → c and c → b for some event c
a → b implies a is a a potential cause of bCausal ordering : potential dependencies“Happened Before” relationship causally orders events• If a → b, then a causally affects b• If a → b and b → a, then a and b are concurrent
( a || b)
Logical Clock
Each process i keeps a clock Ci
Each event a in i is timestamped C(a), the value of Ci when a occurredCi is incremented by 1 for each event in iIn addition, if a is a send of message m from process i to j, then on receive of m,
Cj = max(Cj, C(a)+1)
Points to note:
• Increment amount can be any positive number no necessarily 1
• if a → b, then C(a) < C(b)
• → is an irreflexive partial order
• Total ordering possible by arbitrarily ordering concurrent events by process numbers (assuming process numbers are unique)
Limitation of Lamport’s Clock
a → b implies C(a) < C(b)
BUT
C(a) < C(b) doesn’t imply a → b !!
So not a true clock !!
Solution: Vector Clocks
Ci is a vector of size n (no. of processes)C(a) is similarly a vector of size nUpdate rules:• Ci[i]++ for every event at process i• if a is send of message m from i to j with vector
timestamp tm, on receive of m:Cj[k] = max(Cj[k], tm[k]) for all k
For events a and b with vector timestamps ta and tb,• ta = tb iff for all i, ta[i] = tb[i]
• ta ≠ tb iff for some i, ta[i] ≠ tb[i]
• ta ≤ tb iff for all i, ta[i] ≤ tb[i]
• ta < tb iff (ta ≤ tb and ta ≠ tb)
• ta || tb iff (ta < tb and tb < ta)
a → b iff ta < tb
Events a and b are causally related iff ta < tb or tb < ta, else they are concurrent
Causal ordering of messages: Application of vector clocks
Delivery in Causal Order:If send(m1)→ send(m2), then every recipient of both message m1 and m2 must “deliver” m1 before m2
“deliver” – when the message is actually given to the application for processing
Birman-Schiper-StephensonProtocol
To broadcast m from process i, increment Ci(i), and timestamp m with VTm = Ci[i]When j ≠ i receives m, j delays delivery of m until
Cj[i] = VTm[i] –1 andCj[k] ≥ VTm[k] for all k ≠ iDelayed messaged are queued in j sorted by vector time. Concurrent messages are sorted by receive time.
When m is delivered at j, Cj is updated according to vector clock rule
Problem of Vector Clock
Message size increases since each message needs to be tagged with the vector
Size can be reduced in some cases by only sending values that have changed
Capturing Global State
Global State Collection
Applications: Checking “stable” properties, checkpoint & recovery,…
Issues:Need to collect both node and channel statesSystem cannot be stoppedNo global clock
But what is global state??
Some Notations
LSi : local state of process isend(mij) : send event of message mij from process i to process jrec(mij) : similar, receive instead of sendtime(x) : time at which state x was recordedtime (send(m)) : time at which send(m) occured
send(mij) є LSi ifftime(send(mij)) < time(LSi)
rec(mij) є LSj ifftime(rec(mij)) < time(LSj)
transit(LSi,LSj) = { mij | send(mij) є LSi and rec(mij) єLSj}
inconsistent(LSi, LSj) = {mij | send(mij) є LSi and rec(mij) є LSj}
Global state: collection of local statesGS = {LS1, LS2,…, LSn}
GS is consistent ifffor all i, j, 1 ≤ i, j ≤ n,
inconsistent(LSi, LSj) = Ф
GS is transitless ifffor all i, j, 1 ≤ i, j ≤ n,
transit(LSi, LSj) = Ф
GS is strongly consistent if it is consistent and transitless.Note that channel state may be specified explicitly in a global state, or implicitly in node states using transit()
Chandy-Lamport’s Algorithm
Uses special marker messages
One process acts as initiator, starts the state collection by following the marker sending rule below
Marker sending rule for process P:P records its state; then for each outgoing channel C from P on which a marker has not been sent already, P sends a marker along C before any further message is sent on C
When Q receives a marker along a channel C:
If Q has not recorded its state then Q records the state of C as empty; Q then follows the marker sending rule
If Q has already recorded its state, it records the state of C as the sequence of messages received along C after Q’s state was recorded and before Q received the marker along C
Points to Note
Markers sent on a channel distinguish messages sent on the channel before the sender recorded its states and the messages sent after the sender recorded its stateThe state collected may not be any state that actually happened in reality, rather a state that “could have” happenedRequires FIFO channelsNetwork should be strongly connected (works obviously for connected, undirected also)Message complexity O(|E|), where E = no. of links
Lai and Young’s Algorithm
Similar to Chandy-Lamport’s, but does not require FIFOBoolean value X at each node, False indicates state is not recorded yet, True indicates recordedValue of X piggybacked with every application messageValue of X distinguishes pre-snapshot and post-snapshot messages, similar to the MarkerRequires log of all messages sent before the state is recorded
Mutual Exclusion
Mutual Exclusion
Very well-understood in shared memory systems
Requirements:at most one process in critical section (safety)if more than one requesting process, someone enters (liveness)a requesting process enters within a finite time (no starvation)requests are granted in order (fairness)
Classification of Distributed Mutual Exclusion Algorithms
Non-token based/Permission basedNode takes permission from all/subset of other nodes before entering critical sectionPermission from all processes: e.g. Lamport, Ricart-Agarwala, Raicourol-Carvalho etc.Permission from a subset: ex. Maekawa
Token basedSingle token in the systemNode enters critical section if it has the tokenAlgorithms differ in how the token is circulatedex. Suzuki-Kasami
Some Complexity Measures
No. of messages/critical section entrySynchronization delayResponse timeThroughput
Lamport’s Algorithm
Every node i has a request queue qi, keeps requests sorted by logical timestamps (total ordering enforced by including process id in the timestamps) To request critical section:
send timestamped REQUEST (tsi, i) to all other nodesput (tsi, i) in its own queue
On receiving a request (tsi, i):send timestamped REPLY to the requesting node i put request (tsi, i) in the queue
To enter critical section:i enters critical section if (tsi, i) is at the top if its own queue, and i has received a message (any message) with timestamp larger than (tsi, i) from ALL other nodes.
To release critical section:i removes its request from its own queue and sends a timestamped RELEASE message to all other nodesOn receiving a RELEASE message from i, i’srequest is removed from the local request queue
Some points to note
Purpose of REPLY messages from node i to j is to ensure that j knows of all requests of i prior to sending the REPLY (and therefore, possibly any request of i with timestamp lower than j’s request)Requires FIFO channels. 3(n – 1 ) messages per critical section invocationSynchronization delay = max. message transmission timeRequests are granted in order of increasing timestamps
Ricart-Agarwala AlgorithmImprovement over Lamport’sMain Idea:
node j need not send a REPLY to node i if j has a request with timestamp lower than the request of i (since i cannot enter before j anyway in this case)
Does not require FIFO2(n – 1) messages per critical section invocationSynchronization delay = max. message transmission timerequests granted in order of increasing timestamps
To request critical section:send timestamped REQUEST message (tsi, i)
On receiving request (tsi, i) at j:send REPLY to i if j is neither requesting nor executing critical section or if j is requesting and i’s request timestamp is smaller than j’s request timestamp. Otherwise, defer the request.
To enter critical section:i enters critical section on receiving REPLY from all nodes
To release critical section:send REPLY to all deferred requests
Roucairol-Carvalho Algorithm
Improvement over Ricart-AgarwalaMain idea
once i has received a REPLY from j, it does not need to send a REQUEST to j again unless it sends a REPLY to j (in response to a REQUEST from j)no. of messages required varies between 0 and 2(n – 1) depending on request patternworst case message complexity still the same
Maekawa’s Algorithm
Permission obtained from only a subset of other processes, called the Request Set (or Quorum)Separate Request Set Ri for each process iRequirements:
for all i, j: Ri ∩ Rj ≠ Φfor all i: i Є Rifor all i: |Ri| = K, for some Kany node i is contained in exactly D Request Sets, for some D
K = D = sqrt(N) for Maekawa’s
A simple version
To request critical section:i sends REQUEST message to all process in Ri
On receiving a REQUEST message:send a REPLY message if no REPLY message has been sent since the last RELEASE message is received. Update status to indicate that a REPLY has been sent. Otherwise, queue up the REQUEST
To enter critical section:i enters critical section after receiving REPLY from all nodes in Ri
To release critical section:send RELEASE message to all nodes in Ri
On receiving a RELEASE message, send REPLY to next node in queue and delete the node from the queue. If queue is empty, update status to indicate no REPLY message has been sent.
Message Complexity: 3*sqrt(N)Synchronization delay =
2 *(max message transmission time)
Major problem: DEADLOCK possible
Need three more types of messages (FAILED, INQUIRE, YIELD) to handle deadlock. Message complexity can be 5*sqrt(N)
Building the request sets?
Token based Algorithms
Single token circulates, enter CS when token is presentNo FIFO requiredMutual exclusion obviousAlgorithms differ in how to find and get the tokenUses sequence numbers rather than timestamps to differentiate between old and current requests
Suzuki Kasami Algorithm
Broadcast a request for the tokenProcess with the token sends it to the requestor if it does not need it
Issues:
Current vs. outdated requestsdetermining sites with pending requestsdeciding which site to give the token to
The token:Queue (FIFO) Q of requesting processesLN[1..n] : sequence number of request that j executed most recently
The request message:REQUEST(i, k): request message from node i for its kth critical section execution
Other data structuresRNi[1..n] for each node i, where RNi[j] is the largest sequence number received so far by i in a REQUEST message from j.
To request critical section:If i does not have token, increment RNi[i] and send REQUEST(i, RNi[i]) to all nodesif i has token already, enter critical section if the token is idle (no pending requests), else follow rule to release critical section
On receiving REQUEST(i, sn) fat j:set RNj[i] = max(RNj[i], sn)if j has the token and the token is idle, send it to i if RNj[i] = LN[i] + 1. If token is not idle, follow rule to release critical section
To enter critical section:enter CS if token is present
To release critical section:set LN[i] = RNi[i]For every node j which is not in Q (in token), add node j to Q if RNi[ j ] = LN[ j ] + 1If Q is non empty after the above, delete first node from Q and send the token to that node
Points to note:
No. of messages: 0 if node holds the token already, n otherwise
Synchronization delay: 0 (node has the token) or max. message delay (token is elsewhere)
No starvation
Raymond’s Algorithm
Forms a directed tree (logical) with the token-holder as root
Each node has variable “Holder” that points to its parent on the path to the root. Root’s Holder variable points to itself
Each node i has a FIFO request queue Qi
To request critical section:Send REQUEST to parent on the tree, provided i does not hold the token currently and Qi is empty. Then place request in Qi
When a non-root node j receives a request from iplace request in Qj
send REQUEST to parent if no previous REQUEST sent
When the root r receives a REQUESTplace request in Qrif token is idle, follow rule for releasing critical section (shown later)
When a node receives the tokendelete first entry from the queuesend token to that node (maybe itself)set Holder variable to point to that nodeif queue is non-empty, send a REQUEST message to the parent (node pointed at by Holder variable)
To execute critical sectionenter if token is received and own entry is at the top of the queue; delete the entry from the queue
To release critical sectionif queue is non-empty, delete first entry from the queue, send token to that node and make Holder variable point to that nodeIf queue is still non-empty, send a REQUEST message to the parent (node pointed at by Holder variable)
Points to note:
Avg. message complexity O(log n)
Sync. delay (T log n)/2, where T = max. message delay
Leader Election
Leader Election in Rings
ModelsSynchronous or AsynchronousAnonymous (no unique id) or Non-anonymous (unique ids)Uniform (no knowledge of ‘n’, the number of processes) or non-uniform (knows ‘n’)
Known Impossibility ResultThere is no deterministic, synchronous, non-uniform leader election protocol for anonymous rings
Election in Asynchronous Rings
Lelann-Chang-Robert’s Algorithmsend own id to node on leftif an id received from right, forward id to left node only if received id greater than own id, else ignoreif own id received, declares itself “leader”
Works on unidirectional ringsWorst case message complexity = O(n2)Average case message complexity = O(nlogn)
Hirschberg-Sinclair AlgorithmOperates in phases, requires bidirectional ringIn kth phase, send own id to 2^k processes on both sides of yourself (directly send only to next processes with id and k in it)If id received, forward if received id greater than own id, else ignoreLast process in the chain sends a reply to originator if its id less than received idReplies are always forwardedA process goes to (k+1)th phase only if it receives a reply from both sides in kth phaseProcess receiving its own id – declare itself “leader”
Message Complexity: O(nlgn)Lots of other algorithms exist for ringsLower Bound Result:
Any comparison-based leader election algorithm in a ring requires Ώ(nlgn) messagesWhat if not comparison-based?
Leader Election in Arbitrary Networks
FloodMaxSynchronous, round-basedAt each round, each process sends the max. id seen so far (not necessarily its own) to all its neighborsAfter diameter no. of rounds, if max. id seen = own id, declares itself leaderComplexity = O(d.m), where d = diameter of the network, m = no. of edgesDoes not extend to asynchronous model trivially
Variations of building different types of spanning trees with no pre-specified roots. Chosen root at the end is the leader
Clock Synchronization
Clock Synchronization
Multiple machines with physical clocks. How can we keep them more or less synchronized?Internal vs. External synchronizationPerfect synchronization not possible because of communication delaysEven synchronization within a bound can not be guaranteed with certainty because of unpredictability of communication delays.But still useful !! Ex. – Kerberos, GPS
How clocks work
Computer clocks are crystals that oscillate at a certain frequencyEvery H oscillations, the timer chip interrupts once (clock tick). No. of interrupts per second is typically 18.2, 50, 60, 100; can be higher, settable in some casesThe interrupt handler increments a counter that keeps track of no. of ticks from a reference in the past (epoch)Knowing no. of ticks per second, we can calculate year, month, day, time of day etc.
Clock Drift
Unfortunately, period of crystal oscillation varies slightly If it oscillates faster, more ticks per real second, so clock runs faster; similar for slower clocksFor machine p, when correct reference time is t, let machine clock show time as C = Cp(t)Ideally, Cp(t) = t for all p, tIn practice,
1 – ρ ≤ dC/dt ≤ 1 + ρρ = max. clock drift rate, usually around 10-5 for cheap oscillatorsDrift => Skew between clocks (difference in clock values of two machines)
Resynchronization
Periodic resynchronization needed to offset skew
If two clocks are drifting in opposite directions, max. skew after time t is 2 ρ t
If application requires that clock skew < δ, then resynchronization period
r < δ /(2 ρ)
Usually ρ and δ are known
Cristian’s Algorithm
One m/c acts as the time serverEach m/c sends a message periodically (within resync. period r) asking for current timeTime server replies with its timeSender sets its clock to the replyProblems:
message delaytime server time is less than sender’s current time
Handling message delay: try to estimate the time the message with the timer server’s time took to each the sender
Measure round trip time and halve itMake multiple measurements of round trip time, discard too high values, take average of restUake multiple measurements and take minimumuse knowledge of processing time at server if known
Handling fast clocksDo not set clock backwards; slow it down over a period of time to bring in tune with server’s clock
Berkeley Algorithm
Centralized as in Cristian’s, but the time server is activeTime server asks for time of other m/cs at periodic intervalsTime server averages the times and sends the new time to m/csM/cs sets their time (advances immediately or slows down slowly) to the new timeEstimation of transmission delay as before
External Synchronization
Clocks must be synchronized with real time
Cristian’s algorithm can be used if the time server is synchronized with real time somehow
Berkeley algorithm cannot be used
But what is “real time” anyway?
Measurement of time
Astronomicaltraditionally usedbased on earth’s rotation around its axis and around the sunsolar day : interval between two consecutive transits of the sunsolar second : 1/86,400 of a solar dayperiod of earth’s rotation varies, so solar second is not stablemean solar second : average length of large no of solar days, then divide by 86,400
Atomicbased on the transitions of Cesium 133 atom1 sec. = time for 9,192,631,770 transitionsabout 50+ labs maintain Cesium clockInternational Atomic Time (TAI) : mean no. of ticks of the clocks since Jan 1, 1958highly stableBut slightly off-sync with mean solar day (since solar day is getting longer)A leap second inserted approx. occasionally to bring it in sync. (so far 32, all positive)Resulting clock is called UTC – Universal Coordinated Time
UTC time is broadcast from different sources around the world, ex.
National Institute of Standards & Technology (NIST) – runs radio stations, most famous being WWV, anyone with a proper receiver can tune inUnited States Naval Observatory (USNO) –supplies time to all defense sources, among othersNational Physical Laboratory in UKGPS satellitesMany others
NTP : Network Time ProtocolProtocol for time sync. in the internetHierarchical architecture
Primary time servers (stratum 1) synchronize to national time standards via radio, satelite etc. Secondary servers and clients (stratum 2, 3,…) synchronize to primary servers in a hierrachicalmanner (stratum 2 servers sync. with stratum 1, startum 3 with stratum 2 etc.).
Reliability ensured by redundant serversCommunication by multicast (usually within LAN servers), symmetric (usually within multiple geographically close servers), or client server (to higher stratum servers)Complex algorithms to combine and filter timesSync. possible to within tens of milliseconds for most machinesBut just a best-effort service, no guaranteeshttp://www.ntp.org for more details
Termination Detection
Termination Detection
Modelprocesses can be active or idleonly active processes send messagesidle process can become active on receiving an computation messageactive process can become idle at any timetermination: all processes are idle and no computation message are in transitCan use global snapshot to detect termination also
Huang’s Algorithm
One controlling agent, has weight 1 initiallyAll other processes are idle initially and has weight 0Computation starts when controlling agent sends a computation message to a processAn idle process becomes active on receiving a computation messageB(DW) – computation message with weight DW. Can be sent only by the controlling agent or an active processC(DW) – control message with weight DW, sent by active processes to controlling agent when they are about to become idle
Let current weight at process = W
1. Send of B(DW):• Find W1, W2 such that W1 > 0, W2 > 0, W1 + W2 = W• Set W = W1 and send B(W2)
2. Receive of B(DW):• W += DW; • if idle, become active
3. Send of C(DW):• send C(W) to controlling agent• Become idle
4. Receive of C(DW):• W += DW• if W = 1, declare “termination”
Building Spanning Trees
Building Spanning Trees
Applications:BroadcastConvergecastLeader election
Two variations:from a given root rroot is not given a-priori
Flooding Algorithm
Starts from a given root rr initiates by sending message M to all neighbours, sets its own parent to nilFor all other nodes, on receiving M from i for the first time, set parent to i and send M to all neighbors except i. Ignore any M received after thatTree built is an arbitrary spanning treeMessage complexity
= 2m – (n -1) where m = no of edgesTime complexity ??
Constructing a DFS tree with given root
Plain parallelization of the sequential algorithm by introducing synchronizationEach node i has a set unexplored, initially contains all neighbors of iA node i (initiated by the root) considers nodes in unexplored one by one, sending a neighbor j a message M and then waiting for a response (parentor reject) before considering the next node in unexplored If j has already received M from some other node, j sends a reject to i
Else, j sets i as its parent, and considers nodes in its unexplored set one by onej will send a parent message to i only when it has considered all nodes in its unexplored seti then considers the next node in its unexplored setAlgorithm terminates when root has received parentor reject message from all its neighboursWorst case no. of messages = 4mTime complexity O(m)