Ordering and Consistent Cuts

Post on 31-Dec-2015

23 views 0 download

Tags:

description

Ordering and Consistent Cuts. Presented By Biswanath Panda. Introduction. Ordering and global state detection in a “distributed system” Fundamental Questions What is a distributed system? What is a distributed computation? How can we represent a distributed system? - PowerPoint PPT Presentation

Transcript of Ordering and Consistent Cuts

Ordering and Consistent Cuts

Presented By

Biswanath Panda

Introduction

Ordering and global state detection in a “distributed system”

Fundamental Questions What is a distributed system? What is a distributed computation? How can we represent a distributed system? Why are today’s papers so important?

A distributed system is ….

A collection of sequential processes p1, p2, p3…..pn

Network capable of implementing communication channels between pairs of processes for message exchange

Channels are reliable but may deliver messages out of order

Every process can communicate with every other process(may not be directly)

There is no reasoning based on global clocks All kinds of synchronization must be done by

message passing

Distributed Computation

A distributed computation is a single execution of a distributed program by a collection of processes. Each sequential process generates a sequence of events that are either internal events, or communication events

The local history of process pi during a computation is a (possibly infinite) sequence of events hi = ei

1, ei2…....

A partial local history of a process is a prefix of the local history hi

n = ei1 , ei

2 … ein

The global history of a computation is the set H = Ui=1n hi

So what does this global history as defined tell us? It is just the collection of events that have

occurred in the system It does not give us any idea about the relative

times between the events As there is no notion of global time, events

can only be ordered based on a notion of cause and effect

So lets formalize this idea

Happened Before Relation (→) If a and b are events in the same process

then a → b If a is the sending of a message m by a

process and b is the corresponding receive event then a → b

Finally if a → b b → c then a → c If a → b and b → a then a and b are

concurrent → defines a partial order on the set H

Space Time Diagram

Graphical representation of a distributed system If there is a path between two events then they are

related Else they are concurrent

Is this notion of ordering really important? Some idea of ordering of events is fundamental to

reason about how a system works Global State Detection is a fundamental problem in

distributed computing Enables detecting stable properties of a system How do we get a snapshot of the system when there is no

notion of global time or shared memory How do we ensure that that the state collected is consistent

Use this problem to illustrate the importance of ordering

This will also give us the notion of what is a consistent global state

Global States and Cuts

Global State is a n-tuple of local states one for each process

Cut is a subset of the global history that contains an initial prefix of each local state

Therefore every cut is a natural global state Intuitively a cut partitions the space time

diagram along the time axis A Cut is identified by the last event of each

process that is part of the cut

Example of a Cut

Introduction to consistency

Consider this solution for the common problem of deadlock detection

System has 3 processes p1, p2, p3 An external process p0 sends a message to each

process (Active Monitoring) Each process on getting this message reports its

local state Note that this global state thus collected at p0 is a

cut p0 uses this information to create a wait for graph

Consider the space time diagram below and the cut C2

1

2 3

Cycle formed

So what went wrong?

p0 detected a cycle when there was no deadlock

State recorded contained a message received by p3 which p1 never sent

The system could never be in such a state and hence the state p0 saw was inconsistent

So we need to make sure that application see consistent states

So what is a consistent global state? A cut C is consistent if for all events e and e’

Intuitively if an event is part of a cut then all events that happened before it must also be part of the cut

A consistent cut defines a consistent global state

Notion of ordering is needed after all !!

CeeeCe ''

Passive Deadlock Detection

Let’s change our approach to deadlock detection

p0 now monitors the system passively Each process sends p0 a message when an event

occurs

What global state does p0 now see Basically hell breaks lose

FIFO Channels

Communication channels need not preserve message order

Therefore p0 can construct any permutation of events as a global state

Some of these may not even be valid (events of the same process may not be in order)

Implement FIFO channels using sequence numbers

Now we know that we p0 sees constructs valid runs But the issue of consistency still remains

)'()()'()( mdelivermdelivermsendmsend jjii

Ok let’s now fix consistency

Assume a global real-time clock and bound of δ on the message delay

Don’t panic we shall get rid of this assumption soon RC(e): Time when event e occurs Each process reports to p0 the global timestamp

along with the event Delivery Rule at p0: At time t, deliver all received

messages upto t- δ in increasing timestamp order So do we have a consistent state now?

Clock Condition

Yes we do!! e is observed before e’ iff RC(e) < RC(e’) Recall our definition of consistency

Therefore state is consistent iff

This is the clock condition For timestamps from a global clock this is obviously

true Can we satisfy it for asynchronous systems?

CeeeCe ''

)'()(' eRCeRCee

Logical Clocks

Turns out that the clock condition can be satisfied in asynchronous systems as well

→ is defined such that Clock Condition holds if A and b are events of the same process and a

comes before b then RC(a)<RC(b) If a is the send of an event and b is corrsponding

receive then RC(a)<RC(b)

Lamport’s Clocks

Local variable LC in every process LC: Kind of a logical clock Simple counter that assigns timestamps to

events Every send event is time stamped LC modification rules

LC(ei) = LC + 1 if ei is an internal event or send

max{LC,TS(m)} + 1 if ei is receive(m)

Example of Logical Clocks

p1

p2

p3

1

1

1

2

2 4

4

3

5

Observations on Lamports Clocks Lamport says

a → b then C(a) < C(b) However

C(a) < C(b) then a → b ?? Solution: Vector Clocks

Clock (C) is a vector of length n C[i] : Own logical time C[j] : Best guess about j’s logical time

Vector Clocks Example

1,0,0 2,0,0 3,4,1

0,1,0 2,2,0

0,0,1

2,3,1

2,4,1

Let’s formalise the idea

C[i] is incremented between successive local events

On receiving message timestamped message m

Can be shown that both sides of relation holds

])[],[max(:][, ktkCkCk m

So are Lamport clocks useful only for finding global state? Definitely not!!! Mutual Exclusion using Lamport clocks

Only one process can use resource at a time Requests are granted in the order in which they

are made If every process releases the resource then every

request is eventually granted Assumptions

FIFO reliable channels Direct connection between processes

Algorithm

p1

p2

p3

1,1

(1,1)

(1,1)(1,2) 1,2

(1,2)

(1,2)

(1,2)(1,1)(1,2) 2 3

2

2

r4r3

r3

r3

p1 has higher time stamp messages from p2 and p3. It’s message is at top of queue. So p1 enters

p1 sends release and now p2 enters

Algorithm Summary

Requesting CS Send timestamped REQUEST Place request on request queue

On receiving REQUEST Put request on queue Send back timestamped REPLY

Enter CS if Received larger timestamped REPLY Request at the head of queue

Releasing CS Send RELEASE message On receiving RELEASE remove request

Global State Revisited

Earlier in the talk we had discussed the problem where a process actively tries to get the global state

Solution to the problem that calculates only consistent global states

Model Process only knows about its internal events Messages it sends and receives

Requirements

Each process records it own local state The state of the communication channels is

recorded All these small parts form a consistent whole State Detection must run along with

underlying computation FIFO reliable channels

Global States

What exactly is channel state

Let c be a channel from p to q p records its local state(Lp) and so does

q(Lq) P has some sends in Lp whose receives may

not be in Lq It is these sent messages that are the state of

q Intuitively messages in transit when local

states collected

Basic Algorithm Description

p0 p1

p2

M

B

C

A

M

B

A

C

M

Send A

Send B

Record State

Send M

Recv B

Recv A

Recv C

Recv M, Record State, Channel (2,1)empty

Recv M, Record State, Channel (0,1)empty, Send M

Recv M, Record State, Channel (0,1)A

Send C

Algorithm Summary

Marker sending rule P sends a marker on every outgoing channel after it

records its state and before it sends further messages Marker receiving rule

If q has not recorded its state thenbegin q records its state;

q records the state c as empty sequence

end

Else

q records state of c as the messages it got along c after

it had recorded its state till now

Comments on Algorithm

Marker ensures liveness of algorithm Flooding Algorithm: O(n2) messages Properties of the recorded global state

So is such a state useful Stable properties

s1 s2

se

Conclusion

We looked at Fundamental concepts in distributed systems Ordering in distributed systems Global State Detection

Papers are some of classic works in distributed systems

Where theory meets practice!!!!