Distributed Systems Distributed System modelsDistributed System models 1. Physical Networks 2....

Distributed SystemsDistributed Systems

• Distributed System modelsDistributed System models1.1. Physical NetworksPhysical Networks

2.2. Logical ModelsLogical Models

• Different Failure ModelsDifferent Failure Models• Communication constructsCommunication constructs

( ( semantics of distributed programs)semantics of distributed programs)

• Ordering of events and Ordering of events and • Execution SemanticsExecution Semantics

System ModelSystem Model

Two ways of viewing a DS :

1. As defined by the physical components of the system – physical model

2. As defined from the point of view of processing or computation – logical model

The goal of fault tolerance in DSThe goal of fault tolerance in DS

Is to ensure that some property/service, in the logical model is preserved despite the failure of some component(s) in the physical system.

Physical Network of a DSPhysical Network of a DS

Consists of many computers called nodes that

are typically :

1. Autonomous

2. Geographically Separated

3. Communicate through Communication Networks

Distributed vs. Parallel Systems

1. Nodes loosely coupled

2. Essentially no shared memory

3. Private clocks for nodes

1. Nodes closely coupled

2. May have shared memory b/w nodes

3. May have a single global clock for many/all nodes

Point to Point Physical NetworkPoint to Point Physical Network

Communication Protocols used: TCP/IP, OSI etc.

Fully ConnectedStar

Tree

Bus Topology

Nodes

Common bus

Nodes

Communication Protocol used : CSMA/CD

Logical Model

• A Distributed Application consists of:

a set of concurrently executing processes that cooperate with each other to perform some task.

• A process is the execution of a sequential program, which is a list of instructions.

Concurrent ProcessesConcurrent Processes

Can be classified in three categories:

1. Independent processes : the sets of objects accesses are disjoint.

2. Competing Processes: share resources but there is no information exchange between them.

3. Cooperating Processes:exchange information either by using shared data or by message passing.

A few logical level assumptionsA few logical level assumptions

• Finite progress assumption :since no assumptions about the relative speeds of processes can be made, it is assumed that they all have positive rates of execution.

• Underlying network is treated as fully connected/topology is not considered.

• At logical level the system is made of processes and channels between them.

• Channels are assumed to have infinite buffer and to be error-free.

• Channels deliver messages in the order in which they are sent (NB: on a particular channel).

•Assumptions about Time Bounds on the performance of the system are also made.

•A system is said to be synchronous if, whenever the system is working correctly, it performs it’s intended function within a finite and known time bound, otherwise it is said to be asynchronous.

•A synchronous channel is one in which the max. message delay is known and bounded.

•A synchronous processor is one in which the time to execute a sequence of instructions is finite and bounded.

**Advantage of synchronous systems : failure of components can be deduced by the lack of response.

Failures and Fault ClassificationFailures and Fault Classification

• Crash Fault :causes a component to halt or to lose its internal state, component never undergoes any incorrect state transition when it fails.

• Omission Fault: causes a component to not respond to some inputs.

• Timing/performance Fault:causes a component to respond either too early or too late.

• Byzantine Fault :causes the component to behave in totally arbitrary manner during failure.

• Incorrect Computation Fault : produces incorrect outputs.

Fault HierarchyFault Hierarchy

byzantine timing omissioncrash

**Incorrect computation fault is a subset of byzantine but different from the other faults

Assumptions about fault typesAssumptions about fault types

• For a processor : crash fault or byzantine fault

• For a communication network :all the different types of faults

• For a clock : timing fault, byzantine fault and sometimes omission fault

• For a storage media: crash, timing, omission incorrect computation faults.

• For software components: most of the above defined faults but most important is incorrect computation fault.

Interprocess CommunicationInterprocess Communication

• Synchronization and communication are both achieved by message passing primitives.

• In shared memory systems, primitives like

semaphores, conditional critical regions and

monitors are used.

Process CreationProcess Creation

• Processes are created in a system by the use of some operating system-provided system call.

• At the language level, this is done by using some language primitives eg. fork and join

and cobegin-coend statement.

Fork and Join PrimitiveFork and Join Primitive

Program P1

………..

fork P2;

……….

join P2;

……….

Program P2

………….

………….

………….

end

Cobegin-Coend PrimitiveCobegin-Coend Primitive

cobegin S1|| S2|| S3||…..||Sn coend

The above statement causes n different processes to

be created, each executing a different statement Si,

it ends with the termination of all the Si’s.

Asynchronous Message PassingAsynchronous Message Passing

• In DS without Shared Memory message passing is used both for communication and synchronization.

• A message is sent by a process by executing a send command:

send (data, destination)• Receiving of data is done with a receive

command: receive(data, source) or receive (message) * *in client–server interaction.

AssumptionsAssumptions

Message passing requires some buffer between sender and receiver:

•In asynchronous message passing infinite buffer to store message is assumed, so sender can go on sending messages however receiver is not non-blocking.

•In reality though buffers are finite size, so sender may also have to block: called buffered message passing.

NB: NB: •Asynchronous and synchronous message passing is different from asynchronous and synchronous DS. The former refers to communication primitives and the size of the buffer between sender and receiver, while the latter deals with bounds on message delays.

•In synchronous DS , both asynchronous and synchronous message passing can be supported.

Synchronous msg passing & CSPSynchronous msg passing & CSP

•Has no buffering.

•Has the advantage that at each communication command it is easier to make assertions about processes.

•Has been employed in Communicating Sequential Processes (CSP), a notation proposed for specifying distributed programs.

•CSP uses Guarded Command Language.

Guarded CommandsGuarded Commands

A GC is a statement list that is prefixed by a Boolean expression called a guard :

guard statement list.

The statement list is eligible for execution only if the guard evaluates to true i.e. it succeeds.

Evaluation of guard is assumed to have no side-effects i.e. it does not alter the state of the program in any manner.

The alternative construct is formed by using a set of guarded commands as follows:

[ G1 S1

G2 S2

…………

…………

Gn Sn

],

The execution of this alternative construct aborts if all the guarded commands evaluate to false.

If any GC is true, the corresponding statement is eligible for execution.

In case multiple GC’s evaluate to true, the statement to be executed is selected non-deterministically.

Repetitive structure is similar but with a *prefix.

GC notation allows non-determinism within a program.

Communicating Sequential ProcessesCommunicating Sequential Processes

•Is a programming notation for expressing concurrent programs.

•Employs synchronous message passing.

•Uses guarded commands to allow selective communication.

•A CSP program may consist of many concurrent processes: e.g. A process Pi sends a message, msg, to a process Pj by an output command of the form: Pj!msg.

Pj receives a message from Pi by input command: Pi?m.

For a process Pj the overall code is of the form:

Pj :: Initialize; *[G1 C1[] G2 C2 …..[]Gn Cn].

Remote Procedure CallRemote Procedure Call•A higher level primitive to support client–server interaction.

•An extension of the procedure call mechanism available in most programming languages.

The service to be provided by the server is treated as a procedure that resides on the machine on which the server is.The client process that needs that service makes “calls” to this procedure and RPC takes care of the underlying communication.

A call statement is of the form:

call service (value_args, result_args )

• The states of the server and the client both may change as a result of a ‘call ‘.

• However Idempotent remote procedures on the server do not change the state of the server after each ‘call’ from the client.

• Idempotent servers simplify the task of fault tolerance.

• Two basic approaches to specifying the server side in RPC:

1. Remote procedure is just like a sequential procedure i.e. single process executes the procedure as calls are made.

2. A new process is created every time a call is made. These processes can be concurrent.

Semantics of the RPC in failure conditionsSemantics of the RPC in failure conditions

The classification for semantics of remote calls:• At Least Once: remote proc. has been executed one or more

times if the invocation terminates normally. If it terminates abnormally nothing can be said about the number of times remote proc. executed.

• Exactly Once: remote proc. has executed exactly once if invocation terminates normally if not , then it can be asserted that remote proc. Did not execute more than once.

• At Most once: same as exactly once if invocation terminates normally, otherwise it is guaranteed that remote proc. Has been executed completely once or has not been executed at all.

Orphans: Unwanted executions of remote procedures caused due to communication or processor failure.

e.g. A client that crashes after issuing a call may restart on recovery and reissue the call even though the previous call is still being executed by the server.

Presence of orphans can violate the semantics of RPC and lead to inconsistency.

Call Ordering: property requires that a sequence of invocations generated by a given client result in computations performed by the server in the same order.

It is automatically satisfied if there are no failures.Not a strict requirement in case of Idempotent servers.

Object-Action ModelObject-Action Model

• Another high-level communication paradigm.

• In this paradigm: a system consists of many objects that consist some data and well defined methods (operations) on that data.

• The encapsulated data can only be accesses through the methods defined for them.

• The objects may reside on different nodes.

• A process, sends a message to the object concerned, which performs an action by executing a method and returns the result to the process.

•Nested remote procedure calls may be created.

•Methods on objects may execute in parallel.

•Concurrent calls may be made to the same method or to the same object.

•Becoming popular since it supports :

Fault tolerance by possible replication of objects.

Ordering of EventsOrdering of Events

No single global clock for defining happened–before relationship

between events of different processors.

Partial Ordering : the relation on a set of events in a distributed system is the smallest relation satisfying the following three conditions:

1. If a and b are events performed by the same process and a is performed before b, then a b.

2. If a is the sending of a message by one process and b is the receiving of the same message by another process.

3. If ab and bc, then ac . Two events are said to be concurrent if neither ab , nor b a.

Logical ClocksLogical Clocks• The logical clock Ci, for a process Pi, is a function

which assigns a value Ci(a) to an event a of the process Pi.

• The system of logical clocks is considered to be correct if it is consistent with the relation or for any events a, b if ab then C(a) < C(b)

• Whe a msg is sent from process Pi, the timestamp of the sending event is included in the msg m and can be retrieved by the receiver.

• Let Tm be the timestamp of the message m. There are two conditions that a system of logical clocks should satisfy in order to be correct:

1. Each Pi increments Ci between any two successive events.2. Upon receiving a msg m, Pj sets Cj greater than or equal to it’s present

value and greater than Tm.

Total Ordering of EventsTotal Ordering of Events

• Order the events by the timestamps assigned to them by the logical clock system.Processes can be ordered in their lexicographic order of names.

• A relationship => on the set of events has been defined as follows:

for events a and b of processes Pi and Pj respectively

a=> b iff either Ci(a) < Cj(b) or Ci(a) = Cj(b) and Pi comes before Pj in the ordering.

Execution Model and System StateExecution Model and System State

• At a logical level, a distributed system can be modeled as a directed graph with nodes representing channels between processes.

• The state of a channel in this model is the sequence of msgs that are still in the channel.

• A process can be considered as consisting of a set of states, an initial state, and a sequence of events (or actions).

• The state of a process is an assignment of a value to each of its variables, along with the specification of it’s control point which specifies the event executed last.

• Each event or action of a process assumed to be atomic.

• An event e of a process p can change the state of p and at most one channel c that is incident on p.

•Each event has an enabling condition, which is a condition on the state of the process and the channel attached to it.

•An event e can occur only if this enabling condition is true.e.g. when the program counter has a specific value.

•The global state or the system state of a DS consists of states of each of the processes in the system and the states of the channels in the system.

•The initial global state is one in which each process is in its initial state and all channels are empty.

•An event e can change the system state S by changing the state of process p, iff the enabling condition for e is true in S.

•A function ready(S) is defined on a global state S as a set of events for which enabling condition is satisfied in S.

•The events in ready (S) can belong to different processes , however only one of these events will take place.

•Which of the events in ready (S) will occur can not be predicted deterministically.

•We define another function next, where next(S, e) is the global state immediately following the occurrence of the event e in the global state S.

•The computation of a DS can be defined as a sequence of events.

•Let the initial state of the system be S0 and let seq = (ei, 0<= I <= n) be a sequence of events.

Suppose that the system state when ei occurs is Si, the sequence of events seq is a computation of the system if the following conditions are satisfied:

1. The event ei belongs to ready(Si), 0<= i<= n.

2. Si+1 = next(Si, ei), 0<=i<=n.

Example: A concurrent shared memory program.

a: x := 0

b: cobegin

c: y := 0

d: cobegin

e: y := 2*y

|| f : y := y +3

coend

|| g: while y = 0 do

h: x:= x+1

coend

j: x:= 2*y

An execution sequence for the program:

S0 :[(2,7); {a}] (a) S1 :[(0,7);{c,g}] (c )

S2 :[(0,0); {e,f,g}] (g) S3 :[(0,0);{e,f,h}] (h)

S4 :[(1,0); {e,f,g}] (f) S5 :[(1,3);{e,g}] (e)

S6 :[(1,6); {g}] (g) S7 :[(1,6);{j}] (j)

S8 :[(12,6); {}]

The possible states of the system can alos be represented as a tree with its root as the initial state and each event in the ready state producing a child of a node.

Such a tree is called a reachability tree, in which each node represents a state, and the number of children of a node equals the cardinality of the ready set at that state.

Each path from the initial node to a leaf node shows one possible execution sequence of the system. The states in the path are called valid or consistent states.

Reachability TreeReachability TreeS0

S1

S2

S3

S4

a

b

g

h

g

e

e

f

f

This model is also called the interleaving model.

For details, please refer to:

‘Fault Tolerance in Distributed Systems’ ,

-by Pankaj Jalote

~THANK YOU~

Distributed Systems Distributed System modelsDistributed System models 1. Physical Networks 2....

Documents

Transcript of Distributed Systems Distributed System modelsDistributed System models 1. Physical Networks 2....