OS 18
Transcript of OS 18
-
7/29/2019 OS 18
1/38
Chapter 18
Distributed ProcessManagement
Dave BremerOtago Polytechnic, N.Z.2008, Prentice Hall
Operating Systems:Internals and Design Principles, 6/E
William Stallings
Roadmap
Process Migration
Distributed Global States
Distributed Mutual Exclusion
Distributed Deadlock
-
7/29/2019 OS 18
2/38
Process Migration
Transfer of sufficient amount of the stateof a process from one computer to another
The process executes on the targetmachine
Motivation
Process migration is desirable indistributed computing for several reasonsincluding:
Load sharing
Communications performance
Availability
Utilizing special capabilities
-
7/29/2019 OS 18
3/38
Load Sharing
Move processes from heavily loaded tolightly load systems
Significant improvements are possible
Must be careful that the communicationsoverhead does not exceed the performance
gained.
Communicationsperformance Processes that interact intensively can be
moved to the same node to reducecommunications cost
May be better to move process to the datathan vice versa
Especially when the data is larger than thesize of the process
-
7/29/2019 OS 18
4/38
Availability and
Special Capabilities Availability
Long-running process may need to movebecause of faults or down time
OS must have advance notice of fault
Utilizing special capabilities
Process can take advantage of unique
hardware or software capabilities
Migration issues
For process migration to work we need tosatisfy a few issues:
Who initiates the migration?
What is involved in a Migration?
What portion of the process is migrated?
What happens to outstanding messages andsignals?
-
7/29/2019 OS 18
5/38
Who Initiates Migration?
Depends on the goal or reason formigration
OS initiates
if the goal is load balancing.
May be transparent to process
Process initiates
If the goal is to access a particular resource Process must be aware of the distributed
system
What is Involvedin Migration? Must destroy the process on the source
system and create it on the target system
Process movement, not replication.
Process image and process control blockand any links must be moved
-
7/29/2019 OS 18
6/38
Example of
Process Migration
What is migrated?
Moving the process control block is simple
Several strategies exist for moving theaddress space and data including:
Eager (All)
Precopy
Eager (dirty) Copy-on-reference
Flushing
-
7/29/2019 OS 18
7/38
Eager (all)
Transfer entire address space
No trace of process is left behind
If address space is large and if the process
does not need most of it, then this approachmy be unnecessarily expensive (takingminutes)
Checkpoint/restart capability is useful.
Precopy
Process continues to execute on thesource node while the address space iscopied
Pages modified on the source during precopyoperation have to be copied a second time
Reduces the time that a process is frozen and
cannot execute during migration
-
7/29/2019 OS 18
8/38
Eager (dirty)
Transfer only that portion of the addressspace that is in main memory and havebeen modified
Any additional blocks of the virtual addressspace are transferred on demand
The source machine is involvedthroughout the life of the process
Maintains page and/or segment table entries.
Copy-on-reference
Variation of Eager(All)
Pages are only brought over whenreferenced
Has lowest initial cost of process migration
-
7/29/2019 OS 18
9/38
Flushing
Pages are cleared from main memory byflushing dirty pages to disk
Pages are accessed as needed from disk
Relieves the source of holding any pages ofthe migrated process in main memory
Choosing a Strategy
If the process is not using much addressspace while on the target machine thenbetter to use
Eager (All)
Precopy
Eager (dirty)
Otherwise use
Copy-on-reference
Flushing
-
7/29/2019 OS 18
10/38
What Happens to
Messages and Signals? Need to have a way to temporarily store
outstanding messages and signals duringthe migration activity and then direct themto the new destination.
May need to maintain forwarding details at theinitial site to ensure outstanding messagesand signals get through
Decision to Migrate
Decision to migrate may be made by asingle entity
OS may decide based on load monitoringmodule
Process may decide based on resource
needs
Some systems let the target systemparticipate in the decision.
Negotiated migration
-
7/29/2019 OS 18
11/38
Migration by Negotiation
Charlotte is an example system.
Migration policy is responsibility of aStarter utility
Starter utility is also responsible for long-termscheduling and memory allocation
Migration decision must be reached jointlyby two Starter processes
one on the source and one on the destination
Negotiation of ProcessMigration
-
7/29/2019 OS 18
12/38
Eviction
Destination system may refuse to acceptthe migration of a process to itself
If a workstation is idle, process may havebeen migrated to it
Once the workstation is active, it may benecessary to evict the migrated processes toprovide adequate response time
Preemptive vs.Nonpreemptive Transfers Previous points related to preemptive
processes
Process has been created and may havebegun executing
Nonpreemptive process transfers involveprocesses which have not yet begun
So have no state to transfer
Useful in load balancing.
-
7/29/2019 OS 18
13/38
Roadmap
Process Migration
Distributed Global States
Distributed Mutual Exclusion
Distributed Deadlock
Distributed Goal State
Operating system cannot know the currentstate of all process in the distributedsystem
A process can only know the current stateof all processes on the local system
Remote processes only know stateinformation that is received by messages
-
7/29/2019 OS 18
14/38
Example
Bank account is distributed over twobranches
The total amount in the account is the sumat each branch
At 3 PM the account balance isdetermined
Messages are sent to request theinformation
Example 1
-
7/29/2019 OS 18
15/38
Example 2
If at the time of balance determination, thebalance from branch A is in transit tobranch B
The result is a false reading
Example 3
All messages in transit must be examinedat time of observation
Total consists of balance at both branchesand amount in message
-
7/29/2019 OS 18
16/38
Some Terms
Channel
Exists between two processes if theyexchange messages
State
Sequence of messages that have been sentand received along channels incident with theprocess
Some Terms
Snapshot
Records the state of a process
Global state
The combined state of all processes
Distributed Snapshot
A collection of snapshots, one for eachprocess
-
7/29/2019 OS 18
17/38
Inconsistent
Global State
Consistent Global State
-
7/29/2019 OS 18
18/38
Distributed
Snapshot Algorithm Records a consistent global state
Assumes messages are delivered in orderthat they were sent
And no messages are lost
TCP satisfies requirements
Uses a special control message
Marker
Roadmap
Process Migration
Distributed Global States
Distributed Mutual Exclusion
Distributed Deadlock
-
7/29/2019 OS 18
19/38
Concurrent
Process Problems Two main problems with concurrent
processing are:
Mutual Exclusion
Deadlock
Ch5/6 looked at issues within a singlecomputer
Different issues arise when the processors do
not share common memory or clock
Critical Section
Non-sharable resources are criticalresources
Code accessing it is a critical section of aprogram
Mutual exclusion must be enforced:
only one process at a time is allowed in itscritical section
-
7/29/2019 OS 18
20/38
Distributed Mutual
Exclusion Requirements1. Mutual exclusion must be enforced
2. A process halting in noncritical sectionmust not interfere with other processes
3. A process requiring access to a criticalsection should not be delayed indefinitely
Deadlock and starvation are not allowed
Distributed MutualExclusion Requirements4. If no process is in a critical section, any
process may enter without delay
5. No assumption about the number ofprocessors
6. A process remains inside its critical
section for a finite time only.
-
7/29/2019 OS 18
21/38
Mutual Exclusion
Centralized Algorithmfor Mutual Exclusion One node is designated as the control
node
This node control access to all sharedobjects
Only the control node makes resource-
allocation decision
-
7/29/2019 OS 18
22/38
Properties of a
Centralized Method Only the control node makes resource-
allocation decisions.
All information is located in the controlnode
If control node fails, mutual exclusion breaksdown
Distributed Algorithm
1. All nodes have equal amount ofinformation, on average
2. Each node has only a partial picture ofthe total system and must makedecisions based on this information
3. All nodes bear equal responsibility for thefinal decision
-
7/29/2019 OS 18
23/38
Distributed Algorithm
4. All nodes expend equal effort, onaverage, in effecting a final decision
5. Failure of a node, in general, does notresult in a total system collapse
6. There exits no system-wide commonclock with which to regulate the time ofevents
Ordering of Events in aDistributed System Events must be order to ensure mutual
exclusion and avoid deadlock
But clocks are not synchronized
Communication delays affect timingdecisions
-
7/29/2019 OS 18
24/38
Timestamping
Each system on the network maintains acounter which functions as a clock
Each site has a numerical identifier
When a message is received, thereceiving system sets is clock to one morethan the maximum of its current value andthe incoming timestamp
Timestampingis Ordering Messages Each system ihas a counter Ci functioning
as a clock
Each message increments Ci
Messages sent in the form (m, Ti, i) where
m = contents of the message
Ti= timestamp for this message, set to equalCi i = numerical identifier of this system in the
distributed system
-
7/29/2019 OS 18
25/38
Receiving System
The receiving system j increments itsclock to one more than the maximum of itscurrent value and the incoming timestamp:
Cj 1 + max[Cj, Ti ]
If i sends x, and j sends y then x comesbefore y
1.If Ti < Tj or
2.If Ti = Tj AND i < j
Timestamping
-
7/29/2019 OS 18
26/38
Timestamping
Distributed Queue
An early approach to distributed mutualexclusion uses a distributed queue
Relied on several assumptions
1. Each node (1..N) has 1 process to managemutual exclusion
2. Messages arrive in the same order as sent(this assumption relaxed in later versions)
3. Every message is successfully received
4. Network is fully connected
-
7/29/2019 OS 18
27/38
State Diagram
Token-Passing Approach
Pass a token among the participatingprocesses
The token is an entity that at any time isheld by one process
The process holding the token may enter itscritical section without asking permission
When a process leaves its critical section, itpasses the token to another process
-
7/29/2019 OS 18
28/38
Token-Passing
Token Passing
-
7/29/2019 OS 18
29/38
Roadmap
Process Migration
Distributed Global States
Distributed Mutual Exclusion
Distributed Deadlock
Deadlock
The permanent blocking of a set ofprocesses that either compete for systemresources or communicate with oneanother.
Definition valid for distributed systems asmuch as for single
More complex as no node as knowledge ofthe state of the overall system
-
7/29/2019 OS 18
30/38
Types of Deadlock
Those related to resource allocation
Multiple processes attempt to access thesame resource
Communication of messages
Each process in a set is waiting for amessage from another process in the set,
No process sends a message.
Deadlock inResource Allocation Deadlock required:
Mutual exclusion
Hold and wait
No preemption
Circular wait
In a distributed system there is no controlprocess with an up-to-date knowledge ofthe global state of the system.
-
7/29/2019 OS 18
31/38
Phantom Deadlock
Deadlock Prevention
Circular-wait condition can be preventedby defining a linear ordering of resourcetypes
Hold-and-wait condition can be preventedby requiring that a process request all ofits required resource at one time, andblocking the process until all requests canbe granted simultaneously
-
7/29/2019 OS 18
32/38
Deadlock
Prevention Methods
Deadlock Avoidance
Distributed deadlock avoidance isimpractical
Every node must keep track of the globalstate of the system
The process of checking for a safe global
state must be mutually exclusive
Checking for safe states involvesconsiderable processing overhead
-
7/29/2019 OS 18
33/38
Deadlock Detection
Processes are allowed to obtain freeresources as they wish, and the existenceof a deadlock is determined after the fact.
Each site only knows about its ownresources
But deadlock may involve distributedresources.
DeadlockDetection Strategies
-
7/29/2019 OS 18
34/38
Deadlock Detection
Centralized control
one site is responsible for deadlock detection
Hierarchical control
sites organized in a tree structure
Distributed control
all processes cooperate in the deadlockdetection function
DistributedDeadlock Detection
-
7/29/2019 OS 18
35/38
Distributed Deadlock
Detection States
Deadlock inMessage Communication Types
Mutual Waiting deadlock
Unavailability of Message Buffers
-
7/29/2019 OS 18
36/38
Mutual Waiting
Deadlock Each process in a group is waiting for
another member of the group
And no messages are in transit
Unavailability ofMessage Buffers Common in packet-switching networks
Simplest forms
Store and forward deadlock
For each node, the queue to the adjacentnode in one direction is full with packets
destined for the next node beyond
-
7/29/2019 OS 18
37/38
Direct Store-and-Forward
Deadlock
Indirect Store-and-ForwardDeadlock
-
7/29/2019 OS 18
38/38
Structured Buffer Pool
Communication Deadlock