Civilian Worms: Ensuring Reliability in an Unreliable Environment
description
Transcript of Civilian Worms: Ensuring Reliability in an Unreliable Environment
Civilian Worms: Ensuring Reliability in an Unreliable Environment
Sanjeev R. KulkarniUniversity of Wisconsin-Madison
[email protected] Work with Sambavi Muthukrishnan
Outline Motivation and Goals Civilian Worms Master-Worker Model
Leader Election Forward Progress Correctness Parallel Applications
What’s happening today Move towards clusters Resource Managers
eg. Condor Dynamic environment
Motivation Large Parallel/Standalone Applications Non-Dedicated Resources
eg.:- Condor env. Machines can disappear at any time
Unreliable commodity clusters Hardware failures Network Failures
Security Attacks!
What’s available Parallel Platforms
MPI MPI-1 :- Machines can’t go away! MPI-2 any takers?
PVM Shoot the master!
Condor Shoot the Central Manager!
Goal Bottleneck-Free infrastructure in
an unreliable Environment Ensure “normal termination” of
applications Users submit their jobs Get e-mail upon completion!
Focus of this talk Approaches for Reliability
Standalone Applications Monitor framework ( worms! ) Replication
Parallel Applications Future work!
Worms are here again! Usual Worms
Self replicating Hard to detect and kill
Civilian Worms Controlled replication Spread legally! Monitor applications
Desired Monitoring System
C
W
W
W
W
C
W = worm
C = computation
Issues Management of worms
Distributed State detection Very hard
Forward Progress Checkpointing
Correctness
Management Models Master-Worker
Simple Effective Our Choice!
Symmetric Difficult to manage the model itself!
Our Implementation Model
C C
W
W W W
C
Master
Workers
W = worm
C = computation
Worm States Master
Maintains the state of all the worm segments Listens on a particular socket Respawns failed worm segments
Worker Periodically ping the master Starts the encapsulated process if instructed
Leader Election Invoke the LE algorithm to elect a new master
Note:- Independent of application State
Leader Election The woes begin! Master goes down
Detection Worker ping times out
Timeout value Worker gets an LE message
Action Worker goes into LE state
LE algorithm Each worm segment is given an ID
Only master gives the id Workers broadcast their ids The worker with the lowest id wins
Brief Skeleton While in LE
bcast LE message with your id Set min = your id On getting an LE message with id i
If i >= min ignore else min = i;
min is the new Master
LE in action (1)
M0
W2W1
Master goes down!
LE in action (2)
L2L1
L1 and L2 send out LE messages
LE, 1
LE, 1 LE, 2
LE, 2
LE in action (3)
L2L1
L1 gets LE, 2 and ignores itL2 gets LE, 1 and send COORD_ACK
COORD_ACK
LE in action (4)
W2M1
M1 send COORD to W2, spawns W0
W3
COORD
spawn
Implementation Problems Too many cases Many unclear cases Time to Converge
Timeout values Network Partition
What happens if? Master still up?
Incoming id < self id => goes to LE mode
Else => sends back COORD message Next master in line goes down?
Timeout on COORD message receipt Late COORD_ACK?
Sends KILL message
More Bizarre cases Multiple Masters?
Master bcasts its id periodically Conflict is resolved using lowest id
method No-master?
Workers will timeout soon!
Test-Bed 64 dual processor 550 MHz P-III
nodes Linux 2.2.12 2 GB RAM Fast interconnect. 100 Mbps Master-Worker comm. via UDP
A Stress Test for LE Test
Worker Pings every second Kill n/4 workers After 1 sec, kill the master After .5 sec kill the master in line Kill n/4 workers again
Convergence
Convergence Graph
0
5
10
15
20
25
30
35
2 4 8 16
Cluster Size
Con
verg
e tim
e in
sec
s
Forward Progress Why?
MTTF < application time Solutions
Checkpointing Application Level Process level
Start from checkpoint image!
Checkpoint Address Space
Condor Checkpoint library Rewrites Object files Writes checkpoint to a file on SIGUSR2
Files Assumption :- Common File System
Correctness File Access
Read Only, no problems Writes
Possible inconsistency if multiple processes access
Inconsistency across checkpoints? Need a new File Access Algorithm
Solution: Individual Versions File Access Algorithm
On open If first open
read: nothing write: create a local copy and set a mapping
Else If mapped access mapped file If write: create a local copy and set a mapping
Close Preserve the mapping
File Access cont.
Commit Point On completion of the computation
Checkpoint Includes mapped files
Being more Fancy Security Attacks Civilian to Military transition
Hide yourself from the ps Re-fork periodically to avoid detection
Conclusion LE is VERY HARD
Don’t take it for a course project! Does our system work?
16 nodes: YES 32 nodes: NO
Quite Reliable
Future Direction Robustness Extension to parallel programs
Re-write send/recv calls Routing issues
Scalability issues? A hierarchical design?
References Cohen, F. B., ‘A Case for Benevolent Viruses’,
http://www.all.net/books/integ/goodvcase.html M. Litzkow and M. Solomon. “Supporting Checkponting
and Process Migration outside the UNIX kernel”, Usenix Conference Proceedings, San Francisco, CA, January 1992.
Gurdip Singh, “Leader election in complete networks”, PPDC 92
Implementation Arch.Worm
Communicator
Remove
CheckpointerDispatcherDequeuer
Prepend
Append
Checkpoint
Computation
Parallel Programs Communication
Connectivity across failures Re-write send/recv socket calls
Limitations of Master-Worker Model? Not really!
Communication Checkpoint markers
Buffer all data between checkpoint markers
Help of master in rerouting