Post on 19-Dec-2015
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Formal Verification of Programs That Use MPI One-Sided Communication
Salman Pervez, Ganesh Gopalakrishnan, Robert M. KirbySchool of Computing
University of Utah
Rajeev Thakur, William GroppMathematics and Computer Science Division
Argonne National Laboratory
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
• The demand for concurrent software is increasing.
• Concurrent algorithms are notoriously hard to design and verify.
• Formal methods, and in particular finite-state model checking, provide a means of reasoning about concurrent algorithms.
• Principle advantages of modeling checking approach:- provides formal framework for reasoning- allows coverage – examination of all possible process interleavings
Thesis of the Talk
Thesis: If finite-state models are created and exhaustively analyzed for desired formal properties, robust
algorithms and implementations will result.
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
What is Model Checking?
Navier-Stokes Equations are a mathematical model of fluid flow physics
“V&V” – Validation and Verification“Validate Models, Verify Codes”
“Formal models” can be generated eitherautomatically or by a modeler which
translate and abstract algorithms and implementations.
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Model Checking: History and Current Practice• History
– Approach invented around 1981 by:• Clarke and Emerson, Queille and Sifakis
– Widely used in Hardware Verification since the 90’s– Uses in Software Verification is the current rage
• Notable Successes– Bell Labs : Telephone Switch Software Verification– NASA : Concurrent Java Program Verification– Microsoft : Device Driver Verification
• Applications in HPC by others:– Siegel and Avrunin: MPI two-sided communication programs– Matlin, Lusk, McCune: Verifying parts of MPD
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
MPI One-Sided Communication
• MPI One-Sided Constructs Examined:– MPI_Win_lock– MPI_Win_unlock– MPI_Put– MPI_Get
• The desired atomicity is provided by the constructs MPI_Win_Lock / MPI_Win_Unlock
• Once the lock is relinquished, data values can no longer be trusted
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Test Case: Byte-Range Algorithm
• Algorithm implemented using MPI one-sided communication (with passive-target lock-unlock synchronization) for coordinating a collection of parallel processes contending for byte-range locks.
Notes Concerning Algorithm:• To acquire a lock, a process must checkpoint the
global state by ‘simultaneously’ indicating its intent and reading others’ status.
• When the lock owner release the lock, he wakes up all conflicting ‘sleeping’ processes.
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Lock Acquire
lock_acquire (start, end) {
Stage 11 val[0] = 1; /* flag */ val[1] = start; val[2] = end;2 while(1) {3 lock_win4 place val in win5 get values of other processes from win6 unlock_win7 for all i, if (Pi conflicts with my range)8 conflict = 1;
Stage 29 if(conflict) {10 val[0] = 011 lock_win12 place val in win13 unlock_win14 MPI_Recv(ANY_SOURCE)15 }16 else{17 /* lock is acquired */18 break;19 }20 }//end while
Window:
P0 P1
flag start end 0 -1 -1 0 -1 -1 0 -1 -1
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Lock Release
lock_release (start, end) { val[0] = 0; /* flag */ val[1] = -1; val[2] = -1;
lock_win place val in win get values of other processes from win unlock_win
for all i, if (Pi conflicts with my range) MPI_Send(Pi);
}
Window:
P0 P1
flag start end 0 -1 -1 0 -1 -1 0 -1 -1
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 0 -1 -1 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()
lock_acquire(3,5)
Example 1: Demonstration of Lock Acquire/Release Strategy
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 1 3 5 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()
lock_acquire(3,5)
Deduces Conflict – Stage 1
Example 1: Demonstration of Lock Acquire/Release Strategy
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 0 3 5 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()
lock_acquire(3,5)
Deduces Conflict – Stage 2Blocks on Receive
Example 1: Demonstration of Lock Acquire/Release Strategy
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
0 -1 -1 0 3 5 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()
lock_acquire(3,5)
Deduces Conflict – Stage 2Blocks on Receive
Send Signal to P1
Example 1: Demonstration of Lock Acquire/Release Strategy
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
0 -1 -1 0 3 5 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()
lock_acquire(3,5)
Receives SignalRetry Stage 1
Send Signal to P1
Example 1: Demonstration of Lock Acquire/Release Strategy
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
0 -1 -1 1 3 5 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()
lock_acquire(3,5)lock_release()
Example 1: Demonstration of Lock Acquire/Release Strategy
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
0 -1 -1 0 -1 -1 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()
lock_acquire(3,5)lock_release()
Example 1: Demonstration of Lock Acquire/Release Strategy
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
inlineMPI_Win_lock(proc_i){ /* try sending a message
on a channel of size 1,
will block if a message is already in
the queue. */
lock_chan!proc_id; }
Modeling in Promela
Example Promela Code for lock_release• C-like structure• Powerful abstractions like channels
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 0 -1 -1 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Example 2: Demonstration of Lock Acquire/Release Limitation
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 1 3 5 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Deduces Conflict – Stage 1
Example 2: Demonstration of Lock Acquire/Release Limitation
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
0 -1 -1 1 3 5 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Deduces Conflict – Stage 1
Example 2: Demonstration of Lock Acquire/Release Limitation
Send Signal to P1
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 1 3 5 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Deduces Conflict – Stage 1Deduces Conflict – Stage 1
Example 2: Demonstration of Lock Acquire/Release Limitation
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 0 3 5 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Deduces Conflict – Stage 1
Example 2: Demonstration of Lock Acquire/Release Limitation
Deduces Conflict – Stage 2Block on Receive
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 0 3 5 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Receive SignalRetry Stage 1
Deduces Conflict – Stage 1
Example 2: Demonstration of Lock Acquire/Release Limitation
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 1 3 5 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Deduces Conflict – Stage 1
Example 2: Demonstration of Lock Acquire/Release Limitation
Deduces Conflict – Stage 1
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
0 3 5 1 3 5 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Deduces Conflict – Stage 1Deduces Conflict – Stage 2Block on Receive
Example 2: Demonstration of Lock Acquire/Release Limitation
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
0 3 5 0 3 5 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Deduces Conflict – Stage 2Block on Receive
Deduces Conflict – Stage 2Block on Receive
Example 2: Demonstration of Lock Acquire/Release Limitation
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
0 3 5 0 3 5 0 -1 -1 0 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Deduces Conflict – Stage 2Block on Receive
Deduces Conflict – Stage 2Block on Receive
Example 2: Demonstration of Lock Acquire/Release Limitation
DEADLOCK
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
ObservationsAfter Model Checking
• P0 releases lock before it can see that P1 will be blocked.
• There is no way for P0 to figure out whether P1 merely wants the lock or is actually blocked.
• Multiple unmatched sends can occur (example to follow)
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 1 6 8 0 -1 -1 0 -1 -1
Process 0 Process 1 Process 2
lock_acquire(3,5)lock_release()
lock_acquire(6,8)lock_release()
lock_acquire(5,6)
P2
Example 3: Demonstration of Lock Acquire/Release Limitation
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 1 6 8 1 5 6 0 -1 -1
Process 0 Process 1 Process 2
lock_acquire(3,5)lock_release()
lock_acquire(6,8)lock_release()
lock_acquire(5,6)
P2
Deduces Conflict – Stage 1
Example 2: Demonstration of Lock Acquire/Release Limitation
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 1 6 8 0 5 6 0 -1 -1
Process 0 Process 1 Process 2
lock_acquire(3,5)lock_release()
lock_acquire(6,8)lock_release()
lock_acquire(5,6)
P2
Deduces Conflict – Stage 2Block on Receive
Example 2: Demonstration of Lock Acquire/Release Limitation
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
0 -1 -1 0 -1 -1 0 5 6 0 -1 -1
Process 0 Process 1 Process 2
lock_acquire(3,5)lock_release()
lock_acquire(6,8)lock_release()
lock_acquire(5,6)
P2
Deduces Conflict – Stage 2Block on Receive
Send Signal to P2 Send Signal to P2
Example 2: Demonstration of Lock Acquire/Release Limitation
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Proposed Solution 1• Main idea: Distinguish between processes that want
the lock and those that are blocked.• Three possible flag values:
– 0 = I do not have the lock– 1 = I have the lock– 2 = I am trying for the lock
• If a process wants the lock, but finds another conflicting process with a flag value of 2, it must wait until this value changes to either 1 or 0.
• We have added more certainty to the algorithm but taken a possible performance hit and possible livelock.
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Proposed Solution 2• Main Idea: The process about to be blocked picks who
will wake it up and indicates so by writing to shared memory
• Once processes declare their intentions globally, deadlock can be avoided.
• For there to be deadlock, a dependency cycle must exist.
• The last process to complete this cycle will know about it and must not do so.
Window:
P0 P1
flag start end pick -1 -1 0 -1 -1 0 -1 -1
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 -1 0 -1 -1 -1 0 -1 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Example 3: Demonstration of Lock Acquire/Release Proposed Solution 2
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 -1 1 3 5 -1 0 -1 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Deduces Conflict – Stage 1
Example 3: Demonstration of Lock Acquire/Release Proposed Solution 2
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
0 -1 -1 -1 1 3 5 -1 0 -1 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Deduces Conflict – Stage 1
Example 3: Demonstration of Lock Acquire/Release Proposed Solution 2
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 -1 1 3 5 -1 0 -1 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Deduces Conflict – Stage 1Deduces Conflict – Stage 1
Example 3: Demonstration of Lock Acquire/Release Proposed Solution 2
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 -1 0 3 5 0 0 -1 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Deduces Conflict – Stage 1 Deduces Conflict – Stage 2Block on Receive
Example 3: Demonstration of Lock Acquire/Release Proposed Solution 2
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
1 3 5 -1 0 3 5 0 0 -1 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Deduces Conflict – Stage 2Block on Receive
No Conflict – Stage 1
Example 3: Demonstration of Lock Acquire/Release Proposed Solution 2
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Window:
P0 P1
0 3 5 -1 0 3 5 0 0 -1 -1 -1
Process 0 Process 1
lock_acquire(3,5)lock_release()lock_acquire(3,5)
lock_acquire(3,5)
Deduces Conflict – Stage 2Block on Receive
Deduces Deadlock – Stage 2Reset to Stage 1
Example 3: Demonstration of Lock Acquire/Release Proposed Solution 2
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Discussion and Future Work“Execution Checking”
“Model Checking”
In current practice, concrete executions on a few diverse platforms are often used to verifyalgorithms/codes.
Consequence: Many feasible executions mightnot be manifested.
Model checking forces all executions of a judiciously down-scaled model to be examined.
Current focus of our research: minimize modeling effort and error.
Argonne National Laboratory School of Computing and SCI Institute, University of Utah
Funding Acknowledgements:
• NSF (CSR–SMA: Toward Reliable and Efficient Message Passing Software Through Formal Analysis)• Microsoft (Formal Analysis and Code Generation Support for MPI)• Office of Science – Department of Energy
Summary• Paradigms such as one-sided MPI and threading creates a plethora of execution possibilities – many of which might be algorithmically fatal yet lay dormant at testing time.
• Model checking provides a formal and practical means of reasoning about all possible executions as part of the design, verification and optimization process.
Closing Question (“Food for Thought”):• Can one come up with safe usages (i.e. easier to verify yet not overly restrictive) of one-sided communication?