CS 2104 – Prog. Lang. Concepts Logic Programming - II Dr. Abhik Roychoudhury School of Computing.
Dynamic Analysis of Multithreaded Java Programs Dr. Abhik Roychoudhury National University of...
-
Upload
flora-rogers -
Category
Documents
-
view
215 -
download
1
Transcript of Dynamic Analysis of Multithreaded Java Programs Dr. Abhik Roychoudhury National University of...
Dynamic Analysis of Multithreaded Java ProgramsDynamic Analysis of Multithreaded Java Programs
Dr. Abhik RoychoudhuryDr. Abhik Roychoudhury
National University of SingaporeNational University of Singapore
Program DebuggingProgram Debugging
Age-old activityAge-old activity
Simple bug checks handled by compiler – type Simple bug checks handled by compiler – type errorserrors
Compiler does not detect violation of Compiler does not detect violation of behavioral invariantsbehavioral invariants– At control location L1, the value of x cannot be At control location L1, the value of x cannot be
00
Other errors for diff. classes of programsOther errors for diff. classes of programs– Multithreaded (Concurrent or Parallel)Multithreaded (Concurrent or Parallel)– Real-timeReal-time
Real-time ConstraintsReal-time Constraints
Example:Example:– Data is arriving as a burst every 10 ms Data is arriving as a burst every 10 ms – Every burst is a set of records with a primary keyEvery burst is a set of records with a primary key– Sort the primary keys before next dataset arrivesSort the primary keys before next dataset arrives
Need to guarantee that our sorting program Need to guarantee that our sorting program always completes before 10 msalways completes before 10 ms– Worst Case Execution Time Worst Case Execution Time 10 ms 10 ms
Worst Case Execution Time should consider all Worst Case Execution Time should consider all possible inputspossible inputs– Exponentially many inputs based on Exponentially many inputs based on
permutations.permutations.– Need an efficient analysis technique.Need an efficient analysis technique.
Real-time ConstraintsReal-time Constraints
Worst Case Execution Time estimation is a Worst Case Execution Time estimation is a staticstatic analysis technique. analysis technique.– Performed at compile-time by analyzing the Performed at compile-time by analyzing the
control flow graph of the programcontrol flow graph of the program– No data is collected/analyzed during run-time.No data is collected/analyzed during run-time.
We are trying to prove a property about all We are trying to prove a property about all possible runs of the program.possible runs of the program.
Currently working with Li Xianfeng (Ph.D. Currently working with Li Xianfeng (Ph.D. student)student)
Multithreaded ProgramsMultithreaded Programs
Threads do computation AND communicationThreads do computation AND communication
– e.g. Update of shared resources in a critical e.g. Update of shared resources in a critical sectionsection
Communication among threadsCommunication among threads– By reading/writing shared variables/objectsBy reading/writing shared variables/objects
Java programming languageJava programming language– By explicitly sending/receiving messagesBy explicitly sending/receiving messages
Message Passing Interface (MPI)Message Passing Interface (MPI)
Multithreaded Program Execution PlatformMultithreaded Program Execution Platform– Concurrent (Several threads run on single Concurrent (Several threads run on single
processor with a scheduler)processor with a scheduler)– Parallel (threads run on different processors).Parallel (threads run on different processors).
Static Analysis and VerificationStatic Analysis and Verification
Verification of behavioral properties involvesVerification of behavioral properties involves– Constructing a transition system to show the Constructing a transition system to show the
state changes as the program executesstate changes as the program executes– Traversing the transition to check all behaviors.Traversing the transition to check all behaviors.– Roughly corresponds to an efficient version of Roughly corresponds to an efficient version of
exhaustive program simulation for all possible exhaustive program simulation for all possible inputs.inputs.
More complicated if sys. is unbounded in sizeMore complicated if sys. is unbounded in size– E.g. the number of threads in a program may E.g. the number of threads in a program may
be unbounded, but always finite.be unbounded, but always finite.
Currently working with Irina Mariuca Currently working with Irina Mariuca Gheorghita (Ph.D. student)Gheorghita (Ph.D. student)
Dynamic AnalysisDynamic Analysis
Given an exec. trace of a multi-threaded Given an exec. trace of a multi-threaded programprogram– Analyze the behavior shown in Analyze the behavior shown in thatthat trace. trace.
Corresponds to natural notion of debuggingCorresponds to natural notion of debugging– Debug a program based on a “test case” (the Debug a program based on a “test case” (the
execution trace)execution trace)
Lower complexity than verif. techniques Lower complexity than verif. techniques – No need to Explore a large state space graphNo need to Explore a large state space graph
But need to collect/store/analyze huge traces But need to collect/store/analyze huge traces
Cyclic debuggingCyclic debugging
Normal debugging activity is cyclicNormal debugging activity is cyclic1.1. Try a test caseTry a test case2.2. Encounter a “bug”Encounter a “bug”3.3. Run the program Run the program againagain with the test case with the test case4.4. Use breakpoints etc. and try to locate bug.Use breakpoints etc. and try to locate bug.5.5. If not successful, go to 3.If not successful, go to 3.
Multi-threaded programs are non-Multi-threaded programs are non-deterministicdeterministic
How to re-generate trace ??How to re-generate trace ??
Simple ExampleSimple Example
Initially: x= y= 0Initially: x= y= 0
Lock LockLock Lock
x = 1 if (x == 1)x = 1 if (x == 1)
Unlock y = 1Unlock y = 1
else y = 2else y = 2
UnlockUnlock
Threads communicating via shared variablesThreads communicating via shared variables
All shared variable accesses are protectedAll shared variable accesses are protected
Execution trace 1Execution trace 1
User executes as follows:User executes as follows:– Thread 1 selected by scheduler.Thread 1 selected by scheduler.– Lock x = 1 unlockLock x = 1 unlock executed by Thread 1. executed by Thread 1.– Thread 2 selected by schedulerThread 2 selected by scheduler– Thread 2 executes and y is set to 1Thread 2 executes and y is set to 1
User suspects the result y =1 User suspects the result y =1
Tries to run the program again with x = y = 0Tries to run the program again with x = y = 0
Execution Trace 2Execution Trace 2
Program executes againProgram executes again– Scheduler now selects Thread 2 firstScheduler now selects Thread 2 first– If ( x== 1) y = 1 else y = 2If ( x== 1) y = 1 else y = 2 results in y = 2 results in y = 2– Thread 1 is executed now and x is set to 1Thread 1 is executed now and x is set to 1
User tries to debug based on this traceUser tries to debug based on this trace
But this is a diff. trace with diff. results.But this is a diff. trace with diff. results.
Results from the non-determinism of the Results from the non-determinism of the scheduler.scheduler.
Non-deterministic exec. in parallel platforms Non-deterministic exec. in parallel platforms alsoalso– Relative processor speedsRelative processor speeds
Problems with Dynamic AnalysisProblems with Dynamic Analysis
Multithreaded programs need to store the Multithreaded programs need to store the trace (or a portion of it) for future analysis.trace (or a portion of it) for future analysis.– In sequential programs it is enough to store the In sequential programs it is enough to store the
inputs (test-case) generating the trace.inputs (test-case) generating the trace.
Dynamic Analysis is w.r.t. a specific traceDynamic Analysis is w.r.t. a specific trace– How to generate “representative” traces ? How to generate “representative” traces ?
(How to find “good” test cases ? )(How to find “good” test cases ? )– How to compactly store traces for offline How to compactly store traces for offline
analysis?analysis?– How to regenerate partially stored traces ?How to regenerate partially stored traces ?– Offline/ Post-mortem analysis techniquesOffline/ Post-mortem analysis techniques
Finding representative tracesFinding representative traces
Describe the invariant property you want to Describe the invariant property you want to preservepreserve– At program loc. L1, x should be greater than At program loc. L1, x should be greater than
zerozero– At all locations, x should be not equal to zeroAt all locations, x should be not equal to zero
Construct a state transition graph of the Construct a state transition graph of the program automatically from the programprogram automatically from the program
Efficiently traverse the graph to find possible Efficiently traverse the graph to find possible violations of the property.violations of the property.
Finding representative tracesFinding representative traces
If any violation exists, your search producesIf any violation exists, your search produces– A counterexample traceA counterexample trace– A trace at the end of which the invariant A trace at the end of which the invariant
property/assertion is violated.property/assertion is violated.
Use the counterexample trace as a Use the counterexample trace as a representative trace to guide debugging.representative trace to guide debugging.
Currently working with Daniel Hogberg (Ph.D. Currently working with Daniel Hogberg (Ph.D. student)student)
Compactly storing tracesCompactly storing traces
Treat the program trace as a string s over a Treat the program trace as a string s over a pre-defined alphabet.pre-defined alphabet.
Create a hierarchical representation of the Create a hierarchical representation of the program trace which is more compact program trace which is more compact – (catches repetitions of chunks of code)(catches repetitions of chunks of code)
The compact representation should be The compact representation should be created created online, as the program is executingonline, as the program is executing
We have developed a scheme for compressing We have developed a scheme for compressing Java bytecode for multi-threaded programs.Java bytecode for multi-threaded programs.
Working with Wang Tao (Ph.D. student) and Working with Wang Tao (Ph.D. student) and Ankit Goel (visiting student)Ankit Goel (visiting student)
Compressed Path - ExampleCompressed Path - Example
11
22
33
Uncompressed PathUncompressed Path
123123123123
Compressed Compressed RepresentationRepresentation
S S AA AA
A A 123 123
Control Flow GraphControl Flow GraphIf-then statementIf-then statement
Post-mortem analysis of tracesPost-mortem analysis of traces
Compressed traces should be easy to navigateCompressed traces should be easy to navigate– Post-mortem analysis should not involve Post-mortem analysis should not involve
decompressing entire trace.decompressing entire trace.– Useful to detect Useful to detect data racesdata races in multi-threaded in multi-threaded
Java programsJava programs– Data races Data races Unprotected shared variable Unprotected shared variable
accessesaccesses
x = 1 x = 2x = 1 x = 2
if (x == 1) then S1 if (x == 2) then S2if (x == 1) then S1 if (x == 2) then S2
The programmer expects S1, S2 to be executedThe programmer expects S1, S2 to be executed
May not happen due to data races.May not happen due to data races.
Tracing strategyTracing strategy
Too costly to track down all shared variable Too costly to track down all shared variable operations in a realistic Java programoperations in a realistic Java program
Even if the operations are stored compactly, Even if the operations are stored compactly, the run-time overheads may be too high.the run-time overheads may be too high.
One possibility is to store only the One possibility is to store only the synchronization operations during run-time.synchronization operations during run-time.
Regenerate and analyze possible orderings of Regenerate and analyze possible orderings of the other communication operations – the other communication operations – unsynchronized shared variable accesses.unsynchronized shared variable accesses.
Looking for students in these topics.Looking for students in these topics.
Relevant PapersRelevant Papers
Compactly Representing Parallel Program ExecutionsCompactly Representing Parallel Program Executions, , ((pdfpdf))A. Goel, A. Roychoudhury and T. MitraA. Goel, A. Roychoudhury and T. MitraACM Symposium on Principles and Practice of Parallel ACM Symposium on Principles and Practice of Parallel Programming (PPoPP) 2003. Programming (PPoPP) 2003.
Specifying Multithreaded Java Semantics for Program Specifying Multithreaded Java Semantics for Program VerificationVerification, (, (psps, , pdfpdf))A. Roychoudhury and T. MitraA. Roychoudhury and T. MitraACM/IEEE International Conference on Software Engineering ACM/IEEE International Conference on Software Engineering ((ICSEICSE) 2002, pages 489-499. ) 2002, pages 489-499.
Depiction and Playout of Multi-threaded Program Depiction and Playout of Multi-threaded Program Executions, Executions, A. Roychoudhury, Submitted for publicationA. Roychoudhury, Submitted for publication
Symbolic Simulation of Live Sequence Charts, Symbolic Simulation of Live Sequence Charts, S. Choudhary, A Roychoudhury and RHC Yap, S. Choudhary, A Roychoudhury and RHC Yap, Submitted for publication.Submitted for publication.
My Contact InformationMy Contact Information
First two papers available from my web-page.First two papers available from my web-page.
http://www.comp.nus.edu.sg/~abhik/http://www.comp.nus.edu.sg/~abhik/
If you are interested in the projects, send e-If you are interested in the projects, send e-mail mail – [email protected]@comp.nus.edu.sg– [email protected]@nus.edu.sg