Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF...
-
Upload
timothy-waters -
Category
Documents
-
view
218 -
download
0
Transcript of Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF...
![Page 1: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/1.jpg)
Scalable Dynamic Analysis for Automated Fault Location and
Avoidance
Rajiv Gupta
Funded by NSF grants from CPA, CSR, & CRI programsand grants from Microsoft Research
![Page 2: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/2.jpg)
Motivation
Software bugs cost the U.S. economy about $59.5 billion each year [NIST 02].
Embedded Systems• Mission Critical / Safety Critical Tasks • A failure can lead to Loss of Mission/Life.
(Ariane 5) arithmetic overflow led to shutdown of guidance computer.(Mars Climate Orbiter) missed unit conversion led to faulty navigation data.(Mariner I) missing superscripted bar in the specification for the guidance
program led to its destruction 293 seconds after launch. (Mars Pathfinder) priority inversion error causing system reset.(Boeing 747-400) loss of engine & flight displays while in flight.(Toyota hybrid Prius) VSC, gasoline-powered engine shut off. (Therac-25) wrong dosage during radiation therapy.…….
![Page 3: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/3.jpg)
Overview
Dynamic SlicingOffline
Environment FaultsOnline
Program Execution
Fault
Fault Location
Fault Avoidance
ScalabilityTracing + Logging
Long-runningMulti-threaded
![Page 4: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/4.jpg)
Fault Location
Goal:
Assist the programmer in debugging by automatically narrowing the fault to a small section of the code.
Dynamic Information• Data dependences• Control dependences• Values
Execution Runs• One failed execution & • Its perturbations
![Page 5: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/5.jpg)
Dynamic Information
……
ExecutionProgram Dynamic Dependence Graph
Data Control
![Page 6: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/6.jpg)
Approach
Detect execution of statement s such that Faulty code Affects the value computed by s; or Faulty code is Affected-by the value computed by s
through a chain of dependences.
Estimate the set of potentially faulty statements from s:
Affects: statements from which s is reachable in the dynamic dependence graph. (Backward Slice)
Affected-by: statements that are reachable from s in the dynamic dependence graph. (Forward Slice)
Intersect slices to obtain a smaller fault candidate set.
![Page 7: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/7.jpg)
Backward & Forward Slices
Erroneous Output
Failure inducingInputBackward Slice
[Korel&Laski,1988] Forward Slice [ASE-05]
![Page 8: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/8.jpg)
Failure InducingInput
ErroneousOutput
[ASE-05]
For memory bugs the number of statements is very small (< 5).
Backward & Forward Slices
![Page 9: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/9.jpg)
Bidirectional Slices
BackwardSlice of CP
ForwardSlice of CP
+ BidirectionalSliceCombined
Slice
[ICSE-06]
• Found critical predicates in 12 out of 15 bugs• Search for critical predicate: Brute force: 32 predicates to 155K predicates; After Filtering and Ordering: 1 to 7K predicates.
Critical Predicate:
An execution instance
of a predicate such
that changing its
outcome “repairs” the
program state.
![Page 10: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/10.jpg)
Pruning Slices
Confidence in v
v
C(v): [0,1]
0 - all values of v produce same
1 - any change in v will change
How? Value profiles.
1 1
1
[PLDI-06]
![Page 11: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/11.jpg)
Test Programs
Real Reported Bugs Injected Bugs
Nine logical bugs (incorrect ouput)
• Unix utilities grep 2.5, grep 2.5.1,
flex 2.5.31, make 3.80.
Six memory bugs (program crashes)
• Unix utilities gzip, ncompress,
polymorph, tar, bc, tidy.
Siemens Suite (numerous versions)
schedule, schedule2, replace, print_tokens..
• Unix utilities gzip, flex
![Page 12: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/12.jpg)
Dynamic Slice Sizes
Buggy Runs BS FS BiS
flex 2.5.31(a) 695 605 225flex 2.5.31(b) 272 257 NA
flex 2.5.31(c) 50 1368 NA
grep 2.5 NA 731 88
grep 2.5.1(a) NA 32 111
grep 2.5.1(b) NA 599 NA
grep 2.5.1(c) NA 12 453
make 3.80(a) 981 1239 1372
make 3.80(b) 1290 1646 1436
gzip-1.2.4 34 3 39
ncompress-4.2.4 18 2 30
polymorph-0.4.0 21 3 22
tar 1.13.25 105 202 117
bc 1.06 204 188 267
tidy 554 367 541
![Page 13: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/13.jpg)
Combined Slices
Buggy Runs BS BS^FS^BiS (%BS)
flex 2.5.31(a) 695 27 (3.9%)
flex 2.5.31(b) 272 102 (37.5%)
flex 2.5.31(c) 50 5 (10%)
grep 2.5 NA 86 (7.4%*EXEC)
grep 2.5.1(a) NA 25 (4.9%*EXEC)
grep 2.5.1(b) NA 599 (53.3%*EXEC)
grep 2.5.1(c) NA 12 (0.9%*EXEC)
make 3.80(a) 981 739 (81.4%)
make 3.80(b) 1290 1051 (75.3%)
gzip-1.2.4 34 3 (8.8%)
ncompress-4.2.4 18 2 (14.3%)
polymorph-0.4.0 21 3 (14.3%)
tar 1.13.25 105 45 (42.9%)
bc 1.06 204 102 (50%)
tidy 554 161 (29.1%)
![Page 14: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/14.jpg)
Evaluation of Pruning
Program Description LOC Versions Tests
print_tokens Lexical analyzer 565 5 4072
print_tokens2 Lexical analyzer 510 5 4057replace Pattern replacement 563 8 5542
schedule Priority scheduler 412 3 2627schedule2 Priority scheduler 307 3 2683
gzip Unix utility 8009 1 1217flex Unix utility 12418 8 525
Siemen’s Suite
• Single error is injected in each version.• All the versions are not included:
No output or the very first output is wrong; Root cause is not contained in the BS (code missing error).
![Page 15: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/15.jpg)
Program BS Pruned Slice Pruned Slice / BS
print_tokens 110 35 31.8%
print_tokens2 114 55 48.2%replace 131 60 45.8%
schedule 117 70 59.8%schedule2 90 58 64.4%
gzip 357 121 33.9%flex 727 27 3.7%
Evaluation of Pruning
![Page 16: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/16.jpg)
Effectiveness
Backward
Slice
[AADEBUG-05]
≈ 31% of Executed
Statements
Combined Slice
[ASE-05,ICSE-06]
≈ 36% of Backward
Slice≈ 11% of Exec.
Erroneous outputFailure inducing inputCritical predicate
Pruned Slice
[PLDI-06]
≈ 41% of Backward
Slice≈ 13% of Exec.
ConfidenceAnalysis
![Page 17: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/17.jpg)
Effectiveness
Slicing is effective in locating faults.
No more than 10 static statements had to be inspected.
Program-bug Inspected Stmts.
mutt – heap overflow 8
pine – stack overflow 3
pine – heap overflow 10
mc – stack overflow 2
squid – heap overflow 5
bc – heap overflow 3
![Page 18: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/18.jpg)
Execution Omission Errors
X=
X=
=X
A =
A<0
X=
X=
=X
A =
A<0
X=
=X
A =
A<0
• Inspect pruned slice.• Dynamically detect an Implicit dependence.• Incrementally expand the pruned slice.
[PLDI-07]
Implicitdependence
![Page 19: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/19.jpg)
Scalability of Tracing
Dynamic Information Needed• Dynamic Dependences
for all slicing• Values for Confidence Analysis
for pruning slices
annotates the static program representation
Whole Execution Trace (WET)
Trace Size• ≈ 15 Bytes / Instruction
![Page 20: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/20.jpg)
Trace Sizes & Collection Overheads
Trace sizes are very large for even 10s of execution.
Program Running Time
Dep. Trace
Collection Time
mysql 13 s 21 GB 2886 s
prozilla 8 s 6 GB 2640 s
proxyC 10 s 456 MB 880 s
mc 10 s 55 GB 418 s
mutt 20 s 388 GB 3238 s
pine 14 s 156 GB 2088 s
squid 15 s 88 GB 1132 s
![Page 21: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/21.jpg)
Compacting Whole Execution Traces
Explicitly remember dynamic control flow trace. Infer as many dynamic dependences as possible
from control flow (94%), remember the remaining dependences explicitly (≈ 6%). Specialized graph representation to enable inference.
Explicitly remember value trace. Use context-based method to compress dynamic
control flow, value, and address trace. Bidirectional traversal with equal ease
[MICRO-04, TACO-05]
![Page 22: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/22.jpg)
Input: N=2
51: for I=1 to N do61: if (i%2==0) then
71: p=&a
81: a=a+191: z=2*(*p)
101: print(z)
11: z=021: a=031: b=2
41: p=&b
52: for I=1 to N do
62: if (i%2==0) then
82: a=a+192: z=2*(*p)
1: z=02: a=03: b=24: p=&b5: for i = 1 to N do6: if ( i %2 == 0) then7: p=&a endif endfor8: a=a+19: z=2*(*p)10: print(z)
Dependence Graph Representation
![Page 23: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/23.jpg)
5:for i=1 to N
6:if (i%2==0) then
7: p=&a
8: a=a+1
9: z=2*(*p)
10: print(z)
T F
1: z=0
2: a=0
3: b=2
4: p=&b
T
Input: N=2
11: z=0
21: a=0
31: b=2
41: p=&b
51: for i = 1 to N do
61: if ( i %2 == 0) then
81: a=a+1
91: z=2*(*p)
52: for i = 1 to N do
62: if ( i %2 == 0) then
71: p=&a
82: a=a+1
92: z=2*(*p)
101: print(z)
T1
2
3
4
5
6
7
8
9
10
11
12
13
14
<3,8><2,7>
<7,12>
<11,13>
<13,14>
<4,8>
<12,13>
<5,6><9,10>
<10,11>
<5,7><9,12>
<5,8><9,13>
F
Dependence Graph Representation
![Page 24: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/24.jpg)
1
2
3
4 5
6
7 8
9
10
1
2
3
4
5
6
7
2
1
10
34679
35679
34689
35689
1
2
3
2
1
10
34679
1
2
3
34 5
67 8
9
Transform: Traces of Blocks
![Page 25: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/25.jpg)
Infer: Local Dependence Labels
X =
Y= X
X =
Y= X
(10,10)
(20,20)
(30,30)
10,20,30
X =
Y= X
=Y
21
(20,21)
...
(...,20)
...
![Page 26: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/26.jpg)
Transform: Local Dep. Labels
X =
*P =
Y= X
X =
*P =
Y= X
(10,10)
(20,20)
10,20
X =
*P =
Y= X
(20,20)
![Page 27: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/27.jpg)
Transform: Local Dep. Labels
X =
*P =
Y= X
X =
*P =
Y= X
(10,10)
(20,20) X =
*P =
Y= X
10,20
X =
*P =
Y= X
2010
11,21
(10,11) (20,21)
=Y
11,21
=Y
(20,21) (10,11)
![Page 28: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/28.jpg)
Group: Non-Local Dep. Edges
X =
Y =
= Y
= X
X =
Y =
X =
Y =
= Y
= X
X =
Y =
(10,21)
(10,21)
(20,11)
(20,11)
X =
Y =
= Y
= X
X =
Y =
(20,11)
(10,21)
11,21
10
20
![Page 29: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/29.jpg)
Compacted WET Sizes
ProgramStatementsExecuted(Millions)
WET Size (MB) Before /
After Before After
300.twolf
256.bzip2
255.vortex
197.parser
181.mcf
164.gzip
130.li
126.gcc
099.go
Average
690
751
609
615
715
650
740
365
685
647
10,666
11,921
8,748
8,730
10,541
9,688
10,399
5,238
10,369
9,589
647
221
105
188
416
538
203
89
575
331
16.49
54.02
83.63
46.34
25.33
18.02
51.22
58.84
18.04
41.33
≈ 4 Bits / Instruction
![Page 30: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/30.jpg)
[PLDI-04]vs.
[ICSE-03]
Slicing Times
![Page 31: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/31.jpg)
Dep. Graph Generation Times
Offline post-processing after collecting address and control flow traces ≈ 35x of execution time
Online techniques [ICSM 2007] Information Flow: 9x to18x slowdown Basic block Opt.: 6x to10x slowdown Trace level Opt.: 5.5x to 7.5x slowdown Dual Core: ≈1.5x slowdown
Online Filtering techniques Forward slice of all inputs User-guided bypassing of functions
![Page 32: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/32.jpg)
Reducing Online Overhead
Record non-deterministic events online• Less than 2x overhead• Deterministic replay of executions
Trace faulty executions off-line• Replay the execution• Switch on tracing• Collect and inspect traces
Trace analysis is still a problem• The traces correspond to huge executions• Off-line overhead of trace collection is still significant
![Page 33: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/33.jpg)
Reducing Trace Sizes
Checkpointing Schemes Trace from the most recent checkpoint
• Checkpoints are of the order of minutes.• Better but the trace sizes are still very large.
Exploiting Program Characteristics Multithreaded and server-like [ISSTA-07, FSE-06]
• Examples : mysql, apache.• Each request spawns a new thread.• Do not trace irrelevant threads.
![Page 34: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/34.jpg)
Beyond Tracing
Checkpoint: capture memory image.
Execute and Record (log) Events.
Upon Crash, Rollback to checkpoint. Reduce log and Replay execution using reduced log. Turn on tracing during replay.
xCheckpoint log
x TraceReducedlog
Applicable to Multithreaded Programs
[ISSTA-07]
![Page 35: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/35.jpg)
An Example
A mysql bug • “ load …” command will crash the server if database is not
specified
sql/mysql_load.cc:int mysql_load (THD *thd,...){
…150 if( …151 … +strlen(thd->db) + 3 <152 FN_REFLEN) ...}
Without typing “use database_name”, thd->db is Null.
![Page 36: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/36.jpg)
Example – Execution and log file
open path=/etc/my.cnf
…
Wait for connection
Create Thread 1
Wait for command
Create Thread 2
Wait for command
Recv “show databases”
Handle command
Recv “load data …”
Handle -- (server crashes)
Recv “use test; select * from b”
Handle command
Run mysql server
User 1 connects to the server
User 2 connects to the server
User 1: “show databases”
User 2: “use test”
“ select * from b”
User 1: “load data into table1”
Time
Blue – T0
Red – T1
Green – T2
Gray - Scheduler
![Page 37: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/37.jpg)
Execution Replay using Reduced log
open path=/etc/my.cnf
…
Wait for connection
Create Thread 1
Recv “load data …”
Handle -- (server crashes)
Run mysql server
User 1 connects to the server
User 2 connects to the server
User 1: “show databases”
User 2: “show databases”
“ select * from b”
User 1: “load data into table1”
Time
![Page 38: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/38.jpg)
Execution Reduction
Effects of Reduction Irrelevant Threads Replay-only vs. Replay & Trace
How? By identifying Inter-thread Dependences Event Dependences - found using the log File Dependences - found using the log Shared-Memory Dependences - found using replay
Space requirement reduced by 4x
Time requirement reduced by 2x
Naïve approach requires thread id of last writer of each address Space and time efficient detection
o Memory Regions: Non-shared vs sharedo Locality of References to Regions
![Page 39: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/39.jpg)
Experimental Results
![Page 40: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/40.jpg)
Original OptimizedProgram-bug
Trace SizesNum. of
dependences
Experimental Results
![Page 41: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/41.jpg)
Orig. OPT.Program-bug
Execution Times
(seconds)
Logging
Experimental Results
![Page 42: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/42.jpg)
Debugging System
SlicingModule
WETSlicesApplication
binary
ExecutionEngineValgrind
Input Output
Traces
Instrumentcode
CompressedTrace
Static BinaryAnalyzer
Diablo
Control Dependence
ReducedLog
RecordReplayJockey
Checkpoint+
log
![Page 43: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/43.jpg)
Fault Avoidance
Large number of faults in server programs are caused by the environment.
• 56 % of faults in Apache server.
Types of Faults Handled Atomicity Violation Faults.
• Try alternate scheduling decisions.
Heap Buffer Overflow Faults.• Pad memory requests.
Bad User Request Faults.• Drop bad requests.
Avoidance Strategy Recover first time, Prevent later.
• Record the change that avoided the fault.
![Page 44: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/44.jpg)
Experiments
Program Type of Bug Env. Change # of Trials
Time taken (secs.)
mysql-1 Atomicity Violn. Scheduler 1 130
mysql-2 Atomicity Violn. Scheduler 1 65
mysql-3 Atomicity Violn. Scheduler 1 65
mysql-4 Buffer Overflow. Mem. Padding 1 700
pine-1 Buffer Overflow. Mem. Padding 1 325
pine-2 Buffer Overflow. Mem. Padding 1 270
mutt-1 Bad User Req. Drop Req. 3 205
bc-1 Bad User Req. Drop Req. 3 290
bc-2 Bad User Req. Drop Req. 3 195
![Page 45: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/45.jpg)
Summary
Dynamic SlicingOffline
Environment FaultsOnline
Program Execution
Fault
Fault Location
Fault Avoidance
ScalabilityTracing + Logging
Long-runningMulti-threaded
![Page 46: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/46.jpg)
Dissertations
Xiangyu Zhang, Purdue University Fault Location Via Precise Dynamic
Slicing, 2006. SIGPLAN Outstanding Doctoral Dissertation Award
Sriraman Tallam, Google
Fault Location and Avoidance in Long-Running Multithreaded Programs, 2007.
![Page 47: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/47.jpg)
Ongoing Work
Monitoring Parallel Applications On Multicores: Vijay Nagarajan [ISCA’09]
Debugging Parallel Applications
State-based Approach: Dennis Jeffrey [ISSTA’08a]
Race-detection in Parallel Applications:
Chen Tian & Vijay Nagarajan [ISSTA’08b]
Optimistic Parallelization for Multicores Speculation: Min Feng & Chen Tian [MICRO’08]
![Page 48: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/48.jpg)
![Page 49: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/49.jpg)
Breakdowns of Different Optimizations
Infer
Transform
Group
Others
Explicit
![Page 50: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/50.jpg)
Reducing Trace Sizes
Checkpointing Schemes Trace from the most recent checkpoint
• Checkpoints are of the order of minutes.• Better but the trace sizes are still very large.
Exploiting Program Characteristics Multithreaded and server-like [ISSTA-07]
• Examples : mysql, apache.• Each request spawns a new thread.• Do not trace irrelevant threads.
Single-threaded and event processing [FSE-06]• Examples : pine, mutt, squid.• Events are independent.• Do not trace irrelevant events.
![Page 51: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/51.jpg)
Example of a Fault
Input:• > ./mysqld –log-bin= binlog• > ./mysql -u root -D test -e 'delete from b' &• > ./mysql -u root -D test -e 'insert into b values (1)' &
Output Observation• > ./mysqlbinlog binlog
set timestamp = 1152573474 insert into b values (1); set timestamp = 1152573470 delete from b
![Page 52: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/52.jpg)
File : mysql_delete.cc
mysql_delete(THD *thd, ...)
...
152 error=generate_table(thd, ...);
...
generate_table(THD *thd, ...)
...
81 pthread_mutex_lock(...);
... // Critical Section
105 pthread_mutex_unlock(...);
108 ... // Logging not locked
109 mysql_update_log.write(thd,...);
...
Log of an execution
(scheduling decision)
TEI 1 Recv “delete …”
Handle it but hasn’t logged it
TEI 2 Recv “insert”
Handle it (logged insert)
TEI 3 Logging delete
Red – T1
Green – T2
Gray - Scheduler
Example of a Fault
![Page 53: Scalable Dynamic Analysis for Automated Fault Location and Avoidance Rajiv Gupta Funded by NSF grants from CPA, CSR, & CRI programs and grants from Microsoft.](https://reader036.fdocuments.us/reader036/viewer/2022062421/56649d195503460f949eec81/html5/thumbnails/53.jpg)
System for Fault Avoidance
Put the server program into a 3-phase system• Log the original execution• Replay execution and apply the environmental changes• Use environment patch to prevent the fault from occurring again
Logging – jockey system. Apply environment changes – valgrind system.
Logging PhaseOrig. Execution
EventLog
Fault AvoidancePhase
(Env. Changes)Env. Fault
Patch
Prevention-LoggingPhase
Fault Avoided
FAULT REPLAY & AVOID
NEWFAULT
OLDFAULT
RECORDPATCH
[COMPSAC-08]