Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han...
-
Upload
adrian-suarez -
Category
Documents
-
view
214 -
download
0
Transcript of Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han...
![Page 1: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/1.jpg)
Mining Behavior Graphs for “Backtrace” of Noncrashing Bugs
Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei HanUniversity of Illinois at Urbana-Champaign
Philip S. YuIBM T. J. Watson Research
Presented by: Chao Liu
![Page 2: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/2.jpg)
Outline
Motivations Related Work Classification of Program Executions Extract “Backtrace” from Classification Dynamics Case Study Conclusions
![Page 3: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/3.jpg)
Motivations
• Software is full of bugs– Windows 2000, 35M LOC
• 63,000 known bugs at the time of release, 2 per 1000 lines
• Software failure costs– Ariane 5 explosion is due to “errors in the software
of the inertial reference system” (Ariaen-5 flight 501inquiry board report http://ravel.esrin.esa.it/docs/esa-x-1819eng.pdf)
– A study by the National Institute of Standards and Technology found that software errors cost the U.S.economy about $59.5 billion annuallyhttp://www.nist.gov/director/prog-ofc/report02-3.pdf
• Testing and debugging are laborious and expensive– “50% of my company employees are testers, and the rest spends 50% of
their time testing!” --Bill Gates, in 1995
Courtesy to CNN.com
![Page 4: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/4.jpg)
Bug Localization
• Automatically circle out the most suspicious places
• Two kinds of bugs w.r.t. symptoms– Crashing bugs
• Typical symptoms: segmentation faults• Reasons: memory access violations
– Noncrashing bugs• Typical symptoms: smooth executions but unexpected
outputs• Reasons: logic or semantic errors• An example
![Page 5: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/5.jpg)
Running Example
• Subject program– replace: perform regular
expression matching and substitutions
– 563 lines of C code– 17 functions are involved
• Execution behaviors– 130 out of 5542 test cases
fail to give correct outputs– No incorrect executions
incur segmentation faults
• Debug method– Step-by-step tracing
void subline(char *lin, char *pat, char *sub)
{
int i, lastm, m;
lastm = -1;
i = 0;
while((lin[i] != ENDSTR)) {
m = amatch(lin, i, pat, 0);
if (m >= 0){
putsub(lin, i, m, sub);
lastm = m;
}
if ((m == -1) || (m == i)){
fputc(lin[i], stdout);
i = i + 1;
} else
i = m;
}
}
void subline(char *lin, char *pat, char *sub)
{
int i, lastm, m;
lastm = -1;
i = 0;
while((lin[i] != ENDSTR)) {
m = amatch(lin, i, pat, 0);
if ((m >= 0) && (lastm != m) ){
putsub(lin, i, m, sub);
lastm = m;
}
if ((m == -1) || (m == i)){
fputc(lin[i], stdout);
i = i + 1;
} else
i = m;
}
}
![Page 6: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/6.jpg)
Debugging Crashes
![Page 7: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/7.jpg)
Bug Localization via Backtrace
• Backtrace for noncrashing bugs?
• Major challenges– No abnormality is visible on the surface.– When and where the abnormality happens.
![Page 8: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/8.jpg)
Outline
Motivations Related Work Classification of Program Executions Extract “Backtrace” from Classification Dynamics Case Study Conclusions
![Page 9: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/9.jpg)
Related Work
• Crashing bugs– Memory access monitoring
• Purify [HJ92], Valgrind [SN00], GDB …
• Noncrashing bugs– Static program analysis– Traditional model checking– Model checking source code
![Page 10: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/10.jpg)
Static Program Analysis
• Methodology– Examine source code directly– Enumerate all the possible execution paths without running the program– Check user-specified properties, e.g.
• free(p) …… (*p)• lock(res) …… unlock(res)• receive_ack() … … send_data()
• Strengths– Check all possible execution paths
• Problems– Shallow semantics– Properties should be directly mapped to source code structure
• Tools– ESC [DRL+98], LCLint [EGH+94], ESP [DLS02], MC Checker [ECC00] …
×
![Page 11: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/11.jpg)
Traditional Model Checking
• Methodology– Model program computation as finite state machines– It is described with a particular description language– Exhaustively explore all the reachable states in checking desired or
undesired properties
• Strengths– Model deeper semantics– Naturally fit in checking event-driven systems, like protocols
• Problems– Significant amount of manual efforts in modeling– State space explosion
• Tools– SMV [M93], SPIN [H97], Murphi [DDH+92] …
![Page 12: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/12.jpg)
Model Checking Source Code
• Methodology– Execute the real program in a sandbox (e.g., virtual machine)– Manipulate event happenings, e.g.,
• Message incomings• Return value of memory allocation
• Strengths– Less significant manual specification
• Problems– Application restrictions, e.g.,
• Event-driven programs (still)• Clear mapping between source code and logic event
• Tools– CMC [MPC+02], Verisoft [G97], Java PathFinder [BHP+-00] …
![Page 13: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/13.jpg)
Summary of Related Work
• In summary,– Semantic inputs are necessary
• Program model• Properties to be checked (all three methods)
– Restricted application domain• Event-driven model• Properties are also event-related.
![Page 14: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/14.jpg)
void subline(char *lin, char *pat, char *sub)
{
int i, lastm, m;
lastm = -1;
i = 0;
while((lin[i] != ENDSTR)) {
m = amatch(lin, i, pat, 0);
if (m > 0){
putsub(lin, i, m, sub);
lastm = m;
}
if ((m == -1) || (m == i)){
fputc(lin[i], stdout);
i = i + 1;
} else
i = m;
}
}
void subline(char *lin, char *pat, char *sub)
{
int i, lastm, m;
lastm = -1;
i = 0;
while((lin[i] != ENDSTR)) {
m = amatch(lin, i, pat, 0);
if (m >= 0){
putsub(lin, i, m, sub);
lastm = m;
}
if ((m == -1) || (m == i)){
fputc(lin[i], stdout);
i = i + 1;
} else
i = m;
}
}
Example Revisited
void subline(char *lin, char *pat, char *sub)
{
int i, lastm, m;
lastm = -1;
i = 0;
while((lin[i] != ENDSTR)) {
m = amatch(lin, i, pat, 0);
if (m >= 0){
putsub(lin, i, m, sub);
lastm = m;
}
if ((m == -1) || (m == i)){
fputc(lin[i], stdout);
i = i + 1;
} else
i = m;
}
}
void subline(char *lin, char *pat, char *sub)
{
int i, lastm, m;
lastm = -1;
i = 0;
while((lin[i] != ENDSTR)) {
m = amatch(lin, i, pat, 0);
if ((m >= 0) && (lastm != m) ){
putsub(lin, i, m, sub);
lastm = m;
}
if ((m == -1) || (m == i)){
fputc(lin[i], stdout);
i = i + 1;
} else
i = m;
}
}
• No memory violations
• Not event-driven program
• No explicit error properties
![Page 15: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/15.jpg)
Outline
Motivations Related Work Classification of Program Executions Extract “Backtrace” from Classification Dynamics Case Study Conclusions
![Page 16: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/16.jpg)
Synopsis of Program Execution
• Program behavior graphs– Function-level abstraction of program behaviors– Function calls and transitions– First-order sequential information about function interactions
int main(){ ... A(); ... B();}int A(){ ... }int B(){ ... C() ... }int C(){ ... }
![Page 17: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/17.jpg)
Identification of Incorrect Executions
• A two-class classification problem– Every execution gives one behavior graph– Edges and closed frequent subgraphs as features
• Is classification useful?– Classification itself does not work for bug localization
• Classifier only labels each run as either correct or incorrect as a whole• It does not tell when and where abnormality happens
• Observations– Good classifiers know the differences between correct and
incorrect execution• Difference, a kind of abnormality?
– Where and when does abnormality happens?• Incremental classification
?
![Page 18: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/18.jpg)
Outline
Motivations Related Work Classification of Program Executions Extract “Backtrace” from Classification Dynamics Case Study Conclusions
![Page 19: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/19.jpg)
Incremental Classification
• Classification works only when instances from two classes are different.
• Precision as a measure of the difference.
• Incremental classification • Observe accuracy dynamics
![Page 20: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/20.jpg)
Illustration: Precision Boost
main main
A A
B C
D
B C
D
One Correct Execution One Incorrect Execution
E E
F
G
F
G
H
![Page 21: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/21.jpg)
Bug Relevance
• Precision boost– For each function F:
• Precision boost = Exit precision - Entrance precision.
– Intuition & heuristics• Differences take place within the execution of F• Abnormality happens while F is in the stack• The larger the boost, the more likely F is relevant to the bug
• Bug-relevant function
![Page 22: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/22.jpg)
Outline
Related Work Classification of Program Executions Extract “Backtrace” from Classification Dynamics Case Study Conclusions
![Page 23: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/23.jpg)
Case Study
• Subject program– replace: perform regular
expression matching and substitutions
– 563 lines of C code– 17 functions are involved
• Execution behaviors– 130 out of 5542 test cases
fail to give correct outputs– No incorrect executions
incur segmentation faults
• Task– Can we circle out the
backtrace for this bug?
void subline(char *lin, char *pat, char *sub)
{
int i, lastm, m;
lastm = -1;
i = 0;
while((lin[i] != ENDSTR)) {
m = amatch(lin, i, pat, 0);
if (m >= 0){
putsub(lin, i, m, sub);
lastm = m;
}
if ((m == -1) || (m == i)){
fputc(lin[i], stdout);
i = i + 1;
} else
i = m;
}
}
void subline(char *lin, char *pat, char *sub)
{
int i, lastm, m;
lastm = -1;
i = 0;
while((lin[i] != ENDSTR)) {
m = amatch(lin, i, pat, 0);
if ((m >= 0) && (lastm != m) ){
putsub(lin, i, m, sub);
lastm = m;
}
if ((m == -1) || (m == i)){
fputc(lin[i], stdout);
i = i + 1;
} else
i = m;
}
}
![Page 24: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/24.jpg)
Precision Pairs
![Page 25: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/25.jpg)
Backtrace for Noncrashing Bugs
![Page 26: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/26.jpg)
Outline
Motivations Related Work Classification of Program Executions Extract “Backtrace” from Classification Dynamics Case Study Conclusions
![Page 27: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/27.jpg)
Conclusions
• Identify incorrect executions from program runtime behaviors.
• Classification dynamics can give away “backtrace” for noncrashing bugs without any semantic inputs.
• Data mining can contribute to software engineering and system researches in general.
Mining into
Software and
Systems?
![Page 28: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/28.jpg)
References
• [DRL+98] David L. Detlefs, K. Rustan, M. Leino, Greg Nelson and James B. Saxe. Extended static checking, 1998
• [EGH+94] David Evans, John Guttag, James Horning, and Yang Meng Tan. LCLint: A tool for using specifications to check code. In Proceedings of the ACM SIG-SOFT '94 Symposium on the Foundations of Software Engineering, pages 87-96, 1994.
• [DLS02] Manuvir Das, Sorin Lerner, and Mark Seigle. Esp: Path-sensitive program verication in polynomial time. In Conference on Programming Language Design and Implementation, 2002.
• [ECC00] D.R. Engler, B. Chelf, A. Chou, and S. Hallem. Checking system rules using system-specic, programmer-written compiler extensions. In Proceedings of the Fourth Symposium on Operating Systems Design and Implementation, October 2000.
• [M93] Ken McMillan. Symbolic Model Checking. Kluwer Academic Publishers, 1993• [H97] Gerard J. Holzmann. The model checker SPIN. Software Engineering, 23(5):279-
295, 1997.• [DDH+92] David L. Dill, Andreas J. Drexler, Alan J. Hu, and C. Han Yang. Protocol
verication as a hardware design aid. In IEEE International Conference on Computer Design: VLSI in Computers and Processors, pages 522-525, 1992.
• [MPC+02] Madanlal Musuvathi, David Y.W. Park, Andy Chou, Dawson R. Engler and David L. Dill. CMC: A Pragmatic Approach to Model Checking Real Code. In Proceedings of the fifth Symposium on Operating Systems Design and Implementation, 2002.
![Page 29: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/29.jpg)
References (cont’d)
• [G97] P. Godefroid. Model Checking for Programming Languages using VeriSoft. In Proceedings of the 24th ACM Symposium on Principles of Programming Languages, 1997
• [BHP+-00] G. Brat, K. Havelund, S. Park, and W. Visser. Model checking programs. In IEEE International Conference on Automated Software Engineering (ASE), 2000.
• [HJ92] R. Hastings and B. Joyce. Purify: Fast Detection of Memory Leaks and Access Errors. 1991. in Proceeding of the fthe Winter 1992 USENIX Conference, pages 125-138. San Francisco, California
• [SN00] Julian Seward and Nick Nethercote. Valgrind, an open-source memory debugger for x86-GNU/Linux http://valgrind.org/
• [LLM+04] Zhenmin Li, Shan Lu, Suvda Myagmar, Yuanyuan Zhou. CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code, in Proceeding of the 6th Symposium of Operating Systems Design and Implementation, 2004
• [LCS+04] Zhenmin Li, Zhifeng Chen, Sudarshan M. Srinivasan, Yuanyuan Zhou. C-Miner: Mining Block Correlations in Storage Systems. In proceeding of the 3rd usenix conferences on file and storage technologies, 2004
![Page 30: Mining Behavior Graphs for Backtrace of Noncrashing Bugs Chao Liu, Xifeng Yan, Hwanjo Yu, Jiawei Han University of Illinois at Urbana-Champaign Philip.](https://reader034.fdocuments.us/reader034/viewer/2022051515/5514752a550346414e8b62cb/html5/thumbnails/30.jpg)
Q & A
Thank You!