(semi)Automatic Methods for Security Bug Detection Tal Garfinkel Stanford/VMware.
Automatic Identification of Bug-Introducing Changes .
-
Upload
jessica-ashley -
Category
Documents
-
view
36 -
download
0
description
Transcript of Automatic Identification of Bug-Introducing Changes .
![Page 1: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/1.jpg)
Automatic Identification of Bug-Introducing Changes.
Presenter: Haroon Malik
![Page 2: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/2.jpg)
Abstract Bug-fixes do not contain information about the change that
initially introduced the bug. Extraction of bug-introducing changes is challenging. An algorithm to automatically and accurately identify a bug-
introducing changes. Algorithm can remove 30%~51% of false positive and
14%~15% of false negative to previous algorithm.
![Page 3: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/3.jpg)
Introduction Software project control their changes using SCM and capture bug
reports using bug tracking software e.g Bugzilla. Records which changed in SCM system fixes a Specific bug in the
change tracking system. Bug Progression:
Programmer makes the change Bug-introducing change
Bug manifest itself in some undesirable external behavior. Recorded in bug tracking system
Developer modifies the code to fix bug Bug-fix change
![Page 4: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/4.jpg)
Introduction (Cont’d)
Wide spread use of SCM, data concerning bug fix changes in readily availble.
It is easy to mine SCM repository to mine changes that have repaired a bug Linking key words with bug report refrence
E.g: “Bug” or “Fixed” #902340
![Page 5: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/5.jpg)
Major Problemswith bug-fix data It shed no light on when a bug was injected Not always person who fixes a bug is one
who caused Can not determine where a bug occurred.
![Page 6: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/6.jpg)
Background SZZ Algoritham Working:
Firstly locates key words to mark bug-fixed changes Secondly, running a diff tool what changed in bug-fix Diff tool returns “Hunk” Utilizes annotate feature of SCM to find the changes
Most recent revision Who made the chage
![Page 7: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/7.jpg)
Background (Cont’d) Revision1: Origin of bug (Line 3). Revision 2: Function name changed (bar foo). Revision 3: Bug removal
![Page 8: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/8.jpg)
SZZ Limitations Blank spaces and Comments Formatting changes (Line 3) Name of function containing bug.
![Page 9: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/9.jpg)
Proposed Approach Applied of method level for two java open source projects
Columba and Eclipise Two human judges manually verified all hunk in series of
bug-fix to ensure the corresponding hunks are real bug fixes.
![Page 10: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/10.jpg)
Proposed Approach Steps(1-5) remove 38%~51% of false positive
and 14%~15% of false negatives as compared to SZZ.
![Page 11: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/11.jpg)
Experimental setup History Extraction
Used Kenyon to extract histories from SCM systems
![Page 12: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/12.jpg)
Experimental setup (Cont’d) Accuracy Measures
Bug-introducing change set consists of all the changes with in specific project revisions that have been identified as bug-introducing
Assuming R is the more accurate bug-introducing change set, then compute false positives and false negatives for the set P can be computed as follow:
![Page 13: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/13.jpg)
Annotation Graph Annotation Graph
A graph which contains information on the cross-revision mappings of individual lines.
Major improvement over the SZZ
![Page 14: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/14.jpg)
Experimental setup (Cont’d) Non behavior changes
Code format, comments & blank lines.
14%~20% false positive
Format changes
![Page 15: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/15.jpg)
Manual Verification If a change log indicates the revisions is a bug-fix, it is
assumed all the hunks in revision are bug fixes. Two humans judges marked each bug-fix hunk for both
projects. Used bug-fix hunk verification tool
![Page 16: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/16.jpg)
Real Bugs?
![Page 17: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/17.jpg)
Validation Hurdles Non representative systems. Open Source. Bug fix data is incomplete. Manual Varicication
![Page 18: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/18.jpg)
Bug-Introduction Statistics
Eclipse Columba
![Page 19: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/19.jpg)
Conclusion Refined SSZ approach by introducing
Annotation Graph. Experiments showed the achievement of
38~51% of false positive and 14% of false negative removal as compared to SSZ
![Page 20: Automatic Identification of Bug-Introducing Changes .](https://reader036.fdocuments.us/reader036/viewer/2022062304/56812d94550346895d92b07b/html5/thumbnails/20.jpg)
Thank You