1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data...
-
date post
22-Dec-2015 -
Category
Documents
-
view
217 -
download
2
Transcript of 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data...
![Page 1: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/1.jpg)
1
Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources:a Case Study of OpenBSD
Paul Luo LiJim HerbslebMary ShawCarnegie Mellon University
![Page 2: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/2.jpg)
2
Open Source Software Systems are Critical Infrastructure
![Page 3: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/3.jpg)
3
Problem for Decision Makers Using/Planning to Use Open Source Software Systems Lack of quantitative information on open source software
systems: What is the quality? How many defects are there? When are they going to occur?
![Page 4: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/4.jpg)
4
Possible Benefits of Field Defect Predictions
Make informed choices between open source software systems
Decide whether to adopt the latest software release Better manage resources to deal with possible defects Insure users against the costs of field defect occurrences
![Page 5: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/5.jpg)
5
Field Defect Predictions: Not a Novel Idea
Time-based modeling (software reliability modeling) E.g. Musa 1987, Littlewood 1973
Metrics-based modeling E.g. Ostrand et al. 2004, Khoshgoftaar et al. 2000
![Page 6: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/6.jpg)
6
However, no Study Has Examined Open Source Systems
Time-based modeling (software reliability modeling) May not be possible for open source software systems
Metrics-based modeling May not have necessary data sources of important
predictors from open source projects
![Page 7: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/7.jpg)
7
Outline OpenBSD Reason why software reliability modeling cannot be used to
predict field defect occurrences at the time of release Metrics collected from commonly available data sources Relationships between metrics and field defects Conclusion Sneak peak ahead
![Page 8: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/8.jpg)
8
OpenBSD is …
A derivative of the original AT&T Unix system from the Berkeley Source Distribution (forked from FreeBSD)
Distributed under the Berkeley copyrights Focused on security and reliability
![Page 9: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/9.jpg)
9
OpenBSD uses…
A CVS repository Several mailing lists A request/problem tracking system
![Page 10: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/10.jpg)
10
A Field Defect is…
A user reported problem in the request tracking system of the class “software bugs” whose submit date is after the published date of release
Months after release
Fielddefects
Field defects for release 2.4
![Page 11: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/11.jpg)
11
Outline OpenBSD Reason why software reliability modeling cannot be used to
predict field defect occurrences at the time of release Metrics collected from commonly available data sources Relationships between metrics and field defects Conclusion Sneak peak ahead
![Page 12: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/12.jpg)
12
Software Reliability Modeling is …
Defects
Months since first reported defect
Defects for release 3.0
![Page 13: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/13.jpg)
13
Fitting a Parametric Model to Defect Occurrences
Defects
Months since first reported defect
Defects for release 3.0
Li et al. 2004
![Page 14: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/14.jpg)
14
However, when the Release Date is Early
Defects
Months since first reported defect
Defects for release 3.0
Release date
![Page 15: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/15.jpg)
15
There is Not Enough Data to Fit a Model at the Time of Release
Defects
Months since first reported defect
Defects for release 3.0
![Page 16: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/16.jpg)
16
Outline OpenBSD Reason why software reliability modeling cannot be used to
predict field defect occurrences at the time of release Metrics collected from commonly available data sources Relationships between metrics and field defects Conclusion Sneak peak ahead
![Page 17: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/17.jpg)
17
Using Metrics Available Before Release to Predict Field Defects
Certain characteristics make the presences of field defects more or less likely Product Development Deployment and usage Software and hardware configurations in use
Relationships between predictors and field defects can be modeled using past observations to predict future observations
![Page 18: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/18.jpg)
18
Product Metrics
Metrics that measure the attributes of any intermediate or final product of the development process Examined by most studies
Ostrand and Wyuker 2002 Jones et al. 1999
Computed using snapshots of the code from CVS Computed using various automated tools 101 product metrics
![Page 19: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/19.jpg)
19
Sub-categories of Product Metrics from Literature* Control
E.g. Cyclomatic complexity Volume
E.g. Unique operands Action
E.g. Lines of code Effort
E.g. Halstead’s mental effort metric Modularity
E.g. Statements at nesting greater than 9
*Categories from [ Munson and Khoshgoftaar, 89], based on PCA
![Page 20: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/20.jpg)
20
Development Metrics
Metrics that measure attributes of the development process Examined by many studies
Khoshgoftaar et al. 1999 Graves et al. 2000
Computed using history log information in CVS Computed using problem tracking information 22 development metrics
![Page 21: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/21.jpg)
21
Number of changes E.g. Updates during development (deltas)
Experience of people making changes E.g. Inexperienced developers making changes
Changes to the code E.g. Change in lines of code
Development problems in previous release E.g. Defects during development period of previous
release Problems found during development
E.g. Field defects during development Development problems in current release
E.g. Defects during development against current release
Grouping of Development Metrics from Literature*
* Groupings from Khoshgoftaar et al. 2000, using PCA
![Page 22: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/22.jpg)
22
Deployment and Usage metrics Metrics that measure attributes of the deployment of the
software system and usage in the field Not examined by many studies Data sources used in previous studies not available for
our study
![Page 23: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/23.jpg)
23
Problem with and Solution to Deployment and Usage Metrics
Previous studies uses data sources that are not available in our study: Khoshgoftaar et al. 2000
Execution time based on known usage profile Mockus et al. 2005
Deployment info for monitored telecommunications systems
Our study captures metrics from commonly available data sources: 9 deployment and usage metrics from:
Mailing list archives Problem tracking systems
![Page 24: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/24.jpg)
24
Metrics from Commonly Available Data Sources
Mailing list predictors
TechMailings MiscMailing
AdvocayMailings AnnounceMailings
PortsMailings WWWMailings BugsMailings
Reflects amount of interested in OpenBSD which may be related to
deployment and usage
Request tracking system
predictors
ChangeRequests
DocBugsUsers usually install and use the system before issuing requests
![Page 25: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/25.jpg)
25
Software and Hardware Configurations Metrics
Metrics that measure attributes of the software and hardware configurations in use Not examined by many studies Data sources used in previous study not available for
our study
![Page 26: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/26.jpg)
26
Problem with and Solution to Software and Hardware Configurations Metrics Previous studies uses data sources that are not available in
our study: Mockus et al. 2005
Deployment info for monitored telecommunications systems
Our study captures metrics from commonly available data sources: 7 software and hardware configurations metrics from:
Mailing list archives Problem tracking systems
![Page 27: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/27.jpg)
27
Metrics from Commonly Available Data Sources
Mailing list predictors
SparcMailingsReflect the amount of activity/interest in the
specific hardware area
Request tracking system
predictors
AllBSDBugs i386
AllBSDBugsi386HW
AllBSDBugssparcHW
AllbugsotherHW
CurrentBSDBugsi386HW
CurrentBSDBugssparcHW
CurrentBSDBugsotherHW
Users usually install and use the system before
reporting a problem
![Page 28: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/28.jpg)
28
Outline OpenBSD Reason why software reliability modeling cannot be used to
predict field defect occurrences at the time of release Metrics collected from commonly available data sources Relationships between metrics and field defects Conclusion Sneak peak ahead
![Page 29: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/29.jpg)
29
Top Three Rank-correlated Predictors*
TechMailings Spearman’s ρ(rho): .78 Total number of messages to the technical mailing list, a
developer’s mailing list, during the development period TotMeth
Spearman’s ρ(rho): .61 Total number of methods
PubMeth Spearman’s ρ(rho): .61 Number of public methods
* Single predictor
![Page 30: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/30.jpg)
30
First Three Predictors Selected Using AIC Model Selection*
1. TechMailings Total number of messages to the technical mailing list, a
developer’s mailing list, during the development period
2. UpdatesNotCFiles Number of updates (deltas) to files that are not C source
files during the development period
3. SparcMailing Number of messages to the sparc hardware specific
mailing list, a platform specific mailing list, during the development period
*Predictors in combination
![Page 31: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/31.jpg)
31
The Best Predictor: TechMailings
![Page 32: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/32.jpg)
32
Outline OpenBSD Reason why software reliability modeling cannot be used to
predict field defect occurrences at the time of release Metrics collected from commonly available data sources Relationships between metrics and field defects Conclusion Sneak peak ahead
![Page 33: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/33.jpg)
33
We find…
It is not possible to use software reliability modeling to predict field defects occurrences of OpenBSD There is not enough development defect information at
the time of release It is possible to collect deployment and usage and software
and hardware configurations metrics for OpenBSD Mailing list archives and problem tracking systems are
previous un-utilized sources of data TechMailings is the most important predictor for OpenBSD
Deployment and usage metrics are important for field defect occurrence predictions
![Page 34: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/34.jpg)
34
What’s Next?
Repeat for other open source systems Correlations could have happened by chance!
Repeat for commercial Differences in style of development and in data sources
available Use metrics to predict field defect occurrence rates over
time…
![Page 35: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/35.jpg)
35
Outline OpenBSD Reason why software reliability modeling cannot be used to
predict field defect occurrences at the time of release Metrics collected from commonly available data sources Relationships between metrics and field defects Conclusion Sneak peak ahead
![Page 36: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/36.jpg)
36Months after release
Fielddefects
Field defects for release 2.4
Predicting Parameters Using Metrics-based Methods
λ(t) = N α e – α t i = metrics available before release
fN(i) fa(i)
fN(i) is a metrics-based method
![Page 37: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/37.jpg)
37
Forecasting Field Defect Rates Using a Combined Time-based and Metrics-based Approach: a Case Study of OpenBSD(ISSRE, 2005)
Paul Luo LiJim HerbslebMary ShawCarnegie Mellon University
![Page 38: 1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.](https://reader035.fdocuments.us/reader035/viewer/2022062421/56649d775503460f94a59893/html5/thumbnails/38.jpg)
38
Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources:a Case Study of OpenBSD
Paul Luo LiJim HerbslebMary ShawCarnegie Mellon University