Benevol 2011
Click here to load reader
-
Upload
bogdan-vasilescu -
Category
Education
-
view
163 -
download
0
description
Transcript of Benevol 2011
Similar Tasks, Different Effort:
Why the Same Amount of
Functionality Requires
Different Development Effort?
Alexander Serebrenik
Bogdan Vasilescu
Mark van den Brand
Why do some systems require more effort?
• Empirical study
• ISBSG version 11
• largest publically available collection: 5052 projects
• 118 project attributes, including
− amount of functionality
− work effort
• Not all projects are suited for the study
• self-reporting different data quality
• different ways of measuring project attributes
/ W&I / MDSE PAGE 123-4-2012
Project selection
ISBSG v.11 5052
Effort Staff hours (recorded) 3537
Full development lifecycle 2261
Project-specific activities only 2079
Functionality IFPUG 1661
Data quality “A” or “B” 1609
/ W&I / MDSE PAGE 223-4-2012
Effort and Functionality Distributions
• Effort:
• skewed, outliers
• Adjusted FP or unadjusted FP
• Adjusted is more reliable
[Kitchenham et al. JSS, 2002]
• skewed, outliers
/W&I / MDSE PAGE 323-4-2012
More functionality more effort required
• Log-transformation
for the skewness /
outliers problem
• Adequate
• p-value for the F-
stat ≤ 2.2*10-16,
• p-values intercept
and coefficient ≤
2.2*10-16,
• residuals show a
chaotic pattern
/ W&I / MDSE PAGE 423-4-2012
log(SWE) =
2.92717 +
0.84617 * log(AFP)
Why do some systems require more effort?
• Closer look at the residuals
• technical aspects:
− primary programming language, language type,
development type, platform, and architecture
• organization type
• intended market
• year of project
• Problem of ISBSG
• missing values due to self-reporting
/ W&I / MDSE PAGE 523-4-2012
What attributes impact the development effort?
• Goal: compare different project attributes
• ISBSG – 118 attributes
• Remove projects with missing values
• More attributes less projects
• Keep projects with missing values
• NA-category becomes too important
• We choose
• primary programming language, language type, organization
type, intended market, year of project, development type,
platform, architecture
/ W&I / MDSE PAGE 623-4-2012
• Partition individuals in groups
• Partition = explanation [Cowell, Jenkins 1995]
• Inequality within the groups and between the groups
− Inequality indices
• Better explanation: more inequality between the groups
− Lila is better than red
− Partition refinement doesn’t deteriorate the explanation
/ SET / W&I / TU/e PAGE 7
Explanation of impact
Which inequality index?
• We need a decomposable index applicable to
negative values
/ W&I / MDSE PAGE 823-4-2012
Results
/ W&I / MDSE PAGE 923-4-2012
Project attribute Explanation %
No missing values
N = 151
Missing values
N = 1609
Primary
programming
language
25,37% 16,11%
Organisation type 17,59% 18,36%
Year of the project 10,88% 5,41%
Architecture 8,68% 3,35%
Development
Platform
5,43% 5,05%
Intended Market 4,61% 1,57%
Language type 2,45% 1,28%
Development Type 0,05% 0,07%
Indonesia:
expenditure by
province 18.9%
Indonesia:
expenditure by
educ.level 32.6%
Indonesia:
expenditure by
gender 2.6%
Linux: LOC by
package 17.4%
Linux: LOC by
impl lang 5.32%
Linux: LOC by
maintainer 4.45%
Conclusions
• Three groups of attributes
• High-impact: primary programming language, organization type
• Middle-impact
− year of the project [cf. Kitchenham et al. 2002]
− architecture, development platform
• Low impact: intended market, language type, devel’t type
• A new technique for analysis of effort fp
/ W&I / MDSE PAGE 1023-4-2012
Future work
• Partition should be MECE
• “Wholesale & Retail Trade” and “Financial, Property &
Business Services”
• New aggregation/explanation techniques
• Conjecture: relative importance of attributes will be
the same for other datasets
• Models based on data from multiple companies are not
applicable when one company data is considered [Ruhe
1999]
• Both multi-company and company-specific studies are
needed
/ W&I / MDSE PAGE 1123-4-2012