Decision Support Analysis for Software Decision Support Analysis for Software Effort Estimation by AnalogyEffort Estimation by Analogy
Jingzhou LiJingzhou Li
Guenther RuheGuenther Ruhe
University of Calgary, CanadaUniversity of Calgary, Canada
PROMISE’07, May 20, 2007PROMISE’07, May 20, 2007
22/15/15
OutlineOutline
Technology (evaluation)
Decision making
Empirical studies
Which technology is suitable for which situations?
What is the empirical evidence support of the decision?
Software effort estimation by
analogy(EBA)
What are the optional methods for EBA?
What are the basic decision-making problems?
What are the empirical evidences to support the decision-making?
Empirical study (an example )
Decision–centric process model of EBA
33/15/15
New Object
1. Estimation by analogy1. Estimation by analogy—An introduction—An introduction
EBAEffort
estimateHistorical
data
Three steps:1. Search for analogs (similar objects)2. Determine the closest analogs3. Predict by analogy adaptation
a1 a2 … am Effort
r1 v11 … v1m e1
r2
… … vij … …
rn vn1 ... vnm en
a1 a2 … am Effort
sg vg1 vg2 … vgm ?
How many analogs
should we use?
What adaptation strategy
should we use?
What if there are missing values?
What similarity measures
should we use?
1. What are the basic tasks to accomplish for a user in order to apply or customize EBA?
2. What are the basic decision-making problems and their solution alternatives for applying or customizing EBA?
44/15/15
2. Decision-centric process model of 2. Decision-centric process model of EBAEBA
D8.Determining
closest analogs
Processed
Historical Data
D2.Dealing with
missing values
D1.Impact
analysis of missing values
D7.Retrieving
analogs
Objects Under
Estimation
Effort Estimates
D9.Analogy
adaptation
D11. Comparing EBA methods in generalD10. Choosing evaluation criteria
D6.Determining
similarity measures
Raw Historical
Data
D3.Object
selection
D5.Attribute
weighting & selection
D4. Discretization of attributes
55/15/15
3. 3. Decision problems of EBA and solution Decision problems of EBA and solution alternatives alternatives
ID Decision problems Typical solution alternatives
D1 Impact analysis of missing values Preliminary knowledge
D2 Dealing with missing values Deletion and imputation techniques; NULL value
D3 Object selection Hill climbing, simulated annealing, forward and backward sequential selection algorithms
D4 Discretization of continuous attributes For RSA-based attribute weighting; Based on interval, frequency, or both; other techniques used in machine learning
D5 Attribute weighting and selection
S5.1—Brute-force attribute selectionS5.2—WRAPPER attribute selectionS5.3—Rough Sets based attribute selectionS5.4—Attribute weighting using regressionS5.5—Attribute weighting using genetic algorithmS5.6-S5.9—Attribute weighting using Rough Sets (heuristic H1
to H4)
D6 Determining similarity measures Distance-based, local-global similarity principle
D7 Retrieving analogs Using similarity measures or rule-based heuristics
D8 Determining closest analogs Fixed number of analogs without considering similarity measure; through learning process
D9 Analogy adaptation strategy Mean, weighted mean, linear extrapolation
D10 Choosing evaluation criteria Some conventional criteria: e.g. MMRE, Pred
D11 EBA comparison methods in General Accuracy-based methods
where Si.j represent the jth solution alternative of decision problem Di
66/15/15
3. 3. Decision problems of EBA and solution Decision problems of EBA and solution alternatives alternatives
General form of EBA: EBA = F (D1, D2, …, D11) where domain of Di : {Si.j} – solution alternatives of Di F is an amalgamation function
Customization of EBA: A specific EBA is obtained for a given data set DB by using a (set of) specific solution alternatives Si.j of Di and aggregated through function F. EBA(DB) = F (D1, D2, …, D11, DB)
77/15/15
3. 3. Decision problems of EBA and solution Decision problems of EBA and solution alternativesalternatives
--Customization of EBA--Customization of EBA
EBA = F (D1, D2, …, D11)
Data set
type 1
Data set
type 2
Data set
type k
Customization 1
Customization 2
Customization k
……
Cla
ssifica
tion a
ccord
ing to
chara
cteristics o
f the
data
sets
Si.j for Di?3. How empirical study can be used to support the decision-making regarding the customization of EBA?
88/15/15
4. Decision support in an example EBA 4. Decision support in an example EBA methodmethod
—AQUA—AQUA++
AQUA+
Learning
Phase1
Predicting
Phase2Effort
estimates
Data set for
AQUA+
Learnedaccuracy distributio
n
Attribute weighting
and selection
Phase0 Attributes & weights
Raw historica
l data
Pre-process
(missing value, attribute type…)
Pre-Phase(D2, D6)
(D4, D5) (D8)
(D7, D9)
Objects under
estimation
S2.3: NULL valueS6.5: local-global similarity, weighted mean of local-similarity measures
S4.2: equal frequency and equal width discretization
S5.6-S5.9: RSA-based attribute weighting, heuristics H1-H4
S7.1: similarity measure
S8.2: learning process
S9.1: adaptation using weighted mean
General form of AQUA+:
AQUA+ = F (D2(S2.3), D4(S4.2), D5(S5.6), D6(S6.5), D7(S7.1), D8(S8.2), D9(S9.1)) For a specific type of data set DB:AQUA+ (DB) = ? e.g. S5.6-S5.9: H1-H4?
99/15/15
Data Sets #Objects #Attributes%Missing
Values
%Non-Quantitative Attributes
Source
USP05-RQ 121 14 2.54 71Jingzhou et al.,
2005
USP05-FT 76 14 6.8 71Jingzhou et al.,
2005
ISBSG04-2 158 24 27.24 63 ISBSG, 2004
Kem87 15 5 0 40Kemerer et al.,
1987
Mends03 34 6 0 0Mendes et al.,
2003
Data sets used in the comparative study
4. Decision support in an example EBA 4. Decision support in an example EBA method method —Comparative study—Comparative study
1010/15/15
Weighting
Heuristics
Data sets
AccuH[i]
H0 H1 H2 H3 H4
USP05-FT 0.22 0.42 -1.53 0.52 0.37
USP05-RQ -0.79 0.03 − 0.62 0.15
ISBSG04-2 0.16 1.81 -2.62 0.30 0.35
Kem87 -0.09 0.15 − -0.05 -0.05
Mends03 -0.48 1.42 1.42 -0.47 -0.47
Comparison of the four attribute weighting heuristics
2. H1 performed better than H0 for all data sets, hence is recommended for use in AQUA+.
Tentative conclusions:1. H1 and H3 performed the best, hence RSA-based attribute weighting is
recommended for use by AQUA+.
4. Decision support in an example EBA 4. Decision support in an example EBA method method —Comparative study—Comparative study
1111/15/15
4. Decision support in an example EBA 4. Decision support in an example EBA method method —Apply the knowledge obtained from the —Apply the knowledge obtained from the
comparative studycomparative study
USP05-RQ
USP05-FT
ISBSG04-2Kem87
Mends03
-5
15
35
55
75
95
-2 3 8 13 18 23 28
%Missing values
%N
on-q
uan
tita
tive
att
rib
ute
s
H3 is suitable
for this class
H1 is suitable
for this class
New data set
Which heuristic should be used?
H1 is suitable
for this class
1212/15/15
5. Decision support and empirical 5. Decision support and empirical studiesstudies
Knowledge base
DSS for EBA
Application or customization
of EBA
Empirical studies
e.g. Knowledge about which alternatives are suitable for which types of data set
New Data Set
Apply knowledge
ClassifyCustomize EBA
1313/15/15
6. Summary and future work6. Summary and future work
Decision-centric process model
Decision problems and
solution alternatives
Example EBA AQUA+
Empirical studies
Knowledge base
DSS for EBA
Decision support
1414/15/15
Major referencesMajor references G. Ruhe, "Software Engineering Decision Support—A New Paradigm for G. Ruhe, "Software Engineering Decision Support—A New Paradigm for
Learning Software Organizations", Learning Software Organizations", Advances in Learning Software Advances in Learning Software OrganizationOrganization, Lecture Notes In Computer Science, Vol. 2640, Springer , Lecture Notes In Computer Science, Vol. 2640, Springer 2003, pp 104-115.2003, pp 104-115.
V.R. Basili, G. Caldiera, and H.D. Rombach, "Experience Factory", V.R. Basili, G. Caldiera, and H.D. Rombach, "Experience Factory", Encyclopedia of Software EngineeringEncyclopedia of Software Engineering (Eds. J. Marciniak), Vol. 1, 2001, pp (Eds. J. Marciniak), Vol. 1, 2001, pp 511-519. 511-519.
G. Ruhe, "Software Engineering Decision Support and Empirical G. Ruhe, "Software Engineering Decision Support and Empirical Investigations - A Proposed Marriage", Investigations - A Proposed Marriage", The Future of Empirical Studies in The Future of Empirical Studies in Software EngineeringSoftware Engineering (A. Jedlitschka, M. Ciolkowski, Eds.), Workshop (A. Jedlitschka, M. Ciolkowski, Eds.), Workshop Serious on Empirical Studies in Software Engineering, Vol. 2, 2003, pp 25-Serious on Empirical Studies in Software Engineering, Vol. 2, 2003, pp 25-34.34.
M. Shepperd, C. Schofield, “Estimating Software Project Effort Using M. Shepperd, C. Schofield, “Estimating Software Project Effort Using Analogies”, Analogies”, IEEE Transactions on Software EngineeringIEEE Transactions on Software Engineering, 23(1997) 736-743., 23(1997) 736-743.
J.Z. Li, G. Ruhe, A. Al-Emran, and M.M. Ritcher, "A Flexible Method for Effort J.Z. Li, G. Ruhe, A. Al-Emran, and M.M. Ritcher, "A Flexible Method for Effort Estimation by Analogy", Estimation by Analogy", Empirical Software EngineeringEmpirical Software Engineering, Vol. 12, No. 1, , Vol. 12, No. 1, 2007, pp 65-106. 2007, pp 65-106.
J.Z. Li, G. Ruhe, "Software Effort Estimation by Analogy Using Attribute J.Z. Li, G. Ruhe, "Software Effort Estimation by Analogy Using Attribute Weighting Based on Rough Sets", Weighting Based on Rough Sets", International Journal of Software International Journal of Software Engineering and Knowledge EngineeringEngineering and Knowledge Engineering, To appear. , To appear.
J.Z. Li, A. Ahmed, G. Ruhe, "Impact Analysis of Missing Values on the J.Z. Li, A. Ahmed, G. Ruhe, "Impact Analysis of Missing Values on the Prediction Accuracy of Analogy-based Software Estimation Method AQUA", Prediction Accuracy of Analogy-based Software Estimation Method AQUA", ESEM’07, Madrid, Spain, September 2007.ESEM’07, Madrid, Spain, September 2007.
1515/15/15
Thank you !Thank you !
Comments and questions?Comments and questions?
1616/15/15
A preliminary DSS framework for A preliminary DSS framework for EBAEBA
Machine learning and reasoning tools
Virtual DB Virtual KB
…
Interface
Database
Docum
ents
Web contents, hyperm
edia
Model base
Other form
s of contents
Rule base
Dom
ain knowledge
Dealing w
ith missing values
Attribute w
eighting and selection
Discretization of attributes
General E
BA
comparison m
ethods
…
Object selection
Determ
ining similarity m
easures
Retrieving &
determining analogs
Analogy adaptation strategy
…
Knowledge representation and acquisition
General data analysis tools
Decision-centric EBA processObjects under
estimationEffort estimates
Top Related