The Human Competitiveness of Search Based Software Engineering
-
Upload
fabricio-freitas -
Category
Documents
-
view
214 -
download
2
description
Transcript of The Human Competitiveness of Search Based Software Engineering
The Human Competitiveness of Search Based Software Engineering
Optimization in Software Engineering Group (GOES.UECE)State University of Ceará, Brazil
Jerffeson Teixeira de SouzaCamila Loiola MaiaFabrício Gomes de FreitasDaniel Pinto Coutinho
2nd International Symposium on Search Based Software EngineeringSeptember 7 – 9, 2010
Benevento, Italy
Jerffeson Teixeira de Souza, Ph.D.State University of Ceará, Brazil Professor
http://goes.comp.uece.br/[email protected]
Nice to meet you,
Our little time will be divided as follows
Part 01 Research QuestionsPart 02 Experimental Design Part 03 Results and Analises Part 04 Final Considerations
The question regarding the human competitiveness of SBSE ...
has already been raisedno comprehensive work has been
published to date.
why ?
Mark Harman, The Current State and Future of Search Based Software Engineering, Proceedings of International Conference on Software Engineering / Future of Software Engineering 2007 (ICSE/FOSE '07), Minneapolis: IEEE Computer Society, pp. 342-357, 2007.
one may argue ...
The human competitiveness of SBSE is not in doubt by the SBSE community
!
But, even if that is the case,...
Strong research results regarding this issue would likely, in the least, contribute to the
increasing acceptance of SBSE outside its research community
!
Can the results generated by Search Based Software Engineering be said to be
human competitive? SBSE human competitiveness?
but ...
How to evaluate the Human Competitiveness of SBSE?
The result holds its own or wins a regulated competition involving human contestants (in the form of either live human players or human-written computer programs).
“
”
SSBSEHUMANS
VS
MACHINETHRUSDAY, SEPTEMPER 9 – 11:30 CEST / 05:30 ET
LIVE ON PAY-PER-VIEWLIVE ON PAY-PER-VIEW
FROM THE SOLD-OUT MUSEUM OF SANNIO ARENA BENEVENTO, ITALY
2010
SSBSEHUMANS
VS
SBSE ALGORITHMSTHRUSDAY, SEPTEMPER 9 – 11:30 CEST / 05:30 ET
LIVE ON PAY-PER-VIEWLIVE ON PAY-PER-VIEW
FROM THE SOLD-OUT MUSEUM OF SANNIO ARENA BENEVENTO, ITALY
2010
SSBSEHUMANS
VS
SBSE ALGORITHMSTHRUSDAY, SEPTEMPER 9 – 11:30 CEST / 05:30 ET
LIVE ON PAY-PER-VIEWLIVE ON PAY-PER-VIEW
FROM THE SOLD-OUT MUSEUM OF SANNIO ARENA BENEVENTO, ITALY
2010
which which ones ones ??
Can the results generated by Search Based Software Engineering be said to be
human competitive? SBSE human competitiveness?
Can the results generated by Search Based Software Engineering be said to be
human competitive? SBSE human competitiveness?
How do different metaheuristics compare in solving a variety of search based software
engineering problems? SBSE algorithms comparison?
The Problems
The Next Release Problem
The Multi-Objective Next Release Problem
The Workgroup Formation Problem
The Multi-Objective Test Case Selection Problem
They can be considered “classical” formulations
They cover together a range of three different general phases in the software development life cycle
Motivation
This selection prioritizes customers with higher importance to the company
and must respect a pre-determined budget
Sw
ii
BRcostSi
i
*
Involves determining a set of customers which will have their selected requirements delivered
in the next software release
THE NEXT RELEASE PROBLEM
The cost of implementing the selected requirements is taken as an independent
objective to be optimized, not as a constraint,
along withi
n
ii xcost
1
THE MULTI-OBJECTIVE NEXT RELEASE PROBLEM
a score representing the importance of a
given requirementi
n
ii xscore
1
The formulation displays a single objective function to be minimized, which composes both
salary costs, skill and preference factors
THE WORKGROUP FORMATION PROBLEM
deals with the allocation of human resources to project tasks
P
pa
N
aapp DurASal
1 1
N
a
P
p
P
pppapappp
N
a
P
papmpa
N
a
P
pappa
P
p
S
s
N
asapps
XAAP
APAP
SIAR
1 11 12212121
1 11 1
1 1 1
The paper discusses two variations, one which considers two objectives (code coverage and execution time), used here, and the other covering three objectives (code coverage, execution time and fault detection).
THE MULTI-OBJECTIVE TEST CASE SELECTION PROBLEM
extends previously published mono-objective formulations
The Data
For each problem (NRP, MONRP, WORK and TEST), two instances, A and B, with increasing sizes, were synthetically generated.
Instance NameInstance Features
# Customers # TasksNRPA 10 20NRPB 20 40
INSTANCES FOR PROBLEM NRP
Instance NameInstance Features
# Customers # Requirements
MONRPA 10 20MONRPB 20 40
INSTANCES FOR PROBLEM MONRP
Instance NameInstance Features
# Persons # Skills # ActivitiesWORKA 10 5 5WORKB 20 10 10
INSTANCES FOR PROBLEM WORK
Instance NameInstance Features
# Test Cases # Code Blocks
TESTA 20 40TESTB 40 80
INSTANCES FOR PROBLEM TEST
The Algorithms
For Mono-Objective ProblemsGenetic Algorithm (GA)Simulated Annealing (SA)
For Multi-Objective ProblemsNSGA-IIMoCell
For Mono and Multi-Objective Problems
Random Search
NUMBER OF HUMAN RESPONDENTS PER INSTANCE
Human Subjects
A total of 63 professional software engineers solved some or all of the instances.
Human SubjectsBesides solving the problem instance, each respondent wasasked to answer the following questions related to each problem instance
How hard was it to solve this problem instance?
How hard would it be for you to solve an instance twice this size?
What do you think the quality of a solution generated by you over an instance twice this size would be?
In addition to these specific questions regarding each problem instance, general questions on the respondent theoretical and
practical experience over software engineering
Comparison Metrics
For Mono-Objective ProblemsQuality
For Multi-Objective ProblemsHypervolumeSpreadNumber of Solutions
For Mono and Multi-Objective Problems
Execution Time
How do different metaheuristics compare in solving a variety of search based software
engineering problems? SBSE algorithms comparison?
RESULTS AND RESULTS AND
ANALYSESANALYSES
Problem GA SA RAND
NRPA 26.45±0.500 25.74±0.949 15.03±5.950
NRPB 95.41±0.190 90.47±7.023 45.74±11.819
WORKA 16,026.17±51.700 18,644.71±1,260.194 19,391.34±1,220.17
WORKB 24,831.23±388.107 35,174.19±2,464.733 36,892.64±2,428.269
RESULTS
Quality of Results for NRP and WORKaverages and standard deviations, over 100 executions
Problem GA SA RAND
NRPA 40.92±11.112 23.01±7.476 0.00±0.002
NRPB 504.72±95.665 292.62±55.548 0.06±0.016
WORKA 242.42±44.117 73.35±19.702 0.04±0.010
WORKB 4,797.89±645.360 2,211.28±234.256 1.75±0.158
RESULTS
Time (in miliseconds) Results for NRP and WORKaverages and standard deviations, over 100 executions
Boxplots showing average (+), maximum (), minimum () and 25% - 75% quartile ranges of quality for mono-objective problems NRP and
WORK, instances A and B, for GA, SA and Random Search.
NRPA NRPB
WORKA WORKB
RESULTS
Hypervolume Results for MONRP and TESTaverages and standard deviations, over 100 executions
Problem NSGA-II MOCell RAND
MONRPA 0.6519±0.009 0.6494±0.013 0.5479±0.0701
MONRPB 0.6488±0.015 0.6470±0.017 0.5462±0.0584
TESTA 0.5997±0.009 0.5867±0.019 0.5804±0.0648
TESTB 0.6608±0.020 0.6243±0.044 0.5673±0.1083
RESULTS
Spread Results for MONRP and TESTaverages and standard deviations, over 100 executions
Problem NSGA-II MOCell RAND
MONRPA 0.4216±0.094 0.3973±0.031 0.5492±0.1058
MONRPB 0.4935±0.098 0.3630±0.032 0.5504±0.1081
TESTA 0.4330±0.076 0.2659±0.038 0.5060±0.1029
TESTB 0.3503±0.178 0.2963±0.072 0.4712±0.1410
RESULTS
Time (in miliseconds) Results for MONRP and TESTaverages and standard deviations, over 100 executions
Problem NSGA-II MOCell RAND
MONRPA 1,420.48±168.858 993.09±117.227 25.30±10.132
MONRPB 1,756.71±138.505 1,529.32±141.778 30.49±7.204
TESTA 1,661.03±125.131 1,168.47±142.534 25.24±11.038
TESTB 1,693.37±138.895 1,370.96±127.953 32.89±9.335
RESULTS
Number of Solutions Results for MONRP and TESTaverages and standard deviations, over 100 executions
Problem NSGA-II MOCell RAND
MONRPA 31.97±5.712 25.01±5.266 12.45±1.572
MONRPB 60.56±4.835 48.04±4.857 20.46±2.932
TESTA 35.43±4.110 26.20±5.971 12.54±1.282
TESTB 41.86±9.670 19.93±8.514 11.58±2.184
RESULTS
STANCES FOR PROBLEM NR
Example of the obtained solution sets for NSGA-II, MOCell and Random Search over problem MONRP, Instances A
0
20
40
60
80
100
120
-2400 -2200 -2000 -1800 -1600 -1400 -1200 -1000 -800 -600 -400
cost
value
MOCellNSGA-IIRandom
RESULTS
STANCES FOR PROBLEM NR
Example of the obtained solution sets for NSGA-II, MOCell and Random Search over problem MONRP, Instances B
40
60
80
100
120
140
160
180
200
220
-11000 -10000 -9000 -8000 -7000 -6000 -5000 -4000 -3000
cost
value
MOCellNSGA-IIRandom
RESULTS
STANCES FOR PROBLEM NR
Example of the obtained solution sets for NSGA-II, MOCell and Random Search over problem TEST, Instances A
0
200
400
600
800
1000
1200
1400
-100 -80 -60 -40 -20 0
cost
% coverage
MOCellNSGA-IIRandom
RESULTS
STANCES FOR PROBLEM NR
Example of the obtained solution sets for NSGA-II, MOCell and Random Search over problem TEST, Instances B
0
100
200
300
400
500
600
700
800
900
1000
-100 -90 -80 -70 -60 -50 -40
cost
% coverage
MOCellNSGA-IIRandom
Can the results generated by Search Based Software Engineering be said to be
human competitive? SBSE human competitiveness?RESULTS AND RESULTS AND
ANALYSESANALYSES
RESULTS
Quality and Time (in milliseconds) for NRP and WORKaverages and standard deviations
ProblemSBSE Humans
Quality Time Quality Time
NRPA 26.48±0.512
40.57±9.938
16.19±6.934
1,731,428.57±2,587,005.57
NRPB 95.77±0.832
534.69±91.133
77.85±23.459
3,084,000.00±2,542,943.10
WORKA 16,049.72±121.858
260.00±50.384
28,615.44±12,862.590
2,593,846.15±1,415,659.62
WORKB 25,047.40±322.085
4,919.30±1,219.912
50,604.60±20,378.740
5,280,000.00±3,400,588.14
Boxplots showing average (+), maximum (), minimum () and 25% - 75% quartile ranges of quality for mono-objective problems NRP and
WORK, instances A and B, for SBSE and Human Subjects.
NRPA NRPB
WORKA WORKB
RESULTS
Hypervolume and Time (in milliseconds) Results for SBSE and Humans For MONRP and TEST
averages and standard deviations
ProblemSBSE Humans
HV Time HV Time
MONRPA 0.6519±0.009
1,420.48±168.858 0.4448 1,365,000.00
±1,065,086.42
MONRPB 0.6488±0.015
1,756.71±138.505 0.2870 2,689,090.91
±2,046,662.91
TESTA 0.5997±0.009
1,661.03±125.131 0.4878 1,472,307.69
±892,171.07
TESTB 0.6608±0.020
1,693.37±138.895 0.4979 3,617,142.86
±3,819,431.52
RESULTS
STANCES FOR PROBLEM NR
Solutions generated by humans, and non-dominated solution sets produced by NSGA-II and MOCell for problem MONRP, instances A
0
20
40
60
80
100
120
-2500 -2000 -1500 -1000 -500 0
cost
value
MOCellNSGA-IIHumans
RESULTS
STANCES FOR PROBLEM NR
Solutions generated by humans, and non-dominated solution sets produced by NSGA-II and MOCell for problem MONRP, instances B
0
50
100
150
200
250
-11000-10000-9000 -8000 -7000 -6000 -5000 -4000 -3000 -2000 -1000
cost
value
MOCellNSGA-IIHumans
RESULTS
STANCES FOR PROBLEM NR
Solutions generated by humans, and non-dominated solution sets produced by NSGA-II and MOCell for problem TEST, instances A
0
50
100
150
200
250
-100 -80 -60 -40 -20 0
cost
% coverage
MOCellNSGA-IIHumans
RESULTS
STANCES FOR PROBLEM NR
Solutions generated by humans, and non-dominated solution sets produced by NSGA-II and MOCell for problem TEST, instances B
0
50
100
150
200
250
300
350
400
450
500
550
-100 -90 -80 -70 -60 -50 -40 -30 -20
cost
% coverage
MOCellNSGA-IIHumans
FURTHER HUMAN COMPETITIVENESS
ANALYSESHuman participants were asked to rate how difficult they
found each problem instance and how confident they were on their solutions
Bar chart showing percentage of human respondents who considered each problem “hard” or “very hard”
FURTHER HUMAN COMPETITIVENESS
ANALYSESHuman participants were asked to rate how difficult they
found each problem instance and how confident they were on their solutions
Bar chart showing percentage of human respondents who were “confident” or “very confident”
FURTHER HUMAN COMPETITIVENESS
ANALYSES
Bar charts showing percentage differences in quality for mono and multi-objective problems generated by SBSE and the human
subjects
NRP MONRP
WORK
TEST
FURTHER HUMAN COMPETITIVENESS
ANALYSES
57.33% of the human participants which responded instance A indicated that solving instance B would be “harder” or “much
harder”, and 55.00% predicted that their solution for this instance would be “worse” or “much worse”
62.50% of the instance B respondents pointed out the increased difficulty of a problem instance twice larger, and
57.14% that their solution would be “worse” or “much worse”
FURTHER HUMAN COMPETITIVENESS
ANALYSES
These results suggest that for larger problem instances, the potential of SBSE to generate even more accurate results, when
compared to humans, increases
In fact, this suggests that SBSE may be particularly useful insolving real-world large-scale software engineering
problems
Threats to Validity
Small instance sizes
Artificial instances
Number and diversity of human participants
Number of problems
This paper reports the results of an extensive experimental research aimed at evaluating the
human competitiveness of SBSE
FINALCONSIDERATIONS
Secondarily, several tests were performed over four classical SBSE problems in order to evaluate
the performance of well-known metaheuristics in solving both mono- and multi-objective problems
Regarding the comparison of algorithms
FINALCONSIDERATIONS
GA generated more accurate solutions for mono-objective problems than SA
NSGA-II consistently outperformed MOCell in terms of hypervolume and number of generated solutions
MOCell outperformed NSGA-II when considering spread and the execution time
All of these results are consistent with previously published research
Regarding the human competitiveness question
FINALCONSIDERATIONS
Experiments strongly suggest that the results generated by search based software
engineering can, indeed, be said to be human competitive
Results indicate that for real-world large-scale software engineering problem, the benefits from applying SBSE may be even greater
That is it!Thanks for your time and attention.
Optimization in Software Engineering Group (GOES.UECE)State University of Ceará, Brazil
http://goes.comp.uece.br/
2nd International Symposium on Search Based Software EngineeringSeptember 7 – 9, 2010
Benevento, Italy
The Human Competitiveness of Search Based Software
Engineering