164
CHAPTER 6
ABC TESTER - ARTIFICIAL BEE COLONY BASED
SOFTWARE TEST SUITE OPTIMIZATION APPROACH
6.1 PROBLEM FORMULATION
A test suite is a set of several test cases for a component or system
under test, where the post condition of one test is often used as the
precondition for the next one. The objective of the test suite optimization
process involves generation of effective test cases in a test suite that can cover
the given SUT within less time.
As per Phil McMinn, random generation of test cases leads to an
unlimited source of tests in which the selection of efficient test cases is very
difficult (McMinn 2003).
This research work, proposes an ABC (Artificial Bee Colony
Optimization) based framework, motivated by the intelligent behavior of
honey bees to automate the test suite optimization process. Here, the bees are
implemented as agents that perform the test suite optimization activities
seamlessly. Since, the ABC system combines local search methods carried out
by employed and onlooker bees with global search methods managed by
scouts, the approach attains near global optimal solution.
In the proposed system, the places to search are the nodes in the
SUT. Each test case is represented as a possible solution in the optimization
problem. The artificial bees modify the test cases with time and the bees’ aim
165
is to discover the places of nodes with higher coverage and finally the one
with the highest usage by the given test case.
The coverage of the nodes by the test cases us identified by a
heuristic measure namely happiness value introduced to each node that
corresponds to the quality or fitness of the associated solution. Hence, the test
cases are selected by means of an intelligent search through the SUT based on
the said heuristic associated with each node.
This is done by means of an intelligent search through the Software
under Test (SUT) based on the weight value associated with each node,
depending on the coverage of the test case. This weight value is calculated in
terms of a heuristics measure namely happiness value. The objective function
is to maximize the happiness value by finding the sum of the fitness value
associated with each node based on constraint satisfaction. Now, a few
efficient test cases that can cover the SUT in less time are generated and
stored in the optimal test suite repository.
6.2 REPRESENTATION OF THE SEARCH SPACE – FORMAL
PROBLEM DEFINITION
In the proposed approach, the given Software under Test (SUT) is
represented as a search space. The block of executable statements and
conditions are grouped as states or nodes in the representation model. These
nodes are searched to find the feasible and infeasible nodes in the SUT. The
effective test cases are identified by means of the coverage of them against the
nodes in the SUT. In this research work, the terms nodes and states are used
interchangeably. The test adequacy criteria applied here ensures path
coverage, state coverage and branch coverage.
166
6.2.1 Problem Environment
The Software under Test (SUT) is given as input. Let ‘n’ be the
cyclomatic complexity value that indicates, there are ‘n’ independent test
paths in the Software under Test (SUT). Given ‘m’ test suites each consists of
several test cases that must be processed on ‘n’ test paths / sequences.
6.2.2 Assumption
Software under Test (SUT) is well structured and without any
compilation errors. Code Instrumentation does not affect the functionality.
6.2.3 Objective Criterion
The objective of the proposed approach is to generate an efficient
test suite that can cover the SUT within less time and cost by applying
intelligent search through the SUT using the parallel behavior of a group of
three bees.
6.2.4 Mathematical Model
The Objective function for test suite optimization is:
Max.
Happiness_value (test-case) (6.1)
Sub. to.
Happiness_value (test-case) = 1 if Coverage (test-case) =100%
0 otherwise (6.2)
The objective function in (6.1) is to maximize the happiness value
of each test case by identifying the node with higher coverage value. The
167
constraint (6.2) indicates the happiness value based on coverage of test case
for each node.
6.3 RELATED WORK
Timothy et al (1991), proposed test coverage as an important aspect
of a validation suite for implementations of a standard such as the CAIS
(Common Ada Program Interface Set). Their paper presented the
development and application of a constrained optimization process for CAIS
test coverage. The approach next identified resource (time and effort) and
process constraints. A greedy algorithm was developed to provide a partial
solution to the coverage design problem. The decreasing first fit bin packing
heuristic was then applied by them to refine the coverage measure of the
validation test suite within the process constraints.
Tsong et al (1996), proposed a methodology for test suite
optimization. In their approach, they applied dividing strategies to optimize
the given test suite. The domain is divided in such a way that the test suite
contains only effective test cases.
Chen and Lau (1998), proposed a method for optimizing a test
suite. As per their paper, a test case in a test suite is said to be redundant if the
same testing objective can still be satisfied by other test cases of the test suite.
In their paper, they proposed heuristics to optimize the size of a test suite.
They proposed a divide and conquer based approach for test suite
optimization (2003).
Jones et al (1998) proposed a strategy using genetic algorithms to
automate branch and fault based testing. Pargas et al (1999) applied genetic
algorithm for test data generation. In their approach, they applied GA to
generate test cases that satisfies the given test adequacy criteria. Wegener et al
168
(2001) applied genetic algorithm based approach to generate test data in
structural software testing.
Tibor et al (1999), in their paper expounded the mathematical
optimization method for test suite optimization based on cost and test
coverage, and applied this method to an ISDN protocol.
Christoph et al (2001) applied genetic algorithm for software test
suite optimization. In their paper, they demonstrated the experiments with test
generation problems for larger programs and more complex test adequacy
criteria. They found a widening gap between a technique based on genetic
algorithms and those based on random test generation.
Gregg et al (2003), proposed a methodology for effective test suite
composition in regression testing. Their article reported the results of
controlled experiments examining the effects of two factors in test suite
composition; test suite granularity and test input grouping, based on the costs
and benefits of several regression-testing-related methodologies: retest-all,
regression test selection, test suite reduction, and test case prioritization.
Frank and Joachim (2005) applied evolutionary technique for test
suite reduction in white-box testing. Their paper presented the development of
an evolutionary software measure which is able to predict the test effort for
individual test goals. It was based on four attributes of conditional statements.
Initial results showed that the predicted test effort for individual test goals
corresponds well to real measurements.
Lili et al (2005), proposed a new test-suite reduction technique for
modified condition/decision coverage (MC/DC): a bi-objective model that
considers both the coverage degree of test case for test requirements and the
capability of test cases to reveal the errors.
169
Xiaofang et al (2008), proposed a test suite optimization technique.
The aim of their paper was to provide a guideline for choosing the appropriate
test suite reduction techniques for Boolean specification-based testing. Four
typical heuristic reduction strategies: G, GE, GRE, and H were introduced to
be compared empirically. Furthermore, testing requirement optimization was
combined to enhance these four reduction strategies.
As per Praveen (2009), test data generation is one of the key issues
in software testing. A properly generated test suite may not only locate the
errors in a software system, but also help in reducing the high cost associated
with software testing. His paper proposed Genetic Algorithm for test data
generation to achieve software test optimization.
6.4 PROPOSED ARTIFICIAL BEE COLONY BASED TEST
SUITE OPTIMIZATION FRAMEWORK
6.4.1 Need for Artificial Bee Colony (ABC) Based Approach
As the outcome of the literature study on related work in software
test suite optimization, the following observations were made.
The approaches proposed by Timothy et al (1991), Tsong et al
(1996) and Tiber et al (1999) have focused on test suite optimization based on
coverage based test adequacy criteria. But their approaches have the
drawbacks of focusing upon a particular interface, need for a lot of human
intervention and applied for a particular application respectively.
The effects of granularity, and grouping technique (Gregg et al
2003) based on the cost and fault-detection effectiveness of regression testing
under the given methodologies were analyzed. The analysis showed that test
suite granularity significantly affects several cost-benefit factors for the
methodologies considered, while test input grouping has limited effects.
170
Further, the results exposed essential tradeoffs affecting the relationship
between test suite design and regression testing cost-effectiveness.
The approach proposed by Xiaofang et al (2008), presented the
empirical evaluation of the proposed reduction strategies based on a set of
Boolean specifications. This approach suffers from the generic applicability
problem.
MC/DC coverage (Lili 2005) requires that the effect of each
condition affects the outcome of a decision. This, in turn, requires the relation
of several test cases. Model checkers, however, can only create a single
counter example at a time and do not offer a way to relate traces with each
other.
It has also been observed that, several works have been proposed in
the recent years on applying Genetic Algorithms to test suite optimization
(Jones et al 1998, Pargas et al 1999, Wegener et al 2001, Frank and Joachim
2005, Christoph C. Michael et al 2001 and Praveen 2009). But their general
drawbacks are strike up at local optimal solution and lack of memorization of
best individuals during each generation.
In the case of software systems having dynamic behavior, the
testing process is much more a complex task because of its multi-threaded
processing nature. This leads to the application of knowledge based
approaches in test suite optimization. But, the drawbacks of existing
knowledge based approaches discussed in the literature study have made this
research work to focus on an alternate approach for test suite optimization,
which has the advantages of population based approaches without the
problem of local optima.
171
A population based approach is essential for test suite optimization,
since a pool of test cases is needed to select a few efficient test cases in
forming an optimal test suite. Similarly, the swarm intelligence based
approach is selected here to apply intelligence in the searching process in
order to select the nodes with higher coverage.
The literature study on ABC, which is a population based swarm
intelligence approach, has provided a lot of proof on using it as an alternate
optimization approach to hard optimization problems. Recent research and
development of ABC based systems are focusing mostly on applications such
as financial decision making systems, transportation, manufacturing,
aerospace, military and so on ( Karaboga et al 2007,2008, Dusan et al 2006,
Wong et al 2008, Adil et al 2007, Alok Singh 2009, Srinivasa Rao et al
2008, Karaboga 2009, Mohammad et al 2007).
In the light of the above consideration, this research work applied
ABC, for software test suite optimization problem. In the proposed approach,
the functionality of the bee is extended to do the testing and monitoring
activity so that, it reduces the manual work and improves the confidence on
the software, by testing it with the coverage based test adequacy criteria. The
proposed model based on ABC, interacts with the developer and the SUT, and
thus helps in speeding up the development and testing process and also
provides an insight into the execution flow in the system.
6.4.2 Proposed Artificial Bee Colony Framework
The ABC Tester framework is shown in Figure 6.1. In which, the
functionalities of the three bees are extended to three agents namely Search
Agent, Selector Agent and Replace Agent to produce the test suite with
efficient test cases among near infinite number of test cases.
172
Figure 6.1 ABC Based Test Suite Optimization Framework
The proposed approach, applies the intelligent searching of the
three bees into three agents respectively. The three agents work independently
as per their assigned task and communicate with other agents whenever they
have to exchange information.
Because of the parallel behavior of these agents, the solution
generation becomes faster and makes the approach an efficient one. Since, the
basic test adequacy criterion used is path coverage; the quality of the test suite
is improved during each iteration to cover the paths in the software.
(a) Artificial bee colony optimization (ABC) – an introduction
Artificial Bee Colony (ABC) is one of the most recently defined
algorithms by Dervis Karaboga in 2005, motivated by the intelligent behavior
of honey bees. It is as simple as Particle Swarm Optimization (PSO) and
Differential Evolution (DE) algorithms, and uses only common control
In ABC model, the three types of bees are:
o Employed,
o Onlooker and
o Scouts
In the proposed approach, each bee is associated
with an agent in the optimization model.
o Employed bee – Search agent
o Onlooker bee – Selector agent
o Scouts bee – Replace agent
Optimized Test Suite Repository
Search Bee
Selector Bee
Replace Bee
SUT
173
parameters such as colony size and maximum cycle number (Karaboga et al
2007, 2008).
ABC as an optimization tool provides a population-based search
procedure in which individuals called foods positions are modified by the
artificial bees with time and the bees’ aim is to discover the places of food
sources with high nectar amount and finally the one with the highest nectar.
In ABC system, artificial bees fly around in a multidimensional
search space and some employed and onlooker bees choose food sources
depending on the experience of themselves and their nest mates, and adjust
their positions. Some scouts bees fly and choose the food sources randomly
without using experience. If the nectar amount of a new source is higher than
that of the previous one in their memory, they memorize the new position and
forget the previous one.
In ABC model, the colony consists of three groups of bees:
employed, onlookers and scouts. It is assumed that there is only one artificial
employed bee for each food source. In other words, the number of employed
bees in the colony is equal to the number of food sources around the hive.
Employed bees go to their food source and come back to hive and
dance on this area. The employed bee whose food source has been abandoned
becomes a scout and starts to search for finding a new food source.
Onlookers watch the dances of employed bees and choose food sources
depending on the dances.
(b) Basic ABC Algorithm
The main steps of the basic ABC algorithm are given below:
Initial food sources are produced for all employed bees.
174
REPEAT
Each employed bee goes to a food source in her memory and
determines a neighbor source, then evaluates its nectar amount
and dances in the hive.
Each onlooker watches the dance of employed bees and
chooses one of their sources depending on the dances, and then
goes to that source. After choosing a neighbor around that, she
evaluates its nectar amount.
Abandoned food sources are determined and then, they are
replaced with the new food sources discovered by scouts.
The best food source found so far is registered.
UNTIL (requirements are met)
In ABC which is a population based algorithm, the position of a
food source represents a possible solution to the optimization problem and the
nectar amount of a food source corresponds to the quality (fitness) of the
associated solution. The number of the employed bees is equal to the number
of solutions in the population.
At the first step, a randomly distributed initial population (food
source positions) is generated. After initialization, the population is subjected
to repeat the cycles of the search processes of the employed, onlooker, and
scout bees, respectively. An employed bee produces a modification on the
source position in her memory and discovers a new food source position.
Provided that the nectar amount of the new one is higher than that
of the previous source, the bee memorizes the new source position and forgets
the old one. Otherwise she keeps the position of the one in her memory.
175
After all employed bees complete the search process; they share the
position information of the sources with the onlookers on the dance area.
Each onlooker evaluates the nectar information taken from all employed bees
and then chooses a food source depending on the nectar amounts of sources.
As in the case of the employed bee, she produces a modification on
the source position in her memory and checks its nectar amount. Providing
that its nectar is higher than that of the previous one, the bee memorizes the
new position and forgets the old one. The sources abandoned are determined
and new sources are randomly produced to be replaced with the abandoned
ones by artificial scouts.
Thus, ABC system combines local search methods, carried out by
employed and onlooker bees, with global search methods, managed by scouts,
attempting to balance exploration and exploitation process.
6.5 INTERNAL ARCHITECTURE OF ABC TESTER
The internal architecture of ABC Tester is shown in Figure 6.2. In
this, the system consists of three bees namely Search Bee, Selector Bee and
Replace Bee which acts as agents. The bees aim is to identify places with
higher feasibility value or the coverage value of the given set of test cases.
The bees communicate among themselves by means of a common agent
communication language. The parallel behavior of these bees is given by
means of multi-threading and hence, the bees work in parallel to achieve the
desired result.
176
Figure 6.2 Internal Architecture of ABC Tester
Initially, a random population of test cases is generated. The Search
Bee searches for an executable state in the SUT for each test case as it goes to
an executable state in the test path as per the information in the knowledge
source and determines the best next neighbor node/state. This determination is
done by analyzing all the neighbor nodes/states from the current node/state
based on the selected test case’s coverage. Then, it evaluates the fitness value
(nectar amount) of each node surrounding the current node for the selected
test case. The fitness value is nothing but the happiness value heuristic which
is calculated based on the coverage of the given test case for each neighbor
node. Then the selection of the best node to transit is chosen based on this
heuristic.
177
The Selector Bee watches the Search Bee and selects the test cases
depending upon the fitness value associated with each test case. If the node is
not feasible or not covered by a particular test case, then the node is removed
from memory and the Selector Bee starts a new search for finding the node
with higher feasibility in that path. Based on that, a happiness value or
coverage measure is associated with each test case. A test case with highest
happiness value or coverage measure is remembered and all the other test
cases are removed from the memory.
If the Selector Bee finds that the selected test cases are not efficient
in terms of their coverage, then the Replace Bee generates a new population
of test cases and replaces the test cases in the existing test case set with new
test cases.
This cycle is repeated till the termination condition of either the
maximum number of cycles (MCN) is reached or the specified coverage
criterion is above 95%.
6.6 ALGORITHM FOR TEST SUITE OPTIMIZATION
6.6.1 Heuristics Used in Test Suite Optimization
Happiness Value – It is used as the fitness value and is calculated
based on the coverage of each node by a given
test case.
6.6.2 Proposed ABC algorithm for Test Suite Optimization
Initialize the population of test cases
178
REPEAT
Step 1: The employed bee applies the test cases to the first executable node
in the SUT. Once the employed bee finishes its search process by
applying each test case to the node, the fitness value (coverage
value) of the test cases along with the node information is then
returned by the employed bee.
Step 2: The onlooker bee takes this information as input and evaluates the
coverage value of each test case taken from the employed bee.
Then it finds out the test case that has the highest coverage value of
the given node. The test case along with the covered node
information is memorized.
Step 3: Then the nodes which are adjacent to the covered node are explored
by the employed bee. Now the fitness value of the selected test case
against the explored neighborhood nodes is evaluated by the
onlooker bee.
Step 4: The node with the highest fitness value is selected and appended
with the existing selected node to indicate the test path. The test
case associated with this path is stored in the optimized-test case-
repository.
Step 5: Other nodes except the covered node and the test cases other than
the selected test case are abandoned and they are stored in
temporary-node-list and temporary-test case-list respectively.
Step 6 : If a test path is not complete, then repeat steps 3 to 6.
Otherwise, the nodes and test cases from the temporary-node-list
and temporary-test case-list are selected for the next test path
generation by the employed bees.
179
Step 7: If the onlooker bee finds that the selected test cases are not
efficient, then the scout bees generate a new population of test
cases and replace the test cases in the temporary-test case-list with
new test cases.
Repeat until the specified termination criterion is met. (All the
nodes have been visited at least once or number of generations is
reaching a maximum or coverage criterion is met)
6.6.3 Pseudo code of the ABC Algorithm
The pseudo code of the above algorithm for test suite optimization
problem is given below:
Step 1 : Initialize the population of test cases xij. (where ‘i’ indicates the
value given for variables and ‘j’ represents the test path in the SUT,
j= 1to ‘n’ and ‘n’ represents the cyclomatic complexity value).
Step 2: Evaluate the population based on coverage based test adequacy
criterion.
Step 3: cycle =1
Step 4: repeat
Step 5: Produce new test cases vij in the neighborhood of xij for the
employed bees using the formula:
vij = xij+qij(xij-xkj) (6.3)
Where ‘k’ is a solution in the neighborhood of ‘i’, ‘q’ is a random
number in the range [-1,1] and then evaluate them based on the
fitness value / happiness value for satisfying the coverage based test
adequacy criterion.
180
Step 6: Apply greedy selection process between ‘xi’and ‘vi’
Step 7: Calculate the probability values of test cases ‘xi’ by means of their
fitness values using the equation:
Pi=fiti / i fiti ,Where i=1 to SN (6.4)
Pi values are normalized into [0,1]
In order to calculate the fitness values of solutions, the proposed
approach employed the following equation:
fiti= 1 / 1+hv(i) if hv(i) >=0
1+abs(hv(i)) Otherwise (6.5)
Fitness of each node is decided by the happiness value (hv)
associated with each node based on constraint satisfaction.
For a>b, ci(n) = (b-a) and if ci(n) < 0, then hv(i) = MIN, otherwise
hv(i) =MAX or 0 (6.6)
For a>=b, ci(n) = (b-a) and if ci(n) <= 0, then hv(i) = MIN,
otherwise hv(i) = MAX or 0 (6.7)
For a<b, ci(n) = (b-a) and if ci(n) > 0, then hv(i) = MIN, otherwise
hv(i) = MAX or 0 (6.8)
For a<=b, ci(n) = (b-a) and if ci(n) >= 0, then hv(i) = MIN,
otherwise hv(i) = MAX or 0 (6.9)
For a==b, ci(n) = (b-a) and if ci(n) = 0, then hv(i) = MIN, otherwise
hv(i) = MAX or 0 (6.10)
181
For a!=b, ci(n) = (b-a) and if ci(n) != 0, then hv(i) = MIN, otherwise
hv(i) = MAX or 0 (6.11)
For a OR b, hv(i) = hv(ci(a))+hv(ci(b) (6.12)
For a AND b hv(i) = MIN (hv(ci(a), hv(ci (b)) (6.13)
Step 8: Produce new test cases vi for the onlookers from the test cases ‘xi’,
selected depending on ‘Pi’ and evaluate them.
Step 9: Apply the greedy selection process for the onlookers between ‘xi’
and ‘vi’.
Step 10: Determine the abandoned test case, if exists and replaces it with a
new randomly produced test case xi for the scout using the
equation:
xij = minj+rand(0,1)*(maxj–minj) (6.14)
The scout waggle dances to indicate the new test case generation.
Step 11: Memorize the best test case achieved so far using the fitness value.
Step 12: cycle = cycle+1
Step 13: Until cycle=Max. Cycle Number (MCN)
6.7 EXPERIMENTATION AND EVALUATION
6.7.1 Tested Programs
A range of case studies starting from simple desktop application to
complex web based applications are conducted to find the efficiency of the
proposed approach against the existing approach based on GA. The tested
programs are listed in Table 6.1(a) and the experimental setup is done as in
the Table 6.1 (b).
182
Table 6.1(a) Tested Programs
Case Study #
Object Oriented Systems – in C++ and Java
Case Study Type #Classes
1. Binary Search Tree using Java Academic 2
2. Coffee/COCOA/ Money Lending Machine using C++
Industrial 15
3. Stack using Java Academic 5
4. Queue using Java Academic 5
5. Library Management System using Java Industrial 20
6. Students Mark Processing System using Java
Academic 12
7. Banking Transaction System using Java Industrial 14
8. Shopping Cart using Java Industrial 12
9. File System Manager using C++ Academic 7
10. Network Monitor using C++ Industrial 28
11. Examination Workflow system using Java
Industrial 35
12. Quiz using Java Academic 17
13. Management Information System using Java
Industrial 27
14. Stock Maintenance using Java Academic 8
15. Credit Card Validation using Java Industrial 22
16. Linked Lists – Singly, Doubly and Circularly using Java
Academic 14
17. Anti Money Laundering System using Java
Industrial 56
18. Series Calculation – Sine and Cosine using Java
Academic 6
183
Table 6.1(b) Experimental Setup - ABC and GA
Parameter setup ABC Based GA Based
Type of algorithm Population Based Population Based
Fitness Function Nectar Amount / Happiness value based on Path Coverage
Fitness value based on Path Coverage
No. of Cycles MCN (Prescribed by the tester) - Maximum Cycle Number
MAX. (Prescribed by the tester)
Termination Criteria
Max. no. of cycles / Acceptable path coverage measure
Max. no. of generations / Acceptable path coverage measure
Population of test cases Generation
Using the formula of xij and vij Using Crossover and Mutation
Pheromone / Non –Pheromone based
Non-Pheromone Based Non-Pheromone Based
Communication about selection
Waggle Dance / Status flag Status flag
The sample case studies, demonstrated in this thesis, showed the
coverage results of variables of type integer. For other data types like float
and double, the value range of ‘q’ should be determined in such a way that, it
will produce the test cases in floating point and double based representations.
For char data type, the ASCII equivalent of the value is used for the
generation of test cases. Similar to that, for Boolean data types, either true or
false values should be generated for the variables. This is achieved by
toggling the values of ‘q’ during each cycle. For string data type, radix form is
used. Hence, the proposed approach is well suited for any type of problem
that involves different data types.
184
6.7.2 Case Study 1 – Performance Evaluation of ABC
Consider a simple program to classify a triangle – “Triangle
Classification Problem” shown in Figure 6.3. Its input is a triple of positive
integers (a, b, c) and the data type for input parameters ensures that these are
integers and their values are greater than 0 and less than or equal to 100. The
program output may be one of the following: [Scalene Triangle; Isosceles
Triangle; Equilateral Triangle, Not a triangle]. This problem is chosen as the
bench mark problem because it is the most famous problem in software
testing since 1979 (Myers 1979).
// Triangle Classification Problem
Figure 6.3 Sample code for Triangle Classification Problem
main()
{ int a,b,c;boolean isatriangle;
1.Print(“Triangle Classification Problem”);Print(“Enter three integers which are sides of a triangle”);
Read(a,b,c); 2. If(a<(b+c)) && (b<(a+c)) && (c<(a+b))
3.isatriangle=true;4. Else isatriangle=false;
5. If (isatriangle)
{ 6.If(a==b) xor (a==c) xor (b==c) && !(a==b) && (a==c)
7.Print(“Isosceles Triangle”);8. If(a==b) && (b==c)
9. Print(“Equilateral Triangle”); 10. If(a!=b) && (a!=c) && (b!=c)
11. Print(“Scalene Triangle”); 12. }
185
For the code in Figure 6.3, the Cyclomatic Complexity value is 5.
There are five independent test paths available in the given SUT and are listed
in Table 6.2. The blocks of executable statements are given in Figure 6.3. And
sample test cases with expected output and path coverage details are shown in
Table 6.3.
Table 6.2 Independent paths of Case Study #1
Path # Path 1. 1-2-4-13-14
2. 1-2-3-5-6-7-8-10-12-14
3. 1-2-3-5-6-8-9-10-12-14
4. 1-2-3-5-6-8-10-12-14
5. 1-2-3-5-6-8-10-11-12-14
Table 6.3 Sample Test Cases with expected output and path coverage
details of Case Study #1
Test Case A B C Expected Output Path Covered
1 4 1 2 Not a Triangle 1-2-4-13-14
2 1 4 2 Not a Triangle 1-2-4-13-14
3 1 2 4 Not a Triangle 1-2-4-13-14
4 5 5 5 Equilateral 1-2-3-5-6-8-9-10-12-14
5 2 2 3 Isosceles 1-2-3-5-6-7-8-10-12-14
6 2 3 2 Isosceles 1-2-3-5-6-7-8-10-12-14
7 3 2 2 Isosceles 1-2-3-5-6-7-8-10-12-14
8 3 4 5 Scalene 1-2-3-5-6-8-10-11-12-14
9. 2 3 4 Scalene 1-2-3-5-6-8-10-11-12-14
10. 5 4 6 Scalene 1-2-3-5-6-8-10-11-12-14
186
The initial set of test cases is generated as a random population
generation. Let Xij be the initial set of test cases.
Xij = {2, 2, 3}, i=1 to 3 for parameters a, b and c. And the values
are a=2, b=2 and c=3.
The search bee searches the node corresponds to it and it has
identified that, the test case is suited for node 2 in path2.
Hence, the fitness value is high for node 2, which makes the
selector bee to select this node as the best node to traverse. Then
the neighbor nodes are explored. And the test case is selected
for node6 which is meant for ‘Isosceles Triangle”. This
exploration is continued until it reaches a node that has no more
frontiers to explore.
At last, it has been identified that, the test case {2,2,3} covered
path j=2.
Then the next generation of test cases is generated using the
formula (6.3):
o V13= 2+1 * (2-2) = 2;
o V23=2+ (1) * (2-3) = 3; and
o V33=3+(1) * (3-2) = 4
Vij = {2,3,4} which is fit for node 2 and so the test case is
selected and exploration is done by the selector bee.
While exploration, the Selector bee identifies that the generated
test case is fit for node10 which is meant for “Scalene
Triangle”, and now the process continues till no more frontiers
to explore.
187
Now, it has been identified that, the test case {2,3,4} covers
path j=5.
Since, there are no more test cases available for selection, the
scout generates a new test case using the formula (6.14):
Xij = minj + rand(0,1)*(maxj-minj)
o Xij = {2,2,3} + (-1) * {2,2,4} – {2,2,3} = {2,2,2}
While exploration, the search bee identifies that, the test case
(2,2,2}, has the highest fitness value for node 2 and then
selector bee selects it and explores the neighbor nodes. Now the
node 8 is selected as the best node to explore and the process
continues till no more frontiers to explore in that path.
Finally, it has been identified that, the test case {2,2,2}covers
path j=3.
The process is continued to cover the entire SUT.
Again, the next test case is generated by the scout bee using the
formula (6.14):
o Vij = minj + rand(0,1) * (maxj – minj)
o =>Vij = {2,2,2} + (1) * {2,2,4} – {2,2,2} = {2,2,4}- which
leads to the path 1 meant for “Not a triangle”.
This is repeated till all the test paths have been covered or a
required path coverage percentage has been achieved.
Hence the test cases which cover the SUT are generated within
less number of test runs.
Result of case study 1: The ABC’s behavior is evaluated for test
suite optimization of a complex problem such as triangle classification
problem. This problem is one of the famous bench mark problems to test the
188
efficiency of test cases in several approaches. At the end of the
experimentation, it has been identified that, the behavior of ABC in producing
efficient test cases is not decreasing rather it retains its performance in
generating optimal and near optimal solutions for the given bench mark
problem. The number of test runs taken is only five, to cover the entire SUT.
If there are some infeasible paths, then that will also be intimated by the bees
by means of the status flag associated with each node.
6.7.3 Case Study 2- Performance Comparison of ABC and GA
For performance evaluation between ABC and GA, the problem
taken is comparison of values among three numbers. Here the decision
making process is done based on the value of the parameters a, b and c. The
data types of all these variables are integer.
// Program to compare three values
Figure 6.4 Sample Code for comparing three values
For the code in Figure 6.4, the cyclomatic complexity value is 4,
which means that there are four independent paths in the given problem.
main() {int a, b,c;
1 read a,b,c; 2 if(a>b)
3 Printf(“A is bigger than B”);4 else if (b>c)
5 Printf(“B is bigger than C”); 6 else if(a>c)
7 Printf(“A is bigger than C”);8}
189
Initial set of test cases for each of these paths are created randomly and are
shown in Table 6.4.
Table 6.4 Independent Paths and Initial Set of Random Test Cases of
Case Study #2
S.No. Path Initial set of Test Cases -
(a,b,c) – Random Generation
1. 1-2-3 1,1,2
2. 1-2-4-5 1,1,1
3. 1-2-4-6-7 1,1,1
4. 1-2-4-6-8 1,1,1
(a) Performance of ABC
Let xij be the initial set of test cases for each of the path, (where i=1
to 3 for variables ‘a’, ‘b’ and ‘c’ and j=1 to ‘cyclomatic complexity value’-
that indicates the total number of independent paths in the program). And, xij
indicates a test case i related to path ‘j’ – here it is a triple <a,b,c>, i=1
means it is for ‘a’, 2 for ‘b’ and 3 for ‘c’.
Initial set of test cases are given randomly to the four paths as,
xij = {1,1,2} for path 1 {1,1,1} for path 2, {1,1,1} for path3 and
{1,1,1} for path 4. It is given in table 6.2.
qij=(-1 to 1) which is a random number generated during
program execution.
Now, vij = xij+qij (xij-xkj), where ‘k’ is a solution in the
neighborhood of ‘i’ and ‘q’ is a random number. i=1 to 3, j=1,
190
k = (i+1) % size of the test case (size = 3 since the SUT has 3
variables)
o v11=x11+q11(x11-x21) = 1+(1) (1-1) = 1
o v21=x21+q21(x21-x31)=1+(-1)(1-2) = 2
o v31=x31+q31(x31-x11)=1+(0)(2-1) = 1
o Hence, vi = {1,2,1} where i=1 and j=1
Compare the fitness of xij and vij.
Apply greedy selection process between xij and vij for path j=1.
This must be done for all paths of j.
Calculate the probability Pi for the solution xij and vij by means
of their fitness values:
Pi = fiti / (i=1 to SN) fiti
Here, the fitness values are calculated using the formulae (6.4)
and (6.5).
Based on them, new test cases vij for the onlookers from the
solutions xij are selected depending on Pi and evaluate them.
Apply the greedy selection process between xij and vij.
Unfortunately, both of these test cases have their fitness values
as 0 and so the solutions need to be abandoned and have to be
replaced by new solutions generated by scout bees.
Determine the abandoned solutions and replace them with new
randomly produced solution xij using the scout by the formula
(6.14):
191
xij = minj + rand(0,1) * (maxj –minj)
o Here the minimum valued test case is {1,1,1} and maximum
valued test case is {2,1,1}.
o Hence, xij = {1,1,1} + 1 * {2,1,1} – {1,1,1} = {1,1,1} +
1*{1,0,0} = {1,1,1}+{1,0,0} = {2,1,1}
o Now, the new xij is {2, 1, 1}.
This test case has highest fitness value for the path j=1 and now
scout will start a new search by using this new set of test cases.
Memorize the best test case achieved so far based on their
fitness (probability value).
cycle = cycle +1
Until cycle = Max. Cycle No. (MCN)
The above procedure is coded in Java and the said approach is
applied to each unit of the software under test. Since, the bees are working in
parallel, the decision making process is faster.
After the scout bee identifies a new best solution, it waggle-dances
in the dancing area. Here, a status flag is used by the scout bee to indicate it.
The results of ABC based framework in test case optimization is shown in
Table 6.5.
192
Table 6.5 Results of ABC based framework in Test Case Optimization for
Case Study #2
S.No. Test Sequence
Test Case -(a,b,c)
Probability value based on Fitness
Value
Coverage % (Test
sequence)
1. 1-2-3 1,1,2 0 0% 1,2,1 0 0% 2,1,1 1 100%
2. 1-2-4-51,1,1 0 0% 1,2,1 1 100%
3. 1-2-4-6-7 1,1,1 0 0% 1,2,1 0 0% 2,2,1 1 100%
4. 1-2-4-6-8 1,1,1 1 100%
(b) Performance of GA
(i) Test Case Construction - One point crossover and Mutation
One Point Crossover: Fragmenting the selected population at
some point m and recombine the 0..m-1 portion of first member and m...n of
the second member, as well as recombine the 0..m-1 portion of second
member and m…n of the first member.
For the given case study,
The values of the variables a, b and c
Test Case 1: 1, 1, 2
Test Case 2: 1, 2, 1
After 1-point cross-over at the second position, the new test cases
are as follows:
Test Case 11: 1, 2, 2
Test Case 21: 1, 1, 1
193
This new generation of test cases is then evaluated based on their
effectiveness and then either selection or removal will be done.
Mutation: This operation is used to change the member at gene
level and reproduce the remaining genes for the creation of new generation. In
software testing, the test case sets are considered as the population and
individual test case is considered as member and the possibility of individual
variables’ value passed to a method call are considered as the genes.
Parent 1 – Test Case 1: 1, 1, 2
Parent 2 - TestCase 2: 1, 2, 1
After mutation operator is applied to these test cases, the new
generation of test cases is generated as,
Child 1 - Test Case 11: 2, 1, 2
Child 2 - Test Case 21: 2, 2, 1
Now, this new generation of test cases is evaluated and the
procedure is repeated.
(ii) Test Case Evaluation –Code Coverage criterion
The population that has most favorable features can be assigned
with higher fitness value for evaluation. In software testing, the favorable
feature is revealing more number of errors. The test case with highest
coverage criterion should be selected as they can reveal errors in their
execution.
Test Case 11’s Coverage% is 50%
Test Case 21’s Coverage% is 76%
Parent 1’s Coverage% is 60%
Parent 2’s Coverage% is 70%
194
(iii) Test Case Selection – Filtering Function
Selecting the best parent from the current population is called as the
selection process. This process leads to incremental solution generation. The
test case that has higher fitness value is selected as parent for the next
generation.
After the evaluation is done, Parent1 test case is replaced by Test
Case 21. Then during the next iteration, this modified parent is used for the
generation of test cases (offspring). The test case selection process is repeated
till the termination condition such as maximum number of generations or
acceptable coverage % is reached.
Result of case study 2: As a result of case study 2, it has been
identified that, the total number of test runs taken by ABC approach is only
three to capture the test cases needed to cover the states, statements and
branches in the given SUT. Also, the time taken is very less due to the parallel
behavior of the bees. In the case of GA, the total number of generations is
increasing as the SUT is not covered by the current generation of test cases.
In this experiment, even after the number of generations is
increased to ten in GA, some of the paths were still uncovered and some of
them are repeatedly covered by means of different test cases. Since, the search
in GA is not guided by means of intelligence as in ABC; the solution
generation is repeated till all the paths are covered (even for infeasible paths).
Also, the coverage is not steadiliy increasing as in ABC rather it is increasing
and decreasing during the generations.
195
6.8 PERFORMANCE ANALYSIS
The approach has been tested on all the problems listed in Table
6.1(a). For comparison purpose, five academic problems shown in Table 6.6
and four industrial strength problems listed in Table 6.7 are taken. They are
ranging from simple to complex based on the lines of code and methods per
class. The test suite has been generated using GA and the proposed ABC
tester algorithm.
Table 6.6 Academic Problems (Sample)
S.No. Academic Test Problem Test Object No
No.of Classes
Complexity Level
1. Timetable Generation ATP1 10 Medium 2. Binary Search Tree
Construction ATP2 5 Low
3. Online quiz system ATP3 17 High4. Students marks processing
system ATP4 12 Medium
5. Attendance monitoring system ATP5 18 High
Table 6.7 Industrial Problems (Sample)
S. No
Industrial Test ProblemTest
Object No
No. of Classes
Complexity Level
1. Anti-money laundering system ITP1 56 High
2. Multi-point software ITP2 42 High 3. Ticket Monitoring system ITP3 29 Medium
4. Stock Management System ITP4 15 Low
196
6.8.1 Performance Comparison – ABC Vs. GA
The results are gathered in terms of their path coverage, test runs
and time taken for the entire test case optimization process. The results shown
in Tables 6.8 and 6.9 indicated that, when compared to GA, ABC based test
suite optimization produced high coverage with minimum number of test
cases and within minimal number of test runs.
For ABC based test suite optimization framework, the parameters
mentioned in table 6.1 are applied. The initialization of test cases is done as a
random process. Then the subsequent generations are carried out by the three
bees using the formulae (6.3) to (6.13).
The selection of test cases is done based on the happiness value
heuristic that shows the coverage of the SUT. For example, the selection of
test cases as x[1]=2, x[2]=1, x[3]=1 yields the result as coverage of sequence
1-2-3. The number of test runs is increased to get the coverage of other paths.
The results gathered from academic problems as shown in
Table 6.8, indicated that, even after the number of generations is increased,
the path coverage percentage is low in GA when compared to ABC.
Table 6.8 Results of Academic test problems
S.No. Test Problem
GA Based Optimization ABC Based Optimization
Coverage % Generations
# Test
Cases
Time (Sec.)
Coverage % Cycles
# Test
Cases
Time(Sec.)
1. ATP1 70% 100 72 20 95% 50 24 102. ATP2 65% 200 128 45 98% 80 52 143. ATP3 75% 70 58 18 97% 35 26 84. ATP4 90% 300 208 52 99% 50 80 105. ATP5 85% 150 106 29 96% 68 68 12
197
Table 6.9 Results of Industrial Test Problems
S.No. Test Problem
GA Based Optimization ABC Based Optimization
Coverage % Generations # Test
CasesTime(Sec.)
Coverage % Cycles
# Test
Cases
Time (Sec.)
1. ITP1 70% 500 10042 40 92% 150 3540 22
2. ITP2 75% 300 7120 35 98% 75 2508 20
3. ITP3 80% 400 8648 52 97% 125 3068 25
4. ITP4 68% 180 6040 20 98% 60 2164 10
Similarly, for industrial test problems; the Table 6.9 indicated that,
for complex systems which involve multiple data types the performance of
ABC is superior and takes only less time for test case generation process.
For GA based test case optimization, the parameters as mentioned
in Table 6.1 are used. Then crossover and mutation operations were applied to
generate the further generations. At the end of each generation, the fittest
individuals are identified by finding the coverage metric. The individuals that
have the highest coverage metric were survived and were used as parents for
further generations. This process is continued till an acceptable path coverage
value or the number of test cycles is reached.
6.8.2 Comparison Charts – ABC Vs. GA
From the Figures 6.5 and 6.6, it is understood that, the optimization
of the test cases based on their fitness value is higher and is steadily
improving in ABC. Whereas in GA, the test cases improvement is non-linear
and usually strikes up at local optima.
198
Path Coverage - ABC
020406080
100
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
No.of cycles
Figure 6.5 Path Coverage of test cases in ABC
Figure 6.6 Path Coverage of in GA individuals
The total number of generations/cycles taken by ABC and GA is
shown in Figures 6.7 and 6.9 for academic and industrial test problems. From
these figures, it is understood that, when compared to GA, the proposed
approach based on ABC takes less number of cycles to achieve optimal or
near optimal solutions due to the parallel behavior of the three agents.
199
No.of Generations/Cycles-ABC Vs. GA
050
100150200250300350
ATP1 ATP2 ATP3 ATP4 ATP5
Academic Test Problems
ABCGA
Figure 6.7 Generations/Cycles- ABC Vs. GA for Academic Problems
0%
20%
40%
60%
80%
100%
Path Coverage %
ATP1 ATP2 ATP3 ATP4 ATP5
Academic Test Problems
Path Coverage % - ABC Vs. GA
ABCGA
Figure 6.8 Path Coverage% -ABC Vs. GA for Academic Problems
050
100150200250300350400450500
Generations / Cycles
ITP1 ITP2 ITP3 ITP4
Industrial Test Problems
No.of Generations/Cycles - ABC Vs. GA
ABC
GA
Figure 6.9 Generations/Cycles -ABC Vs. GA for Industrial Problems
200
0%10%20%30%40%50%60%70%80%90%
100%
Path Coverage %
ITP1 ITP2 ITP3 ITP4
Industrial Test Problem s
Path Coverage % - ABC Vs.GA
GA
ABC
Figure 6.10 Path Coverage %- ABC Vs. GA for Industrial Problems
Also, Figures 6.8 and 6.10 showed the comparison of path
coverage% of the test cases generated using ABC and GA for the academic
and industrial problems. From these figures, it has been inferred that, the path
coverage of the test cases is high in ABC due to the selection of individuals at
each generation by the three bees. If there is any improvement in the
coverage%, the Scout automatically generates the new set of test cases to be
selected to form the optimal test suite. The time taken for ABC is less, since
all the three agents are working in parallel and the solution is obtained by
collaborative learning (waggle dance) among them. In the case of GA, there is
no such collaborative learning available and hence it usually takes more time
to generate the solution.
6.9 SUMMARY
The test suite optimization process involves generation of effective
test cases in a test suite that can cover the given SUT within less time. The
proposed framework called “ABC Tester” applies the artificial bee colony
optimization (ABC) approach for test suite optimization.
201
Artificial Bee Colony (ABC) optimization is a non-pheromone
based swarm intelligence approach motivated by the intelligent behavior of
honey bees; the colony consists of three groups of bees namely employed,
onlookers and scouts.
The proposed approach based on Artificial Bee Colony (ABC)
optimization, represents each test case as a possible solution in the
optimization problem and happiness value - a heuristic introduced to each test
case corresponds to the quality or fitness of the associated solution. The test
suite optimization process is achieved by means of three agents namely
Search Agent, Replace Agent and Selector Agent that mimics the behavior of
employed, onlooker and scout bees in the bee colony. The agents work in
parallel and hence achieve better software test suite optimization.
In this research work, the application of ABC is demonstrated in
software test suite optimization and showed the superiority of the proposed
approach over the existing GA based approach. The size of the test suite is
reduced up to 84.7% (approx.) based on path coverage when compared to
GA. Problems with GA include lack of memorization, non linear
optimization, risk of suboptimal solution and delayed convergence. There is
no guarantee for a near global optimal solution even when it may be reached.
But, ABC model based test suite optimization generates near global optimal
results and it converges within less number of test runs.
Top Related