Download - CHAPTER 6 ABC TESTER - ARTIFICIAL BEE COLONY …shodhganga.inflibnet.ac.in/bitstream/10603/32155/11/11_chapter 6.pdf · 164 CHAPTER 6 ABC TESTER - ARTIFICIAL BEE COLONY BASED SOFTWARE

164

CHAPTER 6

ABC TESTER - ARTIFICIAL BEE COLONY BASED

SOFTWARE TEST SUITE OPTIMIZATION APPROACH

6.1 PROBLEM FORMULATION

A test suite is a set of several test cases for a component or system

under test, where the post condition of one test is often used as the

precondition for the next one. The objective of the test suite optimization

process involves generation of effective test cases in a test suite that can cover

the given SUT within less time.

As per Phil McMinn, random generation of test cases leads to an

unlimited source of tests in which the selection of efficient test cases is very

difficult (McMinn 2003).

This research work, proposes an ABC (Artificial Bee Colony

Optimization) based framework, motivated by the intelligent behavior of

honey bees to automate the test suite optimization process. Here, the bees are

implemented as agents that perform the test suite optimization activities

seamlessly. Since, the ABC system combines local search methods carried out

by employed and onlooker bees with global search methods managed by

scouts, the approach attains near global optimal solution.

In the proposed system, the places to search are the nodes in the

SUT. Each test case is represented as a possible solution in the optimization

problem. The artificial bees modify the test cases with time and the bees’ aim

165

is to discover the places of nodes with higher coverage and finally the one

with the highest usage by the given test case.

The coverage of the nodes by the test cases us identified by a

heuristic measure namely happiness value introduced to each node that

corresponds to the quality or fitness of the associated solution. Hence, the test

cases are selected by means of an intelligent search through the SUT based on

the said heuristic associated with each node.

This is done by means of an intelligent search through the Software

under Test (SUT) based on the weight value associated with each node,

depending on the coverage of the test case. This weight value is calculated in

terms of a heuristics measure namely happiness value. The objective function

is to maximize the happiness value by finding the sum of the fitness value

associated with each node based on constraint satisfaction. Now, a few

efficient test cases that can cover the SUT in less time are generated and

stored in the optimal test suite repository.

6.2 REPRESENTATION OF THE SEARCH SPACE – FORMAL

PROBLEM DEFINITION

In the proposed approach, the given Software under Test (SUT) is

represented as a search space. The block of executable statements and

conditions are grouped as states or nodes in the representation model. These

nodes are searched to find the feasible and infeasible nodes in the SUT. The

effective test cases are identified by means of the coverage of them against the

nodes in the SUT. In this research work, the terms nodes and states are used

interchangeably. The test adequacy criteria applied here ensures path

coverage, state coverage and branch coverage.

166

6.2.1 Problem Environment

The Software under Test (SUT) is given as input. Let ‘n’ be the

cyclomatic complexity value that indicates, there are ‘n’ independent test

paths in the Software under Test (SUT). Given ‘m’ test suites each consists of

several test cases that must be processed on ‘n’ test paths / sequences.

6.2.2 Assumption

Software under Test (SUT) is well structured and without any

compilation errors. Code Instrumentation does not affect the functionality.

6.2.3 Objective Criterion

The objective of the proposed approach is to generate an efficient

test suite that can cover the SUT within less time and cost by applying

intelligent search through the SUT using the parallel behavior of a group of

three bees.

6.2.4 Mathematical Model

The Objective function for test suite optimization is:

Max.

Happiness_value (test-case) (6.1)

Sub. to.

Happiness_value (test-case) = 1 if Coverage (test-case) =100%

0 otherwise (6.2)

The objective function in (6.1) is to maximize the happiness value

of each test case by identifying the node with higher coverage value. The

167

constraint (6.2) indicates the happiness value based on coverage of test case

for each node.

6.3 RELATED WORK

Timothy et al (1991), proposed test coverage as an important aspect

of a validation suite for implementations of a standard such as the CAIS

(Common Ada Program Interface Set). Their paper presented the

development and application of a constrained optimization process for CAIS

test coverage. The approach next identified resource (time and effort) and

process constraints. A greedy algorithm was developed to provide a partial

solution to the coverage design problem. The decreasing first fit bin packing

heuristic was then applied by them to refine the coverage measure of the

validation test suite within the process constraints.

Tsong et al (1996), proposed a methodology for test suite

optimization. In their approach, they applied dividing strategies to optimize

the given test suite. The domain is divided in such a way that the test suite

contains only effective test cases.

Chen and Lau (1998), proposed a method for optimizing a test

suite. As per their paper, a test case in a test suite is said to be redundant if the

same testing objective can still be satisfied by other test cases of the test suite.

In their paper, they proposed heuristics to optimize the size of a test suite.

They proposed a divide and conquer based approach for test suite

optimization (2003).

Jones et al (1998) proposed a strategy using genetic algorithms to

automate branch and fault based testing. Pargas et al (1999) applied genetic

algorithm for test data generation. In their approach, they applied GA to

generate test cases that satisfies the given test adequacy criteria. Wegener et al

168

(2001) applied genetic algorithm based approach to generate test data in

structural software testing.

Tibor et al (1999), in their paper expounded the mathematical

optimization method for test suite optimization based on cost and test

coverage, and applied this method to an ISDN protocol.

Christoph et al (2001) applied genetic algorithm for software test

suite optimization. In their paper, they demonstrated the experiments with test

generation problems for larger programs and more complex test adequacy

criteria. They found a widening gap between a technique based on genetic

algorithms and those based on random test generation.

Gregg et al (2003), proposed a methodology for effective test suite

composition in regression testing. Their article reported the results of

controlled experiments examining the effects of two factors in test suite

composition; test suite granularity and test input grouping, based on the costs

and benefits of several regression-testing-related methodologies: retest-all,

regression test selection, test suite reduction, and test case prioritization.

Frank and Joachim (2005) applied evolutionary technique for test

suite reduction in white-box testing. Their paper presented the development of

an evolutionary software measure which is able to predict the test effort for

individual test goals. It was based on four attributes of conditional statements.

Initial results showed that the predicted test effort for individual test goals

corresponds well to real measurements.

Lili et al (2005), proposed a new test-suite reduction technique for

modified condition/decision coverage (MC/DC): a bi-objective model that

considers both the coverage degree of test case for test requirements and the

capability of test cases to reveal the errors.

169

Xiaofang et al (2008), proposed a test suite optimization technique.

The aim of their paper was to provide a guideline for choosing the appropriate

test suite reduction techniques for Boolean specification-based testing. Four

typical heuristic reduction strategies: G, GE, GRE, and H were introduced to

be compared empirically. Furthermore, testing requirement optimization was

combined to enhance these four reduction strategies.

As per Praveen (2009), test data generation is one of the key issues

in software testing. A properly generated test suite may not only locate the

errors in a software system, but also help in reducing the high cost associated

with software testing. His paper proposed Genetic Algorithm for test data

generation to achieve software test optimization.

6.4 PROPOSED ARTIFICIAL BEE COLONY BASED TEST

SUITE OPTIMIZATION FRAMEWORK

6.4.1 Need for Artificial Bee Colony (ABC) Based Approach

As the outcome of the literature study on related work in software

test suite optimization, the following observations were made.

The approaches proposed by Timothy et al (1991), Tsong et al

(1996) and Tiber et al (1999) have focused on test suite optimization based on

coverage based test adequacy criteria. But their approaches have the

drawbacks of focusing upon a particular interface, need for a lot of human

intervention and applied for a particular application respectively.

The effects of granularity, and grouping technique (Gregg et al

2003) based on the cost and fault-detection effectiveness of regression testing

under the given methodologies were analyzed. The analysis showed that test

suite granularity significantly affects several cost-benefit factors for the

methodologies considered, while test input grouping has limited effects.

170

Further, the results exposed essential tradeoffs affecting the relationship

between test suite design and regression testing cost-effectiveness.

The approach proposed by Xiaofang et al (2008), presented the

empirical evaluation of the proposed reduction strategies based on a set of

Boolean specifications. This approach suffers from the generic applicability

problem.

MC/DC coverage (Lili 2005) requires that the effect of each

condition affects the outcome of a decision. This, in turn, requires the relation

of several test cases. Model checkers, however, can only create a single

counter example at a time and do not offer a way to relate traces with each

other.

It has also been observed that, several works have been proposed in

the recent years on applying Genetic Algorithms to test suite optimization

(Jones et al 1998, Pargas et al 1999, Wegener et al 2001, Frank and Joachim

2005, Christoph C. Michael et al 2001 and Praveen 2009). But their general

drawbacks are strike up at local optimal solution and lack of memorization of

best individuals during each generation.

In the case of software systems having dynamic behavior, the

testing process is much more a complex task because of its multi-threaded

processing nature. This leads to the application of knowledge based

approaches in test suite optimization. But, the drawbacks of existing

knowledge based approaches discussed in the literature study have made this

research work to focus on an alternate approach for test suite optimization,

which has the advantages of population based approaches without the

problem of local optima.

171

A population based approach is essential for test suite optimization,

since a pool of test cases is needed to select a few efficient test cases in

forming an optimal test suite. Similarly, the swarm intelligence based

approach is selected here to apply intelligence in the searching process in

order to select the nodes with higher coverage.

The literature study on ABC, which is a population based swarm

intelligence approach, has provided a lot of proof on using it as an alternate

optimization approach to hard optimization problems. Recent research and

development of ABC based systems are focusing mostly on applications such

as financial decision making systems, transportation, manufacturing,

aerospace, military and so on ( Karaboga et al 2007,2008, Dusan et al 2006,

Wong et al 2008, Adil et al 2007, Alok Singh 2009, Srinivasa Rao et al

2008, Karaboga 2009, Mohammad et al 2007).

In the light of the above consideration, this research work applied

ABC, for software test suite optimization problem. In the proposed approach,

the functionality of the bee is extended to do the testing and monitoring

activity so that, it reduces the manual work and improves the confidence on

the software, by testing it with the coverage based test adequacy criteria. The

proposed model based on ABC, interacts with the developer and the SUT, and

thus helps in speeding up the development and testing process and also

provides an insight into the execution flow in the system.

6.4.2 Proposed Artificial Bee Colony Framework

The ABC Tester framework is shown in Figure 6.1. In which, the

functionalities of the three bees are extended to three agents namely Search

Agent, Selector Agent and Replace Agent to produce the test suite with

efficient test cases among near infinite number of test cases.

172

Figure 6.1 ABC Based Test Suite Optimization Framework

The proposed approach, applies the intelligent searching of the

three bees into three agents respectively. The three agents work independently

as per their assigned task and communicate with other agents whenever they

have to exchange information.

Because of the parallel behavior of these agents, the solution

generation becomes faster and makes the approach an efficient one. Since, the

basic test adequacy criterion used is path coverage; the quality of the test suite

is improved during each iteration to cover the paths in the software.

(a) Artificial bee colony optimization (ABC) – an introduction

Artificial Bee Colony (ABC) is one of the most recently defined

algorithms by Dervis Karaboga in 2005, motivated by the intelligent behavior

of honey bees. It is as simple as Particle Swarm Optimization (PSO) and

Differential Evolution (DE) algorithms, and uses only common control

In ABC model, the three types of bees are:

o Employed,

o Onlooker and

o Scouts

In the proposed approach, each bee is associated

with an agent in the optimization model.

o Employed bee – Search agent

o Onlooker bee – Selector agent

o Scouts bee – Replace agent

Optimized Test Suite Repository

Search Bee

Selector Bee

Replace Bee

SUT

173

parameters such as colony size and maximum cycle number (Karaboga et al

2007, 2008).

ABC as an optimization tool provides a population-based search

procedure in which individuals called foods positions are modified by the

artificial bees with time and the bees’ aim is to discover the places of food

sources with high nectar amount and finally the one with the highest nectar.

In ABC system, artificial bees fly around in a multidimensional

search space and some employed and onlooker bees choose food sources

depending on the experience of themselves and their nest mates, and adjust

their positions. Some scouts bees fly and choose the food sources randomly

without using experience. If the nectar amount of a new source is higher than

that of the previous one in their memory, they memorize the new position and

forget the previous one.

In ABC model, the colony consists of three groups of bees:

employed, onlookers and scouts. It is assumed that there is only one artificial

employed bee for each food source. In other words, the number of employed

bees in the colony is equal to the number of food sources around the hive.

Employed bees go to their food source and come back to hive and

dance on this area. The employed bee whose food source has been abandoned

becomes a scout and starts to search for finding a new food source.

Onlookers watch the dances of employed bees and choose food sources

depending on the dances.

(b) Basic ABC Algorithm

The main steps of the basic ABC algorithm are given below:

Initial food sources are produced for all employed bees.

174

REPEAT

Each employed bee goes to a food source in her memory and

determines a neighbor source, then evaluates its nectar amount

and dances in the hive.

Each onlooker watches the dance of employed bees and

chooses one of their sources depending on the dances, and then

goes to that source. After choosing a neighbor around that, she

evaluates its nectar amount.

Abandoned food sources are determined and then, they are

replaced with the new food sources discovered by scouts.

The best food source found so far is registered.

UNTIL (requirements are met)

In ABC which is a population based algorithm, the position of a

food source represents a possible solution to the optimization problem and the

nectar amount of a food source corresponds to the quality (fitness) of the

associated solution. The number of the employed bees is equal to the number

of solutions in the population.

At the first step, a randomly distributed initial population (food

source positions) is generated. After initialization, the population is subjected

to repeat the cycles of the search processes of the employed, onlooker, and

scout bees, respectively. An employed bee produces a modification on the

source position in her memory and discovers a new food source position.

Provided that the nectar amount of the new one is higher than that

of the previous source, the bee memorizes the new source position and forgets

the old one. Otherwise she keeps the position of the one in her memory.

175

After all employed bees complete the search process; they share the

position information of the sources with the onlookers on the dance area.

Each onlooker evaluates the nectar information taken from all employed bees

and then chooses a food source depending on the nectar amounts of sources.

As in the case of the employed bee, she produces a modification on

the source position in her memory and checks its nectar amount. Providing

that its nectar is higher than that of the previous one, the bee memorizes the

new position and forgets the old one. The sources abandoned are determined

and new sources are randomly produced to be replaced with the abandoned

ones by artificial scouts.

Thus, ABC system combines local search methods, carried out by

employed and onlooker bees, with global search methods, managed by scouts,

attempting to balance exploration and exploitation process.

6.5 INTERNAL ARCHITECTURE OF ABC TESTER

The internal architecture of ABC Tester is shown in Figure 6.2. In

this, the system consists of three bees namely Search Bee, Selector Bee and

Replace Bee which acts as agents. The bees aim is to identify places with

higher feasibility value or the coverage value of the given set of test cases.

The bees communicate among themselves by means of a common agent

communication language. The parallel behavior of these bees is given by

means of multi-threading and hence, the bees work in parallel to achieve the

desired result.

176

Figure 6.2 Internal Architecture of ABC Tester

Initially, a random population of test cases is generated. The Search

Bee searches for an executable state in the SUT for each test case as it goes to

an executable state in the test path as per the information in the knowledge

source and determines the best next neighbor node/state. This determination is

done by analyzing all the neighbor nodes/states from the current node/state

based on the selected test case’s coverage. Then, it evaluates the fitness value

(nectar amount) of each node surrounding the current node for the selected

test case. The fitness value is nothing but the happiness value heuristic which

is calculated based on the coverage of the given test case for each neighbor

node. Then the selection of the best node to transit is chosen based on this

heuristic.

177

The Selector Bee watches the Search Bee and selects the test cases

depending upon the fitness value associated with each test case. If the node is

not feasible or not covered by a particular test case, then the node is removed

from memory and the Selector Bee starts a new search for finding the node

with higher feasibility in that path. Based on that, a happiness value or

coverage measure is associated with each test case. A test case with highest

happiness value or coverage measure is remembered and all the other test

cases are removed from the memory.

If the Selector Bee finds that the selected test cases are not efficient

in terms of their coverage, then the Replace Bee generates a new population

of test cases and replaces the test cases in the existing test case set with new

test cases.

This cycle is repeated till the termination condition of either the

maximum number of cycles (MCN) is reached or the specified coverage

criterion is above 95%.

6.6 ALGORITHM FOR TEST SUITE OPTIMIZATION

6.6.1 Heuristics Used in Test Suite Optimization

Happiness Value – It is used as the fitness value and is calculated

based on the coverage of each node by a given

test case.

6.6.2 Proposed ABC algorithm for Test Suite Optimization

Initialize the population of test cases

178

REPEAT

Step 1: The employed bee applies the test cases to the first executable node

in the SUT. Once the employed bee finishes its search process by

applying each test case to the node, the fitness value (coverage

value) of the test cases along with the node information is then

returned by the employed bee.

Step 2: The onlooker bee takes this information as input and evaluates the

coverage value of each test case taken from the employed bee.

Then it finds out the test case that has the highest coverage value of

the given node. The test case along with the covered node

information is memorized.

Step 3: Then the nodes which are adjacent to the covered node are explored

by the employed bee. Now the fitness value of the selected test case

against the explored neighborhood nodes is evaluated by the

onlooker bee.

Step 4: The node with the highest fitness value is selected and appended

with the existing selected node to indicate the test path. The test

case associated with this path is stored in the optimized-test case-

repository.

Step 5: Other nodes except the covered node and the test cases other than

the selected test case are abandoned and they are stored in

temporary-node-list and temporary-test case-list respectively.

Step 6 : If a test path is not complete, then repeat steps 3 to 6.

Otherwise, the nodes and test cases from the temporary-node-list

and temporary-test case-list are selected for the next test path

generation by the employed bees.

179

Step 7: If the onlooker bee finds that the selected test cases are not

efficient, then the scout bees generate a new population of test

cases and replace the test cases in the temporary-test case-list with

new test cases.

Repeat until the specified termination criterion is met. (All the

nodes have been visited at least once or number of generations is

reaching a maximum or coverage criterion is met)

6.6.3 Pseudo code of the ABC Algorithm

The pseudo code of the above algorithm for test suite optimization

problem is given below:

Step 1 : Initialize the population of test cases xij. (where ‘i’ indicates the

value given for variables and ‘j’ represents the test path in the SUT,

j= 1to ‘n’ and ‘n’ represents the cyclomatic complexity value).

Step 2: Evaluate the population based on coverage based test adequacy

criterion.

Step 3: cycle =1

Step 4: repeat

Step 5: Produce new test cases vij in the neighborhood of xij for the

employed bees using the formula:

vij = xij+qij(xij-xkj) (6.3)

Where ‘k’ is a solution in the neighborhood of ‘i’, ‘q’ is a random

number in the range [-1,1] and then evaluate them based on the

fitness value / happiness value for satisfying the coverage based test

adequacy criterion.

180

Step 6: Apply greedy selection process between ‘xi’and ‘vi’

Step 7: Calculate the probability values of test cases ‘xi’ by means of their

fitness values using the equation:

Pi=fiti / i fiti ,Where i=1 to SN (6.4)

Pi values are normalized into [0,1]

In order to calculate the fitness values of solutions, the proposed

approach employed the following equation:

fiti= 1 / 1+hv(i) if hv(i) >=0

1+abs(hv(i)) Otherwise (6.5)

Fitness of each node is decided by the happiness value (hv)

associated with each node based on constraint satisfaction.

For a>b, ci(n) = (b-a) and if ci(n) < 0, then hv(i) = MIN, otherwise

hv(i) =MAX or 0 (6.6)

For a>=b, ci(n) = (b-a) and if ci(n) <= 0, then hv(i) = MIN,

otherwise hv(i) = MAX or 0 (6.7)

For a<b, ci(n) = (b-a) and if ci(n) > 0, then hv(i) = MIN, otherwise

hv(i) = MAX or 0 (6.8)

For a<=b, ci(n) = (b-a) and if ci(n) >= 0, then hv(i) = MIN,

otherwise hv(i) = MAX or 0 (6.9)

For a==b, ci(n) = (b-a) and if ci(n) = 0, then hv(i) = MIN, otherwise

hv(i) = MAX or 0 (6.10)

181

For a!=b, ci(n) = (b-a) and if ci(n) != 0, then hv(i) = MIN, otherwise

hv(i) = MAX or 0 (6.11)

For a OR b, hv(i) = hv(ci(a))+hv(ci(b) (6.12)

For a AND b hv(i) = MIN (hv(ci(a), hv(ci (b)) (6.13)

Step 8: Produce new test cases vi for the onlookers from the test cases ‘xi’,

selected depending on ‘Pi’ and evaluate them.

Step 9: Apply the greedy selection process for the onlookers between ‘xi’

and ‘vi’.

Step 10: Determine the abandoned test case, if exists and replaces it with a

new randomly produced test case xi for the scout using the

equation:

xij = minj+rand(0,1)*(maxj–minj) (6.14)

The scout waggle dances to indicate the new test case generation.

Step 11: Memorize the best test case achieved so far using the fitness value.

Step 12: cycle = cycle+1

Step 13: Until cycle=Max. Cycle Number (MCN)

6.7 EXPERIMENTATION AND EVALUATION

6.7.1 Tested Programs

A range of case studies starting from simple desktop application to

complex web based applications are conducted to find the efficiency of the

proposed approach against the existing approach based on GA. The tested

programs are listed in Table 6.1(a) and the experimental setup is done as in

the Table 6.1 (b).

182

Table 6.1(a) Tested Programs

Case Study #

Object Oriented Systems – in C++ and Java

Case Study Type #Classes

1. Binary Search Tree using Java Academic 2

2. Coffee/COCOA/ Money Lending Machine using C++

Industrial 15

3. Stack using Java Academic 5

4. Queue using Java Academic 5

5. Library Management System using Java Industrial 20

6. Students Mark Processing System using Java

Academic 12

7. Banking Transaction System using Java Industrial 14

8. Shopping Cart using Java Industrial 12

9. File System Manager using C++ Academic 7

10. Network Monitor using C++ Industrial 28

11. Examination Workflow system using Java

Industrial 35

12. Quiz using Java Academic 17

13. Management Information System using Java

Industrial 27

14. Stock Maintenance using Java Academic 8

15. Credit Card Validation using Java Industrial 22

16. Linked Lists – Singly, Doubly and Circularly using Java

Academic 14

17. Anti Money Laundering System using Java

Industrial 56

18. Series Calculation – Sine and Cosine using Java

Academic 6

183

Table 6.1(b) Experimental Setup - ABC and GA

Parameter setup ABC Based GA Based

Type of algorithm Population Based Population Based

Fitness Function Nectar Amount / Happiness value based on Path Coverage

Fitness value based on Path Coverage

No. of Cycles MCN (Prescribed by the tester) - Maximum Cycle Number

MAX. (Prescribed by the tester)

Termination Criteria

Max. no. of cycles / Acceptable path coverage measure

Max. no. of generations / Acceptable path coverage measure

Population of test cases Generation

Using the formula of xij and vij Using Crossover and Mutation

Pheromone / Non –Pheromone based

Non-Pheromone Based Non-Pheromone Based

Communication about selection

Waggle Dance / Status flag Status flag

The sample case studies, demonstrated in this thesis, showed the

coverage results of variables of type integer. For other data types like float

and double, the value range of ‘q’ should be determined in such a way that, it

will produce the test cases in floating point and double based representations.

For char data type, the ASCII equivalent of the value is used for the

generation of test cases. Similar to that, for Boolean data types, either true or

false values should be generated for the variables. This is achieved by

toggling the values of ‘q’ during each cycle. For string data type, radix form is

used. Hence, the proposed approach is well suited for any type of problem

that involves different data types.

184

6.7.2 Case Study 1 – Performance Evaluation of ABC

Consider a simple program to classify a triangle – “Triangle

Classification Problem” shown in Figure 6.3. Its input is a triple of positive

integers (a, b, c) and the data type for input parameters ensures that these are

integers and their values are greater than 0 and less than or equal to 100. The

program output may be one of the following: [Scalene Triangle; Isosceles

Triangle; Equilateral Triangle, Not a triangle]. This problem is chosen as the

bench mark problem because it is the most famous problem in software

testing since 1979 (Myers 1979).

// Triangle Classification Problem

Figure 6.3 Sample code for Triangle Classification Problem

main()

{ int a,b,c;boolean isatriangle;

1.Print(“Triangle Classification Problem”);Print(“Enter three integers which are sides of a triangle”);

Read(a,b,c); 2. If(a<(b+c)) && (b<(a+c)) && (c<(a+b))

3.isatriangle=true;4. Else isatriangle=false;

5. If (isatriangle)

{ 6.If(a==b) xor (a==c) xor (b==c) && !(a==b) && (a==c)

7.Print(“Isosceles Triangle”);8. If(a==b) && (b==c)

9. Print(“Equilateral Triangle”); 10. If(a!=b) && (a!=c) && (b!=c)

11. Print(“Scalene Triangle”); 12. }

185

For the code in Figure 6.3, the Cyclomatic Complexity value is 5.

There are five independent test paths available in the given SUT and are listed

in Table 6.2. The blocks of executable statements are given in Figure 6.3. And

sample test cases with expected output and path coverage details are shown in

Table 6.3.

Table 6.2 Independent paths of Case Study #1

Path # Path 1. 1-2-4-13-14

2. 1-2-3-5-6-7-8-10-12-14

3. 1-2-3-5-6-8-9-10-12-14

4. 1-2-3-5-6-8-10-12-14

5. 1-2-3-5-6-8-10-11-12-14

Table 6.3 Sample Test Cases with expected output and path coverage

details of Case Study #1

Test Case A B C Expected Output Path Covered

1 4 1 2 Not a Triangle 1-2-4-13-14

2 1 4 2 Not a Triangle 1-2-4-13-14

3 1 2 4 Not a Triangle 1-2-4-13-14

4 5 5 5 Equilateral 1-2-3-5-6-8-9-10-12-14

5 2 2 3 Isosceles 1-2-3-5-6-7-8-10-12-14

6 2 3 2 Isosceles 1-2-3-5-6-7-8-10-12-14

7 3 2 2 Isosceles 1-2-3-5-6-7-8-10-12-14

8 3 4 5 Scalene 1-2-3-5-6-8-10-11-12-14

9. 2 3 4 Scalene 1-2-3-5-6-8-10-11-12-14

10. 5 4 6 Scalene 1-2-3-5-6-8-10-11-12-14

186

The initial set of test cases is generated as a random population

generation. Let Xij be the initial set of test cases.

Xij = {2, 2, 3}, i=1 to 3 for parameters a, b and c. And the values

are a=2, b=2 and c=3.

The search bee searches the node corresponds to it and it has

identified that, the test case is suited for node 2 in path2.

Hence, the fitness value is high for node 2, which makes the

selector bee to select this node as the best node to traverse. Then

the neighbor nodes are explored. And the test case is selected

for node6 which is meant for ‘Isosceles Triangle”. This

exploration is continued until it reaches a node that has no more

frontiers to explore.

At last, it has been identified that, the test case {2,2,3} covered

path j=2.

Then the next generation of test cases is generated using the

formula (6.3):

o V13= 2+1 * (2-2) = 2;

o V23=2+ (1) * (2-3) = 3; and

o V33=3+(1) * (3-2) = 4

Vij = {2,3,4} which is fit for node 2 and so the test case is

selected and exploration is done by the selector bee.

While exploration, the Selector bee identifies that the generated

test case is fit for node10 which is meant for “Scalene

Triangle”, and now the process continues till no more frontiers

to explore.

187

Now, it has been identified that, the test case {2,3,4} covers

path j=5.

Since, there are no more test cases available for selection, the

scout generates a new test case using the formula (6.14):

Xij = minj + rand(0,1)*(maxj-minj)

o Xij = {2,2,3} + (-1) * {2,2,4} – {2,2,3} = {2,2,2}

While exploration, the search bee identifies that, the test case

(2,2,2}, has the highest fitness value for node 2 and then

selector bee selects it and explores the neighbor nodes. Now the

node 8 is selected as the best node to explore and the process

continues till no more frontiers to explore in that path.

Finally, it has been identified that, the test case {2,2,2}covers

path j=3.

The process is continued to cover the entire SUT.

Again, the next test case is generated by the scout bee using the

formula (6.14):

o Vij = minj + rand(0,1) * (maxj – minj)

o =>Vij = {2,2,2} + (1) * {2,2,4} – {2,2,2} = {2,2,4}- which

leads to the path 1 meant for “Not a triangle”.

This is repeated till all the test paths have been covered or a

required path coverage percentage has been achieved.

Hence the test cases which cover the SUT are generated within

less number of test runs.

Result of case study 1: The ABC’s behavior is evaluated for test

suite optimization of a complex problem such as triangle classification

problem. This problem is one of the famous bench mark problems to test the

188

efficiency of test cases in several approaches. At the end of the

experimentation, it has been identified that, the behavior of ABC in producing

efficient test cases is not decreasing rather it retains its performance in

generating optimal and near optimal solutions for the given bench mark

problem. The number of test runs taken is only five, to cover the entire SUT.

If there are some infeasible paths, then that will also be intimated by the bees

by means of the status flag associated with each node.

6.7.3 Case Study 2- Performance Comparison of ABC and GA

For performance evaluation between ABC and GA, the problem

taken is comparison of values among three numbers. Here the decision

making process is done based on the value of the parameters a, b and c. The

data types of all these variables are integer.

// Program to compare three values

Figure 6.4 Sample Code for comparing three values

For the code in Figure 6.4, the cyclomatic complexity value is 4,

which means that there are four independent paths in the given problem.

main() {int a, b,c;

1 read a,b,c; 2 if(a>b)

3 Printf(“A is bigger than B”);4 else if (b>c)

5 Printf(“B is bigger than C”); 6 else if(a>c)

7 Printf(“A is bigger than C”);8}

189

Initial set of test cases for each of these paths are created randomly and are

shown in Table 6.4.

Table 6.4 Independent Paths and Initial Set of Random Test Cases of

Case Study #2

S.No. Path Initial set of Test Cases -

(a,b,c) – Random Generation

1. 1-2-3 1,1,2

2. 1-2-4-5 1,1,1

3. 1-2-4-6-7 1,1,1

4. 1-2-4-6-8 1,1,1

(a) Performance of ABC

Let xij be the initial set of test cases for each of the path, (where i=1

to 3 for variables ‘a’, ‘b’ and ‘c’ and j=1 to ‘cyclomatic complexity value’-

that indicates the total number of independent paths in the program). And, xij

indicates a test case i related to path ‘j’ – here it is a triple <a,b,c>, i=1

means it is for ‘a’, 2 for ‘b’ and 3 for ‘c’.

Initial set of test cases are given randomly to the four paths as,

xij = {1,1,2} for path 1 {1,1,1} for path 2, {1,1,1} for path3 and

{1,1,1} for path 4. It is given in table 6.2.

qij=(-1 to 1) which is a random number generated during

program execution.

Now, vij = xij+qij (xij-xkj), where ‘k’ is a solution in the

neighborhood of ‘i’ and ‘q’ is a random number. i=1 to 3, j=1,

190

k = (i+1) % size of the test case (size = 3 since the SUT has 3

variables)

o v11=x11+q11(x11-x21) = 1+(1) (1-1) = 1

o v21=x21+q21(x21-x31)=1+(-1)(1-2) = 2

o v31=x31+q31(x31-x11)=1+(0)(2-1) = 1

o Hence, vi = {1,2,1} where i=1 and j=1

Compare the fitness of xij and vij.

Apply greedy selection process between xij and vij for path j=1.

This must be done for all paths of j.

Calculate the probability Pi for the solution xij and vij by means

of their fitness values:

Pi = fiti / (i=1 to SN) fiti

Here, the fitness values are calculated using the formulae (6.4)

and (6.5).

Based on them, new test cases vij for the onlookers from the

solutions xij are selected depending on Pi and evaluate them.

Apply the greedy selection process between xij and vij.

Unfortunately, both of these test cases have their fitness values

as 0 and so the solutions need to be abandoned and have to be

replaced by new solutions generated by scout bees.

Determine the abandoned solutions and replace them with new

randomly produced solution xij using the scout by the formula

(6.14):

191

xij = minj + rand(0,1) * (maxj –minj)

o Here the minimum valued test case is {1,1,1} and maximum

valued test case is {2,1,1}.

o Hence, xij = {1,1,1} + 1 * {2,1,1} – {1,1,1} = {1,1,1} +

1*{1,0,0} = {1,1,1}+{1,0,0} = {2,1,1}

o Now, the new xij is {2, 1, 1}.

This test case has highest fitness value for the path j=1 and now

scout will start a new search by using this new set of test cases.

Memorize the best test case achieved so far based on their

fitness (probability value).

cycle = cycle +1

Until cycle = Max. Cycle No. (MCN)

The above procedure is coded in Java and the said approach is

applied to each unit of the software under test. Since, the bees are working in

parallel, the decision making process is faster.

After the scout bee identifies a new best solution, it waggle-dances

in the dancing area. Here, a status flag is used by the scout bee to indicate it.

The results of ABC based framework in test case optimization is shown in

Table 6.5.

192

Table 6.5 Results of ABC based framework in Test Case Optimization for

Case Study #2

S.No. Test Sequence

Test Case -(a,b,c)

Probability value based on Fitness

Value

Coverage % (Test

sequence)

1. 1-2-3 1,1,2 0 0% 1,2,1 0 0% 2,1,1 1 100%

2. 1-2-4-51,1,1 0 0% 1,2,1 1 100%

3. 1-2-4-6-7 1,1,1 0 0% 1,2,1 0 0% 2,2,1 1 100%

4. 1-2-4-6-8 1,1,1 1 100%

(b) Performance of GA

(i) Test Case Construction - One point crossover and Mutation

One Point Crossover: Fragmenting the selected population at

some point m and recombine the 0..m-1 portion of first member and m...n of

the second member, as well as recombine the 0..m-1 portion of second

member and m…n of the first member.

For the given case study,

The values of the variables a, b and c

Test Case 1: 1, 1, 2

Test Case 2: 1, 2, 1

After 1-point cross-over at the second position, the new test cases

are as follows:

Test Case 11: 1, 2, 2

Test Case 21: 1, 1, 1

193

This new generation of test cases is then evaluated based on their

effectiveness and then either selection or removal will be done.

Mutation: This operation is used to change the member at gene

level and reproduce the remaining genes for the creation of new generation. In

software testing, the test case sets are considered as the population and

individual test case is considered as member and the possibility of individual

variables’ value passed to a method call are considered as the genes.

Parent 1 – Test Case 1: 1, 1, 2

Parent 2 - TestCase 2: 1, 2, 1

After mutation operator is applied to these test cases, the new

generation of test cases is generated as,

Child 1 - Test Case 11: 2, 1, 2

Child 2 - Test Case 21: 2, 2, 1

Now, this new generation of test cases is evaluated and the

procedure is repeated.

(ii) Test Case Evaluation –Code Coverage criterion

The population that has most favorable features can be assigned

with higher fitness value for evaluation. In software testing, the favorable

feature is revealing more number of errors. The test case with highest

coverage criterion should be selected as they can reveal errors in their

execution.

Test Case 11’s Coverage% is 50%

Test Case 21’s Coverage% is 76%

Parent 1’s Coverage% is 60%

Parent 2’s Coverage% is 70%

194

(iii) Test Case Selection – Filtering Function

Selecting the best parent from the current population is called as the

selection process. This process leads to incremental solution generation. The

test case that has higher fitness value is selected as parent for the next

generation.

After the evaluation is done, Parent1 test case is replaced by Test

Case 21. Then during the next iteration, this modified parent is used for the

generation of test cases (offspring). The test case selection process is repeated

till the termination condition such as maximum number of generations or

acceptable coverage % is reached.

Result of case study 2: As a result of case study 2, it has been

identified that, the total number of test runs taken by ABC approach is only

three to capture the test cases needed to cover the states, statements and

branches in the given SUT. Also, the time taken is very less due to the parallel

behavior of the bees. In the case of GA, the total number of generations is

increasing as the SUT is not covered by the current generation of test cases.

In this experiment, even after the number of generations is

increased to ten in GA, some of the paths were still uncovered and some of

them are repeatedly covered by means of different test cases. Since, the search

in GA is not guided by means of intelligence as in ABC; the solution

generation is repeated till all the paths are covered (even for infeasible paths).

Also, the coverage is not steadiliy increasing as in ABC rather it is increasing

and decreasing during the generations.

195

6.8 PERFORMANCE ANALYSIS

The approach has been tested on all the problems listed in Table

6.1(a). For comparison purpose, five academic problems shown in Table 6.6

and four industrial strength problems listed in Table 6.7 are taken. They are

ranging from simple to complex based on the lines of code and methods per

class. The test suite has been generated using GA and the proposed ABC

tester algorithm.

Table 6.6 Academic Problems (Sample)

S.No. Academic Test Problem Test Object No

No.of Classes

Complexity Level

1. Timetable Generation ATP1 10 Medium 2. Binary Search Tree

Construction ATP2 5 Low

3. Online quiz system ATP3 17 High4. Students marks processing

system ATP4 12 Medium

5. Attendance monitoring system ATP5 18 High

Table 6.7 Industrial Problems (Sample)

S. No

Industrial Test ProblemTest

Object No

No. of Classes

Complexity Level

1. Anti-money laundering system ITP1 56 High

2. Multi-point software ITP2 42 High 3. Ticket Monitoring system ITP3 29 Medium

4. Stock Management System ITP4 15 Low

196

6.8.1 Performance Comparison – ABC Vs. GA

The results are gathered in terms of their path coverage, test runs

and time taken for the entire test case optimization process. The results shown

in Tables 6.8 and 6.9 indicated that, when compared to GA, ABC based test

suite optimization produced high coverage with minimum number of test

cases and within minimal number of test runs.

For ABC based test suite optimization framework, the parameters

mentioned in table 6.1 are applied. The initialization of test cases is done as a

random process. Then the subsequent generations are carried out by the three

bees using the formulae (6.3) to (6.13).

The selection of test cases is done based on the happiness value

heuristic that shows the coverage of the SUT. For example, the selection of

test cases as x[1]=2, x[2]=1, x[3]=1 yields the result as coverage of sequence

1-2-3. The number of test runs is increased to get the coverage of other paths.

The results gathered from academic problems as shown in

Table 6.8, indicated that, even after the number of generations is increased,

the path coverage percentage is low in GA when compared to ABC.

Table 6.8 Results of Academic test problems

S.No. Test Problem

GA Based Optimization ABC Based Optimization

Coverage % Generations

# Test

Cases

Time (Sec.)

Coverage % Cycles

# Test

Cases

Time(Sec.)

1. ATP1 70% 100 72 20 95% 50 24 102. ATP2 65% 200 128 45 98% 80 52 143. ATP3 75% 70 58 18 97% 35 26 84. ATP4 90% 300 208 52 99% 50 80 105. ATP5 85% 150 106 29 96% 68 68 12

197

Table 6.9 Results of Industrial Test Problems

S.No. Test Problem

GA Based Optimization ABC Based Optimization

Coverage % Generations # Test

CasesTime(Sec.)

Coverage % Cycles

# Test

Cases

Time (Sec.)

1. ITP1 70% 500 10042 40 92% 150 3540 22

2. ITP2 75% 300 7120 35 98% 75 2508 20

3. ITP3 80% 400 8648 52 97% 125 3068 25

4. ITP4 68% 180 6040 20 98% 60 2164 10

Similarly, for industrial test problems; the Table 6.9 indicated that,

for complex systems which involve multiple data types the performance of

ABC is superior and takes only less time for test case generation process.

For GA based test case optimization, the parameters as mentioned

in Table 6.1 are used. Then crossover and mutation operations were applied to

generate the further generations. At the end of each generation, the fittest

individuals are identified by finding the coverage metric. The individuals that

have the highest coverage metric were survived and were used as parents for

further generations. This process is continued till an acceptable path coverage

value or the number of test cycles is reached.

6.8.2 Comparison Charts – ABC Vs. GA

From the Figures 6.5 and 6.6, it is understood that, the optimization

of the test cases based on their fitness value is higher and is steadily

improving in ABC. Whereas in GA, the test cases improvement is non-linear

and usually strikes up at local optima.

198

Path Coverage - ABC

020406080

100

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29

No.of cycles

Figure 6.5 Path Coverage of test cases in ABC

Figure 6.6 Path Coverage of in GA individuals

The total number of generations/cycles taken by ABC and GA is

shown in Figures 6.7 and 6.9 for academic and industrial test problems. From

these figures, it is understood that, when compared to GA, the proposed

approach based on ABC takes less number of cycles to achieve optimal or

near optimal solutions due to the parallel behavior of the three agents.

199

No.of Generations/Cycles-ABC Vs. GA

050

100150200250300350

ATP1 ATP2 ATP3 ATP4 ATP5

Academic Test Problems

ABCGA

Figure 6.7 Generations/Cycles- ABC Vs. GA for Academic Problems

0%

20%

40%

60%

80%

100%

Path Coverage %

ATP1 ATP2 ATP3 ATP4 ATP5

Academic Test Problems

Path Coverage % - ABC Vs. GA

ABCGA

Figure 6.8 Path Coverage% -ABC Vs. GA for Academic Problems

050

100150200250300350400450500

Generations / Cycles

ITP1 ITP2 ITP3 ITP4

Industrial Test Problems

No.of Generations/Cycles - ABC Vs. GA

ABC

GA

Figure 6.9 Generations/Cycles -ABC Vs. GA for Industrial Problems

200

0%10%20%30%40%50%60%70%80%90%

100%

Path Coverage %

ITP1 ITP2 ITP3 ITP4

Industrial Test Problem s

Path Coverage % - ABC Vs.GA

GA

ABC

Figure 6.10 Path Coverage %- ABC Vs. GA for Industrial Problems

Also, Figures 6.8 and 6.10 showed the comparison of path

coverage% of the test cases generated using ABC and GA for the academic

and industrial problems. From these figures, it has been inferred that, the path

coverage of the test cases is high in ABC due to the selection of individuals at

each generation by the three bees. If there is any improvement in the

coverage%, the Scout automatically generates the new set of test cases to be

selected to form the optimal test suite. The time taken for ABC is less, since

all the three agents are working in parallel and the solution is obtained by

collaborative learning (waggle dance) among them. In the case of GA, there is

no such collaborative learning available and hence it usually takes more time

to generate the solution.

6.9 SUMMARY

The test suite optimization process involves generation of effective

test cases in a test suite that can cover the given SUT within less time. The

proposed framework called “ABC Tester” applies the artificial bee colony

optimization (ABC) approach for test suite optimization.

201

Artificial Bee Colony (ABC) optimization is a non-pheromone

based swarm intelligence approach motivated by the intelligent behavior of

honey bees; the colony consists of three groups of bees namely employed,

onlookers and scouts.

The proposed approach based on Artificial Bee Colony (ABC)

optimization, represents each test case as a possible solution in the

optimization problem and happiness value - a heuristic introduced to each test

case corresponds to the quality or fitness of the associated solution. The test

suite optimization process is achieved by means of three agents namely

Search Agent, Replace Agent and Selector Agent that mimics the behavior of

employed, onlooker and scout bees in the bee colony. The agents work in

parallel and hence achieve better software test suite optimization.

In this research work, the application of ABC is demonstrated in

software test suite optimization and showed the superiority of the proposed

approach over the existing GA based approach. The size of the test suite is

reduced up to 84.7% (approx.) based on path coverage when compared to

GA. Problems with GA include lack of memorization, non linear

optimization, risk of suboptimal solution and delayed convergence. There is

no guarantee for a near global optimal solution even when it may be reached.

But, ABC model based test suite optimization generates near global optimal

results and it converges within less number of test runs.