Automatically Detecting Equivalent Mutants and Infeasible Paths Lectured by Oren Matza By Jefferson...

Automatically Detecting Equivalent Mutants

and Infeasible Paths

Lectured by Oren Matza

By Jefferson Offutt and Jie Pan

Introduction

Mutation testing is a technique proposed by DeMillo et al. (1978) and Hamlet (1977).

The idea is to take program that behave “good” on a test case, change it (hence the term mutant) and cause this faulty program to result a failure.

The goal is to make a lot of mutants, from the original program, run it and get different results from the original program.

Different behavior of the mutant consider to be fault and then the mutant is consider to be “killed” and “dead” so it will not remain in the testing system.

The test case that was run is called “efficient” test case and it save in the testing system.

Example:

FUNCTION Min (I, J :Integer)

RETURN Integer IS

MinVal : Integer;

Begin

MinVal := I;

If (J < I) THEN

MinVal := j;

END IF;

RETURN (MinVal);

End Min

5 mutants

FUNCTION Min (I, J :Integer)

RETURN Integer IS

MinVal : Integer;

Begin

MinVal := I;

MinVal := J; 1

If (J < I) THEN

If (J > I) THEN 2

If (J < MinVal) THEN 3

MinVal := j;

TRAP 4

MinVal := I; 5

END IF;

RETURN (MinVal);End Min

1,3,5 changing operands

2 changing operator

4 sentence insert/change

The process of testing

Automatic creation of mutants (types seen in the example) Automatic / Manually creation of test cases Run Original program on one of the test cases. If failed, fix the bug and Run again Else Run test case with each “live” mutant If result different from the result of the original program, it consider incorrect and the mutant is “killed”. (the test case is kept as effective one)

At the end 2 types of mutant remains:1. Mutants which are “killable” but the domain of the test cases is not wide enough to kill them.2. Mutants which are equivalent in its behavior to the original program i.e . Equivalent Mutants

The testing process can’t finished until all the mutants are killed.Detecting which of the programs is EM was done manually. Doing this is:time consuming difficult to see in glance which is EM and which is not, which lead to incorrect marking of EM as non-EM and vice versa. (Acree 1980)Because of this reason a considerable effort should be done to find out which are the EM in the mutants domain

Mutants Automatic Creation

The Mothra mutation system (DeMillo et al 1988) uses 22 mutation operators to test Fortran 77 programs.

Mutation Operator Description

AAR array reference for array reference replacementABS absolute value insertionACR array reference for constant replacement AOR arithmetic operator replacement ASR array reference for scalar variable replacementCAR constant for array reference replacementCNR comparable array name replacementCRP constant replacementCSR constant for scalar variable replacementDER DO statement END replacementDSA DATA statement alterations GLR GOTO label replacementLCR logical connector replacementROR relational operator replacementRSR RETURN statement replacementSAN statement analysis SAR scalar variable for array reference replacementSCR scalar for constant replacementSDL statement deletion SRC source constant replacementSVR scalar variable replacementUOI unary operator insertion

Distribution of EM

In previous works and also in the research done in this particular work, it was found that EM arenot evenly distribute among mutant types. They tend to cluster around only few types.The next table summarizes statistics from 11 programs used in this work.

Mutant Type % of equivalent % of all mutants

ABS 47.19 4.30ACR 14.10 1.28SCR 7.05 0.64UOI 6.04 0.55SRC 4.89 0.45SVR 4.46 0.41ROR 3.60 0.33SDL 2.16 0.20CRP 1.58 0.14AAR 1.44 0.13RSR 1.44 0.13LCR 1.15 0.10ASR 1.15 0.10CSR 1.01 0.09SAR 1.01 0.09All others 1.73 0.16

Total 100 9.10

Because ABS mutant as many more equivalent mutants then any anther type. It can be divided to 3unary operators: ABS -- compute the absolute value of the expression NEGABS -- compute the negative of the absolute value ZPUSH -- kills the mutant if the expression is zero (then the mutant is really EM)(this force the tester to cause the expression to be zero - a common testing heuristic)

Do procedures for automatically detecting EM exist ?

Budd and Angluin (1982) showd that the problem of determining if two programs are equivalentis undecidable .Mutant and the original program, are not just two arbitrary programs, and they are very muchsyntactically similar, but Budd and Angluin showed also that this problem is undecidable .

Still because EM distribution and because of the fact that they are different from the original program only in one sentence, in many cases (statistically) it can be determine that this mutant is EM.

In his Ph.D (1988) Offutt presented a technique called Constraint-Based-Testing (CBT) to usemathematical constraints for testing.

Demillo and Offutt (1991) presented how constraints can be used to generate test cases to satisfy mutation testing but did not give details how to do it.

This work develop the idea, gives strategies and algorithms for detecting EM and show some results from implementing those algorithms.

Previous work

Buldwin and Sayward (1979) describe 6 types of compiler optimization techniques that can beused to identified EM.The motivation to use compiler optimization techniques was: programs after optimization are mutants of the original program mutants can be optimization / de optimization of the original program

In 1994 Offutt and craft designed algorithms for those 6 techniques built a tool and succeeded to find 10% of the EM for 15 programs.

Feasible path problem

The original problem of FPA (feasible path analysis) defined like this: given a description o a set of control flow paths through a procedure, feasible test analysisdetermines if there is input data that causes execution to flow down some path.The generalized feasible path problem (FTP) is:given a requirement for a test case, the feasible path problem determines if there is inpt data that can satisfy the requirement (constraints).This problem is undecidable (Goldberg et al 1994, De Millo and Offutt 1991).This work will focus in determine feasibility with a heuristic-based set of transformations, thus determine equivalency of mutants.

In our context, constraint is a mathematical expression that restrict the input space of the program to be the portion of the space that satisfy a certain property.For example the constraint (x > 0) restrict the input space only to positive inputs.(complex constraints can be used to restrict the input space only to inputs that represent rectangle or sorted array)

we will use CBT to define constraints that represent condition in which the mutant is killed.

If there is an input that satisfy the constraints then it means that the mutant is not equivalent and the mutant is killed. If there is not such an input (infeasible input) then this is EM.

Using Constraints to detect EM

The CBT technique

We will mark: P = Program M = Mutant of P S = statement TC = test case.

The state of the program is the values of all data items and program counter.

To kill M, TC has to have those 3 characteristics:

Reachability execute the mutated statement. If will not execute it, defiantly it will not kill M. (Cr)Necessity it must be able to cause M to have an incorrect state if it reaches the mutated statement if S is in a loop, the necessity condition must be hold after each iteration. (the necessity constraint requires that two predicates/expressions will evaluate to different results. (Cn)Sufficiency the final state of M is different from the final state of P. (Cs)

(Cn is necessary but not sufficient. Cs is iff)

Let D represent the entire domain of all TC for P. D can be divided in several ways, for each mutant:

D = Dr Dr D = Dn DnD = Ds Ds

Some facts

Fact 1 - TC is an effective test case that will kill M TC Ds for M (trivial)Fact 2 - If TC is an effective test case that will kill, M then TC Dr Dn (That means that there are TC who satisfy Cr and also Cn but not Cs - example latter)Fact 3 - Ds Dr Dn

Unfortunately finding TC such that TC Dr is an undecidable problem. This is because the determining whether TC executes S is reducible to the halting problem. Thus a weaker condition is defined. CR is defined such that if S is executed, then CR 1is true.

Since Cr CR then the following fact is clear:Fact 4 - Dr DR

CBT uses path expressions to describe reachability condition (the weaker condition) , CR , for a statement.A path expression for a statement S in a P is an algebraic expression that describes a condition on test cases that will be true when P reach S. Path expressions usually describe multiple paths to S by using a disjunctive formula, where each clause represent a separate path. Path expressions are automatically derived from the program by extracting the predicate expressions on the program’s control flow graph.

DR DrDs

Dn

Example

FUNCTION Mid (X, Y, Z : Integer) RETURN Integer IS MidVal :integer;BEGIN MidVal := Z; IF (Y < Z) THEN IF (X < Y) THEN MidVal : = Y; ELSE IF (X < Z) THEN ELSE IF (X<=Z) THEN 1 MidVal := X; END IF ELSE IF (X > Y) THEN MidVal : = Y; ELSE IF (X > Z) THEN MidVal := X; END IF RERURN (MidVal); END

When x = = z Cn is true because(x < z) != (x<=z)but Cs is not satisfied ( P return Z, M return X which are equal)Mutant is not killed.

Formalization of what we saw until now

Constraints and detecting EM

P a program, M mutant of P.P(TC) and M(TC) are the Outputs of P and M on TC

DefinitionM is an EM of P P(TC) = M(TC) for every TC D

This says that if a mutant is functionally equivalent to the original program, it is impossible to find ant test case to kill the mutant.

( TC|TC D P(TC) M(TC)) TC|TC D P(TC) = M(TC))

This leads to to the following theorems:

THEOREM 1Dr = (Cr is infeasible) M is EM

Proof(1) M is equivalent Ds = --Definition, Fact 1(2) Ds Dr Dn --Fact 3(3) Dr = Ds = -- Rules of set, (2)(4) Dr = M is equivalent --Substitutiof (1) in (3)

THEOREM 2Dn = (Cn is infeasible) M is EM

Proof(1) M is equivalent Ds = --Definition, Fact 1(2) Ds Dr Dn --Fact 3(3) Dn = Ds = -- Rules of set, (2)(4) Dn = M is equivalent --Substitutiof (1) in (3)

THEOREM 3Dr Dn = (Cr Cn is infeasible) M is EM

Proof(1) M is equivalent Ds = --Definition, Fact 1(2) Ds Dr Dn --Fact 3(3) Dr Dn = M is equivalent --Substitutiof (1) in (2)

from the fact that Cr CR

the following claims could be derived:

a) DR = M is EMb) DR Dn = M is EM

All the above leads to the following conclusions:

(a) If a path expression constraint system (CR) for a mutated statement, of M , is infeasible, then the set of test cases (DR) that can kill M is empty - implying M is never killed. So M is equivalent.(b) If a necessity constraint system (Cn) for a mutant M , is infeasible, then the set of test cases (Dn) that can kill M is empty - implying M is never killed. So M is equivalent.(c) If a constraint system which is a conjunction of CR and Cn , is infeasible, then the set of test cases (Dn Dn) that can kill M is empty - implying M is never killed. So M is equivalent.

This means that to decide if a constraint system is infeasible, there must be contradiction in the constraint system itself. (for example the constraint system (X>0) (X< 0) )If M as a constraint system like in the example then it is EM.

So far we saw how we can translate the problem of detecting an EM to a problem of finding contradictionsin mathematical constraint system.

Also now test case generation uses constraints and EM detection uses constraints.

Representation of the constraintsThe expressions composed of variables and operators from the programming language of P and comes from the right hand side of assignments and decision statements of P itself.It evaluate to true or false.A clause is a list of constraints connected with logical AND and OR.A conjunctive clause uses only AND.All constraints kept in disjunctive normal form (DNF) which is a list of conjunctive clause connected only by OR’s.DNF formula referred to as a constraint system, in which each conjunctive clause represent path expression to a statement. During constraint satisfaction only one clause need to be satisfied.

Note: we said that constraint includes variables from the program. Unfortunately this includes “internal” variables. For test case generation a symbolic evaluation (King 1976; Offutt 1991) is used to rewrite variables to be in terms of input variables.

Finally what we all been waiting for:

The techniques to find the EM

Because it is undecidable problem, it can’t be solve algorithmically, but because EM are currently detecting manually, even a partial solution is valuable.

There are some off-the-shelf theories to the infeasible-constraint problem, but we will not use it because: (1) such a theorem gives much more then we need, (2) it is difficult to integrate this into already-existing software, for testing this work.Because EM divided into more common an d less common type, and because they are different from the original program in a well-defined way, we can use special techniques to deal with this cases.

3 techniques will be showed: Negation, Constraint splitting, Constant comparison.

Negation

Definition 1constraint C1 is the negation of C2 the domains they describe: (a) non-overlap (b) cover entire domain of variables in C1 and C2 Definition 2constraint C1 is a partial negation of C2 the domains they describe: (a) non-overlap (b) do not cover entire domain Definition 3two constraints are semantically equal if they describe the same domain

Definition 4two constraints are syntactically equal if they describe the same domain , and also have the same string of symbols.(clearly two syntactically equal constrains are also semantically equal )

Examples(1) A is the constraint x > 1, B the constraint x <= 1. A is negation of B and B is negation of A(2) A is the constraint x > 1, B the constraint x < 1. A is partial negation of B and B is partial negation of A(3) A is the constraint x > 0, B the constraint x > 0. They are syntactically equal, so also semantically equal (4) A is the constraint x > 0, B the constraint x >=1 (x integer). Then they are semantically equal but not syntactically equal

The negation technique is the basic technique to recognize infeasible constraints.Just negate one of the constraint and see if now they are syntactically equal.If so, the constraint are conflict (and the mutant is EM)

For example A is (x+y) > z, and B is (x+y) <= z

The following table show how to negate / partial negate a constraint

partial negation of Cconstraint C Negation of C Partial negation 1 Partial negation 2exp1 > exp2 exp1<=exp2 exp1< xp2 exp1=exp2exp1 >= exp2 exp1 < exp2 -- --exp1 < exp2 exp1>=exp2 exp1 > exp2 exp1 = exp2exp1 <= exp2 exp1 > exp2 -- --exp1 = exp2 exp1 exp2 exp1 > exp2 exp1<exp2exp1 exp2 exp1 = exp2 -- -- true false -- -- false true -- --

What about an algorithm ?

Algorithm: Negation (A, B)Precondition: A and B are properly initialize constraints.Postcondition: Returns conflict if A and B conflict, no-conflict otherwise

beginneg-A = Negate (A) --use the table showedif (neg-A syntactically equal B) return conflictelse if (the relatioon operator in A is one of {{>,<,=} ) partial1-A = PartialNegate1 (A) --use the table showed if (partial1-A syntactically equal B) return conflict else partial2-A = PartialNegate2 (A) --use the table showed if (partial2-A syntactically equal B) return conflict else return no-conflict end-if end-if end-ifend-ifend Negation

Constraint Splitting

This technique is also used to recognize infeasible constraints.If C and D are two constraints and one of the constrains is of the form (V1 AOP V2) ROP K), then we can split this constraint.Suppose C has this form. We will split C to two new constraints A and B, such that C A B.It will be shown that if A B conflict with D then also the original C conflict with D.If so, the constraint are conflict (and the mutant is EM)

Note: usually A and B are weaker then C, but it easier to decide if someone of them conflict with D.

Proof: C A BC ( A B ) --implication

(A B ) C --commutativity

( A B ) C --negation

( A B ) C --De Morgan

A B C --commutativitynow we assume that And B conflict with D, so:(1) A B D --assumption

(2) A B -- (1), AND property

(3) A B C --assumption(4) C --implication eliminationn, 2, 3(5) D -- (1), AND property

(6) C D -- (1),(4),(5) AND property

Original Constraint New Constraint 1 New Constraint 2

(x+y) > 0 x > 0 y > 0 (x+y) 0 x 0 y 0 (x+y) < 0 x < 0 y < 0 (x+y) 0 x 0 y 0 (x+y) = 0 x 0 y 0 (x+y) 0 x -y (x -y) > 0 x > 0 y < 0 (x -y) 0 x 0 y 0 (x-y) < 0 x < 0 y > 0 (x -y) 0 x 0 y 0 (x-y) = 0 x 0 y 0 (x -y) 0 x y (x *y) > 0 x > 0 y > 0 x < 0 y < 0 (x *y) 0 x 0 y 0 x 0 y 0 (x *y) < 0 x > 0 y < 0 x < 0 y > 0 (x *y) 0 x 0 y 0 x 0 y 0 (x *y) = 0 x = 0 y = 0 (x *y) 0 x 0 y 0 (x / y) > 0 x > 0 y > 0 x < 0 y < 0 (x / y) 0 x 0 y > 0 x 0 y < 0 (x / y) 0 x 0 y < 0 x 0 y > 0 (x / y) < 0 x > 0 y < 0 x < 0 y > 0 (x / y) = 0 x = 0 (x / y) 0 x 0

The following table show how to split constraint C to to constraint A and B (such that C A B)

The Algorithm

Algorithm Splitting Constraints (NecConst, PEConst)Precondition: NecConst and PEConst are properly initialized constraintsPostcondition: Returns conflict if NecConst and PEConst conflict, no-conflict otherwise.

begin--V1 and V2 are variables, K is a constant, aop is aritmetic operator, rop is relation operator.if (the format of NecConst is not ((V1 aop V2) rop K)) return no-conflictelse --use table to split NecConst A = NewConstraint1 (NecConst) B = NewConstraint2 (NecConst)endif

if (Negation (A, PEConst)==conflict) AND (Negation (B, PEConst)==conflict) return conflictelse if (CompareConstraints (A, PEConst)) AND CompareConstraints (B, PEConst))) return conflict else return no-conflict end ifend ifend

Constant Compression

This technique is working when both constraints have the form (v rop k).v must be the same so the constraints are (v rop k1) and (v rop k2).(This strategy also known as grounding).

If the constraint has the format (v aop k1) rop k2 we can rewrite it as v rop (k2 aop k1), so it will be in the format we want it to be (aop is the inverse operation of aop).Then the following table can help us determine if the two constrains conflict.

Constraint A Constraint B predicate (pred) Conclusion (T for conflict, F for not) x > k1 x > k2 --- F x > k1 x k2 --- F x > k1 x < k2 K1 K2-1 if pred T, else F x > k1 x k2 K1 K2 if pred T, else F x > k1 x = k2 K1 K2 if pred T, else F x > k1 x k2 --- F

x k1 x > k2 --- F x k1 x k2 --- F x k1 x < k2 K1 K2 if pred T, else F x k1 x k2 K1 >K2 if pred T, else F x k1 x = k2 K1 >K2 if pred T, else F x k1 x k2 --- F

x < k1 x > k2 K1 K2+1 if pred T, else F x < k1 x k2 K1 K2 if pred T, else F x < k1 x < k2 --- F x < k1 x k2 --- F x < k1 x = k2 K1 K2 if pred T, else F x < k1 x k2 --- F

x k1 x > k2 K1 K2 if pred T, else F x k1 x k2 K1 <K2 if pred T, else F x k1 x < k2 --- F x k1 x k2 --- F x k1 x = k2 K1 <K2 if pred T, else F x k1 x k2 --- F

x = k1 x > k2 K1 K2 if pred T, else F x = k1 x k2 K1 <K2 if pred T, else F x = k1 x < k2 K1 K2 if pred T, else F x = k1 x k2 K1 >K2 if pred T, else F x = k1 x = k2 K1 K2 if pred T, else F x = k1 x k2 K1 =K2 if pred T, else F

x k1 x > k2 --- F x k1 x k2 --- F x k1 x < k2 --- F x k1 x k2 --- F x k1 x = k2 K1 =K2 if pred T, else F x k1 x k2 --- F

Algorithm

algorithm: CompareConstants (A, B)Precondition: A and B are properly initialize constraints.Postcondition: Returns conflict if A and B conflict, no-conflict otherwise

begin--V is a variable, k,k1,k2 are constants, rop is relational operator, aop is arithmetic operatorif (the format of A is (V rop K)) keep the format the sameelse if (the format of A is (K rop V)) modify format to (V rop K))else if (the format of A is ((V aop K1) rop K2)) modify format to (V rop (K2 aop K1))else if (the format of A is (K1 rop (V aop K2))) modify format to (V rop (K1 aop K2))else return no-conflictend if

if (the format of B is (V rop K)) keep the format the sameelse if (the format of B is (K rop V)) modify format to (V rop K))else if (the format of B is ((V aop K1) rop K2)) modify format to (V rop (K2 aop K1))else if (the format of B is (K1 rop (V aop K2))) modify format to (V rop (K1 aop K2))else return no-conflictend if

if (the v’s in A and B are not the same) return no-conflictend ifif (ConstantCompression ( A, B) == true) -- see the table before return conflictelse return no-conflictend ifend CompareConstants

A proof of concept tool

To test the techniques developed above, a tool call Equivalencer was created.Equivalencer is:integrated with Godzilla - a test data generator.inserted to the Mothra mutation tool set.implemented in Cwork on Sun Sparc workstation running SunOs 4.1.3contain more then 2000 executable lines of codeuses some of the Mothra and Godzilla libraries.It implemented inside the 3 strategies for detecting EM. First it apply Negation, if no conflict Constant comparison, if no conflict Constraint Splitting.

AssertionsIn Equivalencer, assertions are constraints that the user insert into the test program , to restrict the input domain of some variables manually.Assertions on Parameter variables are precondition and derived by hand from the specifications.Assertions on Internal variables can derived automatically - slicing (Weiser 1984) and control flow analysis (Fischer and Leblanc 1988)

Godzilla generate constraints related to array it does not take into account the array index expression.This means that A(i) >0 and A(j) < 0 will generate A() > 0 and A() <0 which leads to conflict.

To avoid this Equivalencer will check array constraints , only if it is assertion constraint which is related to all the array. This will be called array-extension.

Equivalencer Design

1. Initialization - open all files that it needs and load data to memory.2. Consult failure information of the Godzilla about simple EM cases. “exit” with conflict if found.3. Gets from Godzilla the path expressions and combined it with assertions. Check with all 3 techniques. “exit” with conflict if found.4. Take necessity constraints and check with all 3 techniques. “exit” with conflict if found.5. Combine path expressions and necessity constraints and check with all 3 techniques. “exit” with conflict if found.6. Combine necessity constraints and array-extension and check with all 3 techniques. “exit” with conflict if found.7. Combine path expressions and array-extension and check with all 3 techniques. “exit” with conflict if found.8. Exit with no-conflict (no EM).

Note about efficiency

as a proof -of-concept tool it was not meant to be efficient, so every path-expression checked against every necessity constraint.This increase time of execution.

Program Statements Mutants Equivalents Equi. Detected Percent. Detected

Bsearch 20 299 27 19 70.37Bub 11 338 35 24 68.57Cal 29 3010 236 37 15.67Euclid 11 196 24 18 75.00Find 28 1022 75 63 84.00Insert 14 460 46 32 69.57Mid 16 183 13 3 23.08Pat 17 513 61 29 47.54Quad 10 359 31 4 12.90Trityp 28 951 109 80 73.39Warshall 11 305 35 22 62.86

TotalAvg 185 7636 695 331 47.63

Empirical results

Discussion about the resultsAlthough the total is less then 50%, the percentage are dramatically change between the programs (84% and 12.9%) This is because some factors, but maybe most of all it is the limitation of the tools the Equivalencer is based upon.First is the problem with array reference discussed before. If constraint such as A(4) = 0 could be taken into account, then the mutant could have been killed if A(4) was not 0.Also Godzilla associate every variable in each statement with statement number, which propagate, if the value of the variable is not change, to the next statement. This make checking the constraint complicated, specially because Godzilla has a bug in it.

Feasible Path results

Although this research was focused on EM, the same technique can be used on the problem of infeasible paths. Specifically result of Theorem 1 is that if the reachability condition for a sentence is infeasible then the sentence is unreachable. To do something based on this, some programs were artificially created, such that some of the sentences were unreachable, and the Equivalencer have been tried on it.(A mutation operator was defined such as S is unreachable M is EM)The results are shown in the table below:

Program unreachable detected percentage detected

prog 1 2 1 50.00prog 2 1 0 0.00prog 3 1 1 100.00prog 4 1 1 100.00prog 5 1 1 100.00prog 6 1 1 100.00prog 7 2 2 100.00prog 8 3 3 100.00prog 9 2 0 0.00

Total Avg. 14 10 71.43

Improving the software

The Equivalencer is very heavily rely on Godzilla. Godzilla implement symbolic evaluation as separate step from infeasible constraint and throw away considerable information that the Equivalencer needs, for example reference to internal variables.

Godzilla generate array constraints without indexes. If the indexes could also be take, analysis of the CAL program found that Equivalencer could find 69 more EM thus increasing detection from 15.67% to 44.92%.

Give the tester opportunity to help in recognizing difficult constraints. It will be more easier for him then to find EM, and will help the tool to find EM

This software analyze program only until the mutated statement, but the analysis could be done further on, until the end of the program.

Conclusions

A partial desolation to the EM problem was shown here. An algorithms were given, and a proof-of-concept tool was built. It was shown that this was effective partial solution. The technique is general and could be used to every feasible problem. Compare to finding EM by hand, even this non optimized tool was fast. This type of system allow the programmer to submit a software module, and after few minutes of commutations come up with a set of test cases as inputs and a set of outputs to be examined to find failure in the software - inputs and outputs that can be latter use for debug when a failure is found.

The End

Automatically Detecting Equivalent Mutants and Infeasible Paths Lectured by Oren Matza By Jefferson...

Documents

Transcript of Automatically Detecting Equivalent Mutants and Infeasible Paths Lectured by Oren Matza By Jefferson...