Combinatorial Testing Review “Combinatorial Software Testing”, Kuhn, Kacker, Lei, Hunter, IEEE...

Combinatorial Testing Review

• “Combinatorial Software Testing”, Kuhn, Kacker, Lei, Hunter, IEEE Computer, Aug. 2009

• Compared “traditional” software test development with simple combinatorial pairwise testing (2-way combinations)

• Measures: – fault detection effectiveness, – fault detection rate

Pairwise covering array

All pairs of values appear somewhere in array

10 Project Study

• Two teams of testers• One group – ad hoc test development

(common practice) – use cases based on requirements documents

• Second group – all pairwise combinations of system input values

Results

• Higher fault detection effectiveness (13% better) for 2-way tests

• Much higher fault detection per hour 2.4X

Questions to think about

• Why were results better for 2-way test?• Why was fault detection rate so much higher?• Input model refers to the set of parameter

names and values used in testing. For a given input model, when might the “use case” scenario based test method be better than 2-way?

Fault distribution for various application domains

• What does this graph suggest about the effectiveness of pairwise (2-way) testing?

Oracle-free Testing withTwo-layer Covering Arrays

Rick KuhnNational Institute of

Standards and TechnologyGaithersburg, MD

East Carolina UniversityNSF Research Experiences for Undergraduates

June 29, 2015

Some current approaches

• Fuzz testing – send random values until system fails, then analyze memory dump, execution traces

• Metamorphic testing – e.g. cos(x) = cos(x+360), so compare outputs for both, with a difference indicating an error.

• Partial test oracle – e.g., insert element in data structure S, check

New method

• Consider equivalence classes• Example: shipping cost based on distance d and

weight w, with packages < 1 pound are in one class, 1..10 pounds in another, > 10 in a third class.

• Then for cost function f(d,w), f(d, 0.2) = f(d, 0.9), for equal values of d.

• But f(d, 0.2) ≠f(d, 5.0), because two different weight classes are involved.

Basic property of equivalence classes

when a1 and a2 are in the same equivalence class,

f(a1,b,c,d,…) f(a2,b,c,d,…), where is equivalence with respect to some predicate.

If not,- then either the code is wrong, - or equivalence classes are not defined correctly.

Can we use this property for testing?

• Let’s do an example: access control. access is allowed if (1) subject is employee and time is in working hours and it’s a weekday; or(2) subject is an employee with administrative privileges; or (3) subject is an auditor and it is a weekday.

• Equivalence classes for time of day and day of the week• time = minutes past midnight (0..0539), (0540..1020), (1021..1439). • Days of the week = weekend and weekdays,

designated as (1,7) and (2..6) respectively.

Code we want to test

int access_chk() { if (emp && t >= START && t <= END && d >= MON && d <= FRI) return 1; else if (emp && p) return 2; else if (aud && d >= MON && d <= FRI) return 3; else return 0;}

Establish equivalence classes

emp: booleanday: (1,7), (2,6) A1 A2 time: (0,100,539),(540,1020),(1021,1439) B1 B2 B3 priv: booleanaud: boolean

emp (bool) : 0,1day (enum) : A1,A2time (enum): B1,B2,B3priv (bool): 0,1aud (bool) : 0,1

All of these should be equal

Eq. class A1

Eq. class B1

These should also be equal

Eq. class A2

Eq. class B1

Now we’re using class A2

Covering arrayPrimaryarray:0,A2,B1,1,11,A1,B1,0,00,A1,B2,1,01,A2,B2,0,10,A1,B3,0,11,A2,B3,1,0

One secondary array for each row

Class A2 = (2,6)Class B1 = (0,539)

emp: booleanday: (1,7), (2,6) A1 A2 time: (0,539),(540,1020),(1021, 1439) B1 B2 B3 priv: booleanaud: boolean

0 2 0 1 10 6 0 1 10 2 539 1 10 6 539 1 1

Run the tests

• Correct code output:

333300000000111100002222

Faulty code:if (emp && t>=START && t==END && d>=MON && d<=FRI) return 1;

Faulty code output:333300000000331100002222

What’s happening here?Input domain

Incorrect boundary

We simply detect inconsistency between partitions

Can this really work on practical code?

Primary x secondary #tests total

faultsdetected

3-way x 3-way 285x8 2280 64-way x 3-way 970x8 7760 22

Experiment: TCAS code (same used in earlier model checking tests)• Small C module, 12 variables• Seeded faults in 41 variants

• Results:

• More than half of faults detected• Large number of tests -> but fully automated, no human

intervention• We envision this type of checking as part of the build process;

can be used in parallel with static analysis, type checking

Prototype tool

Next Steps• Realistic trial use• Different constructions for secondary array, e.g.,

random values• Formal analysis of applicability – range of

applicability/effectiveness, limitations, special cases

• Determine how many faults can be detected this way

• Develop tools to incorporate into build process

Why this works for testing• If Ai, Bi, Ci, etc. are equivalence classes,

A1 = 0,1,2; A2 = 300,400; etc. B1 = 22,33,55; B2 = 12, 50,99 etc.C1 = 1000, 2000, 4000 …. etc.then:

• f(0, 22, …) = f(1, 22,1000,…) = f(2, 22,1000,…)and since eq. class B1 includes 22,33,55, then f(0, 22, …) = f(0, 33,1000,…) = f(0, 55,1000,…)= f(1, 22,1000,…) = f(1, 33,1000,…) = f(1, 55,1000,…)… and on and on … using all possible combinations of equivalent values

• We use a covering array of t-way combinations to make it tractable

Why this works for testing• If Ai, Bi, Ci, etc. are equivalence classes,

A1 = 0,1,2; A2 = 300,400; etc. B1 = 22,33,55; B2 = 12, 50,99 etc.C1 = 1000, 2000, 4000 …. etc.then:

Combinatorial Testing Review “Combinatorial Software Testing”, Kuhn, Kacker, Lei, Hunter, IEEE...

Documents

Transcript of Combinatorial Testing Review “Combinatorial Software Testing”, Kuhn, Kacker, Lei, Hunter, IEEE...