Software Testing • Basic concepts • Test equivalence ...

Software Testing

• Basic concepts

• Test equivalence

• Terminology

• Dimensions of testing

• Testing methods

– Black box testing

– White box testing

1

Basic Concepts

• Testing involves

– Running actual code

– On selected inputs / in selected environments

– Then observing the behaviour of the program / system

– And deciding whether the behaviour is acceptable

• Main goal of testing: find bugs, break software

• Subsidiary goal: be thorough

2

Test Cases

• Test case: everything needed for one test run

• Test case must contain:

– Inputs to the software under test

– Expected outputs

• Also useful:

– Reason for test case

– What it contributes to thoroughness

• Successful test case: one in which the software behaves correctly

• Failing test case: one in which the software behaves incorrectly

3

Successful Test Case

4

Failing Test Case

5

The Testing Problem

• Could be [infinitely] many possible test cases

– Impossible to test exhaustively

• Testing problem: select test cases that will reveal failures, if any exist

6

Test Equivalence

• Say two test cases A and B aretest-equivalentif we believesoftware will fail on A iff itwill fail on B

• Test-equivalence should be an equivalence relation

• Equivalence class(EC) for test cases: set S of test cases such that if A and B are in S, Aand B are test-equivalent

• Equivalence partitionfor test cases: set P of equivalence classes such that every test caseis in some class S in P

7

Equivalence Partition

• Select only one test case from each equivalence class

• Avoids unnecessary tests

8

Example Software: TRIANGLE

• Program prompts user for input

• User types 3 positive real numbers separated by commas

– e.g.5.1, 12.0, 13

• Numbers are lengths of sides of a triangle

• Program responds with

– “Equilateral” if there is a valid triangle with those side lengths which is equilateral

– “Isosceles”similarly

– “Scalene” similarly

– “Not A Triangle” if there is no valid triangle with those side lengths

• Program also says

– “Acute” if largest angle is<90 degrees

– “Right” if largest angle is 90 degrees

– “Obtuse” if largest angle is>90 degrees

9

Test Equivalence for TRIANGLE

• Probably consider

– (3, 3, 3) test-equivalent to (4, 4, 4)

– (3, 3, 3) test-equivalent to (5, 5, 5)

– (3, 3, 3)not test-equivalent to (3, 4, 5)

• Therefore:

– If (3, 3, 3) is a test case, (4, 4, 4) and (5, 5, 5) should not be

– (3, 3, 3) and (3, 4, 5) can both be test cases

10

Terminology: Bugs

• Failure: eventin which program performs incorrectly

– e.g. printing wrong answer

• Fault: problemin programcausing failure

– e.g.< used instead of<=

• Bug: used informally for both of the above

• Defect: used for both failures and faults, and also for problems in specs, design, documen-tation

• Error: mistakeprogrammermade in introducing fault

– e.g. thinking that< was right

– Errors can be made by spec writers and designers too

• e.g.

– “The failure was that the system core dumped;

– the fault was that an uninitialized pointer was dereferenced;

– the error was that I thought only initialized pointers wouldbe passed to the function”

11

Terminology: Test Artifacts

• Test case: input data for test run, expected output, reason for test case

Reason

Test Case

Expected OutputInput Data

• Test suite: set of test cases

...

Test Suite

TC 3TC 2TC 1 TC N

• May have several overlapping test suites for different purposes

...TC 1 TC 3 TC 4TC 2 TC N

Test Suite BTest Suite A

TC 5

12

Dimensions of Testing

• Scope in program

– Unit testing: testing functions, classes, groups of methods, etc.

– System testing: testing whole programs, software systems

– Integration testing: putting units together into systems

• “Box colour”

– “Black-box” (functional) testing: testing based on reqs, design, etc.

– Tests documented requirements

– “White-box” (structural) testing: testing based code itself

– Tests undocumented features, algorithms

• Testing goals

– Correctness testing: trying to force bad output

– Usability testing

– Load/stress/volume testing: seeing how software performsunder heavy load

13

Coverage

• Word “coverage” used to talk about thoroughness

• “100% node coverage of finite state machine”:Set of strings that causes every node to be visited

• “100% statement coverage of code”:Set of inputs that causes every statement to be executed

• “90% statement coverage of code”:Set of inputs that causes 90% of statements to be executed

• Use coverage to guess at equivalence classes

14

Testing Methods

• Black-box (functional)

• White-box (structural)

• Coverage terminology

• Coverage tools

15

Black-Box Testing

• Also called “functional testing”

• Testing based on requirements

• Major methods:

– Equivalence partitioning on inputs

– Equivalence partitioning on outputs

– Boundary value analysis

– Extreme value analysis

– Syntax testing

16

Equivalence Partitioning

• On inputs:

– Identify input variablese.g. side lengths, number of sides

– Identify valid and invalid value sets for each variablee.g.{l | l > 0} valid; {l | l ≤ 0} invalid; {l | l not a number} invalid

– Build complete test cases from those setse.g. (3, 4, 5), (-1, 3, 3), (3, a, 3), (3), (3, 4, 5, 6, 7)

• On outputs:

– Identify output events, output variables, (sets of) variable valuese.g. triangle types, angle types

– Build test cases that generate each onee.g. (3, 3, 3), (2, 2, 3), (3, 4, 5), (2, 10, 2)

17

Boundary Value Analysis

nEC 1 EC 2

• Find somen such that asn either increases or decreases,we go from one EC to another (e.g. valid to invalid)

• Identify values ofn that are

– Close to boundary on each side

– On boundary (if possible)

• Generate test cases for each

• n can be integer or real; input value, output value, numberof repetitions, etc.

• For number representing month: 0, 1, 12, 13

• For TRIANGLE example:

– On outputs: (3, 3, 2.99), (3, 3, 3), (3, 3, 3.01), (3, 4,4.99), (3, 4, 5), (3, 4, 5.01), . . .

– Number of occurrences: (3, 3), (3, 3, 3), (3, 3, 3, 3), . . .

18

Extreme Value Analysis

• Find ann again

• Choose extremely large/small values forn

• Generate test cases for each

• n can be integer or real

• n can be input value, output value, number of repetitions/occurrences,etc.

• For TRIANGLE:

– On inputs/outputs: (0.000004, 0.000002, 0.000003), (43728947,43928410, 54386880)

– On number of repetitions: ()(empty test case), (3, 3, 3, 3, 3, 3, 3,3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3)

19

Syntax Testing

filename

filename

filename

-f-g

get

put

-m

• Basic steps:

– Identify target language and make explicit if not already

– Define syntax formally with state machine

– Write test cases to cover:

– All transitions

– All incomplete inputs

– Each change of a valid token to an invalid one

• For above grammar: (put -f foo), (put -g foo), (put -m foo bar), (get foo), (put -f), (get),(put -f -g), . . .

20

White-Box Testing

} ... p = p->next; }

while (p!=null) { if (p->n==y) { return p;

• Also called “structural testing”

• Basic idea: if 2 test cases execute code differently, cannotbe test-equivalent

• What criteria do we use to decide “executing the code dif-ferently”?

– “Coverage criteria”

• Hierarchies of coverage criteria

– Statement coverage

– Decision coverage

– Dataflow coverage

• Levels of coverage

21

Statement Coverage

• A test suiteachieves (100%) statement coverageon code if:For every statement in the code, there is at least one test case which causes the program toexecute that statement

• Example: consider the insurance code

premium = 500;if ((age<25) && (sex==male) && (!married)) {premium += 1500;

} else {if (married || (sex==female)) {premium -= 200;

}if ((age>45) && (age<65)) {premium -= 100;

}}

(from Software Testing in the Real World, Edward Kit)

22

Statement Coverage

• In the insurance code:

– 3 “if” statements, 4 assignment statements

• Two test cases can execute all statements:

– One with((age<25) && (sex==male) && (!married)) true

– One with((age<25) && (sex==male) && (!married)) false,but(married || (sex==female))and((age>45) && (age<65)) both true

• Statement coverage closely related toblockcoverage

– (block = sequence of statements that must be executed together)

23

Decision Coverage

• Decision: entire conditional expression after anif, while, etc.

• A test suiteachieves (100%) decision coverageon code if:

– For every decision in the code, there is at least one test casethat evaluates the decisionto true

– For every decision in the code, there is at least one test casethat evaluates the decisionto false

• Also called “branch coverage”

• Three test cases achieve 100% decision coverage on insurance code:

– One with((age<25) && (sex==male) && (!married)) true

– One with(age<25) false, but(married || (sex==female))and((age>45) && (age<65)) both true

– One with(age<25) false, but(married || (sex==female))and((age>45) && (age<65)) both false

24

(Short-Circuit) Condition Coverage

• Condition: atomic expression with no logical operators, within decision

– Logical operator:&&, ||, !, etc.

– Every decision contains one or more conditions

– e.g.((age<25) && (sex==male) && (!married)) : three conditions

• A test suiteachieves (100%) condition coverageon code if:

– For every condition in the code, there is at least one test case that evaluates the condi-tion to true

– For every condition in the code, there is at least one test case that evaluates the condi-tion to false

• For many languages we need to assumeshort-circuit evaluation: e.g.when evaluatingB && C, if B false, don’t evaluateC

• Use phrase “short-circuit (SC) condition coverage” for this

• Need at least 5 test cases for SC condition coverage on insurance code

25

Comparing Coverage Criteria

• Statement, decision, SC condition = “coverage criteria”

• If a test suite achieves decision coverage, it achieves statement coverage

– (There are unimportant, rare exceptions)

– Decision coverage “stronger than” statement coverage

• If a test suite achieves SC condition coverage, it achieves decision coverage

– SC condition coverage “stronger than” decision coverage

• US Military Standard DO-178B defines criteria for testing software at different safety lev-els

– Highest level: needs SC condition coverage to satisfy

26

Dataflow Coverage

• Definition statement (“def”): statement which assigns a value to a variable; e.g.

– “x = 3;” is a definition ofx

– “add(&tree, i);” is a definition oftree

• Use statement: statement which uses the value of a variable

– Predicate use (p-use): statement which uses value in a decision; e.g.“if (x == y) {...}” is a predicate use ofx andy

– Calculation use (c-use): all other uses; e.g.

– “x = y+3;” is a calculation use ofy

– “add(&tree, i);” is a calculation use ofi

• Def-use pair: def of a variablex + a later use (in same method) ofx, that can be reachedfrom the def by a pathq with no other defs ofx in-between

– Parameter decls considered “virtual defs” here

27

Def-Use Coverage

• A test suiteachieves (100%) def-use coverageon code if:

– For every def-use pair, there is at least one test case that executes the pathq betweenthe def and the use

• C-use coverage: def-use coverage for c-uses

• P-use coverage: def-use coverage for p-uses

• Statement, decision, condition coverage:controlflowcriteria

• Def-use, C-use, P-use coverage:dataflowcriteria

• Some studies have shown that test suites achieving dataflow criteria are more effective thanthose achieving controlflow criteria

28

Coverage Terminology

• A test suiteachieves (90%) statement coverageon code if 90% of the statements are exe-cuted

– Similarly 90%, 80%, etc., decision coverage, def-use coverage, etc.

• Some statements cannot be executed (“dead code”)

– Consider these coverage elementsinfeasiblefor statement coverage

– Similarly, infeasible decisions, conditions, etc.

• Generally only count feasible coverage elements

– Thus say “100%feasiblestatement coverage”, etc.

• Minimal test suite for a criterion: one that achieves the criterion in the smallest number oftest cases

29

Coverage Tools

• Step 1: run program that either

– Compiles your software under test(SUT) in a special way

– Adds statements to your SUT andcompiles it

– Instruments your SUT bytecode

• Step 2: run your compiled program(your SUT)

• Step 3: run coverage reporting pro-gram

– Can tell you which statements,etc. in your SUT were executed

• Most tools measure only statementcoverage

• For C:gcc / gcov

• For Java: Cobertura

30

Software Testing • Basic concepts • Test equivalence ...

Documents

Transcript of Software Testing • Basic concepts • Test equivalence ...