Equivalence Testing for Functional Data with an Application to ...
Software Testing • Basic concepts • Test equivalence ...
-
Upload
softwarecentral -
Category
Documents
-
view
2.615 -
download
1
Transcript of Software Testing • Basic concepts • Test equivalence ...
Software Testing
• Basic concepts
• Test equivalence
• Terminology
• Dimensions of testing
• Testing methods
– Black box testing
– White box testing
1
Basic Concepts
• Testing involves
– Running actual code
– On selected inputs / in selected environments
– Then observing the behaviour of the program / system
– And deciding whether the behaviour is acceptable
• Main goal of testing: find bugs, break software
• Subsidiary goal: be thorough
2
Test Cases
• Test case: everything needed for one test run
• Test case must contain:
– Inputs to the software under test
– Expected outputs
• Also useful:
– Reason for test case
– What it contributes to thoroughness
• Successful test case: one in which the software behaves correctly
• Failing test case: one in which the software behaves incorrectly
3
Successful Test Case
4
Failing Test Case
5
The Testing Problem
• Could be [infinitely] many possible test cases
– Impossible to test exhaustively
• Testing problem: select test cases that will reveal failures, if any exist
6
Test Equivalence
• Say two test cases A and B aretest-equivalentif we believesoftware will fail on A iff itwill fail on B
• Test-equivalence should be an equivalence relation
• Equivalence class(EC) for test cases: set S of test cases such that if A and B are in S, Aand B are test-equivalent
• Equivalence partitionfor test cases: set P of equivalence classes such that every test caseis in some class S in P
7
Equivalence Partition
• Select only one test case from each equivalence class
• Avoids unnecessary tests
8
Example Software: TRIANGLE
• Program prompts user for input
• User types 3 positive real numbers separated by commas
– e.g.5.1, 12.0, 13
• Numbers are lengths of sides of a triangle
• Program responds with
– “Equilateral” if there is a valid triangle with those side lengths which is equilateral
– “Isosceles”similarly
– “Scalene” similarly
– “Not A Triangle” if there is no valid triangle with those side lengths
• Program also says
– “Acute” if largest angle is<90 degrees
– “Right” if largest angle is 90 degrees
– “Obtuse” if largest angle is>90 degrees
9
Test Equivalence for TRIANGLE
• Probably consider
– (3, 3, 3) test-equivalent to (4, 4, 4)
– (3, 3, 3) test-equivalent to (5, 5, 5)
– (3, 3, 3)not test-equivalent to (3, 4, 5)
• Therefore:
– If (3, 3, 3) is a test case, (4, 4, 4) and (5, 5, 5) should not be
– (3, 3, 3) and (3, 4, 5) can both be test cases
10
Terminology: Bugs
• Failure: eventin which program performs incorrectly
– e.g. printing wrong answer
• Fault: problemin programcausing failure
– e.g.< used instead of<=
• Bug: used informally for both of the above
• Defect: used for both failures and faults, and also for problems in specs, design, documen-tation
• Error: mistakeprogrammermade in introducing fault
– e.g. thinking that< was right
– Errors can be made by spec writers and designers too
• e.g.
– “The failure was that the system core dumped;
– the fault was that an uninitialized pointer was dereferenced;
– the error was that I thought only initialized pointers wouldbe passed to the function”
11
Terminology: Test Artifacts
• Test case: input data for test run, expected output, reason for test case
Reason
Test Case
Expected OutputInput Data
• Test suite: set of test cases
...
Test Suite
TC 3TC 2TC 1 TC N
• May have several overlapping test suites for different purposes
...TC 1 TC 3 TC 4TC 2 TC N
Test Suite BTest Suite A
TC 5
12
Dimensions of Testing
• Scope in program
– Unit testing: testing functions, classes, groups of methods, etc.
– System testing: testing whole programs, software systems
– Integration testing: putting units together into systems
• “Box colour”
– “Black-box” (functional) testing: testing based on reqs, design, etc.
– Tests documented requirements
– “White-box” (structural) testing: testing based code itself
– Tests undocumented features, algorithms
• Testing goals
– Correctness testing: trying to force bad output
– Usability testing
– Load/stress/volume testing: seeing how software performsunder heavy load
13
Coverage
• Word “coverage” used to talk about thoroughness
• “100% node coverage of finite state machine”:Set of strings that causes every node to be visited
• “100% statement coverage of code”:Set of inputs that causes every statement to be executed
• “90% statement coverage of code”:Set of inputs that causes 90% of statements to be executed
• Use coverage to guess at equivalence classes
14
Testing Methods
• Black-box (functional)
• White-box (structural)
• Coverage terminology
• Coverage tools
15
Black-Box Testing
• Also called “functional testing”
• Testing based on requirements
• Major methods:
– Equivalence partitioning on inputs
– Equivalence partitioning on outputs
– Boundary value analysis
– Extreme value analysis
– Syntax testing
16
Equivalence Partitioning
• On inputs:
– Identify input variablese.g. side lengths, number of sides
– Identify valid and invalid value sets for each variablee.g.{l | l > 0} valid; {l | l ≤ 0} invalid; {l | l not a number} invalid
– Build complete test cases from those setse.g. (3, 4, 5), (-1, 3, 3), (3, a, 3), (3), (3, 4, 5, 6, 7)
• On outputs:
– Identify output events, output variables, (sets of) variable valuese.g. triangle types, angle types
– Build test cases that generate each onee.g. (3, 3, 3), (2, 2, 3), (3, 4, 5), (2, 10, 2)
17
Boundary Value Analysis
nEC 1 EC 2
• Find somen such that asn either increases or decreases,we go from one EC to another (e.g. valid to invalid)
• Identify values ofn that are
– Close to boundary on each side
– On boundary (if possible)
• Generate test cases for each
• n can be integer or real; input value, output value, numberof repetitions, etc.
• For number representing month: 0, 1, 12, 13
• For TRIANGLE example:
– On outputs: (3, 3, 2.99), (3, 3, 3), (3, 3, 3.01), (3, 4,4.99), (3, 4, 5), (3, 4, 5.01), . . .
– Number of occurrences: (3, 3), (3, 3, 3), (3, 3, 3, 3), . . .
18
Extreme Value Analysis
• Find ann again
• Choose extremely large/small values forn
• Generate test cases for each
• n can be integer or real
• n can be input value, output value, number of repetitions/occurrences,etc.
• For TRIANGLE:
– On inputs/outputs: (0.000004, 0.000002, 0.000003), (43728947,43928410, 54386880)
– On number of repetitions: ()(empty test case), (3, 3, 3, 3, 3, 3, 3,3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3)
19
Syntax Testing
filename
filename
filename
-f-g
get
put
-m
• Basic steps:
– Identify target language and make explicit if not already
– Define syntax formally with state machine
– Write test cases to cover:
– All transitions
– All incomplete inputs
– Each change of a valid token to an invalid one
• For above grammar: (put -f foo), (put -g foo), (put -m foo bar), (get foo), (put -f), (get),(put -f -g), . . .
20
White-Box Testing
} ... p = p->next; }
while (p!=null) { if (p->n==y) { return p;
• Also called “structural testing”
• Basic idea: if 2 test cases execute code differently, cannotbe test-equivalent
• What criteria do we use to decide “executing the code dif-ferently”?
– “Coverage criteria”
• Hierarchies of coverage criteria
– Statement coverage
– Decision coverage
– Dataflow coverage
• Levels of coverage
21
Statement Coverage
• A test suiteachieves (100%) statement coverageon code if:For every statement in the code, there is at least one test case which causes the program toexecute that statement
• Example: consider the insurance code
premium = 500;if ((age<25) && (sex==male) && (!married)) {premium += 1500;
} else {if (married || (sex==female)) {premium -= 200;
}if ((age>45) && (age<65)) {premium -= 100;
}}
(from Software Testing in the Real World, Edward Kit)
22
Statement Coverage
• In the insurance code:
– 3 “if” statements, 4 assignment statements
• Two test cases can execute all statements:
– One with((age<25) && (sex==male) && (!married)) true
– One with((age<25) && (sex==male) && (!married)) false,but(married || (sex==female))and((age>45) && (age<65)) both true
• Statement coverage closely related toblockcoverage
– (block = sequence of statements that must be executed together)
23
Decision Coverage
• Decision: entire conditional expression after anif, while, etc.
• A test suiteachieves (100%) decision coverageon code if:
– For every decision in the code, there is at least one test casethat evaluates the decisionto true
– For every decision in the code, there is at least one test casethat evaluates the decisionto false
• Also called “branch coverage”
• Three test cases achieve 100% decision coverage on insurance code:
– One with((age<25) && (sex==male) && (!married)) true
– One with(age<25) false, but(married || (sex==female))and((age>45) && (age<65)) both true
– One with(age<25) false, but(married || (sex==female))and((age>45) && (age<65)) both false
24
(Short-Circuit) Condition Coverage
• Condition: atomic expression with no logical operators, within decision
– Logical operator:&&, ||, !, etc.
– Every decision contains one or more conditions
– e.g.((age<25) && (sex==male) && (!married)) : three conditions
• A test suiteachieves (100%) condition coverageon code if:
– For every condition in the code, there is at least one test case that evaluates the condi-tion to true
– For every condition in the code, there is at least one test case that evaluates the condi-tion to false
• For many languages we need to assumeshort-circuit evaluation: e.g.when evaluatingB && C, if B false, don’t evaluateC
• Use phrase “short-circuit (SC) condition coverage” for this
• Need at least 5 test cases for SC condition coverage on insurance code
25
Comparing Coverage Criteria
• Statement, decision, SC condition = “coverage criteria”
• If a test suite achieves decision coverage, it achieves statement coverage
– (There are unimportant, rare exceptions)
– Decision coverage “stronger than” statement coverage
• If a test suite achieves SC condition coverage, it achieves decision coverage
– SC condition coverage “stronger than” decision coverage
• US Military Standard DO-178B defines criteria for testing software at different safety lev-els
– Highest level: needs SC condition coverage to satisfy
26
Dataflow Coverage
• Definition statement (“def”): statement which assigns a value to a variable; e.g.
– “x = 3;” is a definition ofx
– “add(&tree, i);” is a definition oftree
• Use statement: statement which uses the value of a variable
– Predicate use (p-use): statement which uses value in a decision; e.g.“if (x == y) {...}” is a predicate use ofx andy
– Calculation use (c-use): all other uses; e.g.
– “x = y+3;” is a calculation use ofy
– “add(&tree, i);” is a calculation use ofi
• Def-use pair: def of a variablex + a later use (in same method) ofx, that can be reachedfrom the def by a pathq with no other defs ofx in-between
– Parameter decls considered “virtual defs” here
27
Def-Use Coverage
• A test suiteachieves (100%) def-use coverageon code if:
– For every def-use pair, there is at least one test case that executes the pathq betweenthe def and the use
• C-use coverage: def-use coverage for c-uses
• P-use coverage: def-use coverage for p-uses
• Statement, decision, condition coverage:controlflowcriteria
• Def-use, C-use, P-use coverage:dataflowcriteria
• Some studies have shown that test suites achieving dataflow criteria are more effective thanthose achieving controlflow criteria
28
Coverage Terminology
• A test suiteachieves (90%) statement coverageon code if 90% of the statements are exe-cuted
– Similarly 90%, 80%, etc., decision coverage, def-use coverage, etc.
• Some statements cannot be executed (“dead code”)
– Consider these coverage elementsinfeasiblefor statement coverage
– Similarly, infeasible decisions, conditions, etc.
• Generally only count feasible coverage elements
– Thus say “100%feasiblestatement coverage”, etc.
• Minimal test suite for a criterion: one that achieves the criterion in the smallest number oftest cases
29
Coverage Tools
• Step 1: run program that either
– Compiles your software under test(SUT) in a special way
– Adds statements to your SUT andcompiles it
– Instruments your SUT bytecode
• Step 2: run your compiled program(your SUT)
• Step 3: run coverage reporting pro-gram
– Can tell you which statements,etc. in your SUT were executed
• Most tools measure only statementcoverage
• For C:gcc / gcov
• For Java: Cobertura
30