An Introduction to Software Testing€¦ · Web viewThe key component of the above definitions is...

Intro to Software Testing

An Introduction to Software Testing

This paper provides an introduction to software testing. It serves as a tutorial for developers who are new to formal testing of software, and as a reminder of some finer points for experienced software testers.

Topics covered include basic definitions of testing, validation and verification; the levels of testing from unit testing through to acceptance testing; the relationship with requirements and design specifications; and test documentation.

1. What is Software Testing? There are many published definitions of software testing, however, all of these definitions boil down to essentially the same thing: software testing is the process of executing software in a controlled manner, in order to answer the question "Does the software behave as specified?".

Software testing is often used in association with the terms verification and validation. Verification is the checking or testing of items, including software, for conformance and consistency with an associated specification. Software testing is just one kind of verification, which also uses techniques such as reviews, analysis, inspections and walkthroughs. Validation is the process of checking that what has been specified is what the user actually wanted.

Validation: Are we doing the right job? Verification: Are we doing the job right?

The term bug is often used to refer to a problem or fault in a computer. There are software bugs and hardware bugs. The term originated in the United States, at the time when pioneering computers were built out of valves, when a series of previously inexplicable faults were eventually traced to moths flying about inside the computer.

Software testing should not be confused with debugging. Debugging is the process of analyzing and locating bugs when software does not behave as expected. Although the identification of some bugs will be obvious from playing with the software, a methodical approach to software testing is a much more thorough means of identifying bugs. Debugging is therefore an activity which supports testing, but cannot replace testing. However, no amount of testing can be guaranteed to discover all bugs.

Other activities which are often associated with software testing are static analysis and dynamic analysis. Static analysis investigates the source code of software, looking for problems and gathering metrics without actually executing the code. Dynamic analysis looks at the behavior of software while it is executing, to provide information such as execution traces, timing profiles, and test coverage information.

1


2. Software Specifications and Testing The key component of the above definitions is the word specified. Validation and verification activities, such as software testing, cannot be meaningful unless there is a specification for the software. Software could be a single module or unit of code, or an entire system. Depending on the size of the development and the development methods, specification of software can range from a single document to a complex hierarchy of documents.

A hierarchy of software specifications will typically contain three or more levels of software specification documents.

The Requirements Specification, which specifies what the software is required to do and may also specify constraints on how this may be achieved.

The Architectural Design Specification, which describes the architecture of a design which implements the requirements. Components within the software and the relationship between them will be described in this document.

Detailed Design Specifications, which describe how each component in the software, down to individual units, is to be implemented.

With such a hierarchy of specifications, it is possible to test software at various stages of the development, for conformance with each specification. The levels of testing which correspond to the hierarchy of software specifications listed above are:

Unit Testing, in which each unit (basic component) of the software is tested to verify that the detailed design for the unit has been correctly implemented. Software Integration Testing, in which progressively larger groups of tested software components corresponding to elements of the architectural design are integrated and tested until the software works as a whole.

System Testing, in which the software is integrated to the overall product and tested to show that all requirements are met. A further level of testing is also concerned with requirements:

Acceptance Testing, upon which acceptance of the completed software is based. This will often use a subset of the system tests, witnessed by the customers for the software or system.

Once each level of software specification has been written, the next step is to design the tests. An important point here is that the tests should be designed before the software is implemented, because if the software was implemented first it would be too tempting to test the software against what it is observed to do (which is not really testing at all), rather than against what it is specified to do.

2


Within each level of testing, once the tests have been applied, test results are evaluated. If a problem is encountered, then either the tests are revised and applied again, or the software is fixed and the tests applied again. This is repeated until no problems are encountered, at which point development can proceed to the next level of testing.

Testing does not end following the conclusion of acceptance testing. Software has to be maintained to fix problems which show up during use and to accommodate new requirements. Software tests have to be repeated, modified and extended. The effort to revise and repeat tests consequently forms a major part of the overall cost of developing and maintaining software. The term regression testing is used to refer to the repetition of earlier successful tests in order to make sure that changes to the software have not introduced side effects.

3. Test Design DocumentationThe design of tests is subject to the same basic engineering principles as the design of software. Good design consists of a number of stages which progressively elaborate the design of tests from an initial high level strategy to detailed test procedures. These stages are: test strategy, test planning, test case design, and test procedure design.

The design of tests has to be driven by the specification of the software. At the highest level this means that tests will be designed to verify that the software faithfully implements the requirements of the Requirements Specification. At lower levels tests will be designed to verify that items of software implement all design decisions made in the Architectural Design Specification and Detailed Design Specifications. As with any design process, each stage of the test design process should be subject to informal and formal review.

The ease with which tests can be designed is highly dependant on the design of the software. It is important to consider testability as a key (but usually undocumented) requirement for any software development.

3.1 Test StrategyThe first stage is the formulation of a test strategy. A test strategy is a statement of the overall approach to testing, identifying what levels of testing are to be applied and the methods, techniques and tools to be used. A test strategy should ideally be organisation wide, being applicable to all of an organisations software developments.

Developing a test strategy which efficiently meets the needs of an organisation is critical to the success of software development within the organisation. The application of a test strategy to a software development project should be detailed in the projects software quality plan.

3.2 Test PlansThe next stage of test design, which is the first stage within a software development project, is the development of a test plan. A test plan states what the items to be tested

3


are, at what level they will be tested, what sequence they are to be tested in, how the test strategy will be applied to the testing of each item, and describes the test environment.

A test plan may be project wide, or may in fact be a hierarchy of plans relating to the various levels of specification and testing:

An Acceptance Test Plan, describing the plan for acceptance testing of the software. This would usually be published as a separate document, but might be published with the system test plan as a single document.

A System Test Plan, describing the plan for system integration and testing. This would also usually be published as a separate document, but might be published with the acceptance test plan.

A Software Integration Test Plan, describing the plan for integration of tested software components. This may form part of the Architectural Design Specification.

Unit Test Plan(s), describing the plans for testing of individual units of software. These may form part of the Detailed Design Specifications. The objective of each test plan is to provide a plan for verification, by testing the software, that the software produced fulfills the requirements or design statements of the appropriate software specification. In the case of acceptance testing and system testing, this means the Requirements Specification.

3.3 Test Case DesignOnce the test plan for a level of testing has been written, the next stage of test design is to specify a set of test cases or test paths for each item to be tested at that level. A number of test cases will be identified for each item to be tested at each level of testing. Each test case will specify how the implementation of a particular requirement or design decision is to be tested and the criteria for success of the test.

The test cases may be documented with the test plan, as a section of a software specification, or in a separate document called a test specification or test description.

An Acceptance Test Specification, specifying the test cases for acceptance testing of the software. This would usually be published as a separate document, but might be published with the acceptance test plan.

A System Test Specification, specifying the test cases for system integration and testing. This would also usually be published as a separate document, but might be published with the system test plan.

Software Integration Test Specifications, specifying the test cases for each stage of integration of tested software components. These may form sections of the Architectural Design Specification.

4


Unit Test Specifications, specifying the test cases for testing of individual units of software. These may form sections of the Detailed Design Specifications.

System testing and acceptance testing involve an enormous number of individual test cases. In order to keep track of which requirements are tested by which test cases, an index which cross references between requirements and test cases often constructed. This is usually referred to as a Verification Cross Reference Index (VCRI) and is attached to the test specification. Cross reference indexes may also be used with unit testing and software integration testing.

It is important to design test cases for both positive testing and negative testing. Positive testing checks that the software does what it should. Negative testing checks that the software doesn't do what it shouldn't.

The process of designing test cases, including executing them as thought experiments, will often identify bugs before the software has even been built. It is not uncommon to find more bugs when designing tests than when executing tests.

3.4 Test ProceduresThe final stage of test design is to implement a set of test cases as a test procedure, specifying the exact process to be followed to conduct each of the test cases. This is a fairly straight forward process, which can be likened to designing units of code from higher level functional descriptions.

For each item to be tested, at each level of testing, a test procedure will specify the process to be followed in conducting the appropriate test cases. A test procedure cannot leave out steps or make assumptions. The level of detail must be such that the test procedure is deterministic and repeatable.

Test procedures should always be separate items, because they contain a great deal of detail which is irrelevant to software specifications. If AdaTEST or Cantata are used, test procedures may be coded directly as AdaTEST or Cantata test scripts.

4. Test Results DocumentationWhen tests are executed, the outputs of each test execution should be recorded in a test results file. These results are then assessed against criteria in the test specification to determine the overall outcome of a test. If AdaTEST or Cantata are used, this file will be created and the results assessed automatically according to criteria specified in the test script.

Each test execution should also be noted in a test log. The test log will contain records of when each test has been executed, the outcome of each test execution, and may also include key observations made during test execution. Often a test log is not maintained for lower levels of testing (unit test and software integration test).

5


Test reports may be produced at various points during the testing process. A test report will summarise the results of testing and document any analysis. An acceptance test report often forms a contractual document within which acceptance of software is agreed.

5. ConclusionSoftware can be tested at various stages of the development and with various degrees of rigor. Like any development activity, testing consumes effort and effort costs money. Developers should plan for between 30% and 70% of a projects effort to be expended on verification and validation activities, including software testing.

From an economics point of view, the level of testing appropriate to a particular organisation and software application will depend on the potential consequences of undetected bugs. Such consequences can range from a minor inconvenience of having to find a work-round for a bug to multiple deaths. Often overlooked by software developers (but not by customers), is the long term damage to the credibility of an organisation which delivers software to users with bugs in it, and the resulting negative impact on future business. Conversely, a reputation for reliable software will help an organisation to obtain future business.

Efficiency and quality are best served by testing software as early in the life cycle as practical, with full regression testing whenever changes are made. The later a bug is found, the higher the cost of fixing it, so it is sound economics to identify and fix bugs as early as possible. Designing tests will help to identify bugs, even before the tests are executed, so designing tests as early as practical in a software development is a useful means of reducing the cost of identifying and correcting bugs.

In practice the design of each level of software testing will be developed through a number of layers, each adding more detail to the tests. Each level of tests should be designed before the implementation reaches a point which could influence the design of tests in such a way as to be detrimental to the objectivity of the tests. Remember: software should be tested against what it is specified to do, not against what it actually observed to do.

The effectiveness of testing effort can be maximised by selection of an appropriate testing strategy, good management of the testing process, and appropriate use of tools such as AdaTEST or Cantata to support the testing process. The net result will be an increase in quality and a decrease in costs, both of which can only be beneficial to a software developers business.

The following list provides some rules to follow as an aid to effective and beneficial software testing.

Always test against a specification. If tests are not developed from a specification, then it is not testing. Hence, testing is totally reliant upon adequate specification of software. Document the testing process: specify tests and record test results.

6


Test hierarchically against each level of specification. Finding more errors earlier will ultimately reduce costs. Plan verification and validation activities, particularly testing. Complement testing with techniques such as static analysis and dynamic analysis. Always test positively: that the software does what it should, but also negatively: that it doesn't do what it shouldn't. Have the right attitude to testing - it should be a challenge, not the chore it so often becomes.

Software Testing and Software Development Lifecycles This paper outlines a number of commonly used software development lifecycle models, with particular emphasis on the testing activities involved in each model.

Irrespective of the lifecycle model used for software development, software has to be tested. Efficiency and quality are best served by testing software as early in the lifecycle as practical, with full regression testing whenever changes are made.

1. Introduction The various activities which are undertaken when developing software are commonly modeled as a software development lifecycle. The software development lifecycle begins with the identification of a requirement for software and ends with the formal verification of the developed software against that requirement.

The software development lifecycle does not exist by itself, it is in fact part of an overall product lifecycle. Within the product lifecycle, software will undergo maintenance to correct errors and to comply with changes to requirements. The simplest overall form is where the product is just software, but it can become much more complicated, with multiple software developments each forming part of an overall system to comprise a product.

There are a number of different models for software development lifecycles. One thing which all models have in common, is that at some point in the lifecycle, software has to be tested. This paper outlines some of the more commonly used software development lifecycles, with particular emphasis on the testing activities in each model.

2. Sequential Lifecycle Models The software development lifecycle begins with the identification of a requirement for software and ends with the formal verification of the developed software against that requirement. Traditionally, the models used for the software development lifecycle have been sequential, with the development progressing through a number of well defined phases. The sequential phases are usually represented by a V or waterfall diagram. These models are respectively called a V lifecycle model and a waterfall lifecycle model.

7


Figure 1 V Lifecycle Model

Figure 2 Waterfall Lifecycle ModelThere are in fact many variations of V and waterfall lifecycle models, introducing different phases to the lifecycle and creating different boundaries between phases. The following set of lifecycle phases fits in with the practices of most professional software developers.

The Requirements phase, in which the requirements for the software are gathered and analyzed, to produce a complete and unambiguous specification of what the software is required to do.

The Architectural Design phase, where a software architecture for the implementation of the requirements is designed and specified, identifying the components within the software and the relationships between the components.

The Detailed Design phase, where the detailed implementation of each component is specified.

The Code and Unit Test phase, in which each component of the software is coded and tested to verify that it faithfully implements the detailed design.

The Software Integration phase, in which progressively larger groups of tested software components are integrated and tested until the software works as a whole.

The System Integration phase, in which the software is integrated to the overall product and tested.

The Acceptance Testing phase, where tests are applied and witnessed to validate that the software faithfully implements the specified requirements.

Software specifications will be products of the first three phases of this lifecycle model. The remaining four phases all involve testing the software at various levels, requiring test specifications against which the testing will be conducted as an input to each of these phases.

3. Progressive Development Lifecycle Models The sequential V and waterfall lifecycle models represent an idealised model of software development. Other lifecycle models may be used for a number of reasons, such as volatility of requirements, or a need for an interim system with reduced functionality when long timescales are involved. As an example of other lifecycle models, let us look at progressive development and iterative lifecycle models.

A common problem with software development is that software is needed quickly, but it will take a long time to fully develop. The solution is to form a compromise between timescales and functionality, providing "interim" deliveries of software, with reduced

8


functionality, but serving as a stepping stones towards the fully functional software. It is also possible to use such a stepping stone approach as a means of reducing risk.

The usual names given to this approach to software development are progressive development or phased implementation. The corresponding lifecycle model is referred to as a progressive development lifecycle. Within a progressive development lifecycle, each individual phase of development will follow its own software development lifecycle, typically using a V or waterfall model. The actual number of phases will depend upon the development.

Figure 3 Progressive Development LifecycleEach delivery of software will have to pass acceptance testing to verify the software fulfils the relevant parts of the overall requirements. The testing and integration of each phase will require time and effort, so there is a point at which an increase in the number of development phases will actually become counter productive, giving an increased cost and timescale, which will have to be weighed carefully against the need for an early solution.

The software produced by an early phase of the model may never actually be used, it may just serve as a prototype. A prototype will take short cuts in order to provide a quick means of validating key requirements and verifying critical areas of design. These short cuts may be in areas such as reduced documentation and testing. When such short cuts are taken, it is essential to plan to discard the prototype and implement the next phase from scratch, because the reduced quality of the prototype will not provide a good foundation for continued development.

4. Iterative Lifecycle Models An iterative lifecycle model does not attempt to start with a full specification of requirements. Instead, development begins by specifying and implementing just part of the software, which can then be reviewed in order to identify further requirements. This process is then repeated, producing a new version of the software for each cycle of the model.

Consider an iterative lifecycle model which consists of repeating the four phases in sequence, as illustrated by figure 4.

Figure 4 Iterative Lifecycle ModelA Requirements phase, in which the requirements for the software are gathered and analyzed. Iteration should eventually result in a requirements phase which produces a complete and final specification of requirements. Design phase, in which a software solution to meet the requirements is designed. This may be a new design, or an extension of an earlier design. An Implementation and Test phase, when the software is coded, integrated and tested.

9


A Review phase, in which the software is evaluated, the current requirements are reviewed, and changes and additions to requirements proposed. For each cycle of the model, a decision has to be made as to whether the software produced by the cycle will be discarded, or kept as a starting point for the next cycle (sometimes referred to as incremental prototyping). Eventually a point will be reached where the requirements are complete and the software can be delivered, or it becomes impossible to enhance the software as required, and a fresh start has to be made.

The iterative lifecycle model can be likened to producing software by successive approximation. Drawing an analogy with mathematical methods which use successive approximation to arrive at a final solution, the benefit of such methods depends on how rapidly they converge on a solution.

Continuing the analogy, successive approximation may never find a solution. The iterations may oscillate around a feasible solution or even diverge. The number of iterations required may become so large as to be unrealistic. We have all seen software developments which have made this mistake!

The key to successful use of an iterative software development lifecycle is rigorous validation of requirements, and verification (including testing) of each version of the software against those requirements within each cycle of the model. The first three phases of the example iterative model are in fact an abbreviated form a sequential V or waterfall lifecycle model. Each cycle of the model produces software which requires testing at the unit level, for software integration, for system integration and for acceptance. As the software evolves through successive cycles, tests have to be repeated and extended to verify each version of the software.

5. Maintenance Successfully developed software will eventually become part of a product and enter a maintenance phase, during which the software will undergo modification to correct errors and to comply with changes to requirements. Like the initial development, modifications will follow a software development lifecycle, but not necessarily using the same lifecycle model as the initial development.

Throughout the maintenance phase, software tests have to be repeated, modified and extended. The effort to revise and repeat tests consequently forms a major part of the overall costs of developing and maintaining software.

The term regression testing is used to refer to the repetition of earlier successful tests in order to make sure that changes to the software have not introduced side effects.

6. Summary and Conclusion Irrespective of the lifecycle model used for software development, software has to be tested. Efficiency and quality are best served by testing software as early in the lifecycle as practical, with full regression testing whenever changes are made.

10


Such practices become even more critical with progressive development and iterative lifecycle models, as the degree of re-testing needed to control the quality of software within such developments is much higher than with a more traditional sequential lifecycle model.

Regression testing is a major part of software maintenance. It is easy for changes to be made without anticipating the full consequences, which without full regression testing could lead to a decrease in the quality of the software. The ease with which tests can be repeated has a major influence on the cost of maintaining software.

A common mistake in the management of software development is to start by badly managing a development within a V or waterfall lifecycle model, which then degenerates into an uncontrolled iterative model. This is another situation which we have all seen causing a software development to go wrong.

AdaTEST and Cantata are tools which facilitate automated, repeatable and maintainable testing of software, offering significant advantages to developers of Ada, C and C++ software. The benefits of repeatable and maintainable testing, gained from using AdaTEST or Cantata, become even more important when a progressive development or iterative model is used for the software development lifecycle.

There are a wide range of software development lifecycle models which have not been discussed in this paper. However, other lifecycle models generally follow the form and share similar properties to one of the models described herein, offering similar benefits from the use of AdaTEST or Cantata.

Why Bother to Unit Test? This paper addresses a question often posed by developers who are new to the concept of thorough testing: Why bother to unit test? The question is answered by adopting the position of devil's advocate, presenting some of the common arguments made against unit testing, then proceeding to show how these arguments are worthless. The case for unit testing is supported by published data.

1. Introduction The quality and reliability of software is often seen as the weak link in industry's attempts to develop new products and services.

The last decade has seen the issue of software quality and reliability addressed through a growing adoption of design methodologies and supporting CASE tools, to the extent that most software designers have had some training and experience in the use of formalised software design methods.

Unfortunately, the same cannot be said of software testing. Many developments applying such design methodologies are still failing to bring the quality and reliability of software under control. It is not unusual for 50% of software maintenance costs to be attributed to

11


fixing bugs left by the initial software development; bugs which should have been eliminated by thorough and effective software testing.

This paper addresses a question often posed by developers who are new to the concept of thorough testing: Why bother to unit test? The question is answered by adopting the position of devil's advocate, presenting some of the common arguments made against unit testing, then proceeding to show how these arguments are worthless. The case for unit testing is supported by published data.

2. What is Unit Testing?The unit test is the lowest level of testing performed during software development, where individual units of software are tested in isolation from other parts of a program.

In a conventional structured programming language, such as C, the unit to be tested is traditionally the function or sub-routine. In object oriented languages such as C++, the basic unit to be tested is the class. With Ada, developers have the choice of unit testing individual procedures and functions, or unit testing at the Ada package level. The principle of unit testing also extends to 4GL development, where the basic unit would typically be a menu or display.

Unit level testing is not just intended for one-off development use, to aid bug free coding. Unit tests have to be repeated whenever software is modified or used in a new environment. Consequently, all tests have to be maintained throughout the life of a software system.

Other activities which are often associated with unit testing are code reviews, static analysis and dynamic analysis. Static analysis investigates the textual source of software, looking for problems and gathering metrics without actually compiling or executing it. Dynamic analysis looks at the behaviour of software while it is executing, to provide information such as execution traces, timing profiles, and test coverage information.

3. Some Popular MisconceptionsHaving established what unit testing is, we can now proceed to play the devil's advocate. In the following subsections, some of the common arguments made against unit testing are presented, together with reasoned cases showing how these arguments are worthless.

3.1 It Consumes Too Much TimeOnce code has been written, developers are often keen to get on with integrating the software, so that they can see the actual system starting to work. Activities such as unit testing may be seen to get in the way of this apparent progress, delaying the time when the real fun of debugging the overall system can start.

What really happens with this approach to development is that real progress is traded for apparent progress. There is little point in having a system which "sort of" works, but happens to be full of bugs. In practice, such an approach to development will often result in software which will not even run. The net result is that a lot of time will be spent

12


tracking down relatively simple bugs which are wholly contained within particular units. Individually, such bugs may be trivial, but collectively they result in an excessive period of time integrating the software to produce a system which is unlikely to be reliable when it enters use.

In practice, properly planned unit tests consume approximately as much effort as writing the actual code. Once completed, many bugs will have been corrected and developers can proceed to a much more efficient integration, knowing that they have reliable components to begin with. Real progress has been made, so properly planned unit testing is a much more efficient use of time. Uncontrolled rambling with a debugger consumes a lot more time for less benefit.

Tool support using tools such as AdaTEST and Cantata can make unit testing more efficient and effective, but is not essential. Unit testing is a worthwhile activity even without tool support.

3.2 It Only Proves That the Code Does What the Code DoesThis is a common complaint of developers who jump straight into writing code, without first writing a specification for the unit. Having written the code and confronted with the task of testing it, they read the code to find out what it actually does and base their tests upon the code they have written. Of course they will prove nothing. All that such a test will show is that the compiler works. Yes, they will catch the (hopefully) rare compiler bug; but they could be achieving so much more.

If they had first written a specification, then tests could be based upon the specification. The code could then be tested against its specification, not against itself. Such a test will continue to catch compiler bugs. It will also find a lot more coding errors and even some errors in the specification. Better specifications enable better testing, and the corollary is that better testing requires better specifications.

In practice, there will be situations where a developer is faced with the thankless task of testing a unit given only the code for the unit and no specification. How can you do more than just find compiler bugs? The first step is to understand what the unit is supposed to do - not what it actually does. In effect, reverse engineer an outline specification. The main input to this process is to read the code and the comments, for the unit, and the units which call it or it calls. This can be supported by drawing flowgraphs, either by hand or using a tool. The outline specification can then be reviewed, to make sure that there are no fundamental flaws in the unit, and then used to design unit tests, with minimal further reference to the code.

3.3 "I'm too Good a Programmer to Need Unit Tests" There is at least one developer in every organisation who is so good at programming that their software always works first time and consequently does not need to be tested. How often have you heard this excuse?

13


In the real world, everyone makes mistakes. Even if a developer can muddle through with this attitude for a few simple programs, real software systems are much more complex. Real software systems do not have a hope of working without extensive testing and consequent bug fixing.

Coding is not a one pass process. In the real world software has to be maintained to reflect changes in operational requirements and fix bugs left by the original development. Do you want to be dependent upon the original author to make these changes? The chances are that the "expert" programmer who hacked out the original code without testing it will have moved on to hacking out code elsewhere. With a repeatable unit test the developer making changes will be able to check that there are no undesirable side effects.

3.4 Integration Tests will Catch all the Bugs AnywayWe have already addressed this argument in part as a side issue from some of the preceding discussion. The reason why this will not work is that larger integrations of code are more complex. If units have not been tested first, a developer could easily spend a lot of time just getting the software to run, without actually executing any test cases.

Once the software is running, the developer is then faced with the problem of thoroughly testing each unit within the overall complexity of the software. It can be quite difficult to even create a situation where a unit is called, let alone thoroughly exercised once it is called. Thorough testing of unit level functionality during integration is much more complex than testing units in isolation.

The consequence is that testing will not be as thorough as it should be. Gaps will be left and bugs will slip through.

To create an analogy, try cleaning a fully assembled food processor! No matter how much water and detergent is sprayed around, little scraps of food will remain stuck in awkward corners, only to go rotten and surface in a later recipe. On the other hand, if it is disassembled, the awkward corners either disappear or become much more accessible, and each part can be cleaned without too much trouble.

3.5 It is not Cost EffectiveThe level of testing appropriate to a particular organisation and software application depends on the potential consequences of undetected bugs. Such consequences can range from a minor inconvenience of having to find a work-round for a bug to multiple deaths. Often overlooked by software developers (but not by customers), is the long term damage to the credibility of an organisation which delivers software to users with bugs in it, and the resulting negative impact on future business. Conversely, a reputation for reliable software will help an organisation to obtain future business.

Many studies have shown that efficiency and quality are best served by testing software as early in the life cycle as practical, with full regression testing whenever changes are made. The later a bug is found, the higher the cost of fixing it, so it is sound economics to

14


identify and fix bugs as early as possible. Unit testing is an opportunity to catch bugs early, before the cost of correction escalates too far.

Unit tests are simpler to create, easier to maintain and more convenient to repeat than later stages of testing. When all costs are considered, unit tests are cheap compared to the alternative of complex and drawn out integration testing, or unreliable software.

4. Some FiguresFigures from "Applied Software Measurement", (Capers Jones, McGraw-Hill 1991), for the time taken to prepare tests, execute tests, and fix defects (normalised to one function point), show that unit testing is about twice as cost effective as integration testing and more than three times as cost effective as system testing (see bar chart).

(The term "field test" refers to any tests made in the field, once the software has entered use.)

This does not mean that developers should not perform the latter stages of testing, they are still necessary. What it does mean is that the expense of later stages of testing can be reduced by eliminating as many bugs as possible as early as possible.

Other figures show that up to 50% of maintenance effort is spent fixing bugs which have always been there. This effort could be saved if the bugs were eliminated during development. When it is considered that software maintenance costs can be many times the initial development cost, a potential saving of 50% on software maintenance can make a sizeable impact on overall lifecycle costs. =================-======================-=======================5. ConclusionExperience has shown that a conscientious approach to unit testing will detect many bugs at a stage of the software development where they can be corrected economically. In later stages of software development, detection and correction of bugs is much more difficult, time consuming and costly. Efficiency and quality are best served by testing software as early in the lifecycle as practical, with full regression testing whenever changes are made.

Given units which have been tested, the integration process is greatly simplified. Developers will be able to concentrate upon the interactions between units and the overall functionality without being swamped by lots of little bugs within the units.

The effectiveness of testing effort can be maximised by selection of a testing strategy which includes thorough unit testing, good management of the testing process, and appropriate use of tools such as AdaTEST or Cantata to support the testing process. The result will be more reliable software at a lower development cost, and there will be further benefits in simplified maintenance and reduced lifecycle costs. Effective unit testing is all part of developing an overall "quality" culture, which can only be beneficial to a software developers business.

15


Designing Unit Test Cases Producing a test specification, including the design of test cases, is the level of test design which has the highest degree of creative input. Furthermore, unit test specifications will usually be produced by a large number of staff with a wide range of experience, not just a few experts.

This paper provides a general process for developing unit test specifications and then describes some specific design techniques for designing unit test cases. It serves as a tutorial for developers who are new to formal testing of software, and as a reminder of some finer points for experienced software testers.

IPL is an independent software house founded in 1979 and based in Bath. IPL was accredited to ISO9001 in 1988, and gained TickIT accreditation in 1991. IPL has developed and supplies the AdaTEST and Cantata software verification products. AdaTEST and Cantata have been produced to these standards.

1. Introduction The design of tests is subject to the same basic engineering principles as the design of software. Good design consists of a number of stages which progressively elaborate the design. Good test design consists of a number of stages which progressively elaborate the design of tests:

Test strategy; Test planning; Test specification; Test procedure.

These four stages of test design apply to all levels of testing, from unit testing through to system testing. This paper concentrates on the specification of unit tests; i.e. the design of individual unit test cases within unit test specifications. A more detailed description of the four stages of test design can be found in the IPL paper "An Introduction to Software Testing".

The design of tests has to be driven by the specification of the software. For unit testing, tests are designed to verify that an individual unit implements all design decisions made in the unit's design specification. A thorough unit test specification should include positive testing, that the unit does what it is supposed to do, and also negative testing, that the unit does not do anything that it is not supposed to do.

Producing a test specification, including the design of test cases, is the level of test design which has the highest degree of creative input. Furthermore, unit test specifications will usually be produced by a large number of staff with a wide range of experience, not just a few experts.

16


This paper provides a general process for developing unit test specifications, and then describes some specific design techniques for designing unit test cases. It serves as a tutorial for developers who are new to formal testing of software, and as a reminder of some finer points for experienced software testers.

2. Developing Unit Test Specifications Once a unit has been designed, the next development step is to design the unit tests. An important point here is that it is more rigorous to design the tests before the code is written. If the code was written first, it would be too tempting to test the software against what it is observed to do (which is not really testing at all), rather than against what it is specified to do.

A unit test specification comprises a sequence of unit test cases. Each unit test case should include four essential elements:

A statement of the initial state of the unit, the starting point of the test case (this is only applicable where a unit maintains state between calls); The inputs to the unit, including the value of any external data read by the unit; What the test case actually tests, in terms of the functionality of the unit and the analysis used in the design of the test case (for example, which decisions within the unit are tested); The expected outcome of the test case (the expected outcome of a test case should always be defined in the test specification, prior to test execution). The following subsections of this paper provide a six step general process for developing a unit test specification as a set of individual unit test cases. For each step of the process, suitable test case design techniques are suggested. (Note that these are only suggestions. Individual circumstances may be better served by other test case design techniques). Section 3 of this paper then describes in detail a selection of techniques which can be used within this process to help design test cases.

2.1 Step 1 - Make it Run The purpose of the first test case in any unit test specification should be to execute the unit under test in the simplest way possible. When the tests are actually executed, knowing that at least the first unit test will execute is a good confidence boost. If it will not execute, then it is preferable to have something as simple as possible as a starting point for debugging.

Suitable techniques: Specification derived tests Equivalence partitioning 2.2 Step 2 - Positive Testing

Test cases should be designed to show that the unit under test does what it is supposed to do. The test designer should walk through the relevant specifications; each test case should test one or more statements of specification. Where more than one specification is

17


involved, it is best to make the sequence of test cases correspond to the sequence of statements in the primary specification for the unit.

Suitable techniques: Specification derived tests Equivalence partitioning State-transition testing 2.3 Step 3 - Negative Testing

Existing test cases should be enhanced and further test cases should be designed to show that the software does not do anything that it is not specified to do. This step depends primarily upon error guessing, relying upon the experience of the test designer to anticipate problem areas.

Suitable techniques: Error guessing Boundary value analysis Internal boundary value testing State-transition testing

2.4 Step 4 - Special Considerations Where appropriate, test cases should be designed to address issues such as performance, safety requirements and security requirements.

Particularly in the cases of safety and security, it can be convenient to give test cases special emphasis to facilitate security analysis or safety analysis and certification. Test cases already designed which address security issues or safety hazards should be identified in the unit test specification. Further test cases should then be added to the unit test specification to ensure that all security issues and safety hazards applicable to the unit will be fully addressed.

Suitable techniques: Specification derived tests 2.5 Step 5 - Coverage Tests

The test coverage likely to be achieved by the designed test cases should be visualised. Further test cases can then be added to the unit test specification to achieve specific test coverage objectives. Once coverage tests have been designed, the test procedure can be developed and the tests executed. Suitable techniques: Branch testing Condition testing Data definition-use testing State-transition testing Test Execution

18


A test specification designed using the above five steps should in most cases provide a thorough test for a unit. At this point the test specification can be used to develop an actual test procedure, and the test procedure used to execute the tests. For users of AdaTEST or Cantata, the test procedure will be an AdaTEST or Cantata test script.

Execution of the test procedure will identify errors in the unit which can be corrected and the unit re-tested. Dynamic analysis during execution of the test procedure will yield a measure of test coverage, indicating whether coverage objectives have been achieved. There is therefore a further coverage completion step in the process of designing test specifications.

2.6 Step 6 - Coverage Completion Depending upon an organisation's standards for the specification of a unit, there may be no structural specification of processing within a unit other than the code itself. There are also likely to have been human errors made in the development of a test specification. Consequently, there may be complex decision conditions, loops and branches within the code for which coverage targets may not have been met when tests were executed. Where coverage objectives are not achieved, analysis must be conducted to determine why. Failure to achieve a coverage objective may be due to:

Infeasible paths or conditions - the corrective action should be to annotate the test specification to provide a detailed justification of why the path or condition is not tested. AdaTEST provides some facilities to help exclude infeasible conditions from Boolean coverage metrics. Unreachable or redundant code - the corrective action will probably be to delete the offending code. It is easy to make mistakes in this analysis, particularly where defensive programming techniques have been used. If there is any doubt, defensive programming should not be deleted.

Insufficient test cases - test cases should be refined and further test cases added to a test specification to fill the gaps in test coverage. Ideally, the coverage completion step should be conducted without looking at the actual code. However, in practice some sight of the code may be necessary in order to achieve coverage targets. It is vital that all test designers should recognise that use of the coverage completion step should be minimised. The most effective testing will come from analysis and specification, not from experimentation and over dependence upon the coverage completion step to cover for sloppy test design. Suitable techniques: Branch testing Condition testing Data definition-use testing State-transition testing

2.7 General Guidance

Note that the first five steps in producing a test specification can be achieved:

19


Solely from design documentation; Without looking at the actual code; Prior to developing the actual test procedure. It is usually a good idea to avoid long sequences of test cases which depend upon the outcome of preceding test cases. An error identified by a test case early in the sequence could cause secondary errors and reduce the amount of real testing achieved when the tests are executed.

The process of designing test cases, including executing them as "thought experiments", often identifies bugs before the software has even been built. It is not uncommon to find more bugs when designing tests than when executing tests.

Throughout unit test design, the primary input should be the specification documents for the unit under test. While use of actual code as an input to the test design process may be necessary in some circumstances, test designers must take care that they are not testing the code against itself. A test specification developed from the code will only prove that the code does what the code does, not that it does what it is supposed to do.

3. Test Case Design Techniques The preceding section of this paper has provided a "recipe" for developing a unit test specification as a set of individual test cases. In this section a range of techniques which can be to help define test cases are described.

Test case design techniques can be broadly split into two main categories. Black box techniques use the interface to a unit and a description of functionality, but do not need to know how the inside of a unit is built. White box techniques make use of information about how the inside of a unit works. There are also some other techniques which do not fit into either of the above categories. Error guessing falls into this category.

Black Box (functional) White Box (structural) Other Specification Derived Tests Branch Testing Error Guessing Equivalence Partitioning Condition Testing Boundary Value Analysis Data Definition-Use Testing State-Transition Testing Internal Boundary Value Testing

The most important ingredients of any test design are experience and common sense. Test designers should not let any of the given techniques obstruct the application of experience and common sense.

The selection of test case design techniques described in the following subsections is by no means exhaustive. Further information on techniques for test case design can be found in "Software Testing Techniques" 2nd Edition, B Beizer, Van Nostrand Reinhold, New York 1990.

20


3.1 Specification Derived Tests As the name suggests, test cases are designed by walking through the relevant specifications. Each test case should test one or more statements of specification. It is often practical to make the sequence of test cases correspond to the sequence of statements in the specification for the unit under test. For example, consider the specification for a function to calculate the square root of a real number.

Input - real number Output - real number

When given an input of 0 or greater, the positive square root of the input shall be returned. When given an input of less than 0, the error message "Square root error - illegal negative input" shall be displayed and a value of 0 returned. The library routine Print_Line shall be used to display the error message.

There are three statements in this specification, which can be addressed by two test cases. Note that the use of Print_Line conveys structural information in the specification.

Test Case 1: Input 4, Return 2

Exercises the first statement in the specification

(" When given an input of 0 or greater, the positive square root of the input shall be returned. ") Test Case 2: Input -10, Return 0, Output "Square root error - illegal negative input" using Print_Line.

Exercises the second and third statements in the specification (" When given an input of less than 0, the error message "Square root error - illegal negative input" shall be displayed and a value of 0 returned. The library routine Print_Line shall be used to display the error message. "). Specification derived test cases can provide an excellent correspondence to the sequence of statements in the specification for the unit under test, enhancing the readability and maintainability of the test specification. However, specification derived testing is a positive test case design technique. Consequently, specification derived test cases have to be supplemented by negative test cases in order to provide a thorough unit test specification.

A variation of specification derived testing is to apply a similar technique to a security analysis, safety analysis, software hazard analysis, or other document which provides supplementary information to the unit's specification.

3.2 Equivalence Partitioning Equivalence partitioning is a much more formalised method of test case design. It is based upon splitting the inputs and outputs of the software under test into a number of partitions, where the behaviour of the software is equivalent for any value within a

21


particular partition. Data which forms partitions is not just routine parameters. Partitions can also be present in data accessed by the software, in time, in input and output sequence, and in state.

Equivalence partitioning assumes that all values within any individual partition are equivalent for test purposes. Test cases should therefore be designed to test one value in each partition. Consider again the square root function used in the previous example. The square root function has two input partitions and two output partitions:

Input Partitions Output Partitions (i) <0 (a) >=0 (ii) >=0 (b) Error

These four partitions can be tested with two test cases:


Exercises the >=0 input partition (ii) Exercises the >=0 output partition (a) Test Case 2: Input -10, Return 0, Output "Square root error - illegal negative input" using Print_Line.

Exercises the <0 input partition (i) Exercises the "error" output partition (b) For a function like square root, we can see that equivalence partitioning is quite simple. One test case for a positive number and a real result; and a second test case for a negative number and an error result. However, as software becomes more complex, the identification of partitions and the inter-dependencies between partitions becomes much more difficult, making it less convenient to use this technique to design test cases. Equivalence partitioning is still basically a positive test case design technique and needs to be supplemented by negative tests.

3.3 Boundary Value Analysis Boundary value analysis uses the same analysis of partitions as equivalence partitioning. However, boundary value analysis assumes that errors are most likely to exist at the boundaries between partitions. Boundary value analysis consequently incorporates a degree of negative testing into the test design, by anticipating that errors will occur at or near the partition boundaries. Test cases are designed to exercise the software on and at either side of boundary values. Consider the two input partitions in the square root example:

The zero or greater partition has a boundary at 0 and a boundary at the most positive real number. The less than zero partition shares the boundary at 0 and has another boundary at the most negative real number. The output has a boundary at 0, below which it cannot go.

22


Test Case 1: Input {the most negative real number}, Return 0, Output "Square root error - illegal negative input" using Print_Line

Exercises the lower boundary of partition (i). Test Case 2: Input {just less than 0}, Return 0, Output "Square root error - illegal negative input" using Print_Line

Exercises the upper boundary of partition (i). Test Case 3: Input 0, Return 0

Exercises just outside the upper boundary of partition (i), the lower boundary of partition (ii) and the lower boundary of partition (a). Test Case 4: Input {just greater than 0}, Return {the positive square root of the input}

Exercises just inside the lower boundary of partition (ii). Test Case 5: Input {the most positive real number}, Return {the positive square root of the input}

Exercises the upper boundary of partition (ii) and the upper boundary of partition (a). As for equivalence partitioning, it can become impractical to use boundary value analysis thoroughly for more complex software. Boundary value analysis can also be meaningless for non scalar data, such as enumeration values. In the example, partition (b) does not really have boundaries. For purists, boundary value analysis requires knowledge of the underlying representation of the numbers. A more pragmatic approach is to use any small values above and below each boundary and suitably big positive and negative numbers

3.4 State-Transition Testing State transition testing is particularly useful where either the software has been designed as a state machine or the software implements a requirement that has been modeled as a state machine. Test cases are designed to test the transitions between states by creating the events which lead to transitions.

When used with illegal combinations of states and events, test cases for negative testing can be designed using this approach. Testing state machines is addressed in detail by the IPL paper " Testing State Machines with AdaTEST and Cantata".

3.5 Branch Testing In branch testing, test cases are designed to exercise control flow branches or decision points in a unit. This is usually aimed at achieving a target level of Decision Coverage. Given a functional specification for a unit, a "black box" form of branch testing is to "guess" where branches may be coded and to design test cases to follow the branches. However, branch testing is really a "white box" or structural test case design technique. Given a structural specification for a unit, specifying the control flow within the unit, test cases can be designed to exercise branches. Such a structural unit specification will typically include a flowchart or PDL.

23


Returning to the square root example, a test designer could assume that there would be a branch between the processing of valid and invalid inputs, leading to the following test cases:


Exercises the valid input processing branch Test Case 2: Input -10, Return 0, Output "Square root error - illegal negative input" using Print_Line.

Exercises the invalid input processing branch However, there could be many different structural implementations of the square root function. The following four structural specifications are all valid implementations of the square root function, but the above test cases would only achieve decision coverage of the first and third versions of the specification.

Specification 1

If input<0 THEN

CALL Print_Line "Square root error - illegal negative input"

RETURN 0

ELSE

Use maths co-processor to calculate the answer

RETURN the answer

END_IF

Specification 2

If input<0 THEN


RETURN 0

ELSE_IF input=0 THEN

RETURN 0

ELSE

24



RETURN the answer

END_IF

Specification 3


Examine co-processor status registers

If status=error THEN


RETURN 0

ELSE

RETURN the answer

END_IF

Specification 4

If input<0 THEN


RETURN 0

ELSE_IF input=0 THEN

RETURN 0

ELSE

Calculate first approximation

LOOP

Calculate error

EXIT_LOOP WHEN error<desired accuracy

25


Adjust approximation

END_LOOP

RETURN the answer

END_IF

It can be seen that branch testing works best with a structural specification for the unit. A structural unit specification will enable branch test cases to be designed to achieve decision coverage, but a purely functional unit specification could lead to coverage gaps.

One thing to beware of is that by concentrating upon branches, a test designer could loose sight of the overall functionality of a unit. It is important to always remember that it is the overall functionality of a unit that is important, and that branch testing is a means to an end, not an end in itself. Another consideration is that branch testing is based solely on the outcome of decisions. It makes no allowances for the complexity of the logic which leads to a decision.

3.6 Condition Testing There are a range of test case design techniques which fall under the general title of condition testing, all of which endeavor to mitigate the weaknesses of branch testing when complex logical conditions are encountered. The object of condition testing is to design test cases to show that the individual components of logical conditions and combinations of the individual components are correct.

Test cases are designed to test the individual elements of logical expressions, both within branch conditions and within other expressions in a unit. As for branch testing, condition testing could be used as a "black box" technique, where the test designer makes intelligent guesses about the implementation of a functional specification for a unit. However, condition testing is more suited to "white box" test design from a structural specification for a unit.

The test cases should be targeted at achieving a condition coverage metric, such as Modified Condition Decision Coverage (available as Boolean Operand Effectiveness in AdaTEST). The IPL paper entitled "Structural Coverage Metrics" provides more detail of condition coverage metrics.

To illustrate condition testing, consider the example specification for the square root function which uses successive approximation (specification 4). Suppose that the designer for the unit made a decision to limit the algorithm to a maximum of 10 iterations, on the grounds that after 10 iterations the answer would be as close as it would ever get. The PDL specification for the unit could specify an exit condition:

26


:

:

:

EXIT_LOOP WHEN (error<desired accuracy) or (iterations=10)

:

:

If the coverage objective is Modified Condition Decision Coverage, test cases have to prove that both error<desired accuracy and iterations=10 can independently affect the outcome of the decision.

Test Case 1: 10 iterations, error>desired accuracy for all iterations.

Both parts of the condition are false for the first 9 iterations. On the tenth iteration, the first part of the condition is false and the second part becomes true, showing that the iterations=10 part of the condition can independently affect its outcome. Test Case 2: 2 iterations, error>=desired accuracy for the first iteration, and error<desired accuracy for the second iteration.

Both parts of the condition are false for the first iteration. On the second iteration, the first part of the condition becomes true and the second part remains false, showing that the error<desired accuracy part of the condition can independently affect its outcome. Condition testing works best when a structural specification for the unit is available. It provides a thorough test of complex conditions, an area of frequent programming and design error and an area which is not addressed by branch testing. As for branch testing, it is important for test designers to beware that concentrating on conditions could distract a test designer from the overall functionality of a unit.

3.7 Data Definition-Use Testing Data definition-use testing designs test cases to test pairs of data definitions and uses. A data definition is anywhere that the value of a data item is set, and a data use is anywhere that a data item is read or used. The objective is to create test cases which will drive execution through paths between specific definitions and uses.

Like decision testing and condition testing, data definition-use testing can be used in combination with a functional specification for a unit, but is better suited to use with a structural specification for a unit.

Consider one of the earlier PDL specifications for the square root function which sent every input to the maths co-processor and used the co-processor status to determine the

27


validity of the result. (specification 3). The first step is to list the pairs of definitions and uses. In this specification there are a number of definition-use pairs:

Definition Use 1 Input to routine By the maths co-processor 2 Co-processor status Test for status=error 3 Error message By Print_Line 4 RETURN 0 By the calling unit 5 Answer by co-processor RETURN the answer 6 RETURN the answer By the calling unit

These pairs of definitions and uses can then be used to design test cases. Two test cases are required to test all six of these definition-use pairs:


Tests definition-use pairs 1, 2, 5, 6 Test Case 2: Input -10, Return 0, Output "Square root error - illegal negative input" using Print_Line.

Tests definition-use pairs 1, 2, 3, 4 The analysis needed to develop test cases using this design technique can also be useful for identifying problems before the tests are even executed; for example, identification of situations where data is used without having been defined. This is the sort of data flow analysis that some static analysis tool can help with. The analysis of data definition-use pairs can become very complex, even for relatively simple units. Consider what the definition-use pairs would be for the successive approximation version of square root!

It is possible to split data definition-use tests into two categories: uses which affect control flow (predicate uses) and uses which are purely computational. Refer to "Software Testing Techniques" 2nd Edition, B Beizer, Van Nostrand Reinhold, New York 1990, for a more detailed description of predicate and computational uses.

3.8 Internal Boundary Value Testing In many cases, partitions and their boundaries can be identified from a functional specification for a unit, as described under equivalence partitioning and boundary value analysis above. However, a unit may also have internal boundary values which can only be identified from a structural specification. Consider a fragment of the successive approximation version of the square root unit specification (specification 4):

:

Calculate first approximation

LOOP

28


Calculate error

EXIT_LOOP WHEN error<desired accuracy

Adjust approximation

END_LOOP

RETURN the answer

:

:

The calculated error can be in one of two partitions about the desired accuracy, a feature of the structural design for the unit which is not apparent from a purely functional specification. An analysis of internal boundary values yields three conditions for which test cases need to be designed.

Test Case 1: Error just greater than the desired accuracy

Test Case 2: Error equal to the desired accuracy

Test Case 3: Error just less than the desired accuracy

Internal boundary value testing can help to bring out some elusive bugs. For example, suppose "<=" had been coded instead of the specified "<". Nevertheless, internal boundary value testing is a luxury to be applied only as a final supplement to other test case design techniques.

3.9 Error Guessing Error guessing is based mostly upon experience, with some assistance from other techniques such as boundary value analysis. Based on experience, the test designer guesses the types of errors that could occur in a particular type of software and designs test cases to uncover them. For example, if any type of resource is allocated dynamically, a good place to look for errors is in the deallocation of resources. Are all resources correctly deallocated, or are some lost as the software executes?

Error guessing by an experienced engineer is probably the single most effective method of designing tests which uncover bugs. A well placed error guess can show a bug which could easily be missed by many of the other test case design techniques presented in this paper. Conversely, in the wrong hands error guessing can be a waste of time.

To make the maximum use of available experience and to add some structure to this test case design technique, it is a good idea to build a check list of types of errors. This check list can then be used to help "guess" where errors may occur within a unit. The check list

29


should be maintained with the benefit of experience gained in earlier unit tests, helping to improve the overall effectiveness of error guessing.

4. Conclusion Experience has shown that a conscientious approach to unit testing will detect many bugs at a stage of the software development where they can be corrected economically. A rigorous approach to unit testing requires:

That the design of units is documented in a specification before coding begins; That unit tests are designed from the specification for the unit, also preferably before coding begins; That the expected outcomes of unit test cases are specified in the unit test specification. The process for developing unit test specifications presented in this paper is generic, in that it can be applied to any level of testing. Nevertheless, there will be circumstances where it has to be tailored to specific situations. Tailoring of the process and the use of test case design techniques should be documented in the overall test strategy.

Although the square root example used to illustrate the test case design techniques in fairly trivial, it does serve to show the principles behind the techniques. It is unlikely that any single test case design techniques will lead to a particularly thorough test specification. When used to complement each other through each stage of the test specification development process, the synergy of techniques can be much more effective. Nevertheless, test designers should not let any of these techniques obstruct the application of experience and common sense.

AdaTEST and Cantata can be used in association with any of the test case design techniques described in this paper. The dynamic analysis facilities of AdaTEST and Cantata provide a range of test coverage measures which can be used to help with steps 5 and 6 (coverage tests and coverage completion) of the unit test development process.

Organisational Approaches for Unit Testing This paper describes three organisational approaches for unit testing: top down, bottom up and isolation. The organisational approach is a key element of unit test strategy and planning; selection of an inappropriate approach can have a significant impact on the cost of unit testing and software maintenance. A unit test strategy based on isolation testing is recommended.

1. Introduction Unit testing is the testing of individual components (units) of the software. Unit testing is usually conducted as part of a combined code and unit test phase of the software lifecycle, although it is not uncommon for coding and unit testing to be conducted as two distinct phases.

The basic units of design and code in Ada, C and C++ programs are individual subprograms (procedures, functions, member functions). Ada and C++ provide capabilities for grouping basic units together into packages (Ada) and classes (C++). Unit

30


testing for Ada and C++ usually tests units in the context of the containing package or class.

When developing a strategy for unit testing, there are three basic organisational approaches that can be taken. These are top down, bottom up and isolation. These three approaches are described and their advantages and disadvantages discussed in sections 2, 3, and 4 of this paper.

The concepts of test drivers and stubs are used throughout this paper. A test driver is software which executes software in order to test it, providing a framework for setting input parameters, executing the unit, and reading the output parameters. A stub is an imitation of a unit, used in place of the real unit to facilitate testing.

An AdaTEST or Cantata test script comprises a test driver and an (optional) collection of stubs. Using AdaTEST or Cantata to implement the organisational approaches to unit testing presented in this paper is discussed in section 5.

2. Top Down Testing 2.1 Description In top down unit testing, individual units are tested by using them from the units which call them, but in isolation from the units called.

The unit at the top of a hierarchy is tested first, with all called units replaced by stubs. Testing continues by replacing the stubs with the actual called units, with lower level units being stubbed. This process is repeated until the lowest level units have been tested. Top down testing requires test stubs, but not test drivers.

Figure 2.1 illustrates the test stubs and tested units needed to test unit D, assuming that units A, B and C have already been tested in a top down approach.

Figure 2.1 - Top Down TestingA unit test plan for the program shown in figure 2.1, using a strategy based on the top down organisational approach, could read as follows:

Step (1) Test unit A, using stubs for units B, C and D. Step (2) Test unit B, by calling it from tested unit A, using stubs for units C and D. Step (3) Test unit C, by calling it from tested unit A, using tested units B and a stub for unit D. Step (4) Test unit D, by calling it from tested unit A, using tested unit B and C, and stubs for units E, F and G. (Shown in figure 2.1). Step (5) Test unit E, by calling it from tested unit D, which is called from tested unit A, using tested units B and C, and stubs for units F, G, H, I and J. Step (6) Test unit F, by calling it from tested unit D, which is called from tested unit A, using tested units B, C and E, and stubs for units G, H, I and J. Step (7) Test unit G, by calling it from tested unit D, which is called from tested unit A, using tested units B, C, E and F, and stubs for units H, I and J.

31


Step (8) Test unit H, by calling it from tested unit E, which is called from tested unit D, which is called from tested unit A, using tested units B, C, E, F and G, and stubs for units I and J. Step (9) Test unit I, by calling it from tested unit E, which is called from tested unit D, which is called from tested unit A, using tested units B, C, E, F, G and H, and a stub for units J. Step (10) Test unit J, by calling it from tested unit E, which is called from tested unit D, which is called from tested unit A, using tested units B, C, E, F, G, H and I.

2.2 Advantages Top down unit testing provides an early integration of units before the software integration phase. In fact, top down unit testing is really a combined unit test and software integration strategy.

The detailed design of units is top down, and top down unit testing implements tests in the sequence units are designed, so development time can be shortened by overlapping unit testing with the detailed design and code phases of the software lifecycle.

In a conventionally structured design, where units at the top of the hierarchy provide high level functions, with units at the bottom of the hierarchy implementing details, top down unit testing will provide an early integration of 'visible' functionality. This gives a very requirements oriented approach to unit testing.

Redundant functionality in lower level units will be identified by top down unit testing, because there will be no route to test it. (However, there can be some difficulty in distinguishing between redundant functionality and untested functionality).

2.3 Disadvantages Top down unit testing is controlled by stubs, with test cases often spread across many stubs. With each unit tested, testing becomes more complicated, and consequently more expensive to develop and maintain.

As testing progresses down the unit hierarchy, it also becomes more difficult to achieve the good structural coverage which is essential for high integrity and safety critical applications, and which are required by many standards. Difficulty in achieving structural coverage can also lead to a confusion between genuinely redundant functionality and untested functionality. Testing some low level functionality, especially error handling code, can be totally impractical.

Changes to a unit often impact the testing of sibling units and units below it in the hierarchy. For example, consider a change to unit D. Obviously, the unit test for unit D would have to change and be repeated. In addition, unit tests for units E, F, G, H, I and J, which use the tested unit D, would also have to be repeated. These tests may also have to change themselves, as a consequence of the change to unit D, even though units E, F, G,

32


H, I and J had not actually changed. This leads to a high cost associated with re-testing when changes are made, and a high maintenance and overall lifecycle cost.

The design of test cases for top down unit testing requires structural knowledge of when the unit under test calls other units. The sequence in which units can be tested is constrained by the hierarchy of units, with lower units having to wait for higher units to be tested, forcing a 'long and thin' unit test phase. (However, this can overlap substantially with the detailed design and code phases of the software lifecycle).

The relationships between units in the example program in figure 2.1 is much simpler than would be encountered in a real program, where units could be referenced from more than one other unit in the hierarchy. All of the disadvantages of a top down approach to unit testing are compounded by a unit being referenced from more than one other unit.

2.4 Overall A top down strategy will cost more than an isolation based strategy, due to complexity of testing units below the top of the unit hierarchy, and the high impact of changes. The top down organisational approach is not a good choice for unit testing. However, a top down approach to the integration of units, where the units have already been tested in isolation, can be viable.

3. Bottom up Testing 3.1 Description In bottom up unit testing, units are tested in isolation from the units which call them, but using the actual units called as part of the test.

The lowest level units are tested first, then used to facilitate the testing of higher level units. Other units are then tested, using previously tested called units. The process is repeated until the unit at the top of the hierarchy has been tested. Bottom up testing requires test drivers, but does not require test stubs.

Figure 3.1 illustrates the test driver and tested units needed to test unit D, assuming that units E, F, G, H, I and J have already been tested in a bottom up approach.

Figure 3.1 - Bottom Up TestingA unit test plan for the program shown in figure 3.1, using a strategy based on the bottom up organisational approach, could read as follows:

Step (1) (Note that the sequence of tests within this step is unimportant, all tests within step 1 could be executed in parallel.) Test unit H, using a driver to call it in place of unit E; Test unit I, using a driver to call it in place of unit E; Test unit J, using a driver to call it in place of unit E; Test unit F, using a driver to call it in place of unit D; Test unit G, using a driver to call it in place of unit D; Test unit B, using a driver to call it in place of unit A;

33


Test unit C, using a driver to call it in place of unit A. Step (2) Test unit E, using a driver to call it in place of unit D and tested units H, I and J. Step (3) Test unit D, using a driver to call it in place of unit A and tested units E, F, G, H, I and J. (Shown in figure 3.1). Step (4) Test unit A, using tested units B, C, D, E, F, G, H, I and J.

3.2 Advantages Like top down unit testing, bottom up unit testing provides an early integration of units before the software integration phase. Bottom up unit testing is also really a combined unit test and software integration strategy. All test cases are controlled solely by the test driver, with no stubs required. This can make unit tests near the bottom of the unit hierarchy relatively simple. (However, higher level unit tests can be very complicated).

Test cases for bottom up testing may be designed solely from functional design information, requiring no structural design information (although structural design information may be useful in achieving full coverage). This makes the bottom up approach to unit testing useful when the detailed design documentation lacks structural detail.

Bottom up unit testing provides an early integration of low level functionality, with higher level functionality being added in layers as unit testing progresses up the unit hierarchy. This makes bottom up unit testing readily compatible with the testing of objects.

3.3 Disadvantages As testing progresses up the unit hierarchy, bottom up unit testing becomes more complicated, and consequently more expensive to develop and maintain. As testing progresses up the unit hierarchy, it also becomes more difficult to achieve good structural coverage.

Changes to a unit often impact the testing of units above it in the hierarchy. For example, consider a change to unit H. Obviously, the unit test for unit H would have to change and be repeated. In addition, unit tests for units A, D and E, which use the tested unit H, would also have to be repeated. These tests may also have to change themselves, as a consequence of the change to unit H, even though units A, D and E had not actually changed. This leads to a high cost associated with retesting when changes are made, and a high maintenance and overall lifecycle cost.

The sequence in which units can be tested is constrained by the hierarchy of units, with higher units having to wait for lower units to be tested, forcing a 'long and thin' unit test phase. The first units to be tested are the last units to be designed, so unit testing cannot overlap with the detailed design phase of the software lifecycle.

The relationships between units in the example program in figure 3.1 is much simpler than would be encountered in a real program, where units could be referenced from more than one other unit in the hierarchy. As for top down unit testing, the disadvantages of a

34


bottom up approach to unit testing are compounded by a unit being referenced from more than one other unit.

3.4 Overall The bottom up organisational approach can be a reasonable choice for unit testing, particularly when objects and reuse are considered. However, the bottom up approach is biased towards functional testing, rather than structural testing. This can present difficulties in achieving the high levels of structural coverage essential for high integrity and safety critical applications, and which are required by many standards.

The bottom up approach to unit testing conflicts with the tight timescales required of many software developments. Overall, a bottom up strategy will cost more than an isolation based strategy, due to complexity of testing units above the bottom level in the unit hierarchy and the high impact of changes.

4. Isolation Testing 4.1 Description Isolation testing tests each unit in isolation from the units which call it and the units it calls.

Units can be tested in any sequence, because no unit test requires any other unit to have been tested. Each unit test requires a test driver and all called units are replaced by stubs. Figure 4.1 illustrates the test driver and tested stubs needed to test unit D.

Figure 4.1 - Isolation TestingA unit test plan for the program shown in figure 4.1, using a strategy based on the isolation organisational approach, need contain only one step, as follows:

Step (1) (Note that there is only one step to the test plan. The sequence of tests is unimportant, all tests could be executed in parallel.) Test unit A, using a driver to start the test and stubs in place of units B, C and D; Test unit B, using a driver to call it in place of unit A; Test unit C, using a driver to call it in place of unit A; Test unit D, using a driver to call it in place of unit A and stubs in place of units E, F and G, (Shown in figure 3.1); Test unit E, using a driver to call it in place of unit D and stubs in place of units H, I and J; Test unit F, using a driver to call it in place of unit D; Test unit G, using a driver to call it in place of unit D; Test unit H, using a driver to call it in place of unit E; Test unit I, using a driver to call it in place of unit E; Test unit J, using a driver to call it in place of unit E.

35


4.2 Advantages It is easier to test an isolated unit thoroughly, where the unit test is removed from the complexity of other units. Isolation testing is the easiest way to achieve good structural coverage, and the difficulty of achieving good structural coverage does not vary with the position of a unit in the unit hierarchy.

Because only one unit is being tested at a time, the test drivers tend to be simpler than for bottom up testing, while the stubs tend to be simpler than for top down testing.

With an isolation approach to unit testing, there are no dependencies between the unit tests, so the unit test phase can overlap the detailed design and code phases of the software lifecycle. Any number of units can be tested in parallel, to give a 'short and fat' unit test phase. This is a useful way of using an increase in team size to shorten the overall time of a software development.

A further advantage of the removal of interdependency between unit tests, is that changes to a unit only require changes to the unit test for that unit, with no impact on other unit tests. This results in a lower cost than the bottom up or top down organisational approaches, especially when changes are made.

An isolation approach provides a distinct separation of unit testing from integration testing, allowing developers to focus on unit testing during the unit test phase of the software lifecycle, and on integration testing during the integration phase of the software lifecycle. Isolation testing is the only pure approach to unit testing, both top down testing and bottom up testing result in a hybrid of the unit test and integration phases.

Unlike the top down and bottom up approaches, the isolation approach to unit testing is not affected by a unit being referenced from more than one other unit.

4.3 Disadvantages The main disadvantage of an isolation approach to unit testing is that it does not provide any early integration of units. Integration has to wait for the integration phase of the software lifecycle. (Is this really a disadvantage?).

An isolation approach to unit testing requires structural design information and the use of both stubs and drivers. This can lead to higher costs than bottom up testing for units near the bottom of the unit hierarchy. However, this will be compensated by simplified testing for units higher in the unit hierarchy, together with lower costs each time a unit is changed.

4.4 Overall An isolation approach to unit testing is the best overall choice. When supplemented with an appropriate integration strategy, it enables shorter development timescales and provides the lowest cost, both during development and for the overall lifecycle.

36


Following unit testing in isolation, tested units can be integrated in a top down or bottom up sequence, or any convenient groupings and combinations of groupings. However, a bottom up integration is the most compatible strategy with current trends in object oriented and object biased designs.

An isolation approach to unit testing is the best way of achieving the high levels of structural coverage essential for high integrity and safety critical applications, and which are required by many standards. With all the difficult work of achieving good structural coverage achieved by unit testing, integration testing can concentrate on overall functionality and the interactions between units.

5. Using AdaTEST and Cantata A unit test will be repeated many times throughout the software lifecycle, both during the development part of the lifecycle and later during maintenance. A test harness such as AdaTEST or Cantata can be used to automate unit tests, resulting in unit tests which are easy to repeat and have a low cost of repetition, and reducing risk of human error.

AdaTEST and Cantata test scripts comprise a test driver and an (optional) collection of stubs. AdaTEST and Cantata can be used with any of the organisational approaches to unit testing described by this paper, or with any combination of organisational approaches, enabling the developer to adopt a testing strategy best suited to the needs of a project.

Two related papers are available from IPL:

Achieving Testability when using Ada Packaging and Data Hiding Methods Testing C++ Objects The paper "Testing C++ Objects" also provides detail about how the complexity of separate class and containment hierarchies leads to problems with a bottom up approach to unit testing. It describes how an isolation approach to unit testing is the only practical way to deal with separate class and containment hierarchies.

6. Conclusion In practice, it is unlikely that any single approach to unit testing can be used exclusively. Typically, an isolation approach to unit testing is modified with some bottom up testing, in which the called units are a mixture of stubs and already tested real units. For example, it makes more sense for a mathematical function to be used directly, provided that it has already been tested and is unlikely to change.

The recommended strategy is:

Base your unit test strategy on the isolation approach, then integrate groups of tested units bottom up. Compromise by incorporating some bottom up where it is convenient (for example, using real operators, mathematical functions, string manipulation etc.), but remember the potential impact of changes.

37


This will result in the lowest cost; both to develop unit tests, and to repeat and maintain tests following changes to units, whilst also facilitating the thorough test coverage necessary to achieve reliable software.

Remember that unit testing is about testing units, and that integration testing is about testing the interaction between tested units.

38

An Introduction to Software Testing€¦ · Web viewThe key component of the above definitions is...

Documents

Transcript of An Introduction to Software Testing€¦ · Web viewThe key component of the above definitions is...