Risk and Testing (2003)
-
Upload
neil-thompson -
Category
Technology
-
view
680 -
download
1
Transcript of Risk and Testing (2003)
Thompson informationSystemsConsulting Limited
©
Risk and Testingextended after session
Presentation to Testing Master Class 27/28 March 2003
Neil Thompson
Thompson information Systems Consulting Limited
www.TiSCL.com23 Oast House Crescent
Farnham, SurreyGU9 0NP
England (UK)Phone & fax 01252 726900
[email protected] phone 07000 NeilTh
(634584)Direct fax 07000 NeilTF
(634583)
Some slides included with permission from and Paul Gerrard
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Agenda
• 1. What testers mean by risk• 2. “Traditional” use of risk in testing• 3. More recent contributions to thinking• 4. Risk-Based Testing: Paul Gerrard (and I)• 5. Next steps in RBT:
– end-to-end risk data model; – automation
• 6. Refinements and ideas for future
Slide 1 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
1. What testers mean by risk• Risk that software in live use will fail:
– software: could be Commercial Off The Shelf; packages such as ERP; bespoke project; integrated programmes of multiple systems...; industry-wide supply chains; any systems product
• Could include risk that later stages (higher levels) of testing will be excessively disrupted by failures
• Chain of risks:Error:mistake made by human(eg spec-writing,program-coding)
Fault:something wrong in a product(interim eg spec,final eg executable software)
Failure:deviation of product from its expected* delivery or service(doesn’t do what it should,or does what it shouldn’t)
RISK RISKRISK
not all errorsresult in faults not all faults
result in failures
Slide 2 of 51
* “expected” may be as in spec,or spec may be wrong (verification & validation)
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Extra links (4,5) in the chain of risks?• Steve Allott:
– distinguish faults in specifications from faults in the resulting program code
Error Mistake:a human action that produces an incorrectresult (eg in spec-writing, program-coding)
Fault:an incorrect step, process or data definition in a computer program (ie executable software)
Failure:an incorrect resultRISK RISKRISK
Slide 2.x1 of 51
Error: amount by whichresult is incorrect
(undefined): incorrect resultsin specifications
RISK
• Ed Kit (1995 book, Software Testing in the Real World):– “error” reserved for its scientific use (but not always very useful in software testing?)
Link 1:mistake made by human
Link 2:something wrong in a spec
Link 4:deviation of product from its expected delivery or service(doesn’t do what it should,or does what it shouldn’t)
RISK RISKRISK Link 3:something wrong in programcode
RISK
Could use Defect for this?
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
• Adding in a distinction by John Musa (1998 book, Software Reliability Engineering):– not all deviations are failures (but this is just the “anomaly” concept?)– (so the associated risks are in the testing process rather than development: that
an anomaly may not be noticed, or may be misinterpreted)
• A possible hybrid of all sources:
Chain of risks could be up to 6 links?
Mistake:a human action that produces an incorrectresult (eg in spec-writing, program-coding)
Fault:an incorrect step, process or data definition in a computer program (ie executable software)
Failure:an incorrect result
RISK
RISKRISK
Slide 2.x2 of 51
Error: amount by whichresult is incorrect
Defect: incorrect resultsin specifications
RISK
Note: this fits its usagein inspections
Direct programming mistake
RISK
(false alarm): or Change Request,or testware mistakeAnomaly:
an unexpected resultduring testing
RISK OF MISSING
RISK OF MIS-INTERPRETING
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Three types of software risk
Project Riskresource constraints, external interfaces,
supplier relationships, contract restrictions
Process Riskvariances in planning and
estimation, shortfalls in staffing, failure to track progress, lack of quality
assurance and configuration management
Primarily a management responsibility
Planning and the development process are the main issues here.
Product Risklack of requirements stability, complexity,
design quality, coding quality, non-functional
issues, test specifications.
Requirements risks are the most significant risks reported in risk assessments.
Testers are mainly
concerned with Product Risk
Slide 3 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Risk management components
CONSEQUENCE OFEACH BAD THING
WHICH COULD HAPPEN
BAD THINGS WHICHCOULD HAPPEN, ANDPROBABILITY OF EACH
ENSW
Slide 4 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Symmetric view of risk probability & consequence
321 2 3
4 66 9
risk EXPOSURE =probability x consequence
PROBABILITY(likelihood)
of bad thingoccurring
CONSEQUENCE (impact)if bad thingdoes occur
Slide 5 of 51
• This is how most people quantify risk• Adding gives same rank as multiplying, but
less differentiation
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Risk management components
CONSEQUENCE OFEACH BAD THING
WHICH COULD HAPPEN
BAD THINGS WHICHCOULD HAPPEN, ANDPROBABILITY OF EACH
ENSW
Slide 4r of 51
any other dimensions?
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Do risks have any other dimensions?
• In addition to probability and consequence...• Undetectability:
– difficulty of seeing a bad thing if it does happen– eg insidious database corruption
• Urgency: – advisability of looking for / preventing some bad
things before other bad things– eg lack of requirements stability
• Any others?
Slide 6 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
2. “Traditional” use of risk in testing
• Few authors & trainers in software testing now miss an opportunity to link testing to “risk”
• In recent years, almost a mandatory mantra• But it isn’t new (what will be new is translating
the mantra into everyday doings!)• Let’s look at the major “traditional” authors:
– Hetzel– Myers– Beizer– others
Slide 7 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Who wrote this and when?• “Part of the art of testing is to know when to stop
testing”:– some recent visionary / pragmatist?– Myers? His eponymous 1979 book was the first on
testing?– No, No and No!– Fred Gruenberger in the original testing book, Program
Test Methods 1972-3, Ed. William C. Hetzel• Also in this book (which is the proceedings of first-
ever testing conference)...Slide 8 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Hetzel (Ed.) 1972-3• Little / nothing explicitly on risk, but:
– reliability as a factor in quality; inability to cope with complexity of systems– “the probability of being faulty is great”p255 (Jean-Claude Rault, CRL France)...– “how to run the test for a given probability of error... number of random input
combinations before... considered ‘good’”p258; sampling as a principle of testing
• Interestingly:– “sampling as a principle should decrease in importance and be replaced by
hierarchical organization & logical reduction”p28 (William C. Hetzel)
• Other curiosities:– ?source of Myers’ triangle exercise p13 (ref.
Dr. Richard Hamming, “Computers and Society”)– the first “V-model”? p172 Outside-in design, inside-out testing
(Allan L. Scherr, IBM Poughkeepsie NY / his colleagues)
MODULE TESTCOMPONENT TEST
SYSTEM TESTCUSTOMER USE
MODULE DESIGNCOMPON’T DESIGN
SYSTEM DESIGN
Slide 9 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Myers 1976: Software Reliability - Principles & Practices
• Again, “risk” not explicit, but principles are there:– “reliability must be stated as a function of the severity of errors as well as their
frequency”; “software reliability is the probability that the software will execute for a period of time without a failure, weighted by the cost to the user of each failure”; “probability that a user will not enter a particular set of inputs that leads to a failure”p7
– “if there is reason to believe that this set of test cases had a high probability of uncovering all possible errors, then the tests have established some confidence in the program’s correctness”; “each test case used should provide should provide a maximum yield on our investment... the probability that the test case will expose a previously undetected error”p170, 176
– “if a reasonable estimate of [the number of remaining errors in a program] were available during the testing stages, it would help to determine when to stop testing”p329
– hazard function as a component of reliability modelsp330
Slide 10 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Myers 1979: The Art of Software Testing• Risk is still not in the index, but more principles:
– “the earlier that errors are found, the lower are the costs of correcting... and the higher is the probability of correcting the errors correctly”p18
– “what subset of all possible test cases has the highest probability of detecting the most errors”p36
– tries to base completion criteria for each phase of testing on an estimate of the number of errors originating in particular design processes, and during what testing phases these errors are likely to be detectedp124
– testing adds value by increasing reliabilityp5
– revisits / updates the reliability models outlined in his 1976 book:• those related to hardware reliability theory (reliability growth, Bayesian, Markov,
tailored per program)• error seeding, statistical sampling theory• simple intuitive (parallel independent testers, historic error data)• complexity-based (composite design, code properties)
Slide 11 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Hetzel (1984)-8: The Complete Guide to Software Testing
• Risk appears only once in the index, but is prominent:– Testing principle #4p24: Testing Is Risk-Based
• amount of testing depends on risk of failure, or of missing a defect; so...• use risk to decide number of cases, amount of emphasis, time & resources
• Other principles appear:– testing measures software quality; want maximum confidence per unit cost via
maximum probability of finding defectsp255
– objectives of Testing In The Large include:p123
• are major failures unlikely?• what level of quality is good enough?• what amount of implementation risk is acceptable?
– System Testing should end when we have enough confidence that Acceptance Testing is ready to startp134
Slide 12 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Beizer 1984: Software System Testing & Quality Assurance
• Risk appears twice in index, but both insignificant• However, some relevant principles are to be found:
– smartness in software production is ability to avoid past, present & future bugsp2 (and bwgs?)
– now more than a dozen models/variations in software reliability theory: but all far from reality; and all far from providing simple, pragmatic tools that can be used to measure software developmentp292-293
– six specific criticisms: but if a theory were to overcome these then it would probably be too complicated to be practicalp293-294
– a compromise may be possible in future, but instead for now, suggest go-live when the system is considered to be useful, or at least sufficiently useful to permit the risk of failurep295
– plotting and extrapolation of S-curves to assess when this point attainedp295-304
Slide 13 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Beizer (1983)-90: Software Testing Techniques
• “Risk” word is indexed as though deliberate:– a couple of occurrences are insignificant, but others:
• purpose of testing is not to prove anything but to reduce perceived risk [of software not working] to an acceptable value (penultimate phase of attitude)
• testing not an act; is a mental discipline which results in low-risk software without much testing effort (ultimate phase of attitude) p4
• accepting principles of statistical quality control (but perhaps not yet implementing, because is not yet obvious how to, and in the case of small products, is dangerous)p6
• add test cases for transactions with high risksp135
• we risk release when confidence is high enoughp6
• Other occurrences of key principles, including:– probability of failure due to hibernating bwgs* low enough to acceptp26
– importance of a bwg* depends on frequency, correction cost, [fix] installation cost & consequencesp27
*bwg: ghost, spectre, bogey, hobgoblin, spirit of the night,any imaginary (?) thing that frightens a person (Welsh) Slide 14 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Others• The “traditional” period could be said to cover the
1970s and 1980s. A variety of views can be found:– Edward Miller 1978, in Software Testing & Validation
Techniques (IEEE Tutorial):• “except under very special situations [...], it is important to recognise that program
testing, if performed systematically, can serve to guarantee the absence of bugs”p4
• and/but(?) “a program is well tested when the program tester has an adequately high level of confidence that there are no remaining “errors” that further testing would uncover”p9 (italics by Neil Thompson!)
– DeMillo, McCracken, Martin & Passafiume 1987: Software Testing & Evaluation
• “a technologically sound approach to testing will incorporate... evaluations of software status into overall assessments of risk associated with the development and eventual fielding of the system”p vii
Slide 15 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
3. More recent contributions to (risk use) thinking
• Traditional basis of testing on risk (although more perceptive than some give credit for) is less than satisfactory because:– it tends to be “lip-service”, with no follow-through / practical application– if there is follow-through, it involves merely using risk analysis as part of the
Testing Strategy (then that is shelved, and it’s “heads down” from then on?)
• Contributions more recently from (for example):– Ed Kit (Software Testing in the real world, 1995)– Testing Maturity Model (Illinois Institute of Technology)– Test Process Improvement® (Tim Koomen & Martin Pol)– Testing Organisation MaturityTM questionnaire (Systeme Evolutif)– Hans Schaefer’s work– Zen and the art of Object-Oriented Risk Management (Neil Thompson)
Slide 16x of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Kit 1995: Software Testing in the real world
Slide 16.x1 of 51
• Error-fault-failure chain extendedp18
• Clear statements on risk and risk managementp26:– test the parts of the system whose failures would have
most serious consequencesp27
– frequent-use areas increase chances of failure foundp27
– focus on parts most likely to have errors in themp27
– risk is not only basis for test management decisions, is basis for everyday test practitioner decisionsp28
• Risk management used in integration testp95
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Testing Maturity Model• Five levels of increasing maturity, based loosely on decades of testing evolution
(eg in 1950s testing not even distinguished from debugging)
• Maturity goals and process areas for the five levels do not include risk explicitly, although emphasis moves from tactical to strategic (eg fault detection to prevention):– in level 1, software released without adequate visibility of quality & risks– in level 3, test strategy is determined using risk management techniques– in level 4, software products are evaluated using quality criteria (relation to risk?)– in level 5, costs & test effectiveness are continually improved (sampling quality)
• Strongly recommended are key practices & subpractices:– (not yet available?)
• Little explicitly visible on risk; very process-oriented
Slide 17 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Test Process Improvement®
• Only one entry in index:– risks and recommendations, substantiated with metrics (as part of
Reporting)• risks indicated with regard to (parts of) the tested object• risks can be (eg) delays, quality shortfalls
• But risks incorporated to some extent, eg:– Test Strategy (differentiation in test depth depending on risks)
Slide 18 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Testing Organisation Maturity (TOMTM) questionnaire
• Risk assessment is not only at the beginning:– when development slips, a risk assessment is
conducted and a decision to squeeze, maintain or extend test time may be made
– [for] tests that are descoped, the associated risks are identified & understood
Slide 19 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Hans Schaefer’s work• Squeeze on testing prioritise based on risk• Consider possibility of stepwise release: :
– test most important functions first– look for functions which can be delayed
• What is “important” in the potential release (key functions, worst problems?)– visibility (of function / characteristic)– frequency of use– possible cost of failure
• Where likely to be most problems?– project history (new technology, methods, tools; numerous people, dispersed)– product measures (areas complex, changed, needing optimising, faulty before)
Slide 20 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Zen and the Art of Object-Oriented Risk Management
• Testing continues to lag development; but pessimism / delay was currently unacceptable (deregulation, Euro, Y2k)
• Could use OO concepts to help testing:– encapsulate risk information with tests (for effectiveness); and/or– inherit tests to reuse (efficiency)
• Basic risk management:– relationship to V-model (outline level)– detail level: test specification
Slide 21 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Risk management & the V-model
Acceptancetesting
Systemtesting
Integration testing
Unit testing
Risks that system(s) haveundetected defects
in them
Slide 22a of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Risk management & the V-model
Acceptancetesting
Systemtesting
Integration testing
Unit testing
Risks that system(s) haveundetected defects
in them
Risks that system(s) and business(es)are not right and ready
for each other
Slide 22b of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Where and how testing manages risks:first, at outline level
Level Risks
Acceptancetesting
Service requirements Undetected errors damage
business
Systemtesting
System specification Undetected errors waste user time
& damage confidence inAcceptance testing
Integrationtesting
Interfaces don’t match Undetected errors too late to fix
Unittesting
Units don’t work right Undetected errors won’t be found
by later tests
Slide 23a of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Where and how testing manages risks:first, at outline level
Level Risks How managed
Acceptancetesting
Service requirements Undetected errors damage
business
Specify user-wanted tests against URS Script tests around user guide and user
& operator training materials
Systemtesting
System specification Undetected errors waste user time
& damage confidence inAcceptance testing
Use independent testers, functional &technical, to get fresh view
Take last opportunity to do automatedstress testing before env’ts re-used
Integrationtesting
Interfaces don’t match Undetected errors too late to fix
Use skills of designers before they moveaway
Take last opportunity to exerciseinterfaces singly
Unittesting
Units don’t work right Undetected errors won’t be found
by later tests
Use detailed knowledge of developersbefore they forget
Take last opportunity to exercise everyerror message
Slide 23b of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Second, at detail level:risk management during test specification
• To help decision-making during the “squeezing of testing”, it would be useful to have recorded explicitly as part of the specification of each test:– the type of risk the set of tests is designed to minimise– any specific risks at which a particular test or tests is aimed
• And this was one of the inputs to...
Test specification based on total magnitude of risks for all defects imaginable
x= ( )Estimated probability of defectoccurring
Estimated severity of defect
Slide 24 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
4. Risk-Based (E-Business) Testing• Advert!
– Artech House, 2002– ISBN 1-58053-314-0– reviews amazon.com & co.uk
• companion websitewww.riskbasedtesting.com– sample chapters– Master Test Planning template– comments from readers
(reviews, corrections)Slide 25 of 51
With acknowledgementsto lead author Paul Gerrard
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Risk-Based E-Business Testing: main themes
• Can define approximately 100 “product” risks threatening a typical e-business system and its implementation and maintenance
• Test objectives can be derived almost directly as “inverse” of risks• Usable reliability models are some way off (perhaps even
unattainable?) so better for now to work on basis of stakeholders’ perceptions of risk
• Lists & explains techniques appropriate to each risk type• Includes information on commercial and DIY tools• Final chapters are on “making it happen”• Go-live decision-making: when benefits “now” exceed risks “now”• Written for e-business but principles are portable; extended to wider
tutorial for EuroSTAR 2002; following slides summarise key points
With acknowledgementsto lead author Paul GerrardSlide 26 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Risks and test objectives - examples
Risk Test Objective
The web site fails to functioncorrectly on the user’s clientoperating system and browserconfiguration.
To demonstrate that the application functionscorrectly on selected combinations ofoperating systems and browser versioncombinations.
Bank statement detailspresented in the clientbrowser do not match recordsin the back-end legacybanking systems.
To demonstrate that statement detailspresented in the client browser reconcile withback-end legacy systems.
Vulnerabilities that hackerscould exploit exist in the website networking infrastructure.
To demonstrate through audit, scanning andethical hacking that there are no securityvulnerabilities in the web site networkinginfrastructure.
Slide 27 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Generic test objectives
Test Objective Typical Test StageDemonstrate component meets requirements Component TestingDemonstrate component is ready for reuse in largersub-system
Component Testing
Demonstrate integrated components correctlyassembled/combined and collaborate
Integration testing
Demonstrate system meets functional requirements Functional SystemTesting
Demonstrate system meets non-functional requirements Non-Functional SystemTesting
Demonstrate system meets industry regulationrequirements
System or AcceptanceTesting
Demonstrate supplier meets contractual obligations (Contract) AcceptanceTesting
Validate system meets business or user requirements (User) AcceptanceTesting
Demonstrate system, processes and people meetbusiness requirements
(User) AcceptanceTesting
Slide 28 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Master test planning
Slide 28.x1 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Test Plans for each testing stage (example)
Risk F1
Requirement F1
Risk F2Risk F3Risk F4Risk U1Risk U2Risk U3Risk S1Risk S2
Requirement F2Requirement F3Requirement N1Requirement N2
eg for System Testing:
Generic test objective G4
Generic test objective G5
Test objective F1Test objective F2Test objective F3Test objective F4Test objective U1Test objective U2Test objective U3Test objective S1Test objective S2
Test 1 2 3 4 5 6 7 8 9 10 11 ...
Test importance H L M H H H M M M L L ...
125
10
60
32
12
25
30
100
40
Slide 29 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Test Design: target execution schedule (example)
TEAMSENVIRONMENTS TESTERS
1
5
6
32
7
9
10
Test execution days 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ...
4 8
Retests &
regression tests
11
Earliest completion date
Comfortable completion date
Partition for functional tests
Partition for disruptive non-functional tests
Balance &transactionreporting
End-to-endcustomerscenarios
Inter-accounttransfers
PaymentsDirect debits
Slide 30 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Plan to manage risk (and scope) during test specification & execution
RiskQuality
TimeTimeCost
Scope
Time
CostScope
Quality
ScopeCost
best pair tofine-tune
Quality
Slide 31x of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Risk as the “inverse” of quality
Risk
TimeScope
Slide 31.x1 of 51
Scope Time
Quality
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Managing risk during test specification
Risk
TimeScope
increasingrisk of faults
introduced
decreasingrisk asfaults found
initial scope set by requirements
target go-live dateset in advance
Slide 31.x2 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Verification & validation as risk management methods
Slide 31.x3 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Risk management during test specification: “micro-risks”
RISKS TOTEST AGAINST
Estimatedprobability xTest specification:
• for all defects imaginable...
Estimatedconsequence =
Slide 32ax of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Risk management clarifies during test execution
RISKS TOTEST AGAINST
Estimatedprobability x
x
Test specification:• for all defects imaginable...
Test execution:• for each defect detected
Probability= 1
Estimatedconsequence
Consequence =f (urgency,importance)
=
Slide 32bx of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
But clarity during test execution is only close-range: fog ahead!
REMAININGRISKS
RISKS TOTEST AGAINST
}
Estimatedprobability x
x
x
Test specification:• for all defects imaginable...
{Test execution:• for each failure detected…
• for all anomalies as yet undiscovered...
Probability= 1
Estimatedprobability
Estimatedconsequence
Estimatedconsequence
Consequence =f (urgency,importance)
=
=
Slide 32cx of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Managing risk during test execution: when has risk got low enough?
Risk
TimeScope
increasingrisk of faults
introduced
decreasingrisk asfaults found
initial scope set by requirements
target go-live dateset in advance
Slide 32.x1 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Pragmatic approximation to risk reduction: progress through “test+fix”
Slide 32.x2 of 51
Fail
Pass
Target tests run
Actual tests run
# tests
date
Closed
Deferred
ResolvedAwaitingfix
# anomalies
date
Cumulative anomalies
• Cumulative S-curves are good because they:– show several things at once– facilitate extrapolation of trends– are based on acknowledged
theory and empirical data...
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Cumulative S-curves: theoretical basis
Slide 32.x3 of 51
Tests run
cumulative #
date
Anomalies found
Early tests blocked by hi-impact anomalies Middle tests fast and productive
Much less than 1 anomaly per test
go-live datedate
(potential)# failuresper day
depending onoperational
profile
Hardware
Software reliability growth model
Above curves based on Myers 1976: Software Reliability: Principles & Practices p10
Software
More than 1 anomaly per testMuch more than 1
anomaly per test, butonly one visible at a time! Late tests slower because
awaiting difficult and/or lo-priority fixes
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
A possible variation on thesoftware reliability growth model
Slide 32.x4 of 51
date
potential# failuresper day Software
go-live date
failures
failures
TEST &RETEST
CHECKAGAINSTEXPECTEDRESULTS
FIX
DIAGNOSE
failures
failures
TEST &RETEST
HAVE ATHINK ANDTALK TO FOLKS
HACK
STROKE CHIN
Good testing & maintenance: convergence on stability
Bad testing & maintenance: divergence into instability
Good maintenance
Bad maintenance
Possibility of knock-on errors included inLittlewood & Verrall 1973: A Bayesian Reliability Growth Modelfor Computer Software (in IEEE Symposium)
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Cumulative S-curves: more theory
Slide 32.x5 of 51
Tests run
cumulative #
date
Anomalies found
Pattern of fault discovery Dunn & Ullman 1982 p61
go-live date
Tests run
# per day
Anomalies found
go-live date date
Actually thereare several reliability growth models, but:• the Rayleigh model
is part of hardware reliability methodologyand has been used successfully in software reliabilityduring developmentand testing• its curve produces the S-curve whenaccumulated
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Reliability theory more generally
Slide 32.x6 of 51
# failuresper “t”
Rayleigh (m=2)
Hardware: Poisson distribution Myers 1976
Software: exponential decay Myers 1976
# failuresper “t”
execution time Σt
• NB: models tend to use execution timerather than elapsed time (because removes test distortion, and uses operational profile ie how often used live)
execution time Σt
Both of our software curves are members of the Weibull distribution:• has been used for
decades in hardware: Kan 1995 p179• two single-parameter cases applied to software by Wagoner in 1970s: Dunn & Ullman 1982 p318
Exponential (m=1)
• shape parameter “m” can take various values (only 1 & 2 shown here): Kan 1995 p180
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Reliability models taxonomy (Beizer, Dunn & Ullman, + Musa)
future addition to slide pack
Slide 32.x7 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Reliability: problems with theories(Beizer, 1984)
Slide 32.x8 of 51
• Main problem is getting theories to match reality• Several acknowledged shortcomings of many theories, eg:
– don’t evaluate consequence (severity) of anomalies– assume testing is like live (eg relatively few special cases)– don’t correct properly for stress-test effects, or code enhancements– don’t consider interactions between faults– don’t allow for debugging getting harder over time
• The science is moving on eg Wiley, Journal of Software Testing, Verification & Reliability but:– a reliability theory that satisfied all the above would be complex– would project managers use it, or would they go live anyway?
• So until these are resolved, let’s turn to empirical data...
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Cumulative S-curves: empirical data
Slide 32.x9 of 51
Tests run
#
date
Anomalies found
Observed empirically that faults plot takes characteristic shape Hetzel 1988 p210
go-live date
S-curve also visible in Kit 1995: Software Testing in the Real World p135
Possible to use to roughly gauge test time or faults remaining Hetzel 1988 p210
The Japanese “Project Bankruptcy” study: Abe, Sakamura & Aiso 1979, in Beizer 1984
• analysed 23 projects, including application software & system software developments• included new code, modifications to existing code, and combinations• remarkable similarity across all projects for shape of test completion curve• anomaly detection rates not significant (eg low could mean good software or bad testing)• significant were (a) length of initial slow progress, and (b) shape of anomaly detection curve...
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Project bankruptcy study, summarised by Beizer
Slide 32.x10 of 51
Tests run
cumulative#
date
Anomalies found
planned go-live date
All the projects had 3 phases: Beizer 1984: Software System Testing & Quality Assurance, p297-300
start of integration stage
Phase I............II...............III 100%
(a) Duration of phases was primary indicator of project success or “bankruptcy”:
Ph I..15%................55%................97%
Ph I+II...............................72%.......................................126%success
failure inevitable“bankrupt”
(b) A secondary indicator was anomaly detection rate
Derivative(looks likeRayleighcurve)
failure
inevitablesuccess
“bankrupt”rate
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
S-curves & reliability: summary of references
Slide 32.x11 of 51
• Many of the original references are very old:– illustrates the points that (a) the earliest testers seemed to be
“advanced” even at that outset, and (b) software reliability seems not to have penetrated mainstream testing 30 years on!
– but: means these books & papers hard to obtain, so...• Recommended “recent” references:
– Boris Beizer 1984 book: Software System Testing & Quality Assurance (ISBN 0-442-21306-9)
– Stephen H. Kan 1995 book: Metrics & Models in Software Quality Engineering (ISBN 0-201-63339-6)
– John Musa 1999 book: Software Reliability Engineering (ISBN 0-07-913271-5)
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Fail
Pass
Progress through tests• We are interested in two main aspects:
– can we manage the test execution to get complete before target date?
– if not, can we do it for those tests of high (and medium?) importance?
High
MediumLow
Target tests run
Actual tests run
# tests
date
# tests
Target tests passed
date
Actual testspassed
Slide 33 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Progress through anomaly fixing and retesting
• Similarly, two main aspects:– can we manage the workflow to get anomalies fixed and
retested before target date?– if not, can we do it for those of material impact?
Closed
Deferred
ResolvedAwaitingfix
# anomalies
date
# anomalies
date
Cumulative anomaliesOutstandingmaterial impact
Slide 34x of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Quantitative and qualitative risk reduction from tests and retests
PROGRESSTHROUGHINCIDENTFIXING& RETESTING
PROGRESSTHROUGHTESTS
PROGRESS &RESIDUAL RISKUP RIGHT SIDEOF W-MODEL
Large-Scale Integration Testing
SystemTesting
AcceptanceTesting
H ML Fail
Pass
Materialimpact
Awaiting fixResolved
DeferredClosed
Awaiting fix Materialimpact
Slide 35 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
“complete”firstrun
Physicalconstraints(environ-mentsetc.)
…and also into regression testing
Specificationof tests
Executionof tests
Retesting ®ression testing
Testing Strategy
Testing Plan
Test Design
Test Scripts
Execution: First run
Execution: Retest & 2nd run
Regression Testing
...
Difficulties,squeeze onscripting &execution time
Time & resource con-straints
Desired coverage
What’s left for:
second run
allowance for further runs
Slide 36 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Managing risk during test execution, against “fixed” scope & time
Risk
TimeScope
increasingrisk of faults
introduced
decreasingrisk asfaults found
initial scope set by requirements
target go-live dateset in advance
Slide 37x of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
From what we can report, to what we would like to report
Slide 37.1x of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Risk-based reporting
Progress through the test plan
Plannedendstart
all risks ‘open’ at the start
Slide 38a of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Risk-based reporting
Progress through the test plan
today Plannedend
residual risks of
releasing TODAY
Resid
ual R
isks
start
Slide 38b of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Report not only risk, but also scope, over time
Risk
TimeScope
increasingrisk of faults
introduced
decreasingrisk asfaults found
initial scope set by requirements
target go-live dateset in advance
Slide 38.1x of 51
how muchscope safelydeliveredso far?
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
OpenClosed
Risk
s
Open
OpenClosedClosed
Open
Obje
ctiv
e
Obje
ctiv
e
Bloc
ked
NFR
1Ob
ject
ive
Bene
fitBe
nefit
Bene
fit
Bloc
ked
func
1Bl
ocke
d fu
nc 2
Project objectives, hence benefits, available for release
Obje
ctiv
e
Bene
fitClosed
Slide 39ax of 51
Benefit & objectives based test reporting
Bene
fit
Obje
ctiv
e
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
OpenClosed
Risk
s
Open
OpenClosedClosed
Open
Obje
ctiv
e
Obje
ctiv
e
Obje
ctiv
eOb
ject
ive
Bene
fitBe
nefit
Bene
fit
Bene
fitBe
nefit
Obje
ctiv
e
Bene
fitClosed
Slide 39bx of 51
Benefit & objectives based test reporting
Project objectives, hence benefits, available for release
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Slippages and trade-offs: an example
• If Test Review Boards recommend delay, management may demand a trade-off, “slip in a little of that descoped functionality”
scope
date
firstslip
originaltarget
Slide 40a of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Slippages and trade-offs: an example
• If Test Review Boards recommend delay, management may demand a trade-off, “slip in a little of that descoped functionality”
• This adds benefits but also new risks: more delay?
scope
date
actualgo-live
firstslip
originaltarget
Slide 40b of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Tolerable risk-benefit balance: another example
• Even if we resist temptation to trade off slippage against scope, may still need to renegotiate the tolerable level of risk balanced against benefits
(risk -benefits)
date
originaltarget
date
original target net risk
Slide 41a of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Tolerable risk-benefit balance: another example
• Even if we resist temptation to trade off slippage against scope, may still need to renegotiate the tolerable level of risk balanced against benefits
(risk -benefits)
date
actualgo-live
originaltarget
date“go for
it”margin
original target net risk
Slide 41b of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
5. Next steps in Risk-Based Testing• End-to-end risk data model:
– is wanted to keep risk information linked to tests– is under development by Paul Gerrard– main reason is of course...
• Automation:– clerical and spreadsheets really not enough– want to keep risk to test link up-to-date through
descopes, reprioritisations etc– Paul is working on an Access database for now– any vendors with risk in their data models?
Slide 42 of 51With acknowledgementsto lead author Paul Gerrard
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
6. Refinements and ideas for future• Although almost universal, the simple multiplication of probability x
consequence can be troublingly over-simple: it might descope testing for huge-impact risks which are very unlikely (avionics errors?!). So use an asymmetric view?
• Some risks are from technology, and other risks are business risks, to use of system. So distinguish “cause” risks from “effect” risks?
• Assessing perception of risks is a start, but can metrics give better quantification? Metrics & Fault source analysis
• Reliability models a key part of testing theory in the 1970s, but still not credibly usable? Reliability engineering
• Wider theoretical basis distinguishing risk from uncertainty: Decision theory
Slide 43 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Asymmetric view of risk probability & consequence
321 7
5 86 9
risk EXPOSURE =• very high (red)• high (orange)• high-ish (yellow)• moderate (pale yellow)• low (pale green)• very low (green)
PROBABILITY(likelihood)
CONSEQUENCE(impact)
4
Slide 44 of 51
• This is an arbitrary scheme; others are possible, of course
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Distinguishing “cause” risk from technology & “effect” risk to business)
exposure
probability
consequence
RISKS TOBUSINESS(from faultydocumentsor software)
RISKS FROMTECHNOLOGY(of faults being madein documentsor software)
• Each could be symmetric or asymmetric
• Could weight: • business risks higher
for Acceptance Testing
• technology risks higher for Unit & Integration Testing
• equal for System Testing
Slide 45 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
But what about next time: metrics?
REMAININGRISKS
RISKS TOTEST AGAINST
}
Estimatedprobability x
x
x
Test specification:• for all defects imaginable...
{Test execution:• for each failure detected…
• for all anomalies as yet undiscovered...
Probability= 1
Estimatedprobability
Estimatedconsequence
Estimatedconsequence
Consequence =f (urgency,importance)
=
=
Slide 46a of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
How metrics can help
Risks to business:defects reported, fixed & retested by importance
Risks to testing progress:defects reported, fixed & retested by urgency
Number
Low Medium High
High
Number
Low Medium
Slide 47a of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
But what about next time: metrics?
REMAININGRISKS
RISKS TOTEST AGAINST
}
Estimatedprobability x
x
x
Test specification:• for all defects imaginable...
{Test execution:• for each failure detected…
• for all anomalies as yet undiscovered...
Probability= 1
Estimatedprobability
Estimatedconsequence
Estimatedconsequence
Consequence =f (urgency,importance)
=
=
Slide 46b of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
How metrics can help
Testing progress:tests executed by each day of testing for a level (cumulative)
Testing productivity:defects found by each day of testing for a level (cumulative)
Time
Time
Number
Number
Risks to business:defects reported, fixed & retested by importance
Risks to testing progress:defects reported, fixed & retested by urgency
Number
Low Medium High
High
Number
Low Medium
Slide 47b of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
But what about next time: metrics?
REMAININGRISKS
RISKS TOTEST AGAINST
}
Estimatedprobability x
x
x
Test specification:• for all defects imaginable...
{Test execution:• for each failure detected…
• for all anomalies as yet undiscovered...
Probability= 1
Estimatedprobability
Estimatedconsequence
Estimatedconsequence
Consequence =f (urgency,importance)
=
=
Slide 46c of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
How metrics can help
Testing progress:tests executed by each day of testing for a level (cumulative)
Testing productivity:faults found by each day of testing for a level (cumulative)
Time
Time
Number
Number
Risks to business:defects reported, fixed & retested by importance
Risks to testing progress:defects reported, fixed & retested by urgency
Number
Low Medium High
High
Number
Low Medium
Fault source analysis:source of faults by testing level detecting them
Unit Integration System Acceptance
Coding
Design
Systemspec-ification
Require-ments
Slide 47c of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Fault Source Analysis
UnitTesting
IntegrationTesting
System & LSI Testing
AcceptanceTesting
Coding
Design
Systemspecification
Requirements
Where fault detected
Phase of lifecyclein which error madewhich caused fault
Pilot LiveRunning
Detected asintended
Detectedearly
Detectedlate
(Rehearsal of Acceptance Tests)
Unacceptablylate
Slide 48 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Reliability engineering• A 1999 view of reliability (John D. Musa):
– “division between hardware & software reliability is somewhat artificial... you may combine... to get system reliability”p35
– to model software reliability, consider:• fault introduction (development process, product characteristics)• fault removal (failure discovery, quality of repair)• environment (operational profiles)
– define necessary reliability– execution time increases reliability
• Other books and sourcesSlide 49 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Decision theory• Decision theory is a body of knowledge and related
analytical techniques of different degrees of formality designed to help a decision-maker choose among a set of alternatives in light of their possible consequences. Decision theory can apply to conditions of certainty, risk or uncertainty
• Leads to consideration of utility value, game theory, separating information from noise, etc
• Bayesian was in Myers 1976, but still being discussed now as new & exciting (because of advances in algorithms & computation?)
Slide 50 of 51
Thompson informationSystemsConsulting Limited
©Neil Thompson: Risk and Testing 27/28 Mar
2003
Summary• Risk has been part of testing for longer than many people think:
about 30 years?• Main messages today:
– all testing should be based on risk– risk is difficult to calculate, so use perceptions of
stakeholders, and broker consensus– “enough” testing has been done when (benefits-
risks) of going live today is a positive quantity• But there still seems to be far to go before we can use it
scientifically in testing on an everyday basis, and we’re just getting to grips with the art!
• Plenty of fun to come...
Slide 51x of 51