bbst2_2004.ppt
-
Upload
prakash-kancharla -
Category
Documents
-
view
218 -
download
0
Transcript of bbst2_2004.ppt
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 1/29
1Black Box Software Testing Copyright © 2003 Cem Kaner & James
Black Box Software Testing
Fall 2004by
Cem Kaner, !"!, #h!"!#rofessor of Software $ngineering
%lori&a 'nstit(te of Technology
an&
ames Bach#rincipal, Satisfice 'nc!
Copyright (c) Cem Kaner & James Bach, 2000-2004
This work is license& (n&er the Creati)e Commons *ttrib(tion+Share*like icense! To )iew a copy
of this license, )isit http-..creati)ecommons!org.licenses.by+sa.2!0. or sen& a letter to Creati)eCommons, // 1athan *bbott ay, Stanfor&, California 30/, 4S*!
These notes are partially base& on research that was s(pporte& by 1S% 5rant $'*+0663/3
'T7.S89#$- :'mpro)ing the $&(cation of Software Testers!: *ny opinions, fin&ings an& concl(sions
or recommen&ations expresse& in this material are those of the a(thor;s< an& &o not necessarily
reflect the )iews of the 1ational Science %o(n&ation!
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 2/29
2Black Box Software Testing Copyright © 2003 Cem Kaner & James
Black Box Software Testing
Part 2
Complete Testing Is Impossible
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 3/29
3Black Box Software Testing Copyright © 2003 Cem Kaner & James
Complete testing?What do we mean by "complete testing"?
= Complete :co)erage:- Teste& e)ery line . branch . basis path>
= Testers not fin&ing new b(gs>
= Test plan complete>
= *fter all, if there are more b(gs, yo( can fin& them if yo( &o more
testing! So testing co(l&n?t yet be :complete!:
Complete testing must mean that, at the end of
testing, you know there are no remaining
unknown bugs.
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 4/29
4Black Box Software Testing Copyright © 2003 Cem Kaner & James
Complete coverage?Some people (try to) simplify away the problem of
complete testing by advocating "complete coverage."
= hat is co)erage>
– Extent of testing of certain attributes or pieces of the program, suchas statement coverage or branch coverage or condition coverage.
– Extent of testing completed, compared to a population of possibletests.
= Typical &efinitions are o)ersimplifie&! They miss, for example,
– Interrupts and other parallel operations
– Interesting data values and data combinations
– Missing code= The n(mber of )ariables we might meas(re is st(nning! '
;Kaner< liste& 606 examples in Software Negligence & TestingCoverage.
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 5/29
5Black Box Software Testing Copyright © 2003 Cem Kaner & James
Measuring and achieving high
coverage
Coverage measurement is a good tool to show how far you are from complete testing.
= B(t it@s a lo(sy tool for in)estigating how close yo( are to completion!
= "ri)ing testing to achie)e Ahigh co)erage is likely to yiel& a mass of
low+power tests!– People optimize what we measure them against, at the expense of what
we don’t measure.
= %or more on meas(rement &istortion an& &ysf(nction, rea& Bob
*(stin@s book, Measurement and Management of Performance in
Organizations.
– Brian Marick discusses this and other problems with this and several
other issues in his papers at www.testing.com (e.g. How to Misuse Code
Coverage). Marick has been involved in development of several of the
commercial coverage tools.
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 6/29
6Black Box Software Testing Copyright © 2003 Cem Kaner & James
What about bug find rates?Some people measure completeness of testing
with bug curves: = 1ew b(gs fo(n& per week ;:"efect arri)al rate:<
= B(gs still open ;each week<
= 7atio of b(gs fo(n& to b(gs fixe& ;per week<
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 7/297Black Box Software Testing Copyright © 2003 Cem Kaner & James
Weibull reliability modelBug curves can be useful progress indicators,
but some people fit the data to theoretical curvesto determine when the project will complete.
The model’s assmptions
= Testing occ(rs in a way that is similar to the way the software will beoperate&!
= *ll &efects are e(ally likely to be enco(ntere&!
= *ll &efects are in&epen&ent!
= There is a fixe&, finite n(mber of &efects in the software at the start oftesting!
= The time to arri)al of a &efect follows the eib(ll &istrib(tion!
= The n(mber of &efects &etecte& in a testing inter)al is in&epen&ent ofthe n(mber &etecte& in other testing inter)als for any finite collection ofinter)als!
– See Erik Simmons, When Will We Be Done Testing? Software Defect Arrival Modelling with the Weibull Distribution.
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 8/298Black Box Software Testing Copyright © 2003 Cem Kaner & James
The Weibull modelI think it’s absurd to rely on a distributional
model (or any model) when every assumption itmakes about testing is obviously false.= Dne of the a&)ocates of this approach points o(t that
“Luckily, the Weibull is robust to most violations.”
– This illustrates the use of surrogate measures—we don’t have an
attribute description or model for the attribute we really want tomeasure, so we use something else, that is allegedly “robust”, in its
place. This can be very dangerous
– The Weibull distribution has a shape parameter that allows it to take a
very wide range of shapes. If you have a curve that generally rises then
falls (one mode), you can approximate it with a Weibull.
B4T E*T "D$S TE*T T$ 4S> ED SED4" $ '1T$7#7$T
'T>
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 9/299Black Box Software Testing Copyright © 2003 Cem Kaner & James
Side effects of bug curvesEarlier in testing, the pressure is to increase
bug counts. In response, testers will:
= 7(n tests of feat(res known to be broken or incomplete!
= 7(n m(ltiple relate& tests to fin& m(ltiple relate& b(gs!
= ook for easy b(gs in high (antities rather than har& b(gs!
= ess emphasis on infrastr(ct(re, a(tomation architect(re, tools an&more emphasis of b(g fin&ing! ;Short term payoff b(t long term
inefficiency!<
– For more on measurement dysfunction, read Bob Austin’s book,
Measurement and Management of Performance in Organizations.
– For more observations of problems like these in reputable
software companies, see Doug Hoffman's article, The Dark Side of
Software Metrics.
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 10/2910Black Box Software Testing Copyright © 2003 Cem Kaner & James
Side effects of bug curvesLater in testing, the pressure is to decrease the
new bug rate:= 7(n lots of alrea&y+r(n regression tests!= "on@t look as har& for new b(gs!
= Shift foc(s to appraisal, stat(s reporting!
= Classify (nrelate& b(gs as &(plicates!
= Class relate& b(gs as &(plicates ;an& close&<, hi&ing key &ata abo(t
the symptoms . ca(ses of the problem!
= #ostpone b(g reporting (ntil after the meas(rement checkpoint
;milestone<! ;Some b(gs are lost!<
= 7eport b(gs informally, keeping them o(t of the tracking system!
= Testers get sent to the mo)ies before meas(rement checkpoints!= #rogrammers ignore b(gs they fin& (ntil testers report them!
= B(gs are taken personally!
= Fore b(gs are reGecte&!
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 11/2911Black Box Software Testing Copyright © 2003 Cem Kaner & James
Bad models are counterproductive
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 12/2912Black Box Software Testing Copyright © 2003 Cem Kaner & James
Testers live and breathe tradeoffsThe time needed for test-related tasks is
infinitely larger than the time available.
$xample- Time yo( spen& on
- analyzing, troubleshooting, and effectively describing a failure
's time no longer a)ailable for
- Designing tests - Documenting tests
- Executing tests - Automating tests
- Reviews, inspections - Supporting tech support
- Retooling - Training other staff
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 13/2913Black Box Software Testing Copyright © 2003 Cem Kaner & James
Let's consider the nature of the
infinite set of tests
There are enormous numbers of possible tests.To test everything, you would have to:
= Test e)ery possible inp(t to e)ery )ariable ;incl(&ing o(tp(t
)ariables an& interme&iate res(lts )ariables<!
= Test e)ery possible combination of inp(ts to e)ery combinationof )ariables!
= Test e)ery possible se(ence thro(gh the program!
= Test e)ery har&ware . software config(ration, incl(&ing
config(rations of ser)ers not (n&er yo(r control!
= Test e)ery way in which any (ser might try to (se the program!
» Read The Impossibility of Complete Testing
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 14/2914Black Box Software Testing Copyright © 2003 Cem Kaner & James
Inputs to individual variables
Consider the “valid” inputs
"o(g Eoffman worke& for F*S#*7 ;the Fassi)ely #arallel comp(ter,
HK parallel processors<! This machine is (se& for mission+critical an&
life+critical applications!
–To test the 32-bit integer square root function, Hoffman checked all values(all 4,294,967,296 of them). This took the computer about 6 minutes to run
the tests and compare the results to an oracle.
–There were 2 (two) errors, neither of them near any boundary. (The
underlying error was that a bit was sometimes mis-set, but in most error
cases, there was no effect on the final calculated result.) Without anexhaustive test, these errors probably wouldn’t have shown up.
–What about the 64-bit integer square root? How could we find the time to
run all of these? If we don't run them all, don't we risk missing some bugs?
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 15/2915Black Box Software Testing Copyright © 2003 Cem Kaner & James
Inputs to individual variablesMore complex examples
= $aster $ggs– Bizarre inputs, by design
= $&ite& inp(ts
– These can be quite complex. How much editing is enough?
= Iariations on inp(t timing– Try entering the data very quickly, or very slowly. Enter them
before, after, and during the processing of some other event, or just as the time-out interval for this data item is about to expire.
= 1ow, what abo(t all the error han&ling that yo( can trigger with
:in)ali&: inp(ts>– Think about Whittaker & Jorgensen's constraint-focused attacks
(Whittaker, How Software Fails)
– Think about Jorgensen's hostile data stream attacks
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 16/2916Black Box Software Testing Copyright © 2003 Cem Kaner & James
When people challenge extreme value
tests…
“No user would do that.”
“No user I can think of , who I like,
would do that on purpose.”!ho aren’t yo thin"ing o#$
!ho don’t yo li"e %ho might really se this prodct$
!hat might good sers do y accident$
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 17/2917Black Box Software Testing Copyright © 2003 Cem Kaner & James
Combination testingVariables interact.
= $xample 6- a program crashe& when attempting to print pre)iew a high
resol(tion ;back then, H00xH00 &pi< o(tp(t on a high resol(tion screen! The
option selections for printer resol(tion an& screen resol(tion were interacting!
= $xample 2- *merican *irlines co(l&n@t print tickets if a string concatenating the
fares associate& with all segments was too long!
= $xample 3- Femory leak in or&Star if text was marke& Bol& . 'talic ;rather
than 'talic . Bol&<
= S(ppose there are 1 )ariables!
– Suppose the number of choices for the variables are V1, V2, through VN.
– The total number of possible combinations is V1 x V2 x . . . x VN. This is huge.= * fiel& that accepts only J6, 2, 3 an& another that accepts only J*, B, C yiel&s cases,
6*, 6B, 6C, 2*, 2B, 2C, 3*, 3B, an& 3C!
= Combine two fiel&s that accept one &igit ;0 to < each, yiel&s 60 x 60 L 600 possible
combinations!
= 36M,N,/H,000 possible combinations of the first fo(r mo)es in chess!
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 18/2918Black Box Software Testing Copyright © 2003 Cem Kaner & James
Combination testingVariables interact.
S(ppose there are 1 )ariables!
= abel the n(mber of choices for the )ariables as I6, I2
thro(gh I1!
= The total n(mber of possible combinations is
I6 x I2 x ! ! ! x I1!
This is h(ge!
– A field that accepts only {1, 2, 3} and another that accepts only
{A, B, C} yields 9 cases, 1A, 1B, 1C, 2A, 2B, 2C, 3A, 3B, & 3C.
– Combine two fields that accept one digit (0 to 9) each, yields 10 x
10 = 100 possible combinations.
– 318,979,564,000 possible combinations of the first four moves in
chess.
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 19/2919Black Box Software Testing Copyright © 2003 Cem Kaner & James
Sequences
*
B
C
"
$
%
5
E
'
O $O'T
P 20 times
thro(gh the
loop
Eere@s an example that shows that there are too many paths to
test in e)en a fairly simple program! This is from Fyers, The Art
of Software Testing.
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 20/29
20Black Box Software Testing Copyright © 2003 Cem Kaner & James
SequencesMyers’ example in pseudocode
The program starts at *!From A it can go to B or C
From B it goes to X
From C it can go to D or E
From D it can go to F or G
From F or from G it goes to XFrom E it can go to H or I
From H or from I it goes to X
From X the program can go to
EXIT or back to A. It can go back
to A no more than 19 times.'ne path is *BO+$xit There are %ays to get to * and then to the +*Tin one pass
nother path is *BO*C"%O+$xit There are %ays to get to * the #irsttime, more to get ac" to * the second time, so there are . / 2cases li"e this!
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 21/29
21Black Box Software Testing Copyright © 2003 Cem Kaner & James
SequencesAnalyzing Myers’ example
= There are /6 9 /2 9 !!! 9 /6 9 /20 L 6063 L 600 trillion paths
thro(gh the program!
= 't wo(l& take only a billion years to test e)ery path ;if one co(l&
write, exec(te an& )erify a test case e)ery fi)e min(tes<!
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 22/29
22Black Box Software Testing Copyright © 2003 Cem Kaner & James
Phone System: The Telenova
Stack Failure
Telenova Station Set 1. Integrated voice and data.108 voice features, 110 data features. 1985.
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 23/29
23Black Box Software Testing Copyright © 2003 Cem Kaner & James
The Telenova Stack Failure
Context-sensitive
display10-deep hold queue
10-deep wait queue
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 24/29
24Black Box Software Testing Copyright © 2003 Cem Kaner & James
The Telenova Stack FailureThe bug that triggered the simulation:
Beta c(stomer ;a stock broker< reporte& ran&om fail(res= Co(l& be fre(ent at peak times
= *n in&i)i&(al phone wo(l& crash an& reboot, with other phones crashing while thefirst was rebooting
= Dn a partic(larly b(sy &ay, ser)ice was &isr(pte& all ;$ast Coast< afternoon
e were mystifie&-
= *ll in&i)i&(al f(nctions worke&
= e ha& teste& all lines an& branches!
4ltimately, we fo(n& the b(g in the hol& (e(e
= 4p to 60 calls on hol&, each a&&s recor& to the stack
= 'nitially, the system checke& stack whene)er call was a&&e& or remo)e&, b(t this
took too m(ch system time! So we &roppe& the checks an& a&&e& these– Stack has room for 20 calls (just in case)
– Stack reset (forced to zero) when we knew it should be empty
= The error han&ling ma&e it almost impossible for (s to &etect the problem in thelab! Beca(se we co(l&n@t p(t more than 60 calls on the stack ;(nless we knew themagic error<, we co(l&n@t get to 26 calls to ca(se the stack o)erflow!
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 25/29
25Black Box Software Testing Copyright © 2003 Cem Kaner & James
The Telenova Stack FailureA simplified state diagram showing the bug
Callerhung up
Idle
Connected
On Hold
Ringing
You
hung up
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 26/29
26Black Box Software Testing Copyright © 2003 Cem Kaner & James
Telenova Stack Failure
Idle
Connected
On Hold
Ringing Callerhung up
You
hung up
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 27/29
27Black Box Software Testing Copyright © 2003 Cem Kaner & James
Telenova Stack FailureWhy are we spending so much time on this
example?
= Beca(se it ill(strates se)eral important points-
– Simplistic approaches to path testing can miss critical defects.
– Critical defects can arise under circumstances that appear (in a testlab) so specialized that you would never intentionally test for
them.– When (in some future course or book) you hear a new
methodology for combination testing or path testing, I want you totest it against this defect. If you had no suspicion that there was astack corruption problem in this program, would the new method
lead you to find this bug?
• This example also lays a foundation for our introduction to random /statistical testing.
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 28/29
28Black Box Software Testing Copyright © 2003 Cem Kaner & James
Telenova Stack Failure
Having found and fixedthe hold-stack bug,
should we assume
that we’ve taken care of the problem
or that if there is one long-sequence bug,
there will be more?
Hmmm…
If you kill a cockroach in your kitchen,
do you assumeyou’ve killed the last bug?
Or do you call the exterminator?
7/25/2019 bbst2_2004.ppt
http://slidepdf.com/reader/full/bbst22004ppt 29/29
29Bl k B S f T i C i h © 2003 Cem Kaner & James
Conclusion
= Complete testing is impossible– There is no simple answer for this.
– There is no simple, easily automated, comprehensive oracle to
deal with it.
– Therefore testers live and breathe tradeoffs.