Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool...
Transcript of Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool...
![Page 1: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/1.jpg)
UnitTes)ngToolCompe))onRoundFour
UrkoRueda,RenéJust,JuanP.Galeo5,TanjaE.J.Vos
The9thInterna=onalWorkshoponSearch-BasedSoDwareTes=ng
![Page 2: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/2.jpg)
Contents
1. AbouttheToolcompe==on
2. TheTools3. TheMethodology
4. TheResults5. Lessonslearned
4thJavaunittes=ngcompe==on
1
![Page 3: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/3.jpg)
Unit Testing Tool Competition
FITTESTcrest.cs.ucl.ac.uk/fi;est
Coverage metrics
Mutation metrics
CUTs / Projects / Tools
Tools SBST & nonSBST
2012 ICST’13
✓ Cobertura Javalanche 77 / 5 / 2 Manual & Randoop - baselines
2013 Round Two FITTEST’13
✓
JaCoCo PITest 63 / 9 / 4 1st + T3 & Evosuite
63 / 9 / 8 2014 Round Three SBST’15
✗ 2nd + Commercial & GRT & jTexPert &
Mosa(Evosuite)
2015 Round Four SBST’16
✗ Defects4J: github.com/rjust/defects4j+Realfaultfindingmetric
68 / 5 / 4 Randoop - baseline & T3 & Evosuite & jTexPert
BenchmarkedJavaunittes=ngattheclasslevel
AbouttheToolcompe))on
4thJavaunittes=ngcompe==on
2
![Page 4: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/4.jpg)
AbouttheToolcompe))on
§ Why?§ Towardstes=ngfieldmaturity–thisisjustJava…§ Toolsimprovements,futuredevelopmentsinsight
§ Whatisnewinthe4thedi=on?§ Benchmarkinfrastructure–splitinto
§ Testgenera=on§ Testexecu=on&Testassessment(Defects4J)
§ Benchmarksubjects(fromDefects4Jdataset)§ Timebudgets(1,2,4&8minutes)§ Flakytests(noncompliable,nonreliablepass)
4thJavaunittes=ngcompe==on
3
![Page 5: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/5.jpg)
TheTools
Tool Technique Static analysis
Edition 2012 2013 2014 2015
Randoop (baseline)
Random ✗ ✓ ✓ ✓ ✓
T3 ✗ ✗ ✓ ✓ ✓ jTexPert Random (guided) ✓ ✗ ✗ ✓ ✓
Evosuite Evolutionary algorithm
✓ ✗ ✓ ✓ ✓
§ SBSTandnon-SBSTtools§ Commandlinetools§ Fullyautomated–nohumaninterven=on
4thJavaunittes=ngcompe==on
4
![Page 6: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/6.jpg)
TheMethodology
§ Tooldeployment§ Installa=on–Linuxenvironment§ Wrapperimplementa=on–runtoolscript
§ Std.IN/OUTcommunica=onprotocol§ 4thedi=onhasa=mebudget
§ Tune-upcycle–setup,run,resolveissues§ Benchmarkinfrastructure
§ Defects4Jintegra=on§ Decouplingtestgenera=onfromtestexecu=on/assessment
§ Tool–runovernoncontestbenchmarksamples
4thJavaunittes=ngcompe==on
5
![Page 7: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/7.jpg)
TheMethodologyrun tool
for Tool Tbenchmarkframework
"BENCHMARK"
Src Path / Bin Path / ClassPath
ClassPath for JUnit Compilation
"READY"
.
.
.
name of CUT...
generate file in./temp/testcases
"READY"
compile + execute + measure test case
loop
preparation
time-budget
4thJavaunittes=ngcompe==on
6
![Page 8: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/8.jpg)
TheMethodology
§ Benchmarkinfrastructure§ TwoHPZ820worksta=ons–each:
§ 2CPUsocketsforatotalof20cores§ 256GbRAM
§ 32virtualmachines(16perworksta=on)§ Testgenera=on
§ 1core–controltoolmul=-threadingcapability§ 8GBRAM
§ Testexecu=on/assessment(toolindependent)§ 2cores§ 16GbRAM–resolvesoutofmemoryissues
4thJavaunittes=ngcompe==on
7
![Page 9: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/9.jpg)
TheMethodology
benc
hmar
k to
olre
plic
ated
x32
VM
s
T3
jTexpert
EvoSuite
Randoop runtool
80CUTs
RUNs 1, 2, 3
generatetest cases
collect metrics
aggregator
runtool
runtool
runtool
HP Z820 16 VMs20core CPU256Gb RAM
1core CPU8Gb RAM
time budgets
1 2 4 8m
2core CPU16Gb RAM
1 2 4 8m
1 2 4 8m
1 2 4 8m
1 2 4 8m
1 2 4 8m
1 2 4 8m
1 2 4 8m
HP Z820 16 VMs20core CPU256Gb RAM
1core CPU8Gb RAM
time budgets
1 2 4 8m
2core CPU16Gb RAM
1 2 4 8m
1 2 4 8m
1 2 4 8m
1 2 4 8m
1 2 4 8m
1 2 4 8m
1 2 4 8m
RUNs 4, 5, 6
generatetest cases
collect metrics
CalculateScore
4thJavaunittes=ngcompe==on
8
![Page 10: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/10.jpg)
Randoop
Test classes
@Test
@Test
@Test
compilablerun to detect and remove flaky tests
Test classes
@Test
@Test
No flaky tests
run tocollect metrics
calculate score
benchmark tool
runtool runtool runtool runtool
T3 EvoSuite jTexpert
Time-budget
(1, 2 , 4, 8min)
Y
N
CUT(fixed)
CUT(1 real fault)
CUT(mutated)
generate
CUT(fixed)
TheMethodology
4thJavaunittes=ngcompe==on
9
![Page 11: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/11.jpg)
TheMethodology
§ Flakytests§ Passesduringgenera=on§ But,mightFailduringexecu=on/assessment§ False-posi=vewarnings
§ Nonreliablefault-detec=on§ Nonreliablemuta=onanalysis
§ Defects4Jflakytestssanity§ Noncompilingtestclasses§ Failingtestsover5execu=ons(fixedCUTversions)
4thJavaunittes=ngcompe==on
10
![Page 12: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/12.jpg)
TheMethodology
§ TheMetrics–Testeffec=veness§ Codecoverage(fixedbenchmarkversions)
§ Defects4J<-Cobertura§ Statementcoverage§ Condi=oncoverage
§ Muta=onscore§ Defects4J<-Majorframework(allmuta=onoperators)
§ Realfaultdetec=on(buggybenchmarkversions)§ 1realfaultperbenchmark§ 0or1score,independentofhowmanytestsrevealit
4thJavaunittes=ngcompe==on
11
![Page 13: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/13.jpg)
covScore(T,L,C,r) := wi · covi + wb · covb + wm · covm +
(real fault found ? wf : 0)
TheMethodology
§ TheScoringformulaT=Tool;L=Timebudget;C=CUT;r=RUN(1..6)Coverages:covi=statement;covb=condi=on
covm=mutantskillra=oWeights:wi=1;wb=2;wm=4;wf=4
4thJavaunittes=ngcompe==on
12
![Page 14: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/14.jpg)
TheMethodology
§ TheScoringformula–=mepenalty
§ Testgenera=onslot:L..2·L§ NopenaltyifgenTime<=L§ PenaltyforExtra=metaken(genTime–L)
§ HalfcovScoreiftheToolmustbekilled(>2·L)
4thJavaunittes=ngcompe==on
13
![Page 15: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/15.jpg)
TheMethodology
§ TheScoringformula–testspenalty
#Classes=generatedtestclasses;#uClasses=uncompilable#Tests=testcases;#fTests=flaky
4thJavaunittes=ngcompe==on
14
![Page 16: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/16.jpg)
TheMethodology
§ TheScoringformula–ToolscoreScore(T,L,C):=avg(Score(T,L,C,r)forallrexecu=ons
Score(T,L,C,r) := tScore(T,L,C,r) – penalty(T,L,C,r)
4thJavaunittes=ngcompe==on
15
![Page 17: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/17.jpg)
TheMethodology
§ Conclusionvalidity§ Reliabilityoftreatmentimplementa=on
§ Tooldeploymentinstruc=onsEQUALforallpar=cipants
§ Reliabilityofmeasures§ Efficiency:wallclock=mebyJavaSystem.currentTimeMillis()
§ Effec=veness:Defects4J§ Toolsnon-determinis=cnature:6runs(HWCapacity)
4thJavaunittes=ngcompe==on
16
![Page 18: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/18.jpg)
TheMethodology
§ Internalvalidity§ CUTsfromDefects4J(uniformandarbitraryselec=onfrom5opensourceprojects)§ ToolsandbenchmarkinfrastructureTune-upsamples§ Contestbenchmarks
§ Wrappersruntool:implementedbyToolsside§ Constructvalidity
§ Scoringformulaweights–qualityindicatorsvalue§ Empiricalstudies–correla=onofproxymetricsfor:Testeffec=venessandFaultfindingcapability
4thJavaunittes=ngcompe==on
17
![Page 19: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/19.jpg)
TheResults
Contestrunfor~1week
Testgenera=on,execu=onandassessment
x32 VMs
Asinglevirtual
machinewoulduse8CPUmonths!
4thJavaunittes=ngcompe==on
18
![Page 20: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/20.jpg)
Lessonslearned
§ Tes=ngToolsimprovements§ Automa=on,Testeffec=veness,Comparability
§ Benchmarkinginfrastructureimprovements§ DecouplingTestgen.fromexecu=on/assessment§ Flakytestsiden=fica=onandsanity§ Faultfindingcapabilitymeasurement§ Testeffec=venessduetoTestgenera=on=me§ Whatnext?
§ Automatedparalleliza=onofthebenchmarkcontest§ MoreTools,newlanguages?(i.e.C#?)
4thJavaunittes=ngcompe==on
19
![Page 21: Unit Tesng Tool Compe))on Round Fourggay/sbst2016/slides/competition.pdf · Unit Testing Tool Competition FITTEST crest.cs.ucl.ac. uk/fi;est Coverage metrics Mutation metrics CUTs](https://reader034.fdocuments.us/reader034/viewer/2022052012/6028a88c3999d172491e89c7/html5/thumbnails/21.jpg)
Contactus
UniversidadPolitécnicadeValencia,[email protected],[email protected]
OpenUniversiteitHeerlen,[email protected]
UniversityofMassachuseysAmherst,MA,[email protected]
UniversityofBuenosAires,[email protected]
web:hyp://sbstcontest.dsic.upv.es/
4thJavaunittes=ngcompe==on
20