Post on 17-Jan-2016
How Human and Machine Cooperate to Get Job Done
Cooperative Developer Testing:
Tao XieNorth Carolina State University
In collaboration with Xusheng Xiao@NCSU ASE and Nikolai Tillmann, Peli de Halleux@Microsoft Research and
students
Why Automate Testing?
Software testing is important Software errors cost the U.S. economy about $59.5
billion each year (0.6% of the GDP) [NIST 02] Improving testing infrastructure could save 1/3 cost
[NIST 02] Software testing is costly
Account for even half the total cost of software development [Beizer 90]
Automated testing reduces manual testing effort Test execution: JUnit, NUnit, xUnit, etc. Test generation: Pex, AgitarOne, Parasoft Jtest, etc. Test-behavior checking: Pex, AgitarOne, Parasoft Jtest, etc.
Software Testing Problems
=?
Outputs
Expected
Outputs
Program
+
Test inputs
Test Oracles
Test Generation (machine) Generating high-quality test inputs (e.g.,
achieving high code coverage)
Test Oracles (human) Specifying high-quality test oracles (e.g.,
guarding against various faults)
Test Generation
Human Expensive, incomplete, …
Brute Force Pairwise, predefined data, etc…
Random: Cheap, Fast “It passed a thousand tests” feeling
Dynamic Symbolic Execution: Pex, CUTE,EXE Automated white-box Not random – Constraint Solving
Dynamic Symbolic Execution
Code to generate inputs for:
Constraints to solve
a!=null a!=null &&a.Length>0
a!=null &&a.Length>0 &&a[0]==1234567890
void CoverMe(int[] a){ if (a == null) return; if (a.Length > 0) if (a[0] == 1234567890) throw new Exception("bug");}
Observed constraints
a==nulla!=null &&!(a.Length>0)a!=null &&a.Length>0 &&a[0]!=1234567890
a!=null &&a.Length>0 &&a[0]==1234567890
Data
null
{}
{0}
{123…}a==null
a.Length>0
a[0]==123…T
TF
T
F
F
Execute&MonitorSolve
Choose next path
Done: There is no path left.
Negated condition
Pex:Visual Studio Power Tool
Download counts (20 months)(Feb. 2008 - Oct. 2009 )
Academic: 17,366 Devlabs: 13,022 Total: 30,388
http://research.microsoft.com/projects/pex/
Loops/path explosion Fitnex [Xie et al. DSN 09]
Method sequences MSeqGen [Thummalapenta et al. ESEC/FSE 09]
External methods or environments e.g., file systems, network, db, … Parameterized Mock Objects [Taneja et al. ASE 10-sp]
Opportunities Regression testing [Taneja et al. ICSE 09-nier] Manually written unit tests [Thummalapenta et al. FASE
11] Developer guidance (cooperative developer testing)
[Xiao et al. ICSE 11]
Challenges of DSE
Open Source Pex extensionshttp://pexase.codeplex.com/
Publications: http://research.microsoft.com/en-us/projects/pex/community.aspx#publications
Problems Faced by DSE
DSE Challenges - Preliminary Study
Real EMCPs: 0Real OCPs: 5
Reported EMCPs: 44Reported OCPs: 18 vs.
external-method call problems (EMCP)
object-creation problems (OCP)
DSE Challenges - Preliminary Study
object-creation problems (OCP) - 64.79% external-method call problems (EMCP) -
26.76% boundary problems – 5.63% limitations of the used constraint solver –
2.82%Preliminary results show that the total block coverage achieved is 49.87%, with the lowest coverage being 15.54%.
External-Method Call Problems (EMCP) Example
Example 1: File.Exists has data
dependencies on program input
Subsequent branch at Line 1 using the return value of File.Exists. Example 2:
Path.GetFullPath has data dependencies on program input
Path.GetFullPath throws exceptions.
Example 3: Stirng.Format do not cause any problem
Object-Creation Problems (OCP) Example
• To cover true branch at Line 5, tools need to generate sequences of method calls: Stack s1 = new Stack();s1.Push(new object());
……s1.Push(new object());FixedSizeStack s2 = new FixedSizeStack (s1);
stack.Count() returns the size of stack.items
• Most tools cannot generate such sequence
• true branch at Line 5 has data dependencies on stack.items (List<object>)
Cooperative Developer Testing
Developers provide guidance to help tools achieve higher structural coverage
Apply tools to generate tests Tools report achieved coverage &
problems Developers provide guidance▪ ECMP: Instrumentation or Mock Objects▪ OCP: Factory Methods
Existing Solution ofProblem Identification
Existing solution (e.g., in Pex) identify all external-method calls in the program report all the non-primitive object types of
program inputs and their fields
Limitations the number could be high some identified problem are irrelevant, not
causes for the tools not to achieve high structural coverage
DSE Challenges - Preliminary Study
Real EMCPs: 0Real OCPs: 5
Reported EMCPs: 44Real OCPs: 18 vs.
Proposed Approach: Covana
Precisely identify problems faced by tools when achieving structural coverage
Insight Not-covered branches have data
dependency on real problem candidates
Three main steps: Problem Candidate Identification Forward Symbolic Execution Data Dependence Analysis
[Xiao et al. ICSE 2011]
Overview of Covana
Data Dependence Analysis
Forward Symbolic Execution
Problem Candidat
es
Problem Candidate Identificati
on
Runtime Informati
on
Identified Problems
Coverage
Program / PUT
Generated Test Inputs
Runtime Events
Overview of Covana
Data Dependence Analysis
Forward Symbolic Execution
Problem Candidat
es
Problem Candidate Identificati
on
Runtime Informati
on
Identified Problems
Coverage
Program / PUT
Generated Test Inputs
Runtime Events
Problem Identification
EMCP Candidate Identification External-method calls whose arguments
have data dependencies on program inputs (e.g., NOT method calls that print constant strings or put a thread to sleep for some time)
OCP Candidate Identification Only non-primitive argument types
(e.g., NOT int, boolean, double)
Example EMCP Candidate Identification
Data Dependencies
Example OCP Candidate Identification
OCP Candidates: FixedSizeStack FixedSizeStack.st
ack Stack.items object
Overview of Covana
Data Dependence Analysis
Forward Symbolic Execution
Problem Candidat
es
Problem Candidate Identificati
on
Runtime Informati
on
Identified Problems
Coverage
Program / PUT
Generated Test Inputs
Runtime Events
Forward Symbolic Execution
Turn elements of problem candidates symbolic EMCP: return values of external-method calls OCP: non-primitive program inputs and their fields
Perform symbolic execution (e.g., DSE/Pex)
Collect runtime information Symbolic expression in branches Uncaught exceptions
Overview of Covana
Data Dependence Analysis
Forward Symbolic Execution
Problem Candidat
es
Problem Candidate Identificati
on
Runtime Informati
on
Identified Problems
Coverage
Program / PUT
Generated Test Inputs
Runtime Events
Data Dependence Analysis
Symbolic Expression:return(File.Exists) == true
Element of ECMP Candidate:return(File.Exists)
Branch Statement Line 1 has data dependency on File.Exists at Line 1
EMCP Analysis
Data Dependence Analysis: partially-covered branch statements
have data dependencies on EMCP candidates for return values
Exception Analysis: extract external-method calls from
exception trace the remaining parts of the program after
the call site of the external-method call are not covered
Example EMCP Analysis
Branch Statement Line 1 has data dependency on File.Exists at Line 1
False branch at Line 1 is not covered
File.Exists is reported
Path.GetFullPath throws exceptions for all executions
Code after Line 6 is not covered
Path.GetFullPath is reported
OCP Analysis
Data Dependence Analysis for partially-covered branch statements data dependencies on non-primitive
program input report program input
data dependencies on fields of program input
report the object type of field directly??
Example OCP Analysis
stack.Count() returns the size of the field stack.items
true branch at Line 5 is not covered
Report List<object>, the object type of stack.items
False Warning!!!
an object type of List<object> cannot be used by the tools: not assignable to the field Stack.items by invoking a public constructor or a public setter method of its declaring class Stack!!
Field Declaration Hierarchy
FixedSizeStack .stack
Stack.items
FixedSizeStack
Field Declaration Hierarchy:
reflection can achieve this: first look at all fields of FixedSizeStack, then all fields of FixedSizeStack.stack, and finally Stack.items.
OCP Analysis Algorithm
Only program input, report it directly
Check whether a field is assignable for its declaring class
report its declaring class
report the field itself
Implementation
An extension to Pex identify problem candidates turn elements of problem candidates symbolic collect runtime information
Data dependence analyzer analyze runtime information Identify problems
Graphic User Interface (GUI) component show identified problems with detailed
analysis information
Evaluation – Subjects and Setup
Subjects: xUnit: unit testing framework for .NET▪ 223 classes and interfaces with 11.4 KLOC
QuickGraph: C# graph library▪ 165 classes and interfaces with 8.3 KLOC
Evaluation setup: Pex with the implemented extension as our DSE
test-generation tool Apply Pex to generate tests for program under test Collect coverage and runtime information for
identifying EMCPs and OCPs
Evaluation – Research Questions
RQ1: How effective is Covana in identifying the two main types of problems, EMCPs and OCPs?
RQ2: How effective is Covana in pruning irrelevant problem candidates of EMCPs and OCPs?
Evaluations - RQ1: Problem Identification
Covana identifies • 43 EMCPs with only 1 false positive and 2 false negatives•155 OCPs with 20 false positives and 30 false negatives.
Example Identified OCP
requires the field typeUnderTest of TestClassCommand not null and to implement at least one interface
typeUnderTest is assignable for TestClassCommand . Report ITypeInfo of typeUnderTest as OCP
ClassStart, Pex achieved block coverage of 2/27 (7.14%)
Evaluations –RQ2: Irrelevant-Problem-Candidate Pruning
Covana prunes • 97.33% (1567 in 1610) EMCP candidates with 1 false positive and 2 false negatives• 65.63% (296 in 451) OCP candidates with 20 false positives and 30 false negatives
Discussion
Assisting other structural test-generation approaches
automatic mock object generation: only deal with external-method calls of EMCPs
random approach: assign more possibilities on exploring object types of OCPs
advanced method-sequence-generation approaches (e.g., MSeqGen): only deal with object types of OCPs
Software Testing
=?
Outputs
Expected
Outputs
Program
+
Test inputs
Test Oracles
Test Generation (machine) Generating high-quality test inputs (e.g.,
achieving high code coverage)
Test Oracles (human) Specifying high-quality test oracles (e.g.,
guarding against various faults)
Regression Test Generation
Given a method f(x) (old version) and g (x) (new version) , synthesize meta-program branch cov:
h(x) := Assert(f(x) == g(x))
if (f(x) != g(x)) throw new Exception(“changed behavior !”);
Complications: What if x is a non-primitive type? deep clone, method-
sequence generation, … How to compare receiver objects? deep state comparison, …
[Taneja and Xie. ASE 08 SP]
Migrating Pex to the Web/CloudTry it at http://www.pexforfun.com/
Demo
• Engineering Pex for serious games in computer science• Train problem solving/programming skills and abstraction skills
Software Testing
=?
Outputs
Expected
Outputs
Program
+
Test inputs
Test Oracles
Test Generation (machine) Generating high-quality test inputs (e.g.,
achieving high code coverage)
Test Oracles (human) Specifying high-quality test oracles (e.g.,
guarding against various faults)
Thank you!
Questions ?
https://sites.google.com/site/asergrp/
Observation of Path Condition
This path condition contains all the required fields, since all of them are assigned symbolic values
FixedSizeStack s3 = new ∧Stack s2 = s3.stack ∧ List<object> s1 = s2.items ∧ int s0 = s1._size ∧ s0 == 10
Path Condition that leads to true branch at Line 5:
Observation of Path Condition
This path condition contains all the required fields, since all of them are assigned symbolic values
FixedSizeStack s3 = new ∧Stack s2 = s3.stack ∧ List<object> s1 = s2.items ∧ int s0 = s1._size ∧ s0 == 10
Path Condition that leads to true branch at Line 5:
Constructing Field Declaration Hierarchy
Extract fields from path conditions and construct a field declaration hierarchy
FixedSizeStack s3 = new ∧Stack s2 = s3.stack ∧ List<object> s1 = s2.items ∧ int s0 = s1._size ∧ s0 == 10
FixedSizeStack .stack
Stack.items
FixedSizeStack
Discussion cont.
Static field initialized inside class side effecting symbolic analysis by
previous testsConcrete argument for external-
method calls using constant string to access external
environment affecting achieved coverage
Discussion cont.
Other potential issues argument side effect of external-method
calls control dependency static analysis
Future work carry out experiments to evaluate the
effectiveness of incorporating these three more features