Precise Interface Identification to Improve Testing and Analysis of Web Applications
description
Transcript of Precise Interface Identification to Improve Testing and Analysis of Web Applications
Precise Interface Identification to Improve Testing and Analysis of
Web Applications
William G.J. Halfond,Saswat Anand, and Alessandro Orso
Georgia Institute of Technology
2
Example Web Application
Web Server
End Users
Initial Visit Web Application
getQuote.jsp
buyPolicy.jspQuote Information
http://host/getQuote.jsp?action=doquote&car=jeep
3
Interface Identification
public void write(File outfile, String buffer, int length)
Domain information
Grouping of parameters
1. Names of parameters2. Grouping of parameters3. Domain information
Parameter names
4
Example Web Application
Interface Domain Constraints
action = “checkeligibility” integer(age) age < 16
action = “checkeligibility” integer(age) age 16
public void service (HttpRequest req) 1. String aValue = req.getIP( “action” ) 2. if (aValue.equals( “checkeligibility” )) 3. int userAge = getNumIP( “age” ) 4. if (userAge < 16) 5. displayErrorMsg(“Too young.” ) 6. else 7. displayQuotePage( ) 8. if (aValue.equals( “doquote” )) 9. String nValue = req.getIP( “name” )
10. String carType = req.getIP( “type” )11. int carYear = getNumIP( “year” )12. calculateQuote(carType, carYear)
…
public int getNumIP(String name) 1. String value = getIP(name) 2. int param = Integer.parse(value) 3. return param
1. Names of parameters2. Grouping of parameters3. Domain information
Parameter Namesaction, age, name, type, year
Groupings of Parameters action
action, age
action, name, type, year
DynamicSpider
• Web spider crawls pages of application• Limitation: No guarantee of completeness
StaticDFW1:
• Identify parameter names via static analysis• Limitation: Only identifes parameter names
WAMDF2:
• Uses iterative data-flow analysis• Limitation: Assumes all paths feasible
Previous Approaches: Interface Identification
5
1. Deng, Frankl, Wang, SEN 2004.2. Halfond and Orso, FSE 2007.
(action, age, name, type, year)
1. String aValue = req.getIP( “action” ) 2. if (aValue.equals( “checkeligibility” ))
… 8. if (aValue.equals( “doquote” ))
4. if (userAge < 16) 5. displayErrorMsg(“Too young.” ) 6. else 7. displayQuotePage( )
Our Approach
Statically identify interfaces by using symbolic execution to model input parameters and domain constraining operations.
1. Program transformation2. Symbolic execution3. Interface identification
6
7
1 – Program Transformation
1. Introduce symbolic values
2. Replace domain-constraining operations
value getIP(name)
s new SymbolicValue()s.assignName(name)SymbolicState.add(s, value)return s
1. Accessing an input parameter2. Conversion to numeric type3. String comparison4. Arithmetic constraints
8
2 – Symbolic Execution
Symbolically execute the transformed web application -- track path conditions and symbolic state.
SymbolicExecution
Transformed Web Application
getQuote.jsp
buyPolicy.jsp
Path Conditionsc1 c2 c3
c3 c4 c5
Symbolic Statessaction aValuesyear carYear
9
2 – Access Input Parameters
1. String aValue = req.getIP( “action” )
(PC, SS)
(PC, SS[saction aValue])
PC = Path ConditionSS = Symbolic State
10
2 – String Comparison
(PC saction “checkeligibility”, SS[saction aValue])
(PC, SS[saction aValue])
2. if (aValue.equals( “checkeligibility” ))
8. if (aValue.equals( “doQuote” ))
1. String aValue = req.getIP( “action” )
(PC saction “checkeligibility”, SS[saction aValue])
TRUEFALSE
11
3 – Interface Identification
PC1 saction “checkeligibility” integer(sage) sage 16
PC2 saction “checkeligibility” integer(sage) sage 16
SS [saction aValue, sageuserAge]
1. String aValue = req.getIP( “action” ) 2. if (aValue.equals( “checkeligibility” )) 3. int userAge = getNumIP( “age” ) 4. if (userAge < 16) 5. displayErrorMsg(“Too young.” ) 6. else 7. displayQuotePage( )
…
12
Empirical Evaluation
Research Questions (RQ):1. Efficiency -- Is the new approach efficient in
terms of its analysis time requirements?
2. Precision -- Is the new approach more precise than previous approaches?
3. Usefulness -- Does the new approach improve the performance of quality assurance techniques?
13
Implementation: WAMSE
• Written in Java for Java Enterprise Edition (JEE) based web applications
• Implementation Modules1. TRANSFORM
• Customized JEE libraries• Stinger for analysis and automated transformation
2. SE ENGINE• Symbolic execution engine built on JavaPathFinder• Constraint solver is YICES
3. PC ANALYSIS
Implementation: Other Approaches
14
DynamicSpider
• Web spider crawls pages of application• OWASP Web Scarab Project
StaticDFW1:
• Identify parameter names via static analysis• Reimplementation of the author-provided code
WAMDF2:
• Uses iterative data-flow analysis• Implementation from previous work
1. Deng, Frankl, Wang, SEN 2004.2. Halfond and Orso, FSE 2007.
15
Subject Applications
Subject LOC Classes Servlets
Bookstore 19,402 28 27
Classifieds 10,702 18 18
Employee Directory 5,529 11 9
Events 7,164 13 12
Subjects available online from GotoCode.com
16
RQ1: Efficiency
Bookstore Classifieds Employee Dir. Events0
1000
2000
3000
4000
5000
Ana
lysi
s Ti
me
(s)
1. High amount of infeasible paths in subjects2. Low number of constraints per parameter3. Web applications highly modular
WAMSE WAMDF DFW Spider
17
RQ2: Precision
Bookstore Classifieds Employee Dir. Events0
100
200
300
400
Num
ber o
f Int
erfa
ces
On average, 80% of WAMDF
interfaces were spurious
WAMSE WAMDF
RQ3: Usefulness
Measure improvement of three quality assurance techniques:
a) Invocation Verificationb) Penetration Testingc) Test Input Generation
18
19
RQ3a – Invocation Verification
Approach False Positives False NegativesWAMDF 0% 50%
Spider 39% 0%
WAMSE 0% 0%
Verification of invocations for subject Bookstore
Web Application
getQuote.jsp buyPolicy.jspX
20
RQ3b – Penetration Testing
Bookstore Classifieds Employee Dir. Events0
10
20
30
40
Num
ber o
f Vul
nera
bilit
ies
WAMSE WAMDF DFW Spider
Number of vulnerabilities: 2X – 6X higher for WAMSE
21
RQ3c – Test Input Generation
Bookstore Classifieds EmployeeDir. Events5060708090
100
% Stmt.Coverage
Bookstore Classifieds EmployeeDir. Events10203040506070
% BranchCoverage
Bookstore Classifieds EmployeeDir. Events10
100
1000
# CommandForms
Branch coverage increase: 3%-67%
Statement coverage increase: 3%-25%
Command form increase: 651%-1,577%
WAMSE WAMDF DFW Spider
22
RQ3c – Test Suite Size
Bookstore Classifieds Employee Dir. Events1000
10000
100000
1000000
Num
ber o
f Tes
t Cas
es
RQ3c results:1. Higher coverage for measured metrics2. Smaller average test suite
WAMSE WAMDF DFW Spider
Test suite decrease in size: 4X – 10X
Summary of Results
• Developed interface identification technique for web applications based on symbolic execution.
• Empirical evaluation:• Similar analysis time to other techniques• More precise than current techniques• Improves quality assurance techniques
23