HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla...

64
HAMPI A String Solver for Testing, Analysis and Vulnerability Detection Vijay Ganesh MIT July 16, 2011 Sunday, July 17, 2011

Transcript of HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla...

Page 1: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

HAMPIA String Solver

forTesting, Analysis and Vulnerability Detection

Vijay GaneshMIT

July 16, 2011

Sunday, July 17, 2011

Page 2: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Software Engineering & SMT SolversAn Indispensable Tactic

Formal Methods

Program Analysis

Automatic Testing

Program Synthesis

Program Reasoning

2Sunday, July 17, 2011

Page 3: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Software Engineering & SMT SolversAn Indispensable Tactic

Formal Methods

Program Analysis

Automatic Testing

Program Synthesis

Program ReasoningSAT/SMTSolvers

2Sunday, July 17, 2011

Page 4: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Traditional SMT Problem StatementEfficient Solver for Analysis of Programs

3

Program Reasoning Tool

Program Specification

Program is Correct?or Generate Tests

SMT Solver

Formulas

SAT/UNSAT

Sunday, July 17, 2011

Page 5: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Traditional SMT LogicsEfficient Solvers for Program Expressions

4

• Integer/Real Linear Arithmetic

• Bit-vectors

• Arrays

• Uninterpreted Functions

• Abstract Datatypes

• Quantifiers

• Non-linear Arithmetic

• Strings?

Sunday, July 17, 2011

Page 6: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Key SMT ConceptsLogician’s Question: What’s New?

5

• Approximations

• Asymptotically speaking: probably the same

• SAT• Clause learning using conflict analysis• Backjumping• Variable selection heuristics• Restarts

• SMT• Combinations• Under/Over approximations of formulas• DPLL(T)• Bounding

Sunday, July 17, 2011

Page 7: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Why a String Solver?Efficient Solver for Analysis of String Programs

6

Common String Operations String Programs Types of Errors

FunctionsString concatenation String extraction

PredicatesString comparison String assignmentSanity checking of strings using RE

Traditional AppsC/C++/Java Apps (String Library)C#/.NET

Web AppsSanitization code in PHP, JavaScriptClient-side and server-sideScripting code

Memory-related ErrorsBuffer overflowCode injection

Improper SanitizationSQL injection XSS scriptingIncomplete sanity checking

Sunday, July 17, 2011

Page 8: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

String Solver Problem StatementEfficient Solver for Analysis of String Programs

7

Program Reasoning Tool

String Program Specification

Program is Correct?or Generate Tests

String Solver

String Formulas

SAT/UNSAT

Sunday, July 17, 2011

Page 9: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI String Solver

8

HAMPI String Solver

StringFormulas

UNSAT

SAT

• X = concat(“SELECT...”,v) AND (X ∈ SQL_grammar)• JavaScript, PHP, ... string expressions• NP-complete• ACM Distinguished Paper Award 2009

Sunday, July 17, 2011

Page 10: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Take Home Message

• Theories of Strings are increasingly key for reliability/security

• Conceptual idea: Bounded logics

• Use HAMPI

9Sunday, July 17, 2011

Page 11: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Rest of the Talk

• HAMPI Logic: A Theory of Strings

• Motivating Example: HAMPI-based Vulnerability Detection App

• How HAMPI works

• Experimental Results

• Related Work: Practice and Theory

• HAMPI 2.0

• SMTization: Future of Strings

10Sunday, July 17, 2011

Page 12: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Theory of StringsThe Hampi Language

11

PHP/JavaScript/C++... HAMPI: Theory of Strings Notes

Var a;$a = ‘name’

Var a : 1...20; a = ‘name’

Bounded String Variables String Constants

string_expr.” is ” concat(string_expr, “ is “); Concat Function

substr(string_expr,1,3) string_expr[1:3] Extract Function

assignments/strcmpa = string_expr;a /= string_expr;

equalitya = string_expr;a /= string_expr;

Equality Predicate

Sanity check in regular expression RESanity check in context-free grammar CFG

string_expr in RE string_expr in SQLstring_expr NOT in SQL

Membership Predicate

string_expr contains a sub_strstring_expr does not contain a sub_str

string_expr contains sub_strstring_expr NOT?contains sub_str

Contains Predicate (Substring Predicate)

Sunday, July 17, 2011

Page 13: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Theory of StringsThe Hampi Language

12

• X = concat(“SELECT msg FROM msgs WHERE topicid = ”,v) AND

(X ∈ SQL_Grammar)

• input ∈ RegExp([0-9]+)

• X = concat (str_term1, str_term2, “c”)[1:42]AND

X contains “abc”

Sunday, July 17, 2011

Page 14: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI Solver Motivating ExampleSQL Injection Vulnerabilities

13

BackendDataBase

Malicious SQL Query

Unauthorized Database Results

BuggyScript

SELECT m FROM messages WHERE id=’1’ OR 1 = 1

Sunday, July 17, 2011

Page 15: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI Solver Motivating ExampleSQL Injection Vulnerabilities

14

Source: IBM Internet Security Systems, 2009Source: Fatbardh Veseli, Gjovik University College, Norway

Sunday, July 17, 2011

Page 16: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI Solver Motivating ExampleSQL Injection Vulnerabilities

15

if (input in regexp(“[0-9]+”)) query := “SELECT m FROM messages WHERE id=‘ ” + input + “ ’ “)

Buggy Script

• input passes validation (regular expression check)

• query is syntactically-valid SQL

• query can potentially contain an attack substring (e.g., 1’ OR ‘1’ = ‘1)

Sunday, July 17, 2011

Page 17: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI Solver Motivating ExampleSQL Injection Vulnerabilities

15

if (input in regexp(“[0-9]+”)) query := “SELECT m FROM messages WHERE id=‘ ” + input + “ ’ “)

Buggy Script

• input passes validation (regular expression check)

• query is syntactically-valid SQL

• query can potentially contain an attack substring (e.g., 1’ OR ‘1’ = ‘1)

Should be: “^[0-9]+$”

Sunday, July 17, 2011

Page 18: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI Solver Motivating ExampleSQL Injection Vulnerabilities

16

if (input in regexp(“[0-9]+”)) query := “SELECT m FROM messages WHERE id=‘ ” + input + “ ’ “)

Program Reasoning Tool

Specification

Generate Tests/Report Vulnerability

HAMPI

String Formulas

SAT/UNSAT

Sunday, July 17, 2011

Page 19: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Rest of the Talk

• HAMPI Logic: A Theory of Strings

• Motivating Example: HAMPI-based Vulnerability Detection App

• How HAMPI works

• Experimental Results

• Related Work: Theory and Practice

• HAMPI 2.0

• SMTization: Future of Strings

17Sunday, July 17, 2011

Page 20: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Expressing the Problem in HAMPISQL Injection Vulnerabilities

18

Var v : 12;

cfg SqlSmall := "SELECT ” [a-z]+ " FROM ” [a-z]+ " WHERE " Cond;

cfg Cond := Val "=" Val | Cond " OR " Cond;

cfg Val := [a-z]+ | "'” [a-z0-9]* "'" | [0-9]+;

val q := concat("SELECT msg FROM messages WHERE topicid='", v, "'");

assert q in SqlSmall;

assert q contains "OR ‘1'=‘1'";

SQL Grammar

SQL Query

Input String

SQLI attack conditions

“q is a valid SQL query”

“q contains an attack vector”

assert v in [0-9]+;

Sunday, July 17, 2011

Page 21: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Hampi Key Conceptual IdeaBounding, expressiveness and efficiency

19

LiComplexity of∅ = L1 ∩ ... ∩ Ln

Current Solvers

Context-free Undecidable n/a

Regular PSPACE-complete Quantified Boolean Logic

Bounded NP-complete SATEfficient in practice

Sunday, July 17, 2011

Page 22: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Hampi Key Idea: Bounded LogicsTesting, Vulnerability Detection,...

20

•Finding SAT assignment is key

•Short assignments are sufficient

•Bounding is sufficient

•Bounded logics easier to decide

Sunday, July 17, 2011

Page 23: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Hampi Key Idea: Bounded LogicsBounding vs. Completeness

21

• Bounding leads to incompleteness

• Testing (Bounded MC) vs. Verification (MC)

• Bounding allows trade-off (Scalability vs. Completeness)

• Completeness (also, soundness) as resources

Sunday, July 17, 2011

Page 24: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI Solver Motivating ExampleSQL Injection Vulnerabilities

22

Var v : 12;

cfg SqlSmall := "SELECT ” [a-z]+ " FROM ” [a-z]+ " WHERE " Cond;

cfg Cond := Val "=" Val | Cond " OR " Cond;

cfg Val := [a-z]+ | "'” [a-z0-9]* "'" | [0-9]+;

val q := concat("SELECT msg FROM messages WHERE topicid='", v, "'");

assert q in SqlSmall;

assert q contains "OR ‘1'=‘1'";

SQL Grammar

SQL Query

Input String

SQLI attack conditions

“q is a valid SQL query”

“q contains an attack vector”

assert v in [0-9]+;

Sunday, July 17, 2011

Page 25: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

How Hampi WorksBird’s Eye View: Strings into Bit-vectors

23

Hampi

Find a 4-char string v:• (v) is in E• (v) contains ()()

var v : 4;

cfg E := “()” | E E | “(“ E “)”;

val q := concat( “(“, v, ”)”);

assert q in E;assert q contains “()()”;

STP Encoder

STP DecoderSTP

String Solutionv = )()(

Bit-vector Constraints

Bit-vector Solution

Normalizer

Sunday, July 17, 2011

Page 26: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

STP Bit-vector & Array Solver

24

STP SolverProgram

Expressions(x = z+2 OR

mem[i] + y <= 01)

UNSAT

SAT

• Bit-vector or machine arithmetic• Arrays for memory• C/C++/Java expressions• NP-complete

Sunday, July 17, 2011

Page 27: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

• STP• Enabled Concolic Testing• EXE by Engler et al• BAP/BitBlaze by Song et al.• Model checking by Dill et al.

• Solver-based languages (Alloy team)• Solver-based debuggers• Solver-based type systems • Solver-based concurrency bugfinding

100,000 Constraints

1,000,000 Constraints

2005 2009 Today

• HAMPI: String Solvers• Ardilla by Ernst et al.• Kudzu & Kaluza by Song et al.• Klee by Engler et al.• George Candea’s Cloud 9 tester• STP + HAMPI exceed 100+ projects

The History of STP

25Sunday, July 17, 2011

Page 28: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Impact of STP: Notable Projects

26

Category Research Project Project Leader/Institution

Formal MethodsACL2 Theorem Prover + STPVerification-aware Design CheckerJava PathFinder Model Checker

Eric Smith & David Dill/StanfordJacob Chang & David Dill/StanfordMehlitz & Pasareanu/NASA

Program AnalysisBitBlaze & WebBlazeBAP

Dawn Song et al./BerkeleyDavid Brumley/CMU

Automatic TestingSecurity

Klee, EXESmartFuzzKudzuS2E & Cloud9

Engler & Cadar/StanfordMolnar & Wagner/BerkeleySaxena & Song/BerkeleyBucur & Candea/EPFL

Hardware Bounded Model-cheking (BMC)

Blue-spec BMCBMC

Katelman & Dave/MITHaimed/NVIDIA

• Enabled Concolic Testing• 100+ reliability and security projects

Sunday, July 17, 2011

Page 29: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

How STP WorksEager for BV and Lazy for Arrays

27

Substitutions

Simplifications

Linear Solving

Array Abstraction

Conversion to SAT

Boolean SAT Solver

Refinement Loop

Result

Read(A,i0)=0 Read(A,i1)=1 … Read(A,in)=10,000 Θ(i0,i1)

Input

Sunday, July 17, 2011

Page 30: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

How Hampi WorksUnroll Bounded CFGs into Regular Exp.

28

Bound(E,6) ([()() + (())]) +()[()() + (())] +[()() + (())]()

Hampivar v : 4;

cfg E := “()” | E E | “(“ E “)”;

val q := concat( “(“, v, ”)”);

assert q in E;assert q contains “()()”;

STP Encoder

STP DecoderSTP

String Solutionv = )()(

Bit-vector Constraints

Bit-vector Solution

Normalizer

Sunday, July 17, 2011

Page 31: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

How Hampi WorksUnroll Bounded CFGs into Regular Exp.

28

Bound(E,6) ([()() + (())]) +()[()() + (())] +[()() + (())]()

Hampivar v : 4;

cfg E := “()” | E E | “(“ E “)”;

val q := concat( “(“, v, ”)”);

assert q in E;assert q contains “()()”;

STP Encoder

STP DecoderSTP

String Solutionv = )()(

Bit-vector Constraints

Bit-vector Solution

Normalizer

Bound Auto-derived

Sunday, July 17, 2011

Page 32: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

How Hampi WorksBird’s Eye View: Strings into Bit-vectors

29

Hampi

Find a 4-char string v:• (v) is in E• (v) contains ()()

var v : 4;

cfg E := “()” | E E | “(“ E “)”;

val q := concat( “(“, v, ”)”);

assert q in E;assert q contains “()()”;

STP Encoder

STP DecoderSTP

String Solutionv = )()(

Bit-vector Constraints

Bit-vector Solution

Normalizer

Sunday, July 17, 2011

Page 33: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

How Hampi WorksUnroll Bounded CFGs into Regular Exp.

30

var v : 4;

cfg E := “()” | E E | “(“ E “)”;

val q := concat( “(“, v, ”)”);

assert q in E;assert q contains “()()”;

Auto-derive lower/upper bounds

[L,B]on CFG

[6,6]

cfg E := “()” | E E | “(“ E “)”

Look for minimal length

string“()”

Step 1:

Step 2:

Sunday, July 17, 2011

Page 34: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

How Hampi WorksUnroll Bounded CFGs into Regular Exp.

31

cfg E := “()” | E E | “(“ E “)”

Recursivelyexpand

non-terminals:

Construct Partitions

[4,2][2,4][3,3][5,1][1,5]

[1,4,1]

Step 3:

Length: 6

Min. length constant: ”()”

cfg E := “()” | E E | “(“ E “)”

Recursivelyexpand

non-terminals:

Construct RE

(())()()(())((()))

Step 4:

Length: 6

Min. length constant: ”()”

Sunday, July 17, 2011

Page 35: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Unroll Bounded CFGs into Regular Exp.Managing Exponential Blow-up

32

•Dynamic programming style

• Works well in practice

cfg E := “()” | E E | “(“ E “)”

Recursivelyexpand

non-terminals:

Construct RE

(())()()(())((()))

...

Length: 6

Min. length constant: ”()”

Sunday, July 17, 2011

Page 36: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Unroll Bounded CFGs into Regular Exp.Managing Exponential Blow-up

33

Bound(E,6) ([()() + (())]) +()[()() + (())] +[()() + (())]()

cfg E := “()” | E E | “(“ E “)”

Recursivelyexpand

non-terminals:

Construct RE

(())()()(())((()))

...

Length: 6

Min. length constant: ”()”

Sunday, July 17, 2011

Page 37: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

How Hampi WorksConverting Regular Exp. into Bit-vectors

34

( v ) ∈ ()[()() + (())] + [()() + (())]() + ([()() + (())])

Formula Φ1 ∨ Formula Φ2 ∨ Formula Φ3

Encode regular expressions recursively •  Alphabet { (, ) } 0, 1 •  constant bit-vector constant •  union + disjunction ∨ •  concatenation conjunction ∧ •  Kleene star * conjunction ∧•  Membership, equality equality

B[0]=0∧B[1]=1∧{B[2]=0∧B[3]=1∧B[4]=0∧B[5]=1 ∨…

Sunday, July 17, 2011

Page 38: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

How Hampi WorksConverting Regular Exp. into Bit-vectors

35

( v ) ∈ ()[()() + (())] + [()() + (())]() + ([()() + (())])

Formula Φ1 ∨ Formula Φ2 ∨ Formula Φ3

• Constraint Templates

• Encode once, and reuse

• On-demand formula generation

B[0]=0 ∧ B[1]=1 ∧ {B[2]=0∧B[3]=1∧B[4]=0∧B[5]=1 ∨…

Sunday, July 17, 2011

Page 39: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

How Hampi WorksDecoder converts Bit-vectors to Strings

36

Hampi

Find a 4-char string v:• (v) is in E• (v) contains ()()

var v : 4;

cfg E := “()” | E E | “(“ E “)”;

val q := concat( “(“, v, ”)”);

assert q in E;assert q contains “()()”;

STP Encoder

STP DecoderSTP

String Solutionv = )()(

Bit-vector Constraints

Bit-vector Solution

Normalizer

Sunday, July 17, 2011

Page 40: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Rest of the Talk

• HAMPI Logic: A Theory of Strings

• Motivating Example: HAMPI-based Vulnerability Detection App

• How HAMPI works

• Experimental Results

• Related Work: Theory and Practice

• HAMPI 2.0

• SMTization: Future of Strings

37Sunday, July 17, 2011

Page 41: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI: Result 1Static SQL Injection Analysis

38

0.01

0.1

1

10

100

1000

1 10 100 1000 10000 100000

Tim

e To

Sol

ve (

sec)

Grammar Size (# of productions)

• 1367 string constraints from Wasserman & Su [PLDI’07] • Hampi scales to large grammars• Hampi solved 99.7% of constraints in < 1sec• All solvable constraints had short solutions

Sunday, July 17, 2011

Page 42: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI: Result 2Security Testing and XSS

39

• Attackers inject client-side script into web pages

• Somehow circumvent same-origin policy in websites

• echo “Thank you $my_poster for using the message board”;

• Unsanitized $my_poster

• Can be JavaScript

• Execution can be bad

Sunday, July 17, 2011

Page 43: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI: Result 2Security Testing

40

• Hampi used to build Ardilla security tester [Kiezun et al., ICSE’09]

• 60 new vulnerabilities on 5 PHP applications (300+ kLOC)• 23 SQL injection• 37 cross-site scripting (XSS) 5 added to

US National Vulnerability DB

• 46% of constraints solved in < 1 second per constraint

• 100% of constraints solved in <10 seconds per constraint

Sunday, July 17, 2011

Page 44: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI: Result 3Comparison with Competing Tools

41av

erag

e tim

e (se

c.)

0 10 20 30 40 50 0

5

10

15

20

25

Hampi

CFGAnalyzer

string size (characters)

• HAMPI vs. CFGAnalyzer (U. Munich): HAMPI ~7x faster for strings of size 50+

Sunday, July 17, 2011

Page 45: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI: Result 3Comparison with Competing Tools

42

RE intersection problems

• HAMPI 100x faster than Rex (MSR)

• HAMPI 1000x faster than DPRLE (U. Virginia)

• Pieter Hooimeijer 2010 paper titled ‘Solving String Constraints Lazily’

Sunday, July 17, 2011

Page 46: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI: Result 4Helping KLEE Pierce Parsers

43

Symbolic ExecutionEnginewith

Implicit Spec

Crashing Tests

STP

Formulas

SAT/UNSAT

KLEE

Parser

Semantic Core

Generate InputUsing HAMPI;

Mark Partially Symbolic

Sunday, July 17, 2011

Page 47: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

How to Automatically Crash Programs?KLEE: Concolic Execution-based Tester

44

Problem: Automatically generate crashing tests given only the code

Symbolic ExecutionEnginewith

Implicit Spec

Program

Crashing Tests

STP

Formulas

SAT/UNSAT

Automatic Tester

Sunday, July 17, 2011

Page 48: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

How to Automatically Crash Programs?KLEE: Concolic Execution-based Tester

45

Buggy_C_Program(int* data_field, int len_field) {

int * ptr = malloc(len_field*sizeof(int));int i; //uninitialized

while (i++ < process(len_field)) { //1. Integer overflow causing NULL deref //2. Buffer overflow *(ptr+i) = process_data(*(data_field+i));}

}

Structured input processing code: PDF Reader, Movie Player,...

• Formula captures computation• Tester attaches formula to capture spec

Sunday, July 17, 2011

Page 49: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

How to Automatically Crash Programs?KLEE: Concolic Execution-based Tester

45

Buggy_C_Program(int* data_field, int len_field) {

int * ptr = malloc(len_field*sizeof(int));int i; //uninitialized

while (i++ < process(len_field)) { //1. Integer overflow causing NULL deref //2. Buffer overflow *(ptr+i) = process_data(*(data_field+i));}

}

Structured input processing code: PDF Reader, Movie Player,...

data_field, mem_ptr : ARRAY;len_field : BITVECTOR(32); //symbolici, j, ptr : BITVECTOR(32);//symbolic..mem_ptr[ptr+i] = process_data(data_field[i]);mem_ptr[ptr+i+1] = process_data(data_field[i+1]);..

Equivalent Logic Formula derived usingsymbolic execution

• Formula captures computation• Tester attaches formula to capture spec

Sunday, July 17, 2011

Page 50: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

How to Automatically Crash Programs?KLEE: Concolic Execution-based Tester

45

Buggy_C_Program(int* data_field, int len_field) {

int * ptr = malloc(len_field*sizeof(int));int i; //uninitialized

while (i++ < process(len_field)) { //1. Integer overflow causing NULL deref //2. Buffer overflow *(ptr+i) = process_data(*(data_field+i));}

}

Structured input processing code: PDF Reader, Movie Player,...

data_field, mem_ptr : ARRAY;len_field : BITVECTOR(32); //symbolici, j, ptr : BITVECTOR(32);//symbolic..mem_ptr[ptr+i] = process_data(data_field[i]);mem_ptr[ptr+i+1] = process_data(data_field[i+1]);..

Equivalent Logic Formula derived usingsymbolic execution

• Formula captures computation• Tester attaches formula to capture spec

Sunday, July 17, 2011

Page 51: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

How to Automatically Crash Programs?KLEE: Concolic Execution-based Tester

45

Buggy_C_Program(int* data_field, int len_field) {

int * ptr = malloc(len_field*sizeof(int));int i; //uninitialized

while (i++ < process(len_field)) { //1. Integer overflow causing NULL deref //2. Buffer overflow *(ptr+i) = process_data(*(data_field+i));}

}

Structured input processing code: PDF Reader, Movie Player,...

data_field, mem_ptr : ARRAY;len_field : BITVECTOR(32); //symbolici, j, ptr : BITVECTOR(32);//symbolic..mem_ptr[ptr+i] = process_data(data_field[i]);mem_ptr[ptr+i+1] = process_data(data_field[i+1]);..

Equivalent Logic Formula derived usingsymbolic execution

• Formula captures computation• Tester attaches formula to capture spec

//INTEGER OVERFLOW QUERY0 <= j <= process(len_field);ptr + i + j = 0?

Sunday, July 17, 2011

Page 52: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI: Result 4Helping KLEE Pierce Parsers

46

Symbolic ExecutionEnginewith

Implicit Spec

Crashing Tests

STP

Formulas

SAT/UNSAT

KLEE

Parser

Semantic Core

Mark InputSymbolic

Sunday, July 17, 2011

Page 53: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI: Result 4Helping KLEE Pierce Parsers

47

Symbolic ExecutionEnginewith

Implicit Spec

Crashing Tests

STP

Formulas

SAT/UNSAT

KLEE

Parser

Semantic Core

Generate InputUsing HAMPI;

Mark Partially Symbolic

Sunday, July 17, 2011

Page 54: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

HAMPI: Result 4Helping KLEE Pierce Parsers

48

• Klee provides API to place constraints on symbolic inputs

• Manually writing constraints is hard

• Specify grammar using HAMPI, compile to C code

• Particularly useful for programs with highly-structured inputs

• 2-5X improvement in line coverage

Sunday, July 17, 2011

Page 55: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Impact of Hampi: Notable Projects

49

Category Research Project Project Leader/Institution

Static Analysis SQL-injection vulnerabilities Wasserman & Su/UC, Davis

Security TestingArdilla for PHP (SQL injections, cross-site scripting)

Kiezun & Ernst/MIT

Concolic TestingKleeKudzuNoTamper

Engler & Cadar/StanfordSaxena & Song/BerkeleyBisht & Venkatakrishnan/U Chicago

New Solvers Kaluza Saxena & Song/Berkeley

Sunday, July 17, 2011

Page 56: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Impact of Hampi: Notable Projects

50

Tool Name DescriptionProject Leader/Institution

Kudzu JavaScript Bug Finder & Vulnerability Detector

SaxenaAkhawe HannaMaoMcCamantSong/Berkeley

NoTamper Parameter Tamper Detection

BishtHinrichs/U of ChicagoSkrupskyBobrowiczVekatakrishnan/ U. of Illinois, Chicago

Sunday, July 17, 2011

Page 57: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Impact of Hampi: Notable ProjectsNoTamper

51

Server

• Client-side checks (C), no server checks

• Find solutions S1,S2,... to C, and solutions E1,E2,... to ~C by calling HAMPI

• E1,E2,... are candidate exploits

• Submit (S1, E1),... to server

• If server response same, ignore

• If server response differ, report error

Sunday, July 17, 2011

Page 58: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Related Work (Practice)

52

Tool NameProject Leader/Institution

Comparison with HAMPI

RexBjorner, Tillman, Vornkov et al. (Microsoft Research, Redmond)

• HAMPI + Length+Replace(s1,s2,s3) - CFG• Translation to int. linear arith. (Z3)

Mona Karlund et al. (U. of Aarhus)• Can encode HAMPI & Rex• User work• Automata-based• Non-elementary

DPRLE Hooimeijer (U. of Virginia) • Regular expression constraints

Sunday, July 17, 2011

Page 59: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Related Work (Theory)

53

Result Person (Year) Notes

Undecidability of Quantified Word Equations

Quine (1946) Multiplication reduced to concat

Undecidability of Quantified Word Equations with single alternation

Durnev (1996), G. (2011)2-counter machines reduced to words with single quantifier alter.

Decidability (PSPACE) of QF Theory of Word Equations

Makanin (1977)Plandowski (1996, 2002/06)

Makanin result very difficultSimplified by Plandowski

Decidability (PSPACE-complete) of QF Theory of Word Equations + RE

Schultz (1992) RE membership predicate

QF word equations + Length() (?)

Matiyasevich (1971)UnsolvedReduction to Diophantine

QF word equations in solved form + Length() + RE

G. (2011) Practical

Sunday, July 17, 2011

Page 60: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Future of HAMPI & STP• HAMPI will be combined with STP

• Bit-vectors and Arrays• Integer/Real Linear Arithmetic• Uninterpreted Functions• Strings• Floating Point• Non-linear

• Additional features planned in STP• UNSAT Core• Quantifiers• Incremental• DPLL(T)• Parallel STP• MAXSMT?

• Extensibility and hackability by non-expert

54Sunday, July 17, 2011

Page 61: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Future of Strings• Strings SMTization effort started

• Nikolaj Bjorner, G.• Andrei Voronkov, Ruzica Piskac, Ting Zhang• Cesare Tinelli, Clark Barrett, Dawn Song, Prateek Saxena, Pieter Hooimeijer, Tim Hinrichs

• SMT Theory of Strings• Alphabet (UTF, Unicode,...)• String Constants and String Vars (parameterized by length)• Concat, Extract, Replace, Length Functions• Regular Expressions, CFGs (Extended BNF)• Equality, Membership Predicate, Contains Predicate

• Applications• Static/Dynamic Analysis for Vulnerability Detection• Security Testing using Concolic Idea• Formal Methods• Synthesis

55Sunday, July 17, 2011

Page 62: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Conclusions & Take Away

• String solvers essential for testing, analysis, vulnerability detection• String applications in C/C++/Java/C#• Web applications in PHP/JavaScript (client and server-side)

• HAMPI• Multiple string vars, constants• Concat/extract function • Equality between string terms • Membership predicate over RE/CFGs • Contains predicate

• Demand for even richer theories• Attribute grammars• String theories with length

• Bounding: powerful and versatile idea (BMC, bounded logics,...)

• Using completeness as a resource (can we be more systematic?)

56Sunday, July 17, 2011

Page 63: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Topics Covered Today• HAMPI Logic: A Theory of Strings

• HAMPI-based vulnerability detection app

• How HAMPI works

• Another HAMPI-based app: Tamper Detection

• Experimental results (Static, Dynamic, Competing tools, KLEE)

• Related work (Kaluza, Rex,...)

• Future of strings & SMTization

57Sunday, July 17, 2011

Page 64: HAMPI - University of Waterloovganesh/talks/String-Solvers...• HAMPI: String Solvers • Ardilla by Ernst et al. • Kudzu & Kaluza by Song et al. • Klee by Engler et al. • George

Vijay Ganesh, CAV 2011 Tutorial

Key Contributionshttp://people.csail.mit.edu/vganesh

Name Key Concept Impact Pubs

STP Bit-vector & Array Solver1,2

Abstraction-refinement for Solving

Concolic Testing

CAV 2007CCS 2006TISSEC 2008

HAMPI String Solver1

App-driven Bounding for Solving

Analysis of Web Apps

ISSTA 20093

TOSEM 2011(CAV 2011)

Taint-based Fuzzing Information flow is cheaper than concolic

Scales better than concolic

ICSE 2009

Automatic Input Rectification

Acceptability Envelope:Fix the input, not the program

New way of approaching SE

Under Submission

58

1. 100+ research projects use STP and HAMPI2. STP won the SMTCOMP 2006 and 2010 competitions for bit-vector solvers3. HAMPI: ACM Best Paper Award 20094. Retargetable Compiler (DATE 1999)5. Proof-producing decision procedures (TACAS 2003)6. Error-finding in ARBAC policies (CCS 2011)

Sunday, July 17, 2011