Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019...

55
Solving String Constraints via Hardware/Software Model Checking Jie -Hong Roland Jiang , Fang Yu § National Taiwan University § National Chengchi University Meeting on String Constraints and Applications (MOSCA'19), May 6-9, 2019, Bertinoro, Italy

Transcript of Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019...

Page 1: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Solving String Constraints via Hardware/Software Model

Checking

Jie-Hong Roland Jiang†, Fang Yu §

†National Taiwan University§National Chengchi University

Meeting on String Constraints and Applications (MOSCA'19),

May 6-9, 2019, Bertinoro, Italy

Page 2: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Outline

Introduction

Circuit based solving of string constraints (SLOG)

Circuit based solving of string+lengthconstraints (SLENT)

Conclusions

2019/5/9 MOSCA 2019 2

Page 3: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Outline

Introduction

Circuit based solving of string constraints (SLOG)

Circuit based solving of string+lengthconstraints (SLENT)

Conclusions

2019/5/9 MOSCA 2019 3

Page 4: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Motivating Example

Vulnerable to attack pattern: ’ OR ‘1’=‘1’--

<?PHP$user_pass = $_POST[“pwd”];$user_id = $_POST[“uid”];$query = mysql_query(“SELECT username FROM accounts WHERE passwd = ‘ “.$user_pass.” ’ AND userId = ‘ “.$user_id);

?>

2019/5/9 MOSCA 2019 4

Page 5: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Motivating Example

2019/5/9 MOSCA 2019

’ OR ‘1’=‘1’--

SELECT username FROM accounts WHERE passwd=‘’ OR ‘1’=‘1’--

SELECT username FROM accounts WHERE passwd=‘’ OR ‘1’=‘1’--’AND userId = ‘

SELECT username FROM accounts WHERE passwd=‘’ OR ‘1’=‘1’--’AND userId = ‘ 123

123

5

Page 6: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Introduction

String analysis

Problem formulation

Input: an acyclic dependency graph extracted from string manipulating program

Output: possible string values for each string variable under program execution

2019/5/9 MOSCA 2019 6

Page 7: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Related Work

Automata-based approaches

Use finite automata to characterize the set of possible string values in program execution

Can generate filter

Not support counterexample generation

SMT-based approaches

Use string theory to decide whether a set of constraint is satisfiable

Can generate counterexample

Not support filter generation

2019/5/9 MOSCA 2019 7

Page 8: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Our Contribution

We propose

Automata-based string analysis method using logic circuit representation

Advantages of logic circuit representation:

Support both counterexample generation and filter generation

Maintain circuit size linear in the input circuit sizes during string operations

2019/5/9 MOSCA 2019 8

Page 9: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

MOSCA 2019

Motivation of Circuit Representation

Historic evolution of data structures and tools in logic synthesis and verification

Problem Size

Time1950-1970 1980 1990 2000

CNFTT

SOP BDD

AIG16

50

100

100000

Espresso,

MIS, SIS

SIS, VIS,

MVSIS

ABC

Courtesy of Alan Mishchenko

2019/5/9 9

Page 10: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Introduction

Automata operation

2019/5/9 MOSCA 2019

𝐴𝑙𝑖𝑡1

𝐴𝑣𝑎𝑟1

𝐴𝑙𝑖𝑡2

𝐴𝑣𝑎𝑟2

𝐴1 = 𝐶𝐴𝑇 𝐴𝑙𝑖𝑡1 , 𝐴𝑣𝑎𝑟1

𝐴2 = 𝐶𝐴𝑇 𝐴1, 𝐴𝑙𝑖𝑡1

𝐴3 = 𝐶𝐴𝑇 𝐴2, 𝐴𝑣𝑎𝑟2

𝐴𝑠𝑖𝑛𝑘 = 𝐴310

Page 11: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Introduction

Counterexample gen.

2019/5/9 MOSCA 2019

SELECT username FROM accounts WHERE passwd = ‘

’ OR ‘1’=‘1’--

’ AND userId =

123

SELECT username FROM accounts WHERE passwd = ‘’ OR ‘1’=‘1’–’AND userId = 123

Goal: get witness of input string for vulnerable code 𝑡𝑟𝑎𝑐𝑒1

𝑡𝑟𝑎𝑐𝑒𝑢𝑖𝑑

𝑡𝑟𝑎𝑐𝑒2 𝑡𝑟𝑎𝑐𝑒𝑙𝑖𝑡2

𝑡𝑟𝑎𝑐𝑒𝑙𝑖𝑡1𝑡𝑟𝑎𝑐𝑒𝑝𝑤𝑑

𝑞1, S, 𝑞2, E, 𝑞3, L, … , 𝑞𝑛−1, 3, 𝑞n

11

Page 12: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Introduction

Filter generation

2019/5/9 MOSCA 2019

𝐵𝑣𝑎𝑟1 ⊆ 𝐴𝑣𝑎𝑟1

𝐵𝑣𝑎𝑟2 ⊆ 𝐴𝑣𝑎𝑟2

𝐵𝑠𝑖𝑛𝑘 ⊆ 𝐴𝑠𝑖𝑛𝑘

𝐵1

𝐵2

Goal: get black list for each var_node to filter out malicious input

12

Page 13: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Outline

Introduction

Circuit based solving of string constraints

Circuit based solving of string+lengthconstraints

Conclusions

2019/5/9 MOSCA 2019 13

Page 14: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Preliminaries

A (nondeterministic) finite automaton 𝐴 =𝑄, Σ, 𝐼, 𝑇, 𝑂

𝑄: finite state set

Σ: finite alphabet

𝐼 ⊆ 𝑄: set of initial states

𝑇 ⊆ Σ × 𝑄 × 𝑄: transition relation

𝑂 ⊆ 𝑄: set of accepting states

Reserve a symbol for 𝜖 in Σ

ℒ 𝐴 denote the language (set of strings) accepted by 𝐴

2019/5/9 MOSCA 2019 14

Page 15: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Preliminaries

Circuit representation

Boolean encoding on 𝑄, Σ

Use characteristic functions 𝐼 Ԧ𝑠 , 𝑇 Ԧ𝑥, Ԧ𝑠, Ԧ𝑠′ , 𝑂 Ԧ𝑠to represent an automaton

Σ = {𝜖, 𝑎}

00 01 11 10

a a a

𝑥

𝑠1′

𝐼

𝑂

𝑠2′

𝑠2

𝑠1𝑇

2019/5/9 MOSCA 2019 15

Page 16: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

String/Automata Operations

Supported operations

Intersection

Union

Concatenation

Replacement

Reverse

Prefix

Suffix

Emptiness checking

2019/5/9 MOSCA 2019 16

Page 17: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Intersection

Circuit construction

Construct automaton 𝐴 = 𝐼𝑁𝑇(𝐴1, 𝐴2) with ℒ 𝐴 = ℒ 𝐴1 ∩ ℒ(𝐴2):

2019/5/9 MOSCA 2019

Ԧ𝑠 = Ԧ𝑠1, Ԧ𝑠2

𝑇𝐼𝑁𝑇 Ԧ𝑥, Ԧ𝑠, Ԧ𝑠′ = 𝑇1𝜖 Ԧ𝑥, Ԧ𝑠1, Ԧ𝑠1

′ ∧ 𝑇2𝜖 Ԧ𝑥, Ԧ𝑠2, Ԧ𝑠2

𝐼𝐼𝑁𝑇 Ԧ𝑠 = 𝐼1 Ԧ𝑠1 ∧ 𝐼2 Ԧ𝑠2

𝑂𝐼𝑁𝑇 Ԧ𝑠 = 𝑂1 Ԧ𝑠1 ∧ 𝑂2 Ԧ𝑠2

17

Page 18: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Intersection

Circuit construction

2019/5/9 MOSCA 2019 18

Page 19: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Intersection

Counterexample generation

2019/5/9 MOSCA 2019 19

Page 20: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Intersection

Filter generation

Let 𝐵 be the filter for 𝐴 = 𝐼𝑛𝑡 𝐴1, 𝐴2𝐵 can directly applied as a filter for 𝐴1 as well

as 𝐴2

2019/5/9 MOSCA 2019 20

Page 21: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Circuit construction

Construct automaton 𝐴 = 𝑈𝑁𝐼(𝐴1, 𝐴2) with ℒ 𝐴 = ℒ 𝐴1 ∪ ℒ(𝐴2):

Union

2019/5/9 MOSCA 2019

Ԧ𝑠 = Ԧ𝑠2, 𝛼Ԧ𝑠2 𝑚 : taking first m variables of Ԧ𝑠2𝛼 = 0: state in 𝐴1, 𝛼 = 1: state in 𝐴2

𝑇𝑈𝑁𝐼 Ԧ𝑥, Ԧ𝑠, Ԧ𝑠′ = ¬𝛼 ∧ ¬𝛼′ ∧ 𝑇1 Ԧ𝑥, Ԧ𝑠2 𝑚, Ԧ𝑠2′𝑚 ∨

𝛼 ∧ 𝛼′ ∧ 𝑇2 Ԧ𝑥, Ԧ𝑠2, Ԧ𝑠2′

𝐼𝑈𝑁𝐼 Ԧ𝑠 = ¬𝛼 ∧ 𝐼1 Ԧ𝑠2 𝑚 ∨ 𝛼 ∧ 𝐼2 Ԧ𝑠2

𝑂𝑈𝑁𝐼 Ԧ𝑠 = ¬𝛼 ∧ 𝑂1 Ԧ𝑠1 𝑚 ∧ 𝛼 ∧ 𝑂2 Ԧ𝑠2

21

Page 22: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Union

Circuit construction

2019/5/9 MOSCA 2019 22

Page 23: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Union

Counterexample generation

2019/5/9 MOSCA 2019 23

Page 24: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Union

Filter generation

Let 𝐵 be the filter for 𝐴 = 𝑈𝑛𝑖 𝐴1, 𝐴2𝐵1 = 𝐼𝑛𝑡 𝐵, 𝐴1 and 𝐵2 = 𝐼𝑛𝑡 𝐵, 𝐴2 form

legitimate filter for 𝐴1 and 𝐴2, respectively

2019/5/9 MOSCA 2019 24

Page 25: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Concatenation

Circuit construction

Construct automaton 𝐴 = 𝐶𝐴𝑇(𝐴1, 𝐴2) with ℒ 𝐴 = ℒ 𝐴1 . ℒ 𝐴2 :

2019/5/9 MOSCA 2019

Ԧ𝑠 = Ԧ𝑠2, 𝛼Ԧ𝑠2 𝑚 : taking first m variables of Ԧ𝑠2𝛼 = 0: state in 𝐴1, 𝛼 = 1: state in 𝐴2

𝑇𝐶𝐴𝑇 Ԧ𝑥, Ԧ𝑠, Ԧ𝑠′ = ¬𝛼 ∧ ¬𝛼′ ∧ 𝑇1 Ԧ𝑥, Ԧ𝑠2 𝑚, Ԧ𝑠2′𝑚 ∨

𝛼 ∧ 𝛼′ ∧ 𝑇2 Ԧ𝑥, Ԧ𝑠2, Ԧ𝑠2′ ∨

Ԧ𝑥 = 𝜖 ∧ ¬𝛼 ∧ 𝛼′ ∧ 𝑂1 Ԧ𝑠2 𝑚 ∧ 𝐼2 Ԧ𝑠2′

𝐼𝐶𝐴𝑇 Ԧ𝑠 = ¬𝛼 ∧ 𝐼1 Ԧ𝑠2 𝑚

𝑂𝐶𝐴𝑇 Ԧ𝑠 = 𝛼 ∧ 𝑂2 Ԧ𝑠2

25

Page 26: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Concatenation

Circuit construction

2019/5/9 MOSCA 2019 26

Page 27: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Concatenation

Counterexample generation

2019/5/9 MOSCA 2019 27

Page 28: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Concatenation

Filter generation

Let 𝐵 be the filter for 𝐴 = 𝐶𝑎𝑡 𝐴1, 𝐴2

First construct 𝐵† = 𝐼𝑛𝑡 𝐴, 𝐵

Let 𝐵1 be a copy of 𝐵† but with all the transition between states of 𝛼 = 1 being replaced with 𝜖-transition

Let 𝐵2 be a copy of 𝐵† but with all the transition between states of 𝛼 = 0 being replaced with 𝜖-transition

2019/5/9 MOSCA 2019 28

Page 29: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Emptiness Checking

Decide whether an automaton 𝐴 accept any string or not

𝑇 characterize one step of transition

Convert into sequential circuit, then apply property directed reachability (PDR)

“pdr” command in Berkeley ABC only takes sequential circuits with single initial state and transition functions

2019/5/9 MOSCA 2019 29

Page 30: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Emptiness Checking

Convert transition relation to transition functions

𝑛 new variables Ԧ𝑦 for 𝑛 = |Ԧ𝑠|

One new state variable 𝑧 with initial value 1

Output function: 𝜔 = 𝑂 𝑠 ∧ 𝑧

Next-state functions: 𝛿𝑖 = (𝑦𝑖) for state variables 𝑠𝑖, 𝑖 = 1,… , 𝑛, and 𝛿𝑛+1 = 𝑇 Ԧ𝑥, Ԧ𝑠, Ԧ𝑦 ∧ 𝑧for state variable 𝑧

2019/5/9 MOSCA 2019 30

Page 31: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Emptiness checking

Circuit construction

2019/5/9 MOSCA 2019 31

Page 32: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Experiment Setting

Compared string constraint solvers

Our tool: SLOG

Automata-based tools: JSA, Stranger

SMT-based tools: CVC4, Norn, Z3-str2

Environment

Intel Xeon 8-core CPU

16GB memory

Ubuntu 12.04 LCS

2019/5/9 MOSCA 2019 32

Page 33: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Experimental Setting

20000+ string analysis instances generated from web applications

Replacement-free small instances

Replacement-free large instances

Instances with replacement operation

2019/5/9 MOSCA 2019 33

Page 34: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Experimental Results (1/4)

Replacement-free small instances

2

2019/5/9 MOSCA 2019 34

Page 35: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Experimental Results (2/4)

Replacement-free large instances

2

2019/5/9 MOSCA 2019 35

Page 36: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Experimental Results (3/4)

Instances with replacement operation

2

2

2019/5/9 MOSCA 2019 36

Page 37: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Experimental Results (4/4)

SLOG performance on counterexample generation

accumulated solving and counterexample generation time

2019/5/9 MOSCA 2019 37

Page 38: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Outline

Introduction

Circuit based solving of string constraints (SLOG)

Circuit based solving of string+lengthconstraints (SLENT)

Conclusions

2019/5/9 MOSCA 2019 38

Page 39: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Length Tracking

Circuit construction

Construct automaton 𝐴𝐿 = 𝑇𝑟𝑘𝐿𝑒𝑛(𝐴) with:

𝑇𝐿 Ԧ𝑥 , Ԧ𝑠, 𝑛, 𝑠′, 𝑛 = 𝑇 Ԧ𝑥, Ԧ𝑠, 𝑠′ ∧

( Ԧ𝑥 ≠ 𝜖 ∧ 𝑛’ = 𝑛 + 1 ∨

( Ԧ𝑥 = 𝜖 ∧ (𝑛′ = 𝑛)))

𝐼𝐿 Ԧ𝑠 = 𝐼(Ԧ𝑠)

𝑂𝐿 Ԧ𝑠 = 𝑂(Ԧ𝑠)

2019/5/9 MOSCA 2019 39

Page 40: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Intersection

Circuit construction

Construct automaton 𝐴 = 𝐼𝑁𝑇(𝐴1, 𝐴2) with ℒ 𝐴 = ℒ 𝐴1 ∩ ℒ(𝐴2):

2019/5/9 MOSCA 2019

𝑇𝐼𝑁𝑇 Ԧ𝑥, Ԧ𝑠, 𝑛, Ԧ𝑠′, 𝑛′ = 𝑇1𝜖 Ԧ𝑥, Ԧ𝑠, 𝑛, Ԧ𝑠′, 𝑛′ ∧ 𝑇2

𝜖 Ԧ𝑥, Ԧ𝑠, 𝑛, Ԧ𝑠′, 𝑛′

𝐼𝐼𝑁𝑇 Ԧ𝑠 = 𝐼1 Ԧ𝑠1 ∧ 𝐼2 Ԧ𝑠2𝑂𝐼𝑁𝑇 Ԧ𝑠 = 𝑂1 Ԧ𝑠1 ∧ 𝑂2 Ԧ𝑠2

Ԧ𝑠 = Ԧ𝑠1, Ԧ𝑠2

40

Page 41: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Intersection

Circuit construction

2019/5/9 MOSCA 2019 41

Page 42: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Circuit construction

Construct automaton 𝐴 = 𝑈𝑁𝐼(𝐴1, 𝐴2) with ℒ 𝐴 = ℒ 𝐴1 ∪ ℒ(𝐴2):

Union

2019/5/9 MOSCA 2019

Ԧ𝑠 = Ԧ𝑠2, 𝛼Ԧ𝑠2 𝑚 : taking first m variables of Ԧ𝑠2𝛼 = 0: state in 𝐴1, 𝛼 = 1: state in 𝐴2

𝑇𝑈𝑁𝐼 Ԧ𝑥, Ԧ𝑠, 𝑛, Ԧ𝑠′, 𝑛′ = ¬𝛼 ∧ ¬𝛼′ ∧ 𝑇1 Ԧ𝑥, Ԧ𝑠2 𝑚, 𝑛, Ԧ𝑠2′𝑚, 𝑛′ ∨

𝛼 ∧ 𝛼′ ∧ 𝑇2 Ԧ𝑥, Ԧ𝑠2, 𝑛, Ԧ𝑠2′ , 𝑛′

𝐼𝑈𝑁𝐼 Ԧ𝑠 = ¬𝛼 ∧ 𝐼1 Ԧ𝑠2 𝑚 ∨ 𝛼 ∧ 𝐼2 Ԧ𝑠2

𝑂𝑈𝑁𝐼 Ԧ𝑠 = ¬𝛼 ∧ 𝑂1 Ԧ𝑠1 𝑚 ∧ 𝛼 ∧ 𝑂2 Ԧ𝑠2

42

Page 43: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Union

Circuit construction

2019/5/9 MOSCA 2019 43

Page 44: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Concatenation

Circuit construction

Construct automaton 𝐴 = 𝐶𝐴𝑇(𝐴1, 𝐴2) with ℒ 𝐴 = ℒ 𝐴1 . ℒ 𝐴2 :

2019/5/9 MOSCA 2019

Ԧ𝑠 = Ԧ𝑠2, 𝛼Ԧ𝑠2 𝑚 : taking first m variables of Ԧ𝑠2𝛼 = 0: state in 𝐴1, 𝛼 = 1: state in 𝐴2

𝑇𝐶𝐴𝑇 Ԧ𝑥, Ԧ𝑠, 𝑛, Ԧ𝑠′, 𝑛′ = (¬𝛼 ∧ ¬𝛼′ ∧ 𝑇1 Ԧ𝑥, Ԧ𝑠2 𝑚, 𝑛1, Ԧ𝑠2′𝑚, 𝑛1

′ ∧ 𝑛2 = 𝑛2′ ) ∨

(𝛼 ∧ 𝛼′ ∧ 𝑇2 Ԧ𝑥, Ԧ𝑠2, 𝑛2, Ԧ𝑠2′ , 𝑛2

′ ∧ (𝑛1 = 𝑛1′ )) ∨

Ԧ𝑥 = 𝜖 ∧ ¬𝛼 ∧ 𝛼′ ∧ 𝑂1 Ԧ𝑠2 𝑚 ∧ 𝐼2 Ԧ𝑠2′ ∧ (𝑛 = 𝑛′)

𝐼𝐶𝐴𝑇 Ԧ𝑠 = ¬𝛼 ∧ 𝐼1 Ԧ𝑠2 𝑚

𝑂𝐶𝐴𝑇 Ԧ𝑠 = 𝛼 ∧ 𝑂2 Ԧ𝑠2

44

Page 45: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Concatenation

Circuit construction

2019/5/9 MOSCA 2019 45

Page 46: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Prefix

Circuit construction

2019/5/9 MOSCA 2019 46

Page 47: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Experiment Setting

Compared string constraint solvers

Our tool: SLENT (using IC3IA model checker)

Automata-based tools: (UCSB) ABC

SMT-based tools: CVC4, Norn, S3P, Trau, Z3-str3

Environment

Intel Xeon 8-core CPU

16GB memory

Ubuntu 12.04 LCS

2019/5/9 MOSCA 2019 47

Page 48: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Experimental Results (1/3)

2000+ instances converted from Kaluzabenchmarks involving only string concatenation and length constraints

2019/5/9 MOSCA 2019

solver time (s) #SAT #UNSAT #TO (200s)

Z3str3 56.46 1017 983 0

CVC4 88.89 1017 983 0

Norn 2025.30 1013 983 4

ABC 255.76 1013 983 4

S3P 137.90 1015 983 2

Trau 123.85 1017 983 0

SLENT 1397.82 1013 983 4

48

Page 49: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Experimental Results (2/3)

236 instances converted from Stranger benchmarks with involving string-to-string replaceall, concatenation, and length constraints

2019/5/9 MOSCA 2019

solver time (s) #SAT #UNSAT #TO(600s)

#abort

ABC 2282.84 109 (31)

111(0)

0 16

S3P 605.79 30(0)

114(3)

22 70

Trau 687.49 54(2)

139(22)

5 38

SLENT 26692.55 88(0)

141(0)

7 0

49

Page 50: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Experimental Results (3/3)

101 instances converted from Stranger benchmarks with involving language-to-language replaceall, concatenation, and length constraints

2019/5/9 MOSCA 2019

solver time (s) #SAT #UNSAT #TO(600s)

#abort

ABC 977.80 46 (2)

41(0)

1 13

SLENT 4413.25 44(0)

38(0)

19 0

9 under TO 1800s

50

Page 51: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Conclusions

SLOG

Logic circuit based automata manipulation method for string analysis

Support both counterexample generation and filter synthesis

Maintain circuit size linear in input circuit sizes for string operations

Postpone language emptiness checking with model checking at the end

2019/5/9 MOSCA 2019 51

Page 52: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Conclusions (cont’d)

SLENT

Encode length information to string automata as length-encoded automata

Construct characteristic functions of length-encoded automata through automata manipulations that correspond to string and length constraints

Leverage a symbolic model checker for infinite state systems as an engine for language emptiness checking

2019/5/9 MOSCA 2019 52

Page 53: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Conclusions (cont’d)

SLOG and SLENT are based on scalable circuit representation and good at solving complex string constraints

Counterexample generation and filter synthesis are possible

Future work

Support complement operation

Allow relation on string variables

2019/5/9 MOSCA 2019 53

Page 54: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Acknowledgement

2019/5/9 MOSCA 2019

Hung-En WangTsung-Lin TsaiChun-Han LinShih-Yu Chen

54

Page 55: Solving String Constraints via Hardware/Software Model Checking · 2019. 10. 22. · MOSCA 2019 Motivation of Circuit Representation Historic evolution of data structures and tools

Thanks for Your Attention!

Questions?

2019/5/9 MOSCA 2019 55